Importance of linguistic constraints in statistical dependency parsing Bharat Ram Ambati, Language Technologies Research Centre, IIIT-Hyderabad, India.

Introduction

Motivation

• Parsing

• Machine Translation

– Major NLP task – Many Applications

Indian Language

Indian Language

Indian Language

English

• State-of-the-art dependency parsers – CoNLL-X, CoNLL-2007 Shared Tasks – ICON09 NLP Tools Contest

• Linguistic constraints – How to avoid multiple Subjects/Objects ?

Naive Approach (NA)

Probabilistic Approach (PA)

Approaches

Sentence

Sentence

Syntactic Dependency Tree

Syntactic Dependency Tree

How to extract k-best labels with probabilities?

Dependency Tree [T] (k-best labels for each node)

Dependency Tree [T] (k-best labels for each node)

Malt T has a verb with multiple Subj/Obj

Final Dependency Tree NO

• Modified implementation

T has a verb with multiple Subj/Obj

MST+MaxEnt • Used MaxEnt APIs

YES

Final Dependency Tree NO

YES

Extract multiple Subj/Obj of that verb

Extract multiple Subj/Obj of that verb

Assign Subj/Obj to the left most node

Assign Subj/Obj to node with highest prob.

Assign 2nd best label to the rest of the nodes

Assign 2nd best label to the rest of the nodes

Update k-best labels list

Update k-best labels list

Experiments Hindi

Czech

• Data

• Data – CoNLL-2007 Shared Task data – Test data: 286 sentences – Equivalent labels: ‘agent’ and ‘patient’

– ICON09 NLP Tools Contest data – Test data: 150 sentences – Equivalent labels: ‘k1’ and ‘k2’

• Multiple Subj/Obj instances

• Multiple Subj/Obj instances

– MaltPaser Output: 39 – MST+MaxEnt Output: 51

– MaltPaser Output: 38

• Results

• Results UAS

Malt LAS LS

MST+MAXENT UAS LAS LS

Baseline 90.14 74.48 76.38 91.26 90.14 74.57 76.38 91.26 NA 90.14 74.74 76.56 91.26 PA

72.75

75.26

72.84

75.26

73.36

75.87

Comparison of NA and PA with previous best results for Hindi

UAS

LAS

LS

Baseline

82.92

76.32

83.69

NA PA

82.92

75.92

83.35

82.92

75.97

83.40

Comparison of NA and PA with previous best results for Czech

Discussion

Future Work

• Probabilistic Approach better than Naive Approach

• All data-sets of CoNLL-X and CoNLL-2007 Shared Tasks

• Hindi

• Both MST and Malt

– 0.26% (Malt) and 0.61% (MST+MaxEnt) improvement – Better probability estimates using MxEnt

• Czech – No improvement – Limitation of libsvm learner of MaltParser

– libsvm and liblinear of Malt – MaxEnt labeler for MST

• Avoiding multiple instances of other labels • More linguistic constraints

Importance of linguistic constraints in statistical dependency parsing

Importance of linguistic constraints in statistical dependency parsing. Bharat Ram Ambati,. Language Technologies Research Centre, IIIT-Hyderabad, India. Motivation. • Machine Translation. Indian Language Indian Language Indian Language English. Introduction. • Parsing. – Major NLP task. – Many Applications.

831KB Sizes 0 Downloads 189 Views

Recommend Documents

Experiments in Indian Language Dependency Parsing - web.iiit.ac.in
Language Technologies Research Centre,. International Institute of Information Technology,. Hyderabad, India ... specific to either one particular or all the Indian.

Recent Advances in Dependency Parsing
Jun 1, 2010 - auto-parsed data (W. Chen et al. 09) ... Extract subtrees from the auto-parsed data ... Directly use linguistic prior knowledge as a training signal.

Posterior Sparsity in Unsupervised Dependency Parsing - Journal of ...
Department of Computer and Information Science ... We investigate unsupervised learning methods for dependency parsing models that .... this interpretation best elucidates how the posterior regularization method we propose in Section 4.

Posterior Sparsity in Unsupervised Dependency Parsing - Journal of ...
39.2. 4. BS. Ad-Hoc @45. DMV. 55.1. 44.4. 39.4. 5. LsM. Ad-Hoc @15. DMV. 56.2. 48.2. 44.1. 6. LP. Hybrid @45. DMV. 57.1. 48.7. 45.0. Smoothing effects. 7.

Unsupervised Dependency Parsing without ... - Stanford NLP Group
inating the advantage that human annotation has over unsupervised ... of several drawbacks of this practice is that it weak- ens any conclusions that ..... 5http://nlp.stanford.edu/software/ .... off-the-shelf component for tagging-related work.11.

Incremental Joint POS Tagging and Dependency Parsing in Chinese
range syntactic information. Also, the traditional pipeline approach to POS tagging and depen- dency parsing may suffer from the problem of error propagation.

Universal Dependency Annotation for Multilingual Parsing
of the Workshop on Treebanks and Linguistic Theo- ries. Sabine Buchholz and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In.

Corrective Dependency Parsing - Research at Google
dates based on parses generated by an automatic parser. We chose to ..... this task, we experimented with the effect of each feature class being added to the .... Corrective modeling is an approach to repair the output from a system where more.

Tree Revision Learning for Dependency Parsing
Revision learning is performed with a discriminative classi- fier. The revision stage has linear com- plexity and preserves the efficiency of the base parser. We present empirical ... A dependency parse tree encodes useful semantic in- formation for

CoNLL-X shared task on Multilingual Dependency Parsing - ILK
lenges to machine learning researchers. During the .... to the virtual root node of the sentence. ... languages and instructions on how to get the rest, the software.

CoNLL-X shared task on Multilingual Dependency Parsing - ILK
junctions and nouns; some values also include parts in square brackets which in hindsight should maybe have gone to FEATS idue to treatment of multiwords.

Generalized Higher-Order Dependency Parsing ... - Research at Google
to related work in Section 6. Our chart-based .... plus potentially any additional interactions of these roles. ..... features versus exact decoding trade-off in depen-.

prosodic constraints on statistical strategies in ...
a possible context (e.g., 'window') than in an impossible context (e.g., 'wind'). ..... 6- and 9-month-old with sequences of words that were NPs and VPs.