Trimming CFG Parse Trees for Sentence Compression Using Machine Learning Approach Yuya Unno, Takashi Ninomiya, Yusuke Miyao and Jun’ichi Tsujii University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, Japan

I ntroduc tion

Method 2. Bottom-Up Method

Sentence compression is one of the summarization tasks. We compress an input sentence and create a new short grammatical sentence preserving its meaning. Yesterday I went to Tokyo by train I went to Tokyo  

We cannot learn some compression patterns using the previous method, because the two parse trees sometimes have different structures. Previous method S

We can only drop words

Input: sentence l Output: argmaxsP(s | l)

NP

VP

NP

Knight and Marcu’ s noisy-channel model [1] 1. Parse sentences in the training corpus 2. Compare the corresponding nodes of compressed and original parse trees from the root nodes 3. Estimate rewriting probabilities using count of applied CFG rules We revised this model in two points 1. Maximum-entropy model 2. Bottom-up method

DT The

PP on the table

N apple

S NP

NP

ADVP

Yesterday

VP V

I

PP

I

∏P

( rl , rs )∈R

exp

Left-most ‘ ADVP’

S

ADVP

NP

Yesterday

I

V

‘ Yesterday’ is removed

features  Mother

    

DT

N

The

apple

on the table

r ∈R '

NP

is red

DT

N

The

apple

NP

went to Tokyo

1 P( s | l ) = ∏ exp ∑ λi f i (rs , rl ) i ( rs ,rl )∈R Z  Depth

PP on the table

is red

Avg. length of original sentences: 23.8 Avg. length of compressed sentences: 12.5 Training set: 527 sentences Development set: 263 sentences Test set: 264 sentences F-measure Bigram F-measure BLEU score

80

I

VP

Extracted tree

70 60 50 40

75.3 63.3

30

50.2 47.1

20

80.9 64.1 62.1

72.0 69.5

S

PP

S NP

10

VP

Select the nodes which dominate the compressed sentence

Daughter nodes are corresponding

100 90

Probabilities depend on various features of a parse tree

node  Daughter nodes sequence  Daughter terminals that are removed

Compressed tree

Original tree

PP

(rl | rs )∏ Pcfg (r )

‘ S’is the root

V

apple

VP

We can easily introduce various features to the maximum-entropy model, such as the depth from the root node and which words are removed. Maximum entropy model

The

is red

E x pe rim e nta l R e s ult

went to Tokyo went to Tokyo Pexp(rl | rs): probability of rewriting rs to rl P ( s | l ) ∝ P (l | s ) P ( s ) P(l | s ) =

PP

S

NP

N

VP

NP

Probabilities only depend on mother and daughter nodes

rs

DT

{DT, N} is not a subsequence of {NP, PP}

Bottom-up method

Rewriting probabilities only depend on mother and daughter nonterminals in Knight and Marcu’s model.

S

is red

VP

In the bottom-up method, we only parse the original sentence, and extract a tree from the original parse tree.

Method 1. Maximum-Entropy Model

rl

S

NP

Original tree

A lgorithm

Knight and Marcu’ s Noisy-channel model

Daughter nodes are not corresponding

Noisy-channel Maximum EntropyMaximum Entropy with Bottom-up

VP V

Results of N-gram based evaluation

PP

Grammar

Importance

Human Noisy-channel

4.94 3.81

4.31 3.38

Maximum Entropy ME + Bottom-up

3.88 4.22

3.38 4.06

went to Tokyo

from the root  Left-most and right-most daughters  etc...

We used the same corpus as Knight and Marcu. We evaluated the results using F-measure and BLEU score [2], and human judgment. Our method exceeds the previous method in all evaluation criteria. Especially we obtained the highest score using the maximum entropy model with bottom-up method.

Results of human evaluation Grammar: Whether the output is grammatically correct  Importance: Whether the important words remain 

[1] K. Knight and D. Marcu. 2000. Statistics-Based Summarization - Step One: Sentence Compression. In Proc. of AAAI/IAAI' ‘00 [2] K. Papineni, S. Roukos, T. Ward, and W. Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proc. of ACL'02.

Maximum-entropy model

Test set: 264 sentences. Noisy-channel. 63.3. 50.247.1. 75.3. 64.162.1. 80.9. 72.069.5. Maximum EntropyMaximum Entropy with Bottom-up. F-measure. Bigram F-measure. BLEU score. 10. 20. 30. 40. 50. 60. 70. 80. 90. 100. S. NP. VP. NP. PP. The apple on the table is red. DT. N. S. NP. VP. DT. N. The apple is red. S. NP.

1MB Sizes 0 Downloads 281 Views

Recommend Documents

AlgebraSolidGeometry_E_3sec model 1 And The model answer.pdf ...
Whoops! There was a problem loading more pages. AlgebraSolidGeometry_E_3sec model 1 And The model answer.pdf. AlgebraSolidGeometry_E_3sec model ...

Model Questions
The entrance test for admission to Master's Degree in Hospital Management is ... After successive discounts of 10% and 8% have been granted the net price of ...

4. Model Design of Weather Monitoring Model
model are AVHRR – LAC (Advanced Very. High Resolution Radiometer – Local Area. Coverage) type. Description about it could be seen in chapter 2.2.3. Actually, it has spatial resolution is 1,1 x 1,1 kilometers square and temporal resolution is one

Towards Automatic Model Synchronization from Model ...
School of Electronics Engineering and Computer Science ...... quate to support synchronization because the transforma- .... engineering, pages 362–365.

Model Typing for Improving Reuse in Model-Driven Engineering ... - Irisa
Mar 2, 2005 - on those found in object-oriented programming languages. .... The application of typing in model-driven engineering is seen at a number of.

Medical Model vs. Social Model - Kids As Self Advocates
Visit Kids As Self Advocates on the web at: www.fvkasa.org. KASA is a project of ... are a change in the interaction between the individual and society. 5.

A Behavioural Model for Client Reputation - A client reputation model ...
The problem: unauthorised or malicious activities performed by clients on servers while clients consume services (e.g. email spam) without behavioural history ...

Model Typing for Improving Reuse in Model-Driven Engineering Jim ...
typing in model-driven engineering, including a motivating example. Following this, in section 3 ... type system). Not all errors can be addressed by type systems, especially since one usually requires that type checking is easy; e.g., with static ty

model-integration.pdf
Sign in. Main menu.

MODEL QUESTION PAPER
28) Verify Euler's formula for the given network. 29) In ∆leABC, PQ II BC. AP = 3 cm, AR = 4.5 cm,. AQ = 6 cm, AB ... A motor boat whose speed is 15km/hr in still water goes 30 km down stream and comes back in a total of a 4 hours 30 minutes. Deter

Model PDQ's.pdf
Wa/iace'£ publication0. Since DarWin |maiiace^ubiicanon^-tim i^id and ... a// -fka Fi-eids Of fodtoy^.-. Page 3 of 5. Model PDQ's.pdf. Model PDQ's.pdf. Open.

spatial model - GitHub
Real survey data is messy ... Weather has a big effect on detectability. Need to record during survey. Disambiguate ... Parallel processing. Some models are very ...

Model in Word - Microsoft
ground domain), for which large amounts of train- ing data are available, to a different domain (the adaptation domain), for which only small amounts of training ...

Supporting Model-to-Model Transformations: The VMT ...
can attach a Java program that realizes the actual transformation (referred to as a ..... M. Clavel, F. Durän, S. Eker, P. Lincoln, N. Marti-Oliet, J. Meseguer, and J.

Model Typing for Improving Reuse in Model-Driven Engineering ... - Irisa
Mar 2, 2005 - paradigm, both for model transformation and for general ... From the perspective of the data structures involved, model-driven computing ..... tools that work regardless of the metamodel from which the object was instan- tiated.

The subspace Gaussian mixture model – a structured model for ...
Aug 7, 2010 - We call this a ... In HMM-GMM based speech recognition (see [11] for review), we turn the .... of the work described here has been published in conference .... ize the SGMM system; we do this in such a way that all the states' ...

Validation in Model-Driven Engineering: Testing Model ... - Irisa
using MDA for software development, there remain many challenges for the process of software validation, and in par- ticular software testing, in an MDA context ...

MODEL ADEZIUNE.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. MODEL ...

model acta.pdf
Page. 1. /. 1. Loading… Page 1 of 1. model acta.pdf. model acta.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying model acta.pdf. Page 1 of 1.

Model AIC Deviance - GitHub
summary(dsm_all). Family: Tweedie(p=1.25). Link function: log. Formula: count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + s(NPP) + offset(off.set).

Business Model
Proof will be able to reach the segments through its primary advertising campaign of ... will utilize social networks such as Facebook, Twitter, Instagram, and Yelp to make the ...... Does your business need customer relationship management?

Model Checking
where v1, v2, . . . . v represents the current state and v., v, ..., v, represents the next state. By converting this ... one register is eventually equal to the sum of the values in two other registers. In such ... atomic proposition names. .... If