Recognition of Requisite Part and Effectuation Part in Law Sentences (RRE Task)

Ngo Xuan Bach Joint work with: Nguyen Le Minh, Akira Shimazu JAIST ICCPOL 2010

Legal Engineering  To achieve a trustworthy electronic society

 To examine and verify the validity of issues o Whether a law is established appropriately according to its purpose o Whether a law is consistent with related laws o etc  Two important goals o To help experts make complete and consistent laws o To design an information system which works based on laws  Developing a system which can process legal texts automatically

2

The Logical Structure of Law Sentences  In most cases, a law sentence can roughly be divided into two

parts: o Requisite part o Effectuation part

 Example o “When the mayor designates a district for promoting beautification,

s/he must in advance listen to opinions from the organizations and the administrative agencies which are recognized to be concerned with the district”

3

Analyzing the Logical Structure of Law Sentences  Input o A law sentence Law Sentence

 Output o The logical parts Law Sentence

Subject Part

4

Requisite Part

Effectuation Part

Analyzing the Logical Structure of Law Sentences

Law Sentence

RRE Task

Law Sentence

Subject Part

5

Requisite Part

Effectuation Part

Motivation  To understand the logical structure of legal texts  To support other tasks in legal text processing o Translating legal articles to logical and formal representations o Verifying legal documents o Legal article retrieval o Legal text summarization o Question answering in legal domains

6

RRE Task  Two kinds of sentences and seven kinds of parts

 Implication Sentences o Requisite Part: R o Effectuation Part: E o Subject Part: S1, S2, S3 o S1: A Subject Part having an influence in the Requisite Part o S2: A Subject Part having an influence in the Effectuation Part o S3: A Subject Part having an influence in both Requisite and Effectuation Parts  Equivalence Sentences o The Left Side Part: EL o The Right Side Part: ER 7

Solution (1) Sequence Learning

Reranking

 Sequence Learning Sentence

w1w2w3…wk wk+1…wn

Elements

w1

w2



wk

Wk+1



wn

Tags

B-R

I-R



I-R

B-E



I-E

 Input: Sequence of elements o Words o Bunsetsus  Output: Sequence of tags

8

Solution (1) Sequence Learning

Reranking

 Sequence Learning Sentence

w1w2w3…wk wk+1…wn

Elements

w1

w2



wk

Wk+1



wn

Tags

B-R

I-R



I-R

B-E



I-E

 Input: Sequence of elements o Words o Bunsetsus  Output: Sequence of tags

9

Position

Kind of Part

Solution (2)  Reranking o Two steps  Step 1: generate a set of candidates using a base model (GEN)  Step 2: rerank candidates using a score function 𝐹 𝑥 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦∈𝐺𝐸𝑁

𝑥

𝑠𝑐𝑜𝑟𝑒 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦∈𝐺𝐸𝑁

o Advantage  Can utilize non-local, global features

10

𝑥

Φ 𝑦 .𝑊

Solution (3) System Architecture Output 1 Input Sentence

Base Model

Output 2 …

Reranking Model

Output N

Phase 1 using CRFs

11

Phase 2 using Perceptron

Final Output

Solution (4) Decoding Algorithm •For each sample x If the highest probability outputted by GEN is greater than a threshold Then 𝐹 𝑥 is the output with the highest probability of GEN Else

𝐹 𝑥 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦∈𝐺𝐸𝑁 End If •End For

12

𝑥

𝑠𝑐𝑜𝑟𝑒 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦∈𝐺𝐸𝑁

𝑥

Φ 𝑦 .𝑊

Corpus  Japanese National Pension Law (JNPL) Corpus o 764 sentences

Sentence Type

Number

Part Type

Number

Equivalence

11

EL ER

11 11

753

E R S1 S2 S3

745 429 9 562 102

Implication

13

Evaluation Method & Measure  Evaluation method o 10-fold cross-validation test  Measure o Precision, Recall, F1

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =

#𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑎𝑟𝑡𝑠 #𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑝𝑎𝑟𝑡𝑠

𝑟𝑒𝑐𝑎𝑙𝑙 =

𝛽 2 + 1 ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙 𝐹𝛽 = 𝛽 2 ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

14

#𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑎𝑟𝑡𝑠 #𝑎𝑐𝑡𝑢𝑎𝑙 𝑝𝑎𝑟𝑡𝑠

Experiment Goals Considering three problems: 1) Investigate which features are suitable for the RRE task? o

Investigate how to model the RRE task efficiently?

2) o o

Word-based modeling Bunsetsu-based modeling

Investigate which tag setting is suitable for the RRE task?

3) o o o o

15

Word, POS tag, Katakana, Stem, Bunsetsu tag, Named Entities

IOB (Inside, Outside, Begin) IOE (Inside, Outside, End) FILC (First, Inside, Last, Consecutive) FIL (First, Inside, Last)

Experiment Design Four Models 1)

Word-based model o Words are elements

2)

Word reduction model o Important words are elements

3)

Bunsetsu-based model o Bunsetsu are elements

4)

Reranking model o Bunsetsu-based model + reranking

16

Word-Based Model (1)  Modeling o Words are elements  Example Source Sentence Word Sequence Tag Sequence

17

被保険者期間を計算する場合には、月によるものとする。 (When a period of an insured is calculated, it is based on a month.) 被 保険 者 期間 を 計算 する 場合 に は 、 月 による もの と する 。 hi hoken sha kikan wo keisan suru B-R I-R I-R I-R I-R I-R I-R

baai ni wa tsuki niyoru mono to I-R I-R I-R I-R B-E I-E I-E I-E

suru I-E I-E

Word-Based Model (2)  Features (Cabocha tool) o

Word, POS tag, Katakana, Stem, Bunsetsu tag, Named Entities

 Experimental results Feature Sets Word (Baseline) Word + Katakana, Stem Word + POS Word + Bunsetsu Word + NE

18

Precision (%) Recall (%) 87.27 87.02 87.68 86.15 87.22

85.50 85.39 85.66 84.86 85.45

F1(%) 86.38 86.20 (-0.18) 86.66 (+0.28) 85.50 (-0.88) 86.32 (-0.06)

Word Reduction Model (1)  Bunsetsu o In Japanese, a sentence is divided into some chunks call Bunsetsu o Each Bunsetsu contains one or more content words (noun, verb, adjective, etc) and may include some function words (case-maker, punctuation, etc)  The head word o Is the rightmost content word o Contributes the central meaning

 The functional word o Is the rightmost function word, except for punctuation o Plays a grammatical role

19

Word Reduction Model (2)  Sentence Reduction o Reducing a full sentence to a reduction sentence containing only important words o Important words: head words, functional words, and punctuation  Example Source Sentence Original Sequence

被保険者期間を計算する場合には、月によるものとする。 (When a period of an insured is calculated, it is based on a month.) 被 保険 者 期間 を 計算 する 場合 に は 、 月 による もの と する 。

hi hoken sha kikan wo keisan suru baai ni wa tsuki niyoru mono Original Tag B-R I-R I-R I-R I-R I-R I-R I-R I-R I-R I-R B-E I-E I-E Bunsetsu 1 2 3 4 5 Head Word Yes Yes Yes Yes Yes Functional Yes Yes Yes Word New 期間 を する 場合 は 月 による もの Sequence New Tag B-R I-R I-R I-R I-R B-E I-E I-E 20

to suru I-E I-E I-E 6 Yes Yes と する I-E I-E

Word Reduction Model (3)  Features o Head words, functional words, punctuation, and POS tags of them o HFW: Head Functional Word o HFWP: Head Functional Word Pos  Experimental results Model Baseline Word HFW HFWP

21

Sentence Full Full Reduction Reduction

Feature Word Word + POS Word Word + POS

Prec(%) 87.27 87.68 88.09 87.74

Recall (%) 85.50 85.66 86.30 86.52

F1(%) 86.38 86.66 (+0.28) 87.19 (+0.81) 87.12 (+0.74)

Bunsetsu-Based Model (1)  Bunsetsus are elements  Motivation o Each Bunsetsu only belongs to one part o Reduce the length of sequences  JNPL corpus: From 47.3 to 17.6 on average o Utilize important words  Example Source Sentence Word Sequence Tag Sequence Bunsetsu New Tag 22

被保険者期間を計算する場合には、月によるものとする。

(When a period of an insured is calculated, it is based on a month.) 被 保険 者 期間 を 計算 する 場合 に は 、 月 による もの と する 。 hi hoken sha kikan wo keisan suru B-R I-R I-R I-R I-R I-R I-R 1 B-R

2 I-R

baai ni wa tsuki niyoru mono to I-R I-R I-R I-R B-E I-E I-E I-E 3 I-R

4 B-E

5 I-E

suru I-E I-E 6 I-E

Bunsetsu-Based Model (2)  Features o

Head words, functional words, punctuations, co-occurrence of head words and functional words

 Experimental results Model Baseline Word HFW BC-IOB BC-IOE BC-FILC BC-FIL 23

Element Prec (%) Recall (%) Word 87.27 85.50 Word 87.68 85.66 Important Word 88.09 86.30 Bunsetsu 88.75 86.52 Bunsetsu 89.35 87.05 Bunsetsu 88.75 86.09 Bunsetsu 88.87 86.30

F1(%) 86.38 86.66(+0.28)

87.19(+0.81) 87.62(+1.24) 88.18(+1.80) 87.40(+1.02) 87.57(+1.19)

Reranking Model (1)  Motivation o Utilize non-local, global features

Feature Representation  Candidate: I-R I-R I-R E-R I-S2 I-S2 I-S2 E-S2 I-E I-E I-E E-E  Tag sequence: START I-R E-R I-S2 E-S2 I-E E-E END  Part sequence: START R S2 E END  Features o Probability of base model o Unigram, bigram, and trigram of tag sequences and part sequences o Number of parts 24

Reranking Model (2)  Experiment setting o Training set 80%, development set 10%, test set 10% o GEN: 20-best outputs of the BC-IOE model o Algorithm: Perceptron algorithm (10 loops)  Experimental results

25

Model

Precision (%)

Recall (%)

F1(%)

Baseline

87.27

85.50

86.36

BC-IOE

89.35

87.05

88.18 (+1.80)

Reranking

89.42

87.75

88.58 (+2.20)

Experimental Results (Overall) 90

89 88 87

Precision Recall F1

86 85 84 83 Word-Based Model 26

Word Reduction Bunsetsu-Based Reranking Model Model Model

Conclusion  Presented the RRE task

 Investigated the RRE task in some aspects o Linguistics features  Words and POS tags are suitable o Problem modeling  Modeling based on Bunsetsu is better than modeling based on words o Tag setting  IOE tag setting is suitable

 Presented four models for RRE task o The best model: 88.58% in F1 score

27

References 1. Collins, M., Koo, T.: Discriminative Reranking for Natural Language Parsing. In Computational Linguistics, Volume 31, Issue 1, pp.25-70 (2005). 2. Collins, M.: Discriminative Reranking for NLP. http://www.clsp.jhu.edu/ws2005/calendar/documents/CollinsJuly7.pdf 3. Freund,Y., Schapire, R.: Large Margin Classification using the Perceptron Algorithm. In Machine Learning, Volume 37, Issue 3, pp.277-296 (1999). 4. Kudo, T.:Yet Another Japanese Dependency Structure Analyzer. http://chasen.org/ taku/software/cabocha/ 5. Kudo, T.: CRF++:Yet Another CRF toolkit. http://crfpp.sourceforge.net/ 6. Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. of ICML, pp.282-289 (2001). 7. Ludtke, D., Sato, S.: Fast Base NP Chunking with Decision Trees-Experiments on Different POS Tag Settings. In Proc. of CICLing, pp. 139-150 (2003). 8. Murata, M., Uchimoto, K., Ma, Q., Isahara, H.: Bunsetsu identification using category-exclusive rules. In Proc. of the 18th conference on Computational linguistics - Volume 1, pp.565-571 (2000). 9. Nakamura, M., Nobuoka, S., Shimazu, A.: Towards Translation of Legal Sentences into Logical Forms. In Proc. of the 1st Int. Workshop on JURISIN, (2007). 10. Tanaka, K., Kawazoe, I., Narita, H.: Standard structure of legal provisions - for the legal knowledge processing by natural language - (in Japanese). In IPSJ Research Report on Natural Language Processing, pp.79-86 (1993). 28

Thank you for your attentions!

29

Learning Semantic Correspondences with Less ...

Analyzing the Logical Structure of Law Sentences ..... Kudo, T.: Yet Another Japanese Dependency Structure Analyzer. http://chasen.org/ taku/software/cabocha/.

489KB Sizes 3 Downloads 74 Views

Recommend Documents

Learning Semantic Correspondences with Less ...
Department of Computer Science, PTIT, Vietnam. Machine Learning & Applications Lab, PTIT, Vietnam. KSE 2015, Ho Chi Minh City - Vietnam, October 2015. +*.

Doing more with less: Teacher professional learning ...
Jun 2, 2008 - opportunities, including joint lesson planning and the sharing of resources; ..... report that teachers use the computers to collect materials. ..... communities: Leadership, purposeful decision making, and job embedded staff.

Doing more with less: Teacher professional learning ...
Jun 2, 2008 - Administration, Graduate School of Education, Rutgers, The State University of New Jersey, ... (Hargreaves, 2000), the culture of teaching in the United States has long been ..... in the school, or short term training sessions held at a

New Analysis and Algorithm for Learning with ... - Semantic Scholar
figure, the L1 distance is given by twice the area of the green rectangle. In the right ... The two areas are equal, thus disc(P, Q)=0. In terms of ..... SIAM J. Comput.

Semantic Proximity Search on Graphs with Metagraph-based Learning
social networks, proximity search on graphs has been an active .... To compute the instances of a metagraph more efficiently, ...... rankings at top 10 nodes.

Candidate stability and voting correspondences - Springer Link
Jun 9, 2006 - Indeed, we see that, when candidates cannot vote and under different domains of preferences, candidate stability implies no harm and insignificance. We show that if candidates cannot vote and they compare sets according to their expecte

Learning sequence kernels - Semantic Scholar
such as the hard- or soft-margin SVMs, and analyzed more specifically the ..... The analysis of this optimization problem helps us prove the following theorem.

Learning to Combine Discriminative Classifiers - Semantic Scholar
Jul 28, 2010 - [email protected] ABSTRACT. Much of research in data mining and machine learning has led to numerous practical applications.

Organizational Learning Capabilities and ... - Semantic Scholar
A set of questionnaire was distributed to selected academic ... Key words: Organizational learning capabilities (OLC) systems thinking Shared vision and mission ... principle and ambition as a guide to be successful. .... and databases.

Learning Articulation from Cepstral Coefficients - Semantic Scholar
Parallel and Distributed Processing Laboratory, Department of Applied Informatics,. University ... training set), namely the fsew0 speaker data from the MOCHA.