The Usability of Ambiguity Detection Methods for Context-Free Grammars LDTA 2008

Bas Basten CWI, Amsterdam

Motivation 

Problems:  



Solution: 



Use unconstrained context-free grammars

Problem: 



Writing LL/LR grammars is hard LL/LR grammars cannot be composed modularly

Context-free grammars can be ambiguous

Potential solution: 

Use ambiguity detection?

LDTA 2008

2

Overview 1. Ambiguity in Context-Free Grammars 2. Investigated Ambiguity Detection Methods 3. Comparison 4. Results 5. Conclusion 6. Future Directions

LDTA 2008

3

Background: Ambiguity in CFGs 



A grammar G is ambiguous iff L(G) contains a string with multiple derivations Simple expression grammar:   

E→E+E E→E*E E → 0 | 1 | 2 ...

E

E

E

E E

E

1 + 2 * 3 1 + (2 * 3) = 7

LDTA 2008

E

E E

E

1 + 2 * 3 (1 + 2) * 3 = 9

4

Background: Ambiguity in CFGs 

Ambiguity problem: is grammar G ambiguous?  

undecidable in general semidecidable 



Generating all strings of L(G) + checking for ambiguity only terminates if G is ambiguous

LR(k) is a subclass of the unambiguous grammars

LDTA 2008

5

Ambiguity Detection 

An ambiguity detection method (ADM) should:   



Correctly answer “ambiguous” or “unambiguous” Terminate in acceptable time Give information for disambiguation

Perfect ADM cannot exist  

Correctness/termination trade-off Practical value?

LDTA 2008

6

Investigated ADMs 

LR(k) test  



Noncanonical Unambiguity (NU) test  



D.E. Knuth, 1965 LR(k) parse table generation for increasing k S. Schmitz, 2006 Conservative approximation of parse automaton

AMBER  

W.F. Schröer, 2001 Derivation generator

LDTA 2008

Which one is the most practically usable? 7

Noncanonical Unambiguity test 

Builds NFA that approximates parse automaton of G 



Searches NFA for ambiguous strings 



Finite search space

Conservative approximation  



L(NFA) superset of L(G)

No ambiguities left out Reports “unambiguous” and “potentially ambiguous”

Different precisions: LR(0), SLR(1), LR(1) 

Larger automaton → higher accuracy

LDTA 2008

8

Empirical Comparison 

2 grammar collections: 

84 Toy grammars  



3-17 productions

48 ambiguous 36 unambiguous

5 Real world grammars:     

LDTA 2008

HTML 29 productions SQL 79 productions Pascal 176 productions C 212 productions Java 349 productions  Of each: 1 unambiguous + 5 ambiguous versions

9

Empirical Comparison 

Accuracy  



Performance  



Percentage of correct reports Toy grammars Computation time, memory consumption Real world grammars

Termination   

Ability to terminate in given amount of time Real world grammars Time limits: 5 min, 15 hrs.

LDTA 2008

10

Results LR(k) test 

Advantages: 



100% accurate

Disadvantages:    

Only reports “unambiguous” for LR(k) grammars Nontermination on non LR(k) grammars Conflicts in parse tables are incomprehensible Exponential performance 

Max values of k testable in 15 hrs: 2-6 (ambig. real world gr.)

Hardly usable for unconstrained CFGs, only for LR(k) LDTA 2008

11

Results NU test 



Accuracy: (unambiguous toy grammars) SLR(1)

LR(1)

61%

69%

86%

Performance:  



LR(0)

LR(0), SLR(1): Very fast, all tests < 3 sec. LR(1) also, but too much memory on C and Java grammars: swapping or crashing

Incomprehensible reports LR(1) precision pretty useful for |G| < 200 productions

LDTA 2008

12

Results AMBER 

Advantages:   

100% accurate Ambiguous example strings are very useful High termination scores: (ambiguous real world gr.) 



70% in 5 min, 90% in 15 hrs.

Disadvantages:  

Only reports “ambiguous” Runs forever on unambiguous grammars

Very useful for ambiguous grammars LDTA 2008

13

Conclusion 

Usability ranking on grammar collections: 1. AMBER 

Very useful for ambiguous grammars

2. Noncanonical Unambiguity test 

LR(1) precision pretty useful for |G| < 200 productions

3. LR(k) test 

Hardly usable for unconstrained CFGs, only for LR(k)

LDTA 2008

14

Future Directions 

Compare other ADMs, for instance:



Grambiguity  



C. Brabrand, R. Giegerich and A. Møller, 2006 Regular (conservative) approximation

CFGAnalyzer  

R. Axelsson, K. Heljanko and M. Lange, 2007 Incremental SAT solving

LDTA 2008

15

Future Directions 

Iterative approach   



Multiple checks with increasing detail Filter out unambiguous grammar parts Coarse grained, fast → ... → fine grained, slow

For example: 

Run NU test first:  



“unambiguous” → done “potentially ambiguous” → try filtering unambiguous parts

Run derivation generator (AMBER) on remainder 

LDTA 2008

Smaller search space, better performance

16

References 

  





F. W. Schröer. AMBER, an ambiguity checker for context-free grammars. Technical report, Fraunhofer Institute for Computer Architecture and Software Technology, 2001. http://accent.compilertools.net/Amber.html D. E. Knuth. On the translation of languages from left to right. Information and Control, 8(6):607–639, 1965. V. Makarov. MSTA (syntax description translator). May 1995. http://cocom.sourceforge.net/msta.html S. Schmitz. An experimental ambiguity detection tool. In A. Sloane and A. Johnstone, editors, Seventh Workshop on Language Descriptions, Tools, and Applications (LDTA '07), Braga, Portugal, March 2007. C. Brabrand, R. Giegerich, and A. Møller. Analyzing ambiguity of context-free grammars. In M. Balík and J. Holub, editors, 12th International Conference on Implementation and Application of Automata (CIAA '07), July 2007. R. Axelsson, K. Heljanko, M. Lange. CFGAnalyzer. 2007. http:// www.tcs.ifi.lmu.de/~mlange/cfganalyzer/

LDTA 2008

17

The Usability of Ambiguity Detection Methods for Context-Free ...

Problem: Context-free grammars can be ambiguous ... Overview. 1. Ambiguity in Context-Free Grammars. 2. .... Architecture and Software Technology, 2001.

149KB Sizes 2 Downloads 266 Views

Recommend Documents

The Usability of Ambiguity Detection Methods for ...
One way of verifying a grammar is the detection of ambiguities. Ambiguities are ... are intended to contain a certain degree of ambiguity (for instance program- ming languages that ... Electronic Notes in Theoretical Computer Science ..... part of th

Ambiguity Detection Methods for Context-Free Grammars
Aug 17, 2007 - occur in derivations in which every live production is used at most once. (The live produc- tions of a CNF grammar are those of the form A → BC.) His algorithm consists of searching those derivations for duplicate strings (like .....

Improving the Usability of Intrusion Detection Systems - CiteSeerX
The resulting system was tested on two corpora of data: Web access logs ..... 13See e.g. 'http://builder.com.com/5100-6387 14-1044883-2.html', verified ...

Improving the Usability of Intrusion Detection Systems - CiteSeerX
Current advanced intrusion detection systems that benefit from utilising machine learning ... server access requests, and a subset of a data set with system call traces. We also ...... Technology/National Computer Security Center. [WFP99] ...

Ambiguity Detection: Scaling to Scannerless
workbenches for textual software languages. However, the ... use of disambiguation filters [7] to deal with issues such as keyword reservation and longest match ...

Ambiguity Detection: Scaling to Scannerless
Harmless production filtering. – Significant speed-ups (LDTA 2010). – Proved correct ... Measurement results. (small grammar). 20 21 22 23 24 25 26 27 28 29 ...

Methods for detection of word usage over time
1990. 2000. 0. 1. 2. 3. 4. 5. 6. (c) Google ngrams yearly occurences of the word 'ant'. Ondrej Herman (FI MUNI). Detection of word usage over time. 7. 12. 2013.

Methods for detection of nucleic acid sequences in urine
Nov 19, 2004 - See application ?le for complete search history. (56). References Cited .... nal ofthe American Society ofNephrology (1999), 10(5): 12 pages. Vonsover et al. ...... express a preference for the neW test. Invasive prenatal.

Methods for detection of nucleic acid sequences in urine
Nov 19, 2004 - BACKGROUND. Human genetic material is an invaluable source of infor .... The target fetal DNA sequence can be, for example, a sequence that is ...... With the advent of broad-based genetic mapping initia tives such as the ...

heat detection methods for the year 2000
A heat detection program needs to be established and adhered to similar to the .... vulvar tissue in Holstein cows during ovarian cycles and after treatment of.

Sonar Signal Processing Methods for the Detection and ... - IJRIT
and active sonar systems can be used to monitor the underwater acoustic environment for incursions by rapidly moving ... detection and tracking of a small fast surface craft (via its wake) in a highly cluttered shallow water ..... automatic detection

Sonar Signal Processing Methods for the Detection and Localization ...
Fourier transform converts each block of data x(t) from the time domain to the frequency domain: X ( f ) . The power spectrum | X ( f ) ... the hydrophone is 1 m above the sea floor (hr=1m). The model ... The generalized cross correlation processing

Face Detection Methods: A Survey
IJRIT International Journal of Research in Information Technology, Volume 1, Issue 11, November, 2013, Pg. 282-289 ... 1Student, Vishwakarma Institute of Technology, Pune University. Pune .... At the highest level, all possible face candidates are fo

9. Visible-Surface Detection Methods
Perspective Transformation (in a perspective viewing system):. After Modelling Transformation, Viewing Transformation is carried out to transform objects from the world coordinate system to the viewing coordinate system. Afterwards, objects in the sc

General Algorithms for Testing the Ambiguity of Finite Automata
2 Courant Institute of Mathematical Sciences,. 251 Mercer Street, New ... E ) the degree of polynomial ambiguity of a polynomially ambigu- ous automaton A.

General Algorithms for Testing the Ambiguity of ... - Research at Google
International Journal of Foundations of Computer Science c World .... the degree of polynomial ambiguity of a polynomially ambiguous automaton A and.

Pseudo-likelihood methods for community detection in ... - CiteSeerX
Feb 21, 2013 - works, and illustrate on the example of a network of political blogs. ... methods such as hierarchical clustering (see [24] for a review) and ...

Pseudo-likelihood methods for community detection in ... - CiteSeerX
Feb 21, 2013 - approximation to the block model likelihood, which allows us to easily fit block models to ..... web, routing, and some social networks. The model ...

Improved Text-Detection Methods for a Camera-based ...
visually impaired persons have become an important ... The new method we present in the next section of the .... Table 1 Results of Small Character Data set p.

Accuracy of edge detection methods with local ... - Springer Link
Sep 11, 2007 - which regions with different degrees of roughness can be characterized ..... Among the available methods for computing the fractal dimension ...

Two methods of Haustral fold detection from computed ...
This segmented colon is then allowed to cool down [13]. ... where 1 < i < C and 1 < j < Q. Let U be the fuzzy partition matrix and V be the cluster center vector.

Fall Detection – Principles and Methods - CiteSeerX
an ambulatory monitor triggered by a photo-interrupter to record the falling sequences. .... over short distances and the availability of the required algorithms.

Enhancing the Explanatory Power of Usability Heuristics
status, match between system and the real world, user control and freedom ... direct commercial ... had graphical user interfaces, and 3 had telephone-operated.

Two methods of Haustral fold detection from computed ...
Virtual colonoscopy (VC) has gained popularity as a new colon diagnostic method .... in the two-dimensional feature space, constituted by the number of 'hot' ...