Exploring Wikipedia’s Category Graph for Query Classification Milad Alemzadeh1, Richard Khoury2, and Fakhri Karray1 1

Department of Electrical and Computer Engineering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada, N2L 3G1 {malemzad,karray}@uwaterloo.ca 2 Department of Software Engineering, Lakehead University, 955 Oliver Road, Thunder Bay, Ontario, Canada, P7A 5E1 [email protected]

Abstract. Wikipedia’s category graph is a network of 400,000 interconnected category labels, and can be a powerful resource for many classification tasks. However, its size and the lack of order can make it difficult to navigate. In this paper, we present a new algorithm to efficiently explore this graph and discover accurate classification labels. We implement our algorithm as the core of a query classification system and demonstrate its reliability using the KDD CUP 2005 competition as a benchmark. Keywords: Natural Language Processing, Query Classification, Category Labeling, Wikipedia.

1 Introduction Query classification is the task of Natural Language Processing (NLP) whose goal is to identify the category label, in a predefined set, that best represents the domain of a question being asked. An accurate query classification system would be beneficial in many practical systems, including search engines and question-answering systems. But while similar categorization tasks are found in several branches of NLP, the challenge of query classification is accentuated by the fact that a typical query is only between one and four words long [1], [2], rather than the hundreds or thousands of words one can get from an average text document. Such a limited number of keywords makes it difficult to select the correct category label, and moreover it makes the selection very sensitive to “noise words”, or words unrelated to the query that the user entered for some reason such as because they didn’t remember a correct name or technical term to query for. A second challenge of query classification comes from the fact that, while document libraries and databases can be specialized to a single domain, the users of query systems expect to be able to ask queries about any domain at all [1]. In this paper, we build upon our previous work on query labeling using the Wikipedia category graph [3]. We have already shown that Wikipedia offers a set of nearly 400,000 interconnected categories which can be used for query classification. Moreover, since these categories cover most domains of human knowledge at M. Kamel et al. (Eds.): AIS 2011, LNAI 6752, pp. 222–230, 2011. © Springer-Verlag Berlin Heidelberg 2011

Exploring Wikipedia’s Category Graph for Query Classification

223

varying degrees of granularity, it is easy for system designers to identify a subset of them as “target categories” they wish to use as classification goals, rather than deal with the full set of 400,000 categories. This paper now presents a new algorithm to explore the graph of categories, to efficiently discover the best target category to classify a query into. The rest of the paper is organized as follows. Section 2 presents overviews of the literature in the field of query classification with a special focus on the use of Wikipedia for that task. We present in detail our exploration and classification algorithm Section 3, then we move on in Section 4 to describe and analyze the experimental results we obtained with our system. Finally, we give some concluding remarks in Section 5.

2 Background Query classification is the task of NLP that focuses on inferring the domain information surrounding user-written queries, and on assigning each query to the category label that best represents its domain in a predefined set of labels. Given the ubiquity of search engines and question-handling systems today, the challenge of query classification has been receiving a growing amount of attention. Notably, it was the topic of the ACM’s annual KDD CUP competition in 2005 [4], where 37 systems competed to classify a set of 800,000 real web queries into a set of 67 categories designed to cover most topics found on the internet. The winning system was designed to classify a query by comparing its word vector to that of each website in a set pre-classified in the Google directory. The query was assigned the category of the most similar website, and the directory’s set of categories was mapped to the KDD CUP’s set [2]. This system was later improved by introducing a bridging classifier and an intermediatelevel category taxonomy [5]. There are a lot of other active research groups working in query classification. They all follow the basic pattern of mapping a query into an external knowledge source to classify it. There exist a great variety of such systems, using for example ontologies [6], web query logs [7], and Wikipedia [8], [9]. In fact, exploiting Wikipedia as a knowledge source has become commonplace in scientific research. Several hundreds of journal and conference papers have been published using this tool since its creation in 2001. However, to the best of our knowledge, aside from our previous work mentioned in Section 1, there have been only two query classification systems designed based on Wikipedia. The first of these two systems was proposed by Hu et al. [8]. Their work assumes that there is a set of seed concepts that their query classification should be trained to recognize. They thus target the articles and categories relevant to these concepts, and construct a graph of Wikipedia domains by following the links in these articles using a Markov random walk algorithm. Each step from one concept to the next on the graph is assigned a transition probability, and these probabilities are then used to compute the likelihood of each domain. Once the knowledge base has been build in this way, a new user query can be classified simply by using its keywords to retrieve a list of relevant Wikipedia domains, and sorting them by likelihood. Unfortunately, their system remained small-scale and limited to only three basic domains, namely

224

M. Alemzadeh, R. Khoury, and F. Karray

“travel”, “personal name” and “job”. It is not a general-domain classifier such as the one we aim to create. The second query classification system was designed by one of our co-authors in [9]. It follows Wikipedia’s encyclopedia structure to classify queries step-by-step, using the query’s words to select titles, then selecting articles based on these titles, then categories from the articles. At each step, the weights of the selected elements are computed based on the relevant elements in the previous step: a title’s weight depends on the words that selected it, an article’s weight on the titles’, and a category’s weight on the articles’. Unlike [8], this system was a general classifier that could handle queries from any domain, and its performance would have ranked near the top of the KDD CUP 2005 competition.

3 Methodology Wikipedia’s category graph is a massive set of almost 400,000 category labels, describing every domain of knowledge and ranging from the very precise, such as “fictional secret agent and spies”, to the very general, such as “information”. The categories are connected by hypernym relationships, with a child category having an “is-a” relationship to its parents. However, the graph is not strictly hierarchic: there exist shortcuts in the connections (i.e. starting from one child category and going up two different paths of different lengths to reach the same parent category) as well as loops (i.e. starting from one child category and going up a path to reach the same child category again). The fact that the set of category labels covers practically every domain at every level of precision makes it easy for a system designer to identify a subset of categories to be used as “target categories” for a classification system. The query classifier we propose in this paper is designed to explore the graph of categories from any starting point until it reaches the nearest such target categories. The pseudocode of our new algorithm is shown in Figure 1. 3.1 Building the Category Graph The list of categories in Wikipedia and the connections between categories can easily be extracted from the database dump made freely available by the Wikimedia Foundation. For this project, we used the version available from September 2008. However, our graph includes one extra piece of information in addition to the categories, namely the article titles. In Wikipedia, each article is an encyclopedic entry on a given topic which is classified in a set of categories, and which is pointed to by a number of titles: a single main title, some redirect titles (for common alternative names, including foreign translations and typos) and some disambiguation titles (for ambiguous names that may refer to it). For example, the article for the United States is under the title “United States”, as well as the redirect titles “USA”, “United States of America” and “United Staets”, and the disambiguation title “America”. Our graph maps the titles directly to the categories of the articles, and then discards the articles. After this processing, we find that our category graph features 5,453,808 titles and 390,807 categories [3].

Exploring Wikipedia’s Category Graph for Query Classification

225

Define:

CategoryGraph, TargetCategories (a subset of CategoryGraph), Classification (classification results), CassificationSize (number of classification results allowed per query) Input: User query 0. Classification ← {} 1. TitleList ← the most relevant Wikipedia titles to the user query 2. CatList ← the categories relating to TitleList 3. Do for 20 iterations: 4. NewClassification ← subset of CatList that are in TargetCategories 5. If COUNT(Classification + NewClassification) <= CassificationSize 6. Classification ← Classification + NewClassification 7. If COUNT(Classification + NewClassification) > CassificationSize AND COUNT(Classification) > 0 8. Break from loop 9. If COUNT(Classification + NewClassification) > CassificationSize AND COUNT(Classification) = 0 10. Classification ← Select CassificationSize elements from NewClassification 11. Break from loop 12. CatList ← unvisited parent categories directly connected to CatList 13. Return Classification Fig. 1. Structure of our classification algorithm

3.2 Starting the Search The first step of our algorithm is to map the user’s query to an initial set of categories from which the exploration of the graph will begin. This is accomplished by going through the titles included in the graph. The query is stripped of stopwords to keep only keywords; the system then generates the exhaustive list of titles that feature at least one of these keywords, and expands the exhaustive list of categories pointed to by these titles. Next, the algorithm considers each keyword/title/category triplet where it is the case that the keyword is in the title and the title points to the category, and assigns each one a weight that is a function of how many query keywords are featured in the title with a penalty for title keywords not featured in the query. The exact formula to compute the weight Wt of keywords in title t is given in equation (1). In this formula, Nk is the total number of query keywords featured in the title, Ck is the character count of the keywords featured in the title, and Ct is the total number of characters in the title. The rationale for using character counts in this formula is to shift some density weight to titles that match longer keywords in the query. The assumption is that, given that the user typically only provides less than four keywords in the query, having one much longer keyword in the set could mean that this one keyword

226

M. Alemzadeh, R. Khoury, and F. Karray

is more important. Consequently, we give a higher weight to keywords in a title featuring the longer query keywords and missing the shorter ones, as opposed to a title featuring the shorter query keywords and missing the longer ones. Wt = 1+

Nk Ck − Ct

(1)

Ck

The weight of a keyword given a category is then defined as the maximum value that keyword takes in all titles that point to that category. Finally, the density value of each category is computed as the sum of the weights of all query keywords given that category. This process will generate a long list of categories, featuring some categories pointed to by high-weight words and summing to a high density score, and a lot of categories pointed to by only lower-weight words and having a lower score. The list is trimmed by discarding all categories having a score less than half that of the highest-density category. This trimmed set of categories is the initial set the exploration algorithm will proceed from. It corresponds to “CatList” at step 2 of our pseudocode in Figure 1. Through practical experiments, we found that this set typically contains approximately 28 categories. 3.3 Exploration Algorithm Once the initial list of categories is available, the search algorithm explores the category graph step by step. At each step, the algorithm compares the set of newly-visited categories to the list of target categories defined as acceptable classification labels and adds any targets discovered to the list of classification results. It then generates the next generation of unvisited categories directly connected to the current set as parent and repeats the process. The exploration can thus be seen as radiating through the graph from each initial category. This process corresponds to steps 4 and 12 of the pseudocode algorithm in Figure 1. There are two basic termination conditions for the exploration algorithm. The first is when a predefined maximum number of classification results have been discovered. This maximum could for example be 1, if the user wants a unique classification for each query, while it was set at 5 in the KDD CUP 2005 competition rules. However, since the exploration algorithm can discover several target categories in a single iteration, it is possible to overshoot this maximum. The algorithm has two possible behaviors defined in that case. First, if some results have already been discovered, then the new categories are all rejected. For example, if the algorithm has already discovered four target categories to a given query out of a maximum of five and two more categories are discovered in the next iteration, both new categories are rejected and only four results are returned. The second behavior is for the special case where no target categories have been discovered yet and more than the allowed maximum are discovered at once. In that case, the algorithm simply selects randomly the maximum allowed number of results from the set. For example, if the algorithm discovers six target categories at once in an iteration, five of them will be kept at random and returned as the classification result.

Exploring Wikipedia’s Category Graph for Query Classification

227

The second termination condition for the algorithm is reaching a maximum of 20 iterations. The rationale for this is that, at each iteration, both the set of categories visited and the set of newly-generated categories expand. The limit of 20 iterations thus reflects a practical consideration, to prevent the size of the search from growing without constraint. But moreover, after 20 steps, we find that the algorithm has explored too far from the initial categories for the targets encountered to still be relevant. For comparison, in our experiments, the exploration algorithm discovered the maximum number of target categories in only 3 iterations on average, and never reached the 20 iterations limit. This limit thus also allows the algorithm to cut off the exploration of a region of the graph that is very far removed from target categories and will not generate relevant results.

4 Experimental Results In order to test our system, we submitted it to the same challenge as the KDD CUP 2005 competition [4]. The 37 solutions entered in that competition were evaluated by classifying a set of 800 queries into up to 5 categories from a predefined set of 67 target categories ci, and comparing the results to the classification done by three human labelers. The 800 test queries were meaningful English queries selected randomly from MSN search logs, unedited and including the users’ typos and mistakes. The solutions were ranked based on overall precision and overall F1 value, as computed by Equations (2-6). The competition’s Performance Award was given to the system with the top overall F1 value, and the Precision Award was given to the system with the top overall precision value within the top 10 systems evaluated on overall F1 value. Note that participants had the option to enter their system for precision ranking but not F1 ranking or vice-versa rather than both precision and F1 ranking, and several participants chose to use that option. Consequently, the top 10 systems on F1 value ranked for precision are not the same as the top 10 systems ranked for F1 value, and there are some N/A values in the results in Table 1. Precision =

Recall =





i

i

Number of queries correctly labeled as c i



i

Number of queries labeled as c i

Number of queries correctly labeled as c i



i

Number of queries belonging to c i

F1 =

2 × Precision × Recall Precision + Recall

Overall Precision =

1 3 ∑ Precision against labeler j 3 j =1

(2)

(3)

(4)

(5)

228

M. Alemzadeh, R. Khoury, and F. Karray

Overall F1 =

1 3 ∑ F1 against labeler j 3 j =1

(6)

In order for our system to compare to the KDD CUP competition results, we need to use the same set of category labels. As we mentioned in Section 3, the size and level of detail of Wikipedia’s category graph makes it possible to identify categories to map any set of labels to. In our case, we identified 84 target categories in Wikipedia corresponding to the 67 KDD CUP category set. With the mapping done, we classified the 800 test queries with our system and evaluated the results on overall precision and F1 following the KDD CUP guidelines. Our results are presented in Table 1 along with the KDD CUP mean and median, the best system on precision, the best system on F1, and the worst system overall as reported in [4]. As can be seen from that table, our system performs well above the competition average, and in fact ranks in the top-10 of the competition. Table 1. Classification results System Best F1 Best Precision Our System Mean Median Worst

F1 Rank 1 N/A 10 18 19 37

Precision Rank N/A 1 7 13 15 37

Overall Precision 0.4141 0.4237 0.3081 0.2545 0.2446 0.0509

Overall F1 0.4444 0.4261 0.3005 0.2353 0.2327 0.0603

It is interesting to consider not only the final classification result, but also the performance of our exploration algorithm. To do this, we studied how frequently each of the termination conditions explained in Section 3.3 was reached. We can summarize from Section 3.3 that there are five distinct ways the algorithm can terminate. The first is “no initial list”, which is to say that the initial keyword-to-category mapping failed to generate any categories for our initial set and the exploration cannot begin. If there is an initial set of categories generated and the exploration begins, then there are still four ways it can terminate. The first is “failure”, if it reaches the cutoff value of 20 iterations without encountering a single target category. The second termination condition is “exploration limit”, if the algorithm reaches the cutoff value of 20 iterations but did discover some target categories along the way. These categories are returned as the classification results. The third termination is the “overshoot”, if the algorithm discovers more than the maximum number of results in a single iteration and must select results randomly. And the final termination condition is “category limit”, which is when the algorithm has already found some categories and discovers more categories that bring it to or above the set maximum; in the case it goes above the maximum the newly-discovered categories are discarded. In each case, we obtained the number of query searches that ended in that condition, the average number of iterations it took the algorithm to reach that condition, the average number of categories found (which can be greater than the maximum allowed when more categories are found in the last iteration) and the average number of target categories returned. These results are presented in Table 2.

Exploring Wikipedia’s Category Graph for Query Classification

229

Table 2. Exploration performance Termination

Number of queries

Average number of iterations

52 0 0 28 720

0 20 20 2.4 3.3

No initial list Failure Exploration limit Overshoot Category limit

Average number of target categories found 0 0 0 7.0 7.8

Average number of target categories returned 0 N/A N/A 5 3.3

As can be seen from table 2, two of the five termination conditions we identified never occur at all. They are the two undesirable conditions where the exploration strays 20 iterations away from the initial categories. This result indicates that our exploration algorithm never does diverge into wrong directions or miss the target categories, nor does it end up exploring in regions without target categories. However, there is still one undesirable condition that does occur, namely that of the algorithm selecting no initial categories to begin the search from. This occurs when no titles featuring query words can be found; typically because the query consists only of unusual terms and abbreviations. For example, one query consisting only of the abbreviation “AATFCU” failed for this reason. Fortunately, this does not happen frequently: only 6.5% of queries in our test set terminated for this reason. The most common termination conditions, accounting for 93.5% of query searches, are when the exploration successfully discovers the maximum number of target categories, either in several iterations or all in one, with the former case being much more common than the latter. In both cases, we can see that the system discovers these categories quickly, in less than 4 iterations on average. This demonstrates the success and efficiency of our exploration algorithm.

5 Conclusion In this paper, we presented a novel algorithm to explore the Wikipedia category graph and discover the target categories nearest to a set of initial categories. To demonstrate its efficiency, we used the exploration algorithm as the core of a query classification system, and showed that its classification results compare favorably to those of the KDD CUP 2005 competition: our system would have ranked 7th on precision in that competition, with an increase of 6.4% compared to the competition median, and 10th on F1 with a 6.9% increase compared to the median. By using Wikipedia, our system gained the ability to classify queries into a set of almost 400,000 categories covering most of human knowledge and which can easily be mapped to a simpler applicationspecific set of categories when needed. But the core of our contribution remains the novel exploration algorithm, which can efficiently navigate the graph of 400,000 interconnected categories and discover the target categories to classify the query into in 3.3 iterations on average. Future work will focus on further refining the exploration algorithm to limit the number of categories generated at each iteration step by selecting the most promising directions to explore, as well as on developing ways to handle the 6.5% of queries that remain unclassified with our system.

230

M. Alemzadeh, R. Khoury, and F. Karray

References 1. Jansen, M.B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management 36(2), 207–227 (2000) 2. Shen, D., Pan, R., Sun, J.-T., Pan, J.J., Wu, K., Yin, J., Yang, Q.: Q2C@UST: our winning solution to query classification in KDDCUP 2005. ACM SIGKDD Explorations Newsletter 7(2), 100–110 (2005) 3. Alemzadeh, M., Karray, F.: An efficient method for tagging a query with category labels using Wikipedia towards enhancing search engine results. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Toronto, Canada, pp. 192–195 (2010) 4. Li, Y., Zheng, Z., Dai, H.: KDD CUP-2005 report: Facing a great challenge. ACM SIGKDD Explorations Newsletter 7(2), 91–99 (2005) 5. Shen, D., Sun, J., Yang, Q., Chen, Z.: Building bridges for web query classification. In: Proceedings of SIGIR 2006, pp. 131–138 (2006) 6. Fu, J., Xu, J., Jia, K.: Domain ontology based automatic question answering. In: International Conference on Computer Engineering and Technology (ICCET 2008), vol. 2, pp. 346–349 (2009) 7. Beitzel, S.M., Jensen, E.C., Lewis, D.D., Chowdhury, A., Frieder, O.: Automatic classification of web queries using very large unlabeled query logs. ACM Transactions on Information Systems 25(2), article 9 (2007) 8. Hu, J., Wang, G., Lochovsky, F., Sun, J.-T., Chen, Z.: Understanding user’s query intent with Wikipedia. In: Proceedings of the 18th International Conference on World Wide Web, Spain, pp. 471–480 (2009) 9. Khoury, R.: Using Encyclopaedic Knowledge for Query Classification. In: Proceedings of the 2010 International Conference on Artificial Intelligence (ICAI 2010), Las Vegas, USA, vol. 2, pp. 857–862 (2010)

Exploring Wikipedia's Category Graph for Query ... - Springer Link

varying degrees of granularity, it is easy for system designers to identify a subset of them as “target categories” they wish to use as classification goals, rather ..... domly from MSN search logs, unedited and including the users' typos and mistakes. The solutions were ranked based on overall precision and overall F1 value, ...

174KB Sizes 0 Downloads 222 Views

Recommend Documents

Unsupervised Learning for Graph Matching - Springer Link
Apr 14, 2011 - Springer Science+Business Media, LLC 2011. Abstract Graph .... tion as an integer quadratic program (Leordeanu and Hebert. 2006; Cour and Shi ... computer vision applications such as: discovering texture regularity (Hays et al. .... fo

Query-Free News Search - Springer Link
For the best algorithm, 84–91% of the articles found were relevant, with at least 64% of the articles being ... For example, the Intercast system, developed by Intel, allows entire HTML pages to be ...... seem to be good design choices. .... querie

A Category-integrated Language Model for Question ... - Springer Link
to develop effective question retrieval models to retrieve historical question-answer ... trieval in CQA archives is distinct from the search of web pages in that ...

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
number of edges, node degrees, the attributes of nodes and the attributes of edges in ... The website [2] for the 20th International Conference on Pattern Recognition. (ICPR2010) ... Graph embedding, in this sense, is a real bridge joining the.

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
Computer Vision Center, Universitat Autónoma de Barcelona, Spain. {mluqman ... number of edges, node degrees, the attributes of nodes and the attributes.

Exploring Cultural Differences in Pictogram ... - Springer Link
management applications such as Flickr and YouTube have come into wide use, allowing users to ... the meaning associated with the object. Pictorial symbols ...

Using spoken words to guide open-ended category ... - Springer Link
May 26, 2011 - child development literature that strong links exist between early word-learning ... development Б Instance-based learning Б Word categories Б.

LNAI 4285 - Query Similarity Computing Based on ... - Springer Link
similar units between S1 and S2, are called similar units, notated as s(ai,bj), abridged ..... 4. http://metadata.sims.berkeley.edu/index.html, accessed: 2003.Dec.1 ...

Directed Graph Learning via High-Order Co-linkage ... - Springer Link
Abstract. Many real world applications can be naturally formulated as a directed graph learning problem. How to extract the directed link structures of a graph and use labeled vertices are the key issues to in- fer labels of the remaining unlabeled v

3D articulated object retrieval using a graph-based ... - Springer Link
Aug 12, 2010 - Department of Electrical and Computer Engineering, Democritus. University ... Among the existing 3D object retrieval methods, two main categories ...... the Ph.D. degree in the Science of ... the past 9 years he has been work-.

Exploiting Graphics Processing Units for ... - Springer Link
Then we call the CUDA function. cudaMemcpy to ..... Processing Studies (AFIPS) Conference 30, 483–485. ... download.nvidia.com/compute/cuda/1 1/Website/.

Evidence for Cyclic Spell-Out - Springer Link
Jul 23, 2001 - embedding C0 as argued in Section 2.1, this allows one to test whether object ... descriptively head-final languages but also dominantly head-initial lan- ..... The Phonology-Syntax Connection, University of Chicago Press,.

MAJORIZATION AND ADDITIVITY FOR MULTIMODE ... - Springer Link
where 〈z|ρ|z〉 is the Husimi function, |z〉 are the Glauber coherent vectors, .... Let Φ be a Gaussian gauge-covariant channel and f be a concave function on [0, 1].

Tinospora crispa - Springer Link
naturally free from side effects are still in use by diabetic patients, especially in Third .... For the perifusion studies, data from rat islets are presented as mean absolute .... treated animals showed signs of recovery in body weight gains, reach

Chloraea alpina - Springer Link
Many floral characters influence not only pollen receipt and seed set but also pollen export and the number of seeds sired in the .... inserted by natural agents were not included in the final data set. Data were analysed with a ..... Ashman, T.L. an

GOODMAN'S - Springer Link
relation (evidential support) in “grue” contexts, not a logical relation (the ...... Fitelson, B.: The paradox of confirmation, Philosophy Compass, in B. Weatherson.

Bubo bubo - Springer Link
a local spatial-scale analysis. Joaquın Ortego Æ Pedro J. Cordero. Received: 16 March 2009 / Accepted: 17 August 2009 / Published online: 4 September 2009. Ó Springer Science+Business Media B.V. 2009. Abstract Knowledge of the factors influencing

Quantum Programming - Springer Link
Abstract. In this paper a programming language, qGCL, is presented for the expression of quantum algorithms. It contains the features re- quired to program a 'universal' quantum computer (including initiali- sation and observation), has a formal sema

BMC Bioinformatics - Springer Link
Apr 11, 2008 - Abstract. Background: This paper describes the design of an event ontology being developed for application in the machine understanding of infectious disease-related events reported in natural language text. This event ontology is desi

Isoperimetric inequalities for submanifolds with ... - Springer Link
Jul 23, 2011 - if ωn is the volume of a unit ball in Rn, then. nnωnVol(D)n−1 ≤ Vol(∂D)n and equality holds if and only if D is a ball. As an extension of the above classical isoperimetric inequality, it is conjectured that any n-dimensional c

Probabilities for new theories - Springer Link
where between 0 and r, where r is the prior probability that none of the existing theories is ..... theorist's internal programming language"(Dorling 1991, p. 199).

A Process Semantics for BPMN - Springer Link
Business Process Modelling Notation (BPMN), developed by the Business ..... In this paper we call both sequence flows and exception flows 'transitions'; states are linked ...... International Conference on Integrated Formal Methods, pp. 77–96 ...