Measuring Performance of Web Image Context Extraction Sadet Alcic Heinrich-Heine-University of Duesseldorf Department of Computer Science Institute for Databases and Information Systems
July 25th, 2010
1 / 16
Outline
1
Motivation Web Image Context and its benefits
2
Evaluation Framework Workflow Data Collections Existing WICE Algorithms Performance Metrics
3
Evaluation Results
2 / 16
Outline
1
Motivation Web Image Context and its benefits
2
Evaluation Framework Workflow Data Collections Existing WICE Algorithms Performance Metrics
3
Evaluation Results
2 / 16
Outline
1
Motivation Web Image Context and its benefits
2
Evaluation Framework Workflow Data Collections Existing WICE Algorithms Performance Metrics
3
Evaluation Results
2 / 16
Motivation
Web Image Context and its benefits
Motivation
3 / 16
Motivation
Web Image Context and its benefits
Motivation Manual TAGs plane, aircraft, white, airstrip, boeing 787, jet, clouds, sand.
3 / 16
Motivation
Web Image Context and its benefits
Motivation
3 / 16
Motivation
Web Image Context and its benefits
Motivation
3 / 16
Motivation
Web Image Context and its benefits
Motivation Manual TAGs plane, aircraft, white, airstrip, boeing 787, jet, clouds, sand.
Context TAGs boeing, airbus, battle for skies, 47th Farnborough International Airshow, southern England, world’s biggest aerospace manufacturers, aircraft.
3 / 16
Motivation
Web Image Context and its benefits
Motivation Manual TAGs plane, aircraft, white, airstrip, boeing 787, jet, clouds, sand.
Context TAGs boeing, airbus, battle for skies, 47th Farnborough International Airshow, southern England, world’s biggest aerospace manufacturers, aircraft.
Manual TAGs expensive ⇔ Web Image Context provides TAGs for free
3 / 16
Motivation
Web Image Context and its benefits
Motivation Manual TAGs plane, aircraft, white, airstrip, boeing 787, jet, clouds, sand.
Context TAGs boeing, airbus, battle for skies, 47th Farnborough International Airshow, southern England, world’s biggest aerospace manufacturers, aircraft.
Manual TAGs expensive ⇔ Web Image Context provides TAGs for free
Web Image Context has to be extracted
3 / 16
Motivation
Web Image Context and its benefits
Current situation Variety of applications in different research fields I
Machine Learning,
I
Data Mining,
I
Information Retrieval
uses different algorithms to extract Web Image Context
4 / 16
Motivation
Web Image Context and its benefits
Key Question
Which is the best method to extract Web Image Context?
4 / 16
Motivation
Web Image Context and its benefits
Key Question
Which is the best method to extract Web Image Context?
I
Evaluation and comparison of algorithms is needed
4 / 16
Motivation
Web Image Context and its benefits
Key Question
Which is the best method to extract Web Image Context?
I
Evaluation and comparison of algorithms is needed But: WICE is a preprocessing step in existing applications I Existing evaluations have rather investigated WICE on its own
4 / 16
Motivation
Web Image Context and its benefits
Our contribution I Evaluation framework to measure and compare performance of WICE I I I
Large scale, realistic ground truth datasets Performance measures adapted to fit the requirements of Web Context Common WICE methods from literature to test the capabilities of the framework
4 / 16
Motivation
Web Image Context and its benefits
Our contribution I Evaluation framework to measure and compare performance of WICE I I I
Large scale, realistic ground truth datasets Performance measures adapted to fit the requirements of Web Context Common WICE methods from literature to test the capabilities of the framework
4 / 16
Motivation
Web Image Context and its benefits
Our contribution I Evaluation framework to measure and compare performance of WICE I I I
Large scale, realistic ground truth datasets Performance measures adapted to fit the requirements of Web Context Common WICE methods from literature to test the capabilities of the framework
4 / 16
Motivation
Web Image Context and its benefits
Our contribution I Evaluation framework to measure and compare performance of WICE I I I
Large scale, realistic ground truth datasets Performance measures adapted to fit the requirements of Web Context Common WICE methods from literature to test the capabilities of the framework
4 / 16
Evaluation Framework
Outline
1
Motivation Web Image Context and its benefits
2
Evaluation Framework Workflow Data Collections Existing WICE Algorithms Performance Metrics
3
Evaluation Results
5 / 16
Evaluation Framework
Workflow
Workflow
Document Collec,ons Ground truth
6 / 16
Evaluation Framework
Workflow
Workflow
Document Collec,ons Ground truth
Context Extrac,on Extracted Context 6 / 16
Evaluation Framework
Workflow
Workflow
Document Collec,ons Ground truth Evalua,on Results Performance Measures
Context Extrac,on Extracted Context 6 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Ground truth data collections
I
Manual collection created by expert 80 Web Documents collected from 8 categories of Yahoo! Directory ⇒ labour intensive and tedious, hence collection small I
I
Automatic created collections I I
I I I I
visit domain x IF x.content has significantly changed since last access THEN store x.content to disc wait n minutes; REPEAT UNTIL collection large enough
Collects Web documents with common structure For each domain: rule-based wrapper detecting the Image Context
7 / 16
Evaluation Framework
Data Collections
Resulting collections Collection BBC CNN Golem Heise MSN New-York Times Spiegel Telegraph The Globe and Mail Wikipedia Yahoo! (english) diverse (manual) total
#Documents 1,077 874 789 79 375 556 1,076 530 735 3,000 3,737 79 12,907
#Images 7,878 11,612 3,061 1,403 9,264 10,927 36,310 10,503 15,808 6,728 41,170 901 155,565 8 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: heuristics-based 1. Full-Text Extraction (Ortega-Binderberger et al., 2000) I
Baseline approach
I
The complete text of Web document is extracted as Image Context
I
Used to show the benefits of other WICE algorithms
2. N-Term Environment (Sclaroff et al., 1999; Souza Coelho, 2004) I
Extract the N terms before and after the Web image
I
Web document is flattened to visible contents (text and images)
I
Choose N = 10 and N = 20 as applied by Sclaroff and Souza Coelho
9 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: heuristics-based 1. Full-Text Extraction (Ortega-Binderberger et al., 2000) I
Baseline approach
I
The complete text of Web document is extracted as Image Context
I
Used to show the benefits of other WICE algorithms
2. N-Term Environment (Sclaroff et al., 1999; Souza Coelho, 2004) I
Extract the N terms before and after the Web image
I
Web document is flattened to visible contents (text and images)
I
Choose N = 10 and N = 20 as applied by Sclaroff and Souza Coelho
9 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: DOM-based 3. Siblings Extractor (Yong-hong et al., 2005) I
Starts at image node
I
Walks the DOM-tree up, until parent tree has text nodes
I
All text nodes under parent are extracted as Image Context
4. Monash Extractor (Fauzi and Belkhatir, 2009) I Uses DOM structure to classify images in I I I
I
listed, semi-listed and unlisted images
DOM-based extraction rules defined for each category
10 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: DOM-based 3. Siblings Extractor (Yong-hong et al., 2005) I
Starts at image node
I
Walks the DOM-tree up, until parent tree has text nodes
I
All text nodes under parent are extracted as Image Context
4. Monash Extractor (Fauzi and Belkhatir, 2009) I Uses DOM structure to classify images in I I I
I
listed, semi-listed and unlisted images
DOM-based extraction rules defined for each category
10 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: DOM-based 3. Siblings Extractor (Yong-hong et al., 2005) I
Starts at image node
I
Walks the DOM-tree up, until parent tree has text nodes
I
All text nodes under parent are extracted as Image Context
4. Monash Extractor (Fauzi and Belkhatir, 2009) I Uses DOM structure to classify images in I I I
I
listed, semi-listed and unlisted images
DOM-based extraction rules defined for each category
10 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Existing WICE Algorithms
Context Extraction Algorithms: vision-based 5. Vision-based Page Segmentation (Cai et al., 2004; He et al., 2007) I Based on VIPS algorithm (Cai et al., 2003) originally developed for web page segmentation I Hierarchical top-down approach, starting with whole page as initial block I For each block I I I
I I I I
compute Degree of Coherence (DoC) based on DOM and visual cues DoC determines the correlation of contents within a block DoC values range from 1 to 10 (10 maximum coherence)
Permitted Degree of Coherence (PDoC) specifies the minimum coherence a block should have If PDoC coherence is not reached, block is subdivided until all blocks fulfill the condition To use VIPS as WICE method, extract complete text of the visual block containing image We apply PDoC 5, 6, 7. 11 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS P ·R LCS ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS P ·R LCS ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Framework
Performance Metrics
Evaluation Metrics I
I I
Testing on exact matches between computed and ground truth data poses to strong criterion Instead partially accordance should be considered Web Image Context Extraction as an Information Retrieval task I I I
I
An image Q of the document D is a query formulated on this document Results are sequences of terms extracted as Image Context Accordance computed as Longest Common Subsequence (LCS) (Hirschberg, 1975) of terms
This formulation allows adaptation of precision P, recall R and Fscore metrics to WICE P=
I
LCS LCS P ·R ,R = , Fscore = 2 · #computed terms #correct terms P +R
Standard deviation of Fscore computed for every collection to test stability 12 / 16
Evaluation Results
Outline
1
Motivation Web Image Context and its benefits
2
Evaluation Framework Workflow Data Collections Existing WICE Algorithms Performance Metrics
3
Evaluation Results
13 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
All extraction methods are significantly better than baseline 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Possible reason While R is high for baseline, P is very low since number of terms in context is significantly lower than number of terms of the complete document 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Results for N-Term Environment extractors in middle third 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Possible reason Image context is often only before or after the image, but not both 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Results for VIPS-based method unsteady, and the StdDev of its Fscore is higher 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Possible reason VIPS depends highly on chosen PDoC value, visual blocks are to wide or to narrow 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
DOM-based methods perform best 14 / 16
Evaluation Results
Results for 4 of the 12 collections Heise Collec*on 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
MSN Collec*on
F-‐score
Diverse(manual) Collec2on
Wikipedia Collec-on
F-‐score StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
F-‐score StdDev F-‐score
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
10 te rm s 20 te rm s m on as h sib lin gs fu ll t ex vip t s P Do C5 vip s P Do C6 vip s P Do C7
StdDev F-‐score
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
Possible reason Most of the popular Web domains use CMS producing well structured documents
14 / 16
Evaluation Results
Conclusion and future works
I
Evaluation framework for WICE algorithms
I
Existing algorithms implemented to test functionality
I
Results can be used to choose the best extraction method
I
Extension of framework with more algorithms and metrics
I
Development of new WICE methods
15 / 16
Evaluation Results
Conclusion and future works
I
Evaluation framework for WICE algorithms
I
Existing algorithms implemented to test functionality
I
Results can be used to choose the best extraction method
I
Extension of framework with more algorithms and metrics
I
Development of new WICE methods
15 / 16
Evaluation Results
Conclusion and future works
I
Evaluation framework for WICE algorithms
I
Existing algorithms implemented to test functionality
I
Results can be used to choose the best extraction method
I
Extension of framework with more algorithms and metrics
I
Development of new WICE methods
15 / 16
Evaluation Results
Conclusion and future works
I
Evaluation framework for WICE algorithms
I
Existing algorithms implemented to test functionality
I
Results can be used to choose the best extraction method
I
Extension of framework with more algorithms and metrics
I
Development of new WICE methods
15 / 16
Evaluation Results
Conclusion and future works
I
Evaluation framework for WICE algorithms
I
Existing algorithms implemented to test functionality
I
Results can be used to choose the best extraction method
I
Extension of framework with more algorithms and metrics
I
Development of new WICE methods
15 / 16
Evaluation Results
Thanks for listening!
Questions?
16 / 16