Precision-Oriented Query Facet Extraction Weize Kong and James Allan Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst
What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue
Facet 2 ❑Business ❑Economy
Facet 3 ❑International ❑Domestic
2
What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue
Facet 2 ❑Business ❑Economy
Facet 3 ❑International ❑Domestic
• A list of terms in a semantic class • One aspect/facet of the query
2
What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue
Facet 2 ❑Business ❑Economy
Facet 3 ❑International ❑Domestic
• A list of terms in a semantic class • One aspect/facet of the query • •
Helps clarify search intent Assists faceted query and exploratory search 2
Query facet extraction [Kong & Allan SIGIR’13]
baggage allowance
Step 1: apply patterns search results
Candidate facets 1
Delta, Facebook, Login
2
AA, Delta, British Airways
3
JetBlue, first, business, economy
… …
Step 2: refine facets Query Facets 1
AA, Delta, JetBlue, …
2
international, domestic
3
weight, size, quantity
4
business, economy 3
Query facet extraction [Kong & Allan SIGIR’13]
baggage allowance
Step 1: apply patterns search results
Candidate facets 1
Delta, Facebook, Login
2
AA, Delta, British Airways
3
JetBlue, first, business, economy
… …
Step 2: refine facets Query Facets 1
AA, Delta, JetBlue, …
2
international, domestic
3
weight, size, quantity
4
business, economy 3
Faceted search
4
Faceted search
Facets not available for the web
4
Using query facets to extend faceted search to the web [Kong & Allan CIKM’14]
baggage allowance American Airlines Baggage Allowance Information www.aa.com/i18n/.../baggage/baggageAllowance.jsp Airline baggage allowance information from netflights www.netflights.com Delta Baggage | Baggage Fees | Delta Air Lines www.delta.com/content/www/en_US/.../baggage.html United Airlines - Baggage Information | Baggage Policy www.gsa.gov ⋮
re-rank to the top
users select terms
Facet 1 ❑ AA x Delta ❑ ❑ JetBlue Facet 2 x International ❑ ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business x Economy ❑
5
Precision-oriented scenarios Ideal Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy 6
Precision-oriented scenarios Ideal
High “recall”
Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy
Facet 1 ❑ Delta ❑ Economy ❑ AA ❑ JetBlue Facet 2 ❑ Boarding ❑ Lounges Facet 3 ❑ International ❑ Domestic ❑ Business Facet 4 ❑ Quantity ❑ Weight ❑ Size
High “precision” Facet 1 ❑ AA ❑ Delta Facet 2 ❑ Weight ❑ Size Facet 3 ❑ Business ❑ Economy ❑ Lounges
6
Precision-oriented scenarios Ideal
High “recall”
Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy
Facet 1 ❑ Delta ❑ Economy Users would ❑ AA prefer this ❑ JetBlue Facet 2 ❑ Boarding ❑ Lounges Facet 3 ❑ International ❑ Domestic ❑ Business Facet 4 ❑ Quantity ❑ Weight ❑ Size
High “precision” Facet 1 ❑ AA ❑ Delta Facet 2 ❑ Weight ❑ Size Facet 3 ❑ Business ❑ Economy ❑ Lounges
6
Precision-oriented scenarios Ideal
High “recall”
High “precision”
Facet 1 Facet 1 Facet 1 ❑ Delta ❑ AA ❑ AA ❑ Economy ❑ Delta Users would ❑ Delta ❑ AA ❑ JetBlue prefer this Facet 2 ❑ JetBlue Facet 2 ❑ Weight Facet 2 ❑ International ❑ Size ❑ Boarding Users care more about the correctness of presented ❑ Domestic ❑ Lounges Facet 3 facets than the completeness of them. Facet 3 Facet 3 ❑ Business ❑ Weight ❑ International ❑ Economy ❑ Domestic ❑ Size ❑ Lounges ❑ Business ❑ Quantity Facet 4 Facet 4 ❑ Quantity ❑ Business ❑ Weight ❑ Economy ❑ Size
6
Previous models don’t work so well under precisionoriented scenarios Low precision 0.7
LDA
QDM
QFI
QFJ
0.6 0.5
0.4450
0.4 0.3 0.2 0.1 0
Term precision
Term recall
Term clustering F1
7
Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure
8
Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure Poor performing queries E.g. “self motivation”
Well performing queries E.g. “used cars”
8
Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure
• Selective query faceting – Avoid showing facets for poor preforming queries – Only trigger faceting for well performing ones – Predict extraction performance
8
Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure
• Selective query faceting – Avoid showing facets for poor preforming queries – Only trigger faceting for well performing ones – Predict extraction performance
• Improve evaluation measures – not included in this talk 8
Optimize the performance measure
9
Evaluation measure • Compare with human created facets ❑ International ❑ AA ❑ Domestic ❑ Delta ❑ Business ❑ Twitter
Compare
Extracted facets
❑ AA ❑ International ❑ Business ❑ Delta ❑ Domestic ❑ Economy ❑ JetBlue
Annotator facets (ground truth)
• Measures: how to measure similarity – Term classification – Term clustering
10
𝑃𝑅𝐹𝛼,𝛽 • Combine three factors – Term Precision – Term Recall – Pair F1 (term clustering F-measure)
• Using weighted harmonic mean 𝑃𝑅𝐹𝛼,𝛽
𝛼 2 + 𝛽2 + 1 = 2 𝛽2 𝛼 1 + + 𝑇𝑃 𝑇𝑅 𝑃𝐹
Adjust emphasis between factors 11
𝑃𝑅𝐹𝛼,𝛽 • Combine three factors – Term Precision – Term Recall – Pair F1 (term clustering F-measure)
• Using weighted harmonic mean 𝑃𝑅𝐹𝛼,𝛽
𝛼 2 + 𝛽2 + 1 = 2 𝛽2 𝛼 1 + + 𝑇𝑃 𝑇𝑅 𝑃𝐹
Adjust emphasis between factors
Hold α=1 β=1: equal importance β=½:TR ½ important as TP, PF β=⅓:TR ⅓ important as TP, PF ⋮ [Rijsbergen 1979] 11
𝑃𝑅𝐹𝛼,𝛽 Performance measure
Query faceting model 12
Optimize 𝑃𝑅𝐹𝛼,𝛽 directly
𝑢 𝜃 =
∗ ∗ 𝑃𝑅𝐹 (𝑌 , 𝑍 ; 𝜃) 𝛼,𝛽 𝑃𝑅𝐹
(𝑌 ∗ ,𝑍 ∗ )
𝛼,𝛽
Performance measure Training objective
Empirical utility maximization Query faceting model 12
Optimize 𝑃𝑅𝐹𝛼,𝛽 directly • But it’s difficult 𝑦𝑖 = 1 𝑃 𝑦𝑖 = 1 > 𝜆 Non-continuous, non-differentiable
• Solution: approximation by its expectation 𝑦𝑖 = 𝐸 𝑦𝑖 = 𝑃 𝑦𝑖 = 1; 𝜃 𝑃𝑅𝐹𝛼,𝛽 = 𝐸 𝑃𝑅𝐹𝛼,𝛽 ≈ 𝑃𝑅𝐹𝛼,𝛽 (𝑌, 𝑍) Independence assumption 13
Compare EUM & MLE 0.57 †
0.54
†
†
†
†
† EUM
𝑃𝑅𝐹1,𝛽
0.51 0.48
MLE
0.45 0.42 EUM: trained by optimize 𝑃𝑅𝐹1,0.5 MLE: trained by optimize likelihood Both for QFJ model
0.39 0.36
Down-weight term recall
0.33 0
1
2
3
4
5
6
7
8
9 10
†: Significant (p<0.05) over MLE baselines
1/𝛽 14
Utility is a better learning objective than EUM & MLE likelihoodCompare for precision-oriented scenarios. 0.57 †
0.54
†
†
†
†
† EUM
𝑃𝑅𝐹1,𝛽
0.51 0.48
MLE
0.45 0.42 EUM: trained by optimize 𝑃𝑅𝐹1,0.5 MLE: trained by optimize likelihood Both for QFJ model
0.39 0.36
Down-weight term recall
0.33 0
1
2
3
4
5
6
7
8
9 10
†: Significant (p<0.05) over MLE baselines
1/𝛽 14
Selective query faceting
15
Selective query faceting Only trigger faceting for well performing queries
16
Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation 1
Real PRF
0.8 0.6 0.4 0.2
Feature 𝑃𝑅𝐹
0 0
0.5
Correlation
p-value
0.6112
1.4 × 10−11
1
Predicted PRF Results based on 10-fold cross validation 17
Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1
Real PRF
0.8 0.6 0.4 0.2
Feature 𝑃𝑅𝐹
0 0
0.5
Correlation
p-value
0.6112
1.4 × 10−11
1
Predicted PRF Results based on 10-fold cross validation 17
Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1
Real PRF
0.8 0.6 0.4 0.2
Feature 𝑃𝑅𝐹
0 0
0.5
Correlation
p-value
0.6112
1.4 × 10−11
1
Predicted PRF Results based on 10-fold cross validation 17
Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1
Real PRF
0.8 0.6 0.4 0.2
Feature 𝑃𝑅𝐹
0 0
0.5
Correlation
p-value
0.6112
1.4 × 10−11
1
Predicted PRF Results based on 10-fold cross validation 17
Performance for the selected queries
𝑃𝑅𝐹1,1 = 0.5792, when 20 queries selected
𝑃𝑅𝐹1,1 = 0.4720, when
not applying selectively faceting Gray area indicates standard error with 95% confidence intervals. 18
Performance for the selected queries Selective query faceting can improve average performance with fair coverage of the search traffic. 𝑃𝑅𝐹1,1 = 0.5792, when 20 queries selected
𝑃𝑅𝐹1,1 = 0.4720, when
not applying selectively faceting Gray area indicates standard error with 95% confidence intervals. 18
Conclusions • Precision-oriented scenarios • Use utility objective instead of likelihood • Expectation-based approximation is effective • Selective query faceting can be useful
19
Future work • Label query facet
Facet 1 ❑ AA ❑ Delta ❑ JetBlue
Airline
• Rank/select facets and facet terms
– Critical for mobile search (smaller screen)
• Use query facets for exploratory search – Recall-oriented? – How to set the task and evaluate? 20
Thanks Demo to play with =)
http://brooloo.cs.umass.edu
21