Extending Faceted Search to the General Web

Viewer
Transcript

Precision-Oriented Query Facet Extraction Weize Kong and James Allan Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst

What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue

Facet 2 ❑Business ❑Economy

Facet 3 ❑International ❑Domestic

2

What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue

Facet 2 ❑Business ❑Economy

Facet 3 ❑International ❑Domestic

• A list of terms in a semantic class • One aspect/facet of the query

2

What are query facets? baggage allowance Facet 1 ❑AA ❑Delta ❑JetBlue

Facet 2 ❑Business ❑Economy

Facet 3 ❑International ❑Domestic

• A list of terms in a semantic class • One aspect/facet of the query • •

Helps clarify search intent Assists faceted query and exploratory search 2

Query facet extraction [Kong & Allan SIGIR’13]

baggage allowance

Step 1: apply patterns search results

Candidate facets 1

Delta, Facebook, Login

2

AA, Delta, British Airways

3

JetBlue, first, business, economy

… …

Step 2: refine facets Query Facets 1

AA, Delta, JetBlue, …

2

international, domestic

3

weight, size, quantity

4

business, economy 3

Query facet extraction [Kong & Allan SIGIR’13]

baggage allowance

Step 1: apply patterns search results

Candidate facets 1

Delta, Facebook, Login

2

AA, Delta, British Airways

3

JetBlue, first, business, economy

… …

Step 2: refine facets Query Facets 1

AA, Delta, JetBlue, …

2

international, domestic

3

weight, size, quantity

4

business, economy 3

Faceted search

4

Faceted search

Facets not available for the web

4

Using query facets to extend faceted search to the web [Kong & Allan CIKM’14]

baggage allowance American Airlines Baggage Allowance Information www.aa.com/i18n/.../baggage/baggageAllowance.jsp Airline baggage allowance information from netflights www.netflights.com Delta Baggage | Baggage Fees | Delta Air Lines www.delta.com/content/www/en_US/.../baggage.html United Airlines - Baggage Information | Baggage Policy www.gsa.gov ⋮

re-rank to the top

users select terms

Facet 1 ❑ AA x Delta ❑ ❑ JetBlue Facet 2 x International ❑ ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business x Economy ❑

5

Precision-oriented scenarios Ideal Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy 6

Precision-oriented scenarios Ideal

High “recall”

Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy

Facet 1 ❑ Delta ❑ Economy ❑ AA ❑ JetBlue Facet 2 ❑ Boarding ❑ Lounges Facet 3 ❑ International ❑ Domestic ❑ Business Facet 4 ❑ Quantity ❑ Weight ❑ Size

High “precision” Facet 1 ❑ AA ❑ Delta Facet 2 ❑ Weight ❑ Size Facet 3 ❑ Business ❑ Economy ❑ Lounges

6

Precision-oriented scenarios Ideal

High “recall”

Facet 1 ❑ AA ❑ Delta ❑ JetBlue Facet 2 ❑ International ❑ Domestic Facet 3 ❑ Weight ❑ Size ❑ Quantity Facet 4 ❑ Business ❑ Economy

Facet 1 ❑ Delta ❑ Economy Users would ❑ AA prefer this ❑ JetBlue Facet 2 ❑ Boarding ❑ Lounges Facet 3 ❑ International ❑ Domestic ❑ Business Facet 4 ❑ Quantity ❑ Weight ❑ Size

High “precision” Facet 1 ❑ AA ❑ Delta Facet 2 ❑ Weight ❑ Size Facet 3 ❑ Business ❑ Economy ❑ Lounges

6

Precision-oriented scenarios Ideal

High “recall”

High “precision”

Facet 1 Facet 1 Facet 1 ❑ Delta ❑ AA ❑ AA ❑ Economy ❑ Delta Users would ❑ Delta ❑ AA ❑ JetBlue prefer this Facet 2 ❑ JetBlue Facet 2 ❑ Weight Facet 2 ❑ International ❑ Size ❑ Boarding Users care more about the correctness of presented ❑ Domestic ❑ Lounges Facet 3 facets than the completeness of them. Facet 3 Facet 3 ❑ Business ❑ Weight ❑ International ❑ Economy ❑ Domestic ❑ Size ❑ Lounges ❑ Business ❑ Quantity Facet 4 Facet 4 ❑ Quantity ❑ Business ❑ Weight ❑ Economy ❑ Size

6

Previous models don’t work so well under precisionoriented scenarios  Low precision 0.7

LDA

QDM

QFI

QFJ

0.6 0.5

0.4450

0.4 0.3 0.2 0.1 0

Term precision

Term recall

Term clustering F1

7

Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure

8

Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure Poor performing queries E.g. “self motivation”

Well performing queries E.g. “used cars”

8

Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure

• Selective query faceting – Avoid showing facets for poor preforming queries – Only trigger faceting for well performing ones – Predict extraction performance

8

Overview of this work • Improve our previous extraction model under precision-oriented scenarios – Likelihood is a bad training objective – Directly optimize the performance measure

• Selective query faceting – Avoid showing facets for poor preforming queries – Only trigger faceting for well performing ones – Predict extraction performance

• Improve evaluation measures – not included in this talk 8

Optimize the performance measure

9

Evaluation measure • Compare with human created facets ❑ International ❑ AA ❑ Domestic ❑ Delta ❑ Business ❑ Twitter

Compare

Extracted facets

❑ AA ❑ International ❑ Business ❑ Delta ❑ Domestic ❑ Economy ❑ JetBlue

Annotator facets (ground truth)

• Measures: how to measure similarity – Term classification – Term clustering

10

𝑃𝑅𝐹𝛼,𝛽 • Combine three factors – Term Precision – Term Recall – Pair F1 (term clustering F-measure)

• Using weighted harmonic mean 𝑃𝑅𝐹𝛼,𝛽

𝛼 2 + 𝛽2 + 1 = 2 𝛽2 𝛼 1 + + 𝑇𝑃 𝑇𝑅 𝑃𝐹

Adjust emphasis between factors 11

𝑃𝑅𝐹𝛼,𝛽 • Combine three factors – Term Precision – Term Recall – Pair F1 (term clustering F-measure)

• Using weighted harmonic mean 𝑃𝑅𝐹𝛼,𝛽

𝛼 2 + 𝛽2 + 1 = 2 𝛽2 𝛼 1 + + 𝑇𝑃 𝑇𝑅 𝑃𝐹

Adjust emphasis between factors

Hold α=1 β=1: equal importance β=½:TR ½ important as TP, PF β=⅓:TR ⅓ important as TP, PF ⋮ [Rijsbergen 1979] 11

𝑃𝑅𝐹𝛼,𝛽 Performance measure

Query faceting model 12

Optimize 𝑃𝑅𝐹𝛼,𝛽 directly

𝑢 𝜃 =

∗ ∗ 𝑃𝑅𝐹 (𝑌 , 𝑍 ; 𝜃) 𝛼,𝛽 𝑃𝑅𝐹

(𝑌 ∗ ,𝑍 ∗ )

𝛼,𝛽

Performance measure Training objective

Empirical utility maximization Query faceting model 12

Optimize 𝑃𝑅𝐹𝛼,𝛽 directly • But it’s difficult 𝑦𝑖 = 1 𝑃 𝑦𝑖 = 1 > 𝜆 Non-continuous, non-differentiable 

• Solution: approximation by its expectation 𝑦𝑖 = 𝐸 𝑦𝑖 = 𝑃 𝑦𝑖 = 1; 𝜃 𝑃𝑅𝐹𝛼,𝛽 = 𝐸 𝑃𝑅𝐹𝛼,𝛽 ≈ 𝑃𝑅𝐹𝛼,𝛽 (𝑌, 𝑍) Independence assumption 13

Compare EUM & MLE 0.57 †

0.54

†

†

†

†

† EUM

𝑃𝑅𝐹1,𝛽

0.51 0.48

MLE

0.45 0.42 EUM: trained by optimize 𝑃𝑅𝐹1,0.5 MLE: trained by optimize likelihood Both for QFJ model

0.39 0.36

Down-weight term recall

0.33 0

1

2

3

4

5

6

7

8

9 10

†: Significant (p<0.05) over MLE baselines

1/𝛽 14

Utility is a better learning objective than EUM & MLE likelihoodCompare for precision-oriented scenarios. 0.57 †

0.54

†

†

†

†

† EUM

𝑃𝑅𝐹1,𝛽

0.51 0.48

MLE

0.45 0.42 EUM: trained by optimize 𝑃𝑅𝐹1,0.5 MLE: trained by optimize likelihood Both for QFJ model

0.39 0.36

Down-weight term recall

0.33 0

1

2

3

4

5

6

7

8

9 10

†: Significant (p<0.05) over MLE baselines

1/𝛽 14

Selective query faceting

15

Selective query faceting Only trigger faceting for well performing queries

16

Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation 1

Real PRF

0.8 0.6 0.4 0.2

Feature 𝑃𝑅𝐹

0 0

0.5

Correlation

p-value

0.6112

1.4 × 10−11

1

Predicted PRF Results based on 10-fold cross validation 17

Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1

Real PRF

0.8 0.6 0.4 0.2

Feature 𝑃𝑅𝐹

0 0

0.5

Correlation

p-value

0.6112

1.4 × 10−11

1

Predicted PRF Results based on 10-fold cross validation 17

Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1

Real PRF

0.8 0.6 0.4 0.2

Feature 𝑃𝑅𝐹

0 0

0.5

Correlation

p-value

0.6112

1.4 × 10−11

1

Predicted PRF Results based on 10-fold cross validation 17

Predicting Extraction Performance • Predict 𝑃𝑅𝐹𝛼,𝛽 based on its expectation Threshold 1

Real PRF

0.8 0.6 0.4 0.2

Feature 𝑃𝑅𝐹

0 0

0.5

Correlation

p-value

0.6112

1.4 × 10−11

1

Predicted PRF Results based on 10-fold cross validation 17

Performance for the selected queries

𝑃𝑅𝐹1,1 = 0.5792, when 20 queries selected

𝑃𝑅𝐹1,1 = 0.4720, when

not applying selectively faceting Gray area indicates standard error with 95% confidence intervals. 18

Performance for the selected queries Selective query faceting can improve average performance with fair coverage of the search traffic. 𝑃𝑅𝐹1,1 = 0.5792, when 20 queries selected

𝑃𝑅𝐹1,1 = 0.4720, when

not applying selectively faceting Gray area indicates standard error with 95% confidence intervals. 18

Conclusions • Precision-oriented scenarios • Use utility objective instead of likelihood • Expectation-based approximation is effective • Selective query faceting can be useful

19

Future work • Label query facet

Facet 1 ❑ AA ❑ Delta ❑ JetBlue

Airline

• Rank/select facets and facet terms

– Critical for mobile search (smaller screen)

• Use query facets for exploratory search – Recall-oriented? – How to set the task and evaluate? 20

Thanks Demo to play with =)

http://brooloo.cs.umass.edu

21

Browsing-oriented Semantic Faceted Search