Probabilistic Partial User Model Similarity for Collaborative Filtering 1st International Workshop on Inductive Reasoning and Machine Learning (IRMLeS) 2009 Amancio Bouza, Gerald Reif, Abraham Bernstein Department of Informatics, University of Zurich

SOFTWAREEVOLUTIONARCHITECTURELAB

Motivation

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

2

Motivation

Italian Food

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

2

Motivation

Italian Food

Zurich

1. Jun. 2009

Heraklion

IRMLeS 2009 by Amancio Bouza

2

Motivation

Italian Food

No common rated items, but similar preferences

Zurich

1. Jun. 2009

Heraklion

IRMLeS 2009 by Amancio Bouza

2

Motivation No common rated items, but similar preferences

Asian Food

Italian Food

Zurich

1. Jun. 2009

Asian Food

Heraklion

IRMLeS 2009 by Amancio Bouza

2

Motivation No common rated items, but similar preferences

Asian Food

Italian Food

Asian Food

Partial User Preference Similarity

Zurich

1. Jun. 2009

Heraklion

IRMLeS 2009 by Amancio Bouza

2

Motivation

Partial User Preference Similarity

1. Jun. 2009

No common rated items, but similar preferences

IRMLeS 2009 by Amancio Bouza

2

Agenda Motivation User preference models Global similarity of user preferences Partial similarity of user preferences Evaluation Conclusion 1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

3

User preference models

Modeling preferences

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

4

User preference models

Modeling preferences Topics of interest (Balabanovic and Shoham 1997)

Weighted topics of interest (Good et al. 1999)

Topics from domain ontology (Middleton et al. 2002, 2004)

0.1

0.6

0.8

0.2

Preference vector (Anand et al. 2007)

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

4

User preference models

Modeling preferences Item rating vector

Topics of interest

X

0.8

X

0.1

0.2

0.8

0.6

0.1

(Resnick et al. 1994)

(Balabanovic and Shoham 1997)

Weighted topics of interest

Item rating vector Prediction of missing values

(Good et al. 1999)

(Melville et al. 1999)

Topics from domain ontology (Middleton et al. 2002, 2004)

0.1

0.6

0.8

0.2

Preference vector (Anand et al. 2007)

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

4

User preference models

Modeling preferences

FORMULAS FOR THE SOFTALK AMANCIO BOUZA X

Item rating vector

Topics of interest

User function

(Balabanovic and Shoham 1997)

Prediction of missing values

(Good et al. 1999)

X

0.1

(Resnick et al. 1994)

hypothesized User function Item rating vector

Weighted topics of interest

0.8

u(i) → ck 0.2

0.8

0.6

0.1

h(i) + ε(i) → ck

(Melville et al. 1999)

Topics from domain ontology

Preference model hypothesized User function

(Middleton et al. 2002, 2004)

0.1

Preference vector

0.6 0.8 0.2 hypothesized User function

(Anand et al. 2007)

hypothesized User function

1. Jun. 2009

User-Based collaborative Filtering: IRMLeS 2009 by Amancio Bouza

h : i "→ ck h(i) → ck ha (i) → ck hb (i) → ck 4

User preference models AMANCIO BOUZA

Modeling preferences User function

u(i) → ck X

Item rating vector

Topics of interest

hypothesized User function (Balabanovic and Shoham 1997)

(Resnick et al. 1994)

Item rating vector Prediction of missing values

(Good et al. 1999)

X

0.1

h(i) + ε(i) → ck 0.2

Weighted topics of interest

0.8

0.8

0.6

0.1

(Melville et al. 1999)

Topics from domain ontology

Preference model

hypothesized User function (Middleton et al. 2002, 2004)

0.1

0.6

0.8

0.2

Preference vector

hypothesized User function (Anand et al. 2007)

2009 by Amancio Bouza hypothesized User IRMLeS function

1. Jun. 2009

h : i "→ ck h(i) → ck ha (i) → ck 4

User preference models

User Preference Model Modeling of items DefinitionFOR of feature THE set with relevant features FORMULAS SOFTALK

Mapping items to rating concepts User contribution: user provides item ratings

AMANCIO BOUZA Learning of accurate user preference model Program is said to learn: Performance P in task T improves with more experience E

u(i) = ck

unction h(i) + ε(i) = ck

unction 1. Jun. 2009

h : i !→ ck h(i) → ck

IRMLeS 2009 by Amancio Bouza

5

:

User preference models

0.875 ∗ (4 − 3.66) + 0.25 ∗ (1 − 2.33 User Preference Model =3+

a2

0.875 + 0.25 Modeling of0.298 items − 0.333 FORMULAS FOR THE SOFTALK FORMULAS FOR THE SOFTALK = 2.969 = Mapping 3 +items to rating concepts 1.125 Definition of feature set with relevant features

AMANCIO BOUZA

User contribution: user provides item ratings

AMANCIO BOUZA Learning of accurate user preference model User function

uhypothesized hfunction a (i) =User a (i) + ε(i)

Program is said to learn: Performance P in task T improves with more experience u(i) = Eck

u(i) = ck

h(i) + ε(i) = ck

h(i) + ε(i) = ck

h : i !→ ck

unction

u(i) =Userh(i) hypothesized function+ ε(i)

ilarity:

h : i !→ ckhypothesized User function

h(i) → ck

" # h(i) → c hypothesized User function sim(ua , ub ) ≡ sim ua (i), ub (i)h (i) → c

unction

1. Jun. 2009

k

IRMLeS 2009 by Amancio Bouza

ha (i) → ck 5

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

Food information e.g. #Italian_Food

Location information e.g. #Business_District

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

Food information e.g. #Italian_Food

Location information e.g. #Business_District

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

Food information e.g. #Italian_Food

Location information e.g. #Business_District

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

Food information e.g. #Italian_Food

Hypothesis #Asian_Ambiance AND #Vegetarian_Food AND #Business_District

Location information e.g. #Business_District

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

...

... ...

...

...

...

Food information e.g. #Italian_Food

... ...

...

...

... ...

...

... ...

...

Location information e.g. #Business_District

...

... ...

...

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

... ... ... ...

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

...

...

#Greek_District ... AND #American_Food AND #cheap_Wine

...

...

...

Food information e.g. #Italian_Food

... ...

...

...

#Italian_Food AND ... #excellent_Wine #Asian_Food AND ... #Asian_Ambiance

... ...

...

Location information e.g. #Business_District

...

... ...

...

1. Jun. 2009

...

IRMLeS 2009 by Amancio Bouza

...

#excellent_Wine AND ... NOT #Italian_Food ... ...

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

#Greek_District AND #American_Food AND #cheap_Wine

#Italian_Food AND #excellent_Wine

Food information e.g. #Italian_Food #Asian_Food AND #Asian_Ambiance

Location information e.g. #Business_District

1. Jun. 2009

#excellent_Wine AND NOT #Italian_Food

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning Ambiance information e.g. #Mediterranean_Ambiance

#Greek_District AND #American_Food AND #cheap_Wine

#Italian_Food AND #excellent_Wine

Food information e.g. #Italian_Food #Asian_Food AND #Asian_Ambiance

Location information e.g. #Business_District

1. Jun. 2009

#excellent_Wine AND NOT #Italian_Food

IRMLeS 2009 by Amancio Bouza

6

User preference models

Concept learning

AMANCIO BOUZA

Ambiance information e.g. #Mediterranean_Ambiance

#Greek_District AND #American_Food AND #cheap_Wine

u(i) = ck

#Italian_Food AND #excellent_Wine

Food information

ser function e.g. #Italian_Food

Location information e.g. #Business_District

ser function 1. Jun. 2009

#Asian_Food AND #Asian_Ambiance

h(i) + ε(i) = ck #excellent_Wine AND NOT #Italian_Food

h : i !→ ck IRMLeS 2009 by Amancio Bouza

6

Global similarity of user preferences

Model Similarity

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

7

Global similarity of user preferences

Model Similarity Item set

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

7

hypothesized User function Global similarity of user preferences h(i) + ε(i) = ck u(i) = ck

Model Similarity

nction

h : i !→ ck

h(i) + ε(i) = ck

nction

hypothesized User function h : i !→ ck

hypothesized User function h(i) → ck

Item set

nction

nction

hypothesized User function ha : i !→ ck

h(i) → ck ha : i !→ ck hb : i !→ ck

User-Based collaborative Filtering: hb : i !→ ck n ! rˆaj = ra + κ sim(a, b) ∗ (rbj − rb ) tive Filtering: n !

b!=a

rˆaj = ra Normalization +κ sim(a, b) ∗ (rbjκ:− rb ) factor

1. Jun. 2009

b!=a

IRMLeS 2009 by Amancio Bouza

κ=

1 n

7

hypothesized User function Global similarity of user preferences h(i) + ε(i) = ck u(i) = ck

Model Similarity

nction

h : i !→ ck

h(i) + ε(i) = ck

nction

hypothesized User function h : i !→ ck

hypothesized User function h(i) → ck

Item set

nction

nction

hypothesized User function ha : i !→ ck

h(i) → ck ha : i !→ ck hb : i !→ ck

User-Based collaborative Filtering: hb : i !→ ck n ! rˆaj = ra + κ sim(a, b) ∗ (rbj − rb ) tive Filtering: n !

b!=a

rˆaj = ra Normalization +κ sim(a, b) ∗ (rbjκ:− rb ) factor

1. Jun. 2009

b!=a

IRMLeS 2009 by Amancio Bouza

κ=

1 n

7

Global similarity of user preferences

Similarity Metric

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

8

b!=a

Global similarity of user preferences

ample calculation:

0.875 ∗ (4 − 3.66) + 0.25 ∗ (1 − 2.33) Similarity Metric rˆ = 3 + a2

er Mode:

0.875 + 0.25 0.298 − 0.333 =3+ = 2.969 1.125 ua (i) = ha (i) + ε(i)

er Preference Similarity:

" # sim(ua , ub ) ≡ sim ua (i), ub (i) sim(ua , ub ) ≡ α

z ! m ! k=1 j=1

P (ua (j) = ck ∧ ub (j) = c

" # sim(ua (i), ub (i)) ! sim ha (i), hb (i)

1. Jun. 2009

z ! m ! " # " sim ua (i), ub (i) ! α P h (j) = c ∧ h (j) = c a k b IRMLeS 2009 by Amancio Bouza 8

0.298 − 0.333 Normalization factor κ: = 3 + Global = 2.969 1 similarity of user preferences κ= 1.125 ! sim(a, b) n

Similarity Metric u (i) = h (i) + ε(i)

User Mode:

b!=a

Example calculation:

0.875 ∗ (4a− 3.66) + 0.25 ∗a(1 − 2.33) 0.875 + 0.25 0.298 − 0.333 =3+ = 2.969 1.125

rˆa2 = 3 +

User Preference Similarity:

" # sim(ua , ub ) u≡(i) sim = h (i) + u ε(i) a (i), ub (i)

User Mode:

a

a

User Preference Similarity:

" # sim(ua , ub ) ≡ sim ua (i),zub (i) m

sim(usim(u a , u, ub )) ≡≡ α α a

b

!!

z ! m ! k=1 j=1

P (u (j) = ck ∧ ub (j)

ack ) P (ua (j) = ck ∧ ub (j) =

" k=1 j=1 # sim(ua (i), ub (i)) ! sim ha (i), hb (i)

" # " # ! ! " # " # sim uu(i), u (i) !! α sim P hh (j)a=(i), c ∧ hh (j)b = c sim ua (i), (i) b (i) z

a

m

a

b

k

b

k

k=1 j=1

z ! m ! " # sim(u , u ) ≡ " sim ua (i), ub (i) ! α P ha (j) = ck ∧ hb (j) Partial User Preference Similarity

a

b

k=1 j=1

Partial User Preference Similarity

sim(ua , ub ) ≡ 1

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

8

de:

Example calculation:

sim(a, b)

Global similarity of user preferences

0.875 ∗b!= (4a − 3.66) + 0.25 ∗ (1 − 2.33) Example calculation: rˆa2 = 3 + 0.875 + 0.25 a a 0.875 ∗ (4 − 3.66) 0.298 − 0.333 + 0.25 ∗ (1 − 2.33) rˆa2 = = 3+ 3+ = 2.969 1.125 0.875 + 0.25 0.298 − 0.333 =3+ = 2.969 User Mode: 1.125 ua (i) = ha (i) + ε(i) User Mode: a Preference b b + ε(i) User Similarity: aua (i) = ha (i) " # sim(u ) ≡ sim ua (i), ub (i) User Preference Similarity: z a , ubm

u (i) = h (i) + ε(i)

Similarity Metric "

erence Similarity:

# sim(u , u ) ≡ sim u (i), u (i)

! ! " # !u (i) sim(u , u ) ≡ sim ! u (i), αP (uaP(j) (u (j)= = c c∧ku ∧ (j) = ) sim(ua , ub ) ≡ αsim(u , u ) ≡! ucb (j) = ck ) ! a

z

b

a

b

sim(u a , ub ) ≡ " #α

z

a m

m

b

a

k=1 j=1 " P (ua (j)

k

b

=# ck ∧ ub (j) = ck )

k

j=1 sim uak=1 (i), ub (i) !k=1 simj=1ha (i), hb (i)

" # u (i), uh (i) !! α ! sim(ua (i), ub (i)) !sim"sim hb (i) # a (i), "P h (j) = c " # m sim(u"a (i), ub (i)) !# sim ! hza (i), h (i) !b " a

b

sim ua (i), ub (i) ! α

z

m

a

P k=1 j=1

m k=1 j=1 Partial User PreferencezSimilarity

k

∧ hb (j) =# ck

ha (j) = ck ∧ hb (j) = ck

#

! ! " # " # Partial User Preference Similarity sim(u , u(j) ) ≡ = c ∧ h (j) = c sim ua (i), ub (i) ! α P h a k b k sim(u , u ) ≡ a

a

b

b

k=1 j=1

ser Preference Similarity

sim(ua , ub ) ≡ 1

1. Jun. 2009

1

IRMLeS 2009 by Amancio Bouza

8

Partial similarity of user preferences

Partial Model Similarity

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity

Item set

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity

Item set

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity tion u(i) = ck

h(i) + ε(i) = ck

tion

tion

tion

ve Filtering: rˆaj = ra + κ 1. Jun. 2009

h : i !→ ck h(i) → ck ha : i !→ ck hb : i !→ ck Item set

n ! b!=a

sim(a, b) ∗ (rbj − rb )

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity Hypothesis 1

Hypothesis 2

Hypothesis 3

ha : i !→ ck

Item set

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

Partial similarity of user preferences

Partial Model Similarity Hypothesis 1

Hypothesis 2

Hypothesis 3

ha : i !→ ck

Item set

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

9

i !→ ck Partial similarity of userh :preferences

hypothesized User function

Partial Model Similarity

h(i) → ck

hypothesized User function

ha : i !→ ck

Hypothesis 1

hypothesized User function Hypothesis 2

User-Based collaborative Filtering: Hypothesis 3

rˆaj = ra + κ

ha : i !→ ck

hb : i !→ ck n ! b!=a

Normalization factor κ:

κ= Item set

sim(a, b) ∗ ( 1

n !

sim(a, b)

b!=a

1. Jun. 2009

ExampleIRMLeS calculation: 2009 by Amancio Bouza

9

i !→ ck Partial similarity of userh :preferences

hypothesized User function

Partial Model Similarity

h(i) → ck

hypothesized User function

ha : i !→ ck

Hypothesis 1

hypothesized User function Hypothesis 2

User-Based collaborative Filtering: Hypothesis 3

rˆaj = ra + κ

ha : i !→ ck

hb : i !→ ck n ! b!=a

Normalization factor κ:

κ= Item set

sim(a, b) ∗ ( 1

n !

sim(a, b)

b!=a

1. Jun. 2009

ExampleIRMLeS calculation: 2009 by Amancio Bouza

9

i !→ ck Partial similarity of userh :preferences

hypothesized User function

Partial Model Similarity

h(i) → ck

hypothesized User function

ha : i !→ ck

Hypothesis 1

hypothesized User function Hypothesis 2

User-Based collaborative Filtering: Hypothesis 3

rˆaj = ra + κ

ha : i !→ ck

hb : i !→ ck n ! b!=a

Normalization factor κ:

κ= Item set

sim(a, b) ∗ ( 1

n !

sim(a, b)

b!=a

1. Jun. 2009

ExampleIRMLeS calculation: 2009 by Amancio Bouza

9

i !→ ck Partial similarity of userh :preferences

hypothesized User function

Partial Model Similarity

h(i) → ck

hypothesized User function

ha : i !→ ck

Hypothesis 1

hypothesized User function Hypothesis 2

User-Based collaborative Filtering: Hypothesis 3

rˆaj = ra + κ

ha : i !→ ck

hb : i !→ ck n ! b!=a

Normalization factor κ:

κ= Item set

sim(a, b) ∗ ( 1

n !

sim(a, b)

b!=a

1. Jun. 2009

ExampleIRMLeS calculation: 2009 by Amancio Bouza

9

i !→ ck Partial similarity of userh :preferences

hypothesized User function

Partial Model Similarity

h(i) → ck

hypothesized User function

ha : i !→ ck

Hypothesis 1

hypothesized User function Hypothesis 2

User-Based collaborative Filtering: Hypothesis 3

rˆaj = ra + κ

ha : i !→ ck

hb : i !→ ck n ! b!=a

Normalization factor κ:

κ= Item set

sim(a, b) ∗ ( 1

n !

sim(a, b)

b!=a

1. Jun. 2009

ExampleIRMLeS calculation: 2009 by Amancio Bouza

9

User Preference Similarity

Partial Similarity Metric

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

10

a

a

r Preference Similarity:

" # sim(ua , ub ) ≡ sim ua (i), ub (i)

User Preference Similarity

Partial Similarity Metric sim(ua , ub ) ≡ α

z ! m ! k=1 j=1

P (ua (j) = ck ∧ ub (j) = ck )

" # " # sim ua (i), ub (i) ! sim ha (i), hb (i)

z ! m ! " # " # sim ua (i), ub (i) ! α P ha (j) = ck ∧ hb (j) = ck k=1 j=1

tial User Preference Similarity

" # ∂sim(ua , ub |ha,q ) ≡ sim ha,q , hb (i)

1 1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

10

=3+ User Mode:

0.298 − 0.333 = 2.969 1.125

User Preference Similarity

ua (i) = ha (i) + ε(i)

Partial Similarity Metric User Preference Similarity:

" # sim(ua , ub ) ≡ sim ua (i), ub (i) sim(ua , ub ) ≡ α

z ! m ! k=1 j=1

P (ua (j) = ck ∧ ub (j) = ck )

" # " # sim ua (i), ub (i) ! sim ha (i), hb (i)

z ! m ! " # " # sim ua (i), ub (i) ! α P ha (j) = ck ∧ hb (j) = ck k=1 j=1

AMANCIO BOUZA Partial User Preference Similarity

" # ∂sim(ua , ub |ha,q ) ≡ sim ha,q , hb (i)

ntinue

n # " ! " ! sim ha,q , hb (i) ≡ α P hb (j) = ck ∧ ha,q (j) = ck

≡α

rtial Preference Simiarlity

j=1 n # j=1

1

" " ! P hb (j) = ck |ha,q (j) = ck P ha,q (j) = ck !

n # " ! ! " sim ha,q , hb (i) ≡ α P hb (i) = ck |ha,q (i) = ck k=1

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

10

User Preference Similarity:

User Preference Similarity

" # sim(ua , ub ) ≡ sim ua (i), ub (i)

AMANCIO BOUZA

sim(ua , ub ) ≡ α

z ! m !

P (ua (j) = ck ∧ ub (j) = ck )

Partial Similarity Metric k=1 j=1

" # " # sim ua (i), ub (i) ! sim ha (i), hb (i)

tinue

z ! m ! " # " # na (i), ub (i) ! α sim u P ha (j) = ck ∧ hb (j) = ck

# ! " ! " sim ha,q , hbPartial (i) User ≡ Preference α P hb (j) = ck ∧ ha,q (j) = ck Similarity k=1 j=1

2

Continue

AMANCIO BOUZA "

# ∂sim(ua , ub |ha,q ) ≡ sim ha,q , hb (i)

j=1 n # ! " ! n sim h # , h (i) ≡ α P h (j) = c a,q

≡α

b

j=1

≡α

Partial Preference Simiarlity

tial Preference Simiarlity

"

" " ! P hb (j) = ck |ha,q "(j) = ck" P ha,q (j) = ck ! ! !

b

j=1 n # j=1

k

∧ ha,q (j) = ck

P hb (j) = ck |ha,q (j) = ck P ha,q (j) = ck 1

n # " ! ! " sim ha,q , hb (i) ≡ α P hb (i) = ck |ha,q (i) = ck

n # " ! ! " sim ha,q , hb (i) ≡ α P hb (i) = ck |ha,q (i) = ck k=1

k=1

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

10

Collaborative Filtering

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

11

FORMULAS FOR THE SOFTALK

Collaborative Filtering AMANCIO BOUZA

rˆaj = ra + κ (Resnick et al. 1994)

n ! b!=a

κ= 1. Jun. 2009

sim(a, b) ∗ (rbj − rb )

n ! b!=a

1 sim(a, b)

IRMLeS 2009 by Amancio Bouza

11

AMANCIO BOUZA

Collaborative Filtering rˆaj = ra + κ (Resnick et al. 1994)

n ! b!=a

κ=

sim(a, b) ∗ (rbj − rb )

n !

1

Item 1 Item 2 Item 3

Avg

a

2

?

4

3

b

2

4

5

3.66

c

5

1

1

2.33

Avg

3

2.5

3.33

sim(a, b)

b!=a

1. Jun. 2009

User

IRMLeS 2009 by Amancio Bouza

Similarity

a

b

c

a

1

0.875

0.25

b

0.875

1

0.125

c

0.25

0.125

1

11

AMANCIO BOUZA AMANCIO BOUZA

Collaborative Filtering rˆaj = ra + κ

n ! b!=a

sim(a, b) ∗ (rbj − rb )

n !

1

rˆaj = ra +!κsim(a, b) sim(a, b) ∗ (rbj − rb ) κ=

(Resnick et al. 1994)

rˆaj = ra + κ

n

b!=a n !

n ! 0.875 ∗ (4 − 3.66) + 0.25 ∗ (1 − 2.33)

0.875 + 0.25 0.298 − 0.333 =3+ = 2.969 1.125 b!=a

1. Jun. 2009

1

sim(a, b) ∗ (rbj − rb )

κ=

b!=a

rˆaj = 3 +

b!=a

User

Item 1 Item 2 Item 3

Avg

a

2

?

4

3

b

2

4

5

3.66

c

5

1

1

2.33

Avg

3

2.5

3.33

sim(a, b)

IRMLeS 2009 by Amancio Bouza

Similarity

a

b

c

a

1

0.875

0.25

b

0.875

1

0.125

c

0.25

0.125

1

11

does it work?

Evaluation Dataset IMDb (movie features) + Netflix Prize (user ratings) 10’128 Movies, 83’029’805 ratings, 479’437 users

Data Analysis Avg. num. r/u: 173.2 Median r/u: 80 Avg. rating: 3.53 Rating median: 4

Experimental Setting Few ratings, few common rated items: 500 users, 50 r/u Many ratings, many common rated items: 500 users, 200 r/u

Significance test Wilcoxon signed-ranks test Significance level: alpha = 0.01 Bonferroni correction for the family-wise error

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

13

Evaluation Dataset IMDb (movie features) + Netflix Prize (user ratings) 10’128 Movies, 83’029’805 ratings, 479’437 users

Setting Data Analysis Avg. num. r/u: 173.2 Median r/u: 80 Avg. rating: 3.53 Rating median: 4

Algorithm

RMSE

Recall

F1

1.097698

0.898961

66.23% 71.23% 68.64%

UMSim (SVM)

1.077945

0.88902

66.72% 71.33% 68.95%

0.885730

66.34% 68.34% 68.34%

0.929923

65.19% 71.14% 68.04%

UMSim (Part) 1.075843 50 Few ratings, few common rated items: 500 users, 50 r/uratings/user CF (Pearson Corr.) 1.131921 Many ratings, many common rated items: 500 users, 200 r/u

Wilcoxon signed-ranks test

Prec.

pUMSim (Part)

Experimental Setting

Significance test

MAE

SVM

1.309146

0.976800

63.85% 71.68% 67.53%

Part

1.334507

1.003800

64.32% 70.98% 67.49%

Significance level: alpha = 0.01 Bonferroni correction for the family-wise error

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

13

Evaluation Setting

Dataset IMDb (movie features) + Netflix Prize (user ratings) 10’128 Movies, 83’029’805 ratings, 479’437 users

Setting Data Analysis Avg. num. r/u: 173.2 Median r/u: 80 Avg. rating: 3.53 Rating median: 4

Algorithm

ratings/user

MAE

Prec.

Recall

F1

pUMSim (Part)

1.097698

0.898961

66.23% 71.23% 68.64%

UMSim (SVM)

1.077945

0.88902

66.72% 71.33% 68.95%

F1

68.34% 68.34%

CF (Pearson Corr.) 1.131921

0.929923

65.19% 71.14% 68.04%

SVM

0.976800

63.85% 71.68% 67.53%

pUMSim (Part)

1.048786

UMSim (SVM)

1.003800 64.32% 63.88% 70.98% 67.49% 1.035611 Part 0.835009 1.334507 60.77% 67.33%

UMSim (Part) 1.032746 200 Few ratings, few common rated items: 500 users, 50 r/uratings/user CF (Pearson Corr.) 1.035324 Many ratings, many common rated items: 500 users, 200 r/u

Wilcoxon signed-ranks test

RMSE

UMSim (Part) 0.885730 66.34% RMSE MAE 1.075843 Prec. Recall 50

Experimental Setting

Significance test

Algorithm

0.843029

60.90% 66.83% 63.73%

1.309146

0.833374

60.89% 67.31% 63.94%

0.832373

60.56% 68.71% 64.38%

SVM

1.230682

0.896450

58.54% 67.80% 62.83%

Part

1.292360

0.953600

58.76% 64.72% 61.60%

Significance level: alpha = 0.01 Bonferroni correction for the family-wise error

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

13

Evaluation Setting

Dataset IMDb (movie features) + Netflix Prize (user ratings) 10’128 Movies, 83’029’805 ratings, 479’437 users

Data Analysis Avg. num. r/u: 173.2

50 ratings/user

Median r/u: 80 Avg. rating: 3.53

Algorithm

RMSE

MAE

Prec.

Recall

F1

pUMSim (Part)

1.097698

0.898961

66.23% 71.23% 68.64%

UMSim (SVM)

1.077945

0.88902

66.72% 71.33% 68.95%

UMSim (Part)

1.075843

0.885730

66.34% 68.34% 68.34%

CF (Pearson Corr.) 1.131921

0.929923

65.19% 71.14% 68.04%

SVM

1.309146

0.976800

63.85% 71.68% 67.53%

Part

1.334507

1.003800

64.32% 70.98% 67.49%

RMSE

MAE

pUMSim (Part)

1.048786

0.843029

60.90% 66.83% 63.73%

UMSim (SVM)

1.035611

0.835009

60.77% 67.33% 63.88%

UMSim (Part)

1.032746

0.833374

60.89% 67.31% 63.94%

CF (Pearson Corr.) 1.035324

0.832373

60.56% 68.71% 64.38%

SVM

1.230682

0.896450

58.54% 67.80% 62.83%

Part

1.292360

0.953600

58.76% 64.72% 61.60%

Rating median: 4

Experimental Setting Few ratings, few common rated items: 500 users, 50 r/u

Setting

Many ratings, many common rated items: 500 users, 200 r/u

Significance test Wilcoxon signed-ranks test Significance level: alpha = 0.01

200 ratings/user

Algorithm

Prec.

Recall

F1

Bonferroni correction for the family-wise error

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

13

Conclusion Model similarity is important Similarity based on user preference models sometimes significantly outperforms Similarity based on common rated item Especially with few common rated items

Partial User Preference Similarity needs further improvement Preprocessing needed for scalability 1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

14

Thanks for your attention

References 1. S. S. Anand, P. Kearney, and M. Shapcott. Generating semantically enriched user profiles for web personalization. ACM Transactions on Internet Technology, 2007. 2. M. Balabanovic and Y. Shoham. Fab: Content-based, collaborative recommendation. In Communications of the ACM, 1997. 3. C. Basu, H. Hirsh, and W. Cohen. Recommendation as classification: Using social and content-based information in recommendation. In AAAI, 1998. 4. J. Bennett and S. Lanning. The netflix prize. KDD Cup and Workshop, 2007. 5. J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In 14th Conference on Uncertainty in AI, 1998.

10. D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. In Proceedings of SIAM Data Mining (SDM’05), 2005. 11. P. Melville, R. J. Mooney, and R. Nagara jan. Content-boosted collaborative filtering for improved recommendations. In AAAI, 2002. 12. S. E. Middleton, H. Alani, and D. C. de Roure. Exploiting synergy between ontologies and recommender systems. In WWW, 2002. 13. S. E. Middleton, N. R. Shadbolt, and D. C. de Roure. Ontological user profiling in recommender systems. In ACM Transactions on Information Systems, 2004. 14. T. M. Mitchel. Machine Learning. 1997.

6. I. Cantador, A. Bellog´ın, and P. Castells. A multilayer ontology-based hybrid recommendation model. AI Communcations, 2008.

15. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW, 1994.

7. E. Frank and I. H. Witten. Generating accurate rule sets without global optimization. In 15th International Conference on Machine Learning, 1998.

16. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW, 2001.

8. N. Good, J. B. Schafer, J. A. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl. Combining collaborative filtering with personal agents for better recommendations. In AAAI /IAAI, 1999.

17. I. H. Witten and E. Frank. Data Mining - Practical Machine Learning Tools and Techniques. 2005.

9. J. L. Herlocker, J. A. Konstan, L. G. Reveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. on Information Sys., 2004.

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

16

Summary

FORMULAS FOR THE SOFTALK

Amancio Bouza, Gerald Reif, Abraham, Bernstein: “Probabilistic Partial User Model Similarity for Collaborative Filtering”, IRMLeS 2009

AMANCIO BOUZA Asian Food

Asian Food

Italian Food

User function u(i) = ck hypothesized User function

Similarity based on user preference models is important

h(i) + ε(i) = ck h : i !→ ck

Zurich

Heraklion

hypothesized User function

User preference similarity is good, but 2 partial user preference similarity not Continue always. Needs further investigation

h(i) → ck

hypothesized User function H1

AMANCIO BOUZA

ha : i !→ ck

hypothesized User function

hb : i !→ ck

H2

n #

User-Based " ! " ! collaborative Filtering: H3 sim ha,q , hb (i) ≡ α P hb (j) = ck ∧ ha,q (j)! n= ck

Partial user preference similarity based on similarity between hypothesis and user model:

≡α

Partial Preference Simiarlity

Hypothesis extraction from user model Hypothesis as item filter

j=1 n #

rˆaj = ra + κ

b!=a

" ! P hb (j) = ck |ha,q (j) = ck P 1 ha,q (j) = c

! Normalization factor κ: j=1

sim(a, b) ∗ (rbj − rb )

Item set set Item

Example calculation:

κ=

n !

sim(a, b)

b!=a

n # " ! ! " 0.875 ∗ (4 − 3.66) + 0.25 ∗ (1 − 2. sim ha,q , hb (i) ≡ α Prˆa2h=b (i) 3 + = ck |ha,q (i) = ck 0.875 + 0.25 k=1

=3+

0.298 − 0.333 = 2.969 1.125

User Model: ua (i) = ha (i) + ε(i)

1. Jun. 2009

IRMLeS 2009 by Amancio Bouza

User Model: u(i) = h(i) + ε(i)

17

1st International Workshop on Inductive Reasoning and Machine ...

Jun 1, 2009 - 1st International Workshop on Inductive Reasoning and. Machine Learning (IRMLeS) 2009 ..... Data Analysis. Avg. num. r/u: 173.2. Median r/u: ...

6MB Sizes 1 Downloads 270 Views

Recommend Documents

Proceedings 1st International Workshop on Comparative Empirical ...
Jun 30, 2012 - and Jochen Hoenicke of University of Freiburg, with particular focus on proof ...... quires a bigger participant critical mass, we suggest that the ...

Proceedings 1st International Workshop on Comparative Empirical ...
Jun 30, 2012 - held on June 30th, 2012 in Manchester, UK, in conjunction with the International .... On the Organisation of Program Verification Competitions . ...... is only measuring the ability of the analyzer to warn for any call to realloc: ....

inductive-and-deductive-reasoning notes and worksheet.PDF
There was a problem loading more pages. Retrying... inductive-and-deductive-reasoning notes and worksheet.PDF. inductive-and-deductive-reasoning notes ...

THE 1ST INTERNATIONAL YOUTH SYMPOSIUM ON CREATIVE ...
Page 3 of 94. COMMITTEE. Gabriel Keefe. Amira Syafriana. Ni Putu Ayu Eka Sundari. Fadliah Istivani. Naila Aliya Marhama. Abi Hakim Mandalaputra. Dedra Nurliaputri. Rizka Arsya Arissafia. Raysa Romaska. Nadya Luckita W. K.. Renery Yemima. Dwiky Aji Ku

3rd International Workshop on Nonlinear and ...
... the CPGR node of. H3ABioNet. Continued on page 3. Participants at the Symposium .... gene-mapping studies in the. African Continent. Participants at the ...

Second International Workshop on Teaching and Learning in the 21st ...
Educational technology needs support of the teachers who integrate ... Professor Dr. Abtar Kaur – Professor, Faculty of Education and Languages, Open ...

Second International Workshop on Teaching and Learning in the 21st ...
Educational technology needs support of the teachers who integrate technology into the curriculum, align it with student learning goals, and use it for engaged ...

The 8th International Workshop on Internet on Things ...
Distributed Denial of Service (DDoS) attacks that have caused outages and network congestion for a large ... or trust architectures, protocols, algorithms, services, and applications on mobile and wireless systems. ... layer protocols are expected to

call for papers - The International Workshop on Non-Intrusive Load ...
The 3rd International Workshop on Non-Intrusive Load Monitoring (NILM) will be held in Vancouver, Canada from May 14 to 15, 2016. This year's venue will be ...

call for papers - The International Workshop on Non-Intrusive Load ...
We invite all researchers working on NILM-related topics to submit 4-page papers to the conference for oral presentation or presentation during a poster session.

3rd International Workshop on Pattern Recognition ...
electronic health records recording patient conditions, diagnostic tests, labs, imaging exams, genomics, proteomics, treatments, ... Olav Skrøvseth, University Hospital of North Norway. Rogerio Abreu De Paula, IBM Brazil ..... gram for World-Leading

3rd International workshop on crocodylian ... - Wiley Online Library
Oct 16, 2008 - This compilation represents the second set of crocodylian genetics and genomic articles pub- lished in a Special Issue of JEZ. Most of these articles were presented in April of 2007, in. Panama City, Panama for the 3rd Crocodylian. Gen

Proceedings of the 5th International Workshop on ...
The data set contains keyphrases (i.e. controlled and un- controlled terms) assigned by professional index- ..... We conclude that there is definitely still room.

Machine Translation Model using Inductive Logic ...
Rule based machine translation systems face different challenges in building the translation model in a form of transfer rules. Some of these problems require enormous human effort to state rules and their consistency. This is where different human l

bayesian reasoning and machine learning pdf
learning pdf. Download now. Click here if your download doesn't start automatically. Page 1 of 1. bayesian reasoning and machine learning pdf. bayesian ...

Barber, Bayesian Reasoning and Machine Learning (666p).pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Barber ...

15th IEEE International Conference on Machine ... -
Prospective authors are invited to submit four to six pages manuscript describing original work. The manuscript has to be written in English and in PDF format. Authors of accepted papers need to present their papers at the conference and at least one

Workshop on Bitcoin
Jun 7, 2016 - cryptocurrencies goes beyond their direct application, and blockchain ... by an anonymous developer using a pseudonym "Satoshi Nakamoto".

CfP_Interdisciplinary Workshop on Water, Technology and the ...
CfP_Interdisciplinary Workshop on Water, Technology and the Nation-State_Manchester_Oct 2016.pdf. CfP_Interdisciplinary Workshop on Water, Technology ...

1st International Global Requirements Engineering ...
research results and ideas. Both groups, industry ... Engineering: Results of an Online Survey. The following ... (2) Store and Reuse Decision Rationale. The discussion ... mat seems to be a key factor for its successful use. This ensures.

1st. Information Note for international partipants_20170722.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. 1st. Information ...