Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving collaborative filtering for the cloud Anirban Basu1 1 Graduate 2 MSIS

Jaideep Vaidya2 Theo Dimitrakos3

Hiroaki Kikuchi1

School of Engineering, Tokai University, Japan

Department, Rutgers The State University of New Jersey, USA 3 Research

& Technology, British Telecom, UK

IEEE Cloudcom 2011, Athens, Greece

Anirban Basu, et al.

Cloud based privacy preserving CF

1/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

2/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Recommendation through collaborative filtering

Recommendation and CF

A recommendation example: Amazon’s “people who buy this also buy that” (user profile analysis). Rating-based collaborative filtering (CF) – another mechanism for recommendation.

Anirban Basu, et al.

Cloud based privacy preserving CF

3/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Recommendation through collaborative filtering

Recommendation and CF

A recommendation example: Amazon’s “people who buy this also buy that” (user profile analysis). Rating-based collaborative filtering (CF) – another mechanism for recommendation.

Anirban Basu, et al.

Cloud based privacy preserving CF

3/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Recommendation through collaborative filtering

Recommendation and CF Items

i_1 i_2 i_3 . . . i_k . . . i_n u_1 u_2 Users

. . .

Sparse user-item rating matrix (m x n)

u_m Predict:

u_x

?

i_k

The task is to predict the rating user u_x will give to item i_k given the sparse user-item rating matrix. Anirban Basu, et al.

Cloud based privacy preserving CF

3/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Recommendation through collaborative filtering

CF: an illustrative example

An airlines example (“-” implies absence of ratings):

Alice Bob Tracy Steve

Virgin Atlantic 3 3 3

Emirates ? 4 2 3

Singapore Airlines 5 5 4 -

Predict: how would Alice rate Emirates?

Anirban Basu, et al.

Cloud based privacy preserving CF

3/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Collaborative filtering on the cloud and privacy

CF on the cloud – privacy risks

Recommendation providers may run on cloud computing infrastructures. Your private rating data may not be safe on the cloud because of insider and outsider threats.

Anirban Basu, et al.

Cloud based privacy preserving CF

4/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Collaborative filtering on the cloud and privacy

CF on the cloud – privacy risks

Recommendation providers may run on cloud computing infrastructures. Your private rating data may not be safe on the cloud because of insider and outsider threats.

Anirban Basu, et al.

Cloud based privacy preserving CF

4/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Collaborative filtering on the cloud and privacy

These are where privacy concerns are raised Cloud computing infrastructure

Submits ratings

User (rating submitter) Distributed storage

Queries rating prediction User (rating requester)

indicates privacy risk

Anirban Basu, et al.

Cloud based privacy preserving CF

4/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The research problem

Research problem: privacy preserving CF

Compute a privacy preserving rating prediction on a Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud, such that we are: able to hide and/or delink user’s private rating data without using any trusted third party, and it is robust to insider threats from the cloud, while assuming honest-but-curious user, and assuming identity concealing network infrastructures (e.g. anonymous networks, pseudonyms, IPv4 NAT).

Anirban Basu, et al.

Cloud based privacy preserving CF

5/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The research problem

Research problem: privacy preserving CF

Compute a privacy preserving rating prediction on a Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud, such that we are: able to hide and/or delink user’s private rating data without using any trusted third party, and it is robust to insider threats from the cloud, while assuming honest-but-curious user, and assuming identity concealing network infrastructures (e.g. anonymous networks, pseudonyms, IPv4 NAT).

Anirban Basu, et al.

Cloud based privacy preserving CF

5/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The research problem

Research problem: privacy preserving CF

Compute a privacy preserving rating prediction on a Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud, such that we are: able to hide and/or delink user’s private rating data without using any trusted third party, and it is robust to insider threats from the cloud, while assuming honest-but-curious user, and assuming identity concealing network infrastructures (e.g. anonymous networks, pseudonyms, IPv4 NAT).

Anirban Basu, et al.

Cloud based privacy preserving CF

5/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The research problem

Research problem: privacy preserving CF

Compute a privacy preserving rating prediction on a Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud, such that we are: able to hide and/or delink user’s private rating data without using any trusted third party, and it is robust to insider threats from the cloud, while assuming honest-but-curious user, and assuming identity concealing network infrastructures (e.g. anonymous networks, pseudonyms, IPv4 NAT).

Anirban Basu, et al.

Cloud based privacy preserving CF

5/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Our contributions

Our contributions

A privacy preserving CF solution for the Google App Engine for Java (GAE/J)1 – a specialised SaaS construction PaaS cloud. Can be extended to vertical partitions2 . Feasible on a real world public PaaS cloud.

1 2

http://code.google.com/appengine/ See § IV.C in the paper. Left out of this presentation for simplicity. Anirban Basu, et al.

Cloud based privacy preserving CF

6/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Our contributions

Our contributions

A privacy preserving CF solution for the Google App Engine for Java (GAE/J)1 – a specialised SaaS construction PaaS cloud. Can be extended to vertical partitions2 . Feasible on a real world public PaaS cloud.

1 2

http://code.google.com/appengine/ See § IV.C in the paper. Left out of this presentation for simplicity. Anirban Basu, et al.

Cloud based privacy preserving CF

6/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Our contributions

Our contributions

A privacy preserving CF solution for the Google App Engine for Java (GAE/J)1 – a specialised SaaS construction PaaS cloud. Can be extended to vertical partitions2 . Feasible on a real world public PaaS cloud.

1 2

http://code.google.com/appengine/ See § IV.C in the paper. Left out of this presentation for simplicity. Anirban Basu, et al.

Cloud based privacy preserving CF

6/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

7/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Types of collaborative filtering

CF can be: either memory based using similarity or deviations between users (user-based) or items (item-based); or model based, such as utilising the singular value decomposition technique.

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Types of collaborative filtering

CF can be: either memory based using similarity or deviations between users (user-based) or items (item-based); or model based, such as utilising the singular value decomposition technique.

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Privacy-preserving CF

Classified, as per mechanism: Encryption based – where privacy of data is preserved through homomorphic encryption [Canny2002a, Han2009]. Randomisation based – where privacy of data is preserved through random perturbation of the data or by anonymising identities [Polat2003, 2005 and 2006].

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Privacy-preserving CF

Classified, as per mechanism: Encryption based – where privacy of data is preserved through homomorphic encryption [Canny2002a, Han2009]. Randomisation based – where privacy of data is preserved through random perturbation of the data or by anonymising identities [Polat2003, 2005 and 2006].

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Privacy-preserving CF Classified, as per mechanism: Encryption based – where privacy of data is preserved through homomorphic encryption [Canny2002a, Han2009]. Randomisation based – where privacy of data is preserved through random perturbation of the data or by anonymising identities [Polat2003, 2005 and 2006]. Classified, as per infrastructure, PPCF can be: single machine or single cluster based [Tada2010, Basu2011], or large-scale distributed [Berkovsky2007, Canny2002b].

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

Privacy-preserving CF Classified, as per mechanism: Encryption based – where privacy of data is preserved through homomorphic encryption [Canny2002a, Han2009]. Randomisation based – where privacy of data is preserved through random perturbation of the data or by anonymising identities [Polat2003, 2005 and 2006]. Classified, as per infrastructure, PPCF can be: single machine or single cluster based [Tada2010, Basu2011], or large-scale distributed [Berkovsky2007, Canny2002b].

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy preserving CF – the state-of-the-art

I will not bore you with bibliography slides at the end. . . Please see the the paper for detailed references of the cited work.

Anirban Basu, et al.

Cloud based privacy preserving CF

8/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

9/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

What is Slope One?

The original paper on SlopeOne CF: Lemire, D., Maclachlan, A. 2005. Slope one predictors for online rating-based collaborative filtering. In: Society for Industrial Mathematics.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

What is Slope One?

Collaborative filtering (CF) predictors of the form f (x) = x + b, hence “slope one”. Weighted version is based on pre-computed average deviations between ratings of items, weighted by relative cardinalities of pairs of items. Accurate, fast and incrementally updatable.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

What is Slope One?

Collaborative filtering (CF) predictors of the form f (x) = x + b, hence “slope one”. Weighted version is based on pre-computed average deviations between ratings of items, weighted by relative cardinalities of pairs of items. Accurate, fast and incrementally updatable.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

What is Slope One?

Collaborative filtering (CF) predictors of the form f (x) = x + b, hence “slope one”. Weighted version is based on pre-computed average deviations between ratings of items, weighted by relative cardinalities of pairs of items. Accurate, fast and incrementally updatable.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

Why Slope One?

The choice of the CF scheme has effect on performance and privacy on the cloud. Traditional user-based or item-based CF requires storage of private rating data; easy to update but slow to query. Low-rank matrix approximations (e.g. SVD) are difficult to compute incrementally; otherwise slow to update from stored private rating data but fast to query. Slope One uses an incrementally updatable item-item matrix model; fast to update and fast to query.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

Why Slope One?

The choice of the CF scheme has effect on performance and privacy on the cloud. Traditional user-based or item-based CF requires storage of private rating data; easy to update but slow to query. Low-rank matrix approximations (e.g. SVD) are difficult to compute incrementally; otherwise slow to update from stored private rating data but fast to query. Slope One uses an incrementally updatable item-item matrix model; fast to update and fast to query.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Slope One – a collaborative filtering predictor

Why Slope One?

The choice of the CF scheme has effect on performance and privacy on the cloud. Traditional user-based or item-based CF requires storage of private rating data; easy to update but slow to query. Low-rank matrix approximations (e.g. SVD) are difficult to compute incrementally; otherwise slow to update from stored private rating data but fast to query. Slope One uses an incrementally updatable item-item matrix model; fast to update and fast to query.

Anirban Basu, et al.

Cloud based privacy preserving CF

10/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

11/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

The weighted Slope One The average deviations of ratings from item a to item b is given as: P P ∆a,b (ri,a − ri,b ) i δi,a,b δa,b = = = i (1) φa,b φa,b φa,b where φa,b is the count of the users who have rated both items while δi,a,b = ri,a − ri,b is the deviation of the rating of item a from that of item b both given by user i. Thus, the rating for user u and item x using the weighted Slope One is predicted as: P P a|a6=x (δx,a + ru,a )φx,a a|a6=x (∆x,a + ru,a φx,a ) P P ru,x = = a|a6=x φx,a a|a6=x φx,a (2) Anirban Basu, et al.

Cloud based privacy preserving CF

12/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

The weighted Slope One The average deviations of ratings from item a to item b is given as: P P ∆a,b (ri,a − ri,b ) i δi,a,b δa,b = = = i (1) φa,b φa,b φa,b where φa,b is the count of the users who have rated both items while δi,a,b = ri,a − ri,b is the deviation of the rating of item a from that of item b both given by user i. Thus, the rating for user u and item x using the weighted Slope One is predicted as: P P a|a6=x (δx,a + ru,a )φx,a a|a6=x (∆x,a + ru,a φx,a ) P P ru,x = = a|a6=x φx,a a|a6=x φx,a (2) Anirban Basu, et al.

Cloud based privacy preserving CF

12/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

Pre-computed incrementally updatable matrices

Weighted Slope One predictor has the following two pre-computed, incrementally updatable matrices. Deviation matrix or ∆: each element is the total deviation of ratings between a pair of items, calculated over cases where both items have been rated by the same user. If the ratings matrix is of dimension mxn (i.e. n items) then ∆ is of dimension nxn. Cardinality matrix or φ: each element is the count of the cases where items in a pair have been both rated by the same user. It is of the same dimension as ∆.

Anirban Basu, et al.

Cloud based privacy preserving CF

12/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

Pre-computed incrementally updatable matrices

Weighted Slope One predictor has the following two pre-computed, incrementally updatable matrices. Deviation matrix or ∆: each element is the total deviation of ratings between a pair of items, calculated over cases where both items have been rated by the same user. If the ratings matrix is of dimension mxn (i.e. n items) then ∆ is of dimension nxn. Cardinality matrix or φ: each element is the count of the cases where items in a pair have been both rated by the same user. It is of the same dimension as ∆.

Anirban Basu, et al.

Cloud based privacy preserving CF

12/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

The generalised weighted Slope One

Pre-computed incrementally updatable matrices Items

i_1 i_2 i_3 . . . i_k . . . i_n u_1 u_2 Users

. . .

indicates private data

Sparse user-item rating matrix (m x n)

u_m Items

i_1 i_2

.

.

. i_n

i_1 Slope One pre-computation phase

i_2 Items

. . .

Sparse item-item deviation and cardinality matrices (n x n)

i_n

Anirban Basu, et al.

Cloud based privacy preserving CF

12/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

13/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Additively homomorphic Paillier cryptosystem homomorphic addition: E(m1 + m2 ) = E(m1 ) · E(m2 ) homomorphic multiplication: E(m1 · π) = E(m1 )π

We denote encryption and decryption functions as E() and D() respectively with plaintext messages m1 , m2 and integer multiplicand π. Anirban Basu, et al.

Cloud based privacy preserving CF

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Additively homomorphic Paillier cryptosystem homomorphic addition: E(m1 + m2 ) = E(m1 ) · E(m2 ) homomorphic multiplication: E(m1 · π) = E(m1 )π

We denote encryption and decryption functions as E() and D() respectively with plaintext messages m1 , m2 and integer multiplicand π. Anirban Basu, et al.

Cloud based privacy preserving CF

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Encrypted prediction query Based on the previous equation for plaintext Slope One predictors, we can write: X Y (∆x,a + ru,a φx,a ) = D( (E(∆x,a )(E(ru,a )φx,a ))) (3) a|a6=x

a|a6=x

and reducing the number of encryptions, the final prediction is given as: P Q D(E( a|a6=x ∆x,a ) a|a6=x (E(ru,a )φx,a )) P ru,x = a|a6=x φx,a

Anirban Basu, et al.

Cloud based privacy preserving CF

(4)

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Encrypted prediction query Based on the previous equation for plaintext Slope One predictors, we can write: X Y (∆x,a + ru,a φx,a ) = D( (E(∆x,a )(E(ru,a )φx,a ))) (3) a|a6=x

a|a6=x

and reducing the number of encryptions, the final prediction is given as: P Q D(E( a|a6=x ∆x,a ) a|a6=x (E(ru,a )φx,a )) P ru,x = a|a6=x φx,a

Anirban Basu, et al.

Cloud based privacy preserving CF

(4)

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Privacy preserving Slope One

Since ∆ and φ are not private information with respect to user data, these are stored unencrypted in the cloud. These matrices are updated as ratings of items are added, updated or deleted in pairs. Proposed solution uses user-encrypted prediction query and response.

Anirban Basu, et al.

Cloud based privacy preserving CF

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Privacy preserving Slope One

Since ∆ and φ are not private information with respect to user data, these are stored unencrypted in the cloud. These matrices are updated as ratings of items are added, updated or deleted in pairs. Proposed solution uses user-encrypted prediction query and response.

Anirban Basu, et al.

Cloud based privacy preserving CF

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Privacy-preserving CF

Privacy preserving Slope One

Since ∆ and φ are not private information with respect to user data, these are stored unencrypted in the cloud. These matrices are updated as ratings of items are added, updated or deleted in pairs. Proposed solution uses user-encrypted prediction query and response.

Anirban Basu, et al.

Cloud based privacy preserving CF

14/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Piecing it together

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

15/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Piecing it together

Overview of the proposed scheme PaaS cloud

Identity anonymiser submits plaintext pair-wise ratings or deviations of ratings

CF application cloud app instance stores plaintext deviations and cardinalities

Google App Engine (GAE/J) or other PaaS cloud distributed datastore User queries with encrypted (user's public key) rating vector

returns encrypted prediction which only the user can decrypt

Anirban Basu, et al.

computes encrypted prediction from stored data CF application cloud app instance

Cloud based privacy preserving CF

16/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Piecing it together

De-linking identities with IPv4 NAT A simple IPv4 NAT can provide a naïve approach to make linkability between actual users and their WAN side IPs hard. LAN side Local router (NAT)

WAN side ISP router Cloud application

Users

Dynamic WAN IP and NAT creates a level of unlinkability between real users and the router's WAN-side IP visible to the cloud.

User computers

Anirban Basu, et al.

Cloud based privacy preserving CF

16/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Piecing it together

Addition, update, deletion and prediction of ratings3

User

CF Site Add, update or remove a rating pair or deviation of ratings for an item pair (Client uses identity anonymising techniques.)

Update plaintext deviation and cardinality matrices.

Figure: UML sequence diagram for addition, update or deletion of data between any one user and the cloud-based CF site.

3

See algorithms IV.1-IV.3 in the paper. Anirban Basu, et al.

Cloud based privacy preserving CF

16/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Piecing it together

Addition, update, deletion and prediction of ratings3

User

CF Site Encrypted prediction query (Encrypted with user's public key) Encrypted prediction response

Decrypt response locally.

Compute encrypted prediction.

(Encrypted with user's public key)

Figure: UML sequence diagram for prediction of between any one user and the cloud-based CF site.

3

See algorithms IV.1-IV.3 in the paper. Anirban Basu, et al.

Cloud based privacy preserving CF

16/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

17/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J)

Specialised SaaS construction PaaS cloud. SaaS application instances run on Java Virtual Machines with web front-ends. Automatically allocated scalable resources for growing user requests. Slow but high replication datastore access; and fast distributed in-memory cache.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J)

Specialised SaaS construction PaaS cloud. SaaS application instances run on Java Virtual Machines with web front-ends. Automatically allocated scalable resources for growing user requests. Slow but high replication datastore access; and fast distributed in-memory cache.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J)

Specialised SaaS construction PaaS cloud. SaaS application instances run on Java Virtual Machines with web front-ends. Automatically allocated scalable resources for growing user requests. Slow but high replication datastore access; and fast distributed in-memory cache.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J)

Specialised SaaS construction PaaS cloud. SaaS application instances run on Java Virtual Machines with web front-ends. Automatically allocated scalable resources for growing user requests. Slow but high replication datastore access; and fast distributed in-memory cache.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J) Low CPU performance per application instance: affects cryptographic operations. Various performance limitations on the free quota. As of July 22, 2011, we measured some limitations of the GAE/J. Feasibility of a PPCF scheme on the GAE/J A. Basu, J. Vaidya, T. Dimitrakos, H. Kikuchi, Feasibility of a privacy preserving collaborative filtering scheme on the Google App Engine – a performance case study, Proceedings of the 27th ACM Symposium on Applied Computing (SAC) Cloud Computing track, Trento, Italy, 2012.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J) Low CPU performance per application instance: affects cryptographic operations. Various performance limitations on the free quota. As of July 22, 2011, we measured some limitations of the GAE/J. Feasibility of a PPCF scheme on the GAE/J A. Basu, J. Vaidya, T. Dimitrakos, H. Kikuchi, Feasibility of a privacy preserving collaborative filtering scheme on the Google App Engine – a performance case study, Proceedings of the 27th ACM Symposium on Applied Computing (SAC) Cloud Computing track, Trento, Italy, 2012.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J) Low CPU performance per application instance: affects cryptographic operations. Various performance limitations on the free quota. As of July 22, 2011, we measured some limitations of the GAE/J. Feasibility of a PPCF scheme on the GAE/J A. Basu, J. Vaidya, T. Dimitrakos, H. Kikuchi, Feasibility of a privacy preserving collaborative filtering scheme on the Google App Engine – a performance case study, Proceedings of the 27th ACM Symposium on Applied Computing (SAC) Cloud Computing track, Trento, Italy, 2012.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Google App Engine for Java (GAE/J) Low CPU performance per application instance: affects cryptographic operations. Various performance limitations on the free quota. As of July 22, 2011, we measured some limitations of the GAE/J. Feasibility of a PPCF scheme on the GAE/J A. Basu, J. Vaidya, T. Dimitrakos, H. Kikuchi, Feasibility of a privacy preserving collaborative filtering scheme on the Google App Engine – a performance case study, Proceedings of the 27th ACM Symposium on Applied Computing (SAC) Cloud Computing track, Trento, Italy, 2012.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Performance results on the GAE/J

Bit sizea 1024 1024 2048 2048 a b

Vector sizeb 5 10 5 10

Prediction time 500ms 650ms 3800ms 5000ms

Paillier cryptosystem modulus bit size, i.e. |n|. Size of the encrypted rating query vector.

Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Performance results on the GAE/J Time taken to predict grows linearly . . . . . . with the size of the query vector. With 100 given ratings in the query vector, the prediction time will be about 50 seconds – an awfully long wait on a web interface!

Bit sizea 1024 1024 2048 2048 a b

Vector sizeb 5 10 5 10

Prediction time 500ms 650ms 3800ms 5000ms

Paillier cryptosystem modulus bit size, i.e. |n|. Size of the encrypted rating query vector. Anirban Basu, et al.

Cloud based privacy preserving CF

18/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Demo

Google App Engine for Java implementation: http://gaejppcf.appspot.com/. Attack simulation on private data: in both cases, the cloud application tracks user’s IPv4 address – a typical attack scenario to attempt to link ratings to users.

Anirban Basu, et al.

Cloud based privacy preserving CF

19/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Implementation and results

Demo

Google App Engine for Java implementation: http://gaejppcf.appspot.com/. Attack simulation on private data: in both cases, the cloud application tracks user’s IPv4 address – a typical attack scenario to attempt to link ratings to users.

Anirban Basu, et al.

Cloud based privacy preserving CF

19/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Outline 1

2

3

4 5

Collaborative filtering and privacy Recommendation through collaborative filtering Collaborative filtering on the cloud and privacy The research problem Our contributions Related work and background Privacy preserving CF – the state-of-the-art Slope One – a collaborative filtering predictor The generalised weighted Slope One Proposed scheme Privacy-preserving CF Piecing it together Evaluation Implementation and results Tailpiece Conclusions and future work Question time! Anirban Basu, et al.

Cloud based privacy preserving CF

20/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Conclusions

Our proposed scheme: uses user-encrypted predicted query and does not store users’ rating data; makes rating-to-user linkability hard; and scales well on real world cloud platforms.

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Conclusions

Our proposed scheme: uses user-encrypted predicted query and does not store users’ rating data; makes rating-to-user linkability hard; and scales well on real world cloud platforms.

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Conclusions

Our proposed scheme: uses user-encrypted predicted query and does not store users’ rating data; makes rating-to-user linkability hard; and scales well on real world cloud platforms.

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Future work

Implement the proposal on vertical partition. Introduce parallelism in prediction queries with large query vectors. Conduct comparative performance analyses with other privacy preserving CF implementations on different SaaS construction PaaS clouds, e.g. the Amazon Elastic Beanstalk. Improve our scheme by discarding some assumptions (e.g. honest user) and dependencies (e.g. anonymiser networks).

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Future work

Implement the proposal on vertical partition. Introduce parallelism in prediction queries with large query vectors. Conduct comparative performance analyses with other privacy preserving CF implementations on different SaaS construction PaaS clouds, e.g. the Amazon Elastic Beanstalk. Improve our scheme by discarding some assumptions (e.g. honest user) and dependencies (e.g. anonymiser networks).

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Future work

Implement the proposal on vertical partition. Introduce parallelism in prediction queries with large query vectors. Conduct comparative performance analyses with other privacy preserving CF implementations on different SaaS construction PaaS clouds, e.g. the Amazon Elastic Beanstalk. Improve our scheme by discarding some assumptions (e.g. honest user) and dependencies (e.g. anonymiser networks).

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Conclusions and future work

Future work

Implement the proposal on vertical partition. Introduce parallelism in prediction queries with large query vectors. Conduct comparative performance analyses with other privacy preserving CF implementations on different SaaS construction PaaS clouds, e.g. the Amazon Elastic Beanstalk. Improve our scheme by discarding some assumptions (e.g. honest user) and dependencies (e.g. anonymiser networks).

Anirban Basu, et al.

Cloud based privacy preserving CF

21/22

Collaborative filtering and privacy

Related work and background

Proposed scheme

Evaluation

Tailpiece

Question time!

Thank you for listening!

Any questions?

Anirban Basu, et al.

Cloud based privacy preserving CF

22/22

Privacy-preserving collaborative filtering for the cloud

Your private rating data may not be safe on the cloud because of insider and outsider threats. Anirban Basu, et al. Cloud based privacy preserving CF. 4/22 ...

680KB Sizes 2 Downloads 318 Views

Recommend Documents

Privacy-preserving collaborative filtering on the cloud ...
which implements a small subset of SQL. ... used the Amazon Relational Database Service (RDS), where a ... The performance also degrades if the database.

Combinational Collaborative Filtering for ... - Research at Google
Aug 27, 2008 - Before modeling CCF, we first model community-user co- occurrences (C-U) ...... [1] Alexa internet. http://www.alexa.com/. [2] D. M. Blei and M. I. ...

Collaborative Filtering Personalized Skylines..pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Collaborative ...

Content-Boosted Collaborative Filtering
Most recommender systems use Collaborative Filtering or ... file. Because of these reasons, CF systems have been used ..... -means clustering algorithm.

Using Mixture Models for Collaborative Filtering - Cornell Computer ...
Using Mixture Models for Collaborative Filtering. Jon Kleinberg. ∗. Department of Computer Science. Cornell University, Ithaca, NY, 14853 [email protected].

An Incremental Approach for Collaborative Filtering in ...
Department of Computer Science and Engineering, National Institute of Technology. Rourkela, Rourkela, Odisha ... real-world datasets show that the proposed approach outperforms the state-of-the-art techniques in ... 1 Introduction. Collaborative filt

Collaborative Filtering with Personalized Skylines
A second alternative incorporates some content-based (resp. CF) characteristics into a CF (resp. content-based) system. Regarding concrete systems, Grundy proposes stereo- types as a mechanism for modeling similarity in book rec- ommendations [36]. T

Transfer learning in heterogeneous collaborative filtering domains
E-mail addresses: [email protected] (W. Pan), [email protected] (Q. Yang). ...... [16] Michael Collins, S. Dasgupta, Robert E. Schapire, A generalization of ... [30] Daniel D. Lee, H. Sebastian Seung, Algorithms for non-negative matrix ...

Securing Collaborative Filtering Against Malicious ...
the IEEE Joint Conference on E-Commerce Technol- ogy and Enterprise Computing, E-Commerce and E-. Services (CEC/EEE 2006). Burke, R.; Mobasher, B.; and Bhaumik, R. 2005. Lim- ited knowledge shilling attacks in collaborative filter- ing systems. In Pr

Collaborative Filtering via Learning Pairwise ... - Semantic Scholar
assumption can give us more accurate pairwise preference ... or transferring knowledge from auxiliary data [10, 15]. However, in real ..... the most popular three items (or trustees in the social network) in the recommended list [18], in order to.

Collaborative IDS Framework for Cloud
Sep 27, 2015 - platforms (i.e. GNU/Linux, Window). .... These SVs gives a decision function of the form f(x) = m. ∑ i=1. αiyiK(xT ... f(x) = f(−1, +1) is its prediction.

Attack Resistant Collaborative Filtering - Research at Google
topic in Computer Science with several successful algorithms and improvements over past years. While early algorithms exploited similarity in small groups ...

Transfer Learning in Collaborative Filtering for Sparsity Reduction
ematically, we call such data sparse, where the useful in- ... Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) ... way. We observe that these two challenges are related to each other, and are similar to the ...

Transfer Learning for Collaborative Filtering via a ...
aims at predicting an active user's ratings on a set of. Appearing in Proceedings of ...... J. of Artificial Intelligence Research, 12, 149–198. Caruana, R. A. (1997).

Practical privacy preserving collaborative filtering on the Google App ...
Google App Engineにおけるプライバシー保護協調フィルタリング ... 方式を Platform-as-a-Service (PaaS) cloud によって実現されている Software-as-a-Service (SaaS).

Google Message Filtering - Devoteam G Cloud
enables the service to harness the latest and most accurate threat ... are always protected from the latest threats. ... Google Apps is a suite of applications.

Google Message Filtering - Devoteam G Cloud
Software-as-a-Service (SaaS) model, saving money and IT resources ... in real time – and apply it to every message flowing through the service network.

CoFiSet: Collaborative Filtering via Learning Pairwise ...
from an auxiliary data domain to a target data domain. This is a directed knowledge transfer approach similar to traditional domain adaptation methods. Adaptive ...

Google Message Filtering - Devoteam G Cloud
Software-as-a-Service (SaaS) model, saving money and IT resources ... Google is a trademark of Google Inc. All other company and product names may be ...

Practical privacy preserving collaborative filtering on ...
A recommendation example: Amazon's “people who buy x also buy y”. Recommendation .... Amazon Web Services Elastic Beanstalk (AWS EBS)2. PaaS cloud.

Feasibility of a privacy preserving collaborative filtering ... - Anirban Basu
cloud for running web applications developed in Python,. 3Report available at .... Extensions in the GAE/J, the open-source University of. Texas (Dallas) Paillier ...

Collaborative Filtering Supporting Web Site Navigation
rithms that try to cluster users with respect some (given ... tion Systems, with the emerging filtering technologies .... file, the user is interested in the document.