What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Perturbation based privacy preserving Slope One predictors for collaborative filtering Anirban Basu1 1 Department 2 MSIS
Jaideep Vaidya2
Hiroaki Kikuchi1
of Electrical Engineering, Tokai University (Japan)
Department, Rutgers, The State University of New Jersey (USA)
The 6th Annual IFIP WG 11.11 International Conference on Trust Management (IFIPTM) 21-25 May 2012 – Surat, India
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
1/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Recommender systems and privacy
What is this talk about? 1
What and why? Recommender systems and privacy
2
Privacy preserving collaborative filtering (PPCF) Collaborative filtering (CF), briefly Privacy preserving Slope One
3
Implementation Performance evaluation Additive noise and encrypted query
4
Tail piece Conclusions and future avenues Question time! Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
2/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Recommender systems and privacy
Does this look familiar?
What was I looking at? Canon EOS 7D with a 15-85mm f/3.5-5.6 IS USM lens!
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
3/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Recommender systems and privacy
Recommendation and privacy
‘People who have bought this have also bought these’ – recommendation, to attract buyers. . . Collaborative filtering (CF) – a recommendation based on opinions of the community. What about privacy in rating based collaborative filtering?
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
3/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Recommender systems and privacy
Privacy and recommendation on the cloud?
Can someone (the cloud?) compute CF for users? . . . and do so without compromising privacy of user ratings? Privacy preserving collaborative filtering (PPCF) for the Software-as-a-Service cloud.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
4/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
A brief background 1
What and why? Recommender systems and privacy
2
Privacy preserving collaborative filtering (PPCF) Collaborative filtering (CF), briefly Privacy preserving Slope One
3
Implementation Performance evaluation Additive noise and encrypted query
4
Tail piece Conclusions and future avenues Question time! Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
5/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
CF – a form of recommendation User-item rating data like this12 :
Alice Bob Carol Dave
Canon 7D 5 3 4
Leica M9 4 5 ? 3
Nikon D7000 2 4 -
... ... ... ... ...
Olympus OM-D 3 3 -
Find a rating for Leica M9 for Carol. CF – a well-known recommendation technique, based on the preferences of the community. 1 2
Note: “-” indicates the absence of a rating. Note: lack of context and sparseness of data. Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
6/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
Slope One predictors
Based on: Lemire, D., Maclachlan, A. 2005. Slope one predictors for online rating-based collaborative filtering. In: Society for Industrial Mathematics.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
7/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
Slope One predictors
CF predictors of the form f (x) = x + b, hence “slope one”. Simple and efficient (compared with Singular Value Decomposition, Pearson’s Product Moment Correlation Coefficient, Cosine Correlation). Robust to certain types of data perturbation.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
7/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
CF and Slope One
Pre-computation phase: Deviation matrix ∆: deviation of ratings of an item pair by the same user; dimension: n × n. Cardinality matrix φ: number of co-existing ratings by the same user of an item pair; dimension same as ∆.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
8/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
CF and Slope One
Figure: The general CF problem.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
8/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
CF and Slope One
Figure: Slope One pre-computation creates a ‘model’ which is used for prediction.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
8/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
The weighted Slope One predictor Average deviation: δa,b
∆a,b = = φa,b
P
i δi,a,b
φa,b
P =
i (ri,a
− ri,b )
φa,b
φa,b : the number of the users who have rated both items; δi,a,b = ri,a − ri,b : the deviation of the ratings of item a from that of item b both given by user i. The weighted Slope One prediction: P P a|a6=x (δx,a + ru,a )φx,a a|a6=x (∆x,a + ru,a φx,a ) P P ru,x = = a|a6=x φx,a a|a6=x φx,a
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
9/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
Preserving privacy with Slope One CF with perturbation
Pre-computation privacy: introducing random noise . . . . . . in individual ratings? . . . in deviations of pairwise ratings?
Prediction privacy: with random noise too?
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
10/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
Type of noise
Additive random noise. Multiplicative random noise. Noise distributions: Gaussian, Poisson,. . .
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
11/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
SlopeOne predictor and additive noise
Figure: Slope One predictor and additive Gaussian noise.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
11/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Collaborative filtering (CF), briefly Privacy preserving Slope One
PPCF proposals
A: Add noise in both the pre-computation and the prediction stages. Better performance, lower accuracy. B: Add noise in both the pre-computation and use encrypted prediction. Better accuracy, slower performance.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
12/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
How does this perform 1
What and why? Recommender systems and privacy
2
Privacy preserving collaborative filtering (PPCF) Collaborative filtering (CF), briefly Privacy preserving Slope One
3
Implementation Performance evaluation Additive noise and encrypted query
4
Tail piece Conclusions and future avenues Question time! Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
13/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
Parameters for the evaluation The scenarios: A1: Additive Gaussian noise3 to both ratings and prediction query. A2: Additive Gaussian noise to deviations and prediction query. B1: Additive Gaussian noise to ratings but rounded off total deviations for encrypted prediction. B2: Additive Gaussian noise to deviation but rounded off total deviations for encrypted prediction.
Dataset: MovieLens 100K. Hardware: 64-bit Mac OS X 10.7.2 and 64-bit Java 1.6.0 29 on Apple Macbook Pro (64-bit 2.53GHz Intel Core i5, 8GB RAM). 3
Noise distribution given as N (0, 5). Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
14/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
Performance results Prediction time4 0.22ms
Non-PPCF baseline
PPCF strategy None
MAE 0.7019
A1
Perturbation
0.8346
0.23ms
A2
Perturbation
0.8307
0.234ms
B1
Perturbation
0.7113
0.233ms
B2
Perturbation
0.7081
0.231ms
Basu et al. (IFIPTM, JISIS 2011)
Encryption
0.7057
4500ms (2048-bit Damgård-Jurik)
Polat and Du (SAC 2005)
Perturbation
0.7104
Unknown
Stored data Item-item deviation, cardinality matrices. Item-item deviation, cardinality matrices. Item-item deviation, cardinality matrices. Item-item deviation, cardinality matrices. Item-item deviation, cardinality matrices. Item-item deviation, cardinality matrices z-scored and randomised user-item rating matrix and its singular value decompositions.
4
Note: B1 and B2 prediction times will be significantly higher when encryption is actually used. Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
15/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
Encrypted prediction query An additively homomorphic cryptosystem – the Paillier cryptosystem, defining homomorphic addition: E(m1 + m2 ) = E(m1 ) · E(m2 ) and homomorphic multiplication: E(m1 · π) = E(m1 )π m1 and m2 are plaintexts and π is a plaintext multiplicand.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
16/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
Encrypted prediction query Based on the previous equation for plaintext Slope One predictors, we can write: X Y (∆x,a + ru,a φx,a ) = D( (E(∆x,a )(E(ru,a )φx,a ))) a|a6=x
a|a6=x
optimising the numerator, the final prediction is: P Q D(E( a|a6=x ∆x,a ) a|a6=x (E(ru,a )φx,a )) P ru,x = a|a6=x φx,a
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
16/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Performance evaluation Additive noise and encrypted query
Cloud deployment scenario for B2
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
17/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Conclusions and future avenues Question time!
Let’s wrap up 1
What and why? Recommender systems and privacy
2
Privacy preserving collaborative filtering (PPCF) Collaborative filtering (CF), briefly Privacy preserving Slope One
3
Implementation Performance evaluation Additive noise and encrypted query
4
Tail piece Conclusions and future avenues Question time! Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
18/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Conclusions and future avenues Question time!
Conclusions
Privacy preserving collaborative filtering with perturbation. Additive random noise to Slope One predictors. Level of privacy and level accuracy are orthogonal. Optimal combination of perturbation and encryption for privacy.
Future work: prototype implementation on a SaaS cloud – Google App Engine for Java.
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
19/20
What and why? Privacy preserving collaborative filtering (PPCF) Implementation Tail piece
Conclusions and future avenues Question time!
Thank you for listening!
Any questions?
Anirban Basu, Jaideep Vaidya, Hiroaki Kikuchi
Perturbation based PPCF
20/20