Noise Injection for Search Privacy Protection Shaozhi Ye [email protected] Department of Computer Science University of California, Davis

Aug. 28, 2009

Joint work with Felix Wu, Raju Pandey, and Hao Chen.

S. Ye (UCDavis)

Aug. 28, 2009

1 / 16

Outline

1

Search Privacy

2

Noise Injection Model

3

Perfect Privacy Protection

4

Limited and Independent Noise

5

Future Work

6

Summary

S. Ye (UCDavis)

Aug. 28, 2009

2 / 16

1.1 Motivation

Threats to search users Large number of data mining algorithms + machines. Data retention window ranges from months to years. Vulnerable data sanitization designs and improper implementations: AOL Gate: 20M queries from 650K "anonymized" users.

Insider attack.

S. Ye (UCDavis)

Aug. 28, 2009

3 / 16

1.2 Search User Profiling

User identification IP address HTTP cookies Client-side tool: toolbar, desktop

User profiling Queries Click-through Search preference: languages, categories Rich client side: toolbar, desktop

S. Ye (UCDavis)

Aug. 28, 2009

4 / 16

1.3 Search Privacy Protection Protection solutions Server side: Privacy preserving data mining

Notable existing tools CustomizeGoogle

Network: Proxies, TOR

Torbutton

User side: Noise injection

TrackMeNot

Credit: Tim Boucher

S. Ye (UCDavis)

Aug. 28, 2009

5 / 16

2.1 Noise injection Model Noise Injection With probability , the user sends a true query Qu With probability 1 − , the user sends a noise query Qn

The search engine observes Qs ∀i

P(Qs = qi ) = P(Qu = qi ) + (1 − )P(Qn = qi )

S. Ye (UCDavis)

Aug. 28, 2009

6 / 16

2.2 Measure Privacy Breaches Privacy breach The distribution of Qu → user profiles. Mutual information I(Qs ; Qu )

Problem Find a Qn such that I(Qs ; Qu ) is minimized.

S. Ye (UCDavis)

Aug. 28, 2009

7 / 16

3. Perfect Privacy Protection

Theorem I(Qs ; Qu ) = 0 only if  ≤ 1/NQ .

Corollary Lower bound noise for a perfect protection: E(|Qn |) =

S. Ye (UCDavis)

1− |Qu | ≥ (NQ − 1)|Qu | 

Aug. 28, 2009

8 / 16

Limitations

Expensive: Send the whole dictionary with each query.

Limited bandwidth Search engines block users to prevent DoS attacks. Response delay: Expected waiting time for each real query is 1/

S. Ye (UCDavis)

Aug. 28, 2009

9 / 16

4. Limited and Independent Noise

Let Qu and Qn be independent.

Optimization Problem arg min I(n) w.r.t. n

X

ni = 1,

∀i

ni ≥ 0

i

where n = (n1 , n2 , · · · , nNQ ).

Solution We prove I is a convex function of n. Use Lagrange multipliers to solve the optimization problem.

S. Ye (UCDavis)

Aug. 28, 2009

10 / 16

4.2 A Special Case: E(|Qn |) = |Qu |

Use Taylor series to replace the logarithm functions for an approximate solution. How close our solution is? The objective function is convex. Increasing the order of the Taylor series gets better accuracy.

Caveat: Computational cost when NQ is large.

S. Ye (UCDavis)

Aug. 28, 2009

11 / 16

4.3 Simulation results How to evaluate? The larger H(Qu ) is, the larger I(Qs ; Qu ) will be. Relative mutual information:

I(Qs ;Qu ) H(Qu ) .

0.5

Qu : Power law distribution The number of the ith most popular queries is proportional to i −α , α ∈ [1.0, 5.5].

Relative Mutual Information

0.45 0.4 0.35 0.3 0.25

optimized noise uniform noise

0.2 0.15 0.1 0.05 0 100

S. Ye (UCDavis)

200

300

400

500 600 NQ

700

800

900

Aug. 28, 2009

1000

12 / 16

4.4 Applicability

Privacy information is restricted within a relatively small sets of queries. Scalability When NQ increases, the protection of random noise gets worse while our solution does not exhibit such trend.

Combining network based solutions with noise injection will help.

S. Ye (UCDavis)

Aug. 28, 2009

13 / 16

5. Future work Allow non-sensitive inferences. Allow attackers with external knowledge. Allow no prior knowledge on Qu → Adaptive noise generator. Have computational constraints for the attacker.

S. Ye (UCDavis)

Aug. 28, 2009

14 / 16

Summary

Developed a noise injection model for search privacy protection. Proved the lower bound for the amount of noise queries required by a perfect privacy protection. Provided the optimal protection when noise is limited and independent of user queries. Computed an approximate solution for the case where same amount of noise is injected and evaluated our result with simulations.

S. Ye (UCDavis)

Aug. 28, 2009

15 / 16

Questions?

Thanks!

S. Ye (UCDavis)

Aug. 28, 2009

16 / 16

Noise Injection for Search Privacy Protection

Aug 28, 2009 - Threats to search users. Large number of data mining algorithms + machines. Data retention window ranges from months to years. Vulnerable data sanitization designs and improper implementations: AOL Gate: 20M queries from 650K "anonymized" users. Insider attack. S. Ye (UCDavis). Aug. 28, 2009.

242KB Sizes 3 Downloads 210 Views

Recommend Documents

Noise Injection for Search Privacy Protection
Department of Computer Science .... user side, e.g. user names, operation systems, and hard- ..... are verified by the Yacas computer algebra system [34]. This.

Noise Injection for Search Privacy Protection
Oct 26, 2011 - Privacy concerns have emerged globally as massive user information being collected by search en- gines. The large body of data mining ...

Noise Injection for Search Privacy Protection
Oct 26, 2011 - while we are not aware of any analytical results published on how to ... Client-side search tool: Client-side search software, such as search ...

Optimal tag suppression for privacy protection in the ...
Aug 25, 2012 - [10], the privacy research literature [11] recognizes the distinction .... framework that enhances social tag-based applications capitalizing on trust ... these systems simply benefit from the networking of a combination of several mix

Privacy Notice Data Protection - Staff.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Privacy Notice ...

Decentralized Privacy Protection Strategies for Location ...
sitive geographical locations of users and personal identity information. The existing approaches [1, 3] for continuous queries still have unresolved issues about location privacy. These work [2] mainly focuses on a centralized approach that adopts a

Wireless Location Privacy Protection - IEEE Computer Society
Dec 1, 2003 - dated that, by December 2005, all cellular carriers be able to identify the location of emergency callers using mobile phones to within 50 to 100.

Wireless Location Privacy Protection
ple routinely use a credit card to buy goods and services on the Internet because they believe that the conve- nience of online purchases outweighs the potential ...

In search of noise-induced bimodality - Core
Nov 7, 2012 - adopting single-cell-level measurement techniques. ... material to make measurements. ... The sample size for the distributions was 105. 0. 40.

In search of noise-induced bimodality - ScienceOpen
Nov 7, 2012 - *Correspondence: [email protected]. Department of Bioengineering, University of Washington, William H Foege. Building, Box 355061, Seattle, ...

Large-scale Privacy Protection in Google Street ... - Research at Google
false positives by incorporating domain-specific informa- tion not available to the ... cation allows users to effectively search and find specific points of interest ...

Taxation and Privacy Protection on Internet Platforms
Oct 4, 2015 - The immense profits of (some not all) internet platforms suggests ... specific tax paid by users (like a tax on internet service providers) produces ...

ocsb-policy-protection-of-privacy-2013.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. ocsb-policy-protection-of-privacy-2013.pdf. ocsb-policy-protection-of-privacy-2013.pdf. Open. Extract. Open

freedom of information and protection of privacy (foip)
The Calgary Catholic School District (the District) is committed to ensure that it complies with the Alberta ​Freedom of Information and Protection of Privacy Act​, RSA 2000, c F-25 (the FOIP. Act). The basic objectives of the FOIP Act are: (1) t

Optimized, delay-based privacy protection in social networks
1 Aggregated Facebook and Twitter activity profiles are shown in [7] per .... some sites have started offering social networking services where users are not ...

Subscription Privacy Protection in Topic-based Publish ...
promised subscribers and untrusted brokers easily exposes the privacy of hon- est subscribers. Given the untrusted .... fer the subscription privacy protection (e.g., for a large anonymity level k = 40, the proposed scheme uses only 2.48 folds of cos

FAQs on Parental Rights and Privacy Protection Policy.pdf ...
... provide notification and opt-out when schools. 1 See statements of Alice Dowdin Calvillo, available at. https://www.youtube.com/watch?v=SVxC s7qIGk&feature=youtu.be. 2 Rocklin Academy Fact Sheet on Gender Identity - 8-30-17, available at. http://

Large-scale Privacy Protection in Google Street ... - Research at Google
wander through the street-level environment, thus enabling ... However, permission to reprint/republish this material for advertising or promotional purposes or for .... 5To obtain a copy of the data set for academic use, please send an e-mail.