Enabling Efficient Content Location and Retrieval in Peer-to-Peer Systems by Exploiting Locality in Interests Kunwadee Sripanidkulchai, Bruce Maggs, Hui Zhang, Carnegie Mellon University

Challenges Challengesposed posedby bypeer-to-peer peer-to-peer • Want scalable and high performance content location and peer

Gnutella overlay

selection. Existing solutions provide scalable location, but have not addressed peer selection. • Retrieval performance between end-hosts is highly variable and dynamic.

Content Peer list overlay

May 1, 2001

High variability (σ ≈ 1 sec) in ping times over a 24-hour period to a random end-host on the Internet (typical for 1/3 of 2400 end-hosts pinged in our experiments)

4

Ping Time (ms)

10

3

10

3) The list evolves as more content is retrieved and more peers are discovered

Content location Queries for content are sent to peers in the list

18:00:00

00:00:00 Time

06:00:00

12:00:00

• Need to use up-to-date performance state to select a peer • For scalability, cannot maintain up-to-date state for all peers • Which peers should we maintain state for? - Peers that have locality in interests

Locality in interests Observation: people share common interests. Can we exploit this to improve content location and retrieval? D, E, F 0/3

Fine-grained dynamic performance state can be maintained for peers on the list

Potential benefits and overhead • Use Boeing corporate web proxy traces to drive the request stream for the simulations • Treat a request for a new document (a compulsory cache miss for a web cache) as a publish in peer-to-peer system • Ran simulations over a period of 5 minutes to 3 hours • Content location algorithms are based on - Asking random peers - Asking peers with same interests (1 hop) - Asking peers of peers with same interests (2 hops)

A, C, D, E

2/3 0/3

3/3 A, B, C, D

Peer selection and content retrieval

A, B, C

60

11

max

F, G, H

10

50 9 content−based 1 hop 8

Miss rate (%)

Proposedsolution solution Proposed

average over 16 runs

30

min

Number of peers

random

40

7

content−based 2 hops

6

5

20 4

A distributed algorithm for peers to self-organize into clusters based on interests (peer list) Why is it easier to incorporate dynamic performance state when using locality in interests to locate and retrieve content? - Only need to keep performance state for peers that are likely to provide the content one is looking for.

Peer list 1) Each peer maintains a list of peers who share the same interests 2) Peer lists are initially bootstrapped using existing protocols, such as Gnutella, Tapestry, Pastry, CAN, or Chord. We use the following heuristic: peers that have the content you are looking for have the same interests.

content−based 1 hop

3

content−based 2 hops

2

10

0

0

2000

4000

6000 Simulation length (s)

8000

10000

12000

Locating content amongst peers with locality in interests results in low miss rates

1

0

2000

4000

6000 Simulation length (s)

8000

10000

12000

Maintaining a small list of peers who share the same interests provides good hit rate

Implementationstatus status Implementation • Refining algorithm by ranking peers in one’s list to select peers that are more likely to have content • Exploring alternative mechanisms to bootstrap peer lists • Developing techniques for incorporating dynamic performance state into algorithm • Implementing our solution using Gnutella to bootstrap peer lists

Enabling Efficient Content Location and Retrieval in ...

May 1, 2001 - Retrieval performance between end-hosts is highly variable and dynamic. • Need to ... Peer-to-Peer Systems by Exploiting Locality in Interests.

56KB Sizes 1 Downloads 308 Views

Recommend Documents

Enabling Efficient Content Location and Retrieval in ...
service architectures peer-to-peer systems, and end-hosts participating in such systems .... we run simulations using the Boeing corporate web proxy traces [2] to.

Enabling Efficient Content Location and Retrieval in ...
May 1, 2001 - Retrieval performance between end-hosts is highly variable and dynamic. ... miss for a web cache) as a publish in peer-to-peer system.

Enabling Efficient Content Location and Retrieval in Peer ... - CiteSeerX
Peer-to-Peer Systems by Exploiting Locality in Interests. Kunwadee ... Gnutella overlay. Peer list overlay. Content. (a) Peer list overlay. A, B, C, D. A, B, C. F, G, H.

Enabling Efficient Content Location and Retrieval in ...
The wide-spread adoption of Internet access as a utility service is enabling ... Our interests lie in peer-to-peer content publishing and distribution, where peers ...

Efficient Content Location Using Interest-Based Locality ...
Section VIII, and related work in Section IX. II. ... First, shortcuts are modular in that they can work with ..... participate in a Web content file-sharing system.

Enabling Robust and Efficient Distributed ...
relatively recent P2P-based storage services that allow data to be stored and retrieved among peers [3]. ... recently, for cloud computing services as well [2], [18]. ...... [45] R. O'Dell and R. Wattenhofer, “Information dissemination in highly ..

Efficient Speaker Identification and Retrieval - Semantic Scholar
Department of Computer Science, Bar-Ilan University, Israel. 2. School of Electrical .... computed using the top-N speedup technique [3] (N=5) and divided by the ...

Efficient Speaker Identification and Retrieval
(a GMM) to the target training data and computing the average log-likelihood of the ... In this paper we aim to (a) improve the time and storage efficiency of the ...

Efficient Speaker Identification and Retrieval - Semantic Scholar
identification framework and for efficient speaker retrieval. In ..... Phase two: rescoring using GMM-simulation (top-1). 0.05. 0.1. 0.2. 0.5. 1. 2. 5. 10. 20. 40. 2. 5. 10.

Unsupervised, Efficient and Semantic Expertise Retrieval
a case-insensitive match of full name or e-mail address [4]. For. CERC, we make use of publicly released ... merical placeholder token. During our experiments we prune V by only retaining the 216 ..... and EX103), where the former is associated with

Unsupervised, Efficient and Semantic Expertise Retrieval
training on NVidia GTX480 and NVidia Tesla K20 GPUs. We only iterate once over the entire training set for each experiment. 5. RESULTS AND DISCUSSION. We start by giving a high-level overview of our experimental re- sults and then address issues of s

BloomCast Efficient And Effective Full-Text Retrieval In Unstructured ...
BloomCast Efficient And Effective Full-Text Retrieval In Unstructured P2P Networks.pdf. BloomCast Efficient And Effective Full-Text Retrieval In Unstructured P2P ...

Social Caching and Content Retrieval in Disruption ...
Epidemic routing [10], which floods the entire network. ... popular data at high social-level nodes to which most content ... 2015 International Conference on Computing, Networking and Communications, Wireless Ad Hoc and Sensor Networks.

Enabling And Secure Efficient Ranked Keyword Search Over ...
... searchable encryption, order-preserving mapping, confidential data, cloud computing. ✦ ... management, universal data access with independent ..... Enabling And Secure Efficient Ranked Keyword Search Over Outsourced Cloud Data.pdf.

Enabling Secure and Efficient Ranked Keyword ... - IJRIT
Abstract—Cloud computing economically enables the paradigm of data service outsourcing. However, to protect data privacy, sensitive cloud data has to be ...

Enabling Secure and Efficient Ranked Keyword ... - IJRIT
the problem of secure ranked keyword search over encrypted cloud data. Ranked search greatly enhances system usability by enabling search result relevance ...

A Security Framework for Content Retrieval in DTNs - IEEE Xplore
Dept. of Computer Science, UCLA. Los Angeles, USA. {tuanle, gerla}@cs.ucla.edu. Abstract—In this paper, we address several security issues in our previously ...

Indexing Shared Content in Information Retrieval Systems - CiteSeerX
We also show how our representation model applies to web, email, ..... IBM has mirrored its main HR web page at us.ibm.com/hr.html and canada.

Community Aware Content Retrieval in Disruption ...
content name toward the higher social level nodes, which are more popular in the network. If the Interest cannot be resolved in the requester's community, it will ...

CONTENT LOCATION IN PEER-TO-PEER SYSTEMS: EXPLOITING ...
Jan 18, 2001 - several different content distribution systems such as the Web and popular peer- .... (a) Top 20 most popular queries. 1. 10. 100. 1000. 10000. 100000 ..... host is connected to monitoring ports of the two campus border routers. .....

CONTENT LOCATION IN PEER-TO-PEER SYSTEMS: EXPLOITING ...
Jan 18, 2001 - several different content distribution systems such as the Web and ..... host is connected to monitoring ports of the two campus border routers.

Content-based retrieval for human motion data
In this study, we propose a novel framework for constructing a content-based human mo- tion retrieval system. Two major components, including indexing and matching, are discussed and their corresponding algorithms are presented. In indexing, we intro

Evaluating Content Based Image Retrieval Techniques ... - CiteSeerX
(“Mountain” class); right: a transformed image (negative transformation) in the testbed ... Some images from classes of the kernel of CLIC ... computer science.

EFFICIENT INTERACTIVE RETRIEVAL OF SPOKEN ...
between the key term ti and the corresponding document class C(ti) is defined by .... initially large number of users can be further classified into cate- gories by ...