Social Network User Lifetime

Viewer
Transcript

1

Social Network User Lifetime Juan Lang Department of Computer Science University of California, Davis [email protected] S. Felix Wu Department of Computer Science University of California, Davis [email protected]

Abstract—Online Social Network (OSN) operators are interested in promoting usage among their users, and try a variety of strategies to encourage use. Some recruit celebrities to their site, some allow third parties to develop applications that run on their sites, and all have features intended to encourage use. As important as usage is, we are unaware of any studies into what influences users to be active and to remain online. This paper is the first work studying the lifetime of OSN users, examining the factors that influence lifetime in one OSN, Buzznet. The major contributions of this work are the study of active lifetime, the features and behaviors that encourage activity, and the comparison of active lifetime to passive lifetime.

to what extent they coincide and the factors that appear to influence passive lifetime. To our knowledge, ours is the first study into the factors that influence user lifetimes for any OSN. The remainder of this work is organized as follows: Related work is discussed in Section II. Background for the work is discussed in Section III. Results are discussed in Section IV. Recommendations are given in Section V. We conclude in Section VI.

II. R ELATED W ORK I. I NTRODUCTION Online Social Network (OSN) operators are interested in promoting usage among their users. Most are funded at least in part by advertising, so increased usage translates directly into increased revenue. Some social networks are funded directly by their users, so a loyal user base also translates directly into increased revenue for the networks’ operators. Maintaining active users is a challenge for existing OSNs. For example, 40% of Twitter accounts have no activity, and 60% of users who visit Twitter in one month fail to return the following month [1], while approximately half of all Facebook accounts crawled in one study had no activity [2]. This work investigates how different factors of a user’s use of one OSN, Buzznet1 , correspond with the user’s lifetime in that OSN. Relative to the largest OSNs, Buzznet has a small user base, and it lacks some of the features available on some larger OSNs such as Facebook. While Buzznet’s smaller user base might be a disadvantage for research purposes, its small size allowed us to crawl the largest connected component (LCC) of the social graph, avoiding selection bias [3]. Because Buzznet has fewer features than competing OSNs, the lifetime of its users can be studied independently of features that might encourage increased usage. The insights gained could therefore help decide which features are the most beneficial for promoting lifetime. The major contributions of this work are the study of lifetime as a distinct phenomenon, in addition to identifying features that appear to influence lifetime. Additionally, this work compares active lifetime to passive lifetime, examining 1 http://www.buzznet.com

Examining user lifetime is related to studying churn, which has been studied in other contexts. Notably, Dasgupta et al [4] studied churn in mobile telecom networks, and observed that users whose friends recently left the network are more likely to leave the network. In telecom networks, users have clear dates on which they leave the network, allowing such analysis to be conducted. Because most OSNs are free for their users, user accounts are more likely to be dormant than to be removed altogether. Stutzbach and Rejaie [5] measured churn on peer to peer (P2P) networks, where peer lifetime is an important component of the network’s availability. Peer lifetime in P2P networks is a measure of how long a particular instance of a running peer remains connected to the network, which is a different phenomenon that that which we study here. Wilson et al [2] crawled 10 million Facebook profiles and observed that users are most likely to be active early in their profiles’ lifetimes. They also observed that approximately half of all users crawled generated no interactions, which is similar to the results from Twitter. Viswanath et al [6] analyze the activity graph of 60k users in the New Orleans network within Facebook. They focus only on pairs of users with activity between the two users. They show that the fraction of messages sent over time decreases after a link is created between two communicating users, which further confirms the observation that users decay over time. Benevenuto et al [7] use a proxy server to capture accesses to several OSNs, including Orkut and MySpace, over a 12 day period. They observed that users are mainly voyeurs: 92% of user activity is a reading activity, and 85% of users in their capture had no write activities.

2

1

III. BACKGROUND

0.9

Definition 1. User u’s in degree ki (u) = |v| such that there is an edge v → u ∈ E. Definition 2. User u’s out degree ko (u) = |v| such that there is an edge u → v ∈ E. The CDF of in degree for users in our crawl appears in Figure 1. Note that it follows the familiar power law distribution, with the largest group of users having an in degree of 1, and a very small fraction of users having a much larger in degree. In order to measure the active lifetime of a user, we look at the date on which the user makes their last recorded activity on the site. Since our crawl did not obtain all available data, we needed to ensure that the data we did crawl are representative of usage on the site as a whole. To do so, we crawled all activity for a sample of 9,000 users chosen at random from the LCC discovered in our initial crawl. For these users, we compared the date of the last activity in the subset of data in our initial crawl, i.e. notes, photos, and photo comments, to the date of the actual last activity. 98% of the users in this crawl had no difference between the two dates, and the mean difference between the two dates was 2.1 days. Thus, the subset of data we crawled is representative of activity on the site as a whole. Definition 3. A user’s active lifetime is the number days between the user’s profile creation date and the last recorded activity from the user. In addition to active lifetimes, we are interested in when users engage in passive activity. We attempt to capture this with the following definitions: Definition 4. A user’s passive lifetime is the number of days between the user’s last recorded activity and the last time on which the user was logged in. Definition 5. A user’s total lifetime is the number of days between the user’s profile creation date and the last time on

0.8 0.7 0.6 CDF

We performed a BFS-based crawl of the Buzznet social graph, until we had obtained the largest connected component, containing approximately 750,000 users and 9 million directed edges. For each of these users, we collected each of the public notes–posts from other users–each user had received, as well as all of the photos each user posted, and the comments each photo received. In all, we retrieved approximately 5 million notes, 4 million photos, and 4 million photo comments. We did not collect other user activities, including videos and journal posts. By not crawling all available data, we reduced the time required for our crawl substantially, at the risk of introducing bias in our estimation. We revisit the issue shortly. For each user, we also collected account information including the account’s creation date, a list of the user’s interests, and demographic information. More formally, let G = (V, E) represent the Buzznet social graph, where V is the set of users, and E is the set of edges. Given two users, u, v ∈ V , and an edge u → v ∈ E, u is one of v’s followers, and v is one of u’s friends.

0.5 0.4 0.3 0.2 0.1 0 1

Fig. 1.

10

100

1000 In degree

10000

100000

1e+06

CDF of in degree

which the user was logged in. The definition of active lifetime does not distinguish between undirected activity, e.g. photos posted by a user to his or her own profile, and directed activity, e.g. comments on another user’s photo. In some cases, we need to distinguish directed activity. Definition 6. Activity sent by a user u is activity generated by u and directed to another user v, v 6= u. Definition 7. Activity received by a user u is activity generated by another user v, v 6= u, directed to u. In the data in our crawl, directed activity includes notes from one user to another user and comments on a photo from a user other than the photo’s poster. Because the behavior of the most popular users is different than that of other users, we defined a class of the most popular users, which we termed Celebrities. Definition 8. A Celebrity is a user whose in degree is above the 99.99th percentile. There were 101 such users in our crawl, with a minimum in degree of 3,032, and a maximum in degree of over 180,000. Edges incident on Celebrities represent approximately 31% of all edges in our crawl. We then defined two more classes based on this definition of Celebrity: Definition 9. A Pure Fan is a user who follows only Celebrities. Definition 10. A Mixed user is a user who is not a Celebrity and who follows at least one non-Celebrity. We will make use of these classes in our results, which we present next.

IV. R ESULTS In this section, we present the active lifetimes of users in our crawl, followed by predictors of their active lifetimes. We then explore the relationship between active lifetimes and passive lifetimes.

3

TABLE II C ORRELATION BETWEEN LIFETIME AND VARIOUS MEASURES FOR USERS BELOW 99 TH PERCENTILE

1

0.9

Correlation coefficient

0.8

In degree Out degree Mean clustering coefficient Number of account details supplied

CDF

0.7

0.6

0.94 0.90 -0.65 0.68

0.5 2000 0.4

1800 1600

0.3

Fig. 2.

10

100 Profile age at last activity (days)

1000

10000 Mean active lifetime (days)

1

CDF of active lifetime

TABLE I C ORRELATION BETWEEN ACTIVE LIFETIME AND VARIOUS MEASURES Correlation coefficient

1400 1200 1000 800 600 400 200

In degree Out degree Mean clustering coefficient Number of account details supplied Last received activity age First received activity age

0.03 0.02 0.00 0.11 0.58 0.10

0 1

10

100

1000

10000

100000

1e+06

In degree

Fig. 3.

Mean active lifetime vs. in degree

A CDF of users’ active lifetimes for users with any recorded activity is shown in Figure 2. Unlike many phenomena in social networks, the active lifetime does not follow a powerlaw distribution: the largest lifetimes are not all that large, though they are bounded by the lifetimes of the social network itself. Still, little to no activity is the norm: Nearly 2/3 of the users we collected had no recorded activity. Of those users who do have recorded activity, the majority of it is early in their profile’s lifetime. One third of users with any activity only had activity on the day they created their profiles, and half had activity only within the first nine days of their profiles’ existence.

them. The correlation coefficients don’t tell the whole story, however. Figure 3 shows a plot of the mean lifetime compared to the in degree of users. The active lifetime compared to the out degree is similar, and is omitted for brevity. As can be seen, there is a strong correlation between degree and mean active lifetime for smaller degrees, but as the degrees get larger there is a large amount of variation. This variation corresponds to a decrease in density for larger degrees: the 99th percentile of in degree is 164, while the 99th percentile of out degree is 189, so relatively few users have larger degree. Table II shows the correlation coefficient between median active lifetime and the in and out degrees for users whose degrees are below the 99th percentile. As can be seen, for the overwhelming majority of users the correlation between degree and lifetime is strong in Buzznet.

B. Predictors of Active Lifetime

Hypothesis 2. A high clustering coefficient predicts a long lifetime.

A. Active Lifetime

In this section, we investigate predictors of users’ active lifetimes, grouped by hypotheses relating to various properties of the users and their activity. 1) Graph structural properties: Hypothesis 1. A high degree predicts a long lifetime. Hypothesis 1.1. A high in degree predicts a long lifetime. The intuition behind the hypothesis is that more popular users are active longer than less popular users. Hypothesis 1.2. A high out degree predicts a long lifetime. The intuition behind the hypothesis is that users who follow many users may be more engaged in the OSN, and therefore have a longer lifetime. Table I shows that the correlation between in and out degree and active lifetime indicates a random relationship between

The hypothesis is that users who are more highly interconnected with their friends are more likely to remain active on the site. As Table I shows, there is no correlation between the active lifetime and the clustering coefficient. Again, the correlation doesn’t tell the whole story. Because the clustering coefficient does not have discrete values, we placed the active lifetime of users into bins based on their clustering coefficient, and computed the mean and median lifetime of users in each bin. The width of each bin was 0.01. The results are shown in Figure 4. In contrast to our hypothesis, the lifetime appears to decrease as the clustering coefficient increases, except at the smallest values of clustering coefficient. Because this didn’t match our expectation, we investigated whether another variable, in degree, might correlate with clustering coefficient. The intuition behind the relationship between in degree and clustering coefficient is that more popular users,

4

350

TABLE III AVERAGE ACTIVE LIFETIME IN DAYS

Mean Median

300

Mean

Std. Dev.

Median

793 35 169

400 124 267

822 0 44

15

20

Active lifetime (days)

250

Celebrities Pure Fans Mixed

200

150 0.8 100 0.7

P(User active at n = 180 days)

50

0 0

Fig. 4.

0.2

0.4 0.6 Clustering coefficient

0.8

1

Active lifetime vs. clustering coefficient 0.5

0.6

0.5

0.4

0.3

0.2

0.4

Clustering coefficient

0.1 0 0.3

10

25

Number of communication partners

Fig. 6.

Probability of being active vs. number of communication partners

0.2

0.1

0 1

10

100

1000

10000

100000

1e+06

In degree

Fig. 5.

5

Mean clustering coefficient vs. in degree

those with high in degree, are less likely to be arranged in tight clusters with their followers than less popular users. The results of our comparison between in degree and clustering coefficient are in Figure 5. As can be seen, there is a relatively strong negative correlation between in degree and clustering coefficient. Since in degree and lifetime correlate, the negative correlation between clustering coefficient and lifetime is to be expected. Still, as we saw with degree, the relationship becomes a little less clear as the in degree gets very large. In order to isolate whether the active lifetime and the clustering coefficient are related for users with smaller in degree, we repeated the comparison for users whose in degree is below the 99th percentile, shown in Table II. Even for these users, there is a relatively strong negative correlation between clustering coefficient and average lifetime, indicating that a user’s popularity, as measured by in degree, has a stronger impact on lifetime than does the user’s degree of connectedness to his or her friends. Hypothesis 3. Whom a user follows influences the user’s lifetime. The inuition behind this hypothesis is that whom you follow influences whether you’re likely to be, and stay, active in an OSN. For example, if you follow people you know in real life, you may be more likely to interact with them online than if you’re following a stranger. In order to investigate this hypothesis, we calculatd the

average active lifetimes for the user classes Celebrities, Pure Fans, and Mixed users. Table III shows the average active lifetimes for each class of users. Unsurprisingly, the Celebrities have the longest active lifetime. Intriguingly, the Pure Fans have much shorter mean and median active lifetimes than the Mixed users. The data support the hypothesis that whom you follow influences your likelihood of being active in the site. 2) User behavior: Hypothesis 4. The number of unique communication partners predicts lifetime. In this hypothesis, a communication partner is a user to whom a user either sends a note or on whose photo a user comments. The intuition behind this hypothesis is that users who communicate with a variety of users over their lifetimes are more likely to remain active than users who communicate with fewer partners. In order to test the hypothesis, 5 samples of users were chosen, each containing 10,000 users. For each sample, the mean probability of the users being active 180 days after their profiles were created was calculated. (Sampling was done in order to show the change in confidence in mean probability as the number of unique communication partners increases.) The probability is plotted against the number of unique communication partners each had in Figure 6. As can be seen, there is an increase in probability of being active after 180 days as the number of communication partners increases, though the error gets larger as the number of partners increases. In other words, there is some support for the hypothesis, though the predictive power of the number of communication partners is not very strong. 3) Activity timing: Hypothesis 5. When a user last receives activity predicts lifetime. The intuition behind this hypothesis is that usage of the

5

100000

0.45 n = 30 n = 180 0.4

0.35

Probability of being active

Number of items sent

10000

1000

100

0.3

0.25

0.2

0.15

0.1

10

0.05

1

0 1

Fig. 7.

10

100 1000 Number of items received

10000

100000

Sent items vs. received items

OSN is mainly spurred by activity within the OSN, rather than by unrelated external activity. In order to investigate this hypothesis, we compared the last date on which a user received any activity to the same user’s active lifetime. The correlation between these two ages, as shown in Table I, is positive, 0.58. This gives the hypothesis some support. Further supporting the hypothesis is the relationship between received activity and sent activity. Figure 7 shows the number of items sent vs. the number of items received for all users in our crawl, in a log-log scale. The correlation coefficient between the log of the number of items sent and the log of the number of items received is 0.76. In other words, there is a clear relationship between the number of items sent and the number of items received, as well as a relationship between the date on which the last item is received and the date on which the user is last active. This supports the hypothesis that received activity within the OSN encourages further activity within the OSN. Hypothesis 6. When a user first receives activity predicts lifetime. A related hypothesis is that the first date on which a user receives activity is related to the user’s lifetime. There are several competing hypotheses regarding the relationship between receipt of the first activity and the active lifetime of the user: Hypothesis 6.1. A user will be relatively inactive until receiving his or her first message or comment from another user. A positive correlation between the first received activity date and the user’s lifetime would tend to support this hypothesis. Hypothesis 6.2. A user will only be active if the first activity he or she receives happens relatively soon after creating his or her profile. A negative correlation between the first received activity date and the user’s lifetime would tend to support this hypothesis, although a random correlation might also suggest such a relationship. Hypothesis 6.3. A user’s lifetime is unaffected by the date on which he or she first receives activity. A random correlation between the first received activity

0

Fig. 8.

100

200 300 Date of first received activity

400

500

Probability of being active n days after first activity

date and the user’s lifetime could suggest that no relationship between the two exists. Table I shows the correlation between the date of the first received activity, and the date of the last activity of a user. As can be seen, there is only a very weak correlation between the two dates. This tends to support either Hypothesis 6.2, that receiving activity early predicts a longer lifetime, or Hypothesis 6.3, that the date on which the first activity is received is unrelated to lifetime. In order to differentiate between these hypotheses, we consider the following hypothesis: Hypothesis 7. When a user first receives activity predicts subsequent lifetime. Hypothesis 5 shows that late user activity is spurred by receiving late activity, but we wish to know whether there exists a “critical period” of received activity, i.e. whether the date of received activity influences the probability that a user remains active for some period of time. In other words, say two users received their first activity on different days: one after 10 days on the site, and another after 180 days on the site. How likely are these two users to be active 30 days after this first received activity? Figure 8 shows the probability of a user being active both 30 and 180 days after the first activity is received. As the data show, there is a clear decrease in the probability of a user remaining active as the date on which they receive their first activity increases. These tend to support the hypothesis that when activity is received matters: users who receive activity early in their profiles’ lifetimes are much more likely to remain active than those who do not. Hypothesis 8. The amount of personal account information supplied predicts lifetime. The intuition behind this hypothesis is that the amount of account detail supplied is an indicator of the amount of engagement with the OSN, and may correlate with the probability of remaining active. The hypothesis is important to investigate in order to see whether there is a tradeoff between privacy, in the form of not revealing too much about oneself, and activity. As Table I shows, there is no correlation between active lifetime and the number of user account details supplied. We were concerned that the data are being overwhelmed by outliers, so we computed the active lifetime for users at or

6

0.6

1

Sample All collected users

0.5

0.4 0.6 CDF

Fraction of users logging in

0.8

0.3

0.4 0.2

0.2 0.1

0

Fig. 9.

Active lifetime Passive lifetime

0 0

500

1000 1500 Profile age (days)

2000

2500

PDF of total lifetime

1

Fig. 10.

10

100 Lifetime (days)

1000

10000

CDF of active and passive lifetimes 1 All users Pure Fans Mixed

0.9 0.8 0.7 Fraction of users

below the 99th percentile of the number of user account details supplied, and computed the correlation between the number of details supplied and the mean active lifetime, given in Table II. As can be seen, there is a positive correlation between mean lifetime and the number of account details supplied, though it isn’t as strong as the correlation between degree and lifetime. This suggests that revealing information about oneself may be one factor that contributes to remaining active on a site.

0.6 0.5 0.4 0.3 0.2 0.1

C. Passive Lifetime As we described in Section III, we crawled all activity for a sample of 9,000 users. As a side effect of our crawl2 , we obtained the last online date for approximately 450,000 users. A plot of the PDF of users’ total lifetime is shown in Figure 9, for both the sampled users and for all the users whose online date was collected. Not surprisingly, users decay over time: far more users are likely to have logged in early in their profiles’ lifetimes than later. The lifetimes for all users are slightly shorter than those for the users in the sample, which can be explained by selection bias: The set of all users whose last online time was obtained have a mean in degree of 9.9, while the set of users in the sample have a mean in degree of 10.1. As we already showed in Hypothesis 1, lifetime is correlated with in degree, and this difference is to be expected. Figure 10 shows a CDF of the active and passive lifetimes of all users whose online date was obtained in our crawl. Recall that the definition of passive lifetime is the number days between the last recorded activity of each user and the user’s last online date. As the CDF shows, a higher fraction of users have a small active lifetime than have a small passive lifetime. In other words, the expected number of days of passive activity is greater than the expected number of days of recorded activity, which confirms the observation that users are mostly voyeurs. As we described in our introduction, OSN operators are interested in promoting usage among their users. As we showed in Hypothesis 5, active usage is more valuable than passive usage, because it promotes usage among other users. A simple question remains, however: is passive usage also correlated with active usage? That is, how accurate an estimator of passive lifetime is active lifetime? To answer this question, 2 For

brevity, the details of the crawling method are omitted.

0 0

0.2

0.4

0.6

0.8

1

Fraction of online age

Fig. 11.

Mean error of active lifetime vs. total lifetime

we calculated the mean error between the active and total lifetimes, as a fraction of the total lifetimes. By normalizing to the total lifetime, we could compare the rate of error across users with very different lifetimes. An error of 0% implies that a user had measurable activity on the same day he or she last logged in. An error of 100% implies that a user had an active lifetime of 0 days, i.e. that the user only had recorded activity on the same day the user created his or her profile, but that the user logged in at some point thereafter. A plot of the mean error is shown in Figure 11. As can be seen, for more than 70% of users, the error is less than 5%, i.e. for the majority of users active usage is a good approximation of passive usage. For a minority of users, there is a larger error rate between last online date and last activity date, and this error rate is relatively evenly distributed. Intriguingly, 94% Pure Fans had an error less than 5%, while only 41% of Mixed users had such a low error rate. In other words, Pure Fans were much less likely to be online except when they had activity, and as we showed in Hypothesis 3, they were also less likely than other users to have activity. While such broad distinctions about user lifetimes are interesting, we are more interested in whether the events that predict users’ logging in can be identified. In order to do so, we looked at users whose last online date was greater than their last activity date, i.e. users with positive passive lifetime. We restricted ourselves to users whose last online date was within the range of our initial crawl, as all activity related to

7

0.025

them could be identified. There were approximately 70,000 such users. We then looked at two types of activity, received activity and friends’ activity, as predictors of logging in.

Hypothesis 10. Undirected activity among a user’s friends predicts passive lifetime. A further test is whether passive activity is predicted by undirected activity among a user’s friends. In our crawled data, photos posted are undirected. In order to determine whether photos posted have any impact on logging in, we computed the probability of a user’s friends posting photos on any day after

Probability of receiving activity

As we discussed in Hypothesis 5, received activity appears to be correlated with (sent) activity. But is it also correlated with passive activity? Ideally, to answer this we would compare all online times with the time of received activity, in order to compare the probability of being online with the probability of receiving activity. Because we only know the date each user was last online, we know that the probability the user was online was 1 on the date the user last generated activity, 1 on the user’s last online date, indeterminate between, and 0 thereafter. In other words, to compare the probability of receiving activity to the probability of logging in, we would have to know the probability of logging in during precisely the period we do not know. Instead, we focus on the probability of receiving activity near the last day logging in. For the 70,000 users with positive passive lifetime whose last online date was within the range of our initial crawl, we computed the probability of the user receiving activity on any day after the user’s last generated activity. If the hypothesis is true, we would expect the probability of receiving activity to be highest shortly before or on the user’s online dates, and generally lower after the user’s last online date. Figure 12 shows a plot of the probability of receiving activity on any day after the user’s last recorded activity, normalized to the user’s last online date, shown as day 0. Activity received on negative days reflects activity received during the users’ passive lifetimes, i.e. before the user last logged in, while activity received on positive days reflects activity received after the users last logged in. For the 70,000 users with positive passive lifetime, the probability of receiving activity was significantly highest on day 0, i.e. on the same the day user last logged in. While the probability itself was not large–approximately 2%–the probability of any user receiving activity on any day is very small, approximately 0.5%. There is a curious spike in the probability of receiving activity approximately 130 days before the users’ last login date. We have no explanation for this other than that it may be an anomaly due to a relatively small dataset. Still, the probability distribution is remarkably regular. The probability of a user receiving activity after logging in is also much lower than the probability of a user receiving activity before the last time he or she logged in, even though the data after logging in are overrepresented: on average, there are more days in which users are not logged in than days in the users’ passive lifetimes. Thus, we conclude that this hypothesis is likely, i.e. that receiving activity predicts passive behavior.

0.015

0.01

0.005

0 -300

Fig. 12.

-200

-100 0 100 Days after last online date

200

300

200

300

Last online date vs. received activity 0.004

0.0035

0.003 Probability friends posted photos

Hypothesis 9. Received activity predicts passive lifetime.

0.02

0.0025

0.002

0.0015

0.001

0.0005

0 -300

-200

-100

0

100

Days after last online date

Fig. 13.

Last online date vs. photos posted by friends

the user’s last generated activity. If the hypothesis is true, we would expect the probability a user’s friends posted photos to be higher shortly before the user’s online dates, and generally lower after the user’s last online date. Figure 13 shows a plot of the probability of users’ friends posting photos on any day after the user’s last recorded activity, again normalized to the user’s last online date at day 0. Photos posted on negative days reflect photos posted during the users’ passive lifetimes, while photos posted on positive days reflect photos posted after the users last logged in. Like for received activity, the probability of a user’s friends posting photos is highest on day 0, i.e. on the same day the user last logged in. Again, the probability itself is quite low, even lower than the probability of receiving activity: the peak probability is about 0.4%. Again, the probability of any user posting photos on any day is quite low: approximately 0.2% on average. The probability of a user’s friends posting photos is also lower after the user last logged in than before. We conclude that this hypothesis is likely, i.e. that undirected activity among a user’s friends also predicts passive lifetime, though perhaps not as strongly as directed activity. V. R ECOMMENDATIONS The premise of this work is that OSN operators are interested in promoting continued activity among their users or followers, and that the factors that encourage continued activity can be analyzed. Based on the hypotheses for which we found support, we can make the following recommendations:

8

Recommendation 1. Encourage users to form friendships. This follows from Hypothesis 1. Recommendation 2. Recommend users befriend users other than the most popular ones. This follows from Hypothesis 3, and is further supported in Section IV-C. It also suggests that luring well-known celebrities to an OSN, e.g. through partnerships, may be an ineffective means of encouraging users to remain active in the OSN. One way to encourage users to befriend users other than the most popular ones may be to implement a “People you may know” feature which recommends users your friends follow. Another way may be to foster the creation of online groups, e.g. interest-based, geography-based, or event-based groups, to create forums in which users can meet one another online. Recommendation 3. Encourage users to communicate with one another. This follows from Hypotheses 4, 5 and 9. One way is to suggest that users communicate with friends of theirs who have not been active in some time. Recommendation 4. Welcome new users to the site. Especially, the welcome should come from existing users. This follows from Hypothesis 7. One way is to add a step to the friendship formation process, encouraging the new friends to send messages to one another or otherwise interact with one another. Recommendation 5. Encourage users to post frequently. This follows from Hypothesis 10. Even if a user’s posts aren’t receiving a lot of comments, they may serve to encourage passive activity among the user’s friends. VI. C ONCLUSION AND F UTURE W ORK In this work, we studied the active and passive lifetimes of the LCC of users in one OSN. We examined the behaviors and properties that predict both active and passive usage of the site, and use these characteristics to suggest features that would promote usage among an OSN’s users. It’s tempting to speculate whether the presence or absence of features encouraging activity are sufficient to induce users’ continued activity. For example, Twitter lacks most of the features we recommend, and has a high rate of churn [8]. Facebook, in contrast, implements most of the features we recommend. While we are not aware of any estimate of Facebook’s rate of churn, Facebook themselves state that more than half of active Facebook users return every month [9]. Unfortunately this leaves open the definition of an active user. According to Nielsen, Facebook is the third most popular brand online [10]. Is Facebook’s popularity due to their features that encourage activity? Perhaps in part. For future work, we would like to validate our findings across multiple OSNs. We would also like to evaluate the impact of implementing the suggested features on the usage of an OSN.

R EFERENCES [1] http://themetricsystem.rjmetrics.com/2010/01/26/ new-data-on-twitters-users-and-engagement/. Online; accessed 14-January-2011. [2] C. Wilson, B. Boe, A. Sala, K. P. Puttaswamy, and B. Y. Zhao, “User interactions in social networks and their implications,” in Proceedings of the 4th ACM European conference on Computer systems, EuroSys ’09, (New York, NY, USA), pp. 205–218, ACM, 2009. [3] S. Ye, J. Lang, and F. Wu, “Crawling online social graphs,” in Web Conference (APWEB), 2010 12th International Asia-Pacific, pp. 236 – 242, 2010. [4] K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. A. Nanavati, and A. Joshi, “Social ties and their relevance to churn in mobile telecom networks,” in Proceedings of the 11th international conference on Extending database technology: Advances in database technology, EDBT ’08, (New York, NY, USA), pp. 668–677, ACM, 2008. [5] D. Stutzbach and R. Rejaie, “Understanding churn in peer-to-peer networks,” in Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, IMC ’06, (New York, NY, USA), pp. 189–202, ACM, 2006. [6] B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi, “On the evolution of user interaction in facebook,” in Proceedings of the 2nd ACM workshop on Online social networks, WOSN ’09, (New York, NY, USA), pp. 37–42, ACM, 2009. [7] F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida, “Characterizing user behavior in online social networks,” in Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, IMC ’09, (New York, NY, USA), pp. 49–62, ACM, 2009. [8] http://blog.nielsen.com/nielsenwire/online mobile/ twitter-quitters-post-roadblock-to-long-term-growth/. Online; accessed 14-January-2011. [9] http://facebook.com/press/info.php?statistics. Online; accessed 14January-2011. [10] http://blog.nielsen.com/nielsenwire/online mobile/ social-media-accounts-for-22-percent-of-time-online/. Online; accessed 14-January-2011.

User Demographics and Language in an Implicit Social Network

Maximum Network Lifetime in Fault Tolerant Sensor ...

Social Network Effects

User Privacy o User Privacy on Social Networking ...

Social Network Effects

social network hindi.pdf

Social Network Effects

Collective Churn Prediction in Social Network

Research Proposal on Social Network ...

Genetic influences on social network characteristics

the social network dts.pdf

Collective Churn Prediction in Social Network

Coordination in a Social Network

THE-SOCIAL-NETWORK-EXPERIMENT.pdf

Genetic influences on social network characteristics

Navigating a Mobile Social Network