On the Influence Propagation of Web Videos

Viewer
Transcript

1

On the Influence Propagation of Web Videos Jiajun Liu, Yi Yang, Zi Huang, Yang Yang, Heng Tao Shen Abstract—We propose a novel approach to analyze how a popular video is propagated in the cyberspace, to identify if it originated from a certain sharing-site, and to identify how it reached the current popularity in its propagation. In addition, we also estimate their influences across different websites outside the major hosting website. Web video is gaining significance due to its rich and eye-ball grabbing content. This phenomenon is evidently amplified and accelerated by the advance of Web 2.0. When a video receives some degree of popularity, it tends to appear on various websites including not only video-sharing websites but also news websites, social networks or even Wikipedia. Numerous video-sharing websites have hosted videos that reached a phenomenal level of visibility and popularity in the entire cyberspace. As a result, it is becoming more difficult to determine how the propagation took place - was the video a piece of original work that was intentionally uploaded to its major hosting site by the authors, or did the video originate from some small site then reached the sharing site after already getting a good level of popularity, or did it originate from other places in the cyberspace but the sharing site made it popular. Existing study regarding this flow of influence is lacking. Literature that discuss the problem of estimating a video’s influence in the whole cyberspace also remains rare. In this article we introduce a novel framework to identify the propagation of popular videos from its major hosting site’s perspective, and to estimate its influence. We define a Unified Virtual Community Space (UVCS) to model the propagation and influence of a video, and devise a novel learning method called Noise-reductive Local-and-Global Learning (NLGL) to effectively estimate a video’s origin and influence. Without losing generality, we conduct experiments on annotated dataset collected from a major video sharing site to evaluate the effectiveness of the framework. Surrounding the collected videos and their ranks, some interesting discussions regarding the propagation and influence of videos as well as user behavior are also presented. Index Terms—Video influence estimation, Video origin estimation, Unified Virtual Community Space

F

1

I NTRODUCTION

As the Web continues to evolve, one of the most noticeable phenomenons is the prevailing of videos as a major source of multimedia information on the Web. The latest research conducted by comScore1 reveals that in the single month of January 2011, a U.S. Internet user spent 870.8 minutes in average on viewing web videos. Web videos nowadays influence society like never before in history. As we witnessed the success of YouTube, Hulu and other video-sharing websites, we have also noticed how the social networks have fueled the growth of online videos. When a video becomes popular, it can be spotted not only on one or more video-sharing websites but also on news websites, social networks, blogs or even Wikipedia. Vise versa, when a video from news websites, social networks or blogs gets attention, it is likely to be put on video-sharing sites as well. In such context, it is utterly important to identify how the propagation took place, i.e., to determine if a popular video on a video sharing website actually originated from that website, or it is merely a projection of influence from somewhere else of the cyberspace. Particularly, in this study we primarily focus on the identification of the propagation patterns of the web videos. We also study their influence in the entire cyberspace. Jiajun Liu is with CSIRO, Australia. Email: [email protected]. Yi Yang, Zi Huang, Yang Yang, Heng Tao Shen are with School of Information Technology and Electrical Engineering, The University of Queensland, Australia. Email:{yi.yang,huang,yang.yang,shenht}@itee.uq.edu.au. 1. http://www.comscore.com/Insights/Press Releases/2011/2/ comScore Releases January 2011 U.S. Online Video Rankings

Fig. 1. Popular Web Videos

The problem we aim to solve is partially similar to the analysis a user’s friends and the identification of his/her influence in a social network, but there are some key differences. In influence analysis in social network, all users, or nodes from a network’s perspective, are normally considered to be in a single website, in which a user’s influence can be identified with existing approaches [2] by analyzing the friend relationships and interactions with other users. In such case, the concept of origin for a user does not exist. However, the problem becomes more difficult if we consider an online video’s propagation and influence as in this case multiple websites need to be examined. Due to the open nature of the Web, an online video’s influence

2

often exhibits a bi-directional fashion. On the one hand, a video’s existence on a hosting site may be affected by some emerging events from other websites. On the other hand, a video originating from a hosting site makes its way to the most popular video inside the site, and then draws dramatic attention from other websites. Figure 1 shows the most viewed ten videos in all time from the largest online video-sharing site YouTube.com. After close investigation of the videos’ propagation in cyberspace, we conclude that video 1, 9 and 10, which are marked with stars, are the origins of other duplicate videos in the cyberspace. These videos originate from YouTube.com, and are then propagated to the cyberspace via other websites. During the process, they have drawn remarkable public attention from both YouTube.com and other websites. A very good example is the video Charlie Bit My Finger - again2 . As the No. 1 web video with the most views in history (up to 2010), the video is widely received and reported, on Wikipedia, MySpace, Twitter, personally blogs, and numerous news websites like The Telegraph3 , Time4 and Sydney Morning Herald 5 . Clearly the video was firstly uploaded on YouTube.com and then became publicly popular on other websites. In this case, its original hosting site outputs great influence to the cyberspace rather than receiving influence from it. That means, the propagation of this video on other websites demonstrates a video sharing site’s significance as an information source. We illustrate this propagation in Figure 2. Contrarily, some other videos show a different case with respect to the propagation of influence. For instance, Coldplay - Viva La Vida, as the No. 5 video, is popular in the YouTube U.K. community. The video is a duplicate of a music video which is already widely hosted on other websites, thus YouTube is not the very source of the influence. When searching with the video title on major search engines, the dominant part of relevant entries links to the pages concerning the song but not the video itself. Little attention is brought to the video’s hosting site compared to the previous example. In such cases, we say the popularity that the video receives on YouTube is a co-effect of the public popularity of the song itself, and YouTube is not the origin of the propagation or influence. The same observation can be obtained for other unmarked videos in Figure 1. Our objective is to analyze the propagation as well as the direction of influence for a video, and then evaluate the influence in the public domain. The problem we target to solve is extremely important to the video sharing site owner, in the following scenarios. 2. http://www.youtube.com/watch?v= OBlgSz8sSM 3. http://www.telegraph.co.uk/news/uknews/3564392/ Finger-biting-brothers-become-YouTube-hit.html 4. http://www.time.com/time/specials/packages/article/0,28804, 1974961 1974925 1974954,00.html 5. http://www.smh.com.au/digital-life/digital-life-news/ once-bitten-now-watched-by-millions-on-youtube-20091028-hjsc.html

Some video-sharing sites have developed schemes to encourage users to upload content. For instance, YouTube.com decided to give cash rewards to successful video uploaders 6 . As encouraging as it can be for individual content producers, it also poses challenges and issues when a video is produced somewhere else but is shared on the sites without proper permission. Though this type of videos can also become popular in the sharing sites, such action should not be encouraged or even rewarded for. Our study facilitates such decision making process by providing means to analyze a user’s uploaded videos and determine if the user’s uploading activities have been legitimate and if the user should be rewarded. • Advertisers often excercise multiple partnership models for online videos. They could cooperate with the distribution channels, in this case, the sharing sites, in the likely form of commercial advertisement before the actual video plays, or they can cooperate with the content producers, and plant the commercial directly into the videos. The two aspect can be mutually reinforcing - cooperation with the sharing sites will reduce risks and establish certain expectation on the exposure on the deployed sites, while cooperation with the producers will offer a wider spread over the Internet and will have better chances to pierce through the barriers between different sharing sites. Advertisers on the video sharing sites often do not solely aim to achieve best view counts on one sharing site, but they also prefer their advertisements to be seen on multiple channels. Our study aids such need by helping them identify who are the best original producers that can also achieve very high view counts. In this article, we make the following technical contributions. • To model an online video’s propagation and influence in the cross-community cyberspace, we define a Unified Virtual Community Space (UVCS) that captures the propagation history of an online video. The UVCS records key information of an online video, such as the video page’s ranking in the search results for a text query with the video’s title on search engines, and the information about the video page’s inbound and outbound links, etc. UVCS is used as the raw feature for our algorithm to classify the propagation and rank the influence of an online video. A video’s UVCS is independent from another video’s UVCS . • We propose an advanced learning method called Noise-reductive Local-and-Global Learning (NLGL)to fulfill the following learning objectives: – The method should be able to reduce noise. The UVCS feature is a combination of multiple •

6. http://www.telegraph.co.uk/technology/3356176/ YouTube-to-give-cash-rewards-for-videos.html

3

Fig. 2. Influence Propagation of the Video “Charlie Bit My Finger - Again!”

sematic components, the significance of each component is not specified in the raw feature. Fields of the UVCS feature may be missing for some feature vectors due to the diversified nature of web pages. Overall the feature is regarded very noisy. – The method should be able to learn a classification/ranking model with a relatively limited number of learning data, and then apply the obtained model on out-of-learning-set data. Consequently, it should be able to preserve the structure of the learning set, but we also need to control the risk of over-fitting. With this model we are able to predict the propagation pattern and influence ranking for any new online video. • Extensive experiments are conducted on popular online videos to verify the effectiveness of the proposed framework. Some observations on propagation and owners’ behaviors are also provided. The rest of the paper is organized as follows. In Section 2 the exiting literature is discussed, followed by the outline of the framework in Section 3. Section 4 describes the construction of the UVCS, and Section 5 is devoted to the technical details of the NLGL learning method. The experimental results and related discussions are presented in Section 6. Finally, the paper is concluded in Section 7.

2

R ELATED W ORK

The study of the propagation of online videos remains insufficient. Existing research usually concerns the propagation or influence of topics and users in a single community [2], [9], [3], [17]. There are also studies that aim to solve the video relevance ranking and video thread tracking problems. We survey the literature in these related fields, and review some of the relevant machine learning methods to our proposed NCRC framework. Fist we survey the recent studies in event discovery and other mining tasks in social networks. In [23],

Twitter is visualized as a sensor network for event detection. With semantic analysis of tweets, it establishes a probabilistic model which derives the probability of an emerging event from the occurrence reading of related tweets. Then with the assistance of Bayesian filters, it determines the location of the event. It is reported effective for earthquakes, typhoons, or even new video game releases. Meanwhile, A statistical model called PET is defined in [11] to track events in social networks. It models the events over time, and exploits the bursts of user interest, the network structural information and the evolution of a topic for event tracking. The approach in [10] uses the query likelihood and news headline prior for top news identification in the Blogosphere. In [6], city landmarks can be discovered by mining personal blogs and modeling photo correlations across blogs as a graph. For influence analysis, [2] studies the interactive behavior of YouTube users. User behaviors and their ranks are also mined. [17] presents a very interesting study for the analysis of social networks. It states that the user interests in social networks also exhibit the powerlaw. While the statistical analysis of YouTube-like communities in [3] shows that the popularity these usergenerated-contents (UGC) receive follows the Pareto Principle (i.e. 80-20 rule). A model called Social Pixel [24] is proposed for aggregating social interests and detecting emerging events. [9], [2] confirm that users in social networks are like web pages that exhibit characteristics as “hubs” and “authorities”. A framework called SocialTransfer is presented in [21] to analyze trending topics on the Internet for socialized video recommendation, [22] uses similar cross-domain analysis to predict the popularity for social videos. While [16] demonstrates a novel contextual advertising platform for online video service. The platform can insert less intrusive advertisements in online videos based on their visual context. [27] studies the propagation of social videos to determine the optimal replication strategy for serving the videos. Numerous studies have been conducted to analyze the relevance of videos or images based on their visual

4

Fig. 3. The Framework

contents [12]. In [14], the authors propose to use correlation to analyze the video’s similarity and hence detect near-duplicates. [19], [25] studies the similar problem but emphasizes more on the scalability of the method. [33] introduces a method to give video meanings by transferring tags to them from visually similar images, while [13] presents a method to provide diverse views on locations by eliminating near-duplicate content. Several studies have been completed for video relevance ranking and video threading tracking. The method proposed in [30] utilizes near-duplicate video detection techniques to rerank news video, while later in [31] it leverages both PageRank and video near-duplicate detection techniques for the same task. In [15], the authors use a bipartite graph model of textual information of videos to discover and track video topics. As for the learning methods, the aspect of preserving local and global consistency during learning has become popular in various practice. [35] proposes to add a weighted Laplacian matrix (affinity matrix) to relax the loss from unrelated labels, and at the same time penalize the loss from closely related labels when predicting the labels. A similar technique is used in [1]. To tackle the transductive learning problem, the local consistency is considered in [29]. In [18], manifold learning is considered to assist the semi-supervised learning. While [8] defines a framework that simultaneously optimizes a dimension reduction problem and a multi-label classification problem is defined. The results reported show that by doing so, the relationship between labels is better explored. By jointly considering the reduction and the labels, it outperforms the method of performing reduction and labeling separately. Recently, the manifold structure is explored in the images retrieval task and has been proved effective [34]. [32] proposed a novel technique by jointly considering local regression and global aligned when learning for a metric that could handle cross media retrieval with multi-modalities. The model is closely related to manifold learning [26] and local learning [35]. Instead of deriving the Laplacian matrix directly from common functions like Gaussian [26], it learns the Laplacian matrix by local regression and global alignment, making it much more robust to

different parameters settings. Unlike in [35], which uses two separate steps to learn the Laplacian Matrix, [32] unified it into a single objective function for optimization. The settings of the proposed learning method in this work is inspired by [32], but there are two key differences. Firstly the method in [32] is transductive, which can not be applied to out-of-sample data, while the proposed method is not only inductive, but also very efficient so that it is possible to use it to handle largescale data. Secondly our method simultaneously reduces the noise in the data by dimension reduction, which is not considered in [32].

3 P ROBLEM F ORMULATION WORK OVERVIEW

AND

F RAME -

In this chapter we formulate the problem and describe the general framework. Motivated the applications in the previous chapter, we give the following formulation of the problem studied in this article. TABLE 2 Classes of Video Propagation Patterns Class 1

2 3 4 5 6

Criteria The video was published on another site first but is propagated to the targeted sharing site, yet both sides play important roles to the further propagation of the video. The video was published on the targeted site first but is propagated to other sites, yet both sides play important roles to the further propagation of the video. The video was published on another site first but the targeted site has become the main source of further propagation. The video was published on the targeted site first and therefore was propagated to other sites, which then became the main source of further propagation. The video was published on the targeted site first and the targeted site remains the primary distribution channel. The video was published on another site first and that site remains the primary distribution channel.

Formulation: Given a set of videos V = {vi }, establish corresponding distinctive features X = {xi } to describe their patterns of propagation and overall influence, classify their

5

TABLE 1 Notations for the UVCS Feature Representation Notation nt tr r ir ni ipr nip L1 W1 L2 W2 NT , NN wi α◦β

N B,

Description The time stamp of the video page. The rank of the video page based on time stamp in the UVCS. The rank of the video page in search engine’s results in the UVCS. The rank of the video page based on the number of inbound links in the UVCS. The number of inbound links for the video page in the UVCS. The rank of the video page based on the number of inbound-linked pages in the UVCS. The number of inbound-linked pages for the video page in the UVCS. Row vector, as binary indicators to specify if the pages with top inbound links link to the video page. Equals 1 if that page links to the page we are assessing, 0 otherwise. Top-k pages are considered. Row vector, with the number of inbound links of the pages with top inbound links, after normalization. Top-k pages are considered. Row vector, as binary indicators to specify if the pages with top inbound linked pages link to the video page. Equals 1 if that page links to the page we are assessing, 0 otherwise. Top-k pages are considered. Row vector, with the number of inbound linked pages of the pages with top inbound linked pages, after normalization. Top-k pages are considered. Row vectors. Contains the number of related materials emerged during the past day, week, month on Twitter, blogs, and news websites respectively. Binary indicator. Equals 1 if a related Wikipedia page for the online video is found. 0 otherwise. The inner product of α and β.

TABLE 3 Public Influence Scores Score Range [0,0.25] (0.25,0.5] (0.5,0.75] (0.75,1]

Criteria The video can be found on some social networks, individual websites but not widely available. It cannot be found on influential media outlets. The video is widely available on major social networks, numerous individual websites but cannot be found on influential media outlets. The video is widely available on major social networks, numerous individual websites, how can only be found on either a news website or wikipedia. The video is widely available on major social networks, numerous individual websites, and can be found on multiple influential media outlets, like news websites, Wikipedia, etc.

propagation patterns into C = {ci } as the classes listed in Table 2, and evaluate their influence scores S = {si }, which are interpreted by the criteria presented in Table 3. Figure 3 outlines the proposed framework. The framework starts with the most popular videos retrieved from a particular sharing site that we aim to analyze. Those videos receive the most attention, which is reflected by their view counts, ratings and discussions, so they are collected as candidates. For each video candidate, its title text is used as search terms to search on search engines. The reason why we used this technique is threefold: firstly it involves dramatic effort to crawl all possible web pages and identify duplicate imbedded videos to identify relevant pages; secondly some of the pages only have text reference to the video but have no links or actual embedded video on them; finally in our investigation we find that most of the relevant pages, with or without video links, could be accessed through text search. Missing a video’s true origin in the text search engine results is highly unlikely. Hence we argue that this technique is effective enough for our task. The pages returned by the search engines are ana-

lyzed, based on which a corresponding feature vector in the Unified Virtual Community Space (UVCS) is constructed. The UVCS is a feature space that consists of elements relevant to a video’s propagation and influence, including the link relations of relevant pages, and the tracking of its presence on other websites, e.g. Twitter, Wikipedia, the Blogsphere, and the news websites. We formulate the feature for the candidate video so that the features are used in the NLGL algorithm. At last, a web interface is illustrated to present the results of the analysis.

4 U NIFIED V IRTUAL C OMMUNITY S PACE In a social network, the user influence is often modeled as a network flow problem, or similarly, as a link analysis problem [2], [9]. The outbound links and inbound links for a user, established by comments and friend connections, are considered to assess the user’s role and influence within the social network. The existing approaches focus on the estimation of within-community influence of an individual user or a group. Its analysis benefits from three factors for being from a single and unified community. These include the unified data format, the same scale and meaning of indicators, and the ease of constructing the link graph to monitor the propagation. However, in our case, which is a crosscommunity analysis, these helpful factors are not valid any more. Different web sites have different format of web page presentation and subsequently different format of desired data elements. For example, for the video-sharing websites we are interested in the video’s statistics like view count, uploaded time, location, etc. While for news websites or Wikipedia, we only need to find out the reference activities to the video. They also show different scales in data and user numbers, in addition their link relations among the web pages from different communities are more difficult to discover than in the single community case.

6

Given above characterization, to reasonably represent the relevant factors of a video’s propagation in cyberspace, we define the concept of the Unified Virtual Community Space (UVCS) which contains the web pages relevant to the video. In this UVCS, all the pages have relevance to the video, and they have some intrinsic properties: 1) Each page has a time stamp, indicating the publishing time or modification time of the page. The temporal relationships among pages can be explored (pages with missing time stamps will have inferred time stamps). An older page naturally shows a larger likelihood to be the origin of propagation of a video. 2) Each page receives a set of inbound links. The link relations among pages inside the UVCS and pages outside of UVCS are ignored. That means we count only the links from the pages in the UVCS to the the pages in the UVCS. A graph with pages that are related to the same video is then established. 3) Each page’s rank in the UVCS is known. Provided by the search engine, this factor mainly reflects the importance of the website that hosts the page. Combined with the previous item, this item describes how important this page is from a PageRank type of view. Nevertheless this factor does not provide much information about the direction of influence. Other than these properties, complementary information collected from other sources, i.e., related blog posts, tweets, news, encyclopedia pages, can be retrieved using specialized search engines. The properties and the complementary information described above roughly depict the history of a video’s propagation and the evidence of its public influence. Each of the elements is essential but none can be considered as enough to use alone. For instance, one may argue that we can model the problem as a network flow problem with temporal constraint, in which only the time stamps are essential. However, suppose some breaking news is broadcast on TV, it may be posted to the sharing site by users even before the video is updated on the official website of the TV channel. Hence time is an important but not a decisive factor. Similarly, a page that ranks the highest in the UVCS is not necessarily the origin of all similar videos in cyberspace, as its superior rank may be the product of its host’s great influence in cyberspace instead of its own origin. The same case applies for other components in the UVCS. Apparently when identifying the propagation and influence of a video, all these components need to be considered simultaneously. Further analysis needs to be performed on this UVCS to determine the propagation and influence for an online video. Next we describe how the UVCS for a video is constructed and represented. Note that each video has its own individual UVCS. In other word, we describe a video’s propagation with

a separate UVCS in which it lives. Each video and its correspondent UVCS is independent from other videos and their UVCS. Such characteristics entitles the whole framework another advantage, that any learning model we trained will be able to be applied on out-of-sample videos, given that the new UVCS are constructed for the out-of-sample videos. Next we show how to construct the UVCS for a given video. 4.1

Feature Representation

In this subsection we explain how to construct the UVCS and how to extract features to describe the UVCS. For the clarity of representation, all the notations are listed in Table 1. Given an online video, its title is used as the search term to perform web searches on general web search engines (like Google web search) and specialized web search engines (Google blog search, Twitter search, etc.). After the top-ranked pages related to these terms are retrieved, they form the UVCS, which is then processed and analyzed. The properties of the UVCS are extracted and combined into a formally defined UVCS feature vector as follows: x = [nt, tr, r, ir, ni, ipr, nip, L1 ◦ W1 ,

(1)

L2 ◦ W2 , N T, N B, N N, wi]. x is the feature vector generated for the estimation of both the propagation and influence. All components are listed in Table 1. The features can be considered as having three parts. • [nt, tr] The first part preserves the temporal order in which all the related pages are published with nt and tr. It is an important indicator of the video page’s roles during the propagation. nt and tr record the time stamp and the rank of the video page’s time stamp in the UVCS. It describes the temporal relations between the video page and other related pages. • [r, ir, ni, ipr, nip, N T, N B, N N, wi] The second part contains the search engine’s rank of the video page as well as the related references in various communities of the UVCS. The components included here effectively reflect how popular the video is in other places of the cyberspace outside one sharing site. For example, ir and ni directly describe the within-UVCS linkage pattern for the page without considering other weights or importance of host websites. • [L1 ◦ W1 , L2 ◦ W2 ] The rest of the feature vector helps to evaluate both the propagation and the influence of the online video. The involvement of L1 ◦ W1 and L2 ◦ W2 is critical to the cases where the publishers themselves use the online video website as a primary channel to spread the video. Each component of the feature vector is normalized differently. Overall we use the rank of that component in the UVCS to normalize it relatively. For instance, if a page has a time stamp that says it’s the 10th latest page

7

in the UVCS which has 100 pages all together, then it is normalized to 0.9. Intuitively, SVM or Least Square Regression can be used to classify the video’s propagation and evaluate its influence. However, they will be quite ineffective due to the following reasons. Although the UVCS feature representation contains vital and discriminative information needed for the tasks, the downside of the vast range of information it carries is that the noise and bias are also introduced. For some videos, r may be dominant, while for others, tr and nt may be of greater importance. Other components in the feature vector also have the chance of being more significant than others for some videos. Meanwhile, the manual annotation for the training data may also introduce inconsistency. Naturally we consider using feature selection or dimension reduction to minimize the impact of the noisy feature space.

5

T HE NLGL A LGORITHM

As we noted in 1, the propagation of a video’s influence may show different patterns. Naturally the video propagation analysis problem can be formulated as a classification problem. And this is the primary objective of the study. In addition, we also show the estimation of video influence can be modeled as a ranking problem, to demonstrate that NLGL is capable of similar applications. In this chapter, we propose the NLGL algorithm to unify the two problems into a single learning framework. NLGL is designed to fit in our application scenario. In our application, we will gather a collection of UVCS features for the popular online videos, however only a small portion of them will be annotated by expert annotators due to limited human resources. Whatever learning method used should be able to fully utilize the manually annotated data, and ideally it should use the manually annotated data to infer the un-annotated ones, and finally it should be able to generate a model to predict the labels or ranks for the out-of-sample data. 5.1

Utilizing the Un-annotated Sample Data

Before we start to explain the NLGL, we first obtain a training dataset with n total videos and n0 manually annotated videos (n0 < n), as X = [x1 , ..., xn ]T ∈ Rn×d , where xi is the d-dimensional UVCS feature vector for the ith video in the dataset. Now we denote the annotated videos as X l = [x1 , ..., xn0 ]T . We use Y l = 0 [y1 , ..., yn0 ]T ∈ Rn ×c to denote the class lables or ranks for the manually annotated videos X l , where c is an integer greater or equal to one. Note that Y l is a unified presentation of the class labels for the propagation and the ranks for the influence ranking. In the first case, c is the number of classes, and the label yi for the ith labeled video in X l satisfies that yij = 1 if the ith video belongs to the j th class, and yij = −1 otherwise. In the second case, c equals 1, and yi is a decimal ranges from 0 to 1, with higher value representing high influence.

From now on, we use the classification problem (as it is slighted more complicated than the ranking problem) as an example to discuss our learning method. The same method can be very easily converted to a ranking problem, with c set to 1 and Y l being manually assigned influence ranks. To learn a classification model to predict the label of new videos, a straightforward method is to minimize the empirical error for the following regression model: 0

min o W

n X

||W oT xi − yi ||2F ,

(2)

i=1

where W o ∈ Rd×c is the classifier which is learnt from the manually annotated data and is able to predict the labels for the un-annotated data. The performance of this classifier is closely related to the number of manually labeled videos, namely n0 . However, the availability of labeled videos is rather limited, as excessive human effort is needed for manually labeling a large-scale dataset. To leverage the unlabeled data for a better performance, it is desirable to design an algorithm which is able to predict the labels of the unlabeled videos by exploiting the data distribution of X. Let F = [f1 , ..., fn ]T ∈ Rn×c and Y = [y1 , ..., yn0 , yn0 +1 , ..., yn ]T ∈ Rn×c be the predicted labels from the learning algorithm and actual labels marked by human annotators, where fi and yi are the predicted and actual labels for the ith video in X, respectively. F should be consistent with Y , which is to minimize the following objective function: min fi

n X i=1

||fi − yi ||2F ⇔ min ||F − Y ||2F . F

(3)

All entries in yn0 +1 , ..., yn are set to 0 for the unlabeled training data. Following [32], a diagonal matrix D ∈ Rn×n is added to Equation (3), where Dii = ∞ if xi is manually labeled7 and Dii = 1 otherwise. In that way it will be less penalized by the error introduced by unlabeled videos. We then rewrite Equation (3) as follows: min tr (F − Y )T D(F − Y ) . (4) F

5.2 Preserving Local Structure for the Annotated Data Recently local learning has shown very promising effectiveness and often outperforms global learning methods [20], [29], [35]. In [32] a local learning method is proposed for cross media retrieval and it shows robust results in data ranking. The basic idea is to train a local linear regression model to predict the ranking score of each datum and its k-nearest neighbors, and then all the local linear regression models are optimized globally. Inspired by [32], we propose to employ a group of local linear regression models to predict F . Suppose Nk (xi ) 7. In the experiment, we set it to a very large constant, i.e., 106 .

8

is a set that contains xi and its k-nearest neighbors in training set. The prediction error for a single data point xj ∈ Nk (xi ) is defined as: l(i, j) = ||pTi xj + bi − fj ||2F .

(5)

where pi ∈ Rd×c is the local prediction model of Nk (xi ), fj is predicted label of xj , and bi ∈ Rc is the bias term in the linear regression model. For each data point xi , the total loss of the local models l(i, j) of its k-nearest neighbors can be written as: X ||pTi xj + bi − fj ||2F + λtr(pTi pi ). L(i) = (6)

be noisy, which may degrade the performance. Previous research efforts have shown that dimension reduction can remove irrelevant, redundant and noisy information from a dataset, keeping only the informative, relevant or important information. Meanwhile, it has been shown in [8] that it is beneficial for supervised learning to project the data into a low dimensional subspace when training the classifiers. Suppose there is a linear transformation, which transforms the video data X in feature space to 0 a more compact and accurate representation Z ∈ Rn×d , where d0 ≤ d is the reduced dimensionality. The transformation between X and Z can be formulated as:

xj ∈Nk (xi )

In order to obtain the globally optimal predicted label of all the training data, we sum up the loss functions for each xi and minimize the total loss. Let us denote Xi = [xi , xi1 , ..., xik ]T as the matrix of nearest neighbors of xi , i.e., Nk (xi ), f(i) = [fi , fi1 , ..., fik ]T is a matrix containing the predicted labels of elements in Nk (xi ), with vi = [i, i1 , ..., ik ] being the vector containing the indices of the samples in Nk (xi ) and 1k+1 ∈ Rk+1 denoting the column vector with all ones, we derive the objective as: min n

n f(i) |n i=1 ,bi |i=1 ,pi |i=1

n X

||Xi pi + 1k+1 bTi − f(i) ||2F

i=1

(7)

Z = XQ

(13)

0

where Q ∈ Rd×d is the transformation matrix. If Q is properly trained, the classification based on Z would result in better performance. While most of the existing classification algorithms rarely consider the correlation between the low dimensional subspace and a classifier during the training, we argue that such information is helpful for semisupervised learning. Therefore, we integrate training data label prediction, dimension reduction and classifier inference into a joint framework and propose to minimize the final objective function of NLGL:

+λtr(pTi pi ). min λ1 (||XQW − F ||2F + λ2 ||W ||2F ) After algebraic manipulation, the objective function in W,F,Q:QT Q=I 7 can be rewritten as [32]: +tr(F T T AT T F ) + tr((F − Y )T D(F − Y )), (14)     A1 where λ1 and λ2 are parameters. The first term aims at    T  ... min tr F T [T1 , ..., Tn ]   [T1 , ..., Tn ] F , (8) learning an optimal dimension reduction transformation F An and a classifier simultaneously. The second and third terms predict the labels of the unlabeled training data. where T and A satisfy that: Besides, ||XQW − F ||2F can be regarded as a regularizer 1 if p = (vi )q (Ti )pq = , (9) of the local learning algorithm, which considers the 0 otherwise global information to the predicted labels F . On the T T Ai = H − HXiT (Xi HXiT + λI)−1 Xi H, (10) other hand, tr(F T AT F ) can be regarded as a regu  larizer of the global regression model which preserves A1 local manifold structures of the training data.   1 ... . Let A =  and H = I − k+1 , and T = An 5.4 Iterative Optimization [T1 , ..., Tn ], Equation (8) is written as: Next we propose an iterative approach to optimize the min tr(F T T AT T F ). (11) objective function of NLGL shown in Equation (14). The F approach consists of two steps. In the first step, we fix Combining 11 and 4, we arrive at F and optimize Q and W . In the second step, Q and T T T min tr(F T AT F ) + min tr (F − Y ) D(F − Y ) . (12) W are fix to optimize F . The two steps are iteratively F F performed until convergence. First, we fix F and optimize Q and W . By setting the 5.3 Dimension Reduction and the Complete Model derivatives of Equation (14) with respect to W to be 0, Despite the high accuracy, a limitation of Equation (12) we have: is that it is not able to predict the labels of the videos W = (QT X T XQ + λ2 I)−1 QT X T F. (15) outside the training set. To this end, we propose to simultaneously learn the predicted label of the training Substituting W in Equation (14) by Equation (15), we data and a classifier which can be used to predict the arrive at: label of a video outside the training set. Meanwhile, the max tr(((QT (X T X + λ2 I)Q)−1 QT X T F F T XQ). (16) representation of the feature vector in the UVCS could QT Q=I

9

Algorithm 1: Iterative Optimization Input: Feature matrix X, manual labels Y , number of iteration N Output: Reduction matrix Q, prediction model W , predicted labels F Algorithm: 1: Initialize F by F = (T AT T + D)−1 DY 2: while Number of iteration N not reached do 3: Update Q according to Equation (16) 4: Update W according to Equation (15) 5: Update F according to Equation (18) 6: end while 7: return Q, W, F

The above optimization problem shown in Equation (16) can be solved by generalized eigenvalue decomposition. Once the optimal Q is obtained, the corresponding optimal W can be computed according to Equation (15). Next, we fix W and Q to update F . By setting the derivatives in Equation (14) with respect to F to be zero, we have: 2λ1 (F − XQW ) + 2T AT T F + 2D(F − Y ) = 0.

(17)

After simple algebraic transformation, F can be represented by W and Q as follows: F = (λ1 I + T AT T + D)−1 (XQW + DY ).

(18)

The problem shown in Equation (14) is non-convex. Only local optima can be achieved by the above iterative approach. Therefore, a good initialization is desirable. We propose to solve the optimization problem shown in Equation (12) to initialize F , i.e., F is initialized by F = (T AT T + D)−1 DY.

(19)

Once W and Q are obtained, we can use them to predict the label of a new video represented by a feature vector xi in the UVCS space as follows. fi = W T QT xTi .

6

E XPERIMENTS

In this section, we major use the task Propagation Classification (PC) to evaluate the performance of NLGL. We also show results for the Influence Estimation (IE) task to demonstrate that NLGL’s capablity and flexibility for applying to other tasks. 6.1

Dataset

The dataset we use for the evaluation consists of 300,000 related web pages regarding 3,000 of the most popular online video entries collected from the Internet (YouTube.com). To construct the UVCS, for each video, the title is used as search terms to perform searches with a general search engine. We also use specialized search services such as blog search, news search, and tweet search to collect related web pages. A UVCS feature vector (see Equation (1)) is extracted for each video. In our experiment, the UVCS feature vector has a dimensionality of 117 in total. The generation of Y (through Equations 3 to 19) is manual. 12 different annotators are asked to label and rank each video with a propagation class and an influence rank. The annotators include 8 male and 4 female. 500 videos in the dataset are each assigned a PC and an IE score between [0,1], with higher values representing higher influence in the cyberspace. Each annotator labels 100 videos, so that each video is at least labeled by two of the annotators. Meanwhile, each annotator’s 100 videos are covered by other two annotators, each of which covers 50. For IE, the average score for each video is used. For PC, if the annotators’ labels do not agree, then the annotator who is considered more experienced (has more labels agreed by other annotators). Each video belongs to only one PC and has only one IE score. Specifically, the criteria for PC labeling are listed in Table 2. The same approach is used in IE, and the same human effort is involved to manually assign a influence score to a video. For IE, Table 3 shows a general guideline for the influence scores. Annotators are free to adjust the actual scores according to the degree of how the video’s influence matches our criteria.

(20)

The NLGL algorithm is summarized in Algorithm 1. As an inductive learning model, once the classifier is trained offline, it can process new, out-of-sample data in linear time with respect to the data size. After we use the related web pages to construct new a UVCS for each new out-of-sample video, we are able to apply the model on the new UVCS features to predict their labels. This makes our method very efficient for processing largescale real-life data. With slightly different settings, i.e. we let the annotators mark the influence ranks for the videos instead of propagation classes, and change the dimensionality of related matrices, we can easily obtain a similar ranking model out from NLGL. Next we examine how NLGL performs in experiments.

6.2

Evaluation

The evaluation for NLGL follows the K-fold crossvalidation, where K=5 in our experiments. That means the 500 manually labeled videos are randomly partitioned into 5 folds, each of which contains 100 videos. The tests repeat 5 times. For each repeat, one of the folds is selected as the test data and the remaining 4 folds are used as part of the training data. The average performance of the 5 tests is reported. As for performance metric, the Area Under the receiver operator characteristic Curve (AUC) has been frequently used to evaluate the effectiveness of classification, model selection, etc. The receiver operator characteristic curve is a plot of the true positive rate

10

(a) Comparison on PC

(b) Comparison on IE

Fig. 4. Performance Comparison

as a function of the false positive rate of a classifier, and the AUC functions as a performance metric for misclassification costs of classifiers. Since our application stands as a multi-class classification, we use both microAUC and macroAUC [7] to measure the performance of our method, as similar to [28]. For the public influence ranking task, we use the average precision. 6.3

Performance Comparison

Under the same evaluation metric, 4 reference methods are used for comparison for the video propagation classification task, i.e. Least Square Regression (LSR), Support Vector Machine (SVM)[5], Classification with Dimensionality Reduction (CDR) [8], and LSR with Manifold Regularization (LSRMR) [1]. The ranking versions for these reference methods (such as in[4])are used to compare with NLGL on the public influence task. For each reference method, their parameters have been carefully tuned and selected to achieve best performance. We use the same experimental procedure to evaluate their performance for the PC and IE tasks. The 5-fold cross-validation is applied, and the average results are reported in Figure 4. Our method outperforms all 4 reference methods in all cases. For the PC task, the micro and macro AUC values of our method reach 0.86 and 0.736 respectively, while none of LSR, SVM or LSRMR has higher micro AUC value than 0.75, or macro AUC values higher than 0.64. Similarly, for IE the performance of NLGL surpasses the others’ by a great margin. Overall, we argue that There are several reasons for the differences in performance. Most importantly, the feature definition is indeed noisy. Some dimensions, or some videos, can introduce great bias to the learned model. In our method, this impact is reduced to the minimum by conducting dimension reduction simultaneously with the learning of the classifier. As this additional transformation is performed before the classifier predicts labels, the data sent into the classifier consist of most important information from the original features. Most of the noise is eliminated. The statement is also supported by the CDR’s results.

CDR has the second best performance in the tests, showing that the effect of dimension reduction is the major reason for performance superiority. LSR considers only the global consistency, and the underlying local relationship between videos is ignored, making it least effective. Noticeably, LSRMR outperforms LSR, showing that the manifold structure of the predicted F effectively boosts the performance of classifier W , which is also the reason that NLGL greatly outperforms CDR.

1

1

0.8

0.8

0.6

0.6

0.4

0.4

microAUC macroAUC

0.2 0 1000

1500

2000

0.2 2500

3000

0

Average Precision

1000

Data Size

(a) PC

1500

2000

2500

3000

Data Size

(b) IE

Fig. 5. Effect of Training Data Size

6.4

Effect of Training Data Size

Data size is known to be a key factor in the performance of semi-supervised classifiers. In our experiments we also test how the data size affects the results. The number of videos used for training begins at 1000 and increases by 500 each time until it reaches 3000. In the data used in each test, approximately 1/6 of the videos are humanlabeled. In Figure 5 it is shown that the performance of our learning method improves as the data size increases. The more labeled and unlabeled data are used for training, the less the learned model becomes biased, making the classifier more accurate. 6.5

Effect of Labeled Data Size

Figure 6 depicts the effect of labeled data size (manually labeled data in Y ) on the PC and IE tasks. In these 5

11

λ1=12 λ1=14

λ1=16 0.8

0.8

0.7

0.6

λ1=1-6 λ1=1-4

λ1=1-2 λ1=1

λ1=12 λ1=14

λ1=16

Average Precision

λ1=1-2 λ1=1

macroAUC

microAUC

λ1=1-6 λ1=1-4

0.7

0.6

0.5

0.5

0.4 λ2=1

λ2=12 λ2=14 λ2=16

λ1=1-2 λ1=1

λ1=12 λ1=14

λ1=16

0.7

0.6

0.5

0.4 λ2=1-6 λ2=1-4 λ2=1-2

λ1=1-6 λ1=1-4 0.8

0.4 λ2=1-6 λ2=1-4 λ2=1-2

(a) MicroAUC of PC

λ2=1

λ2=12 λ2=14 λ2=16

(b) MacroAUC of PC

λ2=1-6 λ2=1-4 λ2=1-2

λ2=1

λ2=12 λ2=14 λ2=16

(c) Average Precision of IE

Fig. 7. The Effect of λ1 and λ2

0.8

0.6

0.6

0.4

0.4

microAUC macroAUC

0.2 0 50

100 200 300 Labeled Data Size

0.2

400

Average Precision

0 50

(a) PC

100 200 300 Labeled Data Size

400

(b) IE

Fig. 6. Effect of Labeled Data Size tests, we use 50, 100, 200, 300, and 400 labeled videos respectively, plus all other unlabeled data as training set. The changes of the AUC values and the average precision are reported. Naturally, more labeled data in Y lead to better predicted F , and subsequently affect the optimization of Q and W . The selection of training data size is a trade off between involved manpower and classification performance. Results in Figure 6 indicate that the training data size is sufficient to show the effectiveness of our method.

6.7

Convergence

The iterative solution in Algorithm 1 guarantees that the objective function value g (Equation (17)) decreases until convergence. We show the relationship between the objective function value and the number of iterations in Figure 8. For both tasks the method shows very speedy convergence, i.e., less than 5 iterations. This property facilitates the classification tasks on large-scale datasets during offline training. 10

10

8

8

6

g (106)

0.8

tween the linear regression model and the local manifold structure is more important than preventing over-fitting. These results show that the performance peaks when λ1 = 104 . All other greater and lesser values of λ1 lead to worse performance. In our experiments, we use λ1 = 104 and λ2 = 1 as our default setting.

6

1

g (10 )

1

4 2

6.6

In this subsection we investigate the effect of the two parameters introduced in Equation (14), i.e. λ1 and λ2 . The aim is to find the optimal balance between the linear regression model and the local manifold structure to preserve as much as possible relevant information in the training data. At the same time the effect of overfitting should be kept ti a minimum. Figure 7 provides an overview of the effect and trends on the AUC values and the average precision values for both the PC and the IE tasks when we tune λ1 and λ2 simultaneously. Obviously NLGL is stable to the parameter changes. In all cases the microAUC remains above 0.8 and the macroAUC stays around 0.7 as well. IE is slightly more sensitive to the parameters but still every setting yields an average precision around 0.7. We also find that the learning method is much less affected by λ2 than by λ1 . This means that the finding the optimal balance be-

4 2

0

Parameter Tuning

6

0

1 2 3 4 5 6 7 8 9 10 Iteration

1 2 3 4 5 6 7 8 9 10 Iteration

(a) PC

(b) IE

Fig. 8. Convergence 6.8

Observation and Discussion

Next we discuss some interesting cases in the predicated labels and ranks for PC and IE. Particularly, we examine the videos retrieved from the ‘most-viewed of all time’ feed and the ‘most-viewed of the week’ feed. In summary, these two sets of results do not differ too much, although we can see minor disagreement as the human and machine have generated slightly different results. Since different annotator may also understand the criteria differently, these misalignments are reasonable. Closer observation of the results also reveals some interesting facts about a video’s propagation patterns

12

and its influence outside the targeted site. Firstly we examine the aged videos. A lot of them were uploaded without any particular purpose other than sharing and amusing other people. However when they became popular both inside out the targeted site, business, charity and casual activities surrounding the videos began to emerge. For instance, after the “Charlie bit my finger again!” video and the “The Sneezing Baby Panda” video received tremendous attention in cyberspace, dedicated websites 8 were founded to exploit their fame. On the other hand, some videos, first published at other places on the Web, cannot be seen on any official websites. Hence the targeted site becomes the major source of their propagation. The “Paul Potts sings Nessun Dorma” video and the “Susan Boyle - Britains Got Talent 2009” video and some other videos are likely to be record from TV programs, however there are no official channels for these videos publicly available. Conclusively, if a video is label to have originated outside the targeted site, but generates very high public influence, it is very likely to be an unofficially uploaded TV episode, a movie, or a music video. Then we investigate recent videos. In this set we discover some interesting user behavior. People who have their own websites still use the targeted site as their primary publication channel for self-made videos to maximize the influence, since the targeted site itself is a very influential host of information. The “Strippers and Stuff” video is a good example of this kind of behavior. In terms of IE, we notice that the average public influence of the aged videos is higher than that of the recent videos. Clearly the influence of aged videos tends to be greater as a result of accumulated attention over time. After looking into their UVCS features, we find that web pages related to the aged videos clearly outnumber those to the recent videos. However, the burst of videos shows quite different patterns. More blog posts, news pages, tweets are observed for the more recent videos. For instance, on Twitter, we find 195 tweets posted in the past week regarding the “Charlie bit my finger - again!” video, while the number of tweets about “Charlie Sheen: The Unedited Version” during the same time period is more than 4000. Then, if we include older tweets, the first video beats the second in number of related tweets by degrees of magnitude. The same pattern is found on blogs, news websites and other websites.

7

C ONCLUSION

AND

F UTURE W ORK

Online videos are so popular nowadays that they begin to change people’s way of daily entertainment greatly. The study of how online videos propagate and how influential they are outside a video sharing site is an increasingly significant research problem. The identification of an online video’s origin and propagation patterns, from the video sharing site’s perspective, is crucial to its 8. http://harryandcharlie.blogspot.com.au, http://www.sneezingbabypanda.com

business models as well to its partner’s decision making for their marketing strategies. In this article, we propose a novel learning model for the classification of video propagation and demonstrate it can also be applied to the estimation of video influence. We determine if a popular online video originated from the video sharing site, or from somewhere else of the Internet. We determine how it got popular through it’s analyzing life cycle. We also give a rough estimation of its general influence on the Internet. Technically, we have made four major contributions. First we have made pioneer effort to model a video’s propagation and influence in cyberspace by a unified presentation, namely UVCS. The UVCS utilizes multimodal indicators include spatial information, page interlinkage relations, social network and news media exposure, and so on. Subsequently it offers a comprehensive and panoramic way of describing an online video’s life cycle. Then we devise a novel learning method called NLGL. NLGL exploits the benefits of local learning, manifold structure and dimension reduction. With NLGL, we successfully model the propagation classification and influence estimation tasks into a classification problem and a ranking problem under the same learning framework. Thirdly, real-life dataset is collected from YouTube, Twitter, blogs and news websites for the evaluation of our method. When tested by extensive experiments with human assistance, it demonstrates good effectiveness and stability. Its superior performance against other learning models is also proven by our results. Last but not least, we provide more insight into the interesting phenomenons and behaviors around the online videos and sharing site users. In our future research, there is great potential for us to extend this research and enable it with other interesting capabilities. The most likely problem would be the identification of the actual origin of the video. We are also investigating possibilities of establishing inter-sharingsite influence model to analyze how video sharing site influence each other.

R EFERENCES [1] [2] [3]

[4] [5] [6] [7]

M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399–2434, 2006. F. Benevenuto, F. Duarte, T. Rodrigues, V. A. F. Almeida, J. M. Almeida, and K. W. Ross. Understanding video interactions in youtube. In ACM Multimedia, pages 761–764, 2008. M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. B. Moon. I tube, you tube, everybody tubes: analyzing the world’s largest user generated content video system. In Internet Measurement Comference, pages 1–14, 2007. O. Chapelle and S. S. Keerthi. Efficient algorithms for ranking with svms. Inf. Retr., 13(3):201–215, 2010. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995. R. Ji, X. Xie, H. Yao, and W.-Y. Ma. Mining city landmarks from blogs by graph modeling. In ACM Multimedia, pages 105–114, 2009. S. Ji, L. Tang, S. Yu, and J. Ye. Extracting shared subspace for multi-label classification. In KDD, pages 381–389, 2008.

13

[8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]

[19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35]

S. Ji and J. Ye. Linear dimensionality reduction for multi-label classification. In IJCAI, pages 1077–1082, 2009. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, pages 604–632, 1999. Y. Lee, H.-Y. Jung, W. Song, and J.-H. Lee. Mining the blogosphere for top news stories identification. In SIGIR, pages 395–402, 2010. C. X. Lin, B. Zhao, Q. Mei, and J. Han. Pet: a statistical model for popular events tracking in social communities. In KDD, pages 929–938, 2010. J. Liu, Z. Huang, H. Cai, H. T. Shen, C.-W. Ngo, and W. Wang. Near-duplicate video retrieval: Current research and future trends. ACM Computing Surveys, 45(4), 2013. J. Liu, Z. Huang, H. Cheng, Y. Chen, H. T. Shen, and Y. Zhang. Presenting diverse location views with real-time near-duplicate photo elimination. In ICDE, pages 505–516, 2013. J. Liu, Z. Huang, H. T. Shen, and B. Cui. Correlation-based retrieval for heavily changed near-duplicate videos. ACM Trans. Inf. Syst., 29(4):21, 2011. L. Liu, L. Sun, Y. Rui, Y. Shi, and S. Yang. Web video topic discovery and tracking via bipartite graph reinforcement model. In WWW, pages 1009–1018, 2008. T. Mei, L. Yang, X. S. Hua, H. Wei, and S. Li. VideoSense: a contextual video advertising system. In ACM Multimedia, pages 463–464. ACM, 2007. A. Mislove, M. Marcon, P. K. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In Internet Measurement Comference, pages 29–42, 2007. F. Nie, D. Xu, I. W.-H. Tsang, and C. Zhang. Flexible manifold embedding: A framework for semi-supervised and unsupervised dimension reduction. IEEE Transactions on Image Processing, 19(7):1921–1932, 2010. S. Paisitkriangkrai, T. Mei, J. Zhang, and X.-S. Hua. Scalable clipbased near-duplicate video detection with ordinal measure. In CIVR, pages 121–128, 2010. S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, pages 2323–2326, 2000. S. D. Roy, T. Mei, W. Zeng, and S. Li. Socialtransfer: cross-domain transfer learning from social streams for media applications. In ACM Multimedia, pages 649–658, 2012. S. D. Roy, T. Mei, W. Zeng, and S. Li. Towards cross domain learning for social video popularity prediction. IEEE Transactions on Multimedia, 2013. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, pages 851–860, 2010. V. K. Singh, M. Gao, and R. Jain. Social pixels: genesis and evaluation. In ACM Multimedia, pages 481–490, 2010. J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In ACM Multimedia, pages 423–432, 2011. F. Wang and C. Zhang. Label propagation through linear neighborhoods. IEEE Trans. Knowl. Data Eng., 20(1):55–67, 2008. Z. Wang, L. Sun, X. Chen, W. Zhu, J. Liu, M. Chen, and S. Yang. Propagation-based social-aware replication for social video contents. In ACM Multimedia, pages 29–38, 2012. F. Wu, Y. Han, Q. Tian, and Y. Zhuang. Multi-label boosting for image annotation by structural grouping sparsity. In ACM Multimedia, pages 15–24, 2010. M. Wu and B. Scholkopf. Transductive classification via local learning regularization. In ICAIS, pages 628–635, 2007. X. Wu, I. Ide, and S. Satoh. News topic tracking and re-ranking with query expansion based on near-duplicate detection. In PCM, pages 755–766, 2009. X. Wu, I. Ide, and S. Satoh. Pagerank with text similarity and video near-duplicate constraints for news story re-ranking. In MMM, pages 533–544, 2010. Y. Yang, D. Xu, F. Nie, J. Luo, and Y. Zhuang. Ranking with local regression and global alignment for cross media retrieval. In ACM Multimedia, pages 175–184, 2009. Y. Yang, Y. Yang, and H. T. Shen. Effective transfer tagging from image to video. TOMCCAP, 9(2):14, 2013. L. Zhang, C. Chen, W. Chen, J. Bu, D. Cai, and X. He. Convex experimental design using manifold structure for image retrieval. In ACM Multimedia, pages 45–54, 2009. D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In NIPS, pages 595– 602, 2003.

Dr. Jiajun Liu is a Post-doctoral Research Fellow in CSIRO, Australia. His research topics include multimedia content analysis, indexing, and retrieval. He received his BEng degree from Nanjing University, China, and obtained his PhD from the University of Queensland in 2013. He also worked as a Researcher/Software Engineer for IBM China during 2006 to 2008.

Dr. Yi Yang is a DECRA Research Fellow in School of Information Technology and Electrical Engineering, The University of Queensland. He received his Ph.D degree from Zhejiang University, in Computer Science in 2010. His research interests include machine learning and its applications to multimedia content analysis and indexing.

Dr. Zi Huang is a Lecturer and Australian Postdoctoral Fellow in School of Information Technology and Electrical Engineering, The University of Queensland. She received her BSc degree from Department of Computer Science, Tsinghua University, China, and her PhD in Computer Science from School of Information Technology and Electrical Engineering, The University of Queensland. Dr. Huang’s research interests include multimedia search, knowledge discovery, and social data analysis.

Dr. Yang Yang is a Research Fellow in department of Computer science, National University of Singapore. His research topics include multimedia content understanding, indexing, and pattern recognition. He obtained his PhD from the University of Queensland in 2013.

Dr. Heng Tao Shen is a Professor of Computer Science in School of Information Technology and Electrical Engineering, The University of Queensland. He obtained his BSc (with 1st class Honours) and PhD from Department of Computer Science, National University of Singapore in 2000 and 2004 respectively. His research interests include Multimedia/Mobile/Web Search and Database Management. Heng Tao has extensively published and served on program committees in most prestigious international publication venues in Multimedia and Database societies. He is the winner of Chris Wallace Award for outstanding Research Contribution in 2010 from CORE Australasia. He is currently an Associate Editor of IEEE Transactions on Knowledge and Data Engineering (TKDE), and will serve as a PC Co-Chair for ACM Multimedia 2015.

On the determinants of calcium wave propagation distance in ...

Audiovisual Celebrity Recognition in Unconstrained Web Videos

On the Influence of Sensor Morphology on Vergence

Study on the influence of dryland technologies on ...

A Study of Relevance Propagation for Web Search

A Study of Relevance Propagation for Web Search -

Measuring Message Propagation and Social Influence ...

Mendelian Randomisation study of the influence of eGFR on coronary ...

The Influence of Admixed Micelles on Corrosion Performance of ...

Influence of vermiwash on the biological productivity of ...

The Influence of Intellectual Property Protection on the ...

Influence of the process temperature on the steel ... - Isoflama

Identifying Ideological Perspectives of Web Videos ...

Influence of the Electrostatic Plasma Lens on the ...

Influence of the process temperature on the steel ... - Isoflama

Influence of the microstructure on the residual stresses ...

The Back-Propagation Learning Algorithm on the Meiko ...

The Back-Propagation Learning Algorithm on the Meiko ... - CiteSeerX

Effects of natural propagation environments on wireless ...

The influence of smoking on postmenopausal bone ...