Detection of Spam in Online Social Networks (OSN)

Viewer
Transcript

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

International Journal of Research in Information Technology (IJRIT) www.ijrit.com

ISSN 2001-5569

Detection of Spam in Online Social Networks (OSN) Through Rule-based System T.Suganya1, K.Sridevi2, M .ArulPrakash3 1

ME , Department of Computer Science and Engineering, Karpagam University Coimbatore, Tamil Nadu, India [email protected]

2

Assistant Professor, Department of Computer Science and Engineering, SSCET Palani, Tamil Nadu, India [email protected] 3

Assistant Professor, Department of Information Technology , SSCET Palani, Tamil Nadu, India [email protected]

Abstract Internet is growing at a rapid speed. In it Online Social Networks (OSNs) plays a vital role. Online Social Networks (OSNs) have become an important part of daily life for many. Online Social Networks (OSNs) connects people to group of people and friends to share and communicate their view of life. Apart from entertainment Online Social Networks (OSNs) also provides useful information in the field of business, profession and marketing. Unfortunately, suitable security measures are not provided by social networks. The security measures are limited. Content-based preference is not supported. So it is impossible to avoid unwanted messages. Spam is the major problem of today’s internet. The spam is usually transformed by the spammers. Spammers collect the details of the users from website, group, chat room and blog and sold to other spammers. The spammers spread the spam message from one user to another. Therefore, it decreases the respect of the human account. In this paper, we propose a method for detecting spam and compare their performance by Rule based system.

Keywords: online social Networks, Filtered Rules, Black lists, short text classifier.

1. Introduction Web mining ,as the term mining implies extraction i.e. extraction on information from web. Usually defined as Knowledge discovery , an application of data mining technique. Web mining is focused with three areas, they are: web content mining, web usage mining and web structure mining. Online social network (OSN) is very popular today which is used by more number of people . Online social network (OSN) not only attracts the teens but also all class people today. The usage of Online social network (OSN) is keep on increasing in its speed , the reason why people those Online social network (OSN) is that it is an effective medium to share and communicate with group of people, which is named as friends, friends of friends. so, while using Online social network (OSN) people should aware of the privacy setting ,also the another important aspect is spam. Spam in Online social network (OSN) is major problem today. T.Suganya, IJRIT

22

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

People in Online social network (OSN) makes connection with friends, friends of friends and even a group. They share a part of their life which also includes message, photos, videos and other multimedia messages. The major fact that affects people in Online social network (OSN) is Spam. Spam is an unwanted message which is posted on user wall by spammers. These Spam keep on multiplying and spread from one user to another in Online social network (OSN) using various link. So an effective detection method is enhanced to give a solution to the users of Online social network (OSN) to enable a secure connection. Section 2 describes the related work that includes the study of the features and content classification. Section 3 deals with the existing system and section 4 describe the proposed system. Section 5 focuses on result and Section 6 presents the conclusions.

2. Related Work Online social network (OSN) is used daily by more number of users nowadays. The data are mined from the Online social network (OSN) and then the work is carried out by developing a robust short text classifier (STC) .while developing short text classifier (STC) we are focusing on the mined and selected set of messages. The work is carried out with the guide of previous work [5] in which the learning model is followed and rules for analyzing the mined data in Online social network (OSN). The short text classifier (STC) is one of the properties utilized to examine the short text ,in detail which is extracted from Online social network (OSN) including other features related to the content of the data mined in Online social network (OSN). Focusing on the learning model in this work the use of neural learning is adopted to provide an effective result in classification of text mined from Online social network (OSN) [4]. Radial Basis Function Networks (RBFN) which supports the overall classification of short text, especially in analyzing noisy data and other class of messages which act as type of soft classifiers. Apart from the classification strategies, [1] the system gives a useful rule layer exploiting a strong language to specify Filtering Rules (FRs), by using the Filtering Rules (FRs), concept the user of Online social network (OSN) will have a control on content of the message which appears on the user’s private walls. Also, the Filtering Rules (FRs), supports numerous types of filtering schemes which can be utilized by the user’s of Online social network (OSN) according to their needs and priorities. More importantly, FRs enhances the profile of the user and the user relationship and also the result of ML (Machine learning) process to determine the features of filtering rules. Blacklists (BLs) is another important concept which can be used to temporarily prevent the list of users in a individual’s friend list to post the message to the private wall of the individual in Online social network (OSN). Which is acknowledgement to the filtering technique [1].

3. Existing System Online Social Networks (OSNs) is i n peak recently because of its easy communication with friends and families. As Online Social Networks (OSNs) usage increases rapidly the spam in the Online Social Networks (OSNs) also increased vigorously numerous approach are given to detect spam in Online Social Networks (OSNs) . Based on the extracted data from the Online Social Networks (OSNs) this method is applied which includes short text classifier, content based classification and also Machine learning technique. Which is then used to analyze the spam in Online Social Networks (OSNs?)

2.1 Short Text Classifier Table 1: Short Text Classifier

Short words Wat

T.Suganya, IJRIT

Long words what

S

yes

Gud

good

23

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

Wanna

Want to

The major concept of the system is Content-Based Messages Filtering (CBMF) and Short Text Classifier. The classification of the message is supported by the features • • • •

Filtered wall (FW) is used to intercept the message posted on the private walls of the user. From the content of the message ,meta data are extracted based on the Machine learning(ML). The extracted meta data are used by Filtered wall (FW) based on the classification and users’ profile. Based on the result obtained , the messages are filtered by Filtered wall (FW).

2.2 Machine Learning-Based Classification We address short text categorization as a hierarchical two level classification process. The first-level classifier performs a binary hard categorization that labels messages as Neutral and Non-neutral. [3] The first-level filtering task facilitates the subsequent second-level task in which a finer-grained classification is performed.

Filtered wall

Content based message filtering

Short text classifier

Users’ profile

Fig. 1 Block diagram of Existing System The second-level classifier performs a soft-partition of Non-neutral messages assigning a given message a gradual membership to each of the non-neutral classes. Among the variety of multiclass ML models well suited for text classification, we choose the RBFN model [6] for the experimented competitive behavior with respect to other state-ofthe-art classifiers. RFBNs have a single hidden layer of processing units with local, restricted activation domain: a Gaussian function is commonly used, but any other locally tunable function can be used. They were introduced as a neural network evolution of exact interpolation [7], and are demonstrated to have the universal approximation property [8], [9]. As outlined in [2], RBFN main advantages are that classification function is nonlinear, the model may produce confidence values and it may be robust to outliers; drawbacks are the potential sensitivity to input parameters, and potential overtraining sensitivity. The first-level classifier is then structured as a regular RBFN. In the second level of the classification stage, we introduce a modification of the standard use of RBFN. Its regular use in classification includes a hard decision on the output values: according to the winner-take-all rule, a given input pattern is assigned with the class corresponding to the winner output neuron which has the highest value. In our approach, we consider all T.Suganya, IJRIT

24

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

values of the output neurons as a result of the classification task and we interpret them as gradual estimation of multi membership to classes.

4. Proposed System The proposed system gives an solution to the spam detection by using flexible rule based system.

4.1 Short Text Classifier The mechanism to classify the text in Online Social Networks (OSNs) is applicable to large set of data rather than small data set [10]. Many kind of text representation are available our technique is to use neural learning to classify the short text .From the view of Machine learning , the technique uses two level of approach for better identifying and eliminating neural sentences by the one step work instead of doing everything in all step. In this two level approach ,the first level is named as hard classification in which the short text are splitted into neural and non-neural sentences. The second level of approach is named as soft classification in which it works on the non-neural sentences to simplify it. Which is then used by filtering rules.

4.2 Text Representation The extracted data from Online Social Networks (OSNs) affects the overall performance of classification mechanism. Different approaches are given for this problem [4] but the suitable measure for classifying the short text is implemented by contextual features[11], [12] which is used to identify the correct word, bad word, capital word, question mark, exclamation mark, Punctuation character

Extracted messages

Short Text classifier

Text Representation

Filtering Rules

Knowledge discovery

Fig. 2 Block diagram of Proposed System

4.3 Filtering Rules T.Suganya, IJRIT

25

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

We use a special layer to filter the spam which is rule layer. To do so, we consider the Online Social Networks (OSNs) as a directed graph in which each node represent user of Online Social Networks (OSNs) and the direction indicates the type of relationship of the users. When using the filtering rules we should consider the three fact that affects the filtering of spam in Online Social Networks (OSNs) .First, in Online Social Networks (OSNs) same message have different meaning in daily routine based on the user who post it on Online Social Networks (OSNs). So the user should give constrain on messages posted, thereby applying the rule to creator of different kinds.

4.4 Blacklists Another special approach we propose is the use of Blacklists which is used to avoid the unwanted post from the third party. In simple we say blacklist as temporary restriction of friends in Online Social Networks (OSNs). The user can have control over the people on his/her friend list in which he/she can temporary restrict certain friends from being posting message to the user’s private wall also, it avoid the temporary restricted user from being able to view the content of the user’s wall. Like Filtering rules, blacklists allows the wall owner to select the people in their friend list to be blocked according to their profile and their relationship in Online Social Networks (OSNs). By using blacklists rule, the wall owner can able to restrict the people in their friend list they do not know directly or with the person in their friend list in which the user have bad opinion. This restriction is available only for certain period the user can then unlock the people who are restricted .Also , if the user need for permanent restriction then the people can be locked permanent in Online Social Networks (OSNs).

5. Conclusion In this paper ,we propose a solution to filter spam in Online Social Networks (OSNs) by using rule based system which uses the FRs filtering rule to filter the unwanted data also we used soft classifier based on neural learning to identify the spam in Online Social Networks (OSNs).This is the first work in classifying the spam effectively by increasing the quality of classification . The collected data from Online Social Networks (OSNs) is examined by machine learning technique which is trained by set of pre classified data. Thus the spam are filtered from the Online Social Networks (OSNs).

6. Acknowledgments I would like to extend my heartfelt gratitude to my family and friends for their vital encouragement and support. Special thanks to my mother Kowsalya Thangavel for her support and my beloved friend Sridevi Kalyan. My sincere thanks to ArulPrakash for his guideline. And to God, who made all things possible.

7.References [1] Marco Vanetti, Elisabetta Binaghi, Elena Ferrari, Barbara Carminati, and Moreno Carullo “A System to Filter Unwanted Messages from OSN User Walls” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 2, FEBRUARY 2013 285 [2] A.K. Jain, R.P.W. Duin, and J. Mao, “Statistical Pattern Recognition:A Review,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, Jan. 2000. [3] T.Suganya, T.Hemalatha “Spam Filtering in Online Social Networks Using Machine Learning Technique” International Journal Of Innovative Research In Computer and Communication Engineering , Vol. 2, Issue 1, January 2014 2567 2568 [4] F. Sebastiani, “Machine Learning in Automated Text Categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1-47, 2002. [5] M. Vanetti, E. Binaghi, B. Carminati, M. Carullo, and E. Ferrari, “Content-Based Filtering in On-Line Social Networks,” Proc. ECML/PKDD Workshop Privacy and Security Issues in Data Mining and Machine Learning (PSDML ’10), 2010. T.Suganya, IJRIT

26

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 2, February 2014, Pg: 22- 27

[6] J. Moody and C. Darken, “Fast Learning in Networks of Locally-Tuned Processing Units,” Neural Computation, vol. 1, no. 2,pp. 281-294, 1989. [7] M.J.D. Powell, “Radial Basis Functions for Multivariable Interpolation: A Review,” Algorithms for Approximation, pp. 143-167,Clarendon Press, 1987. [8] E.J. Hartman, J.D. Keeler, and J.M. Kowalski, “Layered Neural Networks with Gaussian Hidden Units as Universal Approximations,”Neural Computation, vol. 2, pp. 210-215, 1990. [9] J. Park and I.W. Sandberg, “Approximation and Radial-Basis-Function Networks,” Neural Computation, vol. 5, pp. 305-316, 1993. [10] D.D. Lewis, Y. Yang, T.G. Rose, and F. Li, “Rcv1: A New Benchmark Collection for Text Categorization Research,” J. Machine Learning Research, vol. 5, pp. 361-397, 2004. [11] M. Carullo, E. Binaghi, and I. Gallo, “An Online Document Clustering Technique for Short Web Contents,” Pattern Recognition Letters, vol. 30, pp. 870-876, July 2009. [12] M. Carullo, E. Binaghi, I. Gallo, and N. Lamberti, “Clustering of Short Commercial Documents for the Web,” Proc. 19th Int’l Conf.Pattern Recognition (ICPR ’08), 2008. Biography

Suganya Thangavel is presently doing her final year M.E (CSE) in Karpagam University, Coimbatore. She received B.E degree from Anna University in 2010. She is a holder of Yuva Kala Bharathi award for her academic skills. She has two years of teaching experience. She has published papers in international journals and presented more number of papers in national and international conference. Her area of interest includes web mining and mobile computing.

T.Suganya, IJRIT

27

Detection of Spam in Online Social Networks (OSN) - International ...

An Adaptive Fusion Algorithm for Spam Detection

Outlier Detection in Sensor Networks

Robust Location Detection in Emergency Sensor Networks

Finding Hierarchy in Directed Online Social Networks - CS Rutgers

social networks in the boardroom - Wiley Online Library

(Under)mining Privacy in Social Networks

Informal Insurance in Social Networks

Churn in Social Networks

Pricing in Social Networks

Rumor Spreading in Social Networks

Milgram-Routing in Social Networks

Social Networks of Migrant Workers in Construction ...

Networks of Outrage and Hope - Social Movements in the Internet ...

The role of social networks in health

Multirelational organization of large-scale social networks in an ...

The formation of partnerships in social networks

Estimating the size of online social networks - Research at Google