A SYNOPSIS ON

Enhancing Web Navigation Usability Using Web Usage Mining Techniques Submitted To Computer Science and Technology, Department of Technology Shivaji University, Kolhapur.

In partial Fulfilment of M.Tech (Computer Science & Technology) Degree Submitted By Mr. Patil Swapnil Shrimant

Under the Guidance of Prof. Mr. H.P. Khandagale

Computer Science & Technology, Department of Technology, Shivaji University, Kolhapur 2015-2016

Student Details

1. Name of the Institute

: Department of Technology, Shivaji University,Kolhapur.

2. Name of the Course

: M.Tech (Computer Science and Technology) .

3. Dissertation Title

: Enhancing Web Navigation Usability Using Web Usage Mining Techniques .

4. Student Details

: Name

: Patil Swapnil Shrimant

E-mail Id

: [email protected]

Contact No

: 9970878367

5. Roll No.

: 14

6. PRN No.

: 2014010515

7. Present official Address : AP:281,Mal Galli, Koulage, TAL:Kagal DIST:Kolhapur,416235 . 8. Permanent Address

: AP:281,Mal Galli, Koulage, TAL:Kagal DIST:Kolhapur,416235 .

9. E-mail

: [email protected]

10. Mobile

: 9970878367.

11. Guide Details

: Name

: Prof. Mr. H. P. Khandagale

E-mail Id

: k [email protected]

Contact No

: 9881014185

12. P. G. Recognition No.

: SU/PGBUTR/Recog/250 Dtd. 6th April 2013

13. Year of Admission

: 2014-15

Abstract The Internet has evolved significantly over the past few decades. There are many reasons behind this explosive growth in web traffic. Just about every website has some form of navigation. Unfortunately, not every website navigation is good. Most of the time, website navigation is put together by Web designers who know a lot about making decorated websites, but very little about marketing a website or creating a website built from the users point of view, which results into web navigation usability problems for users. This synopsis report “Enhancing web navigation usability using web usage mining techniques” discusses in detail about developing solution model to identify Web navigation related usability problems and analyze comparison between actual and anticipated usage behavior which will be helpful to users to provide better effectiveness ( higher task completion rate) and efficiency (less time) for given tasks.The web navigation structure is changed automatically which reduces the time expended by the web developers. Keywords - Web Server Log, Web-Navigation Usability, Usage Pattern Extraction, Usage Mining, Cognitive User Model.

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

1.Introduction The World Wide Web today has expanded to serve millions of different users for a multitude of purposes in all parts of the world. Naturally, Web content nowadays needs to be filtered and personalized based on the particular needs of individual users. The users interests, expectations and expertise, cognitive style and perception are some of the factors that need to be considered when creating personalized interactive systems [13]. Web navigation refers to the process of navigating a network of information resources in the World Wide Web, which is organized as hypertext or hypermedia. Just about every website has some form of navigation. Unfortunately, not every website navigation is good. Most of the time, a website navigation is put together by Web designers who know a lot about making pretty websites, but very little about marketing a website or creating a website built from the users’ point of view. Therefore, it is necessary to identify navigation- related Web usability problems which will be helpful to users’ to provide better effectiveness (higher task completion rate) and efficiency (less time for given tasks). One of the greatest advantages of designing web-based user interfaces over traditional user interfaces is the ability to keep track of user interactions with the site. Thanks to the simple (yet extremely useful) concept of server log files, users’ interaction with a website is kept in a raw format that can be easily processed by automated tools. This information is stored on most web servers by default. Statistical testing and reliability analysis be used effectively to assure quality for Web applications. Web usage and failure information extracted from existing Web logs. The usage information is used to build models for statistical Web testing. The related failure information is used to measure the reliability of Web applications and the potential effectiveness of statistical Web testing [04]. Web Usage mining applies data mining technique to extract knowledge from these web log files [6]. Additionally, various tools can be used to extract the information from these raw log files. The extracted information can then be used for finding user navigation patterns. By finding frequent user navigation sequences or user navigation sessions from server logs, we can compare actual user navigation trails with the designer’s expected navigation trails and try to improve the interface of the site accordingly.

Student Signature Department of Technology

1

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

This involves: • Data preparation [6] and pre-processing [8] the log files and converting it to sequential data. The data pre-processing phase includes data cleaning, user identification, session identification, site structure, link details formation, path completion, event generation.“Sequential Pattern mining” includes a set of sequences and support threshold to find the complete set of frequent sub-sequences. • Finding user navigation patterns from the sequential data using different pattern finding algorithms. The raw log files from the web server on which the Sheet Exchange website resides are first simplified and converted into sequential data. Then a sequential pattern mining algorithm (A-priori algorithm) applied. • Path completion [2] is a critical and difficult task in the pre-processing phase of web usage mining. For that manually discovering the rules of missing references based on site structure, referrer and other heuristic information. • Transaction Identification: Client Side cookies are used for identifying unique users’ and also to identify and define unique transactions. • Usage Mining: Web usage mining algorithm PrefixSpan[15] is applied over preprocessed data for Pattern Discovery and Extraction. • Cognitive user model: Nowadays, there is a growing need for those engaged in the design of Web technology to understand the human factors involved in Web-based interaction. Incorporating insights from cognitive science about the mechanisms,strengths, and limits of human perception and cognition can provide a number of benefits for Web practitioners. Knowledge about the various constraints on cognition, (e.g., limitations on working memory), patterns of strategy selection can inform the design and evaluation process and allow practitioners to develop technologies that are better suited to human abilities [14]. • Updating web-navigation paths and comparing original and awaited user behavior based on previous results in terms of log records accessed per session. The ultimate goal of this project is to identify and rectify web navigation usability problems based on cognitive user model which shows quantified usability improvement in task success rate, also minimizing the task efforts.The suggested corrections were made over website navigation structure automatically .

Student Signature Department of Technology

2

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

2.Literature Survey In this section, the references are collected from all conferences, sites, articles, books from internet which helps to implement the project. For development of this project we referred some of Base papers, Ideas which helps in development, testing, and in deployment phase. Literature survey is a very vast topic; here consideration of number of books since past 10-12 years has been done. For good understanding of the web mining concepts there are some books for reference. Ruili Geng & Jeff Tian members of IEEE have proposed a transaction paper [1] on “Improving Web Navigation Usability by Comparing Actual & Anticipated Usage” which is the main base paper of project. The paper is all about identifying navigation related web usability problems faced by user after comparisons made between actual and anticipated usage patterns. Based on these comparison results (performed on small service-oriented website) they have highlighted usability issues and suggested corrective actions to be performed by domain experts (manually). B. Liu and his team [16] wrote a book “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data”. The chapter 6-12 of this book describes all basic concepts related to web specific mining. R. Cooley, B. Mobasher, and J. Srivastava, proposed a paper [6] on knowledge information system, “Data preparation for mining World Wide Web browsing patterns” which explains basic concept of Web Usage Mining. Also this paper presents several data preparation techniques in order to identify unique users’ and user sessions. Also, a method to divide user sessions into semantically meaningful transactions is defined and successfully tested against two other methods. Transactions identified by the proposed methods are used to discover association rules from real world data using the WEBMINER system. OM Kumar C. U. and P. Bhargavi published an article [7] on “Analysis of web server log by web usage mining for extracting users’ patterns” , In this paper they described the Web Server Log files and use of Web mining techniques to extract usage patterns by using WEKA(A Software tool).

Student Signature Department of Technology

3

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

C.P.Sumathy, R. Padmaja Valli, T. Santhanam published an article [8] on “An overview of pre-processing of web log files for web usage mining”, where they discusses problems with data stored in the log files do not present in an accurate picture of the users’ accesses to the Web site. Hence, pre-processing of the Web log data is an essential and pre-requisite phase before it can be used for knowledge-discovery or mining tasks. T. Arce, P. E. Roman, J. D. Velasquez, and V. Parada published a paper [12] “Identifying web sessions with simulated annealing” on Expert System Application where they discusses how web site redesigning stage is compulsory to take into account the behavior of the users.Here they have proposed heuristic approach based on simulated annealing to solve the sessionization problem. Nirali Honest and Dr. Atul Patel, Dr. Bankim Patel proposed an IEEE conference paper [2] on “A study of path completion techniques in web usage mining”. They proposed work on path completion by considering different types of path generated in accessing the website designed using content management system and gives a novel algorithm to form the path. Mr. Akshay Upadhyay, Mr. Balram Purswani published a paper [5] for International Journal of Scientific and Research Publications on “Web Usage Mining has Pattern Discovery” where they proposed knowledge in respect of pattern discovery of web usage mining, also they described how Users behavior of page browsing should be in hand with the website designers and study about the visitor’s activities through the web analysis and find patterns of the visitor’s activities. Melody Y. Ivory and Marti A. Hearst published an article [9] on “The State of the Art in Automating Usability Evaluation of User Interfaces” where they proposed extensive survey of usability evaluation methods, organized according to a new taxonomy that emphasizes the role of automation. The survey analyzes existing techniques, identifies which aspects of usability evaluation automation are likely to be of use in future research, and suggests new ways to expand existing approaches to better support usability evaluation. M. Heinath, J. Dzaack, and A. Wiesner published a paper[10] on “Simplifying the development and the analysis of cognitive models ” in cognitive science conference where they proposed drawbacks of cognitive processes and their underlying structures and given solution to these problems by developing HTAmap and SimTrA to simplify

Student Signature Department of Technology

4

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

the development and analysis of cognitive models. Tonio Carta, Fabio Paterno, and Vagner Figueredo de Santana1 published article [11] on “Web Usability Probe: A Tool for Supporting Remote Usability Evaluation of Web Sites” on springer. They presented a tool that supports remote usability evaluation of Web sites. The tool considers client-side data on user interactions and JavaScript events. In addition, it allows the definition of custom events, giving evaluators the flexibility to add specific events to be detected and considered in the evaluation. The tool supports evaluation of any Web site by exploiting a proxy-based architecture and enables the evaluation to perform a comparison between actual user behavior and an optimal sequence of actions. T. Tullis and B. Albert [17] wrote a book “Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics (Interactive Technologies)” which covers all aspects of someone’s interaction with a product, application, or system. Also it says that user experience is measurable or quantifiable. Also it describes usability issues, various user experience metrics.

Student Signature Department of Technology

5

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

3.Problem Statement To implement a new method to identify navigation- related Web usability problems based on comparing actual and anticipated usage patterns and providing solution to these problems by performing automated navigation by using web page link modification which will enhance the overall web navigation usability.

4.Choice of topic with reasoning WWW is a system of interlinked hypertext documents accessed via the Internet. Around 11 Hundred million people access internet daily. Therefore information available on WWW is also growing. With this continued growth of information and proliferation of web services and web based information systems, web sites are also growing to host them. Every website has some form of navigation. Unfortunately, not every websites navigation is good. Most of the time, a websites navigation is put together by Web designers who know a lot about making pretty websites, but very little about marketing a website or creating a website built from the users point of view. Today’s web developers need is to build a system which solves the usability problems in terms of effectiveness, efficiency, and satisfaction of the user, which helps users to accomplish specific tasks. A system needs to be build which solves the web navigation problems from users point of view and which will save time and efforts of developers team.

Student Signature Department of Technology

6

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

5.Objective • To design and implement a system which will identify actual usage behavior from actual usage patterns. • To implement cognitive user model, this will quantify awaited usage behavior. • To design and implement model which will made comparison between these actual and awaited usage behavior. • To implement a system which will corrects the problems based on comparison results, which lead to better functional convenience as characterized by both better effectiveness(higher task completion rate) and efficiency (less time for given tasks). • To implement a system which takes anticipated usage behavior as input and perform automated navigation updation by using web page link modification.

Student Signature Department of Technology

7

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

6.Outline of proposed Work PHASE 1: Discovery of usage patterns from web server logs: INPUT: Web Server logs OUTPUT: Discovery of usage patterns

Figure 1: Data Preparation And Pre-Processing

1. Stage 1:Data Cleaning: Data cleaning is usually site-specific, and involves tasks such as, removing irrelevant entries such as those that represent multimedia data and scripts or uninteresting entries such as those that belongs to top/bottom frames. 2. Stage 2: USER Identification: Since several users may share a single machine name, certain heuristics are used to identify users . User activity record ca be used to refer to the sequence of logged activities belonging to the same user. 3. Stage 3: Session Identification: Sessionization is the process of segmenting the user activity record of each user into sessions, each representing a single visit to the site. The goal of a sessioniza-

Student Signature Department of Technology

8

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

tion heuristic is to reconstruct, from the click stream data, the actual sequence of actions performed by one user during one visit to the site. 4. Stage 4: Path Completion: Manually discovering the rules of missing references based on site structure, referrer, and other heuristic information. 5. Stage 5: Transaction Identification: Most probably using client side cookies for it.

PHASE 2: Pattern Discovery and Extraction: INPUT: Pre-Processed Data OUTPUT: Discovery of Actual usage patterns Using web usage mining algorithm (Most Probably PrefixSpan method) applied to the available data for sequential Pattern Discovery and Extraction.

Figure 2: Proposed Architecture

Student Signature Department of Technology

9

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

PHASE 3: Building a Cognitive User Model: INPUT: Discovered Patterns OUTPUT: Cognitive Model which specify the anticipated User behavior. cognitive modeling methods to understand complex interactive behavior involved in three tasks: (1) icon search, (2) graph reading, and (3) information retrieval on the World Wide Web (WWW). Here, traditional Cognitive user model(ACT-R(Adaptive Control of Thought Rational)/EPIC(Executive Process/Interactive Control)) can be used for building expert system which will predict anticipated usage behavior or performance based on discovered patterns. PHASE 4: Updating Website Navigation links in Automated Manner: INPUT: Anticipated usage behavior OUTPUT: Updated Web navigation links on website. The final anticipated behavior computed from system may be provided to the developer team for necessary modifications to achieve web-navigation improvement. But, this module perform necessary modification by updating links on web pages automatically.For that purpose Document Object Model (DOM) technique can be used, which will save time and efforts of developers’ team. PHASE 5:Comparing Actual And Awaited User Behavior: INPUT: Website after web page link updations OUTPUT: Number of log records The resulted website after performing web page link updations is passed to next stage for comparing original and awaited user behavior results in terms of log records accessed per session. The less number of log records shows effectiveness(higher task completion rate) and efficiency (less time for given tasks).

Student Signature Department of Technology

10

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

7.METHODOLOGY 1. Methods of Data Collection: • Data downloaded from internet. • Web log Data set is collected from the internet or generated synthetically.

2. Probable Methods of Data Analysis:

Pre-processing Activity • Input will be the web server log data with size greater than 1000 and having minimum four attributes. • This data is stored in mysql database and created Extended log file and preprocessed. Usage Mining Algorithm • Input will be pre-processed data. • Output gives the patterns which shows actual usage behavior of user. Cognitive modeling • Input will be discovered patterns in previous stage. • This model generates patterns used for identifying anticipated usage behavior. Comparing actual and anticipated usage behavior • The data of actual and anticipated usage behavior is analyzed on monthly basis. • Based on the results structural changes have been made to the website navigation

Student Signature Department of Technology

11

Guide Signature . Shivaji University Kolhapur

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

8.System Requirement 1. Software Requirement • Operating System : Windows XP • Development Tool : JDK 7,Netbeans IDE • Application Server : Apache Tomcat6.0 • Front End : HTML, Jsp. • Scripts : JavaScript. • Server side Script : Java Server Pages. • Database/Back End : MySql 2. Hardware Requirement • Processor : Pentium III 1.1.GHz and Above. • RAM: 256 MB min. • Hard Disk : 20GB Min.

Student Signature Department of Technology

12

Guide Signature . Shivaji University Kolhapur

9.References [1] Ruili Geng , Jeff Tian,“Improving Web Navigation Usability by Comparing Actual and Anticipated Usage”, IEEE transaction on human machine-systems, VOL. 45, NO. 1,February 2015. [2] Nirali Honest and Dr. Atul Patel, Dr. Bankim Patel, “A study of path completion techniques in web usage mining”, IEEE International Conference on Computational Intelligence and Communication Technology,2015. [3] M. F. Arlitt and C. L. Williamson, “Internet Web servers: Workload characterization and performance implications”, IEEE/ACM Trans. Netw., vol. 5, no. 5, pp. 631-645, Oct. 1997. [4] C. Kallepalli and J. Tian, “Measuring and modeling usage and reliability for statistical Web testing”, IEEE Trans. Softw. Engin., vol. 27, no. 11, pp. 1023-1036, Nov. 2001. [5] Mr. Akshay Upadhyay, Mr. Balram Purswani, “WEB USAGE MINING HAS PATTERN DISCOVERY ”, International Journal of Scientific and Research Publications, Volume 3, Issue 2, February 2013. [6] R. Cooley, B. Mobasher, and J. Srivastava,, “Data preparation for mining world wide web browsing patterns ” Knowl. Inf. Syst., vol. 1, no. 1, pp. 5-32, 1999. [7] Om Kumar C. U. and P. Bhargavi, “Analysis of web server log by web usage mining for extracting usage patterns”, Vol. 3, Issue 2, ISSN 2249-6831, 123-136 ,(IJCSEITR),June 2013. [8] C.P.Sumathy, R. Padmaja Valli, T. Santhanam, “An overview of pre-processing of web log files for web usage mining”, JATIT, vol. 34 No. 1, ISSN: 1992-8645, 15th Dec.2011.

13

Enhancing Web Navigation Usability Using Web Usage Mining Techniques

[9] M. Y. Ivory and M. A. Hearst, “The state of the art in automating usability evaluation of user interfaces”, ACM Comput. Surveys, vol. 33, no. 4,pp. 470-516, 2001. [10] M. Heinath, J. Dzaack, and A. Wiesner, “Simplifying the development and analysis of cognitive models ”, in Proc. Eur. Cognitive Sci. Conference, Delphi, Greece, 2007, pp. 446-451. [11] T. Carta, F. Patern, and V. F. D. Santana, “Web usability probe: A tool for supporting remote usability evaluation of websites ”, in Human-Computer Interaction INTERACT 2011. New York, NY, USA: Springer, pp. 349-357, 2011. [12] T. Arce, P. E. Romn, J. D. Velsquez, and V. Parada, “ Identifying web sessions with simulated annealing”, Expert System Application., vol. 41, no. 4, pp. 1593-1600, 2014. [13] Marios Belk, Efi Papatheocharous, Panagiotis Germanakos, George Samaras, “Investigating the Relation between users’ Cognitive Style and Web Navigation Behavior with K-means Clustering”, Dept. of Comp. Sci., University of Cyprus, Nicosia, Cyprus. [14] D. Peebles and A. L. Cox, “Modeling interactive behaviour with a rational cognitive architecture”, in Human Computer Interaction: Concepts, Methodologies, Tools, and Applications, C. S. Ang and P. Zaphiris, Eds.Hershey, PA, USA: Inf. Sci. Ref., 2008, pp. 1154-1172. [15] Dr.S.Vijayarani and Ms.S.Deepa,”An efficient algorithm for sequence generation in data mining”,International Journal on Cybernetics and Informatics ( IJCI) Vol.3, No.1, February 2014. [16] B. Liu, “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data”. New York, NY, USA: Springer, 2007. [17] T. Tullis and B. Albert, “Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics (Interactive Technologies) ”. San Mateo, CA, USA: Morgan Kaufmann, 2008.

Student Signature Department of Technology

14

Guide Signature . Shivaji University Kolhapur

10.Planning of Work Chart

Time Period

Work to be Completed

01/08/2015 - 25/08/2015

Search and study the requirement, design of the system.

26/08/2015 - 05/10/2015

Implementation of data preparation and pre-processing model.

06/10/2015 - 05-12-2015

Implementation of sequential pattern mining algorithm for pattern discovery and extraction.

06/12/2015 - 31/12/2015

Report and presentation preparation for dissertation phase-1.

01/01/2016 - 15/02/2015

Implementation of cognitive user model.

16/02/2016 - 09/04/2016

Implementation of structural and statistical model which will automatically update website navigation and comparing log records accessed per session respectively.

10/04/2015 - 27/05/2016

Combining all modules, Testing, Result Analysis.

28/05/2016 - 30/06/2016

Report and Presentation for dissertation phase-2.

Student

Guide

Mr.Patil Swapnil Shrimant

Mr. H. P. Khandagale

M. Tech II

Assistant Professor.

Computer Science & Technology

Computer Science & Technology

Department of Technology,

Department of Technology,

Shivaji University, Kolhapur.

Shivaji University, Kolhapur.

Coordinator M. Tech II Computer Science & Technology Department of Technology, Shivaji University, Kolhapur.

15

Enhancing Web Navigation Usability Using Web Usage ...

decorated websites, but very little about marketing a website or creating a website .... R. Padmaja Valli, T. Santhanam published an article [8] on “An overview.

245KB Sizes 0 Downloads 62 Views

Recommend Documents

web usage mining using rough agglomerative clustering
is analysis of web log files with web pages sequences. ... structure of web sites based on co-occurrence ... building block of rough set theory is an assumption.

Enhancing mobile search using web search log data
Enhancing Mobile Search Using Web Search Log Data ... for boosting the relevance of ranking on mobile search. .... However, due to the low coverage.

Enhancing the Explanatory Power of Usability Heuristics
status, match between system and the real world, user control and freedom ... direct commercial ... had graphical user interfaces, and 3 had telephone-operated.

[PDF] Download Designing Web Usability PDF ePubMobi
Learn to design basic applications with Floorplan Manager using OVP OIF QAF and GAP ... All Ebook Designing Web Usability, PDF and EPUB Designing Web Usability, PDF ePub Mobi ... Sign Up to one of our plans and start browsing.

A Common Sense Approach to Web Usability (3rd Edition)
for a wide variety of clients like Apple, Bloomberg.com, Lexus.com, NPR, the International Monetary Fund, ... there's a new caricature app that you'll want to see.

A Common Sense Approach to Web Usability (3rd ...
for a wide variety of clients like Apple, Bloomberg.com, Lexus.com, NPR, the International Monetary Fund, ... there's a new caricature app that you'll want to see.

History Navigation Mechanisms and Web Application ...
By default, an expiration time does not apply to history mechanisms. If the entity is still .... shopping, submitting his credit card info and completing the checkout.

Context-Dependent Web Bookmarks and Their Usage ...
queries, which can be used for web pages that have never ... bookmarks, and a way to extract representative key- ... Proceedings of the 3rd International Conference on Web Information ..... We call the produced query vector Q it context-.

Collaborative Filtering Supporting Web Site Navigation
rithms that try to cluster users with respect some (given ... tion Systems, with the emerging filtering technologies .... file, the user is interested in the document.

Web Spoofing Detection Systems Using Machine Learning ...
... Systems Using Machine. Learning Techniques ... Supervised by. Dr. Sozan A. .... Web Spoofing Detection Systems Using Machine Learning Techniques.pdf.

[PDF] Download Using WebPageTest: Web ...
... Power Users free online, Using WebPageTest: Web Performance Testing for Novices and Power Users .... features in WebPagetest that make testing easier.