Utilizing Recommender Systems to Support Software ...

Viewer
Transcript

Utilizing Recommender Systems to Support Software Requirements Elicitation Carlos Castro-Herrera and Jane Cleland-Huang Systems and Requirements Engineering Center (SAREC) DePaul University Chicago, USA

[email protected], [email protected] ABSTRACT Requirements Engineering involves a number of human intensive activities designed to help project stakeholders discover, analyze, and specify the functional and non-functional needs for a software intensive system. Recommender systems can support several different areas of this process including identifying potential subject matter experts for a topic, keeping individual stakeholders informed of relevant issues, and even recommending possible features for stakeholders to consider and explore. This position paper summarizes an extensive series of experiments that were conducted to identify best-of-breed algorithms for recommending forums to stakeholders and recommending unexplored topics to project managers.

Categories and Subject Descriptors D.2.1 (Software Engineering): Requirements/Specifications; H.3.3 (Information Storage and Retrieval): Information Search and Retrieval.

General Terms Algorithms, management, human factors

Keywords Recommender systems, requirements elicitation, stakeholders, subject matter experts

1. INTRODUCTION Software development projects are typically initiated through a requirements gathering phase in which business analysts work with project stakeholders to elicit the functional and behavioral requirements of the system. This is a critical phase of every software project, and numerous case studies and surveys have shown that incomplete or incorrect requirements are one of the primary causes of project failure and can lead to millions of dollars in lost revenue every year [18]. Traditionally the process of requirements elicitation is performed using face-to-face techniques such as interviews, brainstorming sessions, and other interactive workshop activities. However, recent advancements in technology have led to a growing trend towards using online forums and wikis to facilitate the requirements process [10]. In Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. RSSE’10, May 4, 2010, Cape Town, South Africa. Copyright 2010 ACM 978-1-60558-974-9/10/05…$5.00.

these forums and wikis, project stakeholders gather in a virtual meeting place, often asynchronously, to explore and specify the project requirements. Generally, topic-based discussions take place within discussion threads, which are created on demand by the project stakeholders [9]. This paper describes several ways in which recommender systems can be used to provide timely and useful recommendations to stakeholders, discussion group members, and project managers as they engage in requirements gathering activities using wikis or forums. First, recommendations can be made to individual stakeholders to keep them informed of pertinent discussion topics so that key stakeholders can engage fully in the requirements gathering process. In prior work we have conducted an extensive series of experiments in order to identify the best type of recommender system for this task [5,6,7,8]. Section 2 of this paper summarizes our work by describing the recommender systems that we found to work best in the requirements domain. The second area of recommendation, described in Section 3 of this paper, focuses on the project as a whole. The goal is to analyze the project’s vision documents and identify proposed features which have not been fully discussed by stakeholders in the forum. The recommender system can then suggest ideas for new discussion topics. This is an example of a content-based recommender which performs a gap analysis between features described in the project’s visionary documents and topics that have been discussed in the forums. Our approach utilizes a consensus-based requirements clustering algorithm that we developed in prior work and validated against data mined from feature request forums [12]. In addition to recommending topics to explore, a separate recommender system also identifies stakeholders who might serve as subject matter experts for unexplored topics. To address the cold-start problem of introducing a new topic, we propose three different approaches for identifying potential stakeholders to invite to the discussion forum [7]. Section 4 of this paper describes our approach. The primary contribution of this paper is to document best-ofbreed algorithms for areas in which we have conducted extensive evaluations; in addition Section 5 discusses other possible uses of recommender systems in the requirements elicitation domain.

2. STAKEHOLDER RECOMMENDATIONS The first goal of our research was to develop a recommender system for proposing topics that might interest an individual stakeholder during the requirements elicitation process [5,6,7]. However, the requirements domain presents different challenges

inner product in that space [15]. More specifically, in matrix factorization the matrix of ratings is approximated by the product of two lower dimension matrices and ; where is the number of users, is the number of items and is the number of latent factors being used.

Hit Ratio - Student Dataset Accumulated % Retrieved

100% 90% 80% 70% 60% 50% 40% 30%

Binary Inclusion of Known Data Random

20% 10% 0% 0

5

10

15

20

25

30

Top N recommendations

Hit Ratio - Railway Dataset Accumulated % Retrieved

100% 90%

In this position paper we report only the best-of-breed solutions. Our experimental results showed that the best recommendation results were obtained through using a K-Nearest Neighbor (KNN) collaborative recommender system [17] with an underlying binary representation of users’ interests. It is interesting to note that the matrix factorization approach, which was recently used to win the coveted Netflix prize for movie recommendations, was outperformed by the approach presented here. It is therefore not discussed any further in this paper.

80% 70% 60% 50% 40% 30%

Binary Inclusion of Known Data Random

20% 10% 0% 0

5

10

15

20

25

30

35

40

45

50

To evaluate these approaches, experiments were conducted against four datasets [5,6]. The Student dataset included 370 feature requests generated by DePaul graduate students for an Amazon-like student web portal; Sugar included 1000 discussion posts mined from the discussion forum of a customer relationship management system named SugarCRM; Secondlife represented 1250 posts mined from the discussion forums of the popular Internet-based video game; and finally Railway included 1031 simulated feature requests mined from the publication specifications of two large scale railway systems.

55

Top N recommendations

Figure 1. Hit ratio graph showing improvements from use of binary recommender and inclusion of known data [6]. from the typical recommender system domain of E-Commerce. Data in the requirements domain consists of high dimensional textual representations, it is sparse, noisy, constantly changing, lacks clean classifications, and has no explicit ratings. Therefore techniques that work well for recommending carefully categorized books, or movies with ample ratings, will not necessarily work well for recommending discussion topics to stakeholders. Our work therefore involved an extensive series of experiments that evaluated numerous variants of content-based, collaborative filtering, knowledge based, and matrix factorization techniques, with various different underlying representations. Recommendation problems are typically formulated as prediction tasks in which a predictive model is built according to prior training data [2,17]. The model is then used in conjunction with the dynamic profile of a user to predict the level of interest in a targeted item. Recommender systems have traditionally fallen into three categories of content-based systems which make recommendations based on semantic content of data, collaborative-filtering systems which make recommendations by examining past interactions of a user with the system and identifying other stakeholders with similar interests [17], and knowledge-based systems which make recommendations based on knowledge of the user and pre-established heuristics. More recently recommender systems research has focused on applying algorithms based on matrix factorization. Matrix factorization has been shown to be a highly effective alternative to collaborative filtering [19]. The matrix factorization techniques attempt to map both users and items to a joint latent factor space, such that a prediction to a user for an item can be computed as an

In the binary KNN approach, a user profile is represented as a binary matrix, such that a membership score of 1 means that a user is associated with a given requirements topic. More formally, this is represented by a matrix , where is the number of stakeholders and the number of requirements topics. Unlike other forms of user profile that capture the degree of interest a user has in a topic, the binary profile simply shows that the user has interest. It should be noted also that a 0 only means that we do not know whether a user is interested in a topic, and should not be misinterpreted in a negative way. The similarity between a stakeholder and a potential neighbor is computed using the binary equivalent of the Cosine similarity metric:

where

corresponds to the

topics of interest to each stakeholder (a row in ). This widely accepted similarity formula has been shown to be effective for binary profiles. The predicted level of interest that a stakeholder u will have in requirements topic i is then computed as where

is the rating

a user gave item i, and depicts the fact that is a neighbor of [17]. An in depth explanation of this formula and the experiments we conducted to evaluate it are described in [5]. Our experiments also showed that recommendations could generally be improved if additional known data, such as a stakeholders’ role in the project, could be incorporated into the user’s binary profile. Several different techniques were evaluated [6,7,8] for combining this data, however in the highest performing solution, additional data related to project roles and general project-related interests were incorporated into each of the users’ profiles, so that the profiles matrix included columns in which users (or their supervisors) could specify individual interests and skills. The enhanced profiles were then used to identify the K nearest neighbors.

The binary recommender system was evaluated against the four datasets, and the inclusion of known data against Student and Railway, as these were the only two datasets for which this additional known data was available. All of our experiments utilized a standard leave-one-out cross validation technique, in which each individual’s interests were removed one at a time, and the recommender algorithm then executed. The rank at which the stakeholder’s removed interest was recommended back was recorded. A hit-ratio graph was then generated to depict the number of interests recommended back at the 1 st position, 2nd position, etc. of the ranked list. Results from these experiments are reported in Figure 1; however due to space constraints we show only the Student and Railway datasets, for which both binary and known-data results are available. As depicted in Figure 1, both the inclusion of known data and use of binary representations generated quite accurate recommendations. For example, in the Railway dataset, approximately 40% of known interests were placed into the top 5 recommendations using the binary recommender, and this number increased to just over 65% when known data was incorporated. Although not shown here, these methods were found to significantly outperform numerous other techniques and variants of the KNN approach that we experimented with. A full comparison of techniques is available in other papers [6,7,8]. These experiments show the feasibility of using recommender systems to help inform individual stakeholders of topics that might be of interest to them during the requirements elicitation process. A concrete example is provided in Figure 2, which depicts a recommendation made to a stakeholder to join two different discussion forums

3. RECOMMENDING FOCUS AREAS Recommender systems can also be usefully employed during the elicitation process to make recommendations to project managers concerning topics that need additional focus. These might represent features identified in the project vision document, but not yet discussed in the requirements forum, or could represent a stagnant forum in which ideas have been discussed without closure. Our approach first clusters the sections in the vision documents, secondly clusters the requirements, and finally looks for topics in the vision statements that are not fully covered by the requirements and/or feature requests. Feature requests within the forum are clustered as follows. First sentences are preprocessed to remove common words (such as “this” and “because”) which are not useful for identifying

Figure 2. Thread recommendations made to a stakeholder.

underlying themes. The remaining terms are then stemmed to their roots. Given the final dictionary of terms , each feature request is represented as a vector over T: , where is the weight associated with term in request . The weights are typically computed using the standard tf-idf approach from information retrieval. Specifically, , where represents the number of feature requests, represents the term frequency of in the feature request , and represents the number of feature requests containing . The similarity between each pair of request vectors and is then computed using the Cosine similarity measure:

.

Based on prior experiments, the clustering framework utilizes a technique proposed by Can [4] to predict the appropriate number of clusters for the current dataset. Can’s approach considers the degree to which each feature request differentiates itself from other requests and the ideal number of clusters K is computed as follows:

, where

is the frequency of term in artifact , is the length of the artifact, represents the total occurrence of term , and is the complete set of feature requests. This approach has been validated across multiple datasets [5,12]. Following an extensive series of previously conducted experiments to evaluate the best clustering technique for feature requests [12], Spherical K-Means (SPK), which exhibits fast running-times and returns relatively high-quality results was adopted. A set of centroids are initially selected. The distance between each feature request and each centroid is then computed, and each feature request is placed into the cluster associated with its closest centroid. Once all feature requests have been assigned to clusters, each of the centroids are repositioned in order to increase their average proximity to all their associated feature requests. This process is followed by a series of incremental optimizations in which an attempt is made to move a randomly selected feature request to the cluster for which it maximizes the overall cohesion gain. The process continues until no further optimizations are possible. As the quality of the final clustering is dependent upon a good initial choice of centroids, we adopted a consensus clustering technique in which multiple candidate clusterings, called an ensemble, are generated and then a voting scheme is used to create the final result. There are two primary components in consensus clustering, ensemble generation which describes how the ensemble is created, and integration which describes the technique for transforming the clusterings in the ensemble into a final result. Our approach adopts sub-sampling SPK and averagelink Agglomerative Hierarchical Clustering (AHC) as methods for ensemble generation and integration respectively. Sub-sampling constructs a clustering ensemble of size through subset clustering and classifying. In each of runs, a proportion of the whole dataset is randomly extracted and then partitioned into clusters using SPK. Based on prior experiments was set to 0.8 [12]. The remaining feature requests are then classified into their most closely related clusters.

Payments

Testing

Ratings

Security

Used Books

Product Code

Deals

Encryption

Quality Development Comparisons Portability

Database

Tracking

Shopping Cart

Ease of Use Availability Performance

Figure 3. Topic coverage. Note: Graph was generated automatically using Consensus Clustering and graphing tool; however the shading was performed manually to illustrate results from the topic coverage algorithm. Our ensemble integration is based on the concept of a coassociation matrix . Let be the set of feature requests to be clustered. A clustering ensemble represents partitionings of where the partitioning represents a set of clusters such that . Then each element of the co-association matrix represents a voting score between a pair of instances where is the number of times the instance pair is assigned to the same cluster over the ensemble. The underlying assumption is that pairs of feature requests that truly belong together in a cluster, are likely to be placed together in more of the individual clusterings, than pairs that do not belong together. The final partitioning is generated from using hierarchical clustering to cluster over the co-association matrix [11]. Although consensus clustering has a relatively long running time, it delivers clusters that are consistently of higher quality than the average SPK clustering, and invariably close to the optimal quality achievable by an individual SPK clustering. Experiments validating this finding for software requirements and feature requests are reported in a prior paper [12]. The sentences in the vision document are then clustered in a similar way. The end result is a set of topics extracted from the vision document, and another set of topics extracted from the feature requests. The cosine similarity formula is then used to compute the similarity between each topic in the vision document with each topic from the feature requests. Topics in the vision document with no matching topics in the feature requests i.e. no topic that scores over a predefined similarity score, are identified as potentially unexplored topics and presented as recommendations to the project manager. Figure 3 depicts all of the topics identified in the student domain. The colorings provide a simulated example showing topics not fully covered in the discussion forums.

4. RECOMMENDING EXPERTS Once a topic has been recommended to the project manager for further attention, a different recommender system can suggest a set of stakeholders who might serve as subject matter experts.

Our prior work in this area has identified three different types of stakeholders [7]. The first group represents stakeholders who have directly contributed ideas to the discussion topic (assuming it pre-exists). This group of stakeholders obviously does not exist when an entirely new topic is launched. The second group represents stakeholders who have contributed to related topics. These stakeholders are identified by using the cosine similarity to compute the similarity between each pair of topics. Topics are then represented as a graph, in which each topic is directly connected to other topics that exhibit a similarity score over a certain threshold. For example, Figure 4 depicts a graph showing topics related to Encryption. Stakeholders associated with all directly related groups, in this case payments, security, database, quality, availability, and privacy can be selected. The selection process can either use a fixed threshold to select all stakeholders who have contributed to topics with similarities above a certain score, or can adopt a weighted approach which takes the similarity of the topic, and the level of stakeholder participation into account. Encryption

Payments

Database Quality

Security Privacy

Availability

Figure 4. Topics directly related to “Encryption” Finally, the third group of stakeholders can be identified using the collaborative recommender described in section 2. This discovers stakeholders who, according to their user profiles and the interests of their nearest neighbors, might be interested in this topic. It is interesting to note that even if the topic initially has no known stakeholders, the use of the content-based recommender can circumvent the cold-start problem by creating an initial pool of interested people, and this can provide the data needed to run the collaborative recommender and find additional stakeholders.

5. RECOMMENDING FEATURES Several researchers have previously pointed out the importance of creativity in the requirements process. For example Karlsen et al investigated the use of a searching and browsing tool for exploring information, to synthesize creative ideas for a project [14]. In their approach, they seeded the combinFormation tool with text from scenarios and then presented the resulting images and short textual clips to the user to encourage creative thinking. Gibiec [13] implemented a similar tool that took a topic as a search term, utilized GOOGLE and BING search engines to find related documents, and then used data-mining techniques to extract topics from the results. His approach returned key topics that are potentially related to the initial search term and which can be presented to the user to inspire ideas related to the general topic. An initial series of feasibility experiments have demonstrated that this tool can often generate useful topics; however additional work is needed to discover best-of-breed algorithms. A scenario can be envisioned in which a recommender system, using this information can suggest features to the project manager. The project manager can then choose to launch a new discussion thread to address this feature.

6. PRIOR WORK The use of recommender systems in the domain of requirements engineering is a fairly new concept. Our research group has produced an initial body of work that has been summarized in this paper [5,6,7,8]. Maalej and Thurimella have also raised this subject and presented a list of possible usages and applications of recommender systems in this domain [16]. While the application to the domain is new, the fundamentals of recommender systems have been extensively researched [1,2,3,17,19]. Adomavicius provides a good summary of the current state of recommender systems [1]. All of the techniques presented in this paper are built on the solid work done in this area. For example, the hybrid combination of content and collaborative recommenders discussed in section 4 has been found by Burke to be effective in other domains and an appropriate solution to the cold start problem [3].

7. CONCLUSIONS This paper has presented several ideas for integrating recommender systems into the online requirements gathering process by recommending topics, areas that need focus, and experts. This is an important area of work, as an increasing number of organizations are utilizing wikis and forums to manage feature requests and to gather requirements during the elicitation process. While web-based requirements gathering tools can facilitate a more inclusive requirements elicitation process, the potential overload of data can decrease their effectiveness. Fortunately, the recommender systems described in this paper offer one solution for helping to manage online collaborative requirements processes. Although all of the examples in this paper have been taken from online forums, we do not believe that the recommender systems described in this paper are limited to use in this context. Requirements management tools such as Requisite Pro and DOORS also attempt to organize stakeholder and system level requirements by groups, and recommender systems could be similarly built into these tools to highlight important and relevant areas to different project stakeholders. We are currently incorporating several of the recommender systems described in this paper into an application named ReqForum, built using the open source JForum package. This will provide the opportunity for evaluating the usefulness of the described recommender systems within an actual project where stakeholders can provide realistic feedback concerning the correctness of the recommendations.

8. ACKNOWLEDGMENTS This work is funded in part by the National Science Foundation under NSF grants # CCF-0810924. We thank several DePaul students who have helped build the current forum tool including Adam Czauderna and Ming Chang Liu.

9. REFERENCES [1] G. Adomavicius and A. Tuzhilin. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Trans. on Knowl. and Data Eng. 17(6): 734-749, Jun. 2005. [2] C. Basu, H. Hirsh, and W. Cohen. Recommendation as classification: using social and content-based information in recommendation. In Conf. on Artificial intelligence/ Innovative applications of artificial intelligence (AAAI

'98/IAAI '98), pages 714-720, Madison, WI, USA, 1998. [3] R. Burke. Hybrid Recommender Systems: Survey and Experiments. User Modeling and User-Adapted Interaction 12(4): 331-370, Nov. 2002. [4] F. Can and E. A. Ozkarahan. Concepts and effectiveness of the cover-coefficient based clustering methodology for text databases. ACM Trans. On Database Systems., 15(4):483417, 1990. [5] C. Castro-Herrera, J. Cleland-Huang, and B. Mobasher. Enhancing stakeholder profiles to improve recommendations in online requirements elicitation. IEEE Conf. on Requirements Eng. (RE'09), Atlanta, GA, USA, Aug. 2009. [6] C. Castro-Herrera, C. Duan, J. Cleland-Huang, and B. Mobasher. A recommender system for requirements elicitation in large-scale software projects. In ACM Symposium on Applied Computing (SAC'09), pages 14191426, Honolulu, HI, USA, Mar. 2009. [7] C. Castro-Herrera, and J. Cleland-Huang, A Machine Learning Approach for Identifying Expert Stakeholders, 2nd International Workshop on Managing Requirements Knowledge, Atlanta, Georgia, USA, Sept. 2009. [8] C. Castro-Herrera, J. Cleland-Huang, B. Mobasher, “A recommender system for dynamically evolving online forums”, Recommender Systems, 2009, New York, USA, pp. 213-216. [9] J. Cleland-Huang, H. Dumitru, C. Duan, and C. CastroHerrera. Automated support for managing feature requests in open forums. Comm. of the ACM, 52(10): 68-74 (2009). [10] Decker, B., Ras, E., Rech, J., Jaubert, P., and Rieth, M. 2007. Wiki-Based Stakeholder Participation in Requirements Engineering. IEEE Software. 24, 2 (Mar. 2007), 28-35. [11] I. S. Dhillon and D. S. Modha. Concept decompositions for large sparse text data using clustering. Mach. Learn., 42(1/2):143-175, 2001. [12] C. Duan, J. Cleland-Huang, and B. Mobasher. A consensus based approach to constrained clustering of software requirements. In ACM Conf. on Information and Knowledge Management, pp. 1073-1082, Napa Valley, CA, USA, 2008. [13] M. Gibiec, “A Hierarchical Approach to Mining Topic-based Features”, Term Project, DePaul University. [14] I. Karlsen, N. A. M. Maiden, A. Kerne, “Inventing Requirements with Creativity Support Tools”, Conf. on Reqs Eng. for Software Quality, Amsterdam, 2009, pp. 162-174. [15] Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (Aug. 2009), 30-37. [16] Maalej, W. and Thurimella, A. Towards a Research Agenda for Recommendation Systems in Requirements Engineering, In Proceeding of the 2nd Int. Workshop on Managing Requirements Knowledge, Atlanta, GA, USA, Aug. 2009. [17] J. B. Schafer, D. Frankowski, J. Herlocker, and S. Sen. Collaborative Filtering Recommender Systems. The Adaptive Web: Methods and Strategies of Web Personalization (Lecture Notes in Computer Science).Springer-Verlag, Berlin, Germany, 2007. [18] Standish Chaos Report, Standish Research Group, Available at http://www.standish.com [19] G. Takács, I.Pilászy, B. Németh, and D. Tikk, Matrixfactorization and neighbor based algorithms for the netflix prize problem. ACM Conf. on Recommender Systems Lausanne, Switzerland, October 23 - 25, 2008). 267-274.

Recommender Systems Chaitanya Devaguptapu - GitHub

Evaluating Retail Recommender Systems via ...

Designing Personalized Recommender Systems

Evaluating Retail Recommender Systems via Retrospective Data ...

A Market-Based Approach to Recommender Systems

Recommender Systems - ePrints Soton - University of Southampton

Toward Trustworthy Recommender Systems: An ...

ontoAGA:Ontology to Support Educational Systems Interoperability.pdf

What is Trust in a Recommender for Software Development?

What is Trust in a Recommender for Software ...