Mining Contexts for Recommending Source Locations ...

Viewer
Transcript

IPSJ SIG Technical Report

Mining Contexts for Recommending Source Locations to Explore* SEONAH LEE†

SUNGWON KANG†

To recommend source locations to visit, previous approaches have mined the associations between source locations in programmer interaction histories. However, these approaches result in low recommendation accuracy. To recommend source locations more accurately and thus effectively, we proposed NavClus which automatically forms collections of source locations that are relevant to the tasks performed by programmers, and then retrieves the collections that best match a programmer’s navigation path. To evaluate NavClus, we simulated recommendations and conducted user studies.

1. Introduction A software evolution task is an identifiable and essential unit of work that changes a software system, such as fixing bugs or enhancing features. While performing software evolution tasks, programmers first understand the context of source locations, and then determine the locations to examine, and next verify if the locations are relevant to their tasks. They finally identify and edit the locations necessary for accomplishing their tasks [3]. To assist programmers who determine the source locations to examine next, researchers have suggested recommending source locations to visit next by mining programmer interaction histories. For example, DeLine et al. proposed mining consecutives visits between methods and thus recommending subsequent methods to visit, as a programmer navigates the code base [1]. Likewise, Singer et al. proposed recommending subsequent files to visit [10] and Maalej et al. proposed software artifacts to visit [9]. However, various user studies show that these approach yield low recommendation accuracy and thus do not effectively assist programmers. To effectively assist programmers who navigate the code base, we introduce the concept of navigation contexts, the information that a programmer needs to explore and understand during a software evolution task. We then introduce our NavClus tool for mining the navigation contexts, collections of source locations that are relevant to tasks in programmer interaction histories. To recommend source locations with high recommendation accuracy, we also propose retrieving a past navigation context that best matches a programmer’s navigation path. To evaluate the proposed approach, we simulate code recommendations. We simulate two tools mining programmer interaction histories, a state-of-the-art recommender TeamTracks and the proposed recommender NavClus, and compare their recommendation accuracies. The simulations show that recommendation accuracy of NavClus is twice as high as that of TeamTracks. We also conduct user studies where programmers used the NavClus tool and evaluate the results. The user studies show that most programmers positively responded to the NavClus tool, with all programmers stating they intended to continue to using NavClus.

* This paper is an abstracted version of the following paper: S. Lee and S. Kang, “Clustering Navigation Sequences to Create Contexts for Guiding Code Navigation,” Journal of Systems and Software, 2013. † KAIST

ⓒ2013 Information Processing Society of Japan

This paper is organized as follows. Section 2 discusses related work. Section 3 suggests mining navigation contexts in programmer interaction histories. Section 4 reports our evaluations and results. Section 5 concludes the paper.

2. Related Work A recommendation system for software engineering is defined as "a software application that provides information items estimated to be valuable for a software engineering task in a given context" [11]. To recommend source locations that are relevant to software evolution tasks, researchers have mined programmer interaction histories. Programmer interaction histories are the records of the events in which programmers visit and edit source locations. The previous work that mines programmer interaction histories includes TeamTracks, NavTracks, and Switch! [1, 9, 10] TeamTracks recommends the methods to visit next for a programmer working on a method [1]. To recommend such methods, TeamTracks mines consecutive visits between two source locations in programmers’ interaction histories. In a lab study, TeamTracks demonstrated the potential to share navigation data among programmers. NavTracks recommends files to visit next for a programmer currently working on a file [10]. To recommend such files, NavTracks segments programmer interaction histories into navigation loops and finds associations among files. NavTracks serves as a memory aid to revisit previously viewed files. These approaches show a primitive form of recommending source locations based upon the associations between two source locations. Likewise, Sahm and Maalej proposed the Switch! for recommending the artifacts to visit next [9]. This approach makes recommendations when a software programmer clicks a button to ask for a recommendation. It recommends various artifacts, including tools and documents. However, the internal mechanism of this approach is similar to that of TeamTracks [1] in the way it depends on associations between two artifacts. These tools typically do not require prior knowledge or explicit input from the programmer. However, they yield low recommendation accuracy. In all these approaches, many of the source locations recommended are irrelevant or not highly relevant to the given task [1, 4]. Our approach aims to improve the recommendation accuracy, and thus more effectively assist programmers’ code navigation.

1

IPSJ SIG Technical Report

3. Proposed Approach We first define the term “navigation context” (Section 3.1), and then present two principles that help identify navigation contexts within programmer interaction histories (Section 3.2). We next explain how navigation contexts are formed through mining programmer interaction histories (Section 3.3). We finally introduce the NavClus tool (Section 3.4). 3.1 Navigation Contexts To define the collections of source locations that programmers need to visit during their code navigation activities for their tasks, we introduce the following notion:  Navigation Context: the information – a graph of source locations and the relationships between them – that a programmer needs to explore during an evolution task. To guide a programmer’s code navigation effectively, a recommendation system needs to provide a navigation context on the spot as a programmer navigates the code base. To do this, the question to be asked is how these navigation contexts are automatically formed. We conjecture that the recommendation system can automatically form navigation contexts, using the collections of source locations that programmers navigated while performing tasks in the past. In data mining, we define the navigation context as a group of frequently visited locations that are relevant to similar tasks. 3.2 Principles for Creating Contexts A source location is relevant to a task, if it is needed by a programmer who performs the task. We proposed two principles to identify collections of source locations relevant to tasks in order to automatically create navigation contexts from programmer interaction histories [4]. We define the frequency of a source location as the number of visits that programmers make to that source location. As researchers have reported that the more frequently a source location is visited during a task, the greater its importance is in completing that task [1, 3], we propose principle 1.  Principle 1 (Relevance by Frequency): The source locations that programmers frequently visited are likely to be significantly or highly relevant to their tasks. We view the successive visits that programmers make from a source location to another as an association between them. As researchers have reported that the more frequently two source locations are successively visited [9, 10], the more contextually associated they are, we propose principle 2.  Principle 2 (Relevance by Association): If a source location is relevant to a task, it is likely that the other source locations that programmers sequentially visited from the location are relevant to the same task. 3.3 Steps for Creating Contexts To recommend collections of source locations that are relevant to a programmer’s task at hand, we undertake mining navigation contexts in programmer interaction histories. To mine contexts, we use the following formula:  Navigation Context = Retrieve (Mine (Segment (InteractionHistories)), NavigationPath)

ⓒ2013 Information Processing Society of Japan

3.3.1 Segmenting As programmer interaction history is a continuous data stream that is augmented as time passes, we first need to segment it into several sequences of source locations that could be relevant to a task. As a unit of segmentation, we adopt a navigation sequence, and segment an interaction history into navigation sequences:  Navigation Sequence: A navigation sequence is a sequence of source locations successively visited by a programmer; a navigation sequence ends where a programmer returns to a source location that s/he has already visited and a new sequence begins at the repeated source location. Let us assume that a system just observes a sequence of events. The system thus observes a sequence of events that represent programmers’ visits to the source locations as follows: a, b, c, a, b, d, b, d, , e, f, g, e, f, c, f, c, c, a, b, x, b, d The symbol ‘’ in the sequence above represents a time point when a programmer stopped for a while. This sequence of events is segmented into navigation sequences as follows: (a, b, c), (a, b, d), (b, d), (e, f, g), (e, f, c), (f, c), (c, a, b, x), (b, d) 3.3.2 Mining To mine navigation contexts, we need to integrate the collections of locations relevant to similar tasks and separate the collections of locations relevant to dissimilar tasks. To meet this need, we propose clustering navigation sequences. To explain our idea, we first define the following key terms.  Similar Tasks: Two tasks are similar if programmers modify the same feature by using the same software artifacts in the same development environment.  Similar Navigation Sequences: Two navigation sequences are similar if they contain more of the same source locations than different source locations. To determine if two navigation sequences are similar, we can use a similarity metric. If we use the cosine similarity metric [2], navigation sequences (a, b, c) and (a, b, d) are converted to two vectors (1, 1, 1, 0) and (1, 1, 0, 1) and then their cosine similarity is calculated as 2/3 = 0.66. If their cosine similarity is greater than a certain threshold (e.g., 0.5), the navigation sequences (a, b, c) and (a, b, d) can be determined to be similar. Our insight is that clustering similar sequences can form collections of locations relevant to similar tasks. According to principle 2, source locations in a navigation sequence are likely to be relevant to a task. Let  and  be navigation sequences created from the locations programmers navigated while performing certain tasks, say Ta and Tb. If  and  share locations, the locations in  are likely to be relevant to Tb (and likewise locations in  are likely to be relevant to Ta). The more locations two navigation sequences  and  share, the more likely the locations in  are relevant to task Tb and vice versa. We then cluster similar navigation sequences by putting a navigation sequence into a group where one of the previous navigation sequences is most similar to sequence  such that the similarity is over a threshold (for example, 0.5). The first two navigation sequences (a, b, c) and (a, b, d) are grouped into group Ga because their similarity value is 0.66. The navigation sequence (b, d) is added into the same group Ga because the similarity value of (b, d) and (a, b, d) is 0.82. In this way, we can

2

IPSJ SIG Technical Report

Figure 1. NavClus User Interface, extracted from [8] cluster the above navigation sequences into two groups:  Collecting interaction traces: As a programmer visits and Ga: {(a, b, c), (a, b, d), (b, d), (c, a, b, x), (b, d)}, edits methods and classes, the programmer’s actions are Gb: {(e, f, g), (e, f, c), (f, c)} recorded as interaction traces. This information is used to Once we cluster sequences into two groups of source improve later recommendations. locations that are contextually related, we then identify the 4. Evaluation locations that have significance by counting their frequencies based on principle 1. We call these groups navigation contexts. To evaluate NavClus, we simulated code recommendations Ga: {(a, 3), (b, 5), (c, 2), (d, 3), (x, 1)}, using the experimental data and real data (Section 4.1) and also Gb: {(e, 2), (f, 3), (g, 1), (c, 2)} conducted a wizard-of-oz study and a diary study (Section 4.2). 3.3.3 Retrieving 4.1 Simulations These navigation contexts can be retrieved to recommend This section presents two simulations. Section 4.1.1 shows more source locations that are significant to the current task of a the results of a simulation using experimental data to compare programmer. As a programmer interacts with the code base, a the recommendation accuracies of NavClus and TeamTracks. To collection that contains the greatest number of locations that a generalize the results, Section 4.1.2 shows the results of a programmer has recently visited is retrieved. In the collection, simulation using real data to compare the recommendation source locations that have high frequencies are recommended. accuracy of NavClus with that of TeamTracks. For example, if programmer navigates c and d, the programmer 4.1.1 Simulation using Experimental Data will receive a recommendation of a and b that are frequently We first used the interaction traces that were created from visited through similar tasks. The source location x will be where twelve programmers performed the same four tasks in an excluded from the recommendation because it has a low visit experiment, and compared the recommendation accuracies of frequency. NavClus and TeamTracks [5]. We mined the interaction histories 3.4 Tool of the first eight programmers and simulate code NavClus is built around the notion of navigation contexts, recommendations with the interaction histories of the last four which roughly corresponds to a network of method views and programmers. The simulation created recommendations based edits. This approach is implemented as an extension to the upon the ten source locations that a programmer initially Eclipse IDE for easy use by programmers as shown in Fig. 1. navigated to perform tasks. To measure the recommendation NavClus performs five core functions [8]: accuracy, we used Cumulated Gain(CG) [2], which is commonly  Display history: As a programmer navigates the code base, used to measure the effectiveness of information retrieval the methods and classes already explored incrementally techniques. NavClus shows CG that is more than twice that of appear in a class diagram. TeamTracks (35 vs. 15).  Create and display recommendations: Based on the current 4.1.2 Simulation using Real Data working context and previously seen working contexts, We also used 4,397 interaction traces obtained from the recommendations for additional methods to explore are Eclipse Bugzilla system where programmers fixed real bugs and presented in the class diagram. enhanced features [6]. With the traces, we simulated the  Update diagram layout: Programmers can manually recommendations of NavClus and TeamTracks, and compared rearrange the class diagram via the mouse, or can let their recommendation accuracies. We used on-line learning NavClus automatically update the layout for readability. evaluation method that mines all of the past histories to make a  Jump to source locations: Previously visited or recommendation at the current time point. When a simulator recommended source locations can be visited by double makes recommendations from interaction trace T i, the simulator clicking on the methods in the class diagram. mines the interaction traces that occurred prior to T i (from T1 to

ⓒ2013 Information Processing Society of Japan

3

IPSJ SIG Technical Report

Ti-1). To measure the recommendation accuracy, we used F-measure, the harmonic mean of the precision and recall [2]. The overall F-measure values were low, because of the recall that has a large denominator. Ti occasionally includes more than a hundred locations. On the other hand, the number of locations recommended by the simulation is 10 at maximum. Thus, the recall value is occasionally less than 0.1. In this case, F-measure is less than 0.18, even if the precision is 1.0 The F-measure of NavClus is twice as high as that of TeamTracks (0.144 vs. 0.073). 4.2 User Studies This section presents two user studies. Section 4.2.1 shows the results of a wizard-of-oz study that gathers users’ feedback in the early phase of a working-in-progress system. Section 4.2.2 shows the results of a diary study to understand the effectiveness of the system in users’ real environment. 4.2.1 Wizard-of-oz Study To evaluate a graphical code recommender that embeds NavClus, we conducted a wizard-of-oz study [7]. A Wizard-of-oz study enables us to gather the users' feedback on the working-in-progress recommender in an early phase. In the study, programmers were given a graphical view on the fly as navigating the code base. They were observed and interviewed, and their responses and feedback were gathered. Eleven programmers were recruited for the study. They were asked to perform the same tasks, based on the interaction traces we used for simulation in Section 4.1.1. This study was to find whether programmers find source locations recommended useful as navigating the code, and nine out of eleven participants positively evaluated the given prototype. The main reason that the participants favored the prototype was that they could find the source locations relevant to the task they were performing. We interpreted this result as that it assisted a programmer to navigate toward task relevant source locations, and conducted another user study. 4.2.2 Diary Study To investigate the effectiveness of a graphical code recommender NavClus in real-world development, we conducted a diary study [8]. In a diary study, participants are asked to keep a long-term record of their everyday experiences, from which researchers can infer underlying patterns. This diary study format allows participants to reflect on their daily work, and allows us to explore our questions in a realistic scenario. Ten programmers were recruited for the study. They were asked to install the NavClus tool and to write a daily diary report for a month. At the end of the study, they were interviewed about their experience with NavClus. All of the participants evaluated NavClus highly by stating they would use NavClus in the future, but this appears largely due to the visualization of source locations visited, rather than the recommendations for source locations to visit. We note this is because that the participants in the diary study received recommendation based on their own interaction histories, accumulated for a month. If a turnover of a programmer occurs, the newly recruited programmer will be responsible for the code base instead of the leaving programmer.

ⓒ2013 Information Processing Society of Japan

The new programmer can then get significant assistance from NavClus, based on the leaving programmer's history data.

5. Conclusions To effectively assist programmers who navigate the code base, we proposed mine navigation contexts in programmer interaction histories. For this, we developed a technique that clusters navigation sequences, and incorporated this clustering technique into the NavClus tool. To evaluate NavClus tool, we performed simulations and conducted user studies. The two simulations showed that the clustering technique help yield recommendation accuracy that is twofold more accurate than the TeamTracks tool. The two user studies showed that the recommendations of source locations visualized in a class diagram can facilitate programmers’ code exploration activities. In the future, we will continue to research and develop a code recommender that recognizes a programmer’s situation and uses the recognized situational information to provide the information relevant to the programmers’ task at hand. Currently we are extending the NavClus tool to a collaboration version, and plan to do another user study to investigate the effect of NavClus in the collaboration of programmers.

References 1) R. DeLine, M. Czerwinski, G. Robertson, G., "Easing program comprehension by sharing navigation data," VL/HCC, IEEE pp.241-248, 2005. 2) J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000. 3) M. Kersten, G. C. Murphy, "Using task context to improve programmer productivity," FSE, 2006. 4) A. J. Ko, B. A. Myers, M. J. Coblenz, and H. H. Aung, “An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks,” IEEE Transactions on Software Engineering, 32(12), pp.971-987, 2006. 5) S. Lee and S. Kang, "Clustering and Recommending Collections of Code Relevant to Tasks," ICSM ERA, Sep 25 - Oct 1, 2011, Williamsburg, USA, 2011. 6) S. Lee and S. Kang, "Clustering Navigation Sequences to Create Contexts for Guiding Code Navigation," Journal of Software and Systems, 2013. 7) S. Lee and S. Kang, "A Study on Guiding Programmers' Code Navigation with a Graphical Code Recommender," Studies in Computational Intelligence, Springer-Verlag Berlin Heidelberg 2011. 8) S. Lee, S. Kang and M. Staats "A Graphical Recommender for Assisting Code Exploration," ICSE Formal Demo, 2013. 9) A. Sahm, W. Maalej, "Switch! Recommending artifacts needed next based on personal and shared context," In: Engels, G., Luckey, M., Pretschner, A., Reussner,R. (Eds.), Soft. Eng. LNI, vol. 160. , pp. 473-484. 2010. 10) J. Singer, R. Elves, M. Storey, "NavTracks: supporting navigation in software," IWPC, pp.173-175, 2005. 11) M. Robillard, R. Walker, T. Zimmermann, "Recommendation systems for software engineering," IEEE Soft., 2009.

Acknowledgment This research was supported by Basic Science Research Program and International Cooperation Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012-0007069 and 2013K2A1A2055116).

4

Recommending Source Code Examples via API Call ...