Component Reconn-exion by Andrew Le Gear

Submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science at the University of Limerick, Ireland

Supervised by Jim Buckley Examined by Tony Cahill and Gail Murphy 9th November 2006

Declaration The work described in this thesis is, except where otherwise stated, entirely that of the author and has not been submitted as an exercise for a degree at this or any other university. Signed: AndrewLeGear

In my opinion, the work described in this thesis is, except where otherwise stated, entirely that of the author. Signed: JimBuckley

ii “There is no such thing as a long piece of work, except one that you dare not start.” -Charles Baudelaire.

Abstract For over thirty years, increased software reuse and replaceability have been touted as a means of easier software development. Unfortunately this is a non-trivial task. Component-based development attempts to ease the creation of replaceable and reusable. However, the majority of legacy systems are not implemented using the componentbased development paradigm. To enable the reuse of portions of legacy software as part of a component-based development process, a component recovery technique must first be employed. The two phases to component recovery are (1) encapsulation of a candidate component, followed by (2) the application of a component wrapper to allow the component to be used with component-based technologies. This thesis focuses on the first phase, proposing and evaluating a human-driven process for targeted component encapsulation using a combination of two existing techniques: A dynamic software understanding technique called Software Reconnaissance and a design recovery technique called Reflexion Modelling. Specifically, reuse of core assets in a system is identified using a variation of the Software Reconnaissance technique called the reuse perspective. The set of reuse elements in the reuse perspective, is subsequently investigated and partitioned into cohesive units of functionality using an adapted version of the Reflexion Modelling technique. The potential usefulness of this process is demonstrated and evaluated using two large scale, industrial case studies. The results of the studies, for the most part, would seem to indicate that the process is worthwhile and affords significant time saving opportunities for software engineers.

Dedication Dedicated to a little multi-purpose air conditioner.

Acknowledgements First and foremost I’d like to give a big thanks to my supervisor, Jim Buckley. A man of eternal patience, with me, who always had time for a question, and who educated so well from my beginning as a “research illiterate” in September 2003. None of this would have been possible without him. To my parents, Betty and Joseph Le Gear, who have unfailingly supported me through the years leading to my PhD. I’ll never forget your unquestioning support as I announced it would be a good idea to spend another three years without a job, after seventeen years of education. To Nicola Quinn, for always seeming so interested as I’d tell about what a “great day I had at the lab,” and never once said “I told you so” when I didn’t. To everyone else, past and present, from the SAE group and CS1-045; J.J. Collins, Brendan Cleary, Damien Conroy, Thomas Collins, Deirdre Carew, Andrea Suchankova, Seamus Galvin, Finbar McGurren, Darwin Slattery and Chris Exton. Thank you all for discussion, debate, inspiration, help and friendship and in darker times the “post graduate support group” for those “thesis blues.” A thank you to those at QAD, our research and funding partners, and to IBM as research partners. Thanks to Enterprise Ireland, Lero and SFI for their generous funding of the research project. Thank you those at the CSIS department in UL in their support of this research. Finally, thanks to my examiners Prof. Tony Cahill and Gail Murphy.

Contents I

Literature Review

2

1

Introduction

3

1.1

Software Maintenance and Development . . . . . . . . . . . . . . . . .

4

1.2

Software Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

Reusable and Replaceable Software Through Component-based Devel-

2

opment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.4

Problem Statement: The Legacy Dilemma . . . . . . . . . . . . . . . .

6

1.5

Reengineering Towards Components . . . . . . . . . . . . . . . . . . .

7

1.6

Objectives and Contributions . . . . . . . . . . . . . . . . . . . . . . .

8

1.7

Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Software Components

11

2.1

A Component Definition . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2

Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3

Introducing Component-Based Development . . . . . . . . . . . . . . 17 2.3.1

A Component-Based Development Process . . . . . . . . . . . 20

2.4

The Legacy Dilemma Revisited . . . . . . . . . . . . . . . . . . . . . . 23

2.5

Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5.1

A Brief History of Encapsulation in Software Development . . 25 2.5.1.1

Monitors . . . . . . . . . . . . . . . . . . . . . . . . 25

CONTENTS

3

iv

2.5.1.2

Information Hiding . . . . . . . . . . . . . . . . . . 26

2.5.1.3

Object Oriented Programming . . . . . . . . . . . . 26

2.5.2

Coupling and Cohesion

. . . . . . . . . . . . . . . . . . . . . 33

2.5.3

Encapsulation Features of Component-Based Development

. . 37

Software Reengineering

42

3.1

Agendas for Reengineering Software Systems . . . . . . . . . . . . . . 43

3.2

Dynamic versus Static Analysis . . . . . . . . . . . . . . . . . . . . . 45

3.3

Reuse Identification Techniques . . . . . . . . . . . . . . . . . . . . . 46 3.3.1

Clone Detection . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.3.2

Fan-in Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3.3

Frequency Spectrum Analysis . . . . . . . . . . . . . . . . . . 48

3.4

Dependency Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.5

Reengineering Towards Components . . . . . . . . . . . . . . . . . . . 53 3.5.1

Design Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.5.2

Clustering for Architectural Recovery and Component Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4

3.5.2.1

Dataflow-based Approaches . . . . . . . . . . . . . . 57

3.5.2.2

Structure-Based Approaches . . . . . . . . . . . . . 59

3.5.2.3

Domain-Model Based Approaches . . . . . . . . . . 60

3.5.3

Aggregated Recovery Approaches . . . . . . . . . . . . . . . . 61

3.5.4

Componentisation Processes . . . . . . . . . . . . . . . . . . . 62

3.5.5

Component Wrappers . . . . . . . . . . . . . . . . . . . . . . 65

Software Reconnaissance 4.1

67

A Functionality View of Software . . . . . . . . . . . . . . . . . . . . 69 4.1.1

Common Software Elements . . . . . . . . . . . . . . . . . . . 70

4.1.2

Potentially Involved Software Elements . . . . . . . . . . . . . 70

CONTENTS

v

4.1.3

Indispensably Involved Software Elements . . . . . . . . . . . 72

4.1.4

Uniquely Involved Software Elements . . . . . . . . . . . . . . 75

4.2

Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.2.1

Software Instrumentation Enabling Software Reconnaissance . 75

4.2.2

Best Practices When Applying Software Reconnaissance . . . . 77

4.2.3

Previous Work Using Software Reconnaissance . . . . . . . . . 78

5 Software Reflexion Modelling

82

5.1

The Reflexion Modelling Process . . . . . . . . . . . . . . . . . . . . . 85

5.2

Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.2.1

Early Experiences with Reflexion Modelling . . . . . . . . . . 88

5.2.2

Extensions and Further Uses of Reflexion Modelling . . . . . . 88

5.2.3

A Cognitive Basis for Reflexion Modelling . . . . . . . . . . . 92 5.2.3.1

Encoding

5.2.3.2

Retrieval . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2.3.3

Human Learning

5.2.3.4

Learning Preferences . . . . . . . . . . . . . . . . . 98

6 Research Methodology

. . . . . . . . . . . . . . . . . . . . . . . 93

. . . . . . . . . . . . . . . . . . . 96

100

6.1

Scientific Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.2

Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.3

Quantitative and Qualitative Research Methods . . . . . . . . . . . . . 103

6.4

The Culture of Research Evaluation in Computer Science . . . . . . . . 106 6.4.1

6.5

Arguing for Hybrid Approaches to Research in Computer Science106

A Research Model for This Thesis . . . . . . . . . . . . . . . . . . . . 109 6.5.1

Empirical Techniques Employed . . . . . . . . . . . . . . . . . 110

CONTENTS

II

vi

“Component Reconn-exion”: Reengineering Towards Com-

ponents Using Variations on Reconnaissance and Reflexion

113

7 Reconn-exion

114

7.1

A Conjecture for Prompting Component Abstractions 7.1.1

7.2

. . . . . . . . . 115

A New Reuse Perspective Derived from Software Reconnaissance115

A Hypothesis for Encapsulating Components Using Reflexion Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

7.3

Hypothesising a Process For Component Encapsulation

. . . . . . . . 121

7.4

A Small Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4.1

The House Application . . . . . . . . . . . . . . . . . . . . . . 124

7.4.2

Part 1: A Reuse Perspective . . . . . . . . . . . . . . . . . . . 124

7.4.3

Part 2: Encapsulating with Reflexion . . . . . . . . . . . . . . 128

8 Evaluating the Basis Techniques of Reconn-exion 8.1

8.2

134

Validating the Reuse Perspective: The JIT/S Shipping Case Study . . . 135 8.1.1

Purpose and Research Questions

. . . . . . . . . . . . . . . . 135

8.1.2

The Subject System: JIT/S . . . . . . . . . . . . . . . . . . . . 135

8.1.3

The Participants . . . . . . . . . . . . . . . . . . . . . . . . . 136

8.1.4

Tool Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

8.1.5

The Study Protocol . . . . . . . . . . . . . . . . . . . . . . . . 137

8.1.6

Case Study Part 1 . . . . . . . . . . . . . . . . . . . . . . . . . 138

8.1.7

Case Study Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . 140

8.1.8

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Validating Reflexion-Based Component Encapsulation: The Workspace Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.2.1

Purpose and Research Questions

. . . . . . . . . . . . . . . . 145

8.2.2

The Subject System: The Learning Management System . . . . 145

CONTENTS

vii

8.2.3

The Participants . . . . . . . . . . . . . . . . . . . . . . . . . 146

8.2.4

Participant Tasks . . . . . . . . . . . . . . . . . . . . . . . . . 147

8.2.5

Tool Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

8.2.6

The Study Protocol . . . . . . . . . . . . . . . . . . . . . . . . 148

8.2.7

Enacting the Process . . . . . . . . . . . . . . . . . . . . . . . 150

8.2.8

8.2.9 9

8.2.7.1

Participant 1 . . . . . . . . . . . . . . . . . . . . . . 150

8.2.7.2

Participant 2 . . . . . . . . . . . . . . . . . . . . . . 152

Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 8.2.8.1

Questions 1 and 2 . . . . . . . . . . . . . . . . . . . 157

8.2.8.2

Question 3 . . . . . . . . . . . . . . . . . . . . . . . 161

8.2.8.3

Question 3 . . . . . . . . . . . . . . . . . . . . . . . 163

8.2.8.4

Question 4 . . . . . . . . . . . . . . . . . . . . . . . 166

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

Evaluating Reconn-exion: The AIM Case Study

171

9.1

Purpose and Research Questions . . . . . . . . . . . . . . . . . . . . . 172

9.2

The Subject System: The Advanced Inventory Management Application 172

9.3

The Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

9.4

Tool Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

9.5

Study Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

9.6

Recounting the Process . . . . . . . . . . . . . . . . . . . . . . . . . . 175 9.6.1

Participant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 9.6.1.1

Creating the reuse perspective . . . . . . . . . . . . . 176

9.6.1.2

Using the reuse perspective to prompt component abstractions . . . . . . . . . . . . . . . . . . . . . . . . 177

9.6.2

9.6.1.3

Encapsulation with Reflexion . . . . . . . . . . . . . 177

9.6.1.4

Identifying Multiple Interfaces . . . . . . . . . . . . 177

Participant 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

CONTENTS

viii

9.6.2.1

Creating the reuse perspective . . . . . . . . . . . . . 182

9.6.2.2

Using the reuse perspective to prompt component abstractions . . . . . . . . . . . . . . . . . . . . . . . . 182

9.7

9.6.2.3

Encapsulation with Reflexion . . . . . . . . . . . . . 185

9.6.2.4

Identifying Multiple Interfaces . . . . . . . . . . . . 185

Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 9.7.1

Process: The Effectiveness of Reuse Perspective as a Prompt . . 188

9.7.2

Process: Reconn-exion for Component Encapsulation . . . . . . 190

9.7.3

Product: An Architect’s Assessment of Encapsulated Components192

9.7.4

Product: Metrics of Coupling and Cohesion on Encapsulated Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

9.7.5 9.8

Contextual Knowledge . . . . . . . . . . . . . . . . . . . . . . 195

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

10 Scoping Reconn-exion

198

10.1 Validity of Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 10.2 Theoretical Limitations of Reconn-exion . . . . . . . . . . . . . . . . . 201 11 Future Work

205

11.1 Catalogue of Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . 206 11.2 Exploring Database Accesses . . . . . . . . . . . . . . . . . . . . . . . 206 11.3 Component Wrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 11.4 Combining with Automated Techniques . . . . . . . . . . . . . . . . . 207 11.5 Feature-based Decomposition of Software . . . . . . . . . . . . . . . . 208 11.6 Software Product Line Recovery . . . . . . . . . . . . . . . . . . . . . 211 11.7 Aspect Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 11.8 Inverse Structural Summarization . . . . . . . . . . . . . . . . . . . . . 213 11.9 Collaborative Design Recovery . . . . . . . . . . . . . . . . . . . . . . 214

CONTENTS

ix

11.10Temporal Summarization . . . . . . . . . . . . . . . . . . . . . . . . . 215 11.11Alternate Source Models . . . . . . . . . . . . . . . . . . . . . . . . . 216 11.12Metrics and Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . 217 12 Conclusion

III

Bibliography

Bibliography

IV

Appendices

A Reuse Perspectives

218

223 224

250 251

A.1 Scrabble Emulator Reuse Perspective . . . . . . . . . . . . . . . . . . 251 B Pilot Models and Maps

254

C Workplace - Participant 1

259

D Workplace - Participant 2

268

E AIM - Participant 1

280

F AIM - Participant 2

291

G Interfaces on the House Application

299

H Peer Reviewed Publications

301

List of Figures 2.1

Levels of interface specification adapted from Beugnard et. al. (Beugnard et al., 1999) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2

A Generic Component Architecture adapted from Bachmann et al. (Bachmann et al., 2000). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3

The component based development process adapted from (Cheesman and Daniels, 2001). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4

A detailed description of the provisioning workflow.

. . . . . . . . . . 24

2.5

A code fragment from a C program. . . . . . . . . . . . . . . . . . . . 28

2.6

A visualisation of the encapsulation exercise shown in figure 2.7.

2.7

A revised version of the code fragment in figure 2.5.

2.8

An inheritance hierarchy code sample.

2.9

A visualisation of the inheritance hierarchy in 2.8.

. . . 30

. . . . . . . . . . 31

. . . . . . . . . . . . . . . . . 32 . . . . . . . . . . . 32

2.10 Using polymorphism for encapsulation code example.

. . . . . . . . . 34

2.11 An example of a loosely coupled and highly cohesive component. . . . 35 2.12 Many classes without and encapsulation policy. . . . . . . . . . . . . . 38 2.13 Many classes from figure 2.12 encapsulated by a component.

. . . . . 39

2.14 Event handling code sample. . . . . . . . . . . . . . . . . . . . . . . . 39 2.15 A deployment diagram of two distributed components. . . . . . . . . . 40 2.16 A call between the distributed components shown in 2.15.

. . . . . . . 41

LIST OF FIGURES

xi

3.1

Code example 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.2

A possible graph representation of code example 1 in figure 3.1. . . . . 49

3.3

Code example 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.4

A possible graph representation of code example 2 in figure 3.3. . . . . 50

3.5

Code example 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.6

A possible graph representation of code example 3 in figure 3.5. . . . . 51

3.7

Code example 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.8

A possible graph representation of code example 4 in figure 3.7. . . . . 53

4.1

Identifying Features from Running Systems.

4.2

Common Software Elements. . . . . . . . . . . . . . . . . . . . . . . . 71

4.3

Potentially Involved Software Elements, shaded in black. . . . . . . . . 73

4.4

Indispensably Involved Software Elements, are shaded in black. . . . . 74

4.5

Uniquely involved software elements, are shaded in black. . . . . . . . 76

5.1

Collapsing strategy in operation. . . . . . . . . . . . . . . . . . . . . . 84

5.2

The Software Reflexion Modelling Process . . . . . . . . . . . . . . . 87

5.3

Adapted from (Brace and Roth, 2005).

7.1

The three sets used to form the shared set. . . . . . . . . . . . . . . . . 117

7.2

Encapsulating a component and making its interface explicit.

7.3

Identifying multiple interfaces on a component using Reflexion Mod-

. . . . . . . . . . . . . . 68

. . . . . . . . . . . . . . . . . 93

. . . . . 120

elling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.4

The Component Reconn-exion process. . . . . . . . . . . . . . . . . . 123

7.5

A screenshot of the house application. . . . . . . . . . . . . . . . . . . 125

7.6

First house application Reflexion model. . . . . . . . . . . . . . . . . . 129

7.7

First house application Reflexion model map. . . . . . . . . . . . . . . 129

7.8

Second house application Reflexion model. . . . . . . . . . . . . . . . 130

7.9

Second house application reflexion model map. . . . . . . . . . . . . . 131

LIST OF FIGURES

xii

7.10 Third house application reflexion model. . . . . . . . . . . . . . . . . . 132 7.11 Third house application reflexion model map. . . . . . . . . . . . . . . 133 8.1

A screenshot of the jRMTool eclipse plug-in. . . . . . . . . . . . . . . 148

8.2

First iteration by the first participant.

8.3

Last iteration by the first participant. . . . . . . . . . . . . . . . . . . . 153

8.4

Web administration user interface component. (Note the bolded black

. . . . . . . . . . . . . . . . . . 152

box is not part of the tool’s visual output.) . . . . . . . . . . . . . . . . 155 8.5

A design pattern in the LMS. . . . . . . . . . . . . . . . . . . . . . . . 163

9.1

Procedure clusters prompted by the reuse perspective (participant 1). File names are blurred for copyright reasons. . . . . . . . . . . . . . . 178

9.2

Participant one’s first Reflexion model and map.

. . . . . . . . . . . . 179

9.3

Participant one’s final Reflexion model. . . . . . . . . . . . . . . . . . 181

9.4

Procedure clusters prompted by the reuse perspective (participant 2). Note that the file names are blurred for copyright reasons.

. . . . . . . 184

9.5

Participant two’s third Reflexion model. . . . . . . . . . . . . . . . . . 186

9.6

Participant two’s final Reflexion model. . . . . . . . . . . . . . . . . . 187

11.1 Decomposing a system in terms of its SHARED sets. . . . . . . . . . . 209 11.2 Including the common software elements in the model of the system.

. 210

11.3 A feature based decomposition of a software systems that shows shared, unique and common software elements of a system. 11.4 A simple temporal source model.

. . . . . . . . . . 211

. . . . . . . . . . . . . . . . . . . . 215

11.5 An example temporal summarization. . . . . . . . . . . . . . . . . . . 215 12.1 The Component Reconn-exion process. . . . . . . . . . . . . . . . . . 220 B.1 Pilot Study - Scrabble Emulator - Reflexion model and map 1. . . . . . 255 B.2 Pilot Study - Scrabble Emulator - Reflexion model and map 2. . . . . . 256

LIST OF FIGURES

xiii

B.3 Pilot Study - Scrabble Emulator - Reflexion model and map 3. . . . . . 257 B.4 Pilot Study - Scrabble Emulator - Reflexion model and map 4. . . . . . 258 C.1 Case Study - Workplace - Participant 1 - Reflexion Model and map 1 . . 260 C.2 Case Study - Workplace - Participant 1 - Reflexion model and map 2 . . 261 C.3 Case Study - Workplace - Participant 1 - Reflexion model and map 3 . . 262 C.4 Case Study - Workplace - Participant 1 - Reflexion model and map 4 . . 263 C.5 Case Study - Workplace - Participant 1 - Reflexion model and map 5 . . 264 C.6 Case Study - Workplace - Participant 1 - Reflexion model 6 . . . . . . . 265 C.7 Case Study - Workplace - Participant 1 - Reflexion model and map 7 . . 266 C.8 Case Study - Workplace - Participant 1 - Reflexion model and map 8 . . 267 D.1 Case Study - Workplace - Participant 2 - Reflexion model and map 1 . . 269 D.2 Case Study - Workplace - Participant 2 - Reflexion model and map 2 . . 270 D.3 Case Study - Workplace - Participant 2 - Reflexion model and map 3 . . 271 D.4 Case Study - Workplace - Participant 2 - Reflexion model and map 4 . . 272 D.5 Case Study - Workplace - Participant 2 - Reflexion model and map 5 . . 273 D.6 Case Study - Workplace - Participant 2 - Reflexion model and map 6 . . 274 D.7 Case Study - Workplace - Participant 2 - Reflexion model and map 7 . . 275 D.8 Case Study - Workplace - Participant 2 - Reflexion model and map 8 . . 276 D.9 Case Study - Workplace - Participant 2 - Reflexion model and map 9 . . 277 D.10 Case Study - Workplace - Participant 2 - Reflexion model 10 . . . . . . 278 D.11 Case Study - Workplace - Participant 2 - Reflexion model and map 11 . 279 E.1 Case Study - AIM - Participant 1 - Reflexion Model and map 1 . . . . . 281 E.2 Case Study - AIM - Participant 1 - Reflexion Model and map 2 . . . . . 282 E.3 Case Study - AIM - Participant 1 - Reflexion Model and map 3 . . . . . 283 E.4 Case Study - AIM - Participant 1 - Reflexion Model and map 4 . . . . . 284 E.5 Case Study - AIM - Participant 1 - Reflexion Model and map 5 . . . . . 285

LIST OF FIGURES

xiv

E.6 Case Study - AIM - Participant 1 - Reflexion Model and map 6 . . . . . 286 E.7 Case Study - AIM - Participant 1 - Reflexion Model and map 7 . . . . . 287 E.8 Case Study - AIM - Participant 1 - Reflexion Model and map 8 . . . . . 288 E.9 Case Study - AIM - Participant 1 - Reflexion Model and map 9 . . . . . 289 E.10 Case Study - AIM - Participant 1 - Reflexion Model and map 10 . . . . 290 F.1

Case Study - AIM - Participant 2 - Reflexion Model 1 . . . . . . . . . . 292

F.2

Case Study - AIM - Participant 2 - Reflexion Model 2 . . . . . . . . . . 293

F.3

Case Study - AIM - Participant 2 - Reflexion Model 3 . . . . . . . . . . 294

F.4

Case Study - AIM - Participant 2 - Reflexion Model 4 . . . . . . . . . . 295

F.5

Case Study - AIM - Participant 2 - Reflexion Model 5 . . . . . . . . . . 296

F.6

Case Study - AIM - Participant 2 - Reflexion Model 8 . . . . . . . . . . 297

F.7

Case Study - AIM - Participant 2 - Reflexion Model 9 . . . . . . . . . . 298

G.1 “Transforms” component’s interface on “Main.” Provides interface. . . 299 G.2 “Transforms” component’s interface on “GUI.” Provides interface. . . . 300

List of Tables 2.1

Motivations for software components described by four tiers. . . . . . . 13

2.2

Levels of maturity among CBD Technologies. . . . . . . . . . . . . . . 14

7.1

Features identified for the house application.

8.1

JIT/S Summary.

8.2

Particulars of the participants of the JIT/S study.

8.3

Features identified in JIT/S. . . . . . . . . . . . . . . . . . . . . . . . . 139

8.4

Summary of reuse perspective. . . . . . . . . . . . . . . . . . . . . . . 142

8.5

Workplace case study participant details.

9.1

Participant details.

9.2

Feature set examined by participant 1 during case study.

9.3

The set of features chosen by the second participant. . . . . . . . . . . 183

. . . . . . . . . . . . . . 124

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 . . . . . . . . . . . . 137

. . . . . . . . . . . . . . . . 147

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 . . . . . . . . 176

Part I Literature Review

Chapter 1 Introduction “Every new beginning comes from some other beginning’s end.” - Dan Wilson, Closing Time

1.1 Software Maintenance and Development

1.1

4

Software Maintenance and Development

Despite advances in software development techniques, such as object-oriented programming (OOPSLA, 2006), aspect-oriented programming (Kiczales et al., 1997) and component-based development (Cheesman and Daniels, 2001), existing implementation abstractions are not used on a large scale to create new systems, therefore the majority of software is still built from scratch (Greenfield et al., 2004). Software systems are projected to become ever more complex in the future (Corbi, 1989). This will require increased time and effort when developing new software systems (Voas, 1998; Meyer and Mingins, 1999; Zweben et al., 1995) making such development even less feasible in the future (Meyer and Mingins, 1999). Not only is it imperative that we begin developing software artifacts that are reusable to address this agenda (Voas, 1998; Meyer and Mingins, 1999; Washizaki et al., 2002), but it is also desireable with respect to software maintenance. Software artifacts that are reusable and replaceable (Cheesman and Daniels, 2001), will facilitate software evolution and help to curb the time spent on maintenance and evolution, an activity (Leintz and Swanson, 1980) that accounts for up to 80% of the entire development process (Bowen et al., 1993; Seaman, 2002)1 .

1.2

Software Reuse

Software reuse refers to the construction and evolution of software systems from existing software assets rather than creating a new system from scratch (Krueger, 1992). The potential gains of reuse were first voiced at a NATO software engineering conference 1

It should be noted that a comprehensive survey of software maintenance has not been carried out since that of Lientz and Swanson in 1978 (Leintz and Swanson, 1980; Lientz and Swanson, 1978). However, the increased complexity of software systems (Pressman, 2004) and observing the longevity of software systems such as Microsoft Word (Microsoft, 2006b), or Adobe Acrobat (Inc., 2006), suggests, if anything, that the proportion of effort consumed in manufacturing and evolving software systems has increased since their landmark study.

1.2 Software Reuse

5

in 1968 as a means of curbing the unsustainable trend of creating increasingly larger systems in this way (Nauer and Randell, 1968). Furthermore, software solutions for a specific domain contain many valuable business rules that an organisation will find difficult to reimplement. Over time, these rules become embedded in a system. Therefore they are no longer explicit to developers and maintainers, making it difficult to reuse these valuable assets in other systems within an organisation (Verhoef, 2000). Reports of pragmatic software reuse have been partially realised. During the mid1980’s the reuse of up to 40% of software within Japanese organisations has been reported (Mii and Takeshita, 1993). Software process models, once optimally applied within an organisation, have also prompted increased amounts of reuse during software development (Institute, 1997; Richardson, 1999). However, reuse in this way is entirely dependant upon the institutionalised cultural norms within the organisation’s development teams rather than a prescribed technological approach, such as object orientation or structured programming. However, such a high cultural standard is notoriously difficult to achieve (Richardson, 1999). Other attempts at enabling future reuse on a larger scale have included approaches to software development such as object-orientation, or the availability of reuse libraries (see section 3.3) for common tasks such as string manipulation or lists. Neither of these, however, provide the needed panacea for software development that is sought. Object orientation has proved to be too fine grained for large scale reuse, while library reuse is often too generic to have a major impact on domain specific software solutions produced within many organisations (Greenfield et al., 2004).

1.3 Reusable and Replaceable Software Through Component-based Development6

1.3

Reusable and Replaceable Software Through Componentbased Development

Component-based development refers to the assembly of software from pre-packaged components (Cheesman and Daniels, 2001), and builds upon earlier development approaches, such as object-oriented programming, procedural programming, fourth generation languages and aspect-oriented software development. Component-based development presents itself as a possible solution that can introduce reuse and replaceability to software development practice on a large scale. Components will often encapsulate much larger bodies of code than their object-oriented predecessors and lead to more easily understood software solutions (Zweben et al., 1995). This approach to software development promises many benefits. Most notably, with respect to this thesis, this development approach accommodates the reuse and replacibility of existing software during development and evolution at an encapsulated level of interaction (Parnas, 2002, 1971, 1972). For example, an organisation’s business rules, that may be realised within a component of one of their systems, may be reused by deploying it to augment the functionality of another system. Conversely, if that business rule changes, a new component implementing the new business rule could quickly replace the existing one.

1.4

Problem Statement: The Legacy Dilemma

The vision of large-scale reuse and replaceability using a modern development approach, such as component-based development, would seem to afford significant benefits in alleviating the problems of hiding complexity and facilitating software reuse. However, the vast majority of existing software was developed in accordance with older, different software development paradigms such as structured programming or object-orientation (Kontogiannis et al., 1996; Johnson, 2002). These systems lessen the

1.5 Reengineering Towards Components

7

potential for reuse. This thesis addresses the need to reuse portions of existing non-component-based software when implementing and evolving new or existing component-based systems within a domain vertical.

1.5

Reengineering Towards Components

Reengineering and maintenance research provides a plethora of techniques for analysing and altering existing systems. Attempts at recovering modules for reuse have existed since the early 1980’s (Belady and Evangelisti, 1982; Hutchens and Basili, 1985; Livadas and Johnson, 1994), with the most comprehensive review on the subject to date found in (Koschke, 2000a). Automatic approaches, in isolation, have shown poor rates of success when recovering components (Koschke, 2000a). Hence, Koschke (Koschke, 1999) states that the best approach when attempting component recovery should: Aggregate several techniques: Research has shown that any single automated component recovery technique has a success rate no greater than 40%. To increase accuracy to acceptable levels, two or more techniques should be used to increase accuracy to acceptable levels. Include the human in the recovery process: Human inference on data often provides insight that cannot be achieved algorithmically. Thus, semi-automating the recovery or encapsulation process can be greatly assisted using human insight. Include domain knowledge: Domain knowledge can attribute semantics to the syntax of the program under analysis. Without meaningful semantics attached to the code under analysis, the identified component is often less useful. For example, semantic information such as ‘what is the purpose of the component identified?’

1.6 Objectives and Contributions

8

is vital domain knowledge that can be provided by a software engineer during recovery. This thesis is cognizant of these requirements when exploring existing reengineering and maintenance research to arrive at a proposed solution to the legacy dilemma stated in section 1.4.

1.6

Objectives and Contributions

In approaching component recovery, two distinct steps can be identified: 1. Encapsulating reusable code. 2. Wrapping the reusable code to conform with a component-based development process. If step one is adequately performed then step two becomes trivial, thus the main focus of this thesis lies with the first step, with step two given a lighter treatment (Le Gear et al., 2004). Given this focus, the objectives of this thesis can be outlined as follows: 1. To produce a repeatable process for targeted component encapsulation, that is driven by the software engineer, that makes use of domain knowledge, is humanoriented and aggregates two or more appropriate techniques, based on a comprehensive literature of the field. 2. To evaluate this process and its products in ecologically valid settings. The key contributions of this thesis include: 1. Exploiting a previously unexplored variation on Software Reconnaissance that can be used to identify reuse in software systems. The view produced is called the reuse perspective.

1.7 Thesis Structure

9

2. Creating a tailored version of Reflexion Modelling designed specifically for component interface identification and component encapsulation. 3. The novel combination of the reuse perspective and the variation on Reflexion Modelling as a process for component recovery. This combination forms a component encapsulation process called Reconn-exion. 4. The positioning of Reconn-exion within existing literature. 5. An empirical evaluation of Reconn-exion acquired from realistic industrial settings that evaluates the validity of Reconn-exion with respect to the process and the product and also this evaluation suggests some future refinements to the process. 6. A repeatable process for component recovery.

1.7

Thesis Structure

The remainder of this thesis is structured as follows: • Chapters 2, 3, 4 and 5 forms a literature review relevant to Component Reconnexion. The broad areas of software components and software reengineering are first discussed, followed by a more detailed look at two software analysis techniques - Software Reconnaissance and Software Reflexion Modelling. • Chapter 6 explores research methods and in particular describes the research approach that will be adopted in this thesis. • Chapters 8 and 9 follow with an evaluation of Component Reconn-exion. First, the two constituent parts of Reconn-exion, which are core contributions in themselves, are evaluated in industrial, case studies. Then a large industrial case study is undertaken to evaluate the complete Reconn-exion process.

1.7 Thesis Structure

10

• Chapter 10 reflects on the previous three chapters providing a critical evaluation in terms of validity of the studies undertaken and the theoretical limitations of the technique. • Chapter 11 suggests potential future work that could be undertaken to refine Reconn-exion and evaluate results to date, as well as suggesting potential expansions to Reconn-exion. • Finally in chapter 12 a conclusion to the thesis is provided.

Chapter 2 Software Components “Architecture starts when you carefully put two bricks together. There it begins.” - Ludwig Mies van der Rohe (German born American Architect. 1886-1969).

12

Though many definitions exist that describe components, no officially recognised, standard definition of components, that is sufficiently constrained, exists at present (Hamlet, 2001). The word component has been used to describe procedures (Wilde and Scully, 1995), collections of reusable code (Mii and Takeshita, 1993), identified modules within systems (Cimitile and Visaggio, 1995; Girard and Koschke, 1997), class libraries (Zweben et al., 1995) and, more recently, black-box units of composition in software (Eddon, 1999; Ran et al., 2001). These are but a few understandings of components from a much wider list of contradictory viewpoints (Bachmann et al., 2000; McGurren, 2004; Szyperski, 2003; Johnson, 2002; Wang et al., 1999; Stevens and Pooley, 2000; Girard and Koschke, 1997; Chiricota et al., 2003; Cimitile and Visaggio, 1995; Cheesman and Daniels, 2001; Allen and Frost, 1998; Wilde and Scully, 1995; Ran et al., 2001; Mii and Takeshita, 1993). More recently Clemens Szyperski took a different approach to component definition (Szyperski, 2003). Rather than attempting to adopt a single interpretation that serves as a panacea, he instead categorised existing viewpoints on components into a four tiered classification framework based upon their use and history of inception. This is summarised by table 2.1. The first two tiers in table 2.1 describe reuse only. Terms that relate to more recent component-based development are introduced in tiers 3 and 4. The entries in this table are core issues of consideration when designing a state of the art component (Woodman et al., 2001) and should be addressed when implementing component recovery. For example the designer must decide if the system needs to be able to undergo dynamic alteration (Buckley et al., 2003), be deployable, or is likely to be regarded as reusable again. Szyperski refined this categorization to describe four tiers of maturity among component technologies (table 2.2), defining where the elements of reuse are available and whether they can be introduced dynamically.

13

Table 2.1: Motivations for software components described by four tiers. Tiers Description Tier 1 - Basic Reuse A programmer informally reuses source as he writes a program. i.e. cut and past code cloning Tier 2 - Advanced Reuse

Tier 3 - Deployable Components

Tier 4 - Dynamic Components

Standard libraries of reusable source code are introduced and used, as-is, across a company or several companies. Units of construction are built and deployed to software systems. Real world examples include the java bean (Ran et al., 2001; Inc, 2005) and COM+ (Eddon, 1999) implementations of component-based development. They allow static composition only. The units of construction allow runtime alteration of their properties and the dynamic replacement of components in assembled systems. Examples of which may be found in (Sadd, 2003) and (Dowling et al., 1999).

2.1 A Component Definition

14

Table 2.2: Levels of maturity among CBD Technologies. Maturity Level Description 1. Maintainability Modular Solutions to software systems. 2.Internal Reuse E.g. Product Lines within companies, for companies promoting CMM 5 development processes. 3. a. i. Closed Composition Make and buy from a closed pool of organisations. 3. a. ii. Open Composition Make and buy from the open market. 3. b. Closed Dynamic Dynamically Upgrade from restricted markets. 4. Open and Dynamic Completely open and dynamic upgrades from a potentially unlimited pool of organisations.

The component technologies referred to in this report are concerned primarily with tier 3 of table 2.1 at maturity level 3. a. i. closed composition. That is, this thesis focuses on reuse identification and replaceability of components from existing systems of an organisation. In doing so, it is likely that the reusable entities identified will be of the most use in generating varietal systems for that organisation or organisations in a similar domain, or alternatively they could be used in evolving the organisation’s current systems.

2.1

A Component Definition

By confining the definition of a component in this report to tier three of table 2.1 at a maturity level of 3.a.i., our ability to precisely define the nature of components, relevant to this thesis, is considerably clarified. Three core criteria are identified to characterise the nature of components: 1. A black-box implementation of some functionality (Bass et al., 2000; Bachmann et al., 2000; Wallnau, 2003).

2.2 Interfaces

15

2. May be reused, “as-is” by a third-party consumer (Washizaki et al., 2002). 3. Conforms to some component model (Councill, 2001). This definition can be further expanded to account for the maintenance life cycle in evolving systems by additionally describing components as units of versioning and replacement (Szyperski, 2003; Cheesman and Daniels, 2001). This definition should not be taken as firmly established yet. For example, debate surrounds the requirement that a component must be a black-box implementation (Cho et al., 2001). However, for the purposes of this thesis, we will consider these characteristics as defining components.

2.2

Interfaces

Core to components and the assembly of components into a component architecture is the concept of an interface (Cheesman and Daniels, 2001; Lau, 2001). The IEEE describes an interface as, “A shared boundary across which information is passed ... To connect two or more components for the purpose of passing information from one to the other.” (IEEE, 1990) This sixteen year old definition still carries weight in describing the underlying principals of interfaces. Expanding upon this, the shared boundary described is essentially a formalism for controlling dependencies between software implementations, where a software implementation could be an operating system’s modules, class libraries or even procedure libraries (Bachmann et al., 2000). Beugnard et. al. (Beugnard et al., 1999) describes how complete interface specifications can be described by categorising interface properties into four distinct levels (figure 2.1) (McGurren, 2004; Bachmann et al., 2000):

2.2 Interfaces

16

Level 4: Quality−of−Service Level

Dynamically Negotiable

Level 3: Synchronisation Level

Level 2: Behavioural Level

Level 1: Syntactic Level

Negotiable

Figure 2.1: Levels of interface specification adapted from Beugnard et. al. (Beugnard et al., 1999)

2.3 Introducing Component-Based Development

17

Syntactic Level: The format of method and function signatures as prescribed by the grammar of a programming language caters for this interface level. API’s already adequately cater for syntactic level interfaces. Behavioral Level: A behavioral specification is a formal description of what should happen when a software artifact executes, and is often called a contract (Cicalese and Rotenstreich, 1999). Languages such as Eiffel (Eiffel Software, 2004) and OCL (Clark, 2002) support behavioral level interfaces. Synchronisation Level: At this level properties describing component synchronisation, mutual exclusion, atomicity and transactions are specified. Java already implements a lightweight version of synchronisation through its “synchronised” keyword. Quality-of-Service Level: The previous three levels reasoned about properties that could be precisely defined. The quality-of-service level, however, is concerned with quantifying component properties such as “average response” and “quality of result.” These are a measure of one’s trust in a component (Councill, 2001) and is usually specified by third party certification.

2.3

Introducing Component-Based Development

In section 1.3 component-based development was specifically cited as a means of curbing the problems associated with monolithic software development by explicitly placing reuse as core to the process. Component-based development can be simply defined as “... the building of software systems out of prepackaged generic elements.”(Meyer and Mingins, 1999) “... [this] involves the technical steps for designing and implementing software components, assembling systems from pre-built software compo-

2.3 Introducing Component-Based Development

18

nents, and deploying assembled systems into their target environments.”(Bass et al., 2000) A graphical description of the component architectural style, adapted from (Bachmann et al., 2000), can be seen in figure 2.2. It includes (Bachmann et al., 2000): 1. Components - These form the building blocks of the system. 2. Clearly defined interfaces on the components to describe the services that the component offers. 3. The components assembled in accordance to clearly defined contracts to describe the interaction between component instances. 4. Multiple instances of component types, which describe families of component instances in the same way that an object is an instance of a class. 5. Each instance of these component types can be deployed either statically or dynamically forming a component-based piece of software. A statically deployed component is deployed at implementation time. A dynamically deployed component is deployed at runtime. 6. The combination of component types, their interfaces and an explicit description of their valid patterns of interaction forms a component model. 7. The component model is supported by a component framework. A framework will consist of a set of supporting services and other components that are useful and sometimes necessary in building the component-based application. 8. The component model provides an array of runtime services that enforces the component model.

2.3 Introducing Component-Based Development

19

3 − Contract

2 − Interfaces

1 − Component

5 − Deployed 9 −Multiple instances of a Component Type may be Deployed 4 − Component Type

7 − Component Framework

8 − Runtime Services

6− Component Model

Figure 2.2: A Generic Component Architecture adapted from Bachmann et al. (Bachmann et al., 2000).

2.3 Introducing Component-Based Development

20

With this architecture comes the potential to take software development from being a practiced craft to a fully fledged engineering discipline (Johnson, 2002; Whittaker and Voas, 2002) that includes the predictable assembly (Wallnau, 2003) of software systems. This component-based software engineering approach can be defined as “... the practices needed to perform [component-based development] in a repeatable way to build systems that have predicable properties.” (Bass et al., 2000) While this “holy grail” is yet to be achieved, positive research towards component-based software engineering (Wallnau, 2003; Cheesman and Daniels, 2001; Hamlet, 2001) plus software support for component-based development principals (Eddon, 1999; Ran et al., 2001; Sadd, 2003) are emerging. Two examples are Progress Dynamics (Sadd, 2003) or the .NET framework (Microsoft, 2006a).

2.3.1

A Component-Based Development Process

Several subtly different approaches to component based development exist (Allen and Frost, 1998; Cheesman and Daniels, 2001; Wallnau, 2003), and these are often supported by existing component technologies (Ran et al., 2001; Eddon, 1999; Progress Software, 2003). Here, Chessman and Daniels’ process of specifying component-based software is discussed to contextualise the research (Cheesman and Daniels, 2001). Built upon the UML notation, the process is portable to a wide variety of platforms and component technologies1 . All projects follow two processes simultaneously - A management process with a subservient development process (Cheesman and Daniels, 2001, page 25). The management process accounts for time constraints and the setting of milestones and goals. 1

Chessman and Daniels also extend the current UML notation to explicitly handle components. However, the focus of this discussion is on the process.

2.3 Introducing Component-Based Development

21

Business requirments

Use Case Models

Requirements Business Concept Models

Use Case Models

S P E C I F I C A T I O N

Existing Assets (Resue)

Component Identification

Component Interaction

Component Specification

Technical Constraints

Components

Provisioning

Assembly

Applications

Component Specification and Architectures

Test Tested Applications

Deployment

Figure 2.3: The component based development process adapted from (Cheesman and Daniels, 2001). The development process, which we are concerned with, creates working software from requirements. The diagram in figure 2.3 describes Chessman and Daniels componentbased development process. The process is driven by five workflows, with a workflow being a sequence of actions that produce an output of observable value (Kruchten, 1999): Requirements: The requirements of the system are gathered and organised in a useful way. Two new artifacts are outputted by this workflow; The business concept model and the use case model. The business concept model is a conceptual model of the business domain that provides a common vocabulary to be used by software engineers and project managers in relation to the project. The use case model is

2.3 Introducing Component-Based Development

22

a set of use cases describing all identified functional requirements of the system. Specification: A complete set of requirements, a business concept model and the set of use case models are taken as input and combined with other existing information regarding software assets. This extra information could include existing documentation, designs or recovered or existing software components. These are used to produce a complete set of component specifications and a component architecture as output. The component specifications describe, in detail, what component types will be required. The component architecture shows how instances of these types will interact. The specification workflow can be subdivided further into three major tasks: Component identification: Taking the business concept model and the use case model as input. The component identification stage identifies an initial set of component interfaces and an architecture. Component interaction: This stage examines how the system’s operations will be achieved using the identified component architecture, thus refining upon the component identification workflow. Component specification: Detailed specifications for components are created along with an interface information model artifact. The interface information model describes operations, states and constraints that are enforced on the component. Provisioning: The component specifications and architecture, taken as input, are used to determine the components that are available, the ones that must be built and the ones that must be bought. It is the job of the provisioning workflow to make available the required components for subsequent workflows. The reuse of components is explicitly catered for here as can be seen in figure 2.4. Furthermore,

2.4 The Legacy Dilemma Revisited

23

the reuse is not confined to components, and can include any existing software assets. The potential of reengineering towards components, from existing legacy applications, to supplement the provisioning workflow is the focus of this thesis. Assembly: The components, a suitable user interface and existing software assets such as recovered components or components from a repository, are combined to form an application. Testing and Deployment: During this workflow, standard testing and roll out of the new application occurs. Individual components will be unit tested and the entire assembly will be functionally tested.

2.4

The Legacy Dilemma Revisited

The previous section introduced the concepts of Cheesman and Daniel’s componentbased development process. In particular, it is suggested as a means of introducing the widespread reuse of software across systems. However, component-based development remains a relatively new concept. This implies that the majority of existing software is written using different or even obsolete development paradigms. This existing source code should somehow be exploited for reuse in modern component-based development processes, since it is prohibitively difficult to reimplement the implicit business rules in the existing system (Verhoef, 2000). The provisioning workflow (figure 2.3), whose task it is to make components available for subsequent development, caters explicitly for just such exploitation. Figure 2.4 expands the provisioning workflow presented in figure 2.3. showing that components may be acquired from three sources: • Components may be bought from external vendors. • Components may be built.

2.4 The Legacy Dilemma Revisited

24

Existing Assets (Reuse) Techincal Constraints

Components

Repository

Components

Other Assets

Bought Components Built Components Existing Components

Component Development

Buy Components From External Sources

Reengineering Towards Components Leveraging Existing Components

Provisioning Component Specification and Architectures

Figure 2.4: A detailed description of the provisioning workflow. • Components may be reused from a repository of existing components. Of particular interest is the ability to take existing software components from a repository for reuse. This repository may be established in advance, particularly as part of a development philosophy such as product line software development (Bergey et al., 2000; Simon and Eisenbarth, 2002; Eisenbarth and Simon, 2001). Alternatively this repository could be populated by recovering components from existing software. This can be legacy source code reengineered towards components that can populate the repository. Reengineering legacy systems towards components for reuse in existing or new systems is the primary concern of this thesis.

2.5 Encapsulation

2.5

25

Encapsulation

Encapsulation is a means of reducing interdependence between parts of a software system (Snyder, 1986). By applying encapsulation to portions of software appropriately, increased ease of development for software engineers can be afforded to software engineers (Zweben et al., 1995). This section explores the evolution of encapsulation in software development, describes encapsulation in detail, discusses the core quality measures for encapsulation - coupling and cohesion - and finally discusses why componentbased development is yet another improved means of development that better supports encapsulation in software.

2.5.1

A Brief History of Encapsulation in Software Development

2.5.1.1

Monitors

During the early 1970’s Hoare introduced the concept of monitors as a means of controlling access to procedures and local data in a running program (Hoare, 1974). A monitor can be declared according to the following template: monitorname:monitor begin ... declarations of local data to the monitor procedure declarations end The monitor construct would allow any number of processes in the operating system to request access to the monitor source code. However, never would more than one process be allowed to be executing the source code or accessing the local data of the monitor at any particular time. In this fashion a monitor would achieve process encapsulation by grouping operations and data that should be only be executed together.

2.5 Encapsulation

26

The grouping of data and procedures would be referred to simply by the monitor name, hence achieving the desired abstraction effect afforded by encapsulation. 2.5.1.2

Information Hiding

At roughly the same time as Hoare published his work on monitors (Hoare, 1974), Parnas had begun to introduce the concept of information hiding (Parnas, 1972, 1971, 2002). Parnas first introduced the concepts of information hiding in (Parnas, 1972). Information hiding, similar to the process encapsulation afforded by Hoare monitors, advocates the hiding of portions of a program’s data and operations associated with that data behind a defined interface (Parnas, 2002, 1972). Unlike Hoare’s monitors, information hiding is advocating encapsulation and abstraction on the static structure of the program, rather than encapsulation of the program in terms of the running processes of the operating system. The principles of information hiding provides the necessary basis for dividing a software system into modules, hiding complexity of the system and interacting through well defined interfaces (Wikipedia, 2006b). This form of encapsulation would eventually form the basis for mainstream software development and design. 2.5.1.3 Object Oriented Programming Object oriented programming is a style of programming that supports the concepts of information hiding as a first class language construct. The first object oriented programming languages emerged during the 1960’s with Simula (Simula, 2006). Object oriented programming introduces modern programming concepts such as inheritance, polymorphism and, most relevant to this discussion, information hiding through data encapsulation. This encapsulation is afforded to the user using the class construct in object oriented languages. As research began to highlight the importance of information hiding (Parnas, 1972;

2.5 Encapsulation

27

Zweben et al., 1995) and the need to promote these concepts during software development, object oriented languages, with their explicit use of data encapsulation, began to rise in popularity. By the early 1990’s object oriented programming had gained widespread acceptance in software development and has been shown to provide significant benefits in ease and manageability of development (Zweben et al., 1995). Object oriented languages achieve better encapsulation over their non-object oriented counterparts by providing several key language concepts to the programmer: • The ability to group related operations and data using the class construct. • The ability to limit access to methods and data to a given scope. This is achieved using keywords such as public, private and protected. • The ability to abstract over related class types using inheritance and polymorphism. One possible interpretation of encapsulation is that its purpose is to protect portions of a system from operations and data that are irrelevant to those portions. In the non-object oriented code example in figure 2.5 no such protection is put in place in the program. All four procedures potentially have access to all the data of the code fragment, in spite of the fact that “procedure1” and “procedure2” only access variables r, s, t, and u and “procedure3” and “procedure4” only access v, w, x, y and z. A clear division of data and the operations that act over that data exists in the code fragment, however no first class entity of the programming language exists that makes explicit or enforces this encapsulation of data is available. Furthermore, within our two divisions of the code fragment, “procedure2” and “procedure4” are only ever accessed via “procedure1” and “procedure3” respectively and never from “main”. Again, as the fragment currently stands the potential to access “procedure2” and “procedure4” from “main” does exist. The language offers no means of encapsulation that would “hide” access to “procedure2” and “procedure4” from “main”.

2.5 Encapsulation

int r,s,t,u,v,w,x,y,z; void main() { procedure1(); procedure4(); } void procedure1() { r = s + t; procedure2(); } void procedure2() { u = r + s + t; } void procedure3() { v = w + x; procedure4(); } void procedure4() { x = y; y = v; z++; z += v; }

Figure 2.5: A code fragment from a C program.

28

2.5 Encapsulation

29

Object oriented languages provide these needed language constructs. The revised code example in figure 2.7 is a revised version of the code example in figure 2.5 that makes appropriate use of the object oriented language’s class and scoping constructs in C++. Using the class construct the separate variable groupings mentioned above have been placed into separate classes and grouped with the procedures that operate on those variables. The variables of these classes have been scoped as private since we wish to encapsulate this data in within the scope of the class and deny access to the data from outside the class. Likewise one of the procedures within each of the classes has been marked as “private” (“procedure2” and “procedure4”) because no procedures outside of their respective classes access these procedures. Notice, in particular, how even tight scoping can be achieved by declaring certain variables as local variables within the methods. In this case u is declared locally in “procedure2,” w is declared locally in “procedure3” and y and z are declared locally in “procedure4.” This is because those variables are used exclusively by the procedures that they are now declared in. The classes themselves and the procedures within them that we wish to provide access to are marked with the “public” modifier. This allows access to these elements, program wide. The result of these measures is a reduction in the list of operations and data that classes and procedures can access, through information hiding. This encapsulation results in a reduction in complexity when creating and maintaining the application by separating the concerns of the program into explicitly scoped groupings of data and associated operations, accessible only through clearly defined interfaces. In figure 2.6 a visualisation of this encapsulation is shown to help clarify what has been achieved. Further, encapsulation benefits are provided by object oriented languages through effective use of the object oriented concepts of “inheritance” and “polymorphism.” Inheritance allows one to define a hierarchy of class types in a program. The inheriting type will inherit the characteristics (data and operations) of the type it inherits from. Take the “Animal” inheritance hierarchy in figure 2.8, which is visualised in figure 2.9,

2.5 Encapsulation

30

Program main()

MyClass1

MyClass2

private data r, s and t

private data v and x

procedure1

procedure2 local data u

procedure3 local data w

procedure4 local data y and z

Figure 2.6: A visualisation of the encapsulation exercise shown in figure 2.7.

this time using Java syntax (Sun Microsystems, 2006). The hierarchy describes a set of animals that share some characteristics, and become more specialised as we descend the hierarchy. Carefully note the use of the ”protected” modifier. “private,” as we saw, limits access to the enclosing class. “protected,” however limits access to the enclosing class and any classes that inherit from that class. Classes outside of the inheritance hierarchy will still have no access to the protected members of the class. Polymorphism is a feature of object orientation that operates on inheritance hierarchies and provides the ability to treat a derived class just like it’s parent class, sometimes to the extent that the derived class’ use becomes invisible to the programmer. This encapsulation effect is demonstrated in the code example in figure 2.10. The code example models a scenario where a animal is caught in the “Wild,” brought to a “Clinic” to be treated and then put into captivity in a “Zoo.” This example makes use of the inheritance hierarchy in figure 2.8. Notice how the “capture” method will capture a specific type of

2.5 Encapsulation

void main() { MyClass1 cl1 = new MyClass1(); MyClass2 cl2 = new MyClass2(); cl1.procedure1(); cl2.procedure2(); } public class MyClass1 { private int r,s,t; public void procedure1() { r = s + t; procedure2(); } private void prodecure2() { int u; u = r + s + t; } } public class MyClass2 { private int v,x; public void procedure3() { int w; v = w + x; procedure4(); } private void procedure4() { int y,z; x = y; y = v; z++; z += v; } }

Figure 2.7: A revised version of the code fragment in figure 2.5.

31

2.5 Encapsulation

32

class Animal { protected int morale = 0; public void raiseMorale() { morale++; } public void decreaseMorale() { morale--; } } class Biped extends Animal { } class Quadruped extends Animal { } class Monkey extends Biped { } class Orangutan extends Biped { } class Dog extends Quadruped { } class Cat extends Quadruped { }

Figure 2.8: An inheritance hierarchy code sample.

Animal

Biped

Monkey

Quadruped

Orangutan

Dog

Cat

Figure 2.9: A visualisation of the inheritance hierarchy in 2.8.

2.5 Encapsulation

33

animal depending on the circumstances. However, the type of the animal is not of any concern to the “Clinic” class as the clinic will treat any type of animal and place it in the “Zoo”. Using polymorphism in the example this form of information hiding can be achieved. All operations in the “Clinic” occur on the type “Animal” and the ”Clinic” class remains agnostic to the actual type of the instance it is dealing with. In this way encapsulation over an inheritance hierarchy can be achieved, shielding, where possible, portions of the program from the the complexities of the type hierarchies.

2.5.2

Coupling and Cohesion

By the late 1970’s researchers had begun to arrive at a consensus regarding the merits of encapsulation and abstraction during software development and design. The focus next began to shift to how to assess the quality of encapsulation. The led to two commonly accepted understandings of encapsulation quality being formed - Coupling and Cohesion. These indicators of “good” design were conceived over thirty years ago (Stephens et al., 1974). Coupling is the degree of interdependence between component’s or modules and cohesion is the extent to which an individual component or module’s individual parts are needed to perform the same task (Yourdon and Constantine, 1979). Low coupling and high cohesion often indicate a more replaceable (and reusable) component and, by measuring coupling and cohesion we can get an indirect measure of replaceability and reusability. We define and measure coupling between two modules in terms of the type and degree of communication between them (Fenton, 1991). Figure 2.11 is an example of a recovered component (“Transforms”) from later in the thesis. Notice the high number of internal connections on the components relative to the interconnections between the components. In line with the suggested use of encapsulation as suggested by the previous section (section 2.5.1.3) it could be said that this component is well encapsulated since many calls that are irrelevant to clients are encapsulated in the component, with a minimised number of calls between the components.

2.5 Encapsulation

public class Zoo { public incarcerate(Animal animal) { if (animal.getType().equals(Dog.Class.getType())) { Dog dog = animal; dog.raiseMorale(); } else { Monkey monkey = animal; monkey.decreaseMorale(); } } } public class Wild { String loc; Wild(String location) { loc = location; } public Animal capture() { if (loc == "Europe") return new Dog(); else return new Monkey(); } } public class Clinic { public static void main(String args []) { Wild theWild = new Wild("Europe"); Zoo theZoo = new Zoo(); Animal capturedAnimal = theWild.capture(); treatAnimal(capturedAnimal); theZoo.incarcerate(capturedAnimal); } treatAnimal(Animal theAnimal) { theAnimal.raiseMorale(); } }

Figure 2.10: Using polymorphism for encapsulation code example.

34

2.5 Encapsulation

35

Figure 2.11: An example of a loosely coupled and highly cohesive component.

Correspondingly these components are loosely coupled due to the low interdependence between the components, and they display high cohesion as a virtue of the fact that there is a large number of internal, hidden calls relative to the intercomponent calls. Fenton (Fenton, 1991) describes a classification of 6 types of coupling, between two modules x and y, that can be arranged by increasing strength (0-5) and are based upon the type of communication between two modules: • 0: x and y have no communication; that is they are totally independent of one another. • 1: x and y communicate by parameters, where each parameter is either a single data element or a homogeneous set of data items that incorporate no control element. This type of coupling is necessary for any meaningful communication between modules. • 2: x and y accept the same record type as a parameter. This type of coupling may cause interdependency between otherwise unrelated modules.

2.5 Encapsulation

36

• 3: x passes a parameter to y with the intention of controlling its behavior; that is the parameter is a flag. • 4: x and y refer to the same global data. This type of coupling is undesirable; if the format of the global data must be changed, then all common coupled modules must also be changed. • 5: x refers to the inside of y; that is, it branches into, changes data in, or alters a statement in y. He continues to provide a calculable metric for coupling, c(x, y) = i +

n n+1

where c is the coupling between two modules x and y, i is the level of coupling on the six part scale and n is the number of interconnections between x and y. To measure cohesion Yourden and Constantine (Yourdon and Constantine, 1979) provided a seven point scale of decreasing cohesion. Functional cohesion, where the module performs a single well defined function, is the best and subsequent items are presented here in terms of decreasing cohesion: • Functional: the module performs a single well defined function. • Sequential: the module performs more than one function, but they occur in an order prescribed in the specification. • Communicational: the module performs multiple functions, but all on the same body of data (which is not organised as a single type or structure). • Procedural: the module performs more than one function, and they are related only to a common procedure of the software.

2.5 Encapsulation

37

• Temporal: the module performs more than one function and they are only related by the fact that they must occur within the same timespan. • Logical: the module performs more than one function, and they are related only logically. • Coincidental: the module performs more than one function and they are unrelated.

2.5.3

Encapsulation Features of Component-Based Development

As stated early in this chapter, software components and component-based development claims to provide a better means of software development than the current state of the practice in the software industry. Software components are intended to build upon the existing object oriented technologies (Meyer and Mingins, 1999) by adding to and improving the means of encapsulation during development (Meyer and Mingins, 1999; Cheesman and Daniels, 2001). A software component may constitute any number of classes. Thus, encapsulation can be implemented on a much larger scale. This is an important feature of encapsulation that becomes desireable as an application or domain being modeled grows large. Take, for example, the classes in the diagram in figure 2.12. The edges in the diagram represent dependencies in the program between the classes. After brief examination of the diagram you will notice that two distinct groupings of classes exist (d,e,f,g,h,u,v and w,x,y,z,a,b,c) and that all communication between these two groupings only occurs via three classes (u,v,w). Similar to the problem posed in figure 2.5 in the previous section, a mechanism for aggregating classes and hiding complexity through encapsulation would be useful (see figure 2.13). Software components provide this explicit construct as a first class entity. Unlike packages in Java, for example, which also may be considered in this light, typical component technologies provide one or several explicit,

2.5 Encapsulation

38

z h x

y

g u w e

a

v

f

b d c

Figure 2.12: Many classes without and encapsulation policy.

localised interfaces that declare the public services of the component, thus increasing encapsulation. When we encapsulate the two components, as with figure 2.13 the interface (public classes with respect to the component) on the component become u, v and w. By making explicit what classes are public and private to a component one removes the potential for breaking the desired encapsulation. Earlier, in section 2.3, it was noted how a component framework offers an array of runtime services to the programmer. One such service is an event-based model of programming. In such a model, client code may register with the component to listen for a specific event that occurs in the component. When such an event is fired the client code can respond to the event with the invocation of a specified procedure. Figure 2.14 shows a sample C#.Net code fragment (Microsoft, 2006a) that demonstrates client code registering to listen for a “ComponentShutDown” event. In this circumstance, when the component is shut down, the component (“Component”) will raise an event. Because the client code has registered to listen for this event it will notice that the event has occurred and respond to it by invoking it’s own client code (“ClientProcedure”). Better encapsulation of the state of the component is achieved by hiding more internal data of the component using the event based model. The alternative to using such a model

2.5 Encapsulation

39

z h x

y

g u w e

a

v

f

b d c Component 1

Component 2

Figure 2.13: Many classes from figure 2.12 encapsulated by a component. public class ClientCode { ClientCode() { Component.ComponentShutDownEvent += new EventHandler(ClientProcedure); // Registers client to listen for an event with the component. } public void ClientProcedure(Eventargs e) { // Some operations that respond to the event here. } }

Figure 2.14: Event handling code sample. would have been to have the client source code continually poll the component to see if there has been a change of state in the component. Instead, with the event based model, the client code becomes a passive entity and the relationship between client and component becomes inverted. The state of the component is no longer a concern of the client code until it informed by an event. The onus is on the component to provide clients with notification of an event and information about that event (including changes in state) via the “Eventargs e” argument passed by the component to the client code in figure 2.14.

2.5 Encapsulation

40

192.168.11.2

192.168.11.1

Component1

Component2 HTTP

Figure 2.15: A deployment diagram of two distributed components.

A form of geographic or topological encapsulation is also provided by a component framework. Component frameworks such as J2EE or .Net provide a mechanism for components2 (Microsoft, 2006a; Inc, 2005) on different machines, which could potentially be in very different geographic locations, to register with a component framework. This makes the component available for use in a distributed fashion. However, calls to these components may be made as though they were on the same machine. Take, for example two components on different machines, as shown in the deployment diagram in figure 2.15. Once properly registered with the framework a method call between the two components could potentially be as simple as shown in figure 2.16. In this way information regarding the location of components and the required information to reach these components over a network is encapsulated by the component and the framework and completely hidden from clients of that component. Also noted in section 2.3 was that components can communicate between each other using clearly defined interfaces. The intention here being that the only knowledge we have about a component should be through its interface and that all other information about the component should be encapsulated, including the original language implementation of that component. This suggests that a component-based system could po2

The specific named for these types of component are Enterprise Java Beans (J2EE) and Web Services (.Net)

2.5 Encapsulation

41

//Component1 definition .. . // definition of class within the component public class MyClass { public void someMethod() { Component2.Component2Class component2Class = new Component2.Component2Class(); component2Class.component2Method(); } } .. . // remainder of component.

Figure 2.16: A call between the distributed components shown in 2.15. tentially be composed of many components written in many different languages. The .Net component framework, for example, supports the definition of components written in over 50 different languages (Ritchie, 2006).

Chapter 3 Software Reengineering “It is the neglect of timely repair that makes rebuilding necessary.” - Richard Whately

3.1 Agendas for Reengineering Software Systems

3.1

43

Agendas for Reengineering Software Systems

When software needs to evolve to prolong its lifetime, software development teams may have only three choices: 1. Purchase a new system. 2. Develop a new system. 3. Leverage the existing system. The third choice is often the only feasible option, since the former two routes are generally too expensive (Rochester and Douglass, 1991). As a result, a large body of research has been produced in the areas of reengineering, maintaining and leveraging the existing systems. Reengineering is a subset of software maintenance, specifically directed at leveraging existing systems. Several definitions for reengineering exist (Chikofsky and Cross II, 1990; Arnold, 1993; Corp., 1989). These definitions differ only in allowing or disallowing the behaviour of a system to be altered as a result of applying a reengineering technique. We will use the widely accepted Chikofsky and Cross (Chikofsky and Cross II, 1990) definition of reengineering: “... the examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation of the new form.” Thus, we can view reengineering as an extension of maintenance where the new form is an evolved version of the system (Tilley et al., 1994; Leintz and Swanson, 1980). This definition does not explicitly exclude alteration of the systems behaviour (Arnold, 1993), however it does remain ambiguous. Many fields of research exist within the category of Software Reengineering and Maintenance:

3.1 Agendas for Reengineering Software Systems

44

• Software Comprehension (O’Brien and Buckley, 2001). • Design Recovery (Biggerstaff, 1989) – Architectural Recovery (Aldrich et al., 2002). – Component Recovery (Koschke, 2000a). • Refactoring and Restructuring (Chikofsky and Cross II, 1990). – Language Transformations (Terekhov, 2000). – Rearchitecting (Fowler et al., 1997). – Wrapping (Aldrich et al., 2002). • Data Analysis. – Slicing (Weiser, 1982). – Control flow analysis (Urschler, 1975). – Normalisation (Connolly and Begg, 2004). • Reuse Identification. – Clone Detection (Baxter et al., 1998). – Frequency Spectrum Analysis (Ball, 1999). – Fan-in analysis (Fenton, 1991). The following sections in this chapter focus only on the relevant topics from software reengineering and maintenance that are applicable to the objectives of this thesis, namely component recovery, reuse identification and dynamic and static analysis.

3.2 Dynamic versus Static Analysis

3.2

45

Dynamic versus Static Analysis

In reengineering some form of analysis of the software artifact must be undertaken. This analysis can derive information such as call relations, data flows or metrics of complexity, some of which may be necessary before reconstituting the system in a new form. Techniques employed to analyse software can be broadly categorised as static and dynamic (Tip, 1995; Ritsch and Sneed, 1993). The difference between the two lies in the distinction between programs and processes1 . A program is a static representation and is characterized by source code. A process is an instance of that program executing and is dynamic. The scenario is analogous to a recipe and baking a cake (O’Gorman, 2001); the recipe being the program and the baking of it being the process. Thus, static analysis will present information based upon the source representation of the system. A dynamic analysis will glean it’s information based on source execution at runtime. This runtime information is typically retrieved in the form of a coverage profile or program trace (Ball, 1999) using a form of software instrumentation (Wilde, 1998). Deciding on which approach to employ is a matter of context. For example, consider control or data flow analysis. In the case of a static analysis the resulting data set can be program wide. But, this can be problematic where programs are large, yielding a massive data set after analysis. Attempts to identify software components within legacy software, for the purpose of extraction or modernization are well documented in the reengineering literature with varying degrees of success (Riva and Deursen, 2001; Johnson, 2002; Cimitile and Visaggio, 1995; Girard and Koschke, 1997; Quilici, 1993; Eisenbarth, Koschke and Simon, 2001; Zhao et al., 2004). With the exception of a few solutions such as concept analysis based feature location (Koschke, 2004) described by Eisenbarth et al. (Eisenbarth, Koschke and Simon, 2001), most rely heavily upon static analyses and utilise little or 1

This is the operating system notion of a process.

3.3 Reuse Identification Techniques

46

no information gleaned from the dynamic execution of the software. However, static and dynamic approaches may be viewed as complementary when analysing software (Ball, 1999). Techniques that exclude dynamic analyses deny access to key information regarding (Ritsch and Sneed, 1993): 1. The software elements that are used and those that are not, for given execution scenarios. 2. Performance information. 3. Relationships between code and particular business transactions. 4. Sequence of execution. The first and third points are particularly relevant to this thesis’ core agenda - reengineering towards components - as this states that it is possible to relate source code to a prescribed execution scenario (often realised by a test case) and then to further relate an execution scenario to a business transaction that it instantiates. This offers the potential to identify implementations of behaviors of interest, during the targeting phase of the component recovery process.

3.3

Reuse Identification Techniques

As identified for component-based development in the previous chapter, software reuse is a core concern for a software engineer. Software reengineering and maintenance literature provides us with several means of identifying reuse in software systems.Several types of reuse exist. Based on the review of reuse undertaken here two broad categories of reuse appear to emerge: 1. Reuse internal to a system.

3.3 Reuse Identification Techniques

47

2. Reuse across systems. The latter type of reuse is probably the most familiar type of reuse and realises the “Software Reuse” approach to software development (Nauer and Randell, 1968) defined as: “the process of creating software from existing software rather than building software systems from scratch.” (Krueger, 1992) This type of reuse can be be realised in any number of ways including, component deployment from repositories, the use of libraries in the form of header files or web services (Priéto-Diáz, 1991). The principals of component-based development are intended to foster this type of reuse (Cheesman and Daniels, 2001). Reuse internal to a system can exist in several ways, and are identifiable by their detection techniques, as illustrated in the following sections.

3.3.1

Clone Detection

A software clone is duplicated code within that system (Baxter et al., 1998). Typically between 5% and 10% of a typical appliation consists of code clones (Baxter et al., 1998). Clones in a system tend to be viewed as a maintenance risk, since a change to a cloned piece of code may require changes to the other clones of that piece of code that are not immediately obvious to the maintainer. This is particularly true given that identifying clones in a system is not straightforward, since, subtle changes to the piece of code being cloned may have occured in the cloning process. As a result several algorithms have been prolifereated that attempt clone detection in software (Baxter et al., 1998; Baker, 1997; Johnson, 1994). These algorithms work off the source code text, or the abstract systax tree of the partially compiled program to identify its clones. It is also worth noting that the existence of a clone is not always bad and may indicate to the software engineering portions of code that are highly reusable, since a clone is the explicit

3.3 Reuse Identification Techniques

48

reuse, by a programmer, of some implementation abstraction (Johnson, 1994). Clones will not be made apparant to the user if a dynamic analysis approach such as Software Reconnaissance is used. For example, if code that implements logging is cloned in a system and the logging feature is traced using a test case, only one example of the duplicated logging code will be captured.

3.3.2

Fan-in Analysis

Calls to a procedure from various parts of a system demonstrates another type of reuse. This is known as fan-in. A fan-in analysis determines the number of incoming calls for a procedure or class (Fenton, 1991). The fan-in analysis can also provide other valuable insights in to a system, including the identification of aspects, since procedures called from many diverse locations can indicate the presence of aspects (Marin et al., 2004). However, while fan-in is useful in identifying direct procedural reuse the reuse is not shown to be associated with an particular domain feature set as with the reuse perspective defined in this thesis.

3.3.3

Frequency Spectrum Analysis

Execution traces can be used to determine how often a software element is used for a given run of the program. Measuring this type of reuse is called frequency spectrum analysis (FSA)(Ball, 1999). This analysis can provide runtime reuse frequency information for particular elements or patterns of reuse for groups of elements. Calculation of the reuse perspective also relies on dynamically generated information.

3.4 Dependency Graphs int int int x = z = y =

49

x; y; z; 1; 1; x + z; Figure 3.1: Code example 1. int x

int y

int z

1

Figure 3.2: A possible graph representation of code example 1 in figure 3.1.

3.4

Dependency Graphs

A dependency graph is a graph representation of dependencies in a software system. This intermediate representation of the system is a convenient depiction of the source code that easily affords itself to analysis (Larsen and Harrold, 1996) and code optimization of programs (Ferrante and Warren, 1987). In the code example in figure 3.1 we have a number of statements and assignments. To help understand data flow one might decide to model the assignment and declarations in the program as with the graph in figure 3.2. In the next code example is a program with an ‘if’ statement and a ‘while’ loop (figure 3.3). It is possible to model the control flow structures of this program in a graph for as shown in figure 3.4. Early use of dependency graphs saw them used to help implement code optimiza-

3.4 Dependency Graphs

50

(1) int x = 0; (2) int y = 0; (3) while (x == 0) (4) { (5) if (y > 10) (6) { (7) x = -1; (8) } (9) else (10) { (11) x++; (12) } (13) y++; (14)} (15)x = 0; (16)y = 0; Figure 3.3: Code example 1. int x = 0; int y = 0; while (x > 0) if (x > y)

x = −1;

x++;

y++;

x = 0; y = 0;

Figure 3.4: A possible graph representation of code example 2 in figure 3.3.

3.4 Dependency Graphs

51

public class Animal { } public class Dog extends Animal { } public class Cat extends Animal { } public class Greyhound extends Dog { } Figure 3.5: Code example 1.

Animal

Cat

Dog

Greyhound Figure 3.6: A possible graph representation of code example 3 in figure 3.5.

tions and analyses such as program slicing (Ottenstein and Ottenstein, 1984; Ferrante and Warren, 1987). Ferrante and Ottenstein describe what they call the “Program Dependency Graph” which combines data flow and control flow for a program in a single graph. To model programs written in object oriented languages, dependencies such as inheritance relationships may also be included. Take the code example in figure 3.5 where a three tier inheritance hierarchy exists. A dependency graph modelling this type of dependency can be seen in figure 3.6. Similarly, other constructs that introduce dependencies in object oriented languages such as polymorphism and the friend construct may be modeled in a graph representation.

3.4 Dependency Graphs

52

int procedure1() { procedure2(7); procedure3("hello",3); } int procedure2(int x) { procedure3("hello again",x); } int procedure3(String str, int y) { } Figure 3.7: Code example 4.

Another commonly derived dependency is the method or procedure calls made in a program. Take the code example in figure 3.7 and its corresponding graph representation in figure 3.8. The resulting dependency graph is known as a call graph (Fenton, 1991). In (Larsen and Harrold, 1996) the authors describe a “System Dependency Graph” where the special dependency relations for object oriented software, described above, and the call relation dependencies are combined with the dependencies of the program dependency graph to form a large comprehensive dependency graph of the system. Their application of the graph is to enable slicing in object oriented software. In this context the call graph can be seen as a subset of the system dependency graph. The analyses performed by the technique proposed in this thesis only require the call relations of a program. Therefore it is the call graph dependency graph representation that is used in this thesis as a basis for analyses.

3.5 Reengineering Towards Components

53

Procedure1 Procedure2

Procedure3 Figure 3.8: A possible graph representation of code example 4 in figure 3.7.

3.5

Reengineering Towards Components

3.5.1

Design Recovery

Design recovery is a subset of reverse engineering (Chikofsky and Cross II, 1990) Chikofsky and Cross (Chikofsky and Cross II, 1990) in their taxonomy, define design recovery as: “a subset of reverse engineering in which domain knowledge, external information, and deduction or fuzzy reasoning are added to the observations of the subject system to identify meaningful higher level abstractions beyond those obtained directly by examining the system itself” Other descriptions of design recovery do exist (Stoemer et al., 2003; Dean and Chen, 2003; Sartipi et al., 2000; Malton and Schneider, 2001), but they all essentially capture similar basic concepts - that the implicit agenda behind design recovery is to help the programmer understand the system and its design. Biggerstaff (Biggerstaff, 1989) first brought the term into the mainstream in 1989 with his accompanying tool DESIRE. Here, the inadequacies of source code alone in

3.5 Reengineering Towards Components

54

an understanding context are identified. Application domain, programming style, supplementary documentation are just a few factors external to the source code, that have an impact on the understanding of the source code (Shaft, 1995). Design recovery can include elements of domain knowledge regarding the system, the system’s context, documentation supporting the system and input from an expert developer of the system. Core to this topic is the concept of a domain model. A domain model records the expectations of a programmer regarding the real-world situation the system is modelling, during an understanding process, and attempts to match these expectations with source code; hence introducing traceability from hypotheses to source code. An attempt at automation was made in Biggerstaff’s DESIRE tool (Biggerstaff, 1989). The tool is analyzed further in (Biggerstaff et al., 1993) where he identifies what is known as the concept assignment problem of matching expectations and hypotheses to source code programming implementations (Brooks, 1983). Where these source implementations are clichéd they are known as programming plans (Brooks, 1983). Creating domain models automatically has proved difficult (Biggerstaff et al., 1993). Research in the area of plan detection (Quilici, 1993; Quilici et al., 1997; Quilici and Yang, 1996; Rich, 1984; Woods and Quilici, 1996), and pattern detection (O’Cinneide, 2001; O’Cinneide and Nixon, 1999, 2000, 2001; Heuzeroth et al., 2003), though worthwhile, and partially grounded in comprehension theory, has not yet reached a level of practical application. At present, the best application for automated design recovery through plan detection would seem to be in vertical domains where a far narrower range of plans and expectations would exist, thus making the solution space manageable (i.e. the coding alternatives for each plan) (Quilici et al., 1997). Given that automating design recovery is not currently practical, semi-automated approaches are being investigated as viable solutions. In recent years, semi automated approaches, such as Reflexion Modeling, CME and FEAT have been used with very promising results (Kosche and Daniel, 2003; Murphy and Notkin, 1997; Murphy et al.,

3.5 Reengineering Towards Components

55

1995; Sartipi, 2001; Tran et al., 2000; Murphy et al., 2001; Walenstein, 2002; Chung et al., 2005; Robillard and Murphy, 2002; Lindvall et al., 2002). These processes follow these general steps2 : 1. Hypothesise categories and relationships between the hypothesised categories in the application under analysis. 2. Map parts of the application into these categories creating a hypothesised model. 3. Extract a concrete, lower level model of the application. 4. Compare the hypothesised model against the concrete model of the system. 5. Refine the results and repeat the process until satisfied. Dynamic analysis techniques have also shown promise as a means of design recovery (Ritsch and Sneed, 1993; Heuzeroth et al., 2003; Komondoor and Horwitz, 2003; Rajlich and Wilde, 2002). Dynamic analysis offers the potential to remove the need for source code domain knowledge3 prior to analysing the system. For example, dynamic analysis techniques for feature location, such as Software Reconnaissance or concept analysis (Wilde and Scully, 1995; Eisenbarth et al., 2003; Wong et al., 1999) use knowledge of the system’s execution with respect to test cases that exhibit certain business transactions to relate code to business function. 2

Some of these steps may be implicit in the use of the technique or appear to be merged to the user, however, they do exist. 3 Knowledge of the style of source code written for that domain. E.g. all compilers may have the same approximate design, therefore someone with domain knowledge of compiler development would expect certain modules to exist in the implementation.

3.5 Reengineering Towards Components

3.5.2

56

Clustering for Architectural Recovery and Component Recovery

With repsect to step one (the encapsulation phase) of reengineering towards components, the most relevant reengineering and maintenance techniques are those that involve clustering. Clustering is a widely used technique of software maintenance and reengineering that identifies the contents of potential modules in a system and the cohesive interfaces between those modules. The contents of these modules are called clusters (Hutchens and Basili, 1985). Clustering is often used to aid software comprehension, design recovery, component recovery and architectural reconstruction (Doval et al., 1999; Mitchell et al., 2002; Rennard, 2000; Ogando et al., 1994; Choi and Scacchi, 1990; Lindig and Snelting, 1997; Gall and Klösch, 1995; Patel et al., 1992; Valasareddi and Carver, 1998; Yeh et al., 1995; Kazman and Carrière, 1997; Murphy and Notkin, 1997). Component recovery and architectural recovery are highly related and yet are subtly different software analysis tasks, used for different purposes. An architectural recovery process will generally follow two steps (Koschke, 1999): 1. Identify the code that implements each component in a system. 2. Identify dependencies between the code of the components of the system. This type of analysis is used to redocument systems, communicate their design and to help software engineerings understand unfamiliar systems. In contrast, the goal of component recovery is to identify individual components in a systems and extract them, possibly for reuse in other systems (Koschke, 2000b). To achieve this, a limited form of architectural recovery will occur, however, the global view that architectural recovery achieves is generally not required. We suggest that a component recovery process follows these generic steps (illustrated earlier in section 2.4), which is similar to architectural recovery:

3.5 Reengineering Towards Components

57

1. Identify the code that implements the component of interest only. 2. Identify dependencies on the code of the component of interest only. 3. Conform with a component model by wrapping the component with a component wrapper. This is discussed in section 3.5.5. It is important to note the final step, where a component wrapper is applied to achieve conformance with a component model. The majority of component recovery techniques described here recover components that conform to an older and simpler definition (table 2.1 tier 1, basic reuse) of a component and therefore the last step is often not necessary. Unless otherwise explicitly stated the review of clustering techniques for component recovery in this section does not include the final step. However, this does not pose a dilemma, since, if the first two steps are carried out to define a cohesive, reusable component, the application of a component wrapper becomes relatively trivial. Approaches to clustering can be placed into three broad categories, based upon the type of information they act upon (Koschke, 2000a): • Dataflow-based approaches. • Structure-based approaches. • Domain-model-based techniques . 3.5.2.1

Dataflow-based Approaches

Dataflow-based approaches to clustering, cluster based upon data relationships in the source. The relationships examined can be data types (Doval et al., 1999), abstract data types (Ogando et al., 1994; Yeh et al., 1995) or simply the declared variables themselves (Hutchens and Basili, 1985; Gallagher and Lyle, 1991). The way in which we clustered parts of a code fragment to form valid encapsulations in the example in section 2.5.1.3 could be considered a form of data clustering.

3.5 Reengineering Towards Components

58

Hutchens and Basili (Hutchens and Basili, 1985), describe a dataflow clustering technique, based upon whether data is passed, received, used or altered for two or more procedures. Their work demonstrates some of the earliest evaluation of clustering as a means of architectural recovery. They compared the structure recovered by their technique against descriptions produced by software engineers of the systems to determine the success or failure of the approach. Their results exhibited preliminary success for architectural recovery and set a precedent for evaluating future automated clustering techniques geared toward architectural recovery. Unfortunately, their technique was limited by their inability to perform analysis of abstract data types and pointer usage. The work of Livadas and Johnson (Livadas and Johnson, 1994) overcame some of these shortcomings through the use of system dependency graphs (SDG’s). Livadas and Johnson successfully implemented several clustering algorithms in (Livadas and Johnson, 1994), based upon data type usage identified in an SDG, to recover objects from source code which was not object-oriented. In (Gall and Klösch, 1995) the authors also implemented a clustering technique based on the analysis of data-types. The goal of their work was to identify abstractions in procedural code which could be transformed to object-oriented code, with the specific goal of reuse. Their approach is semi-automated, and human oriented. Their approach begins by extracting low level program representations such as data-flow diagrams, and call graphs. Using these, two types of component are identified algorithmically: • Data store entities (DSE): that is, a clustering of source code that uses the same persistent data. • Non-data store entities (NDSE): that is, a clustering of source code that use the same internal data. Components identified in this fashion are then compared against a domain model

3.5 Reengineering Towards Components

59

generated from human derived information such as requirements documents. Mappings are made manually between the domain concepts and the recovered components. In doing so, it becomes clearer what components are valid and what components are not. The human orientation for the component recovery technique was ahead of its time. The semi-automated, human-oriented nature of the approach, are characteristics that are seen as best practice in component recovery literature (Koschke, 2000a). More recently, dataflow clustering techniques have been incorporated into aggregated approaches for component recovery (Ogando et al., 1994). These are discussed later in section 3.5.3. 3.5.2.2 Structure-Based Approaches Structure-based approaches to clustering operate by analysing the structure of the system. Examples include (Girard and Koschke, 1997; Schwanke, 1991; Siff and Reps, 1997). Some structure-based approaches will operate by applying a specific metric to the relations between elements in the system. For example, Schwanke (Schwanke, 1991) calculates a similarity measure between variables in procedures in a system to create a weighted relationship between them. Unfortunately, using this method on its own produced poor results. Results were improved when the author introduced AI-based tuning method to weight the metric, over time, in favour of the user’s clustering preferences. Other structure-based approaches that derived a weighted relation-ship between clusters in a system include (Chiricota et al., 2003) and (Muller et al., 1993). Another style of structure-based approach exploits graph theory algorithms to a dependency graph of a systems to cluster elements together. One of the most common examples is dominance analysis (Cimitile and Visaggio, 1995; Girard and Koschke, 1997), which clusters procedures together in a similar fashion to the first example shown in section 2.5.1.3. Thus the technique suggests code clusters based on high encapsulation in the system. In (Cimitile and Visaggio, 1995) the authors effectively demonstrate

3.5 Reengineering Towards Components

60

how modules of software can be identified using dominance analysis with the goal of reuse in mind. A more recent flavour of clustering has seen the use of Concept Analysis in clustering systems. Concept analysis is a mathematical technique used to analyse binary relations (Koschke, 2004). In recent years is has been successfully applied to the software engineering field (Snelting, 1998, 1996; Eisenbarth et al., 2003). One potential application of concept analysis is the identification of modules in source code. For example, in (Lindig and Snelting, 1997) the authors attempt to reengineer modules from legacy code by examining the binary relationship between procedures and global variables. Their results were mixed. An architectural recovery seemed only possible where an underlying structure existed in the first place. Two of the three case studies they performed their analysis on had undergone serious structural degradation as a result of years of ongoing maintenance and did not yield a recovered architecture as a result of their analysis. In (Siff and Reps, 1997) the authors also apply concept analysis to recover modules. This time the binary relation was placed over functions and their properties (i.e. - arguments, return values). 3.5.2.3 Domain-Model Based Approaches The concept of a domain model was introduced in 3.5.1. A domain model, formed using the domain knowledge of a user, can be very effective in producing an accurate4 decomposition of a software system (Murphy and Notkin, 1997; Kazman and Carrière, 1997) particularly as a prelude to reuse (Patel et al., 1992). One of the most common domain model-based approaches to software clustering is the Reflexion Modelling technique (Murphy and Notkin, 1997; Murphy et al., 1995, 2001; Murphy, 1996, 2003). Reflexion Modelling forms part of the process of component recovery process proposed by this thesis and is described in greater detail in chapter 5. Other domain-model based 4

Accurate from that user’s point of view.

3.5 Reengineering Towards Components

61

approaches include FEAT (Robillard and Murphy, 2002) and the CME (Chung et al., 2005).

3.5.3

Aggregated Recovery Approaches

It has been shown in literature that clustering techniques generally perform poorly as a means of component or architectural recovery, when used in isolation (Koschke, 2000a; Kazman and Carrière, 1997). Furthermore, it has been suggested that future approaches to solving this problem should aggregate existing individual approaches, include the human more in decisions of the process, make use of dataflow information and place more emphasis on domain knowledge (Koschke, 1999). The approach proposed in this thesis is an aggregated process for component recover that satisfies these recommendations. A good example of an aggregated approach is Ogando et al. (Ogando et al., 1994) who proposed an aggregated approach to recovering objects from source code. Their approach uses a combination of bottom-up and top-down understanding to achieve object recovery. From a top down stand point, objects are identified using two techniques: • Routines are grouped together based on what global variables they use. For example, if a single global variable is used by four procedures then they are grouped together. • User defined data types and the routines that use them are grouped together. These clustering techniques provide an initial architectural recovery of the system, thus facilitating understanding from the top down. From a bottom up perspective a human-oriented grouping of subcomponents is performed. For example, sometimes the automated clustering techniques used will suggest that a routine will belong to many, different objects. These conflicts are resolved in a bottom-up, semi-automated fashion by presenting them to the user. The type of domain knowledge used was reported to be mainly derived from the naming conventions in the code.

3.5 Reengineering Towards Components

62

In (Girard and Koschke, 1997) a framework for component recovery is proposed that uses dominance analysis as the primary technique and combines it with two dataflow clustering techniques and another graph-based structural clustering technique. 1. First all mutually recursive routines are clustered. 2. Then each global variable and the procedures that use them are clustered together. These are called abstract state encapsulations (ASE). 3. Them each user defined data type, and the procedures that use them, are clustered together. These are called abstract data types (ADT). 4. Finally the dominance analysis is performed on the collapsed call graph to yield further component suggestions. The results of the authors’ studies showed a marked improvement upon simply using dominance analysis alone. Two aggregated approaches to architectural/component recovery are described in (Koschke, 1999) and (Kazman and Carrière, 1997) where over fifteen clustering techniques are placed at the user’s disposal to apply at his discretion. These approaches demonstrate the most comprehensive solutions to date. Importantly, both approaches incorporate domain knowledge input from the user, which increased the effectiveness of their solutions. The ultimate goal of this thesis would not be to replace techniques like these, rather to evaluate our technique with an eventual view to integrating it into larger aggregated processes.

3.5.4

Componentisation Processes

In this thesis componentisation refers to techniques used to convert entire systems to a component based implementation, hence it is more similar to architectural recovery than component recovery. In a componentisation process the improved encapsulation

3.5 Reengineering Towards Components

63

mechanisms are applied to no-component-based parts of a system similar to what was described in section 2.5.3. An early componentisation process was described in (Choi and Scacchi, 1990). Using their module interconnection language, NuMIL, the authors describe their suggested process for augmenting a program with, what would be described in modern terms as component architecture description. The modules that they describe, however, are not consistent with the definition of component used in this thesis. Another componentisation approach that is related to the recovery of components is from Aldrich et. al’s application of the ArchJava component based software development language to a legacy system (Aldrich et al., 2002). Reflexion Modelling is used to explicitly and accurately identify component boundaries in the system before applying the ArchJava language to it (see chapter 5). Though successful in their goals of applying a component language extension to an existing system, it does not explore the objective of identifying individual components with the goal of reuse in mind. P. D. Johnson in (Johnson, 2002) describes another componentisation approach using black-box reengineering. Black-box reengineering is any reengineering approach that only requires the maintainer to understand the system down to the functionality level and not the detail of implementation (Understanding to the implementation level during reengineering is known as white-box reengineering). A good example of black-box reengineering techniques are the many feature location approaches that exist5 (Eisenbarth et al., 2003; Wilde and Scully, 1995; Zhao et al., 2004; Wong et al., 1999). These techniques often do not require understanding of the implementation to locate source code responsible for implementing a feature. Removing the requirement for detailed understanding of implementation details, of course, presents time saving benefits at the comprehension stage during a “reengineering to components” process. P. D. Johnson’s process follow three steps: 5

Where a feature is any operation that produced a result of observable value (Eisenbarth et al., 2003).

3.5 Reengineering Towards Components

64

1. Identify business components: Apply a chosen technique that identifies components in code. 2. Create wrapper components: Supplement the code chosen to be recovered as a component with wrapping code so that it may conform to the definition of a component (Bergey et al., 2000). This is a necessary step when reengineering towards components and is described in the next section. 3. Deploy wrapper components: Use the recovered components in a system. In the case of P. D. Johnson’s process he only considers deployment in the existing system to replace the same piece of the system that was chosen to be wrapped as a component. Importantly, this process is only a framework process for componentisation, and leaves many of the details of the precise process steps and what tools and techniques to use to achieve these specific steps, up to the discretion of the user. The process proposed by this thesis partially fits into this framework by describing in detail a set of steps that can be used to fulfil step one (component encapsulation) of componentisation. Also, it’s crucial that the differences between what is proposed by this thesis and the above componentisation process are understood; Reconn-exion implements targeted component recovery where individual components are chosen by the software engineer and encapsulated. Componentisation is a process that is applied to every component in a system. For this thesis, only the first step is within the scope, i.e. - Only the encapsulation of a new component is of concern, not it’s alteration to conform with a component model or integration in a new system. However, for completeness, step two is considered briefly in the next section.

3.5 Reengineering Towards Components

3.5.5

65

Component Wrappers

Wrapping is a mechanism by which legacy source code may be supplemented to modernise the system to conform with new development paradigms (Bergey et al., 2000). By legacy system we mean that it meets the following requirements (Juric et al., 2000): • Has existing code. • Is currently useful. • Is currently used. • Does not conform to the component model that we are applying the wrapping technique for. Due to the recent appearance of component-based development, few legacy systems or the software assets that constitute them, conform to the requirements of existing component models and frameworks6 . Therefore, before source code from a legacy system can be considered fully reengineered towards components, it must first be amended so as to achieve conformance with the component model and framework to which it will be applied. This is known as component wrapping (Comella-Dorda et al., 2000). A simple process for wrapping legacy system as a JavaBean, for example, follows these three steps (Comella-Dorda et al., 2000): 1. Modularise by identifying the component’s code and distinct interfaces in the legacy system. 2. By identifying the interfaces the points of contact to the remainder of the system are identified. 6

Interestingly the legacy source code does conform to a model of sorts - the operating system - that enforces constraints that today seem ubiquitous, such as process management, memory management and file management.

3.5 Reengineering Towards Components

66

3. Now the sufficient information necessary about the component is present to implement a wrapper bean for each component. A more generic approach perhaps would be to apply a component wrapper using the language independent component model. In this model a state of the art architectural description language that has been extended to allow the definition of components, such as xADL (Galvin et al., 2004) could be used. An example of this can be found in (Le Gear et al., 2004). A generic approach using this technology would follow these steps: 1. Identify a portion of legacy source to wrap as a component. 2. Identify its interface boundaries. 3. Apply the mark-up languages appropriately to describe a xADL component type. Component wrapping is a necessary step when reengineering towards components but it is not a core contribution of this paper and is not discussed further in this thesis.

Chapter 4 Software Reconnaissance “What is the difference between exploring and being lost?.” - Dan Eldon, photojournalist.

68

Execution Paths Through a System

Test case can isolate a path for the feature it exhibits

Figure 4.1: Identifying Features from Running Systems. Software Reconnaissance1 is a dynamic analysis technique, that, through the acquisition of coverage profiles (Ball, 1999), yielded by exercising carefully selected test cases on instrumented code (see section 4.2.1), allows mappings to be created (see sections 4.1) between program features and the software elements that implement them (Wilde and Scully, 1995). Figure 4.1 illustrates how Software Reconnaissance works at an abstract level. The term program feature is understood as being a realised functional requirement that produces an observable result of value to the user (Eisenbarth, Koschke and Simon, 2001; Eisenbarth et al., 2003; Eisenbarth, Kosche and Simon, 2001). The software 1

The dynamic search method is another name for the technique (Rajlich and Wilde, 2002).

4.1 A Functionality View of Software

69

elements2 , to which the features are mapped vary in granularity, depending upon the level of instrumentation, and may be branches of the decision tree (Wilde and Scully, 1995), individual statements (Wong et al., 1999) or procedures (Eisenbarth et al., 2003). The appropriate choice of granularity depends upon the context of use.

4.1

A Functionality View of Software

The mappings created between program features and code during Software Reconnaissance create a functionality view of software. Described more formally (Wilde et al., 1992), given a set of potentially overlapping functionalities3 , FUNCS = {f1 , f2 , . . . , fN } and a set of source elements, ELEMS = {e1 , e2 , . . . , eN } it should be possible to construct an implements relation,IMPL, over FUNCS X ELEMS, revealing what functionalities are implemented by what source elements. The link that allows us to construct a relation, is the test case, since a test case Ti represents an execution scenario that may exhibit one or more functionalities, F(Ti ) = {fi, 1 , fi, 2 , . . . } and will also exercise a set of source elements which can be identified through instrumentation (see section 4.2.1), E(Ti ) = {ei, 1 , ei, 2 , . . . } 2

In Norman Wilde’s seminal papers on Software Reconnaissance (Wilde et al., 1992; Wilde and Scully, 1995), he repeatedly refers to components. However, due to the potential confusion between this definition and the modern concept of components described in section 2.1, the term software elements is used in this thesis. 3 Feature and functionality are used interchangeably.

4.1 A Functionality View of Software

70

Put another way, we can define an EXERCISES relation over T X ELEMS where EXERCISES(t, e) is true if a software element e is exercised by a test case t and where t is a set of test cases defined as, T = {t1 , t2 , . . . , tN } Furthermore the relation EXHIBITS over T X FUNCS, may be defined, where EXHIBITS(t, f ) is true if test case t exhibits functionality f (Wilde and Scully, 1995). This type of analysis allows us to identify a number of interesting sets. These are discussed in the following subsections.

4.1.1

Common Software Elements

The set of common software elements refers to the source code that will always be executed regardless of the test case. This is illustrated in figure 4.2, where the large circles represent test cases and the small circles represent the software elements executed by them. The small, black circles are members of the set of common software elements, while the small, white circles are not. The set generally contains utility code of the system (Wilde and Scully, 1995). The set of common software elements, CELEMS, is defined as: CELEMS = {e:ELEMS|∀t ∈T, EXERCISES(t, e)}

4.1.2

Potentially Involved Software Elements

The potentially involved software elements for a feature includes software elements exercised by any test case exhibiting the feature f. This is illustrated in figure 4.3 where the large ovals represent test cases and the small circles represent the software elements executed by them. The small, black circles are members of the set of potentially involved software elements. This set will include software elements directly involved

4.1 A Functionality View of Software

71

Software elements

Common software elements in black

Test cases exhibiting any feature Figure 4.2: Common Software Elements.

4.1 A Functionality View of Software

72

in the implementation of f, however, it may also include elements that do not directly implement the feature f. The set of potentially involved software elements also tends to be quite large as a percentage of the entire system for most features (Wilde and Scully, 1995). When trying to map from feature to location, the primary use of the set of potentially involved software elements is as a foundation for more refined sets. The set of potentially involved software elements, IELEMS, is formally defined as: IELEMS(f ) = {e:ELEMS|∃t∈T, EXHIBITS(t, f ) ∧EXERCISES(t, e)}

4.1.3

Indispensably Involved Software Elements

The set of indispensably involved software elements, IIELEMS, is a refinement of the set of potentially involved software elements. IIELEMS is the set of software elements exercised by all test cases exhibiting f. Figure 4.4 illustrates this set. The small, black circles are members of the set indispensably involved software elements and the small white circles are not. This yields a set the same size or smaller than that of the set of potentially involved components for the same feature. However, the problem of scale remains when trying to locate the code that implements a specific feature (Wilde and Scully, 1995). The elements that solely implement the feature compared to the the size of the set is sometimes a small proportion. Again the set of indispensably involved software elements is more useful in defining more refined sets as described in the following sections. The set of indispensably involved software elements, IIELEMS, can itself be formally defined as: IIELEMS(f ) = {e:ELEMS|∀t∈T, EXHIBITS(t, f ) ⇒EXERCISES(t, e)}

4.1 A Functionality View of Software

Software Elements

Test cases exhibiting the same feature Figure 4.3: Potentially Involved Software Elements, shaded in black.

73

4.1 A Functionality View of Software

74

Software Elements

Test cases exhibiting the same feature Figure 4.4: Indispensably Involved Software Elements, are shaded in black.

4.2 Related Work

4.1.4

75

Uniquely Involved Software Elements

The set of uniquely involved software elements, UELEMS, is a further refinement upon previously described sets. It contains software elements used only by the functionality f and no other. The set is arrived at by taking the set of software elements exercised by any test case exhibiting f except for any elements that are also exhibited by features that do not exhibit f. Figure 4.5 illustrates this set where the large ovals represent test cases and the small circles represent the software elements executed by them. The small, black circles are members of the set of uniquely involved software elements and the small white circles are not. The set can be defined as follows: UELEMS(f ) = IELEMS(f ) - {e:ELEMS | ∃t∈T, ¬EXHIBITS(t, f) ∧ EXERCISES(t, e)} It has been shown experimentally that UELEMS provides a useful starting point when trying to understand the implementation of a particular feature (Wilde and Scully, 1995; Wilde et al., 1992) and that IIELEMS provides a context for understanding when using UELEMS.

4.2

Related Work

4.2.1

Software Instrumentation Enabling Software Reconnaissance

Obviously some form of instrumentation is required for software reconnaissance. Instrumentation is a dynamic software analysis technique that involves the inclusion of output statements in source code to help developers understand programs (Wilde, 1998). Several approaches to instrumentation exist, and have been comprehensively reviewed in (Wilde, 1998). An instrumented program, when run, will output a trace or profile of execution. A trace will show what was run during execution, and in what sequence. A profile will only show what was run during execution (Ball, 1999).

4.2 Related Work

76

Software Elements

Uniquely Involved Software elements in black

Test cases exhibiting features other than f1 Test cases exhibiting the same feature (f1)

Figure 4.5: Uniquely involved software elements, are shaded in black.

4.2 Related Work

4.2.2

77

Best Practices When Applying Software Reconnaissance

Early applications of Software Reconnaissance, which were concerned with finding general starting points for understanding specific program features, determined that instrumentation to the branches of the decision tree was necessary for optimal results (Wilde and Casey, 1996; Gunderson et al., 1995; Wilde et al., 2001; Wilde and Scully, 1995; Fantozzi, 2002). However, more recent variations of the technique use procedure level instrumentation (Eisenbarth et al., 2003; Eisenbarth, Koschke and Simon, 2001; Zhao et al., 2004) and statement level instrumentation (Wong et al., 1999). The former approach is used for feature mapping to source code by incorporating concept analysis while the latter focuses on software comprehension using execution slices. Most influential to the ultimate outcome of the technique’s use is the user’s choice of test cases (Wilde and Casey, 1996). For example, if chosen carefully, no more than two testcases may be required (Rajlich and Wilde, 2002) - One test case exhibiting the feature and another that does not4 , but the exact nature and amount of test cases will vary depending upon the technique’s use. In general, it has been experimentally shown that the fewer test cases used to identify a feature, the better the outcome of the technique (Wilde and Casey, 1996). Also, simply choosing existing test suites used in regression testing will not always suffice (Wilde and Casey, 1996). Test cases, for software testing purposes, are usually more complex since they tend to be attempting to reveal errors and therefore may exercise many features in a single test case (Eisenbarth et al., 2003). However, with the rise of new approaches to testing, such as unit testing and test case driven development (JUnit, 2006) the nature of test cases themselves are changing. In a publication from Eisenberg and Volder (Eisenberg and De Volder, 2005) the authors manage to successfully apply software reconnaissance using an existing suite of JUnit test cases. 4

Of course, this will mean that the involved and indispensably involved software elements sets will be the same.

4.2 Related Work

4.2.3

78

Previous Work Using Software Reconnaissance

In (Wilde et al., 1992) Norman Wilde published his seminal work on feature location and defined many of the fundamentals that would eventually become Software Reconnaissance. In this paper a feature location case study on a telecommunications switch application called PBX is presented. Even with feature location techniques at an early stage of research maturity, the study managed to return results with up to 50% accuracy. Such was the importance of the publication it received the “Most Influential Paper” accolade at the International Conference on Software Maintenance that year. However, the authors did draw the following conclusions from their experience: • The technique cannot replace the knowledge of an informed human. • The technique cannot identify source code responsible for a feature if it is always executed. In 1995 Wilde and Scully published “Software Reconnaissance: Mapping Program Feature to Code” (Wilde and Scully, 1995) which defined and demonstrated Software Reconnaissance for the first time. Much of the functionality view of software described in section 4.1 is derived from this work. The authors describe a case study on a portion of a C compiler with 15 KLOC approx. and successfully demonstrated the feasibility of the approach. In particular they identify the set of unique software elements as a useful point for a software engineer to begin searching for the implementation of a feature of the system he is looking for. In (Wilde and Casey, 1996) the authors evaluate Software Reconnaissance with the agenda of enabling technology transfer to industry. Three case studies on industrial or commercial systems are presented. The systems included: • The visitor control program: An application for logging and issuing passes for visitors to a company.

4.2 Related Work

79

• The graph display system: An XWindows graph display and browsing application. • The test coverage monitor: A program that provides test coverage information when executing test cases. Their experience in transferring the technique to industry allowed the authors to draw some interesting conclusions: • Test cases should be as few and as simple as possible when implementing Software Reconnaissance. • Software Reconnaissance is best applied to an unfamiliar system where focus on a single area is needed. • The use of existing test cases as part of a regression testing suite is not suiteable for applying Software Reconnaissance. New test cases, designed to exhibit specific features are needed. • For any technology being transfered to industry, flexibility and portability are needed to enable industrial trials. A means of visualising program traces to aid Software Reconnaissance is described in (Lukoit et al., 2000) using a tool call TraceGraph. The tool is applied in two case studies: • JointSTARS which is a defence system from the US DoD and approximately 300 KLOC in size. • The Mosaic web browser. Based on observations from the studies the authors noticed:

4.2 Related Work

80

• Only features that the user can control can be located using Software Reconnaissance. • The quality of results will depend upon the test cases used. • In some cases the human eye in conjunction with the visual representation provided by trace graph was able to replace the set difference operator of older Software Reconnaissance tools. The type of Software Reconnaissance used in the case studies is a slight variation on the original Software Reconnaissance technique design for multi-threaded applications (Wilde et al., 1997). The instrumentor required must be able to return a single trace where many traces and processes exist. This extension to Software Reconnaissance was proposed as a solution to problems that practitioners were experiencing while applying Software Reconnaissance during a case study on a system called InterBase in (Gunderson et al., 1995). In (Wilde et al., 2001) the authors compare Software Reconnaissance to a different static analysis feature location technique called the Dependency Graph method. The authors present a case study on a legacy fortran application called CONVERT3, which was used by the U.S. Navy as a raytracer for modelling explosions. Two teams applied the techniques to find two features and observed the following differences with respect to the two feature location techniques: • Because the dependency graph method forces the user to understand more of the source code the technique may be more suitable for users who lack domain knowledge regarding the system. • Software reconnaissance requires far less browsing of the code to locate a feature. • For large, infrequently changing programs Software Reconnaissance is a better alternative to the Dependency Graph method.

4.2 Related Work

81

In general the empirical evidence strongly suggests that Software Reconnaissance is a useful technique when trying to locate features in unfamiliar source code (Wilde and Scully, 1995; Wilde et al., 2001, 1997; Wilde and Casey, 1996; Wilde et al., 1998, 2003; Loeckx and Sieber, 1987; Wilde, 1998, 1994; Wilde et al., 1992; Fantozzi, 2002; Lukoit et al., 2000; Gunderson et al., 1995). Other techniques that resemble Software Reconnaissance are used in other problem domains such as software architecture reconstruction (Yan et al., 2004; Riva and Deursen, 2001) and component recovery from legacy software (Eisenbarth et al., 2003; Eisenbarth, Koschke and Simon, 2001), which is discussed at greater length in section 3.5. Finally, the activities required to undertake a Software Reconnaissance resemble many debugging activities, with, of course, the crucial difference that Software Reconnaissance is attempting to locate features not faults

5

(Wilde and Scully, 1995; IEEE,

1990).

5

However, it could be argued that a fault is simply an undesired feature.

Chapter 5 Software Reflexion Modelling “Any sufficiently advanced technology is indistinguishable from magic.” - Arthur C. Clarke.

83

One of the most successful software understanding and design recovery techniques of recent years has been Reflexion Modelling (Murphy et al., 2001). Impressive improvements in time taken by software engineers in understanding systems have been reported while using the technique. The details of the Software Reflexion Modelling process are explained in the next section, however the underlying principals of the technique are based on what are know as “collapsing strategies” (Stoermer et al., 2004). Given a software system, modelled as a dependency graph, we can decide to group certain elements of the graph together. This is the essence of a collapsing strategy. Figure 5.1 explains this in three steps: 1. We start with a dependency graph of a system (A). 2. Certain elements of this are chosen to be collapsed together (B)(Marked in black). 3. After the collapse is performed we have a slightly more abstracted graph (C). Many of the clustering techniques discussed in section 3.5.2 use collapsing strategies to achieve their goal. Importantly, a collapsing strategy alone is not a clustering technique. Clustering consists of both an analysis of the software followed by a decision to cluster based upon that analysis. For example, dominance clustering is a popular clustering technique (Girard and Koschke, 1997). First dominance analysis is performed and that determines dominating nodes in a call graph (the analysis). This is followed by a decision to cluster based upon the dominating nodes (collapsing strategy). Collapsing strategies that are preceded by a manual analysis guided by human input have shown to be helpful in allowing software engineerings to recover the designs of software systems (Refl, 2005; Murphy and Notkin, 1997; Robillard and Murphy, 2002; Chung et al., 2005). The manual analysis performed is typically a mnemonic analysis (Refl, 2005; Murphy and Notkin, 1997; Robillard and Murphy, 2002; Chung et al., 2005). Mnemonics, in this context, usually refers to the naming conventions of software

84

(A)

(B)

(C)

Figure 5.1: Collapsing strategy in operation.

5.1 The Reflexion Modelling Process

85

elements in a system (Refl, 2005). Software Reflexion Modelling (Murphy and Notkin, 1997), FEAT (Robillard and Murphy, 2002) and the CME (Chung et al., 2005) are all modern examples of this. The next section discusses the Software Reflexion Modelling process in greater detail.

5.1

The Reflexion Modelling Process

Software Reflexion modeling is a semi-automated, diagram-based, structural summarisation technique that programmers can use to aid their comprehension of particular software systems. Introduced by Murphy et al. (Murphy et al., 1995), the technique is primarily aimed towards aiding software understanding. Reflexion Modelling follows a six step process, illustrated in figure 5.2: 1. The programmer who wishes to understand the subject system hypothesises a high-level conceptual model of the system. 2. The computer extracts, using a program analysis tool, a dependency graph of the subject system’s source code called the source model. 3. The programmer then creates a map which maps the elements of the source model onto individual nodes of the high-level model (collapsing strategy). 4. The computer then assesses the call relationships and data accesses in the source code to generate its own high-level model (called the reflexion model). This model shows the relationships between the source code elements mapped to different nodes in the programmer’s high-level model. This allows comparisons between the computer’s model with the programmer’s model and the tool can report consistencies or inconsistencies in three ways:

5.1 The Reflexion Modelling Process

86

• A dashed edge in the reflexion model represents dependencies between elements of the programmer’s high-level model that exist in the source model, but were not actually included in the high-level model. • A dotted edge in the reflexion model represents a hypothesised dependency edge of the programmer’s high-level model that does not actually exist in the source model. • A solid edge in the reflexion model represents a hypothesised edge of the programmer’s high-level model that was also found to exist in the source model. 5. By targeting and studying the inconsistencies highlighted by the reflexion model the programmer can either alter their hypothesised map or high-level model to produce a better recovered model of the system. 6. The previous two steps repeat until the software engineer is satisfied that the recovered model is correct. Many approaches to software understanding and design recovery have since used Software Reflexion modelling as a basis for their techniques (Tran et al., 2000; Kosche and Daniel, 2003; Hassan and Holt, 2004; Chung et al., 2005; Robillard and Murphy, 2002). The success of Reflexion as a software understanding technique can be attributed to it’s parallels with the state of the art in cognitive psychology (This is explored in detail later in this chapter) and the state of the art in component recovery (Koschke, 2000a).

5.1 The Reflexion Modelling Process

87

5.

A B

C

1. High−Level Model

A

A

B

B

C

C

3. Map

Reflexion Model











2. Source Model

5.

Figure 5.2: The Software Reflexion Modelling Process

4.

5.2 Related Work

5.2

Related Work

5.2.1

Early Experiences with Reflexion Modelling

88

In Gail Murphy’s seminal paper on Reflexion Modelling she formally defines the Reflexion Modelling technique and applies it in an example to the NetBSD UNIX operating system (Murphy et al., 1995). In only a few hours the user of the technique was able to produce an accurately recovered model of the system. A case study that better demonstrates the usefulness of Reflexion Modelling shortly followed this publication in (Murphy and Notkin, 1997). A Microsoft software engineer, with 10 years development experience, applied the technique in order to gain a sufficient understanding of the Microsoft Excel spreadsheet product with the aim of later performing experimental engineering. After one month of applying the technique, the software engineer said that he had gained a level of understanding of the system that would have normally taken two years using other available approaches. While this reported improvement may seem dramatic, the result is consistent with subsequent Reflexion Modelling case studies (Murphy et al., 2001), including those undertaken by this thesis.

5.2.2

Extensions and Further Uses of Reflexion Modelling

In (Kosche and Daniel, 2003) the original Reflexion Modelling technique is adapted to incorporate the use of hierarchies in the high-level model description. To demonstrate, the approach it is applied to two C compilers, using a generic compiler reference architecture as an initial high-level model. While the authors evaluation is weak, the case studies were useful in highlighting a number of points: • Architectural recovery remains a difficult task. • The Reflexion Modelling approach to architectural recovery is highly iterative.

5.2 Related Work

89

• Much of the Reflexion Modelling process remains manual. • Domain knowledge is necessary for the success of Reflexion Modelling, at least in the absence of other software understanding aids. This finding is also supported in (Christl et al., 2005). In (Tran et al., 2000) a form of Reflexion modelling is used to repair the architecture of two open source systems. Unlike the original Reflexion modelling technique and similar to the approach used in (Kosche and Daniel, 2003), hierarchies are used to described the high level model, which they call the architectural level. The authors suggest four possible courses of action if the architecture is in need of repair: 1. Splitting: A high-level entity may be split into two separate nodes. 2. Kidnapping: Source model elements may have their mappings changed, thus moving them from one source model entity to another. 3. Change high level model dependencies: Unexpected dependencies may be placed in the high-level model, thus making them expected. 4. Change source model dependencies: The source code may be refactored to make is conform to architectural constraints. This is a highly invasive procedure compared with the other options. The authors applied their technique to the Linux Kernel and the VIM text editor. They used the original, published architectures of both systems as a base and repaired the architecture of both systems. However, it is important to note that this repair occurred without changing the source code (i.e. - Splitting, Kidnapping and changes to the high-level model dependencies were made). This makes their evaluation purely a modelling exercise and does not examine repair or refactoring in the traditional sense.

5.2 Related Work

90

In (Christl et al., 2005) the authors augment the Reflexion Modelling technique with two automatic clustering techniques to help aid the user in the mapping process. The two clustering techniques used are called MQ Attract and CountAttract and identify mappings based on measures of coupling and cohesion in the system. The techniques assume that a good set of modules in a system are those that have a low level of inter module coupling. Their technique was applied in a case study of the SHriMP graph visualisation tool (Storey and Muller, 1995) and, consistent with Reflexion-based case studies, showed good results. Their approach is similar to the one proposed in this thesis in the sense that Reconnexion augments Reflexion Modelling with a further reengineering technique. However, in (Christl et al., 2005) the authors use only static analysis techniques. As identified in the previous chapter, dynamic analysis techniques can facilitate the software comprehension stage of component recovery by providing a link from the initial, behavioral understanding of a system down to the architectural and implementation details of that system. Software Reconnaissance and Reflexion Modelling are proposed as a means of facilitating this understanding in Reconn-exion. Another dimension of information that the use of static analysis techniques will not address is historical information with respect to the software system being analysed (temporal analysis). The authors in (Hassan and Holt, 2004) derive historical information for software systems from source code control repositories and use it to augment the original Reflexion Modelling approach using what they call sticky notes. When iterating through the Reflexion Modelling process the authors suggest that extra information from the source control system, can help explain divergences in a model, could be useful to help answer important questions for the software developer such as: • Who introduced the unexpected dependency? • When was the unexpected dependency introduced?

5.2 Related Work

91

• Why was the dependency introduced? The authors evaluate their approach on a large, open source operating system called NetBSD to demonstrate how their approach could be applied. In (Aldrich et al., 2002) the author proposed and applied a new architectural description language extension to Java called ArchJava. While the focus of the paper is not Reflexion Modelling the authors do use Reflexion modelling to help them reengineering a case study system to apply their architectural description language. In effect, Reflexion Modelling is used, in this instance, to help componentised the system. However, no changes to the Reflexion Modelling process are made to make it suit componentisation and no comments or evaluation are attributed to Reflexion Modelling as a means of componentisation in the publication. A further application of Reflexion Modelling that has gained popularity is its use as a means of enforcing architectural decisions in an evolving software system (Tvedt et al., 2002; Eick et al., 2001; Sefika et al., 1996; Hochstein and Lindvall, 2003; Tvedt et al., 2004; Lindvall et al., 2002). Using Reflexion modelling in this fashion requires the software engineer to produce a high-level model at design time, as part of the normal forward engineering process. Then, as the parts of the system are implemented, they are mapped to the high-level model and the Reflexion model is created to reveal if dependencies exist that should not be there. In (Tvedt et al., 2002) and (Lindvall et al., 2002) the authors use this approach during the reimplementation of a commercial software system, written in Java, called the Experience Management System. Using the approach as part of the development in the company the authors observed several benefits: • The premise of the reimplementation project was to produce a better system. The concrete, as implemented view, provided by the Reflexion model, was useful in demonstrating to management that the project was succeeding.

5.2 Related Work

92

• Developers were initially resistant to the new architecture, which was based upon the mediator pattern. Using Reflexion Modelling, programmers who did not adhere to the pattern could be identified immediately. • Even with the teams best efforts, the desired architecture could not be fully achieved. Using Reflexion Modelling they were at the very least able to document these violations where they would normally have been overlooked. • Catching architectural problems early ultimately led to an easier to understand and more evolvable software system and prevented the normally inevitable decay of the architecture (Eick et al., 2001). A more detailed record of this study, by the authors is available in (Tvedt et al., 2004) and (Hochstein and Lindvall, 2003).

5.2.3

A Cognitive Basis for Reflexion Modelling

The component recovery process described in this thesis is as much an exercise in memory efficiency and recall as it is creating a new component, since the process is semiautomated and the recovered artifact is based on a system that already exists. Reflexion Modelling, and hence the tailored version of Reflexion Modelling that Reconn-exion employs, can be seen as software and process support for many memory retention and recall practises, as proposed and demonstrated by the field of cognitive psychology. The following is a review of the state of the art in cognitive psychology with respect to recovery, retention and recall practices, much of which is collated in (Brace and Roth, 2005), (Littleton et al., 2005) and (Fleming, 2006). The human memory processing structure can be divided into three main workflows (Brace and Roth, 2005), as illustrated in figure 5.3: 1. Encoding involves taking information from the outside world via the senses and

5.2 Related Work

93

Encoding:

Storage:

Retrieval:

Putting information into

Retaining information

Getting information back

in memory

out of memory

Memory

Figure 5.3: Adapted from (Brace and Roth, 2005).

creating a representation for that piece of information so it can be stored internally in the brain. 2. Storage is the process of retaining the encoded information in the brain for long periods. 3. Retrieval involves getting coded information from where it is stored in memory. Two types of retrieval process exist: Recognition and recall. The success of reflexion can be directly attributed to its support for encoding and retrieval. These are discussed in the following sections. 5.2.3.1

Encoding

Encoding is responsible for taking in, preparing and placing information into storage in the brain. Information can be encoded to different depths (Craik and Lockhart, 1972). For example, learning a passage of a book by simply repeating it over and over would constitute a shallow depth of encoding, since the information is simply encoded using its physical characteristics (i.e. the sound of the passage in this case). This type of encoding is known as maintenance encoding and is not a very robust means of encoding information. A deeper means of processing the information would be to encode it in terms of its meaning. Encoding in this fashion requires more effort, but tends to be far more lasting and useful in memory. It is known as semantic processing or elaborative rehearsal.

5.2 Related Work

94

Reflexion can be seen as a form of elaborative rehearsal since the user is not only linking the system to appropriate abstractions (that may typically associate to domain concepts) but also learning the syntax of the system. This also begins to explain why a lack of domain knowledge can have a detrimental effect on the use of technique. Elaborative rehearsal has been shown to dramatically improve encoding (Craik and Tulving, 1975) and there is no reason to believe that it should be any different with the application of Reflexion. A further interesting addendum to elaborative reasoning is what is known as the generation effect (Craik and Tulving, 1975). This states that an individual is more likely to remember information generated themselves rather than information presented to them by a third party. This helps to further explain the success of Reflexion where high level elements and mappings are created and named by the user, thus making it easier for the individual to remember information recovered from the system. The encoding of large bodies of information is also more robust if spread over many separate sessions rather than confined to a single one. This is known as the spacing effect (Ebbinghaus, 1913). The technique described in this thesis places no restrictions on the software engineer to complete his task in a single sitting. The work may be saved and restored over the course of many sessions. Indeed, in many of the case studies reported in this thesis, this is exactly what happened. This may in part contribute to a more effective process. For large bodies of information, organising parts of it into meaningful categories and hierarchies of categories has been shown to dramatically increase the amount of information that can be encoded, in a given time frame (Bousfield, 1953; Bower et al., 1969). The information encoded also tends to be far more lasting and easily retrievable from memory also. The subsequent retrieval tends to occur by category also. This is known (in cognitive psychology terms) as clustering. The nature of this categorisation is almost identically supported by the Reflexion process and tool support provided in

5.2 Related Work

95

this thesis, through the use of high-level models and mappings. Evidence also exists that new information is not encoded as received. It has been shown that when we are presented with new information that we use our knowledge of past experiences to make sense of the of it. This has been termed effort after meaning (Bartlett, 1932), and describes schemas in memory that exist based upon our past experiences. New information is interpreted in terms of these schemata in memory. With respect to Reflexion Modelling the existing schemata is the application domain knowledge that that person has. In this instance, by application domain knowledge we mean the typical implementation of a programming solution for that domain. For example, while all compilers are implemented quite differently, they all may have many design concepts, specific to that domain in common. Knowing this constitutes application domain knowledge. Having no application domain knowledge is equivalent to having no schemata into which new information can fit. The presence of these schemata makes it easier to encode information. This further explains why the Reflexion Modelling technique can be more effectively used by those with extensive domain knowledge regarding the application and programming domains. A user of Reflexion Modelling who has domain knowledge will have an existing schema in memory, therefore the creation of an initial high-level model will be much easier. 5.2.3.2 Retrieval Retrieval processes are responsible for fetching information from memory and bringing it to consciousness. A well encoded piece of information is useless if it cannot be effectively retrieved, however, it will also come as no surprise that the effectiveness of retrieval can be heavily dependant on how the information was previously encoded, as described in Tulving’s encoding specificity principle (Tulving, 1983, 1975). Information is retrieved from memory by accessing existing cues (routes to the piece of information in the brain). Information previously encoded using elaborate rehearsal

5.2 Related Work

96

provides far more cues than a shallowly encoded piece of information and is therefore far easier to retrieve. The previous section showed how the Reflexion process facilitates the user to encode information using hierarchical, clustering, elaborate rehearsal, the spacing effect and the generation effect. This, combined with the fact that recognition provides stronger cues than recall, makes for a very deeply encoded piece of information with many cues for retrieval (Tulving, 1983, 1975). In practice this means that information discovered about the system using Reflexion is very easily accessible by that person at a later date. Furthermore the presence of domain knowledge also provides cues at discovery time, making it easier again to encode new information into an existing schema present due to domain knowledge. 5.2.3.3 Human Learning Learning is the process of change as a result of experience (Littleton et al., 2005). Several models of the human learning process have been proposed in literature. Of most interest to us is a learning approach called category learning. Again this theory of learning is proposed by the field of cognitive psychology. A state of the art review of learning approaches proposed in cognitive psychology is collated in (Brace and Roth, 2005), (Littleton et al., 2005) and (Fleming, 2006). We have already touched on the subject when we spoke clustering in the memory process in section 5.2.3.1. Category learning is the learning that occurs when people come to understand that certain objects or entities belong together in particular categories (Littleton et al., 2005). The learning process is theorised as having these steps: • A group of entities are observed by the individual. • The individual forms a hypothesis that a portion of these belong together. • The individual tests his hypothesis against his existing knowledge.

5.2 Related Work

97

• The hypothesis is accepted, rejected indefinitely or put on hold until new knowledge causes a change in understanding. We can see from this brief overview that the Reflexion Modelling process is almost an identical implementation of category learning. This direct support for what is thought to be the way in which human learning processes actually operate in part explains the success of the Reflexion Modelling technique (Littleton et al., 2005; Brace and Roth, 2005; Fleming, 2006). Furthermore, studies in category learning have convincingly shown that prior knowledge and past experiences play an integral part in the formation of categories (Murphy and Medin, 1985; Murphy and Allopenna, 1994; Kaplan and Murphy, 2000). In the same way we can explain why existing domain knowledge is of benefit to those who use the Reflexion Modelling technique. Another form of learning model popular in cognitive psychology is constructivism (Huitt, 2003; Ausbel, 1968; Bruner, 1990; Seaman, 1999). An authoritative collation of constructivism research can be found in (Huitt, 2003). Constructivism states that an individual learner must actively “build” knowledge and skills and that information exists within these built constructs rather than in the external environment (Huitt, 2003; Bruner et al., 1956). This learning model suggests that the learners processing of stimuli from the environment produces adaptive behavior thus inducing learning. The Reflexion Modelling process supports this form of learning by continually create stimuli for the user. Each iteration of the Reflexion Model produces stimuli in the form of feedback regarding the correctness of his high-level model. If the stimuli challenges the user understanding of the environment (in this case the model of the system) an adaptive change is induced (i.e. the learner learns something new about the system). This adaptive change is eventually realised when the learner alters his high-level model for the following iteration.

5.2 Related Work

5.2.3.4

98

Learning Preferences

Separate to the underlying learning process are the individual preferences of people with respect to learning. People exhibit different likes and dislikes when assimilating new information and integrating it with their understanding of the world. These are our learning preferences, over which we have little control, and form part of a learning style that includes a range of other influencing factor over which we do have control (i.e. diet or environment) (Fleming, 2006). One of the more popular learning preference models is known as VARK1 and was first pioneered in 1992 (Fleming and Mills, 1992; Fleming, 2006). The model suggests that there are four types of learning modes that individuals may have a preference towards (or a mixture of these, making that person multimodal) (Fleming, 2006): Visual This is the preference to take information in in the form or charts, graphs, symbols or hierarchies. It does not include movies or text based presentations. Aural This is the preference to learn from information that is heard or spoken to you. For example, lectures, tapes, email or group discussions. Read/Write This is a preference to learn from information presented to you in the form of words. This is typically reading or writing or both. Kinesthetic This is the preference to “learn by doing.” People of this preference learn better from practical experience or dialogue rather than through direct tuition. Statistically the mixture of preferences among people is diverse, with over 58% of people being multimodal. The process and tool support examined in this thesis is particularly suited to the multimodality of the general population. Three out of the four learning preferences are directly appealed to by the process. The kinesthetic preference 1

Which stands for Visual, Aural, Read/write and Kinesthetic, the fours types of learning preference examined by the VARK assessment.

5.2 Related Work

99

is immediately satisfied through the hypothesis → test → refine nature of the process, strongly appealing to a learn-by-doing approach. The interpretation of results in the Reflexion models and the creation of maps is presented in a textual form to the user, thus satisfying the read/write preference. Finally, the visual preference is addressed through the presentation of reflexion models using a graphical output. This broad appeal, allied with other theories on learning, encoding and retrieval, may explain the positive results and comments with respect to Reflexion and Reconnexion as a learning tool. Experience from this thesis would also seem to support this conjecture.

Chapter 6 Research Methodology “In the fields of observation chance favors only the prepared mind.” - Louis Pasteur, lecture.

6.1 Scientific Method

101

A secondary goal of this thesis is to evaluate the proposed approach. To do this research methodology must first be discussed. A research methodology is a strategy of inquiry which moves from the underlying philosophical assumptions of the researcher through empirical research design and data collection (Myers, 1997). Science offers the modern researcher an arsenal of methodologies that may be used when evaluating a hypothesis. It is the purpose of this chapter to explain existing research methods, relate them to the approach adopted by this thesis and justify this approach to the reader.

6.1

Scientific Method

The foundations of modern science rests upon what is known as “the scientific method.” The purpose of the method is to ensure repeatable, rigorous evaluation of hypotheses across science. Depending on specific context the method will vary, however, across many science disciplines it follows these common steps (O’Callaghan, 2005; Basili, 1996; Perry et al., 1997a): 1. Observation In this initial step a scientist observes some interesting property of the world he wishes to investigate. This guides the formation of subsequent hypotheses and experimentation. 2. Hypothesis Based on these observations a scientist will form a hypothesis. The hypothesis aims to explain the observations made. 3. Evaluation Next an evaluation of the hypothesis is designed and implemented. Special care is needed when designing the evaluation. 4. Collection and Interpretation of data Once the evaluation is performed the gathered data must be scrutinised and understood. The degree of certainty with which we can make statements regarding the data is known as validity (O’Brien et al., 2005).

6.2 Validity

102

5. Conclusion Having assessed the validity and interpreted the data it is now possible to draw conclusions with respect to the hypothesis. The conclusions may support or refute the hypothesis. Sometimes it may not be possible to draw a conclusions from an evaluation and the results may be deemed inconclusive. 6. Relating the conclusion to existing knowledge In order to attain a greater understanding of the meaning of conclusions drawn, the hypothesis, the experiments and the results should be positioned within existing literature. 7. Reporting and publishing results This final step ensures that the knowledge gained is not lost. The publication of results will allow other researchers to confirm claims made and more importantly to build further hypotheses that extend the work.

6.2

Validity

All research is based on some underlying assumptions about what constitutes valid research evaluation (step 3 of scientific method) and which research methods are appropriate (Myers, 1997). Validity refers to the meaning of research results (O’Brien et al., 2005) and the degree to which one can make statements about those results from within the researchers adopted research method. Thus it specifically concerns steps 4 and 5 of the scientific method presented at the beginning of the chapter. Validity is described in three ways (Perry et al., 1997b): External validity is the degree to which the conclusions of a study are applicable to the wider population. The larger and more representative the sample population used, the more applicable the results will be. Internal validity is the certainty with which we can say that the known independent variables in the study are the only causes of what was observed in the depen-

6.3 Quantitative and Qualitative Research Methods

103

dant variables. Internal validity can be maintained by producing many streams of complimentary evidence (Kitchenham et al., 2005) that support the hypothesis being researched. Construct validity refers to the degree to which the structure of the experiment affords the measurement of what the experimenter set out to measure. For example, the experiment may be valid, but the variable under scrutiny may not describe what is proposed in the hypothesis. These descriptions of validity are often in conflict and difficult to balance. For example, to maintain a high level of external validity we may decide to perform studies of industrial programmers in their workplace as opposed to the laboratory. This makes the population more representative and is known as ecological validity (Kellogg, 2003; Perry et al., 1997b). However, in doing so, control over the variables of the experiment may not be possible, thus affecting internal validity.

6.3

Quantitative and Qualitative Research Methods

Once the researcher has decided upon a guiding philosophy for his research, he must then choose from the array of research methods, ones that will be appropriate in evaluating his hypothesis. The types of research method can be categorised in many ways, however the most common distinction that is made is between qualitative and quantitative research methods (Myers, 1997). Quantitative methods imply the ability to numerically measure facets of an experimental setting and includes methods such as: • Surveys - A written or oral survey of questions can be presented to a population and statistical results inferred from the answers (Yip, 1995).

6.3 Quantitative and Qualitative Research Methods

104

• Laboratory experiments - using a laboratory experiment the researcher can control independent variables that affect the object of the hypothesis under scrutiny. Usually a single variable can be adjusted by the researcher and the resulting effect it has can be observed. If the outcome is in line with the prediction of the hypothesis then one can say that the experiment has produced evidence in support of the hypothesis. Evaluating the outcome of such experiments is often associated with statistical hypothesis testing approaches (Carew et al., 2005). • Formal methods - An example of a formal method would be econometrics, which is a combination of mathematical economics, statistics, economic statistics and economic theory (Myers, 1997). Qualitative methods produce data of a textual nature, as opposed to the numerical output of quantitative methods. Qualitative methods can use a range of types of qualitative data1 to evaluate the hypothesis under scrutiny, such as data produced by: • Action research - This type of research is aimed at examining hypotheses that can be applied directly to an industrial setting and their benefit assessed. This is not to be confused with applied science. In the case of action research there is a real contribution back to the scientific community as well as industry, as a result of the application of the hypothesis (Myers, 1997). • Case Study - This type of method is an empirical enquiry that investigates ones hypothesis in a real-life context, known as in-vivo (Basili, 1996). However, the boundaries between what is under evaluation and the context are not necessarily clear (Myers, 1997). Often with a case study there is only one, or a few data points (participants from a population). Therefore the data is not suited to a statistical evaluation. A richer insight into the context is achieved through qualitative data capture. 1

Often produced by technique that can also generate qualitative data.

6.3 Quantitative and Qualitative Research Methods

105

• Ethnography - In an ethnographic study the researcher immerses himself in the context of the hypothesis under study. This is often very time consuming, however it also provides a rich data set. A typical ethnographic study in an IT organisation may involve spending several months working as part of a software development team (Myers, 1997). • Grounded theory - This research method suggests that a hypothesis to explain a certain phenomenon can emerge from an analysis of the gathered data rather than an a-priori hypothesis that guides the formation of the data gathering (Myers, 1997). Data used to realise these qualitative research methods can be gathered from various data sources including: • Observation - The participant or object is simply observed with no interference apart from the study set up. • Interviews - The participant is prompted or questioned to express their views and answers to various topics of relevance to the study. • Questionnaires - These are similar to the quantitative surveys mentioned above. However using qualitative methods the questionnaire can also include essay style answers allowing the participant to express his or her opinion. • Documents or texts - Documentation, emails, letters, memos, faxes, dictations and diaries can all be used as valid data sources. • Researchers impressions - The researcher himself may draw conclusions from observation during the study before analysis. • Think-aloud - The think-aloud method, pioneered by Erisson and Simon during the 1980’s (Ericsson and Simon, 1993), is implemented by having the participant

6.4 The Culture of Research Evaluation in Computer Science

106

of a study speak his thoughts out loud while performing the tasks of the experiment. Think-aloud is known to provide the richest insight into a persons mental state at a given moment in time (Russo et al., 1989) when carried out in line with Ericsson and Simon’s best practice guidelines (Ericsson and Simon, 1993).

6.4

The Culture of Research Evaluation in Computer Science

Thus far research methods in general have been discussed. While generally applicable to the research question of this thesis, the more specific research culture of computer science must also be considered when arguing for a chosen set of research methods. Existing reviews highlight a severe lack of research evaluation of any philosophical tradition in computer science (Glass et al., 2002; Segal et al., 2005). Only 14% of research papers surveyed in (Glass et al., 2002) were found to be evaluative. Even in a journal such as Empirical Software Engineering, whose focus is intended to be that of empirical studies, it was found that between the years of 1997 and 2003 only 53% of the papers within it were evaluative (Segal et al., 2005). Of the evaluative papers a hypothesis testing-based, quantitative approach dominated the evaluations. Furthermore, the evaluations tended to be laboratory-based, did not refer to other scientific disciplines and were not people focussed.

6.4.1

Arguing for Hybrid Approaches to Research in Computer Science

The culture of research in computer science, as highlighted in the previous section and by Basili in (Basili, 1996), indicate that computer science is an emerging discipline with an immature research model. Other, more established disciplines have seen a

6.4 The Culture of Research Evaluation in Computer Science

107

research scenario emerge where the research community divides into two groups - theorists and practitioners. In physics, for example, theoretical physicists create mathematical models of the universe, while experimental physicists test these models. Likewise in medicine, theorists and practitioners of their emerging science exist. However, their fundamental difference to computer science is that the essence of what they are studying is unchanging - The nature of the universe will always be examined by physicists and medical researchers will always be concerned with the human species. Computer scientists, on the other hand, not only attempt to improve the process that operates on an artifact in question, but the artifact itself can also be improved. Thus, in computer science the model of evaluation must be cognizant of both the process and the product (Basili, 1996). The closest scientific analogy to this scenario can be found in the manufacturing domain, where research is undertaken to improve the processes for producing products. However, similar to computer science, the product itself can also be improved. Therefore the role of the researcher in computer science is to understand the evolving nature of processes and products and the relationship between them (Basili, 1996). Moreover, in evaluating technologies or techniques that aim to improve software development, the human will always be a key element in its operation, and therefore its evaluation. This complicates experimentation since different results will be obtained, depending upon the people involved (Basili, 1996). Research in cognitive sciences have developed a long established evaluation approach called the socio-cultural perspective that suggests that to evaluate a hypothesis involving people, studies should be undertaken using real activities, in real situations, in their natural environment (O’Brien et al., 2005). Advocates of the approach argue that the richness of context of such a setting cannot be replicated by any feasible laboratory controlled evaluation. From a computer science perspective this translates to the suggestion that all models created by computer science theorists should eventually be evaluated by computer science practi-

6.4 The Culture of Research Evaluation in Computer Science

108

tioners in software laboratories where real, commercial software is actually being developed (known as in-vivo) (Harrison, 2006). As it currently stands, however, there is a gross imbalance between the body of theoretical models produced by computer science theorists and a corresponding body of work that evaluates these models, in favour of the former (Buckley, 2002; Basili, 1996). Proponents of purist positivist research philosophies often cite that in-vivo evaluation is not repeatable, therefore the results cannot be corroborated. Segal provides a retort that best counters this standpoint, “An argument is often made against field studies is that they cannot be replicated - but neither can a software software engineering activity in the real world (one cannot dip one’s toes in the same river twice!). Validation of the study cannot be based on the replication of the study but on the replication of the interpretation: the question to ask is, would other researchers from the same scientific cultural tradition as the original researcher(s) and given the same data, come to the same conclusions?” It should be noted that performing experiments in-vivo does not preclude the gathering of quantitative data. However, the degree to which once can draw conclusions from quantitative data gathered in an in-vivo experiment can be limited, since many subtle, immeasurable factors may be occurring external to those measurable factors. Such is the nature of human-based evaluations. Therefore, there is a convincing need for the use of qualitative data sources. Take, for example, in a hypothetical evaluation where the performance of a programmer using a tool is investigated and where the user unexpectedly underperforms. Quantitative measures of time and productivity gathered will measure this underperformance and results can be reported on this data. However, later upon the gathering of data using a qualitative data source, such as an interview, we find that that particular

6.5 A Research Model for This Thesis

109

participant had a headache that day, impeding his performance, thus highlighting that his underperformance had little to do with the process under evaluation. Nor was it an accurate reflection of the average person using the process under evaluation. Therefore, vital information would have been overlooked had both qualitative and quantitative methods been used. Also notice how both the interpretivist and positivist philosophies are complementing one another in this instance. Furthermore, the key to a compelling evaluation is to provide a convincing argument in favour of ones hypothesis. To this end, mounting evidence should ideally be provided by many streams (Kitchenham et al., 2005). For example, a combination of quantitative measures, qualitative data sources, assessing both the product and process would evaluate a hypothesis from many angles. This is known as triangulation and is a means of creating a large body of evidence in support of ones hypothesis while also appeasing a range of research philosophies (Myers, 1997).

6.5

A Research Model for This Thesis

Deciding upon a research methodology depends primarily upon the research objectives. This thesis is attempting to perform an initial evaluation of a repeatable process for component encapsulation, that is useful to software engineers and is industrially applicable. These objectives immediately highlight certain requirements when choosing an appropriate methodology for the thesis: • Industrial applicability requires an ecologically valid setting for evaluation. • Investigating usefulness of the component encapsulation approach to programmers requires methods that can reveal the full complexity of human-computer interaction. Again, an ecologically valid setting would be advisable, however it also presents strong motivation for the usage of qualitative methods of evaluation.

6.5 A Research Model for This Thesis

110

• The outcome of the component encapsulation process is a software artifact, which is quantifiable. Therefore, quantitative measures in the form of software metrics would seem appropriate in this case when assessing the product. Complementary qualitative measures should also be used to buttress the findings. An important observation on the requirements cited is that the required research methods do not fall under a single research philosophy or method grouping. Immediately we see the opportunity for of triangulation, that was discussed in the previous section.

6.5.1

Empirical Techniques Employed

Some quantitative measures are used in an attempt to provide an objective evaluation of the product of our process: • Software metrics used to assess the product of the process. • Project data, such as length of time for a project or lines of code. Assessing the process for its usefulness requires more intricate methods that can examine the full richness of complexity of context and human-computer interaction. The available qualitative methods can account for this complexity. Importantly the evaluation is carried out in-vivo (Basili, 1996), helping to preserve a high level of ecological validity (O’Brien et al., 2005). This in-vivo evaluation takes the form of several case studies. Qualitative data sources used in the evaluation include: • Observation: The participant will be observed and video recorded. The result of this form of observation can then be analysed to gain insight in to the process. • Diaries: The participant will produce a diary of his experience of the process.

6.5 A Research Model for This Thesis

111

• Note-taking: During any point of the case study, interesting information with respect to the study that becomes apparent can be recorded in the form of notes taken be the investigator. • Think-aloud: During the participant’s actuation of the process the participant will be encouraged to speak his thoughts out loud. This data, it is expected, will provide a deep insight into the mental state and impressions of the participant during the process. • Interviews: After the process has taken place, interviews will be used to further assess the process and to also assess the product components produced as a result of the process. • Project documents: Existing documentation with respect to the subject system can be used to further help the assessment of both the process and product of the process. From these streams of evidence a strong triangulation of evidence is built. Both the process and the product are evaluated using several research methods from both the quantitative and qualitative categories, with think-aloud data being the most used stream of qualitative evidence. Several actions have been taken to raise the validity of the studies: • All of the studies performed in the thesis are designed to have high external, ecological validity. This is due to the in-vivo nature of all the evaluations. • A high level of internal validity is maintained by creating several streams of evidence through triangulation. • Construct validity is kept to a high level by clearly identifying the attributes of the process and the product that lead to quality components. This has been discussed during the earlier literature review.

6.5 A Research Model for This Thesis

112

The appropriateness of these measures has also been confirmed in a pilot study undertaken on Reconn-exion, found in (Le Gear and Buckley, 2005a).

Part II “Component Reconn-exion”: Reengineering Towards Components Using Variations on Reconnaissance and Reflexion

Chapter 7 Reconn-exion “Live out of your imagination, not your history.” - Stephen Covey.

7.1 A Conjecture for Prompting Component Abstractions

115

In the previous two chapters an in-depth review of Software Reconnaissance and Reflexion Modelling was provided. Extensions on both these techniques are used and combined in this chapter to form a process called “Reconn-exion.” This, allied with an evaluation of this process, are the core contributions of this thesis.

7.1

A Conjecture for Prompting Component Abstractions

7.1.1

A New Reuse Perspective Derived from Software Reconnaissance

In section 2.5.2 the reusability of components was highlighted as being an important quality attribute. Previously in section 3.3 several types of software reuse internal to a system were defined along with techniques for identifying it. In this section we define a new type of software reuse and a means of identifying it. Software Reconnaissance (chapter 4), showed how source code responsible for implementing observable behavior of a system could be identified by gathering execution profiles. However, another facet of the functionality view provided by Software Reconnaissance is the set of shared software elements. The set of shared software elements contains software elements that are neither unique to the functionality in question nor utility code common to all features. Rather the software elements are shared across some features but not all. This is calculated as SHARED(f ) = IIELEMS(f ) - UELEMS(f ) - CELEMS The set contains software elements that are indispensably involved except for software elements that are unique to that feature or common software elements. The set of shared software elements gives a snapshot of the software elements being reused by the

7.1 A Conjecture for Prompting Component Abstractions

116

features of running system, from the context of feature f. Though, difficult to visualise graphically, figure 7.1 is provided to facilitate your understanding of the SHARED set. In Norman Wilde’s seminal paper on Software Reconnaissance he remarks on the potential worth of the source code shared across features exhibited by a system, but never investigates it further (Wilde and Scully, 1995). We can extend this view by combining the sets of shared software elements for all features in the feature set. This gives a reuse view for the entire domain(s) or feature class(es) profiled and is calculated as:

n [

SHARED(fk )

k=1

Where n is the number of features in a particular feature set. A this point we no longer think of features and source code elements shared by features. The view produced should be simply considered as another reuse view giving an interesting perspective on the software system. We call this view a feature-based reuse perspective of a software system or the reuse perspective for short. In particular this view should contain software elements that are generic, reused and architecturally core in the system. By reused we mean that the software element is used more than one within the context of the features under examination in the reuse perspective. By generic we mean that the potential exists for the elements to be reused in a wider context (i.e. reusable). Finally, by architecturally core, we are again alluding to the idea that the software elements are some what generic. Such architecturally core source code may be generic boilerplate or framework code. The truth of our conjectures on the contents of the reuse perspective will be examined during the evaluation in this thesis and is one of it’s core contributions. Our hypothesis is that this shared code, that is reused in implementing more than one feature, is a useful starting point that warrants further investigation when identify-

7.1 A Conjecture for Prompting Component Abstractions

Figure 7.1: The three sets used to form the shared set.

117

7.2 A Hypothesis for Encapsulating Components Using Reflexion Modelling

118

ing code that is reused, reusable, or forms part of architecturally core components of that system. This set is called the ‘reuse perspective’ and is a core contribution of this thesis.

7.2

A Hypothesis for Encapsulating Components Using Reflexion Modelling

Another component quality attribute identified in chapter 2 was the replacibility quality attribute. This should equally be addressed in a component recovery process just as the reusability quality attribute was addressed in the previous section. Due to the successes of Reflexion Modelling as a means of partitioning a system into higher level abstractions it is one of the conjectures of this thesis that Reflexion Modelling could also be adapted to unambiguously encapsulate components of existing systems and aid in the definition of their interfaces, thus supporting the important replacibility quality attribute necessary for a good component. We propose the following guidelines for Reflexion Modelling specifically for the encapsulation of components: 1. The programmer creates a high-level model that contains only two nodes: • A high-level node representing a first attempt at the component he wishes to encapsulate • A second high-level node that represents the remainder of the system (see figure 7.2). 2. The programmer maps the appropriate elements of the software system to the nodes in the high-level model 3. The user then iterates through the traditional Reflexion Modelling process, allowing the tool to build the reflexion model and the programmer to study the edges

7.2 A Hypothesis for Encapsulating Components Using Reflexion Modelling

119

between the nodes for expected and unexpected dependencies. Refinements at this stage are limited to only changing the map, not the high-level model. This process iterates until he is satisfied that the component has been encapsulated. This, we suggest, explicitly identifies the interface of the component. This is illustrated by figure 7.2. 4. The programmer then proceeds to divide up the rest of the system into its major constituent parts by first altering the high-level model and then altering the mappings of the map appropriately. The division of the remainder of the system is usually guided by the programmers domain knowledge of the major services provided by the system. This is shown in figure 7.3. 5. Again, the programmer will continue with several iterations until he is satisfied with the new breakdown of the system. 6. The model will now potentially show the dependencies that the component displays with several parts of the system. We suggest that this identifies the roles the component plays in the system. Of course, the potential for variation upon these guidelines does exist. Some expected variations include: • A user of the process may decide to define a sub architecture for the component that he is encapsulating. • A user may not be able to fully clarify the contents of his component until he has begun to break down the remainder of the system. • A component with a single interface may be encapsulated. It will, at most, have one role in the system.

7.2 A Hypothesis for Encapsulating Components Using Reflexion Modelling

120

Encapsulated Component Interface of component with system

Remainder of System

Figure 7.2: Encapsulating a component and making its interface explicit.

7.3 Hypothesising a Process For Component Encapsulation

Desired Component

Part 1 of remainder of system

Part 2 of remainder of system

121

Multiple Interfaces of the component

Part 3 of remainder of system

Figure 7.3: Identifying multiple interfaces on a component using Reflexion Modelling.

7.3

Hypothesising a Process For Component Encapsulation

Two novel approaches, derived from Software Reconnaissance and Reflexion Modelling and designed to aid in the identification and encapsulation of components in software systems have been defined by the two previous sections. This section brings these together as a new process for component recovery called Reconn-exion. Two motivations drive the decision to combine the reuse perspective and the variation on Reflexion Modelling together as a single, aggregated process for component recovery and they are: • The reuse perspective, generated through the proposed adaptation of Software Reconnaissance, may be a useful means of narrowing the search for reusable, generic and core components of a system. However, it lacks the means of follow-

7.3 Hypothesising a Process For Component Encapsulation

122

ing through and allowing the user to explicitly encapsulate them. Reflexion-based techniques, however, have a proven track record of being able to clearly identify the boundaries and contents of components in existing systems. Given this observation, both the reuse perspective and the variation on Reflexion Modelling appear complimentary to one another. • Likewise, Reflexion Modelling has the potential to make components that are recovered more replaceable. However, the identification of clusters, forming the map in a normal Reflexion model is most often based upon naming conventions in the source code (Kosche and Daniel, 2003; Christl et al., 2005). Existing tools explicitly support this by allowing the user to define regular expressions that encompass groupings of software elements with a specified naming convention (Murphy, 2003). Naming conventions have been shown as an excellent means of aiding comprehension and are pervasive throughout industry (Refl, 2005). However, solely relying upon the naming conventions of a system does create a single point of failure for the reflexion modelling technique, hence the variation on it proposed by this thesis and, indeed, by other authors (Christl et al., 2005; Hassan and Holt, 2004). Any means of reducing the dependence upon naming conventions may be a benefit. The reuse perspective is one potential means of reducing this dependence since it is generated by analysing the behavior of the system. The proposed process, illustrated by figure 7.4 will consist of the following steps: 1. The proposed adaptation of Software Reconnaissance is performed on the subject system automatically producing the reuse perspective as output. 2. Then the participant is presented with the reuse perspective of the subject system. He uses this, combined with the existing naming convention approach, to prompt initial mappings for possible component abstractions in the system.

7.3 Hypothesising a Process For Component Encapsulation

123

Execution Profiles 2. The user browses the Reuse Perspective 1.

3. A component to recover is chosen

Reuse Perspective Software Reconnaissance Tool

4. Based on the analysis of the resue perspective and initial Reflexion Model, using the adapted appraoch, is created.

Reflexion Modelling Adaptation High−Level Model

Map

5. The Process is repeated until the component is encapsulated.

Figure 7.4: The Component Reconn-exion process.

3. From these component abstractions the participant chooses a component of interest that he wishes to recover. 4. The participant can then create his initial Reflexion model, as prescribed by this thesis’ proposed variation, with the map being prompted by the examination of the reuse perspective. 5. Further iterations in the variation on Reflexion Modelling are undertaken. The participant is free to refer to the reuse perspective at any stage. This continues until the component is encapsulated. This integrated process for component recovery is called “Component Reconnexion” and is another core contribution of this thesis.

7.4 A Small Example

124

Table 7.1: Features identified for the house application. No. Feature 1 Start and stop the application. 2 Rotate house. 3 Translate house. 4 Scale house. 5 Change house colour.

7.4

A Small Example

This section introduces a small example to demonstrate the Reconn-exion process.

7.4.1

The House Application

A small house application is used as the example application. The application was an undergraduate, third year graphics project, created by the author. Figure 7.5 is a screen shot from the application. The purpose of it is to allow the user to manipulate the drawing of the house on the screen. The user can scale the size of the house, change its colour, rotate it and translate it to a different position. The application is approximately 1000 LOC in size, contained in one file and written in Java.

7.4.2

Part 1: A Reuse Perspective

The first part of Reconn-exion requires that a reuse perspective of the system be generated. This involves: • Identifying the features of the application. • Exercising appropriate test cases to exhibit and profile the named features. • Examining the profiles to calculate the reuse perspective.

7.4 A Small Example

Figure 7.5: A screenshot of the house application.

125

7.4 A Small Example

126

The features that were identified by the author in the house application are listed in table 7.1. Next test cases were designed and executed for each feature and their execution was profiled. In this example only one test case is created for each feature, however, one could potentially execute many test cases for a feature, with small profiling differences. The profiles for each of the test cases exercised the following procedures:

• Test case 1, exhibits feature 1 (Start and Stop the application) and exercises: – AssignOne9921966.main – House_GUI.House_GUI – House_Transforms1.Bezier_Walls – House_Transforms1.Draw_House – House_Transforms1.Line_Clipped_House – House_Transforms1.Setup – House_Transforms1.bezier – House_Transforms1.clipcode – House_Transforms1.clipline – House_Transforms1.drawLine_Level1 – House_Transforms1.drawLine_Level2 – House_Transforms1.middle – House_Transforms1.paint • Test case 2, exhibits feature 2 (Rotate house) and exercises: – AssignOne9921966.main – House_GUI.House_GUI – House_Transforms1.Bezier_Walls – House_Transforms1.Draw_House – House_Transforms1.Line_Clipped_House – House_Transforms1.Rotate_House – House_Transforms1.Setup – House_Transforms1.Translate_House – House_Transforms1.bezier – House_Transforms1.clipcode – House_Transforms1.clipline – House_Transforms1.drawLine_Level1 – House_Transforms1.drawLine_Level2 – House_Transforms1.matrix_multiplication – House_Transforms1.middle – House_Transforms1.paint • Test case 3, exhibits feature 3 (Translate house) and exercises: – AssignOne9921966.main – House_GUI.House_GUI – House_Transforms1.Bezier_Walls – House_Transforms1.Draw_House – House_Transforms1.Line_Clipped_House

7.4 A Small Example

127

– House_Transforms1.Setup – House_Transforms1.Translate_House – House_Transforms1.bezier – House_Transforms1.clipcode – House_Transforms1.clipline – House_Transforms1.drawLine_Level1 – House_Transforms1.drawLine_Level2 – House_Transforms1.get_firatX – House_Transforms1.get_firstY – House_Transforms1.matrix_multiplication – House_Transforms1.middle – House_Transforms1.paint • Test case 4, exhibits feature 4 (Scale house) and exercises: – AssignOne9921966.main – House_GUI.House_GUI – House_Transforms1.Bezier_Walls – House_Transforms1.Draw_House – House_Transforms1.Line_Clipped_House – House_Transforms1.Scale_House – House_Transforms1.Setup – House_Transforms1.Translate_House – House_Transforms1.bezier – House_Transforms1.clipcode – House_Transforms1.clipline – House_Transforms1.drawLine_Level1 – House_Transforms1.drawLine_Level2 – House_Transforms1.matrix_multiplication – House_Transforms1.middle – House_Transforms1.paint • Test case 5, exhibits feature 5 (Change house colour) and exercises: – AssignOne9921966.main – House_GUI.House_GUI – House_Transforms1.Bezier_Walls – House_Transforms1.Draw_House – House_Transforms1.Line_Clipped_House – House_Transforms1.Setup – House_Transforms1.bezier – House_Transforms1.change_color – House_Transforms1.clipcode – House_Transforms1.clipline – House_Transforms1.drawLine_Level1 – House_Transforms1.drawLine_Level2 – House_Transforms1.middle – House_Transforms1.paint

Given the profiles listed the reuse perspective can be calculated from the shared sets, as described in section 7.1. Two elements comprise the resulting reuse perspective for the house application:

7.4 A Small Example

128

House_Transforms1.Translate_House : This method is a generic method that can be used to translate coordinates to different positions. It is directly involved in implementing the translate house feature (no. 3), and indirectly involved in the implementation of the rotate and scale house features, making it a generic component, core to this application. House_Transforms1.matrix_multiplication : All the features of the house application that involve graphical transforms (no.’s 2,3,4), are implemented, in part, using matrix multiplication. However, the use of the method is not necessarily confined to use with graphical transforms and could potentially be used to multiply any two matrices. Again this makes the method reusable within the application but also reusable in many graphical or maths based applications. The size of the reuse perspective is small because the set of common software elements is large. As predicted, the elements of the reuse perspective seem architecturally core, generic and reused. Steps one and two of the Reconn-exion process states that the reuse perspective should be presented to a software engineer, where he can examine and use portions or all of it to help him begin the component recovery process. This example provides an excellent starting point to begin recovering a “transforms” component, that encapsulates generic graphical transformation routines, from the house application. Both elements of the reuse perspective, in this case, will form part of the new component. Given larger applications, with larger reuse perspectives, it may be possible to notice several potential components within the reuse perspective.

7.4.3

Part 2: Encapsulating with Reflexion

Now that we have chosen the component we wish to encapsulate we must describe the component using our adaptation of Reflexion Modelling. We begin by creating a high level model of the application consisting of a transforms component node and a remain-

7.4 A Small Example

129

Figure 7.6: First house application Reflexion model. Figure 7.7: First house application Reflexion model map.

der of system component node. For our initial map we map the elements prompted to us by the reuse perspective to the transforms component node and the remainder of the elements to the rest of system component node. This is illustrated by figures 7.6 and 7.7. However, the map prompted from the reuse perspective is very much a seed set with regard to mappings. From this initial reflexion model, prompted by the reuse perspective, we can now attempt the encapsulation of our component. By examining the edges between the transforms component and the rest of the system, further software elements that should be mapped to the transforms component were identified (see figure 7.6). The second reflexion model produced took further elements into the mapped

7.4 A Small Example

130

Figure 7.8: Second house application Reflexion model.

definition of the transforms component. This is shown in figures 7.8 and 7.9. In this case it is interesting to note the seemingly high degree of two way coupling between the transforms component and the rest of the system. On further investigation it was found that these dependencies were solely on the java framework and could later be ignored. This is the reason that we were later able to remove this link. The edges entering and leaving the transforms component of the reflexion model in figure 7.9 represent the interface that that component provides to and requires of the rest of the system (an investigation of the edge with the jRMTool will provide a list of the software elements involved in that edge). However, we can gain better insight into the different roles that that component plays in the system by investigating in further detail the transforms component’s interaction with other components of the system. For this reason we now begin to divide up the remainder of the system in an effort to reveal multiple interfaces of our component. We decide that the rest of system component can be divided into “GUI,” a component that handles the display of graphics, and “Main,” which is the main driver of the application. This resembles a model-view-controller

7.4 A Small Example



131

class="House_Transforms" mapTo="Transforms"/> method="House_Transforms" mapTo="Transforms"/> method="matrix_multiplication" mapTo="Transforms"/> method="Translate" mapTo="Transforms"/> method="bezier" mapTo="Transforms"/> method="Bezier_Walls" mapTo="Transforms"/> method="change_color" mapTo="Transforms"/> method="clipCode" mapTo="Transforms"/> method="clipLine" mapTo="Transforms"/> method="Draw_House" mapTo="Transforms"/> method="drawLine_Level1" mapTo="Transforms"/> method="drawLine_Level2" mapTo="Transforms"/> method="get_first" mapTo="Transforms"/> method="Line_Clipped_House" mapTo="Transforms"/> method="middle" mapTo="Transforms"/> method="paint" mapTo="Transforms"/> method="Rotate" mapTo="Transforms"/> method="Scale" mapTo="Transforms"/> method="Setup" mapTo="Transforms"/> method=".*" mapTo="RestOfSystem"/> class=".*" mapTo="RestOfSystem"/>

Figure 7.9: Second house application reflexion model map.

7.4 A Small Example

132

Figure 7.10: Third house application reflexion model.

(MVC) architectural style. The resulting reflexion model and map for this iteration are shown in figures 7.10 and 7.11 The reflexion model reveals two interfaces defined by the edges of the Reflexion model. The contents of these interfaces can be found in appendix G. Given an arbitrary system the user may choose to refine their model further from here, prompting further iterations. In this particular case we are satisfied with the component we have encapsulated.

7.4 A Small Example



133

method="House_GUI" mapTo="GUI"/> class="House_Transforms" mapTo="Transforms"/> method="House_Transforms" mapTo="Transforms"/> method="matrix_multiplication" mapTo="Transforms"/> method="Translate" mapTo="Transforms"/> method="bezier" mapTo="Transforms"/> method="Bezier_Walls" mapTo="Transforms"/> method="change_color" mapTo="Transforms"/> method="clipCode" mapTo="Transforms"/> method="clipLine" mapTo="Transforms"/> method="Draw_House" mapTo="Transforms"/> method="drawLine_Level1" mapTo="Transforms"/> method="drawLine_Level2" mapTo="Transforms"/> method="get_first" mapTo="Transforms"/> method="Line_Clipped_House" mapTo="Transforms"/> method="middle" mapTo="Transforms"/> method="paint" mapTo="Transforms"/> method="Rotate" mapTo="Transforms"/> method="Scale" mapTo="Transforms"/> method="Setup" mapTo="Transforms"/> method="AssignOne9921966" mapTo="Main"/> class="AssignOne9921966" mapTo="Main"/> class="House_GUI" mapTo="GUI"/> Figure 7.11: Third house application reflexion model map.

Chapter 8 Evaluating the Basis Techniques of Reconn-exion “Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so.” - Douglas Adams.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

8.1

135

Validating the Reuse Perspective: The JIT/S Shipping Case Study

8.1.1

Purpose and Research Questions

The following case study takes the reuse perspective in isolation and evaluates it in an industrial scale context. It should be noted that the entire process is not being evaluated here, rather the hypothesis that the reuse perspective contains software elements of a system that are reused, core and / or generic is being evaluated. To clarify, by reused we mean that the software element is used more than once within the context of the features under examination in the reuse perspective. By generic we mean that the potential exists for the elements to be reused in a wider context (i.e. reusable). Finally, by architecturally core, we are again alluding to the idea that the software elements are boilerplate or framework code, that may also be generic. The evaluation is intended to answer that the following questions: 1. Are some elements of the reuse perspective generic? 2. Are some elements of the reuse perspective architecturally core to the system? 3. Do all elements of the reuse perspective fall into at least one of the categories mentioned (reused, generic, core)?

8.1.2

The Subject System: JIT/S

The subject system for this case study is a commercially available, just in time, just in sequence application called JIT/S that extends an ERP1 application called MFGPRO (Ltd., 2006). Both of these are products of a company called QAD, who specialise in producing ERP software products and service, catering specifically to the automotive 1

Enterprise Resource Planning.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study Age No. Files KLOC Language

136

30 years 5 20 Progress4GL

Table 8.1: JIT/S Summary.

domain vertical. QAD was founded in the mid 1970’s in California and quickly became a market leader in its product space. We specifically examine the Shipping module of JIT/S in this case study. The combined application was in excess of 2 MLOC. A summary of details for the shipping module can be found in table 8.1. Progress 4GL, which is the language used to develop JIT/S, is a fourth generation language that contains many domain specific features for the manufacturing sector. The language is largely a procedural language that also contains many language features that closely tie it to database implementations. For example, database SELECT and UPDATE statements are actual keyword language constructs.

8.1.3

The Participants

Two participants were involved in this case study: 1. An ex-employee of QAD Ltd., familiar with the behavior of the shipping module, participated to identify the features of the shipping module and gather the profiles necessary to compute the reuse perspective. For the purposes of this study she will be simply referred to as “the participant.” 2. An original architect and implementor of the shipping module participated in an interview to evaluate the contents of the reuse perspective with respect to the research questions posed in section 8.1.1. For the purposes of this study he will simply be referred to as “the architect.” Details of both participants can be found in table 8.2.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

137

Table 8.2: Particulars of the participants of the JIT/S study. The Participant The Architect Age 29 40+ Yrs. Experience 2 20 Yrs. in Company 1 8 Education Degree Degree

8.1.4

Tool Support

The tools used to enable this study were: • The Progress 4GL tracing tool, Pro*Spy Plus (McIntosh, 2003). This tool allows program traces to be gathered from an executing Progress 4GL program. • A tool developed by the author at the University of Limerick called “ReconCalc” was used to calculate the results of the Software Reconnaissance and the reuse perspective from the gathered profiles / traces (Le Gear, 2006).

8.1.5

The Study Protocol

The following protocol is adopted for the study: 1. All aspects of the study will take place in the working environment of both the participant and the architect. 2. Based on her domain knowledge of the shipping module and her perusal of the module’s documentation, the participant will be asked to create a comprehensive list of features for the shipping module. 3. The participant will then gather traces for each of these features by running the JIT/S system. 4. The participant’s role in the study will then be finished.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

138

5. The profiles will be taken away by the coordinator of the study and the reuse perspective calculated using ReconCalc. 6. Next, in a separate session, the reuse perspective will be presented to the architect for evaluation. 7. This evaluation takes the form of an interview and is audio recorded. It is hoped that the evidence gathered here will answer the research questions in section 8.1.1. 8. In the interview the architect will have a printout of the reuse perspective and will be asked about each element in it.

8.1.6

Case Study Part 1: Generating the Reuse Perspective

The participant identified 19 features, based upon the user manual for the shipping module and her domain knowledge of the shipping module. The list should be seen as fairly comprehensive for the module because of both the extensive domain experience of the participant and the knowledge they gathered from browsing the component’s documentation. This list is summarised in table 8.3. She proceeded to design appropriate test cases to exhibit these features and subsequently retrieved traces using the Progress 4GL Pro*Spy tracing application (McIntosh, 2003). Using ReconCalc the reuse perspective was then calculated. The process, from identification of features to the creation of the reuse perspective, was undertaken in approximately three hours. The resulting reuse perspective produced contained 14 software elements (see next section). In this case a software element corresponds to a Progress 4GL subprocedure.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Table 8.3: Features identified in JIT/S. Feature Start-Stop Scan in Package Create Packing slip Add package to existing packing slip Use existing BOL Create 2nd packing slip Scan package not in stock ERROR scan same package twice ERROR Scan a package already on another BOL ERROR Try to scan a package thats finalised ERROR Finalise a BOL & Close the shipment for shipping. finalise a packing slip with one package finalise a packing slip with many packages finalise many packing slips with many packages delete a package & Self explanatory. delete a packing slip therefore the packing slip too delete a packing slip that has many packages delete the BOL therefore the packages and packing slips too Create BOL

139

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

8.1.7

140

Case Study Part 2: Addressing the Research Questions with an Architect’s Critique

Next, we presented the reuse perspective to the architect to evaluate the contents of the reuse perspective. It is important to note, at this point that the architect, in this case, is evaluating the contents of the reuse perspective and not its ability to serve as a prompt for component abstractions. An evaluation of the latter will be carried out in the next chapter, first we must confirm that the contents of the reuse perspective are relevant to the task. Therefore it is appropriate, in this case, to use the architect to evaluate the contents. The research questions outlined in section 8.1.1 are addressed by the evaluation in this section. He explained to us the use, role and function of each of the fourteen elements of the reuse perspective. The following is a paraphasing of the architects comments, with actual phrases from his summary being included in quotations. Note that the file names have been obfuscated for copyright reasons. Also, it is important to note that the architect was prompted to comment explicitly on the genericity, reusability and core-ness of the element of the reuse perspective: 1. tiuecpeftcp3.w, CreateBOL - A “general purpose” procedure that creates a place holder for bill of lading (BOL) infomation. Used by “almost every feature” in the shipping module. 2. tiuicpifgvm3p.w, getObjectType - “Framework” code and thus by definition “generic.” This is “architecturally core” to all Progress 4GL applications, not just the shipping module. 3. shtdbodeful2o.w, getObjectType - “Framework” code and thus by definition “generic.” This is “architecturally core” to all Progress 4GL applications, not just the shipping module. 4. tiuecpeftcp3.w, GetRuleGroup - This is a “generic” routine that should be “used

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

141

by any procedure” involving packing slips. User specific customisations are stored in a rule group. Evertime an operation on a packing slip occurs, the rule group is referred to via this procedure. 5. tiuecpeftcp3.w, RetrieveObjectHandles - This procedure is “architecturally core” and is intended to be used by customers who wish to write their own extensions to JIT/S. The procedure is used as a “form of API” and returns the internal data structures of the shipping module to a programmer who wishes to retrieve data from the shipping module programatically. 6. tiuecpeftcp3.w, AddPackage - “Functionally core” to the behaviour of shipping. 7. tiuecpeftcp3.w, IdentifyCustomer - A “general purpose” routine that retrieves a customer based upon an ID from a packing slip. It is “used application wide.” 8. tiuecpeftcp3.w, AddPackageObj - Stores data for a package and is “used application wide.” 9. tiuecpeftcp3.w, RefreshQuery - An “architecturally core” procedure. The procedure is used for synchronisation and manages updating between the database and Progress dynamics framework (McIntosh, 2003). 10. tiuecpeftcp3.w, FinalizeShipment - This is the “common step” for all business processes and workflows through the shipping module. 11. tiuecpeftcp3.w, RunRuleGroup - This instantiates the rule group retrieved earlier. This is another “highly reused” procedure. 12. tiuecpeftcp3.w, RemovePackage - This is “used by all deletion” operations - Not just the removal of packages.

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

142

Table 8.4: Summary of reuse perspective. Reused Generic Architecturally Core > 1 Attribute 14 9 8 11 100% 64% 57% 79%

13. tiuecpeftcp3.w, serverCommitTransaction - “Framework” code and by definition is generic. This is “architecturally core” to all Progress 4GL applications, not just the shipping module. 14. tiuecpeftcp3.w, getObjectType - “Framework code” and thus by definition “generic.” This is “architecturally core” to all Progress 4GL applications, not just the shipping module. The results of the examination are summarised in table 8.4. This table, details the percentage of the 14 summaries, made by the architect, that contained the descriptive key phrases of “reuse,” “genericity,” and “architecturally core.” Importantly, the entire set was confirmed as being reused by many features by the architect. Large proportions of the remainder of the set were cited as architecturally core or generic. Furthermore, large proportions of the set also had two or three of these attributes. This evidence appears to support the research questions posed in section 8.1.1. Of the fourteen entries, twelve originated from the same procedure file, “tiuecpeftcp3.w.” This was an interesting correlation and when the architect was asked regarding the significance of this file he stated that, “That’s the GOD class / the controller - That’s the API class. That’s your way into shipping.” “tiuecpeftcp3.w” acts as an API for all other modules of the system to communicate with the shipping module. One would expect this file to be used by a multitude of

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

143

features and this is evidenced as true, given the output of the reuse perspective. Again this is evidence in support of the research questions posed in section 8.1.1. This finding also raises the question “was there more than just 14 reused components?” Almost certainly the answer is yes. Reasons that the reuse perspective did not capture all of them could include: • Not all the features may have been elucidated. • The other reused elements may not be subject to the same type of reuse that the reuse perspective captures (refer back to the section on reuse types in section 3.3). • It is possible that a test case did not correctly trace the right feature. For example, a test case may not have been correctly designed, therefore the test case may not actually have exhibited the desired feature.

8.1.8

Discussion

From the analysis of the architect’s evaluation of the reuse perspective, 100% of the contents of the reuse perspective were marked as being reused. This result should come as no surprise since the formal definition of the set in section 7.1 is the set of reuse across features. Nonetheless, empirical evidence in support of that fact is comforting. At the beginning of this section three specific evaluation questions were posed to be answered during the case study: 1. Are some elements of the reuse perspective generic (i.e. reusable)? • 64% of the elements of the reuse perspective were marked as generic. This suggests that more than just “some” of the contents are generic, but potentially a “large proportion” or even a “majority” are generic. 2. Are some elements of the reuse perspective core to the system (i.e. elements of the systems that could be considered framework, infrastructural or rarely changing)?

8.1 Validating the Reuse Perspective: The JIT/S Shipping Case Study

144

• 57% of the contents of the reuse perpective were marked as “core.” This result is similar to the result for the “generic” property. Adding further weight to this stream of evidence was the evaluators comment regarding the “God class,” with reference to the fact that almost the entire contents of the reuse perspective belonged to a single file that represented the core gateway to that module. 3. Do all elements of the reuse perspective fall into at least one of the categories mentioned (reused, generic, core)? • This question yielded a conclusive yes, with most of the contents having two or even three of these attributes. However, the extent to which these results can be generalised to other systems is difficult to gauge. The ecological validity of the experiment is strong; The system was a real commercial system of substantial size. The use of the original architect provided an evaluation separate to the creator of the reuse perspective, thus introducing a degree of objectivity. However, as with ecologically valid studies of this nature (Basili, 1996), population size is small and control of variables is difficult (and of limited value, given that they typically generate one data point). Their value is in having high ecological validity and of richness in the data that invites and supports deeper analysis. In an effort to address these issues an evaluation of the reuse perspective is revisited during the AIM case study in section 9.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 145

8.2

Validating Reflexion-Based Component Encapsulation: The Workspace Case Study

8.2.1

Purpose and Research Questions

In the section 7.2 we described a plausible technique for component encapsulation. This section takes the proposed technique and actuates it in an industrial setting under realistic circumstances, on a large commercial system. Two software engineers, who currently maintain that system volunteered as participants. They performed their component encapsulation tasks in their normal working environment, with the exception that the Reflexion tool plug-in (jRMTool) was installed on their IDE. Both the participants’ processes and products were subsequently evaluated to assess: 1. Did the process seem to support programmers in component encapsulation? 2. Do the participants adhere to the process, or can the case study reveal improvements/refinements to our guidelines? 3. Is the product of the process a valid component? 4. Is this a better means of component encapsulation than the current approach employed in the company? It is important to note that this case study, unlike the previous study, does not examine the reuse perspective and examines the Reflexion Modelling-based portion of Reconn-exion only.

8.2.2

The Subject System: The Learning Management System

The subject system in this case study is the Learning Management System (LMS), currently being developed and maintained at IBM’s Dublin Software Laboratory, Ireland.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 146

It provides collaborative services for administering eLearning to the employees of a company. Specifically, it makes learning content available to employees as courses, allowing them to enroll in those courses, tracking their progress through courses and reporting on mandatory training compliance. The LMS is part of a larger application called “Workplace Collaborative Learning” (WCL). WCL adds collaborative features to the LMS that allows employees and instructors to discuss and share information and resources around their courses, and skills management features, which can be used to assist in the automatic provisioning of eLearning to employees. The current LMS was originally developed from a legacy stand-alone product that was altered to become part of WCL. The WCL in turn also forms part of yet a larger application called “Workplace Collaborative Services” (WCS) or simply Workplace. WCS aims to provide employees with a new way to perform their work, where all their applications (email, calendaring, document management) are integrated with each other and with the other employees in the company through collaborative applications like instant messaging (IM), Web Conferencing and presence awareness. WCL is intended to be the part of Workplace that administers learning to the company’s employees through Workplace. The sub application, in toto, comprises of approximately half a million lines of code.

8.2.3

The Participants

Two volunteer participants, who are developers of the LMS, participated in the study. Also, two architects of the LMS particpated as evaluators of the output produced by the participants. The details of both the particpants and the architects are summarised in table 8.5

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 147

Table 8.5: Workplace case study participant details.

Age Yrs. Experience Yrs. Experience LMS Education

8.2.4

Participant 1 20 > 25 2 2 Degree

Participant 2 20 > 25 2 2 Degree

Architect 1 35 > 40 10 3 Degree

Architect 2 35 > 40 10+ 8 Degree

Participant Tasks

The task of the first participant was to arrive at a better architectural understanding of one of the components of the system (in the work context of arriving at a better architectural understanding of the system). This was one of his work tasks, and although not his highest priority, the task was ecologically valid. The second participant’s task was a legitimate maintenance task that the support team had tried before, but had cancelled on realising the associated difficulty (it was a GUI and business logic separation project). The participant also undertook some refactoring of the system, prompted by the technique.

8.2.5

Tool Support

To enable the case study to proceed we used the jRMTool reflexion modelling plugin (Murphy, 2003), provided by Gail Murphy, for the Eclipse Java IDE (Eclipse IDE Homepage, 2005). This tool provides automated abilities for creating and viewing high-level models, for mapping software elements to high-level model elements, for displaying the resulting Reflexion models and displaying summary information regarding the edges of the model and unmapped values. A screenshot of the tool can be seen in figure 8.1. As can be seen in the figure, the “mappings” of the source code to the high level model are expressed in a basic XML language. Also note that by clicking on an edge the user gets a list of the relationships

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 148

Figure 8.1: A screenshot of the jRMTool eclipse plug-in. that make up that list (“edge information”). Also note, in the Reflexion Model in figure 8.1 unexpected edges in the model are displayed with a dashed line in accordance with the description of the technique in chapter 5. It is worth noting how much this tool was a natural fit for the participants. As an Eclipse plugin the tool integrated with the normal work tool of the participants.

8.2.6

The Study Protocol

The following protocol will be adopted for the study: • An existing piece of work, that is also a suitable application of component recovery, will be assigned to the participant. This task will not necessarily be the same for each participant.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 149 • The component encapsulation task will take place at the participant’s work desk and machine that the participant uses (i.e. his natural work environment). The situation will be kept as ecologically valid as possible as a result of this measure. • The option to undertake his assigned task over several sessions will also be afforded to the participant. In this way the task will assume further ecological validity by making it easier for the task to slot in with the participant’s natural working schedule. • Prior to the case study each participant will be trained in the use of the jRMTool tool. • Once the case study begins each participant will perform the component encapsulation task in a separate session (to each other). • The session will be be video or audio recorded, if possible. Furthermore, each iteration of the Reflexion Model generated will be saved by the user. It is hoped that this evidence, with other data sources, will be sufficient to answer processbased questions posed in section 8.2.1. • The study will begin with the participant being instructed to think-aloud for the duration of the study. This instruction along with the chosen encapsulation task will be the only direction given. • When the participant decides that the task is completed this portion of the study will finish. • Next the encapsulated components will be presented to original architects of the system for evaluation. • It is hoped that this portion of the evaluation will provide evidence to answer product-based research questions posed in 8.2.1. The degree of freedom given to

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 150

the participant by this protocol is important to maintain the ecological validity of an in-vivo study (O’Brien et al., 2005; Basili, 1996).

8.2.7

Enacting the Process

8.2.7.1

Participant 1

The first participant chose to encapsulate a Helpers Data Transfer Objects component in the LMS. The participant was aware of this logical entity as a component and its use, but was not certain of its makeup or how it was connected in the system. The Helper objects are data transfer objects (DTOs, also known as Value Objects) that get passed between different layers of the system. The helper DTOs attempt to encapsulate some business object, such as a course or educational offering retrieved from or persisted to the underlying database or visual output. For instance, when a HTML form is submitted to the presentation layer, the java Struts components build a helper DTO and fill it with all the data from the form, before passing it on to the service layer for further processing. From there, the helper DTO may be passed to the persistence layer, or back up the stack again, as required. It is a tidier method of passing data around than having interfaces with loads of individual parameters, but does incur some overhead. Using the Eclipse plug-in, described in the tool support section, the participant spent 2 hours in a single sitting encapsulating the Helpers DTO’s component, using our prescribed guidelines. Elements were added to the high-level model through ten model iterations, in the following sequence2 : • Iteration 1: Added RestOfLearning and Helpers. [Steps 1 and 2] • Iteration 2: Added Struts. [Step 3,4,5] 2

Note that the steps of the proposed Reflexion-adapted process for each iteration are placed in square brackets.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 151 • Iteration 3: Replaced Struts with StrutsActionHanders, StrutsActionForms. [Step 3,4,5] • Iteration 4: Added Services. [Step 3,4,5] • Iteration 5: Replaced Services with ServiceInterfaces, ServiceImpls. [Step 3,4,5] • Iteration 6: Added PersistenceMgrs. [Step 3,4,5] • Iteration 7: Added JSPTags and WebserviceAPIs. [Step 3,4,5] • Iteration 8: Added DSPersistenceBeans and RemoteProcessingCommands. [Step 3,4,5] • Iteration 9: Added LMMPersistenceBeans. [Step 3,4,5] • Iteration 10: No further additions - Finalised Map. [Step 6] In creating his first Reflexion model (see point 1 above) the participant followed step one and two of the prescribed process by separating out the helper DTO component from the remainder of the system, based on the naming conventions of the code base. He felt, after this first model, that he had correctly identified the contents of the helper DTO component. Figure 8.2 shows the initial reflexion model and corresponding map. However, he did comment that he would not be absolutely certain of this until he separated out the “RestOfLearning” node. Over the course of a further nine iterations 12 high-level elements were added to the model and two replacements were made, thus iterating through steps three, four and five of our process repeatedly. The models produced can be found in the appendices. Replacements were typically made in situations where a high-level element needed to be subdivided into further sub-architectures. For example, in iteration five, the participant replaced the services component with two components - one which provided interfaces to the services and another which provided the implementation of the services.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 152

Figure 8.2: First iteration by the first participant. By the ninth iteration in the process (figure 8.3) the participant was satisfied that important interactions with the helper DTO had been isolated, therefore identifying its many interfaces with the system. This was confirmed when the participant finalised his map in the tenth reflexion model, thus completing step five of our process. 8.2.7.2

Participant 2

The second participant chose to encapsulate the web administration user interface (UI) of the LMS. This component forms part of a larger web interface component. The web component is the collection of Java classes that comprises the front end of the legacy web application of LMS, prior to its integration with Workplace 3 . As the LMS is now being integrated into Workplace, it was important to partition the system into its legacy web administration element and the rest of the system (for inclusion into workplace). Hence this was an outstanding task that the participant had been assigned. The participant proceeded to encapsulate the web administration UI component over the course of 11 iterations. Elements were added to the high-level model in the following sequence, finishing with the model in figure 8.4: 3

LMS was originally a standalone J2EE-based web application but it had since been integrated into Workplace.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 153

Figure 8.3: Last iteration by the first participant.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 154 • Iteration 1: [Steps 1 and 2] – The web administration UI component - Added Delivery, Servelet, Actions. – Remainder of system - Added RestOfSystem. • Iteration 2: The web administration UI component - Added View. [Steps 3,4,5] • Iteration 3: The web administration UI component - Added OddBalls. [Steps 3,4,5] • Iteration 4: The web administration UI component - Removed OddBalls, Added Navigation. [Steps 3,4,5] • Iteration 5: The web administration UI component - Removed Navigation, Added WebUtil. [Steps 3,4,5] • Iteration 6-9: The high-level model remained the same for these iterations. The participant only made alterations to the map over the course of several refactorings to the code base. [Steps 3,4,5] • Iteration 10: The web administration UI component - Removed Delivery. [Steps 3,4,5] • Iteration 11: The web administration UI component - Removed WebUtil. [Step 6] The participant immediately subdivided the web administration UI component (in his first iteration) into four major sub components, again highlighting the need for flexibility during the process. Also note that in this particular case (because it is a UI separation task) the participant decided that there was a single interface, therefore removing the need to split up the RestOfSystem component. This took five minutes.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 155

Figure 8.4: Web administration user interface component. (Note the bolded black box is not part of the tool’s visual output.)

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 156

Based on previous domain knowledge, several more sub components of the web administration UI component were added and deleted over the course of the 11 iterations. In iteration three the OddBalls component was added by the participant to hold software elements that he had yet to find a place for. However, as it transpired, the participant never had a need to use this placeholder. The high-level model stabilised in iterations six through nine. At this point only changes to the map were made as refactorings to the system were implemented. After the eleventh iteration, the participant was confident that he had encapsulated the web administration UI component and successfully decoupled it from the system, both in the logical reflexion model and through the associated refactorings he made in the code of the system. The participant chose only to identify a single interface with the rest of the system. Again, the participant did not adhere strictly to the prescribed process and instead demonstrated some potential improvements. This observation addresses research question 2 in section 8.2.1.

8.2.8

Evaluation

We are interested in evaluating two aspects of our technique: 1. The process. 2. The product as a result of the process. In evaluating the process we used as our data: • Concurrent think-aloud [TA] generated by the participants as they performed their task. The participants were asked to state everything that came into their mind as they proceeded. This is in line with best practice use of the technique (Ericsson and Simon, 1993). While speaking aloud is not the most natural way for a person to behave, it has been shown to provide the richest insight into the mental state of the participant at a point in time (Russo et al., 1989), when the information gathered is concurrent, unfiltered and comments on state only, not process.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 157 • Diaries [D] kept by the participants as they proceeded with the process which contained their thoughts on the tool, the technique and details of each iteration of the process. • Interviews [I] with the participants after the case study. • A fourth stream of evidence is found in the form of contextual knowledge [CK]. We define contextual knowledge, in this instance, as knowledge we had from the LMS team about the process and its impact. This is explored in greater detail in the section entitled “Contextual Knowledge.” In evaluating the product we relied on data generated by an architect of the system. This architect viewed the models produced by both participants and evaluated them while being recorded by an MP3 player. In addition, a complimentary analysis of the encapsulated component’s quality was attained by applying coupling and cohesion metrics to the product, as suggested in section 2.5.2. Apart from the coupling and cohesion metrics, the methods reported here will not be suited to quantitative, statistical analysis due to the qualitative nature of the data gathered. Indeed, given the high ecological validity in this in-vivo study (Basili, 1996) and the range of uncontrolled variables that this suggests, this evaluation work is more in line with the approach suggested by Seaman (Seaman, 1999) or O’Brien (O’Brien et al., 2005) in providing qualitative evidence of high ecological validity to complement other studies of a quantitative nature. 8.2.8.1

Process - Did the process seem to support programmers in component encapsulation and do the participants adhere to the process, or can the case study reveal improvement to our guidelines? (Question 1 and 2)

The evidence gathered in this section shows support in favour of the first and second research questions in section 8.2.1. Both the participants found the process very helpful

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 158

in identifying the component abstractions and stated so on several occasions, “That’s interesting, looks like this package should be catered for in another node.” [referring to view node] [TA] “Reflexion great for spotting where dependencies were nonetheless . . . ” [D] “That’s fairly revealing actually.” [TA] “A really good tool for getting a high-level idea of a big amount of source code.” [I] “I certainly intend to continue to use it myself.” [I] By the end of the process the participant 2 even self-evaluated his own results: “Delivery is back with the ‘RestOfSystem.’ No unwanted links to RestOfSystem, job done basically . . . Now I have just what I always wanted: the LMS doesn’t even know the web component exists. This is how it should be. It’s good to know that the list of modifications is actually quite short(summed up in 7 bullet points in the notes) which I’m taking as a good sign. The two aren’t nearly as highly interwoven as I had feared, or as I thought from having attempted this separation task a few months back. There’s been a few hacks along the way, but Reflexion allowed me to see the big picture, safe in the knowledge that this was, at least, the minimum number of hacks in the just the right places.” [D] In a previous attempt to isolate the web administration UI component participant 2 had relied upon compiler errors generated as a result of changes he manually introduced to the source code when trying to isolate the component. However, he noted that using our process changed all that for him. He could now work at the modelling level and not

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 159

be faced with many hundreds of dependency errors after each compile when working at the source code level: “[the] compiler is like a final check, reflexion should be your first check.” For each participant, evidence of the value of the iterative “hypothesis → test with tool → new hypothesis” nature of the process is seen throughout the gathered data, with many comments such as, “I’d be interested to know if the action forms are using helpers, because I’d totally expect the action handler to use them . . . [calculates reflexion model] so there are some calls . . . most of the calls from struts are going through the action handlers . . . expected.”[TA] “I would expect the service tier to use it, . . . so I’ll add another node . . . and I would be very, very surprised if there was an arc from the services to the helpers . . . it shouldn’t happen, these guys shouldn’t have any actual processing logic . . . [calculates reflexion model] . . . so there are three calls from the helpers to the services, which suggests there might be a problem there, sounds like something is being violated there.” [TA] “There is a call from helpers to struts . . . didn’t expect that . . . It’s always been unclear to me whether the validation occurs in the action handle or if it validates itself . . . ” [TA] “‘Reflexion Model Information’ showed that I should include an extra package, ‘views,’ in my model, as part of the web component. This only moves the problem however, reducing the unwanted links to ‘Actions’ but adding new ones into ‘Views’.” [D] “It also looked like navigation needed to go into the web component. In fact it does but since its a component that requires initialisation from the

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 160

LMS (and is not just a bunch of ‘dumb’ action classes) I think it belongs in the ‘RestOfSystem’ after all.” [D] “maybe I’ll split services up into the implementation and interface, I’d be interested in knowing how it’s using helpers.” [TA] “Looks like there’s a few links alright, from rest of system back to view, . . . I was hoping there would be none, so that’s a shame.” [TA] Only one comment of warning was made regarding the process by the first participant. In this case he highlighted the importance of mnemonics in undertaking the process, “You’d have a real problem doing this [process] if you didn’t have a consistent naming scheme.” Two worthwhile points are illustrated by this comment. Firstly, without a consistent naming scheme for a subject system, the map for the source model would become very large, since users of the jRMTool would not be able to catch a large set of software elements in a single regular expression. Secondly, if the naming scheme was inconsistent and the names held little clue as to meanings of the classes and methods then the participant would find the creation of the map very difficult (as it would not be in any way obvious as to which node a source code element should go). This suggests that there is scope for combining techniques other than mnemonics with our process as part of the component encapsulation process. In our proposal the alternate search technique would be the software reconnaissance technique, which localises its search for elements of software components based upon the behavior of the program rather than the naming conventions of the source code (Wilde and Scully, 1995). Also, while undertaking his encapsulation task, participant one deviated slightly from the proposed process by breaking up the RestOfSystem component before completely deciding on the contents of the component he was encapsulating. This suggests

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 161

that, in practice, a definite list of contents for the component to be encapsulated would not be decided upon until later iterations of the process. Thus a variant of the process here allows the participant to break down the encapsulated component or the “RestOfSystem” node further to become sure of its contents. This was also evidenced in observations of participant two. These observations are crucial in answering question two of the research questions in section 8.2.1 (Do the participant adhere to the process, or can the participants reveal improvements to the guidelines?). The participant did not stick rigidly to the prescribed process highlighting the need for additional flexibility. This additional flexibility brings the encapsulation process closer to the original Reflexion Modelling process. 8.2.8.2

Product: Architect Assessments - Is the product of the process a valid component? (Question 3)

In order to gather evidence in support of the third research question, which assesses the product of the process, we presented the resultant reflexion models to an architect of the LMS in order to gain his evaluation of the component abstractions produced. This evaluation concentrated on: • The quality of the component. • The correctness of the mappings to the component. The architect agreed with the component encapsulated by participant 1 and the mappings made to it, with the exception of one disagreement over the pattern to which the Helpers DTO component belonged to. In figure 8.5, we see a portion of participant one’s final model. It was the hypothesis of the participant that communication from JSPTags to DBPersistence beans was a breach of a design pattern implemented in the system and that all communication should be routed through the Helpers component.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 162

The architect initially saw the model as currently implementing a data transfer assembly pattern which permits this type of communication. After further examination of the Reflexion model and discussion with the participant, the architect amended his view saying, “looks like helpers are being relied upon to do any manipulation.”[I] “Seems to be doing two patterns . . . object assembler and view helper pattern.”[I] He summed up his review of the component stating that the component and interfaces were, “valid and interesting . . . I would have never thought of this [helpers] as a view holder. Didn’t know the extent of use.”[I] “This is not just [Participant 1’s] perception . . . it is quite a common pattern.”[I] Indeed, the architect noted that 90% of the calls to the DB persistance beans were through the helpers DTO component showing a strong indication of the view helper pattern. This prompted a validating review by a second architect of the LMS, who was an original architect of the system. He indicated that the actual interpretation was a mixture of both the first architect’s and the participant’s opinions. The helpers component is for gathering information for the UI (like a View Helper pattern), but the UI is not necessarily restricted to this. More importantly, the architect also agreed with the participant’s model and mappings produced. For participant 2’s web administration UI component decoupling task the first architect completely agreed with the high-level model and the contents of his model. Not only could he appreciate the validity of the component encapsulation and its interface but the motivation also. Once correctly encapsulated he could see that:

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 163

Figure 8.5: A design pattern in the LMS. “all communication could potentially happen over a webservice.” [With the refactored system] [I] ,which was the ultimate intention of participant 2 too. With respect to the correctness of the model he, the architect, stated, “there is unlikely to be missing of extra edges here because we’ve done the work . . . that is what he wanted it to be . . . that’s good, well done [Participant 2] . . . there’s not an awful lot else to say . . . it’s right.” [I] To summarise, the qualitative evidence gathered through interviews with the original architects of the LMS would seem to lend support to the third research question, which questions the quality of component recovered. 8.2.8.3

Product: Metrics of Coupling and Cohesion - Is the product of the process a valid component (Question 3)

In section 2.5.2 metrics for coupling and cohesion were described. With, encapsulation, and thus replacibility in mind, we calculated the coupling and cohesion for the components encapsulated by our participants. This provides another, more objective stream

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 164

of evidence in support of the third research question for this study. We presented the scales of coupling and cohesion to both participants and asked them to assess their respective components. Participant 1 had trouble stating exactly which type of coupling his component exhibited, “Certainly it is not 0; x and y do communicate. However I don’t know if Helpers really communicates with the outside system (calls out) in any meaningful way. In my opinion, it could be 1 or 2.” [I] We decided to remain conservative in our calculations and accepted a coupling value of two (participant 1’s worst case coupling) for the helpers DTO component’s coupling with the rest of the system. The following coupling measure for the helpers component was calculated: c(helpers, RestOf System) = 2 +

5847 5847+1

= 2.9998

Participant two faced a similar dilemma to participant one when assessing the web administration UI component on the coupling scale, “Coupling between the ‘rest of system’ (ROS) and the web module is either 2 or 3 . . . probably 3, since there is, for example, sometimes some (very very basic) business logic contained in the web module. There’s a subtle difference (parameters as flags) between 2 and 3 . . . but I think it’s closer to 3.” [I] Again we remained conservative and calculated the coupling for the web administration UI using a coupling value of 3 from the scale: c(webadministrationU I, RestOf System) = 3 +

18739 18739+1

= 3.9999

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 165

Based upon Fenton’s studies, values above 4 are tightly coupled and values below 2 are loosely coupled (Fenton, 1991). Both components in this case lie below the tightly coupled threshold. We can compare this to the self coupling of the components. Participant 1 assessed the self coupling of the helpers DTO component as a four on the coupling scale, “I would probably concur that it is level 4 coupling - same global data.” [I] which works out to be, c(helpers, helpers) = 4 +

5356 5356+1

= 4.9998

Participant 2 stated that the internal coupling measure for his component also resided at level four on the scale: c(webadministrationU I, webadministrationU I) = 4 +

27843 27843+1

= 4.9999

This suggests that the component itself is tightly coupled internally. As intra component coupling is higher than inter component coupling in both cases, this suggests that the components have some coupling / cohesion merit. In assessing the cohesion of the components the first participant noted the type of cohesion exhibited by the helpers DTO component was communicational cohesion, “I would actually say that Helpers exhibits cohesion level 3 - Communicational. Based on the difficulty [Architect] and I had on agreeing on the true role of the Helpers component, I certainly don’t think it could be said to have a single well defined function (level 1 - Functional). In fact, both [Architect] and I think that it is used as both a DTO (Data Transfer Object or Value Object) and as a view helper. But in the case of either function, the same body of data is acted upon. I think this qualifies it as communicational. (Though the original intent may have been for it to perform a single well defined function, probably DTO or model bean.)” [I]

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 166

This suggests reasonably good cohesion for the helpers component. Participant 2 assessed the web administration UI component as having functional cohesion, “For functional cohesiveness, yes 1 (functional cohesion) is appropriate I think.” [I] This suggests an extremely high cohesion measure for the web administration UI. 8.2.8.4

Contextual Knowledge: Is this a better means of component encapsulation than the current approach in the company? (Question 4)

For us to make the statement that our process is “better” or “useful,” our process must be “better than” or “useful compared to” something else. Unfortunately, this is difficult to establish definitely using industrial case studies. Contextual knowledge, which is existing data from projects and systems prior to the beginning of the study, allows us to make this comparison. The most obvious contextual evidence supporting the benefits of this approach over current practice has emerged primarily from subjective statements about the process. For example, in reference to his normal, ad hoc, approach to partitioning the system into components, the first participant said, “Normally it would take so long because you only get to see small pieces of the code when developing.” [I] Another example from the first participant’s think-aloud was when he had hypothesised no dependency between helpers and struts when in fact there was. When asked if he thought that this was unexpected he said: “yeah, yeah I do . . . I wrote that code myself.” [TA]

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 167

The fact that he was only made aware of the dependency when using our approach, even in code he wrote himself, is evidence that our approach or the Reflexion model can improve current practice. Another example from the same model arose where validation services for the Helper DTO’s was revealed to the participant. Using his traditional methods he had said that “It was always unclear where this type of validation occurred.” [TA] Only with the help of the process was he now able to explicitly document this interface, by identifying the dependencies the component had with the remainder of the system. Further contextual evidence can be identified for our second participant. A previous project had attempted the same task over the course of two weeks to encapsulate the web administration UI component and was cancelled. “i did try this same job about two months ago and gave up after two weeks.” [TA] With the variation on Reflexion, the same task was accomplished by our second participant in four hours (0.5 days). Using the original attempt as a base, our technique allowed the encapsulation and refactoring of the same component 17.5 times faster than the approach normally used. In the organisation it appeared that the standard means of carrying out this kind of refactoring separation of components was by repeatedly making changes and checking subsequent compile errors. Using Reflexion appeared to be far better solution, “The alternative to this, maybe, is to make the changes to the code. But if I could use the model ... again, not touching the compiler, until I really have to. I could gradually put stuff into this oddball node. Then at the end when I wanna start actually changing code you can, am, actually start the

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 168

refactoring proper and keep rebuilding my reflexion model, and eventually all the stuff in my odd ball node will go down to zero and I’ll have no stray links, so I’m gonna try that.” [TA] Importantly, it should be noted that this evidence shows that the technique is better that state-of-the-practice for the participant not the state-of-the-art in general. An even better solution would be to create automatic refactorings, triggered by the Reflexion model. Work underway at the University of Bremen is currently investigating this goal (Christl et al., 2005). As a final stream of contextual knowledge, the two participants stated that they would actively use the technique during their ongoing work. The architects who assessed the product of the process also expressed an interest in using the approach. Four more programmers in the department have also requested to use the technique as a result of the review provided by the participant and another architect/developer is actively using the approach in his ongoing work. This seems to indicate proliferation of the technique in the organisation.

8.2.9

Discussion

Close to the beginning of this section, four evaluation questions were posed to be evaluated by this case study: 1. Did the process seem to support programmers in component encapsulation? • The documented think-aloud, interviews and diaries all seem to indicate that the process was directly contributing to the programmers ability to encapsulate their respective components. Participant 2 was even able to encapsulate a component that could not be encapsulated using his standard practice. In addition, the two participants stated how they would actively use the technique during their ongoing work and the architects who assessed the product

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 169

of the process also expressed an interest in using the approach. Four more programmers in the department have also requested to use the technique and another architect/developer is actively using the approach in his ongoing work. This seems to indicate technology transfer of the technique in the organisation. 2. Do the participants adhere to the process, or can the case study reveal improvements to our guidelines? • One useful change became apparent during the first study. We later incorporated the change when undertaking the second study. It involved allowing the participant to break down the component under scrutiny into sub architectures to enhance his certainty that the contents of his chosen component is correct and to help break down the interfaces. 3. Is the product of the process a valid component? • We assessed the products of both studies using two independent architects of the system and an objective measure of the products (components) using software metrics of coupling and cohesion was applied. Both assessments indicated that the components were valid. 4. Is this a better means of component encapsulation than the current approach in the company? • From the evidence we gathered it appears so. Compared to the contextual data gathered the encapsulation guidelines allowed the encapsulation of components up to 17.5 times faster, to reach the same level of usefulness, than with the traditional approach within the company. Of course, this is only a single quantitative data point, however other qualitative contextual data gathered also suggests that the process is better.

8.2 Validating Reflexion-Based Component Encapsulation: The Workspace Case Study 170

The evaluation indicates that our process afforded significant benefits to the participants. The ecological validity of this study is also quite high. The industry setting, a large commercial system, actual maintainers, objective metric measures and original architects for assessment made for a highly ecologically valid case study, with many streams of evidence to support the evaluation. These streams included, think-aloud, diaries, metrics, interviews and original architects. A common shortcoming of experiments of this nature is the small population size. This is partially tackled because two participants were used in this case study, as opposed to one. However this still falls quite short of a valid population size. As a result, a statistical analysis was not possible, hence the strong bias in the usage of rich qualitative data. Furthermore, the reliability of these results will be further assessed as two further participants are added in the next case study, where the entire Reconn-exion process is examined.

Chapter 9 Evaluating Reconn-exion: The AIM Case Study “A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.” - Douglas Adams.

9.1 Purpose and Research Questions

9.1

172

Purpose and Research Questions

The purpose of this case study is to examine the entire Reconn-exion process as a single unit. Central to the combined process is whether the reuse perspective is a useful means of prompting the initial Reflexion models of the user, and this is an evaluation not present in the previous case studies. The following evaluation questions are thus examined by this case study: 1. Does the reuse perspective influence the prompting of component abstractions and subsequently influence iterations of Reflexion models? 2. Do the participants adhere to the process, or can the case study reveal improvements to the guidelines? 3. Do participants find the process useful? 4. Is the produced component of high quality? 5. Is this a better means of encapsulating reusable components than the current approach in the company?

9.2

The Subject System: The Advanced Inventory Management Application

The subject system in this case study is a warehousing management application called Advanced Inventory Management or AIM, and is a commercial product of QAD Ltd. (Ltd., 2006). AIM as an application is used as an extension to MFGPRO, which is an enterprise resource planning application and the flagship product of QAD. AIM consists of approximately 250KLOC and over 2572 procedure files. The application is used to provide warehousing, picking and placing algorithms for large storage spaces, as a feed

9.3 The Participants

Table 9.1: Participant details. Participant 1 Age 30+ Yrs. Experience 10 Yrs. Experience with AIM 0.5 VARK Learning Preference Mild kinesthetic

173

Participant 2 40+ 16 0.5 Mild multimodal

to a manufacturing environment. The application was acquired by QAD Ltd. and more recently the responsibility for maintenance was shifted to QAD Ltd.’s Irish development facility. As a result several software developers, who had little or no experience with the application (but had experience in the domain) were faced with the task of learning AIM and maintaining it.

9.3

The Participants

Two participants demonstrated the Reconn-exion process in two different sessions. Their details are summarised in table 9.1. Also, an original architect of the AIM application also took part in the study in order to assess the output of the Reconn-exion process. The architect had worked on the application since it had been acquired by QAD Ltd. four years previously.

9.4

Tool Support

Tool support necessary to enable Reconn-exion for this study included: • The Progress 4GL profiling option. This was enabled to create code coverage profiles from the running code (Progress Software, 2003). • An open source tool, developed by the author at the University of Limerick called

9.5 Study Protocol

174

“ReconCalc”: This was used to calculate the results of the Software Reconnaissance from the gathered profiles (Le Gear, 2006). • A static analysis tool called “XREF” (Progress Software, 2003) was used to derive the source model necessary to perform Reflexion Modelling. • A Perl-based analysis tool call “RMTool”: This was used to calculate the Results of Reflexion Modelling (Murphy, 2003). • A graph layout and display tool call GraphViz: This was used to display the results of Reflexion Modelling to the user (AT and T, 2005).

9.5

Study Protocol

The protocol for the study is planned to execute as follows: • Prior to a monitored session with the participant, the participant will choose features of the application to analyse. • The person running the study will profile the chosen features and calculate the reuse perspective. • In the company the following situation is used: – The participant’s own computer and workplace (i.e. the participant’s natural working environment). – The AIM source code, the RMTool and the reuse perspective are open and ready to browse or use on the computer. – A video camera is set up to record the study.

9.6 Recounting the Process

175

• The participant will first be invited to browse the reuse perspective to see if it may prompt component abstractions to him. The participant will be asked to talk aloud throughout the experiment. • If the reuse perspective prompts a component abstraction to him, he will sketch the initial software elements belonging to the component(s) and places them in an initial map file and computes a Relflexion Model. • The participant will then continue to add to and refine his Reflexion Model until both the component and the roles it plays in the systems are captured by the Reflexion Model. Each iteration of the Reflexion Model, map and high-level model are saved for later perusal. • When the participant decides that he is finished he will inform the study supervisor and is free to go. • The time will be noted and recordings and saved model stored for analysis later. • Next, the resulting encapsulated components will be presented to the original architects of the system for appraisal.

9.6

Recounting the Process

This section describes how the case study unfolded with respect to the protocol of the study and the Reconn-exion process. In particular the section can help to answer the following research questions posed in section 9.1: • Do the participants adhere to the process, or can the case study reveal improvements to the guidelines?

9.6 Recounting the Process

176

Table 9.2: Feature set examined by participant 1 during case study. No. Name 1 Start - Stop 2 Log in (no additional keys presses) 3 Exiting MFGPro 4 Log in (space bar is hit immediately after login) 5 Run AIM 6 Exit AIM 7 Unplanned receipt including transaction checking 8 Unplanned receipt 9 Unplanned Receipt - Transaction checking 10 Unplanned Receipt - Confirmation 11 MFGPRO communicating with AIM 12 Radio Frequency (RF) screen Log in 13 Update batch picking control file 14 Batch picking 15 Creating sales order and releasing goods to ship 16 Confirmation of batch pick 17 Change logical format for the warehouse 18 Select a warehouse (logical format menu) 19 Batch picking - work order 20 Create a work order 21 Receive a task for a work order

9.6.1

Participant 1

9.6.1.1

Creating the reuse perspective

Participant 1 identified a rich feature set totaling twenty one features. The maintenance agenda of the AIM evolution team was used as a seed set (Robillard and Murphy, 2002) in this instance. Table 9.2 lists these features. Using the profiling option in Progress4GL (Progress Software, 2003), coverage profiles for one or more test cases exhibiting each of the features (table 9.2) were retrieved. Using tool automation developed by the author at the University of Limerick, the reuse

9.6 Recounting the Process

177

perspective was calculated. A reuse perspective 99 procedures in size was produced1 . 9.6.1.2 Using the reuse perspective to prompt component abstractions The participant was first asked if, without a prompt, he could identify any components of AIM. The participant could not. The participant was then presented with the reuse perspective and asked to sketch components of the system, prompted by the list of software elements in the reuse perspective. The reuse perspective served as a sufficient prompt and allowed the participant to sketch what he considered worthwhile components, as shown in figure 9.1. In the sketch, consecutive lines represent a cluster. The participant circled the best candidate clusters. 9.6.1.3 Encapsulation with Reflexion Based upon the components prompted to the participant by the reuse perspective, the participant chose to encapsulate a utility component called ‘AIM_Util.’ The participant created a simple high-level model consisting of ‘AIM_Util’ and the remainder of the system, ‘RestofAIM.’ The participant mapped all procedures with the naming convention ‘whgp*’ to ‘AIM_Util’ and everthing else to ‘RestofAIM’ and calculated the Reflexion model. The result of this is in figure 9.2. Upon examining the resulting Reflexion model the participant was satisfied that he had encapsulated the utilities component. 9.6.1.4 Identifying Multiple Interfaces The participant next began to divide out the ’RestofAIM’ component in order to reveal the multiple interfaces that the component had. Over the course of nine further iterations, nodes were added to the model (dividing out RestOfAIM) in the following order: 1

Note: The listing of the reuse perspective cannot be included due to copyright reasons.

9.6 Recounting the Process

178

Figure 9.1: Procedure clusters prompted by the reuse perspective (participant 1). File names are blurred for copyright reasons.

9.6 Recounting the Process

[ file=^whgp mapTo=AIM_Util ] [ file=.* mapTo=RestofAIM ] Figure 9.2: Participant one’s first Reflexion model and map.

179

9.6 Recounting the Process

180

• Iteration 1: Added AIM_Util and RestofAIM. [Steps 1 and 2] • Iteration 2: Added Engine. [Steps 3,4 and 5] • Iteration 3: Added UIBrowsers. [Steps 3,4 and 5] • Iteration 4: No changes to the model only changes to the mappings were made. [Steps 3,4 and 5] • Iteration 5: No changes to the model only changes to the mappings were made. [Steps 3,4 and 5] • Iteration 6: Added AIMDB_Triggers. [Steps 3,4 and 5] • Iteration 7: Added Printing and Algorithms. [Steps 3,4 and 5] • Iteration 8: Added Reporting. [Steps 3,4 and 5] • Iteration 9: Added MFGPRO and ControlFiles. [Steps 3,4 and 5] • Iteration 10: Added RFClient, Picking, DOSessionTriggers, InventoryIC, OrderMGT and Item. [Step 6] By the tenth iteration the participant felt he had adequately encapsulated the utilities component and identified the important interfaces it had with the remainder of the system. Not all the other components divided from RestofAIM represented a new interface. For example, the UIBrowses component had no communication with AIM_Util, therefore AIM_Util had no role to play in services UIBrowses. The final Reflexion model of participant one is in figure 9.3. To view the other models produced, please refer to appendix E.

9.6 Recounting the Process

Figure 9.3: Participant one’s final Reflexion model.

181

9.6 Recounting the Process

9.6.2

182

Participant 2

Based upon a request from the participant, the source model for this study not only included call dependencies, but also database dependencies. For more information about alternate source models please refer section 11.11. 9.6.2.1

Creating the reuse perspective

The participant identified 13 features in the AIM application and coverage profiles were retrieved in the same manner as the first participant. The participant’s own maintenance agenda was used to seed the choice of features, and this feature set differed significantly for participant 2. These are shown in table 9.3. The participant in conjunction with the study coordinator then proceeded to gather profiles for each of the features. The reuse perspective was calculated using ReconCalc and resulted in a set of 58 procedures2 . 9.6.2.2

Using the reuse perspective to prompt component abstractions

The participant, when asked, could not initially think of any components and their contents in the application without a prompt. The reuse perspective was then presented to the participant where he was asked to examine it and attempt sketching component clusters from AIM, prompted by viewing the reuse perspective. The participant’s sketch is shown in figure 9.4. Each circled regular expression or list represents the contents of a component. The name of the component is written either above or below the component. 2

Note: The listing cannot be included in the appendix due to copyright reasons.

9.6 Recounting the Process

Table 9.3: The set of features chosen by the second participant. No. Feature Description 1 Start-stop The application is simply started and stopped. 2 Wave selection The main menu for the wave selecmenu tion portion of AIM. 3 Create a wave Create a new wave plan to perform a pick from the warehouse. 4 Set up wave re- Perform the set up task to repeat an plenish existing wave plan. 5 Replenish Re-perform an existing wave plan. 6 RF Login Log in through the radio frequency login screen. 7 Complete Re- Incorrectly complete a wave replenplenish - ERROR ish. 8 Wave release Execute an existing wave. 9 Complete pick Complete a single pick from the warehouse. 10 Batch pick Complete many picks from a warehouse. 11 Containerise - 1 Pick and fill a palate. Box 12 Containerise - Pick and fill another palate. 2nd Box 13 Containerise - Move contents of a palate to another Transfer Box palate.

183

9.6 Recounting the Process

184

Figure 9.4: Procedure clusters prompted by the reuse perspective (participant 2). Note that the file names are blurred for copyright reasons.

9.6 Recounting the Process

9.6.2.3

185

Encapsulation with Reflexion

The participant, upon examining the reuse perspective and resulting sketch, chose to encapsulate the ‘RF_Screens’ component. This component was an alternate interface to the AIM system for people using the system in conjunction with a radio frequency gun. For example, as deliveries arrive at the wareshouse an employee with a scanner can scan palates as they enter the warehouse, instead of the normal access via the main computer. The participant spent the first three iterations encapsulating this component, using a simple model consisting of ‘RF_SCREENS’ and ‘EVERYTHINGELSE.’ In the participant’s view the ‘RF_SCREENS’ component should exist with a one way interface on ‘EVERYTHINGELSE.’ In this way ‘EVERYTHINGELSE’ is unaware of the existence of the ‘RF_SCREENS’ component and the ‘RF_SCREENS’ component is purely a user of the services of ‘EVERYTHINGELSE,’ similar to a pull model for user interfaces / system interaction. By the third model the participant had finalised this encapsulation as shown in figure 9.5. To see other Reflexion models please refer to appendix F. 9.6.2.4 Identifying Multiple Interfaces The participant next began to divide the ‘EVERYTHINGELSE’ node in order to identify the multiple interfaces for the ‘RF_SCREENS’ component. Six further iterations ensued after iteration 3, adding and deleting nodes in the following order: • Iteration 1: Added RF_SCREENS and EVERYTHINGELSE. • Iteration 2: Only changes to the map were made. • Iteration 3: Only changes to the map were made. • Iteration 4: Added WAVE, MFG_DATA and AIM_DATA.

9.6 Recounting the Process

[ [ [ [

procedure=whrf mapTo=RF_SCREENS ] procedure=whextaut.p mapTo=RF_SCREENS ] procedure=whextman.p mapTo=RF_SCREENS ] procedure=.* mapTo=EVERYTHINGELSE ] Figure 9.5: Participant two’s third Reflexion model.

186

9.6 Recounting the Process

187

Figure 9.6: Participant two’s final Reflexion model.

• Iteration 5: Added CONF and MENU (removed WAVE). • Iteration 6: The list is too numerous to mention but, during this iteration the participant experimented, by creating a node for every table in the database. (Removed AIM_DATA and MFG_DATA.). • Iteration 7: Only changes to the map were made. • Iteration 8: The nodes added during iteration six were conglomerated. Added AIM_RF_DATA, MFG_DATA, AIM_DATA, AIM_SEQUENCE, AIM_RF_SEQUENCE. (Removed CONF). • Iteration 9: Added START, CONF, RF_TASKS, RF_BPICK and RF_CONT. During the final iteration the participant, began to divide up the ‘RF_SCREENS’ component to possible identify a subarchitecture. This subarchitecture is represented by ‘RF_SCREENS’, ‘RF_TASKS’, ‘RF_BPICK’ and ‘RF_CONT’. This creation of a subarchitecture is similar to the behavior of participant 2 in the Workplace case study in the previous chapter. This is an example of a deviation from the process, addressed in research question 2. This final Reflexion model is shown in figure 9.6.

9.7 Evaluation

9.7

188

Evaluation

As with previous case studies, several streams of evidence were used to assess Reconnexion. The results were examined from the point of view of both the process and the product of that process. Data sources included, diaries, think-aloud, interviews, videoed observations, email conversation and metrics. The opportunity was also taken to revisit the evaluation of the reuse perspective.

9.7.1

Process: The Effectiveness of Reuse Perspective as a Prompt

This section presents evidence regarding the following research question from section 9.1: • Does the reuse perspective influence the prompting of component abstractions and subsequently influence iterations of Reflexion models? Both participants were first asked, without a prompt, could they identify reusable components of the system. Both replied they could not, and it was not until prompted with the reuse perspective that they could identify components. This suggests that the prompt is relevant to the task. In recovering components, one of the core criteria is that the resultant components are reusable. Several quotes here show that the software engineer anticipated that the components would be reusable, “. . . you now know that if I am doing some new functionality that I should be using that [he said pointing to a reusable element in set] . . . you reuse what should be reused.” “here’s thirteen files that are doing a lot of generic work . . . and the next stage would be ‘lets go and see what going on here’ . . . you are getting some

9.7 Evaluation

189

information on what I am working with . . . understanding what services are provided.” The original architect of the system was also consulted and confirmed that the reuse perspective contained core and reusable elements of the AIM system: “Absolutely agree. Some of the procedures in the list are the most important procedures in AIM.” “Yes. Some of them are key in AIM.” ”Yes indeed. Radio Frequency, Picking, AIM creation tasks, Wave Functionality.” [with reference to their reusability] The benefits of clustering core functionality by the view was also noted, “. . . you are building what are the core players.” “xijd*, i think they’re fairly core.” and several comments were made highlighting how participants actively clustered the reuse perspective elements into clusters, “All of the trigger code to do with deletes and writes is in that block up to.” “the ‘fo’ programs are all to do with the ‘engine W’ table.” “‘tpqlbmm’ and ‘ng’, I know they’re part of a separate application.” “the last four, they’re to do with ‘wave planning’.” Some of the best evidence with respect to the reuse perspective as an effective prompt to component abstractions are the component sketches produced as a result of viewing the reuse perspective. These are shown earlier in figures 9.1 and 9.4. The sketches were also presented to an original architect of the system for evaluation,

9.7 Evaluation

190

“I agree with [Participant 1] comments. Obviously there are a large amount of programs involves in those categories but I could see that it is the right way to classify these programs.” The prompt provided by the reuse perspective also had a reaching influence through many iterations of Reflexion model, not simply the first. For example, the first participant continued to refer to the reuse perspective for the first nine Reflexion model iterations.

9.7.2

Process: Reconn-exion for Component Encapsulation

This section provides evidence regarding the following research question from section 9.1: • Do participants find the process useful? It should be noted that the value of the reuse perspective portion of the process is implicit in the quotes from the previous subsection, thus the majority of quotes here refer to the Reflexion portion of Reconn-exion. As previously mentioned, both participants were videoed talking aloud as they performed the case study. This provided a wealth of evidence that demonstrates the success of the process as a means of component encapsulation. Throughout the case study many statements can be cited where we can actively see the process support of Reconn-exion contributing to the encapsulation of components. These are some examples3 : “We’ve got 19 calls between aim_util and ‘rest of aim,’ which is interesting, right, we need to figure that out because . . . we’d need to find out what those 19 things are going back here, because they theoretically should belong in there [pointing to ‘aim_util’] as well. Because you’d make a call to utility, 3

Note: File names have been obfuscated for copyright reasons.

9.7 Evaluation

it should really be encapsulated, . . . give you back an answer, . . . unless its a call back thing.” “Let’s put all these engine files, let’s put all these ‘when’ files somewhere, . . . so let’s create a thing [high level node] with engine in it.” “UIbrowsers, there’s no way it should be calling back, they’ll all call it, . . . actually engine mightn’t call it . . . [calculates Reflexion model] . . . so, yeah look there’s no call back between engine and browsers. There’s one call between AIM_Util and browsers . . . thats interesting, we gotta find out why that is.” “So he’s making one call to aim_util. We need to find out what that is, OK? . . . basically this belongs in the rest of aim, actually, yunno what we’ll do?, we’ll put all that into a triggers [node].” “I think that there shouldn’t be. The RF should be . . . it should go from the RF to the rest, I don’t see the RF going back into the rest. It could happen but I don’t think and if it does it should be small. [computes Reflexion model] now that’s interesting. Pretty much on the mark about the one way direction. There’s only four going back into the RF_SCREENS, curious to know what they would be [checks edge information] OK, this makes sense. These make sense. These two programs are actually still RF, they’re exceptions to the rule about the ‘xisg’ naming convention. I’ve seen them before. So that actually does make sense. In actual fact to clear that up we’d need to add those to that group [RF_SCREENS] as well . . . That’s nice actually. There’s sort of a - you can clearly see that there’s definitely only one way.” “So there’s one [call back from RF_SCREENS], that’s interesting, let’s have a look and see what that is [examines edge] OK yes there was one

191

9.7 Evaluation

192

other, . . . so we’re gradually pulling them all in [in to the RF_SCREENS component]. The next step would be to pull that one in and see if that clears it up.” The participants also openly commented on the usefulness of Reconn-exion, “ this is great craic4 ! very powerful, I must admit!” “Gee, that’s an interesting tool you have here, I must say.” As before, this is only a fraction of the available think-aloud in support of the process.

9.7.3

Product: An Architect’s Assessment of Encapsulated Components

This section provides evidence regarding the following research question from section 9.1: • Is the produced component of high quality? To gain an independent assessment of the product of Reconn-exion (the component) an original architect of the system, situated in Spain, was consulted (the same architect from the previous section). Due to the distance factor this assessment was carried out over an e-mail dialogue. The architect was posed the following questions with respect to the product of Reconn-exion, for both participants: “Encapsulating a component: I have included a list of procedures from AIM that the participant grouped together. He felt that these files could be grouped as a [Component type] in AIM. I have two questions: 4

“Craic” is an Irish colloquialism meaning good fun, worthwhile or of value.

9.7 Evaluation

193

1: do you agree that these files constitute a [Component type]? In your opinion is anything omitted? 2: The participant grouped the files together on the premise that they form a cohesive and independent component of the AIM system. Do you agree?” In both cases a positive response was attained: “If you are just talking about procedures (*.p) I could say that in this list are all the important ones.” “Yes. RF functionality is a very independent module from a programming point of view.” With respect to the multiple interfaces for the utilities component encapsulated by the first participant, the architect agreed with the interface categories chosen saying, “Yes indeed.”

9.7.4 Product: Metrics of Coupling and Cohesion on Encapsulated Components This section provides evidence regarding the following research question from section 9.1: • Is the produced component of high quality? This section provides another evaluation of the components produced by the participants, this time using software metrics to assess quality. As with the previous case study, metrics of coupling and cohesion, described in section 2.5.2 are used to assess the components encapsulated using Reconn-exion. The first participant assessed the utilities component as having a coupling rating of 3. The resulting coupling measure then works out as

9.7 Evaluation

194 c(AIM _U til, RestOf System) = 3 +

435 435+1

= 3.9977

Likewise the second participant assessed his component as being coupled at level 4 coupling, providing a coupling measure of, c(RF _SCREEN S, EV ERY T HIN GELSE) = 4 +

9009 9009+1

= 4.9999

With respect to cohesion the participants assessed their components as having a cohesion measure of 7 for the utilities component and a measure of 3 for the RF_SCREENS component. Fenton states that components with a coupling measure above 4 are tightly coupled. The utilities component appears not to be tightly coupled, but based on the cohesion measure does not perform a single well defined function. That is, it’s cohesion is coincidental. This, however, may be considered satisfactory considering the purpose of the component. As a utilities component it will provide many services to the remainder of the system, similar to a library of reusable routines. Therefore, one could say that the component is of reasonable quality from a metrics point of view. The RF_SCREENS component appears to be tightly coupled to the system because it operates upon the same global data as the rest of the system. This may prevent reuse. However, within the context of the system, the component appears to be reasonably cohesive as a single radio frequency (RF) user interface provider. Therefore this component, though probably not reuseable beyond the context of this system, is a cohesive component within the system. Furthermore the unidirectional dependency between the RF_SCREENS component and the remainder of the system make the component ”pluggable” in AIM. That is, the remainder of the system is oblivious to the RF_SCREENS component’s and could potentially be replaced with any RF UI component without any disruption to the remainder of the system.

9.7 Evaluation

9.7.5

195

Contextual Knowledge

This section provides evidence regarding the following research question from section 9.1: • Is this a better means of encapsulating reusable components than the current approach in the company? Contextual knowledge was far more scarce than in the previous study, however some was gathered, enabling us to address the first research question for this study. Upon interviewing one of the participants in the organisation one of the standard practices for architectural recovery involved examining the call stack for various scenarios of execution of the system. A drawback of this approach is the inability to ever get an overall, summarized view, as with Reconn-exion, “The other way is top down, which is where you start tracing through the program, and you could spend weeks doing that and you might never get to where you want to be. Whereas here you can just look up and say ‘why is this happening?’ and go in . . . its a different approach.” As before, it should be noted that this evidence demonstrates that the technique is better than the state-of-the-practice for the participant and not the state-of-the-art in general. According to the participants, Reconn-exion is far better that this approach and vocal comments of support were made throughout the case studies, for example, as reported earlier, “ this is great craic! very powerful, I must admit!” Furthermore the process itself could be seen as actively contributing to the formation of components (see section 9.7.2).

9.8 Discussion

196

Also, as noted earlier, each participant was asked to list components of AIM without a prompt prior to being presented with the reuse perspective. Neither could think of components without a prompt. The participants were asked if he could identify components in the AIM module without prompting from existing documentation. The participants could not. Participant 1 replied: “No, can’t off the top of my head.”

9.8

Discussion

At the beginning of this section, several evaluation questions we proposed to be addressed by this case study: Does the reuse perspective have an influence in prompting component abstractions and subsequently influence iterations of Reflexion models? From viewing the reuse perspectives generated, both participants were able to sketch 17 components (see figures 9.1 and 9.4). Neither were able to sketch potential components without the prompt. In each case one of these components was the basis for the targeted component refinement using the variation on Reflexion Modelling. The architect of the system was able to confirm that the identified abstractions were correct. Video evidence from the case study also shows how both participants then based their component choice and formation on the component abstractions sketched. With participant one, the reuse perspective continued to influence the resulting models and maps for a further eight iterations. Do the participants adhere to the process, or can the case study reveal improvements to the guidelines? By and large the participant’s adhered to the process. One notable improvement emerged from the the first participant, which was to include database accesses in the source model to reveal component dependencies on the data model. The language being examined was Progress 4GL. In this language the database and language

9.8 Discussion

197

are tightly integrated, so much so that database fields are treated almost identically to variables. The importance of this was reflected in the improved source model. Also notice how participant one also broke his component down into a sub architecture similar to the participants from the previous chapter. Do participants find the process useful? Each of the participants voiced their satisfaction with the process on a number of occasions. These were discusses in section 9.7.2. Is the produced component of high quality? Based upon the evaluations from both the original architect and the interpretation of the results in light of objective metrics, the components seem valid. The reusability of the utilities component was higher than the RF_SCREENS component, but both seemed cohesive within the system. Is this a better means of component encapsulation than the current approach in the company? This, as always, is difficult to assess using case studies. Evidence to support this question was scarce, however some contextual knowledge did exist demonstrating that the technique was better than what already existed in the organisation as shown in section 9.7.5. Do all elements of the reuse perspective fall into at least one of the categories, “reused,” “generic,” “core”)? The original architect was able to confirm these properties as correct.

Chapter 10 Scoping Reconn-exion “What you perceive, your observations, feelings, interpretations, are all your truth. Your truth is important. Yet it is not The Truth.” - Linda Ellinor.

10.1 Validity of Studies

10.1

199

Validity of Studies

In chapter 6, the concept of validity was explained. It is the purpose of this section to revisit the two evaluation chapters (chapters 8 and 9) and examine the evaluation in terms of validity. The evaluation in this thesis attempted to heighten validity by examining variations of the potential populations during the studies: • Many different component types were recovered during the course of the studies. These component types included: – A highly reusable utilities component. – Two unidirectional user interface components: ∗ A character screen user interface component (RF_SCREENS). ∗ A web interface component. – A data access layer component (helpers DTO). • The participants had many degrees of experience ranging from 2 to 16 years experience and in between. • Applications from a variety of domains were used, including: – An Enterprise resource planning (ERP) domain. – A Distributed learning and collaboration. • All applications were of varying size ranging from 20,000 LOC to 2,000,000 lines of code. • For the two larger case studies, two participants participated in each, increasing the population size and therefore the degree of control for each study. However, this population size falls far short of cancelling the invalidity effect and this is

10.1 Validity of Studies

200

possibly the largest threat to validity of the studies. The low population size prevents us from drawing significance from statistical analyses. However large numbers are difficult to obtain in in-vivo studies and further replication by interested researchers may serve to buttress these results. In line with state of the art recommendations described in chapter 6, the studies undertaken were qualitative, and very high in ecological validity: • All participants were maintainers of the systems examined. • The component encapsulation tasks were in line with the maintenance agenda of the participants. • The studies were undertaken in the natural work environment of the participants. One threat to validity in this evaluation is the inability to implement control and isolation of variables when using case studies. However, as highlighted in chapter 6 by (Segal et al., 2005) the inability to replicate an experiment, due the lack of controls and known variables is not at issue. Rather the replication of interpretation by others is what should be achieved. This position is evidenced by the many peer reviewed publications arising from the work of this thesis (Le Gear, 2004; Le Gear and Buckley, 2005b,c,a; Le Gear, Buckley, Cleary and Collins, 2005; Le Gear, Buckley, Collins and O’Dea, 2005; Le Gear et al., 2004; Le Gear, Cleary, Buckley and Exton, 2005; Le Gear et al., 2006). That is not to say that experimental control should be ignored. The studies undertaken in this thesis, do lack the laboratory control. This removes a valuable avenue of triangulation of evidence and should be addressed as future work. However, as a preliminary evaluation of this prototype process the findings here provide valuable initial evidence. Another threat to validity arose during the evaluation of the components recovered with the original architect of the AIM application. Some of the questions posed to the architect were leading and may have biased his subsequent answers.

10.2 Theoretical Limitations of Reconn-exion

201

Also, it is possible that the use of observation and think-aloud may have introduced an element of the Hawthorne effect, whereby the very knowledge of being observed can affect the performance of a participant (Wikipedia, 2006a). While an element of this may be unavoidable, the contextual evidence suggests that its influence on the study is small. Familiarisation time with tools used in the studies is also a factor that should be taken into account and would have had an impact on the participants’ productivity at the outset of the study. However, it should be noted that any such difficulties were quickly overcome, to such an extent that the participants have even recommended the tool to colleagues in the organisation. At present, 20 further installations of the jRMTool have been made, with reports of 6 software engineers using the tool to aid their regular working tasks. Indeed, if anything, this unfamiliarity would have lowered productivity and raised resentment to the process. Yet our findings contradict these hypothesised trends. It is also possible that there is a threat to the construct validity of the studies. By examining think-aloud data, quotations were gathered that appear to show evidence in support of the research questions under study. However, deciding what qualitative data supports the question being analysed is a subjective task, and introduces an element of ambiguity to the interpretation of results. Finally, it should also be noted that due to spacial, copyright, anonymity and ethical reasons approximately only 25% of the relevant quotes gathered have been included in this thesis.

10.2

Theoretical Limitations of Reconn-exion

In chapter 7 Reconn-exion was described. Its potential merits were discussed and justified in literature, and a demonstration provided to show the usefulness and benefits of

10.2 Theoretical Limitations of Reconn-exion

202

the technique. This section aims to constructively criticise the the technique by examining its theoretical limitations. A first, and pervasive limitation is the inability for static analysis techniques to identify all dependencies in source code. Dependency identification is necessary when identifying the source model used to implement the variation on Reflexion Modelling in Reconn-exion. An example of where dependencies would be missed is when analysing C or C++. When analysing these languages is may not always be possible to know where pointer references point to at a given time, thus some call dependencies may be missed. In this thesis the languages Progress 4GL and Java were analysed. In Progress 4GL an equivalent to a function pointer can sometimes be used to store a reference to a procedure. This resulted in some, but not a detrimental amount, of call dependencies being missed in the source model. In the Java source code analysed in this thesis, some of the calls in the source code were made using SOAP (a distributed messaging service) (W3C, 2006), which were not identified in the generated source model. Fortunately, as with Progress 4GL, the dependencies missed were not very prevalent. Similarly dynamic analysis, as used in the creation of the reuse perspective in Reconn-exion, cannot guarantee complete coverage of a system, since it may not be possible to design a set of test cases that exercises all code paths (even for a chosen, constrained domain). In terms of Reconn-exion, this means that for a chosen set of features, that it is not possible to guarantee exercising all the source code responsible for implementing those features. For example, the systems analysed by the case studies in this thesis, JIT/S, AIM and Workplace, are all highly configurable enterprise systems. The number of possible configuration variations could be extremely large, therefore making it infeasible to profile all subtle configurations of the system that would ensure complete source code coverage. However, this problem is partly mitigated in the studies in this thesis for two reasons: • Since the basis of the reuse perspective is in the relationship between features, it

10.2 Theoretical Limitations of Reconn-exion

203

is the feature coverage that is more important than source code coverage. • Furthermore, the set of features being selected is usually for only a selected part of the system, therefore one is rarely trying to achieve complete source code coverage when creating a reuse perspective, but rather, focussing on functionality of interest. The identification of features, which is a necessary part of the Reconn-exion process, is a highly subjective task. The decision as to what features exist, and the definition of those features are almost completely left to the discretion of the user. This can lead to different interpretations of what constitutes a feature, and what can be classed as a comprehensive feature set for the domain. For example, in the example of the creation of the reuse perspective in section 7.4, five features are identified. Three of these are rotate, translate and scale. However, another person choosing features of the application may have understood rotating and scaling as types of translation, therefore classing those three different behaviors as the same feature. In the case studies in this thesis, this problem is partly mitigated in certain cases, due to the fact that the participants had extensive experience in the application domain. However, regardless of experience, conflicting and equally valid interpretations of feature sets can be formed, even by separate experts. Also, assuming the features have already been chosen, the design and implementation of test cases that exhibit those features correctly is a task left open to error. This is again due to the subjective nature in which a person can create a test case. For example, given two people designing a test case to exhibit a feature, both may design different test cases due to either different interpretations of that feature or their understanding of what the designed test case implements. Up to 10% of all source code in a system is often cloned (Baxter et al., 1998). Given such circumstances it is possible to have what appears to be a single feature in operation, implemented in several places in the source code, unbeknownst to the user. This can

10.2 Theoretical Limitations of Reconn-exion

204

have an adverse affect on the implementation of the software reconnaissance technique, and will result in none of the cloned source code being identified for that feature. For example, give two profiles gathered from two test cases: • T est − Case − P rof ile − 1 = {method1, method2, method3} • T est − Case − P rof ile − 2 = {method1, method4, method5} Both test cases are profiles representing the same feature. Method 1 is initialisation code and does not implement the feature. The methods 4 and 5 are clones of 2 and 3 respectively and are responsible for implementing the feature being profiled. If we now calculate the set of involved software elements we get: IELEM S = {method1} The methods 2,3,4 and 5, which are the ones we want to locate, are lost in the calculation. Users, generating a reuse perspective should be aware of this limitation when applying the technique. Finally, Reconn-exion is an encapsulation technique, which means that the component is simply modelled and not extracted as with component recovery. This has been adequately highlighted in this thesis, however it still remains as a limitation that further work will need to be performed if one wishes to extract and reuse the component.

Chapter 11 Future Work “The future, according to some scientists, will be exactly like the past, only far more expensive.” - John Sladek

11.1 Catalogue of Guidelines

206

The work of this thesis has prompted several interesting avenues of research that could form the basis for a substantial amount of future work. All the proposed hypotheses are directly related to Reconn-exion. The ideas presented are far from being fully developed, but should serve as a form of guidance for future researchers to expand beyond the work of Reconn-exion. Sections 11.1 to 11.4 explore potential future work that aims to better refine the Reconn-exion technique and copperfasten the results of the work begun by this thesis. This is followed in sections 11.5 to 11.12 by suggestions for work that will expand upon Reconn-exion.

11.1

Catalogue of Guidelines

Further studies are needed to help a catalogue of guidelines when using Reconn-exion. Topics where guidance is needed include: • Choice of testcases. • Choice of features. • Appropriate use of tool support. • How to create an initial high level model. • How to create initial mappings. Such research would create a “best practices” body of knowledge for Reconn-exion.

11.2

Exploring Database Accesses

In the case study in section 9 the prospect of using databases accesses as part of the source model is explored, although not evaluated in detail. Future work will see the effects and usefulness of incorporating database accesses into the reflexion model. A

11.3 Component Wrapping

207

further useful refinement on including database accesses would be to make the distinction between reads and writes on particular database accesses in the analysis.

11.3

Component Wrapping

The solution presented in this thesis stops at component encapsulation. Future work exists, in investigating the component wrapping stage that allows the extraction and reuse of that component. The topic was previously visited in section 3.5.5. In a publication arising from this thesis a prototype wrapping solution using the xADL 2.0 architectural description language was implemented with success (Le Gear et al., 2004). However, this was a fairly trivial example. Much work exists in comparing wrapping solutions and to document best practices when wrapping.

11.4

Combining with Automated Techniques

The field of reengineering an maintenance now presents an array of useful reengineering algorithms and techniques that the software maintainer can avail of, many of which were reviewed in the literature review chapters of this thesis. They have become even more valuable as they have been integrated together into large tool sets such as Bauhaus (Koschke, 2005) and Dali (Kazman and Carrière, 1997). These tool sets will in turn become even more useful as they become integrated into widely used development environments such as Eclipse (Eclipse IDE Homepage, 2005) or Visual Studio (Microsoft, 2006c). This thesis evaluates Reconn-exion in isolation. However, much work remains in investigating it’s use in conjunction with other techniques as part of an architectural recovery process (Christl et al., 2005).

11.5 Feature-based Decomposition of Software

11.5

208

Feature-based Decomposition of Software

The reuse perspective is the union of the SHARED sets calculated for each feature. The evidence gathered in chapter 8 suggested that these SHARED sets contain core architectural elements of a system. It may be possible to create an architectural view of a system based upon the relationship between these various SHARED sets. For example, given three features and their corresponding SHARED sets: • F eature1, SHARED(F eature1) = {a, b, c, d, e, f }. • F eature2, SHARED(F eature2) = {b, d, e, g, h, i}. • F eature3, SHARED(F eature3) = {a, c, j, k, l}. Notice that these SHARED sets overlap in places (as in practice): • SHARED(F eature1) ∩ SHARED(F eature2) = {b, d, e}. • SHARED(F eature1) ∩ SHARED(F eature3) = {a, c}. One could then create a model of the system with high-level elements corresponding to: • Shared by feature one only. • Shared by feature two only. • Shared by feature three only. • Shared by features one and two only. • Shared by features one and three only.

11.5 Feature-based Decomposition of Software

209

SHARED(Feature1) {f} SHARED(Feature1) SHARED(Feature3) { a, c }

SHARED(Feature2) { g, h, i }

SHARED(Feature3) { j, k, l }

SHARED(Feature1) SHARED(Feature2) { b, d, e }

Figure 11.1: Decomposing a system in terms of its SHARED sets.

11.5 Feature-based Decomposition of Software

210

SHARED(Feature1) {f} SHARED(Feature1) SHARED(Feature3) { a, c }

SHARED(Feature2) { g, h, i }

SHARED(Feature3)

CELEMS

{ j, k, l }

SHARED(Feature1) SHARED(Feature2) { b, d, e }

Figure 11.2: Including the common software elements in the model of the system.

In this way a summary of the system, similar to a Reflexion Model, that is compared against the call graph of the system, could be produced. This is shown in figure 11.1. This process, unlike all other design recovery techniques could provide an automated route from profiled features to a recovered design. The set of common software elements (CELEMS) is the intersection of all profiles retrieved from the system. This set represents utility and initialisation code. Including this set in the model would add further meaning to the model (figure 11.2). Notice the one way relationship that will often exist between CELEMS and the remainder of the system in figure 11.2, indicating that the source code in CELEMS will usually be executed before the remainder of the system. Finally, the unique elements to a feature could also be included in the model, as with figure 11.3. What is important about these models of a system is that they can be generated from a behavioral specification of a system without consulting the source code in

11.6 Software Product Line Recovery

211

UELEMS (Feature1)

SHARED(Feature1) {f} SHARED(Feature1) SHARED(Feature3) { a, c }

SHARED(Feature2)

UELEMS (Feature2)

{ g, h, i }

SHARED(Feature3) { j, k, l } CELEMS

SHARED(Feature1)

UELEMS

SHARED(Feature2)

(Feature3)

{ b, d, e }

Figure 11.3: A feature based decomposition of a software systems that shows shared, unique and common software elements of a system.

a process moving from system execution to a recovered architecture in a single, automatic step. Furthermore, the models recovered are not in terms of generic architectural concepts such as “data layer,” or “user interface,” rather a domain specific architectural recovery could potentially be achieved.

11.6

Software Product Line Recovery

Software product lines are the underlying, generic architectures (to that domain) of software products that allow for the shorter time-to-market and shorter and cheaper devel-

11.6 Software Product Line Recovery

212

opment cycles through reuse of the a set of domain assets across a range of applications (the product line) (Greenfield et al., 2004; Eisenbarth and Simon, 2001; Priéto-Diáz, 1991). The usefulness of the software product line has been effectively shown for new product families. However, where existing legacy applications exist, migrating to a product line philosophy can be difficult (Simon and Eisenbarth, 2002) and the process of modernization (Seacord et al., 2003) unclear. Investigating whether the SHARED and UELEMS (unique) sets can be used as a means of identifying potential software to reuse in a new product line architecture for an organisation would be useful. Of even more interest to an organisation would be the potential to identify product lines in their organisation that implicitly exist across their products already. This could possibly be achieved through a combination of Software Reconnaissance and clone detection (section 3.3.1) in the following steps: 1. Software Reconnaissance is performed on a catalogue of products within an organisation. The SHARED and UELEMS sets identify the commonalities and variabilities for each individual product. 2. Clone detection detection is applied to the catalogue. This is different to the normal application of clone detection. Normally, one is trying to identify clones within an existing system. In this case, clones of source code across different systems are being searched for. 3. Correspondence between clones and SHARED sets are searched for across systems. If the same, cloned, SHARED set, or a portion of it, appears in more than one product then a portion of an existing, implicit product line within that organisation may have been identified.

11.7 Aspect Recovery

11.7

213

Aspect Recovery

The means of identifying an interface in Reconn-exion could be extended to identify the complicated set of join points for an aspect during aspect-oriented software development (given the appropriate source model). To implement this, a far more detailed source model would be necessary that contains data flow information, allowing the model generator to know what data is entering and leaving the aspect. The technology to achieve this goal exists in other solutions. For example, the “extract method” facility that exists in Visual Studio .Net (Microsoft, 2006a), allows one to highlight consecutive lines of code and automatically create a new method from it. The data model in the IDE figures out data items entering and leaving that region of code. This type of model is necessary for an automated Aspect recovery technique since information regarding the data entering and leaving an aspect is vital for aspect weaving.

11.8

Inverse Structural Summarization

Reflexion Modelling has traditionally been used as means of analysing existing software that is unfamiliar to the user. Interestingly, the reverse use of this structural summarization technique could be of use as a means of design control for a software architect in a development team. The process could proceed like this (Le Gear et al., 2006): 1. At the design phase of the software lifecycle, the software architect creates a high-level architectural model of the system. 2. During the development phase of the lifecycle, each new software element implemented is mapped to part of the high-level model. 3. Each time new mappings are made, a Reflexion Model is generated. Any divergences produced represent a violation of the prescribed architecture. Thus,

11.9 Collaborative Design Recovery

214

architectural violations can be identified immediately when they occur. Once identified, the architect can choose to: • Have that portion of the system altered to conform to the architecture. • Update his architectural model to accommodate the unanticipated architectural need. This approach presents a viable means for an architect to control the architecture of his developing system, or at the very least, documenting changes to his architecture. Gail Murphy suggested a similar idea in (Murphy et al., 2001), with the key difference being that her suggestion’s intended application was for existing systems and not to be incorporated from the beginning of the sotware development process.

11.9

Collaborative Design Recovery

During the evaluation it was noticed how the models developed by the participants were useful in stimulating a learning dialogue between the participants and the architects. Both sides found an opportunity to learn from these dialogues. This is particularly illustrated in section 8.2. This suggests that useful research could be undertaken in investigating a collaborative version of Reconn-exion or Reflexion Modelling whereby a group of people can create high-level models, maps and interpret their Reflexion Models as a team. Such a process could be incorporated into the regular design meetings of a development team. Implementing the collaborative approach would also be relatively inexpensive in effort, since all that would be necessary is each team member to be logged in over a remote desktop, using a single application. A study by the author is currently underway at to investigate this.

11.10 Temporal Summarization

A

B

C

215

D

E

A

B

F

G

H

Figure 11.4: A simple temporal source model.

Create new palatte

Create new palatte

Initialisation

Framework C

A

D

B

E A

B

F

G

H

Figure 11.5: An example temporal summarization.

11.10

Temporal Summarization

Reflexion modelling, as used in this thesis is a form of structural summarization. That is, the models created by the user are compared against a source model that describes the structure of the system, and more specifically the call relationships between procedures and data sources in the source code. However, different source models can also be produced. One that is potentially useful is a source model that represents the sequence or temporal relationship between procedures of a program. This can easily be derived from program traces (section 4.2.1). Figure 11.4 shows a simple source model using temporal relations. Using a source model like this, the normal Reflexion Modelling process could be implemented, except this time a software engineer would be performing a temporal summarization of some business process, rather that a structural summarization of the architecture. That is, business process recovery could be undertaken as opposed to architectural recovery. Figure 11.5 shows an example of temporal summarization using the source model in figure 11.4. Temporal summarisation in this way could allow a software engineer to reason over

11.11 Alternate Source Models

216

a large business process in a system.

11.11

Alternate Source Models

As explained in the previous section, the source model used in the examples in this thesis are structural (the call graph). Already, in this chapter, two more types of source models are named as necessary to implement some of the suggested future work: • A data-flow source model for aspect recovery. • A temporal source model for temporal summarization. Different types of source model enable analyses to be performed from different viewpoints. Describing a system from different viewpoints is standard practice during design (Kruchten, 1995) and should be no different during design recovery. The types of source model that would be interesting to investigate further would be: • Temporal - for business process recovery • State machines - automated recovery of activity diagrams or state charts. • Events and Publish / subscribe - a message oriented model of a system, useful for design recovery on enterprise systems. • Database access - Provides a data oriented view of a system. • Feature mapping - A domain oriented recovery of a system. • Data type usage - A domain concept breakdown of a system. • Data flow - Another business process recovery of a system. • Physical deployment - Allows the recovery of deployment diagrams.

11.12 Metrics and Heuristics

217

• Concurrency - Useful for the design recovery of state charts and sequence diagrams. In terms of software comprehension or component recovery, richer source models can provide a software engineer with more information on dependencies, thus allowing for more informed decisions to be made.

11.12

Metrics and Heuristics

Two interesting observations were made during evaluation that could be developed further as useful metrics and heuristics when analysing software: 1. A new reuse metric could be developed from the concept of the reuse perspective. One possible measure, for example, could capture how many features a software element is shared across. This could be in an indication of domain reuse. 2. For each of the studies of Reconn-exion, it took each of the participants about ten iterations to arrive at, what they felt was, a finished encapsulation of the component, in spite of differences in age, experience, system sizes and application domain. Perhaps this is a “magic number” that can allow us to estimate the number of iterations required to encapsulate a component using Reconn-exion in general. Of course, much work remains to develop these concepts into a hypothesis. Nonetheless, they appear to be interesting avenues for future work.

Chapter 12 Conclusion “Whatever we learn has a purpose and whatever we do affects everything and everyone else, if even in the tiniest way. Why, when a housefly flaps his wings, a breeze goes round the world; when a speck of dust falls to the ground, the entire planet weighs a little more; and when you stamp your foot, the earth moves slightly off its course. Whenever you laugh, gladness spreads like the ripples in a pond; and whenever you’re sad, no one anywhere can be really happy. And it’s much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer.” - Norton Juster, The Phantom Tollbooth.

219

This last chapter briefly revisits the main themes of this thesis and draws conclusions on them. Conclusions with respect to the level to which the stated objectives and contributions were achieved and an overview of the interpretation of the results are provided. Earlier in chapter 7 the Component Reconn-exion process was defined as a new process for component encapsulation based on an extensive literature review. The diagram in figure 12.1 illustrates the steps of the process as being: 1. A reuse perspective is created from dynamic analysis of the subject system. 2. Then the participant is presented with the reuse perspective. He uses this, to prompt initial mappings for possible component abstractions in the system. 3. From these component abstractions the participant chooses a component of interest that he wishes to recover. 4. The participant can then create his initial Reflexion model, as prescribed by this thesis’ proposed variation on Reflexion, with the map being prompted by the examination of the reuse perspective. 5. Further iterations of the variation on Reflexion Modelling are undertaken. The participant is free to refer to the reuse perspective at any stage. This continues until the component is encapsulated. A series of evaluations were then undertaken. This included, an industrial scale, in-vivo case study of the constituent parts of Reconn-exion and followed by a complete industrial scale, in-vivo case study of the entire Reconn-exion process. The following conclusions can be drawn from the work of this thesis: Certain validity issues and theoretical limitations exist. Users of Reconn-exion should be aware of its limitations as well as the potential gains of using it. The typical, pervasive, limitations of static and dynamic analysis exist. However, it is also important

220 Execution Profiles 2. The user browses the Reuse Perspective 1.

3. A component to recover is chosen

Reuse Perspective Software Reconnaissance Tool

4. Based on the analysis of the resue perspective and initial Reflexion Model, using the adapted appraoch, is created.

Reflexion Modelling Adaptation High−Level Model

Map

5. The Process is repeated until the component is encapsulated.

Figure 12.1: The Component Reconn-exion process.

to note that the existence of source code clones can also confuse the outputted reuse perspective in certain instances. Also, when interpreting the results of the case studies one should be aware of the threats to validity that accompany these studies. Most prominant perhaps, is the small population size for the studies undertaken. The preliminary empirical evidence suggests that reuse perspective does indeed contain core, reused and generic software elements. This view now provides a unique view on software system and is a valuable addition to a software engineer’s set of analyses. Potential exists as future work in combining this view with other techniques to allow product line and component recovery. The reuse perspective could potentially serve as a useful prompt for software engineers when choosing reusable component abstractions. The final case study undertaken showed how software engineers were able to identify several core components of systems by examining the reuse perspective. This has benefits for both component recovery and architectural recovery.

221

Reconn-exion appears to be a useful process for component encapsulation and an improvement on standard practices. In line with other results from (Murphy and Notkin, 1997; Murphy et al., 2001, 1995), the Reflexion-based process proposed by this thesis allows significantly quicker and accurate identification of targeted components and architectural elements. Based on our studies the resulting components outputted by Reconn-exion and the variation on Reflexion are of good quality and are reusable as evaluated by original architects and objective metrics. Because Reconn-exion allows for quicker and easier encapsulation of components, accuracy on the software engineers part appears to be increased. An error in accuracy can lead to serious component error. The improvement in accuracy was most evident from the workspace case study (section 8.2) where a software engineer revisited a component encapsulation task that had previously failed, only to complete it 30 times quicker using the altered version of Reflexion. Cognitive psychology provides explanations for the successes of Reconn-exion. Reference to cognitive psychology seems to be underutilised in computer science literature in explaining the success of techniques such as Reconn-exion. Furthermore, if correctly applied cognitive psychology may also be able help guide aspects of tool design in the future. The evidence gathered suggests that Reconn-exion is repeatable. The use of more than one participant in more than one setting showed how Reconn-exion is a repeatable process across organisations, participants and systems for component encapsulation. This, although a provisional finding, is of immense importance if the process is to ever be adopted in practice. Reconn-exion is useful to industry. In both industrial locations where the technique was applied, technology transfer to the organisation began to occur. Apart from the evidence gathered during evaluation, this is probably the best form of evidence that a technique is useful to industry.

222

During the component encapsulation phase the participant should be free to define subarchitectures of the chosen component in order to finalise its contents. In this way the software engineer can become more certain of the contents of a component before defining the roles that that component has in the system.

Part III Bibliography

Bibliography Aldrich, J., Chambers, C. and Notkin, D. (2002), Archjava: Connecting software architecture to implementation, in ‘International Conference on Software Engineering’, Orlando, Florida, USA. Allen, P. and Frost, S. (1998), Component-based Development for Enterprise Systems: Applying the SELECT Perspective, Managing Object Technology Series, Cambridge University Press. Arnold, R. S. (1993), Software Reengineering, IEEE Computer Society Press. AT and T (2005), ‘Graphviz’, World wide web. http://www.graphviz.org/. Ausbel, D. (1968), Educational psychology: A cognitive view, Holt, Rinehart and Winston. Bachmann, F., Bass, L., Buhman, C., Comella-Dorda, S., Long, F., Robert, J., Seacord, R. and Wallnau, K. (2000), Volume ii: Technical concepts of component-based software engineering, 2nd edition, Internation Research and Development CMU/SEI2000-TR-008, ESC-TR-2000-007, Carnegie-Mellon Software Engineering Institute, Pittsburgh, PA 15213-3890. Baker, B. S. (1997), ‘Parameterized duplication in strings: Algorithms and an application to software maintenance’, SIAM Journal on Computing 26(5), 1343–1362.

BIBLIOGRAPHY

225

Ball, T. (1999), The concept of dynamic analysis, in ‘Proceedings of the 7th European engineering conference held jointly with the 7th ACM SIGSOFT international symposium on Foundations of software engineering’, Toulouse, France, pp. 216–234. Bartlett, F. (1932), Remembering: A Study in Experimental and Social Psychology, Cambridge University Press. Basili, V. R. (1996), The role of experimentation in software engineering: Past, current and future, in ‘International Conference on Software Engineering’, pp. 442–449. Bass, L., Buhman, C., Comella-Dorda, S., Long, F., Robert, J., Seacord, R. and Wallnau, K. (2000), Volume i: Market assessment of component-based software engineering, internal Research and Development CMU/SEI-2001-TN-007, Software Engineering Institute, Carnegie-Mellon University. Baxter, I. D., Yahin, A., Moura, L., Anna, M. S. and Bier, L. (1998), Clone detection using abstract syntax trees, in ‘International Conference on Software Maintenance’. Belady, L. A. and Evangelisti, C. J. (1982), ‘System partitioning and its measure’, Journal of Systems and Software 2(1), 23–29. Bergey, J., O’Brien, L. and Smith, D. (2000), Mining existing assets for software product lines, Technical Note CMU/SEI-2000-TN-008, Software Engineering Institute, Carnegie-Mellon University. Product Line Practice Initiative. Beugnard, A., Jezequel, J.-M., Plouzeau, N. and Watkins, D. (1999), ‘Making components contract aware’, IEEE Computer 32(7), 38–44. Biggerstaff, T. J. (1989), ‘Design recovery for maintenance and reuse’, IEEE Computer 22(7), 36–49.

BIBLIOGRAPHY

226

Biggerstaff, T., Mitbander, B. and Webster, D. (1993), The concept assignment problem in program understanding, in ‘Proceedings of Working Conference on Reverse Engineering’, pp. 27–43. Bousfield, W. (1953), ‘The occurance of clustering in recall of randomly arranged associates.’, Journal of General Psychology 49, 229–240. Bowen, J. P., Breuer, P. T. and Lano, K. C. (1993), ‘A compendium of formal techniques for software maintenance’, Software Engineering Journal 8(5), 253–262. Bower, G., Clark, M., Lesgold, A. and Winzenz, D. (1969), ‘Hierarchical retrieval schemes in recall of categorized word lists’, Journal of Verbal Learning and Verbal Behavior. 8, 323–343. Brace, N. and Roth, I. (2005), Mapping Psychology., Vol. Chapter 8: Memory: Structures, processes and Skills, Open University Press. Brooks, R. (1983), ‘Towards a theory of comprehension of computer programs’, International Journal of Man-Machine Studies (18), 543–554. Bruner, J. (1990), Act of Meaning, Harvard University Press. Bruner, J., Goodnow, J. and Austin, G. (1956), A Study of Thinking, John Wiley and Sons. Buckley, J. (2002), System monitoring : a tool for capturing software engineer’s information-seeking behaviour, PhD thesis, University of Limerick. Buckley, J., Mens, T., Zenger, M., Rashid, A. and Kniesel, G. (2003), ‘Towards a taxonomy of software change’, Journal of Software Maintenance and Evolution Research and Practice p. (to appear).

BIBLIOGRAPHY

227

Carew, D., Exton, C. and Buckley, J. (2005), An empirical investigation of the comprehensibility of requirements specification, in ‘ISESE’. Cheesman, J. and Daniels, J. (2001), UML Components : A simple Process for Specifying Component-Based Software, Component Software Series, Addison Wesley. Chikofsky, E. J. and Cross II, J. H. (1990), ‘Reverse engineering and design recovery: A taxonomy’, IEEE Software pp. 13–17. Chiricota, Y., Jourdan, F. and Melancon, G. (2003), Software components capture using graph clustering, in ‘International Workshop on Program Comprehension’, pp. 217– 227. Cho, E. S., Kim, M. S. and Kim, S. D. (2001), Component metrics to measure component quality, in ‘Eighth Asia-Pacific Software Engineering Conference’, IEEE, Macao, China, p. 419. Choi, S. C. and Scacchi, W. (1990), ‘Extracting and restructuring the design of large systems’, IEEE Software 7(1), 66–71. Christl, A., Koschke, R. and Storey, M.-A. (2005), Equipping the reflexion method with automated clustering, in ‘Working Conference on Reverse Engineering’. Chung, W., Harrison, W., Kruskal, V., Ossher, H., Stanley, J., Sutton, M. and Tarr, P. (2005), Working with implicit concerns in the concern manipulation environment, in ‘Linking Aspect Technology and Evolution Co hosted with Aspect Orientated Software Development (ASOD 05)’, Chicago, USA. Cicalese, C. D. T. and Rotenstreich, S. (1999), ‘Behavioral specification of distributed software component interfaces’, IEEE Computer 32(7), 46–53. Cimitile, A. and Visaggio, G. (1995), ‘Software salvaging and the call dominance tree’, Journal of Systems Software 28(2), 117–127.

BIBLIOGRAPHY

228

Clark, T. (2002), Object Modeling with the OCL (Lecture Notes in Computer Science), Springer-Verlag Berlin and Heidelberg GmbH & Co. K. Comella-Dorda, S., Wallnau and Seacord, R. C. (2000), A survey of legacy system modernization approaches, Technical Note CMU/SEI-2000-TN-003, Software Engineering Institute, Carnegie-Mellon University. COTS-Based Systems Initiative. Connolly, T. and Begg, C. (2004), Database Systems: A Practical Approach to Design, Implementation and Management, Addison Wesley. Corbi, T. A. (1989), ‘Program understanding: Challenge for the 1990’s’, IBM Systems Journal 28(2), 294–306. Corp., G. I. (1989), Application reengineering, in ‘Guide Pub. GPP-208’, Chicago. Councill, B. (2001), Third party certification and its required elements, in ‘4th ICSE Workshop on Component-Based Software Engineering’, Toronto, Canada. Craik, F. and Lockhart, R. (1972), ‘Levels of processing: A framework for memory research’, Journal of Verbal Learning and Verbal Behavior. 11, 671–684. Craik, F. and Tulving, E. (1975), ‘Depth of processing and retention of words in episodic memory.’, Journal of Experimental Psychology, General. 104, 268–294. Dean, T. and Chen, Y. (2003), Design recovery of a two level system, in ‘International Conference on Program Comprehension’, Vol. Design RecoveryArchitecture, pp. 23–32. Doval, D., Mancordis, S. and Mitchell, B. S. (1999), Automatic clustering of software systems using a genetic algorithm, in ‘Proceedings of the International Conference on Software Tools and Engineering Practice’.

BIBLIOGRAPHY

229

Dowling, J., Schafer, T., Cahill, V., Haraszti and Redmond, B. (1999), Using reflection to support dynamic adaptation of system software: A case study driven evaluation, in ‘Workshop on ObjectOriented Reflection and Software Engineering’, Denver, Colorado, pp. 153–172. Ebbinghaus, H. (1913), Memory, paperback edition., new york dover, 1964 edn, New York, Teachers College. Eclipse IDE Homepage (2005). http://www.eclipse.org. Eddon, G. (1999), ‘Com+:the evolution of component services’, IEEE Computer 32(7), 104–106. Eick, S. G., Graves, T. L., Karr, A. F., Marron, J. S. and Mockus, A. (2001), ‘Does code decay? assessing the evidence from change management data’, IEEE Transactions on Software Engineering 27(1), 1–12. Eiffel Software, I. (2004), ‘Eiffel software home page’, www.eiffel.com . Eisenbarth, T., Kosche, R. and Simon, D. (2001), Feature-driven program understanding using concept analysis of execution traces, in ‘International Workshop on Programming Comprehension (IWPC)’, IEEE Computer Society, IEEE, Toronto, Canada, pp. 300–309. Eisenbarth, T., Koschke, R. and Simon, D. (2001), Aiding program comprehension by static and dynamic feature analysis, in ‘IEEE International Conference on Software Maintenance (ICSM’01)’, IEEE, Universität Stuttgart, Breitwiesenstr, 20-22, 70565, Stuttgart, Germany, p. 602. Eisenbarth, T., Koschke, R. and Simon, D. (2003), ‘Locating features in source code’, IEEE Transactions on Software Engineering 29(3), 210–224.

BIBLIOGRAPHY

230

Eisenbarth, T. and Simon, D. (2001), Guiding feature asset mining for software product line development, in ‘International Workshop on Product Line Engineering: The Early Steps: Planning, Modelling and Managing’. Eisenberg, A. D. and De Volder, C. (2005), Dynamic feature traces: Finding features in unfamiliar code, in ‘Proceedings of the the 21st International Conference on Software Maintenance’. Ericsson, K. A. and Simon, H. A. (1993), Protocol Analysis - Revised Edition, Verbal Reports as Data, The MIT Press. Fantozzi, A. (2002), Locating features in vim: A software reconnaissance case study. Submission to University of West Florida for CEN 6015. Fenton, N. E. (1991), Software Metrics: A Rigorous Approach, Chapman & Hall, Ltd., London, UK. Ferrante, J. and Warren, J. D. (1987), ‘The program dependence graph and its use in optimization’, ACM Transactions on Programming Languages and Systems 9(3), 319– 349. Fleming, N. (2006), Vark - a guide to learning styles.

http://www.vark-

learn.com/english. Fleming, N. D. and Mills, C. (1992), ‘Not another inventory, rather a catalyst for reflexion’, To Improve the Academy 11, 137. Fowler, M., Beck, K., Brant, J., Opdyke, W. and Roberts, D. (1997), Refactoring: Improving the Design of Existing Code, The SEI Series in Software Engineering, Addison Wesley Professional, Inc.

BIBLIOGRAPHY

231

Gall, H. and Klösch (1995), Finding objects in procedural programs: An alternate approach, in ‘Proceeding of Second Working Conference on Reverse Engineering’, pp. 208–216. Gallagher, K. B. and Lyle, J. R. (1991), ‘Using program slicing in software maintenance’, IEEE Transactions on Software Engineering 17(8), 751–761. Galvin, S., Collins, J., Exton, C. and McGurren, F. (2004), Enhancing the role of interfaces in software architecture description languages (adls), in ‘The Workshop of Architecture Description Languages (WADL ’04)’, Toulouse, France. Girard, J.-F. and Koschke, R. (1997), Finding components in a hierarchy of modules: a step towards architectural understanding, in ‘International Conference on Software Maintenance’, Bari, Italy, pp. 58–65. Glass, R., Vessey, I. and Ramesh, V. (2002), ‘Research in software engineering: An analysis of the literature’, Journal of Information and Software Technology 44, 491– 506. Greenfield, J., Short, K., Cook, S. and Kent, S. (2004), Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools, Wiley. Gunderson, A., Wilde, N. and Casey, C. (1995), Locating features in interbase: A software reconnaissance case study at gte government systems, Technical SERC-TR77-F, Software Engineering Research Center, University of Florida, Gainsville, FL 32611, USA. Hamlet, D. (2001), ‘Component synthesis theory: The problem of scale’, 4th ICSE Workshop on Component-Based Software Engineering . Harrison, W. (2006), ‘Editorial’, Empirical Software Engineering 2(1).

BIBLIOGRAPHY

232

Hassan, A. E. and Holt, R. C. (2004), Using development history sticky notes to understand software architecture, in ‘International Workshop on Programming Comprehension (IWPC)’, IEEE Computer Society, IEEE, Bari, Italy, pp. 183–192. Heuzeroth, D., Holl, T., Hogstrom, G. and Lowe, W. (2003), Automatic design pattern detection, in ‘International Conference on Program Comprehension’, pp. 94–103. Hoare, C. (1974), ‘Monitors: An operating system structuring concept’, Communications of the ACM 17(10), 549–557. Hochstein, L. and Lindvall, M. (2003), Diagnosing architectural degeneration, in ‘28th Annual NASA Goddard Software Engineering Workshop’. Huitt, W. (2003), Constructivism. Educational Psychology Interactive, Valdosta State University. http://chiron.valdosta.edu/whuitt/col/cogsys/construct.html. Hutchens, D. H. and Basili, V. R. (1985), ‘System structure analysis: Clustering with data bindings’, IEEE Transactions on Software Engineering SE-11(8), 749–757. IEEE (1990), Ieee standard glossary of software engineering terminology, Std. 610.121990, IEEE Standards Board. Inc.,

A. (2006),

‘Adobe acrobat reader homepage’,

World Wide Web.

http://www.adobe.com/products/acrobat/. Inc, S. M. (2005), ‘Java home page’, Internet. http://java.sun.com. Institute, S. E. (1997), The Capability Maturity Model: Guidelines for Improving the Software Process, The SEI Series in Software Engineering, Addison Wesley Longman, Inc.

BIBLIOGRAPHY

233

Johnson, J. H. (1994), Substring matching for clone detection and change tracking, in ‘International Conference on Software Maintenance’, IEEE Computer Society, pp. 120–126. Johnson, P. D. (2002), Mining legacy systems for business components: An architecture for an integrated toolkit, in ‘Proceedings of the 26th International Computer Software and Applications Conference’, IEEE. JUnit (2006), Junit. http:.//www.junit.org. Juric, M. B., Rozman, I., Hericko, M. and Domajnko, T. (2000), ‘Integrating legacy systems in distributed object architecture’, ACM Software Engineering Notes 25(2), 35– 39. Kaplan, A. and Murphy, G. (2000), ‘Category learning with minimal prior knowledge’, Journal of Experimental Psychology. Learning Memory and Cognition. 26(4), 829– 846. Kazman, R. and Carrière (1997), Playing detective: Reconstructing software architecture from available evidence, Technical Report CMU/SEI-97-TR-010, ESC-TR-97010, The Software Engineering Institute at Carnegie Mellon University. Kellogg, R. T. (2003), Cognitive Psychology, Sage Publications Incorporated. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopez, C. V., Loingtier, J.-M. and Irwin, J. (1997), Aspect-oriented programming, in ‘Proceedings of European Conference on Object-Oriented Programming (ECOOP)’. Kitchenham, B., Budgen, D., Brereton, P. and Linkman, S. (2005), Realising evidencebased software engineering, in ‘REBSE ’05’, St. Louis Missouri, USA. Komondoor, R. and Horwitz, S. (2003), Effective, automatic procedure extraction, in ‘International Conference on Program Comprehension’, pp. 33–42.

BIBLIOGRAPHY

234

Kontogiannis, K. A., Demori, R., Merlo, E., Galler, M. and Bernstein, M. (1996), ‘Pattern matching for clone and concept detection’, Automated Software Engineering (3), 77–108. Kosche, R. and Daniel, S. (2003), Hierarchial reflexion models, in ‘Working Conference on Reverse Engineering’. Koschke, R. (1999), An incremental semi-automatic method for component recovery, in ‘Working Conference on Reverse Engineering’, IEEE Computer Society Press. Koschke, R. (2000a), Atomic Architectural Component Recovery for Program Understanding and Evolution, PhD thesis, Institute for Computer Science, University of Stuttgart. Koschke, R. (2000b), ‘Bauhaus - clustering’, World wide web. http://www.iste.unistuttgart.de/ps/clustering/. Koschke, R. (2004), A concept analsis primer. Koschke, R. (2005), Bauhaus. http://www.iste.uni-stuttgart.de/ps/bauhaus/papers/. Kruchten, P. (1995), ‘Architectural blueprints - the “4+1” view model of software architecture’, IEEE Software 12(6), 42–50. Kruchten, P. (1999), The Rational Unified Process - An Introduction, Addison-Wesley. Krueger, C. W. (1992), ‘Software reuse’, ACM Computing Surveys (CSUR) 24(2), 131– 183. Larsen, L. and Harrold, M. J. (1996), Slicing object-oriented software, in ‘International Conference on Software Engineering’, pp. 495–505. Lau, K.-K. (2001), ‘Component certification and system prediction: Is there a role for formality?’, 4th ICSE Workshop on Component-Based Software Engineering .

BIBLIOGRAPHY

235

Le Gear, A. (2004), Thematic review of software reengineering and maintenance, Technical Report UL-CSIS-04-3, University of Limerick, Plassy, Castletroy, Co. Limerick, Ireland. Le

Gear,

A.

(2006),

‘Reconcalc’,

World

Wide

Web.

http://sourceforge.net/projects/recon-calc. Le Gear, A. and Buckley, J. (2005a), Component reconn-exion:component recovery using a variation on software reconnaissance an reflexion modelling, Technical Report UL-CSIS-05-3, University of Limerick, Plassy, Castletroy, Co. Limerick, Ireland. Le Gear, A. and Buckley, J. (2005b), ‘Reengineering towards components using “reconn-exion”’, ACM SIGSOFT Software Engineering Notes 30(5), 32. ACM Press. Le Gear, A. and Buckley, J. (2005c), Reengineering towards components using “reconnexion”, in ‘ESEC/FSE Doctoral Symposium 2005’, Libon, Portugal. Le Gear, A., Buckley, J., Cleary, B. and Collins, J. (2005), ‘Achieving a reuse perspective within a component recovery process: An industrial case study’, International Workshop on Programming Comprehension pp. 279–288. Le Gear, A., Buckley, J., Collins, J. and O’Dea, K. (2005), Software reconn-exion: Understanding software using a variation on software reconnaissance and reflexion modelling, in ‘International Symposium on Empirical Software Engineering’, IEEE, Noosa, Australia, pp. 33–42. Le Gear, A., Buckley, J., Galvin, S. and Cleary, B. (2004), A process for transforming portions of existing software for reuse in modern development approaches, in ‘1st International Workshop on Software Evolution Transformations’, Delft, the Netherlands, pp. 40–43.

BIBLIOGRAPHY

236

Le Gear, A., Buckley, J. and McIlwaine, C. (2006), Exercising control over the design of evolving software systems using an inverse application of reflexion modeling, in ‘CASCON’, ACM Press. Le Gear, A., Cleary, B., Buckley, J. and Exton, C. (2005), Making a reuse aspectual view explicit in existing software, in ‘Linking Aspect Technology and Evolution (LATE)’. Leintz, B. and Swanson, E. (1980), Software Maintenance Management, a Study of the Maintenance of Computer Application Software in 487 Data Processing Organizations., Addison-Wesley. Lientz, B. P. and Swanson, E. B. (1978), ‘Characteristics of application software maintenance’, Communications of the ACM 21(6), 466–471. Lindig, C. and Snelting, G. (1997), Assessing modular structure of legacy code based on mathematical concept analysis, in ‘International Conference on Software Engineering’, Bostin MA, USA, pp. 349–359. Lindvall, M., Tesoriero, R. and Costa, P. (2002), Avoiding architectural degeneration: An evaluation process for software architecture, in ‘IEEE Symposium on Software Metrics’. Littleton, K., Toates, F. and Braisby, N. (2005), Mapping Psychology, Vol. Chapter 3: Three Approaches to Learning, Open University Press. Livadas, P. E. and Johnson, T. (1994), ‘A new approach to finding objects in programs’, Journal of Software Maintenance: Research and Practice 6, 249–260. Loeckx, J. and Sieber, K. (1987), The Foundations of Program Verification, Wiley Teubner Series in Computer Science, second edition edn, B. G. Teubner and John Wiley and Sons.

BIBLIOGRAPHY

Ltd.,

Q.

(2006),

237

‘Qad

products:

Mfg/pro

eb2’,

World

wide

web.

http://www.qad.com/solutions/eb2.html. Lukoit, K., Wilde, N., Stowell, S. and Hennessy, T. (2000), Tracegraph: Immediate visual location of software features, Technical Report SERC-TR-86-F, Software Engineering Research Center, Purdue University, 1398 Dept. of Computer Science, West Lafayette, IN 47906. Malton, A. J. and Schneider, K. A. (2001), Processing software source text in automated design recovery, in ‘International Conference on Program Comprehension’, pp. 127– 134. Marin, M., van Deursen, A. and Moonen, L. (2004), Identifying aspects using fan-in analysis, in ‘Proceedings of the 11th Working Conference on Reverse Engineering (WCRE’04)’, pp. 132–141. McGurren, F. (2004), Component composition and architectural reflection, Master’s thesis, University of Limerick, University of Limerick, Plassy, Castletroy, Co. Limerick, Ireland. McIntosh, K. S. (2003), Progress pro*spy plus, Technical Report Edition 1, Progress Software. Meyer, B. and Mingins, C. (1999), ‘Component-based development: From buzz to spark’, IEEE Software 32(7), 35–37. Microsoft,

C.

(2006a),

‘Microsoft

.net

homepage’,

World

wide

web.

homepage’,

World

Wide

Web.

http://www.microsoft.com/net/default.mspx. Microsoft,

C.

(2006b),

‘Microsoft

http://www.microsoft.com/word/.

word

BIBLIOGRAPHY

238

Microsoft, C. (2006c), Visual studio 2005. http://msdn.microsoft.com/vstudio/. Mii, N. and Takeshita, T. (1993), ‘Software re-engineering and reuse from a japanese point of view’, Information and Software Technology 35(1), 45–53. Mitchell, B. S., Mancoridis, S. and Traverso, M. (2002), Search based reverse engineering, in ‘Proceedings of the 14th international conference on Software engineering and knowledge engineering’, ACM Press, New York, USA, Ischia, Italy, pp. 431–438. Muller, H. A., Orgun, M. A., Tilley, S. R. and Uhl, J. S. (1993), ‘A reverse engineering approach to subsystem structure identification’, Software Maintenance: Research and Practice 5(4), 181–204. Murphy, G. and Allopenna, P. (1994), ‘The locus of knowledge effects in conceptlearning’, Journal of Experimental Psychology. Learning Memory and Cognition. 20(4), 904–919. Murphy, G. C. (1996), Lightweight Structural Summarization as an Aid to Software Evolution, PhD thesis, University of Washington. Murphy, G. C. (2003), ‘jrmtool’, World Wide Web.

http://www.cs.ubc.ca/ mur-

phy/jRMTool/doc/. Murphy, G. C. and Notkin, D. (1997), ‘Reengineering with reflexion models: A case study’, IEEE Computer 17(2), 29–36. Murphy, G. C., Notkin, D. and Sullivan, K. (1995), Software reflexion models: Bridging the gap between source and high-level models, in ‘Symposium on the Foundations of Software Engineering’, ACM SIGSOFT, Washington D.C., pp. 18–28. Murphy, G. C., Notkin, D. and Sullivan, K. (2001), ‘Software reflexion models: Bridging the gap between design and implementation’, IEEE Transactions on Software Engineering 27(4), 364–380.

BIBLIOGRAPHY

239

Murphy, G. and Medin, D. (1985), ‘The role of theories in conceptual coherence’, Psychological Review 92, 289–316. Myers, M. D. (1997), ‘Qualitative research in information systems’, MIS Quarterly Discovery 21(2), 241. Nauer, P. and Randell, B. (1968), Software engineering; report on a conference by the nato science committee, Technical report, NATO Scientific Affairs Division. O’Brien, M. P. and Buckley, J. (2001), Inference-based and expectation-based processing in program comprehension, in ‘International Conference on Program Comprehension’, Vol. 9, pp. 71–78. O’Brien, M. P., Buckley, J. and Exton, C. (2005), Empirically studying software practitioners - bridging the gap between theory and practice, in ‘International Conference on Software Maintenance’, IEEE, pp. 433–442. O’Callaghan, M. (2005), Biology, The Educational Company of Ireland. O’Cinneide, M. (2001), Automated Application of Design Patterns: A Refactoring Approach, Phd., University of Dublin, Trinity College. O’Cinneide, M. and Nixon, P. (1999), ‘A methodology for the automated introduction of design patterns’, Proceedings of the International Conference on Software Maintenance . O’Cinneide, M. and Nixon, P. (2000), ‘Composite refactorings for java programs’, Workshop for formal Techniques for Java Programs, European Conference on Objected-Oriented Programming . O’Cinneide, M. and Nixon, P. (2001), ‘Automated software evolution towards design patterns’, Proceedings of the International Workshop on the Principles of Software Evolution .

BIBLIOGRAPHY

240

Ogando, R. M., Yau, S. S. and Liu, S. S. (1994), ‘An object finder for program structure understanding in software maintenance’, Journal of Software Maintenance: Research and Practice 6(5), 262–283. O’Gorman, J. (2001), Operating Systems with Linux, Cornerstones of Computing, Palgrave Publishers Ltd., Houndsmills, Basingstoke, Hampshire, RG21 6SX and 175 Fifth Avenue, New York, NY 10010. OOPSLA, ed. (2006), ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, Portland, Oregon, USA. Ottenstein, K. J. and Ottenstein, L. M. (1984), The program dependence graph in a software development environment, in ‘ACM SIGPLAN/SIGSOFT Symposium on Practical Programming Development Environments’, ACM SIGPLAN /SIGSOFT, Pittsburgh, Pennsylvania, pp. 177–184. Parnas, D. L. (1971), ‘Information distribution aspects of design methodology’, Proc. Int. Fed. Inform. Processing Congr. TA-3, 26–30. Parnas, D. L. (1972), ‘On the criteria to be used in decomposing systems into modules’, Commun. Ass. Comput. Mach. 15, 1053–1058. Parnas, D. L. (2002), ‘The secret history of information hiding’, Software pioneers: contributions to software engineering pp. 399–409. Patel, S., Chu, W. and Baxter, R. (1992), A measure for composite module cohesion, in ‘Proceedings of the 14th international conference on Software engineering’, ACM Press New York, NY, USA, Melbourne, Australia, pp. 38–48. Perry, D., Porter, A. and Votta, L. (1997a), A primer on empirical studies, in ‘International Conference on Software Maintenance’.

BIBLIOGRAPHY

241

Perry, D., Porter, A. and Votta, L. (1997b), A primer on empirical studies, in ‘Tutorial Presented at the International Conference on Software Maintenance’. Pressman, R. S. (2004), Software Engineering: A Practitioner’s Approach, McGrawHill Professional. Priéto-Diáz (1991), ‘Implementing a faceted classification for software reuse’, Communications of the ACM 34(5), 88–97. Progress Software, C. (2003), Getting Started With Progress Dynamics, Progress Dynamics Software Corporation. Quilici, A. (1993), A hybrid approach to recognizing programming plans, in ‘IEEE Workshop on Program Comprehension’, Capri, Italy, pp. 96–103. Quilici, A., Woods, S. and Zhang, Y. (1997), New experiments with a constraint-based approach to program matching, in ‘Working Conference on Reverse Engineering’, pp. 114–123. Quilici, A. and Yang, Q. (1996), Applying plan recognition algorithms to program understanding, in ‘Knowledge-Based Software Engineering Conference’, Syracuse, NY, USA, pp. 96–103. Rajlich, V. and Wilde, N. (2002), The role of concepts in program comprehension, in ‘International Conference on Program Comprehension’, pp. 271–278. Ran, S., Brebner, P. and Gorton, I. (2001), The rigorous evaluation of enterprise java bean technology, in ‘The 15th International Conference on Information Networking’, IEEE Computer Society, IEEE, Beppu City, Oita, Japan, pp. 93–100. Refl, P. A. (2005), Tool assisted identifier naming for improved software readability: An empirical study, in ‘International Symposium on Empirical Software Engineering’.

BIBLIOGRAPHY

Rennard,

J.-P.

242

(2000),

Introduction

to

genetic

algorithms.

http://www.rennard.org/alife/english/gavintrgb.html. Rich, C. (1984), ‘Artificial intelligence and software engineering: The programmer’s apprentice project’, ACM Annual Conference . Richardson, I. (1999), Improving the Software Process in a Small Indigenous Software Development Companies using a model based on Quality Function Deployment, PhD thesis, University of Limerick, Castletroy, Limerick, Ireland. Ritchie, B. (2006), www.dotnetpowered.com. www.dotnetPowered.com. Ritsch, H. and Sneed, H. (1993), Reverse engineering programs via dynamic analysis, in ‘Proceedings of Working Conference on Reverse Engineering’, pp. 192–201. Riva, C. and Deursen, A. v. (2001), Software architecture reconstruction, in ‘International Conference on Software Engineering’, IEEE, IEEE, Edinburgh, Scotland. Tutorial T8. Robillard, M. P. and Murphy, G. C. (2002), Concern graphs: Finding and describing concerns using structural program dependencies, in ‘International Conference on Software Engineering’. Rochester, J. B. and Douglass, D. P. (1991), ‘Re-engineering existing systems’, I/S Analyser 29(10), 1–12. Russo, J., Johnson, E. and Stephens, D. (1989), ‘The validity of verbal protocols’, Memory and Cognition 17. Sadd, J. (2003), Progress Dynamics Developer’s Guide, Expert Series, Progress Software Corporation.

BIBLIOGRAPHY

243

Sartipi, K. (2001), Alborz: A query-based tool for software architecture recovery, in ‘International Conference on Program Comprehension’, pp. 115–116. Sartipi, K., Kontogiannis, K. and Mavaddat, F. (2000), A pattern matching framework for software architecture recovery and restructuring, in ‘International Conference on Program Comprehension’, pp. 37–47. Schwanke, R. W. (1991), An intelligent tool for re-engineering software modularity, in ‘International Conference on Software Engineering’, pp. 83–92. Seacord, R. C., Plakosh, D. and Lewis, G. A. (2003), Modernizing Legacy Systems, Addison Wesley. Seaman, C. (1999), ‘Qualitative methods in empirical studies of software engineering.’, IEEE Transactions on Software Engineering 25(1), 557–572. Seaman, C. B. (2002), ‘The information gathering strategies of software maintainers’, Proceedings of the International Conference on Software Maintenance pp. 141–149. Sefika, M., Sane, A. and Campbell, R. H. (1996), Monitoring compliance of a software system with its high-level design models, in ‘International Conference on Software Engineering’, pp. 387–396. Segal, J., Grinyer, A. and Sharp, H. (2005), The type of evidence produced by empirical software engineers, in ‘REBSE ’05’, St. Louis Missouri, USA. Shaft, T. M. (1995), ‘The relevance of application domain knowledge: The case of computer program comprehension’, Information Systems Research 6, 286. Siff, M. and Reps, T. (1997), Identifying modules via concept analysis, in ‘International Conference on Software Maintenance’, Bari, Italy, pp. 170–179.

BIBLIOGRAPHY

244

Simon, D. and Eisenbarth, T. (2002), ‘Evolutionary introduction of software product lines’, Lecture Notes in Computer Science 2379, 272–282. Simula

(2006),

The

simula

programming

languages.

http://www.engin.umd.umich.edu/CIS/course.des/cis400/simula/simula.html. Snelting, G. (1996), ‘Reengineering of configurations based on mathematical concept analysis’, ACM Transactions of Software Engineering and Methodology 5(2), 146– 189. Snelting, G. (1998), Reengineering of class hierarchies using concept analysis, in ‘Sixth ACM SIGSOFT Symposium on the Foundations of Software Engineering’, pp. 99– 110. Snyder, A. (1986), Encapsulation and inheritance in object-oriented programming languages, in ‘OOPLSA’. Stephens, W., Myers, G. and Constantine, L. (1974), ‘Structured design’, IBM Systems Journal 13(2). Stevens, P. and Pooley, R. (2000), Using UML - Software Engineering with Objects and Components, Object Technology Series, Addison-Wesley. Stoemer, C., O’Brien, L. and Verhoef, C. (2003), Moving towards quality attribute driven software architecture reconstruction, in ‘Working Conference on Reverse Engineering’. Stoermer, C., O’Brien, L. and Verhoef, C. (2004), Architectural views through collapsing strategies, in ‘International Workshop on Program Comprehension (IWPC’04)’. Storey, M.-A. D. and Muller, H. A. (1995), Manipulating and documenting software structures using shrimp views, in ‘International Conference in Software Maintenance’, IEEE, Nice, France, pp. 275–285.

BIBLIOGRAPHY

245

Sun Microsystems, I. (2006), ‘Java home page’, World wide web. http://java.sun.com. Szyperski, C. (2003), Component technology - what, where and how?, in ‘International Conference on Software Engineering’, IEEE, pp. 684–693. Terekhov, A.A. Verhoef, C. (2000), ‘The realities of language conversions’, IEEE Software (6), 111–124. Tilley, S. R., Wong, K., Storey, M.-A. D. and Muller, H. A. (1994), ‘Programmable reverse engineering’, International Journal of Software Engineering and Knowledge Engineering 4(4), 501–520. Tip, F. (1995), ‘A survey of program slicing techniques’, Journal of Programming Languages 3, 121–189. Tran, J. B., Godfrey, M. W., Lee, E. H. and Holt, R. C. (2000), Architectural repair of open source software, in ‘International Conference on Program Comprehension’, pp. 48–59. Tulving, E. (1975), Ecphoric Processing in Recall and Recognition in Recall and Recognition, Wiley. Tulving, E. (1983), Elements of Episodic Memory, Oxford University Press. Tvedt, R. T., Costa, P. and Lindvall, M. (2002), Does the code match the design? a process for architecture evaluation, in ‘International Conference on Software Maintenance’. Tvedt, R. T., Costa, P. and Lindvall, M. (2004), ‘Evaluating software architectures’, Advances in Computers 61, 1–43. Urschler, G. (1975), ‘Automatic structuring of programs’, IBM Journal of Research and Development 19, 181–194.

BIBLIOGRAPHY

246

Valasareddi, R. R. and Carver, D. L. (1998), A graph-based object identification process for procedural programs, in ‘Proceedings of the Working Conference on Reverse Engineering (WCRE’98)’, IEEE Computer Society Washington, DC, USA, p. 50. Verhoef,

C.

(2000),

‘The

realities

of

large

software

portfolios’,

cite-

seer.ist.psu.edu/verhoef00realities.html. Voas, J. M. (1998), ‘Certifying off-the-shelf software components’, IEEE Computer 31(6), 53–59. W3C (2006), ‘Soap specification’, World Wide Web. http://www.w3.org/TR/soap/. Walenstein, A. (2002), ‘Theory-based analysis of cognitive support in software comprehension tools’, International Workshop on Program Comprehension 10, 75–84. Wallnau, K. C. (2003), Volume iii: A technology for predictable assembly from certifiable components, Technical Report CMU/SEI-2003-TR-009, ESC-TR-2003-009, Software Engineering Institute, Carnegie-Mellon University. Wang, G., Ungar, L. and Klawitter, D. (1999), ‘Component assembly for oo distributed systems’, IEEE Computer 32(7), 71–78. Washizaki, H., Yamamoto, H. and Fukazawa, Y. (2002), Software component metrics and it’s experimental evaluation, in ‘Proc. of the International Symposium on Empirical Software Engineering (ISESE 2002)’. Weiser, M. (1982), ‘Programmers use slices when debugging’, Communications of the ACM 25(7), 446–452. Whittaker, J. A. and Voas, J. M. (2002), ‘50 years of software: Key principals for software’, IEEE IT Professional 4(6), 28–35.

BIBLIOGRAPHY

Wikipedia

(2006a),

247

‘Hawthorne

effect’,

World

Wide

Web.

http://en.wikipedia.org/wiki/hawthorne_effect. Wikipedia (2006b), Locating features in vim: A software reconnaissance case study. http://en.wikipedia.org/wiki/Information_hiding. Wilde, N. (1994), Faster reuse and maintenance using “software reconnaissance”, Technical Report SERC-TR-75F, University of West Florida, Pensacola, Florida 32514, USA. Wilde, N. (1998), Understanding embedded software through instrumentation: Preliminary results from a survey of techniques, Technical Report SERC-TR-85-F, Software Engineering Research Center, Purdue University, 1398 Department of Computer Science, West Lafayette, IN 47906. Wilde, N., Blackwell, K. and Justice, R. (1998), Understanding data-sensitive code: One piece of the year 2000 puzzle, Report SERC-TR-83-F, Software Engineering Research Center, University of Florida, University of Florida, CIS Department, Gainesville, FL 32611, January, 1998. Wilde, N., Buckellew, M., Page, H., Rajich, V. and Pounds, L. (2003), ‘A comparison of methods for locating features in legacy software’, Journal of Systems and Software 65(2), 105–114. Wilde, N. and Casey, C. (1996), Early field experience with the software reconnaissance technique for program comprehension, in ‘Proceedings of the 1996 International Conference on Software Maintenance’, pp. 312–318. Wilde, N., Casey, C., Vandeville, J., Trio, G. and Hotz, D. (1997), Reverse engineering of software threads: A design recovery technique for large multi-process systems, Report SERC-TR-82-F, Software Engineering Research Center, Computer Science Department, Purdue University, Purdue University, West Lafayette, IN 47907.

BIBLIOGRAPHY

248

Wilde, N., Gomez, J., Gust, T. and Strasburg, D. (1992), Locating user functionality in old code, in ‘Conference on Software Maintenance’, IEEE, pp. 200–205. Wilde, N., Page, H. and Rajlich, V. R. (2001), A case stude of feature location in unstructured legacy fortran code, in ‘Proceedings of the Fifth European Conference on Software Maintenance and Reengineering’, IEEE Computer Society, p. 68. Wilde, N. and Scully, M. C. (1995), ‘Software reconnaissance: Mapping program features to code’, Journal of Software Maintenance: Research and Practice 7(1), 49–62. Wong, W. E., Gokhale, S., Horgan, J. R. and Trivedi, K. S. (1999), Locating program features using execution slices, in ‘Proceedings of the 1999 IEEE Symposium on Application - Specific Systems and Software Engineering and Technology’, IEEE, IEEE Computer Society, p. 194. Woodman, M., Benediktsson, O., Lefever, B. and Stallinger, F. (2001), ‘Issues of cbd product quality and process quality’, 4th ICSE Workshop on Component-Based Software Engineering . Woods, S. and Quilici, A. (1996), Some experiments toward understanding how program plan recognition algotithms scale, in ‘Working Conference on Reverse Engineering’, Monterey, CA, USA, pp. 21–30. Yan, H., Garlan, D., Schmerl, B., Aldrich, J. and Kazman, R. (2004), Discotect: A system for discovering architectures from running systems, in ‘International Conference on Software Engineering’, IEEE, IEEE, Edinburgh, Scotland, pp. 470–479. Yeh, A. S., Harris, D. R. and Reubenstein, H. B. (1995), Recovering abstract data types and object instances from a conventional procedure language, in ‘Working Conference on Reverse Engineering’, pp. 227–236.

BIBLIOGRAPHY

249

Yip, S. W. L. (1995), Software maintenance in hong kong, in ‘International Conference on Software Maintenance’, IEEE, IEEE, Opio (Nice), France, pp. 88–97. Yourdon, E. and Constantine, L. (1979), Structured Design, Prentice Hall. Zhao, W., Zhang, L., Liu, Y., Sun, J. and Yang, F. (2004), Sniafl: Towards a static noninteractive approach to feature location, in ‘International Conference on Software Engineering’, IEEE, IEEE Computer Society, Edinburgh, Scotland, pp. 293–303. Zweben, S. H., Edwards, S. H., Weide, B. W. and Hollingsworth, J. E. (1995), ‘The effects of layering and encapsulation on software development cost and quality’, IEEE Transactions on Software Engineering 21(3), 200–208.

Part IV Appendices

Appendix A Reuse Perspectives A.1

Scrabble Emulator Reuse Perspective

Bsprite.cpp Release Bsprite.cpp Restore Buttons.cpp PointInRect Buttons.cpp Restore Buttons.cpp hit ComputerPlayer.cpp ChangeTileState ComputerPlayer.cpp GenerateMove ComputerPlayer.cpp MoveTileToBoard ComputerPlayer.cpp Player ComputerPlayer.cpp ReturnTileIndexOnR ComputerPlayer.cpp ReturnTileLetterOnRac Csprite.cpp draw HumanPlayer.cpp CalculateTileT HumanPlayer.cpp FindCorrectSquare HumanPlayer.cpp MoveTileAroundScreen HumanPlayer.cpp Player HumanPlayer.cpp ResetBClicked HumanPlayer.cpp checkPointinRect Input.cpp CInputManager Input.cpp GameKeyboard Input.cpp GameLMouseDown Input.cpp IntroMouseUp Input.cpp Keyboard Input.cpp MainMenuKeyboard Input.cpp MainMenuLMouseUp Input.cpp OrderMenuKeyboard Input.cpp OrderMenuLMouseUp Input.cpp Restore Input.cpp SPPLMouseUp Input.cpp SelectMenuKeyboard Input.cpp SelectMenuLMouseUp Input.cpp SetupOrderMenuButt Input.cpp SetupPlayingBut Input.cpp SetupSelectMenuBu Input.cpp buttondownhandler Input.cpp buttonuphandler Letters.cpp GetNextLetter Letters.cpp Letters Letters.cpp NumOfTilesLeft Letters.cpp PrintLetterArray Letters.cpp ReturnLetterToArray Letters.cpp ReturnValueOfLetter Main.cpp ComposeFrame Main.cpp Redraw

A.1 Scrabble Emulator Reuse Perspective

Main.cpp RestoreSurfaces Objects.cpp ChangeTileStat Objects.cpp CheckTileState Objects.cpp CreateTileOnBoard Objects.cpp DisplayAssociatedLe Objects.cpp GetAssociatedLet Objects.cpp ResetAssociatedLe Objects.cpp SetPos Objects.cpp UpdateLetter Objects.cpp WhatPos Objects.cpp checkPointinR Objects.cpp create Objects.cpp draw Objects.cpp move Objects.cpp updateRect Player.cpp CalculateScore Player.cpp CheckPlayerType Player.cpp CreateTileObjects Player.cpp DisplayTilesToS Player.cpp GetPlayerName Player.cpp GetScoreForMove Player.cpp GetTileObject Player.cpp GetTotalPlayerScore Player.cpp IsRackEmpty Player.cpp IsTileEmpty Player.cpp MoveTilesOffScreen Player.cpp MoveTilesOntoScreen Player.cpp Player Player.cpp PutTilesBackOnRack Player.cpp ResetAssociatedLetter Player.cpp ResetTileStates Player.cpp ReturnLetters Player.cpp ReturnTileState Player.cpp SetPlayerName Player.cpp SetTileOffscreenLocatio Player.cpp UpdateLetterOnRack Referee.cpp AnalyseMove Referee.cpp CheckDictionary Referee.cpp DealTiles Referee.cpp GetPassCount Referee.cpp IsFirstMove Referee.cpp ReplaceTiles Referee.cpp SetFirstMove Timer.cpp elapsed board.cpp AddToCurrentMove board.cpp AreCrosschecksViolated board.cpp AssignTileObject board.cpp AtLeastOneAnchor board.cpp CalcAnchorsAndCrosschecks board.cpp CalculateScoreOfMove board.cpp CheckCentreSquare board.cpp DeleteFromTempCurrentMove board.cpp DirectionOfWord board.cpp ExtendDirectly board.cpp ExtendDown board.cpp ExtendDownDirectly board.cpp ExtendRight board.cpp GenerateFirstMove board.cpp GenerateHorizMoves board.cpp GenerateVertMoves board.cpp GetSquareColLocation board.cpp GetSquareRowLocation board.cpp IsInCrosscheckSet board.cpp IsMoveLinear board.cpp IsNewlyOccupiedSquare board.cpp IsOnlyOneWord board.cpp IsTileOnBoard board.cpp LeftPart board.cpp MoveEmptyPart board.cpp PlaceMove board.cpp PlaceMoveOnBoard board.cpp PrintAnchorsCrosschecks board.cpp PrintBoard board.cpp RecordLegalMove board.cpp ResetBoard board.cpp ResetCurrentCrosschecks board.cpp ResetCurrentMoveArray board.cpp ResetCurrentMoveDirection board.cpp ResetMoveInfo board.cpp ResetSquare board.cpp ResetSquares board.cpp RestoreTilesOnBoard board.cpp ReturnSquareColumn

252

A.1 Scrabble Emulator Reuse Perspective

board.cpp ReturnSquareRow board.cpp ReturnWord board.cpp SetCurrentMoveDirection board.cpp UpdateCrossChecks board.cpp UpdateHorizXChkLetters board.cpp UpdateSquare board.cpp UpdateVertXChkLetters board.cpp UpdateXChkSqValue dictionary.cpp GetAssociatedLetter dictionary.cpp GetAssociatedNumbe dictionary.cpp GetEdgeLetter dictionary.cpp IsEndOfNode dictionary.cpp IsInNode dictionary.cpp IsWordValid dictionary.cpp ReturnNextNode gameman.cpp Begin gameman.cpp CalculateTileToBeMoved gameman.cpp ChangeTileState gameman.cpp CheckGameState gameman.cpp CompPlayer gameman.cpp Destroy gameman.cpp DetermineHighest gameman.cpp DisplayTile gameman.cpp DisplayTilesToScreen gameman.cpp FindCorrectSquare gameman.cpp GetTileIndexOnRack gameman.cpp GetTileLetterOnRack gameman.cpp GetTileObject gameman.cpp HandleMove gameman.cpp MoveTileAroundScreen gameman.cpp MoveTileToBoard gameman.cpp O_Display gameman.cpp O_SetUpNextMove gameman.cpp OrderHandler gameman.cpp P_Display gameman.cpp P_SetUpNextMove gameman.cpp PlayersChosen gameman.cpp ReturnLetters gameman.cpp ReturnTileState gameman.cpp Select

253

Appendix B Pilot Study - Scrabble Emulator Reflexion Models and Maps

255

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=board.cpp mapTo=Board] file=board.h mapTo=Board] class=^CBoard$ mapTo=Board] class=^CBaseSprite$ mapTo=OutsideSys] class=^CBmpFileReader$ mapTo=OutsideSys] class=^CBmpSpriteFileReader$ mapTo=OutsideSys] class=^CButtonManager$ mapTo=OutsideSys] class=^CClippedSprite$ mapTo=OutsideSys] class=^CGameManager$ mapTo=OutsideSys] class=^CInputManager$ mapTo=OutsideSys] class=^ComputerPlayer$ mapTo=OutsideSys] class=^HumanPlayer$ mapTo=OutsideSys] class=^LetterData$ mapTo=OutsideSys] class=^CPlayerOrderData$ mapTo=OutsideSys] class=^CTimer$ mapTo=OutsideSys] class=^DictionaryEdge$ mapTo=OutsideSys] class=^Dictionary$ mapTo=OutsideSys] class=^Player$ mapTo=OutsideSys] class=^Referee$ mapTo=OutsideSys] class=^Square$ mapTo=OutsideSys] class=^TileObject$ mapTo=OutsideSys]

Figure B.1: Pilot Study - Scrabble Emulator - Reflexion model and map 1.

256

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=board.cpp mapTo=Board] file=board.h mapTo=Board] class=^CBoard$ mapTo=Board] class=^CBaseSprite$ mapTo=GraphicsEngine] class=^CBmpFileReader$ mapTo= GraphicsEngine] class=^CBmpSpriteFileReader$ mapTo= GraphicsEngine] class=^CButtonManager$ mapTo= GraphicsEngine] class=^CClippedSprite$ mapTo= GraphicsEngine] class=^CGameManager$ mapTo=GameManager] class=^CInputManager$ mapTo=OutsideSys] class=^ComputerPlayer$ mapTo=PlayerEngine] class=^HumanPlayer$ mapTo=PlayerEngine] class=^LetterData$ mapTo=Letters_DBase] class=^Letters$ mapTo=Letters_DBase] class=^CPlayerOrderData$ mapTo=GameManager] class=^CTimer$ mapTo=OutsideSys] class=^DictionaryEdge$ mapTo=Dictionary_DBase] class=^Dictionary$ mapTo=Dictionary_DBase] class=^Player$ mapTo=PlayerEngine] class=^Referee$ mapTo=Referee] class=^Square$ mapTo=Board] class=^TileObject$ mapTo=GraphicsEngine] file=^Main.cpp$ mapTo=MainDriver] file=^ddsetup.cpp$ mapTo=GraphicsEngine]

Figure B.2: Pilot Study - Scrabble Emulator - Reflexion model and map 2.

257

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=board.cpp mapTo=Board] file=board.h mapTo=Board] class=^CBoard$ mapTo=Board] class=^CBaseSprite$ mapTo=GraphicsEngine] class=^CBmpFileReader$ mapTo= GraphicsEngine] class=^CBmpSpriteFileReader$ mapTo= GraphicsEngine] class=^CButtonManager$ mapTo= GraphicsEngine] class=^CClippedSprite$ mapTo= GraphicsEngine] class=^CGameManager$ mapTo=GameManager] class=^CInputManager$ mapTo=OutsideSys] class=^ComputerPlayer$ mapTo=PlayerEngine] class=^HumanPlayer$ mapTo=PlayerEngine] class=^LetterData$ mapTo=Letters_DBase] class=^Letters$ mapTo=Letters_DBase] class=^CPlayerOrderData$ mapTo=GameManager] class=^CTimer$ mapTo=OutsideSys] class=^DictionaryEdge$ mapTo=Dictionary_DBase] class=^Dictionary$ mapTo=Dictionary_DBase] class=^Player$ mapTo=PlayerEngine] class=^Referee$ mapTo=Referee] class=^Square$ mapTo=Board] class=^TileObject$ mapTo=GraphicsEngine] file=^Main.cpp$ mapTo=MainDriver] file=^ddsetup.cpp$ mapTo=GraphicsEngine]

Figure B.3: Pilot Study - Scrabble Emulator - Reflexion model and map 3.

258

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=board.cpp mapTo=Board] file=board.h mapTo=Board] class=^CBoard$ mapTo=Board] class=^CBaseSprite$ mapTo=GraphicsEngine] class=^CBmpFileReader$ mapTo= GraphicsEngine] class=^CBmpSpriteFileReader$ mapTo= GraphicsEngine] class=^CButtonManager$ mapTo= GraphicsEngine] class=^CClippedSprite$ mapTo= GraphicsEngine] class=^CGameManager$ mapTo=GameManager] class=^CInputManager$ mapTo=OutsideSys] class=^ComputerPlayer$ mapTo=PlayerEngine] class=^HumanPlayer$ mapTo=PlayerEngine] class=^LetterData$ mapTo=Letters_DBase] class=^Letters$ mapTo=Letters_DBase] class=^CPlayerOrderData$ mapTo=GameManager] class=^CTimer$ mapTo=OutsideSys] class=^DictionaryEdge$ mapTo=Dictionary_DBase] class=^Dictionary$ mapTo=Dictionary_DBase] class=^Player$ mapTo=PlayerEngine] class=^Referee$ mapTo=Referee] class=^Square$ mapTo=Board] class=^TileObject$ mapTo=GraphicsEngine] file=^Main.cpp$ mapTo=MainDriver] file=^ddsetup.cpp$ mapTo=GraphicsEngine]

Figure B.4: Pilot Study - Scrabble Emulator - Reflexion model and map 4.

Appendix C Case Study - Workplace - Participant 1 - Reflexion Models and Maps

260



Figure C.1: Case Study - Workplace - Participant 1 - Reflexion Model and map 1

261



class="^.*Helper$" mapTo="Helpers"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.2: Case Study - Workplace - Participant 1 - Reflexion model and map 2

262



class="^.*Helper$" mapTo="Helpers"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module" mapTo="Services"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.3: Case Study - Workplace - Participant 1 - Reflexion model and map 3

263



class="^.*Helper$" mapTo="Helpers"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module$" mapTo="ServiceInterfaces"/> class="^.*ModuleImpl$" mapTo="ServiceImpls"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.4: Case Study - Workplace - Participant 1 - Reflexion model and map 4

264



class="^.*Helper$" mapTo="Helpers"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module$" mapTo="ServiceInterfaces"/> class="^.*ModuleImpl$" mapTo="ServiceImpls"/> class="^.*Mgr" mapTo="PersistenceMgrs"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.5: Case Study - Workplace - Participant 1 - Reflexion model and map 5

265

Figure C.6: Case Study - Workplace - Participant 1 - Reflexion model 6

class="^.*Helper$" mapTo="Helpers"/> class="^.*Tag$" mapTo="JSPTags"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module$" mapTo="ServiceInterfaces"/> class="^.*ModuleImpl$" mapTo="ServiceImpls"/> class="^.*Service$" mapTo="WebServiceAPIs"/> class="^.*Element$" mapTo="WebServiceAPIs"/> class="^.*Mgr" mapTo="PersistenceMgrs"/> class="^.*$" mapTo="RestOfLearning"/>

266



class="^.*Helper$" mapTo="Helpers"/> class="^.*Tag$" mapTo="JSPTags"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module$" mapTo="ServiceInterfaces"/> class="^.*ModuleImpl$" mapTo="ServiceImpls"/> class="^.*Service$" mapTo="WebServiceAPIs"/> class="^.*Element$" mapTo="WebServiceAPIs"/> class="^.*Command$" mapTo="RemoteProcessingCommands"/> class="^.*Mgr" mapTo="PersistenceMgrs"/> class="^.*DsBean$" mapTo="DSPersistentBeans"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.7: Case Study - Workplace - Participant 1 - Reflexion model and map 7

267



class="^.*Helper$" mapTo="Helpers"/> class="^.*Tag$" mapTo="JSPTags"/> class="^AuditAction$" mapTo="RestOfLearning"/> class="^.*Action$" mapTo="StrutsActionHandlers"/> class="^.*Form$" mapTo="StrutsActionForms"/> class="^.*Module$" mapTo="ServiceInterfaces"/> class="^.*ModuleImpl$" mapTo="ServiceImpls"/> class="^.*Service$" mapTo="WebServiceAPIs"/> class="^.*Element$" mapTo="WebServiceAPIs"/> class="^.*Command$" mapTo="RemoteProcessingCommands"/> class="^.*Mgr" mapTo="PersistenceMgrs"/> class="^.*DsBean$" mapTo="DSPersistentBeans"/> class="^.*Bean$" mapTo="LMMPersistentBeans"/> class="^.*$" mapTo="RestOfLearning"/>

Figure C.8: Case Study - Workplace - Participant 1 - Reflexion model and map 8

Appendix D Case Study - Workplace - Participant 2 - Reflexion models and Maps

269



package= "com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> class=".*" mapTo="RestOfSystem"/>

Figure D.1: Case Study - Workplace - Participant 2 - Reflexion model and map 1

270



package= "com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> class=".*" mapTo="RestOfSystem"/>

Figure D.2: Case Study - Workplace - Participant 2 - Reflexion model and map 2

271


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/>



Figure D.3: Case Study - Workplace - Participant 2 - Reflexion model and map 3

272


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.navigation" mapTo="Navigation"/>



Figure D.4: Case Study - Workplace - Participant 2 - Reflexion model and map 4

273


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.webutil" mapTo="WebUtil"/>



Figure D.5: Case Study - Workplace - Participant 2 - Reflexion model and map 5

274


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.webutil" mapTo="WebUtil"/>



Figure D.6: Case Study - Workplace - Participant 2 - Reflexion model and map 6

275


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.webutil" mapTo="WebUtil"/>



Figure D.7: Case Study - Workplace - Participant 2 - Reflexion model and map 7

276


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.webutil" mapTo="WebUtil"/>



Figure D.8: Case Study - Workplace - Participant 2 - Reflexion model and map 8

277


package="com.ibm.workplace.elearn.action" mapTo="Actions"/> package="com.ibm.workplace.elearn.delivery" mapTo="Delivery"/> package="com.ibm.workplace.elearn.servlet" mapTo="Servlet"/> package="com.ibm.workplace.elearn.taglib" mapTo="TagLib"/> package="com.ibm.workplace.elearn.view" mapTo="View"/> package="com.ibm.workplace.elearn.webutil" mapTo="WebUtil"/>



Figure D.9: Case Study - Workplace - Participant 2 - Reflexion model and map 9

278

Figure D.10: Case Study - Workplace - Participant 2 - Reflexion model 10

279



Figure D.11: Case Study - Workplace - Participant 2 - Reflexion model and map 11

Appendix E Case Study - AIM - Participant 1 Reflexion Models and Maps Note that the file names are obfuscated for copyright reasons.

281

[ file=^xihq mapTo=AIM_Util ] [ file=.* mapTo=RestofAIM ]

Figure E.1: Case Study - AIM - Participant 1 - Reflexion Model and map 1

282

[ file=^xihq mapTo=AIM_Util ] [ file=^xifo mapTo=Engine ] [ file=.* mapTo=RestofAIM ]

Figure E.2: Case Study - AIM - Participant 1 - Reflexion Model and map 2

283

[ [ [ [ [

file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=^xics mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure E.3: Case Study - AIM - Participant 1 - Reflexion Model and map 3

284

[ [ [ [ [

file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=^ximv mapTo=UIBrowses ] file=^xicq mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure E.4: Case Study - AIM - Participant 1 - Reflexion Model and map 4

285

[ [ [ [ [ [ [

file=^xiho mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqjfmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure E.5: Case Study - AIM - Participant 1 - Reflexion Model and map 5

286

[ [ [ [ [ [ [ [

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure E.6: Case Study - AIM - Participant 1 - Reflexion Model and map 6

287

[ [ [ [ [ [ [ [ [ [ [ [ [

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xicq mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xiql1 mapTo=Algorithms ] file=.* mapTo=RestofAIM ]

Figure E.7: Case Study - AIM - Participant 1 - Reflexion Model and map 7

288

[ [ [ [ [ [ [ [ [ [ [ [ [

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xicq mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xisq mapTo=Reporting ] file=.* mapTo=RestofAIM ]

Figure E.8: Case Study - AIM - Participant 1 - Reflexion Model and map 8

289

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xicq mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xisq mapTo=Reporting ] file=^d mapTo=MFGPRO ] file=^f mapTo=MFGPRO ] file=^i mapTo=MFGPRO ] file=^m mapTo=MFGPRO ] file=^p mapTo=MFGPRO ] file=^r mapTo=MFGPRO ] file=^s mapTo=MFGPRO ] file=dusm\.p$ mapTo=ControlFiles ] file=.* mapTo=RestofAIM ]

Figure E.9: Case Study - AIM - Participant 1 - Reflexion Model and map 9

290

[ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [

file=\.t$ mapTo=AIMDB_Triggers ] file=dusm\.p$ mapTo=ControlFiles ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xicq mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xisq mapTo=Reporting ] file=^d mapTo=MFGPRO ] file=^f mapTo=MFGPRO ] file=^i mapTo=MFGPRO ] file=^m mapTo=MFGPRO ] file=^p mapTo=MFGPRO ] file=^r mapTo=MFGPRO ] file=^s mapTo=MFGPRO ] file=^xicq mapTo=Picking ] file=^xicc mapTo=Picking ] file=^xiet mapTo=DOSessionTriggers ] file=^xify mapTo=RFClient ] file=^xisg mapTo=RFClient ] file=^xijd mapTo=InventoryIC ] file=^xipn mapTo=OrderMgt ] file=^xiql mapTo=Picking ] file=^xiqq mapTo=Item ] file=.* mapTo=RestofAIM ]

Figure E.10: Case Study - AIM - Participant 1 - Reflexion Model and map 10

Appendix F Case Study - AIM - Participant 2 Reflexion Models and Maps File names are removed for copyright reasons. Case Study - AIM - Participant 2 - Map 6 Case Study - AIM - Participant 2 - Map 7

292

%[/ file=^xihq mapTo=AIM_Util ] %[ file=.* mapTo=RestofAIM ]

Figure F.1: Case Study - AIM - Participant 2 - Reflexion Model 1

293

%[ file=^xihq mapTo=AIM_Util ] %[ file=^xifo mapTo=Engine ] %[ file=.* mapTo=RestofAIM ]

Figure F.2: Case Study - AIM - Participant 2 - Reflexion Model 2

294

%[ %[ %[ %[ %[

file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=^xics mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure F.3: Case Study - AIM - Participant 2 - Reflexion Model 3

295

%[ %[ %[ %[ %[

file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure F.4: Case Study - AIM - Participant 2 - Reflexion Model 4

296

%[ %[ %[ %[ %[ %[ %[

file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=.* mapTo=RestofAIM ]

Figure F.5: Case Study - AIM - Participant 2 - Reflexion Model 5

297

%[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xiql1 mapTo=Algorithms ] file=^xisq mapTo=Reporting ] file=.* mapTo=RestofAIM ]

Figure F.6: Case Study - AIM - Participant 2 - Reflexion Model 8

298

%[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[ %[

file=\.t$ mapTo=AIMDB_Triggers ] file=^xihq mapTo=AIM_Util ] file=^xifo mapTo=Engine ] file=ximvtijq mapTo=RestofAIM ] file=hqifmq mapTo=UIBrowses ] file=^ximv mapTo=UIBrowses ] file=^xics mapTo=UIBrowses ] file=^xiqs mapTo=Printing ] file=^ximg1 mapTo=Algorithms ] file=^xirb1 mapTo=Algorithms ] file=^xiqb1 mapTo=Algorithms ] file=^xiql1 mapTo=Algorithms ] file=^xisq mapTo=Reporting ] file=^d mapTo=MFGPRO ] file=^f mapTo=MFGPRO ] file=^i mapTo=MFGPRO ] file=^m mapTo=MFGPRO ] file=^p mapTo=MFGPRO ] file=^r mapTo=MFGPRO ] file=^s mapTo=MFGPRO ] file=dusm\.p$ mapTo=ControlFiles ] file=.* mapTo=RestofAIM ]

Figure F.7: Case Study - AIM - Participant 2 - Reflexion Model 9

Appendix G Interfaces on the House Application

Figure G.1: “Transforms” component’s interface on “Main.” Provides interface.

300

Figure G.2: “Transforms” component’s interface on “GUI.” Provides interface.

Appendix H Peer Reviewed Publications This appendix contains peer reviewed publications, first authored by the author, arising from the work of this thesis. The list includes: • (Le Gear et al., 2004) A Process for Transforming Portions of Existing Software for Reuse in Modern Development Approaches, 1st International Workshop on Software Evolution Transformations, Andrew Le Gear, Jim Buckley, Seamus Galvin and Brendan Cleary, November 2004, pages 40-43, Delft, the Netherlands. • (Le Gear, Cleary, Buckley and Exton, 2005) Making a Reuse Aspectual View Explicit in Existing Software, Linking Aspect Technology and Evolution (LATE), Andrew Le Gear, Brendan Cleary, Jim Buckley and Chris Exton, March 2005, Chicago, IL, USA. • (Le Gear, Buckley, Cleary and Collins, 2005) Achieving a Reuse Perspective within a Component Recovery Process: An Industrial Case Study, International Workshop on Programming Comprehension, Andrew Le Gear, Jim Buckley, Brendan Cleary and J.J. Collins, May 2005, pages 279-288, St. Louis, USA. • (Le Gear and Buckley, 2005c) Reengineering Towards Components Using “Reconnexion,” ESEC/FSE Doctoral Symposium 2005, Andrew Le Gear and Jim Buckley,

302

September 2005, Lisbon, Portugal. • (Le Gear, Buckley, Collins and O’Dea, 2005) Software Reconn-exion: Understanding Software Using a Variation on Software Reconnaissance and Reflexion Modelling, International Symposium on Empirical Software Engineering, Andrew Le Gear, Jim Buckley, J.J. Collins, and Kieran O’Dea, November 2005, pages 33-42, Noosa, Australia. • (Le Gear and Buckley, 2005b) Reengineering Towards Components Using “Reconnexion,” ACM SIGSOFT Software Engineering Notes, Andrew Le Gear and Jim Buckley, September 2005, 30(5):32, ACM Press. • (Le Gear et al., 2006) Exercising Control Over the Design of Evolving Software Systems Using an Inverse Application of Reflexion Modeling, CASCON, Andrew Le Gear, Jim Buckley and Colin McIlwaine, September 2006.

368

“I may not have gone where I intended to go, but I think I have ended up where I intended to be.” -Douglas Adams.

Component Reconn-exion

Nov 9, 2006 - 2.5.1 A Brief History of Encapsulation in Software Development . . 25. 2.5.1.1. Monitors . ...... introduced and used, as-is, across a company or.

4MB Sizes 1 Downloads 246 Views

Recommend Documents

Component Testing
Jul 8, 2002 - silicon atom. ... you really have to understand the nature of the atom. ..... often that you see a desktop computer burst into flames and burn down ...

Component Testing
Jul 8, 2002 - use a meter to test suspect components and troubleshoot electronic circuits. ..... The valence electron is held loosely to the atom and moves.

Component symbols.pdf
Cell Supplies electrical energy to a circuit, the longer line ... Loudspeaker Converts electrical energy into sound energy. Page 1 of 1. Component symbols.pdf.

Capacity component -
Dec 1, 2016 - Draft JAO/DAO on biodiversity tagging formulated. • Draft DAO on recommended modes of. PPP for biodiversity conservation. • DAO on establishment of PPP unit in. DENR signed and implemented. • “Establishment of a Carbon. Sequestr

Scalable Component Abstractions - LAMP | EPFL
We identify three programming language abstractions for the construction of .... out to be a convenient way to express required services of a component at the ..... the method call f() does not constitute a well-formed path. Type selection and ...

Scalable Component Abstractions - EPFL
from each other, which gives a good degree of type safety. ..... Master's thesis, Technische Universität ... Department of Computer Science, EPFL, Lausanne,.

ELECTRONIC COMPONENT AND CMOS TECHNOLOGY.pdf ...
Page 1 of 2. P.T.O.. IV Semester B.E. (E&C) Degree Examination, January 2013. (2K6 Scheme). EC- 401 : ELECTRONIC COMPONENT AND CMOS ...

Viewed from PCB Component side
版次(rev.) A / 1. 1 : 1. 电话(tel). HOUSING. 单位(units). (mt'l). PIN. Qsn6.5-0.1. PBT. MM. 图名(name). 材料. 料号(Part number). 图档(File Name). UNIT:mm. 1. 3. 5.

Scalable Component Abstractions - LAMP | EPFL
Classes on every level can create objects ... level might be a simple element on the next level of scale. ...... Department of Computer Science, EPFL, Lausanne,.

Scalable Component Abstractions - LAMP - EPFL
software components with static data and hard references, resulting in a ... aspect-oriented programming (indeed, the fragment system .... An important issue in component systems is how to ab- ... this section gives an introduction to object-oriented

Two Component Mixture
2.5. 3. Time (min). Two Component Mixture. Composition, wt %. 1. Isopropanol 53%. 2. n-heptane. 47%. Sample: Neat. GC: 7890A. Carrier: Helium, 5 mL/min. Column: DB-5 (30 m, 0.32 mm, 1.5 micron). Detector: Polyarc/FID. 1. 2. 100° C. 100° C/min. 70°

Component level testing.pdf
Ph: +91 80 28360184 Email: [email protected]. Accurate and affordable testing! Component-level testing. Stringer Edge Grab Handle. Engine Mount on Cross ...

REVIEWED Component #45.pdf
and finding students at different levels, nevertheless, does not mean that the teacher must adjust. for all students, but rather the teacher should be weary of ...

REVIEWED Component #259.pdf
In Evensen, D., and Hmelo, C. E. (eds.), Problem-Based ... Educational Psychology Review, 16(3), 235-266. Meyer, D. K. ... REVIEWED Component #259.pdf.

Medicinal product with a textile component
Jan 28, 2000 - structed in compact parallel bundles, whereas the meshwork .... spanned by a continuous thread 18 according to FIG. 3,. Which for example ...

Reusable and Redundancy Removable Component ...
the duplicity of requirements and will minimize the time taken and cost for ... Modern software systems become more and more large-scale, complex ... One can say that it is changing the way large software systems are developed. ... In October 2, 2012