Clone detection

Designing and Maintaining Software (DAMS)
 Louis Rose

Habitable Software Leaner

Avoids Duplication

Less Complex

Clearer

Loosely Coupled

More Extensible

More Cohesive

???

Goal Automatically detect similar fragments of code.

class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end

class DeepPan def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Goal Automatically detect similar fragments of code.

class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end

class DeepPan def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Beginnings Automatically detect identical fragments of text.

class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end

class DeepPan def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Beginnings Automatically detect identical fragments of text.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Beginnings Automatically detect identical fragments of text.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Simple approach: identify lines of A appearing in B.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Comparing Fragments Increase fragment size until no new clones appear.

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Results The simple approach indicates the following clones:

class StuffedCrust def title @toppings.title + " Pizza" end

class DeepPan def title @toppings.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @toppings.cost + 4 end end

Tweaking From now on, strip out whitespace and comments

class StuffedCrust def title @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class DeepPan def title @toppings.title + " Pizza" end def cost @toppings.cost + 4 end end

More Tweaking Ignore fragments containing only Ruby keywords.

class StuffedCrust def title @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class DeepPan def title @toppings.title + " Pizza" end def cost @toppings.cost + 4 end end

Recap: Goal Automatically detect similar fragments of code.

class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end

class DeepPan def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end

class DeepPan def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class C def title "Stuffed Crust " + @toppings.title + " Pizza" end

class C def title "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class C def M "Stuffed Crust " + @toppings.title + " Pizza" end

class C def M "Deep Pan " + @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class C def M S+ @toppings.title + " Pizza" end

class C def M S+ @ingredients.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class C def M S+ @A.title + " Pizza" end

class C def M S+ @A.title + " Pizza" end

def cost @toppings.cost + 6 end end

def cost @ingredients.cost + 4 end end

Parameterisation Alter the source before clone detection

class C def M S+ @A.title + S end

class C def M S+ @A.title + S end

def M @A.cost + I end end

def M @A.cost + I end end

Clone Detection Now perform detection on parameterised source.

class C def M S+ @A.title + S end def M @A.cost + I end end

class C def M S+ @A.title + S end def M @A.cost + I end end

Baker’s Algorithm “Duplication may be introduced into a large system as modifications are made to add new features or to fix bugs. Rather than rewrite working sections of code, programmers may copy and modify sections of code. It has long been known that copying sections of code may make the code larger, more complex, and more difficult to maintain.” - Brenda S. Baker
 A Program for Identifying Duplicated Code
 Computing Science and Statistics, 24, 1992

Performance


 https://en.wikipedia.org/wiki/Suffix_tree

Limitations Baker’s algorithm cannot detect syntactically distinct but semantically equivalent clones class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class Stonebaked def cost 6 + @toppings.cost end def title "Stonebaked #{@toppings.title} Pizza" end end

Clones are Indicators No algorithm can definitively determine whether a clone is accidental or essential duplication class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class Stonebaked def cost 6 + @toppings.cost end def title "Stonebaked #{@toppings.title} Pizza" end end

Clones are Indicators No algorithm can definitively determine whether a clone is accidental or essential duplication class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class Stonebaked def cost 6 + @toppings.cost end def title "Stonebaked #{@toppings.title} Pizza" end end

Clones are Indicators No algorithm can definitively determine whether a clone is accidental or essential duplication class StuffedCrust def title "Stuffed Crust " + @toppings.title + " Pizza" end def cost @toppings.cost + 6 end end

class Stonebaked def cost 6 + @toppings.cost end def title "Stonebaked #{@toppings.title} Pizza" end end

Summary Clones can indicate potentially
 duplicated code Baker’s is the most straightforward
 clone detection algorithm, and there are
 many more sophisticated No tool can distinguish accidental
 and essential duplication

Designing and Maintaining Software (DAMS) - GitHub

Automatically detect similar fragments of code. class StuffedCrust def title. "Stuffed Crust " +. @toppings.title +. " Pizza" end def cost. @toppings.cost + 6 end end class DeepPan def title. "Deep Pan " +. @ingredients.title +. " Pizza" end def cost. @ingredients.cost + 4 end end ...

215KB Sizes 0 Downloads 223 Views

Recommend Documents

Designing and Maintaining Software (DAMS) - GitHub
ASTs are tree data structures that can be analysed for meaning (following JLJ in SYAC 2014/15) ... More Cohesive. Avoids Duplication. Clearer. More Extensible.

Designing and Maintaining Software (DAMS) - GitHub
Open-source. Influenced by Perl, Smalltalk, Eiffel, Ada and Lisp. Dynamic. Purely object-oriented. Some elements of functional programming. Duck-typed class Numeric def plus(x) self.+(x) end end y = 5.plus(6) https://www.ruby-lang.org/en/about · http

Designing and Maintaining Software (DAMS) - GitHub
Page 1. Getting Lean. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Lean software… Has no extra parts. Solves the problem at hand and no more. Is often easier to change (i.e., is more habitable). Page 3. The Advice I Want to Give.

Designing and Maintaining Software (DAMS) - GitHub
Why not duplicate? Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Habitable Software. Leaner. Less Complex. Loosely Coupled. More Cohesive. Avoids Duplication. Clearer. More Extensible ??? Page 3. Bad Practice. Page 4. Don't Repeat Yo

Designing and Maintaining Software (DAMS) - GitHub
“We have tried to demonstrate that it is almost always incorrect to begin the decomposition of a system into modules on the basis of a flowchart. We propose instead that one begins with a list of difficult design decisions or design decisions which

Designing and Maintaining Software (DAMS) - GitHub
Tools: Vagrant. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Bugs that appear in production and that can't be reproduced by a developer on their machine are really hard to fix. Problem: “It works on my machine”. Page 3. Why does

Designing and Maintaining Software (DAMS) - GitHub
Clear Documentation. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Bad documentation. Misleading or contradictory find_customer(id). CustomerGateway. Used to look up a customer by their customer number. Page 3. Bad documentation. Red

Designing and Maintaining Software (DAMS) - GitHub
%w.rack tilt date INT TERM..map{|l|trap(l){$r.stop}rescue require l};. $u=Date;$z=($u.new.year + 145).abs;puts "== Almost Sinatra/No Version has taken the stage on #$z for development with backup from Webrick". $n=Module.new{extend. Rack;a,D,S,q=Rack

Designing and Maintaining Software (DAMS) - GitHub
R&D: sketch habitable solutions on paper, using UML. 4. Evaluate solutions and implement the best, using TDD. Probably start again at 3. 5. Give to the product owner to validate. Probably start again at 1. 6. Put into production for customers to eval

Designing and Maintaining Software (DAMS) - GitHub
Habitable Software. Leaner. Less Complex. Loosely Coupled. More Cohesive. Avoids Duplication. Clearer. More Extensible ??? Page 3. Lean. “Perfection is finally achieved not when there is no longer anything to add, but when there is no longer anythi

Designing and Maintaining Software (DAMS) - GitHub
Fixes issue #42. Users were being redirected to the home page after login, which is less useful than redirecting to the page they had originally requested before being redirected to the login form. * Store requested path in a session variable. * Redi

Designing and Maintaining Software (DAMS) - GitHub
Designing and Maintaining Software (DAMS). Louis Rose ... Loosely coupled software is… Flexible: the ... http://www.objectmentor.com/resources/articles/dip.pdf ... Monthly. Sales. Reporting. Sales. Data. Provider. MySQL. Database. Access ...

Designing and Maintaining Software (DAMS) - GitHub
What is it? Several pieces of data are often used together. Why is it problematic? Behaviour that operates on the clump has no home. (and consequently is often duplicated). When does it arise? High cohesion of the clump has not been detected. D

Designing and Maintaining Software (DAMS) - GitHub
Observers. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Page 3. Delivery people need to know when pizzas are ready class Pizza def initialize(delivery_person). @delivery_person = delivery_person end def bake cook # blocking call. @d

Designing and Maintaining Software (DAMS) - GitHub
“We want the reading of code to be easy, even it makes the writing harder. (Of course, there's no way to write code without also reading it, so…)” - Bob Martin. Clean Code. Prentice Hall, 2009. Page 5. Page 6. User Experience. “A person of av

Designing and Maintaining Software (DAMS) - GitHub
Getting loose coupling. Designing and Maintaining Software (DAMS). Louis Rose ... should not depend on low-level modules. Both should depend on abstractions.” “Abstractions should not depend on details. Details should depend on abstractions.” -

Designing and Maintaining Software (DAMS) - GitHub
Ruby Testing Frameworks. 3 popular options are: RSpec, Minitest and Test::Unit. We'll use RSpec, as it has the most comprehensive docs. Introductory videos are at: http://rspec.info ...

Designing and Maintaining Software (DAMS) - GitHub
Clear Names. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Naming is hard. “There are only two hard things in Computer. Science: cache invalidation and naming things.” - Phil Karlton http://martinfowler.com/bliki/TwoHardThings.ht

Designing and Maintaining Software (DAMS) - GitHub
Coupling Between Objects. Counts the number of other classes to which a class is coupled (other than via inheritance). CBO(c) = |d ∈ C - (1cl U Ancestors(C))| uses(c, d) V uses(d, c). - Chidamber and Kemerer. A metrics suite for object-oriented des

Designing and Maintaining Software (DAMS) - GitHub
Reducing duplication. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Tactics. Accentuate similarities to find differences. Favour composition over inheritance. Know when to reach for advanced tools. (metaprogramming, code generation).

Designing and Maintaining Software (DAMS) - GitHub
Plug-ins. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Problem. Page 3. Current Architecture. Shareable. Likeable. Food. Pizza. Liking and sharing foods are primary business concerns, so shouldn't be implemented as delegators. Page

Designing and Maintaining Software (DAMS) - GitHub
When we are testing the way that a unit behaves when a condition is met, use a stub to setup the condition. Solution: use stubs for queries class Subscription ... def bill(amount) unless payments.exists(subscription_id: id) payments.charge(subscripti

Designing and Maintaining Software (DAMS) - GitHub
Getting Cohesion. Designing and Maintaining Software (DAMS). Louis Rose. Page 2. Single Responsibility. Principle. A class should have only one reason to change. - Martin and Martin. Chapter 8, Agile Principles, Patterns and Practices in C#, Prentice

Designing and Maintaining Software (DAMS) - GitHub
Size != Complexity. “Imagine a small (50 line) program comprising. 25 consecutive "IF THEN" constructs. Such a program could have as many as 33.5 million distinct control paths.” - Thomas J. McCabe. IEEE Transactions on Software Engineering, 2:4,