<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog of josh &#187; code-clones</title>
	<atom:link href="http://landofjosh.com/tag/code-clones/feed/" rel="self" type="application/rss+xml" />
	<link>http://landofjosh.com</link>
	<description>software development under the big arch</description>
	<lastBuildDate>Thu, 22 Oct 2009 06:26:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Agent Ralph In Action</title>
		<link>http://landofjosh.com/2009/08/agent-ralph-in-action/</link>
		<comments>http://landofjosh.com/2009/08/agent-ralph-in-action/#comments</comments>
		<pubDate>Wed, 05 Aug 2009 05:15:51 +0000</pubDate>
		<dc:creator>josh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[agent ralph]]></category>
		<category><![CDATA[code-clones]]></category>
		<category><![CDATA[resharper]]></category>

		<guid isPermaLink="false">http://landofjosh.com/?p=169</guid>
		<description><![CDATA[I&#8217;ve been yack yack yacking about clone detection and Agent Ralph.  It&#8217;s time to put up or shut up.  This post is some screen shots of Agent Ralph in action. Agent Ralph&#8216;s front end is a Resharper plug-in.  Any clones detected are passed up to the plug-in which presents them to the user as highlights and [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been yack yack yacking about clone detection and Agent Ralph.  It&#8217;s time to put up or shut up.  This post is some screen shots of Agent Ralph in action.</p>
<p><a title="Agent Ralph Project" href="http://code.google.com/p/agentralphplugin/">Agent Ralph</a>&#8216;s front end is a <a title="JetBrain's Resharper" href="http://www.jetbrains.com/resharper/">Resharper</a> plug-in.  Any clones detected are passed up to the plug-in which presents them to the user as highlights and quick fixes.  This is how we achieve the automated repair that a modern clone tool needs.  The backend scans source files handed to it by the front end, using the techniques I&#8217;ve been <a href="http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/">blogging about</a>.  Specifically, clones are identified by comparing abstract syntax trees of methods.  The ASTs may be modified by the application of safe refactorings (refactorings that do not change the inputs, outputs, or side effects).  If an AST can be safely coerced until it matches another then we can consider the originals functionally equivalent clones.  This technique will detect clones that would otherwise be overlooked by text based clone finders.</p>
<p>So, here&#8217;s the basic case.  Two identical methods:</p>
<p><img class="size-full wp-image-170 alignnone" title="identicalmethodshighlight-cropped" src="http://landofjosh.com/wp-content/uploads/2009/08/identicalmethodshighlight-cropped.png" alt="identicalmethodshighlight-cropped" width="436" height="195" /></p>
<p>Note the Resharper squigglies telling us something is up.  Passing the mouse over either method name brings up a tooltip identifying the method as a clone of the other.</p>
<p>Placing the cursor on the method name prompts you with a <a title="Resharper Quick Fixes" href="http://www.jetbrains.com/resharper/features/code_analysis.html#Quick-Fixes">quick fix</a>&#8230;</p>
<p><img class="alignnone size-full wp-image-173" title="identicalmethodsquickfix-cropped" src="http://landofjosh.com/wp-content/uploads/2009/08/identicalmethodsquickfix-cropped.png" alt="identicalmethodsquickfix-cropped" width="471" height="229" /></p>
<p>&#8230;and invoking it&#8230;</p>
<p><img class="alignnone size-full wp-image-172" title="identicalmethodsquickfixapplied-cropped" src="http://landofjosh.com/wp-content/uploads/2009/08/identicalmethodsquickfixapplied-cropped.png" alt="identicalmethodsquickfixapplied-cropped" width="455" height="198" /></p>
<p>&#8230;replaces the body of the clone with a call to the original.  That&#8217;s automated clone repair!  An inline method applied to Test1 will complete the removal.</p>
<p>The next methods are identical, but only if a rename local variable refactoring is applied.  And indeed you can see that it is, indicated by the highlighting and quickfix offering.</p>
<p><img class="size-full wp-image-175 alignnone" title="clonewithrenamelocal-cropped" src="http://landofjosh.com/wp-content/uploads/2009/08/clonewithrenamelocal-cropped.png" alt="clonewithrenamelocal-cropped" width="458" height="265" /></p>
<p>The last example is one I am particularly proud of.   Here we are detecting a clone that is a block within a larger method.  Methods EmbeddedClone1 and EmbeddedClone2 both contain clones of Test2.</p>
<p><img class="alignnone size-full wp-image-179" title="embeddedclonequickfix-cropped" src="http://landofjosh.com/wp-content/uploads/2009/08/embeddedclonequickfix-cropped.png" alt="embeddedclonequickfix-cropped" width="470" height="415" /></p>
<p>Thus far I&#8217;ve restricted myself to using methods as the only unit of comparison.  Doing so made it easier to reason and implement as I worked through ideas.  At some point I realized that I could use an extract method refactoring to create provisional methods from indiscriminate code blocks on the fly.  If the provisional method is a clone then it follows that the original code block is a clone.  In this way I can continue to think and code in terms of methods, yet rely on the extract method refactoring to apply my algorithms to sub-units of method (aka, arbitrary blocks and statements).</p>
]]></content:encoded>
			<wfw:commentRss>http://landofjosh.com/2009/08/agent-ralph-in-action/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>An Idea For Robust Clone Detection Using Abstract Syntax Trees</title>
		<link>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/</link>
		<comments>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/#comments</comments>
		<pubDate>Sun, 19 Jul 2009 20:09:50 +0000</pubDate>
		<dc:creator>josh</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[code-clones]]></category>

		<guid isPermaLink="false">http://landofjosh.com/?p=145</guid>
		<description><![CDATA[My last post concluded with the promise to go into detail on some implementation ideas of my clone analyzer. As I argued previously, a text based matching tool is not good enough, it&#8217;s simply too easy to fool. What we want is a matching tool that considers the full syntax of the language being analyzed. [...]]]></description>
			<content:encoded><![CDATA[<p>My last <a href="http://landofjosh.com/?p=77">post</a> concluded with the promise to go into detail on some implementation ideas of my clone analyzer.</p>
<p>As I argued previously, a text based matching tool is not good enough, it&#8217;s simply too easy to fool. What we want is a matching tool that considers the full syntax of the language being analyzed.   That leads us to a solution based on the analysis of abstract syntax trees (ASTs). </p>
<p>ASTs can be generated easily from a partial compilation of the files under analysis.  Basic comparison is easy too.  It&#8217;s a straightforward tree walking algorithm.  There are many ways to do it, and I&#8217;ll go into my implementation later.  Where things get interesting is when we consider ASTs that are functionally equivalent but not identical.  That is, ASTs that differ in unimportant ways.  The initial impulse is to begin &#8216;relaxing&#8217; the tree walking comparison.  I.e., ignore things like local variables names, parameter order, and other obvious irrelevancies to the concern of functional equivalence.  Instead I proposed that we attempt to refactor one tree and see if we can transform it into an AST that does match. If so, we can conclude the original ASTs match.  We don&#8217;t actually need to perform these refactorings &#8216;for real&#8217;.  Knowing the safe transform exists is enough for a clone repair tool to replace one method with the other. </p>
<p>Now, my theory here can be divided into two distinct parts.  First, we need to be able to tell when any two methods are completely identical.  Second, if we can we take two non matching methods and automatically apply a series of safe refactorings that will convert one into an identical match of the other then we can conclude that the original methods are clones.</p>
<p><strong>Part I &#8211; Matching Identical ASTs</strong></p>
<p>The initial match algorithm is a careful tree traversal with a node for node comparison which exits at the first mismatch.  For this part of the project I relied on the open source and very nice NRefactory project.  It includes a C# parser, among other useful stuff.  Thanks to the availability of source and decent examples I was able to get up and running very quickly.</p>
<p>The first step is to get an AST by passing the class file to a Parse() function.  One caveat of my implementation is that it will not work on code that does not compile.  When Parse() encounters a syntax error it returns null.  In practice, I don&#8217;t anticipate this limitation having much effect on usefulness.</p>
<p>The AST generated from this method&#8230;</p>
<pre name="code" class="csharp">
int Foo() {
    return 7 + 8 * (4 - 6);
}
</pre>
<p>&#8230;looks like this[1]:<br />
<img src="http://landofjosh.com/wp-content/uploads/2009/07/syntax_tree.png" alt="syntax_tree" title="syntax_tree" width="554" height="208" class="aligncenter size-full wp-image-160" /><br />
There&#8217;s a couple of things to note about these ASTs.  Each node has a distinct type like MethodDeclaration, ReturnStatement, Operator, Literal, ect.  Some nodes also have other properties.  For example, Operators have an Op property that is (in this example) one of +, -, or *.  Literals hold the literal value in the property named Val (&#8217;7&#8242;, &#8217;8&#8242;, &#8217;4&#8242;, and &#8217;6&#8242; here), and the type of that value, called Type.</p>
<p>We now need to compare the ASTs.  Let&#8217;s call the function to do this Compare(left_tree, right_tree):bool.  Starting at the root node (in this case, a MethodDeclaration node) of the left hand tree we begin walking that tree.  At each left hand node we compare to the corresponding right hand node.  The individual node comparison first checks that the node types match (both are Operators, both are Literals, ect).  Then it compares the values of each of the node properties.  At this point we have confirmed the node matches and we can proceed to it&#8217;s children.</p>
<p>The actual implementation is based on a slightly modified <a href="http://en.wikipedia.org/wiki/Visitor_pattern">Visitor</a> pattern.  Each of the Visitor class&#8217;s Visit methods take a second parameter of type INode (base class of all AST nodes), in addition to the normal strictly typed first parameter.  The second parameter is there because we need to drag the right hand node along on each Visit call, and then pass corresponding right hand child node(s) to each Accept call.  Here&#8217;s the partial IVisitor definition:</p>
<pre  name="code" class="csharp">
public interface IVisitor {
    void Visit(MethodDeclaration left, INode right);
    void Visit(ReturnStatement left, INode right);
    void Visit(Operator left, INode right);
    void Visit(LiteralType left, INode right);
}
</pre>
<p>Here&#8217;s the modified INode.Accept method interface.  Note the inclusion of the second parameter, right, which will hold the right hand tree node that corresponds to the left hand AST node.  The left node is of course &#8216;this&#8217;.</p>
<pre name="code" class="csharp">
interface INode {
   void Accept(IVisitor v, INode right);
    ...
}
</pre>
<p>And all Accept implementations look pretty much identical.  Here&#8217;s Operator&#8217;s.  Notice it is dutifully passing that right parameter on to the Visit call?</p>
<pre name="code" class="csharp">
public class ReturnStatement : INode {
    void Accept(IVisitor v, INode right) {
        v.Visit(this, right);
    }

    public INode LeftExpr;
    public INode RightExpr;
}
</pre>
<p>So far it&#8217;s been pretty boiler plate Visitor pattern stuff, with the inclusion of the extra INode parameter named right.  Now let&#8217;s look at the concrete Visitor implementation which is where the good stuff happens.</p>
<p>If you recall, I said earlier that the actual comparison of two single INodes involves these steps:<br />
1.    Confirm the nodes&#8217; types match.<br />
2.    Compare each of the node specific property values.  Since they are of the same type, they have the same property sets.<br />
3.    Recursively call Accept() on the left hand node&#8217;s children, passing the right hand node&#8217;s children as the second parameter.</p>
<p>In standard Visitor pattern fashion there is an IVisitor.Visit method for every non-abstract INode subclass.  Operator&#8217;s Visit looks like this[2]:</p>
<pre name="code" class="csharp">
public class ComparisonVisitor : IVisitor {
    public void Visit(Operator left, INode right) {
        // 1.    Confirm nodes' types match.
        Operator right_operator = right as Operator;
        if(right_operator == null) {
            SetFailure();
            return;
        }

        // 2.    Compare each of the node specific property values.
        if(this.Op != right_operator.Op) {
            SetFailure();
            return;
        }

        // 3.    Recursively call Accept on the left children, passing the right children.
        left.LeftExpr.Accept(v, right_operator.LeftExpr);
        left.RightExpr.Accept(v, right_operator.RightExpr);
    }
    ... // Repeat for the remaining IVisitor implementations
}
</pre>
<p>This does require adding a new Accept(left,right) method overload on each node.  That is, I had to go back and modify the NRefactory AST implementation to make this work.  It was one of those moments where I realize, again, how much open source rocks. </p>
<p><strong>Further</strong></p>
<p>This is surprisingly trivial to implement.  In fact it was so easy that I got bored doing it and ended up autogenenerating all of the tree walking and much of the comparison code.  NRefactory has it&#8217;s own generator that creates all of the INode subclasses (Operator, MethodDeclaration, ect). It lays out the node specific properties, default ctors, and Accept implementations.  It even generates some premade concrete Visitor implementations of it&#8217;s own.  I hijacked this and hooked in the additional generation of my modified Visitor to the concrete INode subclasses.  It also generates the vast majority of my ComparisonVisitor as a partial class.  The parts that remain are from the node specific property matching (step 2) which lives in node type specific Match(left,right) functions.  An example is bool Match(Operator left, Operator right), and it is called right there in the conditional of the step 2 if.  I wrote just a handful of those so that I could have a decent subset of C# to carry on with.</p>
<p>As I wrote this blog post it occurred to me that I might be able to auto generate the Match functions too.  Clearly enough info is delivered to the generation routines so that they can lay out the properties on the INode subclasses.  I can use the same info to autogenerate the Match method of step 2.</p>
<p><strong>Part II &#8211; Applying refactorings</strong></p>
<p>There you have it, the basic, core clone detection algorithm.  It&#8217;s so basic in fact, that it&#8217;s going to do no better than a text based match which ignores whitespace.  It will not detect a clone like this:</p>
<pre  name="code" class="csharp">public double Area(double radius) {
   double PI = 3.14;
   return PI*Math.Pow(r, 2);
}

public double CircleArea(double radius) {
   double pi = 3.14;
   return pi*Math.Pow(r, 2);
}</pre>
<p>The difference in the clones is the name of a local variable.  At the beginning I stated that we were starting with the case of identical method clones.  Because we are working from that precondition it made our Compare() function extremely easy to implement.  But actually it&#8217;s a no more productive match function than a basic text based clone finder.  A more useful AST comparison implementation might ignore the names of automatic variables.  This is not how I go about it. </p>
<p>The way I do it is to transform the AST in a way that does not change it&#8217;s functionality, yet creates a new AST that can be recompared.  We apply so called &#8216;safe&#8217; AST transformations &#8211; transformations that produce new ASTs yet the methods take the same inputs, produce the same outputs, and have not had their side effects modified.  The ASTs under consideration can be considered clones if there is a safe transformation &#8211; or even a series of safe transformations &#8211; that would convert one tree into the other.  These &#8220;safe AST transformations&#8221; are simply common code refactorings.  Going back to the example above, we could apply a rename local variable refactoring directly to the AST.  A recomparison would show them as equivalent, and that is enough to deem the original methods clones.  Each refactoring can be coded and applied in isolation.  That will help keep the implementation complexity low.</p>
<p><strong>Summary</strong></p>
<p>And that&#8217;s all there is to it.  Apply refactorings to one abstract syntax tree until it matches the other abstract syntax tree, or not.  My algorithm at this point is really just brute force.  I try all combinations of available refactorings (exactly one at the time of this writing) by applying them to one of the trees until I get a match or exhaust the possibilities.  One of my next steps is looking for heuristics that will allow us to reduce the number of refactorings that get performed during the search.  For example, if the compare fails due to a name mismatch on a local variable then it can store that fact for use in later selecting a candidate refactoring like &#8216;rename local variable&#8217;. </p>
<p>In the next post I&#8217;ll show how I used Extract Method to deal with the methods-only limitation of the tool so far.  And, I&#8217;ll be writing more refactoring operations and I&#8217;m sure I&#8217;ll learn some things worth writing about then as well.</p>
<p><em>Special thanks to my friend <a href="http://seanfoy.blogspot.com/">Sean</a> for all the proofreading, feedback, skepticism, and challenging questions.</em></p>
<p>[1]   I produced the graph with <a href="http://ironcreek.net/phpsyntaxtree/">this nifty tool</a>, using the phrase [MethodDeclaration(Name=Foo,ReturnType=int) [ReturnStatement [Operator(Op=+) Literal(Type=int,Val=7) [Operator(Op=*) [Literal(Type=int,Val=8)][Operator(Op=-) [Literal(Type=int,Val=4)][Literal(Type=int,Val=6)]]]]]].</p>
<p>[2]  Why don&#8217;t I return bools from the IVisitor methods instead of calling SetFailure() and returning anyway?  &#8211; Because I reserve the right to continue analyzing in the event of a mismatch.  This might be useful when choosing heuristics for later refactoring application.</p>
]]></content:encoded>
			<wfw:commentRss>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Code Clones</title>
		<link>http://landofjosh.com/2009/05/code-clones/</link>
		<comments>http://landofjosh.com/2009/05/code-clones/#comments</comments>
		<pubDate>Sat, 30 May 2009 22:29:02 +0000</pubDate>
		<dc:creator>josh</dc:creator>
				<category><![CDATA[unnecessary-complexity]]></category>
		<category><![CDATA[code-clones]]></category>

		<guid isPermaLink="false">http://landofjosh.com/?p=77</guid>
		<description><![CDATA[Code clones are code constructs or functionality that is repeated throughout a system.  It&#8217;s a well documented problem.  In short, the issue is duplication of logic.  Take this example: double Area(double radius) { return 3.14*Math.Pow(radius, 2); } double ComputeArea(double radius) { return 3.14*Math.Pow(radius, 2); } Two functions identical in every way except name.  The cost of [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Code clones are code constructs or functionality that is repeated throughout a system.  It&#8217;s a well <a title="DRY - Don't Repeat Yourself" href="http://en.wikipedia.org/wiki/Don't_repeat_yourself">documented</a> problem.  In short, the issue is duplication of logic.  Take this example:</p>
<pre name="code" class="csharp">double Area(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}

double ComputeArea(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Two functions identical in every way except name.  The cost of future modifications to whatever system they reside in is increasing.  Suffice it to say that when the maintenance programmer decides to increase precision of the area calculation he must append digits to the 3.14 constant in at least two places.  When this kind of duplication is allowed it leads to unmaintainable systems.  It is yet another form of unnecessary complexity, and perhaps even the worst.  Take any single instance of unnecessary complexity and it can likely be brushed aside as poor form that is easily corrected.  While that is true, it is the <em>repetition</em> of poor form that ultimately renders our systems unmaintainable.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">We usually we think of clones as being textually equivalent sections of code (think cut and paste coding), but they can also be <span style="font-style: normal;">functionally equivalent</span>.  I define t<em>extually equivalent</em> to mean two code sections which are line for line identical (comments and whitespace optionally ignored).  Two code blocks are <em>functionally equivalent</em> when their logic is the same even though the text may differ.  Building on our earlier example, here pi is called out explicitely:</p>
<pre name="code" class="csharp">double Area(double radius)
{
     const double PI = 3.14;
     return PI*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Here pi is a literal, defined inline.</p>
<pre  name="code" class="csharp">double ComputeArea(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Clearly these are <em>functionally equivalent.</em><span style="font-style: normal;"> For all inputs they produce the same output. They are</span> not <em>textually equivalent.</em><span style="font-style: normal;"> The magic number has been replaced with a constant.</span> Another example is two functions that differ only by local variable naming, where all types and operations are identical, are functionally equivalent.</p>
<p><br/></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">My general term for correcting code clones is collapsing. </span><em>Collapsing</em> is the process of safely merging code clones into, or replacing code clones with, a common implementation.  The &#8216;safely&#8217; part implies that a collapsing operation does not modify the expected inputs, outputs, and side effects, as perceived by the clients of the former clone.  It is a form of refactoring specifically targeted at the elimination of code duplication.  For the last example, the collapsing operation consists of replacing each call to ComputeArea with a call to Area.  Once ComputeArea is no longer used anywhere it can be deleted.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">We do have tools available to us for identifying clones.  I have used three good ones: Simian, Clone Detective, and TeamCity.  These tools are lacking in two distinct ways.  First, they seem to treat clone detection as a text matching problem.  Second, they do not provide clone relationship analysis and automated correction assistance.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">Text matching clone finders are easily defeated by differences in variable names, whitespace, comments, brace/delimiter placement, and literals.  Differences of these types are inconsequential in most languages and will cause primitive clone detectors to return fewer results, or fragmented result sets (many smaller clones instead of fewer large clones).  Member order within cloned classes also causes result fragmentation.  I will say that all the tools I&#8217;ve mentioned do have options that take these inhibitors into account.  However those issues are relatively easy problems to solve.  Not so easy is things like parameter order, which screws up matches at both the cloned parameter definitions and cloned call sites.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">Ultimately, text based clone detectors do not provide the rich result sets that we need to create powerful automated clone collapsing tools.  The alternative to text based matchers that I propose is clone detectors that operate directly on parsed syntax tree representations of source code.  Analyzers operating against syntax structures will facilitate the elimination of the match noise in clean, language specific ways.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">The poor state of clone identification being what it is, simple detection is not enough.  A proper clone analysis tool should analyze the clone and it&#8217;s context and then present the user with automated options for collapsing and removal.  Unfortunately, it is not always as simple as doing a couple of extract method operations with some find and replace.   Consider the clones in these two classes (the lines above and below the calculation of the <code>area</code> variable, on lines 7 and 19):</p>
<pre name="code" class="csharp">class Cylinder
{
     public double Volume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = 3.14 * Math.Pow(this.radius, 2);
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}
class Box
{
     public double ComputeVolume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = this.length * this.width;
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}</pre>
<p style="margin-bottom: 0in;">The ideal collapse result is to isolate the differing logic into two distinct functional units.  That is, perform an extract method on the unique logic <em>between</em> the clones.  Now it becomes clear that the two classes are collapsable into a single class where the unique logic can be inserted via polymorphism or dependency injection.  Like so:</p>
<p style="margin-bottom: 0in;">
<pre name="code" class="csharp">abstract class Shape
{
     public abstract double ComputeArea();

     public double Volume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = ComputeArea();
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}
class Cylinder : Shape
{
     public override double ComputeArea()
     {
          return 3.14 * Math.Pow(this.radius, 2);
     }
     ...
}
class Box : Shape
{
     public override double ComputeArea()
     {
          return this.length* this.width;
     }
     ...
}</pre>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">The contextual analysis of the clones&#8217; relationship to surrounding code and other clones led us to a more optimal result.  Indeed, otherwise the most likely path of collapsing chosen would be to jump in with extract methods against the clones themselves.  This would have put us in the situation of having a number of identical methods instead of blocks, but would not have provided any clear direction on the best way to proceed with collapsing.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">It&#8217;s worth noting that sometimes collapsing will result in more lines of code.  This does not diminish the value of collapsing.   The extra lines are boiler plate like class, interface, and function definitions.   The value is derived from the reduction and isolation of the formerly duplicated logic.  Additionally, as we increase the number of functional units (methods and classes) we make more malleable code because these units provide the boundaries that our existing code manipulation tools tend to operate on.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">In my next post I will present some implementation ideas for the rich clone analyzer I have outlined here.  In further posts I will discuss how else we might put such a tool to use.</p>
]]></content:encoded>
			<wfw:commentRss>http://landofjosh.com/2009/05/code-clones/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
