<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog of josh &#187; unnecessary-complexity</title>
	<atom:link href="http://landofjosh.com/tag/unnecessary-complexity/feed/" rel="self" type="application/rss+xml" />
	<link>http://landofjosh.com</link>
	<description>software development under the big arch</description>
	<lastBuildDate>Thu, 22 Oct 2009 06:26:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Code Clones</title>
		<link>http://landofjosh.com/2009/05/code-clones/</link>
		<comments>http://landofjosh.com/2009/05/code-clones/#comments</comments>
		<pubDate>Sat, 30 May 2009 22:29:02 +0000</pubDate>
		<dc:creator>josh</dc:creator>
				<category><![CDATA[unnecessary-complexity]]></category>
		<category><![CDATA[code-clones]]></category>

		<guid isPermaLink="false">http://landofjosh.com/?p=77</guid>
		<description><![CDATA[Code clones are code constructs or functionality that is repeated throughout a system.  It&#8217;s a well documented problem.  In short, the issue is duplication of logic.  Take this example: double Area(double radius) { return 3.14*Math.Pow(radius, 2); } double ComputeArea(double radius) { return 3.14*Math.Pow(radius, 2); } Two functions identical in every way except name.  The cost of [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Code clones are code constructs or functionality that is repeated throughout a system.  It&#8217;s a well <a title="DRY - Don't Repeat Yourself" href="http://en.wikipedia.org/wiki/Don't_repeat_yourself">documented</a> problem.  In short, the issue is duplication of logic.  Take this example:</p>
<pre name="code" class="csharp">double Area(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}

double ComputeArea(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Two functions identical in every way except name.  The cost of future modifications to whatever system they reside in is increasing.  Suffice it to say that when the maintenance programmer decides to increase precision of the area calculation he must append digits to the 3.14 constant in at least two places.  When this kind of duplication is allowed it leads to unmaintainable systems.  It is yet another form of unnecessary complexity, and perhaps even the worst.  Take any single instance of unnecessary complexity and it can likely be brushed aside as poor form that is easily corrected.  While that is true, it is the <em>repetition</em> of poor form that ultimately renders our systems unmaintainable.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">We usually we think of clones as being textually equivalent sections of code (think cut and paste coding), but they can also be <span style="font-style: normal;">functionally equivalent</span>.  I define t<em>extually equivalent</em> to mean two code sections which are line for line identical (comments and whitespace optionally ignored).  Two code blocks are <em>functionally equivalent</em> when their logic is the same even though the text may differ.  Building on our earlier example, here pi is called out explicitely:</p>
<pre name="code" class="csharp">double Area(double radius)
{
     const double PI = 3.14;
     return PI*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Here pi is a literal, defined inline.</p>
<pre  name="code" class="csharp">double ComputeArea(double radius)
{
    return 3.14*Math.Pow(radius, 2);
}</pre>
<p style="margin-bottom: 0in;">Clearly these are <em>functionally equivalent.</em><span style="font-style: normal;"> For all inputs they produce the same output. They are</span> not <em>textually equivalent.</em><span style="font-style: normal;"> The magic number has been replaced with a constant.</span> Another example is two functions that differ only by local variable naming, where all types and operations are identical, are functionally equivalent.</p>
<p><br/></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">My general term for correcting code clones is collapsing. </span><em>Collapsing</em> is the process of safely merging code clones into, or replacing code clones with, a common implementation.  The &#8216;safely&#8217; part implies that a collapsing operation does not modify the expected inputs, outputs, and side effects, as perceived by the clients of the former clone.  It is a form of refactoring specifically targeted at the elimination of code duplication.  For the last example, the collapsing operation consists of replacing each call to ComputeArea with a call to Area.  Once ComputeArea is no longer used anywhere it can be deleted.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">We do have tools available to us for identifying clones.  I have used three good ones: Simian, Clone Detective, and TeamCity.  These tools are lacking in two distinct ways.  First, they seem to treat clone detection as a text matching problem.  Second, they do not provide clone relationship analysis and automated correction assistance.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">Text matching clone finders are easily defeated by differences in variable names, whitespace, comments, brace/delimiter placement, and literals.  Differences of these types are inconsequential in most languages and will cause primitive clone detectors to return fewer results, or fragmented result sets (many smaller clones instead of fewer large clones).  Member order within cloned classes also causes result fragmentation.  I will say that all the tools I&#8217;ve mentioned do have options that take these inhibitors into account.  However those issues are relatively easy problems to solve.  Not so easy is things like parameter order, which screws up matches at both the cloned parameter definitions and cloned call sites.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">Ultimately, text based clone detectors do not provide the rich result sets that we need to create powerful automated clone collapsing tools.  The alternative to text based matchers that I propose is clone detectors that operate directly on parsed syntax tree representations of source code.  Analyzers operating against syntax structures will facilitate the elimination of the match noise in clean, language specific ways.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">The poor state of clone identification being what it is, simple detection is not enough.  A proper clone analysis tool should analyze the clone and it&#8217;s context and then present the user with automated options for collapsing and removal.  Unfortunately, it is not always as simple as doing a couple of extract method operations with some find and replace.   Consider the clones in these two classes (the lines above and below the calculation of the <code>area</code> variable, on lines 7 and 19):</p>
<pre name="code" class="csharp">class Cylinder
{
     public double Volume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = 3.14 * Math.Pow(this.radius, 2);
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}
class Box
{
     public double ComputeVolume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = this.length * this.width;
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}</pre>
<p style="margin-bottom: 0in;">The ideal collapse result is to isolate the differing logic into two distinct functional units.  That is, perform an extract method on the unique logic <em>between</em> the clones.  Now it becomes clear that the two classes are collapsable into a single class where the unique logic can be inserted via polymorphism or dependency injection.  Like so:</p>
<p style="margin-bottom: 0in;">
<pre name="code" class="csharp">abstract class Shape
{
     public abstract double ComputeArea();

     public double Volume()
     {
          double height = GetHeight();
          double scaling = GetScalingFactor();
          double area = ComputeArea();
          double volume =  height * area;
          return volume * scaling;
     }
     ...
}
class Cylinder : Shape
{
     public override double ComputeArea()
     {
          return 3.14 * Math.Pow(this.radius, 2);
     }
     ...
}
class Box : Shape
{
     public override double ComputeArea()
     {
          return this.length* this.width;
     }
     ...
}</pre>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">The contextual analysis of the clones&#8217; relationship to surrounding code and other clones led us to a more optimal result.  Indeed, otherwise the most likely path of collapsing chosen would be to jump in with extract methods against the clones themselves.  This would have put us in the situation of having a number of identical methods instead of blocks, but would not have provided any clear direction on the best way to proceed with collapsing.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">It&#8217;s worth noting that sometimes collapsing will result in more lines of code.  This does not diminish the value of collapsing.   The extra lines are boiler plate like class, interface, and function definitions.   The value is derived from the reduction and isolation of the formerly duplicated logic.  Additionally, as we increase the number of functional units (methods and classes) we make more malleable code because these units provide the boundaries that our existing code manipulation tools tend to operate on.</p>
<p><br/></p>
<p style="margin-bottom: 0in;">In my next post I will present some implementation ideas for the rich clone analyzer I have outlined here.  In further posts I will discuss how else we might put such a tool to use.</p>
]]></content:encoded>
			<wfw:commentRss>http://landofjosh.com/2009/05/code-clones/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unnecessary Complexity Case Study #1: Untyped Enum Comparisons</title>
		<link>http://landofjosh.com/2009/05/unnecessary-complexity-case-study-1-untyped-enum-comparisons/</link>
		<comments>http://landofjosh.com/2009/05/unnecessary-complexity-case-study-1-untyped-enum-comparisons/#comments</comments>
		<pubDate>Thu, 07 May 2009 13:27:27 +0000</pubDate>
		<dc:creator>josh</dc:creator>
				<category><![CDATA[unnecessary-complexity]]></category>
		<category><![CDATA[agent ralph]]></category>
		<category><![CDATA[resharper]]></category>

		<guid isPermaLink="false">http://landofjosh.com/?p=12</guid>
		<description><![CDATA[This post is the first in a series of posts on specific examples of unnecessary complexity.   Consider this code: enum MyEnum  { First, Second } public void Match1(MyEnum e) {     if (e.ToString() == "First") { } } The problem lies in the condition of the if.  The developer is converting an enum instance [...]]]></description>
			<content:encoded><![CDATA[<p>This post is the first in a series of posts on specific examples of unnecessary complexity.  </p>
<p>Consider this code:</p>
<pre name="code" class="csharp">enum MyEnum  { First, Second }
public void Match1(MyEnum e)
{
    if (e.ToString() == "First") { }
}</pre>
<p>The problem lies in the condition of the if.  The developer is converting an enum instance to a string for the purposes of a comparison, circumventing type safety.  The primary problem here is that the statement can no longer safely participate in automated refactorings.  For example, if my MyEnum.First is renamed this code will break such that the condition is always false and the body will never execute.  A Find Usages executed on MyEnum.First would not identify this site.  </p>
<p>It also performs worse than the alternatives.  I hesitate to even mention that though because, like most micro performance optimizations, it is only going to matter in uncommon situations like a tight loop or heavy load.</p>
<p>The correct code is this:</p>
<pre name="code" class="csharp">if(e == MyEnum.First) { }</pre>
<p>The code is now typesafe, fast (it&#8217;s a comparison of integers), and refactoring friendly.  And it&#8217;s a simple enough fix.  The exact steps are:</p>
<ol>
<li>Scan the enum values list for a member whose name exactly matches the string literal.</li>
<li>If found, replace the string literal with a reference to the qualified enum value.</li>
</ol>
<p>There are some special cases to consider as well.</p>
<p>Here the string is being manipulated before comparing, by calling ToLower().</p>
<pre name="code" class="csharp">if(e.ToString().ToLower() == "first") { }</pre>
<p>When comparing the result of a ToLower() or ToUpper() call the string constant is single cased, likely with the intention of reducing typos.  I suspect the programmer considered this <a href="http://en.wikipedia.org/wiki/Defensive_programming">defensive programming</a>.</p>
<p>Unnecessary complexity begets bugs.  When your code base is littered with untyped enum comparisons, this mistake is inevitable:</p>
<pre name="code" class="csharp">if(e.ToString().ToLower() == "First") { }</pre>
<p>The string constant actually matches the declared enum casing, yet due to the ToLower() call the condition is always false, and the branch is never executed.  This permutation is particularly dangerous as the fix revives a branch that hasn&#8217;t been executing. You are eliminating a side effect, and a danger of eliminating side effects is breaking other parts of the system that could be dependent on them.</p>
<p>I have created a tool for automating the correction of untyped enum comparisons.  It&#8217;s a Resharper plug-in based <a href="http://www.jetbrains.com/resharper/features/code_analysis.html#Quick-Fixes">quick fix</a> and is available as part of my <a title="Agent Ralph" href="http://code.google.com/p/agentralphplugin/">Agent Ralph</a> project.</p>
<div id="attachment_18" class="wp-caption alignnone" style="width: 516px"><img class="size-full wp-image-18  " title="Make Enum Comparison Typesafe" src="http://landofjosh.com/wp-content/uploads/2009/05/makeenumcomparisontypesafe1.png" alt="Make Enum Comparison Typesafe" width="506" height="120" /><p class="wp-caption-text">Using Agent Ralph to automate a Make Enum Comparison Typesafe code correction.</p></div>
<p>Currently it only handles the simple case, pictured above.  I hope to correct that deficiency soon.</p>
<p><em>Thanks to <a href="http://www.fooberry.com">Mark</a> for the great suggestion on a WordPress friendly <a href="http://wordpress.org/extend/plugins/google-syntax-highlighter/">code syntax highlighter</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://landofjosh.com/2009/05/unnecessary-complexity-case-study-1-untyped-enum-comparisons/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
