<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for blog of josh</title>
	<atom:link href="http://landofjosh.com/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://landofjosh.com</link>
	<description>software development under the big arch</description>
	<lastBuildDate>Sat, 27 Apr 2013 15:09:50 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
	<item>
		<title>Comment on Unnecessary Complexity Case Study #1: Untyped Enum Comparisons by cheap louis vuitton purse</title>
		<link>http://landofjosh.com/2009/05/unnecessary-complexity-case-study-1-untyped-enum-comparisons/comment-page-1/#comment-103</link>
		<dc:creator>cheap louis vuitton purse</dc:creator>
		<pubDate>Sat, 27 Apr 2013 15:09:50 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=12#comment-103</guid>
		<description>Available @ AmazonHobo Vintage Jackyn Shoulder Bag The mossy color of this bag matched with vintage leather thats popular to Hobo bags make this bag quite distinct. I</description>
		<content:encoded><![CDATA[<p>Available @ AmazonHobo Vintage Jackyn Shoulder Bag The mossy color of this bag matched with vintage leather thats popular to Hobo bags make this bag quite distinct. I</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on I have assembled the Triforce by josh</title>
		<link>http://landofjosh.com/2009/10/i-have-assembled-the-triforce/comment-page-1/#comment-100</link>
		<dc:creator>josh</dc:creator>
		<pubDate>Fri, 11 May 2012 13:54:16 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=213#comment-100</guid>
		<description>Hi Ranji,

I love hearing that your team is using Agent Ralph.  

I&#039;ll see if I can get Agent Ralph updated for the latest R# soon. 

Thanks,
Josh</description>
		<content:encoded><![CDATA[<p>Hi Ranji,</p>
<p>I love hearing that your team is using Agent Ralph.  </p>
<p>I&#8217;ll see if I can get Agent Ralph updated for the latest R# soon. </p>
<p>Thanks,<br />
Josh</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on I have assembled the Triforce by Ranji</title>
		<link>http://landofjosh.com/2009/10/i-have-assembled-the-triforce/comment-page-1/#comment-99</link>
		<dc:creator>Ranji</dc:creator>
		<pubDate>Fri, 11 May 2012 06:39:47 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=213#comment-99</guid>
		<description>Hello Josh,

I have found Agent Ralph very useful and my team is using it extensively. Could you please add support to ReSharper 6.x as we have upgraded to ReSharper 6.x?

Thanks and Regards
Ranji</description>
		<content:encoded><![CDATA[<p>Hello Josh,</p>
<p>I have found Agent Ralph very useful and my team is using it extensively. Could you please add support to ReSharper 6.x as we have upgraded to ReSharper 6.x?</p>
<p>Thanks and Regards<br />
Ranji</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Unnecessary Complexity Case Study #1: Untyped Enum Comparisons by Andy Garcia</title>
		<link>http://landofjosh.com/2009/05/unnecessary-complexity-case-study-1-untyped-enum-comparisons/comment-page-1/#comment-97</link>
		<dc:creator>Andy Garcia</dc:creator>
		<pubDate>Fri, 02 Dec 2011 08:48:48 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=12#comment-97</guid>
		<description>@Brian Rodewalt. This simple refactoring you can add yourself with &quot;Search and Replace pattern&quot;, something like &quot;$string1$.ToLower == $string2$&quot; and replacement &quot;$string1$.Equals($string2$, StringComparison.OrdinalIgnoreCase)&quot;</description>
		<content:encoded><![CDATA[<p>@Brian Rodewalt. This simple refactoring you can add yourself with &#8220;Search and Replace pattern&#8221;, something like &#8220;$string1$.ToLower == $string2$&#8221; and replacement &#8220;$string1$.Equals($string2$, StringComparison.OrdinalIgnoreCase)&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on An Idea For Robust Clone Detection Using Abstract Syntax Trees by josh</title>
		<link>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/comment-page-1/#comment-91</link>
		<dc:creator>josh</dc:creator>
		<pubDate>Sat, 30 Jan 2010 20:20:08 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=145#comment-91</guid>
		<description>Ira,

I didn&#039;t realize that there was a exact match following the hashed match.  I agree, my false positive concern is misplaced.

I am looking forward to reading your papers.

Josh</description>
		<content:encoded><![CDATA[<p>Ira,</p>
<p>I didn&#8217;t realize that there was a exact match following the hashed match.  I agree, my false positive concern is misplaced.</p>
<p>I am looking forward to reading your papers.</p>
<p>Josh</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on An Idea For Robust Clone Detection Using Abstract Syntax Trees by Ira Baxter</title>
		<link>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/comment-page-1/#comment-90</link>
		<dc:creator>Ira Baxter</dc:creator>
		<pubDate>Thu, 28 Jan 2010 04:33:46 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=145#comment-90</guid>
		<description>Well, the algorighm only uses hashes to find possible matches, and then compares them exactly.  So it detects exact clones without error.

It also detects &quot;near miss&quot; clones which you can think of as parameterized code, e.g., if you made a macro out of block of code and replaced well-formed sections, you&#039;d end up with a parameterized clone.   

The present (2010) implementation operates somewhat differently than the 1998 paper, but the basic ideas are the same.  A 2004 study (published in IEEE Transactions on Software Engineering) by Steve Bellon compared several detectors, and concluded that ours produce the smallest number of false positives, so I think your worry is misplaced.

You&#039;ve observed that matching exact trees is &quot;easy&quot;.  In fact, I agree. What isn&#039;t easy is matching inexact trees to produce the near miss clones, and making this work at the 2 million line scale for multiple programming languages.

You can find a number of examples of clone detection runs on different languages at the website.

-- IDB</description>
		<content:encoded><![CDATA[<p>Well, the algorighm only uses hashes to find possible matches, and then compares them exactly.  So it detects exact clones without error.</p>
<p>It also detects &#8220;near miss&#8221; clones which you can think of as parameterized code, e.g., if you made a macro out of block of code and replaced well-formed sections, you&#8217;d end up with a parameterized clone.   </p>
<p>The present (2010) implementation operates somewhat differently than the 1998 paper, but the basic ideas are the same.  A 2004 study (published in IEEE Transactions on Software Engineering) by Steve Bellon compared several detectors, and concluded that ours produce the smallest number of false positives, so I think your worry is misplaced.</p>
<p>You&#8217;ve observed that matching exact trees is &#8220;easy&#8221;.  In fact, I agree. What isn&#8217;t easy is matching inexact trees to produce the near miss clones, and making this work at the 2 million line scale for multiple programming languages.</p>
<p>You can find a number of examples of clone detection runs on different languages at the website.</p>
<p>&#8211; IDB</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on An Idea For Robust Clone Detection Using Abstract Syntax Trees by josh</title>
		<link>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/comment-page-1/#comment-89</link>
		<dc:creator>josh</dc:creator>
		<pubDate>Mon, 25 Jan 2010 05:59:00 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=145#comment-89</guid>
		<description>Hello Ira, 

I read your paper, though I have not tried out your implementation.  If I remember correctly, the high level view of the algorithm was to compute a hash on the ASTs and sub-ASTs, and then compare hashes for clone detection.  I think that one advantage to my approach is that when a clone is reported there is a higher degree of certainty that it is valid.  (I think like 100% certainty, though I haven&#039;t proven that or anything.)  The &#039;fuzzy hash&#039;  idea where the hash calculation handles some constructs differently in an effort to detect near miss clones (like ignoring small sub trees for example) seems like it would generate false positives.  For my tool false positives aren&#039;t acceptable as I want to automate the clone repair.  Of course, a hash based implementation would be a lot faster.  I am dealing with the kind of poor algorithmic complexity you mention in your paper, like O(n^3) and worse sometimes.

I would like to read more about how you automated the repair of the clones.  Do you talk about that in one of the other papers?  I&#039;ve only read the one.

Thanks for the comment,
Josh</description>
		<content:encoded><![CDATA[<p>Hello Ira, </p>
<p>I read your paper, though I have not tried out your implementation.  If I remember correctly, the high level view of the algorithm was to compute a hash on the ASTs and sub-ASTs, and then compare hashes for clone detection.  I think that one advantage to my approach is that when a clone is reported there is a higher degree of certainty that it is valid.  (I think like 100% certainty, though I haven&#8217;t proven that or anything.)  The &#8216;fuzzy hash&#8217;  idea where the hash calculation handles some constructs differently in an effort to detect near miss clones (like ignoring small sub trees for example) seems like it would generate false positives.  For my tool false positives aren&#8217;t acceptable as I want to automate the clone repair.  Of course, a hash based implementation would be a lot faster.  I am dealing with the kind of poor algorithmic complexity you mention in your paper, like O(n^3) and worse sometimes.</p>
<p>I would like to read more about how you automated the repair of the clones.  Do you talk about that in one of the other papers?  I&#8217;ve only read the one.</p>
<p>Thanks for the comment,<br />
Josh</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on An Idea For Robust Clone Detection Using Abstract Syntax Trees by Ira Baxter</title>
		<link>http://landofjosh.com/2009/07/an-idea-for-robust-clone-detection-using-abstract-syntax-trees/comment-page-1/#comment-88</link>
		<dc:creator>Ira Baxter</dc:creator>
		<pubDate>Sat, 23 Jan 2010 23:00:51 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=145#comment-88</guid>
		<description>I implemented and wrote a technical paper on a clone detector based on AST tree matching back in 1998.   Check out the web site for discussion, link to technical paper, and same clone analysis reports for several languages.</description>
		<content:encoded><![CDATA[<p>I implemented and wrote a technical paper on a clone detector based on AST tree matching back in 1998.   Check out the web site for discussion, link to technical paper, and same clone analysis reports for several languages.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on The Oscillating Shrinking Window by josh</title>
		<link>http://landofjosh.com/2009/08/the-oscillating-shrinking-window/comment-page-1/#comment-62</link>
		<dc:creator>josh</dc:creator>
		<pubDate>Sun, 06 Sep 2009 23:07:59 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=199#comment-62</guid>
		<description>Hi Scott.

On performance:  I opened a ~75 line method (the extract method implementation itself, ironically) and it tanked.  I left it to spin and went to bed.  After about 7 hours it still wasn&#039;t done.  I realized I needed a heuristic.  So, I modified the algorithm to only consider extraction windows whose size matches an existing method.  For example, don&#039;t bother extracting a three statement block if you&#039;re just going to compare the result to a four statement block.  This made scanning the big methods practical.  I think it  changes the algorithmic complexity to be linear, maybe even sub linear.

On robustifying:  My philosophy in this project is that I want to detect clones by proving two pieces of code functionally equivalent.  To prove that, I apply a series of strict refactorings, where strict means the refactored result has not changed it&#039;s inputs, outputs, or side effects.  If the code can be safely coerced to a perfect match of some other code then they are clones.  So to answer your question, the way I would solve your example is not by &quot;relaxing&quot; the comparison to say, disregard literal input values.  Instead I would apply an Introduce Local Variable refactoring to the literals.  Subsequent extract methods would exclude said variable definitions producing ASTs that when compared now do not fail due to different literal inputs.

It&#039;s a different way of accomplishing the same thing, but I think it has some advantages.  One is that refactoring implementations are external, sort of like plug-ins.  I can implement a new one (or accept someone else&#039;s) and just add it right to the list, improving the whole system without having to touch the core code.</description>
		<content:encoded><![CDATA[<p>Hi Scott.</p>
<p>On performance:  I opened a ~75 line method (the extract method implementation itself, ironically) and it tanked.  I left it to spin and went to bed.  After about 7 hours it still wasn&#8217;t done.  I realized I needed a heuristic.  So, I modified the algorithm to only consider extraction windows whose size matches an existing method.  For example, don&#8217;t bother extracting a three statement block if you&#8217;re just going to compare the result to a four statement block.  This made scanning the big methods practical.  I think it  changes the algorithmic complexity to be linear, maybe even sub linear.</p>
<p>On robustifying:  My philosophy in this project is that I want to detect clones by proving two pieces of code functionally equivalent.  To prove that, I apply a series of strict refactorings, where strict means the refactored result has not changed it&#8217;s inputs, outputs, or side effects.  If the code can be safely coerced to a perfect match of some other code then they are clones.  So to answer your question, the way I would solve your example is not by &#8220;relaxing&#8221; the comparison to say, disregard literal input values.  Instead I would apply an Introduce Local Variable refactoring to the literals.  Subsequent extract methods would exclude said variable definitions producing ASTs that when compared now do not fail due to different literal inputs.</p>
<p>It&#8217;s a different way of accomplishing the same thing, but I think it has some advantages.  One is that refactoring implementations are external, sort of like plug-ins.  I can implement a new one (or accept someone else&#8217;s) and just add it right to the list, improving the whole system without having to touch the core code.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on The Oscillating Shrinking Window by Scott Wegner</title>
		<link>http://landofjosh.com/2009/08/the-oscillating-shrinking-window/comment-page-1/#comment-60</link>
		<dc:creator>Scott Wegner</dc:creator>
		<pubDate>Mon, 31 Aug 2009 21:33:29 +0000</pubDate>
		<guid isPermaLink="false">http://landofjosh.com/?p=199#comment-60</guid>
		<description>Very cool Josh.  A couple questions:

You point out that this won&#039;t really scale well, and I can see why.  Have you looked at any real-life performance results?  How large can the source file be for the refactoring to run in some bearable-amount of time?

Do you have plans for extending this refactoring to be more robust?  It might be neat to extract clones which only differ by input value, so you could extract things like MyClone(100), MyClose(200).  I imagine this would add some additional complexity to your AST representation..</description>
		<content:encoded><![CDATA[<p>Very cool Josh.  A couple questions:</p>
<p>You point out that this won&#8217;t really scale well, and I can see why.  Have you looked at any real-life performance results?  How large can the source file be for the refactoring to run in some bearable-amount of time?</p>
<p>Do you have plans for extending this refactoring to be more robust?  It might be neat to extract clones which only differ by input value, so you could extract things like MyClone(100), MyClose(200).  I imagine this would add some additional complexity to your AST representation..</p>
]]></content:encoded>
	</item>
</channel>
</rss>
