<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Spin Foam</title>
	<atom:link href="http://mchouza.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://mchouza.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Sat, 14 Jan 2012 04:47:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='mchouza.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Spin Foam</title>
		<link>http://mchouza.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://mchouza.wordpress.com/osd.xml" title="Spin Foam" />
	<atom:link rel='hub' href='http://mchouza.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Quines [remix]</title>
		<link>http://mchouza.wordpress.com/2012/01/14/quines-remix/</link>
		<comments>http://mchouza.wordpress.com/2012/01/14/quines-remix/#comments</comments>
		<pubDate>Sat, 14 Jan 2012 01:11:00 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[cs]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2724</guid>
		<description><![CDATA[As the quines in the previous post were criticized as boring and ordinary , I did a remix: You can check that it works on the web or by downloading the file and testing it: I&#8217;m aware that there are some impressive examples out there, but I haven&#8217;t analyzed them to avoid spoiling the fun. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2724&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As the quines <a href="http://mchouza.wordpress.com/2012/01/08/quines/">in the previous post</a> were criticized as boring and ordinary <img src='http://s2.wp.com/wp-includes/images/smilies/icon_razz.gif' alt=':-P' class='wp-smiley' /> , <a href="http://www.everythingisaremix.info/watch-the-series/">I did a remix</a>:</p>
<p><pre class="brush: python;">
#include &lt;stdio.h&gt;
#define s char s[]
s=&quot;#if 0\nimport json;r=json.dumps\nprint'#include &lt;stdio.h&gt;\\n#define s char s[]\\ns=%s;\\n%s'%(r(s),s)\n\&quot;\&quot;\&quot; \&quot;\n#elif 1\n#undef s\nint main(void)\n{\n  char *t = s;\n  printf(\&quot;#include &lt;stdio.h&gt;\\n#define s char s[]\\ns=\\\&quot;\&quot;);\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\&quot;\\\\n\&quot;);\n    else if (*t == '\&quot;')\n      printf(\&quot;\\\\\\\&quot;\&quot;);\n    else if (*t == '\\\\')\n      printf(\&quot;\\\\\\\\\&quot;);\n    else\n      printf(\&quot;%c\&quot;, *t);\n    t++;\n  }\n  printf(\&quot;\\\&quot;;\\n%s\\n\&quot;, s);\n  return 0;\n}\n#elif 0\n\&quot; \&quot;\&quot;\&quot;\n#endif&quot;;
#if 0
import json;r=json.dumps
print'#include &lt;stdio.h&gt;\n#define s char s[]\ns=%s;\n%s'%(r(s),s)
&quot;&quot;&quot; &quot;
#elif 1
#undef s
int main(void)
{
  char *t = s;
  printf(&quot;#include &lt;stdio.h&gt;\n#define s char s[]\ns=\&quot;&quot;);
  while (*t)
  {
    if (*t == '\n')
      printf(&quot;\\n&quot;);
    else if (*t == '&quot;')
      printf(&quot;\\\&quot;&quot;);
    else if (*t == '\\')
      printf(&quot;\\\\&quot;);
    else
      printf(&quot;%c&quot;, *t);
    t++;
  }
  printf(&quot;\&quot;;\n%s\n&quot;, s);
  return 0;
}
#elif 0
&quot; &quot;&quot;&quot;
#endif
</pre></p>
<p>You can check that it works on <a href="http://www.ideone.com/bKgiR">the</a> <a href="http://www.ideone.com/DP59E">web</a> or <a href="http://mchouza.googlecode.com/svn-history/r122/trunk/quine/polyquine.c">by downloading the file</a> and testing it:</p>
<p><pre class="brush: plain;">
$ python polyquine.c | diff polyquine.c -
$ gcc -ansi -pedantic -Wall polyquine.c -o polyquine &amp;&amp; ./polyquine | diff polyquine.c -
</pre></p>
<p>I&#8217;m aware that there are <a href="http://www.nyx.net/~gthompso/poly/polyglot.htm">some</a> <a href="http://scienceblogs.com/goodmath/2007/04/true_pathology_a_multilingual.php">impressive</a> <a href="http://www.phong.org/bf/polyglotC++PerlPythonC.c">examples</a> out there, but I haven&#8217;t analyzed them to avoid spoiling the fun.</p>
<p>What other language should I add? Reader contributions are welcome!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2724/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2724/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2724/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2724&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2012/01/14/quines-remix/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>Quines</title>
		<link>http://mchouza.wordpress.com/2012/01/08/quines/</link>
		<comments>http://mchouza.wordpress.com/2012/01/08/quines/#comments</comments>
		<pubDate>Sun, 08 Jan 2012 02:35:54 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[cs]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2705</guid>
		<description><![CDATA[Introduction One interesting programming exercise is to write a quine, a program that outputs its own source code (excluding empty programs!). If we naively try to write it just by using the print statement we will get into an infinite regression: The basic problem is that the instructions required to write other instructions lead to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2705&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h4>Introduction</h4>
<p>One interesting programming exercise is to write <a href="http://en.wikipedia.org/wiki/Quine_(computing)">a quine</a>, a program that outputs its own source code (<a href="http://www.ioccc.org/1994/smr.hint">excluding empty programs</a>!). If we naively try to write it just by using <a href="http://docs.python.org/reference/simple_stmts.html#grammar-token-print_stmt">the print statement</a> we will get into an infinite regression:</p>
<p><pre class="brush: python;">
print 'print \' print \\\'print \\\\\\\'...
</pre></p>
<p>The basic problem is that the instructions required to write other instructions lead to an infinite recursion. So how can we avoid this problem?</p>
<h4>The trick</h4>
<p>One general way to do it is the one used by <a href="http://en.wikipedia.org/wiki/Von_Neumann">Von Neumann</a> to make <a href="http://en.wikipedia.org/wiki/Von_Neumann_universal_constructor">his abstract self-replicating machines</a> and <a href="http://en.wikipedia.org/wiki/Dna_replication">by nature itself</a>. The trick is to handle the same instructions in two different ways: as instructions to be followed and as data to be copied.</p>
<h4>Writing a C quine</h4>
<p>As we will need to access the same data in two different ways, we must define a variable to avoid duplicating it:</p>
<p><pre class="brush: cpp;">
char s[] = &quot;&lt;data will go here&gt;&quot;;
</pre></p>
<p>It&#8217;s easy to write the code that prints everything before the data,</p>
<p><pre class="brush: cpp;">
int main(void)
{
  printf(&quot;char s[] = \&quot;&quot;);
</pre></p>
<p>but now we need to print the data as represented inside the string. Fortunately, of the characters we use, only the <a href="http://en.wikipedia.org/wiki/Newline#In_programming_languages">newline</a>, the quote and the backlash require <a href="http://en.wikipedia.org/wiki/Escape_sequence">escape sequences</a>:</p>
<p><pre class="brush: cpp;">
  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf(&quot;\\n&quot;);
    else if (*t == '&quot;')
      printf(&quot;\\\&quot;&quot;);
    else if (*t == '\\')
      printf(&quot;\\\\&quot;);
    else
      printf(&quot;%c&quot;, *t);
    t++;
  }
</pre></p>
<p>Finally, we print the data as a normal string and exit <strong>main()</strong>:</p>
<p><pre class="brush: cpp;">
  printf(&quot;%s&quot;, s);
  return 0;
}
</pre></p>
<p>This program is finished with exception of filling in the data in variable <strong>s</strong>. As this is quite tedious to do by hand, we will do it <a href="http://www.ideone.com/mJCHA">using a program</a>:</p>
<p><pre class="brush: cpp;">
char s[] = &quot;\&quot;;\nint main(void)\n{\n  printf(\&quot;char s[] = \\\&quot;\&quot;);\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\&quot;\\\\n\&quot;);\n    else if (*t == '\&quot;')\n      printf(\&quot;\\\\\\\&quot;\&quot;);\n    else if (*t == '\\\\')\n      printf(\&quot;\\\\\\\\\&quot;);\n    else\n      printf(\&quot;%c\&quot;, *t);\n    t++;\n  }\n  printf(\&quot;%s\&quot;, s);\n  return 0;\n}\n&quot;;
</pre></p>
<p>Putting it all together, we get <a href="http://www.ideone.com/sgKAx">a program that prints its own source</a>:</p>
<p><pre class="brush: cpp;">
char s[] = &quot;\&quot;;\nint main(void)\n{\n  printf(\&quot;char s[] = \\\&quot;\&quot;);\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\&quot;\\\\n\&quot;);\n    else if (*t == '\&quot;')\n      printf(\&quot;\\\\\\\&quot;\&quot;);\n    else if (*t == '\\\\')\n      printf(\&quot;\\\\\\\\\&quot;);\n    else\n      printf(\&quot;%c\&quot;, *t);\n    t++;\n  }\n  printf(\&quot;%s\&quot;, s);\n  return 0;\n}\n&quot;;
int main(void)
{
  printf(&quot;char s[] = \&quot;&quot;);
  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf(&quot;\\n&quot;);
    else if (*t == '&quot;')
      printf(&quot;\\\&quot;&quot;);
    else if (*t == '\\')
      printf(&quot;\\\\&quot;);
    else
      printf(&quot;%c&quot;, *t);
    t++;
  }
  printf(&quot;%s&quot;, s);
  return 0;
}
</pre></p>
<p>But, though it works in GCC, <a href="http://www.ideone.com/yIucF">it&#8217;s not</a> a strict C99 file:</p>
<p><pre class="brush: plain;">
cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:4: error: implicit declaration of function ‘printf’
prog.c:4: error: incompatible implicit declaration of built-in function ‘printf’
</pre></p>
<p>This can be fixed <a href="http://www.ideone.com/KoPNz">by including the <strong>stdio.h</strong> header</a>:</p>
<p><pre class="brush: cpp;">
#include &lt;stdio.h&gt;
char s[] = &quot;\&quot;;\nint main(void)\n{\n  printf(\&quot;#include &lt;stdio.h&gt;\\nchar s[] = \\\&quot;\&quot;);\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\&quot;\\\\n\&quot;);\n    else if (*t == '\&quot;')\n      printf(\&quot;\\\\\\\&quot;\&quot;);\n    else if (*t == '\\\\')\n      printf(\&quot;\\\\\\\\\&quot;);\n    else\n      printf(\&quot;%c\&quot;, *t);\n    t++;\n  }\n  printf(\&quot;%s\&quot;, s);\n  return 0;\n}\n&quot;;
int main(void)
{
  printf(&quot;#include &lt;stdio.h&gt;\nchar s[] = \&quot;&quot;);
  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf(&quot;\\n&quot;);
    else if (*t == '&quot;')
      printf(&quot;\\\&quot;&quot;);
    else if (*t == '\\')
      printf(&quot;\\\\&quot;);
    else
      printf(&quot;%c&quot;, *t);
    t++;
  }
  printf(&quot;%s&quot;, s);
  return 0;
}
</pre></p>
<h4>Doing a Python quine</h4>
<p>Doing a Python quine is much easier, because <a href="http://docs.python.org/library/functions.html#repr">the built-in function <strong>repr()</strong></a> gives a string representation of an object, allowing us to skip most of the code of the C quine. Using it we can get <a href="http://www.ideone.com/3KhFc">this natural three line quine</a>:</p>
<p><pre class="brush: python;">
s = &quot;print 's = %s' % repr(s)\nprint s&quot;
print 's = %s' % repr(s)
print s
</pre></p>
<p>If we want to do <a href="http://en.wikipedia.org/wiki/Code_golf">code golfing</a>, we can:</p>
<ul>
<li>Remove unnecesary spaces and line breaks.</li>
<li>Printing the representation of the data and the data using only one <strong>print</strong> statement.
</ul>
<p>This give us <a href="http://www.ideone.com/NLx9v">a much shorter quine</a>:</p>
<p><pre class="brush: python;">
s=&quot;print's=%s;%s'%(repr(s),s)&quot;;print's=%s;%s'%(repr(s),s)
</pre></p>
<p>though still longer <a href="http://stackoverflow.com/questions/6223285/shortest-python-quine">than the record one</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2705/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2705/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2705/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2705&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2012/01/08/quines/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>The fall of the slinky I</title>
		<link>http://mchouza.wordpress.com/2011/11/14/the-fall-of-the-slinky-i/</link>
		<comments>http://mchouza.wordpress.com/2011/11/14/the-fall-of-the-slinky-i/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 01:30:05 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[math]]></category>
		<category><![CDATA[physics]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2378</guid>
		<description><![CDATA[The video The video shows how a spring suspended from one of its ends reacts when its dropped. It can be observed that the lower end &#8220;doesn&#8217;t know what happened&#8221; until a wave propagates to it. In this post we will make a computer simulation of its behavior, to see if we can reproduce the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2378&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h5>The video</h5>
<span style="text-align:center; display: block;"><a href="http://mchouza.wordpress.com/2011/11/14/the-fall-of-the-slinky-i/"><img src="http://img.youtube.com/vi/eCMmmEEyOO0/2.jpg" alt="" /></a></span>
<p>The video shows how <a href="http://en.wikipedia.org/wiki/Slinky">a spring</a> suspended from one of its ends reacts when its dropped. It can be observed that the lower end <a href="http://en.wikipedia.org/wiki/Supersonic_speed">&#8220;doesn&#8217;t know what happened&#8221;</a> until <a href="http://en.wikipedia.org/wiki/Longitudinal_wave">a wave</a> propagates to it. In this post we will make a computer simulation of its behavior, to see if we can reproduce the phenomenon, and in the next we will apply <a href="http://en.wikipedia.org/wiki/Wave_equation">a more analytic approach</a> to the same problem.</p>
<h5>Discretization</h5>
<p>As we are studying the actions of gravity and inertia, we need to model the slinky as a <a href="http://en.wikipedia.org/wiki/Spring_(device)">massive spring</a>. In (macroscopic) reality the mass and elasticity are <a href="http://en.wikipedia.org/wiki/Continuum_mechanics">continuously distributed</a> throughout the slinky (macroscopically speaking) but, for the purpose of simulating its behavior with a computer model, <a href="http://en.wikipedia.org/wiki/Discretization">we will represent</a> the object as a series of masses connected with massless ideal springs:</p>
<div id="attachment_2496" class="wp-caption aligncenter" style="width: 410px"><a href="http://mchouza.files.wordpress.com/2011/11/slinky_dicretization.png"><img src="http://mchouza.files.wordpress.com/2011/11/slinky_dicretization.png?w=474&#038;h=334" alt="" title="slinky_dicretization" width="474" height="334" class="size-full wp-image-2496" /></a><p class="wp-caption-text">Slinky discretization, showing how the discretized element properties relate to the original ones.</p></div>
<p>If we divide a slinky of total mass <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='M' title='M' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='N' title='N' class='latex' /> small masses, each of them will have the value <img src='http://s0.wp.com/latex.php?latex=m+%3D+M%2FN&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='m = M/N' title='m = M/N' class='latex' />. <a href="http://en.wikipedia.org/wiki/Hooke%27s_law#Multiple_springs">As the spring constants of series springs add as the resistances of parallel resistors</a>, if we have a slinky with overall spring constant <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='K' title='K' class='latex' /> divided into <img src='http://s0.wp.com/latex.php?latex=N-1&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='N-1' title='N-1' class='latex' /> smaller springs, each of them will have a bigger spring constant, <img src='http://s0.wp.com/latex.php?latex=k+%3D+K%28N-1%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='k = K(N-1)' title='k = K(N-1)' class='latex' /> and (obviously) a smaller unstressed length, <img src='http://s0.wp.com/latex.php?latex=l_0+%3D+L%2F%28N-1%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='l_0 = L/(N-1)' title='l_0 = L/(N-1)' class='latex' />.</p>
<h5>Writing the simulation</h5>
<p>Now it&#8217;s just a question of applying <a href="http://en.wikipedia.org/wiki/Newton%27s_laws_of_motion">Newton&#8217;s laws of motion</a> and the <a href="http://en.wikipedia.org/wiki/Hooke%27s_law">ideal spring equation</a> to get a system of <a href="http://en.wikipedia.org/wiki/Ordinary_differential_equation">ordinary differential equations</a>:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cfrac%7Bd%5E2x_i%7D%7Bdt%5E2%7D+%3D+g+%2B+%5Cfrac%7Bk%7D%7Bm%7D%5Cleft%5B%28x_%7Bi%2B1%7D-x_i-l_0%29H%5Bi%5D-%28x_i-x_%7Bi-1%7D-l_0%29H%5BN-1-i%5D%5Cright%5D&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle &#92;frac{d^2x_i}{dt^2} = g + &#92;frac{k}{m}&#92;left[(x_{i+1}-x_i-l_0)H[i]-(x_i-x_{i-1}-l_0)H[N-1-i]&#92;right]' title='&#92;displaystyle &#92;frac{d^2x_i}{dt^2} = g + &#92;frac{k}{m}&#92;left[(x_{i+1}-x_i-l_0)H[i]-(x_i-x_{i-1}-l_0)H[N-1-i]&#92;right]' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=x_i&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='x_i' title='x_i' class='latex' /> are the coordinates of the masses (with <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='i' title='i' class='latex' /> going from 0 to <img src='http://s0.wp.com/latex.php?latex=N-1&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='N-1' title='N-1' class='latex' />), <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='g' title='g' class='latex' /> is the acceleration of gravity, <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='k' title='k' class='latex' /> is the spring constant of each massless spring, <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='m' title='m' class='latex' /> is the value of each small mass, <img src='http://s0.wp.com/latex.php?latex=l_0&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='l_0' title='l_0' class='latex' /> is the unstressed length of each massless spring and <img src='http://s0.wp.com/latex.php?latex=H%5Bn%5D&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='H[n]' title='H[n]' class='latex' /> is the <a href="http://en.wikipedia.org/wiki/Heaviside_step_function#Discrete_form">discrete Heaviside step function</a> (used to avoid depending on undefined values).</p>
<p>This second order system can be reduced to system of first order <a href="http://en.wikipedia.org/wiki/Ordinary_differential_equation">ODE</a>s <a href="http://en.wikipedia.org/wiki/Ordinary_differential_equation#Reduction_to_a_first_order_system">in the usual way</a>. Integrating it using <a href="http://numericalmethods.eng.usf.edu/topics/runge_kutta_4th_method.html">RK4</a>, we get the following Python code:</p>
<p><pre class="brush: python;">
def get_xdot(x):
    sk = K * (NX - 1)
    sl = L / (NX - 1)
    mm = M / NX
    xdot = [x[NX + i] if i &lt; NX else G
            for i in range(2 * NX)]
    for i in range(NX - 1):
        a = sk * (x[i + 1] - x[i] - sl) / mm
        xdot[NX + i] += a
        xdot[NX + i + 1] -= a
    return xdot

def rk4_step(x, dt):
    k1 = get_xdot(x)
    k2 = get_xdot([x[i] + dt * k1[i] / 2.0 for i in range(len(x))])
    k3 = get_xdot([x[i] + dt * k2[i] / 2.0 for i in range(len(x))])
    k4 = get_xdot([x[i] + dt * k3[i] for i in range(len(x))])
    return [x[i] + dt * (k1[i] + 2.0 * (k2[i] + k3[i]) + k4[i]) / 6.0
            for i in range(len(x))]
</pre></p>
<p>Now we need to define the initial conditions. If we just start with the masses separated by a distance <img src='http://s0.wp.com/latex.php?latex=l_0&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='l_0' title='l_0' class='latex' /> and at rest, we won&#8217;t match the initial conditions of the slinky, because it was being stretched by the action of gravity. It&#8217;s not very difficult to compute initial conditions that leave the system at rest if it&#8217;s being held:</p>
<p><pre class="brush: python;">
def initial_x():
    sl0 = L / (NX - 1)
    mm = M / NX
    sk = K * (NX - 1)
    w = M - mm
    x = [0.0 for i in range(2 * NX)]
    for i in range(1, NX):
        x[i] = x[i - 1] + sl0 + w / sk
        w -= mm
    return x
</pre></p>
<p>The remaining code is <a href="http://science.martinsewell.com/computer-science.html">just</a> <a href="http://blog.willbenton.com/2009/05/computer-science-and-plumbing/">plumbing</a> and <a href="http://matplotlib.sourceforge.net/">matplotlib</a> presentation code. The whole program <a href="http://code.google.com/p/mchouza/source/browse/trunk/slinky/slinky.py?r=116">can be seen at the repository</a>.</p>
<h5>Running the simulation</h5>
<p>If we run the simulation with the parameters</p>
<p><pre class="brush: plain;">
NT = 1000 # number of timesteps
NX = 40   # number of masses
T = 1.0   # simulation duration

L = 0.5   # slinky length
K = 1.0   # slinky overall spring constant
M = 1.0   # slinky mass
G = 1.0   # gravitational acceleration
</pre></p>
<p>we get the following results:</p>
<div id="attachment_2476" class="wp-caption aligncenter" style="width: 410px"><a href="http://mchouza.files.wordpress.com/2011/11/slinky_soft.png"><img src="http://mchouza.files.wordpress.com/2011/11/slinky_soft.png?w=474&#038;h=355" alt="" title="slinky_soft" width="474" height="355" class="size-full wp-image-2476" /></a><p class="wp-caption-text">A simulation where the springs are too soft, giving some negative spring lengths (and consequent overlapping) near t = 1.</p></div>
<p>In this plot the gray lines represent the trajectory of the small masses and the black lines the trajectory of the slinky&#8217;s ends.</p>
<p>Clearly the springs are too soft and we are getting unphysical results, as we spring lengths go negative when <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='t' title='t' class='latex' /> nears 1. To fix that, let&#8217;s run the simulations with a greater spring constant, <img src='http://s0.wp.com/latex.php?latex=K+%3D+5&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='K = 5' title='K = 5' class='latex' />:</p>
<div id="attachment_2481" class="wp-caption aligncenter" style="width: 410px"><a href="http://mchouza.files.wordpress.com/2011/11/slinky_ok.png"><img src="http://mchouza.files.wordpress.com/2011/11/slinky_ok.png?w=474&#038;h=355" alt="" title="slinky_ok" width="474" height="355" class="size-full wp-image-2481" /></a><p class="wp-caption-text">A simulation where the springs have a physically reasonable constant, giving an intriguing behavior to its ends.</p></div>
<p>Now we get a more reasonable result, showing a phenomenon that is more similar to the one observed in the video: the bottom of the slinky remains in place while the top begins to fall. Now we can check if the slinky remains stationary when held by <img src='http://s0.wp.com/latex.php?latex=m_0&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='m_0' title='m_0' class='latex' />:</p>
<p><pre class="brush: python;">
def get_xdot(x):
    sk = K * (NX - 1)
    sl = L / (NX - 1)
    mm = M / NX
    xdot = [x[NX + i] if i &lt; NX else G
            for i in range(2 * NX)]
    for i in range(NX - 1):
        a = sk * (x[i + 1] - x[i] - sl) / mm
        xdot[NX + i] += a
        xdot[NX + i + 1] -= a
    xdot[NX] = 0.0 # holding the slinky
    return xdot
</pre></p>
<div id="attachment_2483" class="wp-caption aligncenter" style="width: 410px"><a href="http://mchouza.files.wordpress.com/2011/11/slinky_held.png"><img src="http://mchouza.files.wordpress.com/2011/11/slinky_held.png?w=474&#038;h=355" alt="" title="slinky_held" width="474" height="355" class="size-full wp-image-2483" /></a><p class="wp-caption-text">Simulation to check if the slinky remains stationary when held from its upper end.</p></div>
<h5>Conclusions</h5>
<p>This simulation validates the main point of the original video: the lower end &#8220;doesn&#8217;t know&#8221; the upper end was released until a compression wave reaches it, at <img src='http://s0.wp.com/latex.php?latex=t+%5Capprox+0.45+&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='t &#92;approx 0.45 ' title='t &#92;approx 0.45 ' class='latex' /> in our simulation. But the detailed behavior differs, as the slinky only shows a compression wave once it reaches the nonlinear regime (when  is no more space between the spires).</p>
<p>In the next post we will show an analysis of this nonlinear behavior and the analytical solution to the idealized slinky drop that was numerically simulated in this post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2378/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2378&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/11/14/the-fall-of-the-slinky-i/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/11/slinky_dicretization.png" medium="image">
			<media:title type="html">slinky_dicretization</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/11/slinky_soft.png" medium="image">
			<media:title type="html">slinky_soft</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/11/slinky_ok.png" medium="image">
			<media:title type="html">slinky_ok</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/11/slinky_held.png" medium="image">
			<media:title type="html">slinky_held</media:title>
		</media:content>
	</item>
		<item>
		<title>GPGPU with WebGL: simulating fluids (trailer)</title>
		<link>http://mchouza.wordpress.com/2011/08/29/gpgpu-with-webgl-simulating-fluids-trailer/</link>
		<comments>http://mchouza.wordpress.com/2011/08/29/gpgpu-with-webgl-simulating-fluids-trailer/#comments</comments>
		<pubDate>Mon, 29 Aug 2011 03:26:03 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[CFD]]></category>
		<category><![CDATA[physics]]></category>
		<category><![CDATA[webgl]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2198</guid>
		<description><![CDATA[In a previous post we solved Laplace&#8217;s Equation using WebGL. We will see how to implement the Lattice Boltzmann algorithm using WebGL shaders in the next post, but this post has a preview of the solution: Video of the simulation (best seen in 720!):<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2198&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://mchouza.wordpress.com/2011/02/21/gpgpu-with-webgl-solving-laplaces-equation/">In a previous post we solved Laplace&#8217;s Equation using WebGL</a>. We will see how to implement the <a href="http://en.wikipedia.org/wiki/Lattice_Boltzmann_methods">Lattice Boltzmann algorithm</a> using WebGL shaders in the next post, but this post has a preview of the solution:</p>
<div id="attachment_2204" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.googlecode.com/svn-history/r114/trunk/webgl-proc/lb-fluid-sim/lb_fluid_sim.html"><img src="http://mchouza.files.wordpress.com/2011/08/lb-demo.png?w=474" alt="" title="lb-demo"   class="size-full wp-image-2204" /></a><p class="wp-caption-text">Click on the image to go to the demo. New obstacles can be created by dragging the mouse over the simulation area.</p></div>
<p><strong>Video of the simulation (best seen in 720!):</strong></p>
<span style="text-align:center; display: block;"><a href="http://mchouza.wordpress.com/2011/08/29/gpgpu-with-webgl-simulating-fluids-trailer/"><img src="http://img.youtube.com/vi/zju2FR1wueo/2.jpg" alt="" /></a></span>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2198/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2198&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/08/29/gpgpu-with-webgl-simulating-fluids-trailer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/08/lb-demo.png" medium="image">
			<media:title type="html">lb-demo</media:title>
		</media:content>
	</item>
		<item>
		<title>Top k &#8211; Solution</title>
		<link>http://mchouza.wordpress.com/2011/07/11/top-k-solution/</link>
		<comments>http://mchouza.wordpress.com/2011/07/11/top-k-solution/#comments</comments>
		<pubDate>Mon, 11 Jul 2011 16:59:16 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[cs]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2108</guid>
		<description><![CDATA[[Answer to the problem in Top k.] This was the problem: Given a list of n numbers, get the top k ones in O(n) time. (Partial credit for o(n log n) solutions.) n log k solution The easiest way to solve this problem in o(n log n) time is to scan the n elements, keeping [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2108&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>[Answer to the problem in <a href="http://mchouza.wordpress.com/2011/06/20/top-k/"><em>Top k</em></a>.]</strong></p>
<p>This was the problem:</p>
<blockquote><p>Given a list of <em>n</em> numbers, get the top <em>k</em> ones in O(<em>n</em>) time. (Partial credit for o(<em>n</em> log <em>n</em>) solutions.)</p></blockquote>
<h5><em>n</em> log <em>k</em> solution</h5>
<p>The easiest way to solve this problem in o(<em>n</em> log <em>n</em>) time is to scan the <em>n</em> elements, keeping track of the greatest <em>k</em> elements seen up to the current element. Using a <a href="http://en.wikipedia.org/wiki/Heap_(data_structure)">heap</a> to store the top <em>k</em> elements, we get the following algorithm:</p>
<p><pre class="brush: cpp;">
#define SWAP(type, a, b) {type swap_temp = a; a = b; b = swap_temp;}

static void minheap_pushdown(int *h, size_t hs, size_t i)
{
    size_t j = 0;
    if (2 * i + 2 &lt; hs)
        j = (h[2 * i + 1] &lt; h[2 * i + 2]) ? 2 * i + 1 : 2 * i + 2;
    else if (2 * i + 1 &lt; hs)
        j = 2 * i + 1;
    if (j != 0 &amp;&amp; h[j] &lt; h[i])
    {
        SWAP(int, h[i], h[j]);
        minheap_pushdown(h, hs, j);
    }
}

static void minheap_raise(int *h, size_t i)
{
    if (i == 0)
        return;
    if (h[i] &lt; h[(i - 1) / 2])
    {
        SWAP(int, h[i], h[(i - 1) / 2]);
        minheap_raise(h, (i - 1) / 2);
    }
}

void top_k_nlogk_cp(int *top_k, size_t k, const int *l, size_t n)
{
    size_t i;

    for (i = 0; i &lt; k; i++)
    {
        top_k[i] = l[i];
        minheap_raise(top_k, i);
    }

    for (i = k; i &lt; n; i++)
    {
        if (l[i] &gt; top_k[0])
        {
            top_k[0] = l[i];
            minheap_pushdown(top_k, k, 0);
        }
    }
}
</pre></p>
<p>As <em>minheap_pushdown()</em> and <em>minheap_raise()</em> have O(log <em>k</em>) cost, the total cost of this solution will be O(<em>n</em> log <em>k</em>). This is better than O(<em>n</em> log <em>n</em>) if <em>k</em> grows slower than any power of <em>n</em> (if <em>k</em> is constant or O(log <em>n</em>), for example).</p>
<p>The previous solution is enough to get partial credit, but we can do better <img src='http://s0.wp.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> </p>
<h5>Linear time solution</h5>
<p><a href="http://en.wikipedia.org/wiki/Quicksort">Quicksort</a> is based on a linear time partition operation. This operation starts with an array and an element of this array, called pivot, and ends with a partially sorted array, having the pivot in his sorted position, smaller elements before it and bigger ones after it. Then doing a single partition operation would suffice&#8230; if we could guarantee that the chosen pivot is the (<em>n</em>-<em>k</em>+1)-th element of the sorted array <img src='http://s0.wp.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> </p>
<p>We cannot do that, but we can do something that is asymptotically equivalent: doing a quicksort-style recursion, though only in the partition that contains the desired pivot. The result is this (converting the <a href="http://en.wikipedia.org/wiki/Tail_call">tail recursion</a> to iteration):</p>
<p><pre class="brush: cpp;">
int *top_k_n_ip(int *l, size_t n, size_t k)
{
    size_t lo, hi, pos, i, j;
    int pivot;
   
    lo = 0;
    hi = n;
    pos = n - k;

    while (hi - lo &gt; 1)
    {
        i = lo + 1;
        j = hi - 1;
        pivot = l[lo];

        while (1)
        {
            while (i &lt; hi &amp;&amp; l[i] &lt;= pivot)
                i++;
            while (j &gt; lo &amp;&amp; l[j] &gt; pivot)
                j--;
            if (i &gt; j)
                break;
            SWAP(int, l[i], l[j]);
        }

        SWAP(int, l[lo], l[j]);

        if (j &lt; pos)
            lo = j + 1;
        else if (j &gt; pos)
            hi = j;
        else
            break;
    }

    return l + n - k;
}
</pre></p>
<p>Now we need to check that this gives us a linear-time algorithm. The execution time of the body of the <code>while (1)</code> loop is bounded from above by <em>C</em> times <em>hi</em> &#8211; <em>lo</em>, where <em>C</em> is a constant. If we assume that the chosen pivot divides the <em>l</em> [ <em>lo</em> .. <em>hi</em> ] array section in two roughly equal parts (this assumption can be made rigorous by <a href="http://www.ics.uci.edu/~eppstein/161/960130.html">choosing the pivot carefully</a>), we get</p>
<p><img src='http://s0.wp.com/latex.php?latex=T%28n%29+%5Cleq+A+%2B+%28B+%2B+C+%5Ccdot+n%29+%2B+%28B+%2B+C+%5Ccdot+n%2F2%29+%2B+%5Chdots+%2B+%28B+%2B+C+%5Ccdot+1%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='T(n) &#92;leq A + (B + C &#92;cdot n) + (B + C &#92;cdot n/2) + &#92;hdots + (B + C &#92;cdot 1)' title='T(n) &#92;leq A + (B + C &#92;cdot n) + (B + C &#92;cdot n/2) + &#92;hdots + (B + C &#92;cdot 1)' class='latex' /></p>
<p>where <em>A</em>, <em>B</em> and <em>C</em> are constants.</p>
<p>Summing all the terms containing <em>B</em> and taking out all the <em>C</em> factors we get:</p>
<p><img src='http://s0.wp.com/latex.php?latex=T%28n%29+%5Cleq+A+%2B+%5Clceil%5Clog_2+n%5Crceil+B+%2B+C+%281+%2B+%5Chdots+%2B+n%2F2+%2B+n%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='T(n) &#92;leq A + &#92;lceil&#92;log_2 n&#92;rceil B + C (1 + &#92;hdots + n/2 + n)' title='T(n) &#92;leq A + &#92;lceil&#92;log_2 n&#92;rceil B + C (1 + &#92;hdots + n/2 + n)' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=T%28n%29+%5Cleq+A+%2B+%5Clceil%5Clog_2+n%5Crceil+B+%2B+3+C+n&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='T(n) &#92;leq A + &#92;lceil&#92;log_2 n&#92;rceil B + 3 C n' title='T(n) &#92;leq A + &#92;lceil&#92;log_2 n&#92;rceil B + 3 C n' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=T%28n%29+%3D+O%28n%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='T(n) = O(n)' title='T(n) = O(n)' class='latex' /></p>
<p>As we know that any correct algorithm must employ <img src='http://s0.wp.com/latex.php?latex=%5COmega%28n%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;Omega(n)' title='&#92;Omega(n)' class='latex' /> time (it needs to inspect all elements), we also know that this algorithm has <img src='http://s0.wp.com/latex.php?latex=%5CTheta%28n%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;Theta(n)' title='&#92;Theta(n)' class='latex' /> time complexity. Now let&#8217;s check that against reality <img src='http://s0.wp.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> </p>
<h5>Benchmarks</h5>
<p>We will <a href="http://code.google.com/p/mchouza/source/browse/?r=106#svn%2Ftrunk%2Ftop_k_algs">benchmark</a> the following algorithm implementations:</p>
<ul>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k.c?r=106#38"><em>n</em> log <em>n</em> (copying)</a>:</strong> it just sorts a copy of the input array and copies the top <em>k</em> elements.</li>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k.c?r=106#51"><em>n</em> log <em>k</em> (copying)</a>:</strong> scans the input array, keeping the top <em>k</em> elements in a heap.</li>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k.c?r=106#71"><em>n</em> (copying)</a>:</strong> applies the O(<em>n</em>) in-place algorithm over a copy of the input array.</li>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k.c?r=106#85"><em>n</em> log <em>n</em> (in-place)</a>:</strong> sorts the input array in-place and returns a pointer to the top <em>k</em> elements.</li>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k.c?r=106#91"><em>n</em> (in-place)</a>:</strong> applies the O(<em>n</em>) quicksort-derived algorithm that was described in the previous section.</li>
<li><strong><a href="http://code.google.com/p/mchouza/source/browse/trunk/top_k_algs/top_k_cc.cpp?r=106#5"><em>n</em> (in-place, C++)</a>:</strong> <a href="https://mmack.wordpress.com/">Miles Macklin</a>&#8216;s <a href="http://mchouza.wordpress.com/2011/06/20/top-k/#comment-176">solution</a>, just calling <em><a href="http://stdcxx.apache.org/doc/stdlibref/nth-element.html">std::nth_element()</a></em> <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
</ul>
<div id="attachment_2133" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.files.wordpress.com/2011/07/k30f_all.png"><img src="http://mchouza.files.wordpress.com/2011/07/k30f_all.png?w=474" alt="" title="k30f_all"   class="size-full wp-image-2133" /></a><p class="wp-caption-text">Execution times for all the algorithms when <em>k</em> is fixed at 30.</p></div>
<p>As <em>k</em> has a fixed value in the previous plot, the O(<em>n</em> log <em>k</em>) algorithm is effectively an O(<em>n</em>) algorithm. In fact, it performs better than the truly linear algorithms, probably due to cache effects.</p>
<p>Another thing that can be seen in these plots is the difficulty of seeing the difference between <em>n</em> and <em>n</em> log <em>n</em> in a log-log plot, but <a href="http://www.wolframalpha.com/input/?i=log+plot+|+{10^n+log+%2810^n%29%2C+10^n}+|+n+%3D+2+to++7">that is something to be expected</a>.</p>
<div id="attachment_2140" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.files.wordpress.com/2011/07/k30p_all.png"><img src="http://mchouza.files.wordpress.com/2011/07/k30p_all.png?w=474" alt="" title="k30p_all"   class="size-full wp-image-2140" /></a><p class="wp-caption-text">Execution times for all the algorithms when <em>k</em> is 30% of <em>n</em>.</p></div>
<p>Here the qualitative behavior of the O(<em>n</em> log <em>k</em>) solution <a href="http://www.wolframalpha.com/input/?i=log+plot+|+{10^n+log+%2810^n%29%2C+10^n%2C+10^n+log+%2810^%280.3+n%29%29}+|+n+%3D+2+to++7">is the expected one</a>. The C implementation of the O(<em>n</em>) algorithm looks surprisingly good, with a performance competitive to the C++ implementation included as part of the standard library.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2108/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2108&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/07/11/top-k-solution/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/07/k30f_all.png" medium="image">
			<media:title type="html">k30f_all</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/07/k30p_all.png" medium="image">
			<media:title type="html">k30p_all</media:title>
		</media:content>
	</item>
		<item>
		<title>Top k</title>
		<link>http://mchouza.wordpress.com/2011/06/20/top-k/</link>
		<comments>http://mchouza.wordpress.com/2011/06/20/top-k/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 19:54:19 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[cs]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2097</guid>
		<description><![CDATA[Given a list of n numbers, get the top k ones in O(n) time. This problem is too easy to solve in O(n log n) time but o(n log n) solutions can be presented for partial credit.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2097&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<blockquote><p>
Given a list of <em>n</em> numbers, get the top <em>k</em> ones in O(<em>n</em>) time.
</p></blockquote>
<p>This problem is too easy to solve in O(<em>n</em> log <em>n</em>) time</p>
<p><pre class="brush: python;">
def top_k_n_log_n(l, k):
    sl = sorted(l)
    return sl[-k:]
</pre></p>
<p>but <a href="http://en.wikipedia.org/wiki/Big_O_notation#Family_of_Bachmann.E2.80.93Landau_notations">o(<em>n</em> log <em>n</em>)</a> solutions can be presented for partial credit. <img src='http://s2.wp.com/wp-includes/images/smilies/icon_razz.gif' alt=':-P' class='wp-smiley' /> </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2097/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2097/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2097/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2097&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/06/20/top-k/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>Solving the &#8220;product, sum &amp; difference riddle&#8221;</title>
		<link>http://mchouza.wordpress.com/2011/05/31/solving-the-product-sum-difference-riddle/</link>
		<comments>http://mchouza.wordpress.com/2011/05/31/solving-the-product-sum-difference-riddle/#comments</comments>
		<pubDate>Tue, 31 May 2011 22:53:05 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[math]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2070</guid>
		<description><![CDATA[A post in the scala-user group (via Mauro Ciancio) asked the following problem: there are 3 people. let&#8217;s name them peter, simon and daniel. the three are supposed to figure out a pair of numbes. the possible pairs are all combinations of numbers from 0 to 1000, meaning: (0,0), (1,0), (2,0)&#8230;.(1000,0),(1,1),(1,2) up to (1000,1000) peter [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2070&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://groups.google.com/group/scala-user/msg/282289c9998b1979?dmode=source">A post</a> in the <a href="http://groups.google.com/group/scala-user">scala-user group</a> (<a href="http://twitter.com/#!/maurociancio/status/74604232854081536">via</a> <a href="http://maurociancio.wordpress.com/">Mauro Ciancio</a>) asked the following problem:</p>
<blockquote><p>
there are 3 people. let&#8217;s name them peter, simon and daniel. the three<br />
are supposed to figure out a pair of numbes. the possible pairs are all<br />
combinations of numbers from 0 to 1000, meaning:<br />
(0,0), (1,0), (2,0)&#8230;.(1000,0),(1,1),(1,2) up to (1000,1000)<br />
peter knows the product of the pair, simon knows the sum, and daniel<br />
knows the difference.</p>
<p>the following conversation takes places:</p>
<p>peter: i don&#8217;t know the solution<br />
simon: i already knew that<br />
peter: know i know the solution<br />
simon: know i know it, too<br />
daniel: wtf? i can only suspect one of the numbers but can&#8217;t be sure<br />
peter: the number you are suspecting is wrong<br />
daniel: k, now i now the numbers, too
</p></blockquote>
<p>As this riddle is a variation on the product &amp; sum riddle, giving us an additional protagonist and a difference sequence of questions and answers, we can try to adapt <a href="http://mchouza.wordpress.com/2010/09/06/solving-the-product-sum-riddle/">our previous solution</a> to this new problem.</p>
<blockquote><p>the possible pairs are all<br />
combinations of numbers from 0 to 1000, meaning:<br />
(0,0), (1,0), (2,0)&#8230;.(1000,0),(1,1),(1,2) up to (1000,1000)</p></blockquote>
<p>This section of the riddle isn&#8217;t very clear, but we can tentatively translate it to Python as:</p>
<p><pre class="brush: python;">
all_pairs = [(i, j) for i in range(1000+1) for j in range(i, 1000+1)]
</pre></p>
<blockquote><p>peter knows the product of the pair, simon knows the sum, and daniel<br />
knows the difference.</p>
<p>peter: i don’t know the solution<br />
simon: i already knew that<br />
peter: know i know the solution<br />
simon: know i know it, too</p></blockquote>
<p>The following four sections of the riddle are identical to those of the &#8220;product &amp; sum&#8221; riddle, so they can be solved using essentially the same code (read <a href="http://mchouza.wordpress.com/2010/09/06/solving-the-product-sum-riddle/">the previous solution</a> for the detailed explanation!):</p>
<p><pre class="brush: python;">
def make_freq_table(it, initial_freqs=None):
    freq_table = {} if initial_freqs is None else dict(initial_freqs)
    for e in it:
        if e not in freq_table:
            freq_table[e] = 0
        freq_table[e] += 1
    return freq_table

# peter: i don't know the solution
num_pairs_by_prod = make_freq_table(x * y for x, y in all_pairs)
pairs_1 = [(x, y) for x, y in all_pairs if num_pairs_by_prod[x * y] &gt; 1]

# simon: i already knew that
identif_by_prod_pairs_sums = set(x + y for x, y in all_pairs
                                 if num_pairs_by_prod[x * y] == 1)
pairs_2 = [(x, y) for x, y in pairs_1
           if x + y not in identif_by_prod_pairs_sums]

# peter: know i know the solution
num_pairs_by_prod_2 = make_freq_table(x * y for x, y in pairs_2)
pairs_3 = [(x, y) for x, y in pairs_2 if num_pairs_by_prod_2[x * y] == 1]

# simon: know i know it, too
num_pairs_by_sum_3 = make_freq_table(x + y for x, y in pairs_3)
pairs_4 = [(x, y) for x, y in pairs_3 if num_pairs_by_sum_3[x + y] == 1]
</pre></p>
<blockquote><p>daniel: wtf? i can only suspect one of the numbers but can&#8217;t be sure</p></blockquote>
<p>This indicates us that, based on the information he has available, Daniel is unable to choose one solution. Furthermore, there is one number that appears more frequently than the others in the pairs he is considering. As Daniel is aware of the information provided by Peter and Simon, we can start by classifying the pairs that are still candidates according to that information by their difference:</p>
<p><pre class="brush: python;">
def get_pairs_by_diff(pairs):
    pairs_by_diff = {}
    for x, y in pairs:
        if x - y not in pairs_by_diff:
            pairs_by_diff[x - y] = []
        pairs_by_diff[x - y].append((x, y))
    return pairs_by_diff
pairs_by_diff_4 = get_pairs_by_diff(pairs_4)
</pre></p>
<p>The fact that Daniel is still unsure indicates that, whatever the value of the difference might be, there must be more than one pair associated with it. And the fact he has one &#8220;suspect number&#8221; indicates that one number appears more often than the others (in other words, there is <em>only one</em> <a href="http://en.wikipedia.org/wiki/Mode_(statistics)">modal value</a>). Translating this reasoning to code:</p>
<p><pre class="brush: python;">
def get_modal_elems(pairs):
    ft = make_freq_table((p[1] for p in pairs),
                         make_freq_table(p[0] for p in pairs))
    max_freq = max(ft.values())
    return [e for e in ft if ft[e] == max_freq]
pairs_5 = [(x, y) for x, y in pairs_4
           if len(pairs_by_diff_4[x - y]) &gt; 1 and\
           len(get_modal_elems(pairs_by_diff_4[x - y])) == 1]
</pre></p>
<blockquote><p>peter: the number you are suspecting is wrong</p></blockquote>
<p>This means that, of the previously selected pairs, only those that don&#8217;t have the &#8220;suspect number&#8221; need to be considered. Getting the &#8220;suspect number&#8221; for each difference (we know that there is one, as we have done a selection by this criteria):</p>
<p><pre class="brush: python;">
pairs_by_diff_5 = get_pairs_by_diff(pairs_5)
susp_num_by_diff_5 = dict((d, get_modal_elems(p)[0])
                          for d, p in pairs_by_diff_5.items())
</pre></p>
<p>Now, based on Peter&#8217;s information, we know we can remove the pairs containing the &#8220;suspect number&#8221; associated to their difference. Doing that:</p>
<p><pre class="brush: python;">
pairs_6 = [(x, y) for x, y in pairs_5
           if susp_num_by_diff_5[x - y] not in (x, y)]
</pre></p>
<blockquote><p>daniel: k, now i now the numbers, too</p></blockquote>
<p>As Daniel now knows the answer, we know that there is only one pair associated to the difference value known to him. Selecting these pairs:</p>
<p><pre class="brush: python;">
num_pairs_by_diff_6 = make_freq_table(x - y for x, y in pairs_6)
pairs_7 = [(x, y) for x, y in pairs_6 if num_pairs_by_diff_6[x - y] == 1]
</pre></p>
<p>Finally, if the riddle is uniquely answerable, we can get the pair items now:</p>
<p><pre class="brush: python;">
assert len(pairs_7) == 1
print(pairs_7[0])
</pre></p>
<p><a href="http://www.ideone.com/Rc9OG">Putting all the code together and running it</a>, we effectively obtain a single value for the pair: <strong>(64, 73)</strong>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2070/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2070&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/05/31/solving-the-product-sum-difference-riddle/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>Searching for bits in C</title>
		<link>http://mchouza.wordpress.com/2011/05/30/searching-for-bits-in-c/</link>
		<comments>http://mchouza.wordpress.com/2011/05/30/searching-for-bits-in-c/#comments</comments>
		<pubDate>Mon, 30 May 2011 01:29:55 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[cs]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=2026</guid>
		<description><![CDATA[Some years ago, a friend asked me how to (efficiently) get the position of some bit with value 1 inside an (unsigned) integer. For example, taking 32 bit integers and taking the least significant bit (LSB) as the bit 0, these are valid answers: If we reduce the problem to a more specific one, finding [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2026&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Some years ago, a friend asked me how to (efficiently) get the position of some bit with value 1 inside an (<a href="http://www.cppreference.com/wiki/language/types">unsigned</a>) integer. For example, taking 32 bit integers and taking the least significant bit (LSB) as the bit 0, these are valid answers:</p>
<p><pre class="brush: plain;">
0xa9e7da24 --&gt; 5
0x1d56b8b0 --&gt; 28
0x9459ffbb --&gt; 0
0x9f0c2a38 --&gt; 24
</pre></p>
<p>If we reduce the problem to a more specific one, finding the <em>lowest</em> bit with value 1 inside an integer, it&#8217;s easy to <a href="http://www.ideone.com/Ieg4F">get a solution</a> just by doing a bitwise AND against a moving mask:</p>
<p><pre class="brush: cpp;">
size_t bitscan(uint32_t a)
{
  size_t i;
  uint32_t mask = 1;
  for (i = 0; i &lt; 32 &amp;&amp; !(a &amp; mask); i++)
    mask &lt;&lt;= 1;
  return i;
}
</pre></p>
<p>but it&#8217;s relatively slow. Can we do it faster?</p>
<p>The answer is clearly yes if we are using x86 assembly, as this instruction set has a specific instruction for this purpose: <a href="http://web.itu.edu.tr/kesgin/mul06/intel/instr/bsf.html">Bit Scan Forward (BSF)</a>. <a href="http://graphics.stanford.edu/~seander/bithacks.html">But, even in C, we can do better by a three step process</a>:</p>
<ul>
<li>First we isolate the least significant bit <em>that is set</em> by using: <em>a</em> &amp; (-<em>a</em>).</li>
<li>Then we take the residue of dividing this power of two by 37.</li>
<li>Finally, we look up the associated bit position, using the residue obtained in the previous step as an index.</li>
</ul>
<p>Why the first step works? We can start with the zero: doing a bitwise AND against -0 gives us just zero. As the zero doesn&#8217;t have any 1 bits to start with, we can say we&#8217;ve &#8220;isolated&#8221; the least significant one, though in a <a href="http://www.cut-the-knot.org/do_you_know/falsity.shtml">somewhat vacuous sense</a>.</p>
<p>To check the nonzero integers, let&#8217;s define the bit expansion of an unsigned integer <em>a</em>:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+a+%3D+%5Csum_%7Bi%3D0%7D%5EN+a_i%5C%2C2%5Ei&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle a = &#92;sum_{i=0}^N a_i&#92;,2^i' title='&#92;displaystyle a = &#92;sum_{i=0}^N a_i&#92;,2^i' class='latex' /></p>
<p>Then the result of doing -<em>a</em> (remember we are dealing with unsigned integers) will be:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+-a+%3D+%5Csum_%7Bi%3D0%7D%5EN+%281-a_i%29%5C%2C2%5Ei+%2B+1&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle -a = &#92;sum_{i=0}^N (1-a_i)&#92;,2^i + 1' title='&#92;displaystyle -a = &#92;sum_{i=0}^N (1-a_i)&#92;,2^i + 1' class='latex' /></p>
<p>If the number is nonzero, there must a least significant nonzero bit. Let&#8217;s call its position <em>M</em>. Writing <em>a</em> &amp; (-<em>a</em>) now:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+a%5C%2C%5C%26%5C%2C+%28-a%29+%3D+%5Cleft%28%5Csum_%7Bi%3D0%7D%5EN+a_i%5C%2C2%5Ei%5Cright%29%5C%26%5Cleft%28%5Csum_%7Bi%3D0%7D%5EN+%281-a_i%29%5C%2C2%5Ei+%2B+1%5Cright%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle a&#92;,&#92;&amp;&#92;, (-a) = &#92;left(&#92;sum_{i=0}^N a_i&#92;,2^i&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=0}^N (1-a_i)&#92;,2^i + 1&#92;right)' title='&#92;displaystyle a&#92;,&#92;&amp;&#92;, (-a) = &#92;left(&#92;sum_{i=0}^N a_i&#92;,2^i&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=0}^N (1-a_i)&#92;,2^i + 1&#92;right)' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D+%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+a_i%5C%2C2%5Ei+%2B+1%5Ccdot2%5EM+%2B+%5Csum_%7Bi%3D0%7D%5E%7BM-1%7D0%5Ccdot2%5Ei%5Cright%29%5C%26&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 1&#92;cdot2^M + &#92;sum_{i=0}^{M-1}0&#92;cdot2^i&#92;right)&#92;&amp;' title='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 1&#92;cdot2^M + &#92;sum_{i=0}^{M-1}0&#92;cdot2^i&#92;right)&#92;&amp;' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+%281-a_i%29%5C%2C2%5Ei+%2B+%281-1%29%5Ccdot2%5EM+%2B+%5Csum_%7Bi%3D0%7D%5E%7BM-1%7D%281-0%29%5Ccdot2%5Ei+%2B+1%5Cright%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle &#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + (1-1)&#92;cdot2^M + &#92;sum_{i=0}^{M-1}(1-0)&#92;cdot2^i + 1&#92;right)' title='&#92;displaystyle &#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + (1-1)&#92;cdot2^M + &#92;sum_{i=0}^{M-1}(1-0)&#92;cdot2^i + 1&#92;right)' class='latex' /> (definition of <em>M</em>)</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D+%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+a_i%5C%2C2%5Ei+%2B+2%5EM%5Cright%29%5C%26%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+%281-a_i%29%5C%2C2%5Ei+%2B+%5Csum_%7Bi%3D0%7D%5E%7BM-1%7D2%5Ei+%2B+1%5Cright%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 2^M&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + &#92;sum_{i=0}^{M-1}2^i + 1&#92;right)' title='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 2^M&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + &#92;sum_{i=0}^{M-1}2^i + 1&#92;right)' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D+%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+a_i%5C%2C2%5Ei+%2B+2%5EM%5Cright%29%5C%26%5Cleft%28%5Csum_%7Bi%3DM%2B1%7D%5EN+%281-a_i%29%5C%2C2%5Ei+%2B+2%5EM%5Cright%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 2^M&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + 2^M&#92;right)' title='&#92;displaystyle = &#92;left(&#92;sum_{i=M+1}^N a_i&#92;,2^i + 2^M&#92;right)&#92;&amp;&#92;left(&#92;sum_{i=M+1}^N (1-a_i)&#92;,2^i + 2^M&#92;right)' class='latex' /> (<a href="http://en.wikipedia.org/wiki/Geometric_series#Formula">geometric series sum</a>)</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D+%5Csum_%7Bi%3DM%2B1%7D%5EN+a_i%281-a_i%29%5C%2C2%5Ei+%2B+2%5EM&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle = &#92;sum_{i=M+1}^N a_i(1-a_i)&#92;,2^i + 2^M' title='&#92;displaystyle = &#92;sum_{i=M+1}^N a_i(1-a_i)&#92;,2^i + 2^M' class='latex' /> (AND bit by bit)</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%3D+%5Csum_%7Bi%3DM%2B1%7D%5EN+0%5Ccdot+2%5Ei+%2B+2%5EM+%3D+2%5EM&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle = &#92;sum_{i=M+1}^N 0&#92;cdot 2^i + 2^M = 2^M' title='&#92;displaystyle = &#92;sum_{i=M+1}^N 0&#92;cdot 2^i + 2^M = 2^M' class='latex' /></p>
<p>we get the isolated <em>M</em><sup>th</sup> bit, as desired.</p>
<p><a href="http://www.ideone.com/VFScl">We can look for the smallest divisor giving different residues for 2<sup>0</sup>, 2<sup>1</sup>, &#8230;, 2<sup>31</sup> via a small piece of code</a>:</p>
<p><pre class="brush: python;">
def get_modulo(n):
    m = 1
    while True:
        if len(set(2**i % m for i in range(n))) == n:
            return m
        m += 1

if __name__ == '__main__':
    print(get_modulo(32))
</pre></p>
<p>Now we just have to put the bit numbers in their associated positions inside the lookup table. <a href="http://www.ideone.com/CauM1">We can calculate the table</a> using the following Python code:</p>
<p><pre class="brush: python;">
def get_lut(max_exp, modulo):
    lut = [-1] * modulo
    for i in range(max_exp + 1):
        assert lut[2 ** i % modulo] == -1
        lut[2 ** i % modulo] = i
    return lut

def print_table(table):
    print('{%s}' % ', '.join('%d' % t for t in table))

if __name__ == '__main__':
    print_table(get_lut(32, 37))
</pre></p>
<p>Then, putting these two pieces together, we get the <a href="http://www.ideone.com/rIOQ9">following C function</a>:</p>
<p><pre class="brush: cpp;">
int lut_bit_pos(uint32_t a)
{
  const int bit_pos_lut[37] =
  {
    -1, 0, 1, 26, 2, 23, 27, 32, 3, 16,
    24, 30, 28, 11, -1, 13, 4, 7, 17,
    -1, 25, 22, 31, 15, 29, 10, 12, 6, 
    -1, 21, 14, 9, 5, 20, 8, 19, 18
  };
  return bit_pos_lut[(a &amp; (-a)) % 37];
}
</pre></p>
<p>As we are working with 32 bit integers, we can do a full test&#8230; <a href="http://www.ideone.com/PHkz7">though not using Ideone</a> (it exceeds the runtime limits).</p>
<h5>Edited 5/30/11 &#8211; Bit sets</h5>
<p><a name="bitsets">The original problem</a> was to handle a set of task slots, each slot having two possible states: occupied or empty. This set required an efficient support of the following operations:</p>
<ul>
<li>Checking if a slot is occupied.</li>
<li>Occupying/freeing a slot.</li>
<li><em>Getting a free slot if one is available.</em></li>
</ul>
<p>As this problem doesn&#8217;t ask us for the slot number, we can use an opaque type and internally codify the slot as its associated power of two:</p>
<p><pre class="brush: cpp;">
typedef uint32_t set_t;
typedef int32_t set_aux_t;
typedef uint32_t slot_t;

#define NUM_SLOTS 32

#define EMPTY_SET ((set_t)-1)
#define FULL_SET ((set_t)0)

#define IS_VALID_SLOT(slot) ((slot)&amp;&amp;!((slot)&amp;((slot)-1)))
#define SLOT_FROM_INDEX(slot_idx) (1&lt;&lt;(slot_idx))

#define GET_EMPTY_SLOT(set) ((set)&amp;(-(set_aux_t)(set)))

#define SET_SLOT(set, slot) ((void)((set)&amp;=~(slot)))
#define CLEAR_SLOT(set, slot) ((void)((set)|=(slot)))

#define IS_SLOT_SET(set, slot) (!((set)&amp;(slot)))
</pre></p>
<p><a href="http://www.ideone.com/Ng7Ba">We can check the implementation using Ideone.</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/2026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/2026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/2026/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=2026&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/05/30/searching-for-bits-in-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>Intermezzo: permutations</title>
		<link>http://mchouza.wordpress.com/2011/03/28/intermezzo-permutations/</link>
		<comments>http://mchouza.wordpress.com/2011/03/28/intermezzo-permutations/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 01:17:43 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[cs]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[math]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=1995</guid>
		<description><![CDATA[As I haven&#8217;t progressed very much with my WebGL apps due to a variety of reasons/excuses (lack of willpower, a crash course on nuclear reactors engineering since the Fukushima nuclear crisis started, etc.), I decided to make a quick post showing a solution to a simple problem. In other words, the main purpose of this [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=1995&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As I haven&#8217;t progressed very much with my WebGL apps due to a variety of reasons/excuses (<a href="http://en.wikipedia.org/wiki/Akrasia">lack of willpower</a>, a <a href="http://metamodern.com/2009/05/27/how-to-learn-about-everything/">crash course</a> on <a href="http://web.mit.edu/nuclearpower/">nuclear</a> <a href="http://www.threemileisland.org/downloads/354.pdf">reactors</a> <a href="http://ocw.mit.edu/courses/nuclear-engineering/22-091-nuclear-reactor-safety-spring-2008/lecture-notes/MIT22_091S08_lec15.pdf">engineering</a> <a href="http://www.acme-nuclear.com/">since</a> <a href="http://www.iaea.org/newscenter/news/tsunamiupdate01.html">the</a> <a href="http://allthingsnuclear.org/tagged/Japan_nuclear">Fukushima</a> <a href="http://bravenewclimate.com/">nuclear</a> <a href="http://mitnse.com/">crisis</a> <a href="http://www.jaif.or.jp/english/index.php">started</a>, etc.), I decided to make a quick post showing a solution to a simple problem. In other words, the main purpose of this post is just to avoid leaving a month without posts <img src='http://s0.wp.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> </p>
<h5>The problem</h5>
<blockquote><p>
Given an arithmetic expression without parentheses, insert them to get a specified result. For example, if the expression is 2+2/2 and the desired result is 2, there is a single solution: (2+2)/2. On the other hand, if the desired result is 0 there is no solution.
</p></blockquote>
<p>It&#8217;s easy to see the answers in simple cases like the last one. But it&#8217;s not so easy to see if we can get a result of 18 by adding parentheses to 18-6/2+4-1*2&#8230; so let&#8217;s write code <a href="http://www.hhhh.org/wiml/virtues.html">to do the hard work for us</a>.</p>
<h5>A solution</h5>
<p>The parentheses only change the order in which the different operators apply and the effects of any operation order can be obtained with some <a href="http://en.wikipedia.org/wiki/Parenthesis#Parentheses_.28_.29">parenthesization</a>. So, if we search for a <a href="http://en.wikipedia.org/wiki/Permutation">permutation</a> of the order in which the operators are applied that gives us the desired result, we are getting a solution for the original problem.</p>
<p>This new problem is not exactly the same as the original one by two reasons that can be exemplified as follows:</p>
<ul>
<li>Given the expression 2+2*2+2, the orders of operation 1-3-2 and 3-1-2 are associated to the same parenthesized expression: (2+2)*(2+2).</li>
<li>The parenthesizations (2+2)+2, 2+(2+2) and 2+2+2 are equivalent due to the associativity of addition.</li>
</ul>
<p>But, as this only gives us some redundant solutions that can be easily eliminated with some post-processing, it&#8217;s not a problem from the correctness point of view. Being able to remove this redundant solutions without needing to enumerate them would be a great thing performance wise, but <a href="http://en.wikipedia.org/wiki/KISS_principle">it doesn&#8217;t seem easy and it&#8217;s not required for small expressions</a> (10! &lt; 10<sup>7</sup>).</p>
<h6>Permutations in lexicographic order</h6>
<p>To avoid accidentally skipping some permutations, it&#8217;s useful to follow a specific order when enumerating them. In this section we will see how we can go through all the permutations of an array [0, 1, 2, ..., <em>n</em> - 1] in <a href="http://en.wikipedia.org/wiki/Lexicographic_order">lexicographic order</a>.</p>
<p>Given some permutation of the array, that we will call <em>a</em>, the next one will probably share some prefix and will differ in the order of the last elements. As we cannot get a lexicographically greater permutation by permuting a suffix that is in reverse order (starting from the greater elements), we need to permute a suffix that is not in reverse order. Then we will make our first step to search for this suffix:</p>
<blockquote><p>
Find the largest index k such that a[k] &lt; a[k + 1].
</p></blockquote>
<p>The specification of the largest index is equivalent to search for the <em>minimal</em> suffix to permute, something required to get a permutation that is <em>lexicographically immediate</em>. If this suffix cannot be found, we are at the last permutation.</p>
<p>Now we need to decide what to permute. It&#8217;s clear that we cannot get a greater permutation just by permuting the suffix <em>a</em>[<em>k</em>+1 .. <em>n</em>-1], as they are in reverse order. So we need to replace <em>a</em>[<em>k</em>] by some element and the obvious choice is to search for the immediately greater element in the suffix or, as the suffix <em>a</em>[<em>k</em>+1 .. <em>n</em>-1] is in reverse order:</p>
<blockquote><p>
Find the largest index l such that a[k] &lt; a[l].
</p></blockquote>
<p>The lexicographically smallest permutation starting with the original <em>a</em>[<em>l</em>] can be obtained by appending all the other elements of the original suffix in ascending order. But, as the suffix is in descending order and <em>l</em> is the largest index such that a[k] &lt; a[l], we can get the same result in an easier way:</p>
<blockquote><p>
Swap a[k] with a[l].<br />
Reverse the sequence from a[k + 1] up to and including the final element a[n].
</p></blockquote>
<p>The final code in JS <a href="http://code.google.com/p/mchouza/source/browse/trunk/misc/parenthesize/parenthesize.js#4">can be seen here</a>.</p>
<h6>Evaluating an arithmetic expression with different orders of operations</h6>
<p>I started using an array whose members were alternately numbers and operators. The main problem is that, while it&#8217;s easy to go from an operator to the adjacent numbers in an array, this is not what we want to do. The desired behavior is to go from an operator to the operands, and they can be the result of a sequence of previous operations.</p>
<p>To solve this problem in a moderately efficient way, I adapted the parent node concept from the <a href="http://en.wikipedia.org/wiki/Disjoint-set_data_structure#Disjoint-set_forests">disjoint-set data structure</a>. Each operation adds the result to the operator node and puts this node as the parent node of both operands, allowing efficient access to the value of the whole operand from any node that is included in it. The code that does this operations can be seen <a href="http://code.google.com/p/mchouza/source/browse/trunk/misc/parenthesize/parenthesize.js#146">at the SVN repository of this blog</a>.</p>
<h5>Testing the solution</h5>
<p>The whole solution can be tested <a href="http://mchouza.googlecode.com/svn/trunk/misc/parenthesize/parenthesize_en.html">at the same SVN repository</a>, but let&#8217;s see some specific examples:</p>
<p><strong>Input expression:</strong> 3+2*2-1*0.5<br />
<strong>Input result 1:</strong> 3<br />
<strong>Output:</strong> (3+2*2-1)*0.5<br />
<strong>Input result 2:</strong> 6<br />
<strong>Output:</strong> 3+2*(2-1*0.5)</p>
<p><strong>Input expression:</strong> 18-6/2+4-1*2<br />
<strong>Input result:</strong> 18<br />
<strong>Output 1:</strong> ((18-6)/2+4-1)*2<br />
<strong>Output 2:</strong> 18-(6/(2+4)-1)*2</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/1995/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/1995/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/1995/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=1995&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/03/28/intermezzo-permutations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>
	</item>
		<item>
		<title>GPGPU with WebGL: solving Laplace&#8217;s equation</title>
		<link>http://mchouza.wordpress.com/2011/02/21/gpgpu-with-webgl-solving-laplaces-equation/</link>
		<comments>http://mchouza.wordpress.com/2011/02/21/gpgpu-with-webgl-solving-laplaces-equation/#comments</comments>
		<pubDate>Mon, 21 Feb 2011 02:29:24 +0000</pubDate>
		<dc:creator>mchouza</dc:creator>
				<category><![CDATA[cs]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[physics]]></category>
		<category><![CDATA[webgl]]></category>

		<guid isPermaLink="false">http://mchouza.wordpress.com/?p=1941</guid>
		<description><![CDATA[This is the first post in what will hopefully be a series of posts exploring how to use WebGL to do GPGPU (General-purpose computing on graphics processing units). In this installment we will solve a partial differential equation using WebGL, the Laplace&#8217;s equation more specifically. Discretizing the Laplace&#8217;s equation The Laplace&#8217;s equation, , is one [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=1941&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is the first post in what will hopefully be a series of posts exploring <a href="http://learningwebgl.com/blog/?page_id=1217">how to use</a> <a href="http://en.wikipedia.org/wiki/Webgl">WebGL</a> to do <a href="http://en.wikipedia.org/wiki/GPGPU">GPGPU (General-purpose computing on graphics processing units)</a>. In this installment we will solve a partial differential equation using WebGL, the Laplace&#8217;s equation more specifically.</p>
<h5>Discretizing the Laplace&#8217;s equation</h5>
<p>The <a href="http://en.wikipedia.org/wiki/Laplace%27s_equation">Laplace&#8217;s equation</a>, <img src='http://s0.wp.com/latex.php?latex=%5Cnabla%5E2+%5Cphi+%3D+0&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;nabla^2 &#92;phi = 0' title='&#92;nabla^2 &#92;phi = 0' class='latex' />, is one of the most ubiquitous <a href="http://en.wikipedia.org/wiki/Partial_differential_equation">partial differential equations</a> in physics. It appears in lot of areas, including electrostatics, heat conduction and fluid flow. </p>
<p>To get a <a href="http://en.wikipedia.org/wiki/Numerical_partial_differential_equations">numerical solution of a differential equation</a>, the first step is to replace <a href="http://en.wikipedia.org/wiki/Manifold">the continuous domain</a> by a <a href="http://en.wikipedia.org/wiki/Lattice_(group)">lattice</a> and the <a href="http://en.wikipedia.org/wiki/Differential_operator">differential operators</a> with <a href="http://en.wikipedia.org/wiki/Difference_operator">their discrete versions</a>. In our case, we just have to replace the <a href="http://en.wikipedia.org/wiki/Laplacian">Laplacian</a> by <a href="http://en.wikipedia.org/wiki/Discrete_Laplace_operator">its discrete version</a>:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cnabla%5E2+%5Cphi%28x%29+%3D+0+%5Crightarrow+%5Cfrac%7B1%7D%7Bh%5E2%7D%5Cleft%28%5Cphi_%7Bi-1%5C%2Cj%7D+%2B+%5Cphi_%7Bi%2B1%5C%2Cj%7D+%2B+%5Cphi_%7Bi%5C%2Cj-1%7D+%2B+%5Cphi_%7Bi%5C%2Cj%2B1%7D+-+4%5Cphi_%7Bi%5C%2Cj%7D%5Cright%29+%3D+0&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle &#92;nabla^2 &#92;phi(x) = 0 &#92;rightarrow &#92;frac{1}{h^2}&#92;left(&#92;phi_{i-1&#92;,j} + &#92;phi_{i+1&#92;,j} + &#92;phi_{i&#92;,j-1} + &#92;phi_{i&#92;,j+1} - 4&#92;phi_{i&#92;,j}&#92;right) = 0' title='&#92;displaystyle &#92;nabla^2 &#92;phi(x) = 0 &#92;rightarrow &#92;frac{1}{h^2}&#92;left(&#92;phi_{i-1&#92;,j} + &#92;phi_{i+1&#92;,j} + &#92;phi_{i&#92;,j-1} + &#92;phi_{i&#92;,j+1} - 4&#92;phi_{i&#92;,j}&#92;right) = 0' class='latex' />,</p>
<p>where <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='h' title='h' class='latex' /> is the grid size.</p>
<p>If we apply this equation at all internal points of the lattice (the external points must retain fixed values if we use <a href="http://en.wikipedia.org/wiki/Dirichlet_boundary_condition">Dirichlet boundary conditions</a>) we get a big <a href="http://en.wikipedia.org/wiki/System_of_linear_equations">system of linear equations</a> whose solution will give a numerical approximation to a solution of the Laplace&#8217;s equation. Of the <a href="http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/csa/node38.html">various methods to solve big linear systems</a>, the <a href="http://en.wikipedia.org/wiki/Jacobi_method">Jacobi relaxation method</a> seems the best fit to <a href="http://en.wikipedia.org/wiki/Shader_(realtime,_logical)">shaders</a>, because it applies the same expression at every lattice point and doesn&#8217;t have dependencies between computations. Applying this method to our linear system, we get the following expression for the iteration:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cphi_%7Bi%5C%2Cj%7D%5E%7B%28k%2B1%29%7D+%3D+%5Cfrac%7B1%7D%7B4%7D%5Cleft%28%5Cphi_%7Bi-1%5C%2Cj%7D%5E%7B%28k%29%7D+%2B+%5Cphi_%7Bi%2B1%5C%2Cj%7D%5E%7B%28k%29%7D+%2B+%5Cphi_%7Bi%5C%2Cj-1%7D%5E%7B%28k%29%7D+%2B+%5Cphi_%7Bi%5C%2Cj%2B1%7D%5E%7B%28k%29%7D%5Cright%29&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;displaystyle &#92;phi_{i&#92;,j}^{(k+1)} = &#92;frac{1}{4}&#92;left(&#92;phi_{i-1&#92;,j}^{(k)} + &#92;phi_{i+1&#92;,j}^{(k)} + &#92;phi_{i&#92;,j-1}^{(k)} + &#92;phi_{i&#92;,j+1}^{(k)}&#92;right)' title='&#92;displaystyle &#92;phi_{i&#92;,j}^{(k+1)} = &#92;frac{1}{4}&#92;left(&#92;phi_{i-1&#92;,j}^{(k)} + &#92;phi_{i+1&#92;,j}^{(k)} + &#92;phi_{i&#92;,j-1}^{(k)} + &#92;phi_{i&#92;,j+1}^{(k)}&#92;right)' class='latex' />,</p>
<p>where <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='k' title='k' class='latex' /> is a step index.</p>
<h5>Solving the discretized problem using WebGL shaders</h5>
<p>If we use a texture to represent the domain and a <a href="http://en.wikipedia.org/wiki/Pixel_shader">fragment shader</a> to do the Jacobi relaxation steps, the shader will follow this general pseudocode:</p>
<ol>
<li>Check if this fragment is a boundary point. If it&#8217;s one, return the previous value of this point.</li>
<li>Get the four nearest neighbors&#8217; values.</li>
<li>Return the average of their values.</li>
</ol>
<p>To flesh out this pseudocode, we need to define a specific representation for the discretized domain. Taking into account that the currently available WebGL versions don&#8217;t support floating point textures, we can use 32 bits RGBA fragments and do the following mapping:</p>
<p><strong>R:</strong> Higher byte of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />.<br />
<strong>G:</strong> Lower byte of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=eeeae8&amp;fg=4a4a49&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />.<br />
<strong>B:</strong> Unused.<br />
<strong>A:</strong> 1 if it&#8217;s a boundary value, 0 otherwise.</p>
<p>Most of the code is straightforward, but doing the multiprecision arithmetic is tricky, as the quantities we are working with behave as floating point numbers in the shaders but are stored as integers. More specifically, the color numbers in the normal range, [0.0, 1.0], are multiplied by 255 and rounded to the nearest byte value  when stored at the target texture.</p>
<p>My first idea was to start by reconstructing the floating point numbers for each input value, do the required operations with the floating numbers and convert the floating point numbers to color components that can be reliably stored (without losing precision). This gives us the following pseudocode for the iteration shader:</p>
<p><pre class="brush: cpp;">
// wc is the color to the &quot;west&quot;, ec is the color to the &quot;east&quot;, ...
float w_val = wc.r + wc.g / 255.0;
float e_val = ec.r + ec.g / 255.0;
// ...
float val = (w_val + e_val + n_val + s_val) / 4.0;
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);
</pre></p>
<p>The reason why we multiply by 255 in place of 256 is that we need <em>val_lo</em> to keep track of the part of <em>val</em> that will be lost when we store it as a color component. As each byte value of a discrete color component will be associated with a range of size 1/255 in its continuous counterpart, we need to use the &#8220;low byte&#8221; to store the position of the continuous component within that range.</p>
<p>Simplifying the code to avoid redundant operations, we get:</p>
<p><pre class="brush: cpp;">
float val = (wc.r + ec.r + nc.r + sc.r) / 4.0 +
	(wc.g + ec.g + nc.g + sc.g) / (4.0 * 255.0);
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);
</pre></p>
<p>The result of running <a href="http://code.google.com/p/mchouza/source/browse/trunk/laplace-eq-webgl/laplace_1.html">the full code</a>, implemented in <a href="http://www.opengl.org/documentation/glsl/">GLSL</a>, is:</p>
<div id="attachment_1982" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.googlecode.com/svn/trunk/laplace-eq-webgl/laplace_1.html#32"><img src="http://mchouza.files.wordpress.com/2011/02/laplace_1.png?w=474" alt="" title="laplace_1"   class="size-full wp-image-1982" /></a><p class="wp-caption-text">Solving the Laplace&#039;s equation using a 32x32 grid. Click the picture to see the live solving process (if your browser supports WebGL).</p></div>
<p>As can be seen, it has quite low resolution but converges fast. But if we just crank up the number of points, the convergence gets slower:</p>
<div id="attachment_1985" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.googlecode.com/svn/trunk/laplace-eq-webgl/laplace_1.html#512"><img src="http://mchouza.files.wordpress.com/2011/02/laplace_2.png?w=474" alt="" title="laplace_2"   class="size-full wp-image-1985" /></a><p class="wp-caption-text">Incompletely converged solution in a 512x512 grid. Click the picture to see a live version.</p></div>
<p>How can we reconcile these approaches?</p>
<h5>Multigrid</h5>
<p>The basic idea behind <a href="http://en.wikipedia.org/wiki/Multigrid_method">multigrid methods</a> is to apply the relaxation method on a hierarchy of increasingly finer discretizations of the problem, using in each step the coarse solution obtained in the previous grid as the &#8220;starting guess&#8221;. In this mode, the long wavelength parts of the solution (those that converge slowly in the finer grids) are obtained in the first coarse iterations, and the last iterations just add the finer parts of the solution (those that converge relatively easily in the finer grids).</p>
<p><a href="http://code.google.com/p/mchouza/source/browse/trunk/laplace-eq-webgl/laplace_2.html">The implementation is quite straightforward</a>, giving us fast convergence and high resolution at the same time:</p>
<div id="attachment_1988" class="wp-caption aligncenter" style="width: 505px"><a href="http://mchouza.googlecode.com/svn/trunk/laplace-eq-webgl/laplace_2.html"><img src="http://mchouza.files.wordpress.com/2011/02/laplace_3.png?w=474" alt="" title="laplace_3"   class="size-full wp-image-1988" /></a><p class="wp-caption-text">Multigrid solution using grids from 8x8 to 512x512. Click the picture to see the live version.</p></div>
<h5>Conclusions</h5>
<p>It&#8217;s quite viable to use WebGL to do at least basic GPGPU tasks, though it is, in a certain sense, a step backward in time, as there is no <a href="http://en.wikipedia.org/wiki/CUDA">CUDA</a>, <a href="http://www.khronos.org/message_boards/viewtopic.php?f=35&amp;t=2970">floating point textures</a> or any feature that helps when working with non-graphic problems: you are on your own. But with the growing presence of WebGL support in modern browsers, it&#8217;s an interesting way of partially accessing the <a href="http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf">enormous computational power</a> present in modern video cards from any JS application, without requiring the installation of a native application.</p>
<p>In the next posts we will explore other kinds of problem-solving where WebGL can provide a great performance boost.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mchouza.wordpress.com/1941/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mchouza.wordpress.com/1941/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mchouza.wordpress.com/1941/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mchouza.wordpress.com&amp;blog=6184898&amp;post=1941&amp;subd=mchouza&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mchouza.wordpress.com/2011/02/21/gpgpu-with-webgl-solving-laplaces-equation/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/5277207dadc9ce68a228f38bf8d5f6a7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mchouza</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/02/laplace_1.png" medium="image">
			<media:title type="html">laplace_1</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/02/laplace_2.png" medium="image">
			<media:title type="html">laplace_2</media:title>
		</media:content>

		<media:content url="http://mchouza.files.wordpress.com/2011/02/laplace_3.png" medium="image">
			<media:title type="html">laplace_3</media:title>
		</media:content>
	</item>
	</channel>
</rss>
