Universality with only two NOT gates

In a previous post we have asked how many NOTs are needed to compute an arbitrary Boolean function. In this post we will see that only two NOT gates are enough.

Building 3 NOT gates starting from 2

If we call the inputs X, Y and Z, we can make a function detecting when no more than one input is active using a single NOT gate:

\displaystyle f(X, Y, Z) = \overline{XY + YZ + XZ}.

Detects when no more than one input is active.

Detects when no more than one input is active.

By selecting only the cases where at least one input is present, adding a term to detect when all the inputs are active and using an additional NOT gate, we can detect when exactly zero or two inputs are active:

\displaystyle g(X, Y, Z) = \overline{f(X, Y, Z)(X + Y + Z) + XYZ}

\displaystyle = \overline{\overline{XY + YZ + XZ}(X + Y + Z) + XYZ}.

Detects when zero or two inputs are active.

Detects when zero or two inputs are active.

Now we know that if X is not present, we either have:

  • 0 inputs present: we can check that by simultaneously ensuring that we don’t have more than one input present and that we have either zero or two inputs present, f(X, Y, Z)\cdot g(X, Y, Z).
  • 1 input present: we should have no more than one input present and Y or Z should be present, f(X, Y, Z)\cdot(Y + Z).
  • 2 inputs present: we can check that by simultaneously ensuring that either zero or two inputs are present and that Y and Z are present, g(X, Y, Z)\cdot YZ.

Putting all together and adding the symmetrical cases:

\displaystyle \overline{X} = f(X, Y, Z) \cdot (Y + Z) + (f(X, Y, Z) + YZ)\cdot g(X, Y, Z)

\displaystyle \overline{Y} = f(X, Y, Z) \cdot (X + Z) + (f(X, Y, Z) + XZ)\cdot g(X, Y, Z)

\displaystyle \overline{Z} = f(X, Y, Z) \cdot (X + Y) + (f(X, Y, Z) + XY)\cdot g(X, Y, Z).

Example of computing NOT x.

Computing NOT X (the other cases are symmetrical).

Checking the solution

To get independent confirmation, let’s check the truth tables with a simple Python script:

from itertools import product

for x, y, z in product((False, True), repeat=3):
	f_xyz = not (x and y or x and z or y and z)
	g_xyz = not (f_xyz and (x or y or z) or x and y and z)
	not_x = f_xyz and (y or z) or (f_xyz or y and z) and g_xyz
	not_y = f_xyz and (x or z) or (f_xyz or x and z) and g_xyz
	not_z = f_xyz and (x or y) or (f_xyz or x and y) and g_xyz
	assert (not x == not_x) and (not y == not_y) and (not z == not_z)

Conclusion

As this technique allows us to expand two NOT gates to three and it can be applied repeatedly, we can compute an arbitrary Boolean function with a circuit containing only two NOT gates.

In a following post we will see how I arrived to the solution by using brute force.

Advertisements

42 code golf

A nice and easy interview problem (link not posted to avoid giving good answers) is the following:

Print the number of integers below one million whose decimal digits sum to 42.

It can be solved with some simple Python code like the following:

print sum(1 if sum(int(c) for c in '%d' % n) == 42 else 0
          for n in range(1000000))

A more interesting problem is to try to write the smallest C program that solves the problem, where C program is defined as something that can be compiled & executed by Ideone in “C” mode. I know it can be done in 83 bytes, but can it be done using less?

Kaufman decimals

Ben Orlin posted an interesting extension of the normal repeating decimals called “Kaufman decimals” (because they were developed jointly with Jeff Kaufman). In this post I will try to define them more precisely and show they can be totally ordered.

Introduction

The Kaufman digit sequences (the decimals without the initial “0.” prefix) are formed by starting from single digit sequences (“0”, “1”, …, “9”) and applying the following operations a finite number of times:

  • Concatenation: taking two digit sequences and putting one after the other.
  • Repetition: taking a digit sequence and repeating it an infinite number of times.

As it’s difficult to start directly with the infinite case, let’s start by defining formally a more simple case.

Finite repetitions

If we replace the infinite repetition by repeating the sequence K times, it’s easy to analyze the resulting sequences. For example, if K = 3,

\displaystyle \overline{\overline{\strut 9}\,1}\,2 = 9991999199912.

So now we can define concatenation in the following way:

\displaystyle \mathcal{C}(a_{i<n}, b_{j<m})_{k<n+m} = \begin{cases}a_k & \text{if }k < n \\ b_{k-n} & \text{otherwise}\end{cases}.

Repetition is also quite easy:

\displaystyle \mathcal{R}_K(a_{i<n})_{j < n \cdot K} = a_{k \bmod n}.

We can check the definition with the following Python code:

def l(a):
    from itertools import count
    for i in count():
        if a(i) is None:
            return i
 
def r(k, a):
    l_a = l(a)
    return lambda i: a(i % l_a) if i < l_a * k else None
 
def c(a, b):
    l_a, l_b = l(a), l(b)
    return lambda i: a(i) if i < l_a else b(i - l_a) if i < l_a + l_b else None
 
def s(c):
    assert len(c) == 1
    return lambda i: c if i == 0 else None
 
def p(a):
    print ''.join(a(i) for i in range(l(a)))
 
if __name__ == '__main__':
    p(c(r(3,c(r(3,s('9')),s('1'))),s('2')))

that gives us the expected output:

9991999199912
Infinite repetitions

For the real Kaufman digit sequences we can promote the ordinary digit sequences to transfinite ones (we use a slightly non-standard definition, allowing sequences of any length), using ordinal indices. Then the concatenation and repetition operations can be defined in essentially the same way:

  • Concatenation: \mathcal{C}(a_{\alpha < \delta}, b_{\beta < \zeta})_{\gamma < \delta + \zeta} = \begin{cases}a_\gamma & \text{if }\gamma < \delta \\ b_{\gamma - \delta} & \text{otherwise}\end{cases}
  • Repetition: \mathcal{R}(a_{\alpha < \gamma})_{\beta < \gamma \cdot \omega} = a_{\beta \bmod \gamma}.

Ordinal modulus is easily defined by using the division theorem for ordinals.

We can construct the set containing all the Kaufman sequences by starting with the one digit “sequences”,

\displaystyle K_0 = \{"0", "1", ..., "9"\} (where “0” is the length 1 sequence having the single digit 0),

and defining the next step based on the previous one,

\displaystyle K_{n+1} = K_n \cup \{\mathcal{C}(a_{\alpha < \delta}, b_{\beta < \zeta}) | a_{\alpha < \delta}, b_{\beta < \zeta} \in K_n\} \cup \{\mathcal{R}(a_{\alpha < \delta}) | a_{\alpha < \delta} \in K_n\}.

As we have defined K_n for all integer n, now we can define K_\omega, the set of Kaufman sequences, as

\displaystyle K_\omega = \bigcup_{n < \omega} K_n.

Total order

It’s easy to see that this sequences can be totally ordered. Let’s define the extension operation for the Kaufman digit sequences:

\displaystyle \mathcal{E}(a_{\alpha < \gamma})_{\beta \in \mathrm{On}} = \begin{cases}a_\beta & \text{if }\beta < \gamma \\ 0 & \text{otherwise}\end{cases}

Then we define two sequences are equal if their extensions match:

\displaystyle a_{\alpha < \delta} = b_{\beta < \zeta} \Leftrightarrow \forall \gamma \in \mathrm{On} : \mathcal{E}(a_{\alpha < \delta})_\gamma = \mathcal{E}(b_{\beta < \zeta})_\gamma.

If they don't match, they must differ in some digit and we can order them based on that digit:

\displaystyle a_{\alpha < \delta} < b_{\beta < \zeta} \Leftrightarrow \exists \gamma \in \mathrm{On} : \left(\mathcal{E}(a_{\alpha < \delta})_\gamma < \mathcal{E}(b_{\beta < \zeta})_\gamma \right)\wedge \left(\forall \lambda < \gamma : \mathcal{E}(a_{\alpha < \delta})_\lambda < \mathcal{E}(b_{\beta < \zeta})_\lambda\right).

Conclusions

The Kaufman decimals can be formally defined and totally ordered by using transfinite sequences. It wouldn’t be too hard to adapt the previous Python code to accept ordinal indices in Cantor normal form, but I’m still not sure if there is an efficient procedure to order Kaufman decimals.

Baez’s “42” – Solving the first puzzle

In a blog post, John Baez proposes the following puzzle:

Consider solutions of

\displaystyle \frac{1}{p} + \frac{1}{q} + \frac{1}{r} = \frac{1}{2}

with positive integers p \le q \le r, and show that the largest possible value of r is 42.

We can start by writing the following simple Python program to solve the problem, using integer arithmetic to avoid floating point problems:

from itertools import count
for r in count(1):
    for q in xrange(1, r + 1):
        for p in xrange(1, q + 1):
            # 1/p + 1/q + 1/r = 1/2
            # qr + pr + pq = pqr/2
            # 2(qr + pr + pq) = pqr
            if 2*(q*r + p*r + p*q) == p*q*r:
                print p, q, r

This program shows us ten possible values (including 42 in the last triple),

6 6 6
4 8 8
5 5 10
4 6 12
3 12 12
3 10 15
3 9 18
4 5 20
3 8 24
3 7 42

but, as it’s trying to enumerate all integer triples, it doesn’t halt.

To be sure not to miss any interesting integer triple, we must explore them more carefully:

from itertools import count
for p in count(1):
    # checks if we are leaving room for 1/q + 1/r
    # 1/p >= 1/2
    if 2 >= p:
        continue # we need smaller values of 1/p
    # checks if we can reach 1/2 with the biggest legal values for 1/q and 1/r
    # 1/p + 1/p + 1/p < 1/2
    if 3 * 2 < p: 
        break
    for q in count(p):
        # checks if we are leaving room for 1/r
        # 1/p + 1/q >= 1/2
        if 2 * (q + p) >= p * q: 
            continue
        # checks if we can reach 1/2 with the biggest legal value for 1/r
        # 1/p + 1/q + 1/q < 1/2
        if 2 * (q + 2 * p) < p * q:
            break
        for r in count(q):
            lhs = 2 * (q * r + p * r + p * q)
            rhs = p * q * r
            # 1/p + 1/q + 1/r > 1/2
            if lhs > rhs:
                continue
            # 1/p + 1/q + 1/r < 1/2
            elif lhs < rhs:
                break
            # 1/p + 1/q + 1/r = 1/2
            else:
                print p, q, r

The results are the same, but now we can be sure of having the whole solution.

Powers of two: the solution

(This post solves the puzzle given in the previous post.)

Doing an empirical analysis

As we are interested in powers of two starting with a given set of digits, without caring about their overall magnitude, let’s plot a histogram of the values of

\displaystyle \frac{2^n}{10^m},

with n going from 1 to 100000 and m chosen to get values between 0.1 and 1.

By using a simple python script and matplotlib, we get the following distribution:

Distribution of the first digits (considered as a fraction between 0.10 and 0.99) of the first 100000 powers of two.

The distribution follows Benford’s law quite precisely

The previous plot, adding the prediction given by Benford's law.

making us expect a uniform distribution in log space

Distribution of the first digits (considered as a fraction between 0.10 and 0.99) of the first 100000 powers of two but grouped in logarithmic bins.

As the leading digits expressed as fractions cover the interval between 0.1 and 1 quite densely, we can expect that some power of two will start with the digits 31415926535897932384626433832795028841971, but this plot is quite far from being a proof.

Translating the problem

We are looking for a power of two 2^n starting with a given sequence of m digits, k. By expressing the digits sequence as a fraction between 0.1 and 1, k/10^m, we can represent the desired condition by the following inequality:

\displaystyle \frac{k}{10^m} \cdot 10^d \le 2^n < \frac{k+1}{10^m} \cdot 10^d.

As logarithms are monotonically increasing, we can apply them on both sides of the inequality:

\displaystyle \log_{10}\frac{k}{10^m} + d \le n \log_{10} 2 < \log_{10}\frac{k+1}{10^m} + d.

If we assume k+1 is not a power of ten, we can translate the inequality to the fractional parts of each member:

\displaystyle \left\{\log_{10}\frac{k}{10^m} + d\right\} \le \left\{n \log_{10} 2\right\} < \left\{\log_{10}\frac{k+1}{10^m} + d\right\}

\displaystyle \log_{10}\frac{k}{10^m} \le \left\{n \log_{10} 2\right\} < \log_{10}\frac{k+1}{10^m}.

Now the problem is reduced to see if we can find a multiple of \log_{10} 2 such that its fractional part falls between two given real numbers. In the next section we will see how this problem can be solved.

Solving the problem

By the Archimedean property of the real numbers, we know that there must be a natural number q such that

\displaystyle \frac{1}{q} < \log_{10}\frac{k+1}{10^m} - \log_{10}\frac{k}{10^m}.

Now, let’s analyze the distribution of \{n\log_{10}2\} for n in the range 1, …, q+1. As we have q+1 numbers inside the unit interval, we must have two numbers with separation less than or equal to 1/q (it’s easy to prove by contradiction). Then there are two exponents, r and s, such that

\displaystyle 0 \le \left\{r\log_{10}2\right\} - \left\{s\log_{10}2\right\} \le \frac{1}{q},

\displaystyle 0 < \left\{(r-s)\log_{10}2\right\} \le \frac{1}{q},

where we know that the inequality with zero is strict because \log_{10}2 is irrational. Then we can use the Archimedean property again to extend the inequality:

\displaystyle \frac{1}{t} < \left\{(r-s)\log_{10}2\right\} \le \frac{1}{q}.

It's intuitively easy to see (and not very difficult to prove) that we must have a number u such that

\displaystyle u\left\{(r-s)\log_{10}2\right\} < \log_{10}\frac{k}{10^m} < (u+1)\left\{(r-s)\log_{10}2\right\} < \log_{10}\frac{k+1}{10^m}

As all the involved numbers are in the range [0.1, 1), we can put the multipliers inside the fractional parts and we can also discard the first inequality because it's not important for our proof:

\displaystyle \log_{10}\frac{k}{10^m} < \left\{(u+1)(r-s)\log_{10}2\right\} < \log_{10}\frac{k+1}{10^m}.

Removing the fractional parts:

\displaystyle \log_{10}\frac{k}{10^m} + \lfloor(u+1)(r-s)\log_{10}2\rfloor < (u+1)(r-s)\log_{10}2

\displaystyle (u+1)(r-s)\log_{10}2 < \log_{10}\frac{k+1}{10^m} + \lfloor(u+1)(r-s)\log_{10}2\rfloor

Exponentiating:

\displaystyle \frac{k}{10^m} \cdot 10^{\lfloor(u+1)(r-s)\log_{10}2\rfloor} < 2^{(u+1)(r-s)}

\displaystyle 2^{(u+1)(r-s)} < \frac{k+1}{10^m}\cdot 10^{\lfloor(u+1)(r-s)\log_{10}2\rfloor}

Conclusion

We have found that we can get a power of two starting with the sequence of digits we want, though we haven’t found any bounds on the required exponents. In fact, the powers of two follow Benford’s law and we can use that to estimate the magnitude of the required exponent:

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx \log_{10}\left(1 + \frac{1}{3.2 \cdot 10^{40}}\right)

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx \frac{1 + \frac{1}{3.2 \cdot 10^{40}} - 1}{\ln 10}

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx 10^{-41}

So we can be quite confident that an exponent smaller than 10^{43} should meet the requirements, though it’s obviously impossible to find it by brute force. But it should be feasible to compute values of u, r and s and getting the first 100 digits of 2^{(u+1)(r-s)}.

Quines [remix]

As the quines in the previous post were criticized as boring and ordinary :-P, I did a remix:

#include <stdio.h>
#define s char s[]
s="#if 0\nimport json;r=json.dumps\nprint'#include <stdio.h>\\n#define s char s[]\\ns=%s;\\n%s'%(r(s),s)\n\"\"\" \"\n#elif 1\n#undef s\nint main(void)\n{\n  char *t = s;\n  printf(\"#include <stdio.h>\\n#define s char s[]\\ns=\\\"\");\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\"\\\\n\");\n    else if (*t == '\"')\n      printf(\"\\\\\\\"\");\n    else if (*t == '\\\\')\n      printf(\"\\\\\\\\\");\n    else\n      printf(\"%c\", *t);\n    t++;\n  }\n  printf(\"\\\";\\n%s\\n\", s);\n  return 0;\n}\n#elif 0\n\" \"\"\"\n#endif";
#if 0
import json;r=json.dumps
print'#include <stdio.h>\n#define s char s[]\ns=%s;\n%s'%(r(s),s)
""" "
#elif 1
#undef s
int main(void)
{
  char *t = s;
  printf("#include <stdio.h>\n#define s char s[]\ns=\"");
  while (*t)
  {
    if (*t == '\n')
      printf("\\n");
    else if (*t == '"')
      printf("\\\"");
    else if (*t == '\\')
      printf("\\\\");
    else
      printf("%c", *t);
    t++;
  }
  printf("\";\n%s\n", s);
  return 0;
}
#elif 0
" """
#endif

You can check that it works on the web or by downloading the file and testing it:

$ python polyquine.c | diff polyquine.c -
$ gcc -ansi -pedantic -Wall polyquine.c -o polyquine && ./polyquine | diff polyquine.c -

I’m aware that there are some impressive examples out there, but I haven’t analyzed them to avoid spoiling the fun.

What other language should I add? Reader contributions are welcome!

Quines

Introduction

One interesting programming exercise is to write a quine, a program that outputs its own source code (excluding empty programs!). If we naively try to write it just by using the print statement we will get into an infinite regression:

print 'print \' print \\\'print \\\\\\\'...

The basic problem is that the instructions required to write other instructions lead to an infinite recursion. So how can we avoid this problem?

The trick

One general way to do it is the one used by Von Neumann to make his abstract self-replicating machines and by nature itself. The trick is to handle the same instructions in two different ways: as instructions to be followed and as data to be copied.

Writing a C quine

As we will need to access the same data in two different ways, we must define a variable to avoid duplicating it:

char s[] = "<data will go here>";

It’s easy to write the code that prints everything before the data,

int main(void)
{
  printf("char s[] = \"");

but now we need to print the data as represented inside the string. Fortunately, of the characters we use, only the newline, the quote and the backlash require escape sequences:

  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf("\\n");
    else if (*t == '"')
      printf("\\\"");
    else if (*t == '\\')
      printf("\\\\");
    else
      printf("%c", *t);
    t++;
  }

Finally, we print the data as a normal string and exit main():

  printf("%s", s);
  return 0;
}

This program is finished with exception of filling in the data in variable s. As this is quite tedious to do by hand, we will do it using a program:

char s[] = "\";\nint main(void)\n{\n  printf(\"char s[] = \\\"\");\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\"\\\\n\");\n    else if (*t == '\"')\n      printf(\"\\\\\\\"\");\n    else if (*t == '\\\\')\n      printf(\"\\\\\\\\\");\n    else\n      printf(\"%c\", *t);\n    t++;\n  }\n  printf(\"%s\", s);\n  return 0;\n}\n";

Putting it all together, we get a program that prints its own source:

char s[] = "\";\nint main(void)\n{\n  printf(\"char s[] = \\\"\");\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\"\\\\n\");\n    else if (*t == '\"')\n      printf(\"\\\\\\\"\");\n    else if (*t == '\\\\')\n      printf(\"\\\\\\\\\");\n    else\n      printf(\"%c\", *t);\n    t++;\n  }\n  printf(\"%s\", s);\n  return 0;\n}\n";
int main(void)
{
  printf("char s[] = \"");
  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf("\\n");
    else if (*t == '"')
      printf("\\\"");
    else if (*t == '\\')
      printf("\\\\");
    else
      printf("%c", *t);
    t++;
  }
  printf("%s", s);
  return 0;
}

But, though it works in GCC, it’s not a strict C99 file:

cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:4: error: implicit declaration of function ‘printf’
prog.c:4: error: incompatible implicit declaration of built-in function ‘printf’

This can be fixed by including the stdio.h header:

#include <stdio.h>
char s[] = "\";\nint main(void)\n{\n  printf(\"#include <stdio.h>\\nchar s[] = \\\"\");\n  char *t = s;\n  while (*t)\n  {\n    if (*t == '\\n')\n      printf(\"\\\\n\");\n    else if (*t == '\"')\n      printf(\"\\\\\\\"\");\n    else if (*t == '\\\\')\n      printf(\"\\\\\\\\\");\n    else\n      printf(\"%c\", *t);\n    t++;\n  }\n  printf(\"%s\", s);\n  return 0;\n}\n";
int main(void)
{
  printf("#include <stdio.h>\nchar s[] = \"");
  char *t = s;
  while (*t)
  {
    if (*t == '\n')
      printf("\\n");
    else if (*t == '"')
      printf("\\\"");
    else if (*t == '\\')
      printf("\\\\");
    else
      printf("%c", *t);
    t++;
  }
  printf("%s", s);
  return 0;
}

Doing a Python quine

Doing a Python quine is much easier, because the built-in function repr() gives a string representation of an object, allowing us to skip most of the code of the C quine. Using it we can get this natural three line quine:

s = "print 's = %s' % repr(s)\nprint s"
print 's = %s' % repr(s)
print s

If we want to do code golfing, we can:

  • Remove unnecesary spaces and line breaks.
  • Printing the representation of the data and the data using only one print statement.

This give us a much shorter quine:

s="print's=%s;%s'%(repr(s),s)";print's=%s;%s'%(repr(s),s)

though still longer than the record one.