Baez’s “42″ – Solving the first puzzle

In a blog post, John Baez proposes the following puzzle:

Consider solutions of

\displaystyle \frac{1}{p} + \frac{1}{q} + \frac{1}{r} = \frac{1}{2}

with positive integers p \le q \le r, and show that the largest possible value of r is 42.

We can start by writing the following simple Python program to solve the problem, using integer arithmetic to avoid floating point problems:

from itertools import count
for r in count(1):
    for q in xrange(1, r + 1):
        for p in xrange(1, q + 1):
            # 1/p + 1/q + 1/r = 1/2
            # qr + pr + pq = pqr/2
            # 2(qr + pr + pq) = pqr
            if 2*(q*r + p*r + p*q) == p*q*r:
                print p, q, r

This program shows us ten possible values (including 42 in the last triple),

6 6 6
4 8 8
5 5 10
4 6 12
3 12 12
3 10 15
3 9 18
4 5 20
3 8 24
3 7 42

but, as it’s trying to enumerate all integer triples, it doesn’t halt.

To be sure not to miss any interesting integer triple, we must explore them more carefully:

from itertools import count
for p in count(1):
    # checks if we are leaving room for 1/q + 1/r
    # 1/p >= 1/2
    if 2 >= p:
        continue # we need smaller values of 1/p
    # checks if we can reach 1/2 with the biggest legal values for 1/q and 1/r
    # 1/p + 1/p + 1/p < 1/2
    if 3 * 2 < p: 
        break
    for q in count(p):
        # checks if we are leaving room for 1/r
        # 1/p + 1/q >= 1/2
        if 2 * (q + p) >= p * q: 
            continue
        # checks if we can reach 1/2 with the biggest legal value for 1/r
        # 1/p + 1/q + 1/q < 1/2
        if 2 * (q + 2 * p) < p * q:
            break
        for r in count(q):
            lhs = 2 * (q * r + p * r + p * q)
            rhs = p * q * r
            # 1/p + 1/q + 1/r > 1/2
            if lhs > rhs:
                continue
            # 1/p + 1/q + 1/r < 1/2
            elif lhs < rhs:
                break
            # 1/p + 1/q + 1/r = 1/2
            else:
                print p, q, r

The results are the same, but now we can be sure of having the whole solution.

A simple combinatorics problem

To break the long hiatus, I’m going to solve a short combinatorics problem extracted from “A Walk Through Combinatorics”:

Find all positive integers a < b < c such that \frac{1}{a} + \frac{1}{b} + \frac{1}{c} = 1.

As a is the smallest of the three integers, we know that \frac{1}{a} will be the bigger of the three fractions. The three fractions must add to one and they must be different, so \frac{1}{a} must be equal to \frac{1}{2}. Rewriting the restriction using this newly found knowledge:

\displaystyle \frac{1}{2} + \frac{1}{b} + \frac{1}{c} = 1

\displaystyle \frac{1}{b} + \frac{1}{c} = \frac{1}{2}.

b must be greater than 2, but it cannot be bigger than 3 because, otherwise,

\displaystyle \frac{1}{b} + \frac{1}{c} \leq \frac{1}{4} + \frac{1}{5} = \frac{9}{20} < \frac{1}{2}.

Then we have b = 3 and

\displaystyle c = \frac{1}{1 - \frac{1}{2} - \frac{1}{3}} = 6.

Solving the binary puzzle

(This is the answer to the puzzle proposed in the previous post.)

If you write all binary numbers between 0 and 2n-1 you will notice that the columns show a pattern. See for example the first 8 binary numbers:

000
001
010
011
100
101
110
111

Each column has exactly the same number of zeroes and ones! Then the total number of ones is half the total number of digits:

\displaystyle \frac{n 2^n}{2} = n 2^{n-1}.

Once we have the formula, the values for n = 1 to 50 can easily be calculated.

A binary number puzzle

As I haven’t finished the posts I’ve planned to write, here is a quick puzzle involving binary numbers:

Compute the total number of one-bits in the binary representations of the numbers 0 to 2n-1 for n from 1 to 50.

For example, for n = 2 we have four different binary representations: 00, 01, 10 and 11. Adding the number of one-bits: 0 + 1 + 1 + 2 = 4.

Hint: it’s quite easy once you get the pattern…

Powers of two: the solution

(This post solves the puzzle given in the previous post.)

Doing an empirical analysis

As we are interested in powers of two starting with a given set of digits, without caring about their overall magnitude, let’s plot a histogram of the values of

\displaystyle \frac{2^n}{10^m},

with n going from 1 to 100000 and m chosen to get values between 0.1 and 1.

By using a simple python script and matplotlib, we get the following distribution:

Distribution of the first digits (considered as a fraction between 0.10 and 0.99) of the first 100000 powers of two.

The distribution follows Benford’s law quite precisely

The previous plot, adding the prediction given by Benford's law.

making us expect a uniform distribution in log space

Distribution of the first digits (considered as a fraction between 0.10 and 0.99) of the first 100000 powers of two but grouped in logarithmic bins.

As the leading digits expressed as fractions cover the interval between 0.1 and 1 quite densely, we can expect that some power of two will start with the digits 31415926535897932384626433832795028841971, but this plot is quite far from being a proof.

Translating the problem

We are looking for a power of two 2^n starting with a given sequence of m digits, k. By expressing the digits sequence as a fraction between 0.1 and 1, k/10^m, we can represent the desired condition by the following inequality:

\displaystyle \frac{k}{10^m} \cdot 10^d \le 2^n < \frac{k+1}{10^m} \cdot 10^d.

As logarithms are monotonically increasing, we can apply them on both sides of the inequality:

\displaystyle \log_{10}\frac{k}{10^m} + d \le n \log_{10} 2 < \log_{10}\frac{k+1}{10^m} + d.

If we assume k+1 is not a power of ten, we can translate the inequality to the fractional parts of each member:

\displaystyle \left\{\log_{10}\frac{k}{10^m} + d\right\} \le \left\{n \log_{10} 2\right\} < \left\{\log_{10}\frac{k+1}{10^m} + d\right\}

\displaystyle \log_{10}\frac{k}{10^m} \le \left\{n \log_{10} 2\right\} < \log_{10}\frac{k+1}{10^m}.

Now the problem is reduced to see if we can find a multiple of \log_{10} 2 such that its fractional part falls between two given real numbers. In the next section we will see how this problem can be solved.

Solving the problem

By the Archimedean property of the real numbers, we know that there must be a natural number q such that

\displaystyle \frac{1}{q} < \log_{10}\frac{k+1}{10^m} - \log_{10}\frac{k}{10^m}.

Now, let’s analyze the distribution of \{n\log_{10}2\} for n in the range 1, …, q+1. As we have q+1 numbers inside the unit interval, we must have two numbers with separation less than or equal to 1/q (it’s easy to prove by contradiction). Then there are two exponents, r and s, such that

\displaystyle 0 \le \left\{r\log_{10}2\right\} - \left\{s\log_{10}2\right\} \le \frac{1}{q},

\displaystyle 0 < \left\{(r-s)\log_{10}2\right\} \le \frac{1}{q},

where we know that the inequality with zero is strict because \log_{10}2 is irrational. Then we can use the Archimedean property again to extend the inequality:

\displaystyle \frac{1}{t} < \left\{(r-s)\log_{10}2\right\} \le \frac{1}{q}.

It's intuitively easy to see (and not very difficult to prove) that we must have a number u such that

\displaystyle u\left\{(r-s)\log_{10}2\right\} < \log_{10}\frac{k}{10^m} < (u+1)\left\{(r-s)\log_{10}2\right\} < \log_{10}\frac{k+1}{10^m}

As all the involved numbers are in the range [0.1, 1), we can put the multipliers inside the fractional parts and we can also discard the first inequality because it's not important for our proof:

\displaystyle \log_{10}\frac{k}{10^m} < \left\{(u+1)(r-s)\log_{10}2\right\} < \log_{10}\frac{k+1}{10^m}.

Removing the fractional parts:

\displaystyle \log_{10}\frac{k}{10^m} + \lfloor(u+1)(r-s)\log_{10}2\rfloor < (u+1)(r-s)\log_{10}2

\displaystyle (u+1)(r-s)\log_{10}2 < \log_{10}\frac{k+1}{10^m} + \lfloor(u+1)(r-s)\log_{10}2\rfloor

Exponentiating:

\displaystyle \frac{k}{10^m} \cdot 10^{\lfloor(u+1)(r-s)\log_{10}2\rfloor} < 2^{(u+1)(r-s)}

\displaystyle 2^{(u+1)(r-s)} < \frac{k+1}{10^m}\cdot 10^{\lfloor(u+1)(r-s)\log_{10}2\rfloor}

Conclusion

We have found that we can get a power of two starting with the sequence of digits we want, though we haven’t found any bounds on the required exponents. In fact, the powers of two follow Benford’s law and we can use that to estimate the magnitude of the required exponent:

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx \log_{10}\left(1 + \frac{1}{3.2 \cdot 10^{40}}\right)

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx \frac{1 + \frac{1}{3.2 \cdot 10^{40}} - 1}{\ln 10}

\displaystyle P({\rm power\ of\ two\ starts\ with\ required\ digits}) \approx 10^{-41}

So we can be quite confident that an exponent smaller than 10^{43} should meet the requirements, though it’s obviously impossible to find it by brute force. But it should be feasible to compute values of u, r and s and getting the first 100 digits of 2^{(u+1)(r-s)}.

Powers of two: a puzzle

While the answer to the previous post remains in its editing phase, this is a quick puzzle that cannot be brute forced:

Is there any power of two whose decimal representation starts with 31415926535897932384626433832795028841971?

My answer will be included in the next “quick post”.

Do we need to live with the uncertainty?

Introduction

In the previous post we have seen that the halting problem cannot be solved. But maybe the problem is just in our programming languages, so let’s see if we can improve them to disallow non-halting programs.

Bounded C

A program can fail to halt by two reasons (we are ignoring I/O, limited memory sizes and all that earthly issues :D ): it can get trapped in an infinite loop or we can enter an infinite recursion. So let’s take a language like C and remove these “troublesome” features from it by

  • only allowing iteration via the for (b = 0; b < a; b++) construct and
  • removing the possibility of doing either direct or indirect recursion.

At first sight we haven’t lost a lot. For example we can easily raise numbers to a power

int my_pow(int base, int exp)
{
    int p = 1, i;
    for (i = 0; i < exp; i++)
        p *= base;
    return p;
}

or compute the elements of the Fibonacci series.

int fib(int n)
{
    int a = 0, b = 1, i, t;
    for (i = 0; i < n; i++)
    {
        t = a + b;
        a = b;
        b = t;
    }
    return a;
}

But maybe we have lost something subtler…

Challenge

Can we make a function in “normal C” that cannot be implemented in our “bounded C”? The answer will be given in the following post… or in the comments :D