42 code golf

A nice and easy interview problem (link not posted to avoid giving good answers) is the following:

Print the number of integers below one million whose decimal digits sum to 42.

It can be solved with some simple Python code like the following:

print sum(1 if sum(int(c) for c in '%d' % n) == 42 else 0
          for n in range(1000000))

A more interesting problem is to try to write the smallest C program that solves the problem, where C program is defined as something that can be compiled & executed by Ideone in “C” mode. I know it can be done in 83 bytes, but can it be done using less?

Inverse kinematics and the Jacobian transpose

Generalities

This is a relatively technical post. Its purpose is mainly to teach myself why the Jacobian transpose is so useful when doing inverse kinematics.

We are going to solve the following problem:

We have a mechanism with effectors applying forces whose components are given by f_i, with i going from 1 to N. The required power is provided by a series of torques, whose components are called \tau_j, where j goes from 1 to M. Get the required values of \tau_j as a function of f_i.

Jacobian

Let’s call \theta_j the angular coordinate associated with the torque components \tau_j and x_i the normal (“linear”) coordinate of the effector associated with the force component f_i. In most useful cases, the positions of the effectors can be expressed as a function of the angular coordinates, x_i(\theta_j). The Jacobian will be the linearization of this relationship around some point \theta_j^0,

\displaystyle J_{ij}(\theta_j^0) = \left. \frac{\partial x_i}{\partial \theta_j} \right|_{\theta_j=\theta_j^0}.

Virtual work

If we can ignore inertia forces (either because we are dealing with a purely static problem or because inertia is negligible), we can get

\displaystyle \sum_{i=1}^N f_i \delta x_i = \sum_{j=1}^M \tau_j \delta \theta_j,

where \delta x_i is the infinitesimal linear displacement associated with the force component f_i and \delta \theta_j is the infinitesimal angular displacement associated with the torque component \tau_j.

Putting things together

As the previous expression uses infinitesimal movements, we can use the Jacobian to relate the linear displacements to the angular ones:

\displaystyle \delta x_i = \sum_{j=1}^M J_{ij} \delta \theta_j.

If we replace that result in the virtual work equation,

\displaystyle \sum_{i=1}^N f_i \left(\sum_{j=1}^M J_{ij} \delta \theta_j\right) = \sum_{j=1}^M \tau_j \delta \theta_j,

and we do some rearrangements, we get an expression with infinitesimal angular displacements in both sides:

\displaystyle  \sum_{j=1}^M \left(\sum_{i=1}^N f_i J_{ij}\right) \delta \theta_j = \sum_{j=1}^M \tau_j \delta \theta_j.

As the infinitesimal angular displacements \delta \theta_j are arbitrary, their factors should match:

\displaystyle \sum_{i=1}^N f_i J_{ij} = \tau_j.

By representing this equation in matrix form,

\displaystyle \boldsymbol{\tau} = \mathbf{J}^T \mathbf{f},

we see how we arrive naturally to the Jacobian transpose.

Force over a dipole

In an arbitrary field

The easiest way to get the force over a dipole is to consider it as the limit of two oppositely charged monopoles that are closely spaced. If the dipole has moment \mathbf{m} and is at position \mathbf{r_0}, it can be considered as the limit of two monopoles, one with charge -\epsilon^{-1} at position \mathbf{r_0} - (\epsilon / 2) \mathbf{m} and the other with charge \epsilon^{-1} at position \mathbf{r_0} + (\epsilon / 2) \mathbf{m}, when \epsilon goes to zero.

If we consider the finite size dipole immersed in a (let’s say magnetic) field \mathbf{B}(\mathbf{r}), the total force will be

\displaystyle \mathbf{F} = -\epsilon^{-1}\mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m}) +\epsilon^{-1}\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m})

\displaystyle \mathbf{F} = \frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}.

We can get the force for the infinitesimal dipole by taking the limit when \epsilon goes to zero

\displaystyle \mathbf{F} = \lim_{\epsilon \to 0}\frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}

\displaystyle \mathbf{F} = \mathbf{m} \cdot (\mathbf{\nabla B})(\mathbf{r_0}),

where \mathbf{\nabla B} is the (tensor) derivative of the magnetic field.

Between two antialigned dipoles

The general field of a magnetic dipole of moment \mathbf{m} at position \mathbf{r_0} is

\displaystyle \mathbf{B}_d(\mathbf{r}) = \frac{\mu_0}{4\pi}\left(\frac{3\mathbf{m}\cdot(\mathbf{r}-\mathbf{r_0})(\mathbf{r}-\mathbf{r_0})-\mathbf{m}|\mathbf{r}-\mathbf{r_0}|^2}{|\mathbf{r}-\mathbf{r_0}|^5}\right).

If we assume we have one dipole at x = 0 with its moment +m\mathbf{\hat{x}} and the other at x = +R with its moment -m\mathbf{\hat{x}}, we get a field at x = +R of

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3m\mathbf{\hat{x}}\cdot(+R\mathbf{\hat{x}})(+R\mathbf{\hat{x}})-m\mathbf{\hat{x}}|+R\mathbf{\hat{x}}|^2}{|+R\mathbf{\hat{x}}|^5}\right)

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3mR^2-mR^2}{R^5}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0 m}{2\pi R^3}\mathbf{\hat{x}}.

By symmetry, we are only interested in the x-component of the x-derivative of the field,

\displaystyle \left(\frac{\partial\mathbf{B}}{\partial x}\right)_x = -\frac{3\mu_0 m}{2\pi R^4}.

And the force on the dipole at x = +R will be

\displaystyle \mathbf{F} = -m\mathbf{\hat{x}} \cdot (\mathbf{\nabla B})(+R\mathbf{\hat{x}})

\displaystyle \mathbf{F} = -m \left(-\frac{3\mu_0 m}{2\pi R^4}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{F} = \frac{3\mu_0 m^2}{2\pi R^4}\mathbf{\hat{x}},

a repulsive force as expected of antialigned dipoles.

Yet another FizzBuzz without conditional branches

A branch-free FizzBuzz in assembly was posted in Hacker News. In this post I will try to do another implementation, for a Linux environment, without using function pointers.

C implementation

We can start by writing a C implementation:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>

int main( void )
{
	int i = 1;
	const char fizz[] = "Fizz";
	const char buzz[] = "Buzz";
	const char nl[] = "\n";
	char digits[ 2 ];
	while( 1 )
	{
		char *p = digits;
		*p = i / 10 + '0';
		p += ( i >= 10 );
		*p = i % 10 + '0';
		p++;
		syscall( SYS_exit * ( i > 100 ) + SYS_write * ( i <= 100 ), i <= 100,
		         digits, ( i % 3 != 0 && i % 5 != 0 ) * ( p - digits ) );
		syscall( SYS_write, 1, fizz, ( i % 3 == 0 ) * ( sizeof( fizz ) - 1 )  );
		syscall( SYS_write, 1, buzz, ( i % 5 == 0 ) * ( sizeof( buzz ) - 1 )  );
		syscall( SYS_write, 1, nl, sizeof( nl ) - 1 );
		i++;
	}
	/* NOT REACHED */
	return 0;
}

(Check the output in Ideone.)

The basic idea is to calculate the syscall number arithmetically and using the fact that a zero-length write is essentially a no-op.

In at least one associated assembly listing (obtained with Godbolt’s Interactive compiler) there are no conditional branches:

main:                                   # @main
	pushq	%rbp
	movq	%rsp, %rbp
	pushq	%r15
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%rbx
	pushq	%rax
	movl	$1, %r15d
	leaq	-42(%rbp), %r14
.LBB0_1:                                # %._crit_edge
	cmpl	$100, %r15d
	movl	$0, %edi
	movl	$60, %eax
	cmovgl	%eax, %edi
	movslq	%r15d, %r15
	cmpl	$101, %r15d
	setl	%al
	movzbl	%al, %esi
	orl	%esi, %edi
	imulq	$1431655766, %r15, %rcx # imm = 0x55555556
	imulq	$1717986919, %r15, %rax # imm = 0x66666667
	movq	%rax, %r10
	shrq	$63, %r10
	shrq	$32, %rax
	movl	%eax, %edx
	sarl	$2, %edx
	leal	(%rdx,%r10), %ebx
	movq	%rcx, %r11
	shrq	$63, %r11
	cmpl	$9, %r15d
	setg	%r8b
	imull	$10, %ebx, %ebx
	negl	%ebx
	leal	48(%rdx,%r10), %edx
	movb	%dl, -42(%rbp)
	leal	48(%r15,%rbx), %r9d
	shrq	$32, %rcx
	addl	%r11d, %ecx
	leal	(%rcx,%rcx,2), %ecx
	movl	%r15d, %edx
	subl	%ecx, %edx
	sete	%r12b
	movzbl	%r8b, %ecx
	movb	%r9b, -42(%rbp,%rcx)
	setne	%bl
	sarl	%eax
	addl	%r10d, %eax
	leal	(%rax,%rax,4), %eax
	movl	%r15d, %edx
	subl	%eax, %edx
	sete	%r13b
	setne	%al
	andb	%bl, %al
	movzbl	%al, %eax
	shlq	$63, %rax
	sarq	$63, %rax
	incq	%rcx
	andq	%rax, %rcx
	movq	%r14, %rdx
	xorb	%al, %al
	callq	syscall
	movzbl	%r12b, %ecx
	shlq	$2, %rcx
	movl	$1, %edi
	movl	$1, %esi
	movl	main::fizz, %edx
	xorb	%al, %al
	callq	syscall
	movzbl	%r13b, %ecx
	shlq	$2, %rcx
	movl	$1, %edi
	movl	$1, %esi
	movl	main::buzz, %edx
	xorb	%al, %al
	callq	syscall
	movl	$1, %edi
	movl	$1, %esi
	movl	main::nl, %edx
	movl	$1, %ecx
	xorb	%al, %al
	callq	syscall
	incl	%r15d
	jmp	.LBB0_1

main::fizz:
	.asciz	 "Fizz"

main::buzz:
	.asciz	 "Buzz"

main::nl:
	.asciz	 "\n"

But there still are comparisons and conditional sets and moves.

x86-64 assembly implementation

We can write a cleaner implementation if we do it directly in assembly:

;; FizzBuzz in x86-64 ASM without conditional branches

section .data
fizz: db 'Fizz'
buzz: db 'Buzz'
nl:   db 10

section .bss
digits: resb 2

section .text

global _start
_start:
	mov rbx, 1

main_loop:

	; if ( rbx > 100 )
	;   exit( 0 )
	; else
	;   write( 1, digits, 0 )
	mov rax, rbx
	neg rax
	add rax, 100
	sar rax, 63
	mov rdi, rax
	and rax, 59
	inc rax
	neg rdi
	mov rsi, digits
	xor rdx, rdx
	syscall

	; r8 <- rbx % 3 != 0 ? -1 : 0
	; r9 <- rbx % 5 != 0 ? -1 : 0
	mov rax, rbx
	mov rcx, 3
	div rcx
	mov r8, rdx
	neg r8
	sar r8, 63
	mov rax, rbx
	mov rcx, 5
	xor rdx, rdx
	div rcx
	mov r9, rdx
	neg r9
	sar r9, 63

	; rcx <- digits
	; *rcx = rbx / 10 + '0'
	; rcx += ( rbx >= 10 )
	; *rcx = rbx % 10 + '0'
	; rcx++
	mov rcx, digits
	mov rax, rbx
	xor rdx, rdx
	mov rdi, 10
	div rdi
	mov rsi, rax
	add rax, '0'
	mov [rcx], rax
	neg rsi
	sar rsi, 63
	neg rsi
	add rcx, rsi
	add rdx, '0'
	mov [rcx], rdx
	inc rcx

	; write( 1, digits, ( rbx % 3 != 0 && rbx % 5 != 0 ) ? rcx - digits : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, digits
	mov rdx, rcx
	sub rdx, digits
	and rdx, r8
	and rdx, r9
	syscall

	; write( 1, fizz, rbx % 3 == 0 ? 4 : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, fizz
	mov rdx, r8
	not rdx
	and rdx, 4
	syscall

	; write( 1, buzz, rbx % 5 == 0 ? 4 : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, buzz
	mov rdx, r9
	not rdx
	and rdx, 4
	syscall

	; write( 1, nl, 1 )
	mov rax, 1
	mov rdi, 1
	mov rsi, nl
	mov rdx, 1
	syscall	

	inc rbx

	jmp main_loop

If the source file is named fizzbuzz.s, it can be compiled with the following command line:

nasm -f elf64 -o fizzbuzz.o fizzbuzz.s; ld -o fizzbuzz fizzbuzz.o

No increasing functions in the long line

For our purposes, let’s define the “long line” as the set L = (0, 1) \times [0, 1) together with a lexicographic ordering:

\displaystyle (a, b) \prec (c, d) \Leftrightarrow (a < c) \vee (a = c) \wedge (b < d).

We want to prove that we cannot have a strictly increasing real-valued function in L, meaning that there is no function f: L \to \mathbb{R} such that

\displaystyle f((a, b)) < f((c, d)) \Leftrightarrow (a, b) \prec (c, d).

To start with the proof we are going to require a lemma.

Between two different real numbers there is at least one rational number

The basic idea (lifted from this math.SE answer) is to define “a grid” of rationals that is thick enough to be sure we have at least one rational between the two reals. So, if a < b are the two real numbers, we can use as denominator

\displaystyle n = \left\lceil \frac{2}{b - a}\right\rceil

and as numerator

\displaystyle m = \lfloor n a + 1 \rfloor.

Now we need to prove that m/n is always strictly between a and b.

It’s easy to see that is bigger than a,

\displaystyle \frac{m}{n} = \frac{\lfloor n a + 1 \rfloor}{n} > \frac{n a}{n} = a,

and that (m-1)/n is less than or equal to a:

\displaystyle \frac{m-1}{n} = \frac{\lfloor n a + 1 \rfloor - 1}{n} \le \frac{n a + 1 - 1}{n} = a.

As we know that 1/n is smaller than b - a because

\displaystyle \frac{b - a}{1 / n} = n(b - a) = \left\lceil \frac{2}{b - a}\right\rceil (b - a) > \frac{2}{b - a}(b - a) = 2,

we have

\displaystyle \frac{m}{n} = \frac{m-1}{n} + \frac{1}{n} < a + (b - a) = b.

Building an impossible one-to-one function

Using g as a name for the previously defined function that takes a pair of reals and returns a rational between them and f for an increasing function in L, we can define h: (0, 1) \to \mathbb{Q} by the following expression:

\displaystyle h(x) = g(f((x, 0)), f((x, 0.5))).

We know that is one-to-one because if x \ne y, assuming WLOG that x < y,

\displaystyle h(x) = g(f((x, 0)), f((x, 0.5)))
\displaystyle < f((x, 0.5))
\displaystyle < f((y, 0))
\displaystyle < g(f((y, 0)), f((y, 0.5))) = h(y).

But this result would imply that the cardinality of (0, 1) is smaller than the cardinality of a subset of \mathbb{Q} and that is absurd.

Conclusions

This result would seem a random trivia, but it has an important consequence for microeconomics: lexicographic preferences cannot be described by (real-valued) utility functions.

Kaufman decimals

Ben Orlin posted an interesting extension of the normal repeating decimals called “Kaufman decimals” (because they were developed jointly with Jeff Kaufman). In this post I will try to define them more precisely and show they can be totally ordered.

Introduction

The Kaufman digit sequences (the decimals without the initial “0.” prefix) are formed by starting from single digit sequences (“0”, “1”, …, “9”) and applying the following operations a finite number of times:

  • Concatenation: taking two digit sequences and putting one after the other.
  • Repetition: taking a digit sequence and repeating it an infinite number of times.

As it’s difficult to start directly with the infinite case, let’s start by defining formally a more simple case.

Finite repetitions

If we replace the infinite repetition by repeating the sequence K times, it’s easy to analyze the resulting sequences. For example, if K = 3,

\displaystyle \overline{\overline{\strut 9}\,1}\,2 = 9991999199912.

So now we can define concatenation in the following way:

\displaystyle \mathcal{C}(a_{i<n}, b_{j<m})_{k<n+m} = \begin{cases}a_k & \text{if }k < n \\ b_{k-n} & \text{otherwise}\end{cases}.

Repetition is also quite easy:

\displaystyle \mathcal{R}_K(a_{i<n})_{j < n \cdot K} = a_{k \bmod n}.

We can check the definition with the following Python code:

def l(a):
    from itertools import count
    for i in count():
        if a(i) is None:
            return i
 
def r(k, a):
    l_a = l(a)
    return lambda i: a(i % l_a) if i < l_a * k else None
 
def c(a, b):
    l_a, l_b = l(a), l(b)
    return lambda i: a(i) if i < l_a else b(i - l_a) if i < l_a + l_b else None
 
def s(c):
    assert len(c) == 1
    return lambda i: c if i == 0 else None
 
def p(a):
    print ''.join(a(i) for i in range(l(a)))
 
if __name__ == '__main__':
    p(c(r(3,c(r(3,s('9')),s('1'))),s('2')))

that gives us the expected output:

9991999199912
Infinite repetitions

For the real Kaufman digit sequences we can promote the ordinary digit sequences to transfinite ones (we use a slightly non-standard definition, allowing sequences of any length), using ordinal indices. Then the concatenation and repetition operations can be defined in essentially the same way:

  • Concatenation: \mathcal{C}(a_{\alpha < \delta}, b_{\beta < \zeta})_{\gamma < \delta + \zeta} = \begin{cases}a_\gamma & \text{if }\gamma < \delta \\ b_{\gamma - \delta} & \text{otherwise}\end{cases}
  • Repetition: \mathcal{R}(a_{\alpha < \gamma})_{\beta < \gamma \cdot \omega} = a_{\beta \bmod \gamma}.

Ordinal modulus is easily defined by using the division theorem for ordinals.

We can construct the set containing all the Kaufman sequences by starting with the one digit “sequences”,

\displaystyle K_0 = \{"0", "1", ..., "9"\} (where “0” is the length 1 sequence having the single digit 0),

and defining the next step based on the previous one,

\displaystyle K_{n+1} = K_n \cup \{\mathcal{C}(a_{\alpha < \delta}, b_{\beta < \zeta}) | a_{\alpha < \delta}, b_{\beta < \zeta} \in K_n\} \cup \{\mathcal{R}(a_{\alpha < \delta}) | a_{\alpha < \delta} \in K_n\}.

As we have defined K_n for all integer n, now we can define K_\omega, the set of Kaufman sequences, as

\displaystyle K_\omega = \bigcup_{n < \omega} K_n.

Total order

It’s easy to see that this sequences can be totally ordered. Let’s define the extension operation for the Kaufman digit sequences:

\displaystyle \mathcal{E}(a_{\alpha < \gamma})_{\beta \in \mathrm{On}} = \begin{cases}a_\beta & \text{if }\beta < \gamma \\ 0 & \text{otherwise}\end{cases}

Then we define two sequences are equal if their extensions match:

\displaystyle a_{\alpha < \delta} = b_{\beta < \zeta} \Leftrightarrow \forall \gamma \in \mathrm{On} : \mathcal{E}(a_{\alpha < \delta})_\gamma = \mathcal{E}(b_{\beta < \zeta})_\gamma.

If they don't match, they must differ in some digit and we can order them based on that digit:

\displaystyle a_{\alpha < \delta} < b_{\beta < \zeta} \Leftrightarrow \exists \gamma \in \mathrm{On} : \left(\mathcal{E}(a_{\alpha < \delta})_\gamma < \mathcal{E}(b_{\beta < \zeta})_\gamma \right)\wedge \left(\forall \lambda < \gamma : \mathcal{E}(a_{\alpha < \delta})_\lambda < \mathcal{E}(b_{\beta < \zeta})_\lambda\right).

Conclusions

The Kaufman decimals can be formally defined and totally ordered by using transfinite sequences. It wouldn’t be too hard to adapt the previous Python code to accept ordinal indices in Cantor normal form, but I’m still not sure if there is an efficient procedure to order Kaufman decimals.

Baez’s “42” – Solving the first puzzle

In a blog post, John Baez proposes the following puzzle:

Consider solutions of

\displaystyle \frac{1}{p} + \frac{1}{q} + \frac{1}{r} = \frac{1}{2}

with positive integers p \le q \le r, and show that the largest possible value of r is 42.

We can start by writing the following simple Python program to solve the problem, using integer arithmetic to avoid floating point problems:

from itertools import count
for r in count(1):
    for q in xrange(1, r + 1):
        for p in xrange(1, q + 1):
            # 1/p + 1/q + 1/r = 1/2
            # qr + pr + pq = pqr/2
            # 2(qr + pr + pq) = pqr
            if 2*(q*r + p*r + p*q) == p*q*r:
                print p, q, r

This program shows us ten possible values (including 42 in the last triple),

6 6 6
4 8 8
5 5 10
4 6 12
3 12 12
3 10 15
3 9 18
4 5 20
3 8 24
3 7 42

but, as it’s trying to enumerate all integer triples, it doesn’t halt.

To be sure not to miss any interesting integer triple, we must explore them more carefully:

from itertools import count
for p in count(1):
    # checks if we are leaving room for 1/q + 1/r
    # 1/p >= 1/2
    if 2 >= p:
        continue # we need smaller values of 1/p
    # checks if we can reach 1/2 with the biggest legal values for 1/q and 1/r
    # 1/p + 1/p + 1/p < 1/2
    if 3 * 2 < p: 
        break
    for q in count(p):
        # checks if we are leaving room for 1/r
        # 1/p + 1/q >= 1/2
        if 2 * (q + p) >= p * q: 
            continue
        # checks if we can reach 1/2 with the biggest legal value for 1/r
        # 1/p + 1/q + 1/q < 1/2
        if 2 * (q + 2 * p) < p * q:
            break
        for r in count(q):
            lhs = 2 * (q * r + p * r + p * q)
            rhs = p * q * r
            # 1/p + 1/q + 1/r > 1/2
            if lhs > rhs:
                continue
            # 1/p + 1/q + 1/r < 1/2
            elif lhs < rhs:
                break
            # 1/p + 1/q + 1/r = 1/2
            else:
                print p, q, r

The results are the same, but now we can be sure of having the whole solution.