The one who doesn’t match

A very common C interview problem is the following:

In an array with n positive integers, all values appear twice with the exception of one, that appears only once. Can we find it using constant space?

The usual trick is to use bitwise XOR in the following way,

unsigned answer1(const unsigned *a, size_t n)
{
    unsigned ret = 0;
    for (size_t i = 0; i < n; i++)
        ret ^= a[i];
    return ret;
}

and it works due to the properties a^b == b^a, a^(b^c) == (a^b)^c, a^0 == a and a^a == 0 (following the usual conventions, we assume that constant space refers to a constant number of computer words).

It is fairly straightforward to extend that solution to

In an array with n positive integers, all values appear k times with the exception of one, that appears only once. How much space and time do we need to find it?

(hint: XOR can be seen as addition modulo 2)

but what extensions are possible? If we generalize it to

In an array with n positive integers, all values appear k times with the exception of m values, that appear p times. How much space and time doe we need to find them?

a trivial upper bound is O(n) space and O(n log n) time, achievable by just sorting and counting. We know we can do better in some cases, but how can we characterize them?

Some answers in the next post (hopefully it won’t take one year to arrive!).

Advertisement

An average problem

This is a simple problem to keep the posting rhythm.

Given an array a of N elements, we keep fixed the first element at 0.0 and the last element at 1.0, while all the other elements are initialized to 0.5. In each time step, all the element in the middle of the array are set to the average of their adjacent elements.

For example, if we have N = 4,

[0.0, 0.5, 0.5, 1.0] # Initialization
[0.0, 0.25, 0.75, 1.0] # After first step
[0.0, 0.375, 0.675, 1.0] # After second step
...

We can easily express it in numpy:

import numpy as np

N = 4
S = 10

a = 0.5 * np.ones(N)
a[0] = 0.0
a[-1] = 1.0

print a
for i in xrange(S):
	a[1:-1] = 0.5 * (a[:-2] + a[2:])
	print a

The three questions are:

  • Does the array converge to a limit?
  • If it does, does it depend on its initial values?
  • Assuming it converges and given a tolerance \epsilon, how many steps does it take to get to a distance \epsilon of the limit?

Answers in the next post.

Paraxial approximation for spherical Green functions

Note: This post is just some notes for a discussion.

We can start by writing it simply as

\displaystyle f(\mathbf{r}) = \frac{e^{ikr}}{r}.

Now we use \mathbf{r} = [x\,y\,z]^T and assume |z| \gg |x|, |y|. After expanding the expression using the coordinates,

\displaystyle f(x, y, z) = \frac{\exp(ik\sqrt{z^2 + x^2 + y^2})}{\sqrt{z^2 + x^2 + y^2}},

we have to find now a way to get rid of the square roots. Taking z^2 as a common factor,

\displaystyle \sqrt{z^2 + x^2 + y^2} = \sqrt{z^2\left(1 + \frac{x^2 + y^2}{z^2}\right)}
\displaystyle = |z|\sqrt{1 + \frac{x^2 + y^2}{z^2}}.

As we know the argument of the square root is very close to 1, we can use a first order approximation, \sqrt{1 + x} \approx 1 + \frac{x}{2}:

\approx |z|\left(1 + \frac{x^2 + y^2}{2z^2}\right)

\approx |z|\left(1 + \frac{x^2 + y^2}{2z^2}\right)

\approx |z| + \frac{x^2 + y^2}{2|z|}.

Replacing in the original expression,

\displaystyle f(x, y, z) \approx \frac{1}{|z|}\exp\left(ik|z| + ik\frac{x^2+y^2}{2|z|}\right)

we get expressions matching the ones in the slides apart from some amplitude and phase conventions.

Superpolynomial

This post is just to help a friend that wanted to see why

\displaystyle \sum_{i=1}^{\lfloor\sqrt{n}\rfloor} \binom{n}{i}

was superpolynomial.

Getting a simple lower bound

We can start by keeping only the last term. As all the terms are positive, that will give us a lower bound:

\displaystyle \sum_{i=1}^{\lfloor\sqrt{n}\rfloor} \binom{n}{i} \ge \binom{n}{\lfloor\sqrt{n}\rfloor}.

We can expand the binomial coefficient in terms of factorials

\displaystyle \binom{n}{\lfloor\sqrt{n}\rfloor} = \frac{n!}{(n - \lfloor\sqrt{n}\rfloor)!\lfloor\sqrt{n}\rfloor!},

and some of the factors inside n! can then be cancelled with the ones in (n - \lfloor\sqrt{n}\rfloor)!, leaving the following expression:

\displaystyle \binom{n}{\lfloor\sqrt{n}\rfloor} = \frac{n(n-1)\hdots(n - \lfloor\sqrt{n}\rfloor + 1)}{\lfloor\sqrt{n}\rfloor!}.

As we want a lower bound, we can bound the numerator by below and the denominator by above, leaving us a bound without factorials:

\displaystyle \frac{n(n-1)\hdots(n - \lfloor\sqrt{n}\rfloor + 1)}{\lfloor\sqrt{n}\rfloor!} \ge \frac{(n - \sqrt{n})^{\sqrt{n} - 1}}{\sqrt{n}^{\sqrt{n}}}.

For any reasonably big value of n we will have n - \lfloor\sqrt{n}\rfloor \ge n/2. If we combine that with factoring out the -1 in the exponent we get

\displaystyle \frac{(n - \sqrt{n})^{\sqrt{n} - 1}}{\sqrt{n}^{\sqrt{n}}} \ge \frac{1}{n - \sqrt{n}}\left(\frac{n}{\sqrt{n}}\right)^{\sqrt{n}} \ge \frac{1}{n}\sqrt{n}^{\sqrt{n}},

removing the problematic floors and arriving to a much simpler expression.

Combining all these bounds we get

\displaystyle \sum_{i=1}^{\lfloor\sqrt{n}\rfloor} \binom{n}{i} \ge \frac{1}{n}\sqrt{n}^{\sqrt{n}},

a much simpler lower bound when compared to the original expression.

Proving it is superpolynomial

If w take an arbitrary polynomial p(n) of degree k we can say for n \ge 1 that

\displaystyle p(n) = \sum_{i=1}^k p_i\,n^i \le \left(\sum_{i=1}^k |p_i|\right)n^k,

meaning we can bound by above any polynomial of degree k by a polynomial of the same degree with a single term.

Taking the limit of the quotient of logarithms,

\displaystyle \lim_{x \to +\infty} \frac{\log \frac{1}{n}\sqrt{n}^{\sqrt{n}}}{\log a\,n^k} = \lim_{x \to +\infty} \frac{\frac{\sqrt{n}}{2}\log n - \log n}{\log a + k\,\log n}

\displaystyle = \lim_{x \to +\infty} \frac{\log n\left(\frac{\sqrt{n}}{2} - 1\right)}{\log n\left(\frac{\log a}{\log n} + k\right)}

\displaystyle = \lim_{x \to +\infty} \frac{\frac{\sqrt{n}}{2} - 1}{\frac{\log a}{\log n} + k}

\displaystyle = +\infty

we can see the lower bound of the original expression is superpolynomial, making the original expression superpolynomial too.

Escaping variables

(Just a short post to break the “dry spell”.)

One interesting consequence of C having non-pure functions is that every function you call can be accessing your local variables if pointers to them “escape”.

If we compile the following C code with -O3

extern void f(int *a);
extern void g(void);

int main(void)
{
  int a = 0;
  f(0);
  a++;
  g();
  a++;
  g();
  a++;
  return a;
}

we get machine code where the additions are collapsed and a is not even given a memory position:

main:
        push    {r3, lr}
        movs    r0, #0
        bl      f(int*)
        bl      g()
        bl      g()
        movs    r0, #3
        pop     {r3, pc}

But if we pass a pointer to the local variable to a single function,

extern void f(int *a);
extern void g(void);

int main(void)
{
  int a = 0;
  f(&a);
  a++;
  g();
  a++;
  g();
  a++;
  return a;
}

we get a load/store after every function call:

main:
        push    {lr}
        sub     sp, sp, #12
        add     r0, sp, #8
        movs    r3, #0
        str     r3, [r0, #-4]!
        bl      f(int*)
        ldr     r3, [sp, #4]
        adds    r3, r3, #1
        str     r3, [sp, #4]
        bl      g()
        ldr     r3, [sp, #4]
        adds    r3, r3, #1
        str     r3, [sp, #4]
        bl      g()
        ldr     r0, [sp, #4]
        adds    r0, r0, #1
        add     sp, sp, #12
        ldr     pc, [sp], #4

Once a pointer to a variable has escaped, the optimizer has to assume that it’s part of the global state and every function call could be modifying it.

Inverse kinematics and the Jacobian transpose

Generalities

This is a relatively technical post. Its purpose is mainly to teach myself why the Jacobian transpose is so useful when doing inverse kinematics.

We are going to solve the following problem:

We have a mechanism with effectors applying forces whose components are given by f_i, with i going from 1 to N. The required power is provided by a series of torques, whose components are called \tau_j, where j goes from 1 to M. Get the required values of \tau_j as a function of f_i.

Jacobian

Let’s call \theta_j the angular coordinate associated with the torque components \tau_j and x_i the normal (“linear”) coordinate of the effector associated with the force component f_i. In most useful cases, the positions of the effectors can be expressed as a function of the angular coordinates, x_i(\theta_j). The Jacobian will be the linearization of this relationship around some point \theta_j^0,

\displaystyle J_{ij}(\theta_j^0) = \left. \frac{\partial x_i}{\partial \theta_j} \right|_{\theta_j=\theta_j^0}.

Virtual work

If we can ignore inertia forces (either because we are dealing with a purely static problem or because inertia is negligible), we can get

\displaystyle \sum_{i=1}^N f_i \delta x_i = \sum_{j=1}^M \tau_j \delta \theta_j,

where \delta x_i is the infinitesimal linear displacement associated with the force component f_i and \delta \theta_j is the infinitesimal angular displacement associated with the torque component \tau_j.

Putting things together

As the previous expression uses infinitesimal movements, we can use the Jacobian to relate the linear displacements to the angular ones:

\displaystyle \delta x_i = \sum_{j=1}^M J_{ij} \delta \theta_j.

If we replace that result in the virtual work equation,

\displaystyle \sum_{i=1}^N f_i \left(\sum_{j=1}^M J_{ij} \delta \theta_j\right) = \sum_{j=1}^M \tau_j \delta \theta_j,

and we do some rearrangements, we get an expression with infinitesimal angular displacements in both sides:

\displaystyle  \sum_{j=1}^M \left(\sum_{i=1}^N f_i J_{ij}\right) \delta \theta_j = \sum_{j=1}^M \tau_j \delta \theta_j.

As the infinitesimal angular displacements \delta \theta_j are arbitrary, their factors should match:

\displaystyle \sum_{i=1}^N f_i J_{ij} = \tau_j.

By representing this equation in matrix form,

\displaystyle \boldsymbol{\tau} = \mathbf{J}^T \mathbf{f},

we see how we arrive naturally to the Jacobian transpose.

Terrenos multifractales – estado actual

Este es un screenshot del estado actual de la demo:

Screenshot de la demo de terrenos multifractales en su estado actual.

Screenshot de la demo de terrenos multifractales en su estado actual.

El terreno se ve bastante poco atractivo por la carencia absoluta de texturas e iluminación y existen algunos problemas en la forma de las costas, pero parece un camino prometedor. prácticamente no sufrió cambios en el aspecto desde hace un par de semanas, ya que me concentré en migrar desde SDL y GLEW a GLFW y GLEE (en busca de mayor simplicidad y menos DLLs).

El número de la barra de títulos es el número de FPS, pero no es tan bajo por la ineficiencia de mi código (que es mucha, sin lugar a dudas, pero compensada por la placa de video) sino porque no logro descubrir como desactivar el vsync :-S