Escaping variables

(Just a short post to break the “dry spell”.)

One interesting consequence of C having non-pure functions is that every function you call can be accessing your local variables if pointers to them “escape”.

If we compile the following C code with -O3

extern void f(int *a);
extern void g(void);

int main(void)
{
  int a = 0;
  f(0);
  a++;
  g();
  a++;
  g();
  a++;
  return a;
}

we get machine code where the additions are collapsed and a is not even given a memory position:

main:
        push    {r3, lr}
        movs    r0, #0
        bl      f(int*)
        bl      g()
        bl      g()
        movs    r0, #3
        pop     {r3, pc}

But if we pass a pointer to the local variable to a single function,

extern void f(int *a);
extern void g(void);

int main(void)
{
  int a = 0;
  f(&a);
  a++;
  g();
  a++;
  g();
  a++;
  return a;
}

we get a load/store after every function call:

main:
        push    {lr}
        sub     sp, sp, #12
        add     r0, sp, #8
        movs    r3, #0
        str     r3, [r0, #-4]!
        bl      f(int*)
        ldr     r3, [sp, #4]
        adds    r3, r3, #1
        str     r3, [sp, #4]
        bl      g()
        ldr     r3, [sp, #4]
        adds    r3, r3, #1
        str     r3, [sp, #4]
        bl      g()
        ldr     r0, [sp, #4]
        adds    r0, r0, #1
        add     sp, sp, #12
        ldr     pc, [sp], #4

Once a pointer to a variable has escaped, the optimizer has to assume that it’s part of the global state and every function call could be modifying it.

Advertisements

Abusing macros for typechecking

One commonly used macro in C programming is ASIZE(), generally defined as something like this

#define ASIZE(a) (sizeof(a)/sizeof(a[0]))

and used to calculate the number of elements in an array.

The main problem with this macro, as written, is that it doesn’t distinguish between arrays and pointers. If passed a pointer, it will silently produce wrong results:

Code

#include <stdio.h>

#define ASIZE(a) (sizeof (a) / sizeof((a)[0]))

int main(void)
{
	short a[3];
	short *b;
	int c[2];
	int *d;
	long long e[5][4];
	char *f[4];
	char (*g)[4];
	(void)a; (void)b; (void)c; (void)d; (void)e; (void)f; (void)g;
	printf("ASIZE() accepts pointers, producing invalid results.\n");
	printf("%zu\n", ASIZE( a ));
	printf("%zu\n", ASIZE( b ));
	printf("%zu\n", ASIZE( c ));
	printf("%zu\n", ASIZE( d ));
	printf("%zu\n", ASIZE( e ));
	printf("%zu\n", ASIZE( f ));
	printf("%zu\n", ASIZE( g ));
	return 0;
}

Output

3
2
2
1
5
4
1

By adding a new macro checking if the parameter is an array, we can define a safer ASIZE():

#define CHECK_ARRAY(a) ((void)(0&&((int (*)(__typeof__(a[0])(*)[ASIZE(a)]))NULL)(&(a))))
#define ASIZE_SAFE(a) (CHECK_ARRAY(a), ASIZE(a))

Checking this new version, we see it gets the correct results when passed arrays, but now the compilation fails when applied to pointers:

Code

#include <stdio.h>

#define ASIZE(a) (sizeof (a) / sizeof((a)[0]))

#define CHECK_ARRAY(a) ((void)(0&&((int (*)(__typeof__(a[0])(*)[ASIZE(a)]))NULL)(&(a))))

#define ASIZE_SAFE(a) (CHECK_ARRAY(a), ASIZE(a))

int main(void)
{
	short a[3];
	short *b;
	int c[2];
	int *d;
	long long e[5][4];
	char *f[4];
	char (*g)[4];
	(void)a; (void)b; (void)c; (void)d; (void)e; (void)f; (void)g;
	printf("ASIZE() accepts pointers, producing invalid results.\n");
	printf("%zu\n", ASIZE( a ));
	printf("%zu\n", ASIZE( b ));
	printf("%zu\n", ASIZE( c ));
	printf("%zu\n", ASIZE( d ));
	printf("%zu\n", ASIZE( e ));
	printf("%zu\n", ASIZE( f ));
	printf("%zu\n", ASIZE( g ));
	printf("ASIZE_SAFE() only accepts arrays (try uncommenting).\n");
	printf("%zu\n", ASIZE_SAFE( a ));
	//printf("%zu\n", ASIZE_SAFE( b ));
	printf("%zu\n", ASIZE_SAFE( c ));
	//printf("%zu\n", ASIZE_SAFE( d ));
	printf("%zu\n", ASIZE_SAFE( e ));
	//printf("%zu\n", ASIZE_SAFE( f ));
	//printf("%zu\n", ASIZE_SAFE( g ));
	return 0;
}

Output

ASIZE() accepts pointers, producing invalid results.
3
2
2
1
5
4
1
ASIZE_SAFE() only accepts arrays (try uncommenting).
3
2
5

It works in a relatively straightforward way, though I have put the details in a gist to avoid spoiling them.

Universality with only two NOT gates

In a previous post we have asked how many NOTs are needed to compute an arbitrary Boolean function. In this post we will see that only two NOT gates are enough.

Building 3 NOT gates starting from 2

If we call the inputs X, Y and Z, we can make a function detecting when no more than one input is active using a single NOT gate:

\displaystyle f(X, Y, Z) = \overline{XY + YZ + XZ}.

Detects when no more than one input is active.

Detects when no more than one input is active.

By selecting only the cases where at least one input is present, adding a term to detect when all the inputs are active and using an additional NOT gate, we can detect when exactly zero or two inputs are active:

\displaystyle g(X, Y, Z) = \overline{f(X, Y, Z)(X + Y + Z) + XYZ}

\displaystyle = \overline{\overline{XY + YZ + XZ}(X + Y + Z) + XYZ}.

Detects when zero or two inputs are active.

Detects when zero or two inputs are active.

Now we know that if X is not present, we either have:

  • 0 inputs present: we can check that by simultaneously ensuring that we don’t have more than one input present and that we have either zero or two inputs present, f(X, Y, Z)\cdot g(X, Y, Z).
  • 1 input present: we should have no more than one input present and Y or Z should be present, f(X, Y, Z)\cdot(Y + Z).
  • 2 inputs present: we can check that by simultaneously ensuring that either zero or two inputs are present and that Y and Z are present, g(X, Y, Z)\cdot YZ.

Putting all together and adding the symmetrical cases:

\displaystyle \overline{X} = f(X, Y, Z) \cdot (Y + Z) + (f(X, Y, Z) + YZ)\cdot g(X, Y, Z)

\displaystyle \overline{Y} = f(X, Y, Z) \cdot (X + Z) + (f(X, Y, Z) + XZ)\cdot g(X, Y, Z)

\displaystyle \overline{Z} = f(X, Y, Z) \cdot (X + Y) + (f(X, Y, Z) + XY)\cdot g(X, Y, Z).

Example of computing NOT x.

Computing NOT X (the other cases are symmetrical).

Checking the solution

To get independent confirmation, let’s check the truth tables with a simple Python script:

from itertools import product

for x, y, z in product((False, True), repeat=3):
	f_xyz = not (x and y or x and z or y and z)
	g_xyz = not (f_xyz and (x or y or z) or x and y and z)
	not_x = f_xyz and (y or z) or (f_xyz or y and z) and g_xyz
	not_y = f_xyz and (x or z) or (f_xyz or x and z) and g_xyz
	not_z = f_xyz and (x or y) or (f_xyz or x and y) and g_xyz
	assert (not x == not_x) and (not y == not_y) and (not z == not_z)

Conclusion

As this technique allows us to expand two NOT gates to three and it can be applied repeatedly, we can compute an arbitrary Boolean function with a circuit containing only two NOT gates.

In a following post we will see how I arrived to the solution by using brute force.

How many NOTs are needed?

It’s relatively easy to see that we cannot compute an arbitrary Boolean function using only AND and OR gates. For example, even the NOT function cannot be computed using only those gates (why?).

Can we build a circuit to compute an arbitrary Boolean function using a constant number of NOT gates?


Solution to the “42 code golf” problem

This was my best result:

n=1e6,m,c,d;main(){while(c+=d==42,d=0,m=--n)while(d+=m%10,m/=10);printf("%d\n",c);}

It would have been nice to find a solution under 80 bytes but, after one hour of trying, that was the best I could manage…

Solving the viral Singaporean math problem

The following problem has become very popular in social media:

math3-superJumbo-v2

In this blog we have solved similar problems before, but this is one that can be easily solved by hand. We only need to be careful not to confuse our knowledge state with the one of Albert & Bernard.

Albert and Bernard just became friends with Cheryl, and they want to know when her birthday is. Cheryl gives them a list of 10 possible dates.

May 15 May 16 May 19

June 17 June 18

July 14 July 16

August 14 August 15 August 17

Cheryl then tells Albert and Bernard separately the month and the day of her birthday, respectively.

We will describe a list of the possible knowledge states of Albert and Bernard after being given that information:

Albert
  1. May 15 or May or May 19
  2. June 17 or June 18
  3. July 14 or July 16
  4. August 14 or August 15 or August 17
Bernard
  1. July 14 or August 14
  2. May 15 or August 15
  3. May 16 or July 16
  4. June 17 or August 17
  5. June 18
  6. May 19

Albert: I don’t know when Cheryl’s birthday is, but I know that Bernard does not know, too.

We already knew that Albert wouldn’t know the day based just on being given the month, but he is giving us additional information by telling us that he knows that Bernard doesn’t know. Bernard would know the date if the day were 18 or 19, so Albert knows that those days could not be the right ones. That excludes options 1 and 2 from our knowledge of Albert knowledge:

Albert
  1. July 14 or July 16
  2. August 14 or August 15 or August 17

Bernard can do the same deductions we have and eliminate the options that are not possible from his state of knowledge (all the options with months different from July and August).

Bernard
  1. July 14 or August 14
  2. August 15
  3. July 16
  4. August 17

Bernard: At first, I didn’t know when Cheryl’s birthday is, but I know now.

Updating our knowledge of Bernard knowledge:

Bernard
  1. August 15
  2. July 16
  3. August 17

As Albert also knows what we know about Bernard knowledge…

Albert
  1. July 16
  2. August 15 or August 17

Albert: Then I also know when Cheryl’s birthday is.

Now we know the right date:

Albert
  1. July 16
Bernard
  1. July 16

42 code golf

A nice and easy interview problem (link not posted to avoid giving good answers) is the following:

Print the number of integers below one million whose decimal digits sum to 42.

It can be solved with some simple Python code like the following:

print sum(1 if sum(int(c) for c in '%d' % n) == 42 else 0
          for n in range(1000000))

A more interesting problem is to try to write the smallest C program that solves the problem, where C program is defined as something that can be compiled & executed by Ideone in “C” mode. I know it can be done in 83 bytes, but can it be done using less?