Intermezzo: permutations

As I haven’t progressed very much with my WebGL apps due to a variety of reasons/excuses (lack of willpower, a crash course on nuclear reactors engineering since the Fukushima nuclear crisis started, etc.), I decided to make a quick post showing a solution to a simple problem. In other words, the main purpose of this post is just to avoid leaving a month without posts 馃榾

The problem

Given an arithmetic expression without parentheses, insert them to get a specified result. For example, if the expression is 2+2/2 and the desired result is 2, there is a single solution: (2+2)/2. On the other hand, if the desired result is 0 there is no solution.

It’s easy to see the answers in simple cases like the last one. But it’s not so easy to see if we can get a result of 18 by adding parentheses to 18-6/2+4-1*2… so let’s write code to do the hard work for us.

A solution

The parentheses only change the order in which the different operators apply and the effects of any operation order can be obtained with some parenthesization. So, if we search for a permutation of the order in which the operators are applied that gives us the desired result, we are getting a solution for the original problem.

This new problem is not exactly the same as the original one by two reasons that can be exemplified as follows:

  • Given the expression 2+2*2+2, the orders of operation 1-3-2 and 3-1-2 are associated to the same parenthesized expression: (2+2)*(2+2).
  • The parenthesizations (2+2)+2, 2+(2+2) and 2+2+2 are equivalent due to the associativity of addition.

But, as this only gives us some redundant solutions that can be easily eliminated with some post-processing, it’s not a problem from the correctness point of view. Being able to remove this redundant solutions without needing to enumerate them would be a great thing performance wise, but it doesn’t seem easy and it’s not required for small expressions (10! < 107).

Permutations in lexicographic order

To avoid accidentally skipping some permutations, it’s useful to follow a specific order when enumerating them. In this section we will see how we can go through all the permutations of an array [0, 1, 2, …, n – 1] in lexicographic order.

Given some permutation of the array, that we will call a, the next one will probably share some prefix and will differ in the order of the last elements. As we cannot get a lexicographically greater permutation by permuting a suffix that is in reverse order (starting from the greater elements), we need to permute a suffix that is not in reverse order. Then we will make our first step to search for this suffix:

Find the largest index k such that a[k] < a[k + 1].

The specification of the largest index is equivalent to search for the minimal suffix to permute, something required to get a permutation that is lexicographically immediate. If this suffix cannot be found, we are at the last permutation.

Now we need to decide what to permute. It’s clear that we cannot get a greater permutation just by permuting the suffix a[k+1 .. n-1], as they are in reverse order. So we need to replace a[k] by some element and the obvious choice is to search for the immediately greater element in the suffix or, as the suffix a[k+1 .. n-1] is in reverse order:

Find the largest index l such that a[k] < a[l].

The lexicographically smallest permutation starting with the original a[l] can be obtained by appending all the other elements of the original suffix in ascending order. But, as the suffix is in descending order and l is the largest index such that a[k] < a[l], we can get the same result in an easier way:

Swap a[k] with a[l].
Reverse the sequence from a[k + 1] up to and including the final element a[n].

The final code in JS can be seen here.

Evaluating an arithmetic expression with different orders of operations

I started using an array whose members were alternately numbers and operators. The main problem is that, while it’s easy to go from an operator to the adjacent numbers in an array, this is not what we want to do. The desired behavior is to go from an operator to the operands, and they can be the result of a sequence of previous operations.

To solve this problem in a moderately efficient way, I adapted the parent node concept from the disjoint-set data structure. Each operation adds the result to the operator node and puts this node as the parent node of both operands, allowing efficient access to the value of the whole operand from any node that is included in it. The code that does this operations can be seen at the SVN repository of this blog.

Testing the solution

The whole solution can be tested at the same SVN repository, but let’s see some specific examples:

Input expression: 3+2*2-1*0.5
Input result 1: 3
Output: (3+2*2-1)*0.5
Input result 2: 6
Output: 3+2*(2-1*0.5)

Input expression: 18-6/2+4-1*2
Input result: 18
Output 1: ((18-6)/2+4-1)*2
Output 2: 18-(6/(2+4)-1)*2

Advertisements

GPGPU with WebGL: solving Laplace’s equation

This is the first post in what will hopefully be a series of posts exploring how to use WebGL to do GPGPU (General-purpose computing on graphics processing units). In this installment we will solve a partial differential equation using WebGL, the Laplace’s equation more specifically.

Discretizing the Laplace’s equation

The Laplace’s equation, \nabla^2 \phi = 0, is one of the most ubiquitous partial differential equations in physics. It appears in lot of areas, including electrostatics, heat conduction and fluid flow.

To get a numerical solution of a differential equation, the first step is to replace the continuous domain by a lattice and the differential operators with their discrete versions. In our case, we just have to replace the Laplacian by its discrete version:

\displaystyle \nabla^2 \phi(x) = 0 \rightarrow \frac{1}{h^2}\left(\phi_{i-1\,j} + \phi_{i+1\,j} + \phi_{i\,j-1} + \phi_{i\,j+1} - 4\phi_{i\,j}\right) = 0,

where h is the grid size.

If we apply this equation at all internal points of the lattice (the external points must retain fixed values if we use Dirichlet boundary conditions) we get a big system of linear equations whose solution will give a numerical approximation to a solution of the Laplace’s equation. Of the various methods to solve big linear systems, the Jacobi relaxation method seems the best fit to shaders, because it applies the same expression at every lattice point and doesn’t have dependencies between computations. Applying this method to our linear system, we get the following expression for the iteration:

\displaystyle \phi_{i\,j}^{(k+1)} = \frac{1}{4}\left(\phi_{i-1\,j}^{(k)} + \phi_{i+1\,j}^{(k)} + \phi_{i\,j-1}^{(k)} + \phi_{i\,j+1}^{(k)}\right),

where k is a step index.

Solving the discretized problem using WebGL shaders

If we use a texture to represent the domain and a fragment shader to do the Jacobi relaxation steps, the shader will follow this general pseudocode:

  1. Check if this fragment is a boundary point. If it’s one, return the previous value of this point.
  2. Get the four nearest neighbors’ values.
  3. Return the average of their values.

To flesh out this pseudocode, we need to define a specific representation for the discretized domain. Taking into account that the currently available WebGL versions don’t support floating point textures, we can use 32 bits RGBA fragments and do the following mapping:

R: Higher byte of \phi.
G: Lower byte of \phi.
B: Unused.
A: 1 if it’s a boundary value, 0 otherwise.

Most of the code is straightforward, but doing the multiprecision arithmetic is tricky, as the quantities we are working with behave as floating point numbers in the shaders but are stored as integers. More specifically, the color numbers in the normal range, [0.0, 1.0], are multiplied by 255 and rounded to the nearest byte value when stored at the target texture.

My first idea was to start by reconstructing the floating point numbers for each input value, do the required operations with the floating numbers and convert the floating point numbers to color components that can be reliably stored (without losing precision). This gives us the following pseudocode for the iteration shader:

// wc is the color to the "west", ec is the color to the "east", ...
float w_val = wc.r + wc.g / 255.0;
float e_val = ec.r + ec.g / 255.0;
// ...
float val = (w_val + e_val + n_val + s_val) / 4.0;
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);

The reason why we multiply by 255 in place of 256 is that we need val_lo to keep track of the part of val that will be lost when we store it as a color component. As each byte value of a discrete color component will be associated with a range of size 1/255 in its continuous counterpart, we need to use the “low byte” to store the position of the continuous component within that range.

Simplifying the code to avoid redundant operations, we get:

float val = (wc.r + ec.r + nc.r + sc.r) / 4.0 +
	(wc.g + ec.g + nc.g + sc.g) / (4.0 * 255.0);
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);

The result of running the full code, implemented in GLSL, is:

Solving the Laplace's equation using a 32x32 grid. Click the picture to see the live solving process (if your browser supports WebGL).

As can be seen, it has quite low resolution but converges fast. But if we just crank up the number of points, the convergence gets slower:

Incompletely converged solution in a 512x512 grid. Click the picture to see a live version.

How can we reconcile these approaches?

Multigrid

The basic idea behind multigrid methods is to apply the relaxation method on a hierarchy of increasingly finer discretizations of the problem, using in each step the coarse solution obtained in the previous grid as the “starting guess”. In this mode, the long wavelength parts of the solution (those that converge slowly in the finer grids) are obtained in the first coarse iterations, and the last iterations just add the finer parts of the solution (those that converge relatively easily in the finer grids).

The implementation is quite straightforward, giving us fast convergence and high resolution at the same time:

Multigrid solution using grids from 8x8 to 512x512. Click the picture to see the live version.

Conclusions

It’s quite viable to use WebGL to do at least basic GPGPU tasks, though it is, in a certain sense, a step backward in time, as there is no CUDA, floating point textures or any feature that helps when working with non-graphic problems: you are on your own. But with the growing presence of WebGL support in modern browsers, it’s an interesting way of partially accessing the enormous computational power present in modern video cards from any JS application, without requiring the installation of a native application.

In the next posts we will explore other kinds of problem-solving where WebGL can provide a great performance boost.

Visualizing the Mandelbrot set with JavaScript

Fractals and the Mandelbrot set

Even though mathematical structures that are now called fractals were first analysed in the 19th century, the study of self-similar structures got a significant boost when computers able to visualize them started to appear in the 1970s. In fact, the term fractal was coined in 1975 by the late Benoit Mandelbrot to describe structures whose parts are similar to the whole.

The Mandelbrot set is a particular fractal set defined as:

\displaystyle M = \{ c \in \mathbb{C}: \lim_{n\to\infty} P_c^n(0) \neq \infty \},

where P_c^n(z) describes the polynomial resulting of iterating n times the parameterized quadratic polynomial P_c(z):

\displaystyle P_c(z) = z^2 + c.

More about the Mandelbrot set

It is said that the limit of a complex sequence z_n is infinite if for any positive number M there is an element of the sequence z_N such that its module and the modules of all succeeding sequence elements are greater than M. In formal notation:

\displaystyle \lim_{n\to\infty} z_n = \infty \iff \forall M > 0\,\exists N > 0\,\forall n \geq N: |z_n| \geq N.

All the points of the complex plane that have a modulus greater than 2 are outside the set. To show that, let’s take a point c_0 whose module is strictly greater than 2:

\displaystyle |c_0| > 2.

To avoid cluttering the inequalities, let’s make two definitions:

\displaystyle a_n = |P_{c_0}^n(0)| and

\displaystyle k = |c_0|.

Then the statement that all points with modulus greater than two are outside the Mandelbrot set is equivalent to

\displaystyle \lim_{n \to \infty} a_n = \infty.

It’s clear from the definition that a_n is non-negative. Let’s analyse carefully the relationship between a term and the previous one:

\displaystyle a_{n+2} = |P_{c_0}^{n+2}(0)| = \left|\left(P_{c_0}^{n+1}(0)\right)^2+c_0\right|

\displaystyle a_{n+2} \geq \left|\left|\left(P_{c_0}^{n+1}(0)\right)^2\right|-|c_0|\right|

\displaystyle a_{n+2} \geq \left|\left|P_{c_0}^{n+1}(0)\right|^2-k\right|

\displaystyle a_{n+2} \geq \left|a_{n+1}^2-k\right|.

We can now inductively explore the hypothesis a_{n+2} \geq (1 + \epsilon) a_{n+1}, with \epsilon = k - 2.

Checking the base case:

\displaystyle a_2 \geq |a_1^2 - k| = |k^2 - k| = |(k-1)k| = (k-1)k = (1+\epsilon)k = (1+\epsilon)a_1

Applying the strong inductive hypothesis to a_{n+1} we get:

\displaystyle a_{n+1} \geq (1 + \epsilon)a_n \geq \ldots \geq (1 + \epsilon)^n a_1 = (1 + \epsilon)^n k

Using this result in the previous inequality:

\displaystyle a_{n+2} \geq a_{n+1}^2-k \quad (a_{n+1}^2 > a_{n+1} > k)

\displaystyle a_{n+2} > a_{n+1}^2-a_{n+1} \quad (a_{n+1} > k)

\displaystyle a_{n+2} > a_{n+1}(a_{n+1} - 1)

\displaystyle a_{n+2} > a_{n+1}(k - 1) \quad (a_{n+1} > k)

\displaystyle a_{n+2} > (1 + \epsilon) a_{n+1} \quad (1 + \epsilon = k - 1)

This implies that a_n will diverge at least geometrically, with ratio 1+ \epsilon. Then any point c whose modulus is greater than 2 will be outside the set.

It can be seen by a slight extension of this argument that, if there is some n such that |P_c^n(0)| is greater than 2, c will be outside the Mandelbrot set too.

Plotting the Mandelbrot set

The simplest way to plot a Mandelbrot set is to iterate the quadratic polynomial P_c(z) over the points c that are associated with an image in the normal way (via some choice of origin and scale factor). If the iterated point reaches a modulus greater than 2 we know that its module will diverge and that it is outside the set. There isn’t any simple test to know if a point is inside the set, but we can choose a fixed number of iterations or to stop the process when no new point outside the set is detected.

Any point known to be outside the set (reached a modulus greater than 2 when iterated) can be colored white, using black to color the points of the set:

A Mandelbrot set plotted using black for the points inside the set and white for those outside the set. (Click to see a live version.)

We can make prettier visualizations if we use some measure of the “difficulty of divergence” for each point. One natural measure for this quantity is the number of iterations required to reach a given modulus. Combining this number of iterations with a function translating the potentially unbounded number of iterations to a color in the range black – white, we get:

Mandelbrot set in gray scale. The gray level of the points outside the set are related to the number of iterations required to reach a given modulus. (Click to see a live version.)

This visualization is nicer, and shows more clearly the existence of filament-like structures (in fact, the set is connected, but I don’t know how to prove that… 馃榾 ). But we can do better by interpolating the iteration number in a continuous way, giving us a less discrete measure of the “difficulty of divergence”. If we use the algorithm described by Linas Vepstas,

\mu = n + 1 - \frac{1}{\log 2}\log \log P_c^n(0),

being \mu the “continuous” iteration count, the we get this picture:

A representation of the Mandelbrot set using the "continuous iteration count" to colorize the exterior. (Click to see a live version.)

The source code of the visualization software can be seen in the SVN repository of this blog and it can be used to check the self similarity of the fractal (left click zooms in, right click zooms out):

A "copy" of the Mandelbrot set in a filament. Zoom out in the live version to see its location.

Din谩mica de fluidos en Javascript

Inspirado por un ejemplo portado por Oliver Hunt de C a JS (y frustrado porque no se me ocurr铆a como resolver el nivel Ophanim en Manufactoria 馃榾 ), decid铆 portar mi c贸digo de simulaci贸n de fluidos mediante Lattice Boltzmann (LB) a Javascript y agregarle interactividad. La visualizaci贸n es algo primitiva, ya que me concentr茅 en optimizar los c谩lculos sin prestarle mucha atenci贸n a la optimizaci贸n del manejo de canvas; posiblemente mejore ese aspecto en una pr贸xima versi贸n.

El c贸digo presenta algunas limitaciones propias de utilizar una grilla de 64 X 64 como base (el m谩ximo tama帽o con performance “realtime“) y, muy probablemente de mi desconocimiento sobre como optimizar los par谩metros de LB. Es particularmente clara la compresibilidad del fluido y no es muy dif铆cil hacer “explotar” la simulaci贸n (se recupera mediante un reset).

Como WordPress no permite incluir c贸digo Javascript en los posts, para visualizar la simulaci贸n pueden hacer click en la imagen de referencia de los controles.

Referencia de los controles de la simulaci贸n. Hacer click para arrancar la simulaci贸n.

El uso de los controles es muy simple:

  • Play/pause: arranca o detiene la simulaci贸n.
  • Reset: reinicia la simulaci贸n desde cero.
  • Visualization reset: reinicia la visualizaci贸n, creando una nueva serie de part铆culas para marcar el movimiento del fluido (sin alterar el estado del mismo).

Para darle un impulso a una parte del fluido basta con arrastrar sobre el 谩rea de visualizaci贸n.

En un posterior post incluir茅 detalles acerca de como funciona LB y de las optimizaciones realizadas (tengo que leer m谩s, sobre todo respecto a la conexi贸n con los valores f铆sicos 馃榾 ).

Porqu茅 no se puede ganar siempre

Respuesta al post anterior

La t茅cnica mostrada en el post anterior se denomina martingala y es muy antigua. Obviamente, a pesar de lo que puedan decir algunos, no funciona. Pero porqu茅 la simulaci贸n anterior parec铆a mostrar lo contrario?

El motivo principal del fallo de la simulaci贸n es no limitar la cantidad de dinero del jugador disponible para apostar. Como la cantidad que se requiere apostar crece exponencialmente con la cantidad de jugadas, en una cantidad relativamente peque帽a de estas se exceder谩n las reservas del jugador y no podr谩 continuar apostando.

Otro problema de la simulaci贸n es que, incluso una cantidad grande de simulaciones, puede obviar acontecimientos muy improbables. Esto no es problem谩tico cuando los eventos en cuesti贸n no tienen consecuencias excepcionales pero, cuando conducen a un impacto proporcional a su improbabilidad, puede distorsionarse mucho el valor esperado (probar la nueva simulaci贸n Javascript con M = 100000 para N = 1000 y N = 1000000; observar las diferencias en los valores esperados estimados).

Pero puede objetarse (con raz贸n!) que solo demostramos el fallo de una estrategia de apuesta, no de todas las posibles. Por lo tanto, a continuaci贸n, demostraremos que esto se aplica a cualquier estrategia.

Porqu茅 los sistemas no funcionan

Como en toda demostraci贸n matem谩tica, necesitaremos hacer ciertas suposiciones sobre el problema en cuesti贸n:

  1. El juego consiste en una cierta serie de jugadas, cada una asociada con una variable aleatoria X_i. Asumimos que todas las variables X_i tienen la misma distribuci贸n.
  2. En la jugada i el jugador apuesta una cantidad m_i a un evento aleatorio e_i (donde tomamos e_i como un subconjunto de lso valores que podr铆a tomar X_i).
  3. Si el evento ocurre al que el jugador apost贸 ocurre, este recibe el monto apostado multiplicado por una constante k(e_i) dependiente del evento al que apost贸. En caso contrario, pierde lo apostado.
  4. k(e) P(e) \le 1.
  5. Los valores de las variables aleatorias X_1, X_2, ..., X_{i-1} no aportan informaci贸n sobre el valor de X_i.

Las primeras 3 suposiciones son simplemente una descripci贸n general de un juego de azar, mientras que 5 nos indica que las anteriores jugadas no nos dan informaci贸n sobre la jugada actual. La suposici贸n 4 parece m谩s compleja, pero esencialmente nos indica que los factores k han sido correctamente elegidos por el casino.

Para ver esto, supongamos que existe un evento e tal que k(e) P(e) > 1. Entonces, si elegimos una estrategia de apostar N veces al evento e un monto unitario, nuestro balance esperado ser谩:

\displaystyle E[b] = E[\sum_{i=1}^N b_i]

\displaystyle = \sum_{i=1}^N E[b_i]

\displaystyle =  \sum_{i=1}^N ((k(e)-1)P(e) - (1-P(e)))

\displaystyle = N (k(e)P(e) - 1)

\displaystyle > 0,

donde b es el balance general del jugador y b_i el balance de la jugada i.

Eso implicar铆a que podr铆amos ganar simplemente repitiendo una apuesta una y otra vez (puede demostrarse sin mucha dificultad que ni siquiera necesitar铆amos reservas muy grandes, solo de orden \sqrt(N)). Como eso claramente no ocurre en los casinos, es obvio que los multiplicadores est谩n elegidos correctamente (de hecho, por ejemplo en la ruleta, est谩n elegidos “a favor” del casino por la existencia del cero).

Busquemos ahora el valor esperado de una estrategia de apuestas m_i(i, X_1, X_2, ..., X_{i-1}), e_i(i, X_1, X_2, ..., X_{i-1}):

\displaystyle E[b] = E[\sum_{i=1}^N b_i]

\displaystyle = E[\sum_{i=1}^N b_i]

\displaystyle = \sum_{i=1}^N E[b_i]

\displaystyle = \sum_{i=1}^N ((k(e_i)-1)P(e_i) - (1-P(e_i)))m_i

\displaystyle = \sum_{i=1}^N (k(e_i)P(e_i) - 1)m_i

siendo e_i y m_i las funciones anteriormente descritas. Pero la suposici贸n 5 nos dice que

P(e_i|i,X_1=x_1, X_2=x_2, ..., X_{i-1} = x_{i-1}) = P(e_i),

es decir que esa informaci贸n no nos modifica la probabilidad de ning煤n evento y, en particular, de e_i. Combinando con la suposici贸n 4 llegamos a

\displaystyle E[b] \le 0 ,

siempre que m_i sea positivo, como es razonable.

Caminos de escape

Aunque una demostraci贸n fuera perfecta, sus conclusiones nunca ser谩 m谩s fuertes que sus hip贸tesis. En esta secci贸n analizaremos algunas formas en las que las hip贸tesis pueden ser falsas:

mi < 0: Este es esencialmente el m茅todo utilizado por los casinos. Suele ser bastante complejo de seguir, pero muy redituable 馃檪

Fallo en la hip贸tesis 4: Los multiplicadores son f谩ciles de determinar conociendo las probabilidades y generalmente se utilizan probabilidades “nominales” (por ejemplo, suponer equiprobables a los 37 o 38 n煤meros de la ruleta). Pero si alg煤n observador encuentra evidencia que lo mueva a probabilidades distintas de las nominales, puede que existan apuestas con valor esperado positivo.

Fallo en la hip贸tesis 5: Pueden existir correlaciones no esperadas entre las variables aleatorias. Esto es m谩s com煤n en los juegos por computadora, que suelen usar n煤meros pseudoaleatorios y, en algunos casos, tienen errores de implementaci贸n que vuelven predecibles las secuencias generadas.

Como ganar siempre :-)

Un sistema de apuestas, muy utilizado por lo que puede verse en Google, sigue el siguiente esquema (donde lo aplicamos a una apuesta doble o nada con una moneda “justa”):

  1. Empezar apostando una unidad.
  2. Apostar a “cara”.
  3. En caso de ganar, retirarse.
  4. En caso de perder, doblar la apuesta y volver al paso 2.

Analicemos este sistema mediante un mill贸n de simulaciones de la t茅cnica, utilizando 3 lenguajes de programaci贸n: C++ (codepad), Python (codepad, N = 1000) y Javascript.

Como puede observarse, los resultados son esencialmente id茅nticos:

Number of simulations: 1000000
Minimum player balance: 1
Maximum player balance: 1
Average number of rounds before winning: 2.000831

En todas las ejecuciones observamos una ganancia fija de una unidad y que se obtiene en promedio en una peque帽a cantidad de tiradas.

Donde est谩 el problema con este sistema? En el pr贸ximo post la respuesta… 馃榾