This is the first post in what will hopefully be a series of posts exploring how to use WebGL to do GPGPU (General-purpose computing on graphics processing units). In this installment we will solve a partial differential equation using WebGL, the Laplace’s equation more specifically.
Discretizing the Laplace’s equation
To get a numerical solution of a differential equation, the first step is to replace the continuous domain by a lattice and the differential operators with their discrete versions. In our case, we just have to replace the Laplacian by its discrete version:
where is the grid size.
If we apply this equation at all internal points of the lattice (the external points must retain fixed values if we use Dirichlet boundary conditions) we get a big system of linear equations whose solution will give a numerical approximation to a solution of the Laplace’s equation. Of the various methods to solve big linear systems, the Jacobi relaxation method seems the best fit to shaders, because it applies the same expression at every lattice point and doesn’t have dependencies between computations. Applying this method to our linear system, we get the following expression for the iteration:
where is a step index.
Solving the discretized problem using WebGL shaders
If we use a texture to represent the domain and a fragment shader to do the Jacobi relaxation steps, the shader will follow this general pseudocode:
- Check if this fragment is a boundary point. If it’s one, return the previous value of this point.
- Get the four nearest neighbors’ values.
- Return the average of their values.
To flesh out this pseudocode, we need to define a specific representation for the discretized domain. Taking into account that the currently available WebGL versions don’t support floating point textures, we can use 32 bits RGBA fragments and do the following mapping:
R: Higher byte of .
G: Lower byte of .
A: 1 if it’s a boundary value, 0 otherwise.
Most of the code is straightforward, but doing the multiprecision arithmetic is tricky, as the quantities we are working with behave as floating point numbers in the shaders but are stored as integers. More specifically, the color numbers in the normal range, [0.0, 1.0], are multiplied by 255 and rounded to the nearest byte value when stored at the target texture.
My first idea was to start by reconstructing the floating point numbers for each input value, do the required operations with the floating numbers and convert the floating point numbers to color components that can be reliably stored (without losing precision). This gives us the following pseudocode for the iteration shader:
// wc is the color to the "west", ec is the color to the "east", ... float w_val = wc.r + wc.g / 255.0; float e_val = ec.r + ec.g / 255.0; // ... float val = (w_val + e_val + n_val + s_val) / 4.0; float hi = val - mod(val, 1.0 / 255.0); float lo = (val - hi) * 255.0; fragmentColor = vec4(hi, lo, 0.0, 0.0);
The reason why we multiply by 255 in place of 256 is that we need val_lo to keep track of the part of val that will be lost when we store it as a color component. As each byte value of a discrete color component will be associated with a range of size 1/255 in its continuous counterpart, we need to use the “low byte” to store the position of the continuous component within that range.
Simplifying the code to avoid redundant operations, we get:
float val = (wc.r + ec.r + nc.r + sc.r) / 4.0 + (wc.g + ec.g + nc.g + sc.g) / (4.0 * 255.0); float hi = val - mod(val, 1.0 / 255.0); float lo = (val - hi) * 255.0; fragmentColor = vec4(hi, lo, 0.0, 0.0);
As can be seen, it has quite low resolution but converges fast. But if we just crank up the number of points, the convergence gets slower:
How can we reconcile these approaches?
The basic idea behind multigrid methods is to apply the relaxation method on a hierarchy of increasingly finer discretizations of the problem, using in each step the coarse solution obtained in the previous grid as the “starting guess”. In this mode, the long wavelength parts of the solution (those that converge slowly in the finer grids) are obtained in the first coarse iterations, and the last iterations just add the finer parts of the solution (those that converge relatively easily in the finer grids).
The implementation is quite straightforward, giving us fast convergence and high resolution at the same time:
It’s quite viable to use WebGL to do at least basic GPGPU tasks, though it is, in a certain sense, a step backward in time, as there is no CUDA, floating point textures or any feature that helps when working with non-graphic problems: you are on your own. But with the growing presence of WebGL support in modern browsers, it’s an interesting way of partially accessing the enormous computational power present in modern video cards from any JS application, without requiring the installation of a native application.
In the next posts we will explore other kinds of problem-solving where WebGL can provide a great performance boost.