Force over a dipole

In an arbitrary field

The easiest way to get the force over a dipole is to consider it as the limit of two oppositely charged monopoles that are closely spaced. If the dipole has moment \mathbf{m} and is at position \mathbf{r_0}, it can be considered as the limit of two monopoles, one with charge -\epsilon^{-1} at position \mathbf{r_0} - (\epsilon / 2) \mathbf{m} and the other with charge \epsilon^{-1} at position \mathbf{r_0} + (\epsilon / 2) \mathbf{m}, when \epsilon goes to zero.

If we consider the finite size dipole immersed in a (let’s say magnetic) field \mathbf{B}(\mathbf{r}), the total force will be

\displaystyle \mathbf{F} = -\epsilon^{-1}\mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m}) +\epsilon^{-1}\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m})

\displaystyle \mathbf{F} = \frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}.

We can get the force for the infinitesimal dipole by taking the limit when \epsilon goes to zero

\displaystyle \mathbf{F} = \lim_{\epsilon \to 0}\frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}

\displaystyle \mathbf{F} = \mathbf{m} \cdot (\mathbf{\nabla B})(\mathbf{r_0}),

where \mathbf{\nabla B} is the (tensor) derivative of the magnetic field.

Between two antialigned dipoles

The general field of a magnetic dipole of moment \mathbf{m} at position \mathbf{r_0} is

\displaystyle \mathbf{B}_d(\mathbf{r}) = \frac{\mu_0}{4\pi}\left(\frac{3\mathbf{m}\cdot(\mathbf{r}-\mathbf{r_0})(\mathbf{r}-\mathbf{r_0})-\mathbf{m}|\mathbf{r}-\mathbf{r_0}|^2}{|\mathbf{r}-\mathbf{r_0}|^5}\right).

If we assume we have one dipole at x = 0 with its moment +m\mathbf{\hat{x}} and the other at x = +R with its moment -m\mathbf{\hat{x}}, we get a field at x = +R of

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3m\mathbf{\hat{x}}\cdot(+R\mathbf{\hat{x}})(+R\mathbf{\hat{x}})-m\mathbf{\hat{x}}|+R\mathbf{\hat{x}}|^2}{|+R\mathbf{\hat{x}}|^5}\right)

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3mR^2-mR^2}{R^5}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0 m}{2\pi R^3}\mathbf{\hat{x}}.

By symmetry, we are only interested in the x-component of the x-derivative of the field,

\displaystyle \left(\frac{\partial\mathbf{B}}{\partial x}\right)_x = -\frac{3\mu_0 m}{2\pi R^4}.

And the force on the dipole at x = +R will be

\displaystyle \mathbf{F} = -m\mathbf{\hat{x}} \cdot (\mathbf{\nabla B})(+R\mathbf{\hat{x}})

\displaystyle \mathbf{F} = -m \left(-\frac{3\mu_0 m}{2\pi R^4}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{F} = \frac{3\mu_0 m^2}{2\pi R^4}\mathbf{\hat{x}},

a repulsive force as expected of antialigned dipoles.

The fall of the slinky I

The video

The video shows how a spring suspended from one of its ends reacts when its dropped. It can be observed that the lower end “doesn’t know what happened” until a wave propagates to it. In this post we will make a computer simulation of its behavior, to see if we can reproduce the phenomenon, and in the next we will apply a more analytic approach to the same problem.

Discretization

As we are studying the actions of gravity and inertia, we need to model the slinky as a massive spring. In (macroscopic) reality the mass and elasticity are continuously distributed throughout the slinky (macroscopically speaking) but, for the purpose of simulating its behavior with a computer model, we will represent the object as a series of masses connected with massless ideal springs:

Slinky discretization, showing how the discretized element properties relate to the original ones.

If we divide a slinky of total mass M in N small masses, each of them will have the value m = M/N. As the spring constants of series springs add as the resistances of parallel resistors, if we have a slinky with overall spring constant K divided into N-1 smaller springs, each of them will have a bigger spring constant, k = K(N-1) and (obviously) a smaller unstressed length, l_0 = L/(N-1).

Writing the simulation

Now it’s just a question of applying Newton’s laws of motion and the ideal spring equation to get a system of ordinary differential equations:

\displaystyle \frac{d^2x_i}{dt^2} = g + \frac{k}{m}\left[(x_{i+1}-x_i-l_0)H[i]-(x_i-x_{i-1}-l_0)H[N-1-i]\right]

where x_i are the coordinates of the masses (with i going from 0 to N-1), g is the acceleration of gravity, k is the spring constant of each massless spring, m is the value of each small mass, l_0 is the unstressed length of each massless spring and H[n] is the discrete Heaviside step function (used to avoid depending on undefined values).

This second order system can be reduced to system of first order ODEs in the usual way. Integrating it using RK4, we get the following Python code:

def get_xdot(x):
    sk = K * (NX - 1)
    sl = L / (NX - 1)
    mm = M / NX
    xdot = [x[NX + i] if i < NX else G
            for i in range(2 * NX)]
    for i in range(NX - 1):
        a = sk * (x[i + 1] - x[i] - sl) / mm
        xdot[NX + i] += a
        xdot[NX + i + 1] -= a
    return xdot

def rk4_step(x, dt):
    k1 = get_xdot(x)
    k2 = get_xdot([x[i] + dt * k1[i] / 2.0 for i in range(len(x))])
    k3 = get_xdot([x[i] + dt * k2[i] / 2.0 for i in range(len(x))])
    k4 = get_xdot([x[i] + dt * k3[i] for i in range(len(x))])
    return [x[i] + dt * (k1[i] + 2.0 * (k2[i] + k3[i]) + k4[i]) / 6.0
            for i in range(len(x))]

Now we need to define the initial conditions. If we just start with the masses separated by a distance l_0 and at rest, we won’t match the initial conditions of the slinky, because it was being stretched by the action of gravity. It’s not very difficult to compute initial conditions that leave the system at rest if it’s being held:

def initial_x():
    sl0 = L / (NX - 1)
    mm = M / NX
    sk = K * (NX - 1)
    w = M - mm
    x = [0.0 for i in range(2 * NX)]
    for i in range(1, NX):
        x[i] = x[i - 1] + sl0 + w / sk
        w -= mm
    return x

The remaining code is just plumbing and matplotlib presentation code. The whole program can be seen at the repository.

Running the simulation

If we run the simulation with the parameters

NT = 1000 # number of timesteps
NX = 40   # number of masses
T = 1.0   # simulation duration

L = 0.5   # slinky length
K = 1.0   # slinky overall spring constant
M = 1.0   # slinky mass
G = 1.0   # gravitational acceleration

we get the following results:

A simulation where the springs are too soft, giving some negative spring lengths (and consequent overlapping) near t = 1.

In this plot the gray lines represent the trajectory of the small masses and the black lines the trajectory of the slinky’s ends.

Clearly the springs are too soft and we are getting unphysical results, as we spring lengths go negative when t nears 1. To fix that, let’s run the simulations with a greater spring constant, K = 5:

A simulation where the springs have a physically reasonable constant, giving an intriguing behavior to its ends.

Now we get a more reasonable result, showing a phenomenon that is more similar to the one observed in the video: the bottom of the slinky remains in place while the top begins to fall. Now we can check if the slinky remains stationary when held by m_0:

def get_xdot(x):
    sk = K * (NX - 1)
    sl = L / (NX - 1)
    mm = M / NX
    xdot = [x[NX + i] if i < NX else G
            for i in range(2 * NX)]
    for i in range(NX - 1):
        a = sk * (x[i + 1] - x[i] - sl) / mm
        xdot[NX + i] += a
        xdot[NX + i + 1] -= a
    xdot[NX] = 0.0 # holding the slinky
    return xdot

Simulation to check if the slinky remains stationary when held from its upper end.

Conclusions

This simulation validates the main point of the original video: the lower end “doesn’t know” the upper end was released until a compression wave reaches it, at t \approx 0.45 in our simulation. But the detailed behavior differs, as the slinky only shows a compression wave once it reaches the nonlinear regime (when is no more space between the spires).

In the next post we will show an analysis of this nonlinear behavior and the analytical solution to the idealized slinky drop that was numerically simulated in this post.

GPGPU with WebGL: solving Laplace’s equation

This is the first post in what will hopefully be a series of posts exploring how to use WebGL to do GPGPU (General-purpose computing on graphics processing units). In this installment we will solve a partial differential equation using WebGL, the Laplace’s equation more specifically.

Discretizing the Laplace’s equation

The Laplace’s equation, \nabla^2 \phi = 0, is one of the most ubiquitous partial differential equations in physics. It appears in lot of areas, including electrostatics, heat conduction and fluid flow.

To get a numerical solution of a differential equation, the first step is to replace the continuous domain by a lattice and the differential operators with their discrete versions. In our case, we just have to replace the Laplacian by its discrete version:

\displaystyle \nabla^2 \phi(x) = 0 \rightarrow \frac{1}{h^2}\left(\phi_{i-1\,j} + \phi_{i+1\,j} + \phi_{i\,j-1} + \phi_{i\,j+1} - 4\phi_{i\,j}\right) = 0,

where h is the grid size.

If we apply this equation at all internal points of the lattice (the external points must retain fixed values if we use Dirichlet boundary conditions) we get a big system of linear equations whose solution will give a numerical approximation to a solution of the Laplace’s equation. Of the various methods to solve big linear systems, the Jacobi relaxation method seems the best fit to shaders, because it applies the same expression at every lattice point and doesn’t have dependencies between computations. Applying this method to our linear system, we get the following expression for the iteration:

\displaystyle \phi_{i\,j}^{(k+1)} = \frac{1}{4}\left(\phi_{i-1\,j}^{(k)} + \phi_{i+1\,j}^{(k)} + \phi_{i\,j-1}^{(k)} + \phi_{i\,j+1}^{(k)}\right),

where k is a step index.

Solving the discretized problem using WebGL shaders

If we use a texture to represent the domain and a fragment shader to do the Jacobi relaxation steps, the shader will follow this general pseudocode:

  1. Check if this fragment is a boundary point. If it’s one, return the previous value of this point.
  2. Get the four nearest neighbors’ values.
  3. Return the average of their values.

To flesh out this pseudocode, we need to define a specific representation for the discretized domain. Taking into account that the currently available WebGL versions don’t support floating point textures, we can use 32 bits RGBA fragments and do the following mapping:

R: Higher byte of \phi.
G: Lower byte of \phi.
B: Unused.
A: 1 if it’s a boundary value, 0 otherwise.

Most of the code is straightforward, but doing the multiprecision arithmetic is tricky, as the quantities we are working with behave as floating point numbers in the shaders but are stored as integers. More specifically, the color numbers in the normal range, [0.0, 1.0], are multiplied by 255 and rounded to the nearest byte value when stored at the target texture.

My first idea was to start by reconstructing the floating point numbers for each input value, do the required operations with the floating numbers and convert the floating point numbers to color components that can be reliably stored (without losing precision). This gives us the following pseudocode for the iteration shader:

// wc is the color to the "west", ec is the color to the "east", ...
float w_val = wc.r + wc.g / 255.0;
float e_val = ec.r + ec.g / 255.0;
// ...
float val = (w_val + e_val + n_val + s_val) / 4.0;
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);

The reason why we multiply by 255 in place of 256 is that we need val_lo to keep track of the part of val that will be lost when we store it as a color component. As each byte value of a discrete color component will be associated with a range of size 1/255 in its continuous counterpart, we need to use the “low byte” to store the position of the continuous component within that range.

Simplifying the code to avoid redundant operations, we get:

float val = (wc.r + ec.r + nc.r + sc.r) / 4.0 +
	(wc.g + ec.g + nc.g + sc.g) / (4.0 * 255.0);
float hi = val - mod(val, 1.0 / 255.0);
float lo = (val - hi) * 255.0;
fragmentColor = vec4(hi, lo, 0.0, 0.0);

The result of running the full code, implemented in GLSL, is:

Solving the Laplace's equation using a 32x32 grid. Click the picture to see the live solving process (if your browser supports WebGL).

As can be seen, it has quite low resolution but converges fast. But if we just crank up the number of points, the convergence gets slower:

Incompletely converged solution in a 512x512 grid. Click the picture to see a live version.

How can we reconcile these approaches?

Multigrid

The basic idea behind multigrid methods is to apply the relaxation method on a hierarchy of increasingly finer discretizations of the problem, using in each step the coarse solution obtained in the previous grid as the “starting guess”. In this mode, the long wavelength parts of the solution (those that converge slowly in the finer grids) are obtained in the first coarse iterations, and the last iterations just add the finer parts of the solution (those that converge relatively easily in the finer grids).

The implementation is quite straightforward, giving us fast convergence and high resolution at the same time:

Multigrid solution using grids from 8x8 to 512x512. Click the picture to see the live version.

Conclusions

It’s quite viable to use WebGL to do at least basic GPGPU tasks, though it is, in a certain sense, a step backward in time, as there is no CUDA, floating point textures or any feature that helps when working with non-graphic problems: you are on your own. But with the growing presence of WebGL support in modern browsers, it’s an interesting way of partially accessing the enormous computational power present in modern video cards from any JS application, without requiring the installation of a native application.

In the next posts we will explore other kinds of problem-solving where WebGL can provide a great performance boost.

Carmack’s unequal lights problem

The problem

John Carmack tweeted:

Thinking about the shape of the surface of equal contribution between two inverse square lights of different intensities.

Without loss of generality (by scaling, translating and rotating) we can use lights with this parameters:

Light 1 position: \mathbf{r}_1 = \mathbf{0}

Light 2 position: \mathbf{r}_2 = \mathbf{\hat{x}}

Light 1 strength: s_1 = 1

Light 2 strength: s_2 = \lambda < 1

1D variant

We get the following equation using the inverse square law:

\displaystyle \frac{1}{x^2} = \frac{\lambda}{(x - 1)^2}

Operating with it:

\displaystyle (x - 1)^2 = \lambda x^2

\displaystyle x^2 - 2x + 1 - \lambda x^2 = 0

\displaystyle (1-\lambda) x^2 - 2x + 1 = 0

Solving this quadratic equation:

\displaystyle x = \frac{2\pm\sqrt{4-4(1-\lambda)}}{2(1-\lambda)}

\displaystyle x = \frac{2\pm\sqrt{4-4+4\lambda}}{2(1-\lambda)}

\displaystyle x = \frac{1\pm\sqrt{\lambda}}{1-\lambda}

Full 3D variant

Using the inverse square law we get an equation for the surface of equal intensity:

\displaystyle \frac{1}{\mathbf{r}^2} = \frac{\lambda}{(\mathbf{r} - \mathbf{\hat{x}})^2}

Operating:

\displaystyle (\mathbf{r} - \mathbf{\hat{x}})^2 = \lambda\mathbf{r}^2

\displaystyle \mathbf{r}^2 - 2\mathbf{r}\cdot\mathbf{\hat{x}} + \mathbf{\hat{x}}^2 - \lambda\mathbf{r}^2 = 0

\displaystyle (1-\lambda)\mathbf{r}^2 - 2\mathbf{r}\cdot\mathbf{\hat{x}} + \mathbf{\hat{x}}^2 = 0

\displaystyle (1-\lambda)(x^2 + y^2 + z^2) - 2x + 1 = 0.

Using \mu = \sqrt{1 - \lambda} and completing the square:

\displaystyle \mu^2(x^2 + y^2 + z^2) - 2x + 1 = 0

\displaystyle \mu^2(y^2 + z^2) + (\mu x)^2 - 2x + 1 = 0

\displaystyle \mu^2(y^2 + z^2) + \left[(\mu x)^2 - 2(\mu x)\mu^{-1} + (\mu^{-1})^2\right] - (\mu^{-1})^2 + 1 = 0

\displaystyle (\mu y)^2 + (\mu z)^2 + (\mu x - \mu^{-1})^2 = (\mu^{-1})^2 - 1

Multiplying all terms by \mu^{-2} and reordering:

\displaystyle (x - \mu^{-2})^2 + y^2 + z^2 = (\mu^{-2})^2 - \mu^{-2}

\displaystyle (x - \mu^{-2})^2 + y^2 + z^2 = \mu^{-2}\left[\mu^{-2} - 1\right]

Replacing by the expression of \mu as function of \lambda:

\displaystyle \left(x - \frac{1}{1-\lambda}\right)^2 + y^2 + z^2 = \frac{\frac{1}{1-\lambda} - 1}{1-\lambda}

\displaystyle \left(x - \frac{1}{1-\lambda}\right)^2 + y^2 + z^2 = \frac{\frac{1-(1-\lambda)}{1-\lambda}}{1-\lambda}

\displaystyle \left(x - \frac{1}{1-\lambda}\right)^2 + y^2 + z^2 = \frac{\frac{\lambda}{1-\lambda}}{1-\lambda}

\displaystyle \left(x - \frac{1}{1-\lambda}\right)^2 + y^2 + z^2 = \frac{\lambda}{(1-\lambda)^2}

This is an sphere with

\displaystyle \text{center} = \left[\frac{1}{1-\lambda},0,0\right]

\displaystyle \text{radius} = \frac{\sqrt{\lambda}}{1-\lambda},

matching the results found for the 1D case.