Force over a dipole

In an arbitrary field

The easiest way to get the force over a dipole is to consider it as the limit of two oppositely charged monopoles that are closely spaced. If the dipole has moment \mathbf{m} and is at position \mathbf{r_0}, it can be considered as the limit of two monopoles, one with charge -\epsilon^{-1} at position \mathbf{r_0} - (\epsilon / 2) \mathbf{m} and the other with charge \epsilon^{-1} at position \mathbf{r_0} + (\epsilon / 2) \mathbf{m}, when \epsilon goes to zero.

If we consider the finite size dipole immersed in a (let’s say magnetic) field \mathbf{B}(\mathbf{r}), the total force will be

\displaystyle \mathbf{F} = -\epsilon^{-1}\mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m}) +\epsilon^{-1}\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m})

\displaystyle \mathbf{F} = \frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}.

We can get the force for the infinitesimal dipole by taking the limit when \epsilon goes to zero

\displaystyle \mathbf{F} = \lim_{\epsilon \to 0}\frac{\mathbf{B}(\mathbf{r_0} + (\epsilon / 2) \mathbf{m}) - \mathbf{B}(\mathbf{r_0} - (\epsilon / 2) \mathbf{m})}{\epsilon}

\displaystyle \mathbf{F} = \mathbf{m} \cdot (\mathbf{\nabla B})(\mathbf{r_0}),

where \mathbf{\nabla B} is the (tensor) derivative of the magnetic field.

Between two antialigned dipoles

The general field of a magnetic dipole of moment \mathbf{m} at position \mathbf{r_0} is

\displaystyle \mathbf{B}_d(\mathbf{r}) = \frac{\mu_0}{4\pi}\left(\frac{3\mathbf{m}\cdot(\mathbf{r}-\mathbf{r_0})(\mathbf{r}-\mathbf{r_0})-\mathbf{m}|\mathbf{r}-\mathbf{r_0}|^2}{|\mathbf{r}-\mathbf{r_0}|^5}\right).

If we assume we have one dipole at x = 0 with its moment +m\mathbf{\hat{x}} and the other at x = +R with its moment -m\mathbf{\hat{x}}, we get a field at x = +R of

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3m\mathbf{\hat{x}}\cdot(+R\mathbf{\hat{x}})(+R\mathbf{\hat{x}})-m\mathbf{\hat{x}}|+R\mathbf{\hat{x}}|^2}{|+R\mathbf{\hat{x}}|^5}\right)

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0}{4\pi}\left(\frac{3mR^2-mR^2}{R^5}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{B}(+R\mathbf{\hat{x}}) = \frac{\mu_0 m}{2\pi R^3}\mathbf{\hat{x}}.

By symmetry, we are only interested in the x-component of the x-derivative of the field,

\displaystyle \left(\frac{\partial\mathbf{B}}{\partial x}\right)_x = -\frac{3\mu_0 m}{2\pi R^4}.

And the force on the dipole at x = +R will be

\displaystyle \mathbf{F} = -m\mathbf{\hat{x}} \cdot (\mathbf{\nabla B})(+R\mathbf{\hat{x}})

\displaystyle \mathbf{F} = -m \left(-\frac{3\mu_0 m}{2\pi R^4}\right)\mathbf{\hat{x}}

\displaystyle \mathbf{F} = \frac{3\mu_0 m^2}{2\pi R^4}\mathbf{\hat{x}},

a repulsive force as expected of antialigned dipoles.

Advertisements

Yet another FizzBuzz without conditional branches

A branch-free FizzBuzz in assembly was posted in Hacker News. In this post I will try to do another implementation, for a Linux environment, without using function pointers.

C implementation

We can start by writing a C implementation:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>

int main( void )
{
	int i = 1;
	const char fizz[] = "Fizz";
	const char buzz[] = "Buzz";
	const char nl[] = "\n";
	char digits[ 2 ];
	while( 1 )
	{
		char *p = digits;
		*p = i / 10 + '0';
		p += ( i >= 10 );
		*p = i % 10 + '0';
		p++;
		syscall( SYS_exit * ( i > 100 ) + SYS_write * ( i <= 100 ), i <= 100,
		         digits, ( i % 3 != 0 && i % 5 != 0 ) * ( p - digits ) );
		syscall( SYS_write, 1, fizz, ( i % 3 == 0 ) * ( sizeof( fizz ) - 1 )  );
		syscall( SYS_write, 1, buzz, ( i % 5 == 0 ) * ( sizeof( buzz ) - 1 )  );
		syscall( SYS_write, 1, nl, sizeof( nl ) - 1 );
		i++;
	}
	/* NOT REACHED */
	return 0;
}

(Check the output in Ideone.)

The basic idea is to calculate the syscall number arithmetically and using the fact that a zero-length write is essentially a no-op.

In at least one associated assembly listing (obtained with Godbolt’s Interactive compiler) there are no conditional branches:

main:                                   # @main
	pushq	%rbp
	movq	%rsp, %rbp
	pushq	%r15
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%rbx
	pushq	%rax
	movl	$1, %r15d
	leaq	-42(%rbp), %r14
.LBB0_1:                                # %._crit_edge
	cmpl	$100, %r15d
	movl	$0, %edi
	movl	$60, %eax
	cmovgl	%eax, %edi
	movslq	%r15d, %r15
	cmpl	$101, %r15d
	setl	%al
	movzbl	%al, %esi
	orl	%esi, %edi
	imulq	$1431655766, %r15, %rcx # imm = 0x55555556
	imulq	$1717986919, %r15, %rax # imm = 0x66666667
	movq	%rax, %r10
	shrq	$63, %r10
	shrq	$32, %rax
	movl	%eax, %edx
	sarl	$2, %edx
	leal	(%rdx,%r10), %ebx
	movq	%rcx, %r11
	shrq	$63, %r11
	cmpl	$9, %r15d
	setg	%r8b
	imull	$10, %ebx, %ebx
	negl	%ebx
	leal	48(%rdx,%r10), %edx
	movb	%dl, -42(%rbp)
	leal	48(%r15,%rbx), %r9d
	shrq	$32, %rcx
	addl	%r11d, %ecx
	leal	(%rcx,%rcx,2), %ecx
	movl	%r15d, %edx
	subl	%ecx, %edx
	sete	%r12b
	movzbl	%r8b, %ecx
	movb	%r9b, -42(%rbp,%rcx)
	setne	%bl
	sarl	%eax
	addl	%r10d, %eax
	leal	(%rax,%rax,4), %eax
	movl	%r15d, %edx
	subl	%eax, %edx
	sete	%r13b
	setne	%al
	andb	%bl, %al
	movzbl	%al, %eax
	shlq	$63, %rax
	sarq	$63, %rax
	incq	%rcx
	andq	%rax, %rcx
	movq	%r14, %rdx
	xorb	%al, %al
	callq	syscall
	movzbl	%r12b, %ecx
	shlq	$2, %rcx
	movl	$1, %edi
	movl	$1, %esi
	movl	main::fizz, %edx
	xorb	%al, %al
	callq	syscall
	movzbl	%r13b, %ecx
	shlq	$2, %rcx
	movl	$1, %edi
	movl	$1, %esi
	movl	main::buzz, %edx
	xorb	%al, %al
	callq	syscall
	movl	$1, %edi
	movl	$1, %esi
	movl	main::nl, %edx
	movl	$1, %ecx
	xorb	%al, %al
	callq	syscall
	incl	%r15d
	jmp	.LBB0_1

main::fizz:
	.asciz	 "Fizz"

main::buzz:
	.asciz	 "Buzz"

main::nl:
	.asciz	 "\n"

But there still are comparisons and conditional sets and moves.

x86-64 assembly implementation

We can write a cleaner implementation if we do it directly in assembly:

;; FizzBuzz in x86-64 ASM without conditional branches

section .data
fizz: db 'Fizz'
buzz: db 'Buzz'
nl:   db 10

section .bss
digits: resb 2

section .text

global _start
_start:
	mov rbx, 1

main_loop:

	; if ( rbx > 100 )
	;   exit( 0 )
	; else
	;   write( 1, digits, 0 )
	mov rax, rbx
	neg rax
	add rax, 100
	sar rax, 63
	mov rdi, rax
	and rax, 59
	inc rax
	neg rdi
	mov rsi, digits
	xor rdx, rdx
	syscall

	; r8 <- rbx % 3 != 0 ? -1 : 0
	; r9 <- rbx % 5 != 0 ? -1 : 0
	mov rax, rbx
	mov rcx, 3
	div rcx
	mov r8, rdx
	neg r8
	sar r8, 63
	mov rax, rbx
	mov rcx, 5
	xor rdx, rdx
	div rcx
	mov r9, rdx
	neg r9
	sar r9, 63

	; rcx <- digits
	; *rcx = rbx / 10 + '0'
	; rcx += ( rbx >= 10 )
	; *rcx = rbx % 10 + '0'
	; rcx++
	mov rcx, digits
	mov rax, rbx
	xor rdx, rdx
	mov rdi, 10
	div rdi
	mov rsi, rax
	add rax, '0'
	mov [rcx], rax
	neg rsi
	sar rsi, 63
	neg rsi
	add rcx, rsi
	add rdx, '0'
	mov [rcx], rdx
	inc rcx

	; write( 1, digits, ( rbx % 3 != 0 && rbx % 5 != 0 ) ? rcx - digits : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, digits
	mov rdx, rcx
	sub rdx, digits
	and rdx, r8
	and rdx, r9
	syscall

	; write( 1, fizz, rbx % 3 == 0 ? 4 : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, fizz
	mov rdx, r8
	not rdx
	and rdx, 4
	syscall

	; write( 1, buzz, rbx % 5 == 0 ? 4 : 0 )
	mov rax, 1
	mov rdi, 1
	mov rsi, buzz
	mov rdx, r9
	not rdx
	and rdx, 4
	syscall

	; write( 1, nl, 1 )
	mov rax, 1
	mov rdi, 1
	mov rsi, nl
	mov rdx, 1
	syscall	

	inc rbx

	jmp main_loop

If the source file is named fizzbuzz.s, it can be compiled with the following command line:

nasm -f elf64 -o fizzbuzz.o fizzbuzz.s; ld -o fizzbuzz fizzbuzz.o