Friday, March 7, 2014

MCMC, C and Julia

A while ago in his blog, Darren Wilkinson compared the performance of a simple Gibbs sampler algorithm implemented in different computer languages. He sounded quite excited about Scala, as it has a nice syntax, and the performance isn't bad either. That being said, the C implementation ran the fastest. In this post, I compare the C implementation with one in Julia.

The sampler

The idea was to construct a Gibbs sampler for the bivariate distribution:

\[ \begin{align} f(x,y) & =kx^2 exp{-xy^2-y^2+2y-4x},\;x>0,y \in \mathbb{R} \end{align} \]

with unknown normalising constant \( k \).

The full conditionals are:

\[ \begin{align} x|y & \sim Ga(3,y^2+4) \cr y|x & \sim N\left(\frac{1}{1+x},\frac{1}{2\left(1+x\right)}\right) \end{align} \]

In C

Interfacing with C code is fairly straightforward in Julia. First, we need to write a C library that can be dynamically loaded.

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>

/* Arrays xa and ya are passed by reference */
void gibbs(int N, int thin, double **xa, double **ya)
{
  int i,j;
  gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937);
  double x=0;
  double y=0;
  for (i=0;i<N;i++) {
    for (j=0;j<thin;j++) {
      x=gsl_ran_gamma(r,3.0,1.0/(y*y+4));
      y=1.0/(x+1)+gsl_ran_gaussian(r,1.0/sqrt(2*x+2));
    }
    (*xa)[i]=x; /* Note how the arrays are dereferenced */
    (*ya)[i]=y;
  }
  free(r);
}

The idea here is that we allocate vectors to store x and y in Julia, then pass them to the C function, which fills in the values, hence all the ugliness in the above with passing arrays by reference (the double pointers in the function header), and the dereferencing to fill in each value.

On my Linux box, I compile the above as follows.

gcc -O3 -fpic -c libgibbs.c -o libgibbs.o -I/usr/local/include
gcc -shared -o libgibbs.so libgibbs.o -L/usr/local/lib -lgsl -lgslcblas -lm

On my OSX box, where gsl is installed via Homebrew, I can skip the include and library directory flags.

This can then be loaded into Julia.

N=50000
thin=1000
xac=Array(Float64,N)
yac=Array(Float64,N)
@elapsed ccall( (:gibbs, "./libgibbs.so"), Void, (Int64,Int64,Ptr{Ptr{Cdouble}},Ptr{Ptr{Cdouble}}), 50000, 1000, &xac, &yac)

This does well, and on my machine, it finishes in about 6.5 seconds.

In Julia

function gibbs(N::Int64,thin::Int64)
  xa=Array(Float64,N)
  ya=Array(Float64,N)
  x=0.0
  y=0.0
  for i in 1:N
    for j in 1:thin
      x=rand(Gamma(3.0,1.0/(y*y+4.0)))
      y=rand(Normal(1.0/(x+1.0),1.0/sqrt(2.0*x+2.0)))
    end
    xa[i]=x
    ya[i]=y
  end
  return(xa,ya)
end

We can run this as follows.

@elapsed xaj,yaj=gibbs(50000,1000)

Oddly enough, this runs in less time than the C version, at about 4.3 seconds on my Mac. In addition, the code avoids all the memory management fiddliness of the C version.