Dear Salonnières,

thank you so much for attending the compu-salon, episode 4. Here's a summary of our discussion.

The next salon will be this Friday, Feb 17, at 4pm. We'll talk about making beautiful and informative plots, with a special focus on the teachings of the statistical-graphics guru, Edward Tufte. As I mentioned last Friday, I would like all of you to bring (and send me in advance) one of your own plots that you're especially proud of, or that you feel needs work. We'll discuss what's great about them, and how they could be improved.

I'll send out a reminder on Friday morning. Until then!

Michele

Compu-salon ep. 4 (2012/02/10) in summary

Also at www.vallis.org/salon.

In our fourth meeting, we talked about:

Ising and Markov

The principles and (broadly) mathematics of the Ising model and Markov Chain Monte Carlo (MCMC) sampling. We won't repeat them here, but among many other places you can look in Mackay's book, Sethna's book, and for a historical perspective of why the Metropolis algorithm should be the Rosenblush algorithm, Gubernatis.

OOP

The history and principles of object oriented programming (OOP). Wikipedia has a decent article. On Friday, we used our MCMC code to exemplify four widely recognized (if partly overlapping) principles:

  • Dynamic dispatch
  • Encapsulation (or abstraction)
  • Polymorphism
  • Inheritance

We used encapsulation/abstraction when we defined a class state to represent a single state of a simple Markov chain, which we built to produce samples distributed as exp( − x4):

class state(object):
    def __init__(self,x):
        self.x = x

    def logp(self):
        return -0.594875 - self.x**4

Remember the all-important distinction between the class (which defines the behavior of the object) and its instances (of which there can be many, and which actually hold the data). The Python code above defines the constructor __init__, which is called to create every instance, and which (here) stores the argument x in the instance variable self.x; it defines also the method logp, which uses the value stored in the instance variable self.x. Now, self is a special variable that refers to the class instance; in Python, self is always the first argument of all class methods. Thus, in the console:

>>> s1 = state(0)
>>> s1.x
0
>>> s1.logp()
-0.59487500000000004
>>> s2 = state(1)
>>> s2.x
0
>>> s2.logp()
-1.594875

Next, we again used encapsulation/abstraction when we defined a class chain to represent a Markov chain. The job, as it were, of a chain is to hold the current state (in self.current), to step to a new state using the Metropolis rule (with the step() method), and to repeat the stepping n times (with the run() method).

class chain(object):
    def __init__(self,init,propose):
        self.current = init
        self.propose = propose
    
        self.xs = [init]

    def step(self):
        new = self.propose(self.current)
    
        logpn, logpc = new.logp(), self.current.logp()
    
        if ( (logpn > logpc) or
             (math.log(random.random()) < logpn - logpc) ):
            self.current = new
    
        self.states.append(self.current)

    def run(self,n):
        for i in xrange(n):
            self.step()

Note that step() calls a function self.propose that we must provide to the chain constructor. In Python, functions are first-class objects, so they can be stored in a variable, to call later. In our salon, we came up with two different proposals for the stepper:

def randomwalk(s):
    dx = random.random() - 0.5    
    return state(s.x + dx)

def curtwalk(s):
    return state(10*random.random() - 5)

We're then ready to use our Markov chain, and perform a first integration:

>>> c = chain(state(0),randomwalk)
>>> c.run(10000)
>>> N.mean([s.x**2 for s in c.states])
0.33987152744696092

Note how we used a list comprehension to extract the instance variable x from every saved state. (We also used mean from numpy—by the way, did you start your session with the standard incantation "import numpy as N, matplotlib.pyplot as P"?)

Next, we moved on to the Ising model, and created a class ising to hold the state (this time an N × N matrix of 1s and -1s), and to compute the energy, magnetization, and of course the probability.

class ising(object):
    def __init__(self,init):
        self.x = init

    def energy(self):
        return -(N.sum(self.x[:-1,:] * self.x[1:,:]) +
                 N.sum(self.x[:,:-1] * self.x[:,1:]) +
                 N.sum(self.x[0,:] * self.x[-1,:]) +
                 N.sum(self.x[:,0] * self.x[:,-1]))

    def magnetization(self):
        return abs(N.sum(self.x))

    def logp(self,T):
        return -self.energy()/T

(For homework, make sure you understand why the matrix expression in energy() implements the Ising energy for periodic boundary conditions.) Again we need a proposal for the Metropolis stepper. The simplest thing we can do is to flip a single spin at random:

def spinflip(s):
    newx = s.x.copy()

    i,j = random.randrange(0,len(newx)), random.randrange(0,len(newx))
    newx[i,j] *= -1

    return ising(newx)

We're almost ready to go, but we still need to modify the class chain slightly to take a temperature as a constructor argument, and to use it when calling logp(). [We can retrofit our p exp( − x4) state to work with the modified chain by taking a dummy argument T (set to None by default) to logp().] OK, we can start an Ising chain with a matrix of zeros:

>>> c = chain(init=ising(N.ones((20,20))),propose=spinflip,T=2)
>>> c.run(100)
>>> c.states[-1].x  # see the final state
array([[ 1.,  1.,  1.,  1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
         1.,  1.,  1.,  1.,  1., -1.,  1.],
[...and so on...]

In constructing the chain, we could have used positional arguments instead of keyword arguments as above, but the latter are nice for clarity.

Now, notice how the class chain works just as well with ising as it did with state! Indeed, all that a generic state-like class needs to do in order to work with chain is to define the method logp(). The fact that chain ends up calling different code for logp() depending on the type of self.current is an example of dynamic dispatch.

It is also an example of subtype polymorphism—that is, multiple state-like objects exist, which can define different methods logp(). This however would be more evident in a language like C++, where we would have to explicitely define an abstract class (say, abstractstate) with a virtual method logp(), and have both state and ising inherit from it. The C++ chain class would then hold a pointer to abstractstate, and calling logp() on it would select the correct implementation (say, ising's), from the abstractstate virtual method table. If you're not familiar with C++ and didn't follow any of that, don't worry, you don't need to. That's why you're learning Python.

Instead, Python uses a much less formal name-based polymorphism where the interface of a class (i.e., the variables and methods that it makes available to the world) are defined implicitly by how the class can be used. This is known also as duck-typing: if it walks, swims, and quacks like a duck, it's a duck. (Or a pa-pap, as my son would say.)

Since it's not practical to store the entire Ising configuration of all the states in a chain, we will again modify chain slightly, to store only the energy and magnetization. There are many ways we could do this—to be least disruptive to our existing code, we'll give ising a method obs() that returns a tuple of energy and magnetization; chain will then try to call obs(), and default to saving the state (using a Python exception) if obs() is not available. See the full code below for details. Thus:

>>> c = chain(init=ising(N.ones((20,20))),propose=spinflip,T=2)
>>> c.run(10000)
>>> P.plot(N.array(c.states)[:,0])  # plot the energy

This Markov chain mixes (thermalizes) very slowly, because the spin-flip proposals are so gradual; indeed, it's almost unusable for simulations. We can do much better with the Wolff update scheme, which selects a probabilistic cluster of spins with the same orientation, and flips all the spins in the cluster at once. See Sethna (2006) for an explanation of this elegant scheme, and see the full code below for a Python implementation.

The Wolff update does not even require the Metropolis stepper, since the update itself satisfies the MCMC detailed-balance condition. Thus, we need to once again change the logic in chain; we do so by using object-oriented inheritance to define a class wolffchain that inherits all the chain methods, but redefines step():

class wolffchain(chain):
    def step(self):
        self.current = wolff(self.current,self.T)
    
        self.states.append(self.current)
        # ...or the slightly more complicated code to save observables...

Let's give it a spin (we'll pass None for propose since wolffchain does not need it):

>>> c = wolffchain(init=ising(N.ones((20,20))),propose=None,T=2)
>>> c.run(10000)
>>> P.plot(N.array(c.states)[:,0])

This mixes much better, and indeed it can be used to plot, say, the expectation value of energy as a function of temperature. For convenience, we define

def runchain(T,n,size=20):
    c = wolffchain(ising(N.ones((size,size))),propose=None,T=T)
    c.run(n)
    return N.mean(N.array(c.states),axis=0) / size**2   # average E,m per spin

and run

>>> Ts = N.linspace(1.5,3.5,30)
>>> Es = N.array([runchain(t,1000) for t in Ts])
>>> P.plot(Ts,Es)

Ta-da! We can see the beginnings of a phase transition, which is realized in the limit of lattice size going to infinity.

As usual, let me know if you have questions, or if I left some typos and misprogrammings here and there.

Why OOP?

Curt asked, why OOP? Actually, what he asked was, if OOP is so good, why not use it all the time? Well, OOP is about organizing and reusing code (and ideas). If your application is one-shot, and simple enough that you can hold its logic in your brain without abstracting subconcepts into objects, then by all means write it as non-OOP procedural code. This threshold is somewhat language dependent: since OOP requires more boilerplate in C++ than in Python, and even more in Java, you may find that for the same problem you'd happily write a class implementation in Python, but not in other languages. In some languages OOP is just not supported (C, F77, Mathematica), or it's too, too painful (MATLAB).

Pointers

I could not find a good specific reference for object-oriented programming in Python, but you could start from "Classes" in the official Python tutorial. The rest you'll pick up as you start writing code. Later in this salon we'll discuss some advanced language features (who knows, maybe even metaclass programming), which are explained very well in Pro Python.

Python code!

import math, random
import numpy as N

# a simple MC state abstraction for a particular probability
class state(object):
    def __init__(self,x):
        self.x = x

    def logp(self,T=None):
        return -0.594875 - self.x**4


# a Markov chain with a Metropolis-rule stepper
class chain(object):
    def __init__(self,init,propose,T):
        self.current = init
        self.propose = propose
        self.T = T
    
        # we'll collect all states (or their observables) in a list
        try:
            self.states = [init.obs()]
        except AttributeError:
            self.states = [init]

    def step(self):
        new = self.propose(self.current)
    
        # we work with logarithms in case probabilities get large
        logpn, logpc = new.logp(self.T), self.current.logp(self.T)
    
        # Metropolis rule: if p_new > p_old, always choose the new
        #                  if p_new < p_old, choose the new with probability p_new/p_old
        #                  otherwise let new = old
        if ( (logpn > logpc) or
             (math.log(random.random()) < logpn - logpc) ):
            self.current = new
    
        try:
            self.states.append(self.current.obs())
        except AttributeError:
            self.states.append(self.current)

    def run(self,n):
        for i in xrange(n):
            self.step()


# random-length random-walk proposal
def randomwalk(s):
    dx = random.random() - 0.5  # pick a number between [-0.5,0.5]

    return state(s.x + dx)      # create a new state instance with the new x

# flat proposal
def curtwalk(s):
    return state(10*random.random() - 5)

# an MC state abstraction for the 2D Ising model
class ising(object):
    def __init__(self,init=None,size=20):
        self.x = init

    def energy(self):
        return -(N.sum(self.x[:-1,:] * self.x[1:,:]) +  # note the slicing tricks to multiply to the left and bottom
                 N.sum(self.x[:,:-1] * self.x[:,1:]) +
                 N.sum(self.x[0,:] * self.x[-1,:])   +  # we also need wrap-around borders
                 N.sum(self.x[:,0] * self.x[:,-1]))

    def magnetization(self):
        return abs(N.sum(self.x))

    def logp(self,T):
        return -self.energy()/T

    def obs(self):
        return (self.energy(),self.magnetization())


def spinflip(s):
    newx = s.x.copy()   # copy the current matrix

    # select two indices at random, and flip the spin
    i,j = random.randrange(0,len(newx)), random.randrange(0,len(newx))
    newx[i,j] *= -1

    # call the ising constructor to create a new state from the matrix
    return ising(newx)


# return a list of four nearest-neighbor positions to pos (as tuples)
# (needed by wolff)
def neighbors(pos,size):
    return [((pos[0] + D[0]) % size,
             (pos[1] + D[1]) % size) for D in ((1,0),(-1,0),(0,1),(0,-1))]


def wolff(current,T):
    newx = current.x.copy()
    size = len(newx)

    # flipping probability needed for detailed balance in the Wolff scheme
    p = 1.0 - math.exp(-2.0/T)

    start     = random.randrange(0,size),random.randrange(0,size)
    startspin = newx[start]                 # pick a random position to begin, note the spin

    toflip = [start]                        # start a FIFO queue of locations to flip

    while toflip:                           # if there are spins in our queue
        if newx[toflip[0]] == startspin:
            newx[toflip[0]] *= -1           # flip the first one if it's still the same color
    
            for pos in neighbors(toflip[0],size):    # add neighbors of the original color with probability p 
                if newx[pos] == startspin and random.random() < p:
                    toflip.append(pos)
    
        # else: the spin was flipped already
    
        del toflip[0]                       # then remove it from the queue

    return ising(newx)


class wolffchain(chain):
    def step(self):
        self.current = wolff(self.current,self.T)
    
        try:
            self.states.append(self.current.obs())
        except AttributeError:
            self.states.append(self.current)


def runchain(T,n,size=20):
    c = wolffchain(ising(N.ones((size,size))),propose=None,T=T)

    # run n steps of the chain
    c.run(n)

    # return the average energy and magnetization per spin
    return N.mean(N.array(c.states),axis=0) / size**2