Dear Salonnières,

thank you so much for attending the compu-salon, episode 4. Here's a summary of our discussion.

The next salon will be this Friday, Feb 17, at 4pm. We'll talk about making beautiful and informative plots, with a special focus on the teachings of the statistical-graphics guru, Edward Tufte. As I mentioned last Friday, I would like all of you to bring (and send me in advance) one of your own plots that you're especially proud of, or that you feel needs work. We'll discuss what's great about them, and how they could be improved.

I'll send out a reminder on Friday morning. Until then!

Michele

# Compu-salon ep. 4 (2012/02/10) in summary

Also at www.vallis.org/salon.

In our fourth meeting, we talked about:

## Ising and Markov

The principles and (broadly) mathematics of the Ising model and Markov Chain Monte Carlo (MCMC) sampling. We won't repeat them here, but among many other places you can look in Mackay's book, Sethna's book, and for a historical perspective of why the Metropolis algorithm should be the Rosenblush algorithm, Gubernatis.

## OOP

The history and principles of object oriented programming (OOP). Wikipedia has a decent article. On Friday, we used our MCMC code to exemplify four widely recognized (if partly overlapping) principles:

• Dynamic dispatch
• Encapsulation (or abstraction)
• Polymorphism
• Inheritance

We used encapsulation/abstraction when we defined a class `state` to represent a single state of a simple Markov chain, which we built to produce samples distributed as exp( − x4):

``````class state(object):
def __init__(self,x):
self.x = x

def logp(self):
return -0.594875 - self.x**4``````

Remember the all-important distinction between the class (which defines the behavior of the object) and its instances (of which there can be many, and which actually hold the data). The Python code above defines the constructor `__init__`, which is called to create every instance, and which (here) stores the argument `x` in the instance variable `self.x`; it defines also the method `logp`, which uses the value stored in the instance variable `self.x`. Now, `self` is a special variable that refers to the class instance; in Python, `self` is always the first argument of all class methods. Thus, in the console:

``````>>> s1 = state(0)
>>> s1.x
0
>>> s1.logp()
-0.59487500000000004
>>> s2 = state(1)
>>> s2.x
0
>>> s2.logp()
-1.594875``````

Next, we again used encapsulation/abstraction when we defined a class `chain` to represent a Markov chain. The job, as it were, of a chain is to hold the current state (in `self.current`), to step to a new state using the Metropolis rule (with the `step()` method), and to repeat the stepping `n` times (with the `run()` method).

``````class chain(object):
def __init__(self,init,propose):
self.current = init
self.propose = propose

self.xs = [init]

def step(self):
new = self.propose(self.current)

logpn, logpc = new.logp(), self.current.logp()

if ( (logpn > logpc) or
(math.log(random.random()) < logpn - logpc) ):
self.current = new

self.states.append(self.current)

def run(self,n):
for i in xrange(n):
self.step()``````

Note that `step()` calls a function `self.propose` that we must provide to the `chain` constructor. In Python, functions are first-class objects, so they can be stored in a variable, to call later. In our salon, we came up with two different proposals for the stepper:

``````def randomwalk(s):
dx = random.random() - 0.5
return state(s.x + dx)

def curtwalk(s):
return state(10*random.random() - 5)``````

We're then ready to use our Markov chain, and perform a first integration:

``````>>> c = chain(state(0),randomwalk)
>>> c.run(10000)
>>> N.mean([s.x**2 for s in c.states])
0.33987152744696092``````

Note how we used a list comprehension to extract the instance variable `x` from every saved state. (We also used `mean` from `numpy`—by the way, did you start your session with the standard incantation "`import numpy as N, matplotlib.pyplot as P`"?)

Next, we moved on to the Ising model, and created a class `ising` to hold the state (this time an N × N matrix of 1s and -1s), and to compute the energy, magnetization, and of course the probability.

``````class ising(object):
def __init__(self,init):
self.x = init

def energy(self):
return -(N.sum(self.x[:-1,:] * self.x[1:,:]) +
N.sum(self.x[:,:-1] * self.x[:,1:]) +
N.sum(self.x[0,:] * self.x[-1,:]) +
N.sum(self.x[:,0] * self.x[:,-1]))

def magnetization(self):
return abs(N.sum(self.x))

def logp(self,T):
return -self.energy()/T``````

(For homework, make sure you understand why the matrix expression in `energy()` implements the Ising energy for periodic boundary conditions.) Again we need a proposal for the Metropolis stepper. The simplest thing we can do is to flip a single spin at random:

``````def spinflip(s):
newx = s.x.copy()

i,j = random.randrange(0,len(newx)), random.randrange(0,len(newx))
newx[i,j] *= -1

return ising(newx)``````

We're almost ready to go, but we still need to modify the class `chain` slightly to take a temperature as a constructor argument, and to use it when calling `logp()`. [We can retrofit our p exp( − x4) `state` to work with the modified `chain` by taking a dummy argument `T` (set to `None` by default) to `logp()`.] OK, we can start an Ising chain with a matrix of zeros:

``````>>> c = chain(init=ising(N.ones((20,20))),propose=spinflip,T=2)
>>> c.run(100)
>>> c.states[-1].x  # see the final state
array([[ 1.,  1.,  1.,  1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1., -1.,  1.],
[...and so on...]``````

In constructing the chain, we could have used positional arguments instead of keyword arguments as above, but the latter are nice for clarity.

Now, notice how the class `chain` works just as well with `ising` as it did with `state`! Indeed, all that a generic state-like class needs to do in order to work with `chain` is to define the method `logp()`. The fact that `chain` ends up calling different code for `logp()` depending on the type of `self.current` is an example of dynamic dispatch.

It is also an example of subtype polymorphism—that is, multiple state-like objects exist, which can define different methods `logp()`. This however would be more evident in a language like C++, where we would have to explicitely define an abstract class (say, `abstractstate`) with a virtual method `logp()`, and have both `state` and `ising` inherit from it. The C++ `chain` class would then hold a pointer to `abstractstate`, and calling `logp()` on it would select the correct implementation (say, `ising`'s), from the `abstractstate` virtual method table. If you're not familiar with C++ and didn't follow any of that, don't worry, you don't need to. That's why you're learning Python.

Instead, Python uses a much less formal name-based polymorphism where the interface of a class (i.e., the variables and methods that it makes available to the world) are defined implicitly by how the class can be used. This is known also as duck-typing: if it walks, swims, and quacks like a duck, it's a duck. (Or a pa-pap, as my son would say.)

Since it's not practical to store the entire Ising configuration of all the states in a chain, we will again modify `chain` slightly, to store only the energy and magnetization. There are many ways we could do this—to be least disruptive to our existing code, we'll give `ising` a method `obs()` that returns a tuple of energy and magnetization; `chain` will then try to call `obs()`, and default to saving the state (using a Python exception) if `obs()` is not available. See the full code below for details. Thus:

``````>>> c = chain(init=ising(N.ones((20,20))),propose=spinflip,T=2)
>>> c.run(10000)
>>> P.plot(N.array(c.states)[:,0])  # plot the energy``````

This Markov chain mixes (thermalizes) very slowly, because the spin-flip proposals are so gradual; indeed, it's almost unusable for simulations. We can do much better with the Wolff update scheme, which selects a probabilistic cluster of spins with the same orientation, and flips all the spins in the cluster at once. See Sethna (2006) for an explanation of this elegant scheme, and see the full code below for a Python implementation.

The Wolff update does not even require the Metropolis stepper, since the update itself satisfies the MCMC detailed-balance condition. Thus, we need to once again change the logic in `chain`; we do so by using object-oriented inheritance to define a class `wolffchain` that inherits all the `chain` methods, but redefines step():

``````class wolffchain(chain):
def step(self):
self.current = wolff(self.current,self.T)

self.states.append(self.current)
# ...or the slightly more complicated code to save observables...``````

Let's give it a spin (we'll pass `None` for `propose` since `wolffchain` does not need it):

``````>>> c = wolffchain(init=ising(N.ones((20,20))),propose=None,T=2)
>>> c.run(10000)
>>> P.plot(N.array(c.states)[:,0])``````

This mixes much better, and indeed it can be used to plot, say, the expectation value of energy as a function of temperature. For convenience, we define

``````def runchain(T,n,size=20):
c = wolffchain(ising(N.ones((size,size))),propose=None,T=T)
c.run(n)
return N.mean(N.array(c.states),axis=0) / size**2   # average E,m per spin``````

and run

``````>>> Ts = N.linspace(1.5,3.5,30)
>>> Es = N.array([runchain(t,1000) for t in Ts])
>>> P.plot(Ts,Es)``````

Ta-da! We can see the beginnings of a phase transition, which is realized in the limit of lattice size going to infinity.

As usual, let me know if you have questions, or if I left some typos and misprogrammings here and there.

## Why OOP?

Curt asked, why OOP? Actually, what he asked was, if OOP is so good, why not use it all the time? Well, OOP is about organizing and reusing code (and ideas). If your application is one-shot, and simple enough that you can hold its logic in your brain without abstracting subconcepts into objects, then by all means write it as non-OOP procedural code. This threshold is somewhat language dependent: since OOP requires more boilerplate in C++ than in Python, and even more in Java, you may find that for the same problem you'd happily write a class implementation in Python, but not in other languages. In some languages OOP is just not supported (C, F77, Mathematica), or it's too, too painful (MATLAB).

## Pointers

I could not find a good specific reference for object-oriented programming in Python, but you could start from "Classes" in the official Python tutorial. The rest you'll pick up as you start writing code. Later in this salon we'll discuss some advanced language features (who knows, maybe even metaclass programming), which are explained very well in Pro Python.

## Python code!

``````import math, random
import numpy as N

# a simple MC state abstraction for a particular probability
class state(object):
def __init__(self,x):
self.x = x

def logp(self,T=None):
return -0.594875 - self.x**4

# a Markov chain with a Metropolis-rule stepper
class chain(object):
def __init__(self,init,propose,T):
self.current = init
self.propose = propose
self.T = T

# we'll collect all states (or their observables) in a list
try:
self.states = [init.obs()]
except AttributeError:
self.states = [init]

def step(self):
new = self.propose(self.current)

# we work with logarithms in case probabilities get large
logpn, logpc = new.logp(self.T), self.current.logp(self.T)

# Metropolis rule: if p_new > p_old, always choose the new
#                  if p_new < p_old, choose the new with probability p_new/p_old
#                  otherwise let new = old
if ( (logpn > logpc) or
(math.log(random.random()) < logpn - logpc) ):
self.current = new

try:
self.states.append(self.current.obs())
except AttributeError:
self.states.append(self.current)

def run(self,n):
for i in xrange(n):
self.step()

# random-length random-walk proposal
def randomwalk(s):
dx = random.random() - 0.5  # pick a number between [-0.5,0.5]

return state(s.x + dx)      # create a new state instance with the new x

# flat proposal
def curtwalk(s):
return state(10*random.random() - 5)

# an MC state abstraction for the 2D Ising model
class ising(object):
def __init__(self,init=None,size=20):
self.x = init

def energy(self):
return -(N.sum(self.x[:-1,:] * self.x[1:,:]) +  # note the slicing tricks to multiply to the left and bottom
N.sum(self.x[:,:-1] * self.x[:,1:]) +
N.sum(self.x[0,:] * self.x[-1,:])   +  # we also need wrap-around borders
N.sum(self.x[:,0] * self.x[:,-1]))

def magnetization(self):
return abs(N.sum(self.x))

def logp(self,T):
return -self.energy()/T

def obs(self):
return (self.energy(),self.magnetization())

def spinflip(s):
newx = s.x.copy()   # copy the current matrix

# select two indices at random, and flip the spin
i,j = random.randrange(0,len(newx)), random.randrange(0,len(newx))
newx[i,j] *= -1

# call the ising constructor to create a new state from the matrix
return ising(newx)

# return a list of four nearest-neighbor positions to pos (as tuples)
# (needed by wolff)
def neighbors(pos,size):
return [((pos + D) % size,
(pos + D) % size) for D in ((1,0),(-1,0),(0,1),(0,-1))]

def wolff(current,T):
newx = current.x.copy()
size = len(newx)

# flipping probability needed for detailed balance in the Wolff scheme
p = 1.0 - math.exp(-2.0/T)

start     = random.randrange(0,size),random.randrange(0,size)
startspin = newx[start]                 # pick a random position to begin, note the spin

toflip = [start]                        # start a FIFO queue of locations to flip

while toflip:                           # if there are spins in our queue
if newx[toflip] == startspin:
newx[toflip] *= -1           # flip the first one if it's still the same color

for pos in neighbors(toflip,size):    # add neighbors of the original color with probability p
if newx[pos] == startspin and random.random() < p:
toflip.append(pos)

# else: the spin was flipped already

del toflip                       # then remove it from the queue

return ising(newx)

class wolffchain(chain):
def step(self):
self.current = wolff(self.current,self.T)

try:
self.states.append(self.current.obs())
except AttributeError:
self.states.append(self.current)

def runchain(T,n,size=20):
c = wolffchain(ising(N.ones((size,size))),propose=None,T=T)

# run n steps of the chain
c.run(n)

# return the average energy and magnetization per spin
return N.mean(N.array(c.states),axis=0) / size**2``````