Distributed python
This tutorial will get you started on how to use IPython's parallel features for parallel and distributed tasks in the cluster.
(I couldn't get it to work across different machines for now. Please edit this page if you get it to work)
MPI
Make sure you have access to the mpiexec command in the cluster. If not, run the mpi-selector-menu command and choose a default MPI distribution.
Create an IPython MPI profile
To do this, run the following command on a cluster machine
$ ipython profile create --parallel --profile=mpi
This will create an ipython profile called 'mpi'. The configuration files will be in the folder ~/.ipython/profile_mpi/
In the file ~/.ipython/profile_mpi/ipcluster_config.py, find the line that sets the c.IPClusterEngines.engine_launcher_class parameter, uncomment it and set it to MPIExecEngineSetLauncher:
c.IPClusterEngines.engine_launcher_class = 'MPIExecEngineSetLauncher'
Launch the engines
To run the workers using the mpi profile you just created:
ipcluster start --n=12 --profile=mpi
(the --n option specifies the number of engines you want to run)
More info here
Using DirectView to run code on the engines
Example:
from IPython.parallel import Client
rc = Client(profile='mpi') dview = rc[:] # get access to all the engines
# execute code on each engine dview.execute('import random; a = random.random()')
# execute script dview.run('my_script.py')
# get and set variables dview['a']
# map a function using all engines concurrently powers = dview.map_sync(lambda x: x**10, range(32))
Note that ipython uses pickle to serialize objects in order get and set remote variables (Numpy arrays are treated differently for efficiency)
More on DirectView here
See the documentation for more.