Multiprocessing Lambdas


This tutorial explains how to run lambda values in parallel. We show the user how to construct a simple wrapper function for the code to be multiprocessed, then we decorate the wrapper function with @biceps.multiprocess.

In the preparation tutorial, we generated the input data that will be used here.

Please be aware of the following:

  • never use multiprocessing when debugging

  • always make sure verbose=False for all methods inside wrapper function


[1]:
import numpy as np
import pandas as pd
import biceps
BICePs - Bayesian Inference of Conformational Populations, Version 2.0
Warning on use of the timeseries module: If the inherent timescales of the system are long compared to those being analyzed, this statistical inefficiency may be an underestimate.  The estimate presumes the use of many statistically independent samples.  Tests should be performed to assess whether this condition is satisfied.   Be cautious in the interpretation of the data.

Data, output directories and parameters

[4]:
energies = np.loadtxt('cineromycin_B/cineromycinB_QMenergies.dat')*627.509  # convert from hartrees to kcal/mol
energies = energies/0.5959   # convert to reduced free energies F = f/kT
energies -= energies.min()  # set ground state to zero, just in case
input_data = biceps.toolbox.sort_data('cineromycin_B/J_NOE')
nsteps=100000
n_lambdas = 2
outdir = '%s_steps_%s_lam'%(nsteps, n_lambdas)
biceps.toolbox.mkdir(outdir)
lambda_values = np.linspace(0.0, 1.0, n_lambdas)
parameters = [dict(ref="uniform", sigma=(0.05, 20.0, 1.02)),
        dict(ref="exponential", sigma=(0.05, 5.0, 1.02), gamma=(0.2, 5.0, 1.02)),]
pd.DataFrame(parameters)
[4]:
ref sigma gamma
0 uniform (0.05, 20.0, 1.02) NaN
1 exponential (0.05, 5.0, 1.02) (0.2, 5.0, 1.02)
[5]:
@biceps.multiprocess(iterable=lambda_values)
def mp_lambdas(lam):
    print(f"lambda: {lam}")
    ensemble = biceps.Ensemble(lam, energies)
    ensemble.initialize_restraints(input_data, parameters)
    sampler = biceps.PosteriorSampler(ensemble)
    sampler.sample(nsteps, verbose=False)
    sampler.traj.process_results(outdir+'/traj_lambda%2.2f.npz'%(lam))
    filename = outdir+'/sampler_lambda%2.2f.pkl'%(lam)
    biceps.toolbox.save_object(sampler, filename)
Number of CPUs: 10
Number of processes: 2
lambda: 0.0
lambda: 1.0
100%|███████████████████████████████████████████████████████████| 100000/100000 [00:01<00:00, 66316.92it/s]

Accepted 70.836 %


Accepted [ ...Nuisance paramters..., state] %
Accepted [32.905 29.871 29.871  8.06 ] %
100%|███████████████████████████████████████████████████████████| 100000/100000 [00:01<00:00, 66102.80it/s]



Accepted 64.41900000000001 %


Accepted [ ...Nuisance paramters..., state] %
Accepted [32.817 29.933 29.933  1.669] %

Note that the output from each lambda will be returned to stdout as soon as the job has completed…

# NOTE: The following cell is for pretty notebook rendering

[6]:
from IPython.core.display import HTML
def css_styling():
    styles = open("../../../theme.css", "r").read()
    return HTML(styles)
css_styling()
[6]: