Analysis¶

In this tutorial, we show the user how to instantiate the biceps.Analysis class which uses MBAR to get predicted populations of each conformational states and compute a BICePs score. We also provide a short description of the output data from analysis and embed the figures of posterior distribution of populations & nuisance parameters. Please refer to the documentation of Analysis for more specific details.

[6]:

import biceps
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings('ignore', category=FutureWarning)

[10]:

%matplotlib inline
A = biceps.Analysis(outdir="results", nstates=100, verbose=True)
fig = A.plot(plottype="hist") # plottype="step")

Loading results/traj_lambda0.00.npz ...
Loading results/traj_lambda1.00.npz ...
not all state sampled, these states [ 0  3  4  5  8  9 11 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31
 34 40 41 42 43 44 48 49 51 52 53 54 55 57 60 61 62 64 69 71 72 73 74 76
 77 78 81 82 83 86 87 88 89 95 96 97 98 99] are not sampled
Loading results/traj_lambda0.00.pkl ...
Loading results/traj_lambda1.00.pkl ...
lam = [0.0, 1.0]
nstates 100
Time for MBAR: 0.077 s
Writing results/BS.dat...
Writing results/populations.dat...
Top 10 states: [46, 85, 92, 45, 39, 80, 65, 90, 59, 38]
Top 10 populations: [0.01001833 0.0212999  0.02792732 0.03482791 0.07746092 0.07924925
 0.08826894 0.09715671 0.20440265 0.32912632]

../../../_images/examples_Tutorials_Prep_Rest_Post_Ana_analysis_2_1.png

../../../_images/examples_Tutorials_Prep_Rest_Post_Ana_analysis_2_2.png

The output files include: population information (“populations.dat”), figure of sampled parameters distribution (“BICePs.pdf”), BICePs score information (“BS.dat”), which are shown above.

Now, let’s take a look at the populations file:

There are 100 rows corresponding to 100 clustered states. 4 columns corresponding to populations of each state (row) for 2 lambda values (first 2 columns) and population change (last 2 columns).

[11]:

import pandas as pd
import numpy as np
pops = np.loadtxt('results/populations.dat')
df = pd.DataFrame(pops)
df

[11]:

	0	1	2	3
0	0.017975	2.496320e-05	0.004195	5.931122e-06
1	0.009977	2.292886e-05	0.003138	7.283074e-06
2	0.002693	3.068112e-04	0.001553	1.773218e-04
3	0.000990	1.008111e-05	0.000989	1.008553e-05
4	0.000000	0.000000e+00	NaN	NaN
...	...	...	...	...
95	0.006000	1.247277e-11	0.002441	5.105694e-12
96	0.000993	6.847524e-06	0.000993	6.850553e-06
97	0.004988	1.150700e-05	0.002225	5.157574e-06
98	0.004000	4.474545e-11	0.001996	2.241289e-11
99	0.002999	5.949074e-07	0.001729	3.439323e-07

100 rows × 4 columns

Conclusion

In this tutorial, the user learned how to call on the biceps.Analysis class in order to analyze the trajectory data, which automatically plots the posterior distribution of populations & nuisance parameters.

We now conclude the series of tutorials: Preparation, Restraint, PosteriorSampler and Analysis To learn more about BICePs, please check out our other examples & tutorials here.

# NOTE: The following cell is for pretty notebook rendering

[12]:

from IPython.core.display import HTML
def css_styling():
    styles = open("../../../theme.css", "r").read()
    return HTML(styles)
css_styling()

[12]: