Working with trajectory ensembles#
Welcome
Welcome to the MD section of the EncoderMap tutorial. All EncoderMap tutorials are provided as jupyter notebooks, that you can run locally, on binderhub, or even on google colab.
Run this notebook on Google Colab:
Find the documentation of EncoderMap:
https://ag-peter.github.io/encodermap
Goals:
In this tutorial you will learn:
How you can write your own Features that will put CVs into SingleTraj and TrajEnsemble.
How to save your SingleTrajs and TrajEnsembles along with their CVs.
For Google colab only:#
If you’re on Google colab, please uncomment these lines and install EncoderMap.
[1]:
# !wget https://gist.githubusercontent.com/kevinsawade/deda578a3c6f26640ae905a3557e4ed1/raw/b7403a37710cb881839186da96d4d117e50abf36/install_encodermap_google_colab.sh
# !sudo bash install_encodermap_google_colab.sh
Primer#
Collective Variables#
Collective Variables (CVs) are often used to simplify and filter the xyz-Data of MD simulations. They are often employed to make sense of a complex protein (or more general molecular) system by breaking it down into just a few well-defined descriptors. They are similar to reaction coordinates, which are 1-dimensional variables along a reaction pathway but can be higher dimensional. When we think about a receptor-ligand system, the distance between the two species is often used as a reaction coordinate. A set of this distance CV and the relative rotation between receptor and ligand can add much more information and help understand the system. It becomes apparent that clever selection of CVs is an important task that many scientists in different fields face day to day.
With the tools presented in this notebook you will be able to work in a fluent and natural way with MD trajectories and their associated CV data. In the scope of this work we define CVs as:
A collection of values that is in one of its dimensions aligned with the frames/timesteps of the underlying simulation data.
Example CVs#
Here’s a list of widely used CVs:
Distances: This category of CVs can contain a 1-diemsnional scalar value describing the distance between two species in a receptor-ligand system. A 1-dimensional CV of the end-to-end distance of a protein can describe the proteins folding state in an approximate and generalizing manner. Distance CVs can also be higher-dimensional. The pairwise distances between \(\mathrm{C_\alpha}\) atoms of a protein with \(n\) residues can be captured as a \(n \times n\) matrix. A so-called hollow matrix, where all diagonal-elements are zero which is also symmetric. Most often a vector of length \(\binom{n}{2}\) is used to describe the pairwise distances in a protein. The distance between a protein and a membrane/interface would also fall into this category.
Angles: Angular CVs lie in a periodic space of \((-\pi, \pi]\) or \((0, 2\pi]\), or \((0, 360]\). The well-known Ramachandran plot uses a protein’s \(\phi\) and \(\psi\) angles to define regions of \(psi\)-\(phi\) combinations which correlate to different secondary structure motifs. Besides the backbone angles other angles can become important in understanding a protein’s conformations. Lysine is often modified via acetylation, phosphorylation, methylation, ubiquitylation and it’s sidechain angles (\(\chi_1\) to \(\chi_5\)) can be important descriptors in such a system. Pseudo-Dihedral angles lie also in this category.
Integer/binary values. If a protein has well-defined states (folded and unfolder) a binary value describing these states could also be a useful CV.
Values from other calculations and hyperparameters. An example for this category could be the temperature at which the simulation was carried out, when simulations were conducted at multiple temperatures. If phase space sub states were obtained by either using Markov-Chain models, or by using some sort of clustering algorithm (GROMOS), the membership to such cluster could also present CVs.
Positional values: Maybe even the full position of an atom or a group of atoms could be an important descriptor for a system.
etc.
We can view the raw xyz-data of trajectories as a high-dimensional array with a shape of (n_frames, n_atoms, 3). The last axis correxponds to the cartesian coordinates. Indexing this axis will give us the x, y, and z coordinates, respectively.
[2]:
# todo: delete
from __future__ import annotations
import plotly.graph_objects as go
import numpy as np
from pathlib import Path
import encodermap as em
%load_ext autoreload
%autoreload 2
/home/kevin/git/encoder_map_private/encodermap/__init__.py:194: GPUsAreDisabledWarning: EncoderMap disables the GPU per default because most tensorflow code runs with a higher compatibility when the GPU is disabled. If you want to enable GPUs manually, set the environment variable 'ENCODERMAP_ENABLE_GPU' to 'True' before importing EncoderMap. To do this in python you can run:
import os; os.environ['ENCODERMAP_ENABLE_GPU'] = 'True'
before importing encodermap.
_warnings.warn(
[3]:
traj = em.load_project('linear_dimers', traj=0)
print(f"First frame, first atom, x-coordinate:\n{traj.xyz[0, 0, 0]=}")
print(f"First 2 frames, first atom, y-coordinates:\n{traj.xyz[:2, 0, 1]=}")
print(f"All frames, all atoms. Only z-coordinates:\n{traj.xyz[..., -1]}")
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning: EncoderMap's repository at https://sawade.io/encodermap_data/linear_dimers is unreachable.
warnings.warn(f"EncoderMap's repository at {url} is unreachable.")
First frame, first atom, x-coordinate:
traj.xyz[0, 0, 0]=6.1210003
First 2 frames, first atom, y-coordinates:
traj.xyz[:2, 0, 1]=array([6.42 , 6.3940005], dtype=float32)
All frames, all atoms. Only z-coordinates:
[[1.1060001 1.092 1.028 ... 5.7790003 5.7790003 5.881 ]
[1.056 1.0610001 0.96400005 ... 5.76 5.689 5.8650002 ]
[1.062 1.0090001 1.144 ... 5.8500004 5.9550004 5.8240004 ]
...
[0.12100001 0.031 0.12400001 ... 4.783 4.846 4.7850003 ]
[0.116 0.081 0.125 ... 4.617 4.6980004 4.5950003 ]
[0.06900001 0.06 0.063 ... 4.5290003 4.486 4.6260004 ]]
The 3-dimensions of this data can be used for plotting. However, we don’t gain much from these plots.
[4]:
em.plot.plot_raw_data(
traj,
frame_slice=slice(0, 5001, 1000),
atom_slice=slice(0, 1500, 150),
)
Adding more context to the raw xyz-data (like size, color, bonds) we can better understand the data.
[5]:
em.plot.plot_ball_and_stick(traj)
/home/kevin/git/encoder_map_private/encodermap/trajinfo/info_single.py:1691: UserWarning:
Dropping CVs from trajectory. Slicing CVs with this method is currently not possible. Raise an issue if you want to have this feature added.
We can get more information by using additional libraries like NGLView
which also render secondary structure. However, we can’t really understand how the protein moves from one folding state to another. Hit the “Play” button and have a look at the first 200 frames of the simulation. Can you spot the LEU56-SER57-ASP58-TYR59 residues transition from a turn into an \(\alpha\)-helix?
[6]:
import nglview as nv
nv.show_mdtraj(traj.traj[::100])
That’s when collective variables can come in handy. They break down and combine different xyz data to help us understand the grand picture of protein folding.
A Ramachandran plot is a very general description of secondary structure motifs. The test protein does indeed contain \(alpha\)-helices and \(beta\)-sheets. We can’t really understand the conformation of the protein.
[7]:
em.plot.plot_ramachandran(traj, subsample=1000)
A DSSP plot doesn’t condense the \(phi\) and \(psi\) angles into a single plot. Here we can assign secondary structure motifs to certain residues at certain times of the simulation.
[8]:
em.plot.plot_dssp(
traj,
simplified=False,
subsample=slice(0, 500),
)
Distances can also be used as useful collective variables. The end-to-end distance can give helpful insights into a proteins general conformation (collapsed vs. extended).
[9]:
em.plot.plot_end2end(traj)
Sharing MD data#
To play with MD data we must first obtain them. Instead of running long MD simulations you can just download the datasets from the University of Konstanz’ data repository KonDATA. EncoderMap has a convenience function get_from_kondata()
, that allows you to fetch those files:
[10]:
import encodermap as em
output_dir = em.get_from_kondata(
"linear_dimers",
mk_parentdir=True,
silence_overwrite_message=True,
)
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning:
EncoderMap's repository at https://sawade.io/encodermap_data/linear_dimers is unreachable.
[11]:
output_dir
[11]:
'/home/kevin/git/encoder_map_private/tests/data/linear_dimers'
Classes for working with MD data#
After we have obtained some files let us work with Encodermap’s SingleTraj
and TrajEnsemble
classes. We need some more imports to also plot some data and visualize it.
[12]:
# Packages for numerical data
import xarray as xr
import numpy as np
# Packages for MD data
import mdtraj as md
import MDAnalysis as mda
# Packages for plotting
import matplotlib as mpl
import matplotlib.pyplot as plt
# Packages for file operations
from glob import glob
from pathlib import Path
import os
import tempfile
# nice printing
from rich.pretty import pprint
# With nglview you can view structural data in the notebook
# If not installed you should check it out.
try:
import nglview as nv
except ImportError:
print("Check out nglview: https://github.com/nglviewer/nglview")
The new SingleTraj
class#
The SingleTraj
class is meant as a container to hold a trajectory’s xyz coordinates, its topology, and its CVs. This class builds the backbone of the TrajEnsemble
class which will be looked at later.
In the background of the SingleTraj
class the MDTraj package (mdtraj/mdtraj) works its magic, however this class offers more benefits:
The
SingleTraj
class keeps track of CVs while indexing and slicing.The trajectories’ xyz data is only loaded when it is actually needed. Most manipuilation can thus be done before creating huge objects in memory.
The SingleTraj
class can be initialized in many ways:
From a trajectory file (.xtc, .dcd, .lammpstrj) and a topology file (.gro, .pdb).
[13]:
output_dir = Path(output_dir)
traj_from_xtc_file = em.SingleTraj(output_dir / "01.xtc", top=output_dir / "01.pdb")
print(traj_from_xtc_file)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. Not containing any CVs.
From an url of the pdb database.
[14]:
traj_from_url = em.SingleTraj('https://files.rcsb.org/view/1YUF.pdb')
print(traj_from_url)
encodermap.SingleTraj object. Currently not in memory. Basename is '1YUF'. Not containing any CVs.
Or by just by providing a pdb code to the
from_pdb_id()
constructor of theSingleTraj
class.
[15]:
traj_from_pdb_id = em.SingleTraj.from_pdb_id('1YUG')
print(traj_from_pdb_id)
encodermap.SingleTraj object. Currently not in memory. Basename is '1YUG'. Not containing any CVs. Common string is '1YUG'.
You can also directly load the trajectory from an EncoderMap project. In this case, the trajectory already has CVs.
[16]:
# with the load_project() method
traj = em.load_project('linear_dimers', traj=0)
print(traj)
encodermap.SingleTraj object. Currently not in memory. Basename is 'trajs'. CV central_angles with shape (1, 5001, 454) loaded. CV central_cartesians with shape (1, 5001, 456, 3) loaded. CV central_dihedrals with shape (1, 5001, 453) loaded. CV central_distances with shape (1, 5001, 455) loaded. CV lowd with shape (1, 5001, 2) loaded. CV side_dihedrals with shape (1, 5001, 322) loaded.
``SingleTraj``s keep track of the files they have been instantiated from.
[17]:
print(
f"basename = {traj_from_xtc_file.basename:<70}{traj_from_url.basename:<70}\n"
f"traj_file = {traj_from_xtc_file.traj_file:<70}{traj_from_url.traj_file:<70}\n"
f"top_file = {traj_from_xtc_file.top_file:<70}{traj_from_url.top_file:<70}\n"
f"extension = {traj_from_xtc_file.extension:<70}{traj_from_url.extension:<70}\n"
f"n_frames = {traj_from_xtc_file.n_frames:<70}{traj_from_url.n_frames:<70}\n"
f"n_atoms = {traj_from_xtc_file.n_atoms:<70}{traj_from_url.n_atoms:<70}"
)
basename = 01 1YUF
traj_file = /home/kevin/git/encoder_map_private/tests/data/linear_dimers/01.xtc https://files.rcsb.org/view/1YUF.pdb
top_file = /home/kevin/git/encoder_map_private/tests/data/linear_dimers/01.pdb https://files.rcsb.org/view/1YUF.pdb
extension = .xtc .pdb
n_frames = 5001 16
n_atoms = 1521 720
Difference between ``traj``, ``trajectory`` and ``top``, ``topology``
traj
and top
always give mdtraj.Trajectory
and mdtraj.Topology
, respectively. They are loaded on demand and return the corresponding mdtraj
object.
trajectory
and topology
can be False
and represent the current backend of the SingleTraj
object.
When instantiated, SingleTraj
d do not load the data from disk. Only when requested.
[18]:
print(traj_from_xtc_file.topology)
print(traj_from_xtc_file.top)
print(traj_from_xtc_file.topology)
False
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False
[19]:
print(traj_from_xtc_file.trajectory)
print(traj_from_xtc_file.traj)
print(traj_from_xtc_file.trajectory)
False
<mdtraj.Trajectory with 5001 frames, 1521 atoms, 152 residues, and unitcells>
False
Loading can be forced
Using the load()
method, the trajectory will be loaded. From that point forwards, the xyz data is kept in RAM and can be accessed. The unload()
function does the reverse and frees up the RAM, but the xyz data can be loaded again. If your RAM is large enough you would not need the unload()
function, but it is there nonetheless.
[20]:
traj_from_xtc_file.load_traj()
print(traj_from_xtc_file.topology)
traj_from_xtc_file.unload()
print(traj_from_xtc_file.topology)
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False
Inside a context manager, the SingleTraj
is always loaded and unloaded afterwards.
[21]:
with traj_from_xtc_file as t:
print(t.topology)
print(traj_from_xtc_file.topology)
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False
If nglview is set up, you can take a look at the trajectory with this code:
[22]:
view = traj_from_xtc_file.show_traj()
view
Some methods and attributes are duplicated from mdtraj
. This allows us to call some mdtraj
functions on the SingleTraj
object directly.
[23]:
selection = traj_from_xtc_file.select('name CA')
print(selection[:5])
dssp = md.compute_dssp(traj_from_xtc_file)
print(dssp[0, :5].tolist())
[ 4 13 25 34 51]
['C', 'E', 'E', 'E', 'E']
[24]:
md.compute_center_of_mass(traj_from_xtc_file)[0]
[24]:
array([7.60928699, 7.31795958, 3.59687811])
By indexing a SingleTraj
, you get another instance of SingleTraj
containing only one frame.
[25]:
frame = traj_from_xtc_file[0]
print(frame)
encodermap.SingleTraj object. Data currently in memory. Basename is '01'. At indices (0,). Not containing any CVs.
The indexing of SingleTraj
s is postponed, meaning, that for certain indexing types (integers, sequences of integers, slices without the step argument) the trajectory data is not loaded from disk. Only, when loaded, are the indexes applied one after the other.
This cell is fast. No file operations are carried out here.
[26]:
traj_from_xtc_file.unload()
frame = traj_from_xtc_file[:10][[0, 1]]
print(frame)
print(frame.trajectory)
print(frame.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, 10, None), [0, 1]). Not containing any CVs.
False
2
This cell is a bit slower. Here, the trajectory is loaded from file.
[27]:
frame.load_traj()
print(frame)
encodermap.SingleTraj object. Data currently in memory. Basename is '01'. At indices (slice(None, 10, None), [0, 1]). Not containing any CVs.
If the index is larger than the number of frames, the SingleTraj
class will throw an exception.
[28]:
traj_from_xtc_file.unload()
frame = traj_from_xtc_file[10_000]
print(frame)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[28], line 2
1 traj_from_xtc_file.unload()
----> 2 frame = traj_from_xtc_file[10_000]
3 print(frame)
File ~/git/encoder_map_private/encodermap/trajinfo/info_single.py:1986, in SingleTraj.__getitem__(self, key)
1984 if isinstance(key, (int, np.integer)):
1985 if key > self.n_frames:
-> 1986 raise IndexError(
1987 f"Index {key} out of range for traj with "
1988 f"{self.n_frames} frames."
1989 )
1990 if isinstance(key, (list, np.ndarray)):
1991 if any([k > self.n_frames for k in key]):
IndexError: Index 10000 out of range for traj with 5001 frames.
SingleTraj
s keep the number of their frames organized. If a SingleTraj
is loaded from disk the frames are represented by a list of integers:
[29]:
traj_from_xtc_file.id
[29]:
array([ 0, 1, 2, ..., 4998, 4999, 5000])
ndexing these frames with a [::2]
slice will give the frames [0, 2, 4, ..., n]
.
[30]:
traj_from_xtc_file[::2].id
[30]:
array([ 0, 2, 4, ..., 4996, 4998, 5000])
This begs the question what happens if we index these frames again? Which frame numbers will be chosen
The answer is: There are two methods.
The normal indexing
[[0, 1, 2]]
does not care for the original frame indices. It will select whichever frame is at positions 0, 1, and 2, regardless of what their original id was.
[31]:
traj_from_xtc_file[::2][[0, 1, 2, 3]].id
[31]:
array([0, 2, 4, 6])
[32]:
traj_from_xtc_file[::5][[0, 1, 2, 5, 6]].id
[32]:
array([ 0, 5, 10, 25, 30])
The other method uses the
fsel[]
selector. It expects the original frame indices.
[33]:
traj_from_xtc_file[::2].fsel[[0, 2, 4, 6]].id
[33]:
array([0, 2, 4, 6])
[34]:
traj_from_xtc_file[::5].fsel[[0, 5, 10, 25, 30]].id
[34]:
array([ 0, 5, 10, 25, 30])
This also means, that the fsel selector can fail for frames that are not present in a SingleTraj
.
This means, that some frames are not accessible, when we do some weird slicing:
[35]:
traj_from_xtc_file[::5].fsel[3]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[35], line 1
----> 1 traj_from_xtc_file[::5].fsel[3]
File ~/git/encoder_map_private/encodermap/trajinfo/info_single.py:199, in SingleTrajFsel.__getitem__(self, item)
194 raise ValueError(
195 f"The `fsel[]` method of `SingleTraj` takes {CanBeIndex} types, "
196 f"but {type(item)} was provided."
197 )
198 if len(idx) == 0:
--> 199 raise ValueError(
200 f"No frames with frame index {item} in trajectory {self.other} "
201 f"with frames: {self.other._frames}"
202 )
203 return self.other[idx]
ValueError: No frames with frame index 3 in trajectory encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, None, 5),). Not containing any CVs. with frames: [ 0 5 10 ... 4990 4995 5000]
You can also give a numpy array, a list or even a slice into the indexing.
Indexing with without a stop ([::5]
) will put the trajectory into memory.
[36]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[::2]
print(traj_from_xtc_file.n_frames)
print(subsample)
print(subsample.n_frames)
5001
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, None, 2),). Not containing any CVs.
2501
[37]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[[0, 1, 5, 6]]
print(subsample)
print(subsample.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices ([0, 1, 5, 6],). Not containing any CVs.
4
Note: Andvanced slicing can result in trajectories with 0 frames in them, or possibly reverse the time axis. Use this feature only if you are sure about what you are doing.
[38]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[5:46:3]
print(subsample)
print(subsample.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(5, 46, 3),). Not containing any CVs.
14
The HDF5 file format (ending wiht the .h5 extension) allows us to directly extract frames and accelerate loading. We can save a .h5 formatted file from an SingleTraj
class by calling:
[39]:
with tempfile.TemporaryDirectory() as d:
d = Path(d)
traj_from_xtc_file.save(d / 'my_traj.h5', overwrite=True)
Loading of .h5 files is similar to all files:
[40]:
with tempfile.TemporaryDirectory() as d:
d = Path(d)
traj_from_xtc_file.save(d / 'my_traj.h5', overwrite=True)
traj_h5 = em.SingleTraj(d / 'my_traj.h5')
print(traj_h5[0])
encodermap.SingleTraj object. Currently not in memory. Basename is 'my_traj'. At indices (0,). Not containing any CVs.
The SingleTraj
class can be used as an iterator in two ways:
Iterate over the frames with
for frame in traj
.Iterate over frame_number and frame with
for frame_no, frame in traj.iterframes()
.
[41]:
for frame in traj[::2][:5]:
print(frame.index)
(None, slice(None, None, 2), slice(None, 5, None), 0)
(None, slice(None, None, 2), slice(None, 5, None), 1)
(None, slice(None, None, 2), slice(None, 5, None), 2)
(None, slice(None, None, 2), slice(None, 5, None), 3)
(None, slice(None, None, 2), slice(None, 5, None), 4)
[42]:
for frame_num, frame in traj[::2][:5].iterframes():
print(frame_num)
0
2
4
6
8
The TrajEnsemble
class contains multiple SingleTraj
s#
The trajectory ensemble is everything you’ve always wanted, and more. Really, it is. Trajectory ensembles unlock fundamental ideas in statistical mechanics, including connections between equilibrium and non-equilibrium phenomena.
Daniel M. Zuckerman (https://statisticalbiophysicsblog.org/?p=92#more-92)
EncoderMap tries to implement the idea of a trajectory ensemble with the TrajectoryEnsemble
class. A container for multiple SingleTraj
s.
A trajectory ensemble can be created by providing it multiple trajectory files:
[43]:
output_dir = Path(em.get_from_kondata("pASP_pGLU", mk_parentdir=True, silence_overwrite_message=True))
trajs_from_files = em.load(
[
output_dir / "asp7.xtc",
output_dir / "glu7.xtc",
],
[
output_dir / "asp7.pdb",
output_dir / "glu7.pdb",
],
common_str=["asp7", "glu7"]
)
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning:
EncoderMap's repository at https://sawade.io/encodermap_data/pASP_pGLU is unreachable.
[44]:
trajs_from_files
[44]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. Object at 0x7055c7b2fa60>
The common_str
argument provides a means to group similar trajectories. It searches for matching substrings in the filename of the provided trajectories and topologies. In the trajs_from_files
, there are only one TrajEnsemble
with a single trajectory per common_str
.
[45]:
for common_str, trajs_subset in trajs_from_files.trajs_by_common_str.items():
print(f"The common_str '{common_str}' contains a subset of {trajs_subset.n_trajs} trajectories.")
The common_str 'asp7' contains a subset of 1 trajectories.
The common_str 'glu7' contains a subset of 1 trajectories.
Something that EncoderMap does different than packages like MDTraj
or MDAnalysis
is that trajectories can be grouped together even when their topologies are different.
[46]:
for top, trajs_subset in trajs_from_files.trajs_by_top.items():
print(f"The topology {top} has {trajs_subset.n_trajs} in the ensemble.")
The topology <mdtraj.Topology with 1 chains, 7 residues, 73 atoms, 71 bonds> has 1 in the ensemble.
The topology <mdtraj.Topology with 1 chains, 7 residues, 80 atoms, 78 bonds> has 1 in the ensemble.
The trajs_from_files
instance has trajectories with 2 different topologies in it. This feature is meant to represent the expression of mutations in biological systems.
https://evolution.berkeley.edu/dna-and-mutations/types-of-mutations/
Indexing TrajEnsemble
s can yield different output types. Indexing with a sinlge int
will yield the corresponding trajectory.
[47]:
trajs_from_files[1]
[47]:
<encodermap.SingleTraj object. Currently not in memory. Basename is 'glu7'. Not containing any CVs. Common string is 'glu7'. Object at 0x7055c895a920>
Python lists, NumPy arrays and slices can also be used for indexing. They all return TrajEnsemble
object and raise Errors, when they are indexing trajectories that do not exist in the ensemble.
[48]:
trajs_from_files[[0, 1]]
[48]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['glu7', 'asp7']. Not containing any CVs. Object at 0x7055ce6b1a20>
[49]:
trajs_from_files[[0, 1, 2]]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[49], line 1
----> 1 trajs_from_files[[0, 1, 2]]
File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3051, in TrajEnsemble.__getitem__(self, key)
3049 return self.trajs[key]
3050 elif isinstance(key, list) and not isinstance(key[0], list):
-> 3051 new_class = self._return_trajs_by_index(key)
3052 return new_class
3053 elif isinstance(key, np.ndarray):
File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:2566, in TrajEnsemble._return_trajs_by_index(self, index)
2564 trajs_subset = self.trajs[ind]._gen_ensemble()
2565 else:
-> 2566 new_traj = self.trajs[ind]._gen_ensemble()
2567 trajs_subset += new_traj
2568 trajs_subset.common_str = new_common_str
IndexError: list index out of range
Slices can contain start, stop, step or any combination, thereof:
[50]:
trajs_from_files[0:1]
[50]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 1 trajectories. Common str is ['asp7']. Not containing any CVs. Object at 0x7055cdfd58a0>
[51]:
trajs_from_files[::3]
[51]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 1 trajectories. Common str is ['asp7']. Not containing any CVs. Object at 0x7055c93a5db0>
They will also raise Errors, if trajectories are not accessible.
[52]:
trajs_from_files[5:]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[52], line 1
----> 1 trajs_from_files[5:]
File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3069, in TrajEnsemble.__getitem__(self, key)
3067 start, stop, step = key.indices(self.n_trajs)
3068 list_ = list(range(start, stop, step))
-> 3069 new_class = self[list_]
3070 return new_class
3071 elif isinstance(key, list) and all(isinstance(k, list) for k in key):
File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3050, in TrajEnsemble.__getitem__(self, key)
3048 if isinstance(key, (int, np.int32, np.int64)):
3049 return self.trajs[key]
-> 3050 elif isinstance(key, list) and not isinstance(key[0], list):
3051 new_class = self._return_trajs_by_index(key)
3052 return new_class
IndexError: list index out of range
Similar to trajectories, which can contain frames with various frame-numbers, TrajEnsemble
can contain trajectories with various trajectory-numbers. The regular indexing []
does not use the trajectory number.
[53]:
trajs = trajs_from_files.copy()
trajs[0].traj_num = 10
trajs[1].traj_num = 20
trajs[0].traj_num
[53]:
10
But the tsel[]
selector uses the trajectory-number:
[54]:
trajs.tsel[20].traj_num
[54]:
20
And raises Errors, if a trajectory with that index is not present.
[55]:
trajs.tsel[0]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[55], line 1
----> 1 trajs.tsel[0]
File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:765, in TrajEnsembleTsel.__getitem__(self, item)
763 if isinstance(item, (int, np.int64)):
764 if item not in items:
--> 765 raise ValueError(
766 f"No trajectories with traj_num {item} in TrajEnsemble {self.other} "
767 f"with trajectories: {items}"
768 )
769 return self.other.trajs_by_traj_num[item]
770 elif isinstance(item, (list, np.ndarray)):
ValueError: No trajectories with traj_num 0 in TrajEnsemble encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. with trajectories: [10 20]
2-dimensional numpy arrays can be used for indexing TrajEnsembles
, by providing trajectory-numbers and frame-numbers arranged in the rows of the array. This type of selection does not respect the actual trajectory-numbers. That’s what - you guessed it - tsel[]
is for.
[56]:
index = np.array([
[0, 0],
[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 2],
[1, 4],
])
trajs[index].id
[56]:
array([[10, 0],
[10, 1],
[10, 2],
[10, 3],
[20, 0],
[20, 2],
[20, 4]])
The tsel[]
selector does the same, but with trajectory-numbers and frame-numbers.
[57]:
index = np.array([
[10, 0],
[10, 10],
[10, 20],
[10, 30],
[20, 0],
[20, 2],
[20, 4],
])
trajs.tsel[index].id
[57]:
array([[10, 0],
[10, 10],
[10, 20],
[10, 30],
[20, 0],
[20, 2],
[20, 4]])
Think about the TrajectoryEnsemble
as a high-dimensional object, that contains an extra axis. The trajectory axis (The SingleTraj
object only has an atom and a time axis).
[58]:
trajs[0] + trajs[1]
[58]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. Object at 0x7055c98a69e0>
As we have previously seen, there are some out-of-the box TrajectoryEnsembles
that come with EncoderMap. You can get a list of all available projects via:
[59]:
em.ALL_PROJECT_NAMES
[59]:
typing.Union[typing.Literal['linear_dimers'], typing.Literal['pASP_pGLU'], typing.Literal['Ub_K11_mutants'], typing.Literal['cube'], typing.Literal['1am7'], typing.Literal['H1Ub']]
Projects are automatically loaded from either:
The University of Konstanz’ data repository KonDATA
A web-based repository with sample data accessible under: https://134.34.112.158/encodermap_data
[60]:
trajs = em.load_project("pASP_pGLU")
trajs
[60]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 7 trajectories. Common str is ['asp10', 'glu8', 'glu6', 'glu7', 'asp8', 'asp6', 'asp7']. CV central_dihedrals with shape (35007, 27) loaded. CV central_angles with shape (35007, 28) loaded. CV side_dihedrals with shape (35007, 28) loaded. CV central_distances with shape (35007, 29) loaded. CV central_cartesians with shape (35007, 30, 3) loaded. Object at 0x7055cdfd5870>
Loading CVs#
After learning about the basics of the SingleTraj
and TrajEnsemble
class we will come back to collective variables. EncoderMap offers a way of “adding” CV data to your trajectories and treat them as a single instance. A trajectory with CV data. The easiest way to load CV data is to provide an already existing NumPy array. However, you will be asked to also provide the attribute name (attr_name
) of the array. With this you could load multiple CV datasets, that differ in ther attribute
names. Here’s an example:
From numpy#
[61]:
# rename the traj to make the following code more readable
traj = traj_from_xtc_file
traj.del_CVs()
# random phi/psi angles in a [0, 2pi] interval
random_raman_angles = np.random.random((traj.n_frames, 2 * traj.n_residues)) * 2 * np.pi
# define labels:
phi_angles = [f'phi {i}' for i in range(traj.n_residues)]
psi_angles = [f'psi {i}' for i in range(traj.n_residues)]
raman_labels = [None]*(len(phi_angles)+len(psi_angles))
raman_labels[::2] = phi_angles
raman_labels[1::2] = psi_angles
# load the CV
traj.load_CV(random_raman_angles, 'raman', labels=raman_labels)
# define some integer values (can be cluster memberships)
random_integers_per_frame = np.random.randint(0, 3, size=traj.n_frames)
traj.load_CV(random_integers_per_frame, 'cluster_membership')
These values can be accessed via directly calling their attribute names (so make sure to use valid identifiers).
[62]:
print(traj.cluster_membership)
[[2]
[0]
[1]
...
[0]
[1]
[0]]
[63]:
print(traj.raman)
[[3.19887292 3.93664429 2.2612379 ... 3.97329588 2.61313514 1.29523575]
[5.67751813 4.56987798 5.1660988 ... 2.93391639 0.37501569 1.58019254]
[3.07029815 2.67067956 6.19632697 ... 5.52159601 1.92555569 5.32793883]
...
[3.32619363 5.30403618 5.29339896 ... 2.86859017 5.97150589 3.54585877]
[5.42811641 4.65150347 0.79362821 ... 4.20676372 2.6362959 3.80984363]
[0.84258919 0.50104588 4.03725794 ... 4.49656793 0.037803 5.22552255]]
There’s also the attribute CVs
that is a dict of these collective variables.
[64]:
traj.CVs
[64]:
{'raman': array([[3.19887292, 3.93664429, 2.2612379 , ..., 3.97329588, 2.61313514,
1.29523575],
[5.67751813, 4.56987798, 5.1660988 , ..., 2.93391639, 0.37501569,
1.58019254],
[3.07029815, 2.67067956, 6.19632697, ..., 5.52159601, 1.92555569,
5.32793883],
...,
[3.32619363, 5.30403618, 5.29339896, ..., 2.86859017, 5.97150589,
3.54585877],
[5.42811641, 4.65150347, 0.79362821, ..., 4.20676372, 2.6362959 ,
3.80984363],
[0.84258919, 0.50104588, 4.03725794, ..., 4.49656793, 0.037803 ,
5.22552255]]),
'cluster_membership': array([[2],
[0],
[1],
...,
[0],
[1],
[0]])}
However, this is not the end. CVs in a SingleTraj
class are stored as xarray.Dataset
s. The dataset can be accessed via _CVs
.
[65]:
traj._CVs
[65]:
<xarray.Dataset> Dimensions: (traj_num: 1, frame_num: 5001, RAMAN: 304, CLUSTER_MEMBERSHIP: 1) Coordinates: * traj_num (traj_num) object None traj_name (traj_num) <U2 '01' * frame_num (frame_num) int64 0 1 2 3 4 ... 4996 4997 4998 4999 5000 * RAMAN (RAMAN) <U7 'phi 0' 'psi 0' ... 'phi 151' 'psi 151' * CLUSTER_MEMBERSHIP (CLUSTER_MEMBERSHIP) <U26 'CLUSTER_MEMBERSHIP FEATURE' Data variables: raman (traj_num, frame_num, RAMAN) float64 3.199 ... 5.226 cluster_membership (traj_num, frame_num, CLUSTER_MEMBERSHIP) int64 2 ... 0 Attributes: length_units: nm time_units: ps feature_axes: ['RAMAN', 'CLUSTER_MEMBERSHIP'] full_path: /home/kevin/git/encoder_map_private/tests/data/linear_dim... topology_file: /home/kevin/git/encoder_map_private/tests/data/linear_dim...
Why xarray?
The underlying xarray.Dataset
is intended to make sure “everything is correct”. Every value can be accessed via an unambigous identifier.
[66]:
traj._CVs['raman'].loc[{'frame_num': 20, 'RAMAN': 'psi 50'}].values
[66]:
array([5.31620961])
Slicing with CVs.#
Slicing keeps your values where they should be.
[67]:
index = np.where(np.array(raman_labels) == 'psi 50')[0]
print(traj[0].raman[0, index])
print(traj[[0, 5, 10, 20]].raman[:,index])
[0.3531243]
[[0.3531243 ]
[4.37541582]
[4.62936691]
[5.31620961]]
Loading from files#
CVs can be loaded by providing a string to files. First, let us save some files.
[68]:
# save numpy
np.save('raman_file.npy', traj.CVs["raman"])
# save text
np.savetxt('cluster_membership_file.txt', traj.cluster_membership)
# save full CV dataset as NetCDF
traj._CVs.to_netcdf('full_CV_dataset.nc', engine="h5netcdf", )
If not providing an attr_name
, while loading files, the filename will be used:
[69]:
traj.load_CV('raman_file.npy')
traj.load_CV('cluster_membership_file.txt')
[70]:
print(traj.CVs.keys())
dict_keys(['raman', 'cluster_membership', 'raman_file', 'cluster_membership_file'])
Multiple CVs can be reconstructed from xarray NetCDF files (most end with .nc). If there are conflicts the new data from disk will overwrite the old.
[71]:
traj = em.SingleTraj(traj.traj_file, traj.top_file)
print(traj.CVs.keys())
traj.load_CV('full_CV_dataset.nc')
print(traj.CVs.keys())
dict_keys([])
dict_keys(['raman', 'cluster_membership'])
Loading with EncoderMap’s featurzier#
EncoderMap comes with a Featurizer
class, that mimics the great PyEMMA package which has now been deprecated. Check it out at http://emma-project.org/latest/.
The Featurizer
centralizes the computation of collective variables. It can contain multiple features, each feature describing a set of collective variables and fixes the computation of collective variables that are part of TrajectoryEnsembles
with non-uniform topologies. To instantiate a featurizer, we provide it with either a SingleTraj
or even a TrajectoryEnsemble
instance.
[72]:
feat = em.Featurizer(traj)
feat
[72]:
EncoderMap Featurizer with features:
[, ...]
The Featurizer
is currently empty. But we can add features via the add_*
methods
[73]:
for m in feat.__dir__():
if m.startswith("add"):
print(m)
add_list_of_feats
add_custom_feature
add_distances_ca
add_distances
add_backbone_torsions
add_angles
add_all
add_selection
add_inverse_distances
add_contacts
add_residue_mindist
add_group_COM
add_residue_COM
add_dihedrals
add_sidechain_torsions
add_minrmsd_to_ref
[74]:
feat.add_backbone_torsions()
feat.add_sidechain_torsions()
feat
/home/kevin/git/encoder_map_private/encodermap/loading/features.py:470: CitePYEMMAWarning:
EncoderMap's featurization uses code from the now deprecated python package PyEMMA (https://github.com/markovmodel/PyEMMA). Please make sure to also cite them, when using EncoderMap.
[74]:
EncoderMap Featurizer with features:
['PHI 0 GLN 2',
'PSI 0 MET 1',
'PHI 0 ILE 3',
'PSI 0 GLN 2',
'PHI 0 PHE 4',
'PSI 0 ILE 3',
'PHI 0 VAL 5',
'PSI 0 PHE 4',
'PHI 0 LYS 6',
'PSI 0 VAL 5', ...]
The idea behind EncoderMap’s feraturizer is to prepare these calculations and define indices which are later used to actually calculate the values of the CVs. The Featurizer
contains a list of Features
. Every feature has a name
and an attribute called *_indexes
.
[75]:
for feature in feat.active_features:
print(f"{feature.__class__.__name__}\n{feature.angle_indexes}\n")
BackboneTorsionFeature
[[ 9 11 13 21]
[ 0 4 9 11]
[ 21 23 25 30]
...
[1493 1495 1508 1510]
[1513 1515 1517 1518]
[1510 1512 1513 1515]]
SideChainTorsions
[[ 0 4 5 6]
[ 11 13 14 15]
[ 23 25 26 27]
...
[1281 1282 1284 1285]
[1472 1473 1475 1476]
[1498 1499 1501 1502]]
In this case, the attribute angle_indexes
attribute contain the 0-based atom ids which are to be used for the calculation of torsion angles. The result of the calculation can be obtained via:
[76]:
feat.get_output()
[76]:
<xarray.Dataset> Dimensions: (traj_num: 1, frame_num: 5001, BACKBONETORSIONFEATURE: 302, SIDECHAINTORSIONS: 322, ATOM_NO: 4) Coordinates: * traj_num (traj_num) object None traj_name (traj_num) <U2 '01' * frame_num (frame_num) int64 0 1 ... 4999 5000 * BACKBONETORSIONFEATURE (BACKBONETORSIONFEATURE) <U13 'PH... time (frame_num) float32 0.0 ... 5e+04 * SIDECHAINTORSIONS (SIDECHAINTORSIONS) <U14 'CHI1 0 ... * ATOM_NO (ATOM_NO) int64 0 1 2 3 Data variables: BackboneTorsionFeature (traj_num, frame_num, BACKBONETORSIONFEATURE) float32 ... SideChainTorsions (traj_num, frame_num, SIDECHAINTORSIONS) float32 ... BackboneTorsionFeature_feature_indices (traj_num, BACKBONETORSIONFEATURE, ATOM_NO) int32 ... SideChainTorsions_feature_indices (traj_num, SIDECHAINTORSIONS, ATOM_NO) int32 ... Attributes: length_units: nm time_units: ps feature_axes: ['SIDECHAINTORSIONS', 'BACKBONETORSIONFEATURE'] angle_units: rad full_path: /home/kevin/git/encoder_map_private/tests/data/linear_dim... topology_file: /home/kevin/git/encoder_map_private/tests/data/linear_dim...
The advantages of this method are:
The same can be done with the
TrajEnsemble
class (more on that later), which is also parallelized.Most of the features contain comprehensive labels themselves.
The SingleTraj
class can be used to direclty load the output of a Featurizer
:
[77]:
traj.load_CV(feat)
And the new features from the Featurizer
are added to the preexisting features of the SingleTraj
.
[78]:
traj._CVs
[78]:
<xarray.Dataset> Dimensions: (traj_num: 1, frame_num: 5001, RAMAN: 304, CLUSTER_MEMBERSHIP: 1, BACKBONETORSIONFEATURE: 302, SIDECHAINTORSIONS: 322, ATOM_NO: 4) Coordinates: * traj_num (traj_num) float64 nan traj_name (traj_num) <U2 '01' * frame_num (frame_num) int64 0 1 ... 4999 5000 * RAMAN (RAMAN) <U7 'phi 0' ... 'psi 151' * CLUSTER_MEMBERSHIP (CLUSTER_MEMBERSHIP) <U26 'CLUSTE... * BACKBONETORSIONFEATURE (BACKBONETORSIONFEATURE) <U13 'PH... time (frame_num) float32 0.0 ... 5e+04 * SIDECHAINTORSIONS (SIDECHAINTORSIONS) <U14 'CHI1 0 ... * ATOM_NO (ATOM_NO) int64 0 1 2 3 Data variables: raman (traj_num, frame_num, RAMAN) float64 ... cluster_membership (traj_num, frame_num, CLUSTER_MEMBERSHIP) int64 ... BackboneTorsionFeature (traj_num, frame_num, BACKBONETORSIONFEATURE) float32 ... SideChainTorsions (traj_num, frame_num, SIDECHAINTORSIONS) float32 ... BackboneTorsionFeature_feature_indices (traj_num, BACKBONETORSIONFEATURE, ATOM_NO) int32 ... SideChainTorsions_feature_indices (traj_num, SIDECHAINTORSIONS, ATOM_NO) int32 ... Attributes: length_units: nm time_units: ps feature_axes: ['BACKBONETORSIONFEATURE', 'RAMAN', 'SIDECHAINTORSIONS', ... angle_units: rad full_path: /home/kevin/git/encoder_map_private/tests/data/linear_dim... topology_file: /home/kevin/git/encoder_map_private/tests/data/linear_dim...
Wrtiting custom features No 1#
Writing your custom features can be done by subclassing pyemma’s features. Required methods and attributes to make your feature work are:
The methods
__init__
,describe
, andcall
.The instance attribute
dimension
, which defines the shape of the returned array.
If you want to change the name of the feature, as it appears in the xarray.Dataset
you can set the attribute name
.
In the next cell we will define a Feature that provides a random float for the atoms in a trajectory.
[79]:
class RandomFloatForAtomFeature(em.features.CustomFeature):
# write an __init__
def __init__(self, traj, selstr='all'):
"""Init of RandomFloatForAtomFeature.
Args:
top (mdtraj.Topology): The topology to select atoms from.
Keyword Args:
selstr (str, optional): The string to provide to top.select().
Defaults to 'all'.
"""
self.traj = traj
self.top = traj.top
self.indexes = self.traj.top.select(selstr)
# Here, we set a required attribute
# The dimension is needed and should
# be an integer value of the length of the feature
# in our case, that's how many atoms are in `self.indexes`
self.dimension = len(self.indexes)
def describe(self):
"""This method is not allowed to take any arguments.
This method provides labels.
Returns:
list: A lsit of str, each str describing one feature.
"""
# In this method we will build a list of str
# Each str should describe one of our features
# We assign floats to atoms, so the labels should tell something about the atoms
getlbl = (
lambda at: f"atom {at.name:>4}:{at.index:5} {at.residue.name}:{at.residue.resSeq:>4}"
)
labels = []
for i in self.indexes:
i = self.traj.top.atom(i)
labels.append(f"Random float for {getlbl(i)}")
return labels
def call(self, traj):
"""This method provides values.
Args:
traj (mdtraj.Trajectory): An mdtraj.Trajectory.
Returns:
np.ndarray: The values of the features defined in describe.
"""
# Make sure that the returned array has correct shape
# It is a good idea, that this array has the same length as
# the trajectory has frames
values = traj.xyz[..., 0]
for i in self.indexes:
values[:, i] = float(str(hash(str(self.traj.top.atom(i))))[-5:])
return values
@property
def name(self):
# define the name of the feature to appear in `SingleTraj._CVs`
return 'MyAwesomeFeature'
[80]:
traj = em.SingleTraj(traj.traj_file, traj.top_file)
print(traj)
featurizer = em.Featurizer(traj)
feat = RandomFloatForAtomFeature(traj)
for i in feat.describe()[:200:25]:
print(i)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. Not containing any CVs.
Random float for atom N: 0 MET: 1
Random float for atom CA: 25 ILE: 3
Random float for atom H: 50 VAL: 5
Random float for atom HG1: 75 THR: 7
Random float for atom C: 100 GLY: 10
Random float for atom H: 125 ILE: 13
Random float for atom O: 150 LEU: 15
Random float for atom OE1: 175 GLU: 18
[81]:
featurizer.add_custom_feature(feat)
[82]:
traj.load_CV(featurizer)
[83]:
traj._CVs.coords['MYAWESOMEFEATURE'].values
[83]:
array(['Random float for atom N: 0 MET: 1',
'Random float for atom H: 1 MET: 1',
'Random float for atom H2: 2 MET: 1', ...,
'Random float for atom C: 1518 GLY: 152',
'Random float for atom O: 1519 GLY: 152',
'Random float for atom OXT: 1520 GLY: 152'], dtype='<U41')
Writing custom features No 2#
In this example we will implement a method of calculating a nematic order parameter. This example will be quite different (working with coarse-grained carbon-hydrate chains (so-called telechelics), and not with proteins), but we will work our way through. Here are some references you might consider:
@article{mukherjee2012derivation,
title={Derivation of coarse grained models for multiscale simulation of liquid crystalline phase transitions},
author={Mukherjee, Biswaroop and Delle Site, Luigi and Kremer, Kurt and Peter, Christine},
journal={The Journal of Physical Chemistry B},
volume={116},
number={29},
pages={8474--8484},
year={2012},
publisher={ACS Publications}
}
@article{flachmuller2021coarse,
title={Coarse grained simulation of the aggregation and structure control of polyethylene nanocrystals},
author={Flachm{\"u}ller, Alexander and Mecking, Stefan and Peter, Christine},
journal={Journal of Physics: Condensed Matter},
volume={33},
number={26},
pages={264001},
year={2021},
publisher={IOP Publishing}
}
Come back later to check out this part of the tutorial.
[ ]:
Saving trajectory and CVs into one file#
A trajectory can (with its CVs) be saved as one comprehensive file with the save()
method. What’s more: Loading such a file again makes it possible to access any frames and their corresponding CVs almost instantaneously.
[84]:
test_data = Path(em.__file__).parent.parent / "tests/data"
traj = em.SingleTraj(test_data / '1am7_corrected.xtc', test_data / '1am7_protein.pdb')
traj.load_CV('all')
traj.save('1am7_all_CVs.h5', overwrite=True)
[85]:
new_traj = em.SingleTraj('1am7_all_CVs.h5')
frames = new_traj[[0, 5, 20, 35]]
frames.central_cartesians.shape
[85]:
(4, 474, 3)
[86]:
frames = new_traj[::5]
print(frames.central_distances.shape)
print(frames._CVs.coords['CENTRAL_DISTANCES'].values)
(11, 473)
['CENTERDISTANCE ATOM N: 0 MET: 1 DIST ATOM CA: 4 MET: 1 CHAIN 0'
'CENTERDISTANCE ATOM CA: 4 MET: 1 DIST ATOM C: 17 MET: 1 CHAIN 0'
'CENTERDISTANCE ATOM C: 17 MET: 1 DIST ATOM N: 19 VAL: 2 CHAIN 0'
'CENTERDISTANCE ATOM N: 19 VAL: 2 DIST ATOM CA: 21 VAL: 2 CHAIN 0'
'CENTERDISTANCE ATOM CA: 21 VAL: 2 DIST ATOM C: 33 VAL: 2 CHAIN 0'
'CENTERDISTANCE ATOM C: 33 VAL: 2 DIST ATOM N: 35 GLU: 3 CHAIN 0'
'CENTERDISTANCE ATOM N: 35 GLU: 3 DIST ATOM CA: 37 GLU: 3 CHAIN 0'
'CENTERDISTANCE ATOM CA: 37 GLU: 3 DIST ATOM C: 48 GLU: 3 CHAIN 0'
'CENTERDISTANCE ATOM C: 48 GLU: 3 DIST ATOM N: 50 ILE: 4 CHAIN 0'
'CENTERDISTANCE ATOM N: 50 ILE: 4 DIST ATOM CA: 52 ILE: 4 CHAIN 0'
'CENTERDISTANCE ATOM CA: 52 ILE: 4 DIST ATOM C: 67 ILE: 4 CHAIN 0'
'CENTERDISTANCE ATOM C: 67 ILE: 4 DIST ATOM N: 69 ASN: 5 CHAIN 0'
'CENTERDISTANCE ATOM N: 69 ASN: 5 DIST ATOM CA: 71 ASN: 5 CHAIN 0'
'CENTERDISTANCE ATOM CA: 71 ASN: 5 DIST ATOM C: 81 ASN: 5 CHAIN 0'
'CENTERDISTANCE ATOM C: 81 ASN: 5 DIST ATOM N: 83 ASN: 6 CHAIN 0'
'CENTERDISTANCE ATOM N: 83 ASN: 6 DIST ATOM CA: 85 ASN: 6 CHAIN 0'
'CENTERDISTANCE ATOM CA: 85 ASN: 6 DIST ATOM C: 95 ASN: 6 CHAIN 0'
'CENTERDISTANCE ATOM C: 95 ASN: 6 DIST ATOM N: 97 GLN: 7 CHAIN 0'
'CENTERDISTANCE ATOM N: 97 GLN: 7 DIST ATOM CA: 99 GLN: 7 CHAIN 0'
'CENTERDISTANCE ATOM CA: 99 GLN: 7 DIST ATOM C: 112 GLN: 7 CHAIN 0'
'CENTERDISTANCE ATOM C: 112 GLN: 7 DIST ATOM N: 114 ARG: 8 CHAIN 0'
'CENTERDISTANCE ATOM N: 114 ARG: 8 DIST ATOM CA: 116 ARG: 8 CHAIN 0'
'CENTERDISTANCE ATOM CA: 116 ARG: 8 DIST ATOM C: 136 ARG: 8 CHAIN 0'
'CENTERDISTANCE ATOM C: 136 ARG: 8 DIST ATOM N: 138 LYS: 9 CHAIN 0'
'CENTERDISTANCE ATOM N: 138 LYS: 9 DIST ATOM CA: 140 LYS: 9 CHAIN 0'
'CENTERDISTANCE ATOM CA: 140 LYS: 9 DIST ATOM C: 158 LYS: 9 CHAIN 0'
'CENTERDISTANCE ATOM C: 158 LYS: 9 DIST ATOM N: 160 ALA: 10 CHAIN 0'
'CENTERDISTANCE ATOM N: 160 ALA: 10 DIST ATOM CA: 162 ALA: 10 CHAIN 0'
'CENTERDISTANCE ATOM CA: 162 ALA: 10 DIST ATOM C: 168 ALA: 10 CHAIN 0'
'CENTERDISTANCE ATOM C: 168 ALA: 10 DIST ATOM N: 170 PHE: 11 CHAIN 0'
'CENTERDISTANCE ATOM N: 170 PHE: 11 DIST ATOM CA: 172 PHE: 11 CHAIN 0'
'CENTERDISTANCE ATOM CA: 172 PHE: 11 DIST ATOM C: 188 PHE: 11 CHAIN 0'
'CENTERDISTANCE ATOM C: 188 PHE: 11 DIST ATOM N: 190 LEU: 12 CHAIN 0'
'CENTERDISTANCE ATOM N: 190 LEU: 12 DIST ATOM CA: 192 LEU: 12 CHAIN 0'
'CENTERDISTANCE ATOM CA: 192 LEU: 12 DIST ATOM C: 207 LEU: 12 CHAIN 0'
'CENTERDISTANCE ATOM C: 207 LEU: 12 DIST ATOM N: 209 ASP: 13 CHAIN 0'
'CENTERDISTANCE ATOM N: 209 ASP: 13 DIST ATOM CA: 211 ASP: 13 CHAIN 0'
'CENTERDISTANCE ATOM CA: 211 ASP: 13 DIST ATOM C: 219 ASP: 13 CHAIN 0'
'CENTERDISTANCE ATOM C: 219 ASP: 13 DIST ATOM N: 221 MET: 14 CHAIN 0'
'CENTERDISTANCE ATOM N: 221 MET: 14 DIST ATOM CA: 223 MET: 14 CHAIN 0'
'CENTERDISTANCE ATOM CA: 223 MET: 14 DIST ATOM C: 236 MET: 14 CHAIN 0'
'CENTERDISTANCE ATOM C: 236 MET: 14 DIST ATOM N: 238 LEU: 15 CHAIN 0'
'CENTERDISTANCE ATOM N: 238 LEU: 15 DIST ATOM CA: 240 LEU: 15 CHAIN 0'
'CENTERDISTANCE ATOM CA: 240 LEU: 15 DIST ATOM C: 255 LEU: 15 CHAIN 0'
'CENTERDISTANCE ATOM C: 255 LEU: 15 DIST ATOM N: 257 ALA: 16 CHAIN 0'
'CENTERDISTANCE ATOM N: 257 ALA: 16 DIST ATOM CA: 259 ALA: 16 CHAIN 0'
'CENTERDISTANCE ATOM CA: 259 ALA: 16 DIST ATOM C: 265 ALA: 16 CHAIN 0'
'CENTERDISTANCE ATOM C: 265 ALA: 16 DIST ATOM N: 267 TRP: 17 CHAIN 0'
'CENTERDISTANCE ATOM N: 267 TRP: 17 DIST ATOM CA: 269 TRP: 17 CHAIN 0'
'CENTERDISTANCE ATOM CA: 269 TRP: 17 DIST ATOM C: 289 TRP: 17 CHAIN 0'
'CENTERDISTANCE ATOM C: 289 TRP: 17 DIST ATOM N: 291 SER: 18 CHAIN 0'
'CENTERDISTANCE ATOM N: 291 SER: 18 DIST ATOM CA: 293 SER: 18 CHAIN 0'
'CENTERDISTANCE ATOM CA: 293 SER: 18 DIST ATOM C: 300 SER: 18 CHAIN 0'
'CENTERDISTANCE ATOM C: 300 SER: 18 DIST ATOM N: 302 GLU: 19 CHAIN 0'
'CENTERDISTANCE ATOM N: 302 GLU: 19 DIST ATOM CA: 304 GLU: 19 CHAIN 0'
'CENTERDISTANCE ATOM CA: 304 GLU: 19 DIST ATOM C: 315 GLU: 19 CHAIN 0'
'CENTERDISTANCE ATOM C: 315 GLU: 19 DIST ATOM N: 317 GLY: 20 CHAIN 0'
'CENTERDISTANCE ATOM N: 317 GLY: 20 DIST ATOM CA: 319 GLY: 20 CHAIN 0'
'CENTERDISTANCE ATOM CA: 319 GLY: 20 DIST ATOM C: 322 GLY: 20 CHAIN 0'
'CENTERDISTANCE ATOM C: 322 GLY: 20 DIST ATOM N: 324 THR: 21 CHAIN 0'
'CENTERDISTANCE ATOM N: 324 THR: 21 DIST ATOM CA: 326 THR: 21 CHAIN 0'
'CENTERDISTANCE ATOM CA: 326 THR: 21 DIST ATOM C: 336 THR: 21 CHAIN 0'
'CENTERDISTANCE ATOM C: 336 THR: 21 DIST ATOM N: 338 ASP: 22 CHAIN 0'
'CENTERDISTANCE ATOM N: 338 ASP: 22 DIST ATOM CA: 340 ASP: 22 CHAIN 0'
'CENTERDISTANCE ATOM CA: 340 ASP: 22 DIST ATOM C: 348 ASP: 22 CHAIN 0'
'CENTERDISTANCE ATOM C: 348 ASP: 22 DIST ATOM N: 350 ASN: 23 CHAIN 0'
'CENTERDISTANCE ATOM N: 350 ASN: 23 DIST ATOM CA: 352 ASN: 23 CHAIN 0'
'CENTERDISTANCE ATOM CA: 352 ASN: 23 DIST ATOM C: 362 ASN: 23 CHAIN 0'
'CENTERDISTANCE ATOM C: 362 ASN: 23 DIST ATOM N: 364 GLY: 24 CHAIN 0'
'CENTERDISTANCE ATOM N: 364 GLY: 24 DIST ATOM CA: 366 GLY: 24 CHAIN 0'
'CENTERDISTANCE ATOM CA: 366 GLY: 24 DIST ATOM C: 369 GLY: 24 CHAIN 0'
'CENTERDISTANCE ATOM C: 369 GLY: 24 DIST ATOM N: 371 ARG: 25 CHAIN 0'
'CENTERDISTANCE ATOM N: 371 ARG: 25 DIST ATOM CA: 373 ARG: 25 CHAIN 0'
'CENTERDISTANCE ATOM CA: 373 ARG: 25 DIST ATOM C: 393 ARG: 25 CHAIN 0'
'CENTERDISTANCE ATOM C: 393 ARG: 25 DIST ATOM N: 395 GLN: 26 CHAIN 0'
'CENTERDISTANCE ATOM N: 395 GLN: 26 DIST ATOM CA: 397 GLN: 26 CHAIN 0'
'CENTERDISTANCE ATOM CA: 397 GLN: 26 DIST ATOM C: 410 GLN: 26 CHAIN 0'
'CENTERDISTANCE ATOM C: 410 GLN: 26 DIST ATOM N: 412 LYS: 27 CHAIN 0'
'CENTERDISTANCE ATOM N: 412 LYS: 27 DIST ATOM CA: 414 LYS: 27 CHAIN 0'
'CENTERDISTANCE ATOM CA: 414 LYS: 27 DIST ATOM C: 432 LYS: 27 CHAIN 0'
'CENTERDISTANCE ATOM C: 432 LYS: 27 DIST ATOM N: 434 THR: 28 CHAIN 0'
'CENTERDISTANCE ATOM N: 434 THR: 28 DIST ATOM CA: 436 THR: 28 CHAIN 0'
'CENTERDISTANCE ATOM CA: 436 THR: 28 DIST ATOM C: 446 THR: 28 CHAIN 0'
'CENTERDISTANCE ATOM C: 446 THR: 28 DIST ATOM N: 448 ARG: 29 CHAIN 0'
'CENTERDISTANCE ATOM N: 448 ARG: 29 DIST ATOM CA: 450 ARG: 29 CHAIN 0'
'CENTERDISTANCE ATOM CA: 450 ARG: 29 DIST ATOM C: 470 ARG: 29 CHAIN 0'
'CENTERDISTANCE ATOM C: 470 ARG: 29 DIST ATOM N: 472 ASN: 30 CHAIN 0'
'CENTERDISTANCE ATOM N: 472 ASN: 30 DIST ATOM CA: 474 ASN: 30 CHAIN 0'
'CENTERDISTANCE ATOM CA: 474 ASN: 30 DIST ATOM C: 484 ASN: 30 CHAIN 0'
'CENTERDISTANCE ATOM C: 484 ASN: 30 DIST ATOM N: 486 HIS: 31 CHAIN 0'
'CENTERDISTANCE ATOM N: 486 HIS: 31 DIST ATOM CA: 488 HIS: 31 CHAIN 0'
'CENTERDISTANCE ATOM CA: 488 HIS: 31 DIST ATOM C: 501 HIS: 31 CHAIN 0'
'CENTERDISTANCE ATOM C: 501 HIS: 31 DIST ATOM N: 503 GLY: 32 CHAIN 0'
'CENTERDISTANCE ATOM N: 503 GLY: 32 DIST ATOM CA: 505 GLY: 32 CHAIN 0'
'CENTERDISTANCE ATOM CA: 505 GLY: 32 DIST ATOM C: 508 GLY: 32 CHAIN 0'
'CENTERDISTANCE ATOM C: 508 GLY: 32 DIST ATOM N: 510 TYR: 33 CHAIN 0'
'CENTERDISTANCE ATOM N: 510 TYR: 33 DIST ATOM CA: 512 TYR: 33 CHAIN 0'
'CENTERDISTANCE ATOM CA: 512 TYR: 33 DIST ATOM C: 529 TYR: 33 CHAIN 0'
'CENTERDISTANCE ATOM C: 529 TYR: 33 DIST ATOM N: 531 ASP: 34 CHAIN 0'
'CENTERDISTANCE ATOM N: 531 ASP: 34 DIST ATOM CA: 533 ASP: 34 CHAIN 0'
'CENTERDISTANCE ATOM CA: 533 ASP: 34 DIST ATOM C: 541 ASP: 34 CHAIN 0'
'CENTERDISTANCE ATOM C: 541 ASP: 34 DIST ATOM N: 543 VAL: 35 CHAIN 0'
'CENTERDISTANCE ATOM N: 543 VAL: 35 DIST ATOM CA: 545 VAL: 35 CHAIN 0'
'CENTERDISTANCE ATOM CA: 545 VAL: 35 DIST ATOM C: 557 VAL: 35 CHAIN 0'
'CENTERDISTANCE ATOM C: 557 VAL: 35 DIST ATOM N: 559 ILE: 36 CHAIN 0'
'CENTERDISTANCE ATOM N: 559 ILE: 36 DIST ATOM CA: 561 ILE: 36 CHAIN 0'
'CENTERDISTANCE ATOM CA: 561 ILE: 36 DIST ATOM C: 576 ILE: 36 CHAIN 0'
'CENTERDISTANCE ATOM C: 576 ILE: 36 DIST ATOM N: 578 VAL: 37 CHAIN 0'
'CENTERDISTANCE ATOM N: 578 VAL: 37 DIST ATOM CA: 580 VAL: 37 CHAIN 0'
'CENTERDISTANCE ATOM CA: 580 VAL: 37 DIST ATOM C: 592 VAL: 37 CHAIN 0'
'CENTERDISTANCE ATOM C: 592 VAL: 37 DIST ATOM N: 594 GLY: 38 CHAIN 0'
'CENTERDISTANCE ATOM N: 594 GLY: 38 DIST ATOM CA: 596 GLY: 38 CHAIN 0'
'CENTERDISTANCE ATOM CA: 596 GLY: 38 DIST ATOM C: 599 GLY: 38 CHAIN 0'
'CENTERDISTANCE ATOM C: 599 GLY: 38 DIST ATOM N: 601 GLY: 39 CHAIN 0'
'CENTERDISTANCE ATOM N: 601 GLY: 39 DIST ATOM CA: 603 GLY: 39 CHAIN 0'
'CENTERDISTANCE ATOM CA: 603 GLY: 39 DIST ATOM C: 606 GLY: 39 CHAIN 0'
'CENTERDISTANCE ATOM C: 606 GLY: 39 DIST ATOM N: 608 GLU: 40 CHAIN 0'
'CENTERDISTANCE ATOM N: 608 GLU: 40 DIST ATOM CA: 610 GLU: 40 CHAIN 0'
'CENTERDISTANCE ATOM CA: 610 GLU: 40 DIST ATOM C: 621 GLU: 40 CHAIN 0'
'CENTERDISTANCE ATOM C: 621 GLU: 40 DIST ATOM N: 623 LEU: 41 CHAIN 0'
'CENTERDISTANCE ATOM N: 623 LEU: 41 DIST ATOM CA: 625 LEU: 41 CHAIN 0'
'CENTERDISTANCE ATOM CA: 625 LEU: 41 DIST ATOM C: 640 LEU: 41 CHAIN 0'
'CENTERDISTANCE ATOM C: 640 LEU: 41 DIST ATOM N: 642 PHE: 42 CHAIN 0'
'CENTERDISTANCE ATOM N: 642 PHE: 42 DIST ATOM CA: 644 PHE: 42 CHAIN 0'
'CENTERDISTANCE ATOM CA: 644 PHE: 42 DIST ATOM C: 660 PHE: 42 CHAIN 0'
'CENTERDISTANCE ATOM C: 660 PHE: 42 DIST ATOM N: 662 THR: 43 CHAIN 0'
'CENTERDISTANCE ATOM N: 662 THR: 43 DIST ATOM CA: 664 THR: 43 CHAIN 0'
'CENTERDISTANCE ATOM CA: 664 THR: 43 DIST ATOM C: 674 THR: 43 CHAIN 0'
'CENTERDISTANCE ATOM C: 674 THR: 43 DIST ATOM N: 676 ASP: 44 CHAIN 0'
'CENTERDISTANCE ATOM N: 676 ASP: 44 DIST ATOM CA: 678 ASP: 44 CHAIN 0'
'CENTERDISTANCE ATOM CA: 678 ASP: 44 DIST ATOM C: 686 ASP: 44 CHAIN 0'
'CENTERDISTANCE ATOM C: 686 ASP: 44 DIST ATOM N: 688 TYR: 45 CHAIN 0'
'CENTERDISTANCE ATOM N: 688 TYR: 45 DIST ATOM CA: 690 TYR: 45 CHAIN 0'
'CENTERDISTANCE ATOM CA: 690 TYR: 45 DIST ATOM C: 707 TYR: 45 CHAIN 0'
'CENTERDISTANCE ATOM C: 707 TYR: 45 DIST ATOM N: 709 SER: 46 CHAIN 0'
'CENTERDISTANCE ATOM N: 709 SER: 46 DIST ATOM CA: 711 SER: 46 CHAIN 0'
'CENTERDISTANCE ATOM CA: 711 SER: 46 DIST ATOM C: 718 SER: 46 CHAIN 0'
'CENTERDISTANCE ATOM C: 718 SER: 46 DIST ATOM N: 720 ASP: 47 CHAIN 0'
'CENTERDISTANCE ATOM N: 720 ASP: 47 DIST ATOM CA: 722 ASP: 47 CHAIN 0'
'CENTERDISTANCE ATOM CA: 722 ASP: 47 DIST ATOM C: 730 ASP: 47 CHAIN 0'
'CENTERDISTANCE ATOM C: 730 ASP: 47 DIST ATOM N: 732 HIS: 48 CHAIN 0'
'CENTERDISTANCE ATOM N: 732 HIS: 48 DIST ATOM CA: 734 HIS: 48 CHAIN 0'
'CENTERDISTANCE ATOM CA: 734 HIS: 48 DIST ATOM C: 747 HIS: 48 CHAIN 0'
'CENTERDISTANCE ATOM C: 747 HIS: 48 DIST ATOM N: 749 PRO: 49 CHAIN 0'
'CENTERDISTANCE ATOM N: 749 PRO: 49 DIST ATOM CA: 759 PRO: 49 CHAIN 0'
'CENTERDISTANCE ATOM CA: 759 PRO: 49 DIST ATOM C: 761 PRO: 49 CHAIN 0'
'CENTERDISTANCE ATOM C: 761 PRO: 49 DIST ATOM N: 763 ARG: 50 CHAIN 0'
'CENTERDISTANCE ATOM N: 763 ARG: 50 DIST ATOM CA: 765 ARG: 50 CHAIN 0'
'CENTERDISTANCE ATOM CA: 765 ARG: 50 DIST ATOM C: 785 ARG: 50 CHAIN 0'
'CENTERDISTANCE ATOM C: 785 ARG: 50 DIST ATOM N: 787 LYS: 51 CHAIN 0'
'CENTERDISTANCE ATOM N: 787 LYS: 51 DIST ATOM CA: 789 LYS: 51 CHAIN 0'
'CENTERDISTANCE ATOM CA: 789 LYS: 51 DIST ATOM C: 807 LYS: 51 CHAIN 0'
'CENTERDISTANCE ATOM C: 807 LYS: 51 DIST ATOM N: 809 LEU: 52 CHAIN 0'
'CENTERDISTANCE ATOM N: 809 LEU: 52 DIST ATOM CA: 811 LEU: 52 CHAIN 0'
'CENTERDISTANCE ATOM CA: 811 LEU: 52 DIST ATOM C: 826 LEU: 52 CHAIN 0'
'CENTERDISTANCE ATOM C: 826 LEU: 52 DIST ATOM N: 828 VAL: 53 CHAIN 0'
'CENTERDISTANCE ATOM N: 828 VAL: 53 DIST ATOM CA: 830 VAL: 53 CHAIN 0'
'CENTERDISTANCE ATOM CA: 830 VAL: 53 DIST ATOM C: 842 VAL: 53 CHAIN 0'
'CENTERDISTANCE ATOM C: 842 VAL: 53 DIST ATOM N: 844 THR: 54 CHAIN 0'
'CENTERDISTANCE ATOM N: 844 THR: 54 DIST ATOM CA: 846 THR: 54 CHAIN 0'
'CENTERDISTANCE ATOM CA: 846 THR: 54 DIST ATOM C: 856 THR: 54 CHAIN 0'
'CENTERDISTANCE ATOM C: 856 THR: 54 DIST ATOM N: 858 LEU: 55 CHAIN 0'
'CENTERDISTANCE ATOM N: 858 LEU: 55 DIST ATOM CA: 860 LEU: 55 CHAIN 0'
'CENTERDISTANCE ATOM CA: 860 LEU: 55 DIST ATOM C: 875 LEU: 55 CHAIN 0'
'CENTERDISTANCE ATOM C: 875 LEU: 55 DIST ATOM N: 877 ASN: 56 CHAIN 0'
'CENTERDISTANCE ATOM N: 877 ASN: 56 DIST ATOM CA: 879 ASN: 56 CHAIN 0'
'CENTERDISTANCE ATOM CA: 879 ASN: 56 DIST ATOM C: 889 ASN: 56 CHAIN 0'
'CENTERDISTANCE ATOM C: 889 ASN: 56 DIST ATOM N: 891 PRO: 57 CHAIN 0'
'CENTERDISTANCE ATOM N: 891 PRO: 57 DIST ATOM CA: 901 PRO: 57 CHAIN 0'
'CENTERDISTANCE ATOM CA: 901 PRO: 57 DIST ATOM C: 903 PRO: 57 CHAIN 0'
'CENTERDISTANCE ATOM C: 903 PRO: 57 DIST ATOM N: 905 LYS: 58 CHAIN 0'
'CENTERDISTANCE ATOM N: 905 LYS: 58 DIST ATOM CA: 907 LYS: 58 CHAIN 0'
'CENTERDISTANCE ATOM CA: 907 LYS: 58 DIST ATOM C: 925 LYS: 58 CHAIN 0'
'CENTERDISTANCE ATOM C: 925 LYS: 58 DIST ATOM N: 927 LEU: 59 CHAIN 0'
'CENTERDISTANCE ATOM N: 927 LEU: 59 DIST ATOM CA: 929 LEU: 59 CHAIN 0'
'CENTERDISTANCE ATOM CA: 929 LEU: 59 DIST ATOM C: 944 LEU: 59 CHAIN 0'
'CENTERDISTANCE ATOM C: 944 LEU: 59 DIST ATOM N: 946 LYS: 60 CHAIN 0'
'CENTERDISTANCE ATOM N: 946 LYS: 60 DIST ATOM CA: 948 LYS: 60 CHAIN 0'
'CENTERDISTANCE ATOM CA: 948 LYS: 60 DIST ATOM C: 966 LYS: 60 CHAIN 0'
'CENTERDISTANCE ATOM C: 966 LYS: 60 DIST ATOM N: 968 SER: 61 CHAIN 0'
'CENTERDISTANCE ATOM N: 968 SER: 61 DIST ATOM CA: 970 SER: 61 CHAIN 0'
'CENTERDISTANCE ATOM CA: 970 SER: 61 DIST ATOM C: 977 SER: 61 CHAIN 0'
'CENTERDISTANCE ATOM C: 977 SER: 61 DIST ATOM N: 979 THR: 62 CHAIN 0'
'CENTERDISTANCE ATOM N: 979 THR: 62 DIST ATOM CA: 981 THR: 62 CHAIN 0'
'CENTERDISTANCE ATOM CA: 981 THR: 62 DIST ATOM C: 991 THR: 62 CHAIN 0'
'CENTERDISTANCE ATOM C: 991 THR: 62 DIST ATOM N: 993 GLY: 63 CHAIN 0'
'CENTERDISTANCE ATOM N: 993 GLY: 63 DIST ATOM CA: 995 GLY: 63 CHAIN 0'
'CENTERDISTANCE ATOM CA: 995 GLY: 63 DIST ATOM C: 998 GLY: 63 CHAIN 0'
'CENTERDISTANCE ATOM C: 998 GLY: 63 DIST ATOM N: 1000 ALA: 64 CHAIN 0'
'CENTERDISTANCE ATOM N: 1000 ALA: 64 DIST ATOM CA: 1002 ALA: 64 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1002 ALA: 64 DIST ATOM C: 1008 ALA: 64 CHAIN 0'
'CENTERDISTANCE ATOM C: 1008 ALA: 64 DIST ATOM N: 1010 GLY: 65 CHAIN 0'
'CENTERDISTANCE ATOM N: 1010 GLY: 65 DIST ATOM CA: 1012 GLY: 65 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1012 GLY: 65 DIST ATOM C: 1015 GLY: 65 CHAIN 0'
'CENTERDISTANCE ATOM C: 1015 GLY: 65 DIST ATOM N: 1017 ARG: 66 CHAIN 0'
'CENTERDISTANCE ATOM N: 1017 ARG: 66 DIST ATOM CA: 1019 ARG: 66 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1019 ARG: 66 DIST ATOM C: 1039 ARG: 66 CHAIN 0'
'CENTERDISTANCE ATOM C: 1039 ARG: 66 DIST ATOM N: 1041 TYR: 67 CHAIN 0'
'CENTERDISTANCE ATOM N: 1041 TYR: 67 DIST ATOM CA: 1043 TYR: 67 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1043 TYR: 67 DIST ATOM C: 1060 TYR: 67 CHAIN 0'
'CENTERDISTANCE ATOM C: 1060 TYR: 67 DIST ATOM N: 1062 GLN: 68 CHAIN 0'
'CENTERDISTANCE ATOM N: 1062 GLN: 68 DIST ATOM CA: 1064 GLN: 68 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1064 GLN: 68 DIST ATOM C: 1077 GLN: 68 CHAIN 0'
'CENTERDISTANCE ATOM C: 1077 GLN: 68 DIST ATOM N: 1079 LEU: 69 CHAIN 0'
'CENTERDISTANCE ATOM N: 1079 LEU: 69 DIST ATOM CA: 1081 LEU: 69 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1081 LEU: 69 DIST ATOM C: 1096 LEU: 69 CHAIN 0'
'CENTERDISTANCE ATOM C: 1096 LEU: 69 DIST ATOM N: 1098 LEU: 70 CHAIN 0'
'CENTERDISTANCE ATOM N: 1098 LEU: 70 DIST ATOM CA: 1100 LEU: 70 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1100 LEU: 70 DIST ATOM C: 1115 LEU: 70 CHAIN 0'
'CENTERDISTANCE ATOM C: 1115 LEU: 70 DIST ATOM N: 1117 SER: 71 CHAIN 0'
'CENTERDISTANCE ATOM N: 1117 SER: 71 DIST ATOM CA: 1119 SER: 71 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1119 SER: 71 DIST ATOM C: 1126 SER: 71 CHAIN 0'
'CENTERDISTANCE ATOM C: 1126 SER: 71 DIST ATOM N: 1128 ARG: 72 CHAIN 0'
'CENTERDISTANCE ATOM N: 1128 ARG: 72 DIST ATOM CA: 1130 ARG: 72 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1130 ARG: 72 DIST ATOM C: 1150 ARG: 72 CHAIN 0'
'CENTERDISTANCE ATOM C: 1150 ARG: 72 DIST ATOM N: 1152 TRP: 73 CHAIN 0'
'CENTERDISTANCE ATOM N: 1152 TRP: 73 DIST ATOM CA: 1154 TRP: 73 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1154 TRP: 73 DIST ATOM C: 1174 TRP: 73 CHAIN 0'
'CENTERDISTANCE ATOM C: 1174 TRP: 73 DIST ATOM N: 1176 TRP: 74 CHAIN 0'
'CENTERDISTANCE ATOM N: 1176 TRP: 74 DIST ATOM CA: 1178 TRP: 74 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1178 TRP: 74 DIST ATOM C: 1198 TRP: 74 CHAIN 0'
'CENTERDISTANCE ATOM C: 1198 TRP: 74 DIST ATOM N: 1200 ASP: 75 CHAIN 0'
'CENTERDISTANCE ATOM N: 1200 ASP: 75 DIST ATOM CA: 1202 ASP: 75 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1202 ASP: 75 DIST ATOM C: 1210 ASP: 75 CHAIN 0'
'CENTERDISTANCE ATOM C: 1210 ASP: 75 DIST ATOM N: 1212 ALA: 76 CHAIN 0'
'CENTERDISTANCE ATOM N: 1212 ALA: 76 DIST ATOM CA: 1214 ALA: 76 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1214 ALA: 76 DIST ATOM C: 1220 ALA: 76 CHAIN 0'
'CENTERDISTANCE ATOM C: 1220 ALA: 76 DIST ATOM N: 1222 TYR: 77 CHAIN 0'
'CENTERDISTANCE ATOM N: 1222 TYR: 77 DIST ATOM CA: 1224 TYR: 77 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1224 TYR: 77 DIST ATOM C: 1241 TYR: 77 CHAIN 0'
'CENTERDISTANCE ATOM C: 1241 TYR: 77 DIST ATOM N: 1243 ARG: 78 CHAIN 0'
'CENTERDISTANCE ATOM N: 1243 ARG: 78 DIST ATOM CA: 1245 ARG: 78 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1245 ARG: 78 DIST ATOM C: 1265 ARG: 78 CHAIN 0'
'CENTERDISTANCE ATOM C: 1265 ARG: 78 DIST ATOM N: 1267 LYS: 79 CHAIN 0'
'CENTERDISTANCE ATOM N: 1267 LYS: 79 DIST ATOM CA: 1269 LYS: 79 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1269 LYS: 79 DIST ATOM C: 1287 LYS: 79 CHAIN 0'
'CENTERDISTANCE ATOM C: 1287 LYS: 79 DIST ATOM N: 1289 GLN: 80 CHAIN 0'
'CENTERDISTANCE ATOM N: 1289 GLN: 80 DIST ATOM CA: 1291 GLN: 80 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1291 GLN: 80 DIST ATOM C: 1304 GLN: 80 CHAIN 0'
'CENTERDISTANCE ATOM C: 1304 GLN: 80 DIST ATOM N: 1306 LEU: 81 CHAIN 0'
'CENTERDISTANCE ATOM N: 1306 LEU: 81 DIST ATOM CA: 1308 LEU: 81 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1308 LEU: 81 DIST ATOM C: 1323 LEU: 81 CHAIN 0'
'CENTERDISTANCE ATOM C: 1323 LEU: 81 DIST ATOM N: 1325 GLY: 82 CHAIN 0'
'CENTERDISTANCE ATOM N: 1325 GLY: 82 DIST ATOM CA: 1327 GLY: 82 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1327 GLY: 82 DIST ATOM C: 1330 GLY: 82 CHAIN 0'
'CENTERDISTANCE ATOM C: 1330 GLY: 82 DIST ATOM N: 1332 LEU: 83 CHAIN 0'
'CENTERDISTANCE ATOM N: 1332 LEU: 83 DIST ATOM CA: 1334 LEU: 83 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1334 LEU: 83 DIST ATOM C: 1349 LEU: 83 CHAIN 0'
'CENTERDISTANCE ATOM C: 1349 LEU: 83 DIST ATOM N: 1351 LYS: 84 CHAIN 0'
'CENTERDISTANCE ATOM N: 1351 LYS: 84 DIST ATOM CA: 1353 LYS: 84 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1353 LYS: 84 DIST ATOM C: 1371 LYS: 84 CHAIN 0'
'CENTERDISTANCE ATOM C: 1371 LYS: 84 DIST ATOM N: 1373 ASP: 85 CHAIN 0'
'CENTERDISTANCE ATOM N: 1373 ASP: 85 DIST ATOM CA: 1375 ASP: 85 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1375 ASP: 85 DIST ATOM C: 1383 ASP: 85 CHAIN 0'
'CENTERDISTANCE ATOM C: 1383 ASP: 85 DIST ATOM N: 1385 PHE: 86 CHAIN 0'
'CENTERDISTANCE ATOM N: 1385 PHE: 86 DIST ATOM CA: 1387 PHE: 86 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1387 PHE: 86 DIST ATOM C: 1403 PHE: 86 CHAIN 0'
'CENTERDISTANCE ATOM C: 1403 PHE: 86 DIST ATOM N: 1405 SER: 87 CHAIN 0'
'CENTERDISTANCE ATOM N: 1405 SER: 87 DIST ATOM CA: 1407 SER: 87 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1407 SER: 87 DIST ATOM C: 1414 SER: 87 CHAIN 0'
'CENTERDISTANCE ATOM C: 1414 SER: 87 DIST ATOM N: 1416 PRO: 88 CHAIN 0'
'CENTERDISTANCE ATOM N: 1416 PRO: 88 DIST ATOM CA: 1426 PRO: 88 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1426 PRO: 88 DIST ATOM C: 1428 PRO: 88 CHAIN 0'
'CENTERDISTANCE ATOM C: 1428 PRO: 88 DIST ATOM N: 1430 LYS: 89 CHAIN 0'
'CENTERDISTANCE ATOM N: 1430 LYS: 89 DIST ATOM CA: 1432 LYS: 89 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1432 LYS: 89 DIST ATOM C: 1450 LYS: 89 CHAIN 0'
'CENTERDISTANCE ATOM C: 1450 LYS: 89 DIST ATOM N: 1452 SER: 90 CHAIN 0'
'CENTERDISTANCE ATOM N: 1452 SER: 90 DIST ATOM CA: 1454 SER: 90 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1454 SER: 90 DIST ATOM C: 1461 SER: 90 CHAIN 0'
'CENTERDISTANCE ATOM C: 1461 SER: 90 DIST ATOM N: 1463 GLN: 91 CHAIN 0'
'CENTERDISTANCE ATOM N: 1463 GLN: 91 DIST ATOM CA: 1465 GLN: 91 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1465 GLN: 91 DIST ATOM C: 1478 GLN: 91 CHAIN 0'
'CENTERDISTANCE ATOM C: 1478 GLN: 91 DIST ATOM N: 1480 ASP: 92 CHAIN 0'
'CENTERDISTANCE ATOM N: 1480 ASP: 92 DIST ATOM CA: 1482 ASP: 92 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1482 ASP: 92 DIST ATOM C: 1490 ASP: 92 CHAIN 0'
'CENTERDISTANCE ATOM C: 1490 ASP: 92 DIST ATOM N: 1492 ALA: 93 CHAIN 0'
'CENTERDISTANCE ATOM N: 1492 ALA: 93 DIST ATOM CA: 1494 ALA: 93 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1494 ALA: 93 DIST ATOM C: 1500 ALA: 93 CHAIN 0'
'CENTERDISTANCE ATOM C: 1500 ALA: 93 DIST ATOM N: 1502 VAL: 94 CHAIN 0'
'CENTERDISTANCE ATOM N: 1502 VAL: 94 DIST ATOM CA: 1504 VAL: 94 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1504 VAL: 94 DIST ATOM C: 1516 VAL: 94 CHAIN 0'
'CENTERDISTANCE ATOM C: 1516 VAL: 94 DIST ATOM N: 1518 ALA: 95 CHAIN 0'
'CENTERDISTANCE ATOM N: 1518 ALA: 95 DIST ATOM CA: 1520 ALA: 95 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1520 ALA: 95 DIST ATOM C: 1526 ALA: 95 CHAIN 0'
'CENTERDISTANCE ATOM C: 1526 ALA: 95 DIST ATOM N: 1528 LEU: 96 CHAIN 0'
'CENTERDISTANCE ATOM N: 1528 LEU: 96 DIST ATOM CA: 1530 LEU: 96 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1530 LEU: 96 DIST ATOM C: 1545 LEU: 96 CHAIN 0'
'CENTERDISTANCE ATOM C: 1545 LEU: 96 DIST ATOM N: 1547 GLN: 97 CHAIN 0'
'CENTERDISTANCE ATOM N: 1547 GLN: 97 DIST ATOM CA: 1549 GLN: 97 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1549 GLN: 97 DIST ATOM C: 1562 GLN: 97 CHAIN 0'
'CENTERDISTANCE ATOM C: 1562 GLN: 97 DIST ATOM N: 1564 GLN: 98 CHAIN 0'
'CENTERDISTANCE ATOM N: 1564 GLN: 98 DIST ATOM CA: 1566 GLN: 98 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1566 GLN: 98 DIST ATOM C: 1579 GLN: 98 CHAIN 0'
'CENTERDISTANCE ATOM C: 1579 GLN: 98 DIST ATOM N: 1581 ILE: 99 CHAIN 0'
'CENTERDISTANCE ATOM N: 1581 ILE: 99 DIST ATOM CA: 1583 ILE: 99 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1583 ILE: 99 DIST ATOM C: 1598 ILE: 99 CHAIN 0'
'CENTERDISTANCE ATOM C: 1598 ILE: 99 DIST ATOM N: 1600 LYS: 100 CHAIN 0'
'CENTERDISTANCE ATOM N: 1600 LYS: 100 DIST ATOM CA: 1602 LYS: 100 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1602 LYS: 100 DIST ATOM C: 1620 LYS: 100 CHAIN 0'
'CENTERDISTANCE ATOM C: 1620 LYS: 100 DIST ATOM N: 1622 GLU: 101 CHAIN 0'
'CENTERDISTANCE ATOM N: 1622 GLU: 101 DIST ATOM CA: 1624 GLU: 101 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1624 GLU: 101 DIST ATOM C: 1635 GLU: 101 CHAIN 0'
'CENTERDISTANCE ATOM C: 1635 GLU: 101 DIST ATOM N: 1637 ARG: 102 CHAIN 0'
'CENTERDISTANCE ATOM N: 1637 ARG: 102 DIST ATOM CA: 1639 ARG: 102 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1639 ARG: 102 DIST ATOM C: 1659 ARG: 102 CHAIN 0'
'CENTERDISTANCE ATOM C: 1659 ARG: 102 DIST ATOM N: 1661 GLY: 103 CHAIN 0'
'CENTERDISTANCE ATOM N: 1661 GLY: 103 DIST ATOM CA: 1663 GLY: 103 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1663 GLY: 103 DIST ATOM C: 1666 GLY: 103 CHAIN 0'
'CENTERDISTANCE ATOM C: 1666 GLY: 103 DIST ATOM N: 1668 ALA: 104 CHAIN 0'
'CENTERDISTANCE ATOM N: 1668 ALA: 104 DIST ATOM CA: 1670 ALA: 104 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1670 ALA: 104 DIST ATOM C: 1676 ALA: 104 CHAIN 0'
'CENTERDISTANCE ATOM C: 1676 ALA: 104 DIST ATOM N: 1678 LEU: 105 CHAIN 0'
'CENTERDISTANCE ATOM N: 1678 LEU: 105 DIST ATOM CA: 1680 LEU: 105 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1680 LEU: 105 DIST ATOM C: 1695 LEU: 105 CHAIN 0'
'CENTERDISTANCE ATOM C: 1695 LEU: 105 DIST ATOM N: 1697 PRO: 106 CHAIN 0'
'CENTERDISTANCE ATOM N: 1697 PRO: 106 DIST ATOM CA: 1707 PRO: 106 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1707 PRO: 106 DIST ATOM C: 1709 PRO: 106 CHAIN 0'
'CENTERDISTANCE ATOM C: 1709 PRO: 106 DIST ATOM N: 1711 MET: 107 CHAIN 0'
'CENTERDISTANCE ATOM N: 1711 MET: 107 DIST ATOM CA: 1713 MET: 107 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1713 MET: 107 DIST ATOM C: 1726 MET: 107 CHAIN 0'
'CENTERDISTANCE ATOM C: 1726 MET: 107 DIST ATOM N: 1728 ILE: 108 CHAIN 0'
'CENTERDISTANCE ATOM N: 1728 ILE: 108 DIST ATOM CA: 1730 ILE: 108 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1730 ILE: 108 DIST ATOM C: 1745 ILE: 108 CHAIN 0'
'CENTERDISTANCE ATOM C: 1745 ILE: 108 DIST ATOM N: 1747 ASP: 109 CHAIN 0'
'CENTERDISTANCE ATOM N: 1747 ASP: 109 DIST ATOM CA: 1749 ASP: 109 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1749 ASP: 109 DIST ATOM C: 1757 ASP: 109 CHAIN 0'
'CENTERDISTANCE ATOM C: 1757 ASP: 109 DIST ATOM N: 1759 ARG: 110 CHAIN 0'
'CENTERDISTANCE ATOM N: 1759 ARG: 110 DIST ATOM CA: 1761 ARG: 110 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1761 ARG: 110 DIST ATOM C: 1781 ARG: 110 CHAIN 0'
'CENTERDISTANCE ATOM C: 1781 ARG: 110 DIST ATOM N: 1783 GLY: 111 CHAIN 0'
'CENTERDISTANCE ATOM N: 1783 GLY: 111 DIST ATOM CA: 1785 GLY: 111 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1785 GLY: 111 DIST ATOM C: 1788 GLY: 111 CHAIN 0'
'CENTERDISTANCE ATOM C: 1788 GLY: 111 DIST ATOM N: 1790 ASP: 112 CHAIN 0'
'CENTERDISTANCE ATOM N: 1790 ASP: 112 DIST ATOM CA: 1792 ASP: 112 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1792 ASP: 112 DIST ATOM C: 1800 ASP: 112 CHAIN 0'
'CENTERDISTANCE ATOM C: 1800 ASP: 112 DIST ATOM N: 1802 ILE: 113 CHAIN 0'
'CENTERDISTANCE ATOM N: 1802 ILE: 113 DIST ATOM CA: 1804 ILE: 113 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1804 ILE: 113 DIST ATOM C: 1819 ILE: 113 CHAIN 0'
'CENTERDISTANCE ATOM C: 1819 ILE: 113 DIST ATOM N: 1821 ARG: 114 CHAIN 0'
'CENTERDISTANCE ATOM N: 1821 ARG: 114 DIST ATOM CA: 1823 ARG: 114 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1823 ARG: 114 DIST ATOM C: 1843 ARG: 114 CHAIN 0'
'CENTERDISTANCE ATOM C: 1843 ARG: 114 DIST ATOM N: 1845 GLN: 115 CHAIN 0'
'CENTERDISTANCE ATOM N: 1845 GLN: 115 DIST ATOM CA: 1847 GLN: 115 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1847 GLN: 115 DIST ATOM C: 1860 GLN: 115 CHAIN 0'
'CENTERDISTANCE ATOM C: 1860 GLN: 115 DIST ATOM N: 1862 ALA: 116 CHAIN 0'
'CENTERDISTANCE ATOM N: 1862 ALA: 116 DIST ATOM CA: 1864 ALA: 116 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1864 ALA: 116 DIST ATOM C: 1870 ALA: 116 CHAIN 0'
'CENTERDISTANCE ATOM C: 1870 ALA: 116 DIST ATOM N: 1872 ILE: 117 CHAIN 0'
'CENTERDISTANCE ATOM N: 1872 ILE: 117 DIST ATOM CA: 1874 ILE: 117 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1874 ILE: 117 DIST ATOM C: 1889 ILE: 117 CHAIN 0'
'CENTERDISTANCE ATOM C: 1889 ILE: 117 DIST ATOM N: 1891 ASP: 118 CHAIN 0'
'CENTERDISTANCE ATOM N: 1891 ASP: 118 DIST ATOM CA: 1893 ASP: 118 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1893 ASP: 118 DIST ATOM C: 1901 ASP: 118 CHAIN 0'
'CENTERDISTANCE ATOM C: 1901 ASP: 118 DIST ATOM N: 1903 ARG: 119 CHAIN 0'
'CENTERDISTANCE ATOM N: 1903 ARG: 119 DIST ATOM CA: 1905 ARG: 119 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1905 ARG: 119 DIST ATOM C: 1925 ARG: 119 CHAIN 0'
'CENTERDISTANCE ATOM C: 1925 ARG: 119 DIST ATOM N: 1927 CYS: 120 CHAIN 0'
'CENTERDISTANCE ATOM N: 1927 CYS: 120 DIST ATOM CA: 1929 CYS: 120 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1929 CYS: 120 DIST ATOM C: 1936 CYS: 120 CHAIN 0'
'CENTERDISTANCE ATOM C: 1936 CYS: 120 DIST ATOM N: 1938 SER: 121 CHAIN 0'
'CENTERDISTANCE ATOM N: 1938 SER: 121 DIST ATOM CA: 1940 SER: 121 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1940 SER: 121 DIST ATOM C: 1947 SER: 121 CHAIN 0'
'CENTERDISTANCE ATOM C: 1947 SER: 121 DIST ATOM N: 1949 ASN: 122 CHAIN 0'
'CENTERDISTANCE ATOM N: 1949 ASN: 122 DIST ATOM CA: 1951 ASN: 122 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1951 ASN: 122 DIST ATOM C: 1961 ASN: 122 CHAIN 0'
'CENTERDISTANCE ATOM C: 1961 ASN: 122 DIST ATOM N: 1963 ILE: 123 CHAIN 0'
'CENTERDISTANCE ATOM N: 1963 ILE: 123 DIST ATOM CA: 1965 ILE: 123 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1965 ILE: 123 DIST ATOM C: 1980 ILE: 123 CHAIN 0'
'CENTERDISTANCE ATOM C: 1980 ILE: 123 DIST ATOM N: 1982 TRP: 124 CHAIN 0'
'CENTERDISTANCE ATOM N: 1982 TRP: 124 DIST ATOM CA: 1984 TRP: 124 CHAIN 0'
'CENTERDISTANCE ATOM CA: 1984 TRP: 124 DIST ATOM C: 2004 TRP: 124 CHAIN 0'
'CENTERDISTANCE ATOM C: 2004 TRP: 124 DIST ATOM N: 2006 ALA: 125 CHAIN 0'
'CENTERDISTANCE ATOM N: 2006 ALA: 125 DIST ATOM CA: 2008 ALA: 125 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2008 ALA: 125 DIST ATOM C: 2014 ALA: 125 CHAIN 0'
'CENTERDISTANCE ATOM C: 2014 ALA: 125 DIST ATOM N: 2016 SER: 126 CHAIN 0'
'CENTERDISTANCE ATOM N: 2016 SER: 126 DIST ATOM CA: 2018 SER: 126 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2018 SER: 126 DIST ATOM C: 2025 SER: 126 CHAIN 0'
'CENTERDISTANCE ATOM C: 2025 SER: 126 DIST ATOM N: 2027 LEU: 127 CHAIN 0'
'CENTERDISTANCE ATOM N: 2027 LEU: 127 DIST ATOM CA: 2029 LEU: 127 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2029 LEU: 127 DIST ATOM C: 2044 LEU: 127 CHAIN 0'
'CENTERDISTANCE ATOM C: 2044 LEU: 127 DIST ATOM N: 2046 PRO: 128 CHAIN 0'
'CENTERDISTANCE ATOM N: 2046 PRO: 128 DIST ATOM CA: 2056 PRO: 128 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2056 PRO: 128 DIST ATOM C: 2058 PRO: 128 CHAIN 0'
'CENTERDISTANCE ATOM C: 2058 PRO: 128 DIST ATOM N: 2060 GLY: 129 CHAIN 0'
'CENTERDISTANCE ATOM N: 2060 GLY: 129 DIST ATOM CA: 2062 GLY: 129 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2062 GLY: 129 DIST ATOM C: 2065 GLY: 129 CHAIN 0'
'CENTERDISTANCE ATOM C: 2065 GLY: 129 DIST ATOM N: 2067 ALA: 130 CHAIN 0'
'CENTERDISTANCE ATOM N: 2067 ALA: 130 DIST ATOM CA: 2069 ALA: 130 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2069 ALA: 130 DIST ATOM C: 2075 ALA: 130 CHAIN 0'
'CENTERDISTANCE ATOM C: 2075 ALA: 130 DIST ATOM N: 2077 GLY: 131 CHAIN 0'
'CENTERDISTANCE ATOM N: 2077 GLY: 131 DIST ATOM CA: 2079 GLY: 131 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2079 GLY: 131 DIST ATOM C: 2082 GLY: 131 CHAIN 0'
'CENTERDISTANCE ATOM C: 2082 GLY: 131 DIST ATOM N: 2084 TYR: 132 CHAIN 0'
'CENTERDISTANCE ATOM N: 2084 TYR: 132 DIST ATOM CA: 2086 TYR: 132 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2086 TYR: 132 DIST ATOM C: 2103 TYR: 132 CHAIN 0'
'CENTERDISTANCE ATOM C: 2103 TYR: 132 DIST ATOM N: 2105 GLY: 133 CHAIN 0'
'CENTERDISTANCE ATOM N: 2105 GLY: 133 DIST ATOM CA: 2107 GLY: 133 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2107 GLY: 133 DIST ATOM C: 2110 GLY: 133 CHAIN 0'
'CENTERDISTANCE ATOM C: 2110 GLY: 133 DIST ATOM N: 2112 GLN: 134 CHAIN 0'
'CENTERDISTANCE ATOM N: 2112 GLN: 134 DIST ATOM CA: 2114 GLN: 134 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2114 GLN: 134 DIST ATOM C: 2127 GLN: 134 CHAIN 0'
'CENTERDISTANCE ATOM C: 2127 GLN: 134 DIST ATOM N: 2129 PHE: 135 CHAIN 0'
'CENTERDISTANCE ATOM N: 2129 PHE: 135 DIST ATOM CA: 2131 PHE: 135 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2131 PHE: 135 DIST ATOM C: 2147 PHE: 135 CHAIN 0'
'CENTERDISTANCE ATOM C: 2147 PHE: 135 DIST ATOM N: 2149 GLU: 136 CHAIN 0'
'CENTERDISTANCE ATOM N: 2149 GLU: 136 DIST ATOM CA: 2151 GLU: 136 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2151 GLU: 136 DIST ATOM C: 2162 GLU: 136 CHAIN 0'
'CENTERDISTANCE ATOM C: 2162 GLU: 136 DIST ATOM N: 2164 HIS: 137 CHAIN 0'
'CENTERDISTANCE ATOM N: 2164 HIS: 137 DIST ATOM CA: 2166 HIS: 137 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2166 HIS: 137 DIST ATOM C: 2179 HIS: 137 CHAIN 0'
'CENTERDISTANCE ATOM C: 2179 HIS: 137 DIST ATOM N: 2181 LYS: 138 CHAIN 0'
'CENTERDISTANCE ATOM N: 2181 LYS: 138 DIST ATOM CA: 2183 LYS: 138 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2183 LYS: 138 DIST ATOM C: 2201 LYS: 138 CHAIN 0'
'CENTERDISTANCE ATOM C: 2201 LYS: 138 DIST ATOM N: 2203 ALA: 139 CHAIN 0'
'CENTERDISTANCE ATOM N: 2203 ALA: 139 DIST ATOM CA: 2205 ALA: 139 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2205 ALA: 139 DIST ATOM C: 2211 ALA: 139 CHAIN 0'
'CENTERDISTANCE ATOM C: 2211 ALA: 139 DIST ATOM N: 2213 ASP: 140 CHAIN 0'
'CENTERDISTANCE ATOM N: 2213 ASP: 140 DIST ATOM CA: 2215 ASP: 140 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2215 ASP: 140 DIST ATOM C: 2223 ASP: 140 CHAIN 0'
'CENTERDISTANCE ATOM C: 2223 ASP: 140 DIST ATOM N: 2225 SER: 141 CHAIN 0'
'CENTERDISTANCE ATOM N: 2225 SER: 141 DIST ATOM CA: 2227 SER: 141 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2227 SER: 141 DIST ATOM C: 2234 SER: 141 CHAIN 0'
'CENTERDISTANCE ATOM C: 2234 SER: 141 DIST ATOM N: 2236 LEU: 142 CHAIN 0'
'CENTERDISTANCE ATOM N: 2236 LEU: 142 DIST ATOM CA: 2238 LEU: 142 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2238 LEU: 142 DIST ATOM C: 2253 LEU: 142 CHAIN 0'
'CENTERDISTANCE ATOM C: 2253 LEU: 142 DIST ATOM N: 2255 ILE: 143 CHAIN 0'
'CENTERDISTANCE ATOM N: 2255 ILE: 143 DIST ATOM CA: 2257 ILE: 143 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2257 ILE: 143 DIST ATOM C: 2272 ILE: 143 CHAIN 0'
'CENTERDISTANCE ATOM C: 2272 ILE: 143 DIST ATOM N: 2274 ALA: 144 CHAIN 0'
'CENTERDISTANCE ATOM N: 2274 ALA: 144 DIST ATOM CA: 2276 ALA: 144 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2276 ALA: 144 DIST ATOM C: 2282 ALA: 144 CHAIN 0'
'CENTERDISTANCE ATOM C: 2282 ALA: 144 DIST ATOM N: 2284 LYS: 145 CHAIN 0'
'CENTERDISTANCE ATOM N: 2284 LYS: 145 DIST ATOM CA: 2286 LYS: 145 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2286 LYS: 145 DIST ATOM C: 2304 LYS: 145 CHAIN 0'
'CENTERDISTANCE ATOM C: 2304 LYS: 145 DIST ATOM N: 2306 PHE: 146 CHAIN 0'
'CENTERDISTANCE ATOM N: 2306 PHE: 146 DIST ATOM CA: 2308 PHE: 146 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2308 PHE: 146 DIST ATOM C: 2324 PHE: 146 CHAIN 0'
'CENTERDISTANCE ATOM C: 2324 PHE: 146 DIST ATOM N: 2326 LYS: 147 CHAIN 0'
'CENTERDISTANCE ATOM N: 2326 LYS: 147 DIST ATOM CA: 2328 LYS: 147 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2328 LYS: 147 DIST ATOM C: 2346 LYS: 147 CHAIN 0'
'CENTERDISTANCE ATOM C: 2346 LYS: 147 DIST ATOM N: 2348 GLU: 148 CHAIN 0'
'CENTERDISTANCE ATOM N: 2348 GLU: 148 DIST ATOM CA: 2350 GLU: 148 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2350 GLU: 148 DIST ATOM C: 2361 GLU: 148 CHAIN 0'
'CENTERDISTANCE ATOM C: 2361 GLU: 148 DIST ATOM N: 2363 ALA: 149 CHAIN 0'
'CENTERDISTANCE ATOM N: 2363 ALA: 149 DIST ATOM CA: 2365 ALA: 149 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2365 ALA: 149 DIST ATOM C: 2371 ALA: 149 CHAIN 0'
'CENTERDISTANCE ATOM C: 2371 ALA: 149 DIST ATOM N: 2373 GLY: 150 CHAIN 0'
'CENTERDISTANCE ATOM N: 2373 GLY: 150 DIST ATOM CA: 2375 GLY: 150 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2375 GLY: 150 DIST ATOM C: 2378 GLY: 150 CHAIN 0'
'CENTERDISTANCE ATOM C: 2378 GLY: 150 DIST ATOM N: 2380 GLY: 151 CHAIN 0'
'CENTERDISTANCE ATOM N: 2380 GLY: 151 DIST ATOM CA: 2382 GLY: 151 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2382 GLY: 151 DIST ATOM C: 2385 GLY: 151 CHAIN 0'
'CENTERDISTANCE ATOM C: 2385 GLY: 151 DIST ATOM N: 2387 THR: 152 CHAIN 0'
'CENTERDISTANCE ATOM N: 2387 THR: 152 DIST ATOM CA: 2389 THR: 152 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2389 THR: 152 DIST ATOM C: 2399 THR: 152 CHAIN 0'
'CENTERDISTANCE ATOM C: 2399 THR: 152 DIST ATOM N: 2401 VAL: 153 CHAIN 0'
'CENTERDISTANCE ATOM N: 2401 VAL: 153 DIST ATOM CA: 2403 VAL: 153 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2403 VAL: 153 DIST ATOM C: 2415 VAL: 153 CHAIN 0'
'CENTERDISTANCE ATOM C: 2415 VAL: 153 DIST ATOM N: 2417 ARG: 154 CHAIN 0'
'CENTERDISTANCE ATOM N: 2417 ARG: 154 DIST ATOM CA: 2419 ARG: 154 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2419 ARG: 154 DIST ATOM C: 2439 ARG: 154 CHAIN 0'
'CENTERDISTANCE ATOM C: 2439 ARG: 154 DIST ATOM N: 2441 GLU: 155 CHAIN 0'
'CENTERDISTANCE ATOM N: 2441 GLU: 155 DIST ATOM CA: 2443 GLU: 155 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2443 GLU: 155 DIST ATOM C: 2454 GLU: 155 CHAIN 0'
'CENTERDISTANCE ATOM C: 2454 GLU: 155 DIST ATOM N: 2456 ILE: 156 CHAIN 0'
'CENTERDISTANCE ATOM N: 2456 ILE: 156 DIST ATOM CA: 2458 ILE: 156 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2458 ILE: 156 DIST ATOM C: 2473 ILE: 156 CHAIN 0'
'CENTERDISTANCE ATOM C: 2473 ILE: 156 DIST ATOM N: 2475 ASP: 157 CHAIN 0'
'CENTERDISTANCE ATOM N: 2475 ASP: 157 DIST ATOM CA: 2477 ASP: 157 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2477 ASP: 157 DIST ATOM C: 2485 ASP: 157 CHAIN 0'
'CENTERDISTANCE ATOM C: 2485 ASP: 157 DIST ATOM N: 2487 VAL: 158 CHAIN 0'
'CENTERDISTANCE ATOM N: 2487 VAL: 158 DIST ATOM CA: 2489 VAL: 158 CHAIN 0'
'CENTERDISTANCE ATOM CA: 2489 VAL: 158 DIST ATOM C: 2501 VAL: 158 CHAIN 0']
However, CVs are deleted, when the number of atoms is altered.
[87]:
frames.atom_slice(frames.select('name CA'))
/home/kevin/git/encoder_map_private/encodermap/trajinfo/info_single.py:1691: UserWarning:
Dropping CVs from trajectory. Slicing CVs with this method is currently not possible. Raise an issue if you want to have this feature added.
[88]:
frames.CVs
[88]:
{}
Saving a complete trajectory Ensemble#
The same is possible with a complete TrajEnsemble
. We can just save and load it after we have introduced all required collective variables.
The filesize of such files can be really large. They are note meant for archival purposes. It is discouraged to store these files on a managed filesystem, which makes constant backups of files. Please save these files only on your local PCs and keep the original .xtc and .pdb files in an archive.
A TrajEnsemble
can be easily saved and loaded.
[89]:
trajs.save("pASP_pGLU_ensemble.h5", overwrite=True)
The file size is rather large with around 50 MB.
[90]:
Path("pASP_pGLU_ensemble.h5").stat().st_size / 1024 / 1024
[90]:
50.07163143157959
And reloaded
[91]:
loaded_trajs = em.TrajEnsemble.from_dataset("pASP_pGLU_ensemble.h5")
loaded_trajs._CVs
[91]:
<xarray.Dataset> Dimensions: (ATOM: 30, ATOM_NO: 4, CENTRAL_ANGLES: 28, CENTRAL_DIHEDRALS: 27, CENTRAL_DISTANCES: 29, COORDS: 3, SIDE_DIHEDRALS: 28, traj_num: 7, frame_num: 5001, traj_name: 1) Coordinates: * ATOM (ATOM) <U2 '1' '2' '3' ... '29' '30' * ATOM_NO (ATOM_NO) int64 0 1 2 3 * CENTRAL_ANGLES (CENTRAL_ANGLES) <U18 'CENTERANGLE ... * CENTRAL_DIHEDRALS (CENTRAL_DIHEDRALS) <U18 'CENTERDIH P... * CENTRAL_DISTANCES (CENTRAL_DISTANCES) <U18 'CENTERDISTA... * COORDS (COORDS) <U10 'POSITION X' ... 'POSIT... * SIDE_DIHEDRALS (SIDE_DIHEDRALS) <U18 'SIDECHDIH CHI1... * frame_num (frame_num) int64 0 1 2 ... 4999 5000 time (traj_num, frame_num) float32 0.0 ...... * traj_num (traj_num) int64 0 1 2 3 4 5 6 * traj_name (traj_name) <U18 'pASP_pGLU_ensemble' Data variables: central_angles (traj_num, frame_num, CENTRAL_ANGLES) float32 ... central_angles_feature_indices (traj_num, CENTRAL_ANGLES, ATOM_NO) float64 ... central_cartesians (traj_num, frame_num, ATOM, COORDS) float32 ... central_cartesians_feature_indices (traj_num, ATOM) float64 0.0 3.0 ... nan central_dihedrals (traj_num, frame_num, CENTRAL_DIHEDRALS) float32 ... central_dihedrals_feature_indices (traj_num, CENTRAL_DIHEDRALS, ATOM_NO) float64 ... central_distances (traj_num, frame_num, CENTRAL_DISTANCES) float32 ... central_distances_feature_indices (traj_num, CENTRAL_DISTANCES, ATOM_NO) float64 ... side_dihedrals (traj_num, frame_num, SIDE_DIHEDRALS) float32 ... side_dihedrals_feature_indices (traj_num, SIDE_DIHEDRALS, ATOM_NO) float64 ... Attributes: angle_units: rad feature_axes: ['SIDE_DIHEDRALS', 'ATOM', 'CENTRAL_DIHEDRALS', 'CENTRAL... full_paths: ['/home/kevin/git/encoder_map_private/tests/data/pASP_pG... length_units: nm time_units: ps topology_files: ['/home/kevin/git/encoder_map_private/tests/data/pASP_pG...
Conclusion#
Thinking of collections of trajectories describing the dynamic phenomena of a protein as an ensemble opens up new ways of working and understanding MD data. EncoderMap’s autoencoder-based dimensionality reduction builds on this framework by using neural networks to reduce the dimensionality of a input feature space. This input feature space and thus the low-dimensional representation also, contain ensemble statistics of the protein or the family of proteins we will look at.
Using the SingleTraj
and TrajEnsemble
classes can help with the dimensionality reduction, but with much more of the book-keeping a computational chemist has to do. We hope the tools presented in this notebook can help you in tackling the itering protein dynamic questions you are faced with.