Working with trajectory ensembles#

Welcome

Welcome to the MD section of the EncoderMap tutorial. All EncoderMap tutorials are provided as jupyter notebooks, that you can run locally, on binderhub, or even on google colab.

Run this notebook on Google Colab:

Open in Colab

Find the documentation of EncoderMap:

https://ag-peter.github.io/encodermap

Goals:

In this tutorial you will learn:

For Google colab only:#

If you’re on Google colab, please uncomment these lines and install EncoderMap.

[1]:
# !wget https://gist.githubusercontent.com/kevinsawade/deda578a3c6f26640ae905a3557e4ed1/raw/b7403a37710cb881839186da96d4d117e50abf36/install_encodermap_google_colab.sh
# !sudo bash install_encodermap_google_colab.sh

Primer#

Collective Variables#

Collective Variables (CVs) are often used to simplify and filter the xyz-Data of MD simulations. They are often employed to make sense of a complex protein (or more general molecular) system by breaking it down into just a few well-defined descriptors. They are similar to reaction coordinates, which are 1-dimensional variables along a reaction pathway but can be higher dimensional. When we think about a receptor-ligand system, the distance between the two species is often used as a reaction coordinate. A set of this distance CV and the relative rotation between receptor and ligand can add much more information and help understand the system. It becomes apparent that clever selection of CVs is an important task that many scientists in different fields face day to day.

With the tools presented in this notebook you will be able to work in a fluent and natural way with MD trajectories and their associated CV data. In the scope of this work we define CVs as:

A collection of values that is in one of its dimensions aligned with the frames/timesteps of the underlying simulation data.

Example CVs#

Here’s a list of widely used CVs:

  • Distances: This category of CVs can contain a 1-diemsnional scalar value describing the distance between two species in a receptor-ligand system. A 1-dimensional CV of the end-to-end distance of a protein can describe the proteins folding state in an approximate and generalizing manner. Distance CVs can also be higher-dimensional. The pairwise distances between \(\mathrm{C_\alpha}\) atoms of a protein with \(n\) residues can be captured as a \(n \times n\) matrix. A so-called hollow matrix, where all diagonal-elements are zero which is also symmetric. Most often a vector of length \(\binom{n}{2}\) is used to describe the pairwise distances in a protein. The distance between a protein and a membrane/interface would also fall into this category.

  • Angles: Angular CVs lie in a periodic space of \((-\pi, \pi]\) or \((0, 2\pi]\), or \((0, 360]\). The well-known Ramachandran plot uses a protein’s \(\phi\) and \(\psi\) angles to define regions of \(psi\)-\(phi\) combinations which correlate to different secondary structure motifs. Besides the backbone angles other angles can become important in understanding a protein’s conformations. Lysine is often modified via acetylation, phosphorylation, methylation, ubiquitylation and it’s sidechain angles (\(\chi_1\) to \(\chi_5\)) can be important descriptors in such a system. Pseudo-Dihedral angles lie also in this category.

  • Integer/binary values. If a protein has well-defined states (folded and unfolder) a binary value describing these states could also be a useful CV.

  • Values from other calculations and hyperparameters. An example for this category could be the temperature at which the simulation was carried out, when simulations were conducted at multiple temperatures. If phase space sub states were obtained by either using Markov-Chain models, or by using some sort of clustering algorithm (GROMOS), the membership to such cluster could also present CVs.

  • Positional values: Maybe even the full position of an atom or a group of atoms could be an important descriptor for a system.

  • etc.

We can view the raw xyz-data of trajectories as a high-dimensional array with a shape of (n_frames, n_atoms, 3). The last axis correxponds to the cartesian coordinates. Indexing this axis will give us the x, y, and z coordinates, respectively.

[2]:
# todo: delete
from __future__ import annotations
import plotly.graph_objects as go
import numpy as np
from pathlib import Path

import encodermap as em

%load_ext autoreload
%autoreload 2
/home/kevin/git/encoder_map_private/encodermap/__init__.py:194: GPUsAreDisabledWarning: EncoderMap disables the GPU per default because most tensorflow code runs with a higher compatibility when the GPU is disabled. If you want to enable GPUs manually, set the environment variable 'ENCODERMAP_ENABLE_GPU' to 'True' before importing EncoderMap. To do this in python you can run:

import os; os.environ['ENCODERMAP_ENABLE_GPU'] = 'True'

before importing encodermap.
  _warnings.warn(
[3]:
traj = em.load_project('linear_dimers', traj=0)

print(f"First frame, first atom, x-coordinate:\n{traj.xyz[0, 0, 0]=}")
print(f"First 2 frames, first atom, y-coordinates:\n{traj.xyz[:2, 0, 1]=}")
print(f"All frames, all atoms. Only z-coordinates:\n{traj.xyz[..., -1]}")
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning: EncoderMap's repository at https://sawade.io/encodermap_data/linear_dimers is unreachable.
  warnings.warn(f"EncoderMap's repository at {url} is unreachable.")
First frame, first atom, x-coordinate:
traj.xyz[0, 0, 0]=6.1210003
First 2 frames, first atom, y-coordinates:
traj.xyz[:2, 0, 1]=array([6.42     , 6.3940005], dtype=float32)
All frames, all atoms. Only z-coordinates:
[[1.1060001  1.092      1.028      ... 5.7790003  5.7790003  5.881     ]
 [1.056      1.0610001  0.96400005 ... 5.76       5.689      5.8650002 ]
 [1.062      1.0090001  1.144      ... 5.8500004  5.9550004  5.8240004 ]
 ...
 [0.12100001 0.031      0.12400001 ... 4.783      4.846      4.7850003 ]
 [0.116      0.081      0.125      ... 4.617      4.6980004  4.5950003 ]
 [0.06900001 0.06       0.063      ... 4.5290003  4.486      4.6260004 ]]

The 3-dimensions of this data can be used for plotting. However, we don’t gain much from these plots.

[4]:
em.plot.plot_raw_data(
    traj,
    frame_slice=slice(0, 5001, 1000),
    atom_slice=slice(0, 1500, 150),
)

Adding more context to the raw xyz-data (like size, color, bonds) we can better understand the data.

[5]:
em.plot.plot_ball_and_stick(traj)
/home/kevin/git/encoder_map_private/encodermap/trajinfo/info_single.py:1691: UserWarning:

Dropping CVs from trajectory. Slicing CVs with this method is currently not possible. Raise an issue if you want to have this feature added.

We can get more information by using additional libraries like NGLView which also render secondary structure. However, we can’t really understand how the protein moves from one folding state to another. Hit the “Play” button and have a look at the first 200 frames of the simulation. Can you spot the LEU56-SER57-ASP58-TYR59 residues transition from a turn into an \(\alpha\)-helix?

[6]:
import nglview as nv
nv.show_mdtraj(traj.traj[::100])

That’s when collective variables can come in handy. They break down and combine different xyz data to help us understand the grand picture of protein folding.

A Ramachandran plot is a very general description of secondary structure motifs. The test protein does indeed contain \(alpha\)-helices and \(beta\)-sheets. We can’t really understand the conformation of the protein.

[7]:
em.plot.plot_ramachandran(traj, subsample=1000)

A DSSP plot doesn’t condense the \(phi\) and \(psi\) angles into a single plot. Here we can assign secondary structure motifs to certain residues at certain times of the simulation.

[8]:
em.plot.plot_dssp(
    traj,
    simplified=False,
    subsample=slice(0, 500),
)

Distances can also be used as useful collective variables. The end-to-end distance can give helpful insights into a proteins general conformation (collapsed vs. extended).

[9]:
em.plot.plot_end2end(traj)

Sharing MD data#

To play with MD data we must first obtain them. Instead of running long MD simulations you can just download the datasets from the University of Konstanz’ data repository KonDATA. EncoderMap has a convenience function get_from_kondata(), that allows you to fetch those files:

[10]:
import encodermap as em
output_dir = em.get_from_kondata(
    "linear_dimers",
    mk_parentdir=True,
    silence_overwrite_message=True,
)
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning:

EncoderMap's repository at https://sawade.io/encodermap_data/linear_dimers is unreachable.

[11]:
output_dir
[11]:
'/home/kevin/git/encoder_map_private/tests/data/linear_dimers'

Classes for working with MD data#

After we have obtained some files let us work with Encodermap’s SingleTraj and TrajEnsemble classes. We need some more imports to also plot some data and visualize it.

[12]:
# Packages for numerical data
import xarray as xr
import numpy as np

# Packages for MD data
import mdtraj as md
import MDAnalysis as mda

# Packages for plotting
import matplotlib as mpl
import matplotlib.pyplot as plt

# Packages for file operations
from glob import glob
from pathlib import Path
import os
import tempfile

# nice printing
from rich.pretty import pprint

# With nglview you can view structural data in the notebook
# If not installed you should check it out.
try:
    import nglview as nv
except ImportError:
    print("Check out nglview: https://github.com/nglviewer/nglview")

The new SingleTraj class#

The SingleTraj class is meant as a container to hold a trajectory’s xyz coordinates, its topology, and its CVs. This class builds the backbone of the TrajEnsemble class which will be looked at later.

In the background of the SingleTraj class the MDTraj package (mdtraj/mdtraj) works its magic, however this class offers more benefits:

  • The SingleTraj class keeps track of CVs while indexing and slicing.

  • The trajectories’ xyz data is only loaded when it is actually needed. Most manipuilation can thus be done before creating huge objects in memory.

The SingleTraj class can be initialized in many ways:

  • From a trajectory file (.xtc, .dcd, .lammpstrj) and a topology file (.gro, .pdb).

[13]:
output_dir = Path(output_dir)
traj_from_xtc_file = em.SingleTraj(output_dir / "01.xtc", top=output_dir / "01.pdb")
print(traj_from_xtc_file)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. Not containing any CVs.
  • From an url of the pdb database.

[14]:
traj_from_url = em.SingleTraj('https://files.rcsb.org/view/1YUF.pdb')
print(traj_from_url)
encodermap.SingleTraj object. Currently not in memory. Basename is '1YUF'. Not containing any CVs.
  • Or by just by providing a pdb code to the from_pdb_id() constructor of the SingleTraj class.

[15]:
traj_from_pdb_id = em.SingleTraj.from_pdb_id('1YUG')
print(traj_from_pdb_id)
encodermap.SingleTraj object. Currently not in memory. Basename is '1YUG'. Not containing any CVs. Common string is '1YUG'.
  • You can also directly load the trajectory from an EncoderMap project. In this case, the trajectory already has CVs.

[16]:
# with the load_project() method
traj = em.load_project('linear_dimers', traj=0)
print(traj)
encodermap.SingleTraj object. Currently not in memory. Basename is 'trajs'. CV central_angles with shape (1, 5001, 454) loaded. CV central_cartesians with shape (1, 5001, 456, 3) loaded. CV central_dihedrals with shape (1, 5001, 453) loaded. CV central_distances with shape (1, 5001, 455) loaded. CV lowd with shape (1, 5001, 2) loaded. CV side_dihedrals with shape (1, 5001, 322) loaded.

``SingleTraj``s keep track of the files they have been instantiated from.

[17]:
print(
    f"basename  = {traj_from_xtc_file.basename:<70}{traj_from_url.basename:<70}\n"
    f"traj_file = {traj_from_xtc_file.traj_file:<70}{traj_from_url.traj_file:<70}\n"
    f"top_file  = {traj_from_xtc_file.top_file:<70}{traj_from_url.top_file:<70}\n"
    f"extension = {traj_from_xtc_file.extension:<70}{traj_from_url.extension:<70}\n"
    f"n_frames  = {traj_from_xtc_file.n_frames:<70}{traj_from_url.n_frames:<70}\n"
    f"n_atoms   = {traj_from_xtc_file.n_atoms:<70}{traj_from_url.n_atoms:<70}"
)
basename  = 01                                                                    1YUF
traj_file = /home/kevin/git/encoder_map_private/tests/data/linear_dimers/01.xtc   https://files.rcsb.org/view/1YUF.pdb
top_file  = /home/kevin/git/encoder_map_private/tests/data/linear_dimers/01.pdb   https://files.rcsb.org/view/1YUF.pdb
extension = .xtc                                                                  .pdb
n_frames  = 5001                                                                  16
n_atoms   = 1521                                                                  720

Difference between ``traj``, ``trajectory`` and ``top``, ``topology``

traj and top always give mdtraj.Trajectory and mdtraj.Topology, respectively. They are loaded on demand and return the corresponding mdtraj object.

trajectory and topology can be False and represent the current backend of the SingleTraj object.

When instantiated, SingleTrajd do not load the data from disk. Only when requested.

[18]:
print(traj_from_xtc_file.topology)
print(traj_from_xtc_file.top)
print(traj_from_xtc_file.topology)
False
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False
[19]:
print(traj_from_xtc_file.trajectory)
print(traj_from_xtc_file.traj)
print(traj_from_xtc_file.trajectory)
False
<mdtraj.Trajectory with 5001 frames, 1521 atoms, 152 residues, and unitcells>
False

Loading can be forced

Using the load() method, the trajectory will be loaded. From that point forwards, the xyz data is kept in RAM and can be accessed. The unload() function does the reverse and frees up the RAM, but the xyz data can be loaded again. If your RAM is large enough you would not need the unload() function, but it is there nonetheless.

[20]:
traj_from_xtc_file.load_traj()
print(traj_from_xtc_file.topology)
traj_from_xtc_file.unload()
print(traj_from_xtc_file.topology)
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False

Inside a context manager, the SingleTraj is always loaded and unloaded afterwards.

[21]:
with traj_from_xtc_file as t:
    print(t.topology)
print(traj_from_xtc_file.topology)
<mdtraj.Topology with 1 chains, 152 residues, 1521 atoms, 1534 bonds>
False

If nglview is set up, you can take a look at the trajectory with this code:

[22]:
view = traj_from_xtc_file.show_traj()
view

Some methods and attributes are duplicated from mdtraj. This allows us to call some mdtraj functions on the SingleTraj object directly.

[23]:
selection = traj_from_xtc_file.select('name CA')
print(selection[:5])
dssp = md.compute_dssp(traj_from_xtc_file)
print(dssp[0, :5].tolist())
[ 4 13 25 34 51]
['C', 'E', 'E', 'E', 'E']
[24]:
md.compute_center_of_mass(traj_from_xtc_file)[0]
[24]:
array([7.60928699, 7.31795958, 3.59687811])

By indexing a SingleTraj, you get another instance of SingleTraj containing only one frame.

[25]:
frame = traj_from_xtc_file[0]
print(frame)
encodermap.SingleTraj object. Data currently in memory. Basename is '01'. At indices (0,). Not containing any CVs.

The indexing of SingleTrajs is postponed, meaning, that for certain indexing types (integers, sequences of integers, slices without the step argument) the trajectory data is not loaded from disk. Only, when loaded, are the indexes applied one after the other.

This cell is fast. No file operations are carried out here.

[26]:
traj_from_xtc_file.unload()
frame = traj_from_xtc_file[:10][[0, 1]]
print(frame)
print(frame.trajectory)
print(frame.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, 10, None), [0, 1]). Not containing any CVs.
False
2

This cell is a bit slower. Here, the trajectory is loaded from file.

[27]:
frame.load_traj()
print(frame)
encodermap.SingleTraj object. Data currently in memory. Basename is '01'. At indices (slice(None, 10, None), [0, 1]). Not containing any CVs.

If the index is larger than the number of frames, the SingleTraj class will throw an exception.

[28]:
traj_from_xtc_file.unload()
frame = traj_from_xtc_file[10_000]
print(frame)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[28], line 2
      1 traj_from_xtc_file.unload()
----> 2 frame = traj_from_xtc_file[10_000]
      3 print(frame)

File ~/git/encoder_map_private/encodermap/trajinfo/info_single.py:1986, in SingleTraj.__getitem__(self, key)
   1984 if isinstance(key, (int, np.integer)):
   1985     if key > self.n_frames:
-> 1986         raise IndexError(
   1987             f"Index {key} out of range for traj with "
   1988             f"{self.n_frames} frames."
   1989         )
   1990 if isinstance(key, (list, np.ndarray)):
   1991     if any([k > self.n_frames for k in key]):

IndexError: Index 10000 out of range for traj with 5001 frames.

SingleTrajs keep the number of their frames organized. If a SingleTraj is loaded from disk the frames are represented by a list of integers:

[29]:
traj_from_xtc_file.id
[29]:
array([   0,    1,    2, ..., 4998, 4999, 5000])

ndexing these frames with a [::2] slice will give the frames [0, 2, 4, ..., n].

[30]:
traj_from_xtc_file[::2].id
[30]:
array([   0,    2,    4, ..., 4996, 4998, 5000])

This begs the question what happens if we index these frames again? Which frame numbers will be chosen

The answer is: There are two methods.

  1. The normal indexing [[0, 1, 2]] does not care for the original frame indices. It will select whichever frame is at positions 0, 1, and 2, regardless of what their original id was.

[31]:
traj_from_xtc_file[::2][[0, 1, 2, 3]].id
[31]:
array([0, 2, 4, 6])
[32]:
traj_from_xtc_file[::5][[0, 1, 2, 5, 6]].id
[32]:
array([ 0,  5, 10, 25, 30])
  1. The other method uses the fsel[] selector. It expects the original frame indices.

[33]:
traj_from_xtc_file[::2].fsel[[0, 2, 4, 6]].id
[33]:
array([0, 2, 4, 6])
[34]:
traj_from_xtc_file[::5].fsel[[0, 5, 10, 25, 30]].id
[34]:
array([ 0,  5, 10, 25, 30])

This also means, that the fsel selector can fail for frames that are not present in a SingleTraj.

This means, that some frames are not accessible, when we do some weird slicing:

[35]:
traj_from_xtc_file[::5].fsel[3]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[35], line 1
----> 1 traj_from_xtc_file[::5].fsel[3]

File ~/git/encoder_map_private/encodermap/trajinfo/info_single.py:199, in SingleTrajFsel.__getitem__(self, item)
    194         raise ValueError(
    195             f"The `fsel[]` method of `SingleTraj` takes {CanBeIndex} types, "
    196             f"but {type(item)} was provided."
    197         )
    198 if len(idx) == 0:
--> 199     raise ValueError(
    200         f"No frames with frame index {item} in trajectory {self.other} "
    201         f"with frames: {self.other._frames}"
    202     )
    203 return self.other[idx]

ValueError: No frames with frame index 3 in trajectory encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, None, 5),). Not containing any CVs. with frames: [   0    5   10 ... 4990 4995 5000]

You can also give a numpy array, a list or even a slice into the indexing.

Indexing with without a stop ([::5]) will put the trajectory into memory.

[36]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[::2]
print(traj_from_xtc_file.n_frames)
print(subsample)
print(subsample.n_frames)
5001
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(None, None, 2),). Not containing any CVs.
2501
[37]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[[0, 1, 5, 6]]
print(subsample)
print(subsample.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices ([0, 1, 5, 6],). Not containing any CVs.
4

Note: Andvanced slicing can result in trajectories with 0 frames in them, or possibly reverse the time axis. Use this feature only if you are sure about what you are doing.

[38]:
traj_from_xtc_file.unload()
subsample = traj_from_xtc_file[5:46:3]
print(subsample)
print(subsample.n_frames)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. At indices (slice(5, 46, 3),). Not containing any CVs.
14

The HDF5 file format (ending wiht the .h5 extension) allows us to directly extract frames and accelerate loading. We can save a .h5 formatted file from an SingleTraj class by calling:

[39]:
with tempfile.TemporaryDirectory() as d:
    d = Path(d)
    traj_from_xtc_file.save(d / 'my_traj.h5', overwrite=True)

Loading of .h5 files is similar to all files:

[40]:
with tempfile.TemporaryDirectory() as d:
    d = Path(d)
    traj_from_xtc_file.save(d / 'my_traj.h5', overwrite=True)
    traj_h5 = em.SingleTraj(d / 'my_traj.h5')

    print(traj_h5[0])
encodermap.SingleTraj object. Currently not in memory. Basename is 'my_traj'. At indices (0,). Not containing any CVs.

The SingleTraj class can be used as an iterator in two ways:

  • Iterate over the frames with for frame in traj.

  • Iterate over frame_number and frame with for frame_no, frame in traj.iterframes().

[41]:
for frame in traj[::2][:5]:
    print(frame.index)
(None, slice(None, None, 2), slice(None, 5, None), 0)
(None, slice(None, None, 2), slice(None, 5, None), 1)
(None, slice(None, None, 2), slice(None, 5, None), 2)
(None, slice(None, None, 2), slice(None, 5, None), 3)
(None, slice(None, None, 2), slice(None, 5, None), 4)
[42]:
for frame_num, frame in traj[::2][:5].iterframes():
    print(frame_num)
0
2
4
6
8

The TrajEnsemble class contains multiple SingleTrajs#

The trajectory ensemble is everything you’ve always wanted, and more. Really, it is. Trajectory ensembles unlock fundamental ideas in statistical mechanics, including connections between equilibrium and non-equilibrium phenomena.

EncoderMap tries to implement the idea of a trajectory ensemble with the TrajectoryEnsemble class. A container for multiple SingleTrajs.

A trajectory ensemble can be created by providing it multiple trajectory files:

[43]:
output_dir = Path(em.get_from_kondata("pASP_pGLU", mk_parentdir=True, silence_overwrite_message=True))

trajs_from_files = em.load(
    [
        output_dir / "asp7.xtc",
        output_dir / "glu7.xtc",
    ],
    [
        output_dir / "asp7.pdb",
        output_dir / "glu7.pdb",
    ],
    common_str=["asp7", "glu7"]
)
/home/kevin/git/encoder_map_private/encodermap/kondata.py:477: UserWarning:

EncoderMap's repository at https://sawade.io/encodermap_data/pASP_pGLU is unreachable.

[44]:
trajs_from_files
[44]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. Object at 0x7055c7b2fa60>

The common_str argument provides a means to group similar trajectories. It searches for matching substrings in the filename of the provided trajectories and topologies. In the trajs_from_files, there are only one TrajEnsemble with a single trajectory per common_str.

[45]:
for common_str, trajs_subset in trajs_from_files.trajs_by_common_str.items():
    print(f"The common_str '{common_str}' contains a subset of {trajs_subset.n_trajs} trajectories.")
The common_str 'asp7' contains a subset of 1 trajectories.
The common_str 'glu7' contains a subset of 1 trajectories.

Something that EncoderMap does different than packages like MDTraj or MDAnalysis is that trajectories can be grouped together even when their topologies are different.

[46]:
for top, trajs_subset in trajs_from_files.trajs_by_top.items():
    print(f"The topology {top} has {trajs_subset.n_trajs} in the ensemble.")
The topology <mdtraj.Topology with 1 chains, 7 residues, 73 atoms, 71 bonds> has 1 in the ensemble.
The topology <mdtraj.Topology with 1 chains, 7 residues, 80 atoms, 78 bonds> has 1 in the ensemble.

The trajs_from_files instance has trajectories with 2 different topologies in it. This feature is meant to represent the expression of mutations in biological systems.

https://evolution.berkeley.edu/dna-and-mutations/types-of-mutations/

Indexing TrajEnsembles can yield different output types. Indexing with a sinlge int will yield the corresponding trajectory.

[47]:
trajs_from_files[1]
[47]:
<encodermap.SingleTraj object. Currently not in memory. Basename is 'glu7'. Not containing any CVs. Common string is 'glu7'. Object at 0x7055c895a920>

Python lists, NumPy arrays and slices can also be used for indexing. They all return TrajEnsemble object and raise Errors, when they are indexing trajectories that do not exist in the ensemble.

[48]:
trajs_from_files[[0, 1]]
[48]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['glu7', 'asp7']. Not containing any CVs. Object at 0x7055ce6b1a20>
[49]:
trajs_from_files[[0, 1, 2]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[49], line 1
----> 1 trajs_from_files[[0, 1, 2]]

File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3051, in TrajEnsemble.__getitem__(self, key)
   3049     return self.trajs[key]
   3050 elif isinstance(key, list) and not isinstance(key[0], list):
-> 3051     new_class = self._return_trajs_by_index(key)
   3052     return new_class
   3053 elif isinstance(key, np.ndarray):

File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:2566, in TrajEnsemble._return_trajs_by_index(self, index)
   2564         trajs_subset = self.trajs[ind]._gen_ensemble()
   2565     else:
-> 2566         new_traj = self.trajs[ind]._gen_ensemble()
   2567         trajs_subset += new_traj
   2568 trajs_subset.common_str = new_common_str

IndexError: list index out of range

Slices can contain start, stop, step or any combination, thereof:

[50]:
trajs_from_files[0:1]
[50]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 1 trajectories. Common str is ['asp7']. Not containing any CVs. Object at 0x7055cdfd58a0>
[51]:
trajs_from_files[::3]
[51]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 1 trajectories. Common str is ['asp7']. Not containing any CVs. Object at 0x7055c93a5db0>

They will also raise Errors, if trajectories are not accessible.

[52]:
trajs_from_files[5:]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[52], line 1
----> 1 trajs_from_files[5:]

File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3069, in TrajEnsemble.__getitem__(self, key)
   3067     start, stop, step = key.indices(self.n_trajs)
   3068     list_ = list(range(start, stop, step))
-> 3069     new_class = self[list_]
   3070     return new_class
   3071 elif isinstance(key, list) and all(isinstance(k, list) for k in key):

File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:3050, in TrajEnsemble.__getitem__(self, key)
   3048 if isinstance(key, (int, np.int32, np.int64)):
   3049     return self.trajs[key]
-> 3050 elif isinstance(key, list) and not isinstance(key[0], list):
   3051     new_class = self._return_trajs_by_index(key)
   3052     return new_class

IndexError: list index out of range

Similar to trajectories, which can contain frames with various frame-numbers, TrajEnsemble can contain trajectories with various trajectory-numbers. The regular indexing [] does not use the trajectory number.

[53]:
trajs = trajs_from_files.copy()
trajs[0].traj_num = 10
trajs[1].traj_num = 20
trajs[0].traj_num
[53]:
10

But the tsel[] selector uses the trajectory-number:

[54]:
trajs.tsel[20].traj_num
[54]:
20

And raises Errors, if a trajectory with that index is not present.

[55]:
trajs.tsel[0]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[55], line 1
----> 1 trajs.tsel[0]

File ~/git/encoder_map_private/encodermap/trajinfo/info_all.py:765, in TrajEnsembleTsel.__getitem__(self, item)
    763 if isinstance(item, (int, np.int64)):
    764     if item not in items:
--> 765         raise ValueError(
    766             f"No trajectories with traj_num {item} in TrajEnsemble {self.other} "
    767             f"with trajectories: {items}"
    768         )
    769     return self.other.trajs_by_traj_num[item]
    770 elif isinstance(item, (list, np.ndarray)):

ValueError: No trajectories with traj_num 0 in TrajEnsemble encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. with trajectories: [10 20]

2-dimensional numpy arrays can be used for indexing TrajEnsembles, by providing trajectory-numbers and frame-numbers arranged in the rows of the array. This type of selection does not respect the actual trajectory-numbers. That’s what - you guessed it - tsel[] is for.

[56]:
index = np.array([
    [0, 0],
    [0, 1],
    [0, 2],
    [0, 3],
    [1, 0],
    [1, 2],
    [1, 4],
])
trajs[index].id
[56]:
array([[10,  0],
       [10,  1],
       [10,  2],
       [10,  3],
       [20,  0],
       [20,  2],
       [20,  4]])

The tsel[] selector does the same, but with trajectory-numbers and frame-numbers.

[57]:
index = np.array([
    [10, 0],
    [10, 10],
    [10, 20],
    [10, 30],
    [20, 0],
    [20, 2],
    [20, 4],
])
trajs.tsel[index].id
[57]:
array([[10,  0],
       [10, 10],
       [10, 20],
       [10, 30],
       [20,  0],
       [20,  2],
       [20,  4]])

Think about the TrajectoryEnsemble as a high-dimensional object, that contains an extra axis. The trajectory axis (The SingleTraj object only has an atom and a time axis).

[58]:
trajs[0] + trajs[1]
[58]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 2 trajectories. Common str is ['asp7', 'glu7']. Not containing any CVs. Object at 0x7055c98a69e0>

As we have previously seen, there are some out-of-the box TrajectoryEnsembles that come with EncoderMap. You can get a list of all available projects via:

[59]:
em.ALL_PROJECT_NAMES
[59]:
typing.Union[typing.Literal['linear_dimers'], typing.Literal['pASP_pGLU'], typing.Literal['Ub_K11_mutants'], typing.Literal['cube'], typing.Literal['1am7'], typing.Literal['H1Ub']]

Projects are automatically loaded from either:

[60]:
trajs = em.load_project("pASP_pGLU")
trajs
[60]:
<encodermap.TrajEnsemble object. Current backend is no_load. Containing 7 trajectories. Common str is ['asp10', 'glu8', 'glu6', 'glu7', 'asp8', 'asp6', 'asp7']. CV central_dihedrals with shape (35007, 27) loaded. CV central_angles with shape (35007, 28) loaded. CV side_dihedrals with shape (35007, 28) loaded. CV central_distances with shape (35007, 29) loaded. CV central_cartesians with shape (35007, 30, 3) loaded. Object at 0x7055cdfd5870>

Loading CVs#

After learning about the basics of the SingleTraj and TrajEnsemble class we will come back to collective variables. EncoderMap offers a way of “adding” CV data to your trajectories and treat them as a single instance. A trajectory with CV data. The easiest way to load CV data is to provide an already existing NumPy array. However, you will be asked to also provide the attribute name (attr_name) of the array. With this you could load multiple CV datasets, that differ in ther attribute names. Here’s an example:

From numpy#

[61]:
# rename the traj to make the following code more readable
traj = traj_from_xtc_file
traj.del_CVs()

# random phi/psi angles in a [0, 2pi] interval
random_raman_angles = np.random.random((traj.n_frames, 2 * traj.n_residues)) * 2 * np.pi

# define labels:
phi_angles = [f'phi {i}' for i in range(traj.n_residues)]
psi_angles = [f'psi {i}' for i in range(traj.n_residues)]
raman_labels = [None]*(len(phi_angles)+len(psi_angles))
raman_labels[::2] = phi_angles
raman_labels[1::2] = psi_angles

# load the CV
traj.load_CV(random_raman_angles, 'raman', labels=raman_labels)

# define some integer values (can be cluster memberships)
random_integers_per_frame = np.random.randint(0, 3, size=traj.n_frames)
traj.load_CV(random_integers_per_frame, 'cluster_membership')

These values can be accessed via directly calling their attribute names (so make sure to use valid identifiers).

[62]:
print(traj.cluster_membership)
[[2]
 [0]
 [1]
 ...
 [0]
 [1]
 [0]]
[63]:
print(traj.raman)
[[3.19887292 3.93664429 2.2612379  ... 3.97329588 2.61313514 1.29523575]
 [5.67751813 4.56987798 5.1660988  ... 2.93391639 0.37501569 1.58019254]
 [3.07029815 2.67067956 6.19632697 ... 5.52159601 1.92555569 5.32793883]
 ...
 [3.32619363 5.30403618 5.29339896 ... 2.86859017 5.97150589 3.54585877]
 [5.42811641 4.65150347 0.79362821 ... 4.20676372 2.6362959  3.80984363]
 [0.84258919 0.50104588 4.03725794 ... 4.49656793 0.037803   5.22552255]]

There’s also the attribute CVs that is a dict of these collective variables.

[64]:
traj.CVs
[64]:
{'raman': array([[3.19887292, 3.93664429, 2.2612379 , ..., 3.97329588, 2.61313514,
         1.29523575],
        [5.67751813, 4.56987798, 5.1660988 , ..., 2.93391639, 0.37501569,
         1.58019254],
        [3.07029815, 2.67067956, 6.19632697, ..., 5.52159601, 1.92555569,
         5.32793883],
        ...,
        [3.32619363, 5.30403618, 5.29339896, ..., 2.86859017, 5.97150589,
         3.54585877],
        [5.42811641, 4.65150347, 0.79362821, ..., 4.20676372, 2.6362959 ,
         3.80984363],
        [0.84258919, 0.50104588, 4.03725794, ..., 4.49656793, 0.037803  ,
         5.22552255]]),
 'cluster_membership': array([[2],
        [0],
        [1],
        ...,
        [0],
        [1],
        [0]])}

However, this is not the end. CVs in a SingleTraj class are stored as xarray.Datasets. The dataset can be accessed via _CVs.

[65]:
traj._CVs
[65]:
<xarray.Dataset>
Dimensions:             (traj_num: 1, frame_num: 5001, RAMAN: 304,
                         CLUSTER_MEMBERSHIP: 1)
Coordinates:
  * traj_num            (traj_num) object None
    traj_name           (traj_num) <U2 '01'
  * frame_num           (frame_num) int64 0 1 2 3 4 ... 4996 4997 4998 4999 5000
  * RAMAN               (RAMAN) <U7 'phi 0' 'psi 0' ... 'phi 151' 'psi 151'
  * CLUSTER_MEMBERSHIP  (CLUSTER_MEMBERSHIP) <U26 'CLUSTER_MEMBERSHIP FEATURE'
Data variables:
    raman               (traj_num, frame_num, RAMAN) float64 3.199 ... 5.226
    cluster_membership  (traj_num, frame_num, CLUSTER_MEMBERSHIP) int64 2 ... 0
Attributes:
    length_units:   nm
    time_units:     ps
    feature_axes:   ['RAMAN', 'CLUSTER_MEMBERSHIP']
    full_path:      /home/kevin/git/encoder_map_private/tests/data/linear_dim...
    topology_file:  /home/kevin/git/encoder_map_private/tests/data/linear_dim...

Why xarray?

The underlying xarray.Dataset is intended to make sure “everything is correct”. Every value can be accessed via an unambigous identifier.

[66]:
traj._CVs['raman'].loc[{'frame_num': 20, 'RAMAN': 'psi 50'}].values
[66]:
array([5.31620961])

Slicing with CVs.#

Slicing keeps your values where they should be.

[67]:
index = np.where(np.array(raman_labels) == 'psi 50')[0]
print(traj[0].raman[0, index])
print(traj[[0, 5, 10, 20]].raman[:,index])
[0.3531243]
[[0.3531243 ]
 [4.37541582]
 [4.62936691]
 [5.31620961]]

Loading from files#

CVs can be loaded by providing a string to files. First, let us save some files.

[68]:
# save numpy
np.save('raman_file.npy', traj.CVs["raman"])

# save text
np.savetxt('cluster_membership_file.txt', traj.cluster_membership)

# save full CV dataset as NetCDF
traj._CVs.to_netcdf('full_CV_dataset.nc', engine="h5netcdf", )

If not providing an attr_name, while loading files, the filename will be used:

[69]:
traj.load_CV('raman_file.npy')
traj.load_CV('cluster_membership_file.txt')
[70]:
print(traj.CVs.keys())
dict_keys(['raman', 'cluster_membership', 'raman_file', 'cluster_membership_file'])

Multiple CVs can be reconstructed from xarray NetCDF files (most end with .nc). If there are conflicts the new data from disk will overwrite the old.

[71]:
traj = em.SingleTraj(traj.traj_file, traj.top_file)
print(traj.CVs.keys())
traj.load_CV('full_CV_dataset.nc')
print(traj.CVs.keys())
dict_keys([])
dict_keys(['raman', 'cluster_membership'])

Loading with EncoderMap’s featurzier#

EncoderMap comes with a Featurizer class, that mimics the great PyEMMA package which has now been deprecated. Check it out at http://emma-project.org/latest/.

The Featurizer centralizes the computation of collective variables. It can contain multiple features, each feature describing a set of collective variables and fixes the computation of collective variables that are part of TrajectoryEnsembles with non-uniform topologies. To instantiate a featurizer, we provide it with either a SingleTraj or even a TrajectoryEnsemble instance.

[72]:
feat = em.Featurizer(traj)
feat
[72]:
EncoderMap Featurizer with features:
[, ...]

The Featurizer is currently empty. But we can add features via the add_* methods

[73]:
for m in feat.__dir__():
    if m.startswith("add"):
        print(m)
add_list_of_feats
add_custom_feature
add_distances_ca
add_distances
add_backbone_torsions
add_angles
add_all
add_selection
add_inverse_distances
add_contacts
add_residue_mindist
add_group_COM
add_residue_COM
add_dihedrals
add_sidechain_torsions
add_minrmsd_to_ref
[74]:
feat.add_backbone_torsions()
feat.add_sidechain_torsions()
feat
/home/kevin/git/encoder_map_private/encodermap/loading/features.py:470: CitePYEMMAWarning:

EncoderMap's featurization uses code from the now deprecated python package PyEMMA (https://github.com/markovmodel/PyEMMA). Please make sure to also cite them, when using EncoderMap.

[74]:
EncoderMap Featurizer with features:
['PHI 0 GLN 2',
 'PSI 0 MET 1',
 'PHI 0 ILE 3',
 'PSI 0 GLN 2',
 'PHI 0 PHE 4',
 'PSI 0 ILE 3',
 'PHI 0 VAL 5',
 'PSI 0 PHE 4',
 'PHI 0 LYS 6',
 'PSI 0 VAL 5', ...]

The idea behind EncoderMap’s feraturizer is to prepare these calculations and define indices which are later used to actually calculate the values of the CVs. The Featurizer contains a list of Features. Every feature has a name and an attribute called *_indexes.

[75]:
for feature in feat.active_features:
    print(f"{feature.__class__.__name__}\n{feature.angle_indexes}\n")
BackboneTorsionFeature
[[   9   11   13   21]
 [   0    4    9   11]
 [  21   23   25   30]
 ...
 [1493 1495 1508 1510]
 [1513 1515 1517 1518]
 [1510 1512 1513 1515]]

SideChainTorsions
[[   0    4    5    6]
 [  11   13   14   15]
 [  23   25   26   27]
 ...
 [1281 1282 1284 1285]
 [1472 1473 1475 1476]
 [1498 1499 1501 1502]]

In this case, the attribute angle_indexes attribute contain the 0-based atom ids which are to be used for the calculation of torsion angles. The result of the calculation can be obtained via:

[76]:
feat.get_output()
[76]:
<xarray.Dataset>
Dimensions:                                 (traj_num: 1, frame_num: 5001,
                                             BACKBONETORSIONFEATURE: 302,
                                             SIDECHAINTORSIONS: 322, ATOM_NO: 4)
Coordinates:
  * traj_num                                (traj_num) object None
    traj_name                               (traj_num) <U2 '01'
  * frame_num                               (frame_num) int64 0 1 ... 4999 5000
  * BACKBONETORSIONFEATURE                  (BACKBONETORSIONFEATURE) <U13 'PH...
    time                                    (frame_num) float32 0.0 ... 5e+04
  * SIDECHAINTORSIONS                       (SIDECHAINTORSIONS) <U14 'CHI1 0 ...
  * ATOM_NO                                 (ATOM_NO) int64 0 1 2 3
Data variables:
    BackboneTorsionFeature                  (traj_num, frame_num, BACKBONETORSIONFEATURE) float32 ...
    SideChainTorsions                       (traj_num, frame_num, SIDECHAINTORSIONS) float32 ...
    BackboneTorsionFeature_feature_indices  (traj_num, BACKBONETORSIONFEATURE, ATOM_NO) int32 ...
    SideChainTorsions_feature_indices       (traj_num, SIDECHAINTORSIONS, ATOM_NO) int32 ...
Attributes:
    length_units:   nm
    time_units:     ps
    feature_axes:   ['SIDECHAINTORSIONS', 'BACKBONETORSIONFEATURE']
    angle_units:    rad
    full_path:      /home/kevin/git/encoder_map_private/tests/data/linear_dim...
    topology_file:  /home/kevin/git/encoder_map_private/tests/data/linear_dim...

The advantages of this method are:

  • The same can be done with the TrajEnsemble class (more on that later), which is also parallelized.

  • Most of the features contain comprehensive labels themselves.

The SingleTraj class can be used to direclty load the output of a Featurizer:

[77]:
traj.load_CV(feat)

And the new features from the Featurizer are added to the preexisting features of the SingleTraj.

[78]:
traj._CVs
[78]:
<xarray.Dataset>
Dimensions:                                 (traj_num: 1, frame_num: 5001,
                                             RAMAN: 304, CLUSTER_MEMBERSHIP: 1,
                                             BACKBONETORSIONFEATURE: 302,
                                             SIDECHAINTORSIONS: 322, ATOM_NO: 4)
Coordinates:
  * traj_num                                (traj_num) float64 nan
    traj_name                               (traj_num) <U2 '01'
  * frame_num                               (frame_num) int64 0 1 ... 4999 5000
  * RAMAN                                   (RAMAN) <U7 'phi 0' ... 'psi 151'
  * CLUSTER_MEMBERSHIP                      (CLUSTER_MEMBERSHIP) <U26 'CLUSTE...
  * BACKBONETORSIONFEATURE                  (BACKBONETORSIONFEATURE) <U13 'PH...
    time                                    (frame_num) float32 0.0 ... 5e+04
  * SIDECHAINTORSIONS                       (SIDECHAINTORSIONS) <U14 'CHI1 0 ...
  * ATOM_NO                                 (ATOM_NO) int64 0 1 2 3
Data variables:
    raman                                   (traj_num, frame_num, RAMAN) float64 ...
    cluster_membership                      (traj_num, frame_num, CLUSTER_MEMBERSHIP) int64 ...
    BackboneTorsionFeature                  (traj_num, frame_num, BACKBONETORSIONFEATURE) float32 ...
    SideChainTorsions                       (traj_num, frame_num, SIDECHAINTORSIONS) float32 ...
    BackboneTorsionFeature_feature_indices  (traj_num, BACKBONETORSIONFEATURE, ATOM_NO) int32 ...
    SideChainTorsions_feature_indices       (traj_num, SIDECHAINTORSIONS, ATOM_NO) int32 ...
Attributes:
    length_units:   nm
    time_units:     ps
    feature_axes:   ['BACKBONETORSIONFEATURE', 'RAMAN', 'SIDECHAINTORSIONS', ...
    angle_units:    rad
    full_path:      /home/kevin/git/encoder_map_private/tests/data/linear_dim...
    topology_file:  /home/kevin/git/encoder_map_private/tests/data/linear_dim...

Wrtiting custom features No 1#

Writing your custom features can be done by subclassing pyemma’s features. Required methods and attributes to make your feature work are:

  • The methods __init__, describe, and call.

  • The instance attribute dimension, which defines the shape of the returned array.

If you want to change the name of the feature, as it appears in the xarray.Dataset you can set the attribute name.

In the next cell we will define a Feature that provides a random float for the atoms in a trajectory.

[79]:
class RandomFloatForAtomFeature(em.features.CustomFeature):
    # write an __init__
    def __init__(self, traj, selstr='all'):
        """Init of RandomFloatForAtomFeature.

        Args:
            top (mdtraj.Topology): The topology to select atoms from.

        Keyword Args:
            selstr (str, optional): The string to provide to top.select().
            Defaults to 'all'.

        """
        self.traj = traj
        self.top = traj.top
        self.indexes = self.traj.top.select(selstr)

        # Here, we set a required attribute
        # The dimension is needed and should
        # be an integer value of the length of the feature
        # in our case, that's how many atoms are in `self.indexes`
        self.dimension = len(self.indexes)

    def describe(self):
        """This method is not allowed to take any arguments.

        This method provides labels.

        Returns:
            list: A lsit of str, each str describing one feature.

        """
        # In this method we will build a list of str
        # Each str should describe one of our features
        # We assign floats to atoms, so the labels should tell something about the atoms
        getlbl = (
            lambda at: f"atom {at.name:>4}:{at.index:5} {at.residue.name}:{at.residue.resSeq:>4}"
        )
        labels = []
        for i in self.indexes:
            i = self.traj.top.atom(i)
            labels.append(f"Random float for {getlbl(i)}")
        return labels

    def call(self, traj):
        """This method provides values.

        Args:
            traj (mdtraj.Trajectory): An mdtraj.Trajectory.

        Returns:
            np.ndarray: The values of the features defined in describe.

        """
        # Make sure that the returned array has correct shape
        # It is a good idea, that this array has the same length as
        # the trajectory has frames
        values = traj.xyz[..., 0]
        for i in self.indexes:
            values[:, i] = float(str(hash(str(self.traj.top.atom(i))))[-5:])
        return values

    @property
    def name(self):
        # define the name of the feature to appear in `SingleTraj._CVs`
        return 'MyAwesomeFeature'
[80]:
traj = em.SingleTraj(traj.traj_file, traj.top_file)
print(traj)
featurizer = em.Featurizer(traj)
feat = RandomFloatForAtomFeature(traj)
for i in feat.describe()[:200:25]:
    print(i)
encodermap.SingleTraj object. Currently not in memory. Basename is '01'. Not containing any CVs.
Random float for atom    N:    0 MET:   1
Random float for atom   CA:   25 ILE:   3
Random float for atom    H:   50 VAL:   5
Random float for atom  HG1:   75 THR:   7
Random float for atom    C:  100 GLY:  10
Random float for atom    H:  125 ILE:  13
Random float for atom    O:  150 LEU:  15
Random float for atom  OE1:  175 GLU:  18
[81]:
featurizer.add_custom_feature(feat)
[82]:
traj.load_CV(featurizer)
[83]:
traj._CVs.coords['MYAWESOMEFEATURE'].values
[83]:
array(['Random float for atom    N:    0 MET:   1',
       'Random float for atom    H:    1 MET:   1',
       'Random float for atom   H2:    2 MET:   1', ...,
       'Random float for atom    C: 1518 GLY: 152',
       'Random float for atom    O: 1519 GLY: 152',
       'Random float for atom  OXT: 1520 GLY: 152'], dtype='<U41')

Writing custom features No 2#

In this example we will implement a method of calculating a nematic order parameter. This example will be quite different (working with coarse-grained carbon-hydrate chains (so-called telechelics), and not with proteins), but we will work our way through. Here are some references you might consider:

@article{mukherjee2012derivation,
  title={Derivation of coarse grained models for multiscale simulation of liquid crystalline phase transitions},
  author={Mukherjee, Biswaroop and Delle Site, Luigi and Kremer, Kurt and Peter, Christine},
  journal={The Journal of Physical Chemistry B},
  volume={116},
  number={29},
  pages={8474--8484},
  year={2012},
  publisher={ACS Publications}
}

@article{flachmuller2021coarse,
  title={Coarse grained simulation of the aggregation and structure control of polyethylene nanocrystals},
  author={Flachm{\"u}ller, Alexander and Mecking, Stefan and Peter, Christine},
  journal={Journal of Physics: Condensed Matter},
  volume={33},
  number={26},
  pages={264001},
  year={2021},
  publisher={IOP Publishing}
}

Come back later to check out this part of the tutorial.

[ ]:

Saving trajectory and CVs into one file#

A trajectory can (with its CVs) be saved as one comprehensive file with the save() method. What’s more: Loading such a file again makes it possible to access any frames and their corresponding CVs almost instantaneously.

[84]:
test_data = Path(em.__file__).parent.parent / "tests/data"
traj = em.SingleTraj(test_data / '1am7_corrected.xtc', test_data / '1am7_protein.pdb')
traj.load_CV('all')
traj.save('1am7_all_CVs.h5', overwrite=True)
[85]:
new_traj = em.SingleTraj('1am7_all_CVs.h5')
frames = new_traj[[0, 5, 20, 35]]
frames.central_cartesians.shape
[85]:
(4, 474, 3)
[86]:
frames = new_traj[::5]
print(frames.central_distances.shape)
print(frames._CVs.coords['CENTRAL_DISTANCES'].values)
(11, 473)
['CENTERDISTANCE  ATOM     N:    0 MET:   1 DIST  ATOM    CA:    4 MET:   1 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:    4 MET:   1 DIST  ATOM     C:   17 MET:   1 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   17 MET:   1 DIST  ATOM     N:   19 VAL:   2 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   19 VAL:   2 DIST  ATOM    CA:   21 VAL:   2 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   21 VAL:   2 DIST  ATOM     C:   33 VAL:   2 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   33 VAL:   2 DIST  ATOM     N:   35 GLU:   3 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   35 GLU:   3 DIST  ATOM    CA:   37 GLU:   3 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   37 GLU:   3 DIST  ATOM     C:   48 GLU:   3 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   48 GLU:   3 DIST  ATOM     N:   50 ILE:   4 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   50 ILE:   4 DIST  ATOM    CA:   52 ILE:   4 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   52 ILE:   4 DIST  ATOM     C:   67 ILE:   4 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   67 ILE:   4 DIST  ATOM     N:   69 ASN:   5 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   69 ASN:   5 DIST  ATOM    CA:   71 ASN:   5 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   71 ASN:   5 DIST  ATOM     C:   81 ASN:   5 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   81 ASN:   5 DIST  ATOM     N:   83 ASN:   6 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   83 ASN:   6 DIST  ATOM    CA:   85 ASN:   6 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   85 ASN:   6 DIST  ATOM     C:   95 ASN:   6 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:   95 ASN:   6 DIST  ATOM     N:   97 GLN:   7 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:   97 GLN:   7 DIST  ATOM    CA:   99 GLN:   7 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:   99 GLN:   7 DIST  ATOM     C:  112 GLN:   7 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  112 GLN:   7 DIST  ATOM     N:  114 ARG:   8 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  114 ARG:   8 DIST  ATOM    CA:  116 ARG:   8 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  116 ARG:   8 DIST  ATOM     C:  136 ARG:   8 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  136 ARG:   8 DIST  ATOM     N:  138 LYS:   9 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  138 LYS:   9 DIST  ATOM    CA:  140 LYS:   9 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  140 LYS:   9 DIST  ATOM     C:  158 LYS:   9 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  158 LYS:   9 DIST  ATOM     N:  160 ALA:  10 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  160 ALA:  10 DIST  ATOM    CA:  162 ALA:  10 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  162 ALA:  10 DIST  ATOM     C:  168 ALA:  10 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  168 ALA:  10 DIST  ATOM     N:  170 PHE:  11 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  170 PHE:  11 DIST  ATOM    CA:  172 PHE:  11 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  172 PHE:  11 DIST  ATOM     C:  188 PHE:  11 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  188 PHE:  11 DIST  ATOM     N:  190 LEU:  12 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  190 LEU:  12 DIST  ATOM    CA:  192 LEU:  12 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  192 LEU:  12 DIST  ATOM     C:  207 LEU:  12 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  207 LEU:  12 DIST  ATOM     N:  209 ASP:  13 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  209 ASP:  13 DIST  ATOM    CA:  211 ASP:  13 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  211 ASP:  13 DIST  ATOM     C:  219 ASP:  13 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  219 ASP:  13 DIST  ATOM     N:  221 MET:  14 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  221 MET:  14 DIST  ATOM    CA:  223 MET:  14 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  223 MET:  14 DIST  ATOM     C:  236 MET:  14 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  236 MET:  14 DIST  ATOM     N:  238 LEU:  15 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  238 LEU:  15 DIST  ATOM    CA:  240 LEU:  15 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  240 LEU:  15 DIST  ATOM     C:  255 LEU:  15 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  255 LEU:  15 DIST  ATOM     N:  257 ALA:  16 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  257 ALA:  16 DIST  ATOM    CA:  259 ALA:  16 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  259 ALA:  16 DIST  ATOM     C:  265 ALA:  16 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  265 ALA:  16 DIST  ATOM     N:  267 TRP:  17 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  267 TRP:  17 DIST  ATOM    CA:  269 TRP:  17 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  269 TRP:  17 DIST  ATOM     C:  289 TRP:  17 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  289 TRP:  17 DIST  ATOM     N:  291 SER:  18 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  291 SER:  18 DIST  ATOM    CA:  293 SER:  18 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  293 SER:  18 DIST  ATOM     C:  300 SER:  18 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  300 SER:  18 DIST  ATOM     N:  302 GLU:  19 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  302 GLU:  19 DIST  ATOM    CA:  304 GLU:  19 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  304 GLU:  19 DIST  ATOM     C:  315 GLU:  19 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  315 GLU:  19 DIST  ATOM     N:  317 GLY:  20 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  317 GLY:  20 DIST  ATOM    CA:  319 GLY:  20 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  319 GLY:  20 DIST  ATOM     C:  322 GLY:  20 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  322 GLY:  20 DIST  ATOM     N:  324 THR:  21 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  324 THR:  21 DIST  ATOM    CA:  326 THR:  21 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  326 THR:  21 DIST  ATOM     C:  336 THR:  21 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  336 THR:  21 DIST  ATOM     N:  338 ASP:  22 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  338 ASP:  22 DIST  ATOM    CA:  340 ASP:  22 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  340 ASP:  22 DIST  ATOM     C:  348 ASP:  22 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  348 ASP:  22 DIST  ATOM     N:  350 ASN:  23 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  350 ASN:  23 DIST  ATOM    CA:  352 ASN:  23 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  352 ASN:  23 DIST  ATOM     C:  362 ASN:  23 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  362 ASN:  23 DIST  ATOM     N:  364 GLY:  24 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  364 GLY:  24 DIST  ATOM    CA:  366 GLY:  24 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  366 GLY:  24 DIST  ATOM     C:  369 GLY:  24 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  369 GLY:  24 DIST  ATOM     N:  371 ARG:  25 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  371 ARG:  25 DIST  ATOM    CA:  373 ARG:  25 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  373 ARG:  25 DIST  ATOM     C:  393 ARG:  25 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  393 ARG:  25 DIST  ATOM     N:  395 GLN:  26 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  395 GLN:  26 DIST  ATOM    CA:  397 GLN:  26 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  397 GLN:  26 DIST  ATOM     C:  410 GLN:  26 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  410 GLN:  26 DIST  ATOM     N:  412 LYS:  27 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  412 LYS:  27 DIST  ATOM    CA:  414 LYS:  27 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  414 LYS:  27 DIST  ATOM     C:  432 LYS:  27 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  432 LYS:  27 DIST  ATOM     N:  434 THR:  28 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  434 THR:  28 DIST  ATOM    CA:  436 THR:  28 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  436 THR:  28 DIST  ATOM     C:  446 THR:  28 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  446 THR:  28 DIST  ATOM     N:  448 ARG:  29 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  448 ARG:  29 DIST  ATOM    CA:  450 ARG:  29 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  450 ARG:  29 DIST  ATOM     C:  470 ARG:  29 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  470 ARG:  29 DIST  ATOM     N:  472 ASN:  30 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  472 ASN:  30 DIST  ATOM    CA:  474 ASN:  30 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  474 ASN:  30 DIST  ATOM     C:  484 ASN:  30 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  484 ASN:  30 DIST  ATOM     N:  486 HIS:  31 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  486 HIS:  31 DIST  ATOM    CA:  488 HIS:  31 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  488 HIS:  31 DIST  ATOM     C:  501 HIS:  31 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  501 HIS:  31 DIST  ATOM     N:  503 GLY:  32 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  503 GLY:  32 DIST  ATOM    CA:  505 GLY:  32 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  505 GLY:  32 DIST  ATOM     C:  508 GLY:  32 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  508 GLY:  32 DIST  ATOM     N:  510 TYR:  33 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  510 TYR:  33 DIST  ATOM    CA:  512 TYR:  33 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  512 TYR:  33 DIST  ATOM     C:  529 TYR:  33 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  529 TYR:  33 DIST  ATOM     N:  531 ASP:  34 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  531 ASP:  34 DIST  ATOM    CA:  533 ASP:  34 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  533 ASP:  34 DIST  ATOM     C:  541 ASP:  34 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  541 ASP:  34 DIST  ATOM     N:  543 VAL:  35 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  543 VAL:  35 DIST  ATOM    CA:  545 VAL:  35 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  545 VAL:  35 DIST  ATOM     C:  557 VAL:  35 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  557 VAL:  35 DIST  ATOM     N:  559 ILE:  36 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  559 ILE:  36 DIST  ATOM    CA:  561 ILE:  36 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  561 ILE:  36 DIST  ATOM     C:  576 ILE:  36 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  576 ILE:  36 DIST  ATOM     N:  578 VAL:  37 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  578 VAL:  37 DIST  ATOM    CA:  580 VAL:  37 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  580 VAL:  37 DIST  ATOM     C:  592 VAL:  37 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  592 VAL:  37 DIST  ATOM     N:  594 GLY:  38 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  594 GLY:  38 DIST  ATOM    CA:  596 GLY:  38 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  596 GLY:  38 DIST  ATOM     C:  599 GLY:  38 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  599 GLY:  38 DIST  ATOM     N:  601 GLY:  39 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  601 GLY:  39 DIST  ATOM    CA:  603 GLY:  39 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  603 GLY:  39 DIST  ATOM     C:  606 GLY:  39 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  606 GLY:  39 DIST  ATOM     N:  608 GLU:  40 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  608 GLU:  40 DIST  ATOM    CA:  610 GLU:  40 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  610 GLU:  40 DIST  ATOM     C:  621 GLU:  40 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  621 GLU:  40 DIST  ATOM     N:  623 LEU:  41 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  623 LEU:  41 DIST  ATOM    CA:  625 LEU:  41 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  625 LEU:  41 DIST  ATOM     C:  640 LEU:  41 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  640 LEU:  41 DIST  ATOM     N:  642 PHE:  42 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  642 PHE:  42 DIST  ATOM    CA:  644 PHE:  42 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  644 PHE:  42 DIST  ATOM     C:  660 PHE:  42 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  660 PHE:  42 DIST  ATOM     N:  662 THR:  43 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  662 THR:  43 DIST  ATOM    CA:  664 THR:  43 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  664 THR:  43 DIST  ATOM     C:  674 THR:  43 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  674 THR:  43 DIST  ATOM     N:  676 ASP:  44 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  676 ASP:  44 DIST  ATOM    CA:  678 ASP:  44 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  678 ASP:  44 DIST  ATOM     C:  686 ASP:  44 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  686 ASP:  44 DIST  ATOM     N:  688 TYR:  45 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  688 TYR:  45 DIST  ATOM    CA:  690 TYR:  45 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  690 TYR:  45 DIST  ATOM     C:  707 TYR:  45 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  707 TYR:  45 DIST  ATOM     N:  709 SER:  46 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  709 SER:  46 DIST  ATOM    CA:  711 SER:  46 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  711 SER:  46 DIST  ATOM     C:  718 SER:  46 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  718 SER:  46 DIST  ATOM     N:  720 ASP:  47 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  720 ASP:  47 DIST  ATOM    CA:  722 ASP:  47 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  722 ASP:  47 DIST  ATOM     C:  730 ASP:  47 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  730 ASP:  47 DIST  ATOM     N:  732 HIS:  48 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  732 HIS:  48 DIST  ATOM    CA:  734 HIS:  48 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  734 HIS:  48 DIST  ATOM     C:  747 HIS:  48 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  747 HIS:  48 DIST  ATOM     N:  749 PRO:  49 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  749 PRO:  49 DIST  ATOM    CA:  759 PRO:  49 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  759 PRO:  49 DIST  ATOM     C:  761 PRO:  49 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  761 PRO:  49 DIST  ATOM     N:  763 ARG:  50 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  763 ARG:  50 DIST  ATOM    CA:  765 ARG:  50 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  765 ARG:  50 DIST  ATOM     C:  785 ARG:  50 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  785 ARG:  50 DIST  ATOM     N:  787 LYS:  51 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  787 LYS:  51 DIST  ATOM    CA:  789 LYS:  51 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  789 LYS:  51 DIST  ATOM     C:  807 LYS:  51 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  807 LYS:  51 DIST  ATOM     N:  809 LEU:  52 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  809 LEU:  52 DIST  ATOM    CA:  811 LEU:  52 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  811 LEU:  52 DIST  ATOM     C:  826 LEU:  52 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  826 LEU:  52 DIST  ATOM     N:  828 VAL:  53 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  828 VAL:  53 DIST  ATOM    CA:  830 VAL:  53 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  830 VAL:  53 DIST  ATOM     C:  842 VAL:  53 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  842 VAL:  53 DIST  ATOM     N:  844 THR:  54 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  844 THR:  54 DIST  ATOM    CA:  846 THR:  54 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  846 THR:  54 DIST  ATOM     C:  856 THR:  54 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  856 THR:  54 DIST  ATOM     N:  858 LEU:  55 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  858 LEU:  55 DIST  ATOM    CA:  860 LEU:  55 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  860 LEU:  55 DIST  ATOM     C:  875 LEU:  55 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  875 LEU:  55 DIST  ATOM     N:  877 ASN:  56 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  877 ASN:  56 DIST  ATOM    CA:  879 ASN:  56 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  879 ASN:  56 DIST  ATOM     C:  889 ASN:  56 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  889 ASN:  56 DIST  ATOM     N:  891 PRO:  57 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  891 PRO:  57 DIST  ATOM    CA:  901 PRO:  57 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  901 PRO:  57 DIST  ATOM     C:  903 PRO:  57 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  903 PRO:  57 DIST  ATOM     N:  905 LYS:  58 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  905 LYS:  58 DIST  ATOM    CA:  907 LYS:  58 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  907 LYS:  58 DIST  ATOM     C:  925 LYS:  58 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  925 LYS:  58 DIST  ATOM     N:  927 LEU:  59 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  927 LEU:  59 DIST  ATOM    CA:  929 LEU:  59 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  929 LEU:  59 DIST  ATOM     C:  944 LEU:  59 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  944 LEU:  59 DIST  ATOM     N:  946 LYS:  60 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  946 LYS:  60 DIST  ATOM    CA:  948 LYS:  60 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  948 LYS:  60 DIST  ATOM     C:  966 LYS:  60 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  966 LYS:  60 DIST  ATOM     N:  968 SER:  61 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  968 SER:  61 DIST  ATOM    CA:  970 SER:  61 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  970 SER:  61 DIST  ATOM     C:  977 SER:  61 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  977 SER:  61 DIST  ATOM     N:  979 THR:  62 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  979 THR:  62 DIST  ATOM    CA:  981 THR:  62 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  981 THR:  62 DIST  ATOM     C:  991 THR:  62 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  991 THR:  62 DIST  ATOM     N:  993 GLY:  63 CHAIN 0'
 'CENTERDISTANCE  ATOM     N:  993 GLY:  63 DIST  ATOM    CA:  995 GLY:  63 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA:  995 GLY:  63 DIST  ATOM     C:  998 GLY:  63 CHAIN 0'
 'CENTERDISTANCE  ATOM     C:  998 GLY:  63 DIST  ATOM     N: 1000 ALA:  64 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1000 ALA:  64 DIST  ATOM    CA: 1002 ALA:  64 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1002 ALA:  64 DIST  ATOM     C: 1008 ALA:  64 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1008 ALA:  64 DIST  ATOM     N: 1010 GLY:  65 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1010 GLY:  65 DIST  ATOM    CA: 1012 GLY:  65 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1012 GLY:  65 DIST  ATOM     C: 1015 GLY:  65 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1015 GLY:  65 DIST  ATOM     N: 1017 ARG:  66 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1017 ARG:  66 DIST  ATOM    CA: 1019 ARG:  66 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1019 ARG:  66 DIST  ATOM     C: 1039 ARG:  66 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1039 ARG:  66 DIST  ATOM     N: 1041 TYR:  67 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1041 TYR:  67 DIST  ATOM    CA: 1043 TYR:  67 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1043 TYR:  67 DIST  ATOM     C: 1060 TYR:  67 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1060 TYR:  67 DIST  ATOM     N: 1062 GLN:  68 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1062 GLN:  68 DIST  ATOM    CA: 1064 GLN:  68 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1064 GLN:  68 DIST  ATOM     C: 1077 GLN:  68 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1077 GLN:  68 DIST  ATOM     N: 1079 LEU:  69 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1079 LEU:  69 DIST  ATOM    CA: 1081 LEU:  69 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1081 LEU:  69 DIST  ATOM     C: 1096 LEU:  69 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1096 LEU:  69 DIST  ATOM     N: 1098 LEU:  70 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1098 LEU:  70 DIST  ATOM    CA: 1100 LEU:  70 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1100 LEU:  70 DIST  ATOM     C: 1115 LEU:  70 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1115 LEU:  70 DIST  ATOM     N: 1117 SER:  71 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1117 SER:  71 DIST  ATOM    CA: 1119 SER:  71 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1119 SER:  71 DIST  ATOM     C: 1126 SER:  71 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1126 SER:  71 DIST  ATOM     N: 1128 ARG:  72 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1128 ARG:  72 DIST  ATOM    CA: 1130 ARG:  72 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1130 ARG:  72 DIST  ATOM     C: 1150 ARG:  72 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1150 ARG:  72 DIST  ATOM     N: 1152 TRP:  73 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1152 TRP:  73 DIST  ATOM    CA: 1154 TRP:  73 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1154 TRP:  73 DIST  ATOM     C: 1174 TRP:  73 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1174 TRP:  73 DIST  ATOM     N: 1176 TRP:  74 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1176 TRP:  74 DIST  ATOM    CA: 1178 TRP:  74 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1178 TRP:  74 DIST  ATOM     C: 1198 TRP:  74 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1198 TRP:  74 DIST  ATOM     N: 1200 ASP:  75 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1200 ASP:  75 DIST  ATOM    CA: 1202 ASP:  75 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1202 ASP:  75 DIST  ATOM     C: 1210 ASP:  75 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1210 ASP:  75 DIST  ATOM     N: 1212 ALA:  76 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1212 ALA:  76 DIST  ATOM    CA: 1214 ALA:  76 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1214 ALA:  76 DIST  ATOM     C: 1220 ALA:  76 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1220 ALA:  76 DIST  ATOM     N: 1222 TYR:  77 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1222 TYR:  77 DIST  ATOM    CA: 1224 TYR:  77 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1224 TYR:  77 DIST  ATOM     C: 1241 TYR:  77 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1241 TYR:  77 DIST  ATOM     N: 1243 ARG:  78 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1243 ARG:  78 DIST  ATOM    CA: 1245 ARG:  78 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1245 ARG:  78 DIST  ATOM     C: 1265 ARG:  78 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1265 ARG:  78 DIST  ATOM     N: 1267 LYS:  79 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1267 LYS:  79 DIST  ATOM    CA: 1269 LYS:  79 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1269 LYS:  79 DIST  ATOM     C: 1287 LYS:  79 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1287 LYS:  79 DIST  ATOM     N: 1289 GLN:  80 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1289 GLN:  80 DIST  ATOM    CA: 1291 GLN:  80 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1291 GLN:  80 DIST  ATOM     C: 1304 GLN:  80 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1304 GLN:  80 DIST  ATOM     N: 1306 LEU:  81 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1306 LEU:  81 DIST  ATOM    CA: 1308 LEU:  81 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1308 LEU:  81 DIST  ATOM     C: 1323 LEU:  81 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1323 LEU:  81 DIST  ATOM     N: 1325 GLY:  82 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1325 GLY:  82 DIST  ATOM    CA: 1327 GLY:  82 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1327 GLY:  82 DIST  ATOM     C: 1330 GLY:  82 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1330 GLY:  82 DIST  ATOM     N: 1332 LEU:  83 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1332 LEU:  83 DIST  ATOM    CA: 1334 LEU:  83 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1334 LEU:  83 DIST  ATOM     C: 1349 LEU:  83 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1349 LEU:  83 DIST  ATOM     N: 1351 LYS:  84 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1351 LYS:  84 DIST  ATOM    CA: 1353 LYS:  84 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1353 LYS:  84 DIST  ATOM     C: 1371 LYS:  84 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1371 LYS:  84 DIST  ATOM     N: 1373 ASP:  85 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1373 ASP:  85 DIST  ATOM    CA: 1375 ASP:  85 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1375 ASP:  85 DIST  ATOM     C: 1383 ASP:  85 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1383 ASP:  85 DIST  ATOM     N: 1385 PHE:  86 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1385 PHE:  86 DIST  ATOM    CA: 1387 PHE:  86 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1387 PHE:  86 DIST  ATOM     C: 1403 PHE:  86 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1403 PHE:  86 DIST  ATOM     N: 1405 SER:  87 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1405 SER:  87 DIST  ATOM    CA: 1407 SER:  87 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1407 SER:  87 DIST  ATOM     C: 1414 SER:  87 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1414 SER:  87 DIST  ATOM     N: 1416 PRO:  88 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1416 PRO:  88 DIST  ATOM    CA: 1426 PRO:  88 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1426 PRO:  88 DIST  ATOM     C: 1428 PRO:  88 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1428 PRO:  88 DIST  ATOM     N: 1430 LYS:  89 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1430 LYS:  89 DIST  ATOM    CA: 1432 LYS:  89 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1432 LYS:  89 DIST  ATOM     C: 1450 LYS:  89 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1450 LYS:  89 DIST  ATOM     N: 1452 SER:  90 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1452 SER:  90 DIST  ATOM    CA: 1454 SER:  90 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1454 SER:  90 DIST  ATOM     C: 1461 SER:  90 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1461 SER:  90 DIST  ATOM     N: 1463 GLN:  91 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1463 GLN:  91 DIST  ATOM    CA: 1465 GLN:  91 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1465 GLN:  91 DIST  ATOM     C: 1478 GLN:  91 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1478 GLN:  91 DIST  ATOM     N: 1480 ASP:  92 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1480 ASP:  92 DIST  ATOM    CA: 1482 ASP:  92 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1482 ASP:  92 DIST  ATOM     C: 1490 ASP:  92 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1490 ASP:  92 DIST  ATOM     N: 1492 ALA:  93 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1492 ALA:  93 DIST  ATOM    CA: 1494 ALA:  93 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1494 ALA:  93 DIST  ATOM     C: 1500 ALA:  93 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1500 ALA:  93 DIST  ATOM     N: 1502 VAL:  94 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1502 VAL:  94 DIST  ATOM    CA: 1504 VAL:  94 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1504 VAL:  94 DIST  ATOM     C: 1516 VAL:  94 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1516 VAL:  94 DIST  ATOM     N: 1518 ALA:  95 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1518 ALA:  95 DIST  ATOM    CA: 1520 ALA:  95 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1520 ALA:  95 DIST  ATOM     C: 1526 ALA:  95 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1526 ALA:  95 DIST  ATOM     N: 1528 LEU:  96 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1528 LEU:  96 DIST  ATOM    CA: 1530 LEU:  96 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1530 LEU:  96 DIST  ATOM     C: 1545 LEU:  96 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1545 LEU:  96 DIST  ATOM     N: 1547 GLN:  97 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1547 GLN:  97 DIST  ATOM    CA: 1549 GLN:  97 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1549 GLN:  97 DIST  ATOM     C: 1562 GLN:  97 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1562 GLN:  97 DIST  ATOM     N: 1564 GLN:  98 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1564 GLN:  98 DIST  ATOM    CA: 1566 GLN:  98 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1566 GLN:  98 DIST  ATOM     C: 1579 GLN:  98 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1579 GLN:  98 DIST  ATOM     N: 1581 ILE:  99 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1581 ILE:  99 DIST  ATOM    CA: 1583 ILE:  99 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1583 ILE:  99 DIST  ATOM     C: 1598 ILE:  99 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1598 ILE:  99 DIST  ATOM     N: 1600 LYS: 100 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1600 LYS: 100 DIST  ATOM    CA: 1602 LYS: 100 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1602 LYS: 100 DIST  ATOM     C: 1620 LYS: 100 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1620 LYS: 100 DIST  ATOM     N: 1622 GLU: 101 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1622 GLU: 101 DIST  ATOM    CA: 1624 GLU: 101 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1624 GLU: 101 DIST  ATOM     C: 1635 GLU: 101 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1635 GLU: 101 DIST  ATOM     N: 1637 ARG: 102 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1637 ARG: 102 DIST  ATOM    CA: 1639 ARG: 102 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1639 ARG: 102 DIST  ATOM     C: 1659 ARG: 102 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1659 ARG: 102 DIST  ATOM     N: 1661 GLY: 103 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1661 GLY: 103 DIST  ATOM    CA: 1663 GLY: 103 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1663 GLY: 103 DIST  ATOM     C: 1666 GLY: 103 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1666 GLY: 103 DIST  ATOM     N: 1668 ALA: 104 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1668 ALA: 104 DIST  ATOM    CA: 1670 ALA: 104 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1670 ALA: 104 DIST  ATOM     C: 1676 ALA: 104 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1676 ALA: 104 DIST  ATOM     N: 1678 LEU: 105 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1678 LEU: 105 DIST  ATOM    CA: 1680 LEU: 105 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1680 LEU: 105 DIST  ATOM     C: 1695 LEU: 105 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1695 LEU: 105 DIST  ATOM     N: 1697 PRO: 106 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1697 PRO: 106 DIST  ATOM    CA: 1707 PRO: 106 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1707 PRO: 106 DIST  ATOM     C: 1709 PRO: 106 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1709 PRO: 106 DIST  ATOM     N: 1711 MET: 107 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1711 MET: 107 DIST  ATOM    CA: 1713 MET: 107 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1713 MET: 107 DIST  ATOM     C: 1726 MET: 107 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1726 MET: 107 DIST  ATOM     N: 1728 ILE: 108 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1728 ILE: 108 DIST  ATOM    CA: 1730 ILE: 108 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1730 ILE: 108 DIST  ATOM     C: 1745 ILE: 108 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1745 ILE: 108 DIST  ATOM     N: 1747 ASP: 109 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1747 ASP: 109 DIST  ATOM    CA: 1749 ASP: 109 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1749 ASP: 109 DIST  ATOM     C: 1757 ASP: 109 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1757 ASP: 109 DIST  ATOM     N: 1759 ARG: 110 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1759 ARG: 110 DIST  ATOM    CA: 1761 ARG: 110 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1761 ARG: 110 DIST  ATOM     C: 1781 ARG: 110 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1781 ARG: 110 DIST  ATOM     N: 1783 GLY: 111 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1783 GLY: 111 DIST  ATOM    CA: 1785 GLY: 111 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1785 GLY: 111 DIST  ATOM     C: 1788 GLY: 111 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1788 GLY: 111 DIST  ATOM     N: 1790 ASP: 112 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1790 ASP: 112 DIST  ATOM    CA: 1792 ASP: 112 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1792 ASP: 112 DIST  ATOM     C: 1800 ASP: 112 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1800 ASP: 112 DIST  ATOM     N: 1802 ILE: 113 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1802 ILE: 113 DIST  ATOM    CA: 1804 ILE: 113 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1804 ILE: 113 DIST  ATOM     C: 1819 ILE: 113 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1819 ILE: 113 DIST  ATOM     N: 1821 ARG: 114 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1821 ARG: 114 DIST  ATOM    CA: 1823 ARG: 114 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1823 ARG: 114 DIST  ATOM     C: 1843 ARG: 114 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1843 ARG: 114 DIST  ATOM     N: 1845 GLN: 115 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1845 GLN: 115 DIST  ATOM    CA: 1847 GLN: 115 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1847 GLN: 115 DIST  ATOM     C: 1860 GLN: 115 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1860 GLN: 115 DIST  ATOM     N: 1862 ALA: 116 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1862 ALA: 116 DIST  ATOM    CA: 1864 ALA: 116 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1864 ALA: 116 DIST  ATOM     C: 1870 ALA: 116 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1870 ALA: 116 DIST  ATOM     N: 1872 ILE: 117 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1872 ILE: 117 DIST  ATOM    CA: 1874 ILE: 117 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1874 ILE: 117 DIST  ATOM     C: 1889 ILE: 117 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1889 ILE: 117 DIST  ATOM     N: 1891 ASP: 118 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1891 ASP: 118 DIST  ATOM    CA: 1893 ASP: 118 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1893 ASP: 118 DIST  ATOM     C: 1901 ASP: 118 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1901 ASP: 118 DIST  ATOM     N: 1903 ARG: 119 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1903 ARG: 119 DIST  ATOM    CA: 1905 ARG: 119 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1905 ARG: 119 DIST  ATOM     C: 1925 ARG: 119 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1925 ARG: 119 DIST  ATOM     N: 1927 CYS: 120 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1927 CYS: 120 DIST  ATOM    CA: 1929 CYS: 120 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1929 CYS: 120 DIST  ATOM     C: 1936 CYS: 120 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1936 CYS: 120 DIST  ATOM     N: 1938 SER: 121 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1938 SER: 121 DIST  ATOM    CA: 1940 SER: 121 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1940 SER: 121 DIST  ATOM     C: 1947 SER: 121 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1947 SER: 121 DIST  ATOM     N: 1949 ASN: 122 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1949 ASN: 122 DIST  ATOM    CA: 1951 ASN: 122 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1951 ASN: 122 DIST  ATOM     C: 1961 ASN: 122 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1961 ASN: 122 DIST  ATOM     N: 1963 ILE: 123 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1963 ILE: 123 DIST  ATOM    CA: 1965 ILE: 123 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1965 ILE: 123 DIST  ATOM     C: 1980 ILE: 123 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 1980 ILE: 123 DIST  ATOM     N: 1982 TRP: 124 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 1982 TRP: 124 DIST  ATOM    CA: 1984 TRP: 124 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 1984 TRP: 124 DIST  ATOM     C: 2004 TRP: 124 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2004 TRP: 124 DIST  ATOM     N: 2006 ALA: 125 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2006 ALA: 125 DIST  ATOM    CA: 2008 ALA: 125 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2008 ALA: 125 DIST  ATOM     C: 2014 ALA: 125 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2014 ALA: 125 DIST  ATOM     N: 2016 SER: 126 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2016 SER: 126 DIST  ATOM    CA: 2018 SER: 126 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2018 SER: 126 DIST  ATOM     C: 2025 SER: 126 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2025 SER: 126 DIST  ATOM     N: 2027 LEU: 127 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2027 LEU: 127 DIST  ATOM    CA: 2029 LEU: 127 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2029 LEU: 127 DIST  ATOM     C: 2044 LEU: 127 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2044 LEU: 127 DIST  ATOM     N: 2046 PRO: 128 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2046 PRO: 128 DIST  ATOM    CA: 2056 PRO: 128 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2056 PRO: 128 DIST  ATOM     C: 2058 PRO: 128 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2058 PRO: 128 DIST  ATOM     N: 2060 GLY: 129 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2060 GLY: 129 DIST  ATOM    CA: 2062 GLY: 129 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2062 GLY: 129 DIST  ATOM     C: 2065 GLY: 129 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2065 GLY: 129 DIST  ATOM     N: 2067 ALA: 130 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2067 ALA: 130 DIST  ATOM    CA: 2069 ALA: 130 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2069 ALA: 130 DIST  ATOM     C: 2075 ALA: 130 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2075 ALA: 130 DIST  ATOM     N: 2077 GLY: 131 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2077 GLY: 131 DIST  ATOM    CA: 2079 GLY: 131 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2079 GLY: 131 DIST  ATOM     C: 2082 GLY: 131 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2082 GLY: 131 DIST  ATOM     N: 2084 TYR: 132 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2084 TYR: 132 DIST  ATOM    CA: 2086 TYR: 132 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2086 TYR: 132 DIST  ATOM     C: 2103 TYR: 132 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2103 TYR: 132 DIST  ATOM     N: 2105 GLY: 133 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2105 GLY: 133 DIST  ATOM    CA: 2107 GLY: 133 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2107 GLY: 133 DIST  ATOM     C: 2110 GLY: 133 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2110 GLY: 133 DIST  ATOM     N: 2112 GLN: 134 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2112 GLN: 134 DIST  ATOM    CA: 2114 GLN: 134 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2114 GLN: 134 DIST  ATOM     C: 2127 GLN: 134 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2127 GLN: 134 DIST  ATOM     N: 2129 PHE: 135 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2129 PHE: 135 DIST  ATOM    CA: 2131 PHE: 135 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2131 PHE: 135 DIST  ATOM     C: 2147 PHE: 135 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2147 PHE: 135 DIST  ATOM     N: 2149 GLU: 136 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2149 GLU: 136 DIST  ATOM    CA: 2151 GLU: 136 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2151 GLU: 136 DIST  ATOM     C: 2162 GLU: 136 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2162 GLU: 136 DIST  ATOM     N: 2164 HIS: 137 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2164 HIS: 137 DIST  ATOM    CA: 2166 HIS: 137 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2166 HIS: 137 DIST  ATOM     C: 2179 HIS: 137 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2179 HIS: 137 DIST  ATOM     N: 2181 LYS: 138 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2181 LYS: 138 DIST  ATOM    CA: 2183 LYS: 138 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2183 LYS: 138 DIST  ATOM     C: 2201 LYS: 138 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2201 LYS: 138 DIST  ATOM     N: 2203 ALA: 139 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2203 ALA: 139 DIST  ATOM    CA: 2205 ALA: 139 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2205 ALA: 139 DIST  ATOM     C: 2211 ALA: 139 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2211 ALA: 139 DIST  ATOM     N: 2213 ASP: 140 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2213 ASP: 140 DIST  ATOM    CA: 2215 ASP: 140 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2215 ASP: 140 DIST  ATOM     C: 2223 ASP: 140 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2223 ASP: 140 DIST  ATOM     N: 2225 SER: 141 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2225 SER: 141 DIST  ATOM    CA: 2227 SER: 141 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2227 SER: 141 DIST  ATOM     C: 2234 SER: 141 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2234 SER: 141 DIST  ATOM     N: 2236 LEU: 142 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2236 LEU: 142 DIST  ATOM    CA: 2238 LEU: 142 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2238 LEU: 142 DIST  ATOM     C: 2253 LEU: 142 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2253 LEU: 142 DIST  ATOM     N: 2255 ILE: 143 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2255 ILE: 143 DIST  ATOM    CA: 2257 ILE: 143 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2257 ILE: 143 DIST  ATOM     C: 2272 ILE: 143 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2272 ILE: 143 DIST  ATOM     N: 2274 ALA: 144 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2274 ALA: 144 DIST  ATOM    CA: 2276 ALA: 144 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2276 ALA: 144 DIST  ATOM     C: 2282 ALA: 144 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2282 ALA: 144 DIST  ATOM     N: 2284 LYS: 145 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2284 LYS: 145 DIST  ATOM    CA: 2286 LYS: 145 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2286 LYS: 145 DIST  ATOM     C: 2304 LYS: 145 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2304 LYS: 145 DIST  ATOM     N: 2306 PHE: 146 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2306 PHE: 146 DIST  ATOM    CA: 2308 PHE: 146 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2308 PHE: 146 DIST  ATOM     C: 2324 PHE: 146 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2324 PHE: 146 DIST  ATOM     N: 2326 LYS: 147 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2326 LYS: 147 DIST  ATOM    CA: 2328 LYS: 147 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2328 LYS: 147 DIST  ATOM     C: 2346 LYS: 147 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2346 LYS: 147 DIST  ATOM     N: 2348 GLU: 148 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2348 GLU: 148 DIST  ATOM    CA: 2350 GLU: 148 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2350 GLU: 148 DIST  ATOM     C: 2361 GLU: 148 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2361 GLU: 148 DIST  ATOM     N: 2363 ALA: 149 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2363 ALA: 149 DIST  ATOM    CA: 2365 ALA: 149 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2365 ALA: 149 DIST  ATOM     C: 2371 ALA: 149 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2371 ALA: 149 DIST  ATOM     N: 2373 GLY: 150 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2373 GLY: 150 DIST  ATOM    CA: 2375 GLY: 150 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2375 GLY: 150 DIST  ATOM     C: 2378 GLY: 150 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2378 GLY: 150 DIST  ATOM     N: 2380 GLY: 151 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2380 GLY: 151 DIST  ATOM    CA: 2382 GLY: 151 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2382 GLY: 151 DIST  ATOM     C: 2385 GLY: 151 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2385 GLY: 151 DIST  ATOM     N: 2387 THR: 152 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2387 THR: 152 DIST  ATOM    CA: 2389 THR: 152 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2389 THR: 152 DIST  ATOM     C: 2399 THR: 152 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2399 THR: 152 DIST  ATOM     N: 2401 VAL: 153 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2401 VAL: 153 DIST  ATOM    CA: 2403 VAL: 153 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2403 VAL: 153 DIST  ATOM     C: 2415 VAL: 153 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2415 VAL: 153 DIST  ATOM     N: 2417 ARG: 154 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2417 ARG: 154 DIST  ATOM    CA: 2419 ARG: 154 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2419 ARG: 154 DIST  ATOM     C: 2439 ARG: 154 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2439 ARG: 154 DIST  ATOM     N: 2441 GLU: 155 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2441 GLU: 155 DIST  ATOM    CA: 2443 GLU: 155 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2443 GLU: 155 DIST  ATOM     C: 2454 GLU: 155 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2454 GLU: 155 DIST  ATOM     N: 2456 ILE: 156 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2456 ILE: 156 DIST  ATOM    CA: 2458 ILE: 156 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2458 ILE: 156 DIST  ATOM     C: 2473 ILE: 156 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2473 ILE: 156 DIST  ATOM     N: 2475 ASP: 157 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2475 ASP: 157 DIST  ATOM    CA: 2477 ASP: 157 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2477 ASP: 157 DIST  ATOM     C: 2485 ASP: 157 CHAIN 0'
 'CENTERDISTANCE  ATOM     C: 2485 ASP: 157 DIST  ATOM     N: 2487 VAL: 158 CHAIN 0'
 'CENTERDISTANCE  ATOM     N: 2487 VAL: 158 DIST  ATOM    CA: 2489 VAL: 158 CHAIN 0'
 'CENTERDISTANCE  ATOM    CA: 2489 VAL: 158 DIST  ATOM     C: 2501 VAL: 158 CHAIN 0']

However, CVs are deleted, when the number of atoms is altered.

[87]:
frames.atom_slice(frames.select('name CA'))
/home/kevin/git/encoder_map_private/encodermap/trajinfo/info_single.py:1691: UserWarning:

Dropping CVs from trajectory. Slicing CVs with this method is currently not possible. Raise an issue if you want to have this feature added.

[88]:
frames.CVs
[88]:
{}

Saving a complete trajectory Ensemble#

The same is possible with a complete TrajEnsemble. We can just save and load it after we have introduced all required collective variables.

A TrajEnsemble can be easily saved and loaded.

[89]:
trajs.save("pASP_pGLU_ensemble.h5", overwrite=True)

The file size is rather large with around 50 MB.

[90]:
Path("pASP_pGLU_ensemble.h5").stat().st_size / 1024 / 1024
[90]:
50.07163143157959

And reloaded

[91]:
loaded_trajs = em.TrajEnsemble.from_dataset("pASP_pGLU_ensemble.h5")
loaded_trajs._CVs
[91]:
<xarray.Dataset>
Dimensions:                             (ATOM: 30, ATOM_NO: 4,
                                         CENTRAL_ANGLES: 28,
                                         CENTRAL_DIHEDRALS: 27,
                                         CENTRAL_DISTANCES: 29, COORDS: 3,
                                         SIDE_DIHEDRALS: 28, traj_num: 7,
                                         frame_num: 5001, traj_name: 1)
Coordinates:
  * ATOM                                (ATOM) <U2 '1' '2' '3' ... '29' '30'
  * ATOM_NO                             (ATOM_NO) int64 0 1 2 3
  * CENTRAL_ANGLES                      (CENTRAL_ANGLES) <U18 'CENTERANGLE   ...
  * CENTRAL_DIHEDRALS                   (CENTRAL_DIHEDRALS) <U18 'CENTERDIH P...
  * CENTRAL_DISTANCES                   (CENTRAL_DISTANCES) <U18 'CENTERDISTA...
  * COORDS                              (COORDS) <U10 'POSITION X' ... 'POSIT...
  * SIDE_DIHEDRALS                      (SIDE_DIHEDRALS) <U18 'SIDECHDIH CHI1...
  * frame_num                           (frame_num) int64 0 1 2 ... 4999 5000
    time                                (traj_num, frame_num) float32 0.0 ......
  * traj_num                            (traj_num) int64 0 1 2 3 4 5 6
  * traj_name                           (traj_name) <U18 'pASP_pGLU_ensemble'
Data variables:
    central_angles                      (traj_num, frame_num, CENTRAL_ANGLES) float32 ...
    central_angles_feature_indices      (traj_num, CENTRAL_ANGLES, ATOM_NO) float64 ...
    central_cartesians                  (traj_num, frame_num, ATOM, COORDS) float32 ...
    central_cartesians_feature_indices  (traj_num, ATOM) float64 0.0 3.0 ... nan
    central_dihedrals                   (traj_num, frame_num, CENTRAL_DIHEDRALS) float32 ...
    central_dihedrals_feature_indices   (traj_num, CENTRAL_DIHEDRALS, ATOM_NO) float64 ...
    central_distances                   (traj_num, frame_num, CENTRAL_DISTANCES) float32 ...
    central_distances_feature_indices   (traj_num, CENTRAL_DISTANCES, ATOM_NO) float64 ...
    side_dihedrals                      (traj_num, frame_num, SIDE_DIHEDRALS) float32 ...
    side_dihedrals_feature_indices      (traj_num, SIDE_DIHEDRALS, ATOM_NO) float64 ...
Attributes:
    angle_units:     rad
    feature_axes:    ['SIDE_DIHEDRALS', 'ATOM', 'CENTRAL_DIHEDRALS', 'CENTRAL...
    full_paths:      ['/home/kevin/git/encoder_map_private/tests/data/pASP_pG...
    length_units:    nm
    time_units:      ps
    topology_files:  ['/home/kevin/git/encoder_map_private/tests/data/pASP_pG...

Conclusion#

Thinking of collections of trajectories describing the dynamic phenomena of a protein as an ensemble opens up new ways of working and understanding MD data. EncoderMap’s autoencoder-based dimensionality reduction builds on this framework by using neural networks to reduce the dimensionality of a input feature space. This input feature space and thus the low-dimensional representation also, contain ensemble statistics of the protein or the family of proteins we will look at.

Using the SingleTraj and TrajEnsemble classes can help with the dimensionality reduction, but with much more of the book-keeping a computational chemist has to do. We hope the tools presented in this notebook can help you in tackling the itering protein dynamic questions you are faced with.