encodermap.loading package#

Submodules#

encodermap.loading.dask_featurizer module#

encodermap.loading.delayed module#

encodermap.loading.features module#

Classes to be used as custom features with pyemma add_custom_feature

Todo

  • Write tests

  • Put the describe_last_feats function into utils.

  • Add Nan feature.

  • Write Examples.

class encodermap.loading.features.AllBondDistances(*args, **kwargs)[source]#

Bases: DistanceFeature

Feature that collects all bonds in a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘DISTANCE’.

Type:

str

__serialize_fields = ('distance_indexes', 'periodic')#

attribute names to serialize

__serialize_version = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as many entries as atoms in self.top.

Return type:

list[str]

generic_describe()[source]#
property indexes#

A (n_angles, 2) shaped numpy array giving the atom indices of the distances to be calculated.

Type:

np.ndarray

property name#

The name of the class: “AllBondDistances”.

Type:

str

prefix_label = 'DISTANCE        '#
class encodermap.loading.features.AllCartesians(*args, **kwargs)[source]#

Bases: SelectionFeature

Feature that collects all cartesian position of all atoms in the trajectory.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘POSITION’.

Type:

str

__init__(top)[source]#

Instantiate the AllCartesians class.

Parameters:

top (mdtraj.Topology) – A mdtraj topology.

__serialize_fields = ('indexes',)#

attribute names to serialize

__serialize_version = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as many entries as atoms in self.top.

Return type:

list[str]

property name#

The name of this class: ‘AllCartesians’

Type:

str

prefix_label = 'POSITION '#
class encodermap.loading.features.CentralAngles(*args, **kwargs)[source]#

Bases: AngleFeature

Feature that collects all angles in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘CENTERANGLE’.

Type:

str

__serialize_fields = ('angle_indexes', 'deg', 'cossin', 'periodic')#

attribute names to serialize

__serialize_version = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as many entries as atoms in self.top.

Return type:

list[str]

generic_describe()[source]#
property indexes#

A (n_angles, 3) shaped numpy array giving the atom indices of the angles to be calculated.

Type:

np.ndarray

property name#

The name of the class: “CentralAngles”.

Type:

str

prefix_label = 'CENTERANGLE '#
class encodermap.loading.features.CentralBondDistances(*args, **kwargs)[source]#

Bases: AllBondDistances

Feature that collects all bonds in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘CENTERDISTANCE’.

Type:

str

__serialize_fields = ('distance_indexes', 'periodic')#

attribute names to serialize

__serialize_version = 0#

version of class definition

property indexes#

A (n_angles, 2) shaped numpy array giving the atom indices of the distances to be calculated.

Type:

np.ndarray

property name#

The name of the class: “CentralBondDistances”.

Type:

str

prefix_label = 'CENTERDISTANCE  '#
class encodermap.loading.features.CentralCartesians(*args, **kwargs)[source]#

Bases: AllCartesians

Feature that collects all cartesian position of the backbone atoms.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘CENTERPOS’.

Type:

str

__serialize_fields = ('indexes',)#

attribute names to serialize

__serialize_version = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as manyu entries as atoms in self.top.

Return type:

list[str]

generic_describe()[source]#
property name#

The name of the class: “CentralCartesians”.

Type:

str

prefix_label = 'CENTERPOS'#
class encodermap.loading.features.CentralDihedrals(*args, **kwargs)[source]#

Bases: DihedralFeature

Feature that collects all dihedrals in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

__init__(topology, selstr=None, deg=False, cossin=False, periodic=True, omega=True, generic_labels=False)[source]#

Instantiate this feature class.

Parameters:
  • topology (mdtraj.Topology) – A topology to build features from.

  • selstr (Optional[str]) – A string, that limits the selection of dihedral angles. Only dihedral angles which atoms are represented by the selstr argument are considered. This selection string follows MDTraj’s atom selection language: https://mdtraj.org/1.9.3/atom_selection.html. Can also be None, in which case all backbone dihedrals (also omega) are considered. Defaults to None.

  • deg (bool) – Whether to return the result in degree (deg=True) or in radians (deg=False). Defaults to radions.

  • cossin (bool) – If True, each angle will be returned as a pair of (sin(x), cos(x)). This is useful, if you calculate the mean (e.g TICA/PCA, clustering) in that space. Defaults to False.

  • periodic (bool) – Whether to recognize periodic boundary conditions and work under the minimum image convention. Defaults to True.

__serialize_fields = ('selstr', '_phi_inds', '_psi_inds', '_omega_inds')#

attribute names to serialize

__serialize_version = 0#

version of class definition

property dask_transform#
describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as many entries as atoms in self.top.

Return type:

list[str]

generic_describe()[source]#

Returns a list of generic labels, not containing residue names. These can be used to stack tops of different topology.

Returns:

A list of labels.

Return type:

list[str]

property indexes#

A (n_angles, 4) shaped numpy array giving the atom indices of the dihedral angles to be calculated.

Type:

np.ndarray

property name#

The name of the class: “CentralDihedrals”.

Type:

str

class encodermap.loading.features.SideChainAngles(*args, **kwargs)[source]#

Bases: AngleFeature

Feature that collects all angles not in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘SIDECHANGLE’.

Type:

str

__serialize_fields = ('angle_indexes', 'deg', 'cossin', 'periodic')#

attribute names to serialize

__serialize_version = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Retruns:

list[str]: A list of labels. This list has as many entries as atoms in self.top.

property indexes#

A (n_angles, 3) shaped numpy array giving the atom indices of the angles to be calculated.

Type:

np.ndarray

property name#

The name of the class: “SideChainAngles”.

Type:

str

prefix_label = 'SIDECHANGLE '#
class encodermap.loading.features.SideChainBondDistances(*args, **kwargs)[source]#

Bases: AllBondDistances

Feature that collects all bonds not in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘SIDECHDISTANCE’.

Type:

str

__serialize_fields = ('distance_indexes', 'periodic')#

attribute names to serialize

__serialize_version = 0#

version of class definition

property indexes#

A (n_angles, 2) shaped numpy array giving the atom indices of the distances to be calculated.

Type:

np.ndarray

property name#

The name of the class: “SideChainBondDistances”.

Type:

str

prefix_label = 'SIDECHDISTANCE  '#
class encodermap.loading.features.SideChainCartesians(*args, **kwargs)[source]#

Bases: AllCartesians

Feature that collects all cartesian position of all non-backbone atoms.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

prefix_label#

A prefix for the labels. In this case it is ‘SIDECHPOS’.

Type:

str

__serialize_fields = ('indexes',)#

attribute names to serialize

__serialize_version = 0#

version of class definition

property name#

The name of the class: “SideChainCartesians”.

Type:

str

prefix_label = 'SIDECHPOS'#
class encodermap.loading.features.SideChainDihedrals(*args, **kwargs)[source]#

Bases: DihedralFeature

Feature that collects all dihedrals in the backbone of a topology.

top#

Topology of this feature.

Type:

mdtraj.Topology

indexes#

The numpy array returned from top.select(‘all’).

Type:

np.ndarray

options#

A list of possible sidechain angles [‘chi1’ to ‘chi5’].

Type:

list[str]

__serialize_fields: tuple[str] = ('_prefix_label_lengths',)#

attribute names to serialize

__serialize_version: int = 0#

version of class definition

describe()[source]#

Returns a list of labels, that can be used to unambiguously define atoms in the protein topology.

Returns:

A list of labels. This list has as many entries as atoms in self.top.

Return type:

list[str]

generic_describe()[source]#
property indexes#

A (n_angles, 4) shaped numpy array giving the atom indices of the dihedral angles to be calculated.

Type:

np.ndarray

property name#

The name of the class: “SideChainDihedrals”.

Type:

str

options: list[str] = ['chi1', 'chi2', 'chi3', 'chi4', 'chi5']#

encodermap.loading.featurizer module#

Classes to be used as custom features with pyemma add_custom_feature

Todo

  • Write Docstrings.

  • Write Examples.

  • Sidechain angles, distances not working correctly.

class encodermap.loading.featurizer.Featurizer(trajs, in_memory=True)[source]#

Bases: type

encodermap.loading.featurizer._validate_uri(str_)[source]#

Checks whether the str_ is a valid uri.

encodermap.loading.pipeline module#

Todo

  • Make a pyemma pipeline

  • Load feayures save as xarray in HDF5 and close file

  • This way everything will be easy on memory

  • Fix this TypeError, when inheriting from the same class twice. Make an option to change the metaclass of the second one.

  • TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

encodermap.loading.utils module#

Utility functions for loading trajectories with different topologies.

encodermap.loading.utils.put_nan_back_in()[source]#

Module contents#