encodermap.models package#

Submodules#

encodermap.models.layers module#

Module that implements custom layers. Mainly needed for handling periodicity, backmapping or sparsity.

class BackMapLayer(*args, **kwargs)[source]#

Bases: Layer

Layer that implements backmapping from torsions-angles-distances to Euclidean coordinates.

Parameters:
  • left_split (int)

  • right_split (int)

call(inputs)[source]#

Call the layers, inputs should be a tuple shaped, so that it can be split into distances, angles, dihedrals = inputs

Parameters:

inputs (tuple[Tensor, Tensor, Tensor])

Return type:

Tensor

classmethod from_config(config)[source]#

Reconstructs this keras serializable from a dict.

Parameters:

config (dict[Any, Any]) – A dictionary.

Returns:

An instance of the BackMapLayer.

Return type:

BackMapLayerType

get_config()[source]#

Serializes this keras serializable.

Returns:

A dict with the serializable objects.

Return type:

dict[Any, Any]

class MeanAngles(*args, **kwargs)[source]#

Bases: Layer

Layer that implements the mean of periodic angles.

Parameters:
call(inputs)[source]#

Call the layer

class PeriodicInput(*args, **kwargs)[source]#

Bases: EncoderMapBaseLayer

Layer that handles periodic input. Needed, if angles are treated. Input angles will be split into sin and cos components, and a tensor with shape[0] = 2 * inp_shape[0] will be returned

Parameters:
call(inputs)[source]#

Call the layer.

Parameters:

inputs (Tensor)

Return type:

Tensor

class PeriodicOutput(*args, **kwargs)[source]#

Bases: EncoderMapBaseLayer

Layer that reverses the PeriodicInputLayer.

Parameters:
call(inputs)[source]#

Calls the layer. Inputs should be a tuple of (sin, cos) of the same angles

encodermap.models.models module#

gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int, int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = True) ADCFunctionalModelTesting[source]#
gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int, int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = False) ADCFunctionalModel
gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = False) ADCSparseFunctionalModel

New implementation of the functional model API for AngleCartesianDihedralEncoderMap

The functional API is much more flexible than the sequential API, in that models with multiple inputs and outputs can be defined. Custom layers and submodels can be intermixed. In EncoderMap’s case, the functional API is used to build the AngleDihedralCartesianAutoencoder, which takes input data in form of a tf.data.Dataset with:

  • backbone_angles (angles between C, CA, N - atoms in the backbone).

  • backbone_torsions (dihedral angles in the backbone,

    commonly known as omega, phi, psi).

  • cartesian_coordinates (coordinates of the C, CA, N backbone

    atoms. This data has ndim 3, the other have ndim 2).

  • backbone_distances (distances between the C, CA, N backbone atoms).

  • sidechain_torsions (dihedral angles in the sidechain,

    commonly known as chi1, chi2, chi3, chi4, chi5).

Packing and unpacking that data in the correct order is important. Make sure to double-check whether you are using angles or dihedrals. A simple print of the shape can be enough.

Parameters:
  • input_shapes (Union[tf.data.Dataset, tuple[int, int, int, int, int]]) – The input shapes, that will be used in the construction of the model.

  • parameters (Optional[encodermap.parameters.ADCParameters]) – An instance of encodermap.parameters.ADCParameters, which holds further parameters in network construction. If None is provided, a new instance with default parameters will be created. Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

  • sidechain_only_sparse (bool) – A special case, when the proteins have the same number of residues, but different numbers of sidechain dihedrals. In that case only the sidechain dihedrals are considered to be sparse. Defaults to False.

  • (Union[dict[str (kernel_initializer) – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

:paramLiteral[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • (Union[dict[str (bias_initializer) – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

:paramLiteral[“ones”, “RandomNormal”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:

write_summary (bool) – Whether to print a summary. If p.tensorboard is True a file will be generated. at the main_path.

Returns:

The model.

Return type:

tf.keras.models.Model

Here’s a scheme of the generated network:

┌───────────────────────────────────────────────────────────────────────────────────────┐
│A linear protein with N standard residues has N*3 backbone atoms (..C-N-CA-C-N..)      │
│it has N*3 - 1 distances between these atoms                                           │
│it has N*3 - 2 angles between three atoms                                              │
│it has N*3 - 3 dihedrals between 4 atoms                                               │
│it has S sidechain dihedrals based on the sequence                                     │
└───────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬───────┘
        │                 │                 │                 │                 │
        │                 │                 │                 │                 │
        │                 │                 │                 │                 │
┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐
│cartesians     │ │distances      │ │angles         │ │dihedrals      │ │side dihedrals │
│(batch, N*3, 3)│ │(batch, N*3-1) │ │(batch, N*3-2) │ │(batch, N*3-3) │ │(batch, S)     ├───────┐
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │Every type
        │                 │                 │                 │                  │              │of angular
        │                 │       ┌─────────┼─────────────────┼──────────────────┤              │input has
        │                 │       │         │                 │                  │              │its own
┌───────┴───────┐         │       │ ┌───────┴───────┐ ┌───────┴───────┐ ┌────────┴──────┐       │cost contri
│pair cartesians│         │ ┌─────┼─┤unitcircle ang │ │unitcircle dih │ │unitcircle sdih│       │bution
│(batch, batch) │         │ │if no│ │(b, (N*3-2)*2) │ │(b, (N*3-3)*2) │ │(b, S*2)       │       │which
└───────┬───────┘         │ │angles └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │compares
        │compare the pair │ │are  │         │                 │                  │              │input
        │wise distances of│ │fed  │       if│use_backbone_angles               if│use_sidechains│and
        │the input cartesi│ │through        │                 │                  │              │output
        │ans with the gene│ │the  ┼ ┌───────┴─────────────────┴──────────────────┴──────┐       │->
        │rated cartesians │ │network│concatenate the angle-inputs. Based on parameters. │       │angle_cost
        │-> cartesian loss│ │use  │ │(batch, sum(angle_shapes)                          │       │dihedral_cost
        │                 │ │mean │ └─────────────────────────┬─────────────────────────┘       │side_dihedral
        │                 │ │angles                           │                                 │_cost
        │                 │ │     │                           │                                 │
        │                 │ │     │                           │                                 │
        │                 │ │     │             ┌─────────────┴──────────────┐                  │
        │                 │ │     │             │Encoder layers              │                  │
        │                 │ │     │             │(batch, n_neurons)          │                  │
        │                 │ │     │             └─────────────┬──────────────┘                  │
        │                 │ │     │                           │                                 │
        │                 │ │     │                           │                                 │
        │                 │ │     │add a sigmoid-weighted     │            add a loss function  │
        │      compare the│ │     │loss function that┌────────┴────────┐   to center the points │
        │      ┌──────────┼─┼─────┴──────────────────┤Bottleneck,Latent├────────────────────    │
        │      │generated │ │      compares the pair-│ (batch, 2)      │   around the origin    │
        │      │cartesians│ │      wise distances of └────────┬────────┘   -> center loss       │
        │      │with the  │ │      input and latent           │                                 │
        │      │pairwise  │ │      samples                    │                                 │
        │      │distances │ │      -> distance loss           │                                 │
        │      │of the    │ │                   ┌─────────────┴──────────────┐                  │
        │      │bottleneck│ │                   │Decoder layers              │                  │
        │      │use a 2nd │ │                   │(batch, n_neurons)          │                  │
        │      │sigmoid   │ │                   └─────────────┬──────────────┘                  │
        │      │function  │ │                                 │                                 │
        │      │for this  │ │                                 │                                 │
        │      │->        │ │                                 │                                 │
        │      │cartesian │ │       ┌─────────────────────────┴─────────────────────────┐       │
        │      │distance  │ │       │split the output of the decoder to get angles back │       │
        │      │loss      │ │       │(batch, sum(angle_shapes)                          │       │
        │      │          │ │       └───────┬─────────────────┬─────────────────┬───────┘       │
        │      │          │ │               │                 │                 │               │
        │      │          │ │               │                 │                 │               │
        │      │          │ │               │                 │                 │               │
        │      │          │ │       ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐       │
        │      │          │ │       │unitcircle ang │ │unitcircle dih │ │unitcircle sdih│       │
        │      │          │ │       │(b, (N*3-2)*2) │ │(b, (N*3-3)*2) │ │(b, S*2)       │       │
        │      │          │ │       └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │
        │      │          │ │               │                 │                  │              │
        │      │          │ │             if│use_backbone_angles               if│use_sidechains│
        │      │          │ │               │                 │                  │              │
        │      │          │ │       ┌───────┴───────┐ ┌───────┴───────┐ ┌────────┴──────┐       │
        │      │          │ └───────┤(mean) angles  │ │dihedrals      │ │side dihedrals │       │
        │      │          │         │(batch,3N*3-2) │ │(batch,3N*3-3) │ │(batch, S)     ├───────┘
        │      │          │         └───────┬───────┘ └───────┬───────┘ └───────────────┘
        │      │          │                 │                 │
        │      │          │                 │                 │
        │      │          │                 │                 │
        │      │  ┌───────┴─────────────────┴─────────────────┴──────┐
        │      │  │create new cartesians with chain-in-plane and     │
        │      │  │rotation matrices (batch, 3*N, 3)                 │
        │      │  └───────┬──────────────────────────────────────────┘
        │      │          │
        │      │          │
        │      │          │
        │      │  ┌───────┴───────┐
        │      └──┤gen pair cartes│
        │         │(batch,batch)  │
        └─────────┴───────────────┘
gen_sequential_model(input_shape, parameters=None, sparse=False)[source]#

Returns a tf.keras model with the specified input shape and the parameters in the Parameters class.

Parameters:
  • input_shape (int) – The input shape of the returned model. In most cases that is data.shape[1] of your data.

  • parameters (Optional[AnyParameters]) – The parameters to use on the returned model. If None is provided the default parameters in encodermap.Parameters.defaults is used. You can look at the defaults with print(em.Parameters.defaults_description()). Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

Returns:

A subclass of tf.keras.Model build with specified parameters.

Return type:

em.SequentialModel

Module contents#

EncoderMap’s tensorflow models.

In tensorflow a model is a grouping of layers with training/inference features.