Models#

class ADCFunctionalModel(*args, **kwargs)[source]#

Bases: Model

A subclass of tf.keras.Model, that implements the logic for the AngleDihedralCartesianEncoderMap.

Parameters:
  • parameters (ADCParameters)

  • inputs (Iterable[tf.Tensor])

  • outputs (Iterable[tf.Tensor])

  • encoder (tf.keras.Model)

  • decoder (tf.keras.Model)

compile(*args, **kwargs)[source]#

Configures the model for training.

Example:

```python model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),

loss=tf.keras.losses.BinaryCrossentropy(), metrics=[tf.keras.metrics.BinaryAccuracy(),

tf.keras.metrics.FalseNegatives()])

```

Parameters:
  • optimizer – String (name of optimizer) or optimizer instance. See tf.keras.optimizers.

  • loss – Loss function. May be a string (name of loss function), or a tf.keras.losses.Loss instance. See tf.keras.losses. A loss function is any callable with the signature loss = fn(y_true, y_pred), where y_true are the ground truth values, and y_pred are the model’s predictions. y_true should have shape (batch_size, d0, .. dN) (except in the case of sparse loss functions such as sparse categorical crossentropy which expects integer arrays of shape (batch_size, d0, .. dN-1)). y_pred should have shape (batch_size, d0, .. dN). The loss function should return a float tensor. If a custom Loss instance is used and reduction is set to None, return value has shape (batch_size, d0, .. dN-1) i.e. per-sample or per-timestep loss values; otherwise, it is a scalar. If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses, unless loss_weights is specified.

  • metrics – List of metrics to be evaluated by the model during training and testing. Each of this can be a string (name of a built-in function), function or a tf.keras.metrics.Metric instance. See tf.keras.metrics. Typically you will use metrics=[‘accuracy’]. A function is any callable with the signature result = fn(y_true, y_pred). To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={‘output_a’:’accuracy’, ‘output_b’:[‘accuracy’, ‘mse’]}. You can also pass a list to specify a metric or a list of metrics for each output, such as metrics=[[‘accuracy’], [‘accuracy’, ‘mse’]] or metrics=[‘accuracy’, [‘accuracy’, ‘mse’]]. When you pass the strings ‘accuracy’ or ‘acc’, we convert this to one of tf.keras.metrics.BinaryAccuracy, tf.keras.metrics.CategoricalAccuracy, tf.keras.metrics.SparseCategoricalAccuracy based on the shapes of the targets and of the model output. We do a similar conversion for the strings ‘crossentropy’ and ‘ce’ as well. The metrics passed here are evaluated without sample weighting; if you would like sample weighting to apply, you can specify your metrics via the weighted_metrics argument instead.

  • loss_weights – Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model’s outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.

  • weighted_metrics – List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.

  • run_eagerly

    Bool. If True, this Model’s logic will not be wrapped in a tf.function. Recommended to leave this as None unless your Model cannot be run inside a tf.function. run_eagerly=True is not supported when using tf.distribute.experimental.ParameterServerStrategy. Defaults to

    False.

  • steps_per_execution – Int or ‘auto’. The number of batches to run during each tf.function call. If set to “auto”, keras will automatically tune steps_per_execution during runtime. Running multiple batches inside a single tf.function call can greatly improve performance on TPUs, when used with distributed strategies such as ParameterServerStrategy, or with small models with a large Python overhead. At most, one full epoch will be run each execution. If a number larger than the size of the epoch is passed, the execution will be truncated to the size of the epoch. Note that if steps_per_execution is set to N, Callback.on_batch_begin and Callback.on_batch_end methods will only be called every N batches (i.e. before/after each tf.function execution). Defaults to 1.

  • jit_compile – If True, compile the model training step with XLA. [XLA](https://www.tensorflow.org/xla) is an optimizing compiler for machine learning. jit_compile is not enabled for by default. Note that jit_compile=True may not necessarily work for all models. For more information on supported operations please refer to the [XLA documentation](https://www.tensorflow.org/xla). Also refer to [known XLA issues](https://www.tensorflow.org/xla/known_issues) for more details.

  • pss_evaluation_shards – Integer or ‘auto’. Used for tf.distribute.ParameterServerStrategy training only. This arg sets the number of shards to split the dataset into, to enable an exact visitation guarantee for evaluation, meaning the model will be applied to each dataset element exactly once, even if workers fail. The dataset must be sharded to ensure separate workers do not process the same data. The number of shards should be at least the number of workers for good performance. A value of ‘auto’ turns on exact evaluation and uses a heuristic for the number of shards based on the number of workers. 0, meaning no visitation guarantee is provided. NOTE: Custom implementations of Model.test_step will be ignored when doing exact evaluation. Defaults to 0.

  • **kwargs – Arguments supported for backwards compatibility only.

Return type:

None

property decoder: Model#
property encoder: Model#
classmethod from_config(config, custom_objects=None)[source]#

Reconstructs this keras serializable from a dict.

Parameters:
Returns:

An instance of the ADCFunctionalModel.

Return type:

ADCFunctionalModelType

get_config()[source]#

Serializes this keras serializable.

Returns:

A dict with the serializable objects.

Return type:

dict[str, Any]

get_loss(inp)[source]#
Parameters:

inp (tuple[Tensor, Tensor, Tensor, Tensor] | tuple[Tensor, Tensor, Tensor, Tensor, Tensor])

Return type:

Tensor

train_step(data)[source]#

Can receive three types of data.

  • use_backbone_angles = False, use_sidechains = False:

    Will receive a four-tuple in the order: angles, dihedrals, cartesians, distances. The angles will be used to construct mean angles.

  • use_backbone_angles = True, use_sidechains = False:

    Will receive the same four-tuple as above, but the angles will be fed through the autoencoder.

  • use_backbone_angles = True, use_sidechains = True:

    Will receive a five-tuple in the order: angles, dihedrals, cartesians, distances, side dihedrals. The angles, central dihedrals and side dihedrals will be fed through the autoencoder.

Parameters:

data (tuple[Tensor, Tensor, Tensor, Tensor] | tuple[Tensor, Tensor, Tensor, Tensor, Tensor])

Return type:

None

class ADCFunctionalModelSidechainReconstruction(*args, **kwargs)[source]#

Bases: ADCSparseFunctionalModel

Parameters:
  • parameters (ADCParameters)

  • inputs (Iterable[tf.Tensor])

  • outputs (Iterable[tf.Tensor])

  • encoder (tf.keras.Model)

  • decoder (tf.keras.Model)

  • kwargs (Any)

classmethod from_config(config)[source]#

Reconstructs this keras serializable from a dict.

Parameters:

config (dict[Any, Any]) – A dictionary.

Returns:

An instance of the BackMapLayer.

Return type:

BackMapLayerType

get_config()[source]#

Serializes this keras serializable.

Returns:

A dict with the serializable objects.

Return type:

dict[str, Any]

get_loss(inp)[source]#
Parameters:

inp (tuple[Tensor, ...])

class ADCSparseFunctionalModel(*args, **kwargs)[source]#

Bases: ADCFunctionalModel

Parameters:
  • parameters (ADCParameters)

  • inputs (Iterable[tf.Tensor])

  • outputs (Iterable[tf.Tensor])

  • encoder (tf.keras.Model)

  • decoder (tf.keras.Model)

  • get_dense_model_central_angles (tf.keras.Model)

  • get_dense_model_central_dihedrals (tf.keras.Model)

  • get_dense_model_cartesians (tf.keras.Model)

  • get_dense_model_distances (tf.keras.Model)

  • get_dense_model_side_dihedrals (Union[tf.keras.Model, None])

classmethod from_config(config, custom_objects=None)[source]#

Reconstructs this keras serializable from a dict.

Parameters:
Returns:

An instance of the ADCSparseFunctionalModel.

Return type:

ADCSparseFunctionalModelType

get_config()[source]#

Serializes this keras serializable.

Returns:

A dict with the serializable objects.

Return type:

dict[str, Any]

get_loss(inp)[source]#
class MyBiasInitializer(bias)[source]#

Bases: Initializer

Custom Bias initializer to make bias deterministic.

Gets a numpy array called bias. When called, it checks whether the requested shape matches the shape of the numpy array and then returns the array.

Examples

>>> # Imports
>>> from encodermap.models.models import MyBiasInitializer
>>> import numpy as np
>>> import tensorflow as tf
>>> from tensorflow import keras
>>> from tensorflow.keras import layers
...
>>> # Create a model with the bias initializer
>>> model = tf.keras.models.Sequential(
...     [
...         layers.Dense(
...             2,
...             activation="relu",
...             name="layer1",
...             bias_initializer=MyBiasInitializer(np.array([1.0, 0.5])),
...         ),
...         layers.Dense(
...             3,
...             activation="relu",
...             name="layer2",
...             bias_initializer=MyBiasInitializer(np.array([0.1, 0.2, 0.3])),
...         ),
...         layers.Dense(4, name="layer3"),
...     ]
... )
...
>>> model.build(input_shape=(10, 2))
>>> for layer in model.layers:
...     print(layer.get_weights()[1])
[1.  0.5]
[0.1 0.2 0.3]
[0. 0. 0. 0.]
>>> # This example fails with an AssertionError, because the
>>> # bias shape of the second layer is wrong:
>>> model = tf.keras.models.Sequential(
...     [
...         layers.Dense(
...             2,
...             activation="relu",
...             name="layer1",
...             bias_initializer=MyBiasInitializer(np.array([1.0, 0.5])),
...         ),
...         layers.Dense(
...             3,
...             activation="relu",
...             name="layer2",
...             bias_initializer=MyBiasInitializer(np.array([0.1, 0.2])),
...         ),
...         layers.Dense(4, name="layer3"),
...     ]
... )
...
>>> model.build(input_shape=(10, 2))  
Traceback (most recent call last):
AssertionError: Can't initialize Bias. Requested shape: (3,) shape of pre-set bias: (2,)
Parameters:

bias (np.ndarray)

class MyKernelInitializer(weights)[source]#

Bases: Initializer

Custom Kernel initializer to make weights deterministic.

Gets a numpy array called weights. When called, it checks whether the requested shape matches the shape of the numpy array and then returns the array. For example, see the documentation of MyBiasInitializer.

Parameters:

weights (np.ndarray)

class SequentialModel(*args, **kwargs)[source]#

Bases: Model

Parameters:
  • input_dim (int)

  • parameters (Optional[Parameters])

  • sparse (bool)

  • get_dense_model (Optional[tf.keras.Model])

build(input_shape)[source]#

Builds the model based on input shapes received.

This is to be used for subclassed models, which do not know at instantiation time what their inputs look like.

This method only exists for users who want to call model.build() in a standalone way (as a substitute for calling the model on real data to build it). It will never be called by the framework (and thus it will never throw unexpected errors in an unrelated workflow).

Parameters:

input_shape – Single tuple, TensorShape instance, or list/dict of shapes, where shapes are tuples, integers, or TensorShape instances.

Raises:
  • ValueError

    1. In case of invalid user-provided data (not of type tuple, list, TensorShape, or dict). 2. If the model requires call arguments that are agnostic to the input shapes (positional or keyword arg in call signature). 3. If not all layers were properly built. 4. If float type inputs are not supported within the layers.

  • In each of these cases, the user should build their model by calling

  • it on real tensor data.

call(x, training=False)[source]#

Calls the model on new inputs and returns the outputs as tensors.

In this case call() just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__() method, i.e. model(inputs), which relies on the underlying call() method.

Parameters:
  • inputs – Input tensor, or dict/list/tuple of input tensors.

  • training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.

  • mask – A mask or list of masks. A mask can be either a boolean tensor or None (no mask). For more details, check the guide [here](https://www.tensorflow.org/guide/keras/masking_and_padding).

Returns:

A tensor if there is a single output, or a list of tensors if there are more than one outputs.

compile(*args, **kwargs)[source]#

Configures the model for training.

Example:

```python model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),

loss=tf.keras.losses.BinaryCrossentropy(), metrics=[tf.keras.metrics.BinaryAccuracy(),

tf.keras.metrics.FalseNegatives()])

```

Parameters:
  • optimizer – String (name of optimizer) or optimizer instance. See tf.keras.optimizers.

  • loss – Loss function. May be a string (name of loss function), or a tf.keras.losses.Loss instance. See tf.keras.losses. A loss function is any callable with the signature loss = fn(y_true, y_pred), where y_true are the ground truth values, and y_pred are the model’s predictions. y_true should have shape (batch_size, d0, .. dN) (except in the case of sparse loss functions such as sparse categorical crossentropy which expects integer arrays of shape (batch_size, d0, .. dN-1)). y_pred should have shape (batch_size, d0, .. dN). The loss function should return a float tensor. If a custom Loss instance is used and reduction is set to None, return value has shape (batch_size, d0, .. dN-1) i.e. per-sample or per-timestep loss values; otherwise, it is a scalar. If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses, unless loss_weights is specified.

  • metrics – List of metrics to be evaluated by the model during training and testing. Each of this can be a string (name of a built-in function), function or a tf.keras.metrics.Metric instance. See tf.keras.metrics. Typically you will use metrics=[‘accuracy’]. A function is any callable with the signature result = fn(y_true, y_pred). To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={‘output_a’:’accuracy’, ‘output_b’:[‘accuracy’, ‘mse’]}. You can also pass a list to specify a metric or a list of metrics for each output, such as metrics=[[‘accuracy’], [‘accuracy’, ‘mse’]] or metrics=[‘accuracy’, [‘accuracy’, ‘mse’]]. When you pass the strings ‘accuracy’ or ‘acc’, we convert this to one of tf.keras.metrics.BinaryAccuracy, tf.keras.metrics.CategoricalAccuracy, tf.keras.metrics.SparseCategoricalAccuracy based on the shapes of the targets and of the model output. We do a similar conversion for the strings ‘crossentropy’ and ‘ce’ as well. The metrics passed here are evaluated without sample weighting; if you would like sample weighting to apply, you can specify your metrics via the weighted_metrics argument instead.

  • loss_weights – Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model’s outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.

  • weighted_metrics – List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.

  • run_eagerly

    Bool. If True, this Model’s logic will not be wrapped in a tf.function. Recommended to leave this as None unless your Model cannot be run inside a tf.function. run_eagerly=True is not supported when using tf.distribute.experimental.ParameterServerStrategy. Defaults to

    False.

  • steps_per_execution – Int or ‘auto’. The number of batches to run during each tf.function call. If set to “auto”, keras will automatically tune steps_per_execution during runtime. Running multiple batches inside a single tf.function call can greatly improve performance on TPUs, when used with distributed strategies such as ParameterServerStrategy, or with small models with a large Python overhead. At most, one full epoch will be run each execution. If a number larger than the size of the epoch is passed, the execution will be truncated to the size of the epoch. Note that if steps_per_execution is set to N, Callback.on_batch_begin and Callback.on_batch_end methods will only be called every N batches (i.e. before/after each tf.function execution). Defaults to 1.

  • jit_compile – If True, compile the model training step with XLA. [XLA](https://www.tensorflow.org/xla) is an optimizing compiler for machine learning. jit_compile is not enabled for by default. Note that jit_compile=True may not necessarily work for all models. For more information on supported operations please refer to the [XLA documentation](https://www.tensorflow.org/xla). Also refer to [known XLA issues](https://www.tensorflow.org/xla/known_issues) for more details.

  • pss_evaluation_shards – Integer or ‘auto’. Used for tf.distribute.ParameterServerStrategy training only. This arg sets the number of shards to split the dataset into, to enable an exact visitation guarantee for evaluation, meaning the model will be applied to each dataset element exactly once, even if workers fail. The dataset must be sharded to ensure separate workers do not process the same data. The number of shards should be at least the number of workers for good performance. A value of ‘auto’ turns on exact evaluation and uses a heuristic for the number of shards based on the number of workers. 0, meaning no visitation guarantee is provided. NOTE: Custom implementations of Model.test_step will be ignored when doing exact evaluation. Defaults to 0.

  • **kwargs – Arguments supported for backwards compatibility only.

decoder(x, training=False)[source]#
encoder(x, training=False)[source]#

In the sequential model, the encoder is a method (as oppes to a model).

This method handles the input, when the periodicity of the input data is greater than float(‘inf’).

Parameters:
  • x (Union[np.ndarray, tf.Tensor) – The input.

  • training (bool) – Whether we are training and compute gradients.

Returns:

The output of the encoder.

Return type:

Union[np.ndarray, tf.Tensor]

classmethod from_config(config, custom_objects=None)[source]#

Reconstructs this keras serializable from a dict.

Parameters:
Returns:

An instance of the SequentialModel.

Return type:

SequentialModelType

get_config()[source]#

Serializes this keras serializable.

Returns:

A dict with the serializable objects.

Return type:

dict[str, Any]

train_step(data)[source]#

Overwrites the normal train_step. What is different?

Not much. Even the provided data is expected to be a tuple of (data, classes) (x, y) in classification tasks. The data is unpacked, and y is discarded, because the Autoencoder Model is a regression task.

Parameters:

data (tuple) – The (x, y) data of this train step.

_concatenate_inputs(p, angles_unit_circle, central_dihedrals_unit_circle, side_dihedrals_unit_circle=None, input_cartesians_pairwise_defined_shape=None)[source]#

Concatenates input Tensors for the AngleDihedralCartesianEncoderMap.

As the AngleDihedralCartesianEncoderMap model can use either central_dihedrals, central_angles and central_dihedrals, central_angles and central_dihedrals and side_dihedrals for its Encoder input, these input sources need to be concatenated (after they have been projected onto a unit circle). This function concatenates these inputs in the correct order and ensures a correct shape of the inputs.

Parameters:
  • p (encodermap.parameters.ADCParameters) – A parameter instance.

  • angles_unit_circle (Union[tf.Tensor, None]) – Can be None, in case only the central_dihedrals are used for training. Otherwise, needs to be the central angles.

  • central_dihedrals_unit_circle (tf.Tensor) – The unit circle projected central dihedrals.

  • side_dihedrals_unit_circle (Tensor | None) – Can be None, if case the side dihedrals are not used for training. Otherwise, needs to be the side dihedrals.

  • input_cartesians_pairwise_defined_shape (Optional[tf.Tensor]) – The pairwise distances of the input cartesians.

Returns:

A tuple containing the following:
  • list[int]: A list of the shape[1] of the input tensors. If only

    dihedrals are used for training, this list has only one entry. In the other cases, this list can be used to split the output of the decoder again into the constituents of central_angles, central_dihedrals, side_dihedrals.

  • tf.Tensor: The concatenated inputs.

Return type:

tuple

_concatenate_inputs_reconstruct_sidechains(p, central_angles_unit_circle, central_dihedrals_unit_circle, side_angles_unit_circle, side_dihedrals_unit_circle)[source]#

Concatenates input Tensors for the AngleDihedralCartesianEncoderMap with sidechain reconstruction.

Parameters:
  • p (ADCParameters)

  • central_angles_unit_circle (Tensor)

  • central_dihedrals_unit_circle (Tensor)

  • side_angles_unit_circle (Tensor)

  • side_dihedrals_unit_circle (Tensor)

Return type:

tuple[list[int], Tensor]

_create_inputs_non_periodic_maybe_sparse(shape, p, name, sparse, reshape=None)[source]#

Creates an input Tensor.

Parameters:
  • shape (Union[tuple[int], tuple[int, int]]) – The shape can be either a tuple with one int (in case of the central distances) or a tuple of two ints (in case of central cartesians), in which case, the 2nd is checked to be 3 (for the xyz coordinates).

  • name (str) – The name of this input tensor. Will be preceded with ‘input_’.

  • sparse (bool) – Whether a sparse->dense model should be returned. Defaults to False.

  • reshape (Optional[int]) – Whether the input will be in flattened cartesians and thus reshaped to (shape // reshape, reshape). Thus, only the reshape 3 is currently used in EncoderMap. If None is specified, the output will not be reshaped. Defaults to None.

  • p (ADCParameters)

Returns:

A tuple containing the following:
  • tf.Tensor: The placeholder tensor for the input. If sparse is True,

    this Tensor will first be fed through a Dense layer to use sparse matrix multiplication to make it dense again.

  • Union[tf.Tensor, None]: The Dense output of the Tensor, if sparse is True.

  • Union[tf.keras.Model, None]: The model to get from sparse to dense.

    If sparse is False, None will be returned here.

Return type:

tuple

_create_inputs_periodic_maybe_sparse(shape, p, name, sparse)[source]#

Creates an input Tensor and also projects it onto a unit circle (returns the sin, cos, sin, cos, …) of the values.

Parameters:
  • shape (int) – The shape can be either a tuple with one int (in case of the central distances) or a tuple of two ints (in case of central cartesians), in which case, the 2nd is checked to be 3 (for the xyz coordinates).

  • p (encodermap.parameters.ADCParameters) – An instance of ADCParameters, which contains info about the periodicity of the input space.

  • name (str) – The name of this input tensor. Will be preceded with ‘input_’. The to unit_circle input will be called ‘input_{name}_to_unit_circle’.

  • sparse (bool) – Whether a sparse->dense model should be returned.

Returns:

A tuple containing the following:
  • tf.Tensor: The placeholder tensor for the input. If sparse is True,

    this Tensor will first be fed through a Dense layer to use sparse matrix multiplication to make it dense again.

  • tf.Tensor: The PeriodicInput of the same tensor.

  • Union[tf.keras.Model, None]: The model to get from sparse to dense.

    If sparse is False, a None will be returned here.

Return type:

tuple

_get_adc_decoder(p, splits, input_angles_placeholder=None, kernel_initializer='VarianceScaling', kernel_regularizer=<keras.src.regularizers.L2 object>, bias_initializer='RandomNormal', write_summary=False, input_placeholder=None, n_proteins=None)[source]#

Special function to run a decoder and unpack the outputs.

This function calls _get_decoder_model to get a standard decoder and then splits the output according to the provided splits and the p.

Parameters:
  • p (encodermap.parameters.ADCParameters) – The parameters.

  • splits (list[int]) – A list of ints giving the splits of the decoder outputs. It is expected that the splits follow the logic of angles-dihedrals-sidedihedrals. If only dihedrals are used for training, splits is expected to be a list of len 1.

  • input_angles_placeholder (Optional[tf.Tensor]) – When only using dihedrals for training, this placeholder should be provided to create a set of mean angles. Can also be None, in case len(splits) >= 2.

  • (Union[dict[str (kernel_initializer) – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

  • input_placeholder (Tensor | None)

  • n_proteins (int | None)

Return type:

tuple[Model, Tensor, Tensor, None | Tensor, None | Tensor]

:paramLiteral[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • kernel_regularizer (tf.keras.regularizers.Regularizer) – The regularizer for the kernel (i.e. the layer weights). Standard in EncoderMap is to use the l2 regularizer with a regularization constant of 0.001.

  • (Union[dict[str (bias_initializer) – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • p (ADCParameters)

  • splits (list[int])

  • input_angles_placeholder (Tensor | None)

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

  • input_placeholder (Tensor | None)

  • n_proteins (int | None)

Return type:

tuple[Model, Tensor, Tensor, None | Tensor, None | Tensor]

:paramLiteral[“ones”, “RandomNormal”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • write_summary (bool) – Whether to print a summary. If p.tensorboard is True a file will be generated. at the main_path.

  • n_proteins (Optional[int]) – If not None, number of proteins that constitute the multimer group that is trained.

  • p (ADCParameters)

  • splits (list[int])

  • input_angles_placeholder (Tensor | None)

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • input_placeholder (Tensor | None)

Returns:

A tuple containing the following:
  • tf.keras.models.Model: The decoder model.

  • tf.Tensor: The angles (either mean, or learned angles).

  • tf.Tensor: The dihedrals.

  • Union[None, tf.Tensor]: The sidechain dihedrals. If p.use_sidechains

    is false, None will be returned.

  • Union[None, tf.Tensor]: The homogeneous transformation matrices

    for multimer training. If p.multimer_training is None, None will be returned.

Return type:

tuple

_get_decoder_model(p, out_shape, kernel_initializer='VarianceScaling', kernel_regularizer=<keras.src.regularizers.L2 object>, bias_initializer='RandomNormal', write_summary=False, input_placeholder=None)[source]#

Create a decoder to the requested specs.

Contrary to the _get_encoder_model function, this function doesn’t require an input placeholder. The input placeholder is created in the function body. Thus, a combined autoencoder model can be built by stacking the encoder and decoder like so: output = decoder(encoder(input)).

Parameters:
  • p (encodermap.parameters.ADCParameters) – The parameters.

  • out_shape (int) – The output shape of the decoder. Make sure to match it with the input shape of the encoder.

  • (Union[dict[str (kernel_initializer) – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

  • input_placeholder (Tensor | None)

Return type:

tuple[Model, Tensor, Tensor]

:paramLiteral[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • kernel_regularizer (tf.keras.regularizers.Regularizer) – The regularizer for the kernel (i.e. the layer weights). Standard in EncoderMap is to use the l2 regularizer with a regularization constant of 0.001.

  • (Union[dict[str (bias_initializer) – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • p (ADCParameters)

  • out_shape (int)

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

  • input_placeholder (Tensor | None)

Return type:

tuple[Model, Tensor, Tensor]

:paramLiteral[“ones”, “RandomNormal”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • write_summary (bool) – Whether to print a summary. If p.tensorboard is True a file will be generated. at the main_path.

  • p (ADCParameters)

  • out_shape (int)

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • input_placeholder (Tensor | None)

Returns:

A tuple containing the following:
  • tf.keras.models.Model: The decoder model.

  • tf.Tensor: The output tensor with shape out_shape.

  • tf.Tensor: The input placeholder tensor with shape p.n_neurons.

Return type:

tuple

_get_deterministic_random_normal(mean=0.1, stddev=0.05, seed=None)[source]#

Returns a deterministic random_normal_initializer wit tensorflow1.

For the tf2 implementation, look into MyKernelInitializer. Moving from tf1 to tf2, the seeding method has changed, so that the same

seed can’t be used to get the same random data in tf1 and tf2.

Parameters:
Return type:

RandomNormal

_get_deterministic_variance_scaling(seed=None)[source]#

Returns a deterministic variance_scaling_initializer wit tensorflow1.

For the tf2 implementation, look into MyBiasInitializer. Moving from tf1 to tf2, the seeding method has changed, so that the same seed can’t be used to get the same random data in tf1 and tf2.

Parameters:

seed (int | None)

Return type:

VarianceScaling

_get_encoder_model(inp, p, input_list, kernel_initializer='VarianceScaling', kernel_regularizer=<keras.src.regularizers.L2 object>, bias_initializer='RandomNormal', write_summary=False)[source]#

Create an encoder model and feed the inp through it.

Parameters:
  • inp (tf.Tensor) – The input tensor of the encoder.

  • p (encodermap.parameters.ADCParameters) – The parameters.

  • input_list (list[tf.Tensor]) – This list contains the input placeholders for the encoder. Make sure that these input tensors point to the inp tensor in some way.

  • (Union[dict[str (kernel_initializer) – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

Return type:

tuple[Model, Tensor]

:paramLiteral[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • kernel_regularizer (tf.keras.regularizers.Regularizer) – The regularizer for the kernel (i.e. the layer weights). Standard in EncoderMap is to use the l2 regularizer with a regularization constant of 0.001.

  • (Union[dict[str (bias_initializer) – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • inp (Tensor)

  • p (Parameters | Parameters)

  • input_list (list[Tensor])

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

  • write_summary (bool)

Return type:

tuple[Model, Tensor]

:paramLiteral[“ones”, “RandomNormal”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • write_summary (bool) – Whether to print a summary. If p.tensorboard is True a file will be generated. at the main_path.

  • inp (Tensor)

  • p (Parameters | Parameters)

  • input_list (list[Tensor])

  • kernel_initializer (dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'])

  • kernel_regularizer (Regularizer)

  • bias_initializer (dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'])

Returns:

A tuple containing:
  • tf.keras.models.Model: The encoder model.

  • tf.Tensor: The output of the model.

Return type:

tuple

_unpack_and_assert_input_shapes(input_shapes, p, input_sparse=False, input_sidechain_only_sparse=False)[source]#

This function unpacks and asserts the input_shapes for the regular protein case.

Parameters:
  • input_shapes (Union[tf.data.Dataset, tuple[int, int, int, int, int]]) – The input shapes, that will be used in the construction of the model.

  • parameters (Optional[encodermap.parameters.ADCParameters]) – An instance of encodermap.parameters.ADCParameters, which holds further parameters in network construction. If None is provided, a new instance with default parameters will be created. Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

  • input_sidechain_only_sparse (bool) – Whether only the sidechain dihedrals are sparse. In that case, the input shape of the cartesians is different, because the cartesians are flattened to a rank 2 tensor before running them through a dense layer and then stacking them again to shape (n_frames, n_atoms, 3).

  • p (ADCParameters)

  • input_sparse (bool)

Returns:

A tuple containing the following:
  • int: The input shape for the training angles.

  • int: The input shape for the training dihedrals.

  • int: The input shape for the cartesians.

  • int: The input shape for the distances.

  • Union[int, None]: The input shape for the training sidechain dihedrals.

    Can be None, if they are not used for training.

Return type:

tuple

_unpack_and_assert_input_shapes_multimers(input_shapes, p)[source]#
Parameters:
Return type:

tuple[int, int, int, int, int, bool, bool]

_unpack_and_assert_input_shapes_w_sidechains(input_shapes, p, input_sparse=False, input_sidechain_only_sparse=False)[source]#

This function unpacks and asserts the input_shapes for the regular protein case.

In contrast to _unpack_data_and_assert_input_shapes, a full sidechain reconstruction will be executed.

Parameters:
  • input_shapes (Union[tf.data.Dataset, tuple[int, int, int, int, int]]) – The input shapes, that will be used in the construction of the model.

  • parameters (Optional[encodermap.parametersADCParameters]) – An instance of encodermap.parameters.ADCParameters, which holds further parameters in network construction. If None is provided, a new instance with default parameters will be created. Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

  • input_sidechain_only_sparse (bool) – Whether only the sidechain dihedrals are sparse. In that case, the input shape of the cartesians is different, because the cartesians are flattened to a rank 2 tensor before running them through a dense layer and then stacking them again to shape (n_frames, n_atoms, 3).

  • p (ADCParameters)

  • input_sparse (bool)

Returns:

A tuple containing the following:
  • int: The input shape for the training angles.

  • int: The input shape for the training dihedrals.

  • int: The input shape for the cartesians.

  • int: The input shape for the distances.

  • Union[int, None]: The input shape for the training sidechain dihedrals.

    Can be None, if they are not used for training.

Return type:

tuple

gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int, int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = True) ADCFunctionalModelTesting[source]#
gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int, int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = False) ADCFunctionalModel
gen_functional_model(input_shapes: DatasetV2 | tuple[tuple[int], tuple[int], tuple[int], tuple[int], tuple[int]], parameters: ADCParameters | None = None, sparse: bool = False, sidechain_only_sparse: bool = False, kernel_initializer: dict[str, ndarray] | Literal['ones', 'VarianceScaling', 'deterministic'] = 'VarianceScaling', bias_initializer: dict[str, ndarray] | Literal['ones', 'RandomNormal', 'deterministic'] = 'RandomNormal', write_summary: bool = True, use_experimental_model: bool = False) ADCSparseFunctionalModel

New implementation of the functional model API for AngleCartesianDihedralEncoderMap

The functional API is much more flexible than the sequential API, in that models with multiple inputs and outputs can be defined. Custom layers and submodels can be intermixed. In EncoderMap’s case, the functional API is used to build the AngleDihedralCartesianAutoencoder, which takes input data in form of a tf.data.Dataset with:

  • backbone_angles (angles between C, CA, N - atoms in the backbone).

  • backbone_torsions (dihedral angles in the backbone,

    commonly known as omega, phi, psi).

  • cartesian_coordinates (coordinates of the C, CA, N backbone

    atoms. This data has ndim 3, the other have ndim 2).

  • backbone_distances (distances between the C, CA, N backbone atoms).

  • sidechain_torsions (dihedral angles in the sidechain,

    commonly known as chi1, chi2, chi3, chi4, chi5).

Packing and unpacking that data in the correct order is important. Make sure to double-check whether you are using angles or dihedrals. A simple print of the shape can be enough.

Parameters:
  • input_shapes (Union[tf.data.Dataset, tuple[int, int, int, int, int]]) – The input shapes, that will be used in the construction of the model.

  • parameters (Optional[encodermap.parameters.ADCParameters]) – An instance of encodermap.parameters.ADCParameters, which holds further parameters in network construction. If None is provided, a new instance with default parameters will be created. Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

  • sidechain_only_sparse (bool) – A special case, when the proteins have the same number of residues, but different numbers of sidechain dihedrals. In that case only the sidechain dihedrals are considered to be sparse. Defaults to False.

  • (Union[dict[str (kernel_initializer) – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

:paramLiteral[“ones”, “VarianceScaling”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “VarianceScaling” is provided, the weights will be initialized with tf.keras.initializers. VarianceScaling(). Defaults to “VarianceScaling”. If “deterministic” is provided, a seed will be used with VarianceScaling. If a dict with weight matrices is supplied, the keys should follow this naming con- vention: [“dense/kernel”, “dense_1/kernel”, “dense_2/kernel”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:
  • (Union[dict[str (bias_initializer) – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

  • np.ndarray] – Literal[“ones”, “RandomNormal”, “deterministic”]]): How to initialize the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

:paramLiteral[“ones”, “RandomNormal”, “deterministic”]]): How to initialize

the weights. If “ones” is provided, the weights will be initialized with tf.keras.initializers.Constant(1). If “RandomNormal” is provided, the weights will be initialized with tf.keras.initializers. RandomNormal(0.1, 0.05). Defaults to “RandomNormal”. If “deterministic” is provided, a seed will be used with RandomNormal. If a dict with bias matrices is supplied, the keys should follow this naming con- vention: [“dense/bias”, “dense_1/bias”, “dense_2/bias”, etc.] This is tensorflow’s naming convention for unnamed dense layers.

Parameters:

write_summary (bool) – Whether to print a summary. If p.tensorboard is True a file will be generated. at the main_path.

Returns:

The model.

Return type:

tf.keras.models.Model

Here’s a scheme of the generated network:

┌───────────────────────────────────────────────────────────────────────────────────────┐
│A linear protein with N standard residues has N*3 backbone atoms (..C-N-CA-C-N..)      │
│it has N*3 - 1 distances between these atoms                                           │
│it has N*3 - 2 angles between three atoms                                              │
│it has N*3 - 3 dihedrals between 4 atoms                                               │
│it has S sidechain dihedrals based on the sequence                                     │
└───────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬───────┘
        │                 │                 │                 │                 │
        │                 │                 │                 │                 │
        │                 │                 │                 │                 │
┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐
│cartesians     │ │distances      │ │angles         │ │dihedrals      │ │side dihedrals │
│(batch, N*3, 3)│ │(batch, N*3-1) │ │(batch, N*3-2) │ │(batch, N*3-3) │ │(batch, S)     ├───────┐
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │Every type
        │                 │                 │                 │                  │              │of angular
        │                 │       ┌─────────┼─────────────────┼──────────────────┤              │input has
        │                 │       │         │                 │                  │              │its own
┌───────┴───────┐         │       │ ┌───────┴───────┐ ┌───────┴───────┐ ┌────────┴──────┐       │cost contri
│pair cartesians│         │ ┌─────┼─┤unitcircle ang │ │unitcircle dih │ │unitcircle sdih│       │bution
│(batch, batch) │         │ │if no│ │(b, (N*3-2)*2) │ │(b, (N*3-3)*2) │ │(b, S*2)       │       │which
└───────┬───────┘         │ │angles └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │compares
        │compare the pair │ │are  │         │                 │                  │              │input
        │wise distances of│ │fed  │       if│use_backbone_angles               if│use_sidechains│and
        │the input cartesi│ │through        │                 │                  │              │output
        │ans with the gene│ │the  ┼ ┌───────┴─────────────────┴──────────────────┴──────┐       │->
        │rated cartesians │ │network│concatenate the angle-inputs. Based on parameters. │       │angle_cost
        │-> cartesian loss│ │use  │ │(batch, sum(angle_shapes)                          │       │dihedral_cost
        │                 │ │mean │ └─────────────────────────┬─────────────────────────┘       │side_dihedral
        │                 │ │angles                           │                                 │_cost
        │                 │ │     │                           │                                 │
        │                 │ │     │                           │                                 │
        │                 │ │     │             ┌─────────────┴──────────────┐                  │
        │                 │ │     │             │Encoder layers              │                  │
        │                 │ │     │             │(batch, n_neurons)          │                  │
        │                 │ │     │             └─────────────┬──────────────┘                  │
        │                 │ │     │                           │                                 │
        │                 │ │     │                           │                                 │
        │                 │ │     │add a sigmoid-weighted     │            add a loss function  │
        │      compare the│ │     │loss function that┌────────┴────────┐   to center the points │
        │      ┌──────────┼─┼─────┴──────────────────┤Bottleneck,Latent├────────────────────    │
        │      │generated │ │      compares the pair-│ (batch, 2)      │   around the origin    │
        │      │cartesians│ │      wise distances of └────────┬────────┘   -> center loss       │
        │      │with the  │ │      input and latent           │                                 │
        │      │pairwise  │ │      samples                    │                                 │
        │      │distances │ │      -> distance loss           │                                 │
        │      │of the    │ │                   ┌─────────────┴──────────────┐                  │
        │      │bottleneck│ │                   │Decoder layers              │                  │
        │      │use a 2nd │ │                   │(batch, n_neurons)          │                  │
        │      │sigmoid   │ │                   └─────────────┬──────────────┘                  │
        │      │function  │ │                                 │                                 │
        │      │for this  │ │                                 │                                 │
        │      │->        │ │                                 │                                 │
        │      │cartesian │ │       ┌─────────────────────────┴─────────────────────────┐       │
        │      │distance  │ │       │split the output of the decoder to get angles back │       │
        │      │loss      │ │       │(batch, sum(angle_shapes)                          │       │
        │      │          │ │       └───────┬─────────────────┬─────────────────┬───────┘       │
        │      │          │ │               │                 │                 │               │
        │      │          │ │               │                 │                 │               │
        │      │          │ │               │                 │                 │               │
        │      │          │ │       ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐       │
        │      │          │ │       │unitcircle ang │ │unitcircle dih │ │unitcircle sdih│       │
        │      │          │ │       │(b, (N*3-2)*2) │ │(b, (N*3-3)*2) │ │(b, S*2)       │       │
        │      │          │ │       └───────┬───────┘ └───────┬───────┘ └────────┬──────┘       │
        │      │          │ │               │                 │                  │              │
        │      │          │ │             if│use_backbone_angles               if│use_sidechains│
        │      │          │ │               │                 │                  │              │
        │      │          │ │       ┌───────┴───────┐ ┌───────┴───────┐ ┌────────┴──────┐       │
        │      │          │ └───────┤(mean) angles  │ │dihedrals      │ │side dihedrals │       │
        │      │          │         │(batch,3N*3-2) │ │(batch,3N*3-3) │ │(batch, S)     ├───────┘
        │      │          │         └───────┬───────┘ └───────┬───────┘ └───────────────┘
        │      │          │                 │                 │
        │      │          │                 │                 │
        │      │          │                 │                 │
        │      │  ┌───────┴─────────────────┴─────────────────┴──────┐
        │      │  │create new cartesians with chain-in-plane and     │
        │      │  │rotation matrices (batch, 3*N, 3)                 │
        │      │  └───────┬──────────────────────────────────────────┘
        │      │          │
        │      │          │
        │      │          │
        │      │  ┌───────┴───────┐
        │      └──┤gen pair cartes│
        │         │(batch,batch)  │
        └─────────┴───────────────┘
gen_sequential_model(input_shape, parameters=None, sparse=False)[source]#

Returns a tf.keras model with the specified input shape and the parameters in the Parameters class.

Parameters:
  • input_shape (int) – The input shape of the returned model. In most cases that is data.shape[1] of your data.

  • parameters (Optional[AnyParameters]) – The parameters to use on the returned model. If None is provided the default parameters in encodermap.Parameters.defaults is used. You can look at the defaults with print(em.Parameters.defaults_description()). Defaults to None.

  • sparse (bool) – Whether sparse inputs are expected. Defaults to False.

Returns:

A subclass of tf.keras.Model build with specified parameters.

Return type:

em.SequentialModel