Parameter Classes#
Parameters#
- class Parameters(**kwargs)[source]#
Class to hold Parameters for the Autoencoder
Parameters can be set via keyword args while instantiating the class, set as instance attributes or read from disk. This class can write parameters to disk in .yaml or .json format.
- defaults#
Classvariable dict that holds the defaults even when the current values might have changed.
- Type:
- n_neurons#
List containing number of neurons for each layer up to the bottleneck layer. For example [128, 128, 2] stands for an autoencoder with the following architecture {i, 128, 128, 2, 128, 128, i} where i is the number of dimensions of the input data. These are Input/Output Layers that are not trained.
- activation_functions#
List of activation function names as implemented in TensorFlow. For example: “relu”, “tanh”, “sigmoid” or “” to use no activation function. The encoder part of the network takes the activation functions from the list starting with the second element. The decoder part of the network takes the activation functions in reversed order starting with the second element form the back. For example [“”, “relu”, “tanh”, “”] would result in a autoencoder with {“relu”, “tanh”, “”, “tanh”, “relu”, “”} as sequence of activation functions.
- periodicity#
Defines the distance between periodic walls for the inputs. For example 2pi for angular values in radians. All periodic data processed by EncoderMap must be wrapped to one periodic window. E.g. data with 2pi periodicity may contain values from -pi to pi or from 0 to 2pi. Set the periodicity to float(“inf”) for non-periodic inputs.
- Type:
- dist_sig_parameters#
Parameters for the sigmoid functions applied to the high- and low-dimensional distances in the following order (sig_h, a_h, b_h, sig_l, a_l, b_l)
- Type:
tuple of floats
- distance_cost_scale#
Adjusts how much the distance based metric is weighted in the cost function.
- Type:
- auto_cost_variant#
defines how the auto cost is calculated. Must be one of: * mean_square * mean_abs * mean_norm
- Type:
- center_cost_scale#
Adjusts how much the centering cost is weighted in the cost function.
- Type:
- l2_reg_constant#
Adjusts how much the L2 regularisation is weighted in the cost function.
- Type:
- gpu_memory_fraction#
Specifies the fraction of gpu memory blocked. If set to 0, memory is allocated as needed.
- Type:
- id#
Can be any name for the run. Might be useful for example for specific analysis for different data sets.
- Type:
- model_api#
A string defining the API to be used to build the keras model. Defaults to sequntial. Possible strings are: * functional will use keras’ functional API. * sequential will define a keras Model, containing two other models with the Sequential API.
These two models are encoder and decoder.
custom will create a custom Model where even the layers are custom.
- Type:
- loss#
A string defining the loss function. Defaults to emap_cost. Possible losses are: * reconstruction_loss will try to train output == input * mse: Returns a mean squared error loss. * emap_cost is the EncoderMap loss function. Depending on the class Autoencoder,
Encodermap, `ADCAutoencoder, different contributions are used for a combined loss. Autoencoder uses atuo_cost, reg_cost, center_cost. EncoderMap class adds sigmoid_loss.
- Type:
- training#
A string defining what kind of training is performed when autoencoder.train() is callsed. * auto does a regular model.compile() and model.fit() procedure. * custom uses gradient tape and calculates losses and gradients manually.
- Type:
- seed#
Fixes the state of all operations using random numbers. Defaults to None.
- Type:
Union[int, None]
- write_summary#
If True writes a summar.txt of the models into main_path if tensorboard is True, summaries will also be written.
- Type:
- trainable_dense_to_sparse#
When using different topologies to train the AngleDihedralCartesianEncoderMap, some inputs might be sparse, which means, they have missing values. Creating a dense input is done by first passing these sparse tensors through tf.keras.layers.Dense layers. These layers have trainable weights, and if this parameter is True, these weights will be changed by the optimizer.
- Type:
- using_hypercube#
This parameter is not meant to be set by the user. It allows us to print better error messages when re-loading and re-training a model. It contains a boolean whether a model has been trained on the hypercube example data. If your data is 4-dimensional and you reload a model and forget to prvide your data, the model will happily train with the hypercube (and not your) data. This variable implements a check.
- Type:
Examples
>>> import encodermap as em >>> import tempfile >>> from pathlib import Path ... >>> with tempfile.TemporaryDirectory() as td: ... td = Path(td) ... p = em.Parameters() ... print(p.auto_cost_variant) ... savepath = p.save(td / "parameters.json") ... print(savepath) ... new_params = em.Parameters.from_file(td / "parameters.json") ... print(new_params.main_path) mean_abs /tmp...parameters.json seems like the parameter file was moved to another directory. Parameter file is updated ... /home...
Instantiate the Parameters class
Takes a dict as input and overwrites the class defaults. The dict is directly stored as an attribute and can be accessed via instance attributes.
- Parameters:
**kwargs (dcit) – Dict containing values. If unknown keys are passed they will be dropped.
ADCParameters#
- class ADCParameters(**kwargs)[source]#
This is the parameter object for the AngleDihedralCartesianEncoder. It holds all the parameters that the Parameters object includes, plus the following attributes:
- track_clashes#
Whether to track the number of clashes during training. The average number of clashes is the average number of distances in the reconstructed cartesian coordinates with a distance smaller than 1 (nm). Defaults to False.
- Type:
- track_RMSD#
Whether to track the RMSD of the input and reconstructed cartesians during training. The RMSDs are computed along the batch by minimizing the .. math:
\text{RMSD}(\mathbf{x}, \mathbf{x}^{\text{ref}}) = \min_{\mathsf{R}, \mathbf{t}} % \sqrt{\frac{1}{N} \sum_{i=1}^{N} \left[ % (\mathsf{R}\cdot\mathbf{x}_{i}(t) + \mathbf{t}) - \mathbf{x}_{i}^{\text{ref}} \right]^{2}}
This results in n RMSD values, where n is the size of the batch. A mean RMSD of this batch and the values for this batch will be logged to tensorboard.
- Type:
- cartesian_pwd_start#
Index of the first atom to use for the pairwise distance calculation.
- Type:
- cartesian_pwd_step#
Step for the calculation of paiwise distances. E.g. for a chain of atoms N-C_a-C-N-C_a-C… cartesian_pwd_start=1 and cartesian_pwd_step=3 will result in using all C-alpha atoms for the pairwise distance calculation.
- Type:
- use_backbone_angles#
Allows to define whether backbone bond angles should be learned (True) or if instead mean values should be used to generate conformations (False).
- Type:
- angle_cost_variant#
Defines how the angle cost is calculated. Must be one of:
“mean_square”
“mean_abs”
“mean_norm”.
- Type:
- angle_cost_reference#
Can be used to normalize the angle cost with the cost of same reference model (dummy).
- Type:
- dihedral_cost_variant#
Defines how the dihedral cost is calculated. Must be one of:
“mean_square”
“mean_abs”
“mean_norm”.
- Type:
- dihedral_cost_reference#
Can be used to normalize the dihedral cost with the cost of same reference model (dummy).
- Type:
- side_dihedral_cost_scale#
Adjusts how much the side dihedral cost is weighted in the cost function.
- Type:
- side_dihedral_cost_variant#
Defines how the side dihedral cost is calculated. Must be one of:
“mean_square”
“mean_abs”
“mean_norm”.
- Type:
- side_dihedral_cost_reference#
Can be used to normalize the side dihedral cost with the cost of same reference model (dummy).
- Type:
- cartesian_cost_scale#
Adjusts how much the cartesian cost is weighted in the cost function.
- Type:
- cartesian_cost_scale_soft_start#
Allows to slowly turn on the cartesian cost. Must be a tuple with (start, end) or (None, None) If begin and end are given,
cartesian_cost_scale will be increased linearly in the
given range.
- Type:
- cartesian_cost_variant#
Defines how the cartesian cost is calculated. Must be one of:
“mean_square”
“mean_abs”
“mean_norm”.
- Type:
- cartesian_cost_reference#
Can be used to normalize the cartesian cost with the cost of same reference model (dummy).
- Type:
- cartesian_dist_sig_parameters#
Parameters for the sigmoid functions applied to the high- and low-dimensional distances in the following order (sig_h, a_h, b_h, sig_l, a_l, b_l).
- Type:
tuple of floats
- cartesian_distance_cost_scale#
Adjusts how much the cartesian distance cost is weighted in the cost function.
- Type:
- multimer_training#
Experimental feature.
- Type:
Any
- multimer_topology_classes#
Experimental feature.
- Type:
Any
- multimer_connection_bridges#
Experimental feature.
- Type:
Any
- multimer_lengths#
Experimental feature.
- Type:
Any
Examples
>>> import encodermap as em >>> import tempfile >>> from pathlib import Path ... >>> with tempfile.TemporaryDirectory() as td: ... td = Path(td) ... p = em.Parameters() ... print(p.auto_cost_variant) ... savepath = p.save(td / "parameters.json") ... print(savepath) ... new_params = em.Parameters.from_file(td / "parameters.json") ... print(new_params.main_path) mean_abs /tmp...parameters.json seems like the parameter file was moved to another directory. Parameter file is updated ... /home...
Instantiate the ADCParameters class
Takes a dict as input and overwrites the class defaults. The dict is directly stored as an attribute and can be accessed via instance attributes.
- Parameters:
**kwargs (dict) – Dict containing values. If unknown values are passed they will be dropped.