encodermap.parameters package#
Submodules#
encodermap.parameters.parameters module#
Parameter Classes for Encodermap.
This module contains parameter classes which are used to hold information for the encodermap autoencoder. Parameters can be set from keyworded arguments, by overwriting the class attributes or by reading them from .json, .yaml or ASCII files.
- Features:
Setting and saving Parameters with the Parameter class.
Loading parameters from disk and continue where you left off.
The Parameter and ACDParamter class contains already good default values.
- class encodermap.parameters.parameters.ADCParameters(**kwargs: Optional[Union[float, int, str, bool, list[int], list[str], list[float], tuple[int, None]]])[source]#
Bases:
ParametersFramework
This is the parameter object for the AngleDihedralCartesianEncoder. It holds all the parameters that the Parameters object includes, plus the following attributes:
- cartesian_pwd_start#
Index of the first atom to use for the pairwise distance calculation.
- Type:
int
- cartesian_pwd_stop#
Index of the last atom to use for the pairwise distance calculation.
- Type:
int
- cartesian_pwd_step#
Step for the calculation of paiwise distances. E.g. for a chain of atoms N-C_a-C-N-C_a-C… cartesian_pwd_start=1 and cartesian_pwd_step=3 will result in using all C-alpha atoms for the pairwise distance calculation.
- Type:
int
- use_backbone_angles#
Allows to define whether backbone bond angles should be learned (True) or if instead mean values should be used to generate conformations (False).
- Type:
bool
- use_sidechains#
Whether sidechain dihedrals should be passed through the autoencoder.
- Type:
bool
- angle_cost_scale#
Adjusts how much the angle cost is weighted in the cost function.
- Type:
int
- angle_cost_variant#
Defines how the angle cost is calculated. Must be one of: * “mean_square” * “mean_abs” * “mean_norm”.
- Type:
str
- angle_cost_reference#
Can be used to normalize the angle cost with the cost of same reference model (dummy).
- Type:
int
- dihedral_cost_scale#
Adjusts how much the dihedral cost is weighted in the cost function.
- Type:
int
- dihedral_cost_variant#
Defines how the dihedral cost is calculated. Must be one of: * “mean_square” * “mean_abs” * “mean_norm”.
- Type:
str
- dihedral_cost_reference#
Can be used to normalize the dihedral cost with the cost of same reference model (dummy).
- Type:
int
- side_dihedral_cost_scale#
Adjusts how much the side dihedral cost is weighted in the cost function.
- Type:
int
- side_dihedral_cost_variant#
Defines how the side dihedral cost is calculated. Must be one of: * “mean_square” * “mean_abs” * “mean_norm”.
- Type:
str
- side_dihedral_cost_reference#
Can be used to normalize the side dihedral cost with the cost of same reference model (dummy).
- Type:
int
- cartesian_cost_scale#
Adjusts how much the cartesian cost is weighted in the cost function.
- Type:
int
- cartesian_cost_scale_soft_start#
Allows to slowly turn on the cartesian cost. Must be a tuple with (start, end) or (None, None) If begin and end are given, cartesian_cost_scale will be increased linearly in the given range.
- Type:
tuple
- cartesian_cost_variant#
Defines how the cartesian cost is calculated. Must be one of: * “mean_square” * “mean_abs” * “mean_norm”.
- Type:
str
- cartesian_cost_reference#
Can be used to normalize the cartesian cost with the cost of same reference model (dummy).
- Type:
int
- cartesian_dist_sig_parameters#
Parameters for the sigmoid functions applied to the high- and low-dimensional distances in the following order (sig_h, a_h, b_h, sig_l, a_l, b_l).
- Type:
tuple of floats
- cartesian_distance_cost_scale#
Adjusts how much the cartesian distance cost is weighted in the cost function.
- Type:
int
Examples
>>> import encodermap as em >>> parameters = em.ADCParameters() >>> parameters.auto_cost_variant mean_abs >>> parameters.save(path='/path/to/dir') /path/to/dir/parameters.json >>> # alternative constructor >>> new_params = em.Parameters.from_file('/path/to/dir/parameters.json') >>> new_params.main_path /path/to/dir/parameters.json
- __init__(**kwargs: Optional[Union[float, int, str, bool, list[int], list[str], list[float], tuple[int, None]]]) None [source]#
Instantiate the ADCParameters class
Takes a dict as input and overwrites the class defaults. The dict is directly stored as an attribute and can be accessed via instance attributes.
- Parameters:
**kwargs (dict) – Dict containing values. If unknown values are passed they will be dropped.
- activation_functions: list[str]#
- defaults = {'activation_functions': ['', 'tanh', 'tanh', ''], 'analysis_path': '', 'angle_cost_reference': 1, 'angle_cost_scale': 0, 'angle_cost_variant': 'mean_abs', 'auto_cost_scale': None, 'auto_cost_variant': 'mean_abs', 'batch_size': 256, 'batched': True, 'cartesian_cost_reference': 1, 'cartesian_cost_scale': 1, 'cartesian_cost_scale_soft_start': (None, None), 'cartesian_cost_variant': 'mean_abs', 'cartesian_dist_sig_parameters': (4.5, 12, 6, 1, 2, 6), 'cartesian_distance_cost_scale': 1, 'cartesian_pwd_start': None, 'cartesian_pwd_step': None, 'cartesian_pwd_stop': None, 'center_cost_scale': 0.0001, 'checkpoint_step': 5000, 'dihedral_cost_reference': 1, 'dihedral_cost_scale': 1, 'dihedral_cost_variant': 'mean_abs', 'dist_sig_parameters': (4.5, 12, 6, 1, 2, 6), 'distance_cost_scale': None, 'gpu_memory_fraction': 0, 'id': '', 'l2_reg_constant': 0.001, 'learning_rate': 0.001, 'loss': 'emap_cost', 'model_api': 'functional', 'n_neurons': [128, 128, 2], 'n_steps': 100000, 'periodicity': 6.283185307179586, 'seed': None, 'side_dihedral_cost_reference': 1, 'side_dihedral_cost_scale': 0.5, 'side_dihedral_cost_variant': 'mean_abs', 'summary_step': 10, 'tensorboard': False, 'training': 'auto', 'use_backbone_angles': False, 'use_sidechains': False}#
- classmethod defaults_description() str [source]#
str: A string that contains tabulated default parameter values.
- n_neurons: list[int]#
- class encodermap.parameters.parameters.Parameters(**kwargs: Optional[Union[float, int, str, bool, list[int], list[str], list[float], tuple[int, None]]])[source]#
Bases:
ParametersFramework
Class to hold Parameters for the Autoencoder
Parameters can be set via keyword args while instantiating the class, set as instance attributes or read from disk. This class can write parameters to disk in .yaml or .json format.
- defaults#
Classvariable dict that holds the defaults even when the current values might have changed.
- Type:
dict
- main_path#
Defines a main path where the parameters and other things might be stored.
- Type:
str
- n_neurons#
List containing number of neurons for each layer up to the bottleneck layer. For example [128, 128, 2] stands for an autoencoder with the following architecture {i, 128, 128, 2, 128, 128, i} where i is the number of dimensions of the input data. These are Input/Output Layers that are not trained.
- Type:
list of int
- activation_functions#
List of activation function names as implemented in TensorFlow. For example: “relu”, “tanh”, “sigmoid” or “” to use no activation function. The encoder part of the network takes the activation functions from the list starting with the second element. The decoder part of the network takes the activation functions in reversed order starting with the second element form the back. For example [“”, “relu”, “tanh”, “”] would result in a autoencoder with {“relu”, “tanh”, “”, “tanh”, “relu”, “”} as sequence of activation functions.
- Type:
list of str
- periodicity#
Defines the distance between periodic walls for the inputs. For example 2pi for angular values in radians. All periodic data processed by EncoderMap must be wrapped to one periodic window. E.g. data with 2pi periodicity may contain values from -pi to pi or from 0 to 2pi. Set the periodicity to float(“inf”) for non-periodic inputs.
- Type:
float
- learning_rate#
Learning rate used by the optimizer.
- Type:
float
- n_steps#
Number of training steps.
- Type:
int
- batch_size#
Number of training points used in each training step
- Type:
int
- summary_step#
A summary for TensorBoard is writen every summary_step steps.
- Type:
int
- checkpoint_step#
A checkpoint is writen every checkpoint_step steps.
- Type:
int
- dist_sig_parameters#
Parameters for the sigmoid functions applied to the high- and low-dimensional distances in the following order (sig_h, a_h, b_h, sig_l, a_l, b_l)
- Type:
tuple of floats
- distance_cost_scale#
Adjusts how much the distance based metric is weighted in the cost function.
- Type:
int
- auto_cost_scale#
Adjusts how much the autoencoding cost is weighted in the cost function.
- Type:
int
- auto_cost_variant#
defines how the auto cost is calculated. Must be one of: * mean_square * mean_abs * mean_norm
- Type:
str
- center_cost_scale#
Adjusts how much the centering cost is weighted in the cost function.
- Type:
float
- l2_reg_constant#
Adjusts how much the L2 regularisation is weighted in the cost function.
- Type:
float
- gpu_memory_fraction#
Specifies the fraction of gpu memory blocked. If set to 0, memory is allocated as needed.
- Type:
float
- analysis_path#
A path that can be used to store analysis
- Type:
str
- id#
Can be any name for the run. Might be useful for example for specific analysis for different data sets.
- Type:
str
- model_api#
A string defining the API to be used to build the keras model. Defaults to sequntial. Possible strings are: * functional will use keras’ functional API. * sequential will define a keras Model, containing two other models with the Sequential API.
These two models are encoder and decoder.
custom will create a custom Model where even the layers are custom.
- Type:
str
- loss#
A string defining the loss function. Defaults to emap_cost. Possible losses are: * reconstruction_loss will try to train output == input * mse: Returns a mean squared error loss. * emap_cost is the EncoderMap loss function. Depending on the class Autoencoder,
Encodermap, `ACDAutoencoder, different contributions are used for a combined loss. Autoencoder uses atuo_cost, reg_cost, center_cost. EncoderMap class adds sigmoid_loss.
- Type:
str
- batched#
Whether the dataset is batched or not.
- Type:
bool
- training#
A string defining what kind of training is performed when autoencoder.train() is callsed. * auto does a regular model.compile() and model.fit() procedure. * custom uses gradient tape and calculates losses and gradients manually.
- Type:
str
- tensorboard#
Whether to print tensorboard information. Defaults to False.
- Type:
bool
- seed#
Fixes the state of all operations using random numbers. Defaults to None.
- Type:
Union[int, None]
Examples
>>> import encodermap as em >>> paramters = em.Parameters() >>> parameters.auto_cost_variant mean_abs >>> parameters.save(path='/path/to/dir') /path/to/dir/parameters.json >>> # alternative constructor >>> new_params = em.Parameters.from_file('/path/to/dir/parameters.json') >>> new_params.main_path /path/to/dir/parameters.json
- __init__(**kwargs: Optional[Union[float, int, str, bool, list[int], list[str], list[float], tuple[int, None]]]) None [source]#
Instantiate the Parameters class
Takes a dict as input and overwrites the class defaults. The dict is directly stored as an attribute and can be accessed via instance attributes.
- Parameters:
**kwargs (dcit) – Dict containing values. If unknown keys are passed they will be dropped.
- activation_functions: list[str]#
- defaults = {'activation_functions': ['', 'tanh', 'tanh', ''], 'analysis_path': '', 'auto_cost_scale': 1, 'auto_cost_variant': 'mean_abs', 'batch_size': 256, 'batched': True, 'center_cost_scale': 0.0001, 'checkpoint_step': 5000, 'dist_sig_parameters': (4.5, 12, 6, 1, 2, 6), 'distance_cost_scale': 500, 'gpu_memory_fraction': 0, 'id': '', 'l2_reg_constant': 0.001, 'learning_rate': 0.001, 'loss': 'emap_cost', 'model_api': 'sequential', 'n_neurons': [128, 128, 2], 'n_steps': 100000, 'periodicity': 6.283185307179586, 'seed': None, 'summary_step': 10, 'tensorboard': False, 'training': 'auto'}#
- classmethod defaults_description() str [source]#
str: A string that contains tabulated default parameter values.
- n_neurons: list[int]#
- encodermap.parameters.parameters._datetime_windows_and_linux_compatible()[source]#
Portable way to get now as either a linux or windows compatible string.
- For linux systems strings in this manner will be returned:
2022-07-13T16:04:04+02:00
- For windows systems strings in this manner will be returned:
2022-07-13_16-04-46