pyroVED models¶

trVAE¶

class pyroved.models.trVAE(data_dim, latent_dim=2, coord=3, num_classes=0, hidden_dim_e=128, hidden_dim_d=128, num_layers_e=2, num_layers_d=2, activation='tanh', sampler_d='bernoulli', sigmoid_d=True, seed=1, **kwargs)[source]¶

Bases: pyroved.models.base.baseVAE

Variational autoencoder that enforces rotational and/or translational invariances

Parameters

data_dim (Tuple[int]) – Dimensionality of the input data; use (h x w) for images or (length,) for spectra.
latent_dim (int) – Number of latent dimensions.
coord (int) – For 2D systems, coord=0 is vanilla VAE, coord=1 enforces rotational invariance, coord=2 enforces invariance to translations, and coord=3 enforces both rotational and translational invariances. For 1D systems, coord=0 is vanilla VAE and coord>0 enforces transaltional invariance.
num_classes (int) – Number of classes (if any) for class-conditioned (t)(r)VAE.
hidden_dim_e (int) – Number of hidden units per each layer in encoder (inference network). The default value is 128.
hidden_dim_d (int) – Number of hidden units per each layer in decoder (generator network). The default value is 128.
num_layers_e (int) – Number of layers in encoder (inference network). The default value is 2.
num_layers_d (int) – Number of layers in decoder (generator network). The default value is 2.
activation (str) – Non-linear activation for inner layers of encoder and decoder. The available activations are ReLU (‘relu’), leaky ReLU (‘lrelu’), hyberbolic tangent (‘tanh’), and softplus (‘softplus’) The default activation is ‘tanh’.
sampler_d (str) – Decoder sampler, as defined as p(x|z) = sampler(decoder(z)). The available samplers are ‘bernoulli’, ‘continuous_bernoulli’, and ‘gaussian’ (Default: ‘bernoulli’).
sigmoid_d (bool) – Sigmoid activation for the decoder output (Default: True)
seed (int) – Seed used in torch.manual_seed(seed) and torch.cuda.manual_seed_all(seed)
kwargs (float) – Additional keyword arguments are dx_prior and dy_prior for setting a translational prior(s), and decoder_sig for setting sigma in the decoder’s sampler when it is set to “gaussian”.

Example:

Initialize a VAE model with rotational invariance

>>> data_dim = (28, 28)
>>> ssvae = trVAE(data_dim, latent_dim=2, coord=1)

Initialize a class-conditioned VAE model with rotational invariance for dataset that has 10 classes

>>> data_dim = (28, 28)
>>> ssvae = trVAE(data_dim, latent_dim=2, num_classes=10, coord=1)

model(x, y=None, **kwargs)[source]¶

Defines the model p(x|z)p(z)

Return type: None

guide(x, y=None, **kwargs)[source]¶

Defines the guide q(z|x)

Return type: None

split_latent(z)[source]¶

Split latent variable into parts for rotation and/or translation and image content

Return type: Tuple[Tensor]

encode(x_new, **kwargs)[source]¶

Encodes data using a trained inference (encoder) network

Parameters

x_new (Tensor) – Data to encode with a trained trVAE. The new data must have the same dimensions (images height and width or spectra length) as the one used for training.
kwargs (int) – Batch size as ‘batch_size’ (for encoding large volumes of data)

Return type

Tensor

decode(z, y=None, **kwargs)[source]¶

Decodes a batch of latent coordnates

Parameters

z (Tensor) – Latent coordinates (without rotational and translational parts)
y (Optional[Tensor]) – Class (if any) as a batch of one-hot vectors
kwargs (int) – Batch size as ‘batch_size’

Return type

Tensor

manifold2d(d, plot=True, **kwargs)[source]¶

Plots a learned latent manifold in the image space

Parameters

d (int) – Grid size
plot (bool) – Plots the generated manifold (Default: True)
kwargs (Union[str, int]) – Keyword arguments include ‘label’ for class label (if any), custom min/max values for grid boundaries passed as ‘z_coord’ (e.g. z_coord = [-3, 3, -3, 3]) and plot parameters (‘padding’, ‘padding_value’, ‘cmap’, ‘origin’, ‘ylim’)

Return type

Tensor

load_weights(filepath)¶

Loads saved weights of encoder(s) and decoder

Return type: None

save_weights(filepath)¶

Saves trained weights of encoder(s) and decoder

Return type: None

set_decoder(decoder_net)¶

Sets a user-defined decoder neural network

Return type: None

set_encoder(encoder_net)¶

Sets a user-defined encoder neural network

Return type: None

j-trVAE¶

class pyroved.models.jtrVAE(data_dim, latent_dim, discrete_dim, coord=0, hidden_dim_e=128, hidden_dim_d=128, num_layers_e=2, num_layers_d=2, activation='tanh', sampler_d='bernoulli', sigmoid_d=True, seed=1, **kwargs)[source]¶

Bases: pyroved.models.base.baseVAE

Variational autoencoder for learning (jointly) discrete and continuous latent representations on data with arbitrary rotations and/or translations

Parameters

data_dim (Tuple[int]) – Dimensionality of the input data; use (h x w) for images or (length,) for spectra.
latent_dim (int) – Number of continuous latent dimensions.
discrete_dim (int) – Number of discrete latent dimensions
coord (int) – For 2D systems, coord=0 is vanilla VAE, coord=1 enforces rotational invariance, coord=2 enforces invariance to translations, and coord=3 enforces both rotational and translational invariances. For 1D systems, coord=0 is vanilla VAE and coord>0 enforces transaltional invariance.
hidden_dim_e (int) – Number of hidden units per each layer in encoder (inference network).
hidden_dim_d (int) – Number of hidden units per each layer in decoder (generator network).
num_layers_e (int) – Number of layers in encoder (inference network).
num_layers_d (int) – Number of layers in decoder (generator network).
activation (str) – Non-linear activation for inner layers of encoder and decoder. The available activations are ReLU (‘relu’), leaky ReLU (‘lrelu’), hyberbolic tangent (‘tanh’), and softplus (‘softplus’) The default activation is ‘tanh’.
sampler_d (str) – Decoder sampler, as defined as p(x|z) = sampler(decoder(z)). The available samplers are ‘bernoulli’, ‘continuous_bernoulli’, and ‘gaussian’ (Default: ‘bernoulli’).
sigmoid_d (bool) – Sigmoid activation for the decoder output (Default: True)
seed (int) – Seed used in torch.manual_seed(seed) and torch.cuda.manual_seed_all(seed)
kwargs (float) – Additional keyword arguments are dx_prior and dy_prior for setting a translational prior(s), and decoder_sig for setting sigma in the decoder’s sampler when it is set to “gaussian”.

Example:

Initialize a joint VAE model with rotational invariance for 10 discrete classes

>>> data_dim = (28, 28)
>>> ssvae = jtrVAE(data_dim, latent_dim=2, discrete_dim=10, coord=1)

model(x, **kwargs)[source]¶

Defines the model p(x|z,c)p(z)p(c)

Return type: None

guide(x, **kwargs)[source]¶

Defines the guide q(z,c|x)

Return type: None

split_latent(zs)[source]¶

Split latent variable into parts with rotation and/or translation and image content

Return type: Tuple[Tensor]

encode(x_new, **kwargs)[source]¶

Encodes data using a trained inference (encoder) network

Parameters

x_new (Tensor) – Data to encode with a trained trVAE. The new data must have the same dimensions (images height and width or spectra length) as the one used for training.
kwargs (int) – Batch size as ‘batch_size’ (for encoding large volumes of data)

Return type

Tensor

decode(z, y, **kwargs)[source]¶

Decodes a batch of latent coordinates

Parameters

z (Tensor) – Latent coordinates (without rotational and translational parts)
y (Tensor) – Classes as one-hot vectors for each sample in z

Return type

Tensor

manifold2d(d, disc_idx=0, plot=True, **kwargs)[source]¶

Plots a learned latent manifold in the image space

Parameters

d (int) – Grid size
disc_idx (int) – Discrete dimension for which we plot continuous latent manifolds
plot (bool) – Plots the generated manifold (Default: True)
kwargs (Union[str, int]) – Keyword arguments include custom min/max values for grid boundaries passed as ‘z_coord’ (e.g. z_coord = [-3, 3, -3, 3]) and plot parameters (‘padding’, ‘padding_value’, ‘cmap’, ‘origin’, ‘ylim’)

Return type

Tensor

load_weights(filepath)¶

Loads saved weights of encoder(s) and decoder

Return type: None

save_weights(filepath)¶

Saves trained weights of encoder(s) and decoder

Return type: None

set_decoder(decoder_net)¶

Sets a user-defined decoder neural network

Return type: None

set_encoder(encoder_net)¶

Sets a user-defined encoder neural network

Return type: None

ss-trVAE¶

class pyroved.models.sstrVAE(data_dim, latent_dim, num_classes, coord=3, hidden_dim_e=128, hidden_dim_d=128, hidden_dim_cls=128, num_layers_e=2, num_layers_d=2, num_layers_cls=2, sampler_d='bernoulli', sigmoid_d=True, seed=1, **kwargs)[source]¶

Bases: pyroved.models.base.baseVAE

Semi-supervised variational autoencoder with rotational and/or translational invariance

Parameters

data_dim (Tuple[int]) – Dimensionality of the input data; use (h x w) for images or (length,) for spectra.
latent_dim (int) – Number of latent dimensions.
num_classes (int) – Number of classes in the classification scheme
coord (int) – For 2D systems, coord=0 is vanilla VAE, coord=1 enforces rotational invariance, coord=2 enforces invariance to translations, and coord=3 enforces both rotational and translational invariances. For 1D systems, coord=0 is vanilla VAE and coord>0 enforces transaltional invariance.
hidden_dim_e (int) – Number of hidden units per each layer in encoder (inference network).
hidden_dim_d (int) – Number of hidden units per each layer in decoder (generator network).
hidden_dim_cls (int) – Number of hidden units (“neurons”) in each layer of classifier
num_layers_e (int) – Number of layers in encoder (inference network).
num_layers_d (int) – Number of layers in decoder (generator network).
num_layers_cls (int) – Number of layers in classifier
sampler_d (str) – Decoder sampler, as defined as p(x|z) = sampler(decoder(z)). The available samplers are ‘bernoulli’, ‘continuous_bernoulli’, and ‘gaussian’ (Default: ‘bernoulli’).
sigmoid_d (bool) – Sigmoid activation for the decoder output (Default: True)
seed (int) – Seed used in torch.manual_seed(seed) and torch.cuda.manual_seed_all(seed)
kwargs (float) – Additional keyword arguments are dx_prior and dy_prior for setting a translational prior(s), and decoder_sig for setting sigma in the decoder’s sampler when it is set to “gaussian”.

Example:

Initialize a VAE model with rotational invariance for semisupervised learning of the dataset that has 10 classes

>>> data_dim = (28, 28)
>>> ssvae = sstrVAE(data_dim, latent_dim=2, num_classes=10, coord=1)

model(xs, ys=None, **kwargs)[source]¶

Model of the generative process p(x|z,y)p(y)p(z)

Return type: None

guide(xs, ys=None, **kwargs)[source]¶

Guide q(z|y,x)q(y|x)

Return type: None

split_latent(zs)[source]¶

Split latent variable into parts with rotation and/or translation and image content

Return type: Tuple[Tensor]

model_classify(xs, ys=None, **kwargs)[source]¶

Models an auxiliary (supervised) loss

Return type: None

guide_classify(xs, ys=None, **kwargs)[source]¶: Dummy guide function to accompany model_classify

set_classifier(cls_net)[source]¶

Sets a user-defined classification network

Return type: None

classifier(x_new, **kwargs)[source]¶

Classifies data

Parameters

x_new (Tensor) – Data to classify with a trained ss-trVAE. The new data must have the same dimensions (images height x width or spectra length) as the one used for training.
kwargs (int) – Batch size as ‘batch_size’ (for encoding large volumes of data)

Return type

Tensor

encode(x_new, y=None, **kwargs)[source]¶

Encodes data using a trained inference (encoder) network

Parameters

x_new (Tensor) – Data to encode with a trained trVAE. The new data must have the same dimensions (images height and width or spectra length) as the one used for training.
y (Optional[Tensor]) – Classes as one-hot vectors for each sample in x_new. If not provided, the ss-trVAE’s classifier will be used to predict the classes.
kwargs (int) – Batch size as ‘batch_size’ (for encoding large volumes of data)

Return type

Tensor

decode(z, y, **kwargs)[source]¶

Decodes a batch of latent coordinates

Parameters

z (Tensor) – Latent coordinates (without rotational and translational parts)
y (Tensor) – Classes as one-hot vectors for each sample in z
kwargs (int) – Batch size as ‘batch_size’

Return type

Tensor

manifold2d(d, plot=True, **kwargs)[source]¶

Returns a learned latent manifold in the image space

Parameters

d (int) – Grid size
plot (bool) – Plots the generated manifold (Default: True)
kwargs (Union[str, int]) – Keyword arguments include ‘label’ for class label (if any), custom min/max values for grid boundaries passed as ‘z_coord’ (e.g. z_coord = [-3, 3, -3, 3]) and plot parameters (‘padding’, ‘padding_value’, ‘cmap’, ‘origin’, ‘ylim’)

Return type

Tensor

load_weights(filepath)¶

Loads saved weights of encoder(s) and decoder

Return type: None

save_weights(filepath)¶

Saves trained weights of encoder(s) and decoder

Return type: None

set_decoder(decoder_net)¶

Sets a user-defined decoder neural network

Return type: None

set_encoder(encoder_net)¶

Sets a user-defined encoder neural network

Return type: None

VED¶

class pyroved.models.VED(input_dim, output_dim, input_channels=1, output_channels=1, latent_dim=2, hidden_dim_e=32, hidden_dim_d=96, num_layers_e=None, num_layers_d=None, activation='lrelu', batchnorm=False, sampler_d='bernoulli', sigmoid_d=True, seed=1, **kwargs)[source]¶

Bases: pyroved.models.base.baseVAE

Variational encoder-decoder model where the inputs and outputs are not identical. This model can be used for realizing im2spec and spec2im type of models where 1D spectra are predicted from image data and vice versa.

Parameters

input_dim (Tuple[int]) – Dimensionality of the input data; use (h x w) for images or (length,) for spectra.
output_dim (Tuple[int]) – Dimensionality of the input data; use (h x w) for images or (length,) for spectra. Doesn’t have to match the input data.
input_channels (int) – Number of input channels (Default: 1)
output_channels (int) – Number of output channels (Default: 1)
latent_dim (int) – Number of latent dimensions.
hidden_dim_e (int) – Number of hidden units (convolutional filters) for each layer in the first block of the encoder NN. The number of units in the consecutive blocks is defined as hidden_dim_e * n, where n = 2, 3, …, n_blocks (Default: 32).
hidden_dim_e – Number of hidden units (convolutional filters) for each layer in the first block of the decoder NN. The number of units in the consecutive blocks is defined as hidden_dim_e // n, where n = 2, 3, …, n_blocks (Default: 32).
num_layers_e (Optional[List[int]]) – List with numbers of layers per each block of the encoder NN. Defaults to [1, 2, 2] if none is specified.
num_layers_d (Optional[List[int]]) – List with numbers of layers per each block of the decoder NN. Defaults to [2, 2, 1] if none is specified.
activation (str) – Non-linear activation for inner layers of encoder and decoder. The available activations are ReLU (‘relu’), leaky ReLU (‘lrelu’), hyberbolic tangent (‘tanh’), and softplus (‘softplus’) The default activation is ‘tanh’.
batchnorm (bool) – Batch normalization attached to each convolutional layer after non-linear activation (except for layers with 1x1 filters) in the encoder and decoder NNs (Default: False)
sampler_d (str) – Decoder sampler, as defined as p(x|z) = sampler(decoder(z)). The available samplers are ‘bernoulli’, ‘continuous_bernoulli’, and ‘gaussian’ (Default: ‘bernoulli’).
sigmoid_d (bool) – Sigmoid activation for the decoder output (Default: True)
seed (int) – Seed used in torch.manual_seed(seed) and torch.cuda.manual_seed_all(seed)
kwargs (float) – Additional keyword argument is decoder_sig for setting sigma in the decoder’s sampler when it is chosen to be a “gaussian”.

Example:

Initialize a VED model for predicting 1D spectra from 2D images

>>> input_dim = (32, 32) # image height and width
>>> output_dim = (16,) # spectrum length
>>> ved = VED(input_dim, output_dim, latent_dim=2)

model(x=None, y=None, **kwargs)[source]¶

Defines the model p(y|z)p(z)

Return type: None

guide(x=None, y=None, **kwargs)[source]¶

Defines the guide q(z|x)

Return type: None

encode(x_new, **kwargs)[source]¶

Encodes data using a trained inference (encoder) network

Parameters

x_new (Tensor) – Data to encode with a trained trVAE. The new data must have the same dimensions (images height and width or spectra length) as the one used for training.
kwargs (int) – Batch size as ‘batch_size’ (for encoding large volumes of data)

Return type

Tensor

decode(z, **kwargs)[source]¶

Decodes a batch of latent coordnates

Parameters: z (Tensor) – Latent coordinates
Return type: Tensor

predict(x_new, **kwargs)[source]¶

Forward prediction (encode -> sample -> decode)

Return type: Tensor

manifold2d(d, plot=True, **kwargs)[source]¶

Plots a learned latent manifold in the image space

Parameters

d (int) – Grid size
plot (bool) – Plots the generated manifold (Default: True)
kwargs (Union[str, int]) – Keyword arguments include custom min/max values for grid boundaries passed as ‘z_coord’ (e.g. z_coord = [-3, 3, -3, 3]) and plot parameters (‘padding’, ‘padding_value’, ‘cmap’, ‘origin’, ‘ylim’)

Return type

Tensor

load_weights(filepath)¶

Loads saved weights of encoder(s) and decoder

Return type: None

save_weights(filepath)¶

Saves trained weights of encoder(s) and decoder

Return type: None

set_decoder(decoder_net)¶

Sets a user-defined decoder neural network

Return type: None

set_encoder(encoder_net)¶

Sets a user-defined encoder neural network

Return type: None