ctgan
model tag. Below is an example configuration that may be used to create a Gretel CTGAN model. All Gretel models implement a common interface to train or fine-tune synthetic data models from the model-specific config. See the reference example to train a model.embedding_dim
(int, required, defaults to 128
) - Size of the random sample passed to the Generator (z vector).generator_dim
(List(int), required, defaults to [256, 256]
) - Size of the output samples for each of the Residuals. Adding more numbers to this list will create more Residuals, one for each number. This is equivalent to increasing the depth of the Generator.discriminator_dim
(List(int), required, defaults to [256, 256]
) - Size of the output samples for each of the discriminator linear layers. A new Linear layer will be created for each number added to this list.generator_lr
(float, required, defaults to 2e-4
) - Learning rate for the Generator.generator_decay
(float, required, defaults to 1e-6
) - Weight decay for the Generator's Adam optimizer.discriminator_lr
(float, required, defaults to 2e-4
) - Learning rate for the discriminator.discriminator_decay
(float, required, defaults to 1e-6
) - Weight decay for the discriminator's Adam optimizer.batch_size
(int, required, defaults to 500
) - Determines the number of examples the model see's each step. Importantly, this must be a multiple of 10
as specified by the CTGAN training scheme.epochs
(int, required, defaults to 300
) - Number of training iterations the model will undergo during training. A larger number will result in longer training times, but potentially higher quality synthetic data.discriminator_steps
(int, required, defaults to 1
) - The discriminator and Generator take different number of steps per batch. The original WGAN paper took 5 discriminator steps for each Generator step. In this case we default to 1
which follows the original CTGAN implementation.log_frequency
(bool, required, defaults to True
) - Determines the use of log frequency of categorical counts during conditional sampling. In some cases, switching to False improves performance.verbose
(bool, required, defaults to False
) - Whether to print training progress during training.pac
(int, required, defaults to 10
) - Number of samples to group together when applying the discriminator.