Search…
Models
This section covers the generative machine learning models supported by Gretel APIs as well as core use cases and capabilities.

Supported Features

This section compares features of different generative data models supported by Gretel APIs.
βœ… = Supported
βœ–οΈ = Not yet supported
LSTM
Gretel-GPT
CTGAN
Amplify
DGAN
Tag
synthetics
gpt_x
ctgan
amplify
timeseries_dgan
Type
Language Model
Language Model
Generative Adversarial Network
Statistical
Generative Adversarial Network
Model
LSTM
Pre-trained Transformer
GAN
Statistical
GAN
Privacy filters
βœ…
βœ–οΈ
βœ…
βœ–οΈ
βœ–οΈ
Differential privacy
βœ…
βœ–οΈ
βœ–οΈ
βœ–οΈ
βœ–οΈ
Synthetic quality report
βœ…
βœ–οΈ
βœ…
βœ–οΈ
βœ–οΈ
Tabular
βœ…
βœ–οΈ
βœ…
βœ…
βœ–οΈ
Time-series
βœ…
βœ–οΈ
βœ–οΈ
βœ–οΈ
βœ…
Natural language
βœ…
βœ…
βœ–οΈ
βœ–οΈ
βœ–οΈ
Conditional generation
βœ…
βœ…
βœ…
βœ–οΈ
βœ–οΈ
Pre-trained
βœ–οΈ
βœ…
βœ–οΈ
βœ–οΈ
βœ–οΈ
Gretel cloud
βœ…
βœ…
βœ…
βœ…
βœ…
On-premises
βœ…
βœ…
βœ…
βœ…
βœ…
GitHub - gretelai/gretel-synthetics: Synthetic data generators for structured and unstructured text, featuring differentially private learning.
GitHub
Check out our GitHub for research, source code and examples including our core synthetic data generation library.

Create and train a model

Below is an example configuration that may be used to create and fine-tune a synthetic data model. Save the example above to model-config.yaml.
  • Replace [model_id] with the type of model you wish to train (e.g. synthetics, gpt_x, ctgan, timeseries_dgan, amplify).
  • data_source must point to a valid and accessible file in CSV, JSON, or JSONL format.
    • Supported storage formats include S3, GCS, Azure Blog Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.
    • data_source: __temp__ can be used when the source file is specified elsewhere using:
      • --in_data parameter via CLI,
      • parameter via SDK,
      • dataset button via Console.
schema_version: "1.0"
name: "my-model"
​
models:
- [model_id]:
data_source: foo.csv
Use the following CLI command to train and create the synthetic data model.
  • The use of exports are not necessary, they are only used to have a cleaner models create command.
  • --in_data is optional, and can be used to override the data_source specified in the config.
export CONFIG_PATH=model-config.yaml
export DATASOURCE=foo.csv
​
gretel models create \
--config $CONFIG_PATH \
--runner cloud \
--in-data $DATASOURCE > my-model.json

Generate data from a model

Below is an example CLI command that may be used to generate data from a model.
  • --model-id supports both a model uid and the JSON that models create outputs.
  • --in-data (optional) allows you to specify a CSV file to prompt the model for conditional data generation tasks.
gretel models run --model-id my-model.json \
--runner cloud \
--in-data prompts.csv \
--output .
​

​

Copy link
On this page
Supported Features
Create and train a model
Generate data from a model