model-config.yaml
. [model_id]
with the type of model you wish to train (e.g. synthetics
, gpt_x
, ctgan
).data_source
must point to a valid and accessible file URL in CSV format. Supported storage formats include S3, GCS, Azure Blog Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.exports
are not necessary, they are only used to have a cleaner models create
command.--in_data
is optional, and can be used to override the data_source
specified in the config.--model-id
supports both a model uid
and the JSON that models create
outputs.--in-data
(optional) allows you to specify a CSV file to prompt the model for conditional data generation tasks.