Model Configurations
Gretel configurations are declarative objects that specify how a model should be created. Configurations can be authored in YAML or JSON.
Gretel has configuration templates that may be helpful as starting points for creating your model.
All Gretel models follow the same high-level configuration file format structure. All configurations include schema_version
and name
keys, as well as a models
array that is keyed by a [model_id]
. Within the [model_id]
object, all model configurations have a data_source
key.
[model_id]
is replaced with the type of model you wish to train (e.g.navigator_ft
,gpt_x
,actgan
,tabular_dp
, ortransform_v2
).The mapping between Gretel models and configuration
model_id
values is:Tabular Fine-Tuning:
navigator_ft
Text Fine-Tuning:
gpt_x
Tabular GAN:
actgan
Tabular DP:
tabular_dp
Transform:
transform_v2
data_source
must point to a valid and accessible file in CSV, JSON, or JSONL format.Supported storage formats include S3, GCS, Azure Blog Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.
Note: Some models have specific data source format requirements
data_source: __tmp__
can be used when the source file is specified elsewhere using:--in_data
parameter via CLI,parameter via SDK,
dataset
button
via Console.
Each Gretel model has different additional keys within the model_id
object and unique configuration parameters specific to that model. For details on the configuration parameters for each model, see the specific model page:
Last updated
Was this helpful?