Model Configurations
Gretel configurations are declarative objects that specify how a model should be created. Configurations can be authored in YAML or JSON.
Gretel has configuration templates that may be helpful as starting points for creating your model.
All Gretel models follow the same high-level configuration file format structure. All configurations include schema_version
and name
keys, as well as a models
array that is keyed by a [model_id]
. Within the [model_id]
object, all model configurations have a data_source
key.
[model_id]
is replaced with the type of model you wish to train (e.g.synthetics
,gpt_x
,actgan
,timeseries_dgan
,amplify
,transform
,classify
).data_source
must point to a valid and accessible file in CSV, JSON, or JSONL format.Supported storage formats include S3, GCS, Azure Blog Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.
Note: Some models have specific data source format requirements
data_source: __tmp__
can be used when the source file is specified elsewhere using:--in_data
parameter via CLI,parameter via SDK,
dataset
button
via Console.
Each Gretel model have different additional keys within the model_id
object and unique configuration parameters specific to that model. For details on the configuration parameters for each model, see the specific model page:
Evaluate. Note that the Evaluate model is not included in the Models documentation, but running a Gretel Evaluate Job does require passing the correct evaluate model configuration.
Last updated