Like gretel_model, the gretel_tabular action can be used to train and generate records from Gretel Models. gretel_tabular's primary value add is the maintence of referential integrity between related tables. This action is therefore recommended for workflows involving relational databases or data warehouses. gretel_tabular also allows specifying different model configs for different tables, and even instructing Gretel to find optimal model configs for your data via Gretel Tuner.
Inputs
Outputs
Example Configs
Generate a synthetic database by applying a consistent synthetics model to all tables in the dataset. Note that the model config can be specified as a full object...
Instead of providing a specific model config, you can instruct the gretel_tabular action to run trials to identify the best model config for each table. This is accomplished via the autotune option inside model_config fields (at either the root train level to apply to all tables, or inside a table_specific_config to apply to only a subset of tables).
Data to use for training, including relationships between tables (if applicable). This should be a reference to a dataset output from a previous action.
train.model
(Deprecated, prefer train.model_config)
A reference to a blueprint or config location. If a config location is used, it must be addressable by the workflow action.
This field is mutually exclusive to train.model_config.
train.model_config
A yaml object that accepts a few different shapes (detailed below): 1) a complete Gretel model config; 2) a reference to a blueprint or config location (from); 3) an autotune configuration.
train.skip_tables
(List of tables to pass through unaltered to outputs, see following fields)
train.skip_tables.table
The name of a table to skip, i.e. omit from model training and pass through unaltered.
train.table_specific_configs
(List of table-specific training details, see following fields)
train.table_specific_configs.tables
A list of table names to which the other fields in this object apply.
train.table_specific_configs.model_config
An alternative to the global default train.model_config value defined above.
run
(Run details, see following fields)
run.encode_keys
(Transform models only.) Whether to transform primary and foreign key columns. Defaults to false.
run.num_records_multiplier
(Synthetics models only.) Parameter for scaling output table size. Defaults to 1.0.
dataset
A dataset object containing the outputs from the models created by this action.
enabled
This boolean field must be explicitly set to true to enable config tuning.
trials_per_table
Optionally specify the number of trials to run for each table. Defaults to 4.
metric
The metric to optimize for. Defaults to synthetic_data_quality_score; also accepts field_correlation_stability, field_distribution_stability, principal_component_stability.
tuner_config
The specific Gretel Tuner config to use. Like model_config, this accepts either full configuration objects, or references to blueprints via from.