Concepts

Terminology and core concepts that make up Gretel Workflows.

Workflow

A Workflow is typically created for a specific use case or data source. You can think of a Workflow like a data pipeline or DAG.

Workflow Config

The core configuration interface is a YAML config. You can edit and create Workflow YAML configs from the Console, SDK or CLI. These configs define what the workflow does, and when.

name: synthesize-metrics-analytics

trigger:
  cron:
    pattern: "@daily"

actions:
  - name: gcs-read
    type: gcs_source
    connection: c_1
    config:
      bucket: my-analytics-bucket
      glob_filter: "*.csv"
      path: metrics/

  - name: model-train-run
    type: gretel_model
    input: gcs-read
    config:
      project_id: proj_1
      model: synthetics/default
      run_params:
        params: {}
      training_data: "{outputs.gcs-read.dataset.files.data}"

For a more detailed reference please see the Config Syntax docs.

Workflow Actions

Workflows are composed of many Workflow Actions. Actions are configured with inputs and produce outputs that determine the execution flow of the Workflow.

Each Workflow Action is responsible for integrating with some service and performing some processing on its set of inputs. These services could be external data stores (e.g. for reading source data or writing synthetic data), or Gretel (e.g. for training and running models).

Connections

Connections are used to authenticate a Gretel Action to an external service such as GCS or Snowflake. Each action is tied to at most one external service, and needs to be configured with a connection for the appropriate service.

For more detail on connections, including a full list of available connector types, see Connectors.

Triggers

Triggers are managed as a property on the workflow config and can be used to schedule Workflows.

See Scheduled Workflows for more information.

Workflow Run

A Workflow Run represents the concrete execution of a Workflow. When a Workflow is either manually triggered or triggered from a schedule, a Workflow Run is created.

Workflow Task

A Workflow task represents the concrete execution of a Workflow Action. Each task has an associated set of logs, inputs and outputs.

Last updated