Gretel Workflows
Automate and operationalize synthetic data using Gretel Workflows
Gretel Workflows provide an easy to use, config driven API for automating and operationalizing Gretel. Using Connectors, you can connect Gretel Workflows to various data sources such as S3 or MySQL and schedule recurring jobs to make it easy to securely share data across your organization.
A Gretel Workflow is constructed of actions that connect to various services including object stores and databases. These actions are then composed to create a pipeline for processing data with Gretel. In the example above:
A source action is configured to extract data from a source, such as S3 or MySQL.
The extracted source data is passed as inputs to Gretel Models. Using Workflows you can chain together different types of models based on specific use cases or privacy needs.
A destination action writes output data from the models to a sink.
Creating Workflows in the Gretel Console
Log into the Gretel Console.
Navigate to the Workflows page using the menu item in the left side bar and follow the instructions to create a new workflow.
The wizard-based flow will guide you through model selection, data source and destination creation, and workflow configuration.
Once completed, all workflow runs can be viewed for a particular workflow via the Workflow page, or for all workflows and models on the Activity page.
For more detailed steps by step instructions see Managing Workflows.
Workflows as YAML
Workflows are configured using YAML. Below is an example workflow config that crawls an Amazon S3 bucket and creates an anonymized synthetic copy of the bucket contents in a destination bucket.
This second example workflow config connects to a MySQL database, creates a synthetic version of the database, and writes it to an output MySQL database.
Next Steps
Next, we'll dive deeper into the components that make up Workflows. You may also want to check out a list of supported sources and sinks here: Connectors.
Last updated