Automate and operationalize synthetic data using Gretel Workflows
Gretel Workflows provide an easy to use, config driven API for automating and operationalizing Gretel. Using Workflows you can connect to various data sources such as S3 or MySQL and schedule recurring jobs to make it easy to secure and share data across your organization.
A Gretel Workflow is constructed of actions that connect to various services including object stores and databases. These actions are then composed to create a pipeline for processing data with Gretel. In the example above:
- 1.A source action is configured to extract data from data sources such as S3 or MySQL.
- 2.The extracted source data is passed as inputs to Gretel Models. Using Workflows you can chain together different types of models based on specific use cases or privacy needs.
- 3.Using the outputs of the model, a destination action can write to a corresponding destination data source such as S3 or MySQL.
- 1.Log into the Gretel Console.
- 2.Navigate to the Workflows page using the menu item in the left side bar and follow the instructions to create a new workflow.
- 3.The wizard-based flow will guide you through model selection, data source and destination creation, and workflow configuration.
- 4.Once completed, all workflow runs can be viewed for a particular workflow via the Workflow page, or for all workflows and models on the Activity page.
Workflows are configured using YAML. Below is an example workflow config that crawls a S3 bucket and creates an anonymized synthetic copy of the bucket contents in a destination bucket.
- name: s3-crawl
- name: model-train-run
- name: s3-sync