Workflows can be managed from the Console, CLI or SDK.
To manage workflows from the Console, select the Workflows tab from the left side navigation bar. This will bring you to a list of Workflows where you can view more details for each Workflow.
Using the CLI you can view commands for working with workflows by running
gretel workflows --help
Creating Workflows
Workflows can be created either from the Gretel Console or CLI.
First, create a file on your computer containing a YAML workflow config. Then run the following command
Log into the Gretel Console, and navigate to the Workflows page. Select the New Workflow button.
Next, select the project in which you'd like to create the workflow. For first time users, a Default Project will automatically be created.
Projects hold workflows, models and connections. Sharing a project will also share the entities inside it. Similarly, you cannot run a workflow if the connection is uses is in a project that is not accessible to you.
Now, select the model type. This depends on the use case. For example, if the goal is to generate synthetic data with differential privacy guarantees, choose Tabular DP.
The next step is selecting the remote data source. Since workflows are meant to be run automatically, you can't manually upload a data source. When creating and evaluating models, we recommend creating a model directly. That model can be referenced in the workflow config when it's time to operationalize your data generation.
Existing connections will show up automatically in the dropdown. If there are no connections, select New connection to define one. Add a descriptive connection name (separated by hyphens), and enter your credentials.
Provide data source and file name details. Gretel supports multiple files being processed at once. All files will create the same model type that was selected earlier in the flow.
Configure the destination. Generated data can be uploaded to the Gretel Cloud for easier access and sharing. It can also be output to a remote connection; either the same one that was configured as the data source or an entirely new one.
By default, processed files are output to the configured bucket path using the name of the data file for the model run. If you want to customize the filename or path you can modify the destination action from YAML config after completing the wizard.
With the source and destination defined, select whether the workflow should run manually or on a schedule. We provide some pre-defined schedule types, but you can also create your own schedule using a cron expression. Read more about cron expressions, or use this generator for help creating one.
Scheduled workflows don't run immediately. To test the workflow, select the Run workflow button after the workflow has been created.
The final step is reviewing the workflow configuration. For an example of a workflow config, see the section below.
The workflow configuration can be edited from this page, and the model type updated. Once the workflow has been created, it will appear on the Workflows screen. Click the workflow list item to run the workflow. Workflow run activity details will be displayed, along with detailed logs for each step.
When a workflow has successfully completed, all generated artifacts will be available in the remote destination. This includes the generated data, quality and utility reports, and log files.
Workflows can be editing by navigating to that workflow, and clicking the configuration tab. Use the YAML code editor to modify workflow parameters, and select Done when completed. The new configuration will take effect for all subsequent runs of that workflow. To test the changes, select the Run workflow now button in the top right.
Building on the previous example from Creating Workflows
Workflows are organized under projects and share the same permissions as the project they are owned by.
Project Permission Mappings
In order to run a workflow, you must also have project writeaccess to the connections and models configured for that workflow.
Sharing Workflows
You can share a Workflow by sharing the project it is owned by. If a workflow references models or connections in a different project, be sure you have the appropriate level of access to that project.