Azure Blob
Connect to your Azure Blob containers.
Getting Started
Prerequisites to create an Azure Blob based workflow. You will need
A connection to Azure Blob.
A source container.
A destination container. This can be the same as your source container.
Configuring an Azure Blob Connection
Azure Blob related actions require creating an azure
connection. The connection must be configured with the correct permissions for each Gretel Action.
For specific permissions, please refer to the Minimum Permissions section under each corresponding action.
There are three ways to authenticate a Gretel Azure Blob Connection, each method requires different fields for connection creation:
Account Access Key
Connection Creation Parameters
| Display name of your choosing used to identify your connection within Gretel. |
| Name of the Storage Account. |
| |
| Default container to crawl data from. Different containers can be chosen at the |
First, create a file on your local computer containing the connection credentials. This file should also include type
, name
, config
, and credentials
. connection_target_type
is optional; if omitted, the connection can be used for both source and destination action. The config
and credentials
fields should contain fields that are specific to the connection being created.
Below is an example Azure Blob connection using access key credentials:
Now that you've created the credentials file, use the CLI to create the connection
Entra ID
| Display name of your choosing used to identify your connection within Gretel. |
| Name of the Storage Account. |
| Application (client) ID. |
| Directory (tenant) ID. |
| Email of the Service Account. |
| Password of the Service Account. |
| Default container to crawl data from. Different containers can be chosen at the |
First, create a file on your local computer containing the connection credentials. This file should also include type
, name
, config
, and credentials
. connection_target_type
is optional; if omitted, the connection can be used for both source and destination action. The config
and credentials
fields should contain fields that are specific to the connection being created.
Below is an example Azure Blob connection using access key credentials:
Now that you've created the credentials file, use the CLI to create the connection
SAS Token
| Display name of your choosing used to identify your connection within Gretel. |
| Name of the Storage Account. |
| |
| Default container to crawl data from. Different containers can be chosen at the |
First, create a file on your local computer containing the connection credentials. This file should also include type
, name
, config
, and credentials
. connection_target_type
is optional; if omitted, the connection can be used for both source and destination action. The config
and credentials
fields should contain fields that are specific to the connection being created.
Below is an example Azure Blob connection file using access key credentials:
Now that you've created the credentials file, use the CLI to create the connection
Azure Blob Source
Type |
|
Connection |
|
The azure_source
action can be used to read an object from an Azure Blob container into Gretel Models.
This action works as an incremental crawler. Each time a workflow is run the action will crawl new files that have landed in the container since the last crawl.
For details how the action more generally works, please see https://github.com/Gretellabs/docs/blob/main/workflows-and-connectors/connectors/object-storage/broken-reference/README.md.
Inputs
| Container to crawl data from. If empty, will default to |
| A glob filter may be used to match file names matching a specific pattern. Please see the Glob Filter Reference for more details. |
| Prefix to crawl objects from. If no |
| Default |
Outputs
| A dataset object containing file and table representations of the found objects. |
Minimum Permissions
The associated service account must have the following permissions for the configured container
Storage Blob Data Reader role permissions, or higher
The SAS Token must have the following permissions for the configured container or storage account
List
Read
The SAS Token added for the storage account needs to have Container and Object allowed resource types.
Azure Blob Destination
Type |
|
Connection |
|
The azure_destination
action may be used to write gretel_model
or gretel_tabular
outputs to Azure Blob containers.
For details how the action more generally works, please see Reading Objects.
Inputs
| Container to write data to. If empty, will default to |
| Defines the path prefix to write the object into. |
| Name of the file to write data back to. This file name will be appended to the |
| Data to write to the file. This should be a reference to the output from a previous action. |
Outputs
None
Minimum Permissions
The associated service account must have the following permissions for the configured container
Storage Blob Data Contributor role permissions, or higher
The SAS Token must have the following permissions for the configured container or storage account
Create
List
Write
The SAS Token added for the storage account needs to have Container and Object allowed resource types.
Examples
Create a synthetic copy of your Azure Blob container. The following config will crawl a container, train and run a synthetic model, then write the outputs of the model back to a destination container while maintaining the same folder structure of the source container.
Last updated