Amazon S3

Connect Gretel to your Amazon S3 buckets.

This guide will walk you through connecting source and destination S3 buckets to Gretel. Source buckets will be crawled and used as training inputs to Gretel models. Model outputs get written to the configured S3 destination.

Getting Started

Prerequisites to create a Amazon S3 based workflow. You will need

  1. A connection to Amazon S3.

  2. A source bucket.

  3. (optional) A destination bucket. This can be the same as your source bucket, or omitted entirely.

Configuring a Connection

Amazon S3 related actions require creating an s3 connection. The connection must be configured with the correct IAM permissions for each Gretel Action.

You can configure the following properties for a connection

access_key_id

Unique identifier used to authenticate and identify the user.

secret_access_key

Secret value used to sign requests.

All credentials sent to Gretel are encrypted both in transit and at rest.

The following policy can be used to enable access for all S3 related actions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "GretelS3Source",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::your-source-bucket-here",
        "arn:aws:s3:::your-source-bucket-here/*"
      ]
    },
    {
      "Sid": "GretelS3Destination",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts",
        "s3:ListBucketMultipartUploads",
        "s3:CreateMultipartUpload",
        "s3:UploadPart",
        "s3:CompleteMultipartUpload"
      ],
      "Resource": [
        "arn:aws:s3:::your-destination-bucket-here/*"
      ]
    }
  ]
}

More granular permissions for each action can be found in the action's respective Minimum Permissions section.

Creating Access Keys

The following documentation provides instruction for creating IAM users and access keys from your AWS account.

Creating an IAM Role

You can configure your Gretel S3 connector to use an IAM role for authorization. Using IAM roles you can grant Gretel systems access to your bucket without sharing any static access keys.

Before setting up your IAM role, you must first locate the Gretel Project ID for the project you wish to create the connection in. You will use the project id as the external id for the IAM role.

You may find your Gretel Project ID from the Console or SDK using the following instructions:

Navigate to the Projects page, and select Copy UID from the project drop-down on the right.

This should automatically copy the project id to your clipboard.

Now that you have the external id, you will need to create an AWS IAM role. To create the role, navigate to your AWS IAM Console, select the Roles page from the left menu, select Create Role and follow the instruction for Gretel Cloud below:

From the Role Creation dialog

  1. Select AWS account as the Trusted entity type.

  2. From the Select Another AWS account and enter Gretel's AWS account 074762682575.

  3. Check Require external ID and enter the Gretel Project ID from the previous step as the External ID.

  4. Select Next and add the appropriate IAM policies for the bucket.

The final trust policy on your IAM role should look similar to

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "074762682575"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "sts:ExternalId": "<your gretel project id, eg proj_28N5smcmkGnD6H5pd17tZwfYkQ1>"
                }
            }
        }
    ]
}

For more information about delegating permissions to an AWS IAM user, please reference the following AWS documentation:

Now that you have the role configured, you can create a Gretel connection using the role ARN from the the previous step.

From the Gretel Console, navigate to the Create Connection dialog, select S3, select the Role ARN authentication method, and enter the role ARN created in the previous steps.

Last updated

Was this helpful?