CLI & SDK

Get up and running with Gretel's CLI and SDK.

The Gretel CLI and Python SDK are made available through both PyPi (most common) and GitHub.

Installation

Prerequisites

We require using Python 3.9+ when using the CLI and SDK. You can download Python 3.9 (or newer) here and install manually, or you may wish to install Python 3.9+ from your terminal. If you are working with a new Python installation or environment you should also verify that pip is installed.

Gretel Client

To get started, you will need to setup your environment and install the appropriate packages.

The most straightforward way to install the gretel-client CLI and SDK is with pip:

pip install -U gretel-client

The -U flag will ensure the most recent version is installed. Occasionally we will ship a Release Candidate (RC) version of the package. These are generally safe to install, you may optionally include this with the inclusion of the --pre flag.

If you wish to have the most recent development features, you may also choose to install directly from GitHub with the following command. This may be suggested from our Customer Success team if you are testing new features that have not been fully released yet.

pip install git+https://github.com/gretelai/gretel-python-client@main

Gretel Hybrid Dependencies

If you are using Gretel Hybrid to run Gretel jobs on your own cloud infrastructure, the Gretel CLI and SDK will require your cloud provider's respective Python libraries. To install these dependencies run the relevant command below.

pip install -U "gretel-client[aws]"

Authentication

After installing the package, you should configure authentication with Gretel Cloud. This will be required in order to create and utilize any models.

If you are installing Gretel on a system that you own or wholly control, we highly recommend configuring the CLI and SDK once with our configuration assistant. If you do this once, you will be able to use the CLI and SDK without doing specific authentication before running any commands.

To begin the CLI configuration process, use the command:

gretel configure

This will walk you through some prompts. You may press <ENTER> to accept the default which is shown in square brackets for each prompt. The prompt will look similar to:

Gretel.ai COPYRIGHT Notice


The Gretel CLI and Python SDK, installed through the "gretel-client"
package or other mechanism is free and open source software under
the Apache 2.0 License.

When using the CLI or SDK, you may launch "Gretel Worker(s)"
that are hosted in your local environment as containers. These
workers are launched automatically when running commands that create
models or process data records.

The "Gretel Worker" and all code within it is copyrighted and an
extension of the Gretel Service and licensed under the Gretel.ai
Terms of Service.  These terms can be found at https://gretel.ai/terms
section G paragraph 2.


Endpoint [https://api.gretel.cloud]: 
Artifact Endpoint [cloud]: 
Default Runner (cloud, local, hybrid) [cloud]: 
Gretel API Key [grtuf6c5****]: 
Default Project []: 
  1. Press <ENTER> to accept the default value for the Endpoint. (https://api.gretel.cloud)

  2. The Artifact Endpoint is only required for Gretel Hybrid users. If you are using Gretel Cloud, press <ENTER> to accept the default value of cloud. If you are a Gretel Hybrid user the configured value should be the URI for the Sink Bucket which was created during the Gretel Hybrid deployment. This would be the resource identifier for an Amazon S3 Bucket, Azure Storage Container, Google Cloud Storage Bucket.

    • Amazon S3 Example: s3://your-sink-bucket

    • Azure Storage Example: azure//your-sink-bucket

    • Google Cloud Storage Example: gcs://your-sink-bucket

  3. The Default Runner is set for cloud. Press <ENTER> to accept the default value unless you are a Gretel Hybrid user or are running Gretel locally on your own machine(s). We recommend keeping cloud as the default runner, which will utilize Gretel Cloud's auto-scaling GPU and CPU fleet to create and utilize models.

    • If you are a Gretel Hybrid user set this value to hybrid to utilize hybrid runners.

    • If you need to run compute on your own machine(s) set this value to local.

  4. When prompted for your Gretel API Key, paste the key you created in the Gretel Console.

  5. When prompted for your Default Project, you may optionally enter a Project Name or press <ENTER> to accept the default.

Finally, you can test your configuration using the command:

gretel whoami

If the configuration is good to go, you should get back an output like this:

{
    "email": "user@domain.com",
    "config": {
        "endpoint": "https://api.gretel.cloud",
        "artifact_endpoint": "cloud",
        "api_key": "grtuf6c5****",
        "default_project_name": "my-synthetic-project",
        "default_runner": "cloud",
        "preview_features": "disabled"
    }
}

At this point, you are authenticated with Gretel, and can use the CLI without needing to re-authenticate. If you run into trouble, feel free to contact us for help!

Gretel Python Client docs can be found here.

Cloud Provider Authentication for Gretel Hybrid

The Gretel Client uses cloud provider specific libraries to interact with the underlying object storage via the smart_open library. If you're a Gretel Hybrid user you may need to configure your environment with proper credentials for your specific cloud provider.

AWS

When using AWS, the Gretel Client will look for default credentials already configured on your system. Docs for configuring credentials can be found here. For Gretel CLI usage we recommend configuring the "credentials file" (which can be done using the AWS CLI) or utilizing environment variables. Both of these authentication methods are outlined in the linked documentation.

Azure

When using Azure, the Gretel Client will look for credentials already configured on your system. Docs for configuring credentials can be found here. For Gretel CLI usage we recommend authenticating with the Azure CLI or utilizing environment variables. Both of these authentication methods are outlined in the linked documentation.

When interacting with Azure Storage, the Gretel Client will also need information exported via the AZURE_STORAGE_CONNECTION_STRING or AZURE_STORAGE_ACCOUNT_NAME environment variables.

To fetch a connection string for the AZURE_STORAGE_CONNECTION_STRING you can run the following command from your terminal using the Azure CLI.

export AZURE_STORAGE_CONNECTION_STRING=$(az storage account show-connection-string \
    --name ${STORAGE_ACCOUNT_NAME} \
    --resource-group "${RESOURCE_GROUP}" --query="connectionString")

Be sure to replace STORAGE_ACCOUNT_NAME and RESOURCE_GROUP with the appropriate values for your storage container.

If you want to use AZURE_STORAGE_ACCOUNT_NAMEthen you'll simply export the following:

# Replace with your Azure storage account
export AZURE_STORAGE_ACCOUNT_NAME="my-storage-account-name"

And then use the Gretel Client as you normally would with your already configured authentication mechanism.

The AZURE_STORAGE_ACCOUNT_NAME may be used to configure the Gretel Client with system assigned managed identities, but it will try all of the options supported by DefaultAzureCredentials. AZURE_STORAGE_ACCOUNT_NAME should contain the value of the storage account associated with your storage container.

GCP

When using GCP, the Gretel Client will look for default credentials already configured on your machine. Docs for configuring GCP credentials can be found here. For Gretel CLI usage we recommend authenticating with the GCP CLI (gcloud).

Last updated