The Gretel CLI and Python SDK are made available through both PyPi (most common) and GitHub.
Installation
Prerequisites
We require using Python 3.9+ when using the CLI and SDK. You can download Python 3.9 (or newer) here and install manually, or you may wish to install Python 3.9+ from your terminal. If you are working with a new Python installation or environment you should also verify that pip is installed.
Gretel Client
To get started, you will need to setup your environment and install the appropriate packages.
The most straightforward way to install the gretel-client CLI and SDK is with pip:
pipinstall-Ugretel-client
The -U flag will ensure the most recent version is installed. Occasionally we will ship a Release Candidate (RC) version of the package. These are generally safe to install, you may optionally include this with the inclusion of the --pre flag.
If you wish to have the most recent development features, you may also choose to install directly from GitHub with the following command. This may be suggested from our Customer Success team if you are testing new features that have not been fully released yet.
If you are using Gretel Hybrid to run Gretel jobs on your own cloud infrastructure, the Gretel CLI and SDK will require your cloud provider's respective Python libraries. To install these dependencies run the relevant command below.
pipinstall-U"gretel-client[aws]"
pipinstall-U"gretel-client[azure]"
pipinstall-U"gretel-client[gcp]"
Authentication
After installing the package, you should configure authentication with Gretel Cloud. This will be required in order to create and utilize any models.
If you are installing Gretel on a system that you own or wholly control, we highly recommend configuring the CLI and SDK once with our configuration assistant. If you do this once, you will be able to use the CLI and SDK without doing specific authentication before running any commands.
To begin the CLI configuration process, use the command:
gretelconfigure
This will walk you through some prompts. You may press <ENTER> to accept the default which is shown in square brackets for each prompt. The prompt will look similar to:
Gretel.ai COPYRIGHT Notice
The Gretel CLI and Python SDK, installed through the "gretel-client"
package or other mechanism is free and open source software under
the Apache 2.0 License.
When using the CLI or SDK, you may launch "Gretel Worker(s)"
that are hosted in your local environment as containers. These
workers are launched automatically when running commands that create
models or process data records.
The "Gretel Worker" and all code within it is copyrighted and an
extension of the Gretel Service and licensed under the Gretel.ai
Terms of Service. These terms can be found at https://gretel.ai/terms
section G paragraph 2.
Endpoint [https://api.gretel.cloud]:
Artifact Endpoint [cloud]:
Default Runner (cloud, local, hybrid) [cloud]:
Gretel API Key [grtuf6c5****]:
Default Project []:
Press <ENTER> to accept the default value for the Endpoint. (https://api.gretel.cloud)
The Artifact Endpoint is only required for Gretel Hybrid users. If you are using Gretel Cloud, press <ENTER> to accept the default value of cloud. If you are a Gretel Hybrid user the configured value should be the URI for the Sink Bucket which was created during the Gretel Hybrid deployment. This would be the resource identifier for an Amazon S3 Bucket, Azure Storage Container, Google Cloud Storage Bucket.
Amazon S3 Example: s3://your-sink-bucket
Azure Storage Example: azure//your-sink-bucket
Google Cloud Storage Example: gcs://your-sink-bucket
The Default Runner is set for cloud. Press <ENTER> to accept the default value unless you are a Gretel Hybrid user or are running Gretel locally on your own machine(s). We recommend keeping cloud as the default runner, which will utilize Gretel Cloud's auto-scaling GPU and CPU fleet to create and utilize models.
If you are a Gretel Hybrid user set this value to hybrid to utilize hybrid runners.
If you need to run compute on your own machine(s) set this value to local.
When prompted for your GretelAPI Key, paste the key you created in the Gretel Console.
When prompted for your Default Project, you may optionally enter a Project Name or press <ENTER> to accept the default.
Finally, you can test your configuration using the command:
gretelwhoami
If the configuration is good to go, you should get back an output like this:
At this point, you are authenticated with Gretel, and can use the CLI without needing to re-authenticate. If you run into trouble, feel free to contact us for help!
There are a few different options to configure your Gretel Cloud connection through the SDK.
If you are using an ephemeral environment (such as Google Colab, etc) and you only wish to configure your connection for the duration of your Python session. You can configure your connection like this:
from gretel_client import configure_sessionconfigure_session(api_key="grtu****", validate=True)# If in a Notebook or similar environment you should see...# Using endpoint https://api.gretel.cloud# Logged in as user@domain.com ✅
Never commit code with your Gretel API key exposed! Generally you should load your Gretel API key in from some secure secrets manager or an environment variable.
See below for additional options if you are creating a Notebook, etc. such that you can always configure API key prompting.
Prompting
If you wish to maintain code that others may use, you can also use the following modification for configuring your session with Gretel Cloud. By using the prompt value, you'll be presented with a dialogue to import your API key.
from gretel_client import configure_sessionconfigure_session(api_key="prompt", validate=True)# If in a Notebook or similar environment you should see...# Using endpoint https://api.gretel.cloud# Logged in as user@domain.com ✅
Using the prompt option will only work if you do not already have Gretel credentials saved on disk. If credentials are already found on disk, configure_session() will utilize those and validate the connection with Gretel Cloud.
Hybrid Support
If you want to configure your session to run in Hybrid mode, run the following as part of configure_session:
from gretel_client import configure_sessionconfigure_session( api_key="grtu****", validate=True, default_runner="hybrid", artifact_endpoint="s3://my-bucket"# or gcs:// azure://)# If in a Notebook or similar environment you should see...# Using endpoint https://api.gretel.cloud# Logged in as user@domain.com ✅
The hybrid environment configuration will apply to everything run with the Gretel client, including libraries like Gretel Trainer and Gretel Relational.
See additional storage setup instructions per cloud provider here.
The Gretel Client uses cloud provider specific libraries to interact with the underlying object storage via the smart_open library. If you're a Gretel Hybrid user you may need to configure your environment with proper credentials for your specific cloud provider.
AWS
When using AWS, the Gretel Client will look for default credentials already configured on your system. Docs for configuring credentials can be found here. For Gretel CLI usage we recommend configuring the "credentials file" (which can be done using the AWS CLI) or utilizing environment variables. Both of these authentication methods are outlined in the linked documentation.
Azure
When using Azure, the Gretel Client will look for credentials already configured on your system. Docs for configuring credentials can be found here. For Gretel CLI usage we recommend authenticating with the Azure CLI or utilizing environment variables. Both of these authentication methods are outlined in the linked documentation.
When interacting with Azure Storage, the Gretel Client will also need information exported via the AZURE_STORAGE_CONNECTION_STRING or AZURE_STORAGE_ACCOUNT_NAME environment variables.
To fetch a connection string for the AZURE_STORAGE_CONNECTION_STRING you can run the following command from your terminal using the Azure CLI.
Be sure to replace STORAGE_ACCOUNT_NAME and RESOURCE_GROUP with the appropriate values for your storage container.
If you want to use AZURE_STORAGE_ACCOUNT_NAMEthen you'll simply export the following:
# Replace with your Azure storage accountexport AZURE_STORAGE_ACCOUNT_NAME="my-storage-account-name"
And then use the Gretel Client as you normally would with your already configured authentication mechanism.
The AZURE_STORAGE_ACCOUNT_NAME may be used to configure the Gretel Client with system assigned managed identities, but it will try all of the options supported by DefaultAzureCredentials. AZURE_STORAGE_ACCOUNT_NAME should contain the value of the storage account associated with your storage container.
GCP
When using GCP, the Gretel Client will look for default credentials already configured on your machine. Docs for configuring GCP credentials can be found here. For Gretel CLI usage we recommend authenticating with the GCP CLI (gcloud).