LogoLogo
  • Welcome to Gretel!
  • Gretel Basics
    • Getting Started
      • Quickstart
      • Blueprints
      • Use Case Examples
      • Environment Setup
        • Console
        • SDK
      • Projects
      • Inputs and Outputs
      • Gretel Connectors
        • Object Storage
          • Amazon S3
          • Google Cloud Storage
          • Azure Blob
        • Database
          • MySQL
          • PostgreSQL
          • MS SQL Server
          • Oracle Database
        • Data Warehouse
          • Snowflake
          • BigQuery
          • Databricks
        • Gretel Project
    • Release Notes
      • Platform Release Notes
        • April 2025
        • March 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
      • Console Release Notes
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
      • Python SDKs
  • Create Synthetic Data
    • Gretel Safe Synthetics
      • Transform
        • Reference
        • Examples
        • Supported Entities
      • Synthetics
        • Gretel Tabular Fine-Tuning
        • Gretel Text Fine-Tuning
        • Gretel Tabular GAN
        • Benchmark Report
        • Privacy Protection
      • Evaluate
        • Synthetic Quality & Privacy Report
        • Tips to Improve Synthetic Data Quality
        • Data Privacy 101
      • SDK
    • Gretel Data Designer
      • Getting Started with Data Designer
      • Define your Data Columns
        • Column Types
        • Add Constraints to Columns
        • Custom Model Configurations
        • Upload Files as Seeds
      • Building your Dataset
        • Seeding your Dataset
        • Generating Data
      • Generate Realistic Personal Details
      • Structured Outputs
      • Code Validation
      • Data Evaluation
      • Magic Assistance
      • Using Jinja Templates
  • Gretel Playground [Legacy]
    • Getting Started
    • Prompts Tips & Best Practices
    • FAQ
    • SDK Examples
    • Tutorials
    • Videos
    • Gretel Playground [Legacy] Inference API
    • Batch Job SDK
  • Reference
    • Gretel's Python Client
    • Gretel’s Open Source Synthetic Engine
    • Gretel’s REST API
    • Homepage
    • Model Suites
Powered by GitBook
On this page
  • Architecture Diagram
  • Data sent to Gretel's control plane when using Hybrid mode
  • Outbound Network Requirements

Was this helpful?

Export as PDF
  1. Operate and Manage Gretel
  2. Gretel Hybrid

Architecture

Last updated 1 year ago

Was this helpful?

Architecture Diagram

Data sent to Gretel's control plane when using Hybrid mode

When running in Hybrid mode, the following data will be stored in Gretel's control plane and may be passed between your Gretel Hybrid environment and the Gretel API.

  • Project names and descriptions

  • Model configuration (The YAML configuration created for each model)

  • Model name and ID

  • Model status (created, active, completed, etc)

  • Model run ID (when using a model to create more data)

  • Model run status (created, active, completed, etc)

  • Workflow IDs, Workflow Run IDs and Workflow Task IDs

  • Workflow Task Statuses and overall Workflow Run Status

  • The email address of the user that created a model

  • The email address of the user that ran a model

  • Model creation and model run logs. These logs only include metadata and error information.

  • Workflow Task logs. These logs include metadata and error information, and allow users to view logs in the Console.

  • Names of data source and results (file names only, no data is stored)

The following data is not stored in Gretel's control plane when using Hybrid mode.

  • Model training data. This will be stored and accessed from your own object storage (buckets you create).

  • Model training artifacts. These will be written to your object storage (buckets you create) instead. This includes:

    • The trained model archive / weights

    • Quality and privacy reports

    • Sample data generated during training

  • Model run artifacts. These will be written to your object storage instead. This includes:

    • Generated data

    • Model run reports (if applicable)

An example of viewing a hybrid job using Gretel Transform API:

Outbound Network Requirements

Gretel Hybrid relies on outbound connections to reach out to the Gretel API and pull container images. No inbound network connections are required for Gretel Hybrid to function. The below endpoints must be reachable from the network associated with the Kubernetes cluster hosting Gretel Hybrid.

  • api.gretel.cloud (HTTPS / TCP 443) - The Gretel API. This must be reachable by all Gretel pods running within your Kubernetes cluster for the purposes of job scheduling and orchestration.

  • artifacts.gretel.cloud (HTTPS / TCP 443) - This endpoint provides presigned S3 URLs for pulling certain base model artifacts when a model training job starts. This must be reachable by all Gretel pods running within your Kubernetes cluster.

  • 074762682575.dkr.ecr.us-west-2.amazonaws.com (HTTPS / TCP 443) - Gretel's Contain Registry hosted on AWS ECR. This must be reachable by Kubernetes nodes so that pod container images may be pulled.

  • s3.amazonaws.com (HTTPS / TCP 443) - AWS S3 is the persistent storage that backs ECR and this endpoint must be reachable by Kubernetes nodes so that they can pull Gretel container images.

  • s3-us-west-2.amazonaws.com (HTTPS / TCP 443) - AWS S3 is the persistent storage that backs ECR and this endpoint must be reachable by Kubernetes nodes so that they can pull Gretel container images.

Logs are the only artifacts stored in Gretel Cloud. Data source and generated result names can be viewed, but data is not stored in Gretel Cloud.