LogoLogo
  • Welcome to Gretel!
  • Gretel Basics
    • Getting Started
      • Quickstart
      • Blueprints
      • Use Case Examples
      • Environment Setup
        • Console
        • SDK
      • Projects
      • Inputs and Outputs
      • Gretel Connectors
        • Object Storage
          • Amazon S3
          • Google Cloud Storage
          • Azure Blob
        • Database
          • MySQL
          • PostgreSQL
          • MS SQL Server
          • Oracle Database
        • Data Warehouse
          • Snowflake
          • BigQuery
          • Databricks
        • Gretel Project
    • Release Notes
      • Platform Release Notes
        • May 2025
        • April 2025
        • March 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
      • Console Release Notes
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
      • Python SDKs
  • Create Synthetic Data
    • Gretel Safe Synthetics
      • Transform
        • Reference
        • Examples
        • Supported Entities
      • Synthetics
        • Gretel Tabular Fine-Tuning
        • Gretel Text Fine-Tuning
        • Gretel Tabular GAN
        • Benchmark Report
        • Privacy Protection
      • Evaluate
        • Synthetic Quality & Privacy Report
        • Tips to Improve Synthetic Data Quality
        • Data Privacy 101
      • SDK
    • Gretel Data Designer
      • Getting Started with Data Designer
      • Define your Data Columns
        • Column Types
        • Add Constraints to Columns
        • Custom Model Configurations
        • Upload Files as Seeds
      • Building your Dataset
        • Seeding your Dataset
        • Generating Data
      • Generate Realistic Personal Details
      • Structured Outputs
      • Code Validation
      • Data Evaluation
      • Magic Assistance
      • Using Jinja Templates
  • Gretel Playground [Legacy]
    • Getting Started
    • Prompts Tips & Best Practices
    • FAQ
    • SDK Examples
    • Tutorials
    • Videos
    • Gretel Playground [Legacy] Inference API
    • Batch Job SDK
  • Reference
    • Gretel's Python Client
    • Gretel’s Open Source Synthetic Engine
    • Gretel’s REST API
    • Homepage
    • Model Suites
Powered by GitBook
On this page
  • Transform
  • Synthetics
  • Which models are right for your use case?

Was this helpful?

Export as PDF
  1. Create Synthetic Data

Gretel Safe Synthetics

Reference docs for Gretel Safe Synthetics.

PreviousPython SDKsNextTransform

Last updated 1 month ago

Was this helpful?

Gretel Safe Synthetics allows you to create private versions of your sensitive data. You can use Safe Synthetics to redact and replace sensitive Personally Identifiable Information (PII) with Transform, obfuscate quasi-identifiers with Synthetics, and apply differential privacy for mathematical guarantees of privacy protection. Once your data is generated, Gretel will automatically generate an evaluation report to help measure the quality and privacy of your synthetic data.

You can run a Safe Synthetics workflow by combining the steps that are relevant to you. The recommended flow runs Transform to replace and protect true identifiers, followed by Synthetics to protect quasi-identifying information.

Transform

Gretel’s Transform model combines data classification with data transformation to easily detect and anonymize or mutate sensitive data. Gretel’s data classification can detect a variety of such as PII, which can be used for defining transforms.

We generally recommend combining Gretel Transform with Gretel Synthetics to redact or replace sensitive data before training a synthetics model. This ensures that there is no possibility the model can learn the sensitive PII.

You can find out more about Gretel Transform .

Synthetics

Gretel's Synthetics models generate synthetic datasets that mimic the statistical properties of real-world data, but without containing any actual real-world observations.

The models are trained to understand the patterns, distributions, and relationships within and across each column of the real-world data. After, synthetic records are generated that match those statistical properties, without any one-to-one mapping to original records.

Gretel offers the following synthetics models:

  1. - Gretel’s flagship LLM-based model for generating privacy-preserving, real-world quality synthetic data across numeric, categorical, text, JSON, and event-based tabular data with up to ~50 columns.

    1. Data types: Numeric, categorical, text, JSON, event-based

    2. Differential privacy: Optional

  2. - Gretel’s model for generating privacy-preserving synthetic text using your choice of top performing open-source models.

    1. Data types: Text

    2. Differential privacy: Optional

  3. - Gretel’s model for quickly generating synthetic numeric and categorical data for high-dimensional datasets (>50 columns) while preserving relationships between numeric and categorical columns.

    1. Data types: Numeric, categorical

    2. Differential privacy: NOT supported

  4. - Gretel’s model for generating differentially-private data with very low epsilon values (maximum privacy). It is best for basic analytics use cases (e.g. pairwise modeling), and runs on CPU. If your use case is training an ML model to learn deep insights in the data, Tabular Fine-Tuning is your best option.

    1. Data types: Numeric, categorical

    2. Differential privacy: Required; you cannot run without differential privacy

Which models are right for your use case?

You can use the flow chart below to help determine whether Transform, Synthetics (with or without Differential Privacy), or the combination is best for your use case.

If you decided that you should use Synthetics as part of your use case, you can use the next flow chart to help determine which Synthetics model may be best.

You can learn more about Gretel Synthetics models .

Supported Entities
Tabular Fine-Tuning
Text Fine-Tuning
Tabular GAN
Tabular DP
here
here