LogoLogo
  • Welcome to Gretel!
  • Gretel Basics
    • Getting Started
      • Quickstart
      • Blueprints
      • Use Case Examples
      • Environment Setup
        • Console
        • SDK
      • Projects
      • Inputs and Outputs
      • Gretel Connectors
        • Object Storage
          • Amazon S3
          • Google Cloud Storage
          • Azure Blob
        • Database
          • MySQL
          • PostgreSQL
          • MS SQL Server
          • Oracle Database
        • Data Warehouse
          • Snowflake
          • BigQuery
          • Databricks
        • Gretel Project
    • Release Notes
      • Platform Release Notes
        • May 2025
        • April 2025
        • March 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
      • Console Release Notes
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
      • Python SDKs
  • Create Synthetic Data
    • Gretel Safe Synthetics
      • Transform
        • Reference
        • Examples
        • Supported Entities
      • Synthetics
        • Gretel Tabular Fine-Tuning
        • Gretel Text Fine-Tuning
        • Gretel Tabular GAN
        • Benchmark Report
        • Privacy Protection
      • Evaluate
        • Synthetic Quality & Privacy Report
        • Tips to Improve Synthetic Data Quality
        • Data Privacy 101
      • SDK
    • Gretel Data Designer
      • Getting Started with Data Designer
      • Define your Data Columns
        • Column Types
        • Add Constraints to Columns
        • Custom Model Configurations
        • Upload Files as Seeds
      • Building your Dataset
        • Seeding your Dataset
        • Generating Data
      • Generate Realistic Personal Details
      • Structured Outputs
      • Code Validation
      • Data Evaluation
      • Magic Assistance
      • Using Jinja Templates
  • Gretel Playground [Legacy]
    • Getting Started
    • Prompts Tips & Best Practices
    • FAQ
    • SDK Examples
    • Tutorials
    • Videos
    • Gretel Playground [Legacy] Inference API
    • Batch Job SDK
  • Reference
    • Gretel's Python Client
    • Gretel’s Open Source Synthetic Engine
    • Gretel’s REST API
    • Homepage
    • Model Suites
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Create Synthetic Data
  2. Gretel Safe Synthetics

Evaluate

Analyze the quality and utility of synthetic data.

PreviousPrivacy ProtectionNextSynthetic Quality & Privacy Report

Last updated 1 month ago

Was this helpful?

Overview

By default, Gretel runs an Evaluate step at the end of your Safe Synthetics workflow. This step returns a .

In the Console Builder, you can indicate that you want a Quality & Privacy Report generated by checking the box for "Generate Quality & Privacy Report." We recommend choosing this option to help you quickly analyze the results of your Safe Synthetics run.

Several of the report metrics rely on a holdout dataset in order to be computed. We recommend applying the holdout in order to get results for these metrics, unless you have less than 500 rows of data. In that case, we recommend turning off the holdout in order to use all of the data for training. In the Console Builder, you can apply the holdout by checking the box under "Data holdout."

If you need to set a group by parameter for Holdout (if you have event-driven data) you can do so by going to the Advanced tab in builder and adding a parameter inside of the Holdout step. The below code sets a maximum holdout of 2000 records and groups by values in the column "state."

steps:
  - name: holdout
    task: holdout
    inputs: [{file_id}]
    config:
      holdout: 0.05
      max_holdout: 2000
      group_by: "state"

In the SDK, both the Holdout and Evaluate steps happen by default. The standard template for Safe Synthetics, which includes both Holdout and Evaluate, is:

synthetic_dataset = gretel.safe_synthetic_dataset\
    .from_data_source(ds) \
    .transform() \
    .synthesize() \
    .create()

If you want to turn off the holdout, you can do so from the .from_data_source() step.

synthetic_dataset = gretel.safe_synthetic_dataset\
    .from_data_source(ds, holdout=None) \
    .transform() \
    .synthesize() \
    .create()

Similarly, if you want to adjust the amount of the holdout, that can also be done from the .from_data_source() step. By default, the holdout is set to 5%.

synthetic_dataset = gretel.safe_synthetic_dataset\
    .from_data_source(ds, holdout=0.1) \
    .transform() \
    .synthesize() \
    .create()

If you want to turn off Evaluate, you must explicitly disable it:

synthetic_dataset = gretel.safe_synthetic_dataset\
    .from_data_source(ds) \
    .transform() \
    .synthesize() \
    .evaluate(disable=True) \
    .create()

Synthetic Quality and Privacy Report