LogoLogo
  • Welcome to Gretel!
  • Gretel Basics
    • Getting Started
      • Quickstart
      • Blueprints
      • Use Case Examples
      • Environment Setup
        • Console
        • SDK
      • Projects
      • Inputs and Outputs
      • Gretel Connectors
        • Object Storage
          • Amazon S3
          • Google Cloud Storage
          • Azure Blob
        • Database
          • MySQL
          • PostgreSQL
          • MS SQL Server
          • Oracle Database
        • Data Warehouse
          • Snowflake
          • BigQuery
          • Databricks
        • Gretel Project
    • Release Notes
      • Platform Release Notes
        • May 2025
        • April 2025
        • March 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
      • Console Release Notes
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
      • Python SDKs
  • Create Synthetic Data
    • Gretel Safe Synthetics
      • Transform
        • Reference
        • Examples
        • Supported Entities
      • Synthetics
        • Gretel Tabular Fine-Tuning
        • Gretel Text Fine-Tuning
        • Gretel Tabular GAN
        • Benchmark Report
        • Privacy Protection
      • Evaluate
        • Synthetic Quality & Privacy Report
        • Tips to Improve Synthetic Data Quality
        • Data Privacy 101
      • SDK
    • Gretel Data Designer
      • Getting Started with Data Designer
      • Define your Data Columns
        • Column Types
        • Add Constraints to Columns
        • Custom Model Configurations
        • Upload Files as Seeds
      • Building your Dataset
        • Seeding your Dataset
        • Generating Data
      • Generate Realistic Personal Details
      • Structured Outputs
      • Code Validation
      • Data Evaluation
      • Magic Assistance
      • Using Jinja Templates
  • Gretel Playground [Legacy]
    • Getting Started
    • Prompts Tips & Best Practices
    • FAQ
    • SDK Examples
    • Tutorials
    • Videos
    • Gretel Playground [Legacy] Inference API
    • Batch Job SDK
  • Reference
    • Gretel's Python Client
    • Gretel’s Open Source Synthetic Engine
    • Gretel’s REST API
    • Homepage
    • Model Suites
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Create Synthetic Data
  2. Gretel Data Designer

Define your Data Columns

Introduction to Column Definition

In Data Designer, columns are the fundamental building blocks that determine what data you'll generate and how it will be structured. This guide introduces the key concepts for defining columns that produce high-quality synthetic data.

The Column Definition Process

The Data Designer workflow revolves around defining columns that work together to produce realistic data. Each column definition specifies:

  • What type of data to generate (statistical distributions, categories, AI-generated text, etc.)

  • How to generate the data (parameters, prompts, dependencies)

  • Relationships with other columns (constraints, dependencies)

Key Concepts in Column Definition

Column Types

Data Designer supports a rich variety of column types, from simple statistical distributions to complex AI-generated content:

  • Sampling-based columns: Generate data through statistical methods (categories, numbers, dates)

  • LLM-based columns: Generate realistic text and structured content using large language models

Learn more about column types →

Column Constraints

Constraints allow you to control the values your columns can contain, enforcing business rules and maintaining data consistency:

  • Scalar constraints: Restrict numerical values to specific ranges

  • Column relationships: Ensure logical relationships between columns

Learn more about column constraints →

Designing an Effective Column Strategy

Best Practices for Column Definition

  1. Start with seed data: Define your categorical and structural columns first

  2. Build relationships: Create dependencies between related columns

  3. Layer complexity: Begin with basic columns, then add more sophisticated ones

  4. Preview frequently: Use aidd.preview() to validate your design iteratively

PreviousGetting Started with Data DesignerNextColumn Types

Last updated 29 days ago

Was this helpful?