LogoLogo
  • Welcome to Gretel!
  • Gretel Basics
    • Getting Started
      • Quickstart
      • Blueprints
      • Use Case Examples
      • Environment Setup
        • Console
        • SDK
      • Projects
      • Inputs and Outputs
      • Gretel Connectors
        • Object Storage
          • Amazon S3
          • Google Cloud Storage
          • Azure Blob
        • Database
          • MySQL
          • PostgreSQL
          • MS SQL Server
          • Oracle Database
        • Data Warehouse
          • Snowflake
          • BigQuery
          • Databricks
        • Gretel Project
    • Release Notes
      • Platform Release Notes
        • May 2025
        • April 2025
        • March 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
      • Console Release Notes
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
      • Python SDKs
  • Create Synthetic Data
    • Gretel Safe Synthetics
      • Transform
        • Reference
        • Examples
        • Supported Entities
      • Synthetics
        • Gretel Tabular Fine-Tuning
        • Gretel Text Fine-Tuning
        • Gretel Tabular GAN
        • Benchmark Report
        • Privacy Protection
      • Evaluate
        • Synthetic Quality & Privacy Report
        • Tips to Improve Synthetic Data Quality
        • Data Privacy 101
      • SDK
    • Gretel Data Designer
      • Getting Started with Data Designer
      • Define your Data Columns
        • Column Types
        • Add Constraints to Columns
        • Custom Model Configurations
        • Upload Files as Seeds
      • Building your Dataset
        • Seeding your Dataset
        • Generating Data
      • Generate Realistic Personal Details
      • Structured Outputs
      • Code Validation
      • Data Evaluation
      • Magic Assistance
      • Using Jinja Templates
  • Gretel Playground [Legacy]
    • Getting Started
    • Prompts Tips & Best Practices
    • FAQ
    • SDK Examples
    • Tutorials
    • Videos
    • Gretel Playground [Legacy] Inference API
    • Batch Job SDK
  • Reference
    • Gretel's Python Client
    • Gretel’s Open Source Synthetic Engine
    • Gretel’s REST API
    • Homepage
    • Model Suites
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Gretel Playground [Legacy]

Batch Job SDK

Documentation for the batch job SDK. For using Playground at scale.

PreviousGretel Playground [Legacy] Inference APINextModel Suites

Last updated 1 month ago

Was this helpful?

  1. Initialize Playground Batch with a model config:

model_config = """
schema_version: 1.0
models:
 - navigator:
       model_id: "gretelai/auto"
       output_format: "jsonl"
"""
  1. Utilize to use batch SDK in your own workflows:

def submit_generate(model, prompt: str, params: dict, ref_data=None) -> pd.DataFrame:
   """
   Generate or augment data from the Navigator model.


   Args:
   model: The model object that will process the prompt.
   prompt (str): The text prompt to generate data from.
   params (dict): Parameters for data generation.
   ref_data: Optional existing dataset to edit or augment.


   Returns:
   pd.DataFrame: The generated data.
   """
   data_processor = model.create_record_handler_obj(
       data_source=pd.DataFrame({"prompt": [prompt]}),
       params=params,
       ref_data=ref_data
   )
   data_processor.submit_cloud()
   poll(data_processor, verbose=False)
   return pd.read_json(data_processor.get_artifact_link("data"), lines=True, compression="gzip")

Example:

# Generate mock dataset
prompt = """\
Generate a mock dataset for users from the Foo company based in France.


Each user should have the following columns:
* first_name: traditional French first names.
* last_name: traditional French surnames.
* email: formatted as the first letter of their first name followed by their last name @foo.io (e.g., jdupont@foo.io).
* gender: Male/Female/Non-binary.
* city: a city in France.
* country: always 'France'.
"""


params = {
   "num_records": 10,
   "temperature": 0.8,
   "top_p": 1,
   "top_k": 50
}
df = submit_generate(model=model, prompt=prompt, params=params)
these helper functions