The Magic interface within Data Designer allows you to interactively define columns, preview samples, and refine data generation through natural language.
Key Benefits
Automatic prompt generation for LLM columns that reference other columns with correct formatting
Automatic structured output configuration for complex JSON schema definitions
Simplified categorical data creation with automatic inference of appropriate values
Interactive refinement through a conversational interface
Creating and Editing Columns with Magic
Magic SDK offers multiple ways to create columns.
Sampling Columns
Automatically create columns based on distributions or categories without manual configuration.
Configures any relevant distribution parameters (here: the likelihood of occurrence of each weather type).
Updates the Data Designer object (dd) with this new column definition.
However, let's say you aren't pleased with the result you receive from this function, or perhaps you'd like to edit the result in some way. Subsequent calls to the same function will edit existing columns in place.
dd = Gretel().data_designer.new()
dd.magic.add_sampling_column(
"weather",
"Possible weather types for Tokyo."
)
# values: ["Sunny", "Cloudy", "Rainy", "Snowy", "Windy"]
dd.magic.add_sampling_column(
"weather",
"All values should be in Japanese."
)
# "values": ["晴れ", "曇り", "雨", "雪", "風"]
This function can even be used to edit pre-existing, manually created columns.
dd = Gretel().data_designer.new()
## Manually define a uniform sampler column...
dd.add_column(
name="temperature",
type="uniform",
params={"low": 32.0, "high": 212.0}
)
## ...and edit it with Magic.
dd.magic.add_sampling_column(
"temperature",
"Change from F to C"
)
"""
SamplerColumn(
name='temperature',
type='uniform',
params={
"low": 0.0,
"high": 100.0
})
"""
Magic can create columns for a wide range of possible sampling types.
dd.magic.add_sampling_column("person_1", "An older man from upstate NY.")
A common way to increase diversity of a dataset is to include more possible values that a sampling column can take on. To help with this common pattern, magic offers tools for specifically for this.
We can also extend categories created by Magic, not just hand-crafted ones.
dd = Gretel().data_designer.new()
## Generate a category and then boost its values for extra diversity
dd.magic.add_sampling_column("japan_city", "Cities in Japan.")
dd.magic.extend_category("japan_city", n=3)
dd.magic.extend_category("japan_city", n=3)
dd = Gretel().data_designer.new()
dd.magic.add_sampling_column("japan_city", "Cities in Japan")
dd.magic.add_sampling_column("weather", "Possible weather types for Japan")
dd.magic.add_sampling_column("temperature", "Possible temperature in Japan, in C")
# Create a text description that depends on the weather column
dd.magic.add_column(
"forecast",
"A realistic weather forecast, as would be written in a newspaper. Two to three sentences."
)
Generated Config for Column forecast
LLMGenColumn(
model_suite='apache-2.0',
error_rate=0.2,
model_configs=None,
model_alias='text',
prompt='Write a realistic weather forecast for {{ japan_city }} as it would appear in a newspaper. Include the weather
conditions ({{ weather }}) and the temperature ({{ temperature }}°C) in your forecast. Keep it to two to three sentences.',
name='forecast',
system_prompt=None,
output_type='text',
output_format=None,
description='A realistic weather forecast for a Japanese city, as it would appear in a newspaper, including weather conditions
and temperature, written in two to three sentences.'
)
dd.preview() Output
Column
Value
japan_city
Kyoto
weather
Sunny
temperature
22.137...
forecast
Tomorrow in Kyoto will be sunny with clear skies throughout the day. The temperature will reach a high of 22°C.
Similar to sampling columns, we can edit columns in place with instructions by calling the same function on an existing LLM generation column. Furthermore, we can specify exactly which pre-existing columns we require the LLM generation's prompt template depend on.
dd.magic.add_column(
"forecast",
"The forecast should be two detailed paragraphs.",
must_depend_on=["weather", "japan_city", "temperature"]
)
Updated Config for Column forecast
LLMGenColumn(
model_suite='apache-2.0',
error_rate=0.2,
model_configs=None,
model_alias='text',
prompt='Write a realistic weather forecast for {{ japan_city }} as it would appear in
a reputable Japanese newspaper. The forecast should include two detailed paragraphs:\n\n1.
The first paragraph should provide a comprehensive description of the current weather
conditions ({{ weather }}) and temperature ({{ temperature }}°C). Include any notable
atmospheric phenomena, local impacts, and how the current weather affects daily life in
the city.\n\n2. The second paragraph should offer a detailed outlook for the next day,
including any expected changes in weather and temperature, potential impacts on daily
activities, and any precautions residents should take. Use specific examples to illustrate
your points.',
name='forecast',
system_prompt=None,
output_type='text',
output_format=None,
description='A realistic weather forecast for a Japanese city, as it would appear in a
reputable newspaper. The forecast includes the city name, weather conditions, and
temperature in two detailed paragraphs.'
)
First Prompt Template
Updated Prompt Template
Write a realistic weather forecast for {{ japan_city }} as it would appear in a newspaper. Include the weather conditions ({{ weather }}) and the temperature ({{ temperature }}°C) in your forecast. Keep it to two to three sentences
Write a realistic weather forecast for {{ japan_city }} as it would appear in
a reputable Japanese newspaper. The forecast should include two detailed paragraphs:
The first paragraph should provide a comprehensive description of the current weather conditions ({{ weather }}) and temperature ({{ temperature }}°C). Include any notable atmospheric phenomena, local impacts, and how the current weather affects daily life in the city.
The second paragraph should offer a detailed outlook for the next day,
including any expected changes in weather and temperature, potential impacts on daily activities, and any precautions residents should take. Use specific examples to illustrate your points.
Structured Outputs. Magic isn't limited to just generating text columns, however. It can also be used to generate configurations for structured output columns without having to know JSONSchema or Pydantic.
dd.magic.add_column(
"hourly_weather_data",
"Structured data with hourly temperature, humidity, and wind speed predictions.",
must_depend_on=["forecast", "japan_city"]
)
Generated Config for Column hourly_weather_data
LLMGenColumn(
model_suite='apache-2.0',
error_rate=0.2,
model_configs=None,
model_alias='text',
prompt="Given the following weather forecast for {{ japan_city }}: '{{ forecast }}'. Generate structured hourly predictions for
temperature, humidity, and wind speed for the next 24 hours. Format the data as follows: Hour, Temperature (°C), Humidity (%), Wind
Speed (km/h).",
name='hourly_weather_data',
system_prompt=None,
output_type='structured',
output_format={
"type": "array",
"items": {
"type": "object",
"properties": {
"Hour": {
"type": "string",
"description": "The hour of the day in 24-hour format (e.g., '00:00', '01:00', ..., '23:00')."
},
"Temperature": {
"type": "number",
"description": "The predicted temperature in Celsius for the given hour."
},
"Humidity": {
"type": "number",
"description": "The predicted humidity percentage for the given hour."
},
"Wind_Speed": {
"type": "number",
"description": "The predicted wind speed in kilometers per hour for the given hour."
}
},
"required": [
"Hour",
"Temperature",
"Humidity",
"Wind_Speed"
]
}
}
In this configuration, we can see that Magic has correctly chosen the "structured" output type and has also included a JSONSchema definition for the output structure. This structure will be followed during data generation, as shown below.
Generated Sample of hourly_weather_data using dd.preview()
The output data structure definition can also be edited with successive calls. So, for instance, we can request edits to the output data structure itself, if we like.
dd.magic.add_column(
"hourly_weather_data",
"Make the fields lowercase. Also require an additional field, air_quality_index."
)
Generated Sample of hourly_weather_data using dd.preview()
Refining Prompts
As tweaking prompt templates is a common task when designing data generation steps, Magic also has tools specifically for refining and varying prompts (and nothing else). consider the following example.
dd = Gretel().data_designer.new()
dd.add_column(name="farmer", type="person")
dd.add_column(
name="question",
prompt="Ask {{ farmer.first_name }} about the price of apples today."
)
dd.magic.refine_prompt("question", "Use the farmer's full name.")
dd.magic.refine_prompt("question", "Ask about pears instead.")
dd.magic.refine_prompt("question", "Ask about pears if the farmer is a man and oranges if not.")
Step
Prompt Template
Original
Ask {{ farmer.first_name }} about the price of apples today.
Frist Edit
Ask {{ farmer.first_name }} {{ farmer.last_name }} about the price of apples today.
Second Edit
Ask {{ farmer.first_name }} {{ farmer.last_name }} about the price of pears today.
Fourth Edit
Ask {{ farmer.first_name }} {{ farmer.last_name }} about the price of {% if
farmer.sex == 'Male' %}pears{% else %}oranges{% endif %} today.
This way of using refine_prompt is quite close to add_column's capability to edit columns, however, refine_prompt ensures that there can be no spurious changes to any other parts of the LLM generation config.
Sometimes, though, you simply want to rephrase a prompt. A bare call to refine_prompt on the target column will vary that prompt template with
for _ in range(3):
dd.magic.refine_prompt("question")
Prompt Template Variations
You are a customer at a local market. Ask {{ farmer.first_name }} {{
farmer.last_name }} about the price of {% if farmer.sex == 'Male' %}pears{% else
%}oranges{% endif %} today. Your question should sound natural and polite, as if you are
having a conversation with a neighbor or friend.
You are a customer at a local market. Politely ask {{ farmer.first_name }} {{
farmer.last_name }} about the price of {% if farmer.sex == 'Male' %}pears{% else
%}oranges{% endif %} today. Your question should sound natural and friendly, as if you are
having a conversation with a neighbor or friend.
You are a customer at a local market. Please politely ask {{ farmer.first_name
}} {{ farmer.last_name }} about the price of {% if farmer.sex == 'Male' %}pears{% else
%}oranges{% endif %} today. Your question should sound natural and friendly, as if you are
having a conversation with a neighbor or friend. Begin your question with 'Hi' or 'Hello'
and use a casual, friendly tone.
Interactive Workflow
Sometimes you might want to see possible outputs to help you request edits to column configurations. Magic supports an interactive chat-like interface which can be accessed by setting
magic.add_sampling_column(..., interactive=True)
magic.add_column(..., interactive=True)
While in this mode, you will be able to view Magic-proposed column configurations, optionally preview the outputs of that column for the current state of the Data Designer object, and then optionally accept those changes or request further edits via text commands.
dd = Gretel().data_designer.new()
dd.add_column(name="farmer", type="person")
dd.add_column(
name="question",
prompt="Ask {{ farmer.first_name }} about the price of apples today.",
)
dd.magic.add_column(
"question",
"Use the farmer's full name.",
interactive=True
)
Command
Description
accept
Save the current configuration
cancel
Discard changes
start-over
Reset to initial state
retry
Generate configuration again
preview
Generate sample data
preview-on/preview-off
Toggle automatic previews
Best Practices
For most projects, combine Magic SDK with the standard Data Designer SDK:
# Initialize
gretel = Gretel(api_key="YOUR_API_KEY")
dd = gretel.data_designer.new(model_suite="apache-2.0")
# Use explicit configuration for well-defined base columns
dd.add_column(
C.SamplerColumn(
name="employee_id",
type=P.SamplerType.UUID,
params=P.UUIDSamplerParams(
prefix="GRETEL_",
short_form=True,
uppercase=True
)
)
)
# Use Magic for more complex columns
dd.magic.add_sampling_column(
"product_category",
"Main product categories in an e-commerce store"
)
# Refine and extend
dd.magic.refine_prompt("product_description", "Add more technical specifications")
dd.magic.extend_category("product_category", n=10)
# Generate the full dataset
preview = dd.preview()
preview.display_sample_record()
Magic can also help with creating column configurations for LLM generation columns — helping you draft and even define . To get started, let's start with our previous weather examples