Gretel Playground [Legacy] Inference API
Real-time data generation with Gretel Playground
from gretel_client import Gretel
gretel = Gretel(api_key="prompt")
Real-time vs batch data generation
In this section, we will introduce the Playground inference API, which makes it easy to generate high-quality synthetic tabular and text data – in real time – with just a few lines of code, powered by Gretel Playground.
Playground currently supports two data generation modes: tabular
and natural_language
. In both modes, you can choose the backend model that powers the generation, which we'll describe in more detail below.
Tabular data generation
The Gretel object has a factories
attribute that provides helper methods for creating new objects that interact with Gretel's non-project-based APIs. Let's use the factories
attribute to fetch the available backend models that power Playground's tabular
data generation:
print(gretel.factories.get_navigator_model_list("tabular"))
This will print the list of available models, the first of which will be gretelai/auto
, which automatically selects the current default model, which will change with time as models continue to evolve.
To initialize the Playground Tabular inference API, we use the initialize_navigator_api
method. Then, we can generate synthetic data in real time using its generate
method:
# the `backend_model` argument is optional and defaults "gretelai/auto"
tabular = gretel.factories.initialize_navigator_api("tabular", backend_model="gretelai/auto")
prompt = """\
Generate customer bank transaction data. Include the following columns:
- customer_name
- customer_id
- transaction_date
- transaction_amount
- transaction_type
- transaction_category
- account_balance
"""
# generate tabular data from a natural language prompt
df = tabular.generate(prompt, num_records=25)
You can augment an existing dataset using the edit
method:
# add column to the generated table using the `edit` method
edit_prompt = """\
Add the following column to the provided table:
- customer_address
"""
df_edited = tabular.edit(edit_prompt, seed_data=df)
Finally, Playground's tabular
mode supports streaming data generation. To enable streaming, simply set the stream
parameter to True
:
prompt = """\
Generate positive and negative reviews for common household products purchased online.
Columns are: the product name, number of stars (1-5), review and customer id
"""
for record in tabular.generate(
prompt=prompt,
num_records=150,
stream=True,
sample_buffer_size=5
):
print(record)
Natural language generation
Playground's natural_language
mode gives you access to state-of-the-art LLMs for generating text data. Let's fetch the available backend models that power Playground's natural_language
data generation:
print(gretel.factories.get_navigator_model_list("natural_language"))
Similar to the tabular
mode, this will print the list of available models, the first of which will be gretelai/gpt-auto
, which automatically selects the current default model.
To initialize the Playground Natural Language inference API, we again use the initialize_navigator_api
method. Then, we can generate synthetic text data in real time using its generate
method:
llm = gretel.factories.initialize_navigator_api("natural_language")
text = llm.generate("Please tell me a funny joke about data scientists.")
print(text)
# let's see if the llm is funnier with a higher temperature
text_higher_temp = llm.generate("Please tell me a funny joke about data scientists.", temperature=2)
print(text_higher_temp)
Last updated
Was this helpful?