Prompts Tips & Best Practices

Writing a well-formatted, clear prompt can get you a long way toward high quality tabular results, and often resolve errors you may be experiencing. Follow these guidelines to get the best from Navigator.

Write prompts relevant for tabular data

Do not submit spam or prompts irrelevant to a tabular dataset (like "hello") to Navigator. If you're looking for question-and-answer, try submitting a prompt to GPT (select "Gretel GPT" in the dropdown in the Playground)

Describe the data you want in detail.

The more detail you include, both in terms of what the output should and shouldn’t look like, will lead to better results. This includes:

  • List columns you want the input to have.

  • Describe each column, include things like format you want the data to follow (e.g. YYYY-MM-DD for dates). Describe range of values, if applicable, and the context.

Make sure your prompt matches sample data, if you include sample data

If there is a mismatch between the text prompt and sample data you provide, this can confuse the model and cause errors. For best results, always make sure your text prompt and sample data match.

When adding an example table of records, supplement your prompt with a short text prompt

This helps the model parse the example table. The text prompt can be as simple as an instruction to generate more data following the example data. Example:

Generate 30 rows of data exactly like the following table

When using SQL statements as your prompt, use CREATE TABLE statements

This includes more information than SELECT. You can also combine them both, for example

CREATE TABLE users (
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    dob DATE,
    email VARCHAR(100), -- formatted as the first letter of their first name followed by their last name @foo.io (e.g., jdupont@foo.io)
    city VARCHAR(50),
    country VARCHAR(50) -- all users are from France
);
SELECT * FROM users where dob > "2000-01-01";

Add clear column descriptions

If you want to generate a table with multiple columns, use a bulleted list and a short, clear description of the data you want in each column. Example:

Create a U.S. flight passenger dataset with the following columns: - Traveler ID: a 6-character alphanumeric ID - Departing city: a city in the U.S. - Arrival city: a city in the U.S. - Duration: duration of the flight, in minutes - Number of seats: seats on the flight

Last updated