Data Quality Reports
We only support evaluation and validation for Text-to-Python and Text-to-SQL data generation tasks. We have support for generate evaluations coming soon!
As part of our commitment to providing high-quality synthetic data, Data Designer includes a comprehensive evaluation process for generated datasets. This evaluation is designed to give you insights into the quality, diversity, and usefulness of your synthetic data. Here's an overview of how we evaluate datasets in the current version of Data Designer:
Evaluation Process
Automatic Evaluation: By default, evaluations run as part of the synthetic data generation pipeline. However, you have the option to disable evaluation if you prefer to speed up the generation process.
Preview Mode: We offer the ability to run evaluations in preview mode, allowing you to quickly assess and iterate on your dataset before full generation.
Full Dataset Analysis: Our primary focus is on full dataset evaluations, providing you with a comprehensive view of your generated data.
Visualization: We generate visualizations as part of our evaluations, making it easier for you to interpret the results at a glance.
Evaluation Metrics
Our evaluation process includes both general metrics applicable to all datasets and use-case specific metrics for text-to-code datasets. Here are the key metrics we provide:
General Metrics
Diversity Analysis:
Percentage of unique records
Number of unique values per column
Per-column distribution and diversity index
Distribution and visualization of contextual tags
LLM-based Quality Assessment:
We use a large language model (LLM) as a judge to evaluate the dataset on relevance, readability, scalability, and adherence to coding standards.
Text-to-Code Specific Metrics
Code Validity:
Fraction of valid code in the dataset
Code Quality Assessment:
LLM-based evaluation using a code-specific rubric, considering factors such as modularity, readability, and scalability
Static Analysis:
Linter-based code assessments for Python and SQL code
Evaluation Report
After the evaluation process, we provide you with a comprehensive report that includes:
Quantitative metrics for each evaluation category
Visualizations to help you quickly grasp the characteristics of your dataset
Qualitative feedback from our LLM-based assessments
This report is stored as part of the workflow artifacts, allowing you to review and compare evaluations across different iterations of your dataset.
Benefits of Our Evaluation Process
Trust and Transparency: Our evaluation process helps build trust by providing clear, objective measures of data quality.
Iterative Improvement: By offering evaluations in preview mode, we enable you to quickly iterate and improve your data generation process.
Use-Case Optimization: Our text-to-code specific metrics ensure that generated code meets high standards of validity and quality.
Comprehensive Insights: By combining quantitative metrics, visualizations, and LLM-based assessments, we provide a holistic view of your dataset's characteristics and quality.
You can learn how to access the evaluation report here.
Last updated