Comment on page
Synthetic Data Quality Score (SQS)
How to evaluate any two datasets to generate a Gretel Synthetic Data Quality Report.
Remember, we suggest using your synthetic data as
in-data
and the data you wish to compare it with should be the ref-data
parameter.For more details about how to interpret and utilize the report, please see our Synthetic Data Quality Report page.
Because
sqs
is the default evaluation task type, you can simply reference the default evaluate
configuration via the GitHub blueprint shortcut: evaluate/default
.The CLI usage to create an Quality Report is:
$ gretel models create --config evaluate/default --in-data synthetic.csv --ref-data compare.csv --output report-dir
This will upload both datasets to Gretel Cloud, generate the report, and download the report artifacts to the
report-dir
directory. Within this directory, the artifacts of interest are:report.html.gz
which is an HTML document that contains the full reportreport_json.json.gz
which is a JSON version of the report
If you wish for this job to launch on your local host (from where you are running the command) you may add the
--runner local
flag.The Gretel SDK provides Python classes specifically to run reports. The
QualityReport()
class uses evaluate
with sqs
task type generate a Synthetic Data Quality Report. The most basic usage is below:from gretel_client.evaluation.quality_report import QualityReport
# NOTE: These data sources may also be Pandas DataFrames!
data_source = "synthetic.csv"
ref_data = "compare.csv"
report = QualityReport(data_source=data_source, ref_data=ref_data)
report.run() # this will wait for the job to finish
# This will return the full report JSON details
report.as_dict
# This will return the full HTML contents of the report
report.as_html
If you do not specify a
project
parameter when using the QualityReport()
class, then a temporary project will be created and deleted after the report finishes and the artifacts are downloaded. This slightly differs from CLI behavior where temporary projects are not used.Last modified 3mo ago