Gretel Relational Report

Assess the accuracy and privacy of your synthetic database.

Introduction

To use data, you need to trust it. The Gretel Relational Report provides unique accuracy and privacy scores to help you verify the quality of your synthetic database. In addition to overall database scores, the report provides table-level insights that measure how well both in-table and cross-table relationships are maintained. This consumable report provides confidence that your data is accurate and secure.

Database SQS

The Synthetic Data Quality Score (SQS) is an estimate of how well the generated synthetic data maintains the same statistical properties as the original dataset. In this sense, the SQS can be viewed as a utility score or a confidence score as to whether scientific conclusions drawn from the synthetic database would be the same if one were to have used the original database instead. If you do not require statistical symmetry, as might be the case in a testing or demo environment, a lower score may be just as acceptable. If your SQS is not as high as you'd like it to be, check out our Tips to Improve Synthetic Data Quality.

Database PPL

The Privacy Protection Level (PPL) is determined by the model chosen for synthesis. Gretel Relational Synthetics support Gretel LSTM, Gretel ACTGAN, Gretel Tabular DP, and Gretel Amplify. In general, Tabular DP will have the highest privacy scores, followed by LSTM and ACTGAN, and finally Amplify. By nature, synthetic data is inherently more private than real-world data, so even a synthetic database with a Normal PPL is more secure than non-synthesized database. When sharing data internally within a company, a PPL of Normal or better is recommended. When sharing data outside of your organization, we recommend a PPL of Very Good or higher.

Table Relationships

The report includes a visual of the key relationships between tables in the database, as shown below. When the cursor is hovered over a key, its related keys and tables highlight.

Individual and Cross-table SQS

For each table, individual and cross-table Synthetic Data Qualitys are generated, which include additional quality scores. The individual report evaluates the statistical accuracy of the individual synthetic table compared to the real world table it is based on. This provides insight into the quality of the stand-alone synthetic table. The cross-table report evaluates the synthetic data of the table and all its ancestor tables. This provides insight into the accuracy of the table in the context of the database as a whole.

The individual, cross-table, and relational reports are all bundled in the gretel_tabular output archive file, which can be found in Gretel Console under the Data Sources tab of your project.

Last updated