June 2024
Release notes for the Gretel Platform, June 2024
2024.6.9
Add support for setting crawl limits when configuring Gretel Workflow object storage connectors. To set a limit, configure
limit
on your object storage source connector.
2024.6.8
Improvements to Workflow config validation. Workflow action names are now validated to ensure uniqueness within a Workflow config.
Gretel BigQuery connections can now be created without specifying a
dataset
. You can instead configure the BigQuery dataset by passingbq_dataset
when configuring abigquery_source
action.Bugfix to database subsetting. When collecting batches of data, those batches previously needed to contain the same set of columns. This constraint would sometime break subsetting if columns were sparsely populated.
Hybrid Model docker images have now been consolidated into a single Model image.
Hybrid Workflow docker images have now been consolidated into a single Workflow image.
Intermediate Workflow artifacts are now immediately cleaned up when a Workflow completes. When a Workflow is configured with a sink, any intermediate model artifacts produced by the Workflow are cleaned up and removed when the Workflow completes.
GPT-x, update config validation to limit
epsilon
to be between 0.1 and 100.GPT-x, ensure sampling probability is never larger than 1.0.
Bugfix: When writing objects to Azure Blob Storage, block sizes were written in chunks that were too small, leading to errors when writing larger object. Objects are now written in larger 25mb blocks.
2024.6.7
Standardize Tv2
column
properties. Thecolumn
object can be used to access specific properties of a column that is being evaluated in Tv2. See the Tv2 reference for more details.Update Tv2 to maintain referential integrity. By default, the
gretel_tabular
action when using Tv2 will ensure that Pk/FK columns are not transformed. By settingrun.encode_keys: true
within the action, keys will be transformed to integers or UUIDs.Bugfix in
gretel_tabular
where null foreign-keys can be included when using subsetting.Bugfix for Synthetic Quality Score for field correlation stability when missing values are in the data.
Bugfix for enforcing Teams runtime limits (max objects crawled, max bytes processed) on Workflows. These limits were previously being loaded from specific users, this is now fixed so they limits are loaded by Team if the user is a member of one.
2024.6.6
🚀 Hello Navigator Fine-Tuning! Our newest multi-modal model is live!
Check out the blog for even more details!
This model is available via the
models-navigator_ft
container for Hybrid customers.
2024.6.5
Improve error messages within Gretel Navigator
Added new
partial_mask()
filter to Tv2.
2024.6.4
Update model names within Gretel Navigator
Bug fix for Gretel Navigator edit mode when adding numerical columns.
2024.6.3
For GPT-x, the
delta
hyperparam will only be automatically updated ifdp: true
. Previously it was updated regardless of DP being enabled which was unnecessary.Improvements to the SQS Text Statistical Score for measuring quality of synthetic natural language data.
2024.6.02
Improved prompt validation for Gretel Navigator
When using Tv2 with
gretel_tabular
columns will no longer be attempted to be ordered in their original order. This causes issues when Tv2 configs are adding or removing columns.
2024.6.1
Tv2 NER will utilize GPUs when available.
Databricks destination connector optimizations.
Better handling for foreign key column with null values in
gretel_tabular
.
Last updated