Snorkel AI Data Development Platform v25.9 (STS) release notes

Breaking changes

The update() method for the Dataset, Benchmark, Criteria, and Evaluator classes now returns nothing for consistency across the SDK.
Legacy Node and OperatorNode classes have been removed from the SDK.
SageMaker integration support has been removed.

Deprecations

No deprecations in this release.

Features and improvements

Prompt development

You can now run prompts on the first N traces of a trace dataset for quicker iteration and testing.

Evaluation

You can now view datapoint-level results for benchmark evaluation runs in either table or individual record view, in addition to aggregate results.
You can now select which model to use for error analysis from the criteria editor page. If no selection is made, the first configured model in the Foundation Model suite is used by default.
You can now see if a criterion has conditions and view the applied condition.
You can add conditions during criteria creation. Criteria with conditions will only evaluate datapoints that meet those conditions.

Infrastructure

Python has been upgraded to version 3.12 for improved performance and security.
Dependencies have been upgraded to numpy==2.0.2, spacy==3.8.6, and scikit-learn==1.7.1 for enhanced functionality.

SDK

You can now create, get, and delete benchmark executions through the SDK.
You can now delete slices through the SDK.
A new Clusters class has been added to the SDK for managing data clusters. You can use this class to fetch clusters generated by error analysis.
A new Error Analysis class has been added to the SDK for triggering error analysis runs.

Bug fixes

User interface

The unnecessary primary text field has been removed from classification tasks.

Data management

Serialized columns are now correctly deserialized and displayed in the UI.

Evaluation

Criteria names are no longer truncated in the benchmark evaluations table.
Issues with prompt execution functionality have been resolved.

Annotation

Single label schemas now render properly in reviewer mode.

Known issues

Application

Annotator breadcrumb shows admin pages.
Typo appears on the label schema creation form.
Search input box for sequence tagging doesn't return labels in record view.
Trace view within the prompt editing UI fails to load for old benchmarks.
Filtering users to assign for annotation and selecting all incorrectly selects all available users.
Failures of ground truth import jobs are not surfaced to the UI.
Ground truth is not updating properly for trace datasets in traces view.
When using targeted criteria, the tooltip on the evaluation runs page shows an inaccurate slice count.
The "Datapoints with no score" warning for targeted criteria doesn't match the number of datapoints the criteria evaluated.
Breadcrumb navigation may get overlapped by content while scrolling the page.

Infrastructure

Split datasource by percentage can break if an unused connector is disabled.

Data management

Dataset size discrepancies may occur between the actual file size and what is shown in the UI.

Breaking changes​

Deprecations​

Features and improvements​

Prompt development​

Evaluation​

Infrastructure​

SDK​

Bug fixes​

User interface​

Data management​

Evaluation​

Annotation​

Known issues​

Application​

Infrastructure​

Data management​

Breaking changes

Deprecations

Features and improvements

Prompt development

Evaluation

Infrastructure

SDK

Bug fixes

User interface

Data management

Evaluation

Annotation

Known issues

Application

Infrastructure

Data management