Snorkel Flow v25.4 (LTS) release notes
Breaking changes
- Legacy notebook functionality on Kubernetes has been removed.
Features and improvements
Annotation
- Improved role-based UI restrictions: Labelers (Annotators) now see only tabs relevant to their workflow.
- You can now perform bulk labeling for sequence tagging multi-schema annotation (Text).
- Label schemas can now be included during annotation task creation.
- The loading state for annotations has been streamlined for better performance.
- The delimiter for highlight UIDs has been updated.
- Added Data Explorer tab to datasets page where users view individual data points inside a dataset.
Prompt development
- You can now run your prompt on just the first N rows of data for faster testing and iteration in both traditional and LLM-as-a-judge (LLMAJ) prompt development.
- The prompt editor now correctly highlights columns regardless of case sensitivity.
- Ground truth filter errors in LLMAJ and prompt development have been fixed.
ML tasks: text
- Error reporting for Ground Truth upload has been improved, with partial writes now disabled.
Infrastructure
- Added SDK function
sf.get_model_provider_status()
to validate operational status of foundation model providers. - Added support for running concurrent LLM inference requests on Bedrock.
- Improved responsiveness of helm charts.
User interface
- Dataset splits now appear in the UI immediately upon first upload.
Integrations
- You can now use Llama API prompt inference requests via the Custom Inference Service.
SDK
- Fixed a circular dependency issue when installing a wheel.
Bug fixes
ML tasks
- Fixed an issue with Ground Truth uploads.
SDK
- The
workspace_uid
parameter has been removed fromset_secret
.
Known Issues
Data upload
- New data sources do not have embeddings generated if that feature is not activated.
- Uploading large CSV files can show unrelated errors during data upload.
- There may be dataset size discrepancies between the actual file size and what is shown in the GUI.
- Downloading PDFs with the https URL fails.
Annotation
- For PDFs, the annotation filter for negative ground truth doesn't work.
- In Annotation Trace view, the first document might not load existing annotations on page load.
- Focus on free text multi-schema annotation inputs may act erratically, especially when using Tab to switch between fields.
Data development
- You may receive an error that certain datapoints are not in index after resampling a dev split.
- Studio
/dataset
and/advanced-lf-state
endpoints error out with a cryptic error message when there is no span. - Drop
context_pages
andpage_docs
need to be dropped at/dataset
&/context-dataset
.
Prompt development
- You may encounter errors when creating an LLMAJ with a previously deleted exact name.
- Errors may occur when loading ground truth for LLM responses in prompt development.
- Inaccurate counts may appear in traces batch.
- Trace step pagination may not load all steps of a trace when loading a long trace.
Evaluation
- Multi-trace dictionary values for trace steps have limitations with
int
,float
, andbool
types. - Benchmark graphs may load before the run is completed.
- LLMAJ Schema filter is not properly scoped to the selected benchmark.
- GT filter in All Prompt views should be 'present/absent', not asking for the value.
- In Prompt table view, truncated text may be difficult to read.
- You may experience double-loading when clicking to the next page in Traces.
SDK
- The SDK
aggregate_annotations()
method may fail.