Skip to main content

Snorkel AI Data Development Platform v25.6 (STS) release notes

Breaking changes

SDK

  • The following SDK functions have been removed from snorkelai.sdk.client.datasources:

    • create_datasource
    • delete_datasource
    • get_datasources
    • prep_and_ingest_datasource
    • split_datasources_by_percent
    • update_datasource
    • add_active_datasources
    • get_node_datasources
    • put_node_datasource
    • refresh_active_datasources
  • The following SDK functions have been removed from snorkelai.sdk.client.evaluation:

    • client.evaluation.create_evaluation_report
    • client.evaluation.preview_custom_prompt_metric

    Use the Snorkel GUI to run evaluations.

  • The following SDK functions have been removed from snorkelai.sdk.client.metrics:

    • client.metrics.add_metric_to_node
    • client.metrics.delete_metric_from_node
    • client.metrics.get_candidate_extractor_metrics
    • client.metrics.get_df_metrics
    • client.metrics.get_model_metrics
    • client.metrics.list_available_metrics
    • client.metrics.register_custom_metric
    • client.metrics.tune_threshold_on_valid
  • The following SDK functions have been removed from snorkelai.sdk.client.models:

    • client.models.add_predictions
    • client.models.get_models
    • client.models.get_predictions
    • client.models.register_model
    • client.models.register_trained_model
    • client.models.train_custom_model
    • client.models.train_model
  • The following SDK functions have been removed from snorkelai.sdk.client.nodes:

    • client.nodes.commit_model_to_node
  • The following SDK functions no longer take start_date and end_date as parameters:

    • client.get_node_data
    • client.get_ground_truth
    • client.get_span_level_ground_truth_conflicts
    • client.get_model_metrics
    • client.get_predictions
  • snorkelai.sdk.client.download_remote_object has been removed.

User interface

  • The JupyterLab application no longer includes a button linking to the integrated Jupyter notebook app. Access the app from the Notebook left navigation menu instead.

Features and improvements

Data upload

Annotation

  • Annotation now supports multi-label tagging for sequence tagging.
  • For spans, the inter-annotator agreement metric can now be configured so that strict overlap is not required for agreement to be calculated. You can set the percentage overlap when you create a new label schema and select the overlapping span functionality. The instructions for defining a custom agreement threshold are part of uploading a multi-schema annotation dataset.
  • Reviewers can now choose which labels are ground truth on a per-annotation, per-document basis.
  • The annotation task creation interface now displays error messages when names conflict and when entries exceed the character limit.
  • Annotation task creation supports single and multi-label for sequence tagging.
  • Quality of life and GUI improvements for annotation task creation.

Prompt development

  • You can now export a CSV file with data from the current prompt run, including inputs, model information, and the LLM's response.
  • You can now export a JSON file with a prompt template, containing the model, prompt, and metadata for a prompt version.

Evaluation

  • From the evaluation dashboard, you can now filter data by slice, score, and inter-annotator agreement. Use this dashboard to easily identify problematic outputs and identify targets for prompt development.
  • The agreement score filter now uses the more intuitive agree and disagree options for binary and ordinal criteria. Previously, this used unintuitive percentage-based inputs.
  • You can now create custom code-based evaluators via the SDK (snorkelai.sdk.develop.CodeEvaluator). Code evaluators let you use Python to deterministically, quickly, and automatically assign the correct label to a datapoint during evaluation. In the Snorkel GUI, you can run these evaluators as part of a benchmark and see the results. Read about how to create a code evaluator using the SDK.

Bug fixes

Annotation

  • Fixed bug that exported only one batch when trying to export multiple batches.
  • Cursor position for label schemas no longer resets after saving.
  • Fixed a scrolling bug with tables.

Known issues

Data upload

  • The dataset size shown in the GUI does not always match the actual file size.

Annotation

  • In review mode, the GUI does not scroll to the annotation.
  • In review mode, the annotation filter shows individual status rather than group status.
  • If all labels are rejected, an annotation is marked Resolved even if no ground truth was committed.

Prompt development

  • In the prompt workflow, users are unable to select a freeform annotation label schema.