Skip to main content

Snorkel AI Data Development Platform v25.7 (LTS) release notes

Breaking changes

SDK

  • Dataset.get_dataframe no longer accepts both split and datasource_uid parameters simultaneously.

SDK module, function, and class list at bottom of documentation

Snorkel has removed a large number of older SDK functions in our ongoing effort to streamline the platform.

This list is long for 25.7, so refer to the SDK removals section at the end of this document.

Features and improvements

Data management

  • You can now use external Amazon S3 buckets to store datasets and associated files. See S3 external storage.
  • Updated SDK for data management:
    • Dataset.create_datasource now automatically adds datasources to the annotation node.
    • Dataset.create_datasource can now automatically generate uid_col values.

Annotation

Prompt development

  • You can enhance prompts with one- and few-shot ground-truth examples by selecting one or more datapoints to inject directly into the prompt. Each datapoint includes the input columns, the LLMs reponse from the current run, and the ground truth. See Improve LLMAJ alignment.
  • You can run prompts on a filtered subset of datapoints in both prompt development and LLMAJ evaluation workflows.

Evaluation

  • You can use the new Improve Prompt error analysis feature to identify and improve prompt accuracy for LLMAJ workflows. After running a prompt, this feature automatically groups evaluator–human disagreements and suggests targeted prompt changes. You can see what went wrong and how to improve the prompt. See Improve LLMAJ alignment.
  • You can execute benchmark runs on the test split via snorkelai.sdk.develop.benchmarks.Benchmark.execute(splits=["test"]).
  • You can edit ground truth scores directly from the evaluation results interface to quickly correct or add ground truth labels without leaving the evaluation workflow.
  • You can pivot the evaluation table to display criteria as either rows or columns, depending on your analytical needs.

SDK updates for evaluation

  • You can create(), update(), and get() LLMAJ evaluators; see snorkelai.sdk.develop.PromptEvaluator for details.
  • You can execute LLMAJ runs using PromptEvaluator methods: execute(), get_execution_result(), poll_execution_result(), and get_executions().
  • You can manage benchmarks and criteria through the SDK with comprehensive create(), get(), update(), and archive() capabilities.
  • You no longer need to provide a workspace_id in Benchmark.create(); if omitted, the system infers workspace_uid from context.
  • You can provide the workspace name instead of workspace ID in Benchmark.create().

Integrations

Bug fixes

SDK

  • Fixed Dataset.create_datasource when used with pandas DataFrames.
  • Fixed optional description and workspace_uid fields in Benchmark and Criteria create() and update() methods.

Known issues

User interface

  • Exporting multiple batches only exports one batch.

Data management

  • The dataset size shown in the GUI does not always match the actual file size.

SDK removals

Entire modules removed

  • The entire snorkelai.sdk.client.nodes module has been removed, including these functions:

    • client.nodes.add_active_datasources
    • client.nodes.add_node
    • client.nodes.add_node_hierarchy
    • client.nodes.commit_builtin_operator
    • client.nodes.commit_custom_operator
    • client.nodes.delete_node
    • client.nodes.fit_and_commit
    • client.nodes.get_model_node
    • client.nodes.get_model_nodes
    • client.nodes.get_node
    • client.nodes.get_node_data
    • client.nodes.get_node_datasources
    • client.nodes.get_node_input_cols
    • client.nodes.get_node_inputs_data
    • client.nodes.get_node_label_map
    • client.nodes.get_node_output_data
    • client.nodes.get_node_settings
    • client.nodes.get_node_uid
    • client.nodes.get_preprocessing_issues
    • client.nodes.list_nodes
    • client.nodes.put_node_datasource
    • client.nodes.refresh_active_datasources
    • client.nodes.set_node_settings
    • client.nodes.uncommit_operator
  • The entire snorkelai.sdk.client.operators module has been removed, including these functions:

    • client.operators.add_operator
    • client.operators.add_operator_class
    • client.operators.check_conflicting_operator_name
    • client.operators.delete_operator
    • client.operators.execute_operators
    • client.operators.get_custom_operators
    • client.operators.get_default_operator
    • client.operators.get_default_operators
    • client.operators.get_operator_code
    • client.operators.get_operator_config
  • The entire snorkelai.sdk.client.lfs module has been removed, including these functions:

    • client.lfs.archive_lf
    • client.lfs.archive_lfs
    • client.lfs.delete_lf
    • client.lfs.execute_lfs
    • client.lfs.get_lf
    • client.lfs.get_lfs
  • The entire snorkelai.sdk.client.lf_packages module has been removed, including these functions:

    • client.lf_packages.delete_lf_package
    • client.lf_packages.export_lf_packages
    • client.lf_packages.import_lf_packages
    • client.lf_packages.transfer_lf_packages
  • The entire snorkelai.sdk.client.lf_templates module has been removed, including these functions:

    • client.lf_templates.add_lf_template_class
    • client.lf_templates.delete_lf_template
    • client.lf_templates.get_lf_templates
  • The entire snorkelai.sdk.client.dataset_views module has been removed, including these functions:

    • client.dataset_views.create_dataset_view
    • client.dataset_views.delete_dataset_view
    • client.dataset_views.get_dataset_view
    • client.dataset_views.get_dataset_views
    • client.dataset_views.update_dataset_view
  • The entire snorkelai.sdk.client.batches module has been removed, including these functions:

    • client.batches.create_batches
    • client.batches.delete_batch
    • client.batches.get_batches
    • client.batches.get_x_uids_from_batch
    • client.batches.update_batch
  • The entire snorkelai.sdk.client.training_sets module has been removed, including these functions:

    • client.training_sets.delete_training_set
    • client.training_sets.get_training_set

Functions removed from existing modules

  • The following SDK functions have been removed from snorkelai.sdk.client.applications:

    • client.applications.add_block_to_application
    • client.applications.create_app_version
    • client.applications.create_application
    • client.applications.create_classification_application
    • client.applications.create_hocr_classification_application
    • client.applications.create_hocr_extraction_application
    • client.applications.create_multilabel_classification_application
    • client.applications.create_native_pdf_classification_application
    • client.applications.create_native_pdf_extraction_application
    • client.applications.create_sequence_tagging_application
    • client.applications.create_text_extraction_application
    • client.applications.delete_application
    • client.applications.duplicate_application
    • client.applications.execute_graph_on_data
    • client.applications.get_application
    • client.applications.get_applications
    • client.applications.list_app_versions
    • client.applications.load_app_version
    • client.applications.set_application_visibility
    • client.applications.update_application
    • client.applications.visualize_application_graph
  • The following SDK functions have been removed from snorkelai.sdk.client.blocks:

    • client.blocks.delete_operator_block
    • client.blocks.duplicate_block
    • client.blocks.get_operator_block
    • client.blocks.get_operator_blocks
  • The following SDK functions have been removed from snorkelai.sdk.client.fm_suite:

    • client.fm_suite.run_lf_inference
  • The following SDK functions have been removed from snorkelai.sdk.client.gts:

    • client.gts.get_inferred_document_ground_truth_from_span_ground_truth
  • The following SDK functions have been removed from snorkelai.sdk.client.transfer:

    • client.transfer.export_lfs
    • client.transfer.export_node_data
    • client.transfer.import_lfs
    • client.transfer.import_node_data
    • client.transfer.transfer_annotations
    • client.transfer.transfer_gts
    • client.transfer.transfer_lfs
    • client.transfer.transfer_lfs_by_name

Classes removed

  • The ModelNode SDK class has been removed from snorkelai.sdk.develop.