Skip to main content
Version: Next

Migration guide: Batches to Tasks

With release v25.11, Snorkel Flow introduces Annotation Tasks, a redesigned workflow that replaces the legacy Annotation Batches system. Tasks provide a more flexible, stable, and scalable way to manage annotation workloads across projects and teams.

This guide explains what changed, why it changed, and how to migrate your existing workflows—both in the UI and the SDK.

What Changed?

Annotation Tasks offer a more flexible, datapoint-aware model than Batches. Key improvements:

  • datapoint-level visibility and control

    • View progress per datapoint
    • Assign or reassign specific datapoints to annotators
    • Add new datapoints to an existing task
  • Editable annotation forms

    • Add or remove label schemas (and thus questions) even after task creation.
  • Clearer task lifecycle

    • Track annotation progress through status transitions

Side-by-Side Comparison

In the UI

BatchesTasks
Batches tabLinked Annotation Tasks tab
Create batchCreate task
No datapoint-level visibilityFull datapoint-level visibility (progress, assignments)
Batch-level annotator assignment onlydatapoint-level assignment & rebalancing
Static label schemasEditable annotation forms
Must create new batch to add datapointsAdd datapoints directly to a task

In the SDK

BatchesTasks
Batch classAnnotationTask class
Batch object exposes batch_size, x_uidsTask object exposes annotation form, dynamic x_uids, creator info, and schema IDs
Batch.create()AnnotationTask.create()
Fixed datapoint set on creationAdd/remove datapoints dynamically (add_datapoints, remove_datapoints)
Label schemas fixed on creationAdd label schemas anytime (add_label_schemas)
Batch-level assignment controldatapoint-level assignment (add_assignees, remove_assignees, list_user_assignments)
Annotations are accessed from Datasets via get_dataframe(include_annotations=True) / export(include_annotations=True) and committed as ground truth via commit(source_uid, label_schema_uids=None)Annotations are accessed from Tasks via add_annotations(annotations, user_format=True), get_annotations(user_format=True, user_uids=None, label_schema_uids=None, source_uids=None), and get_annotation_status(user_format=True) for datapoint-level assignees & status

Code Migration Examples

Example 1 — Creating a Batch → Creating a Task

Before (Batches)

from snorkelai.sdk.develop import Dataset

dataset = Dataset.get("contracts-dataset")

batches = dataset.create_batches(
name="contracts-review-batch",
assignees=[101, 102], # user UIDs
label_schemas=[schema_1, schema_2],
batch_size=500,
split="train",
)
batch = batches[0]

After (Tasks)

from snorkelai.sdk.develop import Dataset, AnnotationTask

dataset = Dataset.get("contracts-dataset")

task = AnnotationTask.create(
dataset_uid=dataset.uid,
name="contracts-review-task",
description="Initial review for contracts",
)

# Attach the same label schemas (by UID)
task.add_label_schemas(label_schema_uids=[schema_1.uid, schema_2.uid])

# Choose datapoints (e.g., first 500 UIDs)
df = dataset.get_dataframe(max_rows=500, target_columns=["__DATAPOINT_UID"])
x_uids = df["__DATAPOINT_UID"].tolist()
task.add_datapoints(x_uids=x_uids)

# Assign annotators at the datapoint level
task.add_assignees(users=[101, 102], x_uids=x_uids)

Example 2 — Get list of Batches → list of Tasks

Before (Batches)

from snorkelai.sdk.develop import Dataset

dataset = Dataset.get("contracts-dataset")
batches = dataset.batches # property: List[Batch]

After (Tasks)

from snorkelai.sdk.develop import Dataset, AnnotationTask

dataset = Dataset.get("contracts-dataset")
tasks = AnnotationTask.list(dataset=dataset.uid) # List[AnnotationTask]

Example 3 — Export batch annotations → Export Task Annotations

Before (Batches)

from snorkelai.sdk.develop import Batch

batch = Batch.get(batch_uid=123)
df = batch.get_dataframe(
selected_fields=["__DATAPOINT_UID", "text"],
include_annotations=True,
include_ground_truth=False,
)

After (Tasks)

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=456)
annotations = task.get_annotations(
user_format=True,
label_schema_uids=[101],
)

Example 4 — Commit ground truth from annotations

Before (Batches)

from snorkelai.sdk.develop import Batch

batch = Batch.get(batch_uid=123)
df = batch.get_dataframe(
selected_fields=["__DATAPOINT_UID", "text"],
include_annotations=True,
include_ground_truth=False,
)

After (Tasks)

from snorkelai.sdk.develop import Dataset, AnnotationTask
import pandas as pd

dataset = Dataset.get("contracts-dataset")
task = AnnotationTask.get(annotation_task_uid=456)

# 1. Pull finalized annotations from a task
ann = task.get_annotations(
user_format=True,
label_schema_uids=[101],
)

# 2. Transform into Dataset.add_ground_truth format
gt_df = ann.reset_index()[["__DATAPOINT_UID", "label"]]

# 3. Write ground truth to the dataset
job_uid = dataset.add_ground_truth(
label_schema_uid=101,
data=gt_df,
user_format=True,
sync=False,
)

Migration Strategies

Strategy 1: Keep Using Batches Temporarily

  • Batches are currently in maintenance mode
  • Continue using Batch SDKs
  • Begin exploring Tasks for new workflows
  • Create Tasks for new annotation flows
  • Allow existing batches to complete naturally

Strategy 3: Full Migration Immediately

  • Replace all Batch workflows with their Task equivalents

Known Limitations

  • Some advanced datapoint selection strategies that Batches handled (e.g., randomization) must be implemented client-side
  • Auto-migration of existing batches is not provided
  • Some older UI pages may still reference “batch” terminology until batches are fully deprecated

FAQs

Do I need to migrate immediately?

No. Batches continue to work for now with deprecation warnings.

What happens to existing batches?

They remain fully functional. You do not need to recreate them.

Can I use Tasks and Batches simultaneously?

Yes. Both appear in the UI and SDK from v25.11 onwards.

Will Batch SDKs break?

No, but they are deprecated and will no longer be actively supported.

Are Tasks a 1:1 replacement for Batches?

Not exactly—Tasks are more flexible and expose datapoint-level operations, assignment tools, and editable forms.

When will Batches be removed?

Planned for H1 2026, subject to change.