Migration guide: Batches to Tasks
With release v25.11, Snorkel Flow introduces Annotation Tasks, a redesigned workflow that replaces the legacy Annotation Batches system. Tasks provide a more flexible, stable, and scalable way to manage annotation workloads across projects and teams.
This guide explains what changed, why it changed, and how to migrate your existing workflows—both in the UI and the SDK.
What Changed?
Annotation Tasks offer a more flexible, datapoint-aware model than Batches. Key improvements:
-
datapoint-level visibility and control
- View progress per datapoint
- Assign or reassign specific datapoints to annotators
- Add new datapoints to an existing task
-
Editable annotation forms
- Add or remove label schemas (and thus questions) even after task creation.
-
Clearer task lifecycle
- Track annotation progress through status transitions
Side-by-Side Comparison
In the UI
| Batches | Tasks |
|---|---|
| Batches tab | Linked Annotation Tasks tab |
| Create batch | Create task |
| No datapoint-level visibility | Full datapoint-level visibility (progress, assignments) |
| Batch-level annotator assignment only | datapoint-level assignment & rebalancing |
| Static label schemas | Editable annotation forms |
| Must create new batch to add datapoints | Add datapoints directly to a task |
In the SDK
| Batches | Tasks |
|---|---|
Batch class | AnnotationTask class |
Batch object exposes batch_size, x_uids | Task object exposes annotation form, dynamic x_uids, creator info, and schema IDs |
Batch.create() | AnnotationTask.create() |
| Fixed datapoint set on creation | Add/remove datapoints dynamically (add_datapoints, remove_datapoints) |
| Label schemas fixed on creation | Add label schemas anytime (add_label_schemas) |
| Batch-level assignment control | datapoint-level assignment (add_assignees, remove_assignees, list_user_assignments) |
Annotations are accessed from Datasets via get_dataframe(include_annotations=True) / export(include_annotations=True) and committed as ground truth via commit(source_uid, label_schema_uids=None) | Annotations are accessed from Tasks via add_annotations(annotations, user_format=True), get_annotations(user_format=True, user_uids=None, label_schema_uids=None, source_uids=None), and get_annotation_status(user_format=True) for datapoint-level assignees & status |
Code Migration Examples
Example 1 — Creating a Batch → Creating a Task
Before (Batches)
from snorkelai.sdk.develop import Dataset
dataset = Dataset.get("contracts-dataset")
batches = dataset.create_batches(
name="contracts-review-batch",
assignees=[101, 102], # user UIDs
label_schemas=[schema_1, schema_2],
batch_size=500,
split="train",
)
batch = batches[0]
After (Tasks)
from snorkelai.sdk.develop import Dataset, AnnotationTask
dataset = Dataset.get("contracts-dataset")
task = AnnotationTask.create(
dataset_uid=dataset.uid,
name="contracts-review-task",
description="Initial review for contracts",
)
# Attach the same label schemas (by UID)
task.add_label_schemas(label_schema_uids=[schema_1.uid, schema_2.uid])
# Choose datapoints (e.g., first 500 UIDs)
df = dataset.get_dataframe(max_rows=500, target_columns=["__DATAPOINT_UID"])
x_uids = df["__DATAPOINT_UID"].tolist()
task.add_datapoints(x_uids=x_uids)
# Assign annotators at the datapoint level
task.add_assignees(users=[101, 102], x_uids=x_uids)
Example 2 — Get list of Batches → list of Tasks
Before (Batches)
from snorkelai.sdk.develop import Dataset
dataset = Dataset.get("contracts-dataset")
batches = dataset.batches # property: List[Batch]
After (Tasks)
from snorkelai.sdk.develop import Dataset, AnnotationTask
dataset = Dataset.get("contracts-dataset")
tasks = AnnotationTask.list(dataset=dataset.uid) # List[AnnotationTask]
Example 3 — Export batch annotations → Export Task Annotations
Before (Batches)
from snorkelai.sdk.develop import Batch
batch = Batch.get(batch_uid=123)
df = batch.get_dataframe(
selected_fields=["__DATAPOINT_UID", "text"],
include_annotations=True,
include_ground_truth=False,
)
After (Tasks)
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=456)
annotations = task.get_annotations(
user_format=True,
label_schema_uids=[101],
)
Example 4 — Commit ground truth from annotations
Before (Batches)
from snorkelai.sdk.develop import Batch
batch = Batch.get(batch_uid=123)
df = batch.get_dataframe(
selected_fields=["__DATAPOINT_UID", "text"],
include_annotations=True,
include_ground_truth=False,
)
After (Tasks)
from snorkelai.sdk.develop import Dataset, AnnotationTask
import pandas as pd
dataset = Dataset.get("contracts-dataset")
task = AnnotationTask.get(annotation_task_uid=456)
# 1. Pull finalized annotations from a task
ann = task.get_annotations(
user_format=True,
label_schema_uids=[101],
)
# 2. Transform into Dataset.add_ground_truth format
gt_df = ann.reset_index()[["__DATAPOINT_UID", "label"]]
# 3. Write ground truth to the dataset
job_uid = dataset.add_ground_truth(
label_schema_uid=101,
data=gt_df,
user_format=True,
sync=False,
)
Migration Strategies
Strategy 1: Keep Using Batches Temporarily
- Batches are currently in maintenance mode
- Continue using
BatchSDKs - Begin exploring Tasks for new workflows
Strategy 2: Gradual Migration (Recommended)
- Create Tasks for new annotation flows
- Allow existing batches to complete naturally
Strategy 3: Full Migration Immediately
- Replace all Batch workflows with their Task equivalents
Known Limitations
- Some advanced datapoint selection strategies that Batches handled (e.g., randomization) must be implemented client-side
- Auto-migration of existing batches is not provided
- Some older UI pages may still reference “batch” terminology until batches are fully deprecated
FAQs
Do I need to migrate immediately?
No. Batches continue to work for now with deprecation warnings.
What happens to existing batches?
They remain fully functional. You do not need to recreate them.
Can I use Tasks and Batches simultaneously?
Yes. Both appear in the UI and SDK from v25.11 onwards.
Will Batch SDKs break?
No, but they are deprecated and will no longer be actively supported.
Are Tasks a 1:1 replacement for Batches?
Not exactly—Tasks are more flexible and expose datapoint-level operations, assignment tools, and editable forms.
When will Batches be removed?
Planned for H1 2026, subject to change.