Skip to main content
Version: 25.10

snorkelai.sdk.develop.AnnotationTask

final class snorkelai.sdk.develop.AnnotationTask(name, annotation_task_uid, dataset_uid, created_by_user_uid, created_at, description=None, annotation_form=None, x_uids=None)

Bases: Base

Represents an annotation task within a Snorkel dataset for managing annotation workflows.

An annotation task defines a set of datapoints that need to be annotated, along with the annotation form, user assignments, and task configuration. This class provides methods for creating, retrieving, updating, and managing annotation tasks.

AnnotationTask objects should not be instantiated directly - use the create() or get() class methods instead.

__init__

__init__(name, annotation_task_uid, dataset_uid, created_by_user_uid, created_at, description=None, annotation_form=None, x_uids=None)

Create an AnnotationTask object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the create() and get() methods.

Parameters

NameTypeDefaultInfo
namestrThe name of the annotation task.
annotation_task_uidintThe unique identifier for the annotation task.
dataset_uidintThe UID of the dataset that the annotation task belongs to.
created_by_user_uidintThe UID of the user who created the annotation task.
created_atdatetimeThe timestamp when the annotation task was created.
descriptionOptional[str]NoneA description of the annotation task, by default None.
annotation_formOptional[AnnotationForm]NoneThe annotation form associated with the task, by default None.
x_uidsOptional[List[str]]NoneList of datapoint UIDs in this annotation task, by default None.

Methods

__init__(name, annotation_task_uid, ...[, ...])Create an AnnotationTask object in-memory with necessary properties.
add_annotations(annotations[, user_format])Add annotations to an annotation task.
add_assignees(users, x_uids)Assign all of the listed users to the listed datapoints in the annotation task.
add_datapoints(x_uids)Add datapoints to the annotation task.
add_label_schemas(label_schema_uids)Add label schemas to the annotation task.
create(dataset_uid, name[, description])Create an annotation task.
delete(annotation_task_uid)Delete an annotation task.
get(annotation_task_uid)Get an annotation task by UID.
get_annotation_status([user_format])Fetch the task columns (assignees, status) for all the datapoints in an annotation task.
get_annotations([user_format, user_uids, ...])Get annotations from an annotation task, filtered by the uids specified.
get_dataframe([limit, offset])Fetch the dataset columns for all the datapoints in an annotation task.
list(dataset)List all annotation tasks for a given dataset.
list_user_assignments(users[, user_format])Get user assignments in an annotation task.
remove_assignees([users, x_uids])Remove all of the listed users from the listed datapoints in the annotation task.
remove_datapoints(x_uids)Remove datapoints from the annotation task.
update([name, description])Update an annotation task.

Attributes

annotation_formReturn the annotation form of the annotation task
created_atReturn the creation timestamp of the annotation task
created_by_user_uidReturn the UID of the user who created the annotation task
dataset_uidReturn the UID of the dataset that the annotation task belongs to
descriptionReturn the description of the annotation task
label_schema_uidsReturn the list of label schema UIDs associated with this annotation task.
nameReturn the name of the annotation task
uidReturn the UID of the annotation task
x_uidsReturn the list of datapoint UIDs in this annotation task

add_annotations

add_annotations(annotations, user_format=True)

Add annotations to an annotation task.

Parameters

NameTypeDefaultInfo
annotationsUnion[DataFrame, List[Dict]]

Annotations to add to the task. Can be provided in one of two formats:

  1. DataFrame: A pandas DataFrame with annotation data

  2. List of Dicts: A list of dictionaries with annotation parameters

user_formatboolTrueTrue if annotation labels in data are in user format, otherwise they must be raw label values.

Return type

None

Examples

DataFrame Input:

>>> import pandas as pd
>>> df = pd.DataFrame([
... {
... 'x_uid': 'doc::1',
... 'dataset_uid': 1001,
... 'label_schema_uid': 101,
... 'label': 'positive',
... 'metadata': {'confidence': 0.95},
... 'freeform_text': None
... },
... {
... 'x_uid': 'doc::2',
... 'dataset_uid': 1001,
... 'label_schema_uid': 101,
... 'label': 'negative',
... 'metadata': {'confidence': 0.87},
... 'freeform_text': None
... }
... ])
>>> annotation_task.add_annotations(df)

List of Dictionaries Input:

>>> annotations_list = [
... {
... 'x_uid': 'doc::3',
... 'dataset_uid': 1002,
... 'label_schema_uid': 102,
... 'label': {'spans': [[0, 10, 'PERSON'], [15, 25, 'ORG']]},
... 'metadata': {'annotator': 'user_123'},
... 'freeform_text': None
... },
... {
... 'x_uid': 'doc::4',
... 'dataset_uid': 1002,
... 'label_schema_uid': 103,
... 'label': {}, # Empty for text annotations
... 'metadata': {},
... 'freeform_text': 'This document discusses climate change impacts.'
... }
... ]
>>> annotation_task.add_annotations(annotations_list)

Dictionaries of different label types:

>>> # Single-choice classification
>>> single_choice = [{'x_uid': 'doc::6', 'dataset_uid': 1003, 'label_schema_uid': 101, 'label': 'category_a'}]
>>>
>>> # Multi-choice classification
>>> multi_choice = [{'x_uid': 'doc::7', 'dataset_uid': 1003, 'label_schema_uid': 102, 'label': ['tag1', 'tag2', 'tag3']}]
>>>
>>> # Sequence tagging (NER)
>>> sequence_tags = [{'x_uid': 'doc::8', 'dataset_uid': 1003, 'label_schema_uid': 103, 'label': [[0, 5, 'B-PER'], [6, 15, 'B-LOC']]}]
>>>
>>> # Text annotation (freeform)
>>> text_annotation = [{'x_uid': 'doc::9', 'dataset_uid': 1003, 'label_schema_uid': 104, 'label': {}, 'freeform_text': 'User feedback here'}]
>>>
>>> annotation_task.add_annotations(single_choice)
>>> annotation_task.add_annotations(multi_choice)
>>> annotation_task.add_annotations(sequence_tags)
>>> annotation_task.add_annotations(text_annotation)

Notes

  • All annotations must belong to label schemas associated with this annotation task

  • The x_uid must correspond to datapoints in the task’s dataset

  • For text-based labels (is_text_label=True), use freeform_text instead of label

  • For structured labels, use the label field with appropriate format for the label type

  • Metadata is optional and can contain arbitrary key-value pairs

  • Timestamps (ts) are auto-generated if not provided

Raises

  • ValueError – If annotation format is invalid or contains missing required fields

  • UserInputError – If x_uid is empty or label_schema_uid is not associated with this task

add_assignees

add_assignees(users, x_uids)

Assign all of the listed users to the listed datapoints in the annotation task.

Parameters

NameTypeDefaultInfo
usersList[Union[int, str]]List of users to assign to the datapoints. Can be user UIDs (int) or usernames (str).
x_uidsList[str]List of datapoint UIDs to assign the users to.

Raises

ValueError – If users or x_uids are empty, or if user input is invalid

Return type

None

Examples

Add assignees using user UIDs:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=[101, 102, 103], x_uids=["doc::1", "doc::2", "doc::3"])

Add assignees using usernames:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=["alice", "bob", "charlie"], x_uids=["doc::1", "doc::2", "doc::3"])

Add assignees using mixed usernames and UIDs:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=["alice", 102, "charlie"], x_uids=["doc::1", "doc::2", "doc::3"])

add_datapoints

add_datapoints(x_uids)

Add datapoints to the annotation task.

Parameters

NameTypeDefaultInfo
x_uidsList[str]List of datapoint UIDs to add to the annotation task.

Return type

None

add_label_schemas

add_label_schemas(label_schema_uids)

Add label schemas to the annotation task.

Label schemas will be displayed in the order in which they are added.

Parameters

NameTypeDefaultInfo
label_schema_uidsList[int]List of label schema UIDs to add to the annotation task.

Raises

ValueError – If label_schema_uids is empty, label schemas are not existing in the dataset or if updating the annotation task fails

Return type

None

Example

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_label_schemas(label_schema_uids=[1, 2, 3])

create

classmethod create(dataset_uid, name, description=None)

Create an annotation task.

note
This method only accepts dataset_uid, name, and description as parameters. Other properties (such as annotation form, datapoint UIDs, and questions) can be set later through other methods.

Parameters

NameTypeDefaultInfo
dataset_uidintThe UID of the dataset for the annotation task.
namestrThe name of the annotation task.
descriptionOptional[str]NoneA description of the annotation task, by default None.

Returns

The created annotation task object

Return type

AnnotationTask

delete

classmethod delete(annotation_task_uid)

Delete an annotation task.

Parameters

NameTypeDefaultInfo
annotation_task_uidintThe UID of the annotation task to delete.

Return type

None

get

classmethod get(annotation_task_uid)

Get an annotation task by UID.

Parameters

NameTypeDefaultInfo
annotation_task_uidintThe UID of the annotation task to retrieve.

Returns

The annotation task object

Return type

AnnotationTask

get_annotation_status

get_annotation_status(user_format=True)

Fetch the task columns (assignees, status) for all the datapoints in an annotation task.

Parameters

NameTypeDefaultInfo
user_formatboolTrueIf True, assignee names are returned instead of uids.

Returns

A DataFrame with columns: x_uid (this is the index), assignees (list of user UIDs or usernames), status (str) The DataFrame will have one row per datapoint in the annotation task

Example:

Data Point ID Assignee(s) Status ———- ———- ———- doc::1 [101, 102] IN_ANNOTATION doc::2 [103] READY_FOR_REVIEW doc::3 [101, 104] COMPLETED doc::4 [] NEEDS_ASSIGNEES

Return type

pd.DataFrame

get_annotations

get_annotations(user_format=True, user_uids=None, label_schema_uids=None, source_uids=None)

Get annotations from an annotation task, filtered by the uids specified.

Parameters

NameTypeDefaultInfo
user_formatboolTrueIf True, convert raw label value to label names.
user_uidsOptional[List[int]]NoneList of user UIDs to filter annotations by, by default None.
label_schema_uidsOptional[List[int]]NoneList of label schema UIDs to filter annotations by, by default None.
source_uidsOptional[List[int]]NoneList of source UIDs to filter annotations by, by default None.

Returns

DataFrame containing the filtered annotations with label values transformed to label names if user_format is True

Return type

pd.DataFrame

get_dataframe

get_dataframe(limit=None, offset=0)

Fetch the dataset columns for all the datapoints in an annotation task.

Parameters

NameTypeDefaultInfo
limitOptional[int]NoneThe max number of rows to return. If None, all rows will be returned.
offsetint0Rows will be returned starting at this index.

Returns

DataFrame containing the dataset data

Return type

pd.DataFrame

list

classmethod list(dataset)

List all annotation tasks for a given dataset.

Parameters

NameTypeDefaultInfo
datasetUnion[str, int]The dataset UID or dataset object to list annotation tasks for.

Returns

A list of annotation task objects

Return type

List[AnnotationTask]

list_user_assignments

list_user_assignments(users, user_format=True)

Get user assignments in an annotation task.

Parameters

NameTypeDefaultInfo
usersList[Union[int, str]]List of users to fetch annotation assignments for. Can be user UIDs (int) or usernames (str).
user_formatboolTrueIf true, return user names as keys; if false, return user UIDs as keys.

Returns

A dictionary with user keys (names if user_format is True, UIDs otherwise) and values containing lists of datapoint_uids that the user is assigned to

Example:

assignments = {
"Dr Bubbles": ["doc::1", "doc::2"],
"Rebekah": ["doc::5"],
"Hiromu": [],
}

Return type

Dict[str | int, List[str]]

Raises

ValueError – If user_uids is empty or if fetching assignments fails

Examples

Get assignments using user UIDs, returning usernames as keys:

>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=[101, 102, 103])
>>> # Returns dictionary with usernames as keys
>>> # {'alice': ['doc::1', 'doc::2'], 'bob': ['doc::3'], 'charlie': []}

Get assignments using usernames, returning usernames as keys:

>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=['alice', 'bob', 'charlie'])
>>> # Returns dictionary with usernames as keys
>>> # {'alice': ['doc::1', 'doc::2'], 'bob': ['doc::3'], 'charlie': []}

Get assignments using user UIDs, returning UIDs as keys:

>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=[101, 102], user_format=False)
>>> # Returns dictionary with user UIDs as keys
>>> # {101: ['doc::1', 'doc::2'], 102: ['doc::3']}

Get assignments using mixed input (usernames and UIDs):

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
assignments = task.list_user_assignments(users=['alice', 102, 'charlie'])

remove_assignees

remove_assignees(users=None, x_uids=None)

Remove all of the listed users from the listed datapoints in the annotation task.

If both users and x_uids are None, it will remove all assignees from all datapoints in the task. This is a non-destructive operation – it removes the assignments but retains the annotations.

Parameters

NameTypeDefaultInfo
usersOptional[List[Union[int, str]]]NoneA list of users to remove from listed datapoints, by default None. Can be user UIDs (int) or usernames (str). If None, all users assigned to listed datapoints will be removed from those datapoints.
x_uidsOptional[List[str]]NoneA list of the x_uids of datapoints to remove users from, by default None. If None, listed users will be removed from all the datapoints they are assigned to.

Raises

ValueError – If fetching or deleting assignments fails

Return type

None

Examples

Remove specific users from specific datapoints using UIDs:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=[101, 102], x_uids=["doc::1", "doc::2"])

Remove specific users from specific datapoints using usernames:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", "bob"], x_uids=["doc::1", "doc::2"])

Remove specific users from specific datapoints using mixed identifiers:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", 102], x_uids=["doc::1", "doc::2"])

Remove specific users from all datapoints they are assigned to:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", "bob"])

Remove all users from specific datapoints:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(x_uids=["doc::1", "doc::2"])

Remove all assignees from all datapoints in the task:

from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees()

remove_datapoints

remove_datapoints(x_uids)

Remove datapoints from the annotation task.

Parameters

NameTypeDefaultInfo
x_uidsList[str]List of datapoint UIDs to remove from the annotation task.

Return type

None

update

update(name=None, description=None)

Update an annotation task.

Parameters

NameTypeDefaultInfo
nameOptional[str]NoneThe new name for the annotation task, by default None.
descriptionOptional[str]NoneThe new description for the annotation task, by default None.

Return type

None

property annotation_form: AnnotationForm

Return the annotation form of the annotation task

property created_at: datetime

Return the creation timestamp of the annotation task

property created_by_user_uid: int

Return the UID of the user who created the annotation task

property dataset_uid: int

Return the UID of the dataset that the annotation task belongs to

property description: str | None

Return the description of the annotation task

property label_schema_uids: List[int]

Return the list of label schema UIDs associated with this annotation task.

property name: str

Return the name of the annotation task

property uid: int

Return the UID of the annotation task

property x_uids: List[str]

Return the list of datapoint UIDs in this annotation task