snorkelai.sdk.develop.AnnotationTask
- final class snorkelai.sdk.develop.AnnotationTask(name, annotation_task_uid, dataset_uid, created_by_user_uid, created_at, description=None, annotation_form=None, x_uids=None)
Bases:
BaseRepresents an annotation task within a Snorkel dataset for managing annotation workflows.
An annotation task defines a set of datapoints that need to be annotated, along with the annotation form, user assignments, and task configuration. This class provides methods for creating, retrieving, updating, and managing annotation tasks.
AnnotationTask objects should not be instantiated directly - use the
create()orget()class methods instead.- __init__(name, annotation_task_uid, dataset_uid, created_by_user_uid, created_at, description=None, annotation_form=None, x_uids=None)
Create an AnnotationTask object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the
create()andget()methods.Parameters
Parameters
Name Type Default Info name strThe name of the annotation task. annotation_task_uid intThe unique identifier for the annotation task. dataset_uid intThe UID of the dataset that the annotation task belongs to. created_by_user_uid intThe UID of the user who created the annotation task. created_at datetimeThe timestamp when the annotation task was created. description Optional[str]NoneA description of the annotation task, by default None. annotation_form Optional[AnnotationForm]NoneThe annotation form associated with the task, by default None. x_uids Optional[List[str]]NoneList of datapoint UIDs in this annotation task, by default None.
\_\_init\_\_
__init__
Methods
__init__(name, annotation_task_uid, ...[, ...])Create an AnnotationTask object in-memory with necessary properties. add_annotations(annotations[, user_format])Add annotations to an annotation task. add_assignees(users, x_uids)Assign all of the listed users to the listed datapoints in the annotation task. add_datapoints(x_uids)Add datapoints to the annotation task. add_label_schemas(label_schema_uids)Add label schemas to the annotation task. create(dataset_uid, name[, description])Create an annotation task. delete(annotation_task_uid)Delete an annotation task. get(annotation_task_uid)Get an annotation task by UID. get_annotation_status([user_format])Fetch the task columns (assignees, status) for all the datapoints in an annotation task. get_annotations([user_format, user_uids, ...])Get annotations from an annotation task, filtered by the uids specified. get_dataframe([limit, offset])Fetch the dataset columns for all the datapoints in an annotation task. list(dataset)List all annotation tasks for a given dataset. list_user_assignments(users[, user_format])Get user assignments in an annotation task. remove_assignees([users, x_uids])Remove all of the listed users from the listed datapoints in the annotation task. remove_datapoints(x_uids)Remove datapoints from the annotation task. update([name, description])Update an annotation task. Attributes
annotation_formReturn the annotation form of the annotation task created_atReturn the creation timestamp of the annotation task created_by_user_uidReturn the UID of the user who created the annotation task dataset_uidReturn the UID of the dataset that the annotation task belongs to descriptionReturn the description of the annotation task label_schema_uidsReturn the list of label schema UIDs associated with this annotation task. nameReturn the name of the annotation task uidReturn the UID of the annotation task x_uidsReturn the list of datapoint UIDs in this annotation task - add_annotations(annotations, user_format=True)
Add annotations to an annotation task.
Parameters
Parameters
DataFrame: A pandas DataFrame with annotation data
List of Dicts: A list of dictionaries with annotation parameters
Return type
Return type
None
Name Type Default Info annotations Union[DataFrame, List[Dict]]Annotations to add to the task. Can be provided in one of two formats:
user_format boolTrueTrue if annotation labels in data are in user format, otherwise they must be raw label values. Examples
DataFrame Input:
>>> import pandas as pd
>>> df = pd.DataFrame([
... {
... 'x_uid': 'doc::1',
... 'dataset_uid': 1001,
... 'label_schema_uid': 101,
... 'label': 'positive',
... 'metadata': {'confidence': 0.95},
... 'freeform_text': None
... },
... {
... 'x_uid': 'doc::2',
... 'dataset_uid': 1001,
... 'label_schema_uid': 101,
... 'label': 'negative',
... 'metadata': {'confidence': 0.87},
... 'freeform_text': None
... }
... ])
>>> annotation_task.add_annotations(df)List of Dictionaries Input:
>>> annotations_list = [
... {
... 'x_uid': 'doc::3',
... 'dataset_uid': 1002,
... 'label_schema_uid': 102,
... 'label': {'spans': [[0, 10, 'PERSON'], [15, 25, 'ORG']]},
... 'metadata': {'annotator': 'user_123'},
... 'freeform_text': None
... },
... {
... 'x_uid': 'doc::4',
... 'dataset_uid': 1002,
... 'label_schema_uid': 103,
... 'label': {}, # Empty for text annotations
... 'metadata': {},
... 'freeform_text': 'This document discusses climate change impacts.'
... }
... ]
>>> annotation_task.add_annotations(annotations_list)Dictionaries of different label types:
>>> # Single-choice classification
>>> single_choice = [{'x_uid': 'doc::6', 'dataset_uid': 1003, 'label_schema_uid': 101, 'label': 'category_a'}]
>>>
>>> # Multi-choice classification
>>> multi_choice = [{'x_uid': 'doc::7', 'dataset_uid': 1003, 'label_schema_uid': 102, 'label': ['tag1', 'tag2', 'tag3']}]
>>>
>>> # Sequence tagging (NER)
>>> sequence_tags = [{'x_uid': 'doc::8', 'dataset_uid': 1003, 'label_schema_uid': 103, 'label': [[0, 5, 'B-PER'], [6, 15, 'B-LOC']]}]
>>>
>>> # Text annotation (freeform)
>>> text_annotation = [{'x_uid': 'doc::9', 'dataset_uid': 1003, 'label_schema_uid': 104, 'label': {}, 'freeform_text': 'User feedback here'}]
>>>
>>> annotation_task.add_annotations(single_choice)
>>> annotation_task.add_annotations(multi_choice)
>>> annotation_task.add_annotations(sequence_tags)
>>> annotation_task.add_annotations(text_annotation)Notes
All annotations must belong to label schemas associated with this annotation task
The x_uid must correspond to datapoints in the task’s dataset
For text-based labels (is_text_label=True), use freeform_text instead of label
For structured labels, use the label field with appropriate format for the label type
Metadata is optional and can contain arbitrary key-value pairs
Timestamps (ts) are auto-generated if not provided
Raises
Raises
ValueError – If annotation format is invalid or contains missing required fields
UserInputError – If x_uid is empty or label_schema_uid is not associated with this task
add\_annotations
add_annotations
- add_assignees(users, x_uids)
Assign all of the listed users to the listed datapoints in the annotation task.
Parameters
Parameters
Raises
Raises
ValueError – If users or x_uids are empty, or if user input is invalid
Return type
Return type
None
Name Type Default Info users List[Union[int, str]]List of users to assign to the datapoints. Can be user UIDs (int) or usernames (str). x_uids List[str]List of datapoint UIDs to assign the users to. Examples
Add assignees using user UIDs:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=[101, 102, 103], x_uids=["doc::1", "doc::2", "doc::3"])Add assignees using usernames:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=["alice", "bob", "charlie"], x_uids=["doc::1", "doc::2", "doc::3"])Add assignees using mixed usernames and UIDs:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_assignees(users=["alice", 102, "charlie"], x_uids=["doc::1", "doc::2", "doc::3"])
add\_assignees
add_assignees
- add_datapoints(x_uids)
Add datapoints to the annotation task.
add\_datapoints
add_datapoints
- add_label_schemas(label_schema_uids)
Add label schemas to the annotation task.
Label schemas will be displayed in the order in which they are added.
Parameters
Parameters
Raises
Raises
ValueError – If label_schema_uids is empty, label schemas are not existing in the dataset or if updating the annotation task fails
Return type
Return type
None
Name Type Default Info label_schema_uids List[int]List of label schema UIDs to add to the annotation task. Example
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.add_label_schemas(label_schema_uids=[1, 2, 3])
add\_label\_schemas
add_label_schemas
- classmethod create(dataset_uid, name, description=None)
Create an annotation task.
noteThis method only accepts dataset_uid, name, and description as parameters. Other properties (such as annotation form, datapoint UIDs, and questions) can be set later through other methods.Parameters
Parameters
Returns
Returns
The created annotation task object
Return type
Return type
Name Type Default Info dataset_uid intThe UID of the dataset for the annotation task. name strThe name of the annotation task. description Optional[str]NoneA description of the annotation task, by default None.
create
create
- classmethod delete(annotation_task_uid)
Delete an annotation task.
delete
delete
- classmethod get(annotation_task_uid)
Get an annotation task by UID.
get
get
- get_annotation_status(user_format=True)
Fetch the task columns (assignees, status) for all the datapoints in an annotation task.
Parameters
Parameters
Returns
Returns
A DataFrame with columns: x_uid (this is the index), assignees (list of user UIDs or usernames), status (str) The DataFrame will have one row per datapoint in the annotation task
- Example:
Data Point ID Assignee(s) Status ———- ———- ———- doc::1 [101, 102] IN_ANNOTATION doc::2 [103] READY_FOR_REVIEW doc::3 [101, 104] COMPLETED doc::4 [] NEEDS_ASSIGNEES
Return type
Return type
pd.DataFrame
Name Type Default Info user_format boolTrueIf True, assignee names are returned instead of uids.
get\_annotation\_status
get_annotation_status
- get_annotations(user_format=True, user_uids=None, label_schema_uids=None, source_uids=None)
Get annotations from an annotation task, filtered by the uids specified.
Parameters
Parameters
Returns
Returns
DataFrame containing the filtered annotations with label values transformed to label names if user_format is True
Return type
Return type
pd.DataFrame
Name Type Default Info user_format boolTrueIf True, convert raw label value to label names. user_uids Optional[List[int]]NoneList of user UIDs to filter annotations by, by default None. label_schema_uids Optional[List[int]]NoneList of label schema UIDs to filter annotations by, by default None. source_uids Optional[List[int]]NoneList of source UIDs to filter annotations by, by default None.
get\_annotations
get_annotations
- get_dataframe(limit=None, offset=0)
Fetch the dataset columns for all the datapoints in an annotation task.
get\_dataframe
get_dataframe
- classmethod list(dataset)
List all annotation tasks for a given dataset.
Parameters
Parameters
Returns
Returns
A list of annotation task objects
Return type
Return type
List[AnnotationTask]
Name Type Default Info dataset Union[str, int]The dataset UID or dataset object to list annotation tasks for.
list
list
- list_user_assignments(users, user_format=True)
Get user assignments in an annotation task.
Parameters
Parameters
Returns
Returns
A dictionary with user keys (names if user_format is True, UIDs otherwise) and values containing lists of datapoint_uids that the user is assigned to
Example:
assignments = {
"Dr Bubbles": ["doc::1", "doc::2"],
"Rebekah": ["doc::5"],
"Hiromu": [],
}Return type
Return type
Dict[str | int, List[str]]Raises
Raises
ValueError – If user_uids is empty or if fetching assignments fails
Name Type Default Info users List[Union[int, str]]List of users to fetch annotation assignments for. Can be user UIDs (int) or usernames (str). user_format boolTrueIf true, return user names as keys; if false, return user UIDs as keys. Examples
Get assignments using user UIDs, returning usernames as keys:
>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=[101, 102, 103])
>>> # Returns dictionary with usernames as keys
>>> # {'alice': ['doc::1', 'doc::2'], 'bob': ['doc::3'], 'charlie': []}Get assignments using usernames, returning usernames as keys:
>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=['alice', 'bob', 'charlie'])
>>> # Returns dictionary with usernames as keys
>>> # {'alice': ['doc::1', 'doc::2'], 'bob': ['doc::3'], 'charlie': []}Get assignments using user UIDs, returning UIDs as keys:
>>> from snorkelai.sdk.develop import AnnotationTask
>>> task = AnnotationTask.get(annotation_task_uid=123)
>>> assignments = task.list_user_assignments(users=[101, 102], user_format=False)
>>> # Returns dictionary with user UIDs as keys
>>> # {101: ['doc::1', 'doc::2'], 102: ['doc::3']}Get assignments using mixed input (usernames and UIDs):
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
assignments = task.list_user_assignments(users=['alice', 102, 'charlie'])
list\_user\_assignments
list_user_assignments
- remove_assignees(users=None, x_uids=None)
Remove all of the listed users from the listed datapoints in the annotation task.
If both users and x_uids are None, it will remove all assignees from all datapoints in the task. This is a non-destructive operation – it removes the assignments but retains the annotations.
Parameters
Parameters
Raises
Raises
ValueError – If fetching or deleting assignments fails
Return type
Return type
None
Name Type Default Info users Optional[List[Union[int, str]]]NoneA list of users to remove from listed datapoints, by default None. Can be user UIDs (int) or usernames (str). If None, all users assigned to listed datapoints will be removed from those datapoints. x_uids Optional[List[str]]NoneA list of the x_uids of datapoints to remove users from, by default None. If None, listed users will be removed from all the datapoints they are assigned to. Examples
Remove specific users from specific datapoints using UIDs:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=[101, 102], x_uids=["doc::1", "doc::2"])Remove specific users from specific datapoints using usernames:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", "bob"], x_uids=["doc::1", "doc::2"])Remove specific users from specific datapoints using mixed identifiers:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", 102], x_uids=["doc::1", "doc::2"])Remove specific users from all datapoints they are assigned to:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(users=["alice", "bob"])Remove all users from specific datapoints:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees(x_uids=["doc::1", "doc::2"])Remove all assignees from all datapoints in the task:
from snorkelai.sdk.develop import AnnotationTask
task = AnnotationTask.get(annotation_task_uid=123)
task.remove_assignees()
remove\_assignees
remove_assignees
- remove_datapoints(x_uids)
Remove datapoints from the annotation task.
remove\_datapoints
remove_datapoints
- update(name=None, description=None)
Update an annotation task.
update
update
- property annotation_form: AnnotationForm
Return the annotation form of the annotation task
- property created_at: datetime
Return the creation timestamp of the annotation task
- property created_by_user_uid: int
Return the UID of the user who created the annotation task
- property dataset_uid: int
Return the UID of the dataset that the annotation task belongs to
- property description: str | None
Return the description of the annotation task
- property label_schema_uids: List[int]
Return the list of label schema UIDs associated with this annotation task.
- property name: str
Return the name of the annotation task
- property uid: int
Return the UID of the annotation task
- property x_uids: List[str]
Return the list of datapoint UIDs in this annotation task