snorkelflow.client.annotations.aggregate_annotations
- snorkelflow.client.annotations.aggregate_annotations(node, batch_name, sources=None, strategy=None)
Combine annotations from multiple sources into a single set of annotations. The “strategy” argument specifies how to combine the annotations; the “sources” argument specifies which sources to aggregate. If “sources” is None, then all sources will be aggregated. This function will return a list of new annotations that were created by the aggregation. This aggregated set of annotations will then show up in the Batches page in the UI.
Aggregation strategies differ based on what kind of task you are aggregating annotations for.
For multi-class classification tasks, the “simple_majority” strategy will select the label that was most frequently assigned to each data point.
For multi-label classification tasks, the “simple_union” strategy will take the union over all votes for all classes for each data point. Conflicts will be broken by selecting the vote that was most frequent.
For sequence tagging tasks, the “simple_intersection” strategy will label the intersection of the spans marked by all annotators as the final span.
Examples
>>> sf.aggregate_annotations(1, "test-batch-name", sources=["user1", "user2"], strategy="simple_majority")
__DATAPOINT_UID annotation_uid
doc::10005 4679185 {'annotation_uid': 4679185, 'x_uid': 'doc::100...
doc::10006 4679186 {'annotation_uid': 4679186, 'x_uid': 'doc::100...
doc::10007 4679187 {'annotation_uid': 4679187, 'x_uid': 'doc::100...
doc::10009 4679188 {'annotation_uid': 4679188, 'x_uid': 'doc::100...
Name: annotation, dtype: objectParameters
Parameters
Return type
Return type
List
[Annotation
]
Name Type Default Info node int
UID of the node that we are committing the annotations to. batch_name str
The name of the batch where the annotations are being aggregated. sources Optional[List[str]]
None
[Optional] The list of sources where the annotations are being aggregated. strategy Optional[str]
None
The strategy to use for aggregation.