Skip to main content
Version: 25.5

snorkelai.sdk.client.gts.get_document_ground_truth

snorkelai.sdk.client.gts.get_document_ground_truth(node, context_uids=None)

Get document-level ground truths as Pandas DataFrame.

This function gets the ground truth for an entire input document. It can only be used for information extraction tasks. For text extraction tasks, this gets a list of extractions for each document. For entity classification tasks, this gets a dictionary mapping entities to classes for each document. The ground truth is always represented as integers rather than strings. A mapping from integer to string labels can be retrieved using sf.get_node_label_map. For information about how to interpret ground truth in information extraction tasks, see Format for ground truth interaction in the SDK.

Examples

>>> sf.get_document_ground_truth(123, context_uids=[456, 789])
<pd.DataFrame>
__DATAPOINT_UID | ground_truth
doc::0 | [[0, 10, 0]]
doc::1 | [[0, 5, 1], [5, 15, 0]]

Parameters

NameTypeDefaultInfo
nodeintThe UID of the node whose document ground truth to get.
context_uidsOptional[List[int]]NoneOptional list of context_uids of documents whose ground truth to get. If None, get all document ground truth.

Return type

DataFrame

Returns

  • Pandas DataFrame containing document ground truth with

  • document context_uid as index