Version: 0.96

snorkelflow.sdk.ModelNode

class snorkelflow.sdk.ModelNode(uid, application_uid, config)

Bases: Node

ModelNode class represents a model node.

__init__(uid, application_uid, config)

Methods

`__init__`(uid, application_uid, config)
`get`(node_uid)	Fetches a node by its UID.
`get_comments`([username])	Retreive the comments left on the current model node.
`get_dataframe`([split, columns])	Retrieve the data being passed directly through this node.
`get_ground_truth`([split, user_format])	Retrieve ground truth data for the current model node.
`get_lfs`()	Retrieve a list of currently active labeling functions for the current model node.
`get_model_preds`([model_uid, split, ...])	Retrieve model predictions and probabilities for the current model node and a given model UID.
`get_tags`([is_context])	Retrieve tags put on the current model node.
`get_training_set`(training_set_uid[, split, ...])	Retrieve a training set for this model node, specified by the training set UID.

Attributes

`application_uid`	The unique identifier for the application this node belongs to
`config`	Returns the detailed configuration information for this node
`uid`	The unique identifier for this node

get_comments

get_comments(username=None)

Retreive the comments left on the current model node. This method will return a Pandas DataFrame whose columns contain the metadata and content for the comment.

Examples

>>> my_node.get_comments()
    comment_uid user_uid    x_uid   body    created_at      is_edited
    7           3           doc::1  hello   2023-09-26T17   False

Parameters Parameters
Returns Returns: A Pandas DataFrame containing the comments left on the model node.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
username	`Optional[str]`	`None`	Optionally, return only a specific user’s coments. By default returns all comments.

get_dataframe

get_dataframe(split=None, columns=None)

Retrieve the data being passed directly through this node. Can be filtered by a split or by a subset of columns (useful for large datasets). The data can also optionally include tag and comment metadata.

This dataframe is not the same as the dataframe returned by Dataset.get_dataframe(). While Dataset.get_dataframe() returns the source data, the dataframe returned by Node.get_dataframe() has also undergone all the preprocessing/DAG transformations up to this point in the processing pipeline.

Parameters Parameters
Returns Returns: A dataframe of the data being passed directly through this node, optionally filtered by split and/or columns and indexed by x_uid. This DataFrame is the result of all preprocessing in the DAG pipeline up to this point.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
split	`Optional[str]`	`None`	Optionally restrict the data retrieved to a particular split, by default None (i.e., all splits).
columns	`Optional[List[str]]`	`None`	Optionally restrict the columns returned by this function, by default None. Useful for large datasets to significantly speed up retrieval time.

get_ground_truth

get_ground_truth(split=None, user_format=False)

Retrieve ground truth data for the current model node. Optionally filter by a particular split.

Parameters Parameters
Returns Returns: A Pandas DataFrame mapping the data index to the ground truth label. If user_format is True, the label column will contain human-readable label names.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
split	`Optional[str]`	`None`	Which data split to select, by default None (all splits). Can be one of “dev”, “train”, “valid”, or “test.
user_format	`bool`	`False`	Whether to return the ground truth in a human-readable format, by default False.

get_lfs

get_lfs()

Retrieve a list of currently active labeling functions for the current model node.

Examples

>>> my_node.get_lfs()
[
    LF(name='LF 1', label=3, templates=[...]),
    LF(name='LF 2', label=2, templates=[...]),
    LF(name='LF 3', label=1, templates=[...]),
]

Returns Returns: A list of all currently active labeling functions for the current model node.
Return type Return type: List[LF]

get_model_preds

get_model_preds(model_uid=None, split=None, is_context=False, user_format=True)

Retrieve model predictions and probabilities for the current model node and a given model UID. If no model UID is provided, the most recent model’s predictions are returned.

Examples

>>> my_node.get_model_preds()
            preds   probs
x_uid
doc::994    0       [0.543..., 0.080..., 0.37...]
doc::999    2       [0.327..., 0.201..., 0.4...]

Parameters Parameters
Returns Returns: A Pandas DataFrame of model predictions and probabilities, indexed by x_uid. If user_format is True, the preds column will contain human-readable label names.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
model_uid	`Optional[int]`	`None`	The UID of a trained model, by default the latest model. All trained models can be seen from the “Models” accordion in Developer Studio.
split	`Optional[str]`	`None`	Optionally filter model predictions by split, by default returns predictions for all splits. Splits can be one of “train”, “dev”, “valid”, or “test”.
is_context	`bool`	`False`	When True, retrieves predictions at the document level instead of the span level, by default False. Only applicable for information extraction tasks.
user_format	`bool`	`True`	Whether to return the predictions in a human-readable or compressed integer format, by default True (returning a human-readable format).

get_tags

get_tags(is_context=False)

Retrieve tags put on the current model node. For information extraction tasks, this method allows for fine-grained control over whether you want to retrieve tags at the document level or at the span level.

Examples

>>> my_node.get_tags()
x_uid
doc::10005        [loan-err, new_tag1]
doc::10006                  [new_tag1]
doc::10198             [Key-EMP-error]
Name: tags, dtype: object

Parameters Parameters
Returns Returns: A Pandas Series containing the tags put on the model node, indexed by x_uid .
Return type Return type: pd.Series

Name	Type	Default	Info
is_context	`bool`	`False`	When True, retrieves tags at the document level instead of the span level, by default False. Only applicable for information extraction tasks.

get_training_set

get_training_set(training_set_uid, split=None, user_format=True)

Retrieve a training set for this model node, specified by the training set UID. Allow allows for filtering the training set by a particular data split.

Examples

>>> my_node.get_training_set(1)
            training_set_labels     training_set_probs
doc::100    stock                   [0.030..., 0.024...]

Parameters Parameters
Returns Returns: A Pandas DataFrame of training set predictions and probabilities, indexed by x_uid. If user_format is True, the preds column will contain human-readable label names.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
training_set_uid	`int`		A training set UID, which can be found under the “Models” accordion in Developer Studio.
split	`Optional[str]`	`None`	An optional data split to return predictions for, by default None (all splits). Can be one of “train”, “dev”, “valid”, or “test”.
user_format	`bool`	`True`	Whether to return the predictions in a human-readable or compressed integer format, by default True (returning a human-readable format).

\_\_init\_\_

__init__​

get\_comments

get_comments​

Examples​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_dataframe

get_dataframe​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_ground\_truth

get_ground_truth​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_lfs

get_lfs​

Examples​

Returns

Returns​

Return type

Return type​

get\_model\_preds

get_model_preds​

Examples​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_tags

get_tags​

Examples​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_training\_set

get_training_set​

Examples​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

init

get_comments

Examples

Parameters

Returns

Return type

get_dataframe

Parameters

Returns

Return type

get_ground_truth

Parameters

Returns

Return type

get_lfs

Examples

Returns

Return type

get_model_preds

Examples

Parameters

Returns

Return type

get_tags

Examples

Parameters

Returns

Return type

get_training_set

Examples

Parameters

Returns

Return type