Skip to main content
Version: 0.91


class snorkelflow.sdk.OperatorNode(uid, application_uid, config)

Bases: Node

OperatorNode class represents a non-model, operator node.

__init__(uid, application_uid, config)


__init__(uid, application_uid, config)


Fetches a node by its UID.

get_dataframe([max_input_rows, ...])

Retrieve the data being passed directly through this node.



The unique identifier for the application this node belongs to


Returns the detailed configuration information for this node


The unique identifier for this node

get_dataframe(max_input_rows=10, datasource_uids=None, partition=None)

Retrieve the data being passed directly through this node. By default, this function will only process a maximum of 10 rows of data, to prevent the Notebook from running out of memory. To override this limit, set max_input_rows to a higher value.

This dataframe is not the same as the dataframe returned by Dataset.get_dataframe(). While Dataset.get_dataframe() returns the source data, the dataframe returned by Node.get_dataframe() has also undergone all the preprocessing/DAG transformations up to this point in the processing pipeline.

  • max_input_rows (int, default: 10) – The number of rows that should be pushed through this node, by default 10

  • datasource_uids (Optional[List[int]], default: None) – A list of datasource UIDs to process, useful if you have some specific datasources you want to examine, by default None. See the Dataset class for more information on fetching a datasource UID.

  • partition (Optional[int], default: None) – A specific file partition to process, by default None. Only applicable if the source dataset files are in a readily partitioned format.


A DataFrame displaying the results when the source dataset is pushed through this node.

Return type:
