Version: 0.94

snorkelflow.sdk.OperatorNode

class snorkelflow.sdk.OperatorNode(uid, application_uid, config)

Bases: Node

OperatorNode class represents a non-model, operator node.

__init__(uid, application_uid, config)

Methods

`__init__`(uid, application_uid, config)
`get`(node_uid)	Fetches a node by its UID.
`get_dataframe`([max_input_rows, ...])	Retrieve the data being passed directly through this node.

Attributes

`application_uid`	The unique identifier for the application this node belongs to
`config`	Returns the detailed configuration information for this node
`uid`	The unique identifier for this node

get_dataframe

get_dataframe(max_input_rows=10, datasource_uids=None, partition=None)

Retrieve the data being passed directly through this node. By default, this function will only process a maximum of 10 rows of data, to prevent the Notebook from running out of memory. To override this limit, set max_input_rows to a higher value.

This dataframe is not the same as the dataframe returned by Dataset.get_dataframe(). While Dataset.get_dataframe() returns the source data, the dataframe returned by Node.get_dataframe() has also undergone all the preprocessing/DAG transformations up to this point in the processing pipeline.

Parameters Parameters
Returns Returns: A DataFrame displaying the results when the source dataset is pushed through this node.
Return type Return type: pd.DataFrame

Name	Type	Default	Info
max_input_rows	`int`	`10`	The number of rows that should be pushed through this node, by default 10.
datasource_uids	`Optional[List[int]]`	`None`	A list of datasource UIDs to process, useful if you have some specific datasources you want to examine, by default None. See the `Dataset` class for more information on fetching a datasource UID.
partition	`Optional[int]`	`None`	A specific file partition to process, by default None. Only applicable if the source dataset files are in a readily partitioned format.

snorkelflow.sdk.OperatorNode

\_\_init\_\_

init

get\_dataframe

get_dataframe

Parameters

Parameters

Returns

Returns

Return type

Return type

\_\_init\_\_

__init__​

get\_dataframe

get_dataframe​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

init

get_dataframe

Parameters

Returns

Return type