Skip to main content
Version: 25.4

snorkelflow.client.nodes.add_node

warning

This is a beta function in 25.4. Beta features may have known gaps or bugs, but are functional workflows and eligible for Snorkel Support. To access beta features, contact Snorkel Support to enable the feature flag for your Snorkel-hosted instance.

snorkelflow.client.nodes.add_node(application, input_node_uids=None, expected_op_type=None, node_config=None, output_node_uid=None, output_node_uids=None, node_cls='ApplicationNode', op_type=None, op_config=None, add_to_parent_block=False)

Adds a new node to an application’s data processing pipeline.

Creates a node in the application’s directed acyclic graph (DAG), optionally with a committed operator. When you add a node, you add to your data processing pipeline. Each node is a single step in your sequence of data transformation operations. Each node can have input/output connections and the operation to perform on the data at that step.

When creating a node, you must specify input_node_uids and optionally output_node_uids to define the node’s connections in the pipeline. Use input_node_uids when building the pipeline forward from source to target, and output_node_uids when building backward from target to source.

You must specify either op_type (to commit an operator immediately) or expected_op_type (to create a placeholder node that will have an operator committed later). Use op_type when you know the exact operator and configuration, and expected_op_type when you want to reserve a spot in the pipeline for a specific type of operation.

Parameters

NameTypeDefaultInfo
applicationUnion[str, int]Name or UID of the application where you want to add the node.
input_node_uidsOptional[List[int]]NoneList of input node UIDs that feed data into this node. Use [-1] to connect to the initial dataset node. Required.
expected_op_typeOptional[str]NoneThe expected type of operator for this node (e.g., “Featurizer”, “Filter”, “Model”). Required if not providing op_type, otherwise can be omitted. See the operators reference for a comprehensive list of operator types.
node_configOptional[Dict[str, Any]]NoneDictionary with configuration for the node. For model nodes, this can include label_map containing class-to-index mappings.
output_node_uidsOptional[List[int]]NoneList of node UIDs that receive data from this node.
output_node_uidOptional[int]NoneDEPRECATED. Use output_node_uids instead.
node_clsstr'ApplicationNode'

The node class type. Valid node class types include:

  • ApplicationNode: Default node class for general purposes.

  • ClassificationNode: Node specific to text classification tasks.

  • SequenceTaggingNode: Node specific to sequence tagging tasks.

  • WordClassificationNode: Node for word-level classification tasks.

  • EntityClassificationNode: Node for entity classification tasks.

op_typeOptional[str]None

Type of operator to commit to the node (e.g., “TruncatePreprocessor”, “RegexFilter”). Required if providing op_config. See the operators reference for a comprehensive list of operator types.

op_configOptional[Dict[str, Any]]NoneDictionary with configuration specific to the operator type. For example, a TruncatePreprocessor requires field, target_field, length, and by parameters.
add_to_parent_blockOptional[bool]FalseWhen True, adds the node to the parent block of the output node. This affects node nesting in the application structure. When using with a single block application with input_node_uids=[-1] and add_to_parent_block=True, the node will be added ahead of the block, not within it.

Raises

  • ValueError – When op_config is provided without op_type.

  • ValueError – When neither input_node_uids nor output_node_uids is specified.

Return type

Dict[str, Any]

Examples

Example 1

Adds a placeholder node with an expected operator type of "Featurizer".

# Add a placeholder Featurizer node
placeholder_featurizer = sf.add_node(
application="your-app-name",
input_node_uids=[-1],
expected_op_type="Featurizer",
node_config={
"description": "Text feature extraction node"
}
)

Example 1 return

Returns information about the newly created node.

{
"node_uid": 100,
"op_version_uid": 200
}

Example 2

Adds a node with a committed operator, inserting it into the pipeline ahead of the placeholder node.

preprocessing_node = sf.add_node(
application="your-app-name",
input_node_uids=[-1], # Connect to dataset node
op_type="TruncatePreprocessor", # Specific operator type
op_config={
"field": "text",
"target_field": "text_truncated",
"by": "chars",
"length": 512
},
output_node_uids=[100]
)

Example 2 return

Returns information about the newly created preprocessing node.

{
"node_uid": 101,
"op_version_uid": 201
}

Example 3

Adds a model node with a label map in the node configuration and adds it to the parent block.

# Add a model node with a classification label map
model_node = sf.add_node(
application="classification-example-app",
input_node_uids=[101],
expected_op_type="Model",
node_cls="ClassificationNode",
node_config={
"label_map": {
"negative": -1,
"neutral": 0,
"positive": 1
}
},
add_to_parent_block=True
)

Example 3 return

Returns information about the newly created model node.

{
"node_uid": 102,
"op_version_uid": 202
}