snorkelai.sdk.develop.Slice
- class snorkelai.sdk.develop.Slice(dataset, slice_uid, name, description=None, config=None)
Bases:
object
Represents a slice within a Snorkel dataset for identifying and managing subsets of datapoints.
A slice is a logical subset of datapoints within a dataset that can be created either manually by adding specific datapoints, or programmatically using slicing functions defined through templates and configurations. Slices are essential for data analysis, model evaluation, and targeted data operations within Snorkel workflows.
Key capabilities:
Manual datapoint management through add/remove operations
Programmatic datapoint identification using configurable slicing functions
This class provides methods for creating, retrieving, updating, and managing slice membership. Slice objects should not be instantiated directly - use the
create()
orget()
class methods instead.For more information on slices and data management, see the Data Management Guide.
Parameters
Parameters
Name Type Default Info dataset IdType
The UID or name for the dataset within Snorkel. slice_uid int
The unique identifier for the slice within Snorkel. name str
The display name of the slice. description str, optional
A description of the slice’s purpose and contents. config SliceConfig, optional
Configuration defining slicing functions for programmatic datapoint identification. - __init__(dataset, slice_uid, name, description=None, config=None)
Create a Slice object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the
create()
andget()
methodsParameters
Parameters
Name Type Default Info dataset Union[str, int]
The UID or name for the dataset within Snorkel. slice_uid int
The UID for the slice within Snorkel. name str
The name of the slice. description Optional[str]
None
The description of the slice.
\_\_init\_\_
__init__
Methods
__init__
(dataset, slice_uid, name[, ...])Create a Slice object in-memory with necessary properties. add_x_uids
(x_uids)Add datapoints to a slice. create
(dataset, name[, description, config])Create a slice for a dataset. get
(dataset, slice)Retrieve a slice by UID. get_x_uids
()Retrieve the UIDs of the datapoints in the slice. list
(dataset)Retrieve all slices for a dataset. remove_x_uids
(x_uids)Remove datapoints from a slice. update
([name, description, config])Update the slice properties. Attributes
dataset_uid
Return the UID of the dataset that the slice belongs to description
Return the description of the slice name
Return the name of the slice slice_uid
Return the UID of the slice - add_x_uids(x_uids)
Add datapoints to a slice.
add\_x\_uids
add_x_uids
- classmethod create(dataset, name, description='', config=None)
Create a slice for a dataset. Slices are use to identify a subset of datapoints in a dataset. You can add datapoints to a slice manually, or if you define a config, you can add datapoints programmatically. Slice membership can contain both manual and programmatic identified datapoints.
Parameters
Parameters
Returns
Returns
The slice object
Return type
Return type
Raises
Raises
ValueError – If the dataset doesn’t exist or cannot be found by name/UID
ValueError – If the slice name is a reserved name or already exists for the dataset
ValueError – If there are other validation or server errors during slice creation
Name Type Default Info dataset Union[str, int]
The UID or name for the dataset within Snorkel Flow. name str
The name of the slice. description str
''
A description of the slice, by default the empty string. config Optional[SliceConfig]
None
A SliceConfig object, by default None, you can reference the schema in the template module for constructing this config. This config is used to define the Slicing Function (templates and graph) for the slice, allowing it to programmatically add datapoints to the slice membership. Examples
>>> from templates.keyword_template import KeywordTemplateSchema
>>> from snorkelai.sdk.develop.slices import Slice, SliceConfig
>>> from snorkelai.sdk.utils.graph import DEFAULT_GRAPH
>>> Slice.create(
>>> dataset=dataset_uid,
>>> name="slice_name",
>>> description="description",
>>> config=SliceConfig(
>>> templates=[
>>> {
>>> "transform_type": "dataset_template_filter",
>>> "config": {
>>> "transform_config_type": "filter_schema",
>>> "filter_type": "text_template",
>>> "filter_config_type": "dataset_text_template",
>>> "dataset_uid": 1,
>>> "template_config": {
>>> "field": "text_col",
>>> "keywords": ["keyword1", "keyword2"],
>>> "operator": "CONTAINS",
>>> "case_sensitive": False,
>>> "tokenize": True,
>>> },
>>> },
>>> },
>>> ],
>>> graph=DEFAULT_GRAPH,
>>> ),
>>> )
create
create
- classmethod get(dataset, slice)
Retrieve a slice by UID.
get
get
- get_x_uids()
Retrieve the UIDs of the datapoints in the slice.
get\_x\_uids
get_x_uids
- classmethod list(dataset)
Retrieve all slices for a dataset.
list
list
- remove_x_uids(x_uids)
Remove datapoints from a slice.
remove\_x\_uids
remove_x_uids
- update(name=None, description=None, config=None)
Update the slice properties.
Parameters
Parameters
Raises
Raises
ValueError – If there are other errors during slice update
Return type
Return type
None
Name Type Default Info name Optional[str]
None
The new name for the slice, by default None. description Optional[str]
None
The new description for the slice, by default None. config Optional[SliceConfig]
None
A SliceConfig object with the new configuration for the slice, by default None.
update
update
- property dataset_uid: int
Return the UID of the dataset that the slice belongs to
- property description: str | None
Return the description of the slice
- property name: str
Return the name of the slice
- property slice_uid: int
Return the UID of the slice