snorkelflow.sdk.Slice
- class snorkelflow.sdk.Slice(dataset, slice_uid, name, description=None, config=None)
Bases:
object
- __init__(dataset, slice_uid, name, description=None, config=None)
Create a Slice object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the
create()
andget()
methods- Parameters:
dataset (
Union
[str
,int
]) – The UID or name for the dataset within Snorkel Flowslice_uid (
int
) – The UID for the slice within Snorkel Flowname (
str
) – The name of the slicedescription (
Optional
[str
], default:None
) – The description of the slice
Methods
__init__
(dataset, slice_uid, name[, ...])Create a Slice object in-memory with necessary properties.
add_x_uids
(x_uids)Add datapoints to a slice.
create
(dataset, name[, description, config])Create a slice for a dataset.
get
(dataset, slice)Retrieve a slice by UID.
Retrieve the UIDs of the datapoints in the slice.
list
(dataset)Retrieve all slices for a dataset.
remove_x_uids
(x_uids)Remove datapoints from a slice.
update
([name, description, config])Update the slice properties.
Attributes
Return the UID of the dataset that the slice belongs to
Return the description of the slice
Return the name of the slice
Return the UID of the slice
- add_x_uids(x_uids)
Add datapoints to a slice.
- Parameters:
x_uids (
List
[str
]) – List of UIDs of the datapoints you want to add to the slice- Return type:
None
- classmethod create(dataset, name, description='', config=None)
Create a slice for a dataset. Slices are use to identify a subset of datapoints in a dataset. You can add datapoints to a slice manually, or if you define a config, you can add datapoints programmatically. Slice membership can contain both manual and programmatic identified datapoints.
- Parameters:
dataset (
Union
[str
,int
]) – The UID or name for the dataset within Snorkel Flowname (
str
) – The name of the slicedescription (
str
, default:''
) – A description of the slice, by default the empty stringconfig (
Optional
[SliceConfig
], default:None
) – A SliceConfig object, by default None, you can reference the schema in the template module for constructing this config. This config is used to define the Slicing Function (templates and graph) for the slice, allowing it to programmatically add datapoints to the slice membership.
- Returns:
The slice object
- Return type:
Examples
>>> from templates.keyword_template import KeywordTemplateSchema
>>> from snorkelflow.sdk.slices import Slice, SliceConfig
>>> from snorkelflow.utils.graph import DEFAULT_GRAPH
>>> Slice.create(
>>> dataset=dataset_uid,
>>> name="slice_name",
>>> description="description",
>>> config=SliceConfig(
>>> templates=[
>>> KeywordTemplateSchema(
>>> field="text_col",
>>> keywords=["keyword1", "keyword2"],
>>> operator="CONTAINS",
>>> case_sensitive=False,
>>> tokenize=True,
>>> )
>>> ],
>>> graph=DEFAULT_GRAPH,
>>> ),
>>> )
- classmethod get(dataset, slice)
Retrieve a slice by UID.
- Parameters:
dataset (
Union
[str
,int
]) – The UID or name for the dataset within Snorkel Flowslice (
Union
[str
,int
]) – The UID or name of the slice
- Returns:
The slice object
- Return type:
- Raises:
ValueError – If no slice is found with the given UID
- get_x_uids()
Retrieve the UIDs of the datapoints in the slice.
- Returns:
List of UIDs of the datapoints in the slice
- Return type:
List[str]
- classmethod list(dataset)
Retrieve all slices for a dataset.
- Parameters:
dataset (
Union
[str
,int
]) – The UID or name for the dataset within Snorkel Flow- Returns:
A list of all the slices available for that dataset
- Return type:
List[Slice]
- Raises:
ValueError – If no dataset is found with the given id
- remove_x_uids(x_uids)
Remove datapoints from a slice.
- Parameters:
x_uids (
List
[str
]) – List of UIDs of the datapoints you want to remove from the slice- Return type:
None
- update(name=None, description=None, config=None)
Update the slice properties.
- Parameters:
name (
Optional
[str
], default:None
) – The new name for the slice, by default Nonedescription (
Optional
[str
], default:None
) – The new description for the slice, by default Noneconfig (
Optional
[SliceConfig
], default:None
) – A SliceConfig object with the new configuration for the slice, by default None
- Return type:
None
- property dataset_uid: int
Return the UID of the dataset that the slice belongs to
- property description: str | None
Return the description of the slice
- property name: str
Return the name of the slice
- property slice_uid: int
Return the UID of the slice