Skip to main content
Version: 0.95

snorkelflow.sdk.Slice

class snorkelflow.sdk.Slice(dataset, slice_uid, name, description=None, config=None)

Bases: object

__init__(dataset, slice_uid, name, description=None, config=None)

Create a Slice object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the create() and get() methods

Parameters:
  • dataset (Union[str, int]) – The UID or name for the dataset within Snorkel Flow

  • slice_uid (int) – The UID for the slice within Snorkel Flow

  • name (str) – The name of the slice

  • description (Optional[str], default: None) – The description of the slice

Methods

__init__(dataset, slice_uid, name[, ...])

Create a Slice object in-memory with necessary properties.

add_x_uids(x_uids)

Add datapoints to a slice.

create(dataset, name[, description, config])

Create a slice for a dataset.

get(dataset, slice)

Retrieve a slice by UID.

get_x_uids()

Retrieve the UIDs of the datapoints in the slice.

list(dataset)

Retrieve all slices for a dataset.

remove_x_uids(x_uids)

Remove datapoints from a slice.

update([name, description, config])

Update the slice properties.

Attributes

dataset_uid

Return the UID of the dataset that the slice belongs to

description

Return the description of the slice

name

Return the name of the slice

slice_uid

Return the UID of the slice

add_x_uids(x_uids)

Add datapoints to a slice.

Parameters:

x_uids (List[str]) – List of UIDs of the datapoints you want to add to the slice

Return type:

None

classmethod create(dataset, name, description='', config=None)

Create a slice for a dataset. Slices are use to identify a subset of datapoints in a dataset. You can add datapoints to a slice manually, or if you define a config, you can add datapoints programmatically. Slice membership can contain both manual and programmatic identified datapoints.

Parameters:
  • dataset (Union[str, int]) – The UID or name for the dataset within Snorkel Flow

  • name (str) – The name of the slice

  • description (str, default: '') – A description of the slice, by default the empty string

  • config (Optional[SliceConfig], default: None) – A SliceConfig object, by default None, you can reference the schema in the template module for constructing this config. This config is used to define the Slicing Function (templates and graph) for the slice, allowing it to programmatically add datapoints to the slice membership.

Returns:

The slice object

Return type:

Slice

Examples

>>> from templates.keyword_template import KeywordTemplateSchema
>>> from snorkelflow.sdk.slices import Slice, SliceConfig
>>> from snorkelflow.utils.graph import DEFAULT_GRAPH
>>> Slice.create(
>>> dataset=dataset_uid,
>>> name="slice_name",
>>> description="description",
>>> config=SliceConfig(
>>> templates=[
>>> KeywordTemplateSchema(
>>> field="text_col",
>>> keywords=["keyword1", "keyword2"],
>>> operator="CONTAINS",
>>> case_sensitive=False,
>>> tokenize=True,
>>> )
>>> ],
>>> graph=DEFAULT_GRAPH,
>>> ),
>>> )
classmethod get(dataset, slice)

Retrieve a slice by UID.

Parameters:
  • dataset (Union[str, int]) – The UID or name for the dataset within Snorkel Flow

  • slice (Union[str, int]) – The UID or name of the slice

Returns:

The slice object

Return type:

Slice

Raises:

ValueError – If no slice is found with the given UID

get_x_uids()

Retrieve the UIDs of the datapoints in the slice.

Returns:

List of UIDs of the datapoints in the slice

Return type:

List[str]

classmethod list(dataset)

Retrieve all slices for a dataset.

Parameters:

dataset (Union[str, int]) – The UID or name for the dataset within Snorkel Flow

Returns:

A list of all the slices available for that dataset

Return type:

List[Slice]

Raises:

ValueError – If no dataset is found with the given id

remove_x_uids(x_uids)

Remove datapoints from a slice.

Parameters:

x_uids (List[str]) – List of UIDs of the datapoints you want to remove from the slice

Return type:

None

update(name=None, description=None, config=None)

Update the slice properties.

Parameters:
  • name (Optional[str], default: None) – The new name for the slice, by default None

  • description (Optional[str], default: None) – The new description for the slice, by default None

  • config (Optional[SliceConfig], default: None) – A SliceConfig object with the new configuration for the slice, by default None

Return type:

None

property dataset_uid: int

Return the UID of the dataset that the slice belongs to

property description: str | None

Return the description of the slice

property name: str

Return the name of the slice

property slice_uid: int

Return the UID of the slice