Version: 0.94

snorkelflow.sdk.QualityDataset

class snorkelflow.sdk.QualityDataset(df, dataset_uid, label_schema_uid, model_node_uid, label_map)

Bases: FTDataset

__init__(df, dataset_uid, label_schema_uid, model_node_uid, label_map)

Methods

`__init__`(df, dataset_uid, label_schema_uid, ...)
`append`(ft_dataset)	Append the given FTDataset to the current FTDataset.
`create_annotation_batches`([assignees])	Create an annotation batch for the ft dataset.
`export_data`(format, filepath)	Export the data in the FTDataset to the specified format and write to the provided filepath.
`filter`([source_uids, splits, x_uids, ...])	Filter the dataset based on the given filters.
`get_data`()	Get the data associated with the fine tuning dataset.
`get_x_uids`()	Get the x_uids in the FTDataset.
`mix`(mix_on, weights, n_samples[, seed])	Mix the dataset by split, source_uid, or slice based on the given weights, returning up to limit samples.
`sample`(n[, seed])	Sample n samples from the FTDataset.
`save`(name)	Save the FTDataset as a slice.
`set_as_dev_set`()	Resample the x_uids within the FTDataset as the dev set for the fine tuning application.

filter

filter(source_uids=None, splits=None, x_uids=None, feature_hashes=None, slices=None, has_gt=None, labels=None, confidence_threshold=None)

Filter the dataset based on the given filters.

Parameters Parameters
Returns Returns: The filtered dataset
Return type Return type: QualityDataset

Name	Type	Default	Info
source_uids	`Optional[List[int]]`	`None`	The source uids to filter by.
splits	`Optional[List[str]]`	`None`	The splits to filter by.
x_uids	`Optional[List[str]]`	`None`	The x uids to filter by.
feature_hashes	`Optional[List[str]]`	`None`	The feature hashes to filter by.
slices	`Optional[List[Slice]]`	`None`	The slices to filter by, rows within at least one slice will be included.
has_gt	`Optional[bool]`	`None`	Filter by the existence / non-existence of ground truth.
labels	`Optional[List[str]]`	`None`	The labels to filter by.
confidence_threshold	`Optional[float]`	`None`	The confidence threshold to filter by.

\_\_init\_\_

__init__​

filter

filter​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

init

filter

Parameters

Returns

Return type