snorkelflow.sdk.LabelSchema
- class snorkelflow.sdk.LabelSchema(name, uid, dataset_uid, label_map, description)
Bases:
object
The LabelSchema object represents a label schema in Snorkel Flow. Currently, this interface only represents Dataset-level (not Node-level) label schemas.
- __init__(name, uid, dataset_uid, label_map, description)
Create a label schema object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the
create()
andget()
methods- Parameters:
name (
str
) – The name of the label schemauid (
int
) – The UID for the label schema within Snorkel Flowdataset_uid (
int
) – The UID for the dataset within Snorkel Flowlabel_map (
Dict
[str
,int
]) – The label map of the label schemadescription (
Optional
[str
]) – The description of the label schema
Methods
__init__
(name, uid, dataset_uid, label_map, ...)Create a label schema object in-memory with necessary properties.
copy
(name[, description, label_map, ...])Copy a label schema.
create
(dataset_uid, name, data_type, ...[, ...])Create a label schema for a dataset.
delete
(label_schema)Delete a label schema by name or UID.
get
(label_schema)Retrieve a label schema by name or UID.
Attributes
The UID for the dataset within Snorkel Flow.
The description of the label schema.
The label map of the label schema.
The name of the label schema.
The UID for the label schema within Snorkel Flow.
- copy(name, description=None, label_map=None, label_descriptions=None, updated_label_schema=None)
Copy a label schema.
- Parameters:
name (
str
) – The name of the new label schemadescription (
Optional
[str
], default:None
) – The description of the new label schemalabel_map (
Optional
[Dict
[str
,int
]], default:None
) – The label map of the new label schemalabel_descriptions (
Optional
[Dict
[str
,str
]], default:None
) – The label descriptions of the new label schemaupdated_label_schema (
Optional
[Dict
[str
,str
]], default:None
) – The update mapping to apply to the new label schema. This is a dictionary mapping label names for the current label schema to those for the new label schema. If a label for the current label schema is removed, it is mapped to None. Examples: 1. Rename “old_1” to “new_1” and remove “old_2”: {“old_1”: “new_1”, “old_2”: None} 2. Merge “old_1” and “old_2” to “new_1”: {“old_1”: “new_1”, “old_2”: “new_1”} 3. Split “old_1” to “new_1” and “new_2”, and keep assets labeled as “old_1” at “new_1”: {“old_1”: “new_1”} 4. Add “new_3”: None (no change to the existing assets)
- Returns:
The new label schema object
- Return type:
- classmethod create(dataset_uid, name, data_type, task_type, label_map, multi_label=False, description=None, label_column=None, label_descriptions=None, primary_field=None)
Create a label schema for a dataset.
Typically, Dataset.create_label_schema() is the recommended entrypoint for creating label schemas.
- Parameters:
dataset_uid (
int
) – The UID for the dataset within Snorkel Flowname (
str
) – The name of the label schemadata_type (
str
) – The data type of the label schematask_type (
str
) – The task type of the label schemalabel_map (
Union
[Dict
[str
,int
],List
[str
]]) – A dictionary mapping label names to their integer values, or a list of label namesmulti_label (
bool
, default:False
) – Whether the label schema is a multi-label schema, by default Falsedescription (
Optional
[str
], default:None
) – A description of the label schema, by default Nonelabel_column (
Optional
[str
], default:None
) – The name of the column that contains the labels, by default Nonelabel_descriptions (
Optional
[Dict
[str
,str
]], default:None
) – A dictionary mapping label names to their descriptions, by default Noneprimary_field (
Optional
[str
], default:None
) – The primary field of the label schema, by default None
- Returns:
The label schema object
- Return type:
- classmethod delete(label_schema)
Delete a label schema by name or UID.
- Parameters:
label_schema (
Union
[str
,int
]) – The name or UID of the label schema- Return type:
None
- classmethod get(label_schema)
Retrieve a label schema by name or UID.
- Parameters:
label_schema (
Union
[str
,int
]) – The name or UID of the label schema- Returns:
The label schema object
- Return type:
- Raises:
ValueError – If no label schema is found with the given name or UID
- property dataset_uid: int
The UID for the dataset within Snorkel Flow.
- property description: str | None
The description of the label schema.
- property label_map: Dict[str, int]
The label map of the label schema.
- property name: str
The name of the label schema.
- property uid: int
The UID for the label schema within Snorkel Flow.