snorkelflow.sdk.LabelSchema
- class snorkelflow.sdk.LabelSchema(name, uid, dataset_uid, label_map, description, is_text_label=False)
Bases:
object
The LabelSchema object represents a label schema in Snorkel Flow. Currently, this interface only represents Dataset-level (not Node-level) label schemas.
- __init__(name, uid, dataset_uid, label_map, description, is_text_label=False)
Create a label schema object in-memory with necessary properties. This constructor should not be called directly, and should instead be accessed through the
create()
andget()
methods- Parameters:
name (
str
) – The name of the label schemauid (
int
) – The UID for the label schema within Snorkel Flowdataset_uid (
int
) – The UID for the dataset within Snorkel Flowlabel_map (
Dict
[str
,int
]) – The label map of the label schemadescription (
Optional
[str
]) – The description of the label schemais_text_label (
bool
, default:False
) – Whether the label schema is a text label schema
Methods
__init__
(name, uid, dataset_uid, label_map, ...)Create a label schema object in-memory with necessary properties.
copy
(name[, description, label_map, ...])Copy a label schema.
create
(dataset_uid, name, data_type, ...[, ...])Create a label schema for a dataset.
delete
(label_schema)Delete a label schema by name or UID.
get
(label_schema)Retrieve a label schema by name or UID.
Attributes
The UID for the dataset within Snorkel Flow.
The description of the label schema.
Whether the label schema is a text label schema.
The label map of the label schema.
The name of the label schema.
The UID for the label schema within Snorkel Flow.
- copy(name, description=None, label_map=None, label_descriptions=None, updated_label_schema=None)
Copy a label schema.
- Parameters:
name (
str
) – The name of the new label schemadescription (
Optional
[str
], default:None
) – The description of the new label schemalabel_map (
Optional
[Dict
[str
,int
]], default:None
) – The label map of the new label schemalabel_descriptions (
Optional
[Dict
[str
,str
]], default:None
) – The label descriptions of the new label schemaupdated_label_schema (
Optional
[Dict
[str
,str
]], default:None
) – The update mapping to apply to the new label schema. This is a dictionary mapping label names for the current label schema to those for the new label schema. If a label for the current label schema is removed, it is mapped to None. Examples: 1. Rename “old_1” to “new_1” and remove “old_2”: {“old_1”: “new_1”, “old_2”: None} 2. Merge “old_1” and “old_2” to “new_1”: {“old_1”: “new_1”, “old_2”: “new_1”} 3. Split “old_1” to “new_1” and “new_2”, and keep assets labeled as “old_1” at “new_1”: {“old_1”: “new_1”} 4. Add “new_3”: None (no change to the existing assets)
- Returns:
The new label schema object
- Return type:
- classmethod create(dataset_uid, name, data_type, task_type, label_map, multi_label=False, description=None, label_column=None, label_descriptions=None, primary_field=None, is_text_label=False)
Create a label schema for a dataset.
Typically, Dataset.create_label_schema() is the recommended entrypoint for creating label schemas.
- Parameters:
dataset_uid (
int
) – The UID for the dataset within Snorkel Flowname (
str
) – The name of the label schemadata_type (
str
) – The data type of the label schematask_type (
str
) – The task type of the label schemalabel_map (
Union
[Dict
[str
,int
],List
[str
]]) – A dictionary mapping label names to their integer values, or a list of label namesmulti_label (
bool
, default:False
) – Whether the label schema is a multi-label schema, by default Falsedescription (
Optional
[str
], default:None
) – A description of the label schema, by default Nonelabel_column (
Optional
[str
], default:None
) – The name of the column that contains the labels, by default Nonelabel_descriptions (
Optional
[Dict
[str
,str
]], default:None
) – A dictionary mapping label names to their descriptions, by default Noneprimary_field (
Optional
[str
], default:None
) – The primary field of the label schema, by default Noneis_text_label (
bool
, default:False
) – Whether the label schema is a text label schema, by default False
- Returns:
The label schema object
- Return type:
- classmethod delete(label_schema)
Delete a label schema by name or UID.
- Parameters:
label_schema (
Union
[str
,int
]) – The name or UID of the label schema- Return type:
None
- classmethod get(label_schema)
Retrieve a label schema by name or UID.
- Parameters:
label_schema (
Union
[str
,int
]) – The name or UID of the label schema- Returns:
The label schema object
- Return type:
- Raises:
ValueError – If no label schema is found with the given name or UID
- property dataset_uid: int
The UID for the dataset within Snorkel Flow.
- property description: str | None
The description of the label schema.
- property is_text_label: bool
Whether the label schema is a text label schema.
- property label_map: Dict[str, int]
The label map of the label schema.
- property name: str
The name of the label schema.
- property uid: int
The UID for the label schema within Snorkel Flow.