snorkelflow.sdk.FineTuningApp
- class snorkelflow.sdk.FineTuningApp(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)
Bases:
object
- __init__(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)
\_\_init\_\_
__init__
Methods
__init__
(app_uid, model_node_uid, ...)create
(app_name, fine_tuning_app_config)Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.
create_evaluation_report
(split[, ...])Create an evaluation report for the quality dataset.
delete
()Delete the fine tuning application.
get
(application)Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.
Get the annotation batches associated with the entire fine tuning dataset and label schema.
get_dataframe
([split, source_uids, x_uids, ...])Get data from the dataset associated with the fine tuning application with the given filters applied
get_evaluation_report
(evaluation_report_uid)Get the evaluation report associated with the given evaluation report uid.
Get the fine tuning dataset associated with the fine tuning application.
get_quality_dataset
(model_uid)Create a QualityDataset object from a trained model's predictions.
Get the sources within the current workspace.
import_data
(data, split, source_uid[, name, ...])Import data into the fine tuning application.
import_ground_truth
(gt_df, gt_column, ...[, ...])Import ground truth labels into the fine tuning dataset.
List the evaluation reports associated with the fine tuning application.
List the quality models associated with the fine tuning application.
register_custom_metric
(metric_name, metric_func)Register a user-defined metric with the FineTuningApp.
register_model_source
(model_name[, metadata])Register a model source with the given model name and metadata.
register_source
(source_name, source_type, ...)Register a source in the platform
Setup the studio for the fine tuning application.
Attributes
Get metadata about each datasource, include details about the source
- classmethod create(app_name, fine_tuning_app_config)
Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.
create
create
- create_evaluation_report(split, quality_models=None, finetuned_model_sources=None, slices=None)
Create an evaluation report for the quality dataset.
Parameters
Parameters
Returns
Returns
A dictionary containing the evaluation results
Return type
Return type
Dict[str, Any]
Name Type Default Info split str
The split of the data to evaluate. quality_models Union[List[str], List[int], None]
None
The quality models to evaluate (if not provided, the committed quality model or the most recently trained model will be used, in that order). finetuned_model_sources Union[List[str], List[int], None]
None
The finetuned model sources to evaluate (if not provided, all finetuned models associated with the datasources will be used). slices Union[List[str], List[int], None]
None
The slices to evaluate (if not provided, all slices in the given dataset will be evaluated).
create\_evaluation\_report
create_evaluation_report
- delete()
Delete the fine tuning application. Dataset must be deleted separately.
Return type
Return type
None
delete
delete
- classmethod get(application)
Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.
get
get
- get_annotation_batches()
Get the annotation batches associated with the entire fine tuning dataset and label schema.
get\_annotation\_batches
get_annotation_batches
- get_dataframe(split=None, source_uids=None, x_uids=None, datasource_uids=None)
Get data from the dataset associated with the fine tuning application with the given filters applied
Parameters
Parameters
Return type
Return type
DataFrame
Name Type Default Info split Optional[str]
None
The split of the data to get. source_uids Optional[List[int]]
None
The source uids to filter by. x_uids Optional[List[str]]
None
The x uids to filter by. datasource_uids Optional[List[str]]
None
The datasource uids to filter by.
get\_dataframe
get_dataframe
- get_evaluation_report(evaluation_report_uid)
Get the evaluation report associated with the given evaluation report uid.
get\_evaluation\_report
get_evaluation_report
- get_ft_dataset()
Get the fine tuning dataset associated with the fine tuning application.
get\_ft\_dataset
get_ft_dataset
- get_quality_dataset(model_uid)
Create a QualityDataset object from a trained model’s predictions.
get\_quality\_dataset
get_quality_dataset
- get_sources()
Get the sources within the current workspace.
get\_sources
get_sources
- import_data(data, split, source_uid, name=None, sync=True, refresh_datasources=True, prompt_template=None)
Import data into the fine tuning application.
Parameters
Parameters
Returns
Returns
The job_id of the ingestion job
Return type
Return type
str
Name Type Default Info data Union[str, DataFrame]
A file path or a pandas DataFrame of the data to import into the dataset. split str
The split of the data. source_uid int
The source to associate the data with for data lineage. name Optional[str]
None
The name of the data source. sync bool
True
Whether to wait for the ingestion job to complete before returning. refresh_datasources bool
True
Whether to refresh datasources for the downstream model node after ingestion. Can only be set if sync is True. prompt_template Optional[str]
None
The prompt template used when the data was generated.
import\_data
import_data
- import_ground_truth(gt_df, gt_column, join_column, source_uid=None, user_format=True)
Import ground truth labels into the fine tuning dataset.
Parameters
Parameters
Return type
Return type
None
Name Type Default Info gt_df DataFrame
The ground truth labels DataFrame. gt_column str
The column in the ground truth DataFrame that contains the labels. join_column str
The column to join the gt_df and the fine tuning dataset on to associate the ground truth labels with the fine tuning dataset. source_uid Optional[int]
None
The source uid to associate the annotations with. Defaults to the requesting user’s source uid if not set. user_format bool
True
Whether the labels are in the user format or not (the label map string value vs the int value). If true, the label map will be used to convert the labels to their integer values.
import\_ground\_truth
import_ground_truth
- list_evaluation_reports()
List the evaluation reports associated with the fine tuning application.
Return type
Return type
List
[Dict
[str
,Any
]]
list\_evaluation\_reports
list_evaluation_reports
- list_quality_models()
List the quality models associated with the fine tuning application.
Return type
Return type
DataFrame
list\_quality\_models
list_quality_models
- register_custom_metric(metric_name, metric_func, overwrite=False)
Register a user-defined metric with the FineTuningApp.
Parameters
Parameters
Returns
Returns
id of the registered metric.
Return type
Return type
int
Name Type Default Info metric_name str
The display name of this metric. metric_func Callable
A python function to compute this metric. overwrite Optional[bool]
False
Overwrite a metric of the same name if one already exists.
register\_custom\_metric
register_custom_metric
- register_model_source(model_name, metadata=None)
Register a model source with the given model name and metadata.
Parameters
Parameters
Returns
Returns
The registered model source.
Return type
Return type
Dict[str, Any]
Name Type Default Info model_name str
The name of the model. metadata Optional[ModelSourceMetadata]
None
The metadata associated with the model source. If not provided, the provided model name will be used as the model name in the metadata.
register\_model\_source
register_model_source
- classmethod register_source(source_name, source_type, user_uid, metadata=None)
Register a source in the platform
Parameters
Parameters
Returns
Returns
The created source
Return type
Return type
Dict[str, Any]
Name Type Default Info source_name str
The name of the source. source_type SvcSourceType
The type of the source. user_uid Optional[int]
The user uid to associate with the source. metadata Optional[Dict[str, Any]]
None
The metadata to associate with the source.
register\_source
register_source
- setup_studio()
Setup the studio for the fine tuning application. This will refresh any stale datasources associated with the fine tuning application.
Return type
Return type
None
setup\_studio
setup_studio
- property datasource_metadata: Dict[int, Any]
Get metadata about each datasource, include details about the source