Version: 0.96

snorkelflow.sdk.FineTuningApp

class snorkelflow.sdk.FineTuningApp(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Bases: object

__init__(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Methods

`__init__`(app_uid, model_node_uid, ...)
`create`(app_name, fine_tuning_app_config)	Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.
`create_evaluation_report`([split, ...])	Create an evaluation report for the quality dataset.
`delete`()	Delete the fine tuning application.
`get`(application)	Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.
`get_annotation_batches`()	Get the annotation batches associated with the entire fine tuning dataset and label schema.
`get_dataframe`([split, source_uids, x_uids, ...])	Get data from the dataset associated with the fine tuning application with the given filters applied
`get_evaluation_report`(evaluation_report_uid)	Get the evaluation report associated with the given evaluation report uid.
`get_ft_dataset`()	Get the fine tuning dataset associated with the fine tuning application.
`get_quality_dataset`(model_uid)	Create a QualityDataset object from a trained model's predictions.
`get_sources`()	Get the sources within the current workspace.
`import_data`(data, split, source_uid[, name, ...])	Import data into the fine tuning application.
`import_ground_truth`(gt_df, gt_column, ...[, ...])	Import ground truth labels into the fine tuning dataset.
`list_evaluation_reports`()	List the evaluation reports associated with the fine tuning application.
`list_quality_models`()	List the quality models associated with the fine tuning application.
`register_custom_metric`(metric_name, metric_func)	Register a user-defined metric with the FineTuningApp.
`register_metric`(metric_schema)	Register a defined metric with the FineTuningApp.
`register_model_source`(model_name[, metadata])	Register a model source with the given model name and metadata.
`register_source`(source_name, source_type, ...)	Register a source in the platform
`setup_studio`()	Setup the studio for the fine tuning application.
`unregister_metric`(metric_name)	Unregister a metric with the FineTuningApp.

Attributes

datasource_metadata Get metadata about each datasource, include details about the source

create

classmethod create(app_name, fine_tuning_app_config)

Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.

Parameters Parameters
Returns Returns: The fine tuning application object
Return type Return type: FineTuningApp

Name	Type	Default	Info
app_name	`str`		The name of the fine tuning application.
fine_tuning_app_config	`FineTuningAppConfig`		The configuration of the fine tuning application.

create_evaluation_report

create_evaluation_report(split=None, quality_models=None, finetuned_model_sources=None, slices=None)

Create an evaluation report for the quality dataset.

Parameters Parameters
Returns Returns: A dictionary containing the evaluation results
Return type Return type: Dict[str, Any]

Name	Type	Default	Info
split	`Optional[str]`	`None`	The split of the data to evaluate (if not provided, metrics will be computed for all splits).
quality_models	`Union[List[str], List[int], None]`	`None`	The quality models to evaluate (if not provided, the committed quality model or the most recently trained model will be used, in that order).
finetuned_model_sources	`Union[List[str], List[int], None]`	`None`	The finetuned model sources to evaluate (if not provided, all finetuned models associated with the datasources will be used).
slices	`Union[List[str], List[int], None]`	`None`	The slices to evaluate (if not provided, all slices in the given dataset will be evaluated).

delete

delete()

Delete the fine tuning application. Dataset must be deleted separately.

Return type Return type: None

get

classmethod get(application)

Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.

Parameters Parameters
Returns Returns: The fine tuning application object
Return type Return type: FineTuningApp

Name	Type	Default	Info
application	`Union[str, int]`		The name or uid of the fine tuning application.

get_annotation_batches

get_annotation_batches()

Get the annotation batches associated with the entire fine tuning dataset and label schema.

Return type Return type: List[Batch]

get_dataframe

get_dataframe(split=None, source_uids=None, x_uids=None, datasource_uids=None)

Get data from the dataset associated with the fine tuning application with the given filters applied

Parameters Parameters
Return type Return type: DataFrame

Name	Type	Default	Info
split	`Optional[str]`	`None`	The split of the data to get.
source_uids	`Optional[List[int]]`	`None`	The source uids to filter by.
x_uids	`Optional[List[str]]`	`None`	The x uids to filter by.
datasource_uids	`Optional[List[str]]`	`None`	The datasource uids to filter by.

get_evaluation_report

get_evaluation_report(evaluation_report_uid)

Get the evaluation report associated with the given evaluation report uid.

Parameters Parameters
Returns Returns: A dictionary containing the details of the evaluation report.
Return type Return type: Dict[str, Any]

Name	Type	Default	Info
evaluation_report_uid	`int`		The unique identifier for the evaluation report to retrieve.

get_ft_dataset

get_ft_dataset()

Get the fine tuning dataset associated with the fine tuning application.

Returns Returns: The fine tuning dataset object
Return type Return type: FTDataset

get_quality_dataset

get_quality_dataset(model_uid)

Create a QualityDataset object from a trained model’s predictions.

Parameters Parameters
Returns Returns: The QualityDataset object.
Return type Return type: QualityDataset

Name	Type	Default	Info
model_uid	`int`		The unique identifier of the trained model.

get_sources

get_sources()

Get the sources within the current workspace.

Returns Returns: A list of dictionaries containing the details of the sources.
Return type Return type: List[Dict[str, Any]]

import_data

import_data(data, split, source_uid, name=None, sync=True, refresh_datasources=True, prompt_template=None)

Import data into the fine tuning application.

Parameters Parameters
Returns Returns: The job_id of the ingestion job
Return type Return type: str

Name	Type	Default	Info
data	`Union[str, DataFrame]`		A file path or a pandas DataFrame of the data to import into the dataset.
split	`str`		The split of the data.
source_uid	`int`		The source to associate the data with for data lineage.
name	`Optional[str]`	`None`	The name of the data source.
sync	`bool`	`True`	Whether to wait for the ingestion job to complete before returning.
refresh_datasources	`bool`	`True`	Whether to refresh datasources for the downstream model node after ingestion. Can only be set if sync is True.
prompt_template	`Optional[str]`	`None`	The prompt template used when the data was generated.

Notes

If sync is set to False, the method will return immediately after submitting the ingestion job, and refresh_datasources and backfill predictions will not be performed. To ensure all post-ingestion tasks are completed, keep sync as True (default).

import_ground_truth

import_ground_truth(gt_df, gt_column, join_column, source_uid=None, user_format=True)

Import ground truth labels into the fine tuning dataset.

Parameters Parameters
Return type Return type: None

Name	Type	Default	Info
gt_df	`DataFrame`		The ground truth labels DataFrame.
gt_column	`str`		The column in the ground truth DataFrame that contains the labels.
join_column	`str`		The column to join the gt_df and the fine tuning dataset on to associate the ground truth labels with the fine tuning dataset.
source_uid	`Optional[int]`	`None`	The source uid to associate the annotations with. Defaults to the requesting user’s source uid if not set.
user_format	`bool`	`True`	Whether the labels are in the user format or not (the label map string value vs the int value). If true, the label map will be used to convert the labels to their integer values.

list_evaluation_reports

list_evaluation_reports()

List the evaluation reports associated with the fine tuning application.

Return type Return type: List[Dict[str, Any]]

list_quality_models

list_quality_models()

List the quality models associated with the fine tuning application.

Return type Return type: DataFrame

register_custom_metric

register_custom_metric(metric_name, metric_func, overwrite=False)

Parameters Parameters
Returns Returns: id of the registered metric.
Return type Return type: int

Name	Type	Default	Info
metric_name	`str`		The display name of this metric.
metric_func	`Callable`		A python function to compute this metric.
overwrite	`Optional[bool]`	`False`	Overwrite a metric of the same name if one already exists.

register_metric

register_metric(metric_schema)

Parameters Parameters
Return type Return type: None

Name	Type	Default	Info
metric_schema	`MetricSchema`		A MetricSchema object.

register_model_source

register_model_source(model_name, metadata=None)

Parameters Parameters
Returns Returns: The registered model source.
Return type Return type: Dict[str, Any]

Name	Type	Default	Info
model_name	`str`		The name of the model.
metadata	`Optional[ModelSourceMetadata]`	`None`	The metadata associated with the model source. If not provided, the provided model name will be used as the model name in the metadata.

register_source

classmethod register_source(source_name, source_type, user_uid, metadata=None)

Parameters Parameters
Returns Returns: The created source
Return type Return type: Dict[str, Any]

Name	Type	Default	Info
source_name	`str`		The name of the source.
source_type	`SvcSourceType`		The type of the source.
user_uid	`Optional[int]`		The user uid to associate with the source.
metadata	`Optional[Dict[str, Any]]`	`None`	The metadata to associate with the source.

setup_studio

setup_studio()

Setup the studio for the fine tuning application. This will refresh any stale datasources associated with the fine tuning application.

Return type Return type: None

unregister_metric

unregister_metric(metric_name)

Unregister a metric with the FineTuningApp.

Parameters Parameters
Return type Return type: None

Name	Type	Default	Info
metric_name	`str`		The display name of the metric to unregister.

property datasource_metadata: Dict[int, Any]: Get metadata about each datasource, include details about the source

\_\_init\_\_

__init__​

create

create​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

create\_evaluation\_report

create_evaluation_report​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

delete

delete​

Return type

Return type​

get

get​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_annotation\_batches

get_annotation_batches​

Return type

Return type​

get\_dataframe

get_dataframe​

Parameters

Parameters​

Return type

Return type​

get\_evaluation\_report

get_evaluation_report​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_ft\_dataset

get_ft_dataset​

Returns

Returns​

Return type

Return type​

get\_quality\_dataset

get_quality_dataset​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

get\_sources

get_sources​

Returns

Returns​

Return type

Return type​

import\_data

import_data​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

import\_ground\_truth

import_ground_truth​

Parameters

Parameters​

init

create

Parameters

Returns

Return type

create_evaluation_report

Parameters

Returns

Return type

delete

Return type

get

Parameters

Returns

Return type

get_annotation_batches

Return type

get_dataframe

Parameters

Return type

get_evaluation_report

Parameters

Returns

Return type

get_ft_dataset

Returns

Return type

get_quality_dataset

Parameters

Returns

Return type

get_sources

Returns

Return type

import_data

Parameters

Returns

Return type

import_ground_truth

Parameters

Return type

list_evaluation_reports

Return type

list_quality_models

Return type

register_custom_metric

Parameters

Returns

Return type

register_metric

Parameters

Return type

register_model_source

Parameters

Returns

Return type

register_source

Parameters

Returns

Return type

setup_studio

Return type

unregister_metric

Parameters

Return type