Skip to main content
Version: 0.93

snorkelflow.sdk.FineTuningApp

class snorkelflow.sdk.FineTuningApp(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Bases: object

__init__

__init__(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Methods

__init__(app_uid, model_node_uid, ...)

create(app_name, fine_tuning_app_config)

Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.

create_evaluation_report(split[, ...])

Create an evaluation report for the quality dataset.

delete()

Delete the fine tuning application.

get(application)

Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.

get_annotation_batches()

Get the annotation batches associated with the entire fine tuning dataset and label schema.

get_dataframe([split, source_uids, x_uids, ...])

Get data from the dataset associated with the fine tuning application with the given filters applied

get_evaluation_report(evaluation_report_uid)

Get the evaluation report associated with the given evaluation report uid.

get_ft_dataset()

Get the fine tuning dataset associated with the fine tuning application.

get_quality_dataset(model_uid)

Create a QualityDataset object from a trained model's predictions.

import_data(data, split, source_uid[, name, ...])

Import data into the fine tuning application.

import_ground_truth(gt_df, gt_column, ...[, ...])

Import ground truth labels into the fine tuning dataset.

list_evaluation_reports()

List the evaluation reports associated with the fine tuning application.

list_quality_models()

List the quality models associated with the fine tuning application.

register_model_source(model_name[, metadata])

Register a model source with the given model name and metadata.

register_source(source_name, source_type, ...)

Register a source in the platform

setup_studio()

Setup the studio for the fine tuning application.

Attributes

datasource_metadata

Get metadata about each datasource, include details about the source

create

classmethod create(app_name, fine_tuning_app_config)

Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.

Parameters

NameTypeDefaultInfo
app_namestrThe name of the fine tuning application.
fine_tuning_app_configFineTuningAppConfigThe configuration of the fine tuning application.

Returns

The fine tuning application object

Return type

FineTuningApp

create_evaluation_report

create_evaluation_report(split, quality_models=None, finetuned_model_sources=None, slices=None)

Create an evaluation report for the quality dataset.

Parameters

NameTypeDefaultInfo
splitstrThe split of the data to evaluate.
quality_modelsUnion[List[str], List[int], None]NoneThe quality models to evaluate (if not provided, the committed quality model or the most recently trained model will be used, in that order).
finetuned_model_sourcesUnion[List[str], List[int], None]NoneThe finetuned model sources to evaluate (if not provided, all finetuned models associated with the datasources will be used).
slicesUnion[List[str], List[int], None]NoneThe slices to evaluate (if not provided, all slices in the given dataset will be evaluated).

Returns

A dictionary containing the evaluation results

Return type

Dict[str, Any]

delete

delete()

Delete the fine tuning application. Dataset must be deleted separately.

Return type

None

get

classmethod get(application)

Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.

Parameters

NameTypeDefaultInfo
applicationUnion[str, int]The name or uid of the fine tuning application.

Returns

The fine tuning application object

Return type

FineTuningApp

get_annotation_batches

get_annotation_batches()

Get the annotation batches associated with the entire fine tuning dataset and label schema.

Return type

List[Batch]

get_dataframe

get_dataframe(split=None, source_uids=None, x_uids=None, datasource_uids=None)

Get data from the dataset associated with the fine tuning application with the given filters applied

Parameters

NameTypeDefaultInfo
splitOptional[str]NoneThe split of the data to get.
source_uidsOptional[List[int]]NoneThe source uids to filter by.
x_uidsOptional[List[str]]NoneThe x uids to filter by.
datasource_uidsOptional[List[str]]NoneThe datasource uids to filter by.

Return type

DataFrame

get_evaluation_report

get_evaluation_report(evaluation_report_uid)

Get the evaluation report associated with the given evaluation report uid.

Parameters

NameTypeDefaultInfo
evaluation_report_uidintThe unique identifier for the evaluation report to retrieve.

Returns

A dictionary containing the details of the evaluation report.

Return type

Dict[str, Any]

get_ft_dataset

get_ft_dataset()

Get the fine tuning dataset associated with the fine tuning application.

Returns

The fine tuning dataset object

Return type

FTDataset

get_quality_dataset

get_quality_dataset(model_uid)

Create a QualityDataset object from a trained model’s predictions.

Parameters

NameTypeDefaultInfo
model_uidintThe unique identifier of the trained model.

Returns

The QualityDataset object.

Return type

QualityDataset

import_data

import_data(data, split, source_uid, name=None, sync=True, refresh_datasources=True)

Import data into the fine tuning application.

Parameters

NameTypeDefaultInfo
dataUnion[str, DataFrame]A file path or a pandas DataFrame of the data to import into the dataset.
splitstrThe split of the data.
source_uidintThe source to associate the data with for data lineage.
nameOptional[str]NoneThe name of the data source.
syncboolTrueWhether to wait for the ingestion job to complete before returning.
refresh_datasourcesboolTrueWhether to refresh datasources for the downstream model node after ingestion. Can only be set if sync is True.

Returns

The job_id of the ingestion job

Return type

str

import_ground_truth

import_ground_truth(gt_df, gt_column, join_column, source_uid=None, user_format=True)

Import ground truth labels into the fine tuning dataset.

Parameters

NameTypeDefaultInfo
gt_dfDataFrameThe ground truth labels DataFrame.
gt_columnstrThe column in the ground truth DataFrame that contains the labels.
join_columnstrThe column to join the gt_df and the fine tuning dataset on to associate the ground truth labels with the fine tuning dataset.
source_uidOptional[int]NoneThe source uid to associate the annotations with.
user_formatboolTrueWhether the labels are in the user format or not (the label map string value vs the int value). If true, the label map will be used to convert the labels to their integer values.

Return type

None

list_evaluation_reports

list_evaluation_reports()

List the evaluation reports associated with the fine tuning application.

Return type

List[Dict[str, Any]]

list_quality_models

list_quality_models()

List the quality models associated with the fine tuning application.

Return type

DataFrame

register_model_source

register_model_source(model_name, metadata=None)

Register a model source with the given model name and metadata.

Parameters

NameTypeDefaultInfo
model_namestrThe name of the model.
metadataOptional[ModelSourceMetadata]NoneThe metadata associated with the model source. If not provided, the provided model name will be used as the model name in the metadata.

Returns

The registered model source.

Return type

Dict[str, Any]

register_source

classmethod register_source(source_name, source_type, user_uid, metadata=None)

Register a source in the platform

Parameters

NameTypeDefaultInfo
source_namestrThe name of the source.
source_typeSvcSourceTypeThe type of the source.
user_uidOptional[int]The user uid to associate with the source.
metadataOptional[Dict[str, Any]]NoneThe metadata to associate with the source.

Returns

The created source

Return type

Dict[str, Any]

setup_studio

setup_studio()

Setup the studio for the fine tuning application. This will refresh any stale datasources associated with the fine tuning application.

Return type

None

property datasource_metadata: Dict[int, Any]

Get metadata about each datasource, include details about the source