Skip to main content
Version: 25.3

snorkelflow.sdk.FineTuningApp

class snorkelflow.sdk.FineTuningApp(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Bases: object

__init__

__init__(app_uid, model_node_uid, dataset_uid, label_schema_uid, workspace_uid, fine_tuning_app_config)

Methods

__init__(app_uid, model_node_uid, ...)

create(app_name, fine_tuning_app_config)

Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.

create_evaluation_report([split, ...])

Create an evaluation report for the quality dataset.

delete()

Delete the fine tuning application.

get(application)

Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.

get_annotation_batches()

Get the annotation batches associated with the entire fine tuning dataset and label schema.

get_dataframe([split, source_uids, x_uids, ...])

Get data from the dataset associated with the fine tuning application with the given filters applied

get_evaluation_report(evaluation_report_uid)

Get the evaluation report associated with the given evaluation report uid.

get_ft_dataset()

Get the fine tuning dataset associated with the fine tuning application.

get_quality_dataset(model_uid)

Create a QualityDataset object from a trained model's predictions.

get_sources()

Get the sources within the current workspace.

import_data(data, split, source_uid[, name, ...])

Import data into the fine tuning application.

import_ground_truth(gt_df, gt_column, ...[, ...])

Import ground truth labels into the fine tuning dataset.

list_evaluation_reports()

List the evaluation reports associated with the fine tuning application.

list_quality_models()

List the quality models associated with the fine tuning application.

register_custom_metric(metric_name, metric_func)

Register a user-defined metric with the FineTuningApp.

register_metric(metric_schema)

Register a defined metric with the FineTuningApp.

register_model_source(model_name[, metadata])

Register a model source with the given model name and metadata.

register_source(source_name, source_type, ...)

Register a source in the platform

setup_studio()

Setup the studio for the fine tuning application.

unregister_metric(metric_name)

Unregister a metric with the FineTuningApp.

Attributes

datasource_metadata

Get metadata about each datasource, include details about the source

create

classmethod create(app_name, fine_tuning_app_config)

Create a new fine tuning application with the given name and configuration The dataset, application, label schema will be setup for you.

Parameters

NameTypeDefaultInfo
app_namestrThe name of the fine tuning application.
fine_tuning_app_configFineTuningAppConfigThe configuration of the fine tuning application.

Returns

The fine tuning application object

Return type

FineTuningApp

create_evaluation_report

create_evaluation_report(split=None, quality_models=None, finetuned_model_sources=None, slices=None)

Create an evaluation report for the quality dataset.

Parameters

NameTypeDefaultInfo
splitOptional[str]NoneThe split of the data to evaluate (if not provided, metrics will be computed for all splits).
quality_modelsUnion[List[str], List[int], None]NoneThe quality models to evaluate (if not provided, the committed quality model or the most recently trained model will be used, in that order).
finetuned_model_sourcesUnion[List[str], List[int], None]NoneThe finetuned model sources to evaluate (if not provided, all finetuned models associated with the datasources will be used).
slicesUnion[List[str], List[int], None]NoneThe slices to evaluate (if not provided, all slices in the given dataset will be evaluated).

Returns

A dictionary containing the evaluation results

Return type

Dict[str, Any]

delete

delete()

Delete the fine tuning application. Dataset must be deleted separately.

Return type

None

get

classmethod get(application)

Initialize a FineTuningApp object from an existing fine tuning application previously created with the SDK.

Parameters

NameTypeDefaultInfo
applicationUnion[str, int]The name or uid of the fine tuning application.

Returns

The fine tuning application object

Return type

FineTuningApp

get_annotation_batches

get_annotation_batches()

Get the annotation batches associated with the entire fine tuning dataset and label schema.

Return type

List[Batch]

get_dataframe

get_dataframe(split=None, source_uids=None, x_uids=None, datasource_uids=None)

Get data from the dataset associated with the fine tuning application with the given filters applied

Parameters

NameTypeDefaultInfo
splitOptional[str]NoneThe split of the data to get.
source_uidsOptional[List[int]]NoneThe source uids to filter by.
x_uidsOptional[List[str]]NoneThe x uids to filter by.
datasource_uidsOptional[List[str]]NoneThe datasource uids to filter by.

Return type

DataFrame

get_evaluation_report

get_evaluation_report(evaluation_report_uid)

Get the evaluation report associated with the given evaluation report uid.

Parameters

NameTypeDefaultInfo
evaluation_report_uidintThe unique identifier for the evaluation report to retrieve.

Returns

A dictionary containing the details of the evaluation report.

Return type

Dict[str, Any]

get_ft_dataset

get_ft_dataset()

Get the fine tuning dataset associated with the fine tuning application.

Returns

The fine tuning dataset object

Return type

FTDataset

get_quality_dataset

get_quality_dataset(model_uid)

Create a QualityDataset object from a trained model’s predictions.

Parameters

NameTypeDefaultInfo
model_uidintThe unique identifier of the trained model.

Returns

The QualityDataset object.

Return type

QualityDataset

get_sources

get_sources()

Get the sources within the current workspace.

Returns

A list of dictionaries containing the details of the sources.

Return type

List[Dict[str, Any]]

import_data

import_data(data, split, source_uid, name=None, sync=True, refresh_datasources=True, prompt_template=None)

Import data into the fine tuning application.

Parameters

NameTypeDefaultInfo
dataUnion[str, DataFrame]A file path or a pandas DataFrame of the data to import into the dataset.
splitstrThe split of the data.
source_uidintThe source to associate the data with for data lineage.
nameOptional[str]NoneThe name of the data source.
syncboolTrueWhether to wait for the ingestion job to complete before returning.
refresh_datasourcesboolTrueWhether to refresh datasources for the downstream model node after ingestion. Can only be set if sync is True.
prompt_templateOptional[str]NoneThe prompt template used when the data was generated.

Returns

The job_id of the ingestion job

Return type

str

Notes

If sync is set to False, the method will return immediately after submitting the ingestion job, and refresh_datasources and backfill predictions will not be performed. To ensure all post-ingestion tasks are completed, keep sync as True (default).

import_ground_truth

import_ground_truth(gt_df, gt_column, join_column, source_uid=None, user_format=True)

Import ground truth labels into the fine tuning dataset.

Parameters

NameTypeDefaultInfo
gt_dfDataFrameThe ground truth labels DataFrame.
gt_columnstrThe column in the ground truth DataFrame that contains the labels.
join_columnstrThe column to join the gt_df and the fine tuning dataset on to associate the ground truth labels with the fine tuning dataset.
source_uidOptional[int]NoneThe source uid to associate the annotations with. Defaults to the requesting user’s source uid if not set.
user_formatboolTrueWhether the labels are in the user format or not (the label map string value vs the int value). If true, the label map will be used to convert the labels to their integer values.

Return type

None

list_evaluation_reports

list_evaluation_reports()

List the evaluation reports associated with the fine tuning application.

Return type

List[Dict[str, Any]]

list_quality_models

list_quality_models()

List the quality models associated with the fine tuning application.

Return type

DataFrame

register_custom_metric

register_custom_metric(metric_name, metric_func, overwrite=False)

Register a user-defined metric with the FineTuningApp.

Parameters

NameTypeDefaultInfo
metric_namestrThe display name of this metric.
metric_funcCallableA python function to compute this metric.
overwriteOptional[bool]FalseOverwrite a metric of the same name if one already exists.

Returns

id of the registered metric.

Return type

int

register_metric

register_metric(metric_schema)

Register a defined metric with the FineTuningApp.

Parameters

NameTypeDefaultInfo
metric_schemaMetricSchemaA MetricSchema object.

Return type

None

register_model_source

register_model_source(model_name, metadata=None)

Register a model source with the given model name and metadata.

Parameters

NameTypeDefaultInfo
model_namestrThe name of the model.
metadataOptional[ModelSourceMetadata]NoneThe metadata associated with the model source. If not provided, the provided model name will be used as the model name in the metadata.

Returns

The registered model source.

Return type

Dict[str, Any]

register_source

classmethod register_source(source_name, source_type, user_uid, metadata=None)

Register a source in the platform

Parameters

NameTypeDefaultInfo
source_namestrThe name of the source.
source_typeSvcSourceTypeThe type of the source.
user_uidOptional[int]The user uid to associate with the source.
metadataOptional[Dict[str, Any]]NoneThe metadata to associate with the source.

Returns

The created source

Return type

Dict[str, Any]

setup_studio

setup_studio()

Setup the studio for the fine tuning application. This will refresh any stale datasources associated with the fine tuning application.

Return type

None

unregister_metric

unregister_metric(metric_name)

Unregister a metric with the FineTuningApp.

Parameters

NameTypeDefaultInfo
metric_namestrThe display name of the metric to unregister.

Return type

None

property datasource_metadata: Dict[int, Any]

Get metadata about each datasource, include details about the source