Version: 25.9

snorkelai.sdk.develop.BenchmarkExecution

final class snorkelai.sdk.develop.BenchmarkExecution(benchmark_uid, benchmark_execution_uid, name, created_at, created_by, archived)

Bases: Base

Represents a single execution run of a benchmark for a dataset.

A benchmark execution exports comprehensive evaluation data including per-datapoint scores (evaluator outputs, rationales, and ground truth agreement), slice membership, benchmark and execution metadata, including timing information and execution context.

init

__init__(benchmark_uid, benchmark_execution_uid, name, created_at, created_by, archived)

Parameters Parameters

Name	Type	Info
benchmark_uid	`int`	The unique identifier of the parent Benchmark. The `benchmark_uid` is visible in the URL of the benchmark page in the Snorkel GUI. For example, `https://YOUR-SNORKEL-INSTANCE/benchmarks/100/` indicates a benchmark with `benchmark_uid` of `100`.
benchmark_execution_uid	`int`	The unique identifier for this execution.
name	`str`	The name of the execution.
created_at	`datetime`	Timestamp of when this execution was run.
created_by	`str`	Username of the user who ran this execution.
archived	`bool`	Whether this execution is archived.

Methods

`__init__`(benchmark_uid, ...)
`create`(benchmark_uid[, name, criteria_uids, ...])	Create a benchmark execution.
`delete`(benchmark_uid, benchmark_execution_uid)	Delete (archive) a benchmark execution.
`export`(filepath[, config, connector_config_uid])	Export information associated with this benchmark execution.
`get`(benchmark_uid, benchmark_execution_uid)	Get a benchmark execution by its unique identifier.
`list`(benchmark_uid[, include_archived])	List all benchmark executions for a given benchmark.
`update`(archived)	Update the state of the benchmark execution.

Attributes

`archived`	Return whether the benchmark execution is archived
`benchmark_execution_uid`	Return the UID of the benchmark execution
`benchmark_uid`	Return the UID of the parent benchmark
`created_at`	Return the timestamp when the benchmark execution was created
`created_by`	Return the username of the user who created the benchmark execution
`name`	Return the name of the benchmark execution
`uid`	Return the UID of the benchmark execution

create

classmethod create(benchmark_uid, name=None, criteria_uids=None, datasource_uids=None, splits=None)

Create a benchmark execution.

Parameters Parameters
Returns Returns: The created benchmark execution.
Return type Return type: BenchmarkExecution

Name	Type	Default	Info
benchmark_uid	`int`		The unique identifier of the benchmark to create an execution for.
name	`Optional[str]`	`None`	Optional name for the benchmark execution.
criteria_uids	`Optional[List[int]]`	`None`	List of criteria UIDs to include in the execution.
datasource_uids	`Optional[List[int]]`	`None`	List of datasource UIDs to include in the execution.
splits	`Optional[List[str]]`	`None`	List of splits to include in the execution.

Example

from snorkelai.sdk.develop import BenchmarkExecution
BenchmarkExecution.create(benchmark_uid=123, name="Test Execution",datasource_uids=[1, 2, 3], splits=["train", "test"])

delete

classmethod delete(benchmark_uid, benchmark_execution_uid)

Delete (archive) a benchmark execution.

This performs a soft delete by archiving the benchmark execution. Hard deletion is not supported.

Parameters Parameters
Raises Raises: ValueError – If the benchmark execution is not found.
Return type Return type: None

Name	Type	Default	Info
benchmark_uid	`int`		The unique identifier of the benchmark.
benchmark_execution_uid	`int`		The unique identifier of the benchmark execution to delete.

Example

from snorkelai.sdk.develop import BenchmarkExecution
BenchmarkExecution.delete(benchmark_uid=123, benchmark_execution_uid=456)

export

export(filepath, config=None, connector_config_uid=None)

Export information associated with this benchmark execution. The exported data includes:

Benchmark metadata for the associated benchmark
Execution metadata for this execution
Each datapoint lists its evaluation score, which includes:
- The evaluator outputs
- Rationale
- Agreement with ground truth
Each datapoint lists its slice membership(s)
(CSV exports only) Uploaded user columns and ground truth

The export includes all datapoints without filtering or sampling. Some datapoints may have missing evaluation scores if the benchmark was not executed against them (for example, datapoints in the test split).

Parameters Parameters
Return type Return type: None

Name	Type	Default	Info
filepath	`str`		The filepath where you want to write the exported data.
config	`Union[JsonExportConfig, CsvExportConfig, None]`	`None`	A `JsonExportConfig` or `CsvExportConfig` object. Defaults to JSON. No additional configuration is required for JSON exports. For CSV exports, the following parameters are supported: `sep`: The separator between columns. Default is `,`. `quotechar`: The character used to quote fields. Default is `"`. `escapechar`: The character used to escape special characters. Default is `\`.
connector_config_uid	`Optional[int]`	`None`	Optional UID of the connector config to use for the export. Required only if the export destination is a remote, private bucket (a private S3 or GCS bucket that requires credentials). Ignored if the export destination is a public bucket (a public S3 or GCS bucket that does not require credentials) or if the export destination is a local file.

Examples

Example 1

Export a benchmark execution to a local file:

from snorkelai.sdk.develop import Benchmark

benchmark = Benchmark.get(100)
execution = benchmark.list_executions()[0]
execution.export("benchmark_execution.json")

Example 2

Export a benchmark execution to a S3 bucket using a connector config:

from snorkelai.sdk.develop import Benchmark

benchmark = Benchmark.get(100)
execution = benchmark.list_executions()[0]
execution.export("s3://MY-BUCKET/MY-PATH/benchmark_execution.json", connector_config_uid=1)

get

classmethod get(benchmark_uid, benchmark_execution_uid)

Get a benchmark execution by its unique identifier.

Parameters Parameters
Returns Returns: The requested benchmark execution.
Return type Return type: BenchmarkExecution
Raises Raises: ValueError – If the benchmark execution is not found.

Name	Type	Default	Info
benchmark_uid	`int`		The unique identifier of the benchmark.
benchmark_execution_uid	`int`		The unique identifier of the benchmark execution.

Example

from snorkelai.sdk.develop import BenchmarkExecution

BenchmarkExecution.get(benchmark_uid=123, benchmark_execution_uid=456)

list

static list(benchmark_uid, include_archived=False)

List all benchmark executions for a given benchmark.

Parameters Parameters
Return type Return type: List[BenchmarkExecution]

Name	Type	Default	Info
benchmark_uid	`int`		The unique identifier of the parent Benchmark. The `benchmark_uid` is visible in the URL of the benchmark page in the Snorkel GUI. For example, `https://YOUR-SNORKEL-INSTANCE/benchmarks/100/` indicates a benchmark with `benchmark_uid` of `100`.
include_archived	`bool`	`False`	Whether to include archived executions. Defaults to False.

update

update(archived)

Update the state of the benchmark execution.

Parameters Parameters
Return type Return type: None

Name	Type	Default	Info
archived	`bool`		Whether the benchmark execution should be archived.

property archived: bool: Return whether the benchmark execution is archived

property benchmark_execution_uid: int: Return the UID of the benchmark execution

property benchmark_uid: int: Return the UID of the parent benchmark

property created_at: datetime: Return the timestamp when the benchmark execution was created

property created_by: str: Return the username of the user who created the benchmark execution

property name: str: Return the name of the benchmark execution

property uid: int: Return the UID of the benchmark execution

\_\_init\_\_

__init__​

Parameters

Parameters​

create

create​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

Example​

delete

delete​

Parameters

Parameters​

Raises

Raises​

Return type

Return type​

Example​

export

export​

Parameters

Parameters​

Return type

Return type​

Examples​

Example 1

Example 1​

Example 2

Example 2​

get

get​

Parameters

Parameters​

Returns

Returns​

Return type

Return type​

Raises

Raises​

Example​

list

list​

Parameters

Parameters​

Return type

Return type​

update

update​

Parameters

Parameters​

Return type

Return type​

init

Parameters

create

Parameters

Returns

Return type

Example

delete

Parameters

Raises

Return type

Example

export

Parameters

Return type

Examples

Example 1

Example 2

get

Parameters

Returns

Return type

Raises

Example

list

Parameters

Return type

update

Parameters

Return type