Skip to main content
Version: 25.6

snorkelai.sdk.develop.BenchmarkExecution

class snorkelai.sdk.develop.BenchmarkExecution(*args, **kwargs)

Bases: BaseModel

Represents a single execution run of a benchmark for a dataset.

A benchmark execution exports comprehensive evaluation data including per-datapoint scores (evaluator outputs, rationales, and ground truth agreement), slice membership, benchmark and execution metadata, including timing information and execution context.

Parameters

NameTypeDefaultInfo
benchmark_uidintThe unique identifier of the parent Benchmark. The benchmark_uid is visible in the URL of the benchmark page in the Snorkel GUI. For example, https://YOUR-SNORKEL-INSTANCE/benchmarks/100/ indicates a benchmark with benchmark_uid of 100.
benchmark_execution_uidintThe unique identifier for this execution.
namestrThe name of the execution.
created_atdatetimeTimestamp of when this execution was run.
created_bystrUsername of the user who ran this execution.

__init__

__init__(*args, **kwargs)

Methods

__init__(*args, **kwargs)
export(filepath[, config, connector_config_uid])Export information associated with this benchmark execution.

Attributes

benchmark_uid
benchmark_execution_uid
name
created_at
created_by

export

export(filepath, config=None, connector_config_uid=None)

Export information associated with this benchmark execution. The exported data includes:

  • Benchmark metadata for the associated benchmark

  • Execution metadata for this execution

  • Each datapoint lists its evaluation score, which includes:
    • The evaluator outputs

    • Rationale

    • Agreement with ground truth

  • Each datapoint lists its slice membership(s)

  • (CSV exports only) Uploaded user columns and ground truth

The export includes all datapoints without filtering or sampling. Some datapoints may have missing evaluation scores if the benchmark was not executed against them (for example, datapoints in the test split).

Parameters

NameTypeDefaultInfo
filepathstrThe filepath where you want to write the exported data.
configUnion[JsonExportConfig, CsvExportConfig, None]None

A JsonExportConfig or CsvExportConfig object. Defaults to JSON. No additional configuration is required for JSON exports. For CSV exports, the following parameters are supported:

  • sep: The separator between columns. Default is ,.

  • quotechar: The character used to quote fields. Default is ".

  • escapechar: The character used to escape special characters. Default is \.

connector_config_uidOptional[int]NoneOptional UID of the connector config to use for the export. Required only if the export destination is a remote, private bucket (a private S3 or GCS bucket that requires credentials). Ignored if the export destination is a public bucket (a public S3 or GCS bucket that does not require credentials) or if the export destination is a local file.

Return type

None

Examples

Example 1

Export a benchmark execution to a local file:

benchmark = Benchmark(100)
execution = benchmark.list_executions()[0]
execution.export("benchmark_execution.json")

Example 2

Export a benchmark execution to a S3 bucket using a connector config:

benchmark = Benchmark(100)
execution = benchmark.list_executions()[0]
execution.export("s3://MY-BUCKET/MY-PATH/benchmark_execution.json", connector_config_uid=1)
benchmark_execution_uid: int
benchmark_uid: int
created_at: datetime
created_by: str
name: str