snorkelai.sdk.develop.BenchmarkExecution
- class snorkelai.sdk.develop.BenchmarkExecution(*args, **kwargs)
Bases:
BaseModel
Represents a single execution run of a benchmark for a dataset.
A benchmark execution exports comprehensive evaluation data including per-datapoint scores (evaluator outputs, rationales, and ground truth agreement), slice membership, benchmark and execution metadata, including timing information and execution context.
Parameters
Parameters
Name Type Default Info benchmark_uid int
The unique identifier of the parent Benchmark. The benchmark_uid
is visible in the URL of the benchmark page in the Snorkel GUI. For example,https://YOUR-SNORKEL-INSTANCE/benchmarks/100/
indicates a benchmark withbenchmark_uid
of100
.benchmark_execution_uid int
The unique identifier for this execution. name str
The name of the execution. created_at datetime
Timestamp of when this execution was run. created_by str
Username of the user who ran this execution. - __init__(*args, **kwargs)
\_\_init\_\_
__init__
Methods
__init__
(*args, **kwargs)export
(filepath[, config, connector_config_uid])Export information associated with this benchmark execution. Attributes
benchmark_uid
benchmark_execution_uid
name
created_at
created_by
- export(filepath, config=None, connector_config_uid=None)
Export information associated with this benchmark execution. The exported data includes:
Benchmark metadata for the associated benchmark
Execution metadata for this execution
- Each datapoint lists its evaluation score, which includes:
The evaluator outputs
Rationale
Agreement with ground truth
Each datapoint lists its slice membership(s)
(CSV exports only) Uploaded user columns and ground truth
The export includes all datapoints without filtering or sampling. Some datapoints may have missing evaluation scores if the benchmark was not executed against them (for example, datapoints in the test split).
Parameters
Parameters
sep
: The separator between columns. Default is,
.quotechar
: The character used to quote fields. Default is"
.escapechar
: The character used to escape special characters. Default is\
.Return type
Return type
None
Name Type Default Info filepath str
The filepath where you want to write the exported data. config Union[JsonExportConfig, CsvExportConfig, None]
None
A
JsonExportConfig
orCsvExportConfig
object. Defaults to JSON. No additional configuration is required for JSON exports. For CSV exports, the following parameters are supported:connector_config_uid Optional[int]
None
Optional UID of the connector config to use for the export. Required only if the export destination is a remote, private bucket (a private S3 or GCS bucket that requires credentials). Ignored if the export destination is a public bucket (a public S3 or GCS bucket that does not require credentials) or if the export destination is a local file. Examples
Example 1
Example 1
Export a benchmark execution to a local file:
benchmark = Benchmark(100)
execution = benchmark.list_executions()[0]
execution.export("benchmark_execution.json")Example 2
Example 2
Export a benchmark execution to a S3 bucket using a connector config:
benchmark = Benchmark(100)
execution = benchmark.list_executions()[0]
execution.export("s3://MY-BUCKET/MY-PATH/benchmark_execution.json", connector_config_uid=1)
export
export
-
benchmark_execution_uid:
int
-
benchmark_uid:
int
-
created_at:
datetime
-
created_by:
str
-
name:
str