Skip to main content
Version: 25.4

snorkelflow.sdk.Benchmark

class snorkelflow.sdk.Benchmark(benchmark_uid)

Bases: object

SDK client for benchmark operations

__init__

__init__(benchmark_uid)

Initialize a Benchmark object for SDK operations

Parameters

NameTypeDefaultInfo
benchmark_uidintThe unique identifier of the benchmark.

Methods

__init__(benchmark_uid)Initialize a Benchmark object for SDK operations
export_config(filepath[, format])Export benchmark configuration to the specified format and write to the provided filepath.
export_latest_execution(filepath[, config])Export information associated with a latest benchmark execution.
list_executions()Get all benchmark executions for this benchmark, sorted by creation date.

export_config

export_config(filepath, format=BenchmarkExportFormat.JSON)

Export benchmark configuration to the specified format and write to the provided filepath.

Parameters

NameTypeDefaultInfo
filepathstrThe filepath to write the exported config to.
formatBenchmarkExportFormat<BenchmarkExportFormat.JSON: 'json'>The format to export the config to. Currently only JSON is supported.

Return type

None

Examples

>>> benchmark = Benchmark(123)
>>> benchmark.export_config("benchmark_config.json")

export_latest_execution

export_latest_execution(filepath, config=None)

Export information associated with a latest benchmark execution. The exported dataset includes:

  • Per-datapoint evaluation information:

    • Evaluation scores, namely:

      • Parsed evaluator outputs

      • Rationale

      • Agreement with ground truth

    • Slice membership

  • Benchmark metadata

  • Execution metadata

  • (CSV only) Uploaded user columns and ground truth

This export includes all datapoints without filtering or sampling. Some datapoints may have missing evaluation scores if the benchmark has not been executed against them (e.g. those in the test split).

Parameters

NameTypeDefaultInfo
filepathstrThe filepath to write the exported data to.
configUnion[JsonExportConfig, CsvExportConfig, None]None

A JsonExportConfig or CsvExportConfig object. If not provided, JSON will be used by default. No additional configuration is required for JSON exports. For CSV exports, the following parameters are supported:

  • sep: The separator between columns. Default is ,.

  • quotechar: The character used to quote fields. Default is ".

  • escapechar: The character used to escape special characters. Default is \.

Return type

None

Examples

>>> benchmark = Benchmark(123)
>>> benchmark.export_latest_execution("benchmark_execution.json")

list_executions

list_executions()

Get all benchmark executions for this benchmark, sorted by creation date.

Returns

A list of BenchmarkExecution objects.

Return type

List[BenchmarkExecution]

Examples

>>> benchmark = Benchmark(123)
>>> executions = benchmark.list_executions()
>>> print(executions)