snorkelai.sdk.develop.Criteria
- final class snorkelai.sdk.develop.Criteria(benchmark_uid, criteria_uid, name, metric_label_schema_uid, description=None, rationale_label_schema_uid=None, archived=False)
Bases:
BaseA criteria represents a specific characteristic or feature being evaluated as part of a benchmark.
Criteria define what aspects of a model or AI application’s performance are being measured, such as accuracy, relevance, safety, and other qualities. Each criteria is associated with a benchmark and has an evaluator that assesses whether a model’s output satisfies that criteria.
The heart of each criteria is its associated label schema, which defines what, exactly, the criteria is measuring, and maps each option to an integer.
For example, a criteria that measures accuracy might have a label schema that defines the following labels:
INCORRECT: 0CORRECT: 1
A criteria that measures readability might have a label schema that defines the following labels:
POOR: 0ACCEPTABLE: 1EXCELLENT: 2
Read more in the Evaluation overview.
- __init__(benchmark_uid, criteria_uid, name, metric_label_schema_uid, description=None, rationale_label_schema_uid=None, archived=False)
Parameters
Parameters
Name Type Default Info benchmark_uid intThe unique identifier of the parent Benchmark. The benchmark_uidis visible in the URL of the benchmark page in the Snorkel GUI. For example,https://YOUR-SNORKEL-INSTANCE/benchmarks/100/indicates a benchmark withbenchmark_uidof100.criteria_uid intThe unique identifier for this criteria. name strThe name of the criteria. metric_label_schema_uid intThe ID of the schema defining the metric labels. description Optional[str]NoneA detailed description of what the criteria measures. rationale_label_schema_uid Optional[int]NoneThe ID of the schema defining rationale labels (if applicable). archived boolFalseWhether the criteria is archived. Examples
Using the
Criteriaclass requires the following import:from snorkelai.sdk.develop import CriteriaCreate a new criteria:
# Create a new criteria
criteria = Criteria.create(
benchmark_uid=100,
name="Accuracy",
description="Measures response accuracy",
label_map={"Correct": 1, "Incorrect": 0},
requires_rationale=True
)Get an existing criteria:
# Get existing criteria
criteria = Criteria.get(criteria_uid=100)
\_\_init\_\_
__init__
Methods
__init__(benchmark_uid, criteria_uid, name, ...)archive()Archives the criteria, hiding it from the UI and Benchmark.list_criteria method. create(benchmark_uid, name, label_map[, ...])Create a new criteria for a benchmark. delete(criteria_uid)Deletion of a criteria is not implemented. get(criteria_uid)Get an existing criteria by its UID. get_evaluator()Retrieves the evaluator associated with this criteria. update([name, description, archived])Updates the criteria with the given parameters. Attributes
archivedReturn whether the criteria is archived benchmark_uidReturn the UID of the parent benchmark criteria_uidReturn the UID of the criteria descriptionReturn the description of the criteria metric_label_schema_uidReturn the UID of the metric label schema nameReturn the name of the criteria rationale_label_schema_uidReturn the UID of the rationale label schema uidReturn the UID of the criteria - archive()
Archives the criteria, hiding it from the UI and Benchmark.list_criteria method.
Use
snorkelai.sdk.develop.benchmarks.Benchmark.list_criteria()withinclude_archived=Trueto view archived criteria.Return type
Return type
None
archive
archive
- static create(benchmark_uid, name, label_map, description=None, requires_rationale=False)
Create a new criteria for a benchmark.
Your
label_mapmust use consecutive integers starting from0. For example, if you have three labels, you must use the values0,1, and2.Parameters
Parameters
Returns
Returns
A new Criteria object representing the created criteria.
Return type
Return type
Raises
Raises
ValueError – If label_map is empty or has invalid values.
Name Type Default Info benchmark_uid intThe unique identifier of the parent Benchmark. name strThe name of the criteria. label_map Dict[str, int]A dictionary mapping user-friendly labels to numeric values. The key “UNKNOWN” will always be added with value -1. Dictionary values must be consecutive integers starting from 0. description Optional[str]NoneA detailed description of what the criteria measures. requires_rationale boolFalseWhether the criteria requires rationale. Example
criteria = Criteria.create(
benchmark_uid=200,
name="Accuracy",
description="Measures response accuracy",
label_map={"Correct": 1, "Incorrect": 0},
requires_rationale=True
)
create
create
- classmethod delete(criteria_uid)
Deletion of a criteria is not implemented.
delete
delete
- static get(criteria_uid)
Get an existing criteria by its UID.
Parameters
Parameters
Returns
Returns
A Criteria object representing the existing criteria.
Return type
Return type
Raises
Raises
ValueError – If the criteria is not found.
Name Type Default Info criteria_uid intThe unique identifier for the criteria. Example
criteria = Criteria.get(criteria_uid=100)
get
get
- get_evaluator()
Retrieves the evaluator associated with this criteria.
An evaluator is a prompt or code snippet that assesses whether a model’s output satisfies the criteria. Each criteria has one evaluator that assesses each datapoint against the criteria’s label schema and chooses the most appropriate label, in the form of the associated integer.
The evaluator can be either a code evaluator (using custom Python functions) or a prompt evaluator (using LLM prompts).
Example
Example 1
Example 1
Get the evaluator for a criteria and check its type:
from snorkelai.sdk.develop import PromptEvaluator, CodeEvaluator
criteria = Criteria.get(criteria_uid=100)
evaluator = criteria.get_evaluator()
if isinstance(evaluator, CodeEvaluator):
print("This is a code evaluator")
elif isinstance(evaluator, PromptEvaluator):
print("This is a prompt evaluator")
get\_evaluator
get_evaluator
- update(name=None, description=None, archived=None)
Updates the criteria with the given parameters. If a parameter is not provided or is None, the existing value will be left unchanged.
Parameters
Parameters
Return type
Return type
None
Name Type Default Info name Optional[str]NoneThe name of the criteria. description Optional[str]NoneA detailed description of what the criteria measures. archived Optional[bool]NoneWhether the criteria is archived. Example
criteria = Criteria.get(criteria_uid=100)
criteria.update(name="New Name", description="New description")
update
update
- property archived: bool
Return whether the criteria is archived
- property benchmark_uid: int
Return the UID of the parent benchmark
- property criteria_uid: int
Return the UID of the criteria
- property description: str | None
Return the description of the criteria
- property metric_label_schema_uid: int
Return the UID of the metric label schema
- property name: str
Return the name of the criteria
- property rationale_label_schema_uid: int | None
Return the UID of the rationale label schema
- property uid: int
Return the UID of the criteria