Version: 25.7

Connect external models

Snorkel supports connecting third-party foundation models, like OpenAI and Claude, to run inference-related workflows like evaluation, on the Snorkel AI Data Development Platform.

To connect to external models, follow these steps:

Create an account with the third-party service provider.
Configure a connection to that service within your Snorkel instance.

This connection setup allows all workflows in your instance to access the third-party models for inference operations.

Prerequisites

An account and credentials with the providers you want to connect

How to connect external services

You can connect external integrations to your Snorkel instance. The Snorkel secrets service securely manages stored API keys and secrets.

Connect to model providers using the Snorkel GUI

Prerequisites:

In the Snorkel GUI, select your profile.
Select Admin Settings > Foundation Models.
Select Connect for the integration you want to add.
Enter the required details from your external integration provider and select Save.
Configure available models for the provider you added.
(Optional) To make changes, select the Edit or Delete button for your integration.

warning
Deleting an integration also deletes all models associated with that integration.

Connect to model providers using the SDK

To list the integrations that are currently connected, use this SDK command:
```
sai.list_integrations()
```
To check the operational status of an integration, use the get_model_provider_status command for your associated provider:
```
sai.get_model_provider_status(ExternalLLMProvider.OPENAI)
```

To add a connection to a new integration, use the set_secret command for your supported services:

Hugging Face

sai.set_secret("huggingface::inference::api_token", "**YOUR API KEY**")

For more, see Hugging Face token settings.

OpenAI

sai.set_secret("openai_api_key", "**YOUR API KEY**")

For more, see OpenAI API keys.

Azure OpenAI

sai.set_secret("azure_openai_api_key", "**YOUR API KEY**")

Azure Machine Learning

sai.set_secret("azure::ml::api_key", "**YOUR API KEY**")

Vertex AI

sai.set_secret("vertexai_lm_location", "**YOUR PROJECT LOCATION**")
sai.set_secret("vertexai_lm_project_id", "**YOUR PROJECT ID**")
sai.set_secret("vertexai_lm_credentials_json", "**YOUR CREDENTIALS JSON**")

Amazon SageMaker

sai.set_secret("aws::finetuning::region", "**YOUR REGION**")
sai.set_secret("aws::finetuning::access_key_id", "**YOUR ACCESS KEY ID**")
sai.set_secret("aws::finetuning::secret_access_key", "**YOUR SECRET ACCESS KEY**")
sai.set_secret("aws::finetuning::sagemaker_execution_role", "**YOUR EXECUTION ROLE**")

Amazon Bedrock

sai.set_secret("aws::bedrock::region", "**YOUR REGION**")
sai.set_secret("aws::bedrock::access_key_id", "**YOUR ACCESS KEY ID**")
sai.set_secret("aws::bedrock::secret_access_key", "**YOUR SECRET ACCESS KEY**")
sai.set_secret("aws::bedrock::bedrock_execution_role", "**YOUR EXECUTION ROLE**")

Custom inference service: see Custom Inference Service API

How to add models in Snorkel

After you connect an external service to Snorkel, you can add the models from that service to Snorkel.

Add models using the Snorkel GUI

In the Snorkel GUI, select your profile.
Select Admin Settings > Foundation Models.
Select the Add models button.
From the dropdown, choose the model provider you set up in the previous step. Enter the details for Model type, Model name, and Endpoint URL.

The model you added is now visible in the Models section and is available for use in Snorkel.

Add models using the SDK

To view the models that are currently connected, use this SDK command:

sai.get_external_model_endpoints()

You can view detailed information about configured models using the detail parameter:

sai.get_external_model_endpoints(detail=True)

You can inspect configuration information for a particular model using the model_name parameter:

sai.get_external_model_endpoints(model_name="openai/gpt-4o", detail=True)

To add models to Snorkel, include the following information for each model:

model_name: The name of the provider followed by the name of the model. For example, openai/gpt-4.
endpoint: The endpoint to perform inference requests.
model_provider: The provider serving the model. For example, Hugging Face, OpenAI, Vertex AI, or a custom inference service.

fm_type: The associated task type for the model.

This example shows how to add the openai/gpt-4o model to Snorkel:

from snorkelai.client.tdm.models import ExternalLLMProvider, FMType

sai.set_external_model_endpoint(
    model_name="openai/gpt-4o",  # Compatible chat completions model
    endpoint="https://api.openai.com/v1/chat/completions",  # Chat completions endpoint
    model_provider=ExternalLLMProvider.OPENAI,  # Model provider
    fm_type=FMType.TEXT2TEXT,  # The task type of the model
)

To delete a configured model, use this SDK command:

sai.delete_external_model_endpoint("openai/gpt-4o")

Hugging Face 🤗 inference endpoints

To add a Hugging Face model, launch your chosen Hugging Face model on their servers using Hugging Face endpoints. You will be given an endpoint URL for that model.

This example shows how to set up a Hugging Face Text2Text model for Prompt Dev applications:

sai.set_external_model_endpoint(
    model_name="**MODEL NAME**",  # Set this to the model you have created the endpoint for
    endpoint="**ENDPOINT URL**",  # Set this to the URL for the model found in the Inference Endpoints Dashboard
    model_provider=ExternalLLMProvider.HUGGINGFACE, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model; in this case, text2text
)

OpenAI API

To add an Open AI model, set up your API token and ensure that you are using the appropriate API endpoint. Specify a chat completions endpoint for chat models or a completions endpoint for language models. For more information about what API endpoint a model belongs, see OpenAI's Text generation models documentation.

This example shows how to configure an OpenAI model with the chat completions API endpoint:

sai.set_external_model_endpoint(
    model_name="openai/o1-mini", # Compatible chat completions model
    endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
    model_provider=ExternalLLMProvider.OPENAI, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

This example shows how to configure an OpenAI model with the legacy completions API endpoint:

sai.set_external_model_endpoint(
    model_name="openai/gpt-3.5-turbo-instruct", # Compatible completions model
    endpoint="https://api.openai.com/v1/completions", # Completions endpoint
    model_provider=ExternalLLMProvider.OPENAI, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Azure OpenAI API

To add an Azure OpenAI model, set up your API token and ensure that you are using the appropriate API endpoint.

This example shows how to configure a supported Azure OpenAI chat model.

sai.set_external_model_endpoint(
    model_name="azure_openai/your-deployment-name", # Compatible chat completions model
    endpoint="https://your-instance-name.openai.azure.com/chat/completions", # Chat completions endpoint
    model_provider=ExternalLLMProvider.AZURE_OPENAI, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Azure Machine Learning API

To add an Azure Machine Learning AI model, set up your API token and ensure that you are using the appropriate API endpoint.

This example shows how to configure a supported Azure Machine Learning model.

from snorkelai.models.prompts.prompts_services.azure import AzureDataInferenceInterfaceTypes

sai.set_external_model_endpoint(
    model_name="your-model-name", # Name of your deployed Azure ML model
    endpoint="https://<deployment-name>.westus2.inference.ml.azure.com/score", # Chat completions endpoint
    model_provider=ExternalLLMProvider.AZURE_ML, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
    azure_task_type=AzureDataInferenceInterfaceTypes.Llama.value,
)

Bedrock Claude API

To add a Bedrock Claude model endpoint:

sai.set_external_model_endpoint(
    model_name="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",  # Model name
    endpoint="bedrock-runtime.us-west-2.amazonaws.com",  # Endpoint
    model_provider=ExternalLLMProvider.Bedrock,  # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Supported Claude models

The following models are currently supported via AWS Bedrock:

Claude 3.5 Sonnet v2
Claude 3.7

View the Model IDs here.

Vertex AI language models API

To add a Vertex AI model, set up your Vertex AI location, project ID, and credentials JSON.

This example shows how to configure a Vertex AI model:

sai.set_external_model_endpoint(
    model_name="vertexai_lm/gemini-1.5-pro-002",  # Model name
    endpoint="https://cloud.google.com/vertex-ai",  # Endpoint
    model_provider=ExternalLLMProvider.VERTEXAI_LM,  # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Amazon SageMaker API

To add an Amazon Sagemaker model endpoint for predictive use cases:

sai.set_external_model_endpoint(
    model_name="sagemaker/jumpstart-dft-meta-textgeneration-llama-3-8b-instruct", # Model name
    endpoint="https://runtime.sagemaker.<region-name>.amazonaws.com/endpoints/jumpstart-dft-meta-textgeneration-llama-3-8b-instruct/invocations", # Endpoint
    model_provider=ExternalLLMProvider.SAGEMAKER, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Custom inference service API

The custom inference service enables users to configure custom endpoints and additional foundation model providers.

OpenAI-compatible services

The custom inference service supports endpoints from foundation model providers that follow the OpenAI API schema, such as Together AI, Groq, and others. To integrate a supported model: OpenAI API specification.

1. Set API secrets

  sai.set_secret("custom_inference_api_key", "**YOUR API KEY**")

  # Optional: Set default HTTP headers. Note: if authentication is is not via
  sai.set_secret("custom_inference_optional_headers", {
    "**HEADER_NAME**": "**HEADER_VALUE**"
  })

NOTE: By default, custom_inference_api_key is used as a Bearer token in the Authorization header for requests (i.e.Authorization: Bearer <custom_inference_api_key>). If your service uses a different or additional authentication method, set the correct headers using custom_inference_optional_headers. However, you still need to provide an arbitrary value for custom_inference_api_key.

2. Register model endpoints

After setting up your custom inference API key, add a model with custom inference service that conforms to the OpenAI API specification. Like the OpenAI setup, specify a chat completions endpoint for chat models or a completions endpoint for language models. This inference service is also extensible to other foundation model providers that conform to OpenAI's API specification, such as Together AI.

This example shows how to set up a supported model with the chat completions API endpoint:

sai.set_external_model_endpoint(
    model_name="meta-llama/Llama-3.2-3B-Instruct-Turbo", # Chat model
    endpoint="https://api.together.xyz/v1/chat/completions", # Inference service chat endpoint
    model_provider=ExternalLLMProvider.CUSTOM_INFERENCE_SERVICE, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

This example shows how to set up a supported model with the completions API endpoint:

sai.set_external_model_endpoint(
    model_name="meta-llama/Meta-Llama-3-70B", # Language model
    endpoint="https://api.together.xyz/v1/completions", # Inference service completions endpoint
    model_provider=ExternalLLMProvider.CUSTOM_INFERENCE_SERVICE, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Non-OpenAI-compatible services

If your external inference service does not conform to the OpenAI API spec, you can still integrate it using a transform configuration. This allows Snorkel to map requests and responses between OpenAI-compatible schemas and your custom API.

Steps:

Set the custom_inference_api_key and custom_inference_optional_headers secrets as above.
Create a transform configuration: The transform configuration is a JSON object with the following required fields:

/chat/completions:
The root key representing the OpenAI endpoint you wish to transform. This key defines which OpenAI endpoint's requests/responses will be mapped.
custom_path:
The path on your custom LLM API that corresponds to the OpenAI /chat/completions endpoint. Snorkel routes requests to this path instead of the default OpenAI endpoint.
openai_to_custom_request:
An object that defines how to transform the OpenAI request JSON into the format expected by your custom LLM API.
- Keys: Field names as expected by your custom API. - Values: JMESPath expressions describing how to extract or map values from the OpenAI request schema.

custom_to_openai_response:
An object that defines how to transform your custom LLM API's response into the OpenAI response schema.

Keys: Field names as expected by the OpenAI API/SDK.
Values: JMESPath expressions describing how to extract or map values from your custom API's response.

When configuring a Custom Inference Services with a transform spec and associated models, set the base URL for the custom inference service endpoint (do not include the chat completions path). Instead add the chat completion path to the custom_path field in the transform configuration.

Example transform configuration:

import json

transform_config = {
  "/chat/completions": {
    "custom_path": "/v1/generate",
    "openai_to_custom_request": {
      "useCase": "'text-generation'",
      "contextId": "model",
      "preSeed_injection_map": {
        "system": "messages[?role=='system'].content | join(' ', @)",
        "user": "messages[?role=='user'].content | join(' ', @)"
      },
      "parameters": {
        "temperature": "temperature",
        "maxOutputTokens": "max_tokens",
        "topP": "top_p",
        "responseMimeType": "'application/json'"
      }
    },
    "custom_to_openai_response": {
      "id": "responseMetadata.vegasTransactionId",
      "model": "responseMetadata.llmResponsePayload[0].modelVersion",
      "choices": "[{index: to_number('0'), message: {role: 'assistant', content: responseMetadata.llmResponsePayload[0].candidates[0].content.parts[0].text}, finish_reason: responseMetadata.llmResponsePayload[0].finishReason == 'STOP' && 'stop' || 'length'}]",
      "usage": {
        "prompt_tokens": "responseMetadata.llmResponsePayload[0].usageMetadata.promptTokenCount || to_number('0')",
        "completion_tokens": "responseMetadata.llmResponsePayload[0].usageMetadata.candidatesTokenCount || to_number('0')",
        "total_tokens": "responseMetadata.llmResponsePayload[0].usageMetadata.totalTokenCount || to_number('0')"
      }
    }
  }
}


sai.set_secret("custom_inference_transform_spec", json.dumps(transform_config))

How to specify model hyper-parameters

Snorkel supports the setting of arbitrary model hyper-parameters when you configure a model in Snorkel. Some example parameters that this include are:

temperature
top_p
max_input_length
max_output_length

When setting a model hyper-parameter, please ensure it appears exactly as documented by the model's provider. For example, when configuring the temperature hyper-parameter, some models leverage the keyword temp, others use t, and some require the full word temperature.

Once you have confirmed the hyper-parameter names and values you'd like to set, provide them as keyword arguments to sai.set_external_model_endpoint. For example, to set temperature and max_tokens on openai/gpt-4o, run:

sai.set_external_model_endpoint(
    model_name="openai/gpt-4o", # Compatible chat completions model
    endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
    model_provider=ExternalLLMProvider.OPENAI, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
    temperature=0.5,
    max_tokens=500,
)

How to set rate limits

Snorkel supports custom rate limits for the OpenAI, Azure OpenAI, Bedrock, and custom inference service integrations.

To enable this feature, use the SDK function sai.set_external_model_endpoint() to configure the requests_per_sec and tokens_per_sec parameters. Snorkel consumes as much as the provided quota when computing previews of prompt LFs for the whole Snorkel instance.

When setting these values, get the correct values from your organization’s usage tier and provide the values in per-second level. If you provide incorrect values, Snorkel might hit rate limits for OpenAI or underutilize the provided quota.

To determine OpenAI usage limits, visit the OpenAI organization limits page. Find the token limits and request limits per minute for your desired model and divide it by 60 to compute the per-second value. For example, when the model gpt-4o-mini shows 10,000 RPM for request and other limits and 30,000,000 TPM for token limits, provide requests_per_sec=166 and tokens_per_sec=500000 as extra keyword arguments in the set_external_model_endpoint() function.

Here is an example for configuring gpt-4o-mini with rate limits:

# delete an existing model endpoint, only if already registered
sai.delete_external_model_endpoint("openai/gpt-4o-mini")
sai.set_external_model_endpoint(
    model_name="openai/gpt-4o-mini", # Compatible chat completions model
    endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
    model_provider=ExternalLLMProvider.OPENAI, # Model provider
    fm_type=FMType.TEXT2TEXT, # The task type of the model
    requests_per_sec=166, # request limit per second
    tokens_per_sec=500000, # token limit per second
)

Prerequisites​

How to connect external services​

Connect to model providers using the Snorkel GUI​

Connect to model providers using the SDK​

How to add models in Snorkel​

Add models using the Snorkel GUI​

Add models using the SDK​

Hugging Face 🤗 inference endpoints​

OpenAI API​

Azure OpenAI API​

Azure Machine Learning API​

Bedrock Claude API​

Supported Claude models​

Vertex AI language models API​

Amazon SageMaker API​

Custom inference service API​

OpenAI-compatible services​

1. Set API secrets​

2. Register model endpoints​

Non-OpenAI-compatible services​

How to specify model hyper-parameters​

How to set rate limits​