Skip to main content
Version: 0.96

Using external models

Many foundation model suite workflows can use third-party external services to run inference over your data. To configure these third-party models, you need to create an account with the third-party service. Once an account has been established, you can configure an instance-wide connection for your selected service.

You can link external integrations to your Snorkel Flow instance. Snorkel Flow secrets service securely manages your API keys and secrets.

Using Snorkel Flow Admin Settings

  1. Superadmin roles can add new integrations through Snorkel Flow Admin Settings. Select your profile, and then navigate to Admin Settings > Foundation Models.

  2. Select Connect for the integration you want to add.

  3. Enter the required details from your external integration provider and select Save.

  4. Configure available models for the provider you added.

  5. (Optional) To make changes, select the Edit or Delete button for your integration.

    warning

    Deleting an integration also deletes all models associated with that integration.

Add vendor.webp

Using the SDK

  1. To list the integrations that are currently connected, use this SDK command:

    sf.list_integrations()
  2. To add a connection to a new integration, use of the set_secret command for your supported services:

    • Hugging Face

      sf.set_secret("huggingface::inference::api_token", "**YOUR API KEY**")

      For more, see Hugging Face token settings.

    • OpenAI

      sf.set_secret("openai_api_key", "**YOUR API KEY**")

      For more, see OpenAI API keys.

    • Azure OpenAI

      sf.set_secret("azure_openai_api_key", "**YOUR API KEY**")
    • Azure Machine Learning

      sf.set_secret("azure::ml::api_key", "**YOUR API KEY**")
    • Vertex AI

      sf.set_secret("vertexai_lm_location", "**YOUR PROJECT LOCATION**")
      sf.set_secret("vertexai_lm_project_id", "**YOUR PROJECT ID**")
      sf.set_secret("vertexai_lm_credentials_json", "**YOUR CREDENTIALS JSON**")
    • Amazon SageMaker

      sf.set_secret("aws::finetuning::region", "**YOUR REGION**")
      sf.set_secret("aws::finetuning::access_key_id", "**YOUR ACCESS KEY ID**")
      sf.set_secret("aws::finetuning::secret_access_key", "**YOUR SECRET ACCESS KEY**")
      sf.set_secret("aws::finetuning::sagemaker_execution_role", "**YOUR EXECUTION ROLE**")
    • Custom inference service (conforms to the OpenAI API specification)

      sf.set_secret("custom_inference_api_key", "**YOUR API KEY**")

To configure available models

Once you link a service, you can add models that are served within those services.

Using Snorkel Flow Admin Settings

  1. In the Admin Settings > Foundation Models page, select the Add models button.
  2. From the dropdown, choose the model provider you set up and enter the details for Model type, Model name, and the Endpoint URL.

The model you added shows up under the Models section and is available for use in Snorkel Flow.

Add a model.webp

Using SDK

To view the models that are currently connected, use this SDK command:

sf.get_external_model_endpoints()

You can view detailed information about configured models using the detail parameter:

sf.get_external_model_endpoints(detail=True)

You can inspect configuration information for a particular model using the model_name parameter:

sf.get_external_model_endpoints(model_name="openai/gpt-4o", detail=True)

To add models to Snorkel Flow, include the following information for each model:

  • Model name: The name of the provider followed by the name of the model. For example, openai/gpt-4.

  • Endpoint URL: The endpoint to perform inference requests.

  • Model provider: The provider serving the model. For example, Hugging Face, OpenAI, Vertex AI, or a custom inference service.

  • FM type: The associated task type for the model. For more information on which model type to select when configuring a model, see Supported Model Types.

    This example shows how to add the openai/gpt-4o model to Snorkel Flow:

    from snorkelflow.client_v3.tdm.models import ExternalLLMProvider, FMType

    sf.set_external_model_endpoint(
       model_name="openai/gpt-4o", # Compatible chat completions model
       endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
       model_provider=ExternalLLMProvider.OPENAI, # Model provider
       fm_type=FMType.TEXT2TEXT, # The task type of the model
    )

To delete a configured model, use this SDK command:

sf.delete_external_model_endpoint("openai/gpt-4o")

Hugging Face 🤗 inference endpoints

To add a Hugging Face model, launch your chosen Hugging Face model on their servers using Hugging Face endpoints. You will be given an endpoint URL for that model.

This example shows how to set up a Hugging Face Text2Text or TextGeneration model for Classification, Sequence Tagging, and PDF Extraction applications:

sf.set_external_model_endpoint(
model_name="**MODEL NAME**", # Set this to the model you have created the endpoint for
endpoint="**ENDPOINT URL**", # Set this to the URL for the model found in the Inference Endpoints Dashboard
model_provider=ExternalLLMProvider.HUGGINGFACE, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model; in this case, text2text
)

This example shows how to set up a Hugging Face Question Answering (QA) model for Sequence Tagging applications:

sf.set_external_model_endpoint(
model_name="**MODEL NAME**", # Set this to the model you have created the endpoint for
endpoint="**ENDPOINT URL**", # Set this to the URL for the model found in the Inference Endpoints Dashboard
model_provider=ExternalLLMProvider.HUGGINGFACE, # Model provider
fm_type=FMType.QA, # The task type of the model; in this case, question answering
)

This example shows how to set up a Hugging Face Document Question Answering (DocVQA) model for PDF Extraction applications:

sf.set_external_model_endpoint(
model_name="**MODEL NAME**", # Set this to the model you have created the endpoint for
endpoint="**ENDPOINT URL**", # Set this to the URL for the model found in the Inference Endpoints Dashboard
model_provider=ExternalLLMProvider.HUGGINGFACE, # Model provider
fm_type=FMType.DOCVQA, # The task type of the model; in this case, DocVQA
)

OpenAI API

To add an Open AI model, set up your API token and ensure that you are using the appropriate API endpoint. Specify a chat completions endpoint for chat models or a completions endpoint for language models. For more information about what API endpoint a model belongs, see OpenAI's Text generation models documentation.

This example shows how to configure an OpenAI model with the chat completions API endpoint:

sf.set_external_model_endpoint(
model_name="openai/o1-mini", # Compatible chat completions model
endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
model_provider=ExternalLLMProvider.OPENAI, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

This example shows how to configure an OpenAI model with the legacy completions API endpoint:

sf.set_external_model_endpoint(
model_name="openai/gpt-3.5-turbo-instruct", # Compatible completions model
endpoint="https://api.openai.com/v1/completions", # Completions endpoint
model_provider=ExternalLLMProvider.OPENAI, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Azure OpenAI API

To add an Azure OpenAI model, set up your API token and ensure that you are using the appropriate API endpoint.

This example shows how to configure a supported Azure OpenAI chat model.

sf.set_external_model_endpoint(
model_name="azure_openai/your-deployment-name", # Compatible chat completions model
endpoint="https://your-instance-name.openai.azure.com/chat/completions", # Chat completions endpoint
model_provider=ExternalLLMProvider.AZURE_OPENAI, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Azure Machine Learning API

To add an Azure Machine Learning AI model, set up your API token and ensure that you are using the appropriate API endpoint.

This example shows how to configure a supported Azure Machine Learning model.

from snorkelflow.models.prompts.prompts_services.azure import AzureDataInferenceInterfaceTypes

sf.set_external_model_endpoint(
model_name="your-model-name", # Name of your deployed Azure ML model
endpoint="https://<deployment-name>.westus2.inference.ml.azure.com/score", # Chat completions endpoint
model_provider=ExternalLLMProvider.AZURE_ML, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
azure_task_type=AzureDataInferenceInterfaceTypes.Llama.value,
)

Vertex AI language models API

To add a Vertex AI model, set up your Vertex AI location, project ID, and credentials JSON.

This example shows how to configure a Vertex AI model:

sf.set_external_model_endpoint(
   model_name="vertexai_lm/gemini-1.5-pro-002", # Model name
   endpoint="https://cloud.google.com/vertex-ai", # Endpoint
   model_provider=ExternalLLMProvider.VERTEXAI_LM, # Model provider
   fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Amazon SageMaker API

To add an Amazon Sagemaker model endpoint for predictive use cases:

sf.set_external_model_endpoint(
model_name="sagemaker/jumpstart-dft-meta-textgeneration-llama-3-8b-instruct", # Model name
endpoint="https://runtime.sagemaker.<region-name>.amazonaws.com/endpoints/jumpstart-dft-meta-textgeneration-llama-3-8b-instruct/invocations", # Endpoint
model_provider=ExternalLLMProvider.SAGEMAKER, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

To add an Amazon SageMaker model for generative use cases, see the LLM fine-tuning and alignment tutorial.

Custom Inference Service API

The custom inference service enables users to configure custom endpoints and alternative FM providers that adhere to the OpenAI API specification. Some examples of FM providers that are supported by the custom inference service includes Together AI and Groq.

After setting up your custom inference API key, add a model with custom inference service that conforms to the OpenAI API specification. Similar to the OpenAI setup, specify a chat completions endpoint for chat models or a completions endpoint for language models. This inference service is also extensible to other foundation model providers that conform to OpenAI's API specification, such as Together AI.

This example shows how to set up a supported model with the chat completions API endpoint:

sf.set_external_model_endpoint(
model_name="meta-llama/Llama-3.2-3B-Instruct-Turbo", # Chat model
endpoint="https://api.together.xyz/v1/chat/completions", # Inference service chat endpoint
model_provider=ExternalLLMProvider.CUSTOM_INFERENCE_SERVICE, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

This example shows how to set up a supported model with the completions API endpoint:

sf.set_external_model_endpoint(
model_name="meta-llama/Meta-Llama-3-70B", # Language model
endpoint="https://api.together.xyz/v1/completions", # Inference service completions endpoint
model_provider=ExternalLLMProvider.CUSTOM_INFERENCE_SERVICE, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
)

Specifying model hyper-parameters

We support the setting of arbitrary model hyper-parameters when you configure a model in Snorkel Flow. Some example parameters that this include are:

  • temperature
  • top_p
  • max_input_length
  • max_output_length

When setting a model hyper-parameter, please ensure it appears exactly as documented by the model's provider. For example, when configuring the temperature hyper-parameter, some models leverage the keyword temp, others use t, and some require the full word temperature.

Once you have confirmed the hyper-parameter names and values you'd like to set, provide them as keyword arguments to sf.set_external_model_endpoint. For example, to set temperature and max_tokens on openai/gpt-4o, you'd run:

sf.set_external_model_endpoint(
model_name="openai/gpt-4o", # Compatible chat completions model
endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
model_provider=ExternalLLMProvider.OPENAI, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
temperature=0.5,
max_tokens=500,
)

Configuring rate limits

Snorkel Flow supports more efficient interactions for select providers by maximizing the utilization of your organization’s quota while previewing and creating prompt labeling functions (LFs). Currently, the OpenAI, Azure OpenAI, and custom inference service integrations support custom rate limits.

To enable this feature, use the SDK function sf.set_external_model_endpoint() to configure the requests_per_sec and tokens_per_sec parameters. Snorkel Flow consumes as much as the provided quota when computing the preview of prompt LFs across your instance.

When setting these values, get the correct values from your organization’s usage tier and provide the values in per-second level. If you provide incorrect values, Snorkel Flow might hit rate limits for OpenAI or underutilize the provided quota.

To determine OpenAI usage limits, visit OpenAI organization limits page. Find the token limits and request limits per minute for your desired model and divide it by 60 to compute the per-second value. For example, when the model gpt-4o-mini shows 10,000 RPM for request and other limits and 30,000,000 TPM for token limits, provide requests_per_sec=166 and tokens_per_sec=500000 as extra keyword arguments in the set_external_model_endpoint() function.

Here is an example for configuring gpt-4o-mini with rate limits:

# delete an existing model endpoint, only if already registered
sf.delete_external_model_endpoint("openai/gpt-4o-mini")
sf.set_external_model_endpoint(
model_name="openai/gpt-4o-mini", # Compatible chat completions model
endpoint="https://api.openai.com/v1/chat/completions", # Chat completions endpoint
model_provider=ExternalLLMProvider.OPENAI, # Model provider
fm_type=FMType.TEXT2TEXT, # The task type of the model
requests_per_sec=166, # request limit per second
tokens_per_sec=500000, # token limit per second
)