Skip to main content
Version: 0.93

Foundation model suite

Overview

The foundation model suite is a collection of foundation model-based features incorporated into the end-to-end Snorkel workflow. These features distill, adapt, and fine-tune foundation models using the data-centric development workflow and train specialized, enterprise-ready production models.

What are foundation models?

Foundation models (FMs), also known as large language models (LLMs), are extremely large models trained on massive amounts of data, forming a general foundation for use in more specific tasks.

Snorkel Flow provides the bridge for these powerful generic models to be applied to real-world enterprise AI use cases.

For more about foundation models, see Foundation models: a guide.

Use cases

The FM suite focuses on predictive AI use cases. Predictive AI is critical to successfully derive value from AI in enterprise software, especially when it comes to automating mission-critical processes, such as underwriting, know your customer (KYC), and document intelligence.

Because most real-world use cases in enterprise software are complex and performance-critical, generalist foundation models struggle to drive value out of the box due to the lack of domain-specific knowledge. However, the data-centric FM Suite helps you use modern foundation models to accelerate the development of deployable specialist models tailored to the specific use case at hand, all within Snorkel Flow.

What is in the FM suite?

The FM Suite contains these main features:

  • Prompt Builder: Explore and label data through natural language prompts using FM knowledge and translate it into labels that are ready for your weakly supervised learning use cases. See Prompt builder for more information.
  • Warm Start: Auto-label training data with the power of foundation models plus state-of-the-art zero/few-shot learning techniques during onboarding. This approach helps you get to a powerful baseline first pass with minimal human effort. See Warm start for more information.

Infrastructure requirements

Deployment TypeFeatureRelease v0.93 
Snorkel-hostedWarm StartAfter upgrading to 0.93, models are downloaded and readily available for use in Snorkel Flow. Infrastructure:
  • 1 GPU
  • 16 GB Memory

If GPUs are unavailable, contact Snorkel to assess alternatives and trade-offs. | | Prompt Builder | Does not require a GPU because it can run on external infrastructure. Requires a valid account for the infrastructure (Hugging Face, OpenAI, Vertex AI, Azure ML, Azure OpenAI, or Amazon SageMaker). If external connections are not possible, contact the Snorkel team to explore alternatives. | | Fine-tuning | Hugging Face models are accessible in the Model Zoo in Snorkel Flow. Infrastructure:

  • Recommended: 1 GPU
  • Possible to run on CPU with significantly longer run times

| | Customer-hosted (on-prem + private cloud) | Warm Start | Requires internet access to download models for the first use. If an internet connection is unavailable, contact Snorkel support. Infrastructure:

  • 1 GPU
  • 16 GB Memory

If GPUs are unavailable, contact Snorkel to assess alternatives and trade-offs. | | Prompt Builder | FM inference is widely supported for connections outside of the Snorkel platform, including Hugging Face, OpenAI, VertexAI, Azure ML, Azure OpenAI, and Amazon SageMaker. Requires internet access. If an internet connection is unavailable, contact Snorkel for alternatives. | | Fine-tuning | Hugging Face models are accessible in Snorkel. Infrastructure:

  • Recommended: 1 GPU
  • Possible to run on CPU with significantly longer run times

|

Next steps

You can begin using the FM suite with Snorkel Flow’s built-in Foundation Models immediately or configure external models through Snorkel’s extensive list of integrations.