Skip to main content
Version: 0.95

Snorkel Flow Kubernetes installation overview

This topic is your starting point to deploy Snorkel Flow on Kubernetes. Installing Snorkel Flow is a two-step process:

  1. Preparing Infrastructure Resources
  2. Installing Snorkel Flow into the Kubernetes Cluster

Preparing infrastructure resources

At a high level, Snorkel Flow requires the following pieces of infrastructure for deployment:

  • An operational Kubernetes (K8s) cluster
  • NFS or equivalent (NAS drive, etc.)
  • A domain to create URLs for the platform

If you are using a major cloud provider (AWS, GCP, and Azure) for Snorkel Flow, Snorkel recommends deploying into a net-new K8s cluster. However, deploying into an existing cluster and on-prem options are possible. Snorkel offers Terraform and manual methods of spinning up all required cloud infrastructure pieces for the three major cloud providers in preparation for installing Snorkel Flow.

Assumptions and requirements

Snorkel Flow can run on any Kubernetes installation as long as it satisfies our requirements on Cluster Specifications and Storage and Ingress.

Cluster Specifications

CategoryRequirement
Minimum Kubernetes version 1.25+ 
Recommended Kubernetes distributionAWS EKS, GCP GKE, Asure AKS, OpenShift 4.9+
Node ProcessorX86_64
Namespace total CPU64+
Namespace total RAM360 GB+
Namespace total GPU4+ T4 or better
Available per-pod CPU16+
Available per-pod RAM64 GB+
Storage volumes768 GB+ (use-case dependent) NFS-equivalent with read/write access

Storage and ingress

To run Snorkel Flow in your K8s cluster, whether it’s running in a private datacenter or public cloud, you must meet these requirements:

  • A StorageClass that supports the ReadWriteOnly access mode.
  • A StorageClass that supports the ReadWriteMany access mode (e.g., NFS, or other filesystems).
    • For AWS, Elastic File System (EFS) is a good option.
  • An IngressController that is running.
  • A DNS domain/zone that is ready for Snorkel Flow to use. Snorkel Flow requires five subdomains.
  • A TLS certificate with a publicly-recognized cert.

Major cloud infrastructure scripts

For the three major cloud providers, we provide Terraform and manual instructions to spin up required infrastructure that satisfy the requirements above.

AWS

GCP

Azure

Install Snorkel Flow into the Kubernetes cluster

Once all of the required infrastructure resources are provisioned, you're ready to deploy Snorkel Flow into the Kubernetes cluster. Choose from these methods of installation into the cluster: