Snorkel Flow Kubernetes installation overview
This topic is your starting point to deploy Snorkel Flow on Kubernetes. Installing Snorkel Flow is a two-step process:
Preparing infrastructure resources
At a high level, Snorkel Flow requires the following pieces of infrastructure for deployment:
- An operational Kubernetes (K8s) cluster
- NFS or equivalent (NAS drive, etc.)
- A domain to create URLs for the platform
If you are using a major cloud provider (AWS, GCP, and Azure) for Snorkel Flow, Snorkel recommends deploying into a net-new K8s cluster. However, deploying into an existing cluster and on-prem options are possible. Snorkel offers Terraform and manual methods of spinning up all required cloud infrastructure pieces for the three major cloud providers in preparation for installing Snorkel Flow.
Assumptions and requirements
Snorkel Flow can run on any Kubernetes installation as long as it satisfies our requirements on Cluster Specifications and Storage and Ingress.
Cluster Specifications
Category | Requirement |
---|---|
Minimum Kubernetes version | 1.25+ |
Recommended Kubernetes distribution | AWS EKS, GCP GKE, Asure AKS, OpenShift 4.9+ |
Node Processor | X86_64 |
Namespace total CPU | 64+ |
Namespace total RAM | 360 GB+ |
Namespace total GPU | 4+ T4 or better |
Available per-pod CPU | 16+ |
Available per-pod RAM | 64 GB+ |
Storage volumes | 768 GB+ (use-case dependent) NFS-equivalent with read/write access |
Storage and ingress
To run Snorkel Flow in your K8s cluster, whether it’s running in a private datacenter or public cloud, you must meet these requirements:
- A StorageClass that supports the ReadWriteOnly access mode.
- A StorageClass that supports the ReadWriteMany access mode (e.g., NFS, or other filesystems).
- For AWS, Elastic File System (EFS) is a good option.
- An IngressController that is running.
- For AWS, AWS Load Balancer Controller is a good option.
- A DNS domain/zone that is ready for Snorkel Flow to use. Snorkel Flow requires five subdomains.
- A TLS certificate with a publicly-recognized cert.
Major cloud infrastructure scripts
For the three major cloud providers, we provide Terraform and manual instructions to spin up required infrastructure that satisfy the requirements above.
AWS
GCP
Azure
Install Snorkel Flow into the Kubernetes cluster
Once all of the required infrastructure resources are provisioned, you're ready to deploy Snorkel Flow into the Kubernetes cluster. Choose from these methods of installation into the cluster: