Skip to main content
Version: 0.96

AWS Infrastructure Setup - Terraform (Recommended)

Overview

This document will guide you through creating and deploying a new Kubernetes cluster to your existing AWS account, including the creation of all required resources. This process will be completed through the AWS web interface and will involve running our Terraform configuration.

Assumptions/Prerequisites

By default, Snorkel Flow is deployed into an existing AWS VPC and subnets, and configures new infrastructure in your AWS account. We start off with a couple assumptions:

  • You have access to an AWS account
  • You have access to a VPC, along with 3 or more subnets within that VPC
  • You have a domain that you own, as well as a certificate for TLS
    • The domain is configured in Route53
    • The certificate is configured in AWS Certificate Manager
  • The individual executing these setup tasks has admin access to the VPC

Subnet Tagging

To ensure the kubernetes cluster created later in this document recognizes the 3 or more subnets provided, we need to tag them each with 2 tags:

  • Navigate to subnets in the AWS Console
    • VPC -> Subnets
  • For each of the 3 subnets, add a tag for elb (for the aws-load-balancer-controller later in this document) and for the cluster
    • Subnet -> Manage tags
    • Add an elb tag
      • For private subnets
        • Key: kubernetes.io/role/internal-elb
        • Value: 1
      • For public subnets
        • Key: kubernetes.io/role/elb
        • Value: 1
    • Add a cluster tag
      • Key: kubernetes.io/cluster/snorkel-flow-aws
      • Value: shared

Resource Creation

In this step, we use the terraform package provided by the Snorkel team to create the required infrastructure needed to run Snorkel Flow:

  • Download and extract the AWS terraform files, and store it somewhere accessible on the machine that has the required admin permissions to the VPC
  • `cd` into the top-level directory with the `variables.tf` file as well as the `terraform.tfvars` file. The `terraform.tfvars` file is used to inject your configurations into the `variables.tf` file.
  • Inside terraform.tfvars, note the required variables and fill them out in the file
    • vpc_id: ID of the existing VPC to deploy into
    • subnets: a list of existing subnets in the vpc (3 or more) to use for Snorkel Flow
    • domain_filter: the name of an existing hosted zone in route53, to use as the domain for Snorkel Flow urls

Once you are happy with the inputted variables, run terraform and wait for the resources to spin up, which can take around ~15-20 minutes:

  • terraform init
  • terraform plan
  • terraform apply

Once terraform finishes, we should have a fresh k8s cluster to deploy Snorkel Flow into.