Secret storage and encryption
NOTE: This topic applies to Snorkel Flow 0.91.24 and later.
Snorkel Flow sometimes requires storing client credentials or other secret data. For example, Snorkel Flow might need to store an OIDC client secret or a third-party API key on the platform. Snorkel Flow stores these secrets with the following configuration:
- Storage: Secrets are stored in the Snorkel Flow Postgres database.
- Encryption: Snorkel Flow encrypts these secrets before storing them in the Postgres database. Snorkel Flow uses AES-256 encryption with randomized initialization vector (IV) and an encryption key.
- Key: Snorkel recommends providing a user-created encryption key or relies on a generated key if no user key is present.
Best practice: User-created encryption key
If your installation of Snorkel Flow is running on Kubernetes, create your own encryption key in the project's namespace.
Your Snorkel installation stores this key separately from the Snorkel Postgres database, using the Kubernetes security primitive Secrets object.
The advantage of this approach is that if someone gains access to the database or its backups, they cannot decrypt the stored secrets. Snorkel Flow does not mount keys stored this way into the Kubernetes environment. Instead, Snorkel Flow calls the Kubernetes API for encryption and decryption when needed.
WARNING: Do not rotate this key after it is set. If you change this value, you risk losing all encrypted data.
The secret looks like this:
apiVersion: v1
data:
  root-key: <YOUR_ENCODED_KEY>
kind: Secret
metadata:
  name: root-key-secret
  namespace: <YOUR_NAME_SPACE>
type: Opaque
These are the key parameters in this Kubernetes .yaml file:
- root-key: The base64-encoded key.
- name: Unique name for this secret.
- namespace: Shared namespace for the Snorkel Flow project.
For added security, use the secret manager and Kubernetes secrets integration provided by your cloud provider:
- For GCP, use the Secret Manager add-on with Google Kubernetes Engine.
- For AWS, use the AWS Secrets Manager secrets with Kubernetes.
Default: Generated encryption key
If no user-provided key is available or the installation is not running on Kubernetes, Snorkel Flow generates a default key for each Snorkel Flow installation.
Snorkel Flow generates this key deterministically, using the Snorkel Key Component, which is known only to Snorkel AI, and a Customer Key Component, which Snorkel Flow generates securely for each installation.
However, Snorkel highly recommends that you provide their own encryption key to ensure the keys can be decrypted only with access to the database contents and the secret store.