Secret storage and encryption
NOTE: This topic applies to Snorkel AI Data Development Platform 0.91.24 and later.
Snorkel AI Data Development Platform sometimes requires storing client credentials or other secret data. For example, Snorkel AI Data Development Platform might need to store an OIDC client secret or a third-party API key on the platform. Snorkel stores these secrets with the following configuration:
- Storage: Secrets are stored in the Snorkel Postgres database.
- Encryption: Snorkel encrypts these secrets before storing them in the Postgres database. Snorkel uses AES-256 encryption with randomized initialization vector (IV) and an encryption key.
- Key: Snorkel recommends providing a user-created encryption key or relies on a generated key if no user key is present.
Best practice: User-created encryption key
If your installation of the Snorkel AI Data Development Platform is running on Kubernetes, create your own encryption key in the project's namespace.
Your Snorkel installation stores this key separately from the Snorkel Postgres database, using the Kubernetes security primitive Secrets
object.
The advantage of this approach is that if someone gains access to the database or its backups, they cannot decrypt the stored secrets. The Snorkel AI Data Development Platform does not mount keys stored this way into the Kubernetes environment. Instead, the Snorkel AI Data Development Platform calls the Kubernetes API for encryption and decryption when needed.
WARNING: Do not rotate this key after it is set. If you change this value, you risk losing all encrypted data.
The secret looks like this:
apiVersion: v1
data:
root-key: <YOUR_ENCODED_KEY>
kind: Secret
metadata:
name: root-key-secret
namespace: <YOUR_NAME_SPACE>
type: Opaque
These are the key parameters in this Kubernetes .yaml
file:
root-key
: The base64-encoded key.name
: Unique name for this secret.namespace
: Shared namespace for the Snorkel project.
For added security, use the secret manager and Kubernetes secrets integration provided by your cloud provider:
- For GCP, use the Secret Manager add-on with Google Kubernetes Engine.
- For AWS, use the AWS Secrets Manager secrets with Kubernetes.
Default: Generated encryption key
If no user-provided key is available or the installation is not running on Kubernetes, the Snorkel AI Data Development Platform generates a default key for each Snorkel AI Data Development Platform.
The Snorkel AI Data Development Platform generates this key deterministically, using the Snorkel Key Component, which is known only to Snorkel AI, and a Customer Key Component, which Snorkel generates securely for each installation.
However, Snorkel highly recommends that you provide their own encryption key to ensure the keys can be decrypted only with access to the database contents and the secret store.