Skip to main content
Version: 25.5

Secret storage and encryption

NOTE: This topic applies to Snorkel AI Data Development Platform 0.91.24 and later.

Snorkel AI Data Development Platform sometimes requires storing client credentials or other secret data. For example, Snorkel AI Data Development Platform might need to store an OIDC client secret or a third-party API key on the platform. Snorkel stores these secrets with the following configuration:

  • Storage: Secrets are stored in the Snorkel Postgres database.
  • Encryption: Snorkel encrypts these secrets before storing them in the Postgres database. Snorkel uses AES-256 encryption with randomized initialization vector (IV) and an encryption key.
  • Key: Snorkel recommends providing a user-created encryption key or relies on a generated key if no user key is present.

Best practice: User-created encryption key

If your installation of the Snorkel AI Data Development Platform is running on Kubernetes, create your own encryption key in the project's namespace.

Your Snorkel installation stores this key separately from the Snorkel Postgres database, using the Kubernetes security primitive Secrets object.

The advantage of this approach is that if someone gains access to the database or its backups, they cannot decrypt the stored secrets. The Snorkel AI Data Development Platform does not mount keys stored this way into the Kubernetes environment. Instead, the Snorkel AI Data Development Platform calls the Kubernetes API for encryption and decryption when needed.

WARNING: Do not rotate this key after it is set. If you change this value, you risk losing all encrypted data.

The secret looks like this:

apiVersion: v1
data:
root-key: <YOUR_ENCODED_KEY>
kind: Secret
metadata:
name: root-key-secret
namespace: <YOUR_NAME_SPACE>
type: Opaque

These are the key parameters in this Kubernetes .yaml file:

  • root-key: The base64-encoded key.
  • name: Unique name for this secret.
  • namespace: Shared namespace for the Snorkel project.

For added security, use the secret manager and Kubernetes secrets integration provided by your cloud provider:

Default: Generated encryption key

If no user-provided key is available or the installation is not running on Kubernetes, the Snorkel AI Data Development Platform generates a default key for each Snorkel AI Data Development Platform.

The Snorkel AI Data Development Platform generates this key deterministically, using the Snorkel Key Component, which is known only to Snorkel AI, and a Customer Key Component, which Snorkel generates securely for each installation.

However, Snorkel highly recommends that you provide their own encryption key to ensure the keys can be decrypted only with access to the database contents and the secret store.