Skip to main content
Version: 0.95

Audit Logging in Snorkel Flow

This documentation provides an overview of audit logging in Snorkel Flow, including what it encompasses, the parts of the platform covered, and how to access audit logs.

Overview

Audit logging in Snorkel Flow is designed to track and record critical actions and events performed within the platform. The audit logs capture essential details about each activity to provide comprehensive visibility and accountability. Audit logs support security, compliance, and operational monitoring.

Prerequisites

To access and review audit logs, you need the following prerequisites:

  • An existing Snorkel Flow deployment with access to a notebook server.
  • Administrative access to the Snorkel Flow deployment.

What are Audit Logs?

Audit logs provide a way to see which users have performed what actions within the platform. Specific actions and their metadata are logged in the platform, capturing:

  • What activity was performed? - The specific operation or action executed (e.g., user creation, dataset deletion).
  • Who or what performed the activity? - The identity of the user or service responsible for executing the action.
  • Where or on what system the activity was performed from? - The source system or IP address from which the action originated (partially captured).
  • What was the activity performed on? - The target entity or object affected by the action (e.g., dataset, application, user).
  • When was the activity performed? - The timestamp indicating when the action occurred.
  • What was the status, outcome, or result of the activity? - The result or status of the action, such as success or failure (partially captured).

What Do Audit Logs Encompass?

Audit logs in Snorkel Flow cover a broad range of actions across various components of the platform, including user management, data management, and application management. The following operations are currently audited:

Access Management

  • Access Attempts: Tracks user access to the system, including successful and failed attempts.
  • API Authorization Failure: Records details of failed API authorization attempts.
  • API Key Generation: Audits creation and regeneration of API keys.

User Management

  • User Create/Update/Delete: Audits actions involving user accounts, including creation, modification, and deletion.
  • Invite - Create: Tracks the creation of invitations for new users.
  • Role - CRUD: Logs create, read, update, and delete operations on user roles.
  • RBAC Create/Update/Delete: Audits actions related to Role-Based Access Control (RBAC) changes.
  • Password Changes: Tracks all user password change activities.

Data Management

  • Datasource Create/Update/Delete: Logs operations involving data source management, such as adding, updating, or deleting data sources (only through the UI).
  • Dataset - Delete: Tracks deletion of datasets.
  • Static Asset Upload: Logs uploads of static assets.
  • File Operations: Audits the start of data creation operations, including file uploads to Minio and downloads from remote locations.

Application Management

  • Application Create/Update/Delete: Records actions related to the creation, updating, or deletion of applications.
  • Workspace Create/Update/Delete: Logs operations involving workspaces, such as their creation, updating, or deletion.

Python Environment Management

  • Custom Pip Install/Wheel Install/Reset: Audits all changes made to the Python environment, including custom package installations and resets.

Snorkel Flow Actions

  • Annotation Create/Update/Delete/Bulk Delete/Import: Logs all actions related to annotation management, including bulk operations and imports.
  • Annotation Transfer: Records details of transferring annotations between different entities or users.
  • Annotation Commit: Logs the committing of annotations to the system.
  • Batch Creation: Audits the creation of data batches.
  • Label Schema Create/Delete: Logs operations involving the creation or deletion of label schemas.
  • Training Set Create: Tracks the creation of training sets.
  • External LLM Configurations: Audits operations involving configuration changes for external language models.
  • Label Function (LF) CRUD Operations: Logs create, read, update, and delete actions on label functions.

Super Admin Actions

All actions performed by super admin users are fully audited, including:

  • Differentiation between actions performed through the user interface (UI) versus those executed through the software development kit (SDK).

How to Access Audit Logs

You can access audit logs via the /audit-logs endpoint using the Snorkel Flow SDK. This can be executed inside a notebook server if you have SuperAdmin privileges. Here’s a sample code snippet:

# Inside Snorkel Flow notebook server
import snorkelflow.client_v3 as sf

# Configure client context for Snorkel Flow instance
ctx = sf.SnorkelFlowContext.from_kwargs()
resp = ctx.tdm_client.get('/audit-logs?limit=100')

print(resp)

This will return a JSON formatted list of values. An example of this is show below.

[
{
"event_id": 1,
"event_time": "2021-08-17T20:51:47.268843",
"event_name": "api_key",
"event_type": "create",
"event_details": {
"censored_key": "************************************************************E4YO"
},
"user_uid": 2
}
]

The meaning of these fields is as follows:

  • event_id: A unique ID of each event that has occurred on the platform.
  • event_time: The time the event took place.
  • event_name: The name of the type of event. Usually this points to the resource it's related to (e.g. lf, application, workspace, user)
  • event_type: The type of event that has occurred. (e.g. create/read/update/delete).
  • event_details: A JSON object that is unique for each event type. Information in this field is only relevant for the specific event_name, and event_type. (e.g. for a login event, you will have {'user_uid': 22, 'username': 'john_wick'})
  • user_uid: The uid of the user who took the action.

Endpoint: Get Audit Logs

This endpoint allows you to fetch a list of audit logs from the system, which can be used for tracking and monitoring events related to the application's activities.

URL

GET YOUR_SNORKEL_INSTANCE_URL/audit-logs

Headers

  • Authorization: This header must contain the API key for authorization. For example, "Authorization: key API_KEY".
  • Content-Type: This should be set to application/json.

Query Parameters

  • limit (optional, integer): The maximum number of audit log events to retrieve.

    • Default: 200
    • Maximum: 1000
    • Minimum: 1
  • last_id (optional, integer): The ID of the last audit log event from a previous request. Used for pagination to fetch the next set of audit logs.

Request Example

curl -L -H "Authorization: key API_KEY" -H "Content-Type: application/json" "https://edge-tdm-api.k8s.g498.io/audit-logs?limit=100&last_id=50"

Response

The response is a JSON object containing:

  • events (array of AuditEvent objects): A list of audit events.
  • last_id (integer): The ID of the last audit log event retrieved, which can be used for pagination in subsequent requests.