Data and application limit requirements
This page is a reference for data and application limit requirements for a standard Snorkel Flow installation. When creating new datasets or applications, please refer to the these class, dataset, and datapoint limits for the best performance.
# Unique classes | Max total dataset size | Max datapoint size (i.e., single row) | GPU required? | |
---|---|---|---|---|
Text classification (multi-class) | 2 - 100 | 2 GB | 10 KB | Recommended |
Text classification (multi-label) | 1 - 100 | 2 GB | 10 KB | Recommended |
Text extraction (candidate-based) | 2 - 100 | 2 GB | 10 KB | Recommended |
Text extraction (sequence tagging) | 1 - 25 | 250 MB | 10 KB | Required |
PDF extraction | 2 - 20 | 1.6 GB | 1.6 MB | Recommended |
Image classification | 1 - 10 | 20 GB | 500 KB | Required |
Additional Limits
In addition to the application specific limits above, the following table outlines limits that apply per each application:
Limit Type | Maximum Value |
---|---|
Number of Model Nodes | 20 |
Number of Data Sources | 50 |
Max Datasource Size | 100 MB |
These limits are designed to ensure optimal performance and resource utilization within Snorkel Flow. They may be adjusted based on your specific installation and requirements.
Reach out to your Snorkel representative for more information about instance sizing.