Skip to main content
Version: 0.94

Data and application limit requirements

This page is a reference for data and application limit requirements for a standard Snorkel Flow installation. When creating new datasets or applications, please refer to the these class, dataset, and datapoint limits for the best performance.

 # Unique classesMax total dataset sizeMax datapoint size (i.e., single row)GPU required?
Text classification (multi-class)2 - 1002 GB10 KBRecommended
Text classification (multi-label)1 - 1002 GB10 KBRecommended
Text extraction  (candidate-based)2 - 1002 GB10 KBRecommended
Text extraction (sequence tagging)1 - 25250 MB10 KBRequired
PDF extraction2 - 201.6 GB1.6 MBRecommended
Image classification1 - 1020 GB500 KBRequired

Additional Limits

In addition to the application specific limits above, the following table outlines limits that apply per each application:

Limit TypeMaximum Value
Number of Model Nodes20
Number of Data Sources50
Max Datasource Size100 MB

These limits are designed to ensure optimal performance and resource utilization within Snorkel Flow. They may be adjusted based on your specific installation and requirements.

Reach out to your Snorkel representative for more information about instance sizing.