snorkelflow.client.datasets.get_dataset_data
- snorkelflow.client.datasets.get_dataset_data(dataset, split=None, start_date=None, end_date=None, target_columns=None)
Load raw data for the given dataset (prior to applying any processors).
Deprecated since version 2024.R4: Use
snorkelflow.sdk.Dataset.get_dataframe()
instead.Parameters
Parameters
Returns
Returns
An [n_data_points x n_fields] Pandas DataFrame containing the dataset data.
Return type
Return type
DataFrame
Name Type Default Info dataset Union[str, int]
Name or UID of the dataset to load unlabeled dataset from. split Optional[str]
None
Name of split (“train”, “valid”, “test”) to load. None means load all splits. start_date Optional[str]
None
Fetch data starting from this date. Defaults to minus infinity. end_date Optional[str]
None
Fetch data up to this date. Defaults to infinity. target_columns Optional[List[str]]
None
Optional list of columns needed in dataframe. Default to all columns.