Skip to main content
Version: 25.4

snorkelflow.ingest.conversation_json_to_parquet

warning

This is a beta function in 25.4. Beta features may have known gaps or bugs, but are functional workflows and eligible for Snorkel Support. To access beta features, contact Snorkel Support to enable the feature flag for your Snorkel-hosted instance.

snorkelflow.ingest.conversation_json_to_parquet(input_json_file_path, output_parquet_file_path)

Generates SnorkelFlow ingestible PARQUET file from JSON. Note that we convert all columns to str using json.dumps(). Please use json.loads() if you want to use these columns later.

note
Since v0.73, this function no longer adds the output parquet file to a dataset. Please use create_datasource to do so.

Parameters

NameTypeDefaultInfo
input_json_file_pathstrPath to the input JSON file. Local path and MinIO path are supported.
output_parquet_file_pathstrPath of the generated parquet file. Only MinIO path is supported.

Return type

None