Skip to main content
Version: 0.91

snorkelflow.ingest.conversation_json_to_parquet

snorkelflow.ingest.conversation_json_to_parquet(input_json_file_path, output_parquet_file_path)

Generates SnorkelFlow ingestible PARQUET file from JSON. Note that we convert all columns to str using json.dumps(). Please use json.loads() if you want to use these columns later.

note
Since v0.73, this function no longer adds the output parquet file to a dataset. Please use create_datasource to do so.
Parameters:
  • input_json_file_path (str) – Path to the input JSON file. Local path and MinIO path are supported.

  • output_parquet_file_path (str) – Path of the generated parquet file. Only MinIO path is supported.

Return type:

None