Skip to main content
Version: 0.91

operators.spacy.SpacyTokenizer

class operators.spacy.SpacyTokenizer(text_field, tokens_field=None)

Preprocessor that parses document and adds tokens json column.

Used by Sequence Tagging applications to add additional document metadata.

Parameters:
  • text_field (str) – The field to parse with spacy.

  • tokens_field (Optional[str], default: None) – The field in which to store the tokens list object.