operators.spacy.SpacyPreprocessor
- class operators.spacy.SpacyPreprocessor(field, target_field='doc', model='en_core_web_sm', disable=None, **spacy_kwargs)
Preprocessor that parses document and adds json doc column.
Used by Sequence Tagging applications to add additional document metadata.
Parameters
Parameters
Name Type Default Info field str
The field to parse with spacy. target_field str
'doc'
The field in which to store the parsed doc object. model str
'en_core_web_sm'
The model to load into spaCy (only supports models in strap). disable Optional[List[str]]
None
Optional list of pipeline steps to disable. spacy_kwargs Dict[str, Any]
Kwargs to forward to the spacy.load function.