Version: 0.91

operators.candidates.extractor_spacy.TokenSpanFeaturizer

class operators.candidates.extractor_spacy.TokenSpanFeaturizer(field, tokenizer='spacy', **spacy_span_kwargs)

A SpanFeaturizer that yields every token, given a selected tokenization strategy

Given a valid tokenization strategy, this operator will tokenize the input dataframe into spans based on the produced tokens.

Name	Type	Default	Info
field	`str`		The dataframe column to apply the tokenization strategy over.
tokenizer	`str`	`'spacy'`	The tokenizer strategy (one of “spacy” or “whitespace”).