Skip to main content
Version: 0.91

operators.candidates.extractor_spacy.TokenSpanFeaturizer

class operators.candidates.extractor_spacy.TokenSpanFeaturizer(field, tokenizer='spacy', **spacy_span_kwargs)

A SpanFeaturizer that yields every token, given a selected tokenization strategy

Given a valid tokenization strategy, this operator will tokenize the input dataframe into spans based on the produced tokens.

Parameters

NameTypeDefaultInfo
fieldstrThe dataframe column to apply the tokenization strategy over.
tokenizerstr'spacy'The tokenizer strategy (one of “spacy” or “whitespace”).