operators.candidates.rich_doc_features.RichDocRegexNGramDetector
- class operators.candidates.rich_doc_features.RichDocRegexNGramDetector(regex, target_field=None, capture_group=0, case_sensitive=True)
Featurizer that detects ngrams matching a regex pattern.
Parameters
Parameters
Name Type Default Info regex strThe regex pattern to search for. target_field Optional[str]NoneThe name of the field to store the detected ngrams in. capture_group int0The capture group to use when extracting the ngram text. case_sensitive boolTrueWhether to ignore case when searching for the regex pattern.