operators.candidates.rich_doc_features.RichDocRegexNGramDetector
- class operators.candidates.rich_doc_features.RichDocRegexNGramDetector(regex, target_field=None, capture_group=0, case_sensitive=True)
Featurizer that detects ngrams matching a regex pattern.
Parameters
Parameters
Name Type Default Info regex str
The regex pattern to search for. target_field Optional[str]
None
The name of the field to store the detected ngrams in. capture_group int
0
The capture group to use when extracting the ngram text. case_sensitive bool
True
Whether to ignore case when searching for the regex pattern.