operators.candidates.extractor.EntityDictSpanExtractor
- class operators.candidates.extractor.EntityDictSpanExtractor(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)
SpanExtractor that yields (and optionally links) spans in an entity-to-aliases dictionary
This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.
An example of the entity-to-aliases dictionary can be found in Entity Classification Tutorials.
Parameters
Parameters
Name Type Default Info entity_dict_path strThe path to the entity-to-aliases dictionary. field strThe dataframe column to extract spans from. ignore_case boolFalseIf true, the extraction will be NOT case sensitive (default to false). link_entities boolTrueIf true, the extracted span will be linked with its original entity/aliases. col_suffix Optional[str]NoneAn optional suffix for the column containing the extracted spans.