operators.candidates.extractor.EntityDictSpanExtractor
- class operators.candidates.extractor.EntityDictSpanExtractor(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)
SpanExtractor that yields (and optionally links) spans in an entity-to-aliases dictionary
This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.
An example of the entity-to-aliases dictionary can be found in Entity Classification Tutorials.
Parameters
Parameters
Name Type Default Info entity_dict_path str
The path to the entity-to-aliases dictionary. field str
The dataframe column to extract spans from. ignore_case bool
False
If true, the extraction will be NOT case sensitive (default to false). link_entities bool
True
If true, the extracted span will be linked with its original entity/aliases. col_suffix Optional[str]
None
An optional suffix for the column containing the extracted spans.