operators.candidates.extractor.EntityDictSpanExtractor
- class operators.candidates.extractor.EntityDictSpanExtractor(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)
SpanExtractor that yields (and optionally links) spans in an entity-to-aliases dictionary
This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.
An example of the entity-to-aliases dictionary can be found in Entity Classification Tutorials.
- Parameters:
entity_dict_path (
str
) – The path to the entity-to-aliases dictionaryfield (
str
) – The dataframe column to extract spans fromignore_case (
bool
, default:False
) – If true, the extraction will be NOT case sensitive (default to false)link_entities (
bool
, default:True
) – If true, the extracted span will be linked with its original entity/aliasescol_suffix (
Optional
[str
], default:None
) – An optional suffix for the column containing the extracted spans