operators.candidates.extractor.EntityDictSpanFeaturizer
- class operators.candidates.extractor.EntityDictSpanFeaturizer(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)
SpanFeaturizer that yields (and optionally links) spans in an entity-to-aliases dictionary
This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.
- Parameters:
entity_dict_path (
str
) – A path (either local or to remote storage) that contains the entity linking definitionsfield (
str
) – The field of the passing dataframe that entities will be extracted fromignore_cae – If true, ignore case when matching entities (defaults to false)
link_entities (
bool
, default:True
) – If true, link entities (defaults to true)col_suffix (
Optional
[str
], default:None
) – An optional suffix for the column containing the extracted spans