Skip to main content
Version: 0.91

operators.candidates.extractor.EntityDictSpanFeaturizer

class operators.candidates.extractor.EntityDictSpanFeaturizer(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)

SpanFeaturizer that yields (and optionally links) spans in an entity-to-aliases dictionary

This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.

Parameters

NameTypeDefaultInfo
entity_dict_pathstrA path (either local or to remote storage) that contains the entity linking definitions.
fieldstrThe field of the passing dataframe that entities will be extracted from.
ignore_caeIf true, ignore case when matching entities (defaults to false).
link_entitiesboolTrueIf true, link entities (defaults to true).
col_suffixOptional[str]NoneAn optional suffix for the column containing the extracted spans.