Skip to main content
Version: 0.91

operators.candidates.extractor.EntityDictSpanFeaturizer

class operators.candidates.extractor.EntityDictSpanFeaturizer(entity_dict_path, field, ignore_case=False, link_entities=True, col_suffix=None)

SpanFeaturizer that yields (and optionally links) spans in an entity-to-aliases dictionary

This is used for entity classification tasks. It additionally annotates each span with the linked entity, using the dictionary value. This operator is Optimized for keyword aliases.

Parameters:
  • entity_dict_path (str) – A path (either local or to remote storage) that contains the entity linking definitions

  • field (str) – The field of the passing dataframe that entities will be extracted from

  • ignore_cae – If true, ignore case when matching entities (defaults to false)

  • link_entities (bool, default: True) – If true, link entities (defaults to true)

  • col_suffix (Optional[str], default: None) – An optional suffix for the column containing the extracted spans