Skip to main content
Version: 0.96

operators.candidates.extractor.EmailAddressSpanFeaturizer

class operators.candidates.extractor.EmailAddressSpanFeaturizer(field, col_suffix=None)

Extracts spans (slices of documents) that contain email addresses (using regex)

This operator uses a regex to extract all spans from the parent document that contain properly formatted email addresses according to RFC6530.

Parameters:
  • field (str) – The dataframe column to extract email address spans from

  • col_suffix (Optional[str], default: None) – An optional suffix for the column containing the extracted spans