Skip to content

Should normalized_email be stripped of dots and +parts, too? #51

@ambv

Description

@ambv

While this isn't part of RFC 2822, the biggest email provider on the planet treats dots as if they weren't there. Example where this turned out to be a problem: python/cpython#93651

In theory this can lead to a malicious actor:

  • noticing somebody's got an email address without a dot
  • registering an email address on the same provider with a dot
  • using their new email address in commits to circumvent the CLA check

In practice though the malicious actor could have just used the other user's email address all along without having to register their own. Let alone that there isn't a likely motive for malicious intent to fool the CLA bot in the first place.

The more likely undesirable scenario is when two non-malicious people share a, say, @office.com address, one with a dot, the other without a dot, and both are contributors. However, I checked whether this ever happens in the 5,386 CLAs signed so far for CPython at the time of writing and it doesn't. I'd say the likelihood of this is very low.

52% of email addresses are @gmail.com, suggesting that it's worthwhile stripping the dots.
2% of email addresses contain +, suggesting that it might be worthwhile to strip those as well to improve the user experience in case a different +suffix creeps in. We've seen cases like this already in CPython.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions