Skip to content

Implement email sending via SES #3580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 11, 2018
Merged

Implement email sending via SES #3580

merged 23 commits into from
Apr 11, 2018

Conversation

dstufft
Copy link
Member

@dstufft dstufft commented Apr 7, 2018

This implements sending emails via Amazon SES. It has the following features:

  • Implements a SESEmailSender service that sends emails and logs them to the database.
  • Implements a Web hook that SES via SNS will call with various delivery related events.
  • Will Unflag emails as verified when they bounce or are marked as complaint, requiring users to re-verify their emails.
  • Provides an admin panel that will show the emails sent (omitting their body, incase the body has sensitive information in it) as well as the email status (accepted, delivered, bounced, complained) and any delivery events that were received about the email.
  • Provides a flag on the email to indicate it has been having delivery issues.
    • Long term goal of this flag is to provide better UX in the user's account management page to indicate that the reason this email is no longer verified is because of delivery issues. This PR does not implement that though.
  • Cleans up logged emails after 14 days. This is to keep the table from growing unbounded (see the journals table) and also to provide a finite lifespan on any email addresses or similar that get logged into this table.
  • Allow searching the email logs for specific addresses, etc.
  • Allows configuring the email sending backend similarly to the way the file storage backend is configured.
  • Handles flagging an account as having delivery problems after N transient delivery failures.
  • Authenticate/Validate that the web hook was actually called by SNS for the desired topic and it wasn't an attacker attempting to DoS the email handling.

Overall it still needs some cleanups as well (tests need written, some hardcoded region values, error handling is broken, etc). However it has all the basic functionality!

Here's some screenshots of the admin interface:

download 2

download 1

download

download 3

@dstufft
Copy link
Member Author

dstufft commented Apr 7, 2018

Note: Before we deploy this we will need to change the environment variables in our deploys, the new mechanism uses MAIL_BACKEND=<backend> foo=bar format that the storage and docs and such use.

@dstufft
Copy link
Member Author

dstufft commented Apr 8, 2018

I've hardcoded the number of transient bounces that an email address can have before it gets flagged for re-verification to 5. I'm not sure if that's a good number or not or if that should be generally configurable or not.

  • Whenever we get a soft bounce that doesn't occur after a delivery (i.e. not an OOTO auto-responder) increment the number of transient bounces by 1.
  • Whenever we get a successful email delivery, reset the number of transient bounces to 0.
  • Whenever the number of transient bounces is greater than 5, treat it as if the email had hard bounced and un-verify the email address.
  • Whenever an email gets verified (either the first time, or any subsequent times) we reset the number of transient bounces back to 0.


def _validate_topic(self, topic):
comparer = functools.partial(hmac.compare_digest, topic)
if not all(map(comparer, self.topics)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this code at all. As far as I can tell the only time this could be valid is if all the items in topics are the same. Is this supposed to be an any or something?

It looks like you only ever pass a single topic, maybe only support one?


self._validate_topic(message["TopicArn"])
self._validate_timestamp(message["Timestamp"])
self._validate_signature(message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you validate the signature first thing, cryptographic doom principle and all?

# Before we do anything, we need to verify that the URL for the
# signature matches what we expect.
cert_host = urllib.parse.urlparse(cert_url).netloc
if _signing_url_host_re.search(cert_host) is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer .match() to .search(). It doesn't matter in this case since your regexp is anchored, but better safe than sorry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn! I generally always use .search() because .match() messes with my mental model of how regular expressions work. In this case, .match() is probably better because I don't know that it ever makes sense to have this not anchored at the beginning.


def _get_data_to_sign(self, message):
if message["Type"] == "Notification":
parts = self._get_parts_to_sign_notiifcation(message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how you spell notification

@dstufft
Copy link
Member Author

dstufft commented Apr 8, 2018

To switch to SES, you need a configuration like:

MAIL_BACKEND=warehouse.email.services.SESEmailSender [email protected] region=us-west-2 topic=...

That assumes you have the other environment variables setup to configure authentication for AWS.

@dstufft dstufft changed the title [WIP] Implement email sending via SES Implement email sending via SES Apr 8, 2018
@dstufft dstufft requested a review from ewdurbin April 8, 2018 17:56
@dstufft
Copy link
Member Author

dstufft commented Apr 8, 2018

Note: After this merges and deploys, pypi/infra#5 will need to be merged and applied to wire the SNS topic up to Warehouse's web hook handler. It will likely take two runs of terraform for this to work, since we need to know the expected TopicArn before we can subscribe to it.

Copy link
Member

@ewdurbin ewdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blockers in my sight, just a couple questions.



@tasks.task(ignore_result=True, acks_late=True)
def cleanup(request):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might behoove us to cleanup unsuccessful deliveries on a slower pace than successful deliveries. I'd suggest we keep 30-90days of failures for debugging as it's quite possible for someone to not notice that things have gone awry in the short term. Particularly for email notifications for things that they did not initiate.

The alternative to holding onto logs for longer period is to persist some kind of summary as to why the email address was unverified, rather than just that it was.

Either option works for me, but the latter actually has the benefit of persisting for a much longer period than even 30-90 days which is nice.

When a user finally reaches out, we can at least tell them what the "last straw" for their previously verified address was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, the boolean fields stored on accounts_email are probably enough to continue forward but may not capture all the necessary information to help a user get things back up and running. Also we may need more insight for certain spam greylisting/blacklisting situations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set it up so failures are kept for 90 days, and success are kept for 14 days.

@@ -209,6 +209,9 @@ def includeme(config):
)
config.add_route("packaging.file", files_url)

# SES Webhooks
config.add_route("ses.hook", "/_/ses-hook/", domain=warehouse)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not used SES -> SNS -> webhook for this in the past, can we ensure we have the appropriate DeliveryPolicies in place to give these webhooks a fighting chance?

require_csrf=False,
header="x-amz-sns-message-type:Notification",
)
def notification(request):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This route needs to lean towards failing hard to make Retry logic simple with SNS webhooks.

SNS webhook retry conditions:

  • HTTP status in the range 500-599.
  • HTTP status outside the range 200-599.

Looks like we should be OK, as the explicit HTTPBadRequest responses are indeed for requests that are successfully ignored, but something to keep in mind.

),
sa.Column("message_id", sa.Text(), nullable=False),
sa.Column("from", sa.Text(), nullable=False),
sa.Column("to", sa.Text(), nullable=False),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we index this to help with admin search?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an index now!

@dstufft dstufft merged commit 6602e85 into pypi:master Apr 11, 2018
@dstufft dstufft deleted the ses branch April 11, 2018 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants