-
Notifications
You must be signed in to change notification settings - Fork 1k
Implement email sending via SES #3580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note: Before we deploy this we will need to change the environment variables in our deploys, the new mechanism uses |
I've hardcoded the number of transient bounces that an email address can have before it gets flagged for re-verification to 5. I'm not sure if that's a good number or not or if that should be generally configurable or not.
|
warehouse/utils/sns.py
Outdated
|
||
def _validate_topic(self, topic): | ||
comparer = functools.partial(hmac.compare_digest, topic) | ||
if not all(map(comparer, self.topics)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this code at all. As far as I can tell the only time this could be valid is if all the items in topics are the same. Is this supposed to be an any
or something?
It looks like you only ever pass a single topic, maybe only support one?
warehouse/utils/sns.py
Outdated
|
||
self._validate_topic(message["TopicArn"]) | ||
self._validate_timestamp(message["Timestamp"]) | ||
self._validate_signature(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you validate the signature first thing, cryptographic doom principle and all?
warehouse/utils/sns.py
Outdated
# Before we do anything, we need to verify that the URL for the | ||
# signature matches what we expect. | ||
cert_host = urllib.parse.urlparse(cert_url).netloc | ||
if _signing_url_host_re.search(cert_host) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer .match()
to .search()
. It doesn't matter in this case since your regexp is anchored, but better safe than sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm torn! I generally always use .search()
because .match()
messes with my mental model of how regular expressions work. In this case, .match()
is probably better because I don't know that it ever makes sense to have this not anchored at the beginning.
warehouse/utils/sns.py
Outdated
|
||
def _get_data_to_sign(self, message): | ||
if message["Type"] == "Notification": | ||
parts = self._get_parts_to_sign_notiifcation(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how you spell notification
To switch to SES, you need a configuration like:
That assumes you have the other environment variables setup to configure authentication for AWS. |
Note: After this merges and deploys, pypi/infra#5 will need to be merged and applied to wire the SNS topic up to Warehouse's web hook handler. It will likely take two runs of terraform for this to work, since we need to know the expected TopicArn before we can subscribe to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blockers in my sight, just a couple questions.
warehouse/email/ses/tasks.py
Outdated
|
||
|
||
@tasks.task(ignore_result=True, acks_late=True) | ||
def cleanup(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might behoove us to cleanup unsuccessful deliveries on a slower pace than successful deliveries. I'd suggest we keep 30-90days of failures for debugging as it's quite possible for someone to not notice that things have gone awry in the short term. Particularly for email notifications for things that they did not initiate.
The alternative to holding onto logs for longer period is to persist some kind of summary as to why the email address was unverified, rather than just that it was.
Either option works for me, but the latter actually has the benefit of persisting for a much longer period than even 30-90 days which is nice.
When a user finally reaches out, we can at least tell them what the "last straw" for their previously verified address was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the boolean fields stored on accounts_email are probably enough to continue forward but may not capture all the necessary information to help a user get things back up and running. Also we may need more insight for certain spam greylisting/blacklisting situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set it up so failures are kept for 90 days, and success are kept for 14 days.
@@ -209,6 +209,9 @@ def includeme(config): | |||
) | |||
config.add_route("packaging.file", files_url) | |||
|
|||
# SES Webhooks | |||
config.add_route("ses.hook", "/_/ses-hook/", domain=warehouse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not used SES -> SNS -> webhook for this in the past, can we ensure we have the appropriate DeliveryPolicies in place to give these webhooks a fighting chance?
require_csrf=False, | ||
header="x-amz-sns-message-type:Notification", | ||
) | ||
def notification(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This route needs to lean towards failing hard to make Retry logic simple with SNS webhooks.
SNS webhook retry conditions:
- HTTP status in the range 500-599.
- HTTP status outside the range 200-599.
Looks like we should be OK, as the explicit HTTPBadRequest
responses are indeed for requests that are successfully ignored, but something to keep in mind.
), | ||
sa.Column("message_id", sa.Text(), nullable=False), | ||
sa.Column("from", sa.Text(), nullable=False), | ||
sa.Column("to", sa.Text(), nullable=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we index this to help with admin search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an index now!
This implements sending emails via Amazon SES. It has the following features:
Overall it still needs some cleanups as well (tests need written, some hardcoded region values, error handling is broken, etc). However it has all the basic functionality!
Here's some screenshots of the admin interface: