-
Notifications
You must be signed in to change notification settings - Fork 251
Closed
Description
parse_report_email() does not handle text/html MIME parts.
As a result, Microsoft’s DMARC “Preview” emails — which contain an HTML body and no actual DMARC XML in the main message — cause parsedmarc to try parsing the HTML as an aggregate report XML. This results in:
InvalidAggregateReport: Missing field: 'feedback'
ParserError: Message ... is not a valid DMARC reportThe error is thrown before actual DMARC attachments (.xml.gz) are processed, so the whole email is incorrectly classified as invalid and moved to “Archive/Invalid”.
Environment
- Python: 3.x
- OS: Windows / Linux (reproduced in Docker python:3.13-slim)
- parsedmarc version: 18.19.0
- Mail sender: Microsoft enterprise.protection.outlook.com
- Mail subject examples:
[Preview] Report Domain: example.com Submitter: enterprise.protection.outlook.com Report-ID: ...Steps to Reproduce
1. Receive a Microsoft DMARC “Preview” email.
These emails contain:
- an HTML body with a human-readable summary
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<div style="font-family:Segoe UI; font-size:14px;">This is a DMARC aggregate report from Microsoft Corporation. For Emails received between 2025-11-26 00:00:00 UTC to 2025-11-27 00:00:00 UTC.</><br><br>
You're receiving this email because you have included your email address in the 'rua' tag of your DMARC record in DNS for netige.pl. Please remove your email address from the 'rua' tag if you don't want to receive this email.<br><br>
<div style="font-family:Segoe UI; font-size:12px; color:#666666;">Please do not respond to this e-mail. This mailbox is not monitored and you will not receive a response. For any feedback/suggestions, kindly mail to [email protected].<br><br>Microsoft respects your privacy. Review our Online Services
<a href="https://privacy.microsoft.com/en-us/privacystatement">Privacy Statement</a>.<br>
One Microsoft Way, Redmond, WA, USA 98052.
</>- optional attachments with the real DMARC .xml.gz report
2. Run:
parsedmarc.parse_report_email(eml_content)3. Observe that payload for one MIME part is HTML, e.g.:
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<div>This is a DMARC aggregate report from Microsoft Corporation...</div>4. parse_report_email eventually reaches the else: block and tries to parse this HTML as XML › Missing field: 'feedback'.
Expected Behavior
- parse_report_email should ignore text/html parts of multipart emails.
- It should keep scanning MIME parts until it finds the real DMARC XML or .xml.gz file.
Actual Behavior
- The HTML preview body is treated as potential DMARC XML.
- parse_aggregate_report_xml throws:
InvalidAggregateReport: Missing field: 'feedback'- The whole email is incorrectly rejected as invalid.
Root Cause
parse_report_email() is missing a condition for:
elif content_type == "text/html":
passTherefore HTML is sent into the fallback XML/JSON parsing path.
Proposed Fix
Add explicit handling of text/html:
elif content_type == "text/html":
# HTML bodies (e.g. Microsoft preview messages) are not DMARC reports
logger.debug("Skipping HTML body in DMARC email preview")
continueWhy This Fix Is Necessary
Microsoft widely sends DMARC preview emails that contain:
- HTML summary
- a zipped DMARC XML report
Without ignoring HTML, parsedmarc rejects valid DMARC report emails.
Metadata
Metadata
Assignees
Labels
No labels