parse_report_email incorrectly attempts to parse HTML bodies as DMARC reports (missing support for text/html)

parse_report_email() does not handle text/html MIME parts.
As a result, Microsoft’s DMARC “Preview” emails — which contain an HTML body and no actual DMARC XML in the main message — cause parsedmarc to try parsing the HTML as an aggregate report XML. This results in:

```python
InvalidAggregateReport: Missing field: 'feedback'
ParserError: Message ... is not a valid DMARC report
``` 


The error is thrown **before actual DMARC** attachments (.xml.gz) are processed, so the whole email is incorrectly classified as invalid and moved to “Archive/Invalid”.

### **Environment**

- Python: 3.x
- OS: Windows / Linux (reproduced in Docker python:3.13-slim)
- parsedmarc version: 18.19.0
- Mail sender: Microsoft enterprise.protection.outlook.com
- Mail subject examples:

```python
[Preview] Report Domain: example.com Submitter: enterprise.protection.outlook.com Report-ID: ...
``` 

### **Steps to Reproduce**
#### **1.** Receive a Microsoft DMARC “Preview” email.
These emails contain:

- an HTML body with a human-readable summary
```html
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<div style="font-family:Segoe UI; font-size:14px;">This is a DMARC aggregate report from Microsoft Corporation. For Emails received between 2025-11-26 00:00:00 UTC to 2025-11-27 00:00:00 UTC.</><br><br>
You're receiving this email because you have included your email address in the 'rua' tag of your DMARC record in DNS for netige.pl. Please remove your email address from the 'rua' tag if you don't want to receive this email.<br><br>
<div style="font-family:Segoe UI; font-size:12px; color:#666666;">Please do not respond to this e-mail. This mailbox is not monitored and you will not receive a response. For any feedback/suggestions, kindly mail to dmarcreportfeedback@microsoft.com.<br><br>Microsoft respects your privacy. Review our Online Services 
<a href="https://privacy.microsoft.com/en-us/privacystatement">Privacy Statement</a>.<br>
One Microsoft Way, Redmond, WA, USA 98052.
</>
``` 
- optional attachments with the real DMARC .xml.gz report

#### **2.** Run:
```python
parsedmarc.parse_report_email(eml_content)
``` 
#### **3.** Observe that payload for one MIME part is HTML, e.g.:
```html
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<div>This is a DMARC aggregate report from Microsoft Corporation...</div>
``` 
#### **4.** parse_report_email eventually reaches the else: block and tries to parse this HTML as XML › Missing field: 'feedback'.


### **Expected Behavior**
- parse_report_email should ignore text/html parts of multipart emails.
- It should keep scanning MIME parts until it finds the real DMARC XML or .xml.gz file.

### **Actual Behavior**

- The HTML preview body is treated as potential DMARC XML.
- parse_aggregate_report_xml throws:
```python
InvalidAggregateReport: Missing field: 'feedback'

``` 
- The whole email is incorrectly rejected as invalid.

### **Root Cause**
parse_report_email() is missing a condition for:
```python
elif content_type == "text/html":
    pass
``` 
Therefore HTML is sent into the fallback XML/JSON parsing path.

### **Proposed Fix**
Add explicit handling of text/html:

```python
elif content_type == "text/html":
    # HTML bodies (e.g. Microsoft preview messages) are not DMARC reports
    logger.debug("Skipping HTML body in DMARC email preview")
    continue
```


## **Why This Fix Is Necessary**

Microsoft widely sends DMARC preview emails that contain:
- HTML summary
- a zipped DMARC XML report

Without ignoring HTML, parsedmarc rejects valid DMARC report emails.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

parse_report_email incorrectly attempts to parse HTML bodies as DMARC reports (missing support for text/html) #626

Environment

Steps to Reproduce

1. Receive a Microsoft DMARC “Preview” email.

2. Run:

3. Observe that payload for one MIME part is HTML, e.g.:

4. parse_report_email eventually reaches the else: block and tries to parse this HTML as XML › Missing field: 'feedback'.

Expected Behavior

Actual Behavior

Root Cause

Proposed Fix

Why This Fix Is Necessary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

parse_report_email incorrectly attempts to parse HTML bodies as DMARC reports (missing support for text/html) #626

Description

Environment

Steps to Reproduce

1. Receive a Microsoft DMARC “Preview” email.

2. Run:

3. Observe that payload for one MIME part is HTML, e.g.:

4. parse_report_email eventually reaches the else: block and tries to parse this HTML as XML › Missing field: 'feedback'.

Expected Behavior

Actual Behavior

Root Cause

Proposed Fix

Why This Fix Is Necessary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions