Skip to content

XSS via _encode_incomplete_tags bypass #625

@renbou

Description

@renbou

Describe the bug
markdown2 usage with untrusted input in safe_mode can still contain unescaped HTML input due to the way the _hash_html_spans method encodes input by alternating calls to _sanitize_html (safe) and _encode_incomplete_tags (unsafe). _encode_incomplete_tags contains an escape hatch for so-called "auto links" which can be triggered for non-autolink elements too, possibly resulting in XSS:

def _encode_incomplete_tags(self, text: str) -> str:
    if text.endswith(">"):
        return text  # this is not an incomplete tag, this is a link in the form <http://x.y.z>

To Reproduce

markdown2.markdown(
    "<x><img src=x onerror=alert('xss')//><x>",
    safe_mode="escape",
)

Returns <p>&lt;x&gt;<img src=x onerror=alert('xss')//>&lt;x&gt;</p>, which can be used to trigger an XSS payload through the unescaped img's onerror handler. This can be tested by assigning the supposedly "safe" output do document.body.innerHTML on a webpage through the dev console, e.g. about:blank.

Expected behavior
markdown2.Markdown._encode_incomplete_tags should correctly identify <img src=x onerror=alert('xss')//> as a non-autolink element, and escape it, for example to &lt;img src=x onerror=alert('xss')//&gt;.

Debug info

>>> markdown2.__version__
'2.5.3'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions