Summary
Bleach clean() / Cleaner() fails to sanitize dangerous URI schemes in allowed formaction attributes.
Bleach applies URI protocol sanitization only to attributes listed in attr_val_is_uri. While URI-bearing attributes such as action, href, src, and poster are included in that set, formaction is not. As a result, if a downstream application explicitly allows formaction on submit-capable controls in untrusted HTML, Bleach preserves dangerous values such as javascript:alert(1) instead of stripping them.
This can lead to submit-triggered JavaScript execution in applications that rely on Bleach to sanitize untrusted HTML and allow the relevant tag/attribute combination.
Details
The issue appears to be a URI-sanitization coverage gap in Bleach’s sanitizer logic.
Relevant code paths:
bleach/sanitizer.py — BleachSanitizerFilter.allow_token (around line 553)
bleach/_vendor/html5lib/filters/sanitizer.py — attr_val_is_uri (around line 525)
In BleachSanitizerFilter.allow_token, URI protocol sanitization is only applied when:
if namespaced_name in self.attr_val_is_uri:
However, (None, 'formaction') is currently missing from attr_val_is_uri.
This creates an inconsistency where action is protocol-sanitized, but formaction is not.
As a result, if a downstream application allows:
- tags such as
<button> or <input>
- the
formaction attribute
then Bleach preserves dangerous URI schemes such as javascript: in formaction.
Examples of affected submit-capable controls include:
<button> (default submit behavior unless type="button" is set)
<input type="submit">
<input type="image">
This appears to be a real library-side sanitizer gap rather than only an application misuse issue, because Bleach already treats similar URI-bearing attributes (such as action) as protocol-sensitive and sanitizes them.
Suggested minimal fix:
Add:
to attr_val_is_uri in:
bleach/_vendor/html5lib/filters/sanitizer.py
I also prepared a minimal patch and focused regression tests if helpful.
PoC
Below are minimal reproductions using bleach.clean().
1) <button>
from bleach import clean
print(clean(
'<form><button formaction="javascript:alert(1)">go</button></form>',
tags={'form', 'button'},
attributes={'button': ['formaction']},
))
Actual output:
<form><button formaction="javascript:alert(1)">go</button></form>
Expected output:
<form><button>go</button></form>
2) <input type="submit">
print(clean(
'<form><input type="submit" formaction="javascript:alert(1)" value="go"></form>',
tags={'form', 'input'},
attributes={'input': ['type', 'formaction', 'value']},
))
Actual output:
<form><input type="submit" formaction="javascript:alert(1)" value="go"></form>
Expected output:
<form><input type="submit" value="go"></form>
3) <input type="image">
print(clean(
'<form><input type="image" formaction="javascript:alert(1)" src="/foo.png"></form>',
tags={'form', 'input'},
attributes={'input': ['type', 'formaction', 'src']},
))
Actual output:
<form><input type="image" formaction="javascript:alert(1)" src="/foo.png"></form>
Expected output:
<form><input type="image" src="/foo.png"></form>
Impact
This is a client-side HTML sanitization bypass / dangerous URI preservation issue.
If an application relies on Bleach to sanitize untrusted HTML and explicitly allows:
formaction
- and submit-capable controls such as
<button> or <input>
then Bleach can emit sanitized output that still contains a dangerous javascript: URI in formaction.
That can lead to submit-triggered JavaScript execution when the user activates the control.
Impact is limited to configurations that explicitly allow the relevant tag/attribute combination, but the issue is still security-relevant because:
formaction is a real browser sink
- Bleach already protocol-sanitizes similar URI-bearing attributes like
action
- the omission creates inconsistent sanitizer coverage for dangerous URI schemes
I would currently assess this as Medium severity.
If useful, I also have:
Summary
Bleach
clean()/Cleaner()fails to sanitize dangerous URI schemes in allowedformactionattributes.Bleach applies URI protocol sanitization only to attributes listed in
attr_val_is_uri. While URI-bearing attributes such asaction,href,src, andposterare included in that set,formactionis not. As a result, if a downstream application explicitly allowsformactionon submit-capable controls in untrusted HTML, Bleach preserves dangerous values such asjavascript:alert(1)instead of stripping them.This can lead to submit-triggered JavaScript execution in applications that rely on Bleach to sanitize untrusted HTML and allow the relevant tag/attribute combination.
Details
The issue appears to be a URI-sanitization coverage gap in Bleach’s sanitizer logic.
Relevant code paths:
bleach/sanitizer.py—BleachSanitizerFilter.allow_token(around line 553)bleach/_vendor/html5lib/filters/sanitizer.py—attr_val_is_uri(around line 525)In
BleachSanitizerFilter.allow_token, URI protocol sanitization is only applied when:However,
(None, 'formaction')is currently missing fromattr_val_is_uri.This creates an inconsistency where
actionis protocol-sanitized, butformactionis not.As a result, if a downstream application allows:
<button>or<input>formactionattributethen Bleach preserves dangerous URI schemes such as
javascript:informaction.Examples of affected submit-capable controls include:
<button>(default submit behavior unlesstype="button"is set)<input type="submit"><input type="image">This appears to be a real library-side sanitizer gap rather than only an application misuse issue, because Bleach already treats similar URI-bearing attributes (such as
action) as protocol-sensitive and sanitizes them.Suggested minimal fix:
Add:
to
attr_val_is_uriin:bleach/_vendor/html5lib/filters/sanitizer.pyI also prepared a minimal patch and focused regression tests if helpful.
PoC
Below are minimal reproductions using
bleach.clean().1)
<button>Actual output:
Expected output:
2)
<input type="submit">Actual output:
Expected output:
3)
<input type="image">Actual output:
Expected output:
Impact
This is a client-side HTML sanitization bypass / dangerous URI preservation issue.
If an application relies on Bleach to sanitize untrusted HTML and explicitly allows:
formaction<button>or<input>then Bleach can emit sanitized output that still contains a dangerous
javascript:URI informaction.That can lead to submit-triggered JavaScript execution when the user activates the control.
Impact is limited to configurations that explicitly allow the relevant tag/attribute combination, but the issue is still security-relevant because:
formactionis a real browser sinkactionI would currently assess this as Medium severity.
If useful, I also have:
a minimal patch
focused regression tests for:
<button formaction="javascript:..."><input type="submit" formaction="javascript:..."><input type="image" formaction="javascript:...">formaction="/submit"is preserved