fix: remove double-encoding of image alt text#440
Conversation
The image() method called escape_text() on text that was already HTML-escaped by the rendering pipeline (render_tokens → text → escape_text), causing entities like & to become & in the alt attribute. Since striptags() preserves entities from the already-escaped input, the redundant escape_text() call is removed. Fixes lepture#432
|
There was a problem hiding this comment.
Pull request overview
Fixes double-encoding in rendered <img alt="..."> attributes by removing redundant escaping in the HTML renderer, and adds a regression test to prevent recurrence.
Changes:
- Stop re-escaping already-escaped image alt text in
HTMLRenderer.image(). - Add regression tests covering
&,>, and"cases in image alt rendering.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/mistune/renderers/html.py |
Removes redundant escaping in image() to prevent alt double-encoding. |
tests/test_misc.py |
Adds regression coverage ensuring alt is encoded exactly once. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #440 +/- ##
=======================================
Coverage 91.69% 91.69%
=======================================
Files 34 34
Lines 2638 2638
Branches 430 430
=======================================
Hits 2419 2419
Misses 147 147
Partials 72 72
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|



Fixes #432
Problem
Image alt attributes are double-encoded. For example:
Cause
In
HTMLRenderer.image(), thetextparameter arrives already HTML-escaped from the rendering pipeline (render_tokens→text()→escape_text()). Callingescape_text(striptags(text))escapes entities a second time.Fix
Remove the redundant
escape_text()call.striptags()only strips HTML tags and preserves existing entities, so its output is already safe for use in an HTML attribute.Before / After
alt="dogs &amp; cats"alt="dogs & cats"alt="dogs &gt; cats"alt="dogs > cats"alt="&quot;quoted&quot;"alt=""quoted""Validation
All 959 tests pass (including 596 CommonMark spec tests). A regression test is included.