fix: reset MagicEncode after font switch (ESC M) and hw init (ESC @)#729
fix: reset MagicEncode after font switch (ESC M) and hw init (ESC @)#729larsblumberg wants to merge 5 commits into
Conversation
Some printers (confirmed: NT-5890K) silently reset their active code page
back to the factory default after a font switch (`ESC M n`). `MagicEncode`
was unaware of this hardware-side reset: its cached `self.encoding` remained
stale, causing subsequent `text()` calls to skip `CODEPAGE_CHANGE`. Non-ASCII
characters were then sent in the previously-active encoding (e.g. CP1257)
but interpreted by the printer under its default code page, producing
garbled output.
`ESC @` (`hw("INIT")`) is defined in the ESC/POS spec as a full printer reset
that restores all settings to factory defaults, including the active code
page — so the same fix applies there by spec.
Fix: add `MagicEncode.reset_encoding()` which clears both `self.encoding`
and `self.encoder.used_encodings`, and call it from `set()` after every
font change and from `hw()` after `INIT`. This forces a fresh code page
selection and a `CODEPAGE_CHANGE` re-emission before the next text output.
Why `used_encodings` must also be cleared:
`self.encoding = None` is enough to ensure a `CODEPAGE_CHANGE` is emitted.
However, `used_encodings` biases `find_suitable_encoding()` toward previously-
used code pages. After a reset, that preference is stale: on NT-5890K the
previously-used CP1257 does not function correctly after `ESC M`, so
`MagicEncode` would re-select it, emit `CODEPAGE_CHANGE` → CP1257, and send
e.g. `ü` as `0xFC` — wrong in the printer's default code page. Clearing
`used_encodings` removes the stale bias and lets slot-number ordering take
over, landing on CP850 where `ü` = `0x81`, a byte that is correct in
virtually every Western code page regardless of whether the printer
honours the code page switch.
Follow-up: `used_encodings` could be removed from `MagicEncode` entirely.
`self.encoding` is the mechanism that actually avoids redundant switches: it
keeps the current code page as long as it can encode the next character,
without consulting `used_encodings` at all. `used_encodings` is only consulted
when a switch is already unavoidable — at which point it cannot prevent
any switch, it can only gamble on which encoding might be needed again
later. That saves at most one future switch in the rare case where the
same non-default code page is needed again after having been forced away.
Removing `used_encodings` would also make `reset_encoding()` unnecessary:
with only `self.encoding` to clear, callers would just write
`self.magic.encoding = None` directly, with no need for a helper method.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers the four key behaviours introduced by the fix: - reset_encoding() sets self.encoding to None - reset_encoding() clears encoder.used_encodings - the next write after reset always re-emits CODEPAGE_CHANGE (even for the same encoding that was active before the reset) - after clearing used_encodings, find_suitable_encoding picks the lowest-slot encoding rather than the previously-used high-slot one (the exact scenario that caused the NT-5890K umlaut bug)
c9a87b6 to
c9370ec
Compare
c9370ec to
052a8f4
Compare
|
Hi @larsblumberg and welcome to python-escpos. First of all, I want to thank you for opening this PR, our first AI assisted contribution!! Well, we don't have an AI policy yet so I don't know if @patkan or other contributors would have any concern about AI assisted code. I'm short of time but I'll try to review this PR in the next days. |
|
Hi @belono , thanks for getting back!
Thank you for your warm welcome!
I've been using this library since 3 years now, and it has been doing a great job so far! Except that I wanted to finally tackle the char code problem that I've been noticing since long. LLM support has made the root cause analysis for the proposed bug fix very simple! As the PR description shows, I tested the proposed fix properly, with proof attached via before/after prints. Since we are using this library in 2 projects I'd love to continue contributing to
Thank you, I'd appreciate your feedback on this PR and getting the fix eventually merged. |
|
After a first look at the implementation I think the code looks quite good and the thorough tests are very welcome. The helper method reset_encoding() looks good to me too as it helps with the readability. Although, I see a side effect in the new behavior of the If we cannot workaround this behavior, we should document it at least. |
…fault() set_with_default() always passes font="a" to set(), which triggered reset_encoding() even when the font hadn't changed. Track _font state so reset only fires on actual font switches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
belono
left a comment
There was a problem hiding this comment.
The implementation looks very good to me. Clear, simple and explanatory.
I find some inline comments a bit verbose and redundant as they practically repeat the information found in the doc-string of reset_encoding() , which is where users can find the documentation. However, it's not a blocker for me.
Feel free to edit my suggestions with your own words.
| # Last font sent via ESC M; lets set() skip redundant font commands | ||
| # and avoid spurious reset_encoding() from set_with_default(). (#729) |
There was a problem hiding this comment.
| # Last font sent via ESC M; lets set() skip redundant font commands | |
| # and avoid spurious reset_encoding() from set_with_default(). (#729) | |
| # Keep track of the current font |
I find this comments too verbose. One line is enough.
| # Some printers (confirmed: NT-5890K) reset their active code page | ||
| # when switching fonts (ESC M). Invalidate the cached encoding so | ||
| # the next text() call re-emits CODEPAGE_CHANGE before sending text. | ||
| # See https://github.com/python-escpos/python-escpos/pull/729 |
There was a problem hiding this comment.
| # Some printers (confirmed: NT-5890K) reset their active code page | |
| # when switching fonts (ESC M). Invalidate the cached encoding so | |
| # the next text() call re-emits CODEPAGE_CHANGE before sending text. | |
| # See https://github.com/python-escpos/python-escpos/pull/729 | |
| # Force a fresh code page selection as required by some printer models (confirmed: NT-5890K) |
I find this comments too verbose. Condense in one line.
| # ESC @ is defined in the ESC/POS spec as a full printer reset that | ||
| # restores all settings to factory defaults, including the active | ||
| # code page. Invalidate the cached encoding so the next text() call | ||
| # re-emits CODEPAGE_CHANGE rather than silently sending the wrong bytes. |
There was a problem hiding this comment.
| # ESC @ is defined in the ESC/POS spec as a full printer reset that | |
| # restores all settings to factory defaults, including the active | |
| # code page. Invalidate the cached encoding so the next text() call | |
| # re-emits CODEPAGE_CHANGE rather than silently sending the wrong bytes. | |
| # ESC @ resets all settings including the active code page. | |
| # Force a fresh code page selection. |
I find this comments too verbose. Condense in less lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @belono , thanks for your feedback! I agree with the comments being too chatty. Initially preferred to over-communicate why the new behavior was needed, but simpler comments totally make it, too. Just pushed a commit applying your suggestions. |
Bug description
Some printers (confirmed: Netum
NT-5890K) silently reset their active code page back to the factory default after a font switch (ESC M n).MagicEncodewas unaware of this hardware-side reset: its cachedself.encodingremained stale, causing subsequenttext()calls to skipCODEPAGE_CHANGE. Non-ASCII characters were then sent in the previously-active encoding (e.g. CP1257) but interpreted by the printer under its default code page, producing garbled output.ESC @(hw("INIT")) is defined in the ESC/POS spec as a full printer reset that restores all settings to factory defaults, including the active code page — so the same fix applies there by spec.Proposed fix
Fix: add
MagicEncode.reset_encoding()which clears bothself.encodingandself.encoder.used_encodings, and call it fromset()after every font change and fromhw()afterINIT. This forces a fresh code page selection and aCODEPAGE_CHANGEre-emission before the next text output.Why
used_encodingsmust also be cleared:self.encoding = Noneis enough to ensure aCODEPAGE_CHANGEis emitted. However,used_encodingsbiasesfind_suitable_encoding()toward previously- used code pages. After a reset, that preference is stale: on NT-5890K the previously-used CP1257 does not function correctly afterESC M, soMagicEncodewould re-select it, emitCODEPAGE_CHANGE→ CP1257, and send e.g.üas0xFC— wrong in the printer's default code page. Clearingused_encodingsremoves the stale bias and lets slot-number ordering take over, landing on CP850 whereü=0x81, a byte that is correct in virtually every Western code page regardless of whether the printer honours the code page switch.Demo of the the bug and the applied fix:
Future work
Follow-up:
used_encodingscould be removed fromMagicEncodeentirely.self.encodingis the mechanism that actually avoids redundant switches: it keeps the current code page as long as it can encode the next character, without consultingused_encodingsat all.used_encodingsis only consulted when a switch is already unavoidable — at which point it cannot prevent any switch, it can only gamble on which encoding might be needed again later. That saves at most one future switch in the rare case where the same non-default code page is needed again after having been forced away. Removingused_encodingswould also makereset_encoding()unnecessary: with onlyself.encodingto clear, callers would just writeself.magic.encoding = Nonedirectly, with no need for a helper method.