aboutsummaryrefslogtreecommitdiff
path: root/scripts/embedded_scripts
diff options
context:
space:
mode:
authorAvi Halachmi (:avih) <avihpit@yahoo.com>2026-03-18 13:38:24 +0200
committerRon Yorston <rmy@pobox.com>2026-03-19 15:04:16 +0000
commitdf652277439a30a973438577b1a370f4a7d2f47c (patch)
tree68a2b9a321058d929d4bfec675a94041a6a04fad /scripts/embedded_scripts
parent0b0ab67527a62f567a88fd674fbe0c2b2499c87e (diff)
downloadbusybox-w32-df652277439a30a973438577b1a370f4a7d2f47c.tar.gz
busybox-w32-df652277439a30a973438577b1a370f4a7d2f47c.tar.bz2
busybox-w32-df652277439a30a973438577b1a370f4a7d2f47c.zip
win32: UTF8_OUTPUT: refine bad-sequence output
Previously, at writeCon_utf8, when we detected an invalid byte (for the current state), then we printed one '?' which also covered any following invalid bytes, until a valid byte for state 0 was detected. For instance printf '\377\377\377A' printed '?A' (3 bad bytes, 1 '?'). This was by design to avoid excessive '?' noise. However, other terminals (xterm), and specifically windows console (and terminal), print one '?' for any decoding error, and also reset the decoding state after every error. I.e. the same input would error 3 times, and display '???A'. Now we do the same, which also happens to simplify the code. The reference behavior is windows console/terminal in UTF-8 codepage (which writeCon_utf8 tries to emulate in other console codepages). To compare, do 'chcp 65001' to set the console to UTF-8 - which also bypasses writeCon_utf8, and check how the terminal displays some sequence. We should be the same, up to CONFIG_SUBST_WCHAR value. The "state" comment is updated since we no longer maintain bad state. While at it, refine also few nearby comments.
Diffstat (limited to 'scripts/embedded_scripts')
0 files changed, 0 insertions, 0 deletions