busybox-w32 - A mirror of https://github.com/rmyorston/busybox-w32.git

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	win32: fix uname(2) if ARM architecture is undefinedFRP-5301-gda71f7c57	Ron Yorston	2024-02-20	1	-0/+2
\| \| \| \| \|	Older versions of mingw don't define PROCESSOR_ARCHITECTURE_ARM64. Don't let this stop the build.
*	win32: avoid console windows from CGI scripts	Ron Yorston	2024-02-07	1	-1/+1
\| \| \| \| \| \| \| \| \|	When httpd is run in the background its processes are detached from the console. CGI scripts could create subprocesses which needed a console, resulting in annoying console windows appearing. Prevent this by changing the creation flags for CGI scripts to CREATE_NO_WINDOW.
*	win32: UTF8_OUTPUT: recover quicker from bad byte	Avi Halachmi (:avih)	2024-01-31	1	-12/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an unexpected value is detected in UTF-8, we should print the placeholder codepoint, and then recover whenever we detect a value which is valid for starting a new UTF-8 codepoint (including ASCII7). However, previously, we only tested recovery at the bytes following the unexpected one, and so if the first unexpected value was also valid for a new codepoint, then didn't rcover it. Now we check for recovery from the first unexpected byte, which, if recoverable, requires both placeholder printout and recovery, so the recovery "unwinding" is modified a bit to allow placeholder. Example of of a sequence which now recovers quicker than before: (where UTF-8 for U+1F600 "😀" is: 0xF0 0x9F 0x98 0x80) printf "\xF0\xF0\x9F\x98\x80A" Previously: ?A Now: ?😀A
*	win32: import dirname(3) from mingw-w64	Ron Yorston	2024-01-30	2	-0/+288
\| \| \| \| \| \| \| \| \| \| \| \| \|	The mingw-w64 project has updated its implementation of dirname(3). In some circumstances the new version doesn't preserve the type of the user-supplied top-level directory separator. As a result of this the dirname-handles-root test case failed. Import the new implementation and tweak it to preserve the type of the separator. This only affects mingw-w64 versions 12 and above. Currently only the aarch64 build using llvm-mingw is affected.
*	win32: hardcode numeric value of MANIFEST resource	Ron Yorston	2024-01-23	1	-2/+3
\| \| \| \| \|	It seems windres in llvm doesn't understand MANIFEST resources. Use the numeric value 24 instead.
*	win32: add aarch64 to uname(2)	Ron Yorston	2024-01-21	1	-0/+3
\| \| \| \| \|	For Windows on ARM we need to report the aarch64 processor architecture.
*	build system: more clang/llvm tweaks	Ron Yorston	2024-01-18	2	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Linkers associated with clang/llvm may not support the -r option. This is used to create built-in.o object files. It turns out that all such files in busybox-w32 are either empty or only contain one object file. The first case is already supported and the second can be handled by simply copying the object file to built-in.o. The linker is therefore never invoked with the -r option. One adjustment is required: the workaround adopted for GitHub issue #200 linked the dummy C file with the resource object file. This is no longer done so only one object file is used. Since it was the linking that broke the resource file, copying it is an equally effective fix for the issue. Some old linkers don't support the --warn-common option. The lack of this option was being detected but it was still sometimes used.
*	win32: fix detection of directories in stat(2)	Ron Yorston	2024-01-04	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementation of stat(2) detected whether a pathname ending with a directory separator was valid by checking for the error code ERROR_INVALID_NAME when GetFileAttributesExA() failed. This works if the path refers to an actual disk but not if it's on a share. In the latter case the glob '*/' incorrectly returned files that weren't directories. Add code to handle this case. Costs 16-32 bytes. (GitHub issue #381)
*	win32: code shrink procps_scan()	Ron Yorston	2023-12-31	1	-2/+2
\| \| \| \| \| \|	Use getpid() instead of GetProcessId(GetCurrentProcess()). Saves 16 bytes.
*	httpd: consistently leak memory, or not	Ron Yorston	2023-12-31	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	create_detached_process() is only used when running a CGI script. Previously it leaked the return values from quote_arg() but freed the command line it built. Whether or not the CGI script is successfully run its parent process exits almost immediately, so there's no pressing need to free the memory. If FEATURE_CLEAN_UP is disabled (which it is by default) don't bother. Saves 16 bytes.
*	win32: fix clang error/warning	Ron Yorston	2023-12-31	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Since clang doesn't seem to know about ffs(3) make it use __builtin_ffs() instead. Fix a warning in process_escape() in winansi.c: result of comparison of constant -1 with expression of type 'WORD' (aka 'unsigned short') is always true. Change the error value returned by process_colour() from -1 to 0xffff. Costs 16 bytes.
*	httpd: enable support for CGI	Ron Yorston	2023-12-20	1	-2/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The upstream code uses fork/exec when running a CGI process. Emulate this by: - Spawning a child httpd process with the special '-I 0' option, along with the options provided on the server command line. This sets up the proper state then calls the cgi_handler() function. - The cgi_handler() function fixes the pipe file descriptors and starts another child process to run the CGI script. These processes are detached from the console on creation. When spawn() functions are run in P_DETACH mode they don't connect to the standard file descriptors. Normally this doesn't matter but the process which runs the CGI scripts needs to inherit the pipe endpoints. The create_detached_process() function handles this. See: https://github.com/rprichard/win32-console-docs/blob/master/README.md Adds about 2.9Kb to the size of the binary. (GitHub issue #266)
*	win32: code shrink execve(2) implementation	Ron Yorston	2023-12-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Commit 6d6856355a (win32: handle -1 return status from execve(2)) added a test of errno to distinguish between failure to run a program and the program returning -1. Subsequent changes in commit 9db9b34ada (win32: ignore ctrl-c in parent of execve(2)) make this test unnecessary. Remove it. Saves 16-32 bytes.
*	httpd: fix return code when run in background	Ron Yorston	2023-12-15	1	-11/+5
\| \| \| \| \| \| \| \| \| \| \|	When httpd was run in the background the return code of the parent process was incorrect. It seems when spawn() is run in _P_DETACH mode it returns 0 on success, not a process handle. Fix the test for the return code and alter mingw_spawn_detach() so it doesn't treat the return from spawn() as a handle. Saves 32 bytes.
*	win32: only search PATH for compressor	Ron Yorston	2023-11-14	1	-12/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mingw_fork_compressor() uses CreateProcess() to run the compressor program. This will often be an instance of BusyBox, but since the xv and lzma applets in BusyBox don't support compression it can be an external program. It was intended that the external program should be found using PATH. However, CreateProcess() looks in various other places before trying PATH. In particular, it first looks in the directory of the current executable, then in the current directory of the process. This can result in the wrong xz.exe or lzma.exe being found. Perform an explicit PATH search and force CreateProcess() to use the result. This change only affects the search for a compressor. The same problem also affects other uses of our popen(3) emulation. These may be addressed in future. Costs 64-80 bytes. (GitHub issue #376)
*	win32: avoid terminal weirdness induced by Gradle	Ron Yorston	2023-10-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Commit 87a3ddc06 (win32: avoid terminal weirdness induced by Gradle?) correctly diagnosed the problem but got the cure wrong. Reset DISABLE_NEWLINE_AUTO_RETURN in the new mode, not the old one. Otherwise the change isn't applied. Saves 48 bytes. (GitHub issue #372)
*	win32: avoid terminal weirdness induced by Gradle?	Ron Yorston	2023-10-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	GitHub issue #372 reports that in certain circumstances (which I've been unable to reproduce) Gradle leaves the terminal in a state where linefeeds seem not to result in a carriage return. This might be because Gradle sets DISABLE_NEWLINE_AUTO_RETURN in the terminal mode. Reset DISABLE_NEWLINE_AUTO_RETURN to zero before the shell prompt is issued to see of this makes any difference. Costs 16-32 bytes.
*	win32: fix handling of relative paths	Ron Yorston	2023-10-04	1	-12/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 548ec7045b (win32: interpret absolute paths as relative to %SYSTEMDRIVE%) introduced the function xabsolute_path() to make relative paths absolute. This is used in 'dkpg' and 'man' to avoid having to tinker with absolute paths from upstream. Unfortunately, it's too eager to use the relative path. This results in dpkg failing to install deb files with a relative path. Saves 32-48 bytes. (GitHub issue #371)
*	sort: add support for sorting version strings	Ron Yorston	2023-10-01	2	-0/+63
\| \| \| \| \| \| \| \| \| \| \|	Add an implementation of strverscmp from musl so that the 'sort -V' option works. Add '-V' to the trivial usage message. Costs 248-256 bytes. (GitHub issue #370)
*	win32: missing support for "app exec link" reparse points	Ron Yorston	2023-09-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Commit 603af9bb9 (win32: support "app exec link" reparse points) added support for IO_REPARSE_TAG_APPEXECLINK reparse points by pretending they're symbolic links. One change was missed, in the implementation of dirent. Costs 16 bytes.
*	win32: improved support for manifests	Ron Yorston	2023-09-14	4	-0/+50
\| \| \| \| \| \| \| \| \| \|	The UTF-8 manifest has been updated to include features from the standard application manifest. Include a copy of the standard application manifest for toolchains that don't provide one. (GitHub issue #366)
*	win32: convert exit codes	Ron Yorston	2023-09-14	2	-8/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add two utility functions to convert Windows process exit codes. - exit_code_to_wait_status() converts to a POSIX wait status. This is used in ash and the implementations of system(3) and mingw_wait3(). - exit_code_to_posix() converts to a POSIX exit code. (Not that POSIX has much to say about them.) As a result it's possible for more applets to report when child processes are killed as if by a signal. 'time', 'drop' and 'su -W', for example. Adds 64-80 bytes.
*	make: return non-zero exit status when a command fails	Ron Yorston	2023-09-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a build command returned a non-zero exit status 'make' reported a warning and returned an exit code of zero. This was due to the misuse of the status returned by system(3). As the man page says: the return value is a "wait status" that can be examined using the macros described in waitpid(2). (i.e., WIFEXITED(), WEXITSTATUS(), and so on). Use the error() function to correctly report the problem on stderr and return an exit status of 2. Some additional changes in the same area: - When a target is removed report the diagnostic on stderr, as required by POSIX. - When a build command receives a signal GNU make removes the target. bmake doesn't and it isn't required by POSIX. Implement this as an extension. - Expand the error message when a build command fails so it includes the exit status or signal number, as obtained from the value returned by system(3). - Alter the WIN32 implementation of system(3) to handle exit codes which represent termination as if by a signal. Adds 200-240 bytes. (GitHub issue #354)
*	win32: UTF8_INPUT: fix combining of some surrogates pairs	Avi Halachmi (:avih)	2023-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	The construction of a codepoint from a surrogates pair was incorrect when the result should have had the 0x10000 bit unset, due to logical "\|" instead of arithmetic "+" of 0x10000 (so the 0x10000 bit was set incorrectly when the result should have been U+[1]{0,2,4...C,E}XXXX). For instance: typing or pasting U+20000 𠀀
*	Merge pull request #355 from avih/utf8-output-speedup	Ron Yorston	2023-08-25	1	-2/+8
\|\ \| \| \| \|	win32: UTF8_OUTPUT: speedup for big outputs
\| *	win32: UTF8_OUTPUT: speedup for big outputs	Avi Halachmi (:avih)	2023-08-24	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the native Windows console, writeCon_utf8 which converts a stream of UTF8 into console output is about 1.4x slower for big unicode writes than the native fwrite (e.g. when the console codepage is UTF8), which is not too bad. However, newer versions of conhost are quicker, e.g. OpenConsole.exe (which is conhost) which ships with the Windows terminal is about 4x faster than the native conhost in processing (unicode?) input. And when conhost can process inputs much quicker, it turned out that fwrite throughput was nearly 3x better than writeCon_utf8. Luckily, this turned out to be mainly due to the internal 256 wide chars buffer which writeCon_utf8 uses, and that with 4096 buffer it becomes only ~ 10% slower than fwrite, which is much better. However, making the console window very small such that it needs to spend very little time on rendering, makes it apparent that there's still a difference - writeCon_utf8 is about 30% slower than fwrite, but that's still not bad, and that's also an uncommon use case. So this commit increases the buffer, and also allocates it dynamically (once) to avoid abusing the stck with additional 8K in one call.
* \|	make: fix POSIX build	Ron Yorston	2023-08-24	1	-2/+1
\|/ \| \| \| \| \| \| \|	If upstream BusyBox had a 'make' applet a native build with it enabled should match the corresponding build from the busybox-w32 source. Make it so.
*	win32: replace readlink(2)	Ron Yorston	2023-08-21	1	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Windows implementation of readlink(2) has caused problems in the past. As, for example, with commit c29dc205d2 (win32: fix implementation of readlink(2)). Most uses of readlink(2) in BusyBox are actually calls to the (considerably more convenient) library function xmalloc_readlink(). Implement a Windows version of that and used it instead of readlink(2). This improves the handling of symbolic links (and similar reparse points) in CJK and UTF-8 code pages. Saves 48-80 bytes.
*	ash: detect console state on shell start upnoconsole2	Ron Yorston	2023-08-20	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \|	Set 'noconsole' to match the actual state of the console (normal/ iconified) when the shell is started. Thus ShowWindow() will only be called if the actual state differs from the default or user defined state. Costs 20-24 bytes. (GitHub issue #325)
*	win32: disable console output conversion with LC_ALL=C	Avi Halachmi (:avih)	2023-08-03	1	-6/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, when writing to the console, the non-unicode build always assumed the source data is in the ANSI codepage, and used charToCon to convert it unconditionally to the console CP. Similarly, the unicode build made the same assumption (where ANSI CP is UTF8), and always tried to convert it so that it's printed correctly (at least when FEATURE_UTF8_OUTPUT is enabled - which it is by default at the unicode build). However, there could be cases where this assumption is incorrect, for instance if the data comes from a file encoded for some codepage X, and after the user also changed the console CP to X does 'cat file.X' This commit allows disabling this conversion, using the same env vars which can be used to disable the locale/unicode elsewhere, (LANG, LC_CTYPE, LC_ALL as "C") e.g. 'LC_ALL=C cat file.X' now doesn't convert, and the console renders it according to its own codepage.
*	win32: add FEATURE_UTF8_OUTPUT (enabled with unicode)	Avi Halachmi (:avih)	2023-08-03	1	-0/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the unicode build required console (out) codepage of UTF8 in order for unicode output to be printed correctly - e.g. at the shell command prompt or the output of `ls` for unicode file names. This is inconvenient, because by default it's not UTF8, and so unless the user invoked 'chcp 65001' - by default unicode output didn't work. This feature (which is now enabled for the unicode build) makes it print unicode output correctly regardless of the console CP, by using a new stream-conversion funcion from UTF8 chars to wchar_t, and writing those using WriteConsoleW. If the console CP happens to be UTF8 - this conversion is disabled. We could have instead changed the console CP to UTF8, but that's a slippery slope, and some old program which expect the default CP might get broken, so achieving the same result without touching the console CP is hopefully better.
*	win32: unify 'convert and write to console' (no-op)	Avi Halachmi (:avih)	2023-08-03	1	-17/+44
\| \| \| \| \| \| \| \|	Use one call to do both charToCon and then write it to the console. Technically, this commit only reduces boilerplate code slightly, but it also makes it easier for future modifications to make changes to this sequence in one place.
*	win32: support build with FEATURE_UNICODE_SUPPORT	Avi Halachmi (:avih)	2023-07-22	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FEATURE_UTF8_MANIFEST enables Unicode args and filenames on Win 10+. FEATURE_UTF8_INPUT allows the shell prompt to digest correctly Unicode strings (as UTF8) which are typed or pasted. This commit adds support for building with FEATURE_UNICODE_SUPPORT (mostly by supporting 32 bit wchar_t which busybox expects): - Unicode-aware line-edit - for the most part cursor movement/del being (UTF8) codepoint-aware rather than assuming that one-byte equals one-char-on-screen. - Codepoint-aware operations in some other utils, like rev or wc -c. - When UNICODE_COMBINING_WCHARS and UNICODE_WIDE_WCHARS are enabled, some screen-width-aware operations, like with fold, ls, expand, etc. The busybox Unicode support is incomplete, and even less so with the builtin libc replacement functions, like wcwidth, which are active when UNICODE_USING_LOCALE is unset (mingw lacks those functions). FEATURE_CHECK_UNICODE_IN_ENV should be set so that Unicode is not hardcoded but rather depends on the ANSI codepage and some env vars: LC_ALL=C disables Unicode support, else it's enabled if ACP is UTF8. There's at least one known issue where the tab-completion-prefix-case is not updated correctly, e.g. ~/desk<tab> completes to ~/desktop/ instead of ~/Desktop/, because the code which handles it exists only at the non-unicode code paths, but that's not very critical. That seems to be the only case where mingw-specific code is disabled when Unicode is enabled, but there could be other unknown issues. None of the Unicode options is enabled by default, and the next commit will make it easier to create a build which supports Unicode.
*	win32: UTF8 input: improve missing-key-down hack	Avi Halachmi (:avih)	2023-07-21	1	-11/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UTF8 input code works around an issue when pasting at the windows console (but not terminal) that sometimes we get key-up without a prior matching key-down - at which case it generates down. However, previously it detected this by comparing an up-event to the last down-event, which could result in false-positive in cases like: X-down Y-down X-up Y-up (e.g. when typing quickly). Now it remembers the last 8 key-down events when searching a prior matching key-down, which fixes an issue of incorrect repeated keys (in the example above Y-up was incorrectly changed to Y-down).
*	win32: avoid crashing the console with poll(2)	Ron Yorston	2023-07-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8e6991733 (ash: fix 'read' shell built-in (1)) introduced the use of poll(2) in the shell 'read' built-in. When the UTF8 code page is in use this results in the console crashing if a 3 or more byte UTF8 character is entered. The crash is caused by the use of PeekConsoleInputA() which, like ReadConsoleInputA(), is broken. It can be avoided by using PeekConsoleInputW() instead. The number of key events will differ but this doesn't matter in this case as poll(2) effectively runs in a busy loop with a 1ms sleep.
*	date: allow system date to be set	Ron Yorston	2023-07-16	1	-0/+23
\| \| \| \| \| \| \| \| \|	Implement clock_settime(2) and enable the '-s' option to allow the system time to be set. This requires elevated privileges. The code in date.c is now identical to upstream BusyBox. Costs 256-272 bytes.
*	ash: fix 'read' shell built-in (2)	Ron Yorston	2023-07-12	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enabling polling in the previous commit resulted in the following incorrect behaviour: { echo -n te; sleep 3; echo st; } \| (read -t 1 x; echo "$x") An empty "$x" is echoed immediately, not after 1 second. { echo -n te; sleep 1; echo st; } \| (read -t 3 x; echo "$x") An empty "$x" is echoed immediately. "test" should be echoed after 1 second. This arises because poll(2) from gnulib is unable to handle anonymous pipes properly due do deficiencies in Microsoft Windows. These have been acknowledged and fixed in relation to select(2): https://lists.gnu.org/archive/html/bug-gnulib/2014-06/msg00051.html Apply a similar fix to poll(2). Costs 104-156 bytes.
*	ash: properly echo console input to 'read' built-in	Ron Yorston	2023-07-12	1	-0/+11
\| \| \| \| \| \| \|	The 'read' shell built-in echoed console input to stdout. Echo directly to the console instead. Costs 124-136 bytes.
*	win32: more console input character conversions	Ron Yorston	2023-07-07	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add wrappers for the following input functions with conversions for console input. Applications suitable for testing these changes are appended in brackets. - getchar (xargs) - fgetc (tac) - getline (shuf) - fgets (rev) Costs 112-120 bytes.
*	win32: character conversion for fread(3)	Ron Yorston	2023-07-06	1	-0/+15
\| \| \| \| \| \| \|	Some applets use fread(3): dd and od, for example. Perform the necessary conversion when input is coming from the console. Costs 96-112 bytes.
*	win32: don't crash the console and handle CJK input	Ron Yorston	2023-07-02	2	-20/+21
\| \| \| \| \| \| \| \| \| \| \| \|	The previous commit prevented the console from crashing with a UTF8 input code page on Windows 10/11 in the default configuration. But it broke input in CJK code pages. Again. Handle both cases. Costs 36-72 bytes. (GitHub issue #335)
*	win32: revert to previous console input method by default	Ron Yorston	2023-07-01	2	-6/+12
\| \| \| \| \| \| \| \|	Although the input method used for euro support is no longer required for that reason it does provide a more lightweight workaround for the problem with ReadConsoleInputA and UTF8. Repurpose FEATURE_EURO_INPUT as FEATURE_UTF8_INPUT.
*	win32: code shrink readConsoleInput_utf8	Ron Yorston	2023-07-01	2	-4/+5
\| \| \| \| \| \| \|	Move decision about how to read console input from windows_read_key() to readConsoleInput_utf8(). Saves 48-64 bytes.
*	win32: remove superfluous euro code	Ron Yorston	2023-07-01	1	-18/+0
\| \| \| \| \| \| \| \| \|	Commit ebe80f3e5 (win32: don't assume console CP equals OEM CP) fixed the incorrect character conversions which required special treatment for the euro symbol. The unnecessary code has been removed. Saves 64-80 bytes.
*	win32: UTF8 console input: don't spin the CPU	Avi Halachmi (:avih)	2023-06-30	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a regression from ec99f03ae which changed Read into Peek in order to keep the record at the console queue. However, it failed to take into account that as a result, if no input is pending, that readConsoleInput_utf8 now returns immediately without waiting for input - unlike ReadConsoleInput. Other than incorrectly returning a FALSE value in such case, it also caused a busy-wait loop of windows_read_key and high CPU usage. Fix that by waiting till there's input before the peek. This should make it just like ReadConsoleInput - which idles till there's input.
*	win32: UTF8 input: avoid timeout when delivering UTF8 bytes	Avi Halachmi (:avih)	2023-06-28	1	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When windows_read_key - which is the sole consumer of readConsoleInput_utf8 - is called with a timeout value, it uses WaitForSingleObject to test whether the console has pending input. Previously, readConsoleInput_utf8 consumed the input record before it delivered the UTF8 bytes which are generated from it. It's not an issue with ASCII-7 input - because indeed there are no buffered bytes once it's delivered, and, except for console bugs (when only key-up record exists) also not an issue with 2 or 3 bytes UTF8 codepoints - because these are generated from a single wchar_t input record on key-down, and the key-up event is not yet dequeued while delivering the key-down UTF8 bytes. But with a surrogate pair, which consumes two wchar_t records to realize the UTF8 sequence, we previously consumed the records up to and including the key-up event of the 2nd surrogate half. This could result in a timeout if there are no further records at the queue - eventhough some UTF8 bytes are still buffered/pending. Such timeout can result in the shell aborting - windows_read_key returns -1, which is later interpreted as EOF of the shell input, and quits the shell. Now readConsoleInput_utf8 dequeues an input record only once the last byte which was generated from this record is delivered, which we do using PeekConsoleInputW instead of ReadConsoleInputW. This avoid a timeout as long as there are input bytes to deliver.
*	win32: the great UTF8 ReadConsoleInput hack	Avi Halachmi (:avih)	2023-06-28	2	-2/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 597d31ee (EURO_INPUT), ReadConsoleInputA is the default. The main problem with that is that if the console codepage is UTF8, e.g. after "chcp 65001", then typing or pasting can result in a crash of the console itself (the Windows Terminal or cmd.exe window closes). Additionally and regardless of this crash, ReadConsoleInputA is apparently buggy with UTF8 CP also otherwise. For instance, on Windows 7 only ASCII values work - others become '?'. Or sometimes in Windows 10 (cmd.exe console but not Windows terminal) only key-up events arrive for some non-ASCII codepoints (without a prior key-down), and more. So this commit implements readConsoleInput_utf8 which delivers UTF8 Regardless of CP, including of surrogate pairs, and works on win 7/10. Other than fixing the crash and working much better with UTF8 console CP, it also allows a build with the UTF8 manifest to capture correctly arbitrary unicode inputs which are typed or pasted into the console regardless of the console CP. However, it doesn't look OK unless the console CP is set to UTF8 (which we don't do automatically, but the user can chcp 65001), and editing is still lacking due to missing screen-length awareness. To reproduce the crash: start a new console window, 'chcp 65001', run this program (or busybox sh), and paste "ಀ" or "😀" (U+0C80, U+1F600) #include <windows.h> int main() { HANDLE h = GetStdHandle(STD_INPUT_HANDLE); INPUT_RECORD r; DWORD n; while (ReadConsoleInputA(h, &r, 1, &n)) /* NOP */; return 0; }
*	win32: don't assume console CP equals OEM CP	Avi Halachmi (:avih)	2023-06-28	2	-55/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, console input was converted to the ANSI codepage using OemToChar[Buff], and ANSI to console conversion used CharToOem[Buff]. However, while typically true by default, it's not guaranteed that the console CP is the same as the OEM CP. Now the code uses the console input/output CP as appropriate instead of the OEM CP. It uses full wide-char conversion code, which was previously limited to FEATURE_EURO, and now may be used also otherwise. While at it, the code now bypasses the conversion altogether if the src/dst CPs happen to be identical - which can definitely happen. Other than saving some CPU cycles, this also happens to fix an issue with the UTF8 manifest (in both input and output), because apparently the Oem/Char conversion APIs fail to convert one char at a time (which is not a complete UTF8 codepoint sequence) even if both the OEM and the ANSI CPs are UTF8 (as is the case when using UTF8 manifest). Conversion is also skipped: - if the converted output would be longer than the input; - if the input length is 1 and the input is multi-byte.
*	win32: reduce impact of euro support (2)	Ron Yorston	2023-06-23	1	-1/+3
\| \| \| \| \| \| \|	winansi_OemToCharBuff() needs to call the real OemToCharBuff(), not itself! (GitHub issue #335)
*	win32: reduce impact of euro support	Ron Yorston	2023-06-23	1	-10/+12
\| \| \| \| \| \| \| \| \| \|	The workaround for euro support in busybox-w32 is only intended to work in the 858 code page. Skip the workaround if any other code page is in use. Costs 8-36 bytes. (GitHub issue #335)