aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* | ash: fix 'read' shell built-in (1)Ron Yorston2023-07-121-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider this test case: { echo -n te; sleep 3; echo st; } | (read -t 1 x; echo "$x") - bash echoes "te" after 1 second. - Upstream BusyBox echoes an empty "$x" after 1 second. - busybox-w32 echoes an empty "$x" after 3 seconds. The delayed echo in busybox-w32 arises because the 'read' shell built-in omits the code to poll for input. Rearrange the code so that polling takes place. This doesn't address the difference between BusyBox and bash. Costs 48-64 bytes.
* | ash: properly echo console input to 'read' built-inRon Yorston2023-07-124-4/+16
| | | | | | | | | | | | | | The 'read' shell built-in echoed console input to stdout. Echo directly to the console instead. Costs 124-136 bytes.
* | win32: more console input character conversionsRon Yorston2023-07-072-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | Add wrappers for the following input functions with conversions for console input. Applications suitable for testing these changes are appended in brackets. - getchar (xargs) - fgetc (tac) - getline (shuf) - fgets (rev) Costs 112-120 bytes.
* | win32: character conversion for fread(3)Ron Yorston2023-07-062-0/+17
| | | | | | | | | | | | | | Some applets use fread(3): dd and od, for example. Perform the necessary conversion when input is coming from the console. Costs 96-112 bytes.
* | win32: don't crash the console *and* handle CJK inputRon Yorston2023-07-022-20/+21
| | | | | | | | | | | | | | | | | | | | | | | | The previous commit prevented the console from crashing with a UTF8 input code page on Windows 10/11 in the default configuration. But it broke input in CJK code pages. Again. Handle both cases. Costs 36-72 bytes. (GitHub issue #335)
* | win32: revert to previous console input method by defaultRon Yorston2023-07-013-12/+19
| | | | | | | | | | | | | | | | Although the input method used for euro support is no longer required for that reason it does provide a more lightweight workaround for the problem with ReadConsoleInputA and UTF8. Repurpose FEATURE_EURO_INPUT as FEATURE_UTF8_INPUT.
* | win32: code shrink readConsoleInput_utf8Ron Yorston2023-07-012-4/+5
| | | | | | | | | | | | | | Move decision about how to read console input from windows_read_key() to readConsoleInput_utf8(). Saves 48-64 bytes.
* | win32: remove superfluous euro codeRon Yorston2023-07-011-18/+0
| | | | | | | | | | | | | | | | | | Commit ebe80f3e5 (win32: don't assume console CP equals OEM CP) fixed the incorrect character conversions which required special treatment for the euro symbol. The unnecessary code has been removed. Saves 64-80 bytes.
* | Merge pull request #338 from avih/concpRon Yorston2023-07-011-3/+5
|\ \ | | | | | | win32: UTF8 console input: don't spin the CPU
| * | win32: UTF8 console input: don't spin the CPUAvi Halachmi (:avih)2023-06-301-3/+5
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a regression from ec99f03ae which changed Read into Peek in order to keep the record at the console queue. However, it failed to take into account that as a result, if no input is pending, that readConsoleInput_utf8 now returns immediately without waiting for input - unlike ReadConsoleInput. Other than incorrectly returning a FALSE value in such case, it also caused a busy-wait loop of windows_read_key and high CPU usage. Fix that by waiting till there's input before the peek. This should make it just like ReadConsoleInput - which idles till there's input.
* | Merge pull request #337 from avih/concpRon Yorston2023-06-303-2/+174
|\ \ | | | | | | Improve console codepage awareness
| * | win32: UTF8 input: avoid timeout when delivering UTF8 bytesAvi Halachmi (:avih)2023-06-281-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When windows_read_key - which is the sole consumer of readConsoleInput_utf8 - is called with a timeout value, it uses WaitForSingleObject to test whether the console has pending input. Previously, readConsoleInput_utf8 consumed the input record before it delivered the UTF8 bytes which are generated from it. It's not an issue with ASCII-7 input - because indeed there are no buffered bytes once it's delivered, and, except for console bugs (when only key-up record exists) also not an issue with 2 or 3 bytes UTF8 codepoints - because these are generated from a single wchar_t input record on key-down, and the key-up event is not yet dequeued while delivering the key-down UTF8 bytes. But with a surrogate pair, which consumes two wchar_t records to realize the UTF8 sequence, we previously consumed the records up to and including the key-up event of the 2nd surrogate half. This could result in a timeout if there are no further records at the queue - eventhough some UTF8 bytes are still buffered/pending. Such timeout can result in the shell aborting - windows_read_key returns -1, which is later interpreted as EOF of the shell input, and quits the shell. Now readConsoleInput_utf8 dequeues an input record only once the last byte which was generated from this record is delivered, which we do using PeekConsoleInputW instead of ReadConsoleInputW. This avoid a timeout as long as there are input bytes to deliver.
| * | win32: the great UTF8 ReadConsoleInput hackAvi Halachmi (:avih)2023-06-283-2/+166
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 597d31ee (EURO_INPUT), ReadConsoleInputA is the default. The main problem with that is that if the console codepage is UTF8, e.g. after "chcp 65001", then typing or pasting can result in a crash of the console itself (the Windows Terminal or cmd.exe window closes). Additionally and regardless of this crash, ReadConsoleInputA is apparently buggy with UTF8 CP also otherwise. For instance, on Windows 7 only ASCII values work - others become '?'. Or sometimes in Windows 10 (cmd.exe console but not Windows terminal) only key-up events arrive for some non-ASCII codepoints (without a prior key-down), and more. So this commit implements readConsoleInput_utf8 which delivers UTF8 Regardless of CP, including of surrogate pairs, and works on win 7/10. Other than fixing the crash and working much better with UTF8 console CP, it also allows a build with the UTF8 manifest to capture correctly arbitrary unicode inputs which are typed or pasted into the console regardless of the console CP. However, it doesn't look OK unless the console CP is set to UTF8 (which we don't do automatically, but the user can chcp 65001), and editing is still lacking due to missing screen-length awareness. To reproduce the crash: start a new console window, 'chcp 65001', run this program (or busybox sh), and paste "ಀ" or "😀" (U+0C80, U+1F600) #include <windows.h> int main() { HANDLE h = GetStdHandle(STD_INPUT_HANDLE); INPUT_RECORD r; DWORD n; while (ReadConsoleInputA(h, &r, 1, &n)) /* NOP */; return 0; }
* | win32: don't assume console CP equals OEM CPAvi Halachmi (:avih)2023-06-285-62/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, console input was converted to the ANSI codepage using OemToChar[Buff], and ANSI to console conversion used CharToOem[Buff]. However, while typically true by default, it's not guaranteed that the console CP is the same as the OEM CP. Now the code uses the console input/output CP as appropriate instead of the OEM CP. It uses full wide-char conversion code, which was previously limited to FEATURE_EURO, and now may be used also otherwise. While at it, the code now bypasses the conversion altogether if the src/dst CPs happen to be identical - which can definitely happen. Other than saving some CPU cycles, this also happens to fix an issue with the UTF8 manifest (in both input and output), because apparently the Oem/Char conversion APIs fail to convert one char at a time (which is not a complete UTF8 codepoint sequence) even if both the OEM and the ANSI CPs are UTF8 (as is the case when using UTF8 manifest). Conversion is also skipped: - if the converted output would be longer than the input; - if the input length is 1 and the input is multi-byte.
* | win32: reduce impact of euro support (2)Ron Yorston2023-06-231-1/+3
| | | | | | | | | | | | | | winansi_OemToCharBuff() needs to call the real OemToCharBuff(), not itself! (GitHub issue #335)
* | win32: reduce impact of euro supportRon Yorston2023-06-231-10/+12
| | | | | | | | | | | | | | | | | | | | The workaround for euro support in busybox-w32 is only intended to work in the 858 code page. Skip the workaround if any other code page is in use. Costs 8-36 bytes. (GitHub issue #335)
* | win32: make support for euro input a separate optionRon Yorston2023-06-224-12/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 93a63809f (win32: add support for the euro currency symbol) made various changes to enable support for the euro symbol. One of these changes allows the euro to be entered from the console even if the current code page doesn't support it. This is probably of limited use: the symbol can be entered but won't be displayed correctly. Move this capability into a separate configuration option, FEATURE_EURO_INPUT, which is disabled by default. Saves 48-64 bytes in the new default case. (GitHub issue #335)
* | pgrep,pkill: remove non-functional optionsRon Yorston2023-06-221-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to limitations of the process scanning code on Windows some features of pgrep and pkill don't work properly: - display/matching of the full command line (-a/-f) - killing the oldest or newest process (-o/-n) - matching session id (-s) To avoid disappointment or error support for these features has been removed. Saves 408-464 bytes.
* | ash: standardise treatment of winxp optionRon Yorston2023-06-211-13/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although 'winxp' is a shell option it could only be set with '-X' on the command line. Fully implement 'winxp' so it can also be set within the shell by 'set -o winxp' and 'set +o winxp'. '-X' no longer needs to be the first option on the command line. Track which shell variables have been imported from a native Windows environment so only those are affected when 'winxp' is changed. The tracking persists in a subshell but is lost when shell variables are exported to the environment so 'set -/+o winxp' is ineffective in a child shell. Costs 48-52 bytes. (GitHub issue #322)
* | ash: code shrinkRon Yorston2023-06-211-28/+20
| | | | | | | | | | | | | | | | | | | | - There's no need to set USER, LOGNAME, HOME and SHELL as environment variables: making them shell variables is enough. - Use endofname() to detect invalid characters in variable names and take the copy of the invalid variable before it's modified. Saves 48-64 bytes.
* | win32: include UTF-8 manifestRon Yorston2023-06-205-1/+21
| | | | | | | | | | | | | | | | Include a manifest in the binary to set the process code page to UTF-8. This only has an effect from Windows 10 version 1903. Controlled by the FEATURE_UTF8_MANIFEST config setting, disabled by default.
* | win32: more applet look-up tweaksRon Yorston2023-06-191-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an applet is overridden by BB_OVERRIDE_APPLETS it should still function in certain cases: busybox applet applet.exe busybox --help applet Arrange for this by adding really_find_applet_by_name() as a static function in libbb/appletlib.c. find_applet_by_name() is implemented using this for external use while really_find_applet_by_name() is used internally in some instances. Adds 32 bytes. (GitHub issue #329)
* | Merge branch 'busybox' into mergeRon Yorston2023-06-1621-265/+892
|\|
| * awk: fix subst code to handle "start of word" pattern correctly (needs ↵Denys Vlasenko2023-06-082-26/+51
| | | | | | | | | | | | | | | | | | REG_STARTEND) function old new delta awk_sub 637 714 +77 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix SEGV on read error in -f PROGFILEDenys Vlasenko2023-06-071-2/+2
| | | | | | | | | | | | | | function old new delta awk_main 829 843 +14 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: code shrinkDenys Vlasenko2023-06-061-8/+10
| | | | | | | | | | | | | | | | | | | | function old new delta awk_sub 544 548 +4 exec_builtin 1136 1130 -6 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 1/1 up/down: 4/-6) Total: -2 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix backslash handling in sub() builtinsDenys Vlasenko2023-06-032-22/+66
| | | | | | | | | | | | | | function old new delta awk_sub 559 544 -15 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix precedence of = relative to ==Denys Vlasenko2023-05-302-21/+50
| | | | | | | | | | | | | | | | | | | | | | | | Discovered while adding code to disallow assignments to non-lvalues function old new delta parse_expr 936 991 +55 .rodata 105243 105247 +4 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 2/0 up/down: 59/0) Total: 59 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * tunctl: code shrinkDenys Vlasenko2023-05-291-2/+1
| | | | | | | | | | | | | | | | | | | | function old new delta .rodata 105246 105243 -3 tunctl_main 349 344 -5 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-8) Total: -8 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: printf(INVALID_FMT) prints it verbatimDenys Vlasenko2023-05-291-3/+9
| | | | | | | | | | | | | | function old new delta awk_printf 628 640 +12 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: shrink - use setvar_sn() to set variables from non-NUL terminated stringsDenys Vlasenko2023-05-281-14/+9
| | | | | | | | | | | | | | | | | | | | | | function old new delta setvar_sn - 39 +39 exec_builtin 1145 1136 -9 awk_getline 591 559 -32 ------------------------------------------------------------------------------ (add/remove: 1/0 grow/shrink: 0/2 up/down: 39/-41) Total: -2 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: code shrinkDenys Vlasenko2023-05-281-23/+24
| | | | | | | | | | | | | | function old new delta awk_getline 620 591 -29 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix closing of non-opened fileDenys Vlasenko2023-05-281-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | function old new delta setvar_ERRNO - 53 +53 .rodata 105252 105246 -6 awk_getline 639 620 -19 evaluate 3402 3377 -25 ------------------------------------------------------------------------------ (add/remove: 1/0 grow/shrink: 0/3 up/down: 53/-50) Total: 3 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * libbb/dump: code shrinkDenys Vlasenko2023-05-281-5/+7
| | | | | | | | | | | | | | function old new delta .rodata 105252 105246 -6 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: do not read ARGIND, only set it (gawk compat)Denys Vlasenko2023-05-271-5/+14
| | | | | | | | | | | | | | | | | | | | | | function old new delta next_input_file 216 243 +27 evaluate 3396 3402 +6 awk_main 826 829 +3 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 3/0 up/down: 36/0) Total: 36 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: remove a local variable "caching" a struct memberDenys Vlasenko2023-05-271-6/+4
| | | | | | | | | | | | | | | | | | | | Since we take its address, the variable lives on stack (not a GPR). Thus, nothing is improved by caching it. function old new delta awk_getline 642 639 -3 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: get rid of one indirection level for iF (input file structure)Denys Vlasenko2023-05-271-37/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | function old new delta try_to_assign - 91 +91 next_input_file 214 216 +2 awk_main 827 826 -1 evaluate 3403 3396 -7 is_assignment 91 - -91 ------------------------------------------------------------------------------ (add/remove: 1/1 grow/shrink: 1/2 up/down: 93/-99) Total: -6 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix splitting with default FSDenys Vlasenko2023-05-272-5/+15
| | | | | | | | | | | | | | function old new delta awk_split 543 544 +1 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * libbb/dump: make xxd_displayoff member conditional on xxdDenys Vlasenko2023-05-272-6/+14
| | | | | | | | | | | | | | | | | | With xxd not selected: function old new delta display 1459 1444 -15 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * od: -l,I,L indeed depend on sizeof(long), fix thisDenys Vlasenko2023-05-263-29/+39
| | | | | | | | | | | | | | | | | | | | function old new delta .rodata 105255 105252 -3 od_main 1917 1901 -16 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-19) Total: -19 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * awk: fix use-after-realloc (CVE-2021-42380), closes 15601Denys Vlasenko2023-05-262-6/+75
| | | | | | | | Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * hexdump: code shrinkDenys Vlasenko2023-05-261-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | function old new delta add_format - 50 +50 add_first 10 - -10 hexdump_main 401 366 -35 .rodata 105306 105255 -51 ------------------------------------------------------------------------------ (add/remove: 1/1 grow/shrink: 0/2 up/down: 50/-96) Total: -46 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * hexdump, xxd: shrink stringsDenys Vlasenko2023-05-262-11/+11
| | | | | | | | | | | | | | | | | | | | function old new delta add_first 12 10 -2 .rodata 105321 105306 -15 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-17) Total: -17 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * libbb/dump: correct handling of 1-byte signed int formatDenys Vlasenko2023-05-262-21/+50
| | | | | | | | Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * libbb/dump: use fputs_stdout where appropriateDenys Vlasenko2023-05-261-2/+2
| | | | | | | | | | | | | | function old new delta display 1485 1483 -2 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * od, hexdump: byte 0x11 is "dc1" not "dcl"Denys Vlasenko2023-05-263-9/+45
| | | | | | | | Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * od: actually remove -IL from --help, as comment saysDenys Vlasenko2023-05-261-1/+1
| | | | | | | | Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * od: support -DOHXILDenys Vlasenko2023-05-263-45/+46
| | | | | | | | | | | | | | | | | | | | function old new delta od_main 1866 1917 +51 .rodata 105306 105321 +15 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 2/0 up/down: 66/0) Total: 66 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * libbb/dump: conditionalize code used only by xxd and odDenys Vlasenko2023-05-263-2/+10
| | | | | | | | Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
| * od: fix default format, shrinkDenys Vlasenko2023-05-262-20/+32
| | | | | | | | | | | | | | | | | | | | function old new delta od_main 556 568 +12 .rodata 104613 104555 -58 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 1/1 up/down: 12/-58) Total: -46 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>