diff options
author | Ron Yorston <rmy@pobox.com> | 2016-03-16 11:36:45 +0000 |
---|---|---|
committer | Ron Yorston <rmy@pobox.com> | 2016-03-16 11:36:45 +0000 |
commit | 3ebb829683aab54f5089a878120294ae9e5d2fcf (patch) | |
tree | 6e9f1a56da7ef7c3f955fd83c5becddb5a844a01 | |
parent | 037cdc9c020ae6875a0707aeab5f7ecceef7e351 (diff) | |
download | busybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.tar.gz busybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.tar.bz2 busybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.zip |
sed: drop \r when reading input
An upstream bug (https://bugs.busybox.net/show_bug.cgi?id=8791) reports:
$ cat myfile
a
b
c
$ sed "s/\(.*\)/\1\1/" myfile
a
b
c
it should be:
$ sed "s/\(.*\)/\1\1/" myfile
aa
bb
cc
This happened because busybox-w32 opens files in binary mode. Lines read
by sed had trailing LFs removed but not trailing CRs. The CRs ended up
in the matched strings and were output, thus giving the appearance that
only one of the backreferences was printed.
The same happens on Linux when a DOS file is processed by BusyBox sed
or GNU sed. However, this behaviour is arguably incorrect on Windows.
I've modified busybox-w32 to drop trailing CRs as well as LFs from
input lines.
-rw-r--r-- | editors/sed.c | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/editors/sed.c b/editors/sed.c index 4c7f75521..a0c713f58 100644 --- a/editors/sed.c +++ b/editors/sed.c | |||
@@ -990,6 +990,11 @@ static char *get_next_line(char *gets_char, char *last_puts_char, char last_gets | |||
990 | char c = temp[len-1]; | 990 | char c = temp[len-1]; |
991 | if (c == '\n' || c == '\0') { | 991 | if (c == '\n' || c == '\0') { |
992 | temp[len-1] = '\0'; | 992 | temp[len-1] = '\0'; |
993 | #if ENABLE_PLATFORM_MINGW32 | ||
994 | if (c == '\n' && len > 1 && temp[len-2] == '\r') { | ||
995 | temp[len-2] = '\0'; | ||
996 | } | ||
997 | #endif | ||
993 | gc = c; | 998 | gc = c; |
994 | if (c == '\0') { | 999 | if (c == '\0') { |
995 | int ch = fgetc(fp); | 1000 | int ch = fgetc(fp); |