aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRon Yorston <rmy@pobox.com>2016-03-16 11:36:45 +0000
committerRon Yorston <rmy@pobox.com>2016-03-16 11:36:45 +0000
commit3ebb829683aab54f5089a878120294ae9e5d2fcf (patch)
tree6e9f1a56da7ef7c3f955fd83c5becddb5a844a01
parent037cdc9c020ae6875a0707aeab5f7ecceef7e351 (diff)
downloadbusybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.tar.gz
busybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.tar.bz2
busybox-w32-3ebb829683aab54f5089a878120294ae9e5d2fcf.zip
sed: drop \r when reading input
An upstream bug (https://bugs.busybox.net/show_bug.cgi?id=8791) reports: $ cat myfile a b c $ sed "s/\(.*\)/\1\1/" myfile a b c it should be: $ sed "s/\(.*\)/\1\1/" myfile aa bb cc This happened because busybox-w32 opens files in binary mode. Lines read by sed had trailing LFs removed but not trailing CRs. The CRs ended up in the matched strings and were output, thus giving the appearance that only one of the backreferences was printed. The same happens on Linux when a DOS file is processed by BusyBox sed or GNU sed. However, this behaviour is arguably incorrect on Windows. I've modified busybox-w32 to drop trailing CRs as well as LFs from input lines.
-rw-r--r--editors/sed.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/editors/sed.c b/editors/sed.c
index 4c7f75521..a0c713f58 100644
--- a/editors/sed.c
+++ b/editors/sed.c
@@ -990,6 +990,11 @@ static char *get_next_line(char *gets_char, char *last_puts_char, char last_gets
990 char c = temp[len-1]; 990 char c = temp[len-1];
991 if (c == '\n' || c == '\0') { 991 if (c == '\n' || c == '\0') {
992 temp[len-1] = '\0'; 992 temp[len-1] = '\0';
993#if ENABLE_PLATFORM_MINGW32
994 if (c == '\n' && len > 1 && temp[len-2] == '\r') {
995 temp[len-2] = '\0';
996 }
997#endif
993 gc = c; 998 gc = c;
994 if (c == '\0') { 999 if (c == '\0') {
995 int ch = fgetc(fp); 1000 int ch = fgetc(fp);