diff options
author | Denys Vlasenko <vda.linux@googlemail.com> | 2017-07-09 00:39:15 +0200 |
---|---|---|
committer | Denys Vlasenko <vda.linux@googlemail.com> | 2017-07-09 00:39:15 +0200 |
commit | 9de9c871bf44b931c7a1bb66d3134e2deb811f88 (patch) | |
tree | 68f7f95835b7e2c1c60825e2745d00f5aacfd92b /coreutils | |
parent | d18b2000967cddd0b84091d90a914aec58025310 (diff) | |
download | busybox-w32-9de9c871bf44b931c7a1bb66d3134e2deb811f88.tar.gz busybox-w32-9de9c871bf44b931c7a1bb66d3134e2deb811f88.tar.bz2 busybox-w32-9de9c871bf44b931c7a1bb66d3134e2deb811f88.zip |
shuf: fix random line selection. Closes 9971
"""
For example, given input file:
foo
bar
baz
after shuffling the input file, foo will never end up back on the first line.
This came to light when I ran into a use-case where someone was selecting
a random line from a file using shuf | head -n 1, and the results on busybox
were showing a statistical anomaly (as in, the first line would never ever
be picked) vs the same process running on environments that had gnu coreutils
installed.
On line https://git.busybox.net/busybox/tree/coreutils/shuf.c#n56 it uses
r %= i, which will result in 0 <= r < i, while the algorithm specifies
0 <= r <= i.
"""
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Diffstat (limited to 'coreutils')
-rw-r--r-- | coreutils/shuf.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/coreutils/shuf.c b/coreutils/shuf.c index 9f61f2f7d..217f15c97 100644 --- a/coreutils/shuf.c +++ b/coreutils/shuf.c | |||
@@ -53,7 +53,7 @@ static void shuffle_lines(char **lines, unsigned numlines) | |||
53 | /* RAND_MAX can be as small as 32767 */ | 53 | /* RAND_MAX can be as small as 32767 */ |
54 | if (i > RAND_MAX) | 54 | if (i > RAND_MAX) |
55 | r ^= rand() << 15; | 55 | r ^= rand() << 15; |
56 | r %= i; | 56 | r %= i + 1; |
57 | tmp = lines[i]; | 57 | tmp = lines[i]; |
58 | lines[i] = lines[r]; | 58 | lines[i] = lines[r]; |
59 | lines[r] = tmp; | 59 | lines[r] = tmp; |