1 files changed, 91 insertions, 201 deletions
diff --git a/bzip2.txt b/bzip2.txt
index aee8e2b..898dfe8 100644
--- a/bzip2.txt
+++ b/bzip2.txt
@@ -1,22 +1,22 @@
 bzip2(1)                                                 bzip2(1)
 NAME
-       bzip2, bunzip2 - a block-sorting file compressor, v0.1
+       bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
+       bzcat - decompresses files to stdout
       bzip2recover - recovers data from damaged bzip2 files
 SYNOPSIS
-       bzip2 [ -cdfkstvVL123456789 ] [ filenames ...  ]
+       bzip2 [ -cdfkstvzVL123456789 ] [ filenames ...  ]
-       bunzip2 [ -kvsVL ] [ filenames ...  ]
+       bunzip2 [ -fkvsVL ] [ filenames ...  ]
+       bzcat [ -s ] [ filenames ...  ]
       bzip2recover filename
 DESCRIPTION
-       Bzip2  compresses  files  using the Burrows-Wheeler block-
+       bzip2  compresses  files  using the Burrows-Wheeler block-
       sorting text compression algorithm,  and  Huffman  coding.
       Compression  is  generally  considerably  better than that
       achieved by more conventional LZ77/LZ78-based compressors,
@@ -26,7 +26,7 @@ DESCRIPTION
       The command-line options are deliberately very similar  to
       those of GNU Gzip, but they are not identical.
-       Bzip2  expects  a list of file names to accompany the com-
+       bzip2  expects  a list of file names to accompany the com-
       mand-line flags.  Each file is replaced  by  a  compressed
       version  of  itself,  with  the  name "original_name.bz2".
       Each compressed file has the same  modification  date  and
@@ -38,8 +38,8 @@ DESCRIPTION
       cepts, or have serious file name length restrictions, such
       as MS-DOS.
-       Bzip2  and  bunzip2  will not overwrite existing files; if
+       bzip2  and  bunzip2 will by default not overwrite existing
-       you want this to happen, you should delete them first.
+       files; if you want this to happen, specify the -f flag.
       If no file names  are  specified,  bzip2  compresses  from
       standard  input  to  standard output.  In this case, bzip2
@@ -47,28 +47,29 @@ DESCRIPTION
       this  would  be  entirely  incomprehensible  and therefore
       pointless.
-       Bunzip2 (or bzip2 -d ) decompresses and restores all spec-
+       bunzip2 (or bzip2 -d ) decompresses and restores all spec-
       ified files whose names end in ".bz2".  Files without this
       suffix are ignored.  Again, supplying no filenames  causes
       decompression from standard input to standard output.
-       You  can also compress or decompress files to the standard
+       bunzip2 will correctly decompress a file which is the con-
-       output by giving the -c flag.  You can decompress multiple
+       catenation of two or more compressed files.  The result is
-       files  like  this, but you may only compress a single file
+       the concatenation of the corresponding uncompressed files.
-       this way, since it would otherwise be difficult  to  sepa-
+       Integrity testing (-t) of concatenated compressed files is
-       rate  out  the  compressed representations of the original
+       also supported.
-       files.
-                                                                1
-bzip2(1)                                                 bzip2(1)
+       You  can also compress or decompress files to the standard
+       output by giving the -c flag.  Multiple files may be  com-
+       pressed and decompressed like this.  The resulting outputs
+       are fed sequentially to stdout.  Compression  of  multiple
+       files  in this manner generates a stream containing multi-
+       ple compressed file representations.  Such a stream can be
+       decompressed  correctly  only  by  bzip2  version 0.9.0 or
+       later.  Earlier versions of bzip2 will stop  after  decom-
+       pressing the first file in the stream.
+       bzcat  (or bzip2 -dc ) decompresses all specified files to
+       the standard output.
       Compression is always performed, even  if  the  compressed
       file  is slightly larger than the original.  Files of less
@@ -108,13 +109,14 @@ MEMORY MANAGEMENT
       file, and bunzip2 then allocates itself just enough memory
       to decompress the file.  Since block sizes are  stored  in
       compressed  files,  it follows that the flags -1 to -9 are
-       irrelevant to and so ignored during  decompression.   Com-
+       irrelevant  to  and  so  ignored   during   decompression.
-       pression  and decompression requirements, in bytes, can be
-       estimated as:
+       Compression  and decompression requirements, in bytes, can
+       be estimated as:
             Compression:   400k + ( 7 x block size )
-             Decompression: 100k + ( 5 x block size ), or
+             Decompression: 100k + ( 4 x block size ), or
                            100k + ( 2.5 x block size )
       Larger  block  sizes  give  rapidly  diminishing  marginal
@@ -125,19 +127,8 @@ MEMORY MANAGEMENT
       requirement  is  set  at compression-time by the choice of
       block size.
-                                                                2
-bzip2(1)                                                 bzip2(1)
       For files compressed with the  default  900k  block  size,
-       bunzip2  will require about 4600 kbytes to decompress.  To
+       bunzip2  will require about 3700 kbytes to decompress.  To
       support decompression of any file on a 4 megabyte machine,
       bunzip2  has  an  option to decompress using approximately
       half this amount of memory, about 2300 kbytes.  Decompres-
@@ -157,8 +148,8 @@ bzip2(1)                                                 bzip2(1)
       file 20,000 bytes long with the flag  -9  will  cause  the
       compressor  to  allocate  around 6700k of memory, but only
       touch 400k + 20000 * 7 = 540 kbytes of it.  Similarly, the
-       decompressor  will  allocate  4600k  but only touch 100k +
+       decompressor  will  allocate  3700k  but only touch 100k +
-       20000 * 5 = 200 kbytes.
+       20000 * 4 = 180 kbytes.
       Here is a table which summarises the maximum memory  usage
       for  different  block  sizes.   Also recorded is the total
@@ -172,15 +163,15 @@ bzip2(1)                                                 bzip2(1)
                  Compress   Decompress   Decompress   Corpus
           Flag     usage      usage       -s usage     Size
-            -1      1100k       600k         350k      914704
+            -1      1100k       500k         350k      914704
-            -2      1800k      1100k         600k      877703
+            -2      1800k       900k         600k      877703
-            -3      2500k      1600k         850k      860338
+            -3      2500k      1300k         850k      860338
-            -4      3200k      2100k        1100k      846899
+            -4      3200k      1700k        1100k      846899
-            -5      3900k      2600k        1350k      845160
+            -5      3900k      2100k        1350k      845160
-            -6      4600k      3100k        1600k      838626
+            -6      4600k      2500k        1600k      838626
-            -7      5400k      3600k        1850k      834096
+            -7      5400k      2900k        1850k      834096
-            -8      6000k      4100k        2100k      828642
+            -8      6000k      3300k        2100k      828642
-            -9      6700k      4600k        2350k      828642
+            -9      6700k      3700k        2350k      828642
 OPTIONS
@@ -189,47 +180,37 @@ OPTIONS
              decompress multiple files to stdout, but will  only
              compress a single file to stdout.
-                                                                3
-bzip2(1)                                                 bzip2(1)
       -d --decompress
-              Force  decompression.  Bzip2 and bunzip2 are really
+              Force  decompression.  bzip2, bunzip2 and bzcat are
-              the same program, and the decision about whether to
+              really the same program,  and  the  decision  about
-              compress  or  decompress  is  done  on the basis of
+              what  actions to take is done on the basis of which
-              which name is used.  This flag overrides that mech-
+              name is used.  This flag overrides that  mechanism,
-              anism, and forces bzip2 to decompress.
+              and forces bzip2 to decompress.
-       -f --compress
+       -z --compress
              The  complement  to -d: forces compression, regard-
              less of the invokation name.
       -t --test
              Check integrity of the specified file(s), but don't
              decompress  them.   This  really  performs  a trial
-              decompression and throws away the result, using the
+              decompression and throws away the result.
-              low-memory decompression algorithm (see -s).
+       -f --force
+              Force overwrite of output files.   Normally,  bzip2
+              will not overwrite existing output files.
       -k --keep
              Keep  (don't delete) input files during compression
              or decompression.
       -s --small
-              Reduce  memory  usage,  both  for  compression  and
+              Reduce memory usage, for compression, decompression
-              decompression.  Files are decompressed using a mod-
+              and  testing.   Files  are  decompressed and tested
-              ified algorithm which only requires 2.5  bytes  per
+              using a modified algorithm which only requires  2.5
-              block  byte.   This  means  any  file can be decom-
+              bytes  per  block byte.  This means any file can be
-              pressed in 2300k of memory,  albeit  somewhat  more
+              decompressed in 2300k of memory,  albeit  at  about
-              slowly than usual.
+              half the normal speed.
              During  compression,  -s  selects  a  block size of
              200k, which limits memory use to  around  the  same
@@ -238,36 +219,21 @@ bzip2(1)                                                 bzip2(1)
              megabytes  or  less),  use  -s for everything.  See
              MEMORY MANAGEMENT above.
       -v --verbose
              Verbose mode -- show the compression ratio for each
              file  processed.   Further  -v's  increase the ver-
              bosity level, spewing out lots of information which
              is primarily of interest for diagnostic purposes.
-       -L --license
+       -L --license -V --version
              Display  the  software  version,  license terms and
              conditions.
-       -V --version
-              Same as -L.
       -1 to -9
              Set the block size to 100 k, 200 k ..  900  k  when
              compressing.   Has  no  effect  when decompressing.
              See MEMORY MANAGEMENT above.
-                                                                4
-bzip2(1)                                                 bzip2(1)
       --repetitive-fast
              bzip2 injects some small  pseudo-random  variations
              into  very  repetitive  blocks  to limit worst-case
@@ -278,7 +244,6 @@ bzip2(1)                                                 bzip2(1)
              would take before resorting to randomisation.  This
              flag makes it give up much sooner.
       --repetitive-best
              Opposite  of  --repetitive-fast;  try  a lot harder
              before resorting to randomisation.
@@ -306,10 +271,10 @@ RECOVERING DATA FROM DAMAGED FILES
       bzip2recover takes a single argument, the name of the dam-
       aged file, and writes a number of files "rec0001file.bz2",
       "rec0002file.bz2", etc, containing the  extracted  blocks.
-       The output filenames are designed so that the use of wild-
+       The  output  filenames  are  designed  so  that the use of
-       cards in subsequent processing -- for example, "bzip2  -dc
+       wildcards in subsequent processing -- for example,  "bzip2
-       rec*file.bz2  >  recovered_data" -- lists the files in the
+       -dc  rec*file.bz2  > recovered_data" -- lists the files in
-       "right" order.
+       the "right" order.
       bzip2recover should be of most use dealing with large .bz2
       files,  as  these will contain many blocks.  It is clearly
@@ -322,18 +287,6 @@ RECOVERING DATA FROM DAMAGED FILES
 PERFORMANCE NOTES
       The sorting phase of compression gathers together  similar
-                                                                5
-bzip2(1)                                                 bzip2(1)
       strings  in  the  file.  Because of this, files containing
       very long runs of  repeated  symbols,  like  "aabaabaabaab
       ..."   (repeated   several  hundred  times)  may  compress
@@ -348,10 +301,6 @@ bzip2(1)                                                 bzip2(1)
       severe slowness in compression, try making the block  size
       as small as possible, with flag -1.
-       Incompressible or virtually-incompressible data may decom-
-       press rather more slowly than one would hope.  This is due
-       to a naive implementation of the move-to-front coder.
       bzip2  usually  allocates  several  megabytes of memory to
       operate in, and then charges all over it in a fairly  ran-
       dom  fashion.   This means that performance, both for com-
@@ -362,12 +311,6 @@ bzip2(1)                                                 bzip2(1)
       large performance improvements.  I imagine bzip2 will per-
       form best on machines with very large caches.
-       Test mode (-t) uses the low-memory decompression algorithm
-       (-s).  This means test mode does not run  as  fast  as  it
-       could;  it  could  run as fast as the normal decompression
-       machinery.  This could easily be fixed at the cost of some
-       code bloat.
 CAVEATS
       I/O  error  messages  are not as helpful as they could be.
@@ -375,91 +318,38 @@ CAVEATS
       but  the  details  of  what  the problem is sometimes seem
       rather misleading.
-       This manual page pertains to version 0.1 of bzip2.  It may
+       This manual page pertains to version 0.9.0 of bzip2.  Com-
-       well  happen that some future version will use a different
+       pressed  data created by this version is entirely forwards
-       compressed file format.  If you try to  decompress,  using
+       and backwards compatible with the previous public release,
-       0.1,  a  .bz2  file created with some future version which
+       version  0.1pl2,  but  with the following exception: 0.9.0
-       uses a different compressed file format, 0.1 will complain
+       can correctly decompress multiple concatenated  compressed
-       that  your  file  "is not a bzip2 file".  If that happens,
+       files.   0.1pl2  cannot do this; it will stop after decom-
-       you should obtain a more recent version of bzip2  and  use
+       pressing just the first file in the stream.
-       that to decompress the file.
       Wildcard expansion for Windows 95 and NT is flaky.
-       bzip2recover  uses  32-bit integers to represent bit posi-
+       bzip2recover uses 32-bit integers to represent  bit  posi-
-       tions in compressed files, so it cannot handle  compressed
+       tions  in compressed files, so it cannot handle compressed
+       files more than 512 megabytes long.  This could easily  be
-                                                                6
-bzip2(1)                                                 bzip2(1)
-       files  more than 512 megabytes long.  This could easily be
       fixed.
-       bzip2recover sometimes reports a  very  small,  incomplete
-       final  block.  This is spurious and can be safely ignored.
-RELATIONSHIP TO bzip-0.21
-       This program is a descendant of the bzip program,  version
-       0.21,  which  I released in August 1996.  The primary dif-
-       ference of bzip2 is its avoidance of the possibly patented
-       algorithms  which  were  used  in 0.21.  bzip2 also brings
-       various useful refinements (-s,  -t),  uses  less  memory,
-       decompresses  significantly  faster,  and  has support for
-       recovering data from damaged files.
-       Because bzip2 uses Huffman coding to  construct  the  com-
-       pressed  bitstream, rather than the arithmetic coding used
-       in 0.21, the compressed representations generated  by  the
-       two  programs are incompatible, and they will not interop-
-       erate.  The change in suffix from  .bz  to  .bz2  reflects
-       this.   It would have been helpful to at least allow bzip2
-       to decompress files created by 0.21, but this would defeat
-       the primary aim of having a patent-free compressor.
-       For a more precise statement about patent issues in bzip2,
-       please see the README file in the distribution.
-       Huffman  coding  necessarily  involves some coding ineffi-
-       ciency compared to arithmetic  coding.   This  means  that
-       bzip2  compresses about 1% worse than 0.21, an unfortunate
-       but unavoidable fact-of-life.  On the other  hand,  decom-
-       pression  is approximately 50% faster for the same reason,
-       and the change in file format gave an opportunity  to  add
-       data-recovery features.  So it is not all bad.
 AUTHOR
       Julian Seward, jseward@acm.org.
+       http://www.muraroa.demon.co.uk
-       The ideas embodied in bzip and bzip2 are due to (at least)
-       the following people: Michael Burrows  and  David  Wheeler
+       The ideas embodied in bzip2 are due to (at least) the fol-
-       (for  the  block  sorting  transformation),  David Wheeler
+       lowing people: Michael Burrows and David Wheeler (for  the
-       (again, for the Huffman coder),  Peter  Fenwick  (for  the
+       block  sorting  transformation), David Wheeler (again, for
-       structured  coding  model  in 0.21, and many refinements),
+       the Huffman coder), Peter Fenwick (for the structured cod-
-       and Alistair Moffat, Radford Neal and Ian Witten (for  the
+       ing model in the original bzip, and many refinements), and
-       arithmetic  coder  in 0.21).  I am much indebted for their
+       Alistair Moffat, Radford Neal  and  Ian  Witten  (for  the
-       help, support and advice.  See the file ALGORITHMS in  the
+       arithmetic  coder  in  the  original  bzip).   I  am  much
-       source  distribution for pointers to sources of documenta-
+       indebted for their help, support and advice.  See the man-
-       tion.  Christian von Roques  encouraged  me  to  look  for
+       ual  in the source distribution for pointers to sources of
-       faster  sorting algorithms, so as to speed up compression.
+       documentation.  Christian von Roques encouraged me to look
-       Bela Lubkin encouraged me to improve the  worst-case  com-
+       for  faster sorting algorithms, so as to speed up compres-
-       pression  performance.   Many  people sent patches, helped
+       sion.  Bela Lubkin encouraged me to improve the worst-case
+       compression performance.  Many people sent patches, helped
       with portability problems, lent machines, gave advice  and
       were generally helpful.
-                                                                7