1 files changed, 176 insertions, 264 deletions
diff --git a/bzip2.1.preformatted b/bzip2.1.preformatted
index 8c4fab1..96b44be 100644
--- a/bzip2.1.preformatted
+++ b/bzip2.1.preformatted
@@ -1,24 +1,20 @@
-bzip2(1)                                                 bzip2(1)
 NNAAMMEE
-       bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
+       bzip2, bunzip2 - a block-sorting file compressor, v0.9.5
       bzcat - decompresses files to stdout
       bzip2recover - recovers data from damaged bzip2 files
 SSYYNNOOPPSSIISS
-       bbzziipp22 [ --ccddffkkssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbzziipp22 [ --ccddffkkqqssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
       bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
       bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
       bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e
 DDEESSCCRRIIPPTTIIOONN
-       _b_z_i_p_2  compresses  files  using the Burrows-Wheeler block-
+       _b_z_i_p_2  compresses  files  using  the Burrows-Wheeler block
       sorting text compression algorithm,  and  Huffman  coding.
       Compression  is  generally  considerably  better than that
       achieved by more conventional LZ77/LZ78-based compressors,
@@ -26,22 +22,22 @@ DDEESSCCRRIIPPTTIIOONN
       tistical compressors.
       The command-line options are deliberately very similar  to
-       those of _G_N_U _G_z_i_p_, but they are not identical.
+       those of _G_N_U _g_z_i_p_, but they are not identical.
       _b_z_i_p_2  expects  a list of file names to accompany the com-
       mand-line flags.  Each file is replaced  by  a  compressed
       version  of  itself,  with  the  name "original_name.bz2".
-       Each compressed file has the same  modification  date  and
+       Each compressed file has the same modification date,  per-
-       permissions  as  the corresponding original, so that these
+       missions, and, when possible, ownership as the correspond-
-       properties can  be  correctly  restored  at  decompression
+       ing original, so that these properties  can  be  correctly
-       time.  File name handling is naive in the sense that there
+       restored  at  decompression  time.   File name handling is
-       is no mechanism for preserving original file  names,  per-
+       naive in the sense that there is no mechanism for preserv-
-       missions  and  dates  in filesystems which lack these con-
+       ing  original file names, permissions, ownerships or dates
-       cepts, or have serious file name length restrictions, such
+       in filesystems which lack these concepts, or have  serious
-       as MS-DOS.
+       file name length restrictions, such as MS-DOS.
       _b_z_i_p_2  and  _b_u_n_z_i_p_2 will by default not overwrite existing
-       files; if you want this to happen, specify the -f flag.
+       files.  If you want this to happen, specify the -f flag.
       If no file names  are  specified,  _b_z_i_p_2  compresses  from
       standard  input  to  standard output.  In this case, _b_z_i_p_2
@@ -49,42 +45,50 @@ DDEESSCCRRIIPPTTIIOONN
       this  would  be  entirely  incomprehensible  and therefore
       pointless.
-       _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
+       _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d_) decompresses  all  specified  files.
-       ified files whose names end in ".bz2".  Files without this
+       Files which were not created by _b_z_i_p_2 will be detected and
-       suffix are ignored.  Again, supplying no filenames  causes
+       ignored, and a warning issued.  _b_z_i_p_2  attempts  to  guess
-       decompression from standard input to standard output.
+       the  filename  for  the decompressed file from that of the
+       compressed file as follows:
+              filename.bz2    becomes   filename
+              filename.bz     becomes   filename
+              filename.tbz2   becomes   filename.tar
+              filename.tbz    becomes   filename.tar
+              anyothername    becomes   anyothername.out
+       If the file does not end in one of the recognised endings,
+       _._b_z_2_,  _._b_z_,  _._t_b_z_2 or _._t_b_z_, _b_z_i_p_2 complains that it cannot
+       guess the name of the original file, and uses the original
+       name with _._o_u_t appended.
+       As  with compression, supplying no filenames causes decom-
+       pression from standard input to standard output.
       _b_u_n_z_i_p_2 will correctly decompress a file which is the con-
       catenation of two or more compressed files.  The result is
       the concatenation of the corresponding uncompressed files.
       Integrity testing (-t) of concatenated compressed files is
-                                                                1
-bzip2(1)                                                 bzip2(1)
       also supported.
-       You  can also compress or decompress files to the standard
+       You can also compress or decompress files to the  standard
-       output by giving the -c flag.  Multiple files may be  com-
+       output  by giving the -c flag.  Multiple files may be com-
       pressed and decompressed like this.  The resulting outputs
-       are fed sequentially to stdout.  Compression  of  multiple
+       are  fed  sequentially to stdout.  Compression of multiple
-       files  in this manner generates a stream containing multi-
+       files in this manner generates a stream containing  multi-
       ple compressed file representations.  Such a stream can be
-       decompressed  correctly  only  by  _b_z_i_p_2  version 0.9.0 or
+       decompressed correctly only  by  _b_z_i_p_2  version  0.9.0  or
-       later.  Earlier versions of _b_z_i_p_2 will stop  after  decom-
+       later.   Earlier  versions of _b_z_i_p_2 will stop after decom-
       pressing the first file in the stream.
-       _b_z_c_a_t  (or _b_z_i_p_2 _-_d_c ) decompresses all specified files to
+       _b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified  files  to
       the standard output.
+       _b_z_i_p_2  will  read arguments from the environment variables
+       _B_Z_I_P_2 and _B_Z_I_P_, in  that  order,  and  will  process  them
+       before  any  arguments  read  from the command line.  This
+       gives a convenient way to supply default arguments.
       Compression is always performed, even  if  the  compressed
       file  is slightly larger than the original.  Files of less
       than about one hundred bytes tend to get larger, since the
@@ -101,121 +105,19 @@ bzip2(1)                                                 bzip2(1)
       corruption  going  undetected  is  microscopic,  about one
       chance in four billion for each file processed.  Be aware,
       though,  that  the  check occurs upon decompression, so it
-       can only tell you that that something is wrong.  It  can't
+       can only tell you that something is wrong.  It can't  help
-       help  you recover the original uncompressed data.  You can
+       you  recover  the original uncompressed data.  You can use
-       use _b_z_i_p_2_r_e_c_o_v_e_r to  try  to  recover  data  from  damaged
+       _b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files.
-       files.
-       Return  values:  0  for a normal exit, 1 for environmental
+       Return values: 0 for a normal exit,  1  for  environmental
-       problems (file not found, invalid flags, I/O errors,  &c),
+       problems  (file not found, invalid flags, I/O errors, &c),
       2 to indicate a corrupt compressed file, 3 for an internal
       consistency error (eg, bug) which caused _b_z_i_p_2 to panic.
-MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
-       _B_z_i_p_2 compresses large files in blocks.   The  block  size
-       affects  both  the  compression  ratio  achieved,  and the
-       amount of memory needed both for  compression  and  decom-
-       pression.   The flags -1 through -9 specify the block size
-       to be 100,000 bytes through 900,000  bytes  (the  default)
-       respectively.   At decompression-time, the block size used
-       for compression is read from the header of the  compressed
-       file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
-       to decompress the file.  Since block sizes are  stored  in
-       compressed  files,  it follows that the flags -1 to -9 are
-       irrelevant  to  and  so  ignored   during   decompression.
-                                                                2
-bzip2(1)                                                 bzip2(1)
-       Compression  and decompression requirements, in bytes, can
-       be estimated as:
-             Compression:   400k + ( 7 x block size )
-             Decompression: 100k + ( 4 x block size ), or
-                            100k + ( 2.5 x block size )
-       Larger  block  sizes  give  rapidly  diminishing  marginal
-       returns;  most of the compression comes from the first two
-       or three hundred k of block size, a fact worth bearing  in
-       mind  when  using  _b_z_i_p_2  on  small  machines.  It is also
-       important to  appreciate  that  the  decompression  memory
-       requirement  is  set  at compression-time by the choice of
-       block size.
-       For files compressed with the  default  900k  block  size,
-       _b_u_n_z_i_p_2  will require about 3700 kbytes to decompress.  To
-       support decompression of any file on a 4 megabyte machine,
-       _b_u_n_z_i_p_2  has  an  option to decompress using approximately
-       half this amount of memory, about 2300 kbytes.  Decompres-
-       sion  speed  is also halved, so you should use this option
-       only where necessary.  The relevant flag is -s.
-       In general, try and use the largest block size memory con-
-       straints  allow,  since  that  maximises  the  compression
-       achieved.  Compression and decompression speed are  virtu-
-       ally unaffected by block size.
-       Another  significant point applies to files which fit in a
-       single block -- that  means  most  files  you'd  encounter
-       using  a  large  block  size.   The  amount of real memory
-       touched is proportional to the size of the file, since the
-       file  is smaller than a block.  For example, compressing a
-       file 20,000 bytes long with the flag  -9  will  cause  the
-       compressor  to  allocate  around 6700k of memory, but only
-       touch 400k + 20000 * 7 = 540 kbytes of it.  Similarly, the
-       decompressor  will  allocate  3700k  but only touch 100k +
-       20000 * 4 = 180 kbytes.
-       Here is a table which summarises the maximum memory  usage
-       for  different  block  sizes.   Also recorded is the total
-       compressed size for 14 files of the Calgary Text  Compres-
-       sion  Corpus totalling 3,141,622 bytes.  This column gives
-       some feel for how  compression  varies  with  block  size.
-       These  figures  tend to understate the advantage of larger
-       block sizes for larger files, since the  Corpus  is  domi-
-       nated by smaller files.
-                  Compress   Decompress   Decompress   Corpus
-           Flag     usage      usage       -s usage     Size
-            -1      1100k       500k         350k      914704
-            -2      1800k       900k         600k      877703
-                                                                3
-bzip2(1)                                                 bzip2(1)
-            -3      2500k      1300k         850k      860338
-            -4      3200k      1700k        1100k      846899
-            -5      3900k      2100k        1350k      845160
-            -6      4600k      2500k        1600k      838626
-            -7      5400k      2900k        1850k      834096
-            -8      6000k      3300k        2100k      828642
-            -9      6700k      3700k        2350k      828642
 OOPPTTIIOONNSS
       --cc ----ssttddoouutt
-              Compress or decompress to standard output.  -c will
+              Compress or decompress to standard output.
-              decompress multiple files to stdout, but will  only
-              compress a single file to stdout.
       --dd ----ddeeccoommpprreessss
              Force  decompression.  _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are
@@ -235,7 +137,9 @@ OOPPTTIIOONNSS
       --ff ----ffoorrccee
              Force overwrite of output files.   Normally,  _b_z_i_p_2
-              will not overwrite existing output files.
+              will  not  overwrite  existing  output files.  Also
+              forces _b_z_i_p_2 to break hard links to files, which it
+              otherwise wouldn't do.
       --kk ----kkeeeepp
              Keep  (don't delete) input files during compression
@@ -254,19 +158,12 @@ OOPPTTIIOONNSS
              figure,  at  the expense of your compression ratio.
              In short, if your  machine  is  low  on  memory  (8
              megabytes  or  less),  use  -s for everything.  See
-              MEMORY MANAGEMENT above.
+              MEMORY MANAGEMENT below.
-                                                                4
-bzip2(1)                                                 bzip2(1)
+       --qq ----qquuiieett
+              Suppress non-essential warning messages.   Messages
+              pertaining  to I/O errors and other critical events
+              will not be suppressed.
       --vv ----vveerrbboossee
              Verbose mode -- show the compression ratio for each
@@ -281,22 +178,96 @@ bzip2(1)                                                 bzip2(1)
       --11 ttoo --99
              Set the block size to 100 k, 200 k ..  900  k  when
              compressing.   Has  no  effect  when decompressing.
-              See MEMORY MANAGEMENT above.
+              See MEMORY MANAGEMENT below.
+       ----     Treats all subsequent arguments as file names, even
+              if they start with a dash.  This is so you can han-
+              dle files with names beginning  with  a  dash,  for
+              example: bzip2 -- -myfilename.
+       ----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt
+              These  flags  are  redundant  in versions 0.9.5 and
+              above.  They provided some coarse control over  the
+              behaviour  of the sorting algorithm in earlier ver-
+              sions, which was sometimes useful.  0.9.5 and above
+              have  an  improved  algorithm  which  renders these
+              flags irrelevant.
+MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
+       _b_z_i_p_2 compresses large files in blocks.   The  block  size
+       affects  both  the  compression  ratio  achieved,  and the
+       amount of memory needed for compression and decompression.
+       The  flags  -1  through  -9  specify  the block size to be
+       100,000 bytes through 900,000 bytes (the default)  respec-
+       tively.   At  decompression  time, the block size used for
+       compression is read from  the  header  of  the  compressed
+       file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
+       to decompress the file.  Since block sizes are  stored  in
+       compressed  files,  it follows that the flags -1 to -9 are
+       irrelevant to and so ignored during decompression.
-       ----rreeppeettiittiivvee--ffaasstt
+       Compression and decompression requirements, in bytes,  can
-              _b_z_i_p_2 injects some small  pseudo-random  variations
+       be estimated as:
-              into  very  repetitive  blocks  to limit worst-case
-              performance during compression.   If  sorting  runs
-              into  difficulties,  the  block  is randomised, and
-              sorting is restarted.  Very roughly, _b_z_i_p_2 persists
-              for  three  times  as  long as a well-behaved input
-              would take before resorting to randomisation.  This
-              flag makes it give up much sooner.
+              Compression:   400k + ( 8 x block size )
-       ----rreeppeettiittiivvee--bbeesstt
+              Decompression: 100k + ( 4 x block size ), or
-              Opposite  of  --repetitive-fast;  try  a lot harder
+                             100k + ( 2.5 x block size )
-              before resorting to randomisation.
+       Larger  block  sizes  give  rapidly  diminishing  marginal
+       returns.  Most of the compression comes from the first two
+       or  three hundred k of block size, a fact worth bearing in
+       mind when using _b_z_i_p_2  on  small  machines.   It  is  also
+       important  to  appreciate  that  the  decompression memory
+       requirement is set at compression time by  the  choice  of
+       block size.
+       For  files  compressed  with  the default 900k block size,
+       _b_u_n_z_i_p_2 will require about 3700 kbytes to decompress.   To
+       support decompression of any file on a 4 megabyte machine,
+       _b_u_n_z_i_p_2 has an option to  decompress  using  approximately
+       half this amount of memory, about 2300 kbytes.  Decompres-
+       sion speed is also halved, so you should use  this  option
+       only where necessary.  The relevant flag is -s.
+       In general, try and use the largest block size memory con-
+       straints  allow,  since  that  maximises  the  compression
+       achieved.   Compression and decompression speed are virtu-
+       ally unaffected by block size.
+       Another significant point applies to files which fit in  a
+       single  block  --  that  means  most files you'd encounter
+       using a large block  size.   The  amount  of  real  memory
+       touched is proportional to the size of the file, since the
+       file is smaller than a block.  For example, compressing  a
+       file  20,000  bytes  long  with the flag -9 will cause the
+       compressor to allocate around 7600k of  memory,  but  only
+       touch 400k + 20000 * 8 = 560 kbytes of it.  Similarly, the
+       decompressor will allocate 3700k but  only  touch  100k  +
+       20000 * 4 = 180 kbytes.
+       Here  is a table which summarises the maximum memory usage
+       for different block sizes.  Also  recorded  is  the  total
+       compressed  size for 14 files of the Calgary Text Compres-
+       sion Corpus totalling 3,141,622 bytes.  This column  gives
+       some  feel  for  how  compression  varies with block size.
+       These figures tend to understate the advantage  of  larger
+       block  sizes  for  larger files, since the Corpus is domi-
+       nated by smaller files.
+                  Compress   Decompress   Decompress   Corpus
+           Flag     usage      usage       -s usage     Size
+            -1      1200k       500k         350k      914704
+            -2      2000k       900k         600k      877703
+            -3      2800k      1300k         850k      860338
+            -4      3600k      1700k        1100k      846899
+            -5      4400k      2100k        1350k      845160
+            -6      5200k      2500k        1600k      838626
+            -7      6100k      2900k        1850k      834096
+            -8      6800k      3300k        2100k      828642
+            -9      7600k      3700k        2350k      828642
 RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS
@@ -314,7 +285,7 @@ RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD F
       _b_z_i_p_2_r_e_c_o_v_e_r is a  simple  program  whose  purpose  is  to
       search  for blocks in .bz2 files, and write each block out
-       into its own .bz2 file.  You can then use _b_z_i_p_2 _-_t to test
+       into its own .bz2 file.  You can then use _b_z_i_p_2 -t to test
       the integrity of the resulting files, and decompress those
       which are undamaged.
@@ -322,21 +293,9 @@ RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD F
       aged file, and writes a number of files "rec0001file.bz2",
       "rec0002file.bz2", etc, containing the  extracted  blocks.
       The  output  filenames  are  designed  so  that the use of
-                                                                5
-bzip2(1)                                                 bzip2(1)
       wildcards in subsequent processing -- for example,  "bzip2
-       -dc  rec*file.bz2  > recovered_data" -- lists the files in
+       -dc   rec*file.bz2 > recovered_data" -- lists the files in
-       the "right" order.
+       the correct order.
       _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
       files,  as  these will contain many blocks.  It is clearly
@@ -351,17 +310,15 @@ PPEERRFFOORRMMAANNCCEE NNOOTTEESS
       The sorting phase of compression gathers together  similar
       strings  in  the  file.  Because of this, files containing
       very long runs of  repeated  symbols,  like  "aabaabaabaab
-       ..."   (repeated   several  hundred  times)  may  compress
+       ..."   (repeated  several hundred times) may compress more
-       extraordinarily slowly.  You can use the -vvvvv option  to
+       slowly than normal.  Versions 0.9.5 and  above  fare  much
-       monitor progress in great detail, if you want.  Decompres-
+       better  than previous versions in this respect.  The ratio
-       sion speed is unaffected.
+       between worst-case and average-case compression time is in
+       the  region  of  10:1.  For previous versions, this figure
-       Such pathological cases seem rare in  practice,  appearing
+       was more like 100:1.  You can use the -vvvv option to mon-
-       mostly in artificially-constructed test files, and in low-
+       itor progress in great detail, if you want.
-       level disk images.  It may be inadvisable to use _b_z_i_p_2  to
-       compress  the  latter.   If you do get a file which causes
+       Decompression speed is unaffected by these phenomena.
-       severe slowness in compression, try making the block  size
-       as small as possible, with flag -1.
       _b_z_i_p_2  usually  allocates  several  megabytes of memory to
       operate in, and then charges all over it in a fairly  ran-
@@ -376,88 +333,43 @@ PPEERRFFOORRMMAANNCCEE NNOOTTEESS
 CCAAVVEEAATTSS
       I/O  error  messages  are not as helpful as they could be.
-       _B_z_i_p_2 tries hard to detect I/O errors  and  exit  cleanly,
+       _b_z_i_p_2 tries hard to detect I/O errors  and  exit  cleanly,
       but  the  details  of  what  the problem is sometimes seem
       rather misleading.
-       This manual page pertains to version 0.9.0 of _b_z_i_p_2_.  Com-
+       This manual page pertains to version 0.9.5 of _b_z_i_p_2_.  Com-
       pressed  data created by this version is entirely forwards
-       and backwards compatible with the previous public release,
+       and  backwards  compatible  with   the   previous   public
-       version  0.1pl2,  but  with the following exception: 0.9.0
+       releases,  versions 0.1pl2 and 0.9.0, but with the follow-
-       can correctly decompress multiple concatenated  compressed
+       ing exception: 0.9.0 and above  can  correctly  decompress
-       files.   0.1pl2  cannot do this; it will stop after decom-
+       multiple  concatenated compressed files.  0.1pl2 cannot do
-       pressing just the first file in the stream.
+       this; it will stop after decompressing just the first file
+       in the stream.
+       _b_z_i_p_2_r_e_c_o_v_e_r  uses  32-bit integers to represent bit posi-
+       tions in compressed files, so it cannot handle  compressed
-                                                                6
+       files  more than 512 megabytes long.  This could easily be
-bzip2(1)                                                 bzip2(1)
-       Wildcard expansion for Windows 95 and NT is flaky.
-       _b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent  bit  posi-
-       tions  in compressed files, so it cannot handle compressed
-       files more than 512 megabytes long.  This could easily  be
       fixed.
 AAUUTTHHOORR
       Julian Seward, jseward@acm.org.
       http://www.muraroa.demon.co.uk
       The ideas embodied in _b_z_i_p_2 are due to (at least) the fol-
-       lowing people: Michael Burrows and David Wheeler (for  the
+       lowing  people: Michael Burrows and David Wheeler (for the
-       block  sorting  transformation), David Wheeler (again, for
+       block sorting transformation), David Wheeler  (again,  for
       the Huffman coder), Peter Fenwick (for the structured cod-
       ing model in the original _b_z_i_p_, and many refinements), and
-       Alistair Moffat, Radford Neal  and  Ian  Witten  (for  the
+       Alistair  Moffat,  Radford  Neal  and  Ian Witten (for the
       arithmetic  coder  in  the  original  _b_z_i_p_)_.   I  am  much
       indebted for their help, support and advice.  See the man-
-       ual  in the source distribution for pointers to sources of
+       ual in the source distribution for pointers to sources  of
       documentation.  Christian von Roques encouraged me to look
-       for  faster sorting algorithms, so as to speed up compres-
+       for faster sorting algorithms, so as to speed up  compres-
       sion.  Bela Lubkin encouraged me to improve the worst-case
       compression performance.  Many people sent patches, helped
-       with portability problems, lent machines, gave advice  and
+       with  portability problems, lent machines, gave advice and
       were generally helpful.
-                                                                7