bzip2-0.9.0cbzip2-0.9.0c

author: Julian Seward <jseward@acm.org> 1998-08-23 22:13:13 +0200
committer: Julian Seward <jseward@acm.org> 1998-08-23 22:13:13 +0200
commit: 977101ad5f833f5c0a574bfeea408e5301a6b052 (patch)
tree: fc1e8fed202869c116cbf6b8c362456042494a0a /bzip2.1.preformatted
parent: 1eb67a9d8f7f05ae310bc9ef297d176f3a3f8a37 (diff)
download: bzip2-0.9.0c.tar.gz
bzip2-0.9.0c.tar.bz2
bzip2-0.9.0c.zip
1 files changed, 158 insertions, 160 deletions
diff --git a/bzip2.1.preformatted b/bzip2.1.preformatted
index 5206e05..8c4fab1 100644
--- a/bzip2.1.preformatted
+++ b/bzip2.1.preformatted
@@ -5,18 +5,20 @@ bzip2(1)                                                 bzip2(1)
 NNAAMMEE
-       bzip2, bunzip2 - a block-sorting file compressor, v0.1
+       bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
+       bzcat - decompresses files to stdout
       bzip2recover - recovers data from damaged bzip2 files
 SSYYNNOOPPSSIISS
-       bbzziipp22 [ --ccddffkkssttvvVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbzziipp22 [ --ccddffkkssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
-       bbuunnzziipp22 [ --kkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
       bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e
 DDEESSCCRRIIPPTTIIOONN
-       _B_z_i_p_2  compresses  files  using the Burrows-Wheeler block-
+       _b_z_i_p_2  compresses  files  using the Burrows-Wheeler block-
       sorting text compression algorithm,  and  Huffman  coding.
       Compression  is  generally  considerably  better than that
       achieved by more conventional LZ77/LZ78-based compressors,
@@ -26,7 +28,7 @@ DDEESSCCRRIIPPTTIIOONN
       The command-line options are deliberately very similar  to
       those of _G_N_U _G_z_i_p_, but they are not identical.
-       _B_z_i_p_2  expects  a list of file names to accompany the com-
+       _b_z_i_p_2  expects  a list of file names to accompany the com-
       mand-line flags.  Each file is replaced  by  a  compressed
       version  of  itself,  with  the  name "original_name.bz2".
       Each compressed file has the same  modification  date  and
@@ -38,8 +40,8 @@ DDEESSCCRRIIPPTTIIOONN
       cepts, or have serious file name length restrictions, such
       as MS-DOS.
-       _B_z_i_p_2  and  _b_u_n_z_i_p_2  will not overwrite existing files; if
+       _b_z_i_p_2  and  _b_u_n_z_i_p_2 will by default not overwrite existing
-       you want this to happen, you should delete them first.
+       files; if you want this to happen, specify the -f flag.
       If no file names  are  specified,  _b_z_i_p_2  compresses  from
       standard  input  to  standard output.  In this case, _b_z_i_p_2
@@ -47,17 +49,15 @@ DDEESSCCRRIIPPTTIIOONN
       this  would  be  entirely  incomprehensible  and therefore
       pointless.
-       _B_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
+       _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
       ified files whose names end in ".bz2".  Files without this
       suffix are ignored.  Again, supplying no filenames  causes
       decompression from standard input to standard output.
-       You  can also compress or decompress files to the standard
+       _b_u_n_z_i_p_2 will correctly decompress a file which is the con-
-       output by giving the -c flag.  You can decompress multiple
+       catenation of two or more compressed files.  The result is
-       files  like  this, but you may only compress a single file
+       the concatenation of the corresponding uncompressed files.
-       this way, since it would otherwise be difficult  to  sepa-
+       Integrity testing (-t) of concatenated compressed files is
-       rate  out  the  compressed representations of the original
-       files.
@@ -70,6 +70,21 @@ DDEESSCCRRIIPPTTIIOONN
 bzip2(1)                                                 bzip2(1)
+       also supported.
+       You  can also compress or decompress files to the standard
+       output by giving the -c flag.  Multiple files may be  com-
+       pressed and decompressed like this.  The resulting outputs
+       are fed sequentially to stdout.  Compression  of  multiple
+       files  in this manner generates a stream containing multi-
+       ple compressed file representations.  Such a stream can be
+       decompressed  correctly  only  by  _b_z_i_p_2  version 0.9.0 or
+       later.  Earlier versions of _b_z_i_p_2 will stop  after  decom-
+       pressing the first file in the stream.
+       _b_z_c_a_t  (or _b_z_i_p_2 _-_d_c ) decompresses all specified files to
+       the standard output.
       Compression is always performed, even  if  the  compressed
       file  is slightly larger than the original.  Files of less
       than about one hundred bytes tend to get larger, since the
@@ -108,36 +123,37 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
       file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
       to decompress the file.  Since block sizes are  stored  in
       compressed  files,  it follows that the flags -1 to -9 are
-       irrelevant to and so ignored during  decompression.   Com-
+       irrelevant  to  and  so  ignored   during   decompression.
-       pression  and decompression requirements, in bytes, can be
-       estimated as:
-             Compression:   400k + ( 7 x block size )
-             Decompression: 100k + ( 5 x block size ), or
-                            100k + ( 2.5 x block size )
-       Larger  block  sizes  give  rapidly  diminishing  marginal
+                                                                2
-       returns;  most of the compression comes from the first two
-       or three hundred k of block size, a fact worth bearing  in
-       mind  when  using  _b_z_i_p_2  on  small  machines.  It is also
-       important to  appreciate  that  the  decompression  memory
-       requirement  is  set  at compression-time by the choice of
-       block size.
-                                                                2
+bzip2(1)                                                 bzip2(1)
+       Compression  and decompression requirements, in bytes, can
+       be estimated as:
-bzip2(1)                                                 bzip2(1)
+             Compression:   400k + ( 7 x block size )
+             Decompression: 100k + ( 4 x block size ), or
+                            100k + ( 2.5 x block size )
+       Larger  block  sizes  give  rapidly  diminishing  marginal
+       returns;  most of the compression comes from the first two
+       or three hundred k of block size, a fact worth bearing  in
+       mind  when  using  _b_z_i_p_2  on  small  machines.  It is also
+       important to  appreciate  that  the  decompression  memory
+       requirement  is  set  at compression-time by the choice of
+       block size.
       For files compressed with the  default  900k  block  size,
-       _b_u_n_z_i_p_2  will require about 4600 kbytes to decompress.  To
+       _b_u_n_z_i_p_2  will require about 3700 kbytes to decompress.  To
       support decompression of any file on a 4 megabyte machine,
       _b_u_n_z_i_p_2  has  an  option to decompress using approximately
       half this amount of memory, about 2300 kbytes.  Decompres-
@@ -157,8 +173,8 @@ bzip2(1)                                                 bzip2(1)
       file 20,000 bytes long with the flag  -9  will  cause  the
       compressor  to  allocate  around 6700k of memory, but only
       touch 400k + 20000 * 7 = 540 kbytes of it.  Similarly, the
-       decompressor  will  allocate  4600k  but only touch 100k +
+       decompressor  will  allocate  3700k  but only touch 100k +
-       20000 * 5 = 200 kbytes.
+       20000 * 4 = 180 kbytes.
       Here is a table which summarises the maximum memory  usage
       for  different  block  sizes.   Also recorded is the total
@@ -172,64 +188,66 @@ bzip2(1)                                                 bzip2(1)
                  Compress   Decompress   Decompress   Corpus
           Flag     usage      usage       -s usage     Size
-            -1      1100k       600k         350k      914704
+            -1      1100k       500k         350k      914704
-            -2      1800k      1100k         600k      877703
+            -2      1800k       900k         600k      877703
-            -3      2500k      1600k         850k      860338
-            -4      3200k      2100k        1100k      846899
-            -5      3900k      2600k        1350k      845160
-            -6      4600k      3100k        1600k      838626
-            -7      5400k      3600k        1850k      834096
-            -8      6000k      4100k        2100k      828642
-            -9      6700k      4600k        2350k      828642
-OOPPTTIIOONNSS
-       --cc ----ssttddoouutt
-              Compress or decompress to standard output.  -c will
-              decompress multiple files to stdout, but will  only
-              compress a single file to stdout.
+                                                                3
-                                                                3
+bzip2(1)                                                 bzip2(1)
+            -3      2500k      1300k         850k      860338
+            -4      3200k      1700k        1100k      846899
+            -5      3900k      2100k        1350k      845160
+            -6      4600k      2500k        1600k      838626
+            -7      5400k      2900k        1850k      834096
+            -8      6000k      3300k        2100k      828642
+            -9      6700k      3700k        2350k      828642
-bzip2(1)                                                 bzip2(1)
+OOPPTTIIOONNSS
+       --cc ----ssttddoouutt
+              Compress or decompress to standard output.  -c will
+              decompress multiple files to stdout, but will  only
+              compress a single file to stdout.
       --dd ----ddeeccoommpprreessss
-              Force  decompression.  _B_z_i_p_2 and _b_u_n_z_i_p_2 are really
+              Force  decompression.  _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are
-              the same program, and the decision about whether to
+              really the same program,  and  the  decision  about
-              compress  or  decompress  is  done  on the basis of
+              what  actions to take is done on the basis of which
-              which name is used.  This flag overrides that mech-
+              name is used.  This flag overrides that  mechanism,
-              anism, and forces _b_z_i_p_2 to decompress.
+              and forces _b_z_i_p_2 to decompress.
-       --ff ----ccoommpprreessss
+       --zz ----ccoommpprreessss
              The  complement  to -d: forces compression, regard-
              less of the invokation name.
       --tt ----tteesstt
              Check integrity of the specified file(s), but don't
              decompress  them.   This  really  performs  a trial
-              decompression and throws away the result, using the
+              decompression and throws away the result.
-              low-memory decompression algorithm (see -s).
+       --ff ----ffoorrccee
+              Force overwrite of output files.   Normally,  _b_z_i_p_2
+              will not overwrite existing output files.
       --kk ----kkeeeepp
              Keep  (don't delete) input files during compression
              or decompression.
       --ss ----ssmmaallll
-              Reduce  memory  usage,  both  for  compression  and
+              Reduce memory usage, for compression, decompression
-              decompression.  Files are decompressed using a mod-
+              and  testing.   Files  are  decompressed and tested
-              ified algorithm which only requires 2.5  bytes  per
+              using a modified algorithm which only requires  2.5
-              block  byte.   This  means  any  file can be decom-
+              bytes  per  block byte.  This means any file can be
-              pressed in 2300k of memory,  albeit  somewhat  more
+              decompressed in 2300k of memory,  albeit  at  about
-              slowly than usual.
+              half the normal speed.
              During  compression,  -s  selects  a  block size of
              200k, which limits memory use to  around  the  same
@@ -239,35 +257,32 @@ bzip2(1)                                                 bzip2(1)
              MEMORY MANAGEMENT above.
+                                                                4
+bzip2(1)                                                 bzip2(1)
       --vv ----vveerrbboossee
              Verbose mode -- show the compression ratio for each
              file  processed.   Further  -v's  increase the ver-
              bosity level, spewing out lots of information which
              is primarily of interest for diagnostic purposes.
-       --LL ----lliicceennssee
+       --LL ----lliicceennssee --VV ----vveerrssiioonn
              Display  the  software  version,  license terms and
              conditions.
-       --VV ----vveerrssiioonn
-              Same as -L.
       --11 ttoo --99
              Set the block size to 100 k, 200 k ..  900  k  when
              compressing.   Has  no  effect  when decompressing.
              See MEMORY MANAGEMENT above.
-                                                                4
-bzip2(1)                                                 bzip2(1)
       ----rreeppeettiittiivvee--ffaasstt
              _b_z_i_p_2 injects some small  pseudo-random  variations
              into  very  repetitive  blocks  to limit worst-case
@@ -306,34 +321,34 @@ RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD F
       _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam-
       aged file, and writes a number of files "rec0001file.bz2",
       "rec0002file.bz2", etc, containing the  extracted  blocks.
-       The output filenames are designed so that the use of wild-
+       The  output  filenames  are  designed  so  that the use of
-       cards in subsequent processing -- for example, "bzip2  -dc
-       rec*file.bz2  >  recovered_data" -- lists the files in the
-       "right" order.
-       _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
-       files,  as  these will contain many blocks.  It is clearly
-       futile to use it on damaged single-block  files,  since  a
-       damaged  block  cannot  be recovered.  If you wish to min-
-       imise any potential data loss through media  or  transmis-
-       sion errors, you might consider compressing with a smaller
-       block size.
-PPEERRFFOORRMMAANNCCEE NNOOTTEESS
+                                                                5
-       The sorting phase of compression gathers together  similar
-                                                                5
+bzip2(1)                                                 bzip2(1)
+       wildcards in subsequent processing -- for example,  "bzip2
+       -dc  rec*file.bz2  > recovered_data" -- lists the files in
+       the "right" order.
-bzip2(1)                                                 bzip2(1)
+       _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
+       files,  as  these will contain many blocks.  It is clearly
+       futile to use it on damaged single-block  files,  since  a
+       damaged  block  cannot  be recovered.  If you wish to min-
+       imise any potential data loss through media  or  transmis-
+       sion errors, you might consider compressing with a smaller
+       block size.
+PPEERRFFOORRMMAANNCCEE NNOOTTEESS
+       The sorting phase of compression gathers together  similar
       strings  in  the  file.  Because of this, files containing
       very long runs of  repeated  symbols,  like  "aabaabaabaab
       ..."   (repeated   several  hundred  times)  may  compress
@@ -348,10 +363,6 @@ bzip2(1)                                                 bzip2(1)
       severe slowness in compression, try making the block  size
       as small as possible, with flag -1.
-       Incompressible or virtually-incompressible data may decom-
-       press rather more slowly than one would hope.  This is due
-       to a naive implementation of the move-to-front coder.
       _b_z_i_p_2  usually  allocates  several  megabytes of memory to
       operate in, and then charges all over it in a fairly  ran-
       dom  fashion.   This means that performance, both for com-
@@ -362,12 +373,6 @@ bzip2(1)                                                 bzip2(1)
       large performance improvements.  I imagine _b_z_i_p_2 will per-
       form best on machines with very large caches.
-       Test mode (-t) uses the low-memory decompression algorithm
-       (-s).  This means test mode does not run  as  fast  as  it
-       could;  it  could  run as fast as the normal decompression
-       machinery.  This could easily be fixed at the cost of some
-       code bloat.
 CCAAVVEEAATTSS
       I/O  error  messages  are not as helpful as they could be.
@@ -375,19 +380,14 @@ CCAAVVEEAATTSS
       but  the  details  of  what  the problem is sometimes seem
       rather misleading.
-       This manual page pertains to version 0.1 of _b_z_i_p_2_.  It may
+       This manual page pertains to version 0.9.0 of _b_z_i_p_2_.  Com-
-       well  happen that some future version will use a different
+       pressed  data created by this version is entirely forwards
-       compressed file format.  If you try to  decompress,  using
+       and backwards compatible with the previous public release,
-       0.1,  a  .bz2  file created with some future version which
+       version  0.1pl2,  but  with the following exception: 0.9.0
-       uses a different compressed file format, 0.1 will complain
+       can correctly decompress multiple concatenated  compressed
-       that  your  file  "is not a bzip2 file".  If that happens,
+       files.   0.1pl2  cannot do this; it will stop after decom-
-       you should obtain a more recent version of _b_z_i_p_2  and  use
+       pressing just the first file in the stream.
-       that to decompress the file.
-       Wildcard expansion for Windows 95 and NT is flaky.
-       _b_z_i_p_2_r_e_c_o_v_e_r  uses  32-bit integers to represent bit posi-
-       tions in compressed files, so it cannot handle  compressed
@@ -400,61 +400,59 @@ CCAAVVEEAATTSS
 bzip2(1)                                                 bzip2(1)
-       files  more than 512 megabytes long.  This could easily be
+       Wildcard expansion for Windows 95 and NT is flaky.
+       _b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent  bit  posi-
+       tions  in compressed files, so it cannot handle compressed
+       files more than 512 megabytes long.  This could easily  be
       fixed.
-       _b_z_i_p_2_r_e_c_o_v_e_r sometimes reports a  very  small,  incomplete
-       final  block.  This is spurious and can be safely ignored.
+AAUUTTHHOORR
+       Julian Seward, jseward@acm.org.
+       http://www.muraroa.demon.co.uk
+       The ideas embodied in _b_z_i_p_2 are due to (at least) the fol-
+       lowing people: Michael Burrows and David Wheeler (for  the
+       block  sorting  transformation), David Wheeler (again, for
+       the Huffman coder), Peter Fenwick (for the structured cod-
+       ing model in the original _b_z_i_p_, and many refinements), and
+       Alistair Moffat, Radford Neal  and  Ian  Witten  (for  the
+       arithmetic  coder  in  the  original  _b_z_i_p_)_.   I  am  much
+       indebted for their help, support and advice.  See the man-
+       ual  in the source distribution for pointers to sources of
+       documentation.  Christian von Roques encouraged me to look
+       for  faster sorting algorithms, so as to speed up compres-
+       sion.  Bela Lubkin encouraged me to improve the worst-case
+       compression performance.  Many people sent patches, helped
+       with portability problems, lent machines, gave advice  and
+       were generally helpful.
-RREELLAATTIIOONNSSHHIIPP TTOO bbzziipp--00..2211
-       This program is a descendant of the _b_z_i_p program,  version
-       0.21,  which  I released in August 1996.  The primary dif-
-       ference of _b_z_i_p_2 is its avoidance of the possibly patented
-       algorithms  which  were  used  in 0.21.  _b_z_i_p_2 also brings
-       various useful refinements (-s,  -t),  uses  less  memory,
-       decompresses  significantly  faster,  and  has support for
-       recovering data from damaged files.
-       Because _b_z_i_p_2 uses Huffman coding to  construct  the  com-
-       pressed  bitstream, rather than the arithmetic coding used
-       in 0.21, the compressed representations generated  by  the
-       two  programs are incompatible, and they will not interop-
-       erate.  The change in suffix from  .bz  to  .bz2  reflects
-       this.   It would have been helpful to at least allow _b_z_i_p_2
-       to decompress files created by 0.21, but this would defeat
-       the primary aim of having a patent-free compressor.
-       For a more precise statement about patent issues in bzip2,
-       please see the README file in the distribution.
-       Huffman  coding  necessarily  involves some coding ineffi-
-       ciency compared to arithmetic  coding.   This  means  that
-       _b_z_i_p_2  compresses about 1% worse than 0.21, an unfortunate
-       but unavoidable fact-of-life.  On the other  hand,  decom-
-       pression  is approximately 50% faster for the same reason,
-       and the change in file format gave an opportunity  to  add
-       data-recovery features.  So it is not all bad.
-AAUUTTHHOORR
-       Julian Seward, jseward@acm.org.
-       The ideas embodied in _b_z_i_p and _b_z_i_p_2 are due to (at least)
-       the following people: Michael Burrows  and  David  Wheeler
-       (for  the  block  sorting  transformation),  David Wheeler
-       (again, for the Huffman coder),  Peter  Fenwick  (for  the
-       structured  coding  model  in 0.21, and many refinements),
-       and Alistair Moffat, Radford Neal and Ian Witten (for  the
-       arithmetic  coder  in 0.21).  I am much indebted for their
-       help, support and advice.  See the file ALGORITHMS in  the
-       source  distribution for pointers to sources of documenta-
-       tion.  Christian von Roques  encouraged  me  to  look  for
-       faster  sorting algorithms, so as to speed up compression.
-       Bela Lubkin encouraged me to improve the  worst-case  com-
-       pression  performance.   Many  people sent patches, helped
-       with portability problems, lent machines, gave advice  and
-       were generally helpful.
author	Julian Seward <jseward@acm.org>	1998-08-23 22:13:13 +0200
committer	Julian Seward <jseward@acm.org>	1998-08-23 22:13:13 +0200
commit	977101ad5f833f5c0a574bfeea408e5301a6b052 (patch)
tree	fc1e8fed202869c116cbf6b8c362456042494a0a /bzip2.1.preformatted
parent	1eb67a9d8f7f05ae310bc9ef297d176f3a3f8a37 (diff)
download	bzip2-0.9.0c.tar.gz bzip2-0.9.0c.tar.bz2 bzip2-0.9.0c.zip