diff options
author | Julian Seward <jseward@acm.org> | 2005-02-15 22:13:13 +0100 |
---|---|---|
committer | Julian Seward <jseward@acm.org> | 2005-02-15 22:13:13 +0100 |
commit | 4d540bfc95a4b0eefc1d1f388ec33534aaeb3a2f (patch) | |
tree | 3b7e9c650b4c61d114e1716c4698e40d5c8d7ef7 /bzip2.1.preformatted | |
parent | 099d844292f60f9d58914da29e5773204dc55e7a (diff) | |
download | bzip2-1.0.3.tar.gz bzip2-1.0.3.tar.bz2 bzip2-1.0.3.zip |
bzip2-1.0.3bzip2-1.0.3
Diffstat (limited to 'bzip2.1.preformatted')
-rw-r--r-- | bzip2.1.preformatted | 247 |
1 files changed, 124 insertions, 123 deletions
diff --git a/bzip2.1.preformatted b/bzip2.1.preformatted index 0f20cb5..129ca83 100644 --- a/bzip2.1.preformatted +++ b/bzip2.1.preformatted | |||
@@ -3,43 +3,43 @@ bzip2(1) bzip2(1) | |||
3 | 3 | ||
4 | 4 | ||
5 | NNAAMMEE | 5 | NNAAMMEE |
6 | bzip2, bunzip2 - a block-sorting file compressor, v1.0.2 | 6 | bzip2, bunzip2 − a block‐sorting file compressor, v1.0.3 |
7 | bzcat - decompresses files to stdout | 7 | bzcat − decompresses files to stdout |
8 | bzip2recover - recovers data from damaged bzip2 files | 8 | bzip2recover − recovers data from damaged bzip2 files |
9 | 9 | ||
10 | 10 | ||
11 | SSYYNNOOPPSSIISS | 11 | SSYYNNOOPPSSIISS |
12 | bbzziipp22 [ --ccddffkkqqssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ] | 12 | bbzziipp22 [ −−ccddffkkqqssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ] |
13 | bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ] | 13 | bbuunnzziipp22 [ −−ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ] |
14 | bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._. ] | 14 | bbzzccaatt [ −−ss ] [ _f_i_l_e_n_a_m_e_s _._._. ] |
15 | bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e | 15 | bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e |
16 | 16 | ||
17 | 17 | ||
18 | DDEESSCCRRIIPPTTIIOONN | 18 | DDEESSCCRRIIPPTTIIOONN |
19 | _b_z_i_p_2 compresses files using the Burrows-Wheeler block | 19 | _b_z_i_p_2 compresses files using the Burrows‐Wheeler block |
20 | sorting text compression algorithm, and Huffman coding. | 20 | sorting text compression algorithm, and Huffman coding. |
21 | Compression is generally considerably better than that | 21 | Compression is generally considerably better than that |
22 | achieved by more conventional LZ77/LZ78-based compressors, | 22 | achieved by more conventional LZ77/LZ78‐based compressors, |
23 | and approaches the performance of the PPM family of sta | 23 | and approaches the performance of the PPM family of sta |
24 | tistical compressors. | 24 | tistical compressors. |
25 | 25 | ||
26 | The command-line options are deliberately very similar to | 26 | The command‐line options are deliberately very similar to |
27 | those of _G_N_U _g_z_i_p_, but they are not identical. | 27 | those of _G_N_U _g_z_i_p_, but they are not identical. |
28 | 28 | ||
29 | _b_z_i_p_2 expects a list of file names to accompany the com | 29 | _b_z_i_p_2 expects a list of file names to accompany the com |
30 | mand-line flags. Each file is replaced by a compressed | 30 | mand‐line flags. Each file is replaced by a compressed |
31 | version of itself, with the name "original_name.bz2". | 31 | version of itself, with the name "original_name.bz2". |
32 | Each compressed file has the same modification date, per | 32 | Each compressed file has the same modification date, per |
33 | missions, and, when possible, ownership as the correspond | 33 | missions, and, when possible, ownership as the correspond |
34 | ing original, so that these properties can be correctly | 34 | ing original, so that these properties can be correctly |
35 | restored at decompression time. File name handling is | 35 | restored at decompression time. File name handling is |
36 | naive in the sense that there is no mechanism for preserv | 36 | naive in the sense that there is no mechanism for preserv |
37 | ing original file names, permissions, ownerships or dates | 37 | ing original file names, permissions, ownerships or dates |
38 | in filesystems which lack these concepts, or have serious | 38 | in filesystems which lack these concepts, or have serious |
39 | file name length restrictions, such as MS-DOS. | 39 | file name length restrictions, such as MS‐DOS. |
40 | 40 | ||
41 | _b_z_i_p_2 and _b_u_n_z_i_p_2 will by default not overwrite existing | 41 | _b_z_i_p_2 and _b_u_n_z_i_p_2 will by default not overwrite existing |
42 | files. If you want this to happen, specify the -f flag. | 42 | files. If you want this to happen, specify the −f flag. |
43 | 43 | ||
44 | If no file names are specified, _b_z_i_p_2 compresses from | 44 | If no file names are specified, _b_z_i_p_2 compresses from |
45 | standard input to standard output. In this case, _b_z_i_p_2 | 45 | standard input to standard output. In this case, _b_z_i_p_2 |
@@ -47,7 +47,7 @@ DDEESSCCRRIIPPTTIIOONN | |||
47 | this would be entirely incomprehensible and therefore | 47 | this would be entirely incomprehensible and therefore |
48 | pointless. | 48 | pointless. |
49 | 49 | ||
50 | _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d_) decompresses all specified files. | 50 | _b_u_n_z_i_p_2 (or _b_z_i_p_2 _−_d_) decompresses all specified files. |
51 | Files which were not created by _b_z_i_p_2 will be detected and | 51 | Files which were not created by _b_z_i_p_2 will be detected and |
52 | ignored, and a warning issued. _b_z_i_p_2 attempts to guess | 52 | ignored, and a warning issued. _b_z_i_p_2 attempts to guess |
53 | the filename for the decompressed file from that of the | 53 | the filename for the decompressed file from that of the |
@@ -64,26 +64,26 @@ DDEESSCCRRIIPPTTIIOONN | |||
64 | guess the name of the original file, and uses the original | 64 | guess the name of the original file, and uses the original |
65 | name with _._o_u_t appended. | 65 | name with _._o_u_t appended. |
66 | 66 | ||
67 | As with compression, supplying no filenames causes decom | 67 | As with compression, supplying no filenames causes decom |
68 | pression from standard input to standard output. | 68 | pression from standard input to standard output. |
69 | 69 | ||
70 | _b_u_n_z_i_p_2 will correctly decompress a file which is the con | 70 | _b_u_n_z_i_p_2 will correctly decompress a file which is the con |
71 | catenation of two or more compressed files. The result is | 71 | catenation of two or more compressed files. The result is |
72 | the concatenation of the corresponding uncompressed files. | 72 | the concatenation of the corresponding uncompressed files. |
73 | Integrity testing (-t) of concatenated compressed files is | 73 | Integrity testing (−t) of concatenated compressed files is |
74 | also supported. | 74 | also supported. |
75 | 75 | ||
76 | You can also compress or decompress files to the standard | 76 | You can also compress or decompress files to the standard |
77 | output by giving the -c flag. Multiple files may be com | 77 | output by giving the −c flag. Multiple files may be com |
78 | pressed and decompressed like this. The resulting outputs | 78 | pressed and decompressed like this. The resulting outputs |
79 | are fed sequentially to stdout. Compression of multiple | 79 | are fed sequentially to stdout. Compression of multiple |
80 | files in this manner generates a stream containing multi | 80 | files in this manner generates a stream containing multi |
81 | ple compressed file representations. Such a stream can be | 81 | ple compressed file representations. Such a stream can be |
82 | decompressed correctly only by _b_z_i_p_2 version 0.9.0 or | 82 | decompressed correctly only by _b_z_i_p_2 version 0.9.0 or |
83 | later. Earlier versions of _b_z_i_p_2 will stop after decom | 83 | later. Earlier versions of _b_z_i_p_2 will stop after decom |
84 | pressing the first file in the stream. | 84 | pressing the first file in the stream. |
85 | 85 | ||
86 | _b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified files to | 86 | _b_z_c_a_t (or _b_z_i_p_2 _‐_d_c_) decompresses all specified files to |
87 | the standard output. | 87 | the standard output. |
88 | 88 | ||
89 | _b_z_i_p_2 will read arguments from the environment variables | 89 | _b_z_i_p_2 will read arguments from the environment variables |
@@ -99,15 +99,15 @@ DDEESSCCRRIIPPTTIIOONN | |||
99 | most file compressors) is coded at about 8.05 bits per | 99 | most file compressors) is coded at about 8.05 bits per |
100 | byte, giving an expansion of around 0.5%. | 100 | byte, giving an expansion of around 0.5%. |
101 | 101 | ||
102 | As a self-check for your protection, _b_z_i_p_2 uses 32-bit | 102 | As a self‐check for your protection, _b_z_i_p_2 uses 32‐bit |
103 | CRCs to make sure that the decompressed version of a file | 103 | CRCs to make sure that the decompressed version of a file |
104 | is identical to the original. This guards against corrup | 104 | is identical to the original. This guards against corrup |
105 | tion of the compressed data, and against undetected bugs | 105 | tion of the compressed data, and against undetected bugs |
106 | in _b_z_i_p_2 (hopefully very unlikely). The chances of data | 106 | in _b_z_i_p_2 (hopefully very unlikely). The chances of data |
107 | corruption going undetected is microscopic, about one | 107 | corruption going undetected is microscopic, about one |
108 | chance in four billion for each file processed. Be aware, | 108 | chance in four billion for each file processed. Be aware, |
109 | though, that the check occurs upon decompression, so it | 109 | though, that the check occurs upon decompression, so it |
110 | can only tell you that something is wrong. It can't help | 110 | can only tell you that something is wrong. It can’t help |
111 | you recover the original uncompressed data. You can use | 111 | you recover the original uncompressed data. You can use |
112 | _b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files. | 112 | _b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files. |
113 | 113 | ||
@@ -118,41 +118,41 @@ DDEESSCCRRIIPPTTIIOONN | |||
118 | 118 | ||
119 | 119 | ||
120 | OOPPTTIIOONNSS | 120 | OOPPTTIIOONNSS |
121 | --cc ----ssttddoouutt | 121 | −−cc ‐‐‐‐ssttddoouutt |
122 | Compress or decompress to standard output. | 122 | Compress or decompress to standard output. |
123 | 123 | ||
124 | --dd ----ddeeccoommpprreessss | 124 | −−dd ‐‐‐‐ddeeccoommpprreessss |
125 | Force decompression. _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are | 125 | Force decompression. _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are |
126 | really the same program, and the decision about | 126 | really the same program, and the decision about |
127 | what actions to take is done on the basis of which | 127 | what actions to take is done on the basis of which |
128 | name is used. This flag overrides that mechanism, | 128 | name is used. This flag overrides that mechanism, |
129 | and forces _b_z_i_p_2 to decompress. | 129 | and forces _b_z_i_p_2 to decompress. |
130 | 130 | ||
131 | --zz ----ccoommpprreessss | 131 | −−zz ‐‐‐‐ccoommpprreessss |
132 | The complement to -d: forces compression, | 132 | The complement to −d: forces compression, |
133 | regardless of the invocation name. | 133 | regardless of the invocation name. |
134 | 134 | ||
135 | --tt ----tteesstt | 135 | −−tt ‐‐‐‐tteesstt |
136 | Check integrity of the specified file(s), but don't | 136 | Check integrity of the specified file(s), but don’t |
137 | decompress them. This really performs a trial | 137 | decompress them. This really performs a trial |
138 | decompression and throws away the result. | 138 | decompression and throws away the result. |
139 | 139 | ||
140 | --ff ----ffoorrccee | 140 | −−ff ‐‐‐‐ffoorrccee |
141 | Force overwrite of output files. Normally, _b_z_i_p_2 | 141 | Force overwrite of output files. Normally, _b_z_i_p_2 |
142 | will not overwrite existing output files. Also | 142 | will not overwrite existing output files. Also |
143 | forces _b_z_i_p_2 to break hard links to files, which it | 143 | forces _b_z_i_p_2 to break hard links to files, which it |
144 | otherwise wouldn't do. | 144 | otherwise wouldn’t do. |
145 | 145 | ||
146 | bzip2 normally declines to decompress files which | 146 | bzip2 normally declines to decompress files which |
147 | don't have the correct magic header bytes. If | 147 | don’t have the correct magic header bytes. If |
148 | forced (-f), however, it will pass such files | 148 | forced (‐f), however, it will pass such files |
149 | through unmodified. This is how GNU gzip behaves. | 149 | through unmodified. This is how GNU gzip behaves. |
150 | 150 | ||
151 | --kk ----kkeeeepp | 151 | −−kk ‐‐‐‐kkeeeepp |
152 | Keep (don't delete) input files during compression | 152 | Keep (don’t delete) input files during compression |
153 | or decompression. | 153 | or decompression. |
154 | 154 | ||
155 | --ss ----ssmmaallll | 155 | −−ss ‐‐‐‐ssmmaallll |
156 | Reduce memory usage, for compression, decompression | 156 | Reduce memory usage, for compression, decompression |
157 | and testing. Files are decompressed and tested | 157 | and testing. Files are decompressed and tested |
158 | using a modified algorithm which only requires 2.5 | 158 | using a modified algorithm which only requires 2.5 |
@@ -160,46 +160,46 @@ OOPPTTIIOONNSS | |||
160 | decompressed in 2300k of memory, albeit at about | 160 | decompressed in 2300k of memory, albeit at about |
161 | half the normal speed. | 161 | half the normal speed. |
162 | 162 | ||
163 | During compression, -s selects a block size of | 163 | During compression, −s selects a block size of |
164 | 200k, which limits memory use to around the same | 164 | 200k, which limits memory use to around the same |
165 | figure, at the expense of your compression ratio. | 165 | figure, at the expense of your compression ratio. |
166 | In short, if your machine is low on memory (8 | 166 | In short, if your machine is low on memory (8 |
167 | megabytes or less), use -s for everything. See | 167 | megabytes or less), use −s for everything. See |
168 | MEMORY MANAGEMENT below. | 168 | MEMORY MANAGEMENT below. |
169 | 169 | ||
170 | --qq ----qquuiieett | 170 | −−qq ‐‐‐‐qquuiieett |
171 | Suppress non-essential warning messages. Messages | 171 | Suppress non‐essential warning messages. Messages |
172 | pertaining to I/O errors and other critical events | 172 | pertaining to I/O errors and other critical events |
173 | will not be suppressed. | 173 | will not be suppressed. |
174 | 174 | ||
175 | --vv ----vveerrbboossee | 175 | −−vv ‐‐‐‐vveerrbboossee |
176 | Verbose mode -- show the compression ratio for each | 176 | Verbose mode ‐‐ show the compression ratio for each |
177 | file processed. Further -v's increase the ver | 177 | file processed. Further −v’s increase the ver |
178 | bosity level, spewing out lots of information which | 178 | bosity level, spewing out lots of information which |
179 | is primarily of interest for diagnostic purposes. | 179 | is primarily of interest for diagnostic purposes. |
180 | 180 | ||
181 | --LL ----lliicceennssee --VV ----vveerrssiioonn | 181 | −−LL ‐‐‐‐lliicceennssee ‐‐VV ‐‐‐‐vveerrssiioonn |
182 | Display the software version, license terms and | 182 | Display the software version, license terms and |
183 | conditions. | 183 | conditions. |
184 | 184 | ||
185 | --11 ((oorr ----ffaasstt)) ttoo --99 ((oorr ----bbeesstt)) | 185 | −−11 ((oorr −−−−ffaasstt)) ttoo −−99 ((oorr −−−−bbeesstt)) |
186 | Set the block size to 100 k, 200 k .. 900 k when | 186 | Set the block size to 100 k, 200 k .. 900 k when |
187 | compressing. Has no effect when decompressing. | 187 | compressing. Has no effect when decompressing. |
188 | See MEMORY MANAGEMENT below. The --fast and --best | 188 | See MEMORY MANAGEMENT below. The −−fast and −−best |
189 | aliases are primarily for GNU gzip compatibility. | 189 | aliases are primarily for GNU gzip compatibility. |
190 | In particular, --fast doesn't make things signifi | 190 | In particular, −−fast doesn’t make things signifi |
191 | cantly faster. And --best merely selects the | 191 | cantly faster. And −−best merely selects the |
192 | default behaviour. | 192 | default behaviour. |
193 | 193 | ||
194 | ---- Treats all subsequent arguments as file names, even | 194 | −−‐‐ Treats all subsequent arguments as file names, even |
195 | if they start with a dash. This is so you can han | 195 | if they start with a dash. This is so you can han |
196 | dle files with names beginning with a dash, for | 196 | dle files with names beginning with a dash, for |
197 | example: bzip2 -- -myfilename. | 197 | example: bzip2 −‐ −myfilename. |
198 | 198 | ||
199 | ----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt | 199 | −−‐‐rreeppeettiittiivvee‐‐ffaasstt ‐‐‐‐rreeppeettiittiivvee‐‐bbeesstt |
200 | These flags are redundant in versions 0.9.5 and | 200 | These flags are redundant in versions 0.9.5 and |
201 | above. They provided some coarse control over the | 201 | above. They provided some coarse control over the |
202 | behaviour of the sorting algorithm in earlier ver | 202 | behaviour of the sorting algorithm in earlier ver |
203 | sions, which was sometimes useful. 0.9.5 and above | 203 | sions, which was sometimes useful. 0.9.5 and above |
204 | have an improved algorithm which renders these | 204 | have an improved algorithm which renders these |
205 | flags irrelevant. | 205 | flags irrelevant. |
@@ -209,13 +209,13 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT | |||
209 | _b_z_i_p_2 compresses large files in blocks. The block size | 209 | _b_z_i_p_2 compresses large files in blocks. The block size |
210 | affects both the compression ratio achieved, and the | 210 | affects both the compression ratio achieved, and the |
211 | amount of memory needed for compression and decompression. | 211 | amount of memory needed for compression and decompression. |
212 | The flags -1 through -9 specify the block size to be | 212 | The flags −1 through −9 specify the block size to be |
213 | 100,000 bytes through 900,000 bytes (the default) respec | 213 | 100,000 bytes through 900,000 bytes (the default) respec |
214 | tively. At decompression time, the block size used for | 214 | tively. At decompression time, the block size used for |
215 | compression is read from the header of the compressed | 215 | compression is read from the header of the compressed |
216 | file, and _b_u_n_z_i_p_2 then allocates itself just enough memory | 216 | file, and _b_u_n_z_i_p_2 then allocates itself just enough memory |
217 | to decompress the file. Since block sizes are stored in | 217 | to decompress the file. Since block sizes are stored in |
218 | compressed files, it follows that the flags -1 to -9 are | 218 | compressed files, it follows that the flags −1 to −9 are |
219 | irrelevant to and so ignored during decompression. | 219 | irrelevant to and so ignored during decompression. |
220 | 220 | ||
221 | Compression and decompression requirements, in bytes, can | 221 | Compression and decompression requirements, in bytes, can |
@@ -238,21 +238,21 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT | |||
238 | _b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To | 238 | _b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To |
239 | support decompression of any file on a 4 megabyte machine, | 239 | support decompression of any file on a 4 megabyte machine, |
240 | _b_u_n_z_i_p_2 has an option to decompress using approximately | 240 | _b_u_n_z_i_p_2 has an option to decompress using approximately |
241 | half this amount of memory, about 2300 kbytes. Decompres | 241 | half this amount of memory, about 2300 kbytes. Decompres |
242 | sion speed is also halved, so you should use this option | 242 | sion speed is also halved, so you should use this option |
243 | only where necessary. The relevant flag is -s. | 243 | only where necessary. The relevant flag is ‐s. |
244 | 244 | ||
245 | In general, try and use the largest block size memory con | 245 | In general, try and use the largest block size memory con |
246 | straints allow, since that maximises the compression | 246 | straints allow, since that maximises the compression |
247 | achieved. Compression and decompression speed are virtu | 247 | achieved. Compression and decompression speed are virtu |
248 | ally unaffected by block size. | 248 | ally unaffected by block size. |
249 | 249 | ||
250 | Another significant point applies to files which fit in a | 250 | Another significant point applies to files which fit in a |
251 | single block -- that means most files you'd encounter | 251 | single block ‐‐ that means most files you’d encounter |
252 | using a large block size. The amount of real memory | 252 | using a large block size. The amount of real memory |
253 | touched is proportional to the size of the file, since the | 253 | touched is proportional to the size of the file, since the |
254 | file is smaller than a block. For example, compressing a | 254 | file is smaller than a block. For example, compressing a |
255 | file 20,000 bytes long with the flag -9 will cause the | 255 | file 20,000 bytes long with the flag ‐9 will cause the |
256 | compressor to allocate around 7600k of memory, but only | 256 | compressor to allocate around 7600k of memory, but only |
257 | touch 400k + 20000 * 8 = 560 kbytes of it. Similarly, the | 257 | touch 400k + 20000 * 8 = 560 kbytes of it. Similarly, the |
258 | decompressor will allocate 3700k but only touch 100k + | 258 | decompressor will allocate 3700k but only touch 100k + |
@@ -260,59 +260,59 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT | |||
260 | 260 | ||
261 | Here is a table which summarises the maximum memory usage | 261 | Here is a table which summarises the maximum memory usage |
262 | for different block sizes. Also recorded is the total | 262 | for different block sizes. Also recorded is the total |
263 | compressed size for 14 files of the Calgary Text Compres | 263 | compressed size for 14 files of the Calgary Text Compres |
264 | sion Corpus totalling 3,141,622 bytes. This column gives | 264 | sion Corpus totalling 3,141,622 bytes. This column gives |
265 | some feel for how compression varies with block size. | 265 | some feel for how compression varies with block size. |
266 | These figures tend to understate the advantage of larger | 266 | These figures tend to understate the advantage of larger |
267 | block sizes for larger files, since the Corpus is domi | 267 | block sizes for larger files, since the Corpus is domi |
268 | nated by smaller files. | 268 | nated by smaller files. |
269 | 269 | ||
270 | Compress Decompress Decompress Corpus | 270 | Compress Decompress Decompress Corpus |
271 | Flag usage usage -s usage Size | 271 | Flag usage usage ‐s usage Size |
272 | 272 | ||
273 | -1 1200k 500k 350k 914704 | 273 | ‐1 1200k 500k 350k 914704 |
274 | -2 2000k 900k 600k 877703 | 274 | ‐2 2000k 900k 600k 877703 |
275 | -3 2800k 1300k 850k 860338 | 275 | ‐3 2800k 1300k 850k 860338 |
276 | -4 3600k 1700k 1100k 846899 | 276 | ‐4 3600k 1700k 1100k 846899 |
277 | -5 4400k 2100k 1350k 845160 | 277 | ‐5 4400k 2100k 1350k 845160 |
278 | -6 5200k 2500k 1600k 838626 | 278 | ‐6 5200k 2500k 1600k 838626 |
279 | -7 6100k 2900k 1850k 834096 | 279 | ‐7 6100k 2900k 1850k 834096 |
280 | -8 6800k 3300k 2100k 828642 | 280 | ‐8 6800k 3300k 2100k 828642 |
281 | -9 7600k 3700k 2350k 828642 | 281 | ‐9 7600k 3700k 2350k 828642 |
282 | 282 | ||
283 | 283 | ||
284 | RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS | 284 | RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS |
285 | _b_z_i_p_2 compresses files in blocks, usually 900kbytes long. | 285 | _b_z_i_p_2 compresses files in blocks, usually 900kbytes long. |
286 | Each block is handled independently. If a media or trans | 286 | Each block is handled independently. If a media or trans |
287 | mission error causes a multi-block .bz2 file to become | 287 | mission error causes a multi‐block .bz2 file to become |
288 | damaged, it may be possible to recover data from the | 288 | damaged, it may be possible to recover data from the |
289 | undamaged blocks in the file. | 289 | undamaged blocks in the file. |
290 | 290 | ||
291 | The compressed representation of each block is delimited | 291 | The compressed representation of each block is delimited |
292 | by a 48-bit pattern, which makes it possible to find the | 292 | by a 48‐bit pattern, which makes it possible to find the |
293 | block boundaries with reasonable certainty. Each block | 293 | block boundaries with reasonable certainty. Each block |
294 | also carries its own 32-bit CRC, so damaged blocks can be | 294 | also carries its own 32‐bit CRC, so damaged blocks can be |
295 | distinguished from undamaged ones. | 295 | distinguished from undamaged ones. |
296 | 296 | ||
297 | _b_z_i_p_2_r_e_c_o_v_e_r is a simple program whose purpose is to | 297 | _b_z_i_p_2_r_e_c_o_v_e_r is a simple program whose purpose is to |
298 | search for blocks in .bz2 files, and write each block out | 298 | search for blocks in .bz2 files, and write each block out |
299 | into its own .bz2 file. You can then use _b_z_i_p_2 -t to test | 299 | into its own .bz2 file. You can then use _b_z_i_p_2 −t to test |
300 | the integrity of the resulting files, and decompress those | 300 | the integrity of the resulting files, and decompress those |
301 | which are undamaged. | 301 | which are undamaged. |
302 | 302 | ||
303 | _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam | 303 | _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam |
304 | aged file, and writes a number of files | 304 | aged file, and writes a number of files |
305 | "rec00001file.bz2", "rec00002file.bz2", etc, containing | 305 | "rec00001file.bz2", "rec00002file.bz2", etc, containing |
306 | the extracted blocks. The output filenames are | 306 | the extracted blocks. The output filenames are |
307 | designed so that the use of wildcards in subsequent pro | 307 | designed so that the use of wildcards in subsequent pro |
308 | cessing -- for example, "bzip2 -dc rec*file.bz2 > recov | 308 | cessing ‐‐ for example, "bzip2 ‐dc rec*file.bz2 > recov |
309 | ered_data" -- processes the files in the correct order. | 309 | ered_data" ‐‐ processes the files in the correct order. |
310 | 310 | ||
311 | _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2 | 311 | _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2 |
312 | files, as these will contain many blocks. It is clearly | 312 | files, as these will contain many blocks. It is clearly |
313 | futile to use it on damaged single-block files, since a | 313 | futile to use it on damaged single‐block files, since a |
314 | damaged block cannot be recovered. If you wish to min | 314 | damaged block cannot be recovered. If you wish to min |
315 | imise any potential data loss through media or transmis | 315 | imise any potential data loss through media or transmis |
316 | sion errors, you might consider compressing with a smaller | 316 | sion errors, you might consider compressing with a smaller |
317 | block size. | 317 | block size. |
318 | 318 | ||
@@ -324,21 +324,21 @@ PPEERRFFOORRMMAANNCCEE NNOOTTEESS | |||
324 | ..." (repeated several hundred times) may compress more | 324 | ..." (repeated several hundred times) may compress more |
325 | slowly than normal. Versions 0.9.5 and above fare much | 325 | slowly than normal. Versions 0.9.5 and above fare much |
326 | better than previous versions in this respect. The ratio | 326 | better than previous versions in this respect. The ratio |
327 | between worst-case and average-case compression time is in | 327 | between worst‐case and average‐case compression time is in |
328 | the region of 10:1. For previous versions, this figure | 328 | the region of 10:1. For previous versions, this figure |
329 | was more like 100:1. You can use the -vvvv option to mon | 329 | was more like 100:1. You can use the −vvvv option to mon |
330 | itor progress in great detail, if you want. | 330 | itor progress in great detail, if you want. |
331 | 331 | ||
332 | Decompression speed is unaffected by these phenomena. | 332 | Decompression speed is unaffected by these phenomena. |
333 | 333 | ||
334 | _b_z_i_p_2 usually allocates several megabytes of memory to | 334 | _b_z_i_p_2 usually allocates several megabytes of memory to |
335 | operate in, and then charges all over it in a fairly ran | 335 | operate in, and then charges all over it in a fairly ran |
336 | dom fashion. This means that performance, both for com | 336 | dom fashion. This means that performance, both for com |
337 | pressing and decompressing, is largely determined by the | 337 | pressing and decompressing, is largely determined by the |
338 | speed at which your machine can service cache misses. | 338 | speed at which your machine can service cache misses. |
339 | Because of this, small changes to the code to reduce the | 339 | Because of this, small changes to the code to reduce the |
340 | miss rate have been observed to give disproportionately | 340 | miss rate have been observed to give disproportionately |
341 | large performance improvements. I imagine _b_z_i_p_2 will per | 341 | large performance improvements. I imagine _b_z_i_p_2 will per |
342 | form best on machines with very large caches. | 342 | form best on machines with very large caches. |
343 | 343 | ||
344 | 344 | ||
@@ -348,50 +348,51 @@ CCAAVVEEAATTSS | |||
348 | but the details of what the problem is sometimes seem | 348 | but the details of what the problem is sometimes seem |
349 | rather misleading. | 349 | rather misleading. |
350 | 350 | ||
351 | This manual page pertains to version 1.0.2 of _b_z_i_p_2_. Com | 351 | This manual page pertains to version 1.0.3 of _b_z_i_p_2_. Com |
352 | pressed data created by this version is entirely forwards | 352 | pressed data created by this version is entirely forwards |
353 | and backwards compatible with the previous public | 353 | and backwards compatible with the previous public |
354 | releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1, | 354 | releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0, 1.0.1 and |
355 | but with the following exception: 0.9.0 and above can cor | 355 | 1.0.2, but with the following exception: 0.9.0 and above |
356 | rectly decompress multiple concatenated compressed files. | 356 | can correctly decompress multiple concatenated compressed |
357 | 0.1pl2 cannot do this; it will stop after decompressing | 357 | files. 0.1pl2 cannot do this; it will stop after decom |
358 | just the first file in the stream. | 358 | pressing just the first file in the stream. |
359 | 359 | ||
360 | _b_z_i_p_2_r_e_c_o_v_e_r versions prior to this one, 1.0.2, used | 360 | _b_z_i_p_2_r_e_c_o_v_e_r versions prior to 1.0.2 used 32‐bit integers |
361 | 32-bit integers to represent bit positions in compressed | 361 | to represent bit positions in compressed files, so they |
362 | files, so it could not handle compressed files more than | 362 | could not handle compressed files more than 512 megabytes |
363 | 512 megabytes long. Version 1.0.2 and above uses 64-bit | 363 | long. Versions 1.0.2 and above use 64‐bit ints on some |
364 | ints on some platforms which support them (GNU supported | 364 | platforms which support them (GNU supported targets, and |
365 | targets, and Windows). To establish whether or not | 365 | Windows). To establish whether or not bzip2recover was |
366 | bzip2recover was built with such a limitation, run it | 366 | built with such a limitation, run it without arguments. |
367 | without arguments. In any event you can build yourself an | 367 | In any event you can build yourself an unlimited version |
368 | unlimited version if you can recompile it with MaybeUInt64 | 368 | if you can recompile it with MaybeUInt64 set to be an |
369 | set to be an unsigned 64-bit integer. | 369 | unsigned 64‐bit integer. |
370 | 370 | ||
371 | 371 | ||
372 | 372 | ||
373 | 373 | ||
374 | AAUUTTHHOORR | 374 | AAUUTTHHOORR |
375 | Julian Seward, jseward@acm.org. | 375 | Julian Seward, jsewardbzip.org. |
376 | 376 | ||
377 | http://sources.redhat.com/bzip2 | 377 | http://www.bzip.org |
378 | 378 | ||
379 | The ideas embodied in _b_z_i_p_2 are due to (at least) the fol | 379 | The ideas embodied in _b_z_i_p_2 are due to (at least) the fol |
380 | lowing people: Michael Burrows and David Wheeler (for the | 380 | lowing people: Michael Burrows and David Wheeler (for the |
381 | block sorting transformation), David Wheeler (again, for | 381 | block sorting transformation), David Wheeler (again, for |
382 | the Huffman coder), Peter Fenwick (for the structured cod | 382 | the Huffman coder), Peter Fenwick (for the structured cod |
383 | ing model in the original _b_z_i_p_, and many refinements), and | 383 | ing model in the original _b_z_i_p_, and many refinements), and |
384 | Alistair Moffat, Radford Neal and Ian Witten (for the | 384 | Alistair Moffat, Radford Neal and Ian Witten (for the |
385 | arithmetic coder in the original _b_z_i_p_)_. I am much | 385 | arithmetic coder in the original _b_z_i_p_)_. I am much |
386 | indebted for their help, support and advice. See the man | 386 | indebted for their help, support and advice. See the man |
387 | ual in the source distribution for pointers to sources of | 387 | ual in the source distribution for pointers to sources of |
388 | documentation. Christian von Roques encouraged me to look | 388 | documentation. Christian von Roques encouraged me to look |
389 | for faster sorting algorithms, so as to speed up compres | 389 | for faster sorting algorithms, so as to speed up compres |
390 | sion. Bela Lubkin encouraged me to improve the worst-case | 390 | sion. Bela Lubkin encouraged me to improve the worst‐case |
391 | compression performance. The bz* scripts are derived from | 391 | compression performance. Donna Robinson XMLised the docu |
392 | those of GNU gzip. Many people sent patches, helped with | 392 | mentation. The bz* scripts are derived from those of GNU |
393 | portability problems, lent machines, gave advice and were | 393 | gzip. Many people sent patches, helped with portability |
394 | generally helpful. | 394 | problems, lent machines, gave advice and were generally |
395 | helpful. | ||
395 | 396 | ||
396 | 397 | ||
397 | 398 | ||