aboutsummaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README230
1 files changed, 57 insertions, 173 deletions
diff --git a/README b/README
index d58bb49..2f59ef7 100644
--- a/README
+++ b/README
@@ -1,194 +1,61 @@
1 1
2GREETINGS!
3 2
4 This is the README for bzip2, my block-sorting file compressor, 3This is the README for bzip2, a block-sorting file compressor, version
5 version 0.1. 40.9.0. This version is fully compatible with the previous public
5release, bzip2-0.1pl2.
6 6
7 bzip2 is distributed under the GNU General Public License version 2; 7bzip2-0.9.0 is distributed under a BSD-style license. For details,
8 for details, see the file LICENSE. Pointers to the algorithms used 8see the file LICENSE.
9 are in ALGORITHMS. Instructions for use are in bzip2.1.preformatted.
10 9
11 Please read all of this file carefully. 10Complete documentation is available in Postscript form (manual.ps)
11or html (manual_toc.html). A plain-text version of the manual page is
12available as bzip2.txt.
12 13
13 14
15HOW TO BUILD -- UNIX
14 16
15HOW TO BUILD 17Type `make'.
16 18
17 -- for UNIX: 19This creates binaries "bzip2" and "bzip2recover".
18 20
19 Type `make'. (tough, huh? :-) 21It also runs four compress-decompress tests to make sure things are
22working properly. If all goes well, you should be up & running.
23Please be sure to read the output from `make' just to be sure that the
24tests went ok.
20 25
21 This creates binaries "bzip2", and "bunzip2", 26To install bzip2 properly:
22 which is a symbolic link to "bzip2".
23 27
24 It also runs four compress-decompress tests to make sure 28* Copy the binaries "bzip2" and "bzip2recover" to a publically visible
25 things are working properly. If all goes well, you should be up & 29 place, possibly /usr/bin or /usr/local/bin.
26 running. Please be sure to read the output from `make'
27 just to be sure that the tests went ok.
28 30
29 To install bzip2 properly: 31* In that directory, make "bunzip2" and "bzcat" be symbolic links
32 to "bzip2".
30 33
31 -- Copy the binary "bzip2" to a publically visible place, 34* Copy the manual page, bzip2.1, to the relevant place.
32 possibly /usr/bin, /usr/common/bin or /usr/local/bin. 35 Probably the right place is /usr/man/man1/.
33
34 -- In that directory, make "bunzip2" be a symbolic link
35 to "bzip2".
36
37 -- Copy the manual page, bzip2.1, to the relevant place.
38 Probably the right place is /usr/man/man1/.
39
40 -- for Windows 95 and NT:
41 36
42 For a start, do you *really* want to recompile bzip2? 37If you want to program with the library, you'll need to copy libbz2.a
43 The standard distribution includes a pre-compiled version 38and bzlib.h to /usr/lib and /usr/include respectively.
44 for Windows 95 and NT, `bzip2.exe'. 39
45 40
46 This executable was created with Jacob Navia's excellent 41HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
47 port to Win32 of Chris Fraser & David Hanson's excellent
48 ANSI C compiler, "lcc". You can get to it at the pages
49 of the CS department of Princeton University,
50 www.cs.princeton.edu.
51 I have not tried to compile this version of bzip2 with
52 a commercial C compiler such as MS Visual C, as I don't
53 have one available.
54
55 Note that lcc is designed primarily to be portable and
56 fast. Code quality is a secondary aim, so bzip2.exe
57 runs perhaps 40% slower than it could if compiled with
58 a good optimising compiler.
59
60 I compiled a previous version of bzip (0.21) with Borland
61 C 5.0, which worked fine, and with MS VC++ 2.0, which
62 didn't. Here is an comment from the README for bzip-0.21.
63
64 MS VC++ 2.0's optimising compiler has a bug which, at
65 maximum optimisation, gives an executable which produces
66 garbage compressed files. Proceed with caution.
67 I do not know whether or not this happens with later
68 versions of VC++.
69
70 Edit the defines starting at line 86 of bzip.c to
71 select your platform/compiler combination, and then compile.
72 Then check that the resulting executable (assumed to be
73 called bzip.exe) works correctly, using the SELFTEST.BAT file.
74 Bearing in mind the previous paragraph, the self-test is
75 important.
76
77 Note that the defines which bzip-0.21 had, to support
78 compilation with VC 2.0 and BC 5.0, are gone. Windows
79 is not my preferred operating system, and I am, for the
80 moment, content with the modestly fast executable created
81 by lcc-win32.
82
83 A manual page is supplied, unformatted (bzip2.1),
84 preformatted (bzip2.1.preformatted), and preformatted
85 and sanitised for MS-DOS (bzip2.txt).
86
87
88
89COMPILATION NOTES
90
91 bzip2 should work on any 32 or 64-bit machine. It is known to work
92 [meaning: it has compiled and passed self-tests] on the
93 following platform-os combinations:
94
95 Intel i386/i486 running Linux 2.0.21
96 Sun Sparcs (various) running SunOS 4.1.4 and Solaris 2.5
97 Intel i386/i486 running Windows 95 and NT
98 DEC Alpha running Digital Unix 4.0
99
100 Following the release of bzip-0.21, many people mailed me
101 from around the world to say they had made it work on all sorts
102 of weird and wonderful machines. Chances are, if you have
103 a reasonable ANSI C compiler and a 32-bit machine, you can
104 get it to work.
105
106 The #defines starting at around line 82 of bzip2.c supply some
107 degree of platform-independance. If you configure bzip2 for some
108 new far-out platform which is not covered by the existing definitions,
109 please send me the relevant definitions.
110
111 I recommend GNU C for compilation. The code is standard ANSI C,
112 except for the Unix-specific file handling, so any ANSI C compiler
113 should work. Note however that the many routines marked INLINE
114 should be inlined by your compiler, else performance will be very
115 poor. Asking your compiler to unroll loops gives some
116 small improvement too; for gcc, the relevant flag is
117 -funroll-loops.
118
119 On a 386/486 machines, I'd recommend giving gcc the
120 -fomit-frame-pointer flag; this liberates another register for
121 allocation, which measurably improves performance.
122
123 I used the abovementioned lcc compiler to develop bzip2.
124 I would highly recommend this compiler for day-to-day development;
125 it is fast, reliable, lightweight, has an excellent profiler,
126 and is generally excellent. And it's fun to retarget, if you're
127 into that kind of thing.
128
129 If you compile bzip2 on a new platform or with a new compiler,
130 please be sure to run the four compress-decompress tests, either
131 using the Makefile, or with the test.bat (MSDOS) or test.cmd (OS/2)
132 files. Some compilers have been seen to introduce subtle bugs
133 when optimising, so this check is important. Ideally you should
134 then go on to test bzip2 on a file several megabytes or even
135 tens of megabytes long, just to be 110% sure. ``Professional
136 programmers are paranoid programmers.'' (anon).
137 42
43It's difficult for me to support compilation on all these platforms.
44My approach is to collect binaries for these platforms, and put them
45on my web page (http://www.muraroa.demon.co.uk). Look there.
138 46
139 47
140VALIDATION 48VALIDATION
141 49
142 Correct operation, in the sense that a compressed file can always be 50Correct operation, in the sense that a compressed file can always be
143 decompressed to reproduce the original, is obviously of paramount 51decompressed to reproduce the original, is obviously of paramount
144 importance. To validate bzip2, I used a modified version of 52importance. To validate bzip2, I used a modified version of Mark
145 Mark Nelson's churn program. Churn is an automated test driver 53Nelson's churn program. Churn is an automated test driver which
146 which recursively traverses a directory structure, using bzip2 to 54recursively traverses a directory structure, using bzip2 to compress
147 compress and then decompress each file it encounters, and checking 55and then decompress each file it encounters, and checking that the
148 that the decompressed data is the same as the original. As test 56decompressed data is the same as the original. There are more details
149 material, I used several runs over several filesystems of differing 57in Section 4 of the user guide.
150 sizes.
151
152 One set of tests was done on my base Linux filesystem,
153 410 megabytes in 23,000 files. There were several runs over
154 this filesystem, in various configurations designed to break bzip2.
155 That filesystem also contained some specially constructed test
156 files designed to exercise boundary cases in the code.
157 This included files of zero length, various long, highly repetitive
158 files, and some files which generate blocks with all values the same.
159 58
160 The other set of tests was done just with the "normal" configuration,
161 but on a much larger quantity of data.
162
163 Tests are:
164
165 Linux FS, 410M, 23000 files
166
167 As above, with --repetitive-fast
168
169 As above, with -1
170
171 Low level disk image of a disk containing
172 Windows NT4.0; 420M in a single huge file
173
174 Linux distribution, incl Slackware,
175 all GNU sources. 1900M in 2300 files.
176
177 Approx ~100M compiler sources and related
178 programming tools, running under Purify.
179
180 About 500M of data in 120 files of around
181 4 M each. This is raw data from a
182 biomagnetometer (SQUID-based thing).
183
184 Overall, total volume of test data is about
185 3300 megabytes in 25000 files.
186
187 The distribution does four tests after building bzip. These tests
188 include test decompressions of pre-supplied compressed files, so
189 they not only test that bzip works correctly on the machine it was
190 built on, but can also decompress files compressed on a different
191 machine. This guards against unforseen interoperability problems.
192 59
193 60
194Please read and be aware of the following: 61Please read and be aware of the following:
@@ -234,14 +101,30 @@ PATENTS:
234End of legalities. 101End of legalities.
235 102
236 103
104WHAT'S NEW IN 0.9.0 (as compared to 0.1pl2) ?
105
106 * Approx 10% faster compression, 30% faster decompression
107 * -t (test mode) is a lot quicker
108 * Can decompress concatenated compressed files
109 * Programming interface, so programs can directly read/write .bz2 files
110 * Less restrictive (BSD-style) licensing
111 * Flag handling more compatible with GNU gzip
112 * Much more documentation, i.e., a proper user manual
113 * Hopefully, improved portability (at least of the library)
114
115
237I hope you find bzip2 useful. Feel free to contact me at 116I hope you find bzip2 useful. Feel free to contact me at
238 jseward@acm.org 117 jseward@acm.org
239if you have any suggestions or queries. Many people mailed me with 118if you have any suggestions or queries. Many people mailed me with
240comments, suggestions and patches after the releases of 0.15 and 0.21, 119comments, suggestions and patches after the releases of bzip-0.15,
241and the changes in bzip2 are largely a result of this feedback. 120bzip-0.21 and bzip2-0.1pl2, and the changes in bzip2 are largely a
242I thank you for your comments. 121result of this feedback. I thank you for your comments.
122
123At least for the time being, bzip2's "home" is
124http://www.muraroa.demon.co.uk.
243 125
244Julian Seward 126Julian Seward
127jseward@acm.org
245 128
246Manchester, UK 129Manchester, UK
24718 July 1996 (version 0.15) 13018 July 1996 (version 0.15)
@@ -250,4 +133,5 @@ Manchester, UK
250Guildford, Surrey, UK 133Guildford, Surrey, UK
2517 August 1997 (bzip2, version 0.1) 1347 August 1997 (bzip2, version 0.1)
25229 August 1997 (bzip2, version 0.1pl2) 13529 August 1997 (bzip2, version 0.1pl2)
13623 August 1998 (bzip2, version 0.9.0)
253 137