diff options
Diffstat (limited to '')
-rw-r--r-- | algorithm.txt | 34 |
1 files changed, 15 insertions, 19 deletions
diff --git a/algorithm.txt b/algorithm.txt index cdc830b..f64f7c3 100644 --- a/algorithm.txt +++ b/algorithm.txt | |||
@@ -59,10 +59,10 @@ but saves time since there are both fewer insertions and fewer searches. | |||
59 | 59 | ||
60 | 2.1 Introduction | 60 | 2.1 Introduction |
61 | 61 | ||
62 | The real question is, given a Huffman tree, how to decode fast. The most | 62 | The key question is how to represent a Huffman code (or any prefix code) so |
63 | important realization is that shorter codes are much more common than | 63 | that you can decode fast. The most important characteristic is that shorter |
64 | longer codes, so pay attention to decoding the short codes fast, and let | 64 | codes are much more common than longer codes, so pay attention to decoding the |
65 | the long codes take longer to decode. | 65 | short codes fast, and let the long codes take longer to decode. |
66 | 66 | ||
67 | inflate() sets up a first level table that covers some number of bits of | 67 | inflate() sets up a first level table that covers some number of bits of |
68 | input less than the length of longest code. It gets that many bits from the | 68 | input less than the length of longest code. It gets that many bits from the |
@@ -77,20 +77,16 @@ table took no time (and if you had infinite memory), then there would only | |||
77 | be a first level table to cover all the way to the longest code. However, | 77 | be a first level table to cover all the way to the longest code. However, |
78 | building the table ends up taking a lot longer for more bits since short | 78 | building the table ends up taking a lot longer for more bits since short |
79 | codes are replicated many times in such a table. What inflate() does is | 79 | codes are replicated many times in such a table. What inflate() does is |
80 | simply to make the number of bits in the first table a variable, and set it | 80 | simply to make the number of bits in the first table a variable, and then |
81 | for the maximum speed. | 81 | to set that variable for the maximum speed. |
82 | 82 | ||
83 | inflate() sends new trees relatively often, so it is possibly set for a | 83 | For inflate, which has 286 possible codes for the literal/length tree, the size |
84 | smaller first level table than an application that has only one tree for | 84 | of the first table is nine bits. Also the distance trees have 30 possible |
85 | all the data. For inflate, which has 286 possible codes for the | 85 | values, and the size of the first table is six bits. Note that for each of |
86 | literal/length tree, the size of the first table is nine bits. Also the | 86 | those cases, the table ended up one bit longer than the ``average'' code |
87 | distance trees have 30 possible values, and the size of the first table is | 87 | length, i.e. the code length of an approximately flat code which would be a |
88 | six bits. Note that for each of those cases, the table ended up one bit | 88 | little more than eight bits for 286 symbols and a little less than five bits |
89 | longer than the ``average'' code length, i.e. the code length of an | 89 | for 30 symbols. |
90 | approximately flat code which would be a little more than eight bits for | ||
91 | 286 symbols and a little less than five bits for 30 symbols. It would be | ||
92 | interesting to see if optimizing the first level table for other | ||
93 | applications gave values within a bit or two of the flat code size. | ||
94 | 90 | ||
95 | 91 | ||
96 | 2.2 More details on the inflate table lookup | 92 | 2.2 More details on the inflate table lookup |
@@ -158,7 +154,7 @@ Let's make the first table three bits long (eight entries): | |||
158 | 110: -> table X (gobble 3 bits) | 154 | 110: -> table X (gobble 3 bits) |
159 | 111: -> table Y (gobble 3 bits) | 155 | 111: -> table Y (gobble 3 bits) |
160 | 156 | ||
161 | Each entry is what the bits decode to and how many bits that is, i.e. how | 157 | Each entry is what the bits decode as and how many bits that is, i.e. how |
162 | many bits to gobble. Or the entry points to another table, with the number of | 158 | many bits to gobble. Or the entry points to another table, with the number of |
163 | bits to gobble implicit in the size of the table. | 159 | bits to gobble implicit in the size of the table. |
164 | 160 | ||