diff options
| author | Mike Pall <mike> | 2021-06-07 12:03:22 +0200 |
|---|---|---|
| committer | Mike Pall <mike> | 2021-06-07 12:03:22 +0200 |
| commit | ac02a120ef249aac37b4847705a3099bd4b92967 (patch) | |
| tree | ce8dde84c0cf6017752dd605088dc80f8626ea1a /doc/ext_buffer.html | |
| parent | 4216bdfb2a18b213d226da26361417c537c36743 (diff) | |
| download | luajit-ac02a120ef249aac37b4847705a3099bd4b92967.tar.gz luajit-ac02a120ef249aac37b4847705a3099bd4b92967.tar.bz2 luajit-ac02a120ef249aac37b4847705a3099bd4b92967.zip | |
String buffers, part 2e: add serialization string dictionary.
Sponsored by fmad.io.
Diffstat (limited to 'doc/ext_buffer.html')
| -rw-r--r-- | doc/ext_buffer.html | 70 |
1 files changed, 63 insertions, 7 deletions
diff --git a/doc/ext_buffer.html b/doc/ext_buffer.html index 94af757d..2443fc90 100644 --- a/doc/ext_buffer.html +++ b/doc/ext_buffer.html | |||
| @@ -175,14 +175,19 @@ object itself as a convenience. This allows method chaining, e.g.: | |||
| 175 | 175 | ||
| 176 | <h2 id="create">Buffer Creation and Management</h2> | 176 | <h2 id="create">Buffer Creation and Management</h2> |
| 177 | 177 | ||
| 178 | <h3 id="buffer_new"><tt>local buf = buffer.new([size])</tt></h3> | 178 | <h3 id="buffer_new"><tt>local buf = buffer.new([size [,options]])<br> |
| 179 | local buf = buffer.new([options])</tt></h3> | ||
| 179 | <p> | 180 | <p> |
| 180 | Creates a new buffer object. | 181 | Creates a new buffer object. |
| 181 | </p> | 182 | </p> |
| 182 | <p> | 183 | <p> |
| 183 | The optional <tt>size</tt> argument ensures a minimum initial buffer | 184 | The optional <tt>size</tt> argument ensures a minimum initial buffer |
| 184 | size. This is strictly an optimization for cases where the required | 185 | size. This is strictly an optimization when the required buffer size is |
| 185 | buffer size is known beforehand. | 186 | known beforehand. The buffer space will grow as needed, in any case. |
| 187 | </p> | ||
| 188 | <p> | ||
| 189 | The optional table <tt>options</tt> sets various | ||
| 190 | <a href="#serialize_options">serialization options</a>. | ||
| 186 | </p> | 191 | </p> |
| 187 | 192 | ||
| 188 | <h3 id="buffer_reset"><tt>buf = buf:reset()</tt></h3> | 193 | <h3 id="buffer_reset"><tt>buf = buf:reset()</tt></h3> |
| @@ -205,7 +210,7 @@ immediately. | |||
| 205 | 210 | ||
| 206 | <h2 id="write">Buffer Writers</h2> | 211 | <h2 id="write">Buffer Writers</h2> |
| 207 | 212 | ||
| 208 | <h3 id="buffer_put"><tt>buf = buf:put([str|num|obj] [, ...])</tt></h3> | 213 | <h3 id="buffer_put"><tt>buf = buf:put([str|num|obj] [,…])</tt></h3> |
| 209 | <p> | 214 | <p> |
| 210 | Appends a string <tt>str</tt>, a number <tt>num</tt> or any object | 215 | Appends a string <tt>str</tt>, a number <tt>num</tt> or any object |
| 211 | <tt>obj</tt> with a <tt>__tostring</tt> metamethod to the buffer. | 216 | <tt>obj</tt> with a <tt>__tostring</tt> metamethod to the buffer. |
| @@ -217,7 +222,7 @@ internally. But it still involves a copy. Better combine the buffer | |||
| 217 | writes to use a single buffer. | 222 | writes to use a single buffer. |
| 218 | </p> | 223 | </p> |
| 219 | 224 | ||
| 220 | <h3 id="buffer_putf"><tt>buf = buf:putf(format, ...)</tt></h3> | 225 | <h3 id="buffer_putf"><tt>buf = buf:putf(format, …)</tt></h3> |
| 221 | <p> | 226 | <p> |
| 222 | Appends the formatted arguments to the buffer. The <tt>format</tt> | 227 | Appends the formatted arguments to the buffer. The <tt>format</tt> |
| 223 | string supports the same options as <tt>string.format()</tt>. | 228 | string supports the same options as <tt>string.format()</tt>. |
| @@ -298,7 +303,7 @@ method, if nothing is added to the buffer (e.g. on error). | |||
| 298 | Returns the current length of the buffer data in bytes. | 303 | Returns the current length of the buffer data in bytes. |
| 299 | </p> | 304 | </p> |
| 300 | 305 | ||
| 301 | <h3 id="buffer_concat"><tt>res = str|num|buf .. str|num|buf [...]</tt></h3> | 306 | <h3 id="buffer_concat"><tt>res = str|num|buf .. str|num|buf […]</tt></h3> |
| 302 | <p> | 307 | <p> |
| 303 | The Lua concatenation operator <tt>..</tt> also accepts buffers, just | 308 | The Lua concatenation operator <tt>..</tt> also accepts buffers, just |
| 304 | like strings or numbers. It always returns a string and not a buffer. | 309 | like strings or numbers. It always returns a string and not a buffer. |
| @@ -319,7 +324,7 @@ Skips (consumes) <tt>len</tt> bytes from the buffer up to the current | |||
| 319 | length of the buffer data. | 324 | length of the buffer data. |
| 320 | </p> | 325 | </p> |
| 321 | 326 | ||
| 322 | <h3 id="buffer_get"><tt>str, ... = buf:get([len|nil] [,...])</tt></h3> | 327 | <h3 id="buffer_get"><tt>str, … = buf:get([len|nil] [,…])</tt></h3> |
| 323 | <p> | 328 | <p> |
| 324 | Consumes the buffer data and returns one or more strings. If called | 329 | Consumes the buffer data and returns one or more strings. If called |
| 325 | without arguments, the whole buffer data is consumed. If called with a | 330 | without arguments, the whole buffer data is consumed. If called with a |
| @@ -444,6 +449,56 @@ data after decoding a single top-level object. The buffer method leaves | |||
| 444 | any left-over data in the buffer. | 449 | any left-over data in the buffer. |
| 445 | </p> | 450 | </p> |
| 446 | 451 | ||
| 452 | <h3 id="serialize_options">Serialization Options</h3> | ||
| 453 | <p> | ||
| 454 | The <tt>options</tt> table passed to <tt>buffer.new()</tt> may contain | ||
| 455 | the following members (all optional): | ||
| 456 | </p> | ||
| 457 | <ul> | ||
| 458 | <li> | ||
| 459 | <tt>dict</tt> is a Lua table holding a <b>dictionary of strings</b> that | ||
| 460 | commonly occur as table keys of objects you are serializing. These keys | ||
| 461 | are compactly encoded as indexes during serialization. A well chosen | ||
| 462 | dictionary saves space and improves serialization performance. | ||
| 463 | </li> | ||
| 464 | </ul> | ||
| 465 | <p> | ||
| 466 | <tt>dict</tt> needs to be an array of strings, starting at index 1 and | ||
| 467 | without holes (no <tt>nil</tt> inbetween). The table is anchored in the | ||
| 468 | buffer object and internally modified into a two-way index (don't do | ||
| 469 | this yourself, just pass a plain array). The table must not be modified | ||
| 470 | after it has been passed to <tt>buffer.new()</tt>. | ||
| 471 | </p> | ||
| 472 | <p> | ||
| 473 | The <tt>dict</tt> tables used by the encoder and decoder must be the | ||
| 474 | same. Put the most common entries at the front. Extend at the end to | ||
| 475 | ensure backwards-compatibility — older encodings can then still be | ||
| 476 | read. You may also set some indexes to <tt>false</tt> to explicitly drop | ||
| 477 | backwards-compatibility. Old encodings that use these indexes will throw | ||
| 478 | an error when decoded. | ||
| 479 | </p> | ||
| 480 | <p> | ||
| 481 | Note: parsing and preparation of the options table is somewhat | ||
| 482 | expensive. Create a buffer object only once and recycle it for multiple | ||
| 483 | uses. Avoid mixing encoder and decoder buffers, since the | ||
| 484 | <tt>buf:set()</tt> method frees the already allocated buffer space: | ||
| 485 | </p> | ||
| 486 | <pre class="code"> | ||
| 487 | local options = { | ||
| 488 | dict = { "commonly", "used", "string", "keys" }, | ||
| 489 | } | ||
| 490 | local buf_enc = buffer.new(options) | ||
| 491 | local buf_dec = buffer.new(options) | ||
| 492 | |||
| 493 | local function encode(obj) | ||
| 494 | return buf_enc:reset():encode(obj):get() | ||
| 495 | end | ||
| 496 | |||
| 497 | local function decode(str) | ||
| 498 | return buf_dec:set(str):decode() | ||
| 499 | end | ||
| 500 | </pre> | ||
| 501 | |||
| 447 | <h3 id="serialize_stream">Streaming Serialization</h3> | 502 | <h3 id="serialize_stream">Streaming Serialization</h3> |
| 448 | <p> | 503 | <p> |
| 449 | In some contexts, it's desirable to do piecewise serialization of large | 504 | In some contexts, it's desirable to do piecewise serialization of large |
| @@ -536,6 +591,7 @@ uint64 → 0x11 uint.L // FFI uint64_t | |||
| 536 | complex → 0x12 re.L im.L // FFI complex | 591 | complex → 0x12 re.L im.L // FFI complex |
| 537 | 592 | ||
| 538 | string → (0x20+len).U len*char.B | 593 | string → (0x20+len).U len*char.B |
| 594 | | 0x0f (index-1).U // Dict entry | ||
| 539 | 595 | ||
| 540 | .B = 8 bit | 596 | .B = 8 bit |
| 541 | .I = 32 bit little-endian | 597 | .I = 32 bit little-endian |
