From ac02a120ef249aac37b4847705a3099bd4b92967 Mon Sep 17 00:00:00 2001 From: Mike Pall Date: Mon, 7 Jun 2021 12:03:22 +0200 Subject: String buffers, part 2e: add serialization string dictionary. Sponsored by fmad.io. --- doc/ext_buffer.html | 70 +++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 63 insertions(+), 7 deletions(-) (limited to 'doc/ext_buffer.html') diff --git a/doc/ext_buffer.html b/doc/ext_buffer.html index 94af757d..2443fc90 100644 --- a/doc/ext_buffer.html +++ b/doc/ext_buffer.html @@ -175,14 +175,19 @@ object itself as a convenience. This allows method chaining, e.g.:

Buffer Creation and Management

-

local buf = buffer.new([size])

+

local buf = buffer.new([size [,options]])
+local buf = buffer.new([options])

Creates a new buffer object.

The optional size argument ensures a minimum initial buffer -size. This is strictly an optimization for cases where the required -buffer size is known beforehand. +size. This is strictly an optimization when the required buffer size is +known beforehand. The buffer space will grow as needed, in any case. +

+

+The optional table options sets various +serialization options.

buf = buf:reset()

@@ -205,7 +210,7 @@ immediately.

Buffer Writers

-

buf = buf:put([str|num|obj] [, ...])

+

buf = buf:put([str|num|obj] [,…])

Appends a string str, a number num or any object obj with a __tostring metamethod to the buffer. @@ -217,7 +222,7 @@ internally. But it still involves a copy. Better combine the buffer writes to use a single buffer.

-

buf = buf:putf(format, ...)

+

buf = buf:putf(format, …)

Appends the formatted arguments to the buffer. The format string supports the same options as string.format(). @@ -298,7 +303,7 @@ method, if nothing is added to the buffer (e.g. on error). Returns the current length of the buffer data in bytes.

-

res = str|num|buf .. str|num|buf [...]

+

res = str|num|buf .. str|num|buf […]

The Lua concatenation operator .. also accepts buffers, just like strings or numbers. It always returns a string and not a buffer. @@ -319,7 +324,7 @@ Skips (consumes) len bytes from the buffer up to the current length of the buffer data.

-

str, ... = buf:get([len|nil] [,...])

+

str, … = buf:get([len|nil] [,…])

Consumes the buffer data and returns one or more strings. If called without arguments, the whole buffer data is consumed. If called with a @@ -444,6 +449,56 @@ data after decoding a single top-level object. The buffer method leaves any left-over data in the buffer.

+

Serialization Options

+

+The options table passed to buffer.new() may contain +the following members (all optional): +

+ +

+dict needs to be an array of strings, starting at index 1 and +without holes (no nil inbetween). The table is anchored in the +buffer object and internally modified into a two-way index (don't do +this yourself, just pass a plain array). The table must not be modified +after it has been passed to buffer.new(). +

+

+The dict tables used by the encoder and decoder must be the +same. Put the most common entries at the front. Extend at the end to +ensure backwards-compatibility — older encodings can then still be +read. You may also set some indexes to false to explicitly drop +backwards-compatibility. Old encodings that use these indexes will throw +an error when decoded. +

+

+Note: parsing and preparation of the options table is somewhat +expensive. Create a buffer object only once and recycle it for multiple +uses. Avoid mixing encoder and decoder buffers, since the +buf:set() method frees the already allocated buffer space: +

+
+local options = {
+  dict = { "commonly", "used", "string", "keys" },
+}
+local buf_enc = buffer.new(options)
+local buf_dec = buffer.new(options)
+
+local function encode(obj)
+  return buf_enc:reset():encode(obj):get()
+end
+
+local function decode(str)
+  return buf_dec:set(str):decode()
+end
+
+

Streaming Serialization

In some contexts, it's desirable to do piecewise serialization of large @@ -536,6 +591,7 @@ uint64 → 0x11 uint.L // FFI uint64_t complex → 0x12 re.L im.L // FFI complex string → (0x20+len).U len*char.B + | 0x0f (index-1).U // Dict entry .B = 8 bit .I = 32 bit little-endian -- cgit v1.2.3-55-g6feb