From 4c6b669c419f313306b9e6ee43be4ad5f6d73ec6 Mon Sep 17 00:00:00 2001 From: Mike Pall Date: Thu, 25 Mar 2021 02:21:31 +0100 Subject: String buffers, part 1: object serialization. Sponsored by fmad.io. --- doc/contact.html | 2 + doc/ext_buffer.html | 275 +++++++++++++++++++++++++++++++++++++++++++++ doc/ext_c_api.html | 2 + doc/ext_ffi.html | 2 + doc/ext_ffi_api.html | 2 + doc/ext_ffi_semantics.html | 2 + doc/ext_ffi_tutorial.html | 2 + doc/ext_jit.html | 2 + doc/ext_profiler.html | 2 + doc/extensions.html | 2 + doc/faq.html | 2 + doc/install.html | 2 + doc/luajit.html | 2 + doc/running.html | 2 + doc/status.html | 2 + 15 files changed, 303 insertions(+) create mode 100644 doc/ext_buffer.html (limited to 'doc') diff --git a/doc/contact.html b/doc/contact.html index b7980091..c253a08b 100644 --- a/doc/contact.html +++ b/doc/contact.html @@ -37,6 +37,8 @@ FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_buffer.html b/doc/ext_buffer.html new file mode 100644 index 00000000..455c298d --- /dev/null +++ b/doc/ext_buffer.html @@ -0,0 +1,275 @@ + + + +String Buffers + + + + + + + +
    +Lua +
    + + +
    +

    + +The string buffer library allows high-performance manipulation of +string-like data. + +

    +

    + +Unlike Lua strings, which are constants, string buffers are +mutable sequences of 8-bit (binary-transparent) characters. Data +can be stored, formatted and encoded into a string buffer and later +converted, decoded or extracted. + +

    +

    + +The convenient string buffer API simplifies common string manipulation +tasks, that would otherwise require creating many intermediate strings. +String buffers improve performance by eliminating redundant memory +copies, object creation, string interning and garbage collection +overhead. In conjunction with the FFI library, they allow zero-copy +operations. + +

    + +

    Using the String Buffer Library

    +

    +The string buffer library is built into LuaJIT by default, but it's not +loaded by default. Add this to the start of every Lua file that needs +one of its functions: +

    +
    +local buffer = require("string.buffer")
    +
    + +

    Work in Progress

    + +

    + +This library is a work in progress. More +functions will be added soon. + +

    + +

    Serialization of Lua Objects

    +

    + +The following functions and methods allow high-speed serialization +(encoding) of a Lua object into a string and decoding it back to a Lua +object. This allows convenient storage and transport of structured +data. + +

    +

    + +The encoded data is in an internal binary +format. The data can be stored in files, binary-transparent +databases or transmitted to other LuaJIT instances across threads, +processes or networks. + +

    +

    + +Encoding speed can reach up to 1 Gigabyte/second on a modern desktop- or +server-class system, even when serializing many small objects. Decoding +speed is mostly constrained by object creation cost. + +

    +

    + +The serializer handles most Lua types, common FFI number types and +nested structures. Functions, thread objects, other FFI cdata, full +userdata and associated metatables cannot be serialized (yet). + +

    +

    + +The encoder serializes nested structures as trees. Multiple references +to a single object will be stored separately and create distinct objects +after decoding. Circular references cause an error. + + +

    + +

    str = buffer.encode(obj)

    +

    + +Serializes (encodes) the Lua object obj into the string +str. + +

    +

    + +obj can be any of the supported Lua types — it doesn't +need to be a Lua table. + +

    +

    + +This function may throw an error when attempting to serialize +unsupported object types, circular references or deeply nested tables. + +

    + +

    obj = buffer.decode(str)

    +

    + +De-serializes (decodes) the string str into the Lua object +obj. + +

    +

    + +The returned object may be any of the supported Lua types — +even nil. + +

    +

    + +This function may throw an error when fed with malformed or incomplete +encoded data. The standalone function throws when there's left-over data +after decoding a single top-level object. + +

    + +

    Serialization Format Specification

    +

    + +This serialization format is designed for internal use by LuaJIT +applications. Serialized data is upwards-compatible and portable across +all supported LuaJIT platforms. + +

    +

    + +It's an 8-bit binary format and not human-readable. It uses e.g. +embedded zeroes and stores embedded Lua string objects unmodified, which +are 8-bit-clean, too. Encoded data can be safely concatenated for +streaming and later decoded one top-level object at a time. + +

    +

    + +The encoding is reasonably compact, but tuned for maximum performance, +not for minimum space usage. It compresses well with any of the common +byte-oriented data compression algorithms. + +

    +

    + +Although documented here for reference, this format is explicitly +not intended to be a 'public standard' for structured data +interchange across computer languages (like JSON or MessagePack). Please +do not use it as such. + +

    +

    + +The specification is given below as a context-free grammar with a +top-level object as the starting point. Alternatives are +separated by the | symbol and * indicates repeats. +Grouping is implicit or indicated by {…}. Terminals are +either plain hex numbers, encoded as bytes, or have a .format +suffix. + +

    +
    +object    → nil | false | true
    +          | null | lightud32 | lightud64
    +          | int | num | tab
    +          | int64 | uint64 | complex
    +          | string
    +
    +nil       → 0x00
    +false     → 0x01
    +true      → 0x02
    +
    +null      → 0x03                            // NULL lightuserdata
    +lightud32 → 0x04 data.I                   // 32 bit lightuserdata
    +lightud64 → 0x05 data.L                   // 64 bit lightuserdata
    +
    +int       → 0x06 int.I                                 // int32_t
    +num       → 0x07 double.L
    +
    +tab       → 0x08                                   // Empty table
    +          | 0x09 h.U h*{object object}          // Key/value hash
    +          | 0x0a a.U a*object                    // 0-based array
    +          | 0x0b a.U a*object h.U h*{object object}      // Mixed
    +          | 0x0c a.U (a-1)*object                // 1-based array
    +          | 0x0d a.U (a-1)*object h.U h*{object object}  // Mixed
    +
    +int64     → 0x10 int.L                             // FFI int64_t
    +uint64    → 0x11 uint.L                           // FFI uint64_t
    +complex   → 0x12 re.L im.L                         // FFI complex
    +
    +string    → (0x20+len).U len*char.B
    +
    +.B = 8 bit
    +.I = 32 bit little-endian
    +.L = 64 bit little-endian
    +.U = prefix-encoded 32 bit unsigned number n:
    +     0x00..0xdf   → n.B
    +     0xe0..0x1fdf → (0xe0|(((n-0xe0)>>8)&0x1f)).B ((n-0xe0)&0xff).B
    +   0x1fe0..       → 0xff n.I
    +
    +
    +
    + + + diff --git a/doc/ext_c_api.html b/doc/ext_c_api.html index 6079e5ac..9f1ad212 100644 --- a/doc/ext_c_api.html +++ b/doc/ext_c_api.html @@ -37,6 +37,8 @@ FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_ffi.html b/doc/ext_ffi.html index 13b75bda..b934dc78 100644 --- a/doc/ext_ffi.html +++ b/doc/ext_ffi.html @@ -37,6 +37,8 @@ FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_ffi_api.html b/doc/ext_ffi_api.html index b7ace808..061cc42a 100644 --- a/doc/ext_ffi_api.html +++ b/doc/ext_ffi_api.html @@ -42,6 +42,8 @@ td.abiparam { font-weight: bold; width: 6em; } FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_ffi_semantics.html b/doc/ext_ffi_semantics.html index 904ee51d..fef39c32 100644 --- a/doc/ext_ffi_semantics.html +++ b/doc/ext_ffi_semantics.html @@ -42,6 +42,8 @@ td.convop { font-style: italic; width: 40%; } FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_ffi_tutorial.html b/doc/ext_ffi_tutorial.html index 8ed61364..ca71be4d 100644 --- a/doc/ext_ffi_tutorial.html +++ b/doc/ext_ffi_tutorial.html @@ -44,6 +44,8 @@ td.idiomlua b { font-weight: normal; color: #2142bf; } FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_jit.html b/doc/ext_jit.html index 84302fa0..6dd54c70 100644 --- a/doc/ext_jit.html +++ b/doc/ext_jit.html @@ -37,6 +37,8 @@ FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/ext_profiler.html b/doc/ext_profiler.html index 0e8d3691..2783abdb 100644 --- a/doc/ext_profiler.html +++ b/doc/ext_profiler.html @@ -37,6 +37,8 @@ FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/extensions.html b/doc/extensions.html index 77cf444c..799679a3 100644 --- a/doc/extensions.html +++ b/doc/extensions.html @@ -54,6 +54,8 @@ td.excinterop { FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/faq.html b/doc/faq.html index b71e6e7c..a5d744d2 100644 --- a/doc/faq.html +++ b/doc/faq.html @@ -40,6 +40,8 @@ dd { margin-left: 1.5em; } FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/install.html b/doc/install.html index fab0b2ca..e4af9dde 100644 --- a/doc/install.html +++ b/doc/install.html @@ -65,6 +65,8 @@ td.compatno { FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/luajit.html b/doc/luajit.html index 42c0ac83..a25267a6 100644 --- a/doc/luajit.html +++ b/doc/luajit.html @@ -122,6 +122,8 @@ table.feature small { FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/running.html b/doc/running.html index ae4911d5..b55b8439 100644 --- a/doc/running.html +++ b/doc/running.html @@ -59,6 +59,8 @@ td.param_default { FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API diff --git a/doc/status.html b/doc/status.html index e1f024bf..1d3ba984 100644 --- a/doc/status.html +++ b/doc/status.html @@ -40,6 +40,8 @@ ul li { padding-bottom: 0.3em; } FFI Semantics
  • +String Buffers +
  • jit.* Library
  • Lua/C API -- cgit v1.2.3-55-g6feb