aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorMike Pall <mike>2011-02-09 01:26:02 +0100
committerMike Pall <mike>2011-02-09 01:26:02 +0100
commit24c314e8fcfb3d12ea05c1f9bf7add40d24ae0cd (patch)
treebf7bd5d2b852f9c13b70f6392c24b315364cb968 /doc
parent2388a7fcc017b9e9a75a4674aa81933b510882f7 (diff)
downloadluajit-24c314e8fcfb3d12ea05c1f9bf7add40d24ae0cd.tar.gz
luajit-24c314e8fcfb3d12ea05c1f9bf7add40d24ae0cd.tar.bz2
luajit-24c314e8fcfb3d12ea05c1f9bf7add40d24ae0cd.zip
FFI: Add more docs on FFI semantics.
Diffstat (limited to 'doc')
-rw-r--r--doc/ext_ffi_semantics.html292
1 files changed, 268 insertions, 24 deletions
diff --git a/doc/ext_ffi_semantics.html b/doc/ext_ffi_semantics.html
index 9b7cac70..f48c6406 100644
--- a/doc/ext_ffi_semantics.html
+++ b/doc/ext_ffi_semantics.html
@@ -57,18 +57,159 @@
57</div> 57</div>
58<div id="main"> 58<div id="main">
59<p> 59<p>
60TODO 60This page describes the detailed semantics underlying the FFI library
61and its interaction with both Lua and C&nbsp;code.
62</p>
63<p>
64Given that the FFI library is designed to interface with C&nbsp;code
65and that declarations can be written in plain C&nbsp;syntax, it
66closely follows the C&nbsp;language semantics wherever possible. Some
67concessions are needed for smoother interoperation with Lua language
68semantics. But it should be straightforward to write applications
69using the LuaJIT FFI for developers with a C or C++ background.
61</p> 70</p>
62 71
63<h2 id="clang">C Language Support</h2> 72<h2 id="clang">C Language Support</h2>
64<p> 73<p>
65TODO 74The FFI library has a built-in C&nbsp;parser with a minimal memory
75footprint. It's used by the <a href="ext_ffi_api.html">ffi.* library
76functions</a> to declare C&nbsp;types or external symbols.
77</p>
78<p>
79It's only purpose is to parse C&nbsp;declarations, as found e.g. in
80C&nbsp;header files. Although it does evaluate constant expressions,
81it's <em>not</em> a C&nbsp;compiler. The body of <tt>inline</tt>
82C&nbsp;function definitions is simply ignored.
83</p>
84<p>
85Also, this is <em>not</em> a validating C&nbsp;parser. It expects and
86accepts correctly formed C&nbsp;declarations, but it may choose to
87ignore bad declarations or show rather generic error messages. If in
88doubt, please check the input against your favorite C&nbsp;compiler.
89</p>
90<p>
91The C&nbsp;parser complies to the <b>C99 language standard</b> plus
92the following extensions:
93</p>
94<ul>
95
96<li>C++-style comments (<tt>//</tt>).</li>
97
98<li>The <tt>'\e'</tt> escape in character and string literals.</li>
99
100<li>The <tt>long long</tt> 64&nbsp;bit integer type.</tt>
101
102<li>The C99/C++ boolean type, declared with the keywords <tt>bool</tt>
103or <tt>_Bool</tt>.</li>
104
105<li>Complex numbers, declared with the keywords <tt>complex</tt> or
106<tt>_Complex</tt>.</li>
107
108<li>Two complex number types: <tt>complex</tt> (aka
109<tt>complex&nbsp;double</tt>) and <tt>complex&nbsp;float</tt>.</li>
110
111<li>Vector types, declared with the GCC <tt>mode</tt> or
112<tt>vector_size</tt> attribute.</li>
113
114<li>Unnamed ('transparent') <tt>struct</tt>/<tt>union</tt> fields
115inside a <tt>struct</tt>/<tt>union</tt>.</li>
116
117<li>Incomplete <tt>enum</tt> declarations, handled like incomplete
118<tt>struct</tt> declarations.</li>
119
120<li>Unnamed <tt>enum</tt> fields inside a
121<tt>struct</tt>/<tt>union</tt>. This is similar to a scoped C++
122<tt>enum</tt>, except that declared constants are visible in the
123global namespace, too.</li>
124
125<li>C++-style scoped <tt>static&nbsp;const</tt> declarations inside a
126<tt>struct</tt>/<tt>union</tt>.</li>
127
128<li>Zero-length arrays (<tt>[0]</tt>), empty
129<tt>struct</tt>/<tt>union</tt>, variable-length arrays (VLA,
130<tt>[?]</tt>) and variable-length structs (VLS, with a trailing
131VLA).</li>
132
133<li>Alternate GCC keywords with '<tt>__</tt>', e.g.
134<tt>__const__</tt>.</li>
135
136<li>GCC <tt>__attribute__</tt> with the following attributes:
137<tt>aligned</tt>, <tt>packed</tt>, <tt>mode</tt>,
138<tt>vector_size</tt>, <tt>cdecl</tt>, <tt>fastcall</tt>,
139<tt>stdcall</tt>.</li>
140
141<li>The GCC <tt>__extension__</tt> keyword and the GCC
142<tt>__alignof__</tt> operator.</li>
143
144<li>GCC <tt>__asm__("symname")</tt> symbol name redirection for
145function declarations.</tt>
146
147<li>MSVC keywords for fixed-length types: <tt>__int8</tt>,
148<tt>__int16</tt>, <tt>__int32</tt> and <tt>__int64</tt>.</li>
149
150<li>MSVC <tt>__cdecl</tt>, <tt>__fastcall</tt>, <tt>__stdcall</tt>,
151<tt>__ptr32</tt>, <tt>__ptr64</tt>, <tt>__declspec(align(n))</tt>
152and <tt>#pragma&nbsp;pack</tt>.</li>
153
154<li>All other GCC/MSVC-specific attributes are ignored.</li>
155
156</ul>
157<p>
158The following C&nbsp;types are pre-defined by the C&nbsp;parser (like
159a <tt>typedef</tt>, except re-declarations will be ignored):
66</p> 160</p>
161<ul>
162
163<li>Vararg handling: <tt>va_list</tt>, <tt>__builtin_va_list</tt>,
164<tt>__gnuc_va_list</tt>.</li>
165
166<li>From <tt>&lt;stddef.h&gt;</tt>: <tt>ptrdiff_t</tt>,
167<tt>size_t</tt>, <tt>wchar_t</tt>.</li>
168
169<li>From <tt>&lt;stdint.h&gt;</tt>: <tt>int8_t</tt>, <tt>int16_t</tt>,
170<tt>int32_t</tt>, <tt>int64_t</tt>, <tt>uint8_t</tt>,
171<tt>uint16_t</tt>, <tt>uint32_t</tt>, <tt>uint64_t</tt>,
172<tt>intptr_t</tt>, <tt>uintptr_t</tt>.</li>
173
174</ul>
175<p>
176You're encouraged to use these types in preference to the
177compiler-specific extensions or the target-dependent standard types.
178E.g. <tt>char</tt> differs in signedness and <tt>long</tt> differs in
179size, depending on the target architecture and platform ABI.
180</p>
181<p>
182The following C&nbsp;features are <b>not</b> supported:
183</p>
184<ul>
185
186<li>A declaration must always have a type specifier; it doesn't
187default to an <tt>int</tt> type.</li>
188
189<li>Old-style empty function declarations (K&amp;R) are not allowed.
190All C&nbsp;functions must have a proper protype declaration. A
191function declared without parameters (<tt>int&nbsp;foo();</tt>) is
192treated as a function taking zero arguments, like in C++.</li>
193
194<li>The <tt>long double</tt> C&nbsp;type is parsed correctly, but
195there's no support for the related conversions, accesses or arithmetic
196operations.</li>
197
198<li>Wide character strings and character literals are not
199supported.</li>
200
201<li><a href="#status">See below</a> for features that are currently
202not implemented.</li>
203
204</ul>
67 205
68<h2 id="convert">C Type Conversion Rules</h2> 206<h2 id="convert">C Type Conversion Rules</h2>
69<p> 207<p>
70TODO 208TODO
71</p> 209</p>
210<h3 id="convert_tolua">Conversions from C&nbsp;types to Lua objects</h2>
211<h3 id="convert_fromlua">Conversions from Lua objects to C&nbsp;types</h2>
212<h3 id="convert_between">Conversions between C&nbsp;types</h2>
72 213
73<h2 id="init">Initializers</h2> 214<h2 id="init">Initializers</h2>
74<p> 215<p>
@@ -81,8 +222,8 @@ initializers and the C&nbsp;types involved:
81<li>If no initializers are given, the object is filled with zero bytes.</li> 222<li>If no initializers are given, the object is filled with zero bytes.</li>
82 223
83<li>Scalar types (numbers and pointers) accept a single initializer. 224<li>Scalar types (numbers and pointers) accept a single initializer.
84The standard <a href="#convert">C&nbsp;type conversion rules</a> 225The Lua object is <a href="#convert_fromlua">converted to the scalar
85apply.</li> 226C&nbsp;type</a>.</li>
86 227
87<li>Valarrays (complex numbers and vectors) are treated like scalars 228<li>Valarrays (complex numbers and vectors) are treated like scalars
88when a single initializer is given. Otherwise they are treated like 229when a single initializer is given. Otherwise they are treated like
@@ -111,16 +252,6 @@ initializer or a compatible aggregate, of course.</li>
111 252
112</ul> 253</ul>
113 254
114<h2 id="clib">C Library Namespaces</h2>
115<p>
116A C&nbsp;library namespace is a special kind of object which allows
117access to the symbols contained in libraries. Indexing it with a
118symbol name (a Lua string) automatically binds it to the library.
119</p>
120<p>
121TODO
122</p>
123
124<h2 id="ops">Operations on cdata Objects</h2> 255<h2 id="ops">Operations on cdata Objects</h2>
125<p> 256<p>
126TODO 257TODO
@@ -158,9 +289,9 @@ Similar rules apply for Lua strings which are implicitly converted to
158<tt>"const&nbsp;char&nbsp;*"</tt>: the string object itself must be 289<tt>"const&nbsp;char&nbsp;*"</tt>: the string object itself must be
159referenced somewhere or it'll be garbage collected eventually. The 290referenced somewhere or it'll be garbage collected eventually. The
160pointer will then point to stale data, which may have already beeen 291pointer will then point to stale data, which may have already beeen
161overwritten. Note that string literals are automatically kept alive as 292overwritten. Note that <em>string literals</em> are automatically kept
162long as the function containing it (actually its prototype) is not 293alive as long as the function containing it (actually its prototype)
163garbage collected. 294is not garbage collected.
164</p> 295</p>
165<p> 296<p>
166Objects which are passed as an argument to an external C&nbsp;function 297Objects which are passed as an argument to an external C&nbsp;function
@@ -181,6 +312,121 @@ indistinguishable from pointers returned by C functions (which is one
181of the reasons why the GC cannot follow them). 312of the reasons why the GC cannot follow them).
182</p> 313</p>
183 314
315<h2 id="clib">C Library Namespaces</h2>
316<p>
317A C&nbsp;library namespace is a special kind of object which allows
318access to the symbols contained in shared libraries or the default
319symbol namespace. The default
320<a href="ext_ffi_api.html#ffi_C"><tt>ffi.C</tt></a> namespace is
321automatically created when the FFI library is loaded. C&nbsp;library
322namespaces for specific shared libraries may be created with the
323<a href="ext_ffi_api.html#ffi_load"><tt>ffi.load()</tt></a> API
324function.
325</p>
326<p>
327Indexing a C&nbsp;library namespace object with a symbol name (a Lua
328string) automatically binds it to the library. First the symbol type
329is resolved &mdash; it must have been declared with
330<a href="ext_ffi_api.html#ffi_cdef"><tt>ffi.cdef</tt></a>. Then the
331symbol address is resolved by searching for the symbol name in the
332associated shared libraries or the default symbol namespace. Finally,
333the resulting binding between the symbol name, the symbol type and its
334address is cached. Missing symbol declarations or nonexistent symbol
335names cause an error.
336</p>
337<p>
338This is what happens on a <b>read access</b> for the different kinds of
339symbols:
340</p>
341<ul>
342
343<li>External functions: a cdata object with the type of the function
344and its address is returned.</li>
345
346<li>External variables: the symbol address is dereferenced and the
347loaded value is <a href="#convert_tolua">converted to a Lua object</a>
348and returned.</li>
349
350<li>Constant values (<tt>static&nbsp;const</tt> or <tt>enum</tt>
351constants): the constant is <a href="#convert_tolua">converted to a
352Lua object</a> and returned.</li>
353
354</ul>
355<p>
356This is what happens on a <b>write access</b>:
357</p>
358<ul>
359
360<li>External variables: the value to be written is
361<a href="#convert_fromlua">converted to the C&nbsp;type</a> of the
362variable and then stored at the symbol address.</li>
363
364<li>Writing to constant variables or to any other symbol type causes
365an error, like any other attempted write to a constant location.</li>
366
367</ul>
368<p>
369C&nbsp;library namespaces themselves are garbage collected objects. If
370the last reference to the namespace object is gone, the garbage
371collector will eventually release the shared library reference and
372remove all memory associated with the namespace. Since this may
373trigger the removal of the shared library from the memory of the
374running process, it's generally <em>not safe</em> to use function
375cdata objects obtained from a library if the namespace object may be
376unreferenced.
377</p>
378<p>
379Performance notice: the JIT compiler specializes to the identity of
380namespace objects and to the strings used to index it. This
381effectively turns function cdata objects into constants. It's not
382useful and actually counter-productive to explicitly cache these
383function objects, e.g. <tt>local strlen = ffi.C.strlen</tt>. OTOH it
384<em>is</em> useful to cache the namespace itself, e.g. <tt>local C =
385ffi.C</tt>.
386</p>
387
388<h2 id="policy">No Hand-holding!</h2>
389<p>
390The FFI library has been designed as <b>a low-level library</b>. The
391goal is to interface with C&nbsp;code and C&nbsp;data types with a
392minimum of overhead. This means <b>you can do anything you can do
393from&nbsp;C</b>: access all memory, overwrite anything in memory, call
394machine code at any memory address and so on.
395</p>
396<p>
397The FFI library provides <b>no memory safety</b>, unlike regular Lua
398code. It will happily allow you to dereference a <tt>NULL</tt>
399pointer, to access arrays out of bounds or to misdeclare
400C&nbsp;functions. If you make a mistake, your application might crash,
401just like equivalent C&nbsp;code would.
402</p>
403<p>
404This behavior is inevitable, since the goal is to provide full
405interoperability with C&nbsp;code. Adding extra safety measures, like
406bounds checks, would be futile. There's no way to detect
407misdeclarations of C&nbsp;functions, since shared libraries only
408provide symbol names, but no type information. Likewise there's no way
409to infer the valid range of indexes for a returned pointer.
410</p>
411<p>
412Again: the FFI library is a low-level library. This implies it needs
413to be used with care, but it's flexibility and performance often
414outweigh this concern. If you're a C or C++ developer, it'll be easy
415to apply your existing knowledge. OTOH writing code for the FFI
416library is not for the faint of heart and probably shouldn't be the
417first exercise for someone with little experience in Lua, C or C++.
418</p>
419<p>
420As a corollary of the above, the FFI library is <b>not safe for use by
421untrusted Lua code</b>. If you're sandboxing untrusted Lua code, you
422definitely don't want to give this code access to the FFI library or
423to <em>any</em> cdata object (except 64&nbsp;bit integers or complex
424numbers). Any properly engineered Lua sandbox needs to provide safety
425wrappers for many of the standard Lua library functions &mdash;
426similar wrappers need to be written for high-level operations on FFI
427data types, too.
428</p>
429
184<h2 id="status">Current Status</h2> 430<h2 id="status">Current Status</h2>
185<p> 431<p>
186The initial release of the FFI library has some limitations and is 432The initial release of the FFI library has some limitations and is
@@ -200,18 +446,15 @@ obscure constructs.</li>
200<li><tt>static const</tt> declarations only work for integer types 446<li><tt>static const</tt> declarations only work for integer types
201up to 32&nbsp;bits. Neither declaring string constants nor 447up to 32&nbsp;bits. Neither declaring string constants nor
202floating-point constants is supported.</li> 448floating-point constants is supported.</li>
203<li>The <tt>long double</tt> C&nbsp;type is parsed correctly, but
204there's no support for the related conversions, accesses or
205arithmetic operations.</li>
206<li>Packed <tt>struct</tt> bitfields that cross container boundaries 449<li>Packed <tt>struct</tt> bitfields that cross container boundaries
207are not implemented.</li> 450are not implemented.</li>
208<li>Native vector types may be defined with the GCC <tt>mode</tt> and 451<li>Native vector types may be defined with the GCC <tt>mode</tt> or
209<tt>vector_size</tt> attributes. But no operations other than loading, 452<tt>vector_size</tt> attribute. But no operations other than loading,
210storing and initializing them are supported, yet.</li> 453storing and initializing them are supported, yet.</li>
211<li>The <tt>volatile</tt> type qualifier is currently ignored by 454<li>The <tt>volatile</tt> type qualifier is currently ignored by
212compiled code.</li> 455compiled code.</li>
213<li><a href="ext_ffi_api.html#ffi_cdef">ffi.cdef</a> silently ignores 456<li><a href="ext_ffi_api.html#ffi_cdef"><tt>ffi.cdef</tt></a> silently
214all redeclarations.</li> 457ignores all redeclarations.</li>
215</ul> 458</ul>
216<p> 459<p>
217The JIT compiler already handles a large subset of all FFI operations. 460The JIT compiler already handles a large subset of all FFI operations.
@@ -238,6 +481,7 @@ two.</li>
238value.</li> 481value.</li>
239<li>Calls to C&nbsp;functions with 64 bit arguments or return values 482<li>Calls to C&nbsp;functions with 64 bit arguments or return values
240on 32 bit CPUs.</li> 483on 32 bit CPUs.</li>
484<li>Accesses to external variables in C&nbsp;library namespaces.</li>
241<li><tt>tostring()</tt> for cdata types.</li> 485<li><tt>tostring()</tt> for cdata types.</li>
242<li>The following <a href="ext_ffi_api.html">ffi.* API</a> functions: 486<li>The following <a href="ext_ffi_api.html">ffi.* API</a> functions:
243<tt>ffi.sizeof()</tt>, <tt>ffi.alignof()</tt>, <tt>ffi.offsetof()</tt>. 487<tt>ffi.sizeof()</tt>, <tt>ffi.alignof()</tt>, <tt>ffi.offsetof()</tt>.