aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThijs Schreijer <thijs@thijsschreijer.nl>2022-08-24 12:31:18 +0200
committerGitHub <noreply@github.com>2022-08-24 12:31:18 +0200
commit87c48f3e4ddba13d9c014067e62568ba906cc410 (patch)
treec99889ec0b9bbab2bc5ee14cd9046d0a122eb226
parent95b7efa9da506ef968c1347edf3fc56370f0deed (diff)
parent97d5194f302d3fb9fe27874d9b5f73004a208d01 (diff)
downloadluasocket-87c48f3e4ddba13d9c014067e62568ba906cc410.tar.gz
luasocket-87c48f3e4ddba13d9c014067e62568ba906cc410.tar.bz2
luasocket-87c48f3e4ddba13d9c014067e62568ba906cc410.zip
Merge pull request #364 from lunarmodules/cleanup
-rw-r--r--.luacheckrc2
-rw-r--r--FIX28
-rw-r--r--TODO81
-rw-r--r--TODO.md135
-rw-r--r--WISH22
-rw-r--r--docs/logo.ps (renamed from logo.ps)0
-rw-r--r--docs/lua05.pptbin304128 -> 0 bytes
-rw-r--r--etc/README89
-rw-r--r--gem/ex1.lua4
-rw-r--r--gem/ex10.lua17
-rw-r--r--gem/ex11.lua7
-rw-r--r--gem/ex12.lua34
-rw-r--r--gem/ex2.lua11
-rw-r--r--gem/ex3.lua15
-rw-r--r--gem/ex4.lua5
-rw-r--r--gem/ex5.lua15
-rw-r--r--gem/ex6.lua14
-rw-r--r--gem/ex7.lua16
-rw-r--r--gem/ex8.lua5
-rw-r--r--gem/ex9.lua3
-rw-r--r--gem/gem.c54
-rw-r--r--gem/gt.b64206
-rw-r--r--gem/input.binbin11732 -> 0 bytes
-rw-r--r--gem/ltn012.tex695
-rw-r--r--gem/luasocket.pngbin11732 -> 0 bytes
-rw-r--r--gem/makefile14
-rwxr-xr-xgem/myps2pdf113
-rw-r--r--gem/t1.lua25
-rw-r--r--gem/t1lf.txt5
-rw-r--r--gem/t2.lua36
-rw-r--r--gem/t2.txt4
-rw-r--r--gem/t2gt.qp5
-rw-r--r--gem/t3.lua25
-rw-r--r--gem/t4.lua10
-rw-r--r--gem/t5.lua30
-rw-r--r--gem/test.lua46
-rw-r--r--ltn012.md390
-rw-r--r--ltn012.wiki393
-rw-r--r--ltn013.md191
-rw-r--r--ltn013.wiki194
-rw-r--r--luasocket-scm-3.rockspec1
-rw-r--r--makefile.dist28
-rw-r--r--samples/README90
-rw-r--r--samples/b64.lua (renamed from etc/b64.lua)0
-rw-r--r--samples/check-links.lua (renamed from etc/check-links.lua)0
-rw-r--r--samples/check-memory.lua (renamed from etc/check-memory.lua)0
-rw-r--r--samples/cookie.lua (renamed from etc/cookie.lua)0
-rw-r--r--samples/dict.lua (renamed from etc/dict.lua)0
-rw-r--r--samples/dispatch.lua (renamed from etc/dispatch.lua)0
-rw-r--r--samples/eol.lua (renamed from etc/eol.lua)0
-rw-r--r--samples/forward.lua (renamed from etc/forward.lua)0
-rw-r--r--samples/get.lua (renamed from etc/get.lua)0
-rw-r--r--samples/links (renamed from etc/links)0
-rw-r--r--samples/lp.lua (renamed from etc/lp.lua)0
-rw-r--r--samples/qp.lua (renamed from etc/qp.lua)0
-rw-r--r--samples/tftp.lua (renamed from etc/tftp.lua)0
56 files changed, 814 insertions, 2244 deletions
diff --git a/.luacheckrc b/.luacheckrc
index 8b25dd7..a3b4f63 100644
--- a/.luacheckrc
+++ b/.luacheckrc
@@ -15,8 +15,6 @@ include_files = {
15} 15}
16 16
17exclude_files = { 17exclude_files = {
18 "etc/*.lua",
19 "etc/**/*.lua",
20 "test/*.lua", 18 "test/*.lua",
21 "test/**/*.lua", 19 "test/**/*.lua",
22 "samples/*.lua", 20 "samples/*.lua",
diff --git a/FIX b/FIX
deleted file mode 100644
index 40f30a1..0000000
--- a/FIX
+++ /dev/null
@@ -1,28 +0,0 @@
1
2
3
4
5
6
7http was preserving old host header during redirects
8fix smtp.send hang on source error
9add create field to FTP and SMTP and fix HTTP ugliness
10clean timeout argument to open functions in SMTP, HTTP and FTP
11eliminate globals from namespaces created by module().
12url.absolute was not working when base_url was already parsed
13http.request was redirecting even when the location header was empty
14tcp{client}:shutdown() was checking for group instead of class.
15tcp{client}:send() now returns i+sent-1...
16get rid of a = socket.try() in the manual, except for protected cases. replace it with assert.
17get rid of "base." kludge in package.loaded
18check all "require("http")" etc in the manual.
19make sure sock_gethostname.* only return success if the hp is not null!
20change 'l' prefix in C libraries to 'c' to avoid clash with LHF libraries
21 don't forget the declarations in luasocket.h and mime.h!!!
22setpeername was using udp{unconnected}
23fixed a bug in http.lua that caused some requests to fail (Florian Berger)
24fixed a bug in select.c that prevented sockets with descriptor 0 from working (Renato Maia)
25fixed a "bug" that caused dns.toip to crash under uLinux
26fixed a "bug" that caused a crash in gethostbyname under VMS
27DEBUG and VERSION became _DEBUG and _VERSION
28send returns the right value if input is "". Alexander Marinov
diff --git a/TODO b/TODO
deleted file mode 100644
index a838fc0..0000000
--- a/TODO
+++ /dev/null
@@ -1,81 +0,0 @@
1- bizarre default values for getnameinfo should throw error instead!
2
3> It's just too bad it can't talk to gmail -
4> reason 1: they absolutely want TLS
5> reason 2: unlike all the other SMTP implementations, they
6> don't
7> tolerate missing < > around adresses
8
9- document the new bind and connect behavior.
10- shouldn't we instead make the code compatible to Lua 5.2
11 without any compat stuff, and use a compatibility layer to
12 make it work on 5.1?
13- add what's new to manual
14- should there be an equivalent to tohostname for IPv6?
15- should we add service name resolution as well to getaddrinfo?
16- Maybe the sockaddr to presentation conversion should be done with getnameinfo()?
17
18- add http POST sample to manual
19 people keep asking stupid questions
20- documentation of dirty/getfd/setfd is problematic because of portability
21 same for unix and serial.
22 what to do about this? add a stronger disclaimer?
23- fix makefile with decent defaults?
24
25Done:
26
27- added IPv6 support to getsockname
28- simplified getpeername implementation
29- added family to return of getsockname and getpeername
30 and added modification to the manual to describe
31
32- connect and bind try all adresses returned by getaddrinfo
33- document headers.lua?
34- update copyright date everywhere?
35- remove RCSID from files?
36- move version to 2.1 rather than 2.1.1?
37- fixed url package to support ipv6 hosts
38- changed domain to family
39- implement getfamily methods.
40
41- remove references to Lua 5.0 from documentation, add 5.2?
42- update lua and luasocket version in samples in documentation
43- document ipv5_v6only default option being set?
44- document tcp6 and udp6
45- document dns.getaddrinfo
46- documented zero-sized datagram change?
47 no.
48- document unix socket and serial socket? add raw support?
49 no.
50- document getoption
51- merge luaL_typeerror into auxiliar to avoid using luaL prefix?
52
53
54
55
56
57
58
59
60
61
62replace \r\n with \0xD\0xA in everything
63New mime support
64
65ftp send should return server replies?
66make sure there are no object files in the distribution tarball
67http handling of 100-continue, see DB patch
68DB ftp.lua bug.
69test unix.c to return just a function and works with require"unix"
70get rid of setmetatable(, nil) since packages don't need this anymore in 5.1
71compat-5.1 novo
72ajeitar pra lua-5.1
73
74adicionar exemplos de expansão: pipe, local, named pipe
75testar os options!
76
77
78- Thread-unsafe functions to protect
79 gethostbyname(), gethostbyaddr(), gethostent(),
80inet_ntoa(), strerror(),
81
diff --git a/TODO.md b/TODO.md
new file mode 100644
index 0000000..d265694
--- /dev/null
+++ b/TODO.md
@@ -0,0 +1,135 @@
1## FIX
2
3http was preserving old host header during redirects
4fix smtp.send hang on source error
5add create field to FTP and SMTP and fix HTTP ugliness
6clean timeout argument to open functions in SMTP, HTTP and FTP
7eliminate globals from namespaces created by module().
8url.absolute was not working when base_url was already parsed
9http.request was redirecting even when the location header was empty
10tcp{client}:shutdown() was checking for group instead of class.
11tcp{client}:send() now returns i+sent-1...
12get rid of a = socket.try() in the manual, except for protected cases. replace it with assert.
13get rid of "base." kludge in package.loaded
14check all "require("http")" etc in the manual.
15make sure sock_gethostname.* only return success if the hp is not null!
16change 'l' prefix in C libraries to 'c' to avoid clash with LHF libraries
17 don't forget the declarations in luasocket.h and mime.h!!!
18setpeername was using udp{unconnected}
19fixed a bug in http.lua that caused some requests to fail (Florian Berger)
20fixed a bug in select.c that prevented sockets with descriptor 0 from working (Renato Maia)
21fixed a "bug" that caused dns.toip to crash under uLinux
22fixed a "bug" that caused a crash in gethostbyname under VMS
23DEBUG and VERSION became _DEBUG and _VERSION
24send returns the right value if input is "". Alexander Marinov
25
26
27## WISH
28
29... as an l-value to get all results of a function call?
30at least ...[i] and #...
31extend to full tuples?
32
33__and __or __not metamethods
34
35lua_tostring, lua_tonumber, lua_touseradta etc push values in stack
36__tostring,__tonumber, __touserdata metamethods are checked
37and expected to push an object of correct type on stack
38
39lua_rawtostring, lua_rawtonumber, lua_rawtouserdata don't
40push anything on stack, return data of appropriate type,
41skip metamethods and throw error if object not of exact type
42
43package.findfile exported
44module not polluting the global namespace
45
46coxpcall with a coroutine pool for efficiency (reusing coroutines)
47
48exception mechanism formalized? just like the package system was.
49
50a nice bitlib in the core
51
52
53## TODO
54
55- bizarre default values for getnameinfo should throw error instead!
56
57> It's just too bad it can't talk to gmail -
58> reason 1: they absolutely want TLS
59> reason 2: unlike all the other SMTP implementations, they
60> don't
61> tolerate missing < > around adresses
62
63- document the new bind and connect behavior.
64- shouldn't we instead make the code compatible to Lua 5.2
65 without any compat stuff, and use a compatibility layer to
66 make it work on 5.1?
67- add what's new to manual
68- should there be an equivalent to tohostname for IPv6?
69- should we add service name resolution as well to getaddrinfo?
70- Maybe the sockaddr to presentation conversion should be done with getnameinfo()?
71
72- add http POST sample to manual
73 people keep asking stupid questions
74- documentation of dirty/getfd/setfd is problematic because of portability
75 same for unix and serial.
76 what to do about this? add a stronger disclaimer?
77- fix makefile with decent defaults?
78
79## Done:
80
81- added IPv6 support to getsockname
82- simplified getpeername implementation
83- added family to return of getsockname and getpeername
84 and added modification to the manual to describe
85
86- connect and bind try all adresses returned by getaddrinfo
87- document headers.lua?
88- update copyright date everywhere?
89- remove RCSID from files?
90- move version to 2.1 rather than 2.1.1?
91- fixed url package to support ipv6 hosts
92- changed domain to family
93- implement getfamily methods.
94
95- remove references to Lua 5.0 from documentation, add 5.2?
96- update lua and luasocket version in samples in documentation
97- document ipv5_v6only default option being set?
98- document tcp6 and udp6
99- document dns.getaddrinfo
100- documented zero-sized datagram change?
101 no.
102- document unix socket and serial socket? add raw support?
103 no.
104- document getoption
105- merge luaL_typeerror into auxiliar to avoid using luaL prefix?
106
107
108
109
110
111
112
113
114
115
116replace \r\n with \0xD\0xA in everything
117New mime support
118
119ftp send should return server replies?
120make sure there are no object files in the distribution tarball
121http handling of 100-continue, see DB patch
122DB ftp.lua bug.
123test unix.c to return just a function and works with require"unix"
124get rid of setmetatable(, nil) since packages don't need this anymore in 5.1
125compat-5.1 novo
126ajeitar pra lua-5.1
127
128adicionar exemplos de expans�o: pipe, local, named pipe
129testar os options!
130
131
132- Thread-unsafe functions to protect
133 gethostbyname(), gethostbyaddr(), gethostent(),
134inet_ntoa(), strerror(),
135
diff --git a/WISH b/WISH
deleted file mode 100644
index e7e9c07..0000000
--- a/WISH
+++ /dev/null
@@ -1,22 +0,0 @@
1... as an l-value to get all results of a function call?
2at least ...[i] and #...
3extend to full tuples?
4
5__and __or __not metamethods
6
7lua_tostring, lua_tonumber, lua_touseradta etc push values in stack
8__tostring,__tonumber, __touserdata metamethods are checked
9and expected to push an object of correct type on stack
10
11lua_rawtostring, lua_rawtonumber, lua_rawtouserdata don't
12push anything on stack, return data of appropriate type,
13skip metamethods and throw error if object not of exact type
14
15package.findfile exported
16module not polluting the global namespace
17
18coxpcall with a coroutine pool for efficiency (reusing coroutines)
19
20exception mechanism formalized? just like the package system was.
21
22a nice bitlib in the core
diff --git a/logo.ps b/docs/logo.ps
index 8b5809a..8b5809a 100644
--- a/logo.ps
+++ b/docs/logo.ps
diff --git a/docs/lua05.ppt b/docs/lua05.ppt
deleted file mode 100644
index e2b7ab4..0000000
--- a/docs/lua05.ppt
+++ /dev/null
Binary files differ
diff --git a/etc/README b/etc/README
deleted file mode 100644
index cfd3e37..0000000
--- a/etc/README
+++ /dev/null
@@ -1,89 +0,0 @@
1This directory contains code that is more useful than the
2samples. This code *is* supported.
3
4 tftp.lua -- Trivial FTP client
5
6This module implements file retrieval by the TFTP protocol.
7Its main use was to test the UDP code, but since someone
8found it usefull, I turned it into a module that is almost
9official (no uploads, yet).
10
11 dict.lua -- Dict client
12
13The dict.lua module started with a cool simple client
14for the DICT protocol, written by Luiz Henrique Figueiredo.
15This new version has been converted into a library, similar
16to the HTTP and FTP libraries, that can be used from within
17any luasocket application. Take a look on the source code
18and you will be able to figure out how to use it.
19
20 lp.lua -- LPD client library
21
22The lp.lua module implements the client part of the Line
23Printer Daemon protocol, used to print files on Unix
24machines. It is courtesy of David Burgess! See the source
25code and the lpr.lua in the examples directory.
26
27 b64.lua
28 qp.lua
29 eol.lua
30
31These are tiny programs that perform Base64,
32Quoted-Printable and end-of-line marker conversions.
33
34 get.lua -- file retriever
35
36This little program is a client that uses the FTP and
37HTTP code to implement a command line file graber. Just
38run
39
40 lua get.lua <remote-file> [<local-file>]
41
42to download a remote file (either ftp:// or http://) to
43the specified local file. The program also prints the
44download throughput, elapsed time, bytes already downloaded
45etc during download.
46
47 check-memory.lua -- checks memory consumption
48
49This is just to see how much memory each module uses.
50
51 dispatch.lua -- coroutine based dispatcher
52
53This is a first try at a coroutine based non-blocking
54dispatcher for LuaSocket. Take a look at 'check-links.lua'
55and at 'forward.lua' to see how to use it.
56
57 check-links.lua -- HTML link checker program
58
59This little program scans a HTML file and checks for broken
60links. It is similar to check-links.pl by Jamie Zawinski,
61but uses all facilities of the LuaSocket library and the Lua
62language. It has not been thoroughly tested, but it should
63work. Just run
64
65 lua check-links.lua [-n] {<url>} > output
66
67and open the result to see a list of broken links. Make sure
68you check the '-n' switch. It runs in non-blocking mode,
69using coroutines, and is MUCH faster!
70
71 forward.lua -- coroutine based forward server
72
73This is a forward server that can accept several connections
74and transfers simultaneously using non-blocking I/O and the
75coroutine-based dispatcher. You can run, for example
76
77 lua forward.lua 8080:proxy.com:3128
78
79to redirect all local conections to port 8080 to the host
80'proxy.com' at port 3128.
81
82 unix.c and unix.h
83
84This is an implementation of Unix local domain sockets and
85demonstrates how to extend LuaSocket with a new type of
86transport. It has been tested on Linux and on Mac OS X.
87
88Good luck,
89Diego.
diff --git a/gem/ex1.lua b/gem/ex1.lua
deleted file mode 100644
index 327a542..0000000
--- a/gem/ex1.lua
+++ /dev/null
@@ -1,4 +0,0 @@
1local CRLF = "\013\010"
2local input = source.chain(source.file(io.stdin), normalize(CRLF))
3local output = sink.file(io.stdout)
4pump.all(input, output)
diff --git a/gem/ex10.lua b/gem/ex10.lua
deleted file mode 100644
index 2b1b98f..0000000
--- a/gem/ex10.lua
+++ /dev/null
@@ -1,17 +0,0 @@
1function pump.step(src, snk)
2 local chunk, src_err = src()
3 local ret, snk_err = snk(chunk, src_err)
4 if chunk and ret then return 1
5 else return nil, src_err or snk_err end
6end
7
8function pump.all(src, snk, step)
9 step = step or pump.step
10 while true do
11 local ret, err = step(src, snk)
12 if not ret then
13 if err then return nil, err
14 else return 1 end
15 end
16 end
17end
diff --git a/gem/ex11.lua b/gem/ex11.lua
deleted file mode 100644
index 79c99af..0000000
--- a/gem/ex11.lua
+++ /dev/null
@@ -1,7 +0,0 @@
1local input = source.chain(
2 source.file(io.open("input.bin", "rb")),
3 encode("base64"))
4local output = sink.chain(
5 wrap(76),
6 sink.file(io.open("output.b64", "w")))
7pump.all(input, output)
diff --git a/gem/ex12.lua b/gem/ex12.lua
deleted file mode 100644
index de17d76..0000000
--- a/gem/ex12.lua
+++ /dev/null
@@ -1,34 +0,0 @@
1local smtp = require"socket.smtp"
2local mime = require"mime"
3local ltn12 = require"ltn12"
4
5CRLF = "\013\010"
6
7local message = smtp.message{
8 headers = {
9 from = "Sicrano <sicrano@example.com>",
10 to = "Fulano <fulano@example.com>",
11 subject = "A message with an attachment"},
12 body = {
13 preamble = "Hope you can see the attachment" .. CRLF,
14 [1] = {
15 body = "Here is our logo" .. CRLF},
16 [2] = {
17 headers = {
18 ["content-type"] = 'image/png; name="luasocket.png"',
19 ["content-disposition"] =
20 'attachment; filename="luasocket.png"',
21 ["content-description"] = 'LuaSocket logo',
22 ["content-transfer-encoding"] = "BASE64"},
23 body = ltn12.source.chain(
24 ltn12.source.file(io.open("luasocket.png", "rb")),
25 ltn12.filter.chain(
26 mime.encode("base64"),
27 mime.wrap()))}}}
28
29assert(smtp.send{
30 rcpt = "<diego@cs.princeton.edu>",
31 from = "<diego@cs.princeton.edu>",
32 server = "localhost",
33 port = 2525,
34 source = message})
diff --git a/gem/ex2.lua b/gem/ex2.lua
deleted file mode 100644
index 94bde66..0000000
--- a/gem/ex2.lua
+++ /dev/null
@@ -1,11 +0,0 @@
1function filter.cycle(lowlevel, context, extra)
2 return function(chunk)
3 local ret
4 ret, context = lowlevel(context, chunk, extra)
5 return ret
6 end
7end
8
9function normalize(marker)
10 return filter.cycle(eol, 0, marker)
11end
diff --git a/gem/ex3.lua b/gem/ex3.lua
deleted file mode 100644
index 60b4423..0000000
--- a/gem/ex3.lua
+++ /dev/null
@@ -1,15 +0,0 @@
1local function chainpair(f1, f2)
2 return function(chunk)
3 local ret = f2(f1(chunk))
4 if chunk then return ret
5 else return (ret or "") .. (f2() or "") end
6 end
7end
8
9function filter.chain(...)
10 local f = select(1, ...)
11 for i = 2, select('#', ...) do
12 f = chainpair(f, select(i, ...))
13 end
14 return f
15end
diff --git a/gem/ex4.lua b/gem/ex4.lua
deleted file mode 100644
index c48b77e..0000000
--- a/gem/ex4.lua
+++ /dev/null
@@ -1,5 +0,0 @@
1local qp = filter.chain(normalize(CRLF), encode("quoted-printable"),
2 wrap("quoted-printable"))
3local input = source.chain(source.file(io.stdin), qp)
4local output = sink.file(io.stdout)
5pump.all(input, output)
diff --git a/gem/ex5.lua b/gem/ex5.lua
deleted file mode 100644
index 196b30a..0000000
--- a/gem/ex5.lua
+++ /dev/null
@@ -1,15 +0,0 @@
1function source.empty(err)
2 return function()
3 return nil, err
4 end
5end
6
7function source.file(handle, io_err)
8 if handle then
9 return function()
10 local chunk = handle:read(20)
11 if not chunk then handle:close() end
12 return chunk
13 end
14 else return source.empty(io_err or "unable to open file") end
15end
diff --git a/gem/ex6.lua b/gem/ex6.lua
deleted file mode 100644
index a3fdca0..0000000
--- a/gem/ex6.lua
+++ /dev/null
@@ -1,14 +0,0 @@
1function source.chain(src, f)
2 return function()
3 if not src then
4 return nil
5 end
6 local chunk, err = src()
7 if not chunk then
8 src = nil
9 return f(nil)
10 else
11 return f(chunk)
12 end
13 end
14end
diff --git a/gem/ex7.lua b/gem/ex7.lua
deleted file mode 100644
index c766988..0000000
--- a/gem/ex7.lua
+++ /dev/null
@@ -1,16 +0,0 @@
1function sink.table(t)
2 t = t or {}
3 local f = function(chunk, err)
4 if chunk then table.insert(t, chunk) end
5 return 1
6 end
7 return f, t
8end
9
10local function null()
11 return 1
12end
13
14function sink.null()
15 return null
16end
diff --git a/gem/ex8.lua b/gem/ex8.lua
deleted file mode 100644
index 81e288c..0000000
--- a/gem/ex8.lua
+++ /dev/null
@@ -1,5 +0,0 @@
1local input = source.file(io.stdin)
2local output, t = sink.table()
3output = sink.chain(normalize(CRLF), output)
4pump.all(input, output)
5io.write(table.concat(t))
diff --git a/gem/ex9.lua b/gem/ex9.lua
deleted file mode 100644
index b857698..0000000
--- a/gem/ex9.lua
+++ /dev/null
@@ -1,3 +0,0 @@
1for chunk in source.file(io.stdin) do
2 io.write(chunk)
3end
diff --git a/gem/gem.c b/gem/gem.c
deleted file mode 100644
index 976f74d..0000000
--- a/gem/gem.c
+++ /dev/null
@@ -1,54 +0,0 @@
1#include "lua.h"
2#include "lauxlib.h"
3
4#define CR '\xD'
5#define LF '\xA'
6#define CRLF "\xD\xA"
7
8#define candidate(c) (c == CR || c == LF)
9static int pushchar(int c, int last, const char *marker,
10 luaL_Buffer *buffer) {
11 if (candidate(c)) {
12 if (candidate(last)) {
13 if (c == last)
14 luaL_addstring(buffer, marker);
15 return 0;
16 } else {
17 luaL_addstring(buffer, marker);
18 return c;
19 }
20 } else {
21 luaL_putchar(buffer, c);
22 return 0;
23 }
24}
25
26static int eol(lua_State *L) {
27 int context = luaL_checkint(L, 1);
28 size_t isize = 0;
29 const char *input = luaL_optlstring(L, 2, NULL, &isize);
30 const char *last = input + isize;
31 const char *marker = luaL_optstring(L, 3, CRLF);
32 luaL_Buffer buffer;
33 luaL_buffinit(L, &buffer);
34 if (!input) {
35 lua_pushnil(L);
36 lua_pushnumber(L, 0);
37 return 2;
38 }
39 while (input < last)
40 context = pushchar(*input++, context, marker, &buffer);
41 luaL_pushresult(&buffer);
42 lua_pushnumber(L, context);
43 return 2;
44}
45
46static luaL_reg func[] = {
47 { "eol", eol },
48 { NULL, NULL }
49};
50
51int luaopen_gem(lua_State *L) {
52 luaL_openlib(L, "gem", func, 0);
53 return 0;
54}
diff --git a/gem/gt.b64 b/gem/gt.b64
deleted file mode 100644
index a74c0b3..0000000
--- a/gem/gt.b64
+++ /dev/null
@@ -1,206 +0,0 @@
1iVBORw0KGgoAAAANSUhEUgAAAIAAAACACAIAAABMXPacAAAtU0lEQVR42u19eXRURdb4rarXa5LO
2RshKEshC2MLOBIjsCoMLGJhRPnUEcUGZEX7j4Iw6zqd+zjkzzowL6gzKMOoBRHAAPyQKUZQlxLAk
3EIEkQkhCyEoISegs3f1eVf3+qPTj0Z3udEJImN/Pe/rkdF6/V6/q3qp7b92tEOccfoT+A9zfHfj/
4HX4kQD/DjwToZ/iRAP0MPxKgn+FHAvQz/EiAfgapvzvQQ3DfviCE+rtTPYH/AAKouEYIcc4ForUX
5tXeKexhj6k8IIe2DvdUl0SYAcN7RGYQ63oAQ4hx8fBu6BXfC6vBcsHyDeNRi7cYboZQjBIRgl/lB
6KQcAQnyl+q1IAC9YU7/s2bOnsrKSUupwOHQ63cMPP2wymRhjGOOrV6/m5ORYLJbg4OABAwZYLBaD
7waBtQUsD34mqRT0hHc/abEpNjbWlxYEQCgw0RET463QEABjjjHFfyND/LEg737XsQpblhoaGioqK
8CxcunD9/fv78+ampqepgZFk2mUwBAQEYY6PRSAhRG7Tb7cXFxXa73W63W63Wn/zkJ4sXL1YfVHGB
9EFI5VZc0EDcwxjnnkoRbWhw7dxZt316Yn19TW9siyxQADAZddHRAWlrMffeNnDcvUa8nlDKEAGNv
107ffbClCnoYoFFRFiIufn53/88cfBwcERERERERHjxo2LjIz0ZbaqFLXb7ZcuXZIkKSoqShAYY7xn
11z576+vpJkybFxcUZjUZfOJKKfQBACP75z/yXXtpfXX0JAAFIAAQAAXAADsAAZAA0dGjMa6/Nueee
12FEoZQsgLDfqTAFqWIstyRUVFXFycJEniJ6vV2tTUFBUVRQhxkb0q2TTS7xr9tNxG/bdjtAjl5eXl
135ubW1dUhhJKTkzMyMkwmk0p4AMAYq91Tv1DKCMENDW0PPLBj797vEdJjrAfgjF2HP+d8B8YcAMry
145VP//vf5Oh3h3OM66P8V0NTU9N133+Xl5SmKsnr16qCgIBc8MsbE5HXXgjqdU9oRie8YY5c2W1tb
15CwsLS0tLFy5cqEoILWnFI84rHGNUXW29/fYPCwsvSpI/pQLxntYNxxhjDIpinTNn1K5d/2Uy6Zwd
16cNWO+o4A7mjFGOfk5OzcuTMsLGzixInjxo2zWCwqIlSpAL2k47tMc+18FN8vXLgAAHFxce4Cqa1N
17njlzw9GjZZLkryiK6KP3twEgnY7I8tWf/WzCtm33McZVJVV7H3nppZf6BvXaL+rAFEVJSEhYvHjx
184MGDDQaDykxAw1h6S38XLxUcRnRGnXyiM4cOHdqyZUtDQ0N0dLSfn5/4SUz/Z57Zs3PnCZ0uQFEU
19ANQV9jvIwxiTJOPp0xdCQgLS0gZRyjF2Hc5NXwEu866lpUWv1+v1enVBqFsnwWS0dLrZ4K7dlpSU
20ZGZmVlVVpaen33PPPYL1HzlSOXnyewCk+6gSo2OhocaCgl9GR1vEOtCO7qbbglQsY4yPHj366quv
21nj59GjScWtBGq0f2mVHBZbVxzhMSElatWvXzn//cORUAANau/Y5zB8YYoLsUQJxzQqSGhqb1648D
22gFClXO+4eSNUZ9alS5e2b99eXl4+d+7cqVOnCrl361hvOt2LCNWlttY6bNjbTU22Hk9WhBDnjhEj
23IgoKVoqdc1+vAFmW//WvfymK8uyzz86aNUvlP72HPrjBWaR2RkgIoXeJ2ZqbW9nUdBVj0uPGOecA
24ujNn6s+cuQRui6CXd8JaJUedSsJUEBoaqtfrtdd9p4HQ3rTGL9UE1ik2BZ/trmnMRePinAFAQUEt
25AMMYuXMP34EQRKnjzJlLqakRLr3uTQJoJarLzigyMpIxJiStVr/0pTXOQdgAMEaEYACOEPb+tKCU
26UOEVhYq9qKCKTwYyzW0XL169cUaNEAJglZVXwc2Q3msE0GKfEFJYWGg2m+Pj41UtyMeJr8W7olCB
27dFVS2mxKZeXVqqqrFRXN9fVtDQ1tbW2yw0EBQK8nJpNuwABTWJjfoEGB0dEBMTEWk0mHEBYPU8oY
28Y04S+roEbTalt1Bkt1P3i728AjjnhJCjR49u3rw5IyNDEACcvBW8ajgqRhSFCUsvQhghVF/fmptb
29efjwxWPHqs6da6iutlLqAFA86yQIQCJEHxkZkJQUMnFi9JQpg9LSYsLD/THusCtw3mHR7JIMfn66
303sKP2dxJU70sAzDGBw4c2Llz5/333z958mRVqfD+lBb1GCNhxa2oaP788x8++6z4yJFKq9UKQAGI
31+CCkw1jvqVkhPylllZVXKivrv/22EID4+wdMmhS9YEHKggVD4+KCxAqjlHkig9DfASA+PkismO7r
32oNeAMQ6A4+ODwG0K9o4aqtoajx07tnnz5mXLlo0ePVplO12iXhjZMUYYI1mme/aUrF+f/9VXJTZb
33CwAG0GFMhDHLxfjlHQTTF/KTMQogAzCDwW/27ITHHhs/f36SXk+8GO4VhUkSzsoqmTv3XxgbbkQI
34A3BJQmfO/DI5eYAQhL1JAK0l68qVK1euXElMTOyS6av6EqViI4bb2+WNGwveeCO3uLgSAAAMhBCA
35Dh/TjQMhCABRSgHsAJCUFL16ddrDD4/289OrfQDNahBGiKYm2/Dha2tqrAj1YCcMAIAxYsw+aVLs
36kSMr3G2IN7QPcOqFXJ3IISEhCQkJvmBfaIeKIqQifPDBiREj3n3iiW3FxTUYmwgxCWT1FvYBgFJO
37KQVAhJgwNp07V7ty5afDh7+7fn0e50AIVhTGmNZiCIrCgoKMixYNB7D3aCcMTvalPPjgGNEHl597
38vgI8Gd8FL/JkLnaf+IcPV6xatScv7zxCEsYGdQd0k6HDvs2Yg3PH6NFD3npr3vTp8Wqv1D0Hxqik
395MrYse+0tFCn48X3LSTHGDMmJySEnDjxy4AAfa+tAK1yWVpampubqxJDMLhOub9W2BKC29uVX/7y
40i/T09/LyygjxQ0hPKe0T7AMAYoxTShGSCPEvKKiYMWP9E0/sbm11iKXgHAIoCktMDHnxxVkAbTpd
41t9DFnahW/vSneQEBHYzOBS09IYA62THGra2tmzZtOnfunO9PCeF25Ejl+PHr3n13PyE6jI1O1Pex
42dQgxBpRSjA2E6N9//+DYseuysyskCVPKBTsiBDHGn302ffHiCbJs1ekkJ3K7GC5CSKfDlFrXrJm1
43ePFwShnGnYyuJwTQ+vk2bdrk5+e3ZMkS9Scv2GeMU8p1OvLOO0enTn3v7Nk6QvwpFQbRfjTMIcYY
44pZwQ/9LS+mnT3n/99e8kCQtmKNYB53zTpkV33jlGlpslSWzIPZFBhKUQjLksW596auZrr92hYt8d
45Pz1cAQKhmZmZpaWlS5culSRJsKNOJYrWqY0xeuKJz3/1q38DYIz1lIrNYT9gHyFXAxGlFGM9xtIz
46z+xctuwzYUESXnXOQacj//u/S3796zsUxU6pDSGQJEKIsHB0fAhBkkQQ4pS2Ygyvv77o3XfvFNjv
47zagIVZLs27cvMDBwwoQJqpHHE98Xno3WVvlnP9v65ZcFkhSgKKybAu0GgQMgse2iVIQviIFjjDHG
48YnvccZskYUWxzp49cseO+y0Wg+i82DFIEj58uOL55/cdPFgKYHfuDcUoGAAFYISY77572B//OGv4
498DBFYd6jg3pIAE8hCF6w39xsu+uuTdnZZyXJv2+x34F6xhjndgAOoPfzM5nNEqXcarXLsg1AAdBh
50rIcOB5GgQcukSQlffPGL0FCTGIJgSmI65+VV79xZnJNzsby8UQ3MSkgImT49PiNjWHJyqBrC5d3u
511A0CuHstvOv7KufBGFmtjnnzPsrJKZEkP0WhfTnxnV1t0+mMs2YlLVyYMnFiVHS0xWzWUcqammzn
52zl359tuyf/+7sKSkGiEJIT1jFAAkiShK68SJg7OylgYFGcVAAECrqiKEZJm2tysIgdmsc14EWRY2
53FY/q+A0RQG3Re2yIerMsszvv3Pj114WS5N/n2McACufKz38+/uWXZ6SkDHDvs4rH7duLXnjh69LS
54GkLMlHIALmgwbVry3r0PGwwd4T3gNDcJkqiUUC8SgjEWPoyuba6+CmFtAMH+/ftra2s7COjVuim0
55iEcf/axfsI8x5twRGGjYufPhrVsXJyeHUsrEdlf7oZTLMiUE33//yFOnVj7yyBRK2wgBAKQoVJL8
56Dh78YenSHerqV13cOl2HhUr1DmGMdDpSX3/p22/3C1+3FnU3RAC1obNnz+7atau9vd1L007WzwnB
57r756YOPGI/0y9xmTo6IsBw8+vnBhiixT4dIRWNN+CEE6HRF7LoOBbNiw4JVX5lNqwxg5aeC/deux
58F1/cRwimVJV/AM79ppAK6opvb2/ftWtXSUlJl9iHbsUFiXds2rQpOTl52rRpnoydzoAfJkk4M/Ps
59Y4/twNjotIH0ndQFYP7+ur17l40ZEyHLVJKwpy26+q/Q7hWFzZw5uKVFyck5R4gwjQDGhgMHzqam
60Ro8YMVBs472YuYKDg69cuVJQUJCWlubi5nQHn1aAuu5OnDhRU1MzZ84c7/cLda2mpuWJJz4DQJx3
6114Ryo4AxAnC8+ead48dHORxUhIx7R4Rzb48IwYyx116bm56eRGm7sMFxDgDSU0/9b0VFsyRhL/YS
628Yrbb7+9trY2Ly9Pxd4NEUCFc+fOTZgwYeDAgWL6u9+g2kcB4Omnd1dVNRCi57wvN7rC/mWbNWvo
638uXjKWU6He5SErrQQAjb116bCyAJAwnnjBBdXV3jr36122WY7sAYCwsLGz9+vOBCXbzURy3Iydap
64oijafIfr7+kw4UoS3rLl1H/912ZCTJT2tZkBIcS5PTNz6fz5yaIzvicMqWillEsSzsjYsnNnASEm
65oRQRgilt+/DD+x9+eKyzZe6GhA7M2O12Qoga7O3pdb6yIPEXY+w1qodzziUJNzXZXnghC0ByKgJ9
66BxgD546UlIjbb08AAEKuCUwfQTu0hx4aDYDUKcoYB9D9/vdfX77c5oURiZWk1+tFYD14FcVdEECr
67fbq8wH36g9Ph8Ne/ZpeV1fU581HRp8ycOVinI6pVuQftCH1/6tTYoCALY1SIUs45IfrKyvo///mQ
68kx6uyHVHTqc49JUA2na1Ar2zUXHOQZJweXnTO+/kAhj7nvmoMG5c9I08rlpABw70T0oKBVCc4xV+
69JNM//nHk3LkGwdw6fVz7txc2YoyxrVu3lpaWImecs4fbOACsXftdc7OVEOlGwgh6DJwDAImNDdTi
70omcghhMTYwFg2glNCGltbX3jjRzoLNhWizSEUHl5+datW51G307AGwFU/amqqur48eOSJHm9EyQJ
71V1Vd/fDDEwCG/jLxc84BkNEoAXRD8HpoCgDAZNJdP5PEIjBs2lRQXt4kFoEXFi9J0vHjxysrK8GD
72PurTCvj+++9jYmJiY2O9CHQxFz766ERjYxMh0s1OO/AEIoDH4VBUDN4g2GyK20zihEhW69UPPsgD
73z4tACIOYmJiYmBgRkd8pdEEAsXssKioaOnQoeBAj4pokYYeDbtpUAKDrD+eiOmwAoCIKE3ywBHgd
74OwKAqqqrAC68XvBh/ebN37e3y5KEPWOGA0BycnJRURFowgOve0uX/bBarYqiCAJ4gI44hm++KS0q
75qkVI31/TX2AHAPLza26kCTU5oKGhraTkCgBxGRHngLHu/PlLWVkl0FmwiRaGDx8uy3JTU1Onv3at
76hgYEBKxevTo2NhY8y3TRvU8/PQ1ARZbnTcaytw4DSPv3lzHGvMxN39qB3NyLDQ3NGEvubYjYrU8/
77PeOpBRVXMTExq1evDgwM7PQ2bwRQce2Siu4OkoStVntW1vn+5T8AwBhHSHfqVPWBAxfAq5biCdSg
78MQDYvPl7pwrE3V8EoP/669LGxnZP+qgAQojJZPLkG/BIAHXiMK/bWTWO6tixqsrKKwjp+rv2hBgk
79FWqi6Ex3nU6UMknCBQW1//73GQADpZ1MKc4BY6murik3txKgI4PBS8ue3ANdywDkBPDo/AIA2Lev
80FEDpNPSlbwExxhEyff756W3bTksSVhSP4RpuA7mWmgAAzz2XJcs2LxGJgtL79p33gjoXBLpDFwRo
81bGwsLi7W1gXopAmMACAn56K7sOonEGUbpJUrPz93rkGnI7JMVX+Wx2ec2JdlJkn4j3888OWXZwgx
82ednQcM4ByHffXVSR4OEeYIz98MMPjY2N3SCAQHphYeG2bdu8+h0BY9TY2H7mzCUA7+o/BwBJwuKD
838Q1F3HsFYVWWLl+23nXXxoqKZkED1UnrptJ0/KsojFKu15O///3Y73+/F2NTp8zn+gelwsLLly61
84CiO2xw4htHXr1sLCQnBj6dhz0wAADQ0N4eHhXpawuF5aeqW+vsVrKnOHl0pRWsSHMYcz1vWm0IAx
85hrHh7NlLU6a8n51dIXwyAsXOND+uutFlmQonEsbouee+XrlyB8Z6sey9vINzQAg3NbWWlDQAeHMP
86IIQiIyMvXboE18cVgpcMGTHrm5qagoKCwHMqj2iqqOgygEyI5FkjRgA0JMT/oYemMMbNZik7u+Lw
874dKbKbQ7aFBV1Txjxvqnnpry/PO3RUT4u3gyEOpYxAihb74pW7MmKz+/lBATpeCLFw9jRKlcVHR5
88ypRY7wMJCQnpdCvQBQFqampGjRrllQAcAM6fvwLAvOTxYIwYYxER/m++OU+WqU5H/vzn7MOHfyDE
89IIzGN48GCOk452+/vf/DD/MXLhy+cGHK2LER4eH+BgNhjLe0OMrKmg4evLBly+mjR0sBgBA/Sn2N
90GxNDPn/+CnheAeK62WwWDjIXNHZBgGnTpkVFRUFX4ebl5U2+ONc45yIwRKcjvZh54R1FnDPOESF+
91Vqt948bcjRuP6HTmsDA/k0lijDc12RsbW0SQIcZGABBJHD5uZYTtr7y8CTy4SVS8DR8+XPASn1iQ
922sqUKVPUnAsPdwIA1Na2+DhfCMGS1FHWrk8IAKJjlFIATIiZc5BlWl3d6JzjCIBIkr8QBt0NHhDR
93QLW1LeDZ9C2iZuPi4uLj413Q65EAmjypTqrruOAUABobbW4Wq1sN1KhCBIAQujZwkSmlva27LTc2
942gDAwxS9LoPapRwXdOkPgK58GkL/bWlx9GuAfzeQ5RyaWu/gWnC5Om7fmxMsqLXVIaLYfbv/OvDG
95grR830vrjHFZ7gPvu8hX6ZhBIkyhM6q73MY830Mo5ZxTkQ/sXBmYENJVRTJXbMkyY4x7spZ5R6a3
96fUBLS8uWLVvq6+vBqzlFNQfdzG2wCM6hYg9BaZsT+7yz2xTnbe2aeobqDYKjUkVp4dxuNOojI4Ni
97YkIiIgJNJj3nsqK0cE67lRPp3RAkfrpy5cqWLVuam5tdEOUtU16W5ZMnT6alpYWFhXnxhWGMhOHX
98R5NLDwAhxLmSmDhw6dIxisIaG9vffvuou5EAIcS5nJoac999IxWFVVdffe+945p7OIDI226LjBzw
994INjfvrTxKSk0MBAA8ZI5AqUlFzZu/f8Bx/k1dZewdjkm2OVq3GPngiAEGptbT1x4oQIKtQi0xsB
100JEkym83ecSra0uvJTfUBYIwoVZKSQl54YRoAlJc3/f3vx9yttOK21NTw55+/DQAKCmrfe++YBoMI
101IWDM9sQT6X/961x/f9cAJ4vFEBUVMH16/G9/O3X58s+2by/A2OidBsJwrdcTX5Q6s9ks/Oq+pqmK
102ux0Oh1cCdHS9D5wwKsZFioTnLl2z7WgvY4w4t/2f/zNt3bq7jUZJWEnb2uTy8qZz5xpqaqxCkDoc
1031GIxbNt23223JTDmS342t1gMahKcJ7DZbACg07nW6/C2AvR6vUhE7Wq0KDTUBNC9ALQegLrKnUmK
104ncO11S1h7UXG5Li4ga+8MotzTgi6etX+4ovf7thRePlyG6XcYCCDBwc//fRPHntsvMNB9Xry7LO3
105HTpUKp72/C4AYCEhRuiq8Ep7eztCSK/Xd4MAGOPHHntM1PL0nH8KABAdbfEgFW8VEEabO+5I9Pc3
106tLXJZrPu/vs/3bPnBMZ+jImodKWwsOrxxz9ubZVXr04DgPHjowIC/K1WG0Letzg8OtqiosIdBOqS
107kpIef/xx99CeLvwB4eHhQgx42oWJ9e6s6dLfaO4KxoyJBACzWXfgQNmePWckKciZ44gAMCEGAOMn
108n5wUN1ssBn9/PYA3didsQaIOjXcsmUymiIgI9xsk762L8nVqRpj78+JKSkooAOmrKgM9AcY6nPWt
109rQ4AyM4uBxD7gA59X5hFAXBbm+K7QUIUAkpJGQDXMwltipxKg04R6G0jxjVlNzyB2AkPHTqAEEM/
110BoN2CZxzAN2nn5749NPjAICQjhADAEeoo2QQ54xzBaAlPn6okyRdj4UxBmBwEuAa6kGjC6hGuk43
111Yt6iDcUKsFqtfn5+nuISRVNxcUFxcUGlpZcRkm5VixAC4BgbCUGEYIdDobTdyV4wgC4gwBgVFTB9
112+k9efHG6ry0i4JzGxAQPHhwMzrmoTSRV+YdLQrX2YhcEqK+vX7du3YoVK8LDwz3xOEqZwSCNHRtR
113WlqLsa6v7Mw9Ac5BURRZtpnNAWPHJo0eHT506IDBg4NiYizh4f4DBpj1euKJV7iD2HaMGRMhSj6p
114GawIIVGhua2tbefOnQ888IBIquCaepLqsujCHxAYGMg5r62tDQ8PBw9iQEz5GTPit28/0d8Y9oZ8
115hDDnsr+/Yc2a2cuXj42OtrjkPAuk1NW1DhhgliRfeCkC4NOnx6tI4M6ikQcPHszOzo6MjLRarcXF
116xXFxcRaLRSS3MsbKysqioqLE8RHehDDn3Gg0hoWFlZeXjx492jOlOABMnz4Yof7MCegCVQhxLkdF
117WfbsWTpq1EBhvUEItbfLVVXWysqrZWWNZ8827N9fTik7cuQx8MG0RSkD0M+cORg6WLHgchgApkyZ
118Eh8fn5WVxRj7/PPPbTabxWKJiopKTEwMDAz8+OOPn3zySXEgiDcCCGIOGjSouLgYPAgl9YyUUaPC
119x42LyMu7eMP17W4UtPsvFUSm0IYN944aNdBmU4xG6fDhin/841hOzsXKyquybAdQMAbG6MiR8T7y
120H8Yco0ZFjh0bKf510gA45xaLJSgoqLq6OiIiYuTIkefPn7948eKFCxf279/f0NCQkpISGRkJLn6J
121zpArVMyU9vZ2tR5Kp3dSyiUJ3XNPSl5eGUJGgJu7DrwkmwLwyEg/l6uEIErtkycPmTcvyeGgRqP0
122t79995vf7EKIca5T62ASgh0Ouyj02hWIIgjyXXcNxRiJkihOSndwaUrpnDlzBMYSEhKGDBkixHJj
123Y6PZbAY199UL9gVPTEpKSk5O9u6cEZczMob/z/8cuHkZ8S6ntbj/DsABsJiSLmMBoGlpMQCg15Pq
124auvLL2cBSJKkUxQm3DLCNwDABUftCkSahnHx4hHunXGWLcCHDh3Ky8tDCA0aNGjq1KkiwCc0NFSV
12585zzLjxiWsekp4Q/5KzNOXJk+OzZgwEcvgQoIoQAsBqn5eXj3CJdA6NRMplc3B8dWbQDBwbOnDmk
12609GEh/uLb+XlV6xWGWNJRGupN0gSAXAMGxaqGbtHCzyAfcaM+HHjotQCNi5427VrV2ZmZnJycmJi
127Yk5OjsPhOHnypOpcUbUgn6xa2mM/PBn9Bd9/9NEJaje8E4BzGaBFUVrVUC1PH84V56JmAKAoLDzc
128f9y4CACbXt9R+EGSCCEYoPU3v7ltwACzqCbtAlZrh1k3IiJAr8ecc0lSH0eSRByOlvDw0Fdeud05
129duHkwm7hNuI7f/TR8eAWgC12r3V1dceOHVuxYsX8+fMTEhLi4uIGDRqUm5u7bds2uD5+ouvSxej6
1302kyeQDDBBQuGjRoVfepUDcZ6T6JYrI/x4wc98sjtAQEGr1l/YDJJu3efLS6uA5AqKpplmYrH//Sn
131eXPm1FitzSK0i3PKOaxcOXvNmnS1sI8WKQD4++9rAcDhoEOGhDz2WNq77+5jTM8YEtoj5zBpUuLG
132jfeKoiqEYEIwxgqAnXOjtmAlQpgxx9ChkYsWjQC38A6BpbKystDQUBEGcezYsYSEBAC4995733nn
133ncrKypiYGLXUQBcEUGNSDh482NzcfPfdd3dapAA5yyHqdHjVqsmPProNIYO7KBaF6MUsnjVryJw5
134CV62PMLxK0m4vr61uPiiJPn98EPd4cMVM2YMttuVSZOi8/OfWrs2Ny+vRlFYQkLwL34xZt68RADY
135uLHgrruSQ0PN6pZQxPLv23e+pKQhMTFUlunatT/9yU+it207U1fXoteThISQBQtS7rwzyWCQGhvb
136jUbJaEQGg/SrX6W/8UZua6ujudnmHAvHGFOqPP30ZOFUEDWxtKgAAD8/v6tXrzocDs55ZWXlrFmz
137AMBisRiNRhf/iq95wnq9/rvvvrNarWpghadF8NBDY1JTB1HaiStD6KyEYEKQpyqCngBjBMDWrNnb
1380uIwGCRZpoMHB61dOz87+5EjRx7bvHnRHXckAMBf/5rzyiv7goONoIlY5hwwJm1tbatXfymyORnj
139Dz00eteuJUeOPJ6dvfzDDxcuXJhiMEj5+TXp6RvKyhoRQna7smpVWlXVMw8/PAbARggSyg+l9pSU
140qGXLxrlMf62eMmzYMKPRuHXr1ry8vIEDB0ZHRwPA6dOnKaXiu08uSe1948aNy8rKOnny5G233ebJ
141LCoWgV5P/vCHmYsXb3KZzgCorU0+dOiC78YixlhgoLG2tgUAKwrD2HD8+IVZsz745z8XpqaGq3HO
142oj/Nze0vv3zgzTe/iY+PPHSowmzWnTp1SdsUxsbMzNNz5360bt09Q4YEO+cQF1HTly+3vv320ddf
143P9Ta2rxhw4m//W2uWoxAOFydwQ3AOX3xxZkmk+v0V3l1W1ub2Wx+5JFHNm7cKPhPTk5OTU1Nbm5u
144RkaGwWDQchGf4gkFF9q9e/fJkyefe+457dmCbljukEJ33bUxM/MMIWZnpJ/qrunBeQgEAKsBDYzZ
145JUmaPj1xxoy4uLggnY5cvtyan1+7e/cP9fUNGJsZY863IAA1XxyphVSMRuOcOUnp6bExMRaEUG2t
1469ejR6qysksbGKxibADBjjgULRt5zT4rJJFVVWf/1r/yiolqEJIQQY+1z5qR89dVS7cFsKkIF9r/4
1474osFCxbodDpZlk+dOnXq1Kn6+nqz2Zyeni7OI9VObp8IIO6ur6/funXrkiVLhCbrKVZXBBsXFdVP
148nPiP1lbFibsOGvTAaaNWkxT/OQ9BsQOoQZxC2OjV8Gz1LW7hPeJxUT6ROTmw+rhOhOUihDi3qSH1
149AHonq+BGI8rNXTF6dIRaDVQb+EYIaWxsfOutt1asWBEREUEpdT8IE67Hgk8pSuJLaGjok08+6QX7
150HS1ipChs2LCwV16ZA2BzMQyIXU+3Pi7dYYxxDoQYJcmfEDMhJvEFIaI66zXPos4eR86nTNc/TtXH
151CTGpjSMkidgTgPY//GHW6NERatF3AfX19QL7lNLg4OCwsDCRGAwaxb2trU1dKNpJ373kKRfC+MaI
152ThPi52RE/6HACSGUtt1+e0pW1jXmI5Bgs9lef/31gICABQsWDBo0CCH0zTffHDlyJCUlpbGx0Waz
153ORyOpqamMWPGLFy40L3OW/fKVoLGeOuJBiLaUj2BdPLkdRUVTRjr+7tAdM+xL0rQR0YG5OauiI0N
154FEPT8pPa2tq9e/eeOXNm6NChCxcuBIC33norPDw8MDDQZDL5+fkZDIbU1NROmUf3YtmcWZy0tbU1
155ICDAMw2u1e07ePDCnDkbZFn1Cv1n0aDj9BiEWFbWstmzh7gXylLnYmVl5e7du0tLS8ePH19RUbFg
156wYLk5GRtbdtO0dW9mnGilYKCgrffflsEunRKvw5nm4QVhU2bFrdhwyIAu6hZeMvGrXQ6XBHKyLn9
157/ffvnT17iKi+6C5UBURHR69YsWLFihX19fXV1dUHDhxoaWkRKoOQLp1O1m4fZ4sQCgkJOXz4cFNT
1580/Dhw9UW3TNDOOeEYEWhY8dG+vub9+49TYj+epXmVgYOgCQJUdr6xz/euWpVmkjs6TQHpr29/bPP
159PtuxY0dRUdGkSZOmTZsWExNTVFSUlZVlt9tjY2NFPFanWUbdI4DQeXU6XVhY2K5duxISEgRf8xCa
160isQ5RpTy9PRYQvTffHNGkv4jaNCBfUVpfeGFef/93zO0ey4XwwNj7P33329sbExPT9fr9YmJiQI/
161aWlpFoslPz9/xIgRJpMJPOjg3ZYB4NRwPv7448rKymeffRa8pvAh5ylVkoT/9Kfs5577nBAjY7jv
162y8n5PkqEMMac0vaXX57/hz9M91SCXjipjh07lpWVtWbNGrWcoSzLe/bsSU9PDw4OppS6HMbuAt07
163yE3b0J133nnlirfsQO39hICi0N/9Lj0kxLRixQ7OMSG6W1I35RgTzmVK6TvvLF65cqIn7KuGkKqq
164qoiICL1eL8syxlhUNTlx4oSiKPfee2+X7+v5ESYWi2Xw4MEuEqlTd42TBliW6eOPj//yy0eCg42U
165tkuScKrcImJZJPITxtoDAgyff7505cqJskxdsK8OkznPlIuKiqqoqGhtbRWRz4qi6HS66dOni6TU
166Ls9w7DYBtL1Rjy1xiezw9IgkYVmmc+cmHD/+1MSJgxWlhRDo6flcvYx9jDEhoCgtY8bEHj/+5F13
167JQudx9MACSFiso8dO9ZsNn/44YeiUqu48/Lly2qCu/cXd1sLguvLMoovly5dUhTFZDJ5OstE02+s
168KCwkxLRs2Vi7nWRnn+dcIUTv9Oj2PUfqyBdjzME5Xb165iefLB440F/oPNrxav2INpvt8OHDR48e
169tVqt0dHRI0eOzM7OPnjwoF6vlyQpNzf38OHD9913X1BQkJcM347GbySpSDWUbtiwwWq1Pv300ypt
170vItlcWCLOI9lxYrdp0+XI2TEWHKu674hA3dGSimc21JSYtetu+v6s9w6hgiaEAWEUGNj4/r16yml
171AwcOLCsrE5bnkJCQL7/8sqCgQJZlPz+/u+++e8SIEVor6U0hgIrQq1evvvHGG3FxcUuXLgXPSpH2
172EVU1stuVd9459uqr+5uaGvuKDNeh3mIJfP756atWpQkPl/ASg5PBqtNfDeh8//33CSHLly8HgLa2
173to8//ri0tHTVqlXh4eF2u729vT0gIEA1gnYZ5dgTFnQdARFijBmNxmHDhu3Zs+fixYujR4/2/mIt
174OxJG3alTY5ctGwugP3WqzmazAiCMJe8FYHqGdwDkFKoK5+1+fuaVK9O3bFk8b16SKJWrMn2xshlj
175Fy9erK2t9fPz0+v1CKGmpqY9e/YsWrQoKCiIUmowGMaNG1dcXHzmzJlJkyYRQoxGI3Kecuc9lkfA
176jZ4nrHY0PDx8+fLl3377rcPhMBgM4HUdqNNKnISgKCwszO8vf7n9179Oe++9vPXr86qr6wEAQC8E
177XbdOse3sdcI9KU4HdQBARMSARx8dt2LFhOhoC2PcRdcUgyopKdm5c6fVahWCbfHixampqeJXNW1L
178WPx/+tOfrl+/vq6uLjw8XCj+XmoL9DIBtNSOj49ftmyZOgzBSbyXOVBrjgosRET4v/TSjDVrpmRm
179nvvoo5P795e1tVkBAEAHIKk4UvPcPaFbcA6V0XGuUKoAcJMpYNq05IcfHn333UNFlqTgOcLCIxoU
1806M7Pz//kk09mzJiRnp5OCMnMzBTFZgIDA+Pi4r766qvhw4cTQhRFAYCgoCBCiN1uB429wUffU68d
1816KyuXK28cr/i4XEQfFk9XlkMoLraundvyZ49JTk5FysrmwDEKWDCQyk+1zXpNHIw50ds9PRRUUFT
182pgyaNy9x7tzEmJiOoGj1CGn3GOnGxsa//OUv99xzT1pamjYmU8yn+vr6N998MyEh4cEHH9TpdAih
183L7/88uTJk7/97W99n/i9TACVDNfaRSgvLy8iIiI6OrrL7bg7ISnlCF07q6u9Xf7hh4a8vOrvv68r
184LKyvrLx66VKr1eqQZVlzJh4CwDqd5O+vHzjQLybGMmxYWGpq+PjxUcOGDTCZdFoFzNP5aoKlZGdn
185Hzhw4He/+506lxFCLS0ttbW1JpMpOjq6srLygw8+UBRlxIgRjY2NFy9efOSRR4YMGeLLIeIu0Jtn
186yrsYab///vtt27YtW7YsJSVFXQq+tAAA6lmaooSM0SiNGRMxenQ4dIh93txsa262NzfbbDZFVKrQ
1876bDRKAUGGi0WQ1CQ0WVqi7P7xKmFWut8px0wGAytra1NTU2hoaGKopSXlx85cqS4uNhms1FKp0yZ
188snjx4meeeSY3N/f8+fMhISH33nvvwIEDuQ8ZXZ0MuRdXgArq8L744ouvvvrqjjvumD17ttejNzy1
189I8JAROHBDtYv+IYXh6jTRX7tLFRN8lAXJdWdC679jTfeYIwlJiaWl5c3NDRERUVNmDBhyJAhZWVl
19027dv/8UvfjF27NgunS39QwAt98cYnzlzZvPmzUuWLBk1apSWn/asu2pvPVVkVaN3tP92t32EUHV1
191dWZmZnNzc0JCwsSJE0U0lfhp3bp1gYGBS5YsURRF3eX2gPvfLAK406ClpcVgMOh0Og361KolXWvK
192fQlaa4/LF+HVkiTp7bffjo6OzsjIELLtBvvfwyPNvYM6u4Uyqk2yFIYUdffgyX7Xl6BqONq9K3cm
1931MmyzJ1nF0qSdOjQocrKysmTJ4NTON/g7OlNIawFtVtaHU5c+eijjzDGGRkZAwYM8FE43yTQmnVB
194M+XVBVpXV/fBBx/Mnj07NTX16tWr+/bty8vLe+CBByIjIz2dpNZtRPXZ7FOXc2lp6RdffFFRUTF2
1957NhZs2aJBNjr+tQj8dDdzqjTXFWRtdtGZ2CHsmvXrtzcXJPJpChKWFhYRkZGbGyslwOsuwt9vfxV
196Mpw9e3bHjh1JSUmLFi1y2eyoJtxep4SLyFH/LS8vz8zMHD16dHp6urtuc+nSpbq6uuDg4KioKME5
197u9xa3ooEUMejVmJUFEVRFJEuK8Zjs9lUY1ZH/9yQ1bP3goa0Ku7sdntOTk5+fn59fX1CQsIdd9wR
198FxenfbX7svDdyuYj3CwZ4A7qNk0MQARTqmfNAYDNZlu7dq3FYpkwYUJSUpI4ckKrh2hnnIvBw9O7
199tPeD2ykuIm8rMTHxoYceEjsp7SMuEkIVxb27KPtHA3HX9gTDPXv27MmTJ8+fP2+1WtPS0jIyMnqw
200uXdRIgU0NzdXVlYWFhaOHz8+ISFBZXoqu+uyQupNgr5bAVpwd2oCgCRJw4YNGz58uKIo586dcxED
201R44cqampGTRoUGBgoMViCQ4OFhsLLaIZY4qiUEpFjSN1J7hjxw5ZlgkhgYGBqampLj1RVaA+EP6d
202oKJ/dXABWg4LTkah5d0iSe3YsWMOh8Nms8myvHLlyujoaDGR29vb169f39LSIqwI4eHhK1euBKdh
203ubq6uqioaMiQIZGRkULegJvZqh93grcEAQRop7N2q6xlVoyx1tZWq9U6YMAAbSDU8ePHEULiANOg
204oKDY2FithHCRFv0y0z3BLUQAT6C6d7TaIfiAR5c9bZcBA/0C/wEEEKDtZ6duHy1a3Wtk37LwH0OA
205/1fhphjjfgTf4f8C4VLHz/5KLxoAAAA8dEVYdGNvbW1lbnQAIEltYWdlIGdlbmVyYXRlZCBieSBH
206TlUgR2hvc3RzY3JpcHQgKGRldmljZT1wbm1yYXcpCvqLFvMAAAAASUVORK5CYII=
diff --git a/gem/input.bin b/gem/input.bin
deleted file mode 100644
index d24a954..0000000
--- a/gem/input.bin
+++ /dev/null
Binary files differ
diff --git a/gem/ltn012.tex b/gem/ltn012.tex
deleted file mode 100644
index 8027ecc..0000000
--- a/gem/ltn012.tex
+++ /dev/null
@@ -1,695 +0,0 @@
1\documentclass[10pt]{article}
2\usepackage{fancyvrb}
3\usepackage{url}
4\DefineVerbatimEnvironment{lua}{Verbatim}{fontsize=\small,commandchars=\@\#\%}
5\DefineVerbatimEnvironment{C}{Verbatim}{fontsize=\small,commandchars=\@\#\%}
6\DefineVerbatimEnvironment{mime}{Verbatim}{fontsize=\small,commandchars=\$\#\%}
7\newcommand{\stick}[1]{\vbox{\setlength{\parskip}{0pt}#1}}
8\newcommand{\bl}{\ensuremath{\mathtt{\backslash}}}
9\newcommand{\CR}{\texttt{CR}}
10\newcommand{\LF}{\texttt{LF}}
11\newcommand{\CRLF}{\texttt{CR~LF}}
12\newcommand{\nil}{\texttt{nil}}
13
14\title{Filters, sources, sinks, and pumps\\
15 {\large or Functional programming for the rest of us}}
16\author{Diego Nehab}
17
18\begin{document}
19
20\maketitle
21
22\begin{abstract}
23Certain data processing operations can be implemented in the
24form of filters. A filter is a function that can process
25data received in consecutive invocations, returning partial
26results each time it is called. Examples of operations that
27can be implemented as filters include the end-of-line
28normalization for text, Base64 and Quoted-Printable transfer
29content encodings, the breaking of text into lines, SMTP
30dot-stuffing, and there are many others. Filters become
31even more powerful when we allow them to be chained together
32to create composite filters. In this context, filters can be
33seen as the internal links in a chain of data transformations.
34Sources and sinks are the corresponding end points in these
35chains. A source is a function that produces data, chunk by
36chunk, and a sink is a function that takes data, chunk by
37chunk. Finally, pumps are procedures that actively drive
38data from a source to a sink, and indirectly through all
39intervening filters. In this article, we describe the design of an
40elegant interface for filters, sources, sinks, chains, and
41pumps, and we illustrate each step with concrete examples.
42\end{abstract}
43
44\section{Introduction}
45
46Within the realm of networking applications, we are often
47required to apply transformations to streams of data. Examples
48include the end-of-line normalization for text, Base64 and
49Quoted-Printable transfer content encodings, breaking text
50into lines with a maximum number of columns, SMTP
51dot-stuffing, \texttt{gzip} compression, HTTP chunked
52transfer coding, and the list goes on.
53
54Many complex tasks require a combination of two or more such
55transformations, and therefore a general mechanism for
56promoting reuse is desirable. In the process of designing
57\texttt{LuaSocket~2.0}, we repeatedly faced this problem.
58The solution we reached proved to be very general and
59convenient. It is based on the concepts of filters, sources,
60sinks, and pumps, which we introduce below.
61
62\emph{Filters} are functions that can be repeatedly invoked
63with chunks of input, successively returning processed
64chunks of output. Naturally, the result of
65concatenating all the output chunks must be the same as the
66result of applying the filter to the concatenation of all
67input chunks. In fancier language, filters \emph{commute}
68with the concatenation operator. More importantly, filters
69must handle input data correctly no matter how the stream
70has been split into chunks.
71
72A \emph{chain} is a function that transparently combines the
73effect of one or more filters. The interface of a chain is
74indistinguishable from the interface of its component
75filters. This allows a chained filter to be used wherever
76an atomic filter is accepted. In particular, chains can be
77themselves chained to create arbitrarily complex operations.
78
79Filters can be seen as internal nodes in a network through
80which data will flow, potentially being transformed many
81times along the way. Chains connect these nodes together.
82The initial and final nodes of the network are
83\emph{sources} and \emph{sinks}, respectively. Less
84abstractly, a source is a function that produces new chunks
85of data every time it is invoked. Conversely, sinks are
86functions that give a final destination to the chunks of
87data they receive in sucessive calls. Naturally, sources
88and sinks can also be chained with filters to produce
89filtered sources and sinks.
90
91Finally, filters, chains, sources, and sinks are all passive
92entities: they must be repeatedly invoked in order for
93anything to happen. \emph{Pumps} provide the driving force
94that pushes data through the network, from a source to a
95sink, and indirectly through all intervening filters.
96
97In the following sections, we start with a simplified
98interface, which we later refine. The evolution we present
99is not contrived: it recreates the steps we ourselves
100followed as we consolidated our understanding of these
101concepts within our application domain.
102
103\subsection{A simple example}
104
105The end-of-line normalization of text is a good
106example to motivate our initial filter interface.
107Assume we are given text in an unknown end-of-line
108convention (including possibly mixed conventions) out of the
109commonly found Unix (\LF), Mac OS (\CR), and
110DOS (\CRLF) conventions. We would like to be able to
111use the folowing code to normalize the end-of-line markers:
112\begin{quote}
113\begin{lua}
114@stick#
115local CRLF = "\013\010"
116local input = source.chain(source.file(io.stdin), normalize(CRLF))
117local output = sink.file(io.stdout)
118pump.all(input, output)
119%
120\end{lua}
121\end{quote}
122
123This program should read data from the standard input stream
124and normalize the end-of-line markers to the canonic
125\CRLF\ marker, as defined by the MIME standard.
126Finally, the normalized text should be sent to the standard output
127stream. We use a \emph{file source} that produces data from
128standard input, and chain it with a filter that normalizes
129the data. The pump then repeatedly obtains data from the
130source, and passes it to the \emph{file sink}, which sends
131it to the standard output.
132
133In the code above, the \texttt{normalize} \emph{factory} is a
134function that creates our normalization filter, which
135replaces any end-of-line marker with the canonic marker.
136The initial filter interface is
137trivial: a filter function receives a chunk of input data,
138and returns a chunk of processed data. When there are no
139more input data left, the caller notifies the filter by invoking
140it with a \nil\ chunk. The filter responds by returning
141the final chunk of processed data (which could of course be
142the empty string).
143
144Although the interface is extremely simple, the
145implementation is not so obvious. A normalization filter
146respecting this interface needs to keep some kind of context
147between calls. This is because a chunk boundary may lie between
148the \CR\ and \LF\ characters marking the end of a single line. This
149need for contextual storage motivates the use of
150factories: each time the factory is invoked, it returns a
151filter with its own context so that we can have several
152independent filters being used at the same time. For
153efficiency reasons, we must avoid the obvious solution of
154concatenating all the input into the context before
155producing any output chunks.
156
157To that end, we break the implementation into two parts:
158a low-level filter, and a factory of high-level filters. The
159low-level filter is implemented in C and does not maintain
160any context between function calls. The high-level filter
161factory, implemented in Lua, creates and returns a
162high-level filter that maintains whatever context the low-level
163filter needs, but isolates the user from its internal
164details. That way, we take advantage of C's efficiency to
165perform the hard work, and take advantage of Lua's
166simplicity for the bookkeeping.
167
168\subsection{The Lua part of the filter}
169
170Below is the complete implementation of the factory of high-level
171end-of-line normalization filters:
172\begin{quote}
173\begin{lua}
174@stick#
175function filter.cycle(lowlevel, context, extra)
176 return function(chunk)
177 local ret
178 ret, context = lowlevel(context, chunk, extra)
179 return ret
180 end
181end
182%
183
184@stick#
185function normalize(marker)
186 return filter.cycle(eol, 0, marker)
187end
188%
189\end{lua}
190\end{quote}
191
192The \texttt{normalize} factory simply calls a more generic
193factory, the \texttt{cycle}~factory, passing the low-level
194filter~\texttt{eol}. The \texttt{cycle}~factory receives a
195low-level filter, an initial context, and an extra
196parameter, and returns a new high-level filter. Each time
197the high-level filer is passed a new chunk, it invokes the
198low-level filter with the previous context, the new chunk,
199and the extra argument. It is the low-level filter that
200does all the work, producing the chunk of processed data and
201a new context. The high-level filter then replaces its
202internal context, and returns the processed chunk of data to
203the user. Notice that we take advantage of Lua's lexical
204scoping to store the context in a closure between function
205calls.
206
207\subsection{The C part of the filter}
208
209As for the low-level filter, we must first accept
210that there is no perfect solution to the end-of-line marker
211normalization problem. The difficulty comes from an
212inherent ambiguity in the definition of empty lines within
213mixed input. However, the following solution works well for
214any consistent input, as well as for non-empty lines in
215mixed input. It also does a reasonable job with empty lines
216and serves as a good example of how to implement a low-level
217filter.
218
219The idea is to consider both \CR\ and~\LF\ as end-of-line
220\emph{candidates}. We issue a single break if any candidate
221is seen alone, or if it is followed by a different
222candidate. In other words, \CR~\CR~and \LF~\LF\ each issue
223two end-of-line markers, whereas \CR~\LF~and \LF~\CR\ issue
224only one marker each. It is easy to see that this method
225correctly handles the most common end-of-line conventions.
226
227With this in mind, we divide the low-level filter into two
228simple functions. The inner function~\texttt{pushchar} performs the
229normalization itself. It takes each input character in turn,
230deciding what to output and how to modify the context. The
231context tells if the last processed character was an
232end-of-line candidate, and if so, which candidate it was.
233For efficiency, we use Lua's auxiliary library's buffer
234interface:
235\begin{quote}
236\begin{C}
237@stick#
238@#define candidate(c) (c == CR || c == LF)
239static int pushchar(int c, int last, const char *marker,
240 luaL_Buffer *buffer) {
241 if (candidate(c)) {
242 if (candidate(last)) {
243 if (c == last)
244 luaL_addstring(buffer, marker);
245 return 0;
246 } else {
247 luaL_addstring(buffer, marker);
248 return c;
249 }
250 } else {
251 luaL_pushchar(buffer, c);
252 return 0;
253 }
254}
255%
256\end{C}
257\end{quote}
258
259The outer function~\texttt{eol} simply interfaces with Lua.
260It receives the context and input chunk (as well as an
261optional custom end-of-line marker), and returns the
262transformed output chunk and the new context.
263Notice that if the input chunk is \nil, the operation
264is considered to be finished. In that case, the loop will
265not execute a single time and the context is reset to the
266initial state. This allows the filter to be reused many
267times:
268\begin{quote}
269\begin{C}
270@stick#
271static int eol(lua_State *L) {
272 int context = luaL_checkint(L, 1);
273 size_t isize = 0;
274 const char *input = luaL_optlstring(L, 2, NULL, &isize);
275 const char *last = input + isize;
276 const char *marker = luaL_optstring(L, 3, CRLF);
277 luaL_Buffer buffer;
278 luaL_buffinit(L, &buffer);
279 if (!input) {
280 lua_pushnil(L);
281 lua_pushnumber(L, 0);
282 return 2;
283 }
284 while (input < last)
285 context = pushchar(*input++, context, marker, &buffer);
286 luaL_pushresult(&buffer);
287 lua_pushnumber(L, context);
288 return 2;
289}
290%
291\end{C}
292\end{quote}
293
294When designing filters, the challenging part is usually
295deciding what to store in the context. For line breaking, for
296instance, it could be the number of bytes that still fit in the
297current line. For Base64 encoding, it could be a string
298with the bytes that remain after the division of the input
299into 3-byte atoms. The MIME module in the \texttt{LuaSocket}
300distribution has many other examples.
301
302\section{Filter chains}
303
304Chains greatly increase the power of filters. For example,
305according to the standard for Quoted-Printable encoding,
306text should be normalized to a canonic end-of-line marker
307prior to encoding. After encoding, the resulting text must
308be broken into lines of no more than 76 characters, with the
309use of soft line breaks (a line terminated by the \texttt{=}
310sign). To help specifying complex transformations like
311this, we define a chain factory that creates a composite
312filter from one or more filters. A chained filter passes
313data through all its components, and can be used wherever a
314primitive filter is accepted.
315
316The chaining factory is very simple. The auxiliary
317function~\texttt{chainpair} chains two filters together,
318taking special care if the chunk is the last. This is
319because the final \nil\ chunk notification has to be
320pushed through both filters in turn:
321\begin{quote}
322\begin{lua}
323@stick#
324local function chainpair(f1, f2)
325 return function(chunk)
326 local ret = f2(f1(chunk))
327 if chunk then return ret
328 else return ret .. f2() end
329 end
330end
331%
332
333@stick#
334function filter.chain(...)
335 local f = select(1, ...)
336 for i = 2, select('@#', ...) do
337 f = chainpair(f, select(i, ...))
338 end
339 return f
340end
341%
342\end{lua}
343\end{quote}
344
345Thanks to the chain factory, we can
346define the Quoted-Printable conversion as such:
347\begin{quote}
348\begin{lua}
349@stick#
350local qp = filter.chain(normalize(CRLF), encode("quoted-printable"),
351 wrap("quoted-printable"))
352local input = source.chain(source.file(io.stdin), qp)
353local output = sink.file(io.stdout)
354pump.all(input, output)
355%
356\end{lua}
357\end{quote}
358
359\section{Sources, sinks, and pumps}
360
361The filters we introduced so far act as the internal nodes
362in a network of transformations. Information flows from node
363to node (or rather from one filter to the next) and is
364transformed along the way. Chaining filters together is our
365way to connect nodes in this network. As the starting point
366for the network, we need a source node that produces the
367data. In the end of the network, we need a sink node that
368gives a final destination to the data.
369
370\subsection{Sources}
371
372A source returns the next chunk of data each time it is
373invoked. When there is no more data, it simply returns~\nil.
374In the event of an error, the source can inform the
375caller by returning \nil\ followed by the error message.
376
377Below are two simple source factories. The \texttt{empty} source
378returns no data, possibly returning an associated error
379message. The \texttt{file} source yields the contents of a file
380in a chunk by chunk fashion:
381\begin{quote}
382\begin{lua}
383@stick#
384function source.empty(err)
385 return function()
386 return nil, err
387 end
388end
389%
390
391@stick#
392function source.file(handle, io_err)
393 if handle then
394 return function()
395 local chunk = handle:read(2048)
396 if not chunk then handle:close() end
397 return chunk
398 end
399 else return source.empty(io_err or "unable to open file") end
400end
401%
402\end{lua}
403\end{quote}
404
405\subsection{Filtered sources}
406
407A filtered source passes its data through the
408associated filter before returning it to the caller.
409Filtered sources are useful when working with
410functions that get their input data from a source (such as
411the pumps in our examples). By chaining a source with one or
412more filters, such functions can be transparently provided
413with filtered data, with no need to change their interfaces.
414Here is a factory that does the job:
415\begin{quote}
416\begin{lua}
417@stick#
418function source.chain(src, f)
419 return function()
420 if not src then
421 return nil
422 end
423 local chunk, err = src()
424 if not chunk then
425 src = nil
426 return f(nil)
427 else
428 return f(chunk)
429 end
430 end
431end
432%
433\end{lua}
434\end{quote}
435
436\subsection{Sinks}
437
438Just as we defined an interface for a source of data, we can
439also define an interface for a data destination. We call
440any function respecting this interface a sink. In our first
441example, we used a file sink connected to the standard
442output.
443
444Sinks receive consecutive chunks of data, until the end of
445data is signaled by a \nil\ input chunk. A sink can be
446notified of an error with an optional extra argument that
447contains the error message, following a \nil\ chunk.
448If a sink detects an error itself, and
449wishes not to be called again, it can return \nil,
450followed by an error message. A return value that
451is not \nil\ means the sink will accept more data.
452
453Below are two useful sink factories.
454The table factory creates a sink that stores
455individual chunks into an array. The data can later be
456efficiently concatenated into a single string with Lua's
457\texttt{table.concat} library function. The \texttt{null} sink
458simply discards the chunks it receives:
459\begin{quote}
460\begin{lua}
461@stick#
462function sink.table(t)
463 t = t or {}
464 local f = function(chunk, err)
465 if chunk then table.insert(t, chunk) end
466 return 1
467 end
468 return f, t
469end
470%
471
472@stick#
473local function null()
474 return 1
475end
476
477function sink.null()
478 return null
479end
480%
481\end{lua}
482\end{quote}
483
484Naturally, filtered sinks are just as useful as filtered
485sources. A filtered sink passes each chunk it receives
486through the associated filter before handing it down to the
487original sink. In the following example, we use a source
488that reads from the standard input. The input chunks are
489sent to a table sink, which has been coupled with a
490normalization filter. The filtered chunks are then
491concatenated from the output array, and finally sent to
492standard out:
493\begin{quote}
494\begin{lua}
495@stick#
496local input = source.file(io.stdin)
497local output, t = sink.table()
498output = sink.chain(normalize(CRLF), output)
499pump.all(input, output)
500io.write(table.concat(t))
501%
502\end{lua}
503\end{quote}
504
505\subsection{Pumps}
506
507Although not on purpose, our interface for sources is
508compatible with Lua iterators. That is, a source can be
509neatly used in conjunction with \texttt{for} loops. Using
510our file source as an iterator, we can write the following
511code:
512\begin{quote}
513\begin{lua}
514@stick#
515for chunk in source.file(io.stdin) do
516 io.write(chunk)
517end
518%
519\end{lua}
520\end{quote}
521
522Loops like this will always be present because everything
523we designed so far is passive. Sources, sinks, filters: none
524of them can do anything on their own. The operation of
525pumping all data a source can provide into a sink is so
526common that it deserves its own function:
527\begin{quote}
528\begin{lua}
529@stick#
530function pump.step(src, snk)
531 local chunk, src_err = src()
532 local ret, snk_err = snk(chunk, src_err)
533 if chunk and ret then return 1
534 else return nil, src_err or snk_err end
535end
536%
537
538@stick#
539function pump.all(src, snk, step)
540 step = step or pump.step
541 while true do
542 local ret, err = step(src, snk)
543 if not ret then
544 if err then return nil, err
545 else return 1 end
546 end
547 end
548end
549%
550\end{lua}
551\end{quote}
552
553The \texttt{pump.step} function moves one chunk of data from
554the source to the sink. The \texttt{pump.all} function takes
555an optional \texttt{step} function and uses it to pump all the
556data from the source to the sink.
557Here is an example that uses the Base64 and the
558line wrapping filters from the \texttt{LuaSocket}
559distribution. The program reads a binary file from
560disk and stores it in another file, after encoding it to the
561Base64 transfer content encoding:
562\begin{quote}
563\begin{lua}
564@stick#
565local input = source.chain(
566 source.file(io.open("input.bin", "rb")),
567 encode("base64"))
568local output = sink.chain(
569 wrap(76),
570 sink.file(io.open("output.b64", "w")))
571pump.all(input, output)
572%
573\end{lua}
574\end{quote}
575
576The way we split the filters here is not intuitive, on
577purpose. Alternatively, we could have chained the Base64
578encode filter and the line-wrap filter together, and then
579chain the resulting filter with either the file source or
580the file sink. It doesn't really matter.
581
582\section{Exploding filters}
583
584Our current filter interface has one serious shortcoming.
585Consider for example a \texttt{gzip} decompression filter.
586During decompression, a small input chunk can be exploded
587into a huge amount of data. To address this problem, we
588decided to change the filter interface and allow exploding
589filters to return large quantities of output data in a chunk
590by chunk manner.
591
592More specifically, after passing each chunk of input to
593a filter, and collecting the first chunk of output, the
594user must now loop to receive other chunks from the filter until no
595filtered data is left. Within these secondary calls, the
596caller passes an empty string to the filter. The filter
597responds with an empty string when it is ready for the next
598input chunk. In the end, after the user passes a
599\nil\ chunk notifying the filter that there is no
600more input data, the filter might still have to produce too
601much output data to return in a single chunk. The user has
602to loop again, now passing \nil\ to the filter each time,
603until the filter itself returns \nil\ to notify the
604user it is finally done.
605
606Fortunately, it is very easy to modify a filter to respect
607the new interface. In fact, the end-of-line translation
608filter we presented earlier already conforms to it. The
609complexity is encapsulated within the chaining functions,
610which must now include a loop. Since these functions only
611have to be written once, the user is rarely affected.
612Interestingly, the modifications do not have a measurable
613negative impact in the performance of filters that do
614not need the added flexibility. On the other hand, for a
615small price in complexity, the changes make exploding
616filters practical.
617
618\section{A complex example}
619
620The LTN12 module in the \texttt{LuaSocket} distribution
621implements all the ideas we have described. The MIME
622and SMTP modules are tightly integrated with LTN12,
623and can be used to showcase the expressive power of filters,
624sources, sinks, and pumps. Below is an example
625of how a user would proceed to define and send a
626multipart message, with attachments, using \texttt{LuaSocket}:
627\begin{quote}
628\begin{mime}
629local smtp = require"socket.smtp"
630local mime = require"mime"
631local ltn12 = require"ltn12"
632
633local message = smtp.message{
634 headers = {
635 from = "Sicrano <sicrano@example.com>",
636 to = "Fulano <fulano@example.com>",
637 subject = "A message with an attachment"},
638 body = {
639 preamble = "Hope you can see the attachment" .. CRLF,
640 [1] = {
641 body = "Here is our logo" .. CRLF},
642 [2] = {
643 headers = {
644 ["content-type"] = 'image/png; name="luasocket.png"',
645 ["content-disposition"] =
646 'attachment; filename="luasocket.png"',
647 ["content-description"] = 'LuaSocket logo',
648 ["content-transfer-encoding"] = "BASE64"},
649 body = ltn12.source.chain(
650 ltn12.source.file(io.open("luasocket.png", "rb")),
651 ltn12.filter.chain(
652 mime.encode("base64"),
653 mime.wrap()))}}}
654
655assert(smtp.send{
656 rcpt = "<fulano@example.com>",
657 from = "<sicrano@example.com>",
658 source = message})
659\end{mime}
660\end{quote}
661
662The \texttt{smtp.message} function receives a table
663describing the message, and returns a source. The
664\texttt{smtp.send} function takes this source, chains it with the
665SMTP dot-stuffing filter, connects a socket sink
666with the server, and simply pumps the data. The message is never
667assembled in memory. Everything is produced on demand,
668transformed in small pieces, and sent to the server in chunks,
669including the file attachment which is loaded from disk and
670encoded on the fly. It just works.
671
672\section{Conclusions}
673
674In this article, we introduced the concepts of filters,
675sources, sinks, and pumps to the Lua language. These are
676useful tools for stream processing in general. Sources provide
677a simple abstraction for data acquisition. Sinks provide an
678abstraction for final data destinations. Filters define an
679interface for data transformations. The chaining of
680filters, sources and sinks provides an elegant way to create
681arbitrarily complex data transformations from simpler
682components. Pumps simply push the data through.
683
684\section{Acknowledgements}
685
686The concepts described in this text are the result of long
687discussions with David Burgess. A version of this text has
688been released on-line as the Lua Technical Note 012, hence
689the name of the corresponding LuaSocket module, LTN12. Wim
690Couwenberg contributed to the implementation of the module,
691and Adrian Sietsma was the first to notice the
692correspondence between sources and Lua iterators.
693
694
695\end{document}
diff --git a/gem/luasocket.png b/gem/luasocket.png
deleted file mode 100644
index d24a954..0000000
--- a/gem/luasocket.png
+++ /dev/null
Binary files differ
diff --git a/gem/makefile b/gem/makefile
deleted file mode 100644
index a4287c2..0000000
--- a/gem/makefile
+++ /dev/null
@@ -1,14 +0,0 @@
1ltn012.pdf: ltn012.ps
2 ./myps2pdf ltn012.ps
3
4ltn012.ps: ltn012.dvi
5 dvips -G0 -t letter -o ltn012.ps ltn012.dvi
6
7ltn012.dvi: ltn012.tex
8 latex ltn012
9
10clean:
11 rm -f *~ *.log *.aux *.bbl *.blg ltn012.pdf ltn012.ps ltn012.dvi ltn012.lof ltn012.toc ltn012.lot
12
13pdf: ltn012.pdf
14 open ltn012.pdf
diff --git a/gem/myps2pdf b/gem/myps2pdf
deleted file mode 100755
index 78c23e5..0000000
--- a/gem/myps2pdf
+++ /dev/null
@@ -1,113 +0,0 @@
1#!/bin/sh -
2do_opt=1
3best=0
4rot=0
5a4=0
6eps=0
7usage="Usage: $0 [-no_opt] [-best] [-rot] [-a4] [-eps] in.ps [out.pdf]"
8
9case "x$1" in
10"x-no_opt") do_opt=0 ; shift ;;
11esac
12
13case "x$1" in
14"x-best") best=1 ; shift ;;
15esac
16
17case "x$1" in
18"x-rot") rot=1 ; shift ;;
19esac
20
21case "x$1" in
22"x-a4") a4=1 ; shift ;;
23esac
24
25case "x$1" in
26"x-eps") eps=1 ; shift ;;
27esac
28
29case $# in
302) ifilename=$1 ; ofilename=$2 ;;
311) ifilename=$1
32 if `echo $1 | grep -i '\.e*ps$' > /dev/null`
33 then
34 ofilename=`echo $1 | sed 's/\..*$/.pdf/'`
35 else
36 echo "$usage" 1>&2
37 exit 1
38 fi ;;
39*) echo "$usage" 1>&2 ; exit 1 ;;
40esac
41
42if [ $best == 1 ]
43then
44 options="-dPDFSETTINGS=/prepress \
45 -r1200 \
46 -dMonoImageResolution=1200 \
47 -dGrayImageResolution=1200 \
48 -dColorImageResolution=1200 \
49 -dDownsampleMonoImages=false \
50 -dDownsampleGrayImages=false \
51 -dDownsampleColorImages=false \
52 -dAutoFilterMonoImages=false \
53 -dAutoFilterGrayImages=false \
54 -dAutoFilterColorImages=false \
55 -dMonoImageFilter=/FlateEncode \
56 -dGrayImageFilter=/FlateEncode \
57 -dColorImageFilter=/FlateEncode"
58else
59 options="-dPDFSETTINGS=/prepress \
60 -r600 \
61 -dDownsampleMonoImages=true \
62 -dDownsampleGrayImages=true \
63 -dDownsampleColorImages=true \
64 -dMonoImageDownsampleThreshold=2.0 \
65 -dGrayImageDownsampleThreshold=1.5 \
66 -dColorImageDownsampleThreshold=1.5 \
67 -dMonoImageResolution=600 \
68 -dGrayImageResolution=600 \
69 -dColorImageResolution=600 \
70 -dAutoFilterMonoImages=false \
71 -dMonoImageFilter=/FlateEncode \
72 -dAutoFilterGrayImages=true \
73 -dAutoFilterColorImages=true"
74fi
75
76if [ $rot == 1 ]
77then
78 options="$options -dAutoRotatePages=/PageByPage"
79fi
80
81if [ $eps == 1 ]
82then
83 options="$options -dEPSCrop"
84fi
85
86set -x
87
88if [ $a4 == 1 ]
89then
90 # Resize from A4 to letter size
91 psresize -Pa4 -pletter "$ifilename" myps2pdf.temp.ps
92 ifilename=myps2pdf.temp.ps
93fi
94
95gs -q -dSAFER -dNOPAUSE -dBATCH \
96 -sDEVICE=pdfwrite -sPAPERSIZE=letter -sOutputFile=myps2pdf.temp.pdf \
97 -dCompatibilityLevel=1.3 \
98 $options \
99 -dMaxSubsetPct=100 \
100 -dSubsetFonts=true \
101 -dEmbedAllFonts=true \
102 -dColorConversionStrategy=/LeaveColorUnchanged \
103 -dDoThumbnails=true \
104 -dPreserveEPSInfo=true \
105 -c .setpdfwrite -f "$ifilename"
106
107if [ $do_opt == 1 ]
108then
109 pdfopt myps2pdf.temp.pdf $ofilename
110else
111 mv myps2pdf.temp.pdf $ofilename
112fi
113rm -f myps2pdf.temp.pdf myps2pdf.temp.ps
diff --git a/gem/t1.lua b/gem/t1.lua
deleted file mode 100644
index 0c054c9..0000000
--- a/gem/t1.lua
+++ /dev/null
@@ -1,25 +0,0 @@
1source = {}
2sink = {}
3pump = {}
4filter = {}
5
6-- source.chain
7dofile("ex6.lua")
8
9-- source.file
10dofile("ex5.lua")
11
12-- normalize
13require"gem"
14eol = gem.eol
15dofile("ex2.lua")
16
17-- sink.file
18require"ltn12"
19sink.file = ltn12.sink.file
20
21-- pump.all
22dofile("ex10.lua")
23
24-- run test
25dofile("ex1.lua")
diff --git a/gem/t1lf.txt b/gem/t1lf.txt
deleted file mode 100644
index 8cddd1b..0000000
--- a/gem/t1lf.txt
+++ /dev/null
@@ -1,5 +0,0 @@
1this is a test file
2it should have been saved as lf eol
3but t1.lua will convert it to crlf eol
4otherwise it is broken!
5
diff --git a/gem/t2.lua b/gem/t2.lua
deleted file mode 100644
index a81ed73..0000000
--- a/gem/t2.lua
+++ /dev/null
@@ -1,36 +0,0 @@
1source = {}
2sink = {}
3pump = {}
4filter = {}
5
6-- filter.chain
7dofile("ex3.lua")
8
9-- normalize
10require"gem"
11eol = gem.eol
12dofile("ex2.lua")
13
14-- encode
15require"mime"
16encode = mime.encode
17
18-- wrap
19wrap = mime.wrap
20
21-- source.chain
22dofile("ex6.lua")
23
24-- source.file
25dofile("ex5.lua")
26
27-- sink.file
28require"ltn12"
29sink.file = ltn12.sink.file
30
31-- pump.all
32dofile("ex10.lua")
33
34-- run test
35CRLF = "\013\010"
36dofile("ex4.lua")
diff --git a/gem/t2.txt b/gem/t2.txt
deleted file mode 100644
index f484fe8..0000000
--- a/gem/t2.txt
+++ /dev/null
@@ -1,4 +0,0 @@
1esse é um texto com acentos
2quoted-printable tem que quebrar linhas longas, com mais que 76 linhas de texto
3fora que as quebras de linhas têm que ser normalizadas
4vamos ver o que dá isso aqui
diff --git a/gem/t2gt.qp b/gem/t2gt.qp
deleted file mode 100644
index 355a845..0000000
--- a/gem/t2gt.qp
+++ /dev/null
@@ -1,5 +0,0 @@
1esse =E9 um texto com acentos
2quoted-printable tem que quebrar linhas longas, com mais que 76 linhas de t=
3exto
4fora que as quebras de linhas t=EAm que ser normalizadas
5vamos ver o que d=E1 isso aqui
diff --git a/gem/t3.lua b/gem/t3.lua
deleted file mode 100644
index 4bb98ba..0000000
--- a/gem/t3.lua
+++ /dev/null
@@ -1,25 +0,0 @@
1source = {}
2sink = {}
3pump = {}
4filter = {}
5
6-- source.file
7dofile("ex5.lua")
8
9-- sink.table
10dofile("ex7.lua")
11
12-- sink.chain
13require"ltn12"
14sink.chain = ltn12.sink.chain
15
16-- normalize
17require"gem"
18eol = gem.eol
19dofile("ex2.lua")
20
21-- pump.all
22dofile("ex10.lua")
23
24-- run test
25dofile("ex8.lua")
diff --git a/gem/t4.lua b/gem/t4.lua
deleted file mode 100644
index 8b8071c..0000000
--- a/gem/t4.lua
+++ /dev/null
@@ -1,10 +0,0 @@
1source = {}
2sink = {}
3pump = {}
4filter = {}
5
6-- source.file
7dofile("ex5.lua")
8
9-- run test
10dofile("ex9.lua")
diff --git a/gem/t5.lua b/gem/t5.lua
deleted file mode 100644
index 7c569ea..0000000
--- a/gem/t5.lua
+++ /dev/null
@@ -1,30 +0,0 @@
1source = {}
2sink = {}
3pump = {}
4filter = {}
5
6-- source.chain
7dofile("ex6.lua")
8
9-- source.file
10dofile("ex5.lua")
11
12-- encode
13require"mime"
14encode = mime.encode
15
16-- sink.chain
17require"ltn12"
18sink.chain = ltn12.sink.chain
19
20-- wrap
21wrap = mime.wrap
22
23-- sink.file
24sink.file = ltn12.sink.file
25
26-- pump.all
27dofile("ex10.lua")
28
29-- run test
30dofile("ex11.lua")
diff --git a/gem/test.lua b/gem/test.lua
deleted file mode 100644
index a937b9a..0000000
--- a/gem/test.lua
+++ /dev/null
@@ -1,46 +0,0 @@
1function readfile(n)
2 local f = io.open(n, "rb")
3 local s = f:read("*a")
4 f:close()
5 return s
6end
7
8lf = readfile("t1lf.txt")
9os.remove("t1crlf.txt")
10os.execute("lua t1.lua < t1lf.txt > t1crlf.txt")
11crlf = readfile("t1crlf.txt")
12assert(crlf == string.gsub(lf, "\010", "\013\010"), "broken")
13
14gt = readfile("t2gt.qp")
15os.remove("t2.qp")
16os.execute("lua t2.lua < t2.txt > t2.qp")
17t2 = readfile("t2.qp")
18assert(gt == t2, "broken")
19
20os.remove("t1crlf.txt")
21os.execute("lua t3.lua < t1lf.txt > t1crlf.txt")
22crlf = readfile("t1crlf.txt")
23assert(crlf == string.gsub(lf, "\010", "\013\010"), "broken")
24
25t = readfile("test.lua")
26os.execute("lua t4.lua < test.lua > t")
27t2 = readfile("t")
28assert(t == t2, "broken")
29
30os.remove("output.b64")
31gt = readfile("gt.b64")
32os.execute("lua t5.lua")
33t5 = readfile("output.b64")
34assert(gt == t5, "failed")
35
36print("1 2 5 6 10 passed")
37print("2 3 4 5 6 10 passed")
38print("2 5 6 7 8 10 passed")
39print("5 9 passed")
40print("5 6 10 11 passed")
41
42os.remove("t")
43os.remove("t2.qp")
44os.remove("t1crlf.txt")
45os.remove("t11.b64")
46os.remove("output.b64")
diff --git a/ltn012.md b/ltn012.md
new file mode 100644
index 0000000..fa26b4a
--- /dev/null
+++ b/ltn012.md
@@ -0,0 +1,390 @@
1# Filters, sources and sinks: design, motivation and examples
2### or Functional programming for the rest of us
3by DiegoNehab
4
5## Abstract
6
7Certain operations can be implemented in the form of filters. A filter is a function that processes data received in consecutive function calls, returning partial results chunk by chunk. Examples of operations that can be implemented as filters include the end-of-line normalization for text, Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing, and there are many others. Filters become even more powerful when we allow them to be chained together to create composite filters. Filters can be seen as middle nodes in a chain of data transformations. Sources an sinks are the corresponding end points of these chains. A source is a function that produces data, chunk by chunk, and a sink is a function that takes data, chunk by chunk. In this technical note, we define an elegant interface for filters, sources, sinks and chaining. We evolve our interface progressively, until we reach a high degree of generality. We discuss difficulties that arise during the implementation of this interface and we provide solutions and examples.
8
9## Introduction
10
11Applications sometimes have too much information to process to fit in memory and are thus forced to process data in smaller parts. Even when there is enough memory, processing all the data atomically may take long enough to frustrate a user that wants to interact with the application. Furthermore, complex transformations can often be defined as series of simpler operations. Several different complex transformations might share the same simpler operations, so that an uniform interface to combine them is desirable. The following concepts constitute our solution to these problems.
12
13"Filters" are functions that accept successive chunks of input, and produce successive chunks of output. Furthermore, the result of concatenating all the output data is the same as the result of applying the filter over the concatenation of the input data. As a consequence, boundaries are irrelevant: filters have to handle input data split arbitrarily by the user.
14
15A "chain" is a function that combines the effect of two (or more) other functions, but whose interface is indistinguishable from the interface of one of its components. Thus, a chained filter can be used wherever an atomic filter can be used. However, its effect on data is the combined effect of its component filters. Note that, as a consequence, chains can be chained themselves to create arbitrarily complex operations that can be used just like atomic operations.
16
17Filters can be seen as internal nodes in a network through which data flows, potentially being transformed along its way. Chains connect these nodes together. To complete the picture, we need "sources" and "sinks" as initial and final nodes of the network, respectively. Less abstractly, a source is a function that produces new data every time it is called. On the other hand, sinks are functions that give a final destination to the data they receive. Naturally, sources and sinks can be chained with filters.
18
19Finally, filters, chains, sources, and sinks are all passive entities: they need to be repeatedly called in order for something to happen. "Pumps" provide the driving force that pushes data through the network, from a source to a sink.
20
21 Hopefully, these concepts will become clear with examples. In the following sections, we start with simplified interfaces, which we improve several times until we can find no obvious shortcomings. The evolution we present is not contrived: it follows the steps we followed ourselves as we consolidated our understanding of these concepts.
22
23### A concrete example
24
25Some data transformations are easier to implement as filters than others. Examples of operations that can be implemented as filters include the end-of-line normalization for text, the Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing, and many others. Let's use the end-of-line normalization as an example to define our initial filter interface. We later discuss why the implementation might not be trivial.
26
27Assume we are given text in an unknown end-of-line convention (including possibly mixed conventions) out of the commonly found Unix (LF), Mac OS (CR), and DOS (CRLF) conventions. We would like to be able to write code like the following:
28```lua
29input = source.chain(source.file(io.stdin), normalize("\r\n"))
30output = sink.file(io.stdout)
31pump(input, output)
32```
33
34This program should read data from the standard input stream and normalize the end-of-line markers to the canonic CRLF marker defined by the MIME standard, finally sending the results to the standard output stream. For that, we use a "file source" to produce data from standard input, and chain it with a filter that normalizes the data. The pump then repeatedly gets data from the source, and moves it to the "file sink" that sends it to standard output.
35
36To make the discussion even more concrete, we start by discussing the implementation of the normalization filter. The `normalize` "factory" is a function that creates such a filter. Our initial filter interface is as follows: the filter receives a chunk of input data, and returns a chunk of processed data. When there is no more input data, the user notifies the filter by invoking it with a `nil` chunk. The filter then returns the final chunk of processed data.
37
38Although the interface is extremely simple, the implementation doesn't seem so obvious. Any filter respecting this interface needs to keep some kind of context between calls. This is because chunks can be broken between the CR and LF characters marking the end of a line. This need for context storage is what motivates the use of factories: each time the factory is called, it returns a filter with its own context so that we can have several independent filters being used at the same time. For the normalization filter, we know that the obvious solution (i.e. concatenating all the input into the context before producing any output) is not good enough, so we will have to find another way.
39
40We will break the implementation in two parts: a low-level filter, and a factory of high-level filters. The low-level filter will be implemented in C and will not carry any context between function calls. The high-level filter factory, implemented in Lua, will create and return a high-level filter that keeps whatever context the low-level filter needs, but isolates the user from its internal details. That way, we take advantage of C's efficiency to perform the dirty work, and take advantage of Lua's simplicity for the bookkeeping.
41
42### The Lua part of the implementation
43
44Below is the implementation of the factory of high-level end-of-line normalization filters:
45```lua
46function filter.cycle(low, ctx, extra)
47 return function(chunk)
48 local ret
49 ret, ctx = low(ctx, chunk, extra)
50 return ret
51 end
52end
53
54function normalize(marker)
55 return cycle(eol, 0, marker)
56end
57```
58
59The `normalize` factory simply calls a more generic factory, the `cycle` factory. This factory receives a low-level filter, an initial context and some extra value and returns the corresponding high-level filter. Each time the high level filer is called with a new chunk, it calls the low-level filter passing the previous context, the new chunk and the extra argument. The low-level filter produces the chunk of processed data and a new context. Finally, the high-level filter updates its internal context and returns the processed chunk of data to the user. It is the low-level filter that does all the work. Notice that this implementation takes advantage of the Lua 5.0 lexical scoping rules to store the context locally, between function calls.
60
61Moving to the low-level filter, we notice there is no perfect solution to the end-of-line marker normalization problem itself. The difficulty comes from an inherent ambiguity on the definition of empty lines within mixed input. However, the following solution works well for any consistent input, as well as for non-empty lines in mixed input. It also does a reasonable job with empty lines and serves as a good example of how to implement a low-level filter.
62
63Here is what we do: CR and LF are considered candidates for line break. We issue "one" end-of-line line marker if one of the candidates is seen alone, or followed by a "different" candidate. That is, CR&nbsp;CR and LF&nbsp;LF issue two end of line markers each, but CR&nbsp;LF and LF&nbsp;CR issue only one marker. This idea takes care of Mac OS, Mac OS X, VMS and Unix, DOS and MIME, as well as probably other more obscure conventions.
64
65### The C part of the implementation
66
67The low-level filter is divided into two simple functions. The inner function actually does the conversion. It takes each input character in turn, deciding what to output and how to modify the context. The context tells if the last character seen was a candidate and, if so, which candidate it was.
68```c
69#define candidate(c) (c == CR || c == LF)
70static int process(int c, int last, const char *marker, luaL_Buffer *buffer) {
71 if (candidate(c)) {
72 if (candidate(last)) {
73 if (c == last) luaL_addstring(buffer, marker);
74 return 0;
75 } else {
76 luaL_addstring(buffer, marker);
77 return c;
78 }
79 } else {
80 luaL_putchar(buffer, c);
81 return 0;
82 }
83}
84```
85
86The inner function makes use of Lua's auxiliary library's buffer interface for its efficiency and ease of use. The outer function simply interfaces with Lua. It receives the context and the input chunk (as well as an optional end-of-line marker), and returns the transformed output and the new context.
87```c
88static int eol(lua_State *L) {
89 int ctx = luaL_checkint(L, 1);
90 size_t isize = 0;
91 const char *input = luaL_optlstring(L, 2, NULL, &isize);
92 const char *last = input + isize;
93 const char *marker = luaL_optstring(L, 3, CRLF);
94 luaL_Buffer buffer;
95 luaL_buffinit(L, &amp;buffer);
96 if (!input) {
97 lua_pushnil(L);
98 lua_pushnumber(L, 0);
99 return 2;
100 }
101 while (input &lt; last)
102 ctx = process(*input++, ctx, marker, &amp;buffer);
103 luaL_pushresult(&amp;buffer);
104 lua_pushnumber(L, ctx);
105 return 2;
106}
107```
108
109Notice that if the input chunk is `nil`, the operation is considered to be finished. In that case, the loop will not execute a single time and the context is reset to the initial state. This allows the filter to be reused indefinitely. It is a good idea to write filters like this, when possible.
110
111Besides the end-of-line normalization filter shown above, many other filters can be implemented with the same ideas. Examples include Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing etc. The challenging part is to decide what will be the context. For line breaking, for instance, it could be the number of bytes left in the current line. For Base64 encoding, it could be the bytes that remain in the division of the input into 3-byte atoms.
112
113## Chaining
114
115Filters become more powerful when the concept of chaining is introduced. Suppose you have a filter for Quoted-Printable encoding and you want to encode some text. According to the standard, the text has to be normalized into its canonic form prior to encoding. A nice interface that simplifies this task is a factory that creates a composite filter that passes data through multiple filters, but that can be used wherever a primitive filter is used.
116```lua
117local function chain2(f1, f2)
118 return function(chunk)
119 local ret = f2(f1(chunk))
120 if chunk then return ret
121 else return ret .. f2() end
122 end
123end
124
125function filter.chain(...)
126 local arg = {...}
127 local f = arg[1]
128 for i = 2, #arg do
129 f = chain2(f, arg[i])
130 end
131 return f
132end
133
134local chain = filter.chain(normalize("\r\n"), encode("quoted-printable"))
135while 1 do
136 local chunk = io.read(2048)
137 io.write(chain(chunk))
138 if not chunk then break end
139end
140```
141
142The chaining factory is very simple. All it does is return a function that passes data through all filters and returns the result to the user. It uses the simpler auxiliary function that knows how to chain two filters together. In the auxiliary function, special care must be taken if the chunk is final. This is because the final chunk notification has to be pushed through both filters in turn. Thanks to the chain factory, it is easy to perform the Quoted-Printable conversion, as the above example shows.
143
144## Sources, sinks, and pumps
145
146As we noted in the introduction, the filters we introduced so far act as the internal nodes in a network of transformations. Information flows from node to node (or rather from one filter to the next) and is transformed on its way out. Chaining filters together is the way we found to connect nodes in the network. But what about the end nodes? In the beginning of the network, we need a node that provides the data, a source. In the end of the network, we need a node that takes in the data, a sink.
147
148### Sources
149
150We start with two simple sources. The first is the `empty` source: It simply returns no data, possibly returning an error message. The second is the `file` source, which produces the contents of a file in a chunk by chunk fashion, closing the file handle when done.
151```lua
152function source.empty(err)
153 return function()
154 return nil, err
155 end
156end
157
158function source.file(handle, io_err)
159 if handle then
160 return function()
161 local chunk = handle:read(2048)
162 if not chunk then handle:close() end
163 return chunk
164 end
165 else return source.empty(io_err or "unable to open file") end
166end
167```
168
169A source returns the next chunk of data each time it is called. When there is no more data, it just returns `nil`. If there is an error, the source can inform the caller by returning `nil` followed by an error message. Adrian Sietsma noticed that, although not on purpose, the interface for sources is compatible with the idea of iterators in Lua 5.0. That is, a data source can be nicely used in conjunction with `for` loops. Using our file source as an iterator, we can rewrite our first example:
170```lua
171local process = normalize("\r\n")
172for chunk in source.file(io.stdin) do
173 io.write(process(chunk))
174end
175io.write(process(nil))
176```
177
178Notice that the last call to the filter obtains the last chunk of processed data. The loop terminates when the source returns `nil` and therefore we need that final call outside of the loop.
179
180### Maintaining state between calls
181
182It is often the case that a source needs to change its behavior after some event. One simple example would be a file source that wants to make sure it returns `nil` regardless of how many times it is called after the end of file, avoiding attempts to read past the end of the file. Another example would be a source that returns the contents of several files, as if they were concatenated, moving from one file to the next until the end of the last file is reached.
183
184One way to implement this kind of source is to have the factory declare extra state variables that the source can use via lexical scoping. Our file source could set the file handle itself to `nil` when it detects the end-of-file. Then, every time the source is called, it could check if the handle is still valid and act accordingly:
185```lua
186function source.file(handle, io_err)
187 if handle then
188 return function()
189 if not handle then return nil end
190 local chunk = handle:read(2048)
191 if not chunk then
192 handle:close()
193 handle = nil
194 end
195 return chunk
196 end
197 else return source.empty(io_err or "unable to open file") end
198end
199```
200
201Another way to implement this behavior involves a change in the source interface to makes it more flexible. Let's allow a source to return a second value, besides the next chunk of data. If the returned chunk is `nil`, the extra return value tells us what happened. A second `nil` means that there is just no more data and the source is empty. Any other value is considered to be an error message. On the other hand, if the chunk was "not" `nil`, the second return value tells us whether the source wants to be replaced. If it is `nil`, we should proceed using the same source. Otherwise it has to be another source, which we have to use from then on, to get the remaining data.
202
203This extra freedom is good for someone writing a source function, but it is a pain for those that have to use it. Fortunately, given one of these "fancy" sources, we can transform it into a simple source that never needs to be replaced, using the following factory.
204```lua
205function source.simplify(src)
206 return function()
207 local chunk, err_or_new = src()
208 src = err_or_new or src
209 if not chunk then return nil, err_or_new
210 else return chunk end
211 end
212end
213```
214
215The simplification factory allows us to write fancy sources and use them as if they were simple. Therefore, our next functions will only produce simple sources, and functions that take sources will assume they are simple.
216
217Going back to our file source, the extended interface allows for a more elegant implementation. The new source just asks to be replaced by an empty source as soon as there is no more data. There is no repeated checking of the handle. To make things simpler to the user, the factory itself simplifies the the fancy file source before returning it to the user:
218```lua
219function source.file(handle, io_err)
220 if handle then
221 return source.simplify(function()
222 local chunk = handle:read(2048)
223 if not chunk then
224 handle:close()
225 return "", source.empty()
226 end
227 return chunk
228 end)
229 else return source.empty(io_err or "unable to open file") end
230end
231```
232
233We can make these ideas even more powerful if we use a new feature of Lua 5.0: coroutines. Coroutines suffer from a great lack of advertisement, and I am going to play my part here. Just like lexical scoping, coroutines taste odd at first, but once you get used with the concept, it can save your day. I have to admit that using coroutines to implement our file source would be overkill, so let's implement a concatenated source factory instead.
234```lua
235function source.cat(...)
236 local arg = {...}
237 local co = coroutine.create(function()
238 local i = 1
239 while i <= #arg do
240 local chunk, err = arg[i]()
241 if chunk then coroutine.yield(chunk)
242 elseif err then return nil, err
243 else i = i + 1 end
244 end
245 end)
246 return function()
247 return shift(coroutine.resume(co))
248 end
249end
250```
251
252The factory creates two functions. The first is an auxiliary that does all the work, in the form of a coroutine. It reads a chunk from one of the sources. If the chunk is `nil`, it moves to the next source, otherwise it just yields returning the chunk. When it is resumed, it continues from where it stopped and tries to read the next chunk. The second function is the source itself, and just resumes the execution of the auxiliary coroutine, returning to the user whatever chunks it returns (skipping the first result that tells us if the coroutine terminated). Imagine writing the same function without coroutines and you will notice the simplicity of this implementation. We will use coroutines again when we make the filter interface more powerful.
253
254### Chaining Sources
255
256What does it mean to chain a source with a filter? The most useful interpretation is that the combined source-filter is a new source that produces data and passes it through the filter before returning it. Here is a factory that does it:
257```lua
258function source.chain(src, f)
259 return source.simplify(function()
260 local chunk, err = src()
261 if not chunk then return f(nil), source.empty(err)
262 else return f(chunk) end
263 end)
264end
265```
266
267Our motivating example in the introduction chains a source with a filter. The idea of chaining a source with a filter is useful when one thinks about functions that might get their input data from a source. By chaining a simple source with one or more filters, the same function can be provided with filtered data even though it is unaware of the filtering that is happening behind its back.
268
269### Sinks
270
271Just as we defined an interface for an initial source of data, we can also define an interface for a final destination of data. We call any function respecting that interface a "sink". Below are two simple factories that return sinks. The table factory creates a sink that stores all obtained data into a table. The data can later be efficiently concatenated into a single string with the `table.concat` library function. As another example, we introduce the `null` sink: A sink that simply discards the data it receives.
272```lua
273function sink.table(t)
274 t = t or {}
275 local f = function(chunk, err)
276 if chunk then table.insert(t, chunk) end
277 return 1
278 end
279 return f, t
280end
281
282local function null()
283 return 1
284end
285
286function sink.null()
287 return null
288end
289```
290
291Sinks receive consecutive chunks of data, until the end of data is notified with a `nil` chunk. An error is notified by an extra argument giving an error message after the `nil` chunk. If a sink detects an error itself and wishes not to be called again, it should return `nil`, optionally followed by an error message. A return value that is not `nil` means the source will accept more data. Finally, just as sources can choose to be replaced, so can sinks, following the same interface. Once again, it is easy to implement a `sink.simplify` factory that transforms a fancy sink into a simple sink.
292
293As an example, let's create a source that reads from the standard input, then chain it with a filter that normalizes the end-of-line convention and let's use a sink to place all data into a table, printing the result in the end.
294```lua
295local load = source.chain(source.file(io.stdin), normalize("\r\n"))
296local store, t = sink.table()
297while 1 do
298 local chunk = load()
299 store(chunk)
300 if not chunk then break end
301end
302print(table.concat(t))
303```
304
305Again, just as we created a factory that produces a chained source-filter from a source and a filter, it is easy to create a factory that produces a new sink given a sink and a filter. The new sink passes all data it receives through the filter before handing it in to the original sink. Here is the implementation:
306```lua
307function sink.chain(f, snk)
308 return function(chunk, err)
309 local r, e = snk(f(chunk))
310 if not r then return nil, e end
311 if not chunk then return snk(nil, err) end
312 return 1
313 end
314end
315```
316
317### Pumps
318
319There is a while loop that has been around for too long in our examples. It's always there because everything that we designed so far is passive. Sources, sinks, filters: None of them will do anything on their own. The operation of pumping all data a source can provide into a sink is so common that we will provide a couple helper functions to do that for us.
320```lua
321function pump.step(src, snk)
322 local chunk, src_err = src()
323 local ret, snk_err = snk(chunk, src_err)
324 return chunk and ret and not src_err and not snk_err, src_err or snk_err
325end
326
327function pump.all(src, snk, step)
328 step = step or pump.step
329 while true do
330 local ret, err = step(src, snk)
331 if not ret then return not err, err end
332 end
333end
334```
335
336The `pump.step` function moves one chunk of data from the source to the sink. The `pump.all` function takes an optional `step` function and uses it to pump all the data from the source to the sink. We can now use everything we have to write a program that reads a binary file from disk and stores it in another file, after encoding it to the Base64 transfer content encoding:
337```lua
338local load = source.chain(
339 source.file(io.open("input.bin", "rb")),
340 encode("base64")
341)
342local store = sink.chain(
343 wrap(76),
344 sink.file(io.open("output.b64", "w")),
345)
346pump.all(load, store)
347```
348
349The way we split the filters here is not intuitive, on purpose. Alternatively, we could have chained the Base64 encode filter and the line-wrap filter together, and then chain the resulting filter with either the file source or the file sink. It doesn't really matter.
350
351## One last important change
352
353Turns out we still have a problem. When David Burgess was writing his gzip filter, he noticed that the decompression filter can explode a small input chunk into a huge amount of data. Although we wished we could ignore this problem, we soon agreed we couldn't. The only solution is to allow filters to return partial results, and that is what we chose to do. After invoking the filter to pass input data, the user now has to loop invoking the filter to find out if it has more output data to return. Note that these extra calls can't pass more data to the filter.
354
355More specifically, after passing a chunk of input data to a filter and collecting the first chunk of output data, the user invokes the filter repeatedly, passing the empty string, to get extra output chunks. When the filter itself returns an empty string, the user knows there is no more output data, and can proceed to pass the next input chunk. In the end, after the user passes a `nil` notifying the filter that there is no more input data, the filter might still have produced too much output data to return in a single chunk. The user has to loop again, this time passing `nil` each time, until the filter itself returns `nil` to notify the user it is finally done.
356
357Most filters won't need this extra freedom. Fortunately, the new filter interface is easy to implement. In fact, the end-of-line translation filter we created in the introduction already conforms to it. On the other hand, the chaining function becomes much more complicated. If it wasn't for coroutines, I wouldn't be happy to implement it. Let me know if you can find a simpler implementation that does not use coroutines!
358```lua
359local function chain2(f1, f2)
360 local co = coroutine.create(function(chunk)
361 while true do
362 local filtered1 = f1(chunk)
363 local filtered2 = f2(filtered1)
364 local done2 = filtered1 and ""
365 while true do
366 if filtered2 == "" or filtered2 == nil then break end
367 coroutine.yield(filtered2)
368 filtered2 = f2(done2)
369 end
370 if filtered1 == "" then chunk = coroutine.yield(filtered1)
371 elseif filtered1 == nil then return nil
372 else chunk = chunk and "" end
373 end
374 end)
375 return function(chunk)
376 local _, res = coroutine.resume(co, chunk)
377 return res
378 end
379end
380```
381
382Chaining sources also becomes more complicated, but a similar solution is possible with coroutines. Chaining sinks is just as simple as it has always been. Interestingly, these modifications do not have a measurable negative impact in the the performance of filters that didn't need the added flexibility. They do severely improve the efficiency of filters like the gzip filter, though, and that is why we are keeping them.
383
384## Final considerations
385
386These ideas were created during the development of [LuaSocket](https://github.com/lunarmodules/luasocket) 2.0, and are available as the LTN12 module. As a result, [LuaSocket](https://github.com/lunarmodules/luasocket) implementation was greatly simplified and became much more powerful. The MIME module is especially integrated to LTN12 and provides many other filters. We felt these concepts deserved to be made public even to those that don't care about [LuaSocket](https://github.com/lunarmodules/luasocket), hence the LTN.
387
388One extra application that deserves mentioning makes use of an identity filter. Suppose you want to provide some feedback to the user while a file is being downloaded into a sink. Chaining the sink with an identity filter (a filter that simply returns the received data unaltered), you can update a progress counter on the fly. The original sink doesn't have to be modified. Another interesting idea is that of a T sink: A sink that sends data to two other sinks. In summary, there appears to be enough room for many other interesting ideas.
389
390In this technical note we introduced filters, sources, sinks, and pumps. These are useful tools for data processing in general. Sources provide a simple abstraction for data acquisition. Sinks provide an abstraction for final data destinations. Filters define an interface for data transformations. The chaining of filters, sources and sinks provides an elegant way to create arbitrarily complex data transformation from simpler transformations. Pumps just put the machinery to work.
diff --git a/ltn012.wiki b/ltn012.wiki
deleted file mode 100644
index 96b13ae..0000000
--- a/ltn012.wiki
+++ /dev/null
@@ -1,393 +0,0 @@
1===Filters, sources and sinks: design, motivation and examples===
2==or Functional programming for the rest of us==
3by DiegoNehab
4
5{{{
6
7}}}
8
9===Abstract===
10Certain operations can be implemented in the form of filters. A filter is a function that processes data received in consecutive function calls, returning partial results chunk by chunk. Examples of operations that can be implemented as filters include the end-of-line normalization for text, Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing, and there are many others. Filters become even more powerful when we allow them to be chained together to create composite filters. Filters can be seen as middle nodes in a chain of data transformations. Sources an sinks are the corresponding end points of these chains. A source is a function that produces data, chunk by chunk, and a sink is a function that takes data, chunk by chunk. In this technical note, we define an elegant interface for filters, sources, sinks and chaining. We evolve our interface progressively, until we reach a high degree of generality. We discuss difficulties that arise during the implementation of this interface and we provide solutions and examples.
11
12===Introduction===
13
14Applications sometimes have too much information to process to fit in memory and are thus forced to process data in smaller parts. Even when there is enough memory, processing all the data atomically may take long enough to frustrate a user that wants to interact with the application. Furthermore, complex transformations can often be defined as series of simpler operations. Several different complex transformations might share the same simpler operations, so that an uniform interface to combine them is desirable. The following concepts constitute our solution to these problems.
15
16''Filters'' are functions that accept successive chunks of input, and produce successive chunks of output. Furthermore, the result of concatenating all the output data is the same as the result of applying the filter over the concatenation of the input data. As a consequence, boundaries are irrelevant: filters have to handle input data split arbitrarily by the user.
17
18A ''chain'' is a function that combines the effect of two (or more) other functions, but whose interface is indistinguishable from the interface of one of its components. Thus, a chained filter can be used wherever an atomic filter can be used. However, its effect on data is the combined effect of its component filters. Note that, as a consequence, chains can be chained themselves to create arbitrarily complex operations that can be used just like atomic operations.
19
20Filters can be seen as internal nodes in a network through which data flows, potentially being transformed along its way. Chains connect these nodes together. To complete the picture, we need ''sources'' and ''sinks'' as initial and final nodes of the network, respectively. Less abstractly, a source is a function that produces new data every time it is called. On the other hand, sinks are functions that give a final destination to the data they receive. Naturally, sources and sinks can be chained with filters.
21
22Finally, filters, chains, sources, and sinks are all passive entities: they need to be repeatedly called in order for something to happen. ''Pumps'' provide the driving force that pushes data through the network, from a source to a sink.
23
24 Hopefully, these concepts will become clear with examples. In the following sections, we start with simplified interfaces, which we improve several times until we can find no obvious shortcomings. The evolution we present is not contrived: it follows the steps we followed ourselves as we consolidated our understanding of these concepts.
25
26== A concrete example ==
27
28Some data transformations are easier to implement as filters than others. Examples of operations that can be implemented as filters include the end-of-line normalization for text, the Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing, and many others. Let's use the end-of-line normalization as an example to define our initial filter interface. We later discuss why the implementation might not be trivial.
29
30Assume we are given text in an unknown end-of-line convention (including possibly mixed conventions) out of the commonly found Unix (LF), Mac OS (CR), and DOS (CRLF) conventions. We would like to be able to write code like the following:
31 {{{
32input = source.chain(source.file(io.stdin), normalize("\r\n"))
33output = sink.file(io.stdout)
34pump(input, output)
35}}}
36
37This program should read data from the standard input stream and normalize the end-of-line markers to the canonic CRLF marker defined by the MIME standard, finally sending the results to the standard output stream. For that, we use a ''file source'' to produce data from standard input, and chain it with a filter that normalizes the data. The pump then repeatedly gets data from the source, and moves it to the ''file sink'' that sends it to standard output.
38
39To make the discussion even more concrete, we start by discussing the implementation of the normalization filter. The {{normalize}} ''factory'' is a function that creates such a filter. Our initial filter interface is as follows: the filter receives a chunk of input data, and returns a chunk of processed data. When there is no more input data, the user notifies the filter by invoking it with a {{nil}} chunk. The filter then returns the final chunk of processed data.
40
41Although the interface is extremely simple, the implementation doesn't seem so obvious. Any filter respecting this interface needs to keep some kind of context between calls. This is because chunks can be broken between the CR and LF characters marking the end of a line. This need for context storage is what motivates the use of factories: each time the factory is called, it returns a filter with its own context so that we can have several independent filters being used at the same time. For the normalization filter, we know that the obvious solution (i.e. concatenating all the input into the context before producing any output) is not good enough, so we will have to find another way.
42
43We will break the implementation in two parts: a low-level filter, and a factory of high-level filters. The low-level filter will be implemented in C and will not carry any context between function calls. The high-level filter factory, implemented in Lua, will create and return a high-level filter that keeps whatever context the low-level filter needs, but isolates the user from its internal details. That way, we take advantage of C's efficiency to perform the dirty work, and take advantage of Lua's simplicity for the bookkeeping.
44
45==The Lua part of the implementation==
46
47Below is the implementation of the factory of high-level end-of-line normalization filters:
48 {{{
49function filter.cycle(low, ctx, extra)
50 return function(chunk)
51 local ret
52 ret, ctx = low(ctx, chunk, extra)
53 return ret
54 end
55end
56
57function normalize(marker)
58 return cycle(eol, 0, marker)
59end
60}}}
61
62The {{normalize}} factory simply calls a more generic factory, the {{cycle}} factory. This factory receives a low-level filter, an initial context and some extra value and returns the corresponding high-level filter. Each time the high level filer is called with a new chunk, it calls the low-level filter passing the previous context, the new chunk and the extra argument. The low-level filter produces the chunk of processed data and a new context. Finally, the high-level filter updates its internal context and returns the processed chunk of data to the user. It is the low-level filter that does all the work. Notice that this implementation takes advantage of the Lua 5.0 lexical scoping rules to store the context locally, between function calls.
63
64Moving to the low-level filter, we notice there is no perfect solution to the end-of-line marker normalization problem itself. The difficulty comes from an inherent ambiguity on the definition of empty lines within mixed input. However, the following solution works well for any consistent input, as well as for non-empty lines in mixed input. It also does a reasonable job with empty lines and serves as a good example of how to implement a low-level filter.
65
66Here is what we do: CR and LF are considered candidates for line break. We issue ''one'' end-of-line line marker if one of the candidates is seen alone, or followed by a ''different'' candidate. That is, CR&nbsp;CR and LF&nbsp;LF issue two end of line markers each, but CR&nbsp;LF and LF&nbsp;CR issue only one marker. This idea takes care of Mac OS, Mac OS X, VMS and Unix, DOS and MIME, as well as probably other more obscure conventions.
67
68==The C part of the implementation==
69
70The low-level filter is divided into two simple functions. The inner function actually does the conversion. It takes each input character in turn, deciding what to output and how to modify the context. The context tells if the last character seen was a candidate and, if so, which candidate it was.
71 {{{
72#define candidate(c) (c == CR || c == LF)
73static int process(int c, int last, const char *marker, luaL_Buffer *buffer) {
74 if (candidate(c)) {
75 if (candidate(last)) {
76 if (c == last) luaL_addstring(buffer, marker);
77 return 0;
78 } else {
79 luaL_addstring(buffer, marker);
80 return c;
81 }
82 } else {
83 luaL_putchar(buffer, c);
84 return 0;
85 }
86}
87}}}
88
89The inner function makes use of Lua's auxiliary library's buffer interface for its efficiency and ease of use. The outer function simply interfaces with Lua. It receives the context and the input chunk (as well as an optional end-of-line marker), and returns the transformed output and the new context.
90 {{{
91static int eol(lua_State *L) {
92 int ctx = luaL_checkint(L, 1);
93 size_t isize = 0;
94 const char *input = luaL_optlstring(L, 2, NULL, &isize);
95 const char *last = input + isize;
96 const char *marker = luaL_optstring(L, 3, CRLF);
97 luaL_Buffer buffer;
98 luaL_buffinit(L, &amp;buffer);
99 if (!input) {
100 lua_pushnil(L);
101 lua_pushnumber(L, 0);
102 return 2;
103 }
104 while (input &lt; last)
105 ctx = process(*input++, ctx, marker, &amp;buffer);
106 luaL_pushresult(&amp;buffer);
107 lua_pushnumber(L, ctx);
108 return 2;
109}
110}}}
111
112Notice that if the input chunk is {{nil}}, the operation is considered to be finished. In that case, the loop will not execute a single time and the context is reset to the initial state. This allows the filter to be reused indefinitely. It is a good idea to write filters like this, when possible.
113
114Besides the end-of-line normalization filter shown above, many other filters can be implemented with the same ideas. Examples include Base64 and Quoted-Printable transfer content encodings, the breaking of text into lines, SMTP byte stuffing etc. The challenging part is to decide what will be the context. For line breaking, for instance, it could be the number of bytes left in the current line. For Base64 encoding, it could be the bytes that remain in the division of the input into 3-byte atoms.
115
116===Chaining===
117
118Filters become more powerful when the concept of chaining is introduced. Suppose you have a filter for Quoted-Printable encoding and you want to encode some text. According to the standard, the text has to be normalized into its canonic form prior to encoding. A nice interface that simplifies this task is a factory that creates a composite filter that passes data through multiple filters, but that can be used wherever a primitive filter is used.
119 {{{
120local function chain2(f1, f2)
121 return function(chunk)
122 local ret = f2(f1(chunk))
123 if chunk then return ret
124 else return ret .. f2() end
125 end
126end
127
128function filter.chain(...)
129 local arg = {...}
130 local f = arg[1]
131 for i = 2, #arg do
132 f = chain2(f, arg[i])
133 end
134 return f
135end
136
137local chain = filter.chain(normalize("\r\n"), encode("quoted-printable"))
138while 1 do
139 local chunk = io.read(2048)
140 io.write(chain(chunk))
141 if not chunk then break end
142end
143}}}
144
145The chaining factory is very simple. All it does is return a function that passes data through all filters and returns the result to the user. It uses the simpler auxiliary function that knows how to chain two filters together. In the auxiliary function, special care must be taken if the chunk is final. This is because the final chunk notification has to be pushed through both filters in turn. Thanks to the chain factory, it is easy to perform the Quoted-Printable conversion, as the above example shows.
146
147===Sources, sinks, and pumps===
148
149As we noted in the introduction, the filters we introduced so far act as the internal nodes in a network of transformations. Information flows from node to node (or rather from one filter to the next) and is transformed on its way out. Chaining filters together is the way we found to connect nodes in the network. But what about the end nodes? In the beginning of the network, we need a node that provides the data, a source. In the end of the network, we need a node that takes in the data, a sink.
150
151==Sources==
152
153We start with two simple sources. The first is the {{empty}} source: It simply returns no data, possibly returning an error message. The second is the {{file}} source, which produces the contents of a file in a chunk by chunk fashion, closing the file handle when done.
154 {{{
155function source.empty(err)
156 return function()
157 return nil, err
158 end
159end
160
161function source.file(handle, io_err)
162 if handle then
163 return function()
164 local chunk = handle:read(2048)
165 if not chunk then handle:close() end
166 return chunk
167 end
168 else return source.empty(io_err or "unable to open file") end
169end
170}}}
171
172A source returns the next chunk of data each time it is called. When there is no more data, it just returns {{nil}}. If there is an error, the source can inform the caller by returning {{nil}} followed by an error message. Adrian Sietsma noticed that, although not on purpose, the interface for sources is compatible with the idea of iterators in Lua 5.0. That is, a data source can be nicely used in conjunction with {{for}} loops. Using our file source as an iterator, we can rewrite our first example:
173 {{{
174local process = normalize("\r\n")
175for chunk in source.file(io.stdin) do
176 io.write(process(chunk))
177end
178io.write(process(nil))
179}}}
180
181Notice that the last call to the filter obtains the last chunk of processed data. The loop terminates when the source returns {{nil}} and therefore we need that final call outside of the loop.
182
183==Maintaining state between calls==
184
185It is often the case that a source needs to change its behavior after some event. One simple example would be a file source that wants to make sure it returns {{nil}} regardless of how many times it is called after the end of file, avoiding attempts to read past the end of the file. Another example would be a source that returns the contents of several files, as if they were concatenated, moving from one file to the next until the end of the last file is reached.
186
187One way to implement this kind of source is to have the factory declare extra state variables that the source can use via lexical scoping. Our file source could set the file handle itself to {{nil}} when it detects the end-of-file. Then, every time the source is called, it could check if the handle is still valid and act accordingly:
188 {{{
189function source.file(handle, io_err)
190 if handle then
191 return function()
192 if not handle then return nil end
193 local chunk = handle:read(2048)
194 if not chunk then
195 handle:close()
196 handle = nil
197 end
198 return chunk
199 end
200 else return source.empty(io_err or "unable to open file") end
201end
202}}}
203
204Another way to implement this behavior involves a change in the source interface to makes it more flexible. Let's allow a source to return a second value, besides the next chunk of data. If the returned chunk is {{nil}}, the extra return value tells us what happened. A second {{nil}} means that there is just no more data and the source is empty. Any other value is considered to be an error message. On the other hand, if the chunk was ''not'' {{nil}}, the second return value tells us whether the source wants to be replaced. If it is {{nil}}, we should proceed using the same source. Otherwise it has to be another source, which we have to use from then on, to get the remaining data.
205
206This extra freedom is good for someone writing a source function, but it is a pain for those that have to use it. Fortunately, given one of these ''fancy'' sources, we can transform it into a simple source that never needs to be replaced, using the following factory.
207 {{{
208function source.simplify(src)
209 return function()
210 local chunk, err_or_new = src()
211 src = err_or_new or src
212 if not chunk then return nil, err_or_new
213 else return chunk end
214 end
215end
216}}}
217
218The simplification factory allows us to write fancy sources and use them as if they were simple. Therefore, our next functions will only produce simple sources, and functions that take sources will assume they are simple.
219
220Going back to our file source, the extended interface allows for a more elegant implementation. The new source just asks to be replaced by an empty source as soon as there is no more data. There is no repeated checking of the handle. To make things simpler to the user, the factory itself simplifies the the fancy file source before returning it to the user:
221 {{{
222function source.file(handle, io_err)
223 if handle then
224 return source.simplify(function()
225 local chunk = handle:read(2048)
226 if not chunk then
227 handle:close()
228 return "", source.empty()
229 end
230 return chunk
231 end)
232 else return source.empty(io_err or "unable to open file") end
233end
234}}}
235
236We can make these ideas even more powerful if we use a new feature of Lua 5.0: coroutines. Coroutines suffer from a great lack of advertisement, and I am going to play my part here. Just like lexical scoping, coroutines taste odd at first, but once you get used with the concept, it can save your day. I have to admit that using coroutines to implement our file source would be overkill, so let's implement a concatenated source factory instead.
237 {{{
238function source.cat(...)
239 local arg = {...}
240 local co = coroutine.create(function()
241 local i = 1
242 while i <= #arg do
243 local chunk, err = arg[i]()
244 if chunk then coroutine.yield(chunk)
245 elseif err then return nil, err
246 else i = i + 1 end
247 end
248 end)
249 return function()
250 return shift(coroutine.resume(co))
251 end
252end
253}}}
254
255The factory creates two functions. The first is an auxiliary that does all the work, in the form of a coroutine. It reads a chunk from one of the sources. If the chunk is {{nil}}, it moves to the next source, otherwise it just yields returning the chunk. When it is resumed, it continues from where it stopped and tries to read the next chunk. The second function is the source itself, and just resumes the execution of the auxiliary coroutine, returning to the user whatever chunks it returns (skipping the first result that tells us if the coroutine terminated). Imagine writing the same function without coroutines and you will notice the simplicity of this implementation. We will use coroutines again when we make the filter interface more powerful.
256
257==Chaining Sources==
258
259What does it mean to chain a source with a filter? The most useful interpretation is that the combined source-filter is a new source that produces data and passes it through the filter before returning it. Here is a factory that does it:
260 {{{
261function source.chain(src, f)
262 return source.simplify(function()
263 local chunk, err = src()
264 if not chunk then return f(nil), source.empty(err)
265 else return f(chunk) end
266 end)
267end
268}}}
269
270Our motivating example in the introduction chains a source with a filter. The idea of chaining a source with a filter is useful when one thinks about functions that might get their input data from a source. By chaining a simple source with one or more filters, the same function can be provided with filtered data even though it is unaware of the filtering that is happening behind its back.
271
272==Sinks==
273
274Just as we defined an interface for an initial source of data, we can also define an interface for a final destination of data. We call any function respecting that interface a ''sink''. Below are two simple factories that return sinks. The table factory creates a sink that stores all obtained data into a table. The data can later be efficiently concatenated into a single string with the {{table.concat}} library function. As another example, we introduce the {{null}} sink: A sink that simply discards the data it receives.
275 {{{
276function sink.table(t)
277 t = t or {}
278 local f = function(chunk, err)
279 if chunk then table.insert(t, chunk) end
280 return 1
281 end
282 return f, t
283end
284
285local function null()
286 return 1
287end
288
289function sink.null()
290 return null
291end
292}}}
293
294Sinks receive consecutive chunks of data, until the end of data is notified with a {{nil}} chunk. An error is notified by an extra argument giving an error message after the {{nil}} chunk. If a sink detects an error itself and wishes not to be called again, it should return {{nil}}, optionally followed by an error message. A return value that is not {{nil}} means the source will accept more data. Finally, just as sources can choose to be replaced, so can sinks, following the same interface. Once again, it is easy to implement a {{sink.simplify}} factory that transforms a fancy sink into a simple sink.
295
296As an example, let's create a source that reads from the standard input, then chain it with a filter that normalizes the end-of-line convention and let's use a sink to place all data into a table, printing the result in the end.
297 {{{
298local load = source.chain(source.file(io.stdin), normalize("\r\n"))
299local store, t = sink.table()
300while 1 do
301 local chunk = load()
302 store(chunk)
303 if not chunk then break end
304end
305print(table.concat(t))
306}}}
307
308Again, just as we created a factory that produces a chained source-filter from a source and a filter, it is easy to create a factory that produces a new sink given a sink and a filter. The new sink passes all data it receives through the filter before handing it in to the original sink. Here is the implementation:
309 {{{
310function sink.chain(f, snk)
311 return function(chunk, err)
312 local r, e = snk(f(chunk))
313 if not r then return nil, e end
314 if not chunk then return snk(nil, err) end
315 return 1
316 end
317end
318}}}
319
320==Pumps==
321
322There is a while loop that has been around for too long in our examples. It's always there because everything that we designed so far is passive. Sources, sinks, filters: None of them will do anything on their own. The operation of pumping all data a source can provide into a sink is so common that we will provide a couple helper functions to do that for us.
323 {{{
324function pump.step(src, snk)
325 local chunk, src_err = src()
326 local ret, snk_err = snk(chunk, src_err)
327 return chunk and ret and not src_err and not snk_err, src_err or snk_err
328end
329
330function pump.all(src, snk, step)
331 step = step or pump.step
332 while true do
333 local ret, err = step(src, snk)
334 if not ret then return not err, err end
335 end
336end
337}}}
338
339The {{pump.step}} function moves one chunk of data from the source to the sink. The {{pump.all}} function takes an optional {{step}} function and uses it to pump all the data from the source to the sink. We can now use everything we have to write a program that reads a binary file from disk and stores it in another file, after encoding it to the Base64 transfer content encoding:
340 {{{
341local load = source.chain(
342 source.file(io.open("input.bin", "rb")),
343 encode("base64")
344)
345local store = sink.chain(
346 wrap(76),
347 sink.file(io.open("output.b64", "w")),
348)
349pump.all(load, store)
350}}}
351
352The way we split the filters here is not intuitive, on purpose. Alternatively, we could have chained the Base64 encode filter and the line-wrap filter together, and then chain the resulting filter with either the file source or the file sink. It doesn't really matter.
353
354===One last important change===
355
356Turns out we still have a problem. When David Burgess was writing his gzip filter, he noticed that the decompression filter can explode a small input chunk into a huge amount of data. Although we wished we could ignore this problem, we soon agreed we couldn't. The only solution is to allow filters to return partial results, and that is what we chose to do. After invoking the filter to pass input data, the user now has to loop invoking the filter to find out if it has more output data to return. Note that these extra calls can't pass more data to the filter.
357
358More specifically, after passing a chunk of input data to a filter and collecting the first chunk of output data, the user invokes the filter repeatedly, passing the empty string, to get extra output chunks. When the filter itself returns an empty string, the user knows there is no more output data, and can proceed to pass the next input chunk. In the end, after the user passes a {{nil}} notifying the filter that there is no more input data, the filter might still have produced too much output data to return in a single chunk. The user has to loop again, this time passing {{nil}} each time, until the filter itself returns {{nil}} to notify the user it is finally done.
359
360Most filters won't need this extra freedom. Fortunately, the new filter interface is easy to implement. In fact, the end-of-line translation filter we created in the introduction already conforms to it. On the other hand, the chaining function becomes much more complicated. If it wasn't for coroutines, I wouldn't be happy to implement it. Let me know if you can find a simpler implementation that does not use coroutines!
361 {{{
362local function chain2(f1, f2)
363 local co = coroutine.create(function(chunk)
364 while true do
365 local filtered1 = f1(chunk)
366 local filtered2 = f2(filtered1)
367 local done2 = filtered1 and ""
368 while true do
369 if filtered2 == "" or filtered2 == nil then break end
370 coroutine.yield(filtered2)
371 filtered2 = f2(done2)
372 end
373 if filtered1 == "" then chunk = coroutine.yield(filtered1)
374 elseif filtered1 == nil then return nil
375 else chunk = chunk and "" end
376 end
377 end)
378 return function(chunk)
379 local _, res = coroutine.resume(co, chunk)
380 return res
381 end
382end
383}}}
384
385Chaining sources also becomes more complicated, but a similar solution is possible with coroutines. Chaining sinks is just as simple as it has always been. Interestingly, these modifications do not have a measurable negative impact in the the performance of filters that didn't need the added flexibility. They do severely improve the efficiency of filters like the gzip filter, though, and that is why we are keeping them.
386
387===Final considerations===
388
389These ideas were created during the development of {{LuaSocket}}[http://www.tecgraf.puc-rio.br/luasocket] 2.0, and are available as the LTN12 module. As a result, {{LuaSocket}}[http://www.tecgraf.puc-rio.br/luasocket] implementation was greatly simplified and became much more powerful. The MIME module is especially integrated to LTN12 and provides many other filters. We felt these concepts deserved to be made public even to those that don't care about {{LuaSocket}}[http://www.tecgraf.puc-rio.br/luasocket], hence the LTN.
390
391One extra application that deserves mentioning makes use of an identity filter. Suppose you want to provide some feedback to the user while a file is being downloaded into a sink. Chaining the sink with an identity filter (a filter that simply returns the received data unaltered), you can update a progress counter on the fly. The original sink doesn't have to be modified. Another interesting idea is that of a T sink: A sink that sends data to two other sinks. In summary, there appears to be enough room for many other interesting ideas.
392
393In this technical note we introduced filters, sources, sinks, and pumps. These are useful tools for data processing in general. Sources provide a simple abstraction for data acquisition. Sinks provide an abstraction for final data destinations. Filters define an interface for data transformations. The chaining of filters, sources and sinks provides an elegant way to create arbitrarily complex data transformation from simpler transformations. Pumps just put the machinery to work.
diff --git a/ltn013.md b/ltn013.md
new file mode 100644
index 0000000..9c56805
--- /dev/null
+++ b/ltn013.md
@@ -0,0 +1,191 @@
1# Using finalized exceptions
2### or How to get rid of all those if statements
3by DiegoNehab
4
5
6## Abstract
7This little LTN describes a simple exception scheme that greatly simplifies error checking in Lua programs. All the needed functionality ships standard with Lua, but is hidden between the `assert` and `pcall` functions. To make it more evident, we stick to a convenient standard (you probably already use anyways) for Lua function return values, and define two very simple helper functions (either in C or in Lua itself).
8
9## Introduction
10
11Most Lua functions return `nil` in case of error, followed by a message describing the error. If you don't use this convention, you probably have good reasons. Hopefully, after reading on, you will realize your reasons are not good enough.
12
13If you are like me, you hate error checking. Most nice little code snippets that look beautiful when you first write them lose some of their charm when you add all that error checking code. Yet, error checking is as important as the rest of the code. How sad.
14
15Even if you stick to a return convention, any complex task involving several function calls makes error checking both boring and error-prone (do you see the "error" below?)
16```lua
17function task(arg1, arg2, ...)
18 local ret1, err = task1(arg1)
19 if not ret1 then
20 cleanup1()
21 return nil, error
22 end
23 local ret2, err = task2(arg2)
24 if not ret then
25 cleanup2()
26 return nil, error
27 end
28 ...
29end
30```
31
32The standard `assert` function provides an interesting alternative. To use it, simply nest every function call to be error checked with a call to `assert`. The `assert` function checks the value of its first argument. If it is `nil`, `assert` throws the second argument as an error message. Otherwise, `assert` lets all arguments through as if had not been there. The idea greatly simplifies error checking:
33```lua
34function task(arg1, arg2, ...)
35 local ret1 = assert(task1(arg1))
36 local ret2 = assert(task2(arg2))
37 ...
38end
39```
40
41If any task fails, the execution is aborted by `assert` and the error message is displayed to the user as the cause of the problem. If no error happens, the task completes as before. There isn't a single `if` statement and this is great. However, there are some problems with the idea.
42
43First, the topmost `task` function doesn't respect the protocol followed by the lower-level tasks: It raises an error instead of returning `nil` followed by the error messages. Here is where the standard `pcall` comes in handy.
44```lua
45function xtask(arg1, arg2, ...)
46 local ret1 = assert(task1(arg1))
47 local ret2 = assert(task2(arg2))
48 ...
49end
50
51function task(arg1, arg2, ...)
52 local ok, ret_or_err = pcall(xtask, arg1, arg2, ...)
53 if ok then return ret_or_err
54 else return nil, ret_or_err end
55end
56```
57
58Our new `task` function is well behaved. `Pcall` catches any error raised by the calls to `assert` and returns it after the status code. That way, errors don't get propagated to the user of the high level `task` function.
59
60These are the main ideas for our exception scheme, but there are still a few glitches to fix:
61
62* Directly using `pcall` ruined the simplicity of the code;
63* What happened to the cleanup function calls? What if we have to, say, close a file?
64* `Assert` messes with the error message before raising the error (it adds line number information).
65
66Fortunately, all these problems are very easy to solve and that's what we do in the following sections.
67
68## Introducing the `protect` factory
69
70We used the `pcall` function to shield the user from errors that could be raised by the underlying implementation. Instead of directly using `pcall` (and thus duplicating code) every time we prefer a factory that does the same job:
71```lua
72local function pack(ok, ...)
73 return ok, {...}
74end
75
76function protect(f)
77 return function(...)
78 local ok, ret = pack(pcall(f, ...))
79 if ok then return unpack(ret)
80 else return nil, ret[1] end
81 end
82end
83```
84
85The `protect` factory receives a function that might raise exceptions and returns a function that respects our return value convention. Now we can rewrite the top-level `task` function in a much cleaner way:
86```lua
87task = protect(function(arg1, arg2, ...)
88 local ret1 = assert(task1(arg1))
89 local ret2 = assert(task2(arg2))
90 ...
91end)
92```
93
94The Lua implementation of the `protect` factory suffers with the creation of tables to hold multiple arguments and return values. It is possible (and easy) to implement the same function in C, without any table creation.
95```c
96static int safecall(lua_State *L) {
97 lua_pushvalue(L, lua_upvalueindex(1));
98 lua_insert(L, 1);
99 if (lua_pcall(L, lua_gettop(L) - 1, LUA_MULTRET, 0) != 0) {
100 lua_pushnil(L);
101 lua_insert(L, 1);
102 return 2;
103 } else return lua_gettop(L);
104}
105
106static int protect(lua_State *L) {
107 lua_pushcclosure(L, safecall, 1);
108 return 1;
109}
110```
111
112## The `newtry` factory
113
114Let's solve the two remaining issues with a single shot and use a concrete example to illustrate the proposed solution. Suppose you want to write a function to download an HTTP document. You have to connect, send the request and read the reply. Each of these tasks can fail, but if something goes wrong after you connected, you have to close the connection before returning the error message.
115```lua
116get = protect(function(host, path)
117 local c
118 -- create a try function with a finalizer to close the socket
119 local try = newtry(function()
120 if c then c:close() end
121 end)
122 -- connect and send request
123 c = try(connect(host, 80))
124 try(c:send("GET " .. path .. " HTTP/1.0\r\n\r\n"))
125 -- get headers
126 local h = {}
127 while 1 do
128 l = try(c:receive())
129 if l == "" then break end
130 table.insert(h, l)
131 end
132 -- get body
133 local b = try(c:receive("*a"))
134 c:close()
135 return b, h
136end)
137```
138
139The `newtry` factory returns a function that works just like `assert`. The differences are that the `try` function doesn't mess with the error message and it calls an optional "finalizer" before raising the error. In our example, the finalizer simply closes the socket.
140
141Even with a simple example like this, we see that the finalized exceptions simplified our life. Let's see what we gain in general, not just in this example:
142
143* We don't need to declare dummy variables to hold error messages in case any ever shows up;
144* We avoid using a variable to hold something that could either be a return value or an error message;
145* We didn't have to use several "if" statements to check for errors;
146* If an error happens, we know our finalizer is going to be invoked automatically;
147* Exceptions get propagated, so we don't repeat these "if" statements until the error reaches the user.
148
149Try writing the same function without the tricks we used above and you will see that the code gets ugly. Longer sequences of operations with error checking would get even uglier. So let's implement the `newtry` function in Lua:
150```lua
151function newtry(f)
152 return function(...)
153 if not arg[1] then
154 if f then f() end
155 error(arg[2], 0)
156 else
157 return ...
158 end
159 end
160end
161```
162
163Again, the implementation suffers from the creation of tables at each function call, so we prefer the C version:
164```lua
165static int finalize(lua_State *L) {
166 if (!lua_toboolean(L, 1)) {
167 lua_pushvalue(L, lua_upvalueindex(1));
168 lua_pcall(L, 0, 0, 0);
169 lua_settop(L, 2);
170 lua_error(L);
171 return 0;
172 } else return lua_gettop(L);
173}
174
175static int do_nothing(lua_State *L) {
176 (void) L;
177 return 0;
178}
179
180static int newtry(lua_State *L) {
181 lua_settop(L, 1);
182 if (lua_isnil(L, 1))
183 lua_pushcfunction(L, do_nothing);
184 lua_pushcclosure(L, finalize, 1);
185 return 1;
186}
187```
188
189## Final considerations
190
191The `protect` and `newtry` functions saved a "lot" of work in the implementation of [LuaSocket](https://github.com/lunarmodules/luasocket). The size of some modules was cut in half by the these ideas. It's true the scheme is not as generic as the exception mechanism of programming languages like C++ or Java, but the power/simplicity ratio is favorable and I hope it serves you as well as it served [LuaSocket](https://github.com/lunarmodules/luasocket).
diff --git a/ltn013.wiki b/ltn013.wiki
deleted file mode 100644
index a622424..0000000
--- a/ltn013.wiki
+++ /dev/null
@@ -1,194 +0,0 @@
1===Using finalized exceptions===
2==or How to get rid of all those if statements==
3by DiegoNehab
4
5{{{
6
7}}}
8
9===Abstract===
10This little LTN describes a simple exception scheme that greatly simplifies error checking in Lua programs. All the needed functionality ships standard with Lua, but is hidden between the {{assert}} and {{pcall}} functions. To make it more evident, we stick to a convenient standard (you probably already use anyways) for Lua function return values, and define two very simple helper functions (either in C or in Lua itself).
11
12===Introduction===
13
14Most Lua functions return {{nil}} in case of error, followed by a message describing the error. If you don't use this convention, you probably have good reasons. Hopefully, after reading on, you will realize your reasons are not good enough.
15
16If you are like me, you hate error checking. Most nice little code snippets that look beautiful when you first write them lose some of their charm when you add all that error checking code. Yet, error checking is as important as the rest of the code. How sad.
17
18Even if you stick to a return convention, any complex task involving several function calls makes error checking both boring and error-prone (do you see the ''error'' below?)
19 {{{
20function task(arg1, arg2, ...)
21 local ret1, err = task1(arg1)
22 if not ret1 then
23 cleanup1()
24 return nil, error
25 end
26 local ret2, err = task2(arg2)
27 if not ret then
28 cleanup2()
29 return nil, error
30 end
31 ...
32end
33}}}
34
35The standard {{assert}} function provides an interesting alternative. To use it, simply nest every function call to be error checked with a call to {{assert}}. The {{assert}} function checks the value of its first argument. If it is {{nil}}, {{assert}} throws the second argument as an error message. Otherwise, {{assert}} lets all arguments through as if had not been there. The idea greatly simplifies error checking:
36 {{{
37function task(arg1, arg2, ...)
38 local ret1 = assert(task1(arg1))
39 local ret2 = assert(task2(arg2))
40 ...
41end
42}}}
43
44If any task fails, the execution is aborted by {{assert}} and the error message is displayed to the user as the cause of the problem. If no error happens, the task completes as before. There isn't a single {{if}} statement and this is great. However, there are some problems with the idea.
45
46First, the topmost {{task}} function doesn't respect the protocol followed by the lower-level tasks: It raises an error instead of returning {{nil}} followed by the error messages. Here is where the standard {{pcall}} comes in handy.
47 {{{
48function xtask(arg1, arg2, ...)
49 local ret1 = assert(task1(arg1))
50 local ret2 = assert(task2(arg2))
51 ...
52end
53
54function task(arg1, arg2, ...)
55 local ok, ret_or_err = pcall(xtask, arg1, arg2, ...)
56 if ok then return ret_or_err
57 else return nil, ret_or_err end
58end
59}}}
60
61Our new {{task}} function is well behaved. {{Pcall}} catches any error raised by the calls to {{assert}} and returns it after the status code. That way, errors don't get propagated to the user of the high level {{task}} function.
62
63These are the main ideas for our exception scheme, but there are still a few glitches to fix:
64
65 * Directly using {{pcall}} ruined the simplicity of the code;
66 * What happened to the cleanup function calls? What if we have to, say, close a file?
67 * {{Assert}} messes with the error message before raising the error (it adds line number information).
68
69Fortunately, all these problems are very easy to solve and that's what we do in the following sections.
70
71== Introducing the {{protect}} factory ==
72
73We used the {{pcall}} function to shield the user from errors that could be raised by the underlying implementation. Instead of directly using {{pcall}} (and thus duplicating code) every time we prefer a factory that does the same job:
74 {{{
75local function pack(ok, ...)
76 return ok, {...}
77end
78
79function protect(f)
80 return function(...)
81 local ok, ret = pack(pcall(f, ...))
82 if ok then return unpack(ret)
83 else return nil, ret[1] end
84 end
85end
86}}}
87
88The {{protect}} factory receives a function that might raise exceptions and returns a function that respects our return value convention. Now we can rewrite the top-level {{task}} function in a much cleaner way:
89 {{{
90task = protect(function(arg1, arg2, ...)
91 local ret1 = assert(task1(arg1))
92 local ret2 = assert(task2(arg2))
93 ...
94end)
95}}}
96
97The Lua implementation of the {{protect}} factory suffers with the creation of tables to hold multiple arguments and return values. It is possible (and easy) to implement the same function in C, without any table creation.
98 {{{
99static int safecall(lua_State *L) {
100 lua_pushvalue(L, lua_upvalueindex(1));
101 lua_insert(L, 1);
102 if (lua_pcall(L, lua_gettop(L) - 1, LUA_MULTRET, 0) != 0) {
103 lua_pushnil(L);
104 lua_insert(L, 1);
105 return 2;
106 } else return lua_gettop(L);
107}
108
109static int protect(lua_State *L) {
110 lua_pushcclosure(L, safecall, 1);
111 return 1;
112}
113}}}
114
115===The {{newtry}} factory===
116
117Let's solve the two remaining issues with a single shot and use a concrete example to illustrate the proposed solution. Suppose you want to write a function to download an HTTP document. You have to connect, send the request and read the reply. Each of these tasks can fail, but if something goes wrong after you connected, you have to close the connection before returning the error message.
118 {{{
119get = protect(function(host, path)
120 local c
121 -- create a try function with a finalizer to close the socket
122 local try = newtry(function()
123 if c then c:close() end
124 end)
125 -- connect and send request
126 c = try(connect(host, 80))
127 try(c:send("GET " .. path .. " HTTP/1.0\r\n\r\n"))
128 -- get headers
129 local h = {}
130 while 1 do
131 l = try(c:receive())
132 if l == "" then break end
133 table.insert(h, l)
134 end
135 -- get body
136 local b = try(c:receive("*a"))
137 c:close()
138 return b, h
139end)
140}}}
141
142The {{newtry}} factory returns a function that works just like {{assert}}. The differences are that the {{try}} function doesn't mess with the error message and it calls an optional ''finalizer'' before raising the error. In our example, the finalizer simply closes the socket.
143
144Even with a simple example like this, we see that the finalized exceptions simplified our life. Let's see what we gain in general, not just in this example:
145
146 * We don't need to declare dummy variables to hold error messages in case any ever shows up;
147 * We avoid using a variable to hold something that could either be a return value or an error message;
148 * We didn't have to use several ''if'' statements to check for errors;
149 * If an error happens, we know our finalizer is going to be invoked automatically;
150 * Exceptions get propagated, so we don't repeat these ''if'' statements until the error reaches the user.
151
152Try writing the same function without the tricks we used above and you will see that the code gets ugly. Longer sequences of operations with error checking would get even uglier. So let's implement the {{newtry}} function in Lua:
153 {{{
154function newtry(f)
155 return function(...)
156 if not arg[1] then
157 if f then f() end
158 error(arg[2], 0)
159 else
160 return ...
161 end
162 end
163end
164}}}
165
166Again, the implementation suffers from the creation of tables at each function call, so we prefer the C version:
167 {{{
168static int finalize(lua_State *L) {
169 if (!lua_toboolean(L, 1)) {
170 lua_pushvalue(L, lua_upvalueindex(1));
171 lua_pcall(L, 0, 0, 0);
172 lua_settop(L, 2);
173 lua_error(L);
174 return 0;
175 } else return lua_gettop(L);
176}
177
178static int do_nothing(lua_State *L) {
179 (void) L;
180 return 0;
181}
182
183static int newtry(lua_State *L) {
184 lua_settop(L, 1);
185 if (lua_isnil(L, 1))
186 lua_pushcfunction(L, do_nothing);
187 lua_pushcclosure(L, finalize, 1);
188 return 1;
189}
190}}}
191
192===Final considerations===
193
194The {{protect}} and {{newtry}} functions saved a ''lot'' of work in the implementation of {{LuaSocket}}[http://www.tecgraf.puc-rio.br/luasocket]. The size of some modules was cut in half by the these ideas. It's true the scheme is not as generic as the exception mechanism of programming languages like C++ or Java, but the power/simplicity ratio is favorable and I hope it serves you as well as it served {{LuaSocket}}.
diff --git a/luasocket-scm-3.rockspec b/luasocket-scm-3.rockspec
index 1045251..5a5f7b5 100644
--- a/luasocket-scm-3.rockspec
+++ b/luasocket-scm-3.rockspec
@@ -130,6 +130,5 @@ build = {
130 copy_directories = { 130 copy_directories = {
131 "docs" 131 "docs"
132 , "samples" 132 , "samples"
133 , "etc"
134 , "test" } 133 , "test" }
135} 134}
diff --git a/makefile.dist b/makefile.dist
index a27ba57..5ef44d3 100644
--- a/makefile.dist
+++ b/makefile.dist
@@ -22,20 +22,17 @@ SAMPLES = \
22 samples/lpr.lua \ 22 samples/lpr.lua \
23 samples/talker.lua \ 23 samples/talker.lua \
24 samples/tinyirc.lua 24 samples/tinyirc.lua
25 25 samples/b64.lua \
26ETC = \ 26 samples/check-links.lua \
27 etc/README \ 27 samples/check-memory.lua \
28 etc/b64.lua \ 28 samples/dict.lua \
29 etc/check-links.lua \ 29 samples/dispatch.lua \
30 etc/check-memory.lua \ 30 samples/eol.lua \
31 etc/dict.lua \ 31 samples/forward.lua \
32 etc/dispatch.lua \ 32 samples/get.lua \
33 etc/eol.lua \ 33 samples/lp.lua \
34 etc/forward.lua \ 34 samples/qp.lua \
35 etc/get.lua \ 35 samples/tftp.lua
36 etc/lp.lua \
37 etc/qp.lua \
38 etc/tftp.lua
39 36
40SRC = \ 37SRC = \
41 src/makefile \ 38 src/makefile \
@@ -117,9 +114,6 @@ dist:
117 cp -vf README.md $(DIST) 114 cp -vf README.md $(DIST)
118 cp -vf $(MAKE) $(DIST) 115 cp -vf $(MAKE) $(DIST)
119 116
120 mkdir -p $(DIST)/etc
121 cp -vf $(ETC) $(DIST)/etc
122
123 mkdir -p $(DIST)/src 117 mkdir -p $(DIST)/src
124 cp -vf $(SRC) $(DIST)/src 118 cp -vf $(SRC) $(DIST)/src
125 119
diff --git a/samples/README b/samples/README
index e63a6f5..4ee06b6 100644
--- a/samples/README
+++ b/samples/README
@@ -1,11 +1,95 @@
1This directory contains some sample programs using 1This directory contains some sample programs using
2LuaSocket. This code is not supported. 2LuaSocket. This code is not supported.
3 3
4 tftp.lua -- Trivial FTP client
5
6This module implements file retrieval by the TFTP protocol.
7Its main use was to test the UDP code, but since someone
8found it usefull, I turned it into a module that is almost
9official (no uploads, yet).
10
11 dict.lua -- Dict client
12
13The dict.lua module started with a cool simple client
14for the DICT protocol, written by Luiz Henrique Figueiredo.
15This new version has been converted into a library, similar
16to the HTTP and FTP libraries, that can be used from within
17any luasocket application. Take a look on the source code
18and you will be able to figure out how to use it.
19
20 lp.lua -- LPD client library
21
22The lp.lua module implements the client part of the Line
23Printer Daemon protocol, used to print files on Unix
24machines. It is courtesy of David Burgess! See the source
25code and the lpr.lua in the examples directory.
26
27 b64.lua
28 qp.lua
29 eol.lua
30
31These are tiny programs that perform Base64,
32Quoted-Printable and end-of-line marker conversions.
33
34 get.lua -- file retriever
35
36This little program is a client that uses the FTP and
37HTTP code to implement a command line file graber. Just
38run
39
40 lua get.lua <remote-file> [<local-file>]
41
42to download a remote file (either ftp:// or http://) to
43the specified local file. The program also prints the
44download throughput, elapsed time, bytes already downloaded
45etc during download.
46
47 check-memory.lua -- checks memory consumption
48
49This is just to see how much memory each module uses.
50
51 dispatch.lua -- coroutine based dispatcher
52
53This is a first try at a coroutine based non-blocking
54dispatcher for LuaSocket. Take a look at 'check-links.lua'
55and at 'forward.lua' to see how to use it.
56
57 check-links.lua -- HTML link checker program
58
59This little program scans a HTML file and checks for broken
60links. It is similar to check-links.pl by Jamie Zawinski,
61but uses all facilities of the LuaSocket library and the Lua
62language. It has not been thoroughly tested, but it should
63work. Just run
64
65 lua check-links.lua [-n] {<url>} > output
66
67and open the result to see a list of broken links. Make sure
68you check the '-n' switch. It runs in non-blocking mode,
69using coroutines, and is MUCH faster!
70
71 forward.lua -- coroutine based forward server
72
73This is a forward server that can accept several connections
74and transfers simultaneously using non-blocking I/O and the
75coroutine-based dispatcher. You can run, for example
76
77 lua forward.lua 8080:proxy.com:3128
78
79to redirect all local conections to port 8080 to the host
80'proxy.com' at port 3128.
81
82 unix.c and unix.h
83
84This is an implementation of Unix local domain sockets and
85demonstrates how to extend LuaSocket with a new type of
86transport. It has been tested on Linux and on Mac OS X.
87
4 listener.lua -- socket to stdout 88 listener.lua -- socket to stdout
5 talker.lua -- stdin to socket 89 talker.lua -- stdin to socket
6 90
7listener.lua and talker.lua are about the simplest 91listener.lua and talker.lua are about the simplest
8applications you can write using LuaSocket. Run 92applications you can write using LuaSocket. Run
9 93
10 'lua listener.lua' and 'lua talker.lua' 94 'lua listener.lua' and 'lua talker.lua'
11 95
@@ -17,13 +101,13 @@ be printed by listen.lua.
17This is a cool program written by David Burgess to print 101This is a cool program written by David Burgess to print
18files using the Line Printer Daemon protocol, widely used in 102files using the Line Printer Daemon protocol, widely used in
19Unix machines. It uses the lp.lua implementation, in the 103Unix machines. It uses the lp.lua implementation, in the
20etc directory. Just run 'lua lpr.lua <filename> 104samples directory. Just run 'lua lpr.lua <filename>
21queue=<printername>' and the file will print! 105queue=<printername>' and the file will print!
22 106
23 cddb.lua -- CDDB client 107 cddb.lua -- CDDB client
24 108
25This is the first try on a simple CDDB client. Not really 109This is the first try on a simple CDDB client. Not really
26useful, but one day it might become a module. 110useful, but one day it might become a module.
27 111
28 daytimeclnt.lua -- day time client 112 daytimeclnt.lua -- day time client
29 113
diff --git a/etc/b64.lua b/samples/b64.lua
index 11eeb2d..11eeb2d 100644
--- a/etc/b64.lua
+++ b/samples/b64.lua
diff --git a/etc/check-links.lua b/samples/check-links.lua
index 283f3ac..283f3ac 100644
--- a/etc/check-links.lua
+++ b/samples/check-links.lua
diff --git a/etc/check-memory.lua b/samples/check-memory.lua
index 7bd984d..7bd984d 100644
--- a/etc/check-memory.lua
+++ b/samples/check-memory.lua
diff --git a/etc/cookie.lua b/samples/cookie.lua
index fec10a1..fec10a1 100644
--- a/etc/cookie.lua
+++ b/samples/cookie.lua
diff --git a/etc/dict.lua b/samples/dict.lua
index 8c5b711..8c5b711 100644
--- a/etc/dict.lua
+++ b/samples/dict.lua
diff --git a/etc/dispatch.lua b/samples/dispatch.lua
index 2485415..2485415 100644
--- a/etc/dispatch.lua
+++ b/samples/dispatch.lua
diff --git a/etc/eol.lua b/samples/eol.lua
index eeaf0ce..eeaf0ce 100644
--- a/etc/eol.lua
+++ b/samples/eol.lua
diff --git a/etc/forward.lua b/samples/forward.lua
index 05ced1a..05ced1a 100644
--- a/etc/forward.lua
+++ b/samples/forward.lua
diff --git a/etc/get.lua b/samples/get.lua
index d53c465..d53c465 100644
--- a/etc/get.lua
+++ b/samples/get.lua
diff --git a/etc/links b/samples/links
index 087f1c0..087f1c0 100644
--- a/etc/links
+++ b/samples/links
diff --git a/etc/lp.lua b/samples/lp.lua
index 25f0b95..25f0b95 100644
--- a/etc/lp.lua
+++ b/samples/lp.lua
diff --git a/etc/qp.lua b/samples/qp.lua
index 523238b..523238b 100644
--- a/etc/qp.lua
+++ b/samples/qp.lua
diff --git a/etc/tftp.lua b/samples/tftp.lua
index ed99cd1..ed99cd1 100644
--- a/etc/tftp.lua
+++ b/samples/tftp.lua