diff options
author | Diego Nehab <diego@tecgraf.puc-rio.br> | 2007-05-31 22:27:40 +0000 |
---|---|---|
committer | Diego Nehab <diego@tecgraf.puc-rio.br> | 2007-05-31 22:27:40 +0000 |
commit | 3074a8f56b5153f4477e662453102583d7b6f539 (patch) | |
tree | 095eecdd7017e17115ab8387898d2f5e5f2f2323 | |
parent | 7b195164b0c8755b15e8055f1d524282847f6e13 (diff) | |
download | luasocket-3074a8f56b5153f4477e662453102583d7b6f539.tar.gz luasocket-3074a8f56b5153f4477e662453102583d7b6f539.tar.bz2 luasocket-3074a8f56b5153f4477e662453102583d7b6f539.zip |
Before sending to Roberto.
-rw-r--r-- | gem/ltn012.tex | 218 |
1 files changed, 105 insertions, 113 deletions
diff --git a/gem/ltn012.tex b/gem/ltn012.tex index 7dbc5ef..0f81b86 100644 --- a/gem/ltn012.tex +++ b/gem/ltn012.tex | |||
@@ -23,19 +23,17 @@ received in consecutive function calls, returning partial | |||
23 | results after each invocation. Examples of operations that can be | 23 | results after each invocation. Examples of operations that can be |
24 | implemented as filters include the end-of-line normalization | 24 | implemented as filters include the end-of-line normalization |
25 | for text, Base64 and Quoted-Printable transfer content | 25 | for text, Base64 and Quoted-Printable transfer content |
26 | encodings, the breaking of text into lines, SMTP byte | 26 | encodings, the breaking of text into lines, SMTP dot-stuffing, |
27 | stuffing, and there are many others. Filters become even | 27 | and there are many others. Filters become even |
28 | more powerful when we allow them to be chained together to | 28 | more powerful when we allow them to be chained together to |
29 | create composite filters. In this context, filters can be seen | 29 | create composite filters. In this context, filters can be seen |
30 | as the middle links in a chain of data transformations. Sources an sinks | 30 | as the middle links in a chain of data transformations. Sources an sinks |
31 | are the corresponding end points of these chains. A source | 31 | are the corresponding end points of these chains. A source |
32 | is a function that produces data, chunk by chunk, and a sink | 32 | is a function that produces data, chunk by chunk, and a sink |
33 | is a function that takes data, chunk by chunk. In this | 33 | is a function that takes data, chunk by chunk. In this |
34 | chapter, we describe the design of an elegant interface for filters, | 34 | article, we describe the design of an elegant interface for filters, |
35 | sources, sinks and chaining, refine it | 35 | sources, sinks, and chaining, and illustrate each step |
36 | until it reaches a high degree of generality. We discuss | 36 | with concrete examples. |
37 | implementation challenges, provide practical solutions, | ||
38 | and illustrate each step with concrete examples. | ||
39 | \end{abstract} | 37 | \end{abstract} |
40 | 38 | ||
41 | 39 | ||
@@ -52,7 +50,7 @@ transfer coding, and the list goes on. | |||
52 | Many complex tasks require a combination of two or more such | 50 | Many complex tasks require a combination of two or more such |
53 | transformations, and therefore a general mechanism for | 51 | transformations, and therefore a general mechanism for |
54 | promoting reuse is desirable. In the process of designing | 52 | promoting reuse is desirable. In the process of designing |
55 | LuaSocket 2.0, David Burgess and I were forced to deal with | 53 | \texttt{LuaSocket~2.0}, David Burgess and I were forced to deal with |
56 | this problem. The solution we reached proved to be very | 54 | this problem. The solution we reached proved to be very |
57 | general and convenient. It is based on the concepts of | 55 | general and convenient. It is based on the concepts of |
58 | filters, sources, sinks, and pumps, which we introduce | 56 | filters, sources, sinks, and pumps, which we introduce |
@@ -62,18 +60,18 @@ below. | |||
62 | with chunks of input, successively returning processed | 60 | with chunks of input, successively returning processed |
63 | chunks of output. More importantly, the result of | 61 | chunks of output. More importantly, the result of |
64 | concatenating all the output chunks must be the same as the | 62 | concatenating all the output chunks must be the same as the |
65 | result of applying the filter over the concatenation of all | 63 | result of applying the filter to the concatenation of all |
66 | input chunks. In fancier language, filters \emph{commute} | 64 | input chunks. In fancier language, filters \emph{commute} |
67 | with the concatenation operator. As a result, chunk | 65 | with the concatenation operator. As a result, chunk |
68 | boundaries are irrelevant: filters correctly handle input | 66 | boundaries are irrelevant: filters correctly handle input |
69 | data no matter how it was originally split. | 67 | data no matter how it is split. |
70 | 68 | ||
71 | A \emph{chain} transparently combines the effect of one or | 69 | A \emph{chain} transparently combines the effect of one or |
72 | more filters. The interface of a chain must be | 70 | more filters. The interface of a chain is |
73 | indistinguishable from the interface of its components. | 71 | indistinguishable from the interface of its components. |
74 | This allows a chained filter to be used wherever an atomic | 72 | This allows a chained filter to be used wherever an atomic |
75 | filter is expected. In particular, chains can be chained | 73 | filter is expected. In particular, chains can be |
76 | themselves to create arbitrarily complex operations. | 74 | themselves chained to create arbitrarily complex operations. |
77 | 75 | ||
78 | Filters can be seen as internal nodes in a network through | 76 | Filters can be seen as internal nodes in a network through |
79 | which data will flow, potentially being transformed many | 77 | which data will flow, potentially being transformed many |
@@ -93,15 +91,13 @@ anything to happen. \emph{Pumps} provide the driving force | |||
93 | that pushes data through the network, from a source to a | 91 | that pushes data through the network, from a source to a |
94 | sink. | 92 | sink. |
95 | 93 | ||
96 | These concepts will become less abstract with examples. In | 94 | In the following sections, we start with a simplified |
97 | the following sections, we start with a simplified | 95 | interface, which we later refine. The evolution we present |
98 | interface, which we refine several times until no obvious | 96 | is not contrived: it recreates the steps we followed |
99 | shortcomings remain. The evolution we present is not | 97 | ourselves as we consolidated our understanding of these |
100 | contrived: it recreates the steps we followed ourselves as | 98 | concepts within our application domain. |
101 | we consolidated our understanding of these concepts and the | ||
102 | applications that benefit from them. | ||
103 | 99 | ||
104 | \subsection{A concrete example} | 100 | \subsection{A simple example} |
105 | 101 | ||
106 | Let us use the end-of-line normalization of text as an | 102 | Let us use the end-of-line normalization of text as an |
107 | example to motivate our initial filter interface. | 103 | example to motivate our initial filter interface. |
@@ -141,23 +137,23 @@ it with a \texttt{nil} chunk. The filter responds by returning | |||
141 | the final chunk of processed data. | 137 | the final chunk of processed data. |
142 | 138 | ||
143 | Although the interface is extremely simple, the | 139 | Although the interface is extremely simple, the |
144 | implementation is not so obvious. Any filter | 140 | implementation is not so obvious. A normalization filter |
145 | respecting this interface needs to keep some kind of context | 141 | respecting this interface needs to keep some kind of context |
146 | between calls. This is because chunks can for example be broken | 142 | between calls. This is because a chunk boundary may lie between |
147 | between the CR and LF characters marking the end of a line. This | 143 | the CR and LF characters marking the end of a line. This |
148 | need for contextual storage is what motivates the use of | 144 | need for contextual storage motivates the use of |
149 | factories: each time the factory is called, it returns a | 145 | factories: each time the factory is invoked, it returns a |
150 | filter with its own context so that we can have several | 146 | filter with its own context so that we can have several |
151 | independent filters being used at the same time. For | 147 | independent filters being used at the same time. For |
152 | efficiency reasons, we must avoid the obvious solution of | 148 | efficiency reasons, we must avoid the obvious solution of |
153 | concatenating all the input into the context before | 149 | concatenating all the input into the context before |
154 | producing any output. | 150 | producing any output. |
155 | 151 | ||
156 | To that end, we will break the implementation in two parts: | 152 | To that end, we break the implementation into two parts: |
157 | a low-level filter, and a factory of high-level filters. The | 153 | a low-level filter, and a factory of high-level filters. The |
158 | low-level filter will be implemented in C and will not carry | 154 | low-level filter is implemented in C and does not maintain |
159 | any context between function calls. The high-level filter | 155 | any context between function calls. The high-level filter |
160 | factory, implemented in Lua, will create and return a | 156 | factory, implemented in Lua, creates and returns a |
161 | high-level filter that maintains whatever context the low-level | 157 | high-level filter that maintains whatever context the low-level |
162 | filter needs, but isolates the user from its internal | 158 | filter needs, but isolates the user from its internal |
163 | details. That way, we take advantage of C's efficiency to | 159 | details. That way, we take advantage of C's efficiency to |
@@ -191,22 +187,21 @@ end | |||
191 | The \texttt{normalize} factory simply calls a more generic | 187 | The \texttt{normalize} factory simply calls a more generic |
192 | factory, the \texttt{cycle} factory. This factory receives a | 188 | factory, the \texttt{cycle} factory. This factory receives a |
193 | low-level filter, an initial context, and an extra | 189 | low-level filter, an initial context, and an extra |
194 | parameter, and returns the corresponding high-level filter. | 190 | parameter, and returns a new high-level filter. Each time |
195 | Each time the high-level filer is passed a new chunk, it | 191 | the high-level filer is passed a new chunk, it invokes the |
196 | invokes the low-level filter passing it the previous | 192 | low-level filter with the previous context, the new chunk, |
197 | context, the new chunk, and the extra argument. The | 193 | and the extra argument. It is the low-level filter that |
198 | low-level filter in turn produces the chunk of processed | 194 | does all the work, producing the chunk of processed data and |
199 | data and a new context. The high-level filter then updates | 195 | a new context. The high-level filter then updates its |
200 | its internal context, and returns the processed chunk of | 196 | internal context, and returns the processed chunk of data to |
201 | data to the user. It is the low-level filter that does all | 197 | the user. Notice that we take advantage of Lua's lexical |
202 | the work. Notice that we take advantage of Lua's lexical | ||
203 | scoping to store the context in a closure between function | 198 | scoping to store the context in a closure between function |
204 | calls. | 199 | calls. |
205 | 200 | ||
206 | Concerning the low-level filter code, we must first accept | 201 | Concerning the low-level filter code, we must first accept |
207 | that there is no perfect solution to the end-of-line marker | 202 | that there is no perfect solution to the end-of-line marker |
208 | normalization problem itself. The difficulty comes from an | 203 | normalization problem. The difficulty comes from an |
209 | inherent ambiguity on the definition of empty lines within | 204 | inherent ambiguity in the definition of empty lines within |
210 | mixed input. However, the following solution works well for | 205 | mixed input. However, the following solution works well for |
211 | any consistent input, as well as for non-empty lines in | 206 | any consistent input, as well as for non-empty lines in |
212 | mixed input. It also does a reasonable job with empty lines | 207 | mixed input. It also does a reasonable job with empty lines |
@@ -218,17 +213,18 @@ The idea is to consider both CR and~LF as end-of-line | |||
218 | is seen alone, or followed by a different candidate. In | 213 | is seen alone, or followed by a different candidate. In |
219 | other words, CR~CR~and LF~LF each issue two end-of-line | 214 | other words, CR~CR~and LF~LF each issue two end-of-line |
220 | markers, whereas CR~LF~and LF~CR issue only one marker each. | 215 | markers, whereas CR~LF~and LF~CR issue only one marker each. |
221 | This idea correctly handles the Unix, DOS/MIME, VMS, and Mac | 216 | This method correctly handles the Unix, DOS/MIME, VMS, and Mac |
222 | OS, as well as other more obscure conventions. | 217 | OS conventions. |
223 | 218 | ||
224 | \subsection{The C part of the filter} | 219 | \subsection{The C part of the filter} |
225 | 220 | ||
226 | Our low-level filter is divided into two simple functions. | 221 | Our low-level filter is divided into two simple functions. |
227 | The inner function actually does the conversion. It takes | 222 | The inner function performs the normalization itself. It takes |
228 | each input character in turn, deciding what to output and | 223 | each input character in turn, deciding what to output and |
229 | how to modify the context. The context tells if the last | 224 | how to modify the context. The context tells if the last |
230 | character processed was an end-of-line candidate, and if so, | 225 | processed character was an end-of-line candidate, and if so, |
231 | which candidate it was. | 226 | which candidate it was. For efficiency, it uses |
227 | Lua's auxiliary library's buffer interface: | ||
232 | \begin{quote} | 228 | \begin{quote} |
233 | \begin{C} | 229 | \begin{C} |
234 | @stick# | 230 | @stick# |
@@ -252,12 +248,10 @@ static int process(int c, int last, const char *marker, | |||
252 | \end{C} | 248 | \end{C} |
253 | \end{quote} | 249 | \end{quote} |
254 | 250 | ||
255 | The inner function makes use of Lua's auxiliary library's | 251 | The outer function simply interfaces with Lua. It receives the |
256 | buffer interface for efficiency. The | 252 | context and input chunk (as well as an optional |
257 | outer function simply interfaces with Lua. It receives the | ||
258 | context and the input chunk (as well as an optional | ||
259 | custom end-of-line marker), and returns the transformed | 253 | custom end-of-line marker), and returns the transformed |
260 | output chunk and the new context. | 254 | output chunk and the new context: |
261 | \begin{quote} | 255 | \begin{quote} |
262 | \begin{C} | 256 | \begin{C} |
263 | @stick# | 257 | @stick# |
@@ -291,33 +285,29 @@ initial state. This allows the filter to be reused many | |||
291 | times. | 285 | times. |
292 | 286 | ||
293 | When designing your own filters, the challenging part is to | 287 | When designing your own filters, the challenging part is to |
294 | decide what will be the context. For line breaking, for | 288 | decide what will be in the context. For line breaking, for |
295 | instance, it could be the number of bytes left in the | 289 | instance, it could be the number of bytes left in the |
296 | current line. For Base64 encoding, it could be a string | 290 | current line. For Base64 encoding, it could be a string |
297 | with the bytes that remain after the division of the input | 291 | with the bytes that remain after the division of the input |
298 | into 3-byte atoms. The MIME module in the LuaSocket | 292 | into 3-byte atoms. The MIME module in the \texttt{LuaSocket} |
299 | distribution has many other examples. | 293 | distribution has many other examples. |
300 | 294 | ||
301 | \section{Filter chains} | 295 | \section{Filter chains} |
302 | 296 | ||
303 | Chains add a lot to the power of filters. For example, | 297 | Chains add a lot to the power of filters. For example, |
304 | according to the standard for Quoted-Printable encoding, the | 298 | according to the standard for Quoted-Printable encoding, |
305 | text must be normalized into its canonic form prior to | 299 | text must be normalized to a canonic end-of-line marker |
306 | encoding, as far as end-of-line markers are concerned. To | 300 | prior to encoding. To help specifying complex |
307 | help specifying complex transformations like these, we define a | 301 | transformations like this, we define a chain factory that |
308 | chain factory that creates a composite filter from one or | 302 | creates a composite filter from one or more filters. A |
309 | more filters. A chained filter passes data through all | 303 | chained filter passes data through all its components, and |
310 | its components, and can be used wherever a primitive filter | 304 | can be used wherever a primitive filter is accepted. |
311 | is accepted. | 305 | |
312 | 306 | The chaining factory is very simple. The auxiliary | |
313 | The chaining factory is very simple. All it does is return a | 307 | function~\texttt{chainpair} chains two filters together, |
314 | function that passes data through all filters and returns | 308 | taking special care if the chunk is the last. This is |
315 | the result to the user. The auxiliary | 309 | because the final \texttt{nil} chunk notification has to be |
316 | function~\texttt{chainpair} can only chain two filters | 310 | pushed through both filters in turn: |
317 | together. In the auxiliary function, special care must be | ||
318 | taken if the chunk is the last. This is because the final | ||
319 | \texttt{nil} chunk notification has to be pushed through both | ||
320 | filters in turn: | ||
321 | \begin{quote} | 311 | \begin{quote} |
322 | \begin{lua} | 312 | \begin{lua} |
323 | @stick# | 313 | @stick# |
@@ -333,7 +323,7 @@ end | |||
333 | @stick# | 323 | @stick# |
334 | function filter.chain(...) | 324 | function filter.chain(...) |
335 | local f = arg[1] | 325 | local f = arg[1] |
336 | for i = 2, table.getn(arg) do | 326 | for i = 2, @#arg do |
337 | f = chainpair(f, arg[i]) | 327 | f = chainpair(f, arg[i]) |
338 | end | 328 | end |
339 | return f | 329 | return f |
@@ -343,7 +333,7 @@ end | |||
343 | \end{quote} | 333 | \end{quote} |
344 | 334 | ||
345 | Thanks to the chain factory, we can | 335 | Thanks to the chain factory, we can |
346 | trivially define the Quoted-Printable conversion: | 336 | define the Quoted-Printable conversion as such: |
347 | \begin{quote} | 337 | \begin{quote} |
348 | \begin{lua} | 338 | \begin{lua} |
349 | @stick# | 339 | @stick# |
@@ -361,7 +351,7 @@ pump.all(in, out) | |||
361 | The filters we introduced so far act as the internal nodes | 351 | The filters we introduced so far act as the internal nodes |
362 | in a network of transformations. Information flows from node | 352 | in a network of transformations. Information flows from node |
363 | to node (or rather from one filter to the next) and is | 353 | to node (or rather from one filter to the next) and is |
364 | transformed on its way out. Chaining filters together is our | 354 | transformed along the way. Chaining filters together is our |
365 | way to connect nodes in this network. As the starting point | 355 | way to connect nodes in this network. As the starting point |
366 | for the network, we need a source node that produces the | 356 | for the network, we need a source node that produces the |
367 | data. In the end of the network, we need a sink node that | 357 | data. In the end of the network, we need a sink node that |
@@ -376,8 +366,8 @@ caller by returning \texttt{nil} followed by an error message. | |||
376 | 366 | ||
377 | Below are two simple source factories. The \texttt{empty} source | 367 | Below are two simple source factories. The \texttt{empty} source |
378 | returns no data, possibly returning an associated error | 368 | returns no data, possibly returning an associated error |
379 | message. The \texttt{file} source is more usefule, and | 369 | message. The \texttt{file} source works harder, and |
380 | yields the contents of a file in a chunk by chunk fashion. | 370 | yields the contents of a file in a chunk by chunk fashion: |
381 | \begin{quote} | 371 | \begin{quote} |
382 | \begin{lua} | 372 | \begin{lua} |
383 | @stick# | 373 | @stick# |
@@ -404,9 +394,13 @@ end | |||
404 | 394 | ||
405 | \subsection{Filtered sources} | 395 | \subsection{Filtered sources} |
406 | 396 | ||
407 | It is often useful to chain a source with a filter. A | 397 | A filtered source passes its data through the |
408 | filtered source passes its data through the | ||
409 | associated filter before returning it to the caller. | 398 | associated filter before returning it to the caller. |
399 | Filtered sources are useful when working with | ||
400 | functions that get their input data from a source (such as | ||
401 | the pump in our first example). By chaining a source with one or | ||
402 | more filters, the function can be transparently provided | ||
403 | with filtered data, with no need to change its interface. | ||
410 | Here is a factory that does the job: | 404 | Here is a factory that does the job: |
411 | \begin{quote} | 405 | \begin{quote} |
412 | \begin{lua} | 406 | \begin{lua} |
@@ -425,23 +419,16 @@ end | |||
425 | \end{lua} | 419 | \end{lua} |
426 | \end{quote} | 420 | \end{quote} |
427 | 421 | ||
428 | Our motivating example in the introduction chains a source | ||
429 | with a filter. Filtered sources are useful when working with | ||
430 | functions that get their input data from a source (such as | ||
431 | the pump in the example). By chaining a source with one or | ||
432 | more filters, the function can be transparently provided | ||
433 | with filtered data, with no need to change its interface. | ||
434 | |||
435 | \subsection{Sinks} | 422 | \subsection{Sinks} |
436 | 423 | ||
437 | Just as we defined an interface for sources of | 424 | Just as we defined an interface a data source, |
438 | data, we can also define an interface for a | 425 | we can also define an interface for a data destination. |
439 | destination for data. We call any function respecting this | 426 | We call any function respecting this |
440 | interface a \emph{sink}. In our first example, we used a | 427 | interface a \emph{sink}. In our first example, we used a |
441 | file sink connected to the standard output. | 428 | file sink connected to the standard output. |
442 | 429 | ||
443 | Sinks receive consecutive chunks of data, until the end of | 430 | Sinks receive consecutive chunks of data, until the end of |
444 | data is notified with a \texttt{nil} chunk. A sink can be | 431 | data is signaled by a \texttt{nil} chunk. A sink can be |
445 | notified of an error with an optional extra argument that | 432 | notified of an error with an optional extra argument that |
446 | contains the error message, following a \texttt{nil} chunk. | 433 | contains the error message, following a \texttt{nil} chunk. |
447 | If a sink detects an error itself, and | 434 | If a sink detects an error itself, and |
@@ -529,18 +516,21 @@ common that it deserves its own function: | |||
529 | function pump.step(src, snk) | 516 | function pump.step(src, snk) |
530 | local chunk, src_err = src() | 517 | local chunk, src_err = src() |
531 | local ret, snk_err = snk(chunk, src_err) | 518 | local ret, snk_err = snk(chunk, src_err) |
532 | return chunk and ret and not src_err and not snk_err, | 519 | if chunk and ret then return 1 |
533 | src_err or snk_err | 520 | else return nil, src_err or snk_err end |
534 | end | 521 | end |
535 | % | 522 | % |
536 | 523 | ||
537 | @stick# | 524 | @stick# |
538 | function pump.all(src, snk, step) | 525 | function pump.all(src, snk, step) |
539 | step = step or pump.step | 526 | step = step or pump.step |
540 | while true do | 527 | while true do |
541 | local ret, err = step(src, snk) | 528 | local ret, err = step(src, snk) |
542 | if not ret then return not err, err end | 529 | if not ret then |
543 | end | 530 | if err then return nil, err |
531 | else return 1 end | ||
532 | end | ||
533 | end | ||
544 | end | 534 | end |
545 | % | 535 | % |
546 | \end{lua} | 536 | \end{lua} |
@@ -571,21 +561,23 @@ The way we split the filters here is not intuitive, on | |||
571 | purpose. Alternatively, we could have chained the Base64 | 561 | purpose. Alternatively, we could have chained the Base64 |
572 | encode filter and the line-wrap filter together, and then | 562 | encode filter and the line-wrap filter together, and then |
573 | chain the resulting filter with either the file source or | 563 | chain the resulting filter with either the file source or |
574 | the file sink. It doesn't really matter. | 564 | the file sink. It doesn't really matter. The Base64 and the |
565 | line wrapping filters are part of the \texttt{LuaSocket} | ||
566 | distribution. | ||
575 | 567 | ||
576 | \section{Exploding filters} | 568 | \section{Exploding filters} |
577 | 569 | ||
578 | Our current filter interface has one flagrant shortcoming. | 570 | Our current filter interface has one flagrant shortcoming. |
579 | When David Burgess was writing his \texttt{gzip} filter, he | 571 | When David Burgess was writing his \texttt{gzip} filter, he |
580 | noticed that a decompression filter can explode a small | 572 | noticed that a decompression filter can explode a small |
581 | input chunk into a huge amount of data. To address this, we | 573 | input chunk into a huge amount of data. To address this |
582 | decided to change our filter interface to allow exploding | 574 | problem, we decided to change the filter interface and allow |
583 | filters to return large quantities of output data in a chunk | 575 | exploding filters to return large quantities of output data |
584 | by chunk manner. | 576 | in a chunk by chunk manner. |
585 | 577 | ||
586 | More specifically, after passing each chunk of input data to | 578 | More specifically, after passing each chunk of input to |
587 | a filter and collecting the first chunk of output data, the | 579 | a filter, and collecting the first chunk of output, the |
588 | user must now loop to receive data from the filter until no | 580 | user must now loop to receive other chunks from the filter until no |
589 | filtered data is left. Within these secondary calls, the | 581 | filtered data is left. Within these secondary calls, the |
590 | caller passes an empty string to the filter. The filter | 582 | caller passes an empty string to the filter. The filter |
591 | responds with an empty string when it is ready for the next | 583 | responds with an empty string when it is ready for the next |
@@ -593,7 +585,7 @@ input chunk. In the end, after the user passes a | |||
593 | \texttt{nil} chunk notifying the filter that there is no | 585 | \texttt{nil} chunk notifying the filter that there is no |
594 | more input data, the filter might still have to produce too | 586 | more input data, the filter might still have to produce too |
595 | much output data to return in a single chunk. The user has | 587 | much output data to return in a single chunk. The user has |
596 | to loop again, this time passing \texttt{nil} each time, | 588 | to loop again, now passing \texttt{nil} to the filter each time, |
597 | until the filter itself returns \texttt{nil} to notify the | 589 | until the filter itself returns \texttt{nil} to notify the |
598 | user it is finally done. | 590 | user it is finally done. |
599 | 591 | ||
@@ -602,9 +594,9 @@ the new interface. In fact, the end-of-line translation | |||
602 | filter we presented earlier already conforms to it. The | 594 | filter we presented earlier already conforms to it. The |
603 | complexity is encapsulated within the chaining functions, | 595 | complexity is encapsulated within the chaining functions, |
604 | which must now include a loop. Since these functions only | 596 | which must now include a loop. Since these functions only |
605 | have to be written once, the user is not affected. | 597 | have to be written once, the user is rarely affected. |
606 | Interestingly, the modifications do not have a measurable | 598 | Interestingly, the modifications do not have a measurable |
607 | negative impact in the the performance of filters that do | 599 | negative impact in the performance of filters that do |
608 | not need the added flexibility. On the other hand, for a | 600 | not need the added flexibility. On the other hand, for a |
609 | small price in complexity, the changes make exploding | 601 | small price in complexity, the changes make exploding |
610 | filters practical. | 602 | filters practical. |
@@ -617,7 +609,7 @@ and SMTP modules are especially integrated with LTN12, | |||
617 | and can be used to showcase the expressive power of filters, | 609 | and can be used to showcase the expressive power of filters, |
618 | sources, sinks, and pumps. Below is an example | 610 | sources, sinks, and pumps. Below is an example |
619 | of how a user would proceed to define and send a | 611 | of how a user would proceed to define and send a |
620 | multipart message with attachments, using \texttt{LuaSocket}: | 612 | multipart message, with attachments, using \texttt{LuaSocket}: |
621 | \begin{quote} | 613 | \begin{quote} |
622 | \begin{mime} | 614 | \begin{mime} |
623 | local smtp = require"socket.smtp" | 615 | local smtp = require"socket.smtp" |
@@ -656,8 +648,8 @@ assert(smtp.send{ | |||
656 | The \texttt{smtp.message} function receives a table | 648 | The \texttt{smtp.message} function receives a table |
657 | describing the message, and returns a source. The | 649 | describing the message, and returns a source. The |
658 | \texttt{smtp.send} function takes this source, chains it with the | 650 | \texttt{smtp.send} function takes this source, chains it with the |
659 | SMTP dot-stuffing filter, creates a connects a socket sink | 651 | SMTP dot-stuffing filter, connects a socket sink |
660 | to the server, and simply pumps the data. The message is never | 652 | with the server, and simply pumps the data. The message is never |
661 | assembled in memory. Everything is produced on demand, | 653 | assembled in memory. Everything is produced on demand, |
662 | transformed in small pieces, and sent to the server in chunks, | 654 | transformed in small pieces, and sent to the server in chunks, |
663 | including the file attachment that is loaded from disk and | 655 | including the file attachment that is loaded from disk and |
@@ -665,14 +657,14 @@ encoded on the fly. It just works. | |||
665 | 657 | ||
666 | \section{Conclusions} | 658 | \section{Conclusions} |
667 | 659 | ||
668 | In this article we introduce the concepts of filters, | 660 | In this article, we introduced the concepts of filters, |
669 | sources, sinks, and pumps to the Lua language. These are | 661 | sources, sinks, and pumps to the Lua language. These are |
670 | useful tools for data processing in general. Sources provide | 662 | useful tools for stream processing in general. Sources provide |
671 | a simple abstraction for data acquisition. Sinks provide an | 663 | a simple abstraction for data acquisition. Sinks provide an |
672 | abstraction for final data destinations. Filters define an | 664 | abstraction for final data destinations. Filters define an |
673 | interface for data transformations. The chaining of | 665 | interface for data transformations. The chaining of |
674 | filters, sources and sinks provides an elegant way to create | 666 | filters, sources and sinks provides an elegant way to create |
675 | arbitrarily complex data transformation from simpler | 667 | arbitrarily complex data transformations from simpler |
676 | transformations. Pumps simply move the data through. | 668 | components. Pumps simply move the data through. |
677 | 669 | ||
678 | \end{document} | 670 | \end{document} |