Merge branch 'recoveryresume'

author: Sergio Queiroz <sqmedeiros@gmail.com> 2016-12-13 13:53:49 -0300
committer: Sergio Queiroz <sqmedeiros@gmail.com> 2016-12-13 13:53:49 -0300
commit: 09fab0decb7df93528ab40fcfd99587e9074c64f (patch)
tree: ecd7a763c7a08712f122945bb5ce1ed7d7e5f077 /README.md
parent: d80821d79376671371c15ded562fbe1a9bebc635 (diff)
parent: 1322d612d72ac658f2aa443dca94954b819c0993 (diff)
download: lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.tar.gz
lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.tar.bz2
lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.zip
1 files changed, 465 insertions, 200 deletions
diff --git a/README.md b/README.md
index 1f1bdff..9484b3d 100644
--- a/README.md
+++ b/README.md
@@ -10,47 +10,46 @@ LPegLabel is a conservative extension of the
 [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg)
 library that provides an implementation of Parsing
 Expression Grammars (PEGs) with labeled failures. 
-Labels can be used to signal different kinds of erros
+Labels can be used to signal different kinds of errors
-and to specify which alternative in a labeled ordered
+and to specify which recovery pattern should handle a
-choice should handle a given label. Labels can also be
+given label. Labels can also be combined with the standard
-combined with the standard patterns of LPeg.
+patterns of LPeg.
 This document describes the new functions available
 in LpegLabel and presents some examples of usage.
-For a more detailed discussion about PEGs with labeled failures
-please see [A Parsing Machine for Parsing Expression
-Grammars with Labeled Failures](https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxzcW1lZGVpcm9zfGd4OjMzZmE3YzM0Y2E2MGM5Y2M).
 In LPegLabel, the result of an unsuccessful matching
 is a triple **nil, lab, sfail**, where **lab**
 is the label associated with the failure, and
 **sfail** is the suffix input being matched when
-**lab** was thrown. Below there is a brief summary
+**lab** was thrown. 
-of the new functions provided by LpegLabel: 
+With labeled failures it is possible to distinguish
+between a regular failure and an error. Usually, a
+regular failure is produced when the matching of a
+character fails, and it is caught by an ordered choice.
+An error, by its turn, is produced by the throw operator
+and may be caught by the recovery operator. 
+ 
+Below there is a brief summary of the new functions provided by LpegLabel: 
 <table border="1">
 <tbody><tr><td><b>Function</b></td><td><b>Description</b></td></tr>
-<tr><td><a href="#f-t"><code>lpeglabel.T (l)</code></a></td>
+<tr><td><a href="#f-t"><code>lpeglabelrec.T (l)</code></a></td>
-  <td>Throws label <code>l</code></td></tr>
+  <td>Throws a label <code>l</code> to signal an error</td></tr>
-<tr><td><a href="#f-lc"><code>lpeglabel.Lc (p1, p2, l1, ..., ln)</code></a></td>
+<tr><td><a href="#f-rec"><code>lpeglabelrec.Rec (p1, p2, l1 [, l2, ..., ln])</code></a></td>
-  <td>Matches <code>p1</code> and tries to match <code>p2</code>
+  <td>Specifies a recovery pattern <code>p2</code> for <code>p1</code>,
-      if the matching of <code>p1</code> gives one of l<sub>1</sub>, ..., l<sub>n</sub> 
+ when the matching of <code>p1</code> gives one of the labels l1, ..., ln.</td></tr>
-      </td></tr>
-<tr><td><a href="#f-rec"><code>lpeglabel.Rec (p1, p2 [, l1, ..., ln])</code></a></td>
-  <td>Like <code>Lc</code> but does not reset the position of the parser
-      when trying <code>p2</code>. By default, it catches regular PEG failures
-      </td></tr>
 <tr><td><a href="#re-t"><code>%{l}</code></a></td>
-  <td>Syntax of <em>relabel</em> module. Equivalent to <code>lpeg.T(l)</code>
+  <td>Syntax of <em>relabelrec</em> module. Equivalent to <code>lpeglabelrec.T(l)</code>
      </td></tr>
-<tr><td><a href="#re-lc"><code>p1 /{l1, ..., ln} p2</code></a></td>
+<tr><td><a href="#re-rec"><code>p1 //{l1 [, l2, ..., ln} p2</code></a></td>
-  <td>Syntax of <em>relabel</em> module. Equivalent to <code>lpeg.Lc(p1, p2, l1, ..., ln)</code>
+  <td>Syntax of <em>relabelrec</em> module. Equivalent to <code>lpeglabelrec.Rec(p1, p2, l1, ..., ln)</code>
      </td></tr>
-<tr><td><a href="#re-line"><code>relabel.calcline(subject, i)</code></a></td>
+<tr><td><a href="#re-line"><code>relabelrec.calcline(subject, i)</code></a></td>
  <td>Calculates line and column information regarding position <i>i</i> of the subject</code>
      </td></tr>
-<tr><td><a href="#re-setl"><code>relabel.setlabels (tlabel)</code></a></td>
+<tr><td><a href="#re-setl"><code>relabelrec.setlabels (tlabel)</code></a></td>
  <td>Allows to specicify a table with mnemonic labels. 
      </td></tr>
 </tbody></table>
@@ -59,28 +58,20 @@ of the new functions provided by LpegLabel:
 ### Functions
-#### <a name="f-t"></a><code>lpeglabel.T(l)</code>
+#### <a name="f-t"></a><code>lpeglabelrec.T(l)</code>
 Returns a pattern that throws the label `l`.
-A label must be an integer between 0 and 255.
+A label must be an integer between 1 and 255.
-The label 0 is equivalent to the regular failure of PEGs.
+#### <a name="f-rec"></a><code>lpeglabelrec.Rec(p1, p2, l1, ..., ln)</code>
-#### <a name="f-lc"></a><code>lpeglabel.Lc(p1, p2, l1, ..., ln)</code>
+Returns a *recovery pattern*.
-Returns a pattern equivalent to a *labeled ordered choice*.
 If the matching of `p1` gives one of the labels `l1, ..., ln`,
-then the matching of `p2` is tried from the same position. Otherwise,
+then the matching of `p2` is tried from the failure position of `p1`.
-the result of the matching of `p1` is the pattern's result.
+Otherwise, the result of the matching of `p1` is the pattern's result.
-The labeled ordered choice `lpeg.Lc(p1, p2, 0)` is equivalent to the
-regular ordered choice `p1 / p2`.
-Although PEG's ordered choice is associative, the labeled ordered choice is not.
-When using this function, the user should take care to build a left-associative
-labeled ordered choice pattern.
 #### <a name="f-rec"></a><code>lpeglabel.Rec(p1, p2 [, l1, ..., ln])</code>
@@ -94,29 +85,26 @@ i.e. `lpeg.Rec(p1, p2)` is equivalent to `lpeg.Rec(p1, p2, 0)`.
 #### <a name="re-t"></a><code>%{l}</code>
-Syntax of *relabel* module. Equivalent to `lpeg.T(l)`.
+Syntax of *relabelrec* module. Equivalent to `lpeg.T(l)`.
-#### <a name="re-lc"></a><code>p1 /{l1, ..., ln} p2</code>
+#### <a name="re-lc"></a><code>p1 //{l1, ..., ln} p2</code>
-Syntax of *relabel* module. Equivalent to `lpeg.Lc(p1, p2, l1, ..., ln)`.
+Syntax of *relabelrec* module. Equivalent to `lpeglabelrec.Rec(p1, p2, l1, ..., ln)`.
-The `/{}` operator is left-associative. 
+The `//{}` operator is left-associative. 
-A grammar can use both choice operators (`/` and `/{}`),
-but a single choice can not mix them. That is, the parser of `relabel`
-module will not recognize a pattern as `p1 / p2 /{l1} p3`.
-#### <a name="re-line"></a><code>relabel.calcline (subject, i)</code>
+#### <a name="re-line"></a><code>relabelrec.calcline (subject, i)</code>
 Returns line and column information regarding position <i>i</i> of the subject.
-#### <a name="re-setl"></a><code>relabel.setlabels (tlabel)</code>
+#### <a name="re-setl"></a><code>relabelrec.setlabels (tlabel)</code>
 Allows to specicify a table with labels. They keys of
-`tlabel` must be integers between 0 and 255,
+`tlabel` must be integers between 1 and 255,
 and the associated values should be strings.
@@ -132,16 +120,31 @@ in the *examples* directory.
 The following example defines a grammar that matches
 a list of identifiers separated by commas. A label
 is thrown when there is an error matching an identifier
-or a comma: 
+or a comma.
+We use function `newError` to store error messages in a
+table and to return the index associated with each error message.
 ```lua
-local m = require'lpeglabel'
+local m = require'lpeglabelrec'
-local re = require'relabel'
+local re = require'relabelrec'
+local terror = {}
+local function newError(s)
+  table.insert(terror, s)
+  return #terror
+end
+local errUndef = newError("undefined")
+local errId = newError("expecting an identifier")
+local errComma = newError("expecting ','")
 local g = m.P{
  "S",
  S = m.V"Id" * m.V"List",
-  List = -m.P(1) + (m.V"Comma" + m.T(2)) * (m.V"Id" + m.T(1)) * m.V"List",
+  List = -m.P(1) + (m.V"Comma" + m.T(errComma)) * (m.V"Id" + m.T(errId)) * m.V"List",
  Id = m.V"Sp" * m.R'az'^1,
  Comma = m.V"Sp" * ",",
  Sp = m.S" \n\t"^0,
@@ -151,18 +154,12 @@ function mymatch (g, s)
  local r, e, sfail = g:match(s)
  if not r then
    local line, col = re.calcline(s, #s - #sfail)
-    local msg = "Error at line " .. line .. " (col " .. col .. ")"
+    local msg = "Error at line " .. line .. " (col " .. col .. "): "
-    if e == 1 then
+    return r, msg .. terror[e] .. " before '" .. sfail .. "'"
-      return r, msg .. ": expecting an identifier before '" .. sfail .. "'"
-    elseif e == 2 then
-      return r, msg .. ": expecting ',' before '" .. sfail .. "'"
-    else
-      return r, msg
-    end
  end
  return r
 end
- 
+  
 print(mymatch(g, "one,two"))              --> 8
 print(mymatch(g, "one two"))              --> nil Error at line 1 (col 3): expecting ',' before ' two'
 print(mymatch(g, "one,\n two,\nthree,"))  --> nil Error at line 3 (col 6): expecting an identifier before ''
@@ -170,23 +167,73 @@ print(mymatch(g, "one,\n two,\nthree,"))  --> nil Error at line 3 (col 6): expec
 In this example we could think about writing rule <em>List</em> as follows:
 ```lua
-List = ((m.V"Comma" + m.T(2)) * (m.V"Id" + m.T(1)))^0,
+List = ((m.V"Comma" + m.T(errComma)) * (m.V"Id" + m.T(errId)))^0,
 ```
-but when matching this expression agains the end of input
+but when matching this expression against the end of input
-we would get a failure whose associated label would be **2**,
+we would get a failure whose associated label would be **errComma**,
 and this would cause the failure of the *whole* repetition.
- 
-##### Mnemonics instead of numbers
-In the previous example we could have created a table
-with the error messages to improve the readbility of the PEG.
-Below we rewrite the previous grammar following this approach: 
+#### Error Recovery
+By using the `Rec` function we can specify a recovery pattern that
+should be matched when a label is thrown. After matching the recovery
+pattern, and possibly recording the error, the parser will resume
+the <em>regular</em> matching. For example, in the example below
+we expect to match rule `A`, but when a failure occur the label 42
+is thrown and then we will try to match the recovery pattern `recp`:
 ```lua
-local m = require'lpeglabel'
+local m = require'lpeglabelrec'
-local re = require'relabel'
+local recp = m.P"oast"
+local g = m.P{
+  "S",
+  S = m.Rec(m.V"A", recp, 42) * ".",
+  A = m.P"t" * (m.P"est" + m.T(42))
+}
+print(g:match("test."))   --> 6
+print(g:match("toast."))  --> 7
+print(g:match("oast."))   --> nil  0  oast.
+print(g:match("toward."))   --> nil  0  ward.
+```
+When trying to match subject 'toast.', in rule `A` the first
+'t' is matched, then the matching of `m.P"est"` fails and label 42
+is thrown, with the associated inpux suffix 'oast.'. In rule
+`S` label 42 is caught and the recovery pattern matches 'oast',
+so pattern `'.'` matches the rest of the input.
+When matching subject 'oast.', pattern `m.P"t"` fails, and
+the result of the matching is <b>nil,  0, oast.</b>.
+When matching 'toward.', label 42 is thrown after matching 't',
+with the associated input suffix 'oward.'. As the matching of the
+recovery pattern fails, the result is <b>nil, 0, ward.</b>.
+Usually, the recovery pattern is an expression that does not fail.
+In the previous example, we could have used `(m.P(1) - m.P".")^0`
+as the recovery pattern.
+Below we rewrite the grammar that describes a list of identifiers
+to use a recovery strategy. Grammar `g` remains the same, but we add a
+recovery grammar `grec` that handles the labels thrown by `g`.
+In grammar `grec` we use functions `record` and `sync`.
+Function `record`, plus function `recorderror`, will help
+us to save the input position where a label was thrown,
+while function `sync` will give us a synchronization pattern,
+that consumes the input while is not possible to match a given
+pattern `p`.   
+When the matching of an identifier fails, a defaul value ('NONE')
+is provided.
+```lua
+local m = require'lpeglabelrec'
+local re = require'relabelrec'
 local terror = {}
@@ -199,73 +246,88 @@ local errUndef = newError("undefined")
 local errId = newError("expecting an identifier")
 local errComma = newError("expecting ','")
+local id = m.R'az'^1
 local g = m.P{
  "S",
  S = m.V"Id" * m.V"List",
-  List = -m.P(1) + (m.V"Comma" + m.T(errComma)) * (m.V"Id" + m.T(errId)) * m.V"List",
+  List = -m.P(1) + m.V"Comma" * m.V"Id" * m.V"List",
-  Id = m.V"Sp" * m.R'az'^1,
+  Id = m.V"Sp" * id + m.T(errId),
-  Comma = m.V"Sp" * ",",
+  Comma = m.V"Sp" * "," + m.T(errComma),
  Sp = m.S" \n\t"^0,
 }
-function mymatch (g, s)
+local subject, errors
-  local r, e, sfail = g:match(s)
-  if not r then
-    local line, col = re.calcline(s, #s - #sfail)
-    local msg = "Error at line " .. line .. " (col " .. col .. "): "
-    return r, msg .. terror[e] .. " before '" .. sfail .. "'"
-  end
-  return r
-end
-  
-print(mymatch(g, "one,two"))              --> 8
-print(mymatch(g, "one two"))              --> nil Error at line 1 (col 3): expecting ',' before ' two'
-print(mymatch(g, "one,\n two,\nthree,"))  --> nil Error at line 3 (col 6): expecting an identifier before ''
-```
+function recorderror(pos, lab)
+  local line, col = re.calcline(subject, pos)
+  table.insert(errors, { line = line, col = col, msg = terror[lab] })
+end
-##### *relabel* syntax
+function record (lab)
+  return (m.Cp() * m.Cc(lab)) / recorderror
+end
-Now we rewrite the previous example using the syntax
+function sync (p)
-supported by *relabel*:
+  return (-p * m.P(1))^0
+end
-```lua
+local grec = m.P{
-local re = require 'relabel' 
+  "S",
+  S = m.Rec(m.Rec(g, m.V"ErrComma", errComma), m.V"ErrId", errId),
+  ErrComma = record(errComma) * sync(id),
+  ErrId = record(errId) * sync(m.P",")
+}
-local g = re.compile[[
-  S      <- Id List
-  List   <- !.  /  (',' /  %{2}) (Id / %{1}) List
-  Id     <- Sp [a-z]+
-  Comma  <- Sp ','
-  Sp     <- %s*
-]]
 function mymatch (g, s)
+  errors = {}
+  subject = s  
  local r, e, sfail = g:match(s)
-  if not r then
+  if #errors > 0 then
-    local line, col = re.calcline(s, #s - #sfail)
+    local out = {}
-    local msg = "Error at line " .. line .. " (col " .. col .. ")"
+    for i, err in ipairs(errors) do
-    if e == 1 then
+      local msg = "Error at line " .. err.line .. " (col " .. err.col .. "): " .. err.msg
-      return r, msg .. ": expecting an identifier before '" .. sfail .. "'"
+      table.insert(out,  msg)
-    elseif e == 2 then
-      return r, msg .. ": expecting ',' before '" .. sfail .. "'"
-    else
-      return r, msg
    end
+    return nil, table.concat(out, "\n") .. "\n"
  end
  return r
 end
+  
-print(mymatch(g, "one,two"))              --> 8
+print(mymatch(grec, "one,two"))
-print(mymatch(g, "one two"))              --> nil Error at line 1 (col 3): expecting ',' before ' two'
+-- Captures (separated by ';'): one; two; 
-print(mymatch(g, "one,\n two,\nthree,"))  --> nil Error at line 3 (col 6): expecting an identifier before ''
+-- Syntactic errors found: 0
+print(mymatch(grec, "one two three"))
+-- Captures (separated by ';'): one; two; three; 
+-- Syntactic errors found: 2
+-- Error at line 1 (col 4): expecting ','
+-- Error at line 1 (col 8): expecting ','
+print(mymatch(grec, "1,\n two, \n3,"))
+-- Captures (separated by ';'): NONE; two; NONE; NONE; 
+-- Syntactic errors found: 3
+-- Error at line 1 (col 1): expecting an identifier
+-- Error at line 2 (col 6): expecting an identifier
+-- Error at line 3 (col 2): expecting an identifier
+print(mymatch(grec, "one\n two123, \nthree,"))
+-- Captures (separated by ';'): one; two; three; NONE; 
+-- Syntactic errors found: 3
+-- Error at line 2 (col 1): expecting ','
+-- Error at line 2 (col 5): expecting ','
+-- Error at line 3 (col 6): expecting an identifier
 ```
-With the help of function *setlabels* we can also rewrite the previous example to use
+##### *relabelrec* syntax
-mnemonic labels instead of plain numbers:
+Below we describe again a grammar that matches a list of identifiers,
+now using the syntax supported by *relabelrec*, where `//{}` is the
+recovery operator, and `%{}` is the throw operator:
 ```lua
-local re = require 'relabel' 
+local re = require 'relabelrec' 
 local errinfo = {
  {"errUndef",  "undefined"},
@@ -285,59 +347,124 @@ re.setlabels(labels)
 local g = re.compile[[
  S      <- Id List
-  List   <- !.  /  (',' /  %{errComma}) (Id / %{errId}) List
+  List   <- !.  /  Comma Id List
-  Id     <- Sp [a-z]+
+  Id     <- Sp {[a-z]+} / %{errId}
-  Comma  <- Sp ','
+  Comma  <- Sp ',' / %{errComma}
  Sp     <- %s*
 ]]
+local errors
+function recorderror (subject, pos, label)
+  local line, col = re.calcline(subject, pos)
+  table.insert(errors, { line = line, col = col, msg = errmsgs[labels[label]] })
+  return true 
+end
+function sync (p)
+  return '( !(' .. p .. ') .)*'
+end
+local grec = re.compile(
+  "S         <- %g //{errComma} ErrComma //{errId} ErrId" .. "\n" ..
+  "ErrComma  <-  ('' -> 'errComma' => recorderror) " .. sync('[a-z]+') .. "\n" ..
+  "ErrId     <-  ('' -> 'errId' => recorderror) " .. sync('","') .. "-> default" 
+  , {g = g, recorderror  = recorderror, default = "NONE"}
+)
 function mymatch (g, s)
-  local r, e, sfail = g:match(s)
+  errors = {}
-  if not r then
+  subject = s  
-    local line, col = re.calcline(s, #s - #sfail)
+  io.write("Input: ", s, "\n")
-    local msg = "Error at line " .. line .. " (col " .. col .. "): "
+  local r = { g:match(s) }
-    return r, msg .. errmsgs[e] .. " before '" .. sfail .. "'"
+  io.write("Captures (separated by ';'): ")
+  for k, v in pairs(r) do
+    io.write(v .. "; ")
  end
+  io.write("\nSyntactic errors found: " .. #errors)
+  if #errors > 0 then
+    io.write("\n")
+    local out = {}
+    for i, err in ipairs(errors) do
+      local msg = "Error at line " .. err.line .. " (col " .. err.col .. "): " .. err.msg
+      table.insert(out,  msg)
+    end
+    io.write(table.concat(out, "\n"))
+  end
+  print("\n")
  return r
 end
-print(mymatch(g, "one,two"))              --> 8
+print(mymatch(grec, "one,two"))
-print(mymatch(g, "one two"))              --> nil Error at line 1 (col 3): expecting ',' before ' two'
+-- Captures (separated by ';'): one; two; 
-print(mymatch(g, "one,\n two,\nthree,"))  --> nil Error at line 3 (col 6): expecting an identifier before ''
+-- Syntactic errors found: 0
+print(mymatch(grec, "one two three"))
+-- Captures (separated by ';'): one; two; three; 
+-- Syntactic errors found: 2
+-- Error at line 1 (col 4): expecting ','
+-- Error at line 1 (col 8): expecting ','
+print(mymatch(grec, "1,\n two, \n3,"))
+-- Captures (separated by ';'): NONE; two; NONE; NONE; 
+-- Syntactic errors found: 3
+-- Error at line 1 (col 1): expecting an identifier
+-- Error at line 2 (col 6): expecting an identifier
+-- Error at line 3 (col 2): expecting an identifier
+print(mymatch(grec, "one\n two123, \nthree,"))
+-- Captures (separated by ';'): one; two; three; NONE; 
+-- Syntactic errors found: 3
+-- Error at line 2 (col 1): expecting ','
+-- Error at line 2 (col 5): expecting ','
+-- Error at line 3 (col 6): expecting an identifier
 ```
 #### Arithmetic Expressions
-Here's an example of an LPegLabel grammar that make its own function called
+Here's an example of an LPegLabel grammar that matches an expression.
-'expect', which takes a pattern and a label as parameters and throws the label
+We have used a function `expect`, that takes a pattern `patt` and a label as
-if the pattern fails to be matched. This function can be extended later on to
+parameters and builds a new pattern that throws this label when `patt`
-record all errors encountered once error recovery is implemented.
+fails.
-```lua
+When a subexpression is syntactically invalid, a default value of 1000
-local lpeg = require"lpeglabel"
+is provided by the recovery pattern, so the evaluation of an expression
+should always produce a numeric value. 
-local R, S, P, V, C, Ct, T = lpeg.R, lpeg.S, lpeg.P, lpeg.V, lpeg.C, lpeg.Ct, lpeg.T
+In this example, we can see that it may be a tedious and error prone
+task to build manually the recovery grammar `grec`. In the next example
+we will show how to build the recovery grammar in a more automatic way. 
+```lua
+local m = require"lpeglabelrec"
+local re = require"relabelrec"
 local labels = {
-  {"NoExp",     "no expression found"},
+  {"ExpTermFirst",  "expected an expression"},
-  {"Extra",     "extra characters found after the expression"},
+  {"ExpTermOp",   "expected a term after the operator"},
-  {"ExpTerm",   "expected a term after the operator"},
-  {"ExpExp",    "expected an expression after the parenthesis"},
  {"MisClose",  "missing a closing ')' after the expression"},
 }
-local function expect(patt, labname)
+local function labelindex(labname)
  for i, elem in ipairs(labels) do
    if elem[1] == labname then
-      return patt + T(i)
+      return i
    end
  end
  error("could not find label: " .. labname)
 end
-local num = R("09")^1 / tonumber
+local errors, subject
-local op = S("+-*/")
+local function expect(patt, labname)
+  local i = labelindex(labname)
+  return patt + m.T(i)
+end
+local num = m.R("09")^1 / tonumber
+local op = m.S("+-")
 local function compute(tokens)
  local result = tokens[1]
@@ -346,10 +473,6 @@ local function compute(tokens)
      result = result + tokens[i+1]
    elseif tokens[i] == '-' then
      result = result - tokens[i+1]
-    elseif tokens[i] == '*' then
-      result = result * tokens[i+1]
-    elseif tokens[i] == '/' then
-      result = result / tokens[i+1]
    else
      error('unknown operation: ' .. tokens[i])
    end
@@ -357,81 +480,223 @@ local function compute(tokens)
  return result
 end
-local g = P {
+local g = m.P {
  "Exp",
-  Exp = Ct(V"Term" * (C(op) * expect(V"Term", "ExpTerm"))^0) / compute;
+  Exp = m.Ct(m.V"OperandFirst" * (m.C(op) * m.V"Operand")^0) / compute,
-  Term = num + V"Group";
+  OperandFirst = expect(m.V"Term", "ExpTermFirst"),
-  Group = "(" * expect(V"Exp", "ExpExp") * expect(")", "MisClose");
+  Operand = expect(m.V"Term", "ExpTermOp"),
+  Term = num + m.V"Group",
+  Group = "(" * m.V"Exp" * expect(")", "MisClose"),
 }
-g = expect(g, "NoExp") * expect(-P(1), "Extra")
+function recorderror(pos, lab)
+  local line, col = re.calcline(subject, pos)
+  table.insert(errors, { line = line, col = col, msg = labels[lab][2] })
+end
+function record (labname)
+  return (m.Cp() * m.Cc(labelindex(labname))) / recorderror
+end
+function sync (p)
+  return (-p * m.P(1))^0
+end
+function defaultValue (p)
+  return p or m.Cc(1000) 
+end
+local grec = m.P {
+  "S",
+  S = m.Rec(m.V"A", m.V"ErrExpTermFirst", labelindex("ExpTermFirst")), 
+  A = m.Rec(m.V"Sg", m.V"ErrExpTermOp", labelindex("ExpTermOp")),
+  Sg = m.Rec(g, m.V"ErrMisClose", labelindex("MisClose")),
+  ErrExpTermFirst = record("ExpTermFirst") * sync(op + ")") * defaultValue(),
+  ErrExpTermOp = record("ExpTermOp") * sync(op + ")") * defaultValue(),
+  ErrMisClose = record("MisClose") * sync(m.P")") * defaultValue(m.P""),
+}
+               
 local function eval(input)
-  local result, label, suffix = g:match(input)
+  errors = {}
-  if result ~= nil then
+  io.write("Input: ", input, "\n")
-    return result
+  subject = input
-  else
+  local result, label, suffix = grec:match(input)
-    local pos = input:len() - suffix:len() + 1
+  io.write("Syntactic errors found: " .. #errors, "\n")
-    local msg = labels[label][2]
+  if #errors > 0 then
-    return nil, "syntax error: " .. msg .. " (at index " .. pos .. ")"
+    local out = {}
+    for i, err in ipairs(errors) do
+      local pos = err.col
+      local msg = err.msg
+      table.insert(out, "syntax error: " .. msg .. " (at index " .. pos .. ")")
+    end
+    print(table.concat(out, "\n"))
  end
+  io.write("Result = ")
+  return result  
 end
-print(eval "98-76*(54/32)")
+print(eval "90-70-(5)+3")
--> 37.125
+-- Syntactic errors found: 0
+-- Result = 18
+print(eval "15+")
+-- Syntactic errors found: 1
+-- syntax error: expected a term after the operator (at index 3)
+-- Result = 1015
+print(eval "-2")
+-- Syntactic errors found: 1
+-- syntax error: expected an expression (at index 1)
+-- Result = 998
+print(eval "1+()+")
+-- Syntactic errors found: 2
+-- syntax error: expected an expression (at index 4)
+-- syntax error: expected a term after the operator (at index 5)
+-- Result = 2001
+print(eval "1+(")
+-- Syntactic errors found: 2
+-- syntax error: expected an expression (at index 3)
+-- syntax error: missing a closing ')' after the expression (at index 3)
+-- Result = 1001
+print(eval "3)")
+-- Syntactic errors found: 0
+-- Result = 3
+```
-print(eval "(1+1-1*2/2")
+#### Automatically Building the Recovery Grammar 
--> syntax error: missing a closing ')' after the expression (at index 11)
-print(eval "(1+)-1*(2/2)")
+Below we rewrite the previous example to automatically
--> syntax error: expected a term after the operator (at index 4)
+build the recovery grammar based on information provided
+by the user for each label (error message, recovery pattern, etc). 
+In the example below we also throw an error when the grammar
+does not match the whole subject.
-print(eval "(1+1)-1*(/2)")
+```lua
--> syntax error: expected an expression after the parenthesis (at index 10)
+local m = require"lpeglabelrec"
+local re = require"relabelrec"
-print(eval "1+(1-(1*2))/2x")
+local num = m.R("09")^1 / tonumber
--> syntax error: extra chracters found after the expression (at index 14)
+local op = m.S("+-")
-print(eval "-1+(1-(1*2))/2")
+local labels = {}
--> syntax error: no expression found (at index 1)
+local nlabels = 0
-```
-#### Catching labels
+local function newError(lab, msg, psync, pcap)
+  nlabels = nlabels + 1
+  psync = psync or m.P(-1)
+  pcap = pcap or m.P""
+  labels[lab] = { id = nlabels, msg = msg, psync = psync, pcap = pcap }
+end
-When a label is thrown, the grammar itself can handle this label
+newError("ExpTermFirst", "expected an expression", op + ")", m.Cc(1000)) 
-by using the labeled ordered choice. Below we rewrite the example
+newError("ExpTermOp", "expected a term after the operator", op + ")", m.Cc(1000))
-of the list of identifiers to show this feature:
+newError("MisClose",  "missing a closing ')' after the expression",  m.P")")
+newError("Extra", "extra characters found after the expression") 
+local errors, subject
-```lua
+local function expect(patt, labname)
-local m = require'lpeglabel'
+  local i = labels[labname].id
+  return patt + m.T(i)
+end
-local terror = {}
+local function compute(tokens)
+  local result = tokens[1]
+  for i = 2, #tokens, 2 do
+    if tokens[i] == '+' then
+      result = result + tokens[i+1]
+    elseif tokens[i] == '-' then
+      result = result - tokens[i+1]
+    else
+      error('unknown operation: ' .. tokens[i])
+    end
+  end
+  return result
+end
-local function newError(s)
+local g = m.P {
-  table.insert(terror, s)
+  "Exp",
-  return #terror
+  Exp = m.Ct(m.V"OperandFirst" * (m.C(op) * m.V"Operand")^0) / compute,
+  OperandFirst = expect(m.V"Term", "ExpTermFirst"),
+  Operand = expect(m.V"Term", "ExpTermOp"),
+  Term = num + m.V"Group",
+  Group = "(" * m.V"Exp" * expect(")", "MisClose"),
+}
+function recorderror(pos, lab)
+  local line, col = re.calcline(subject, pos)
+  table.insert(errors, { line = line, col = col, msg = labels[lab].msg })
 end
-local errUndef = newError("undefined")
+function record (labname)
-local errId = newError("expecting an identifier")
+  return (m.Cp() * m.Cc(labname)) / recorderror
-local errComma = newError("expecting ','")
+end
-local g = m.P{
+function sync (p)
-  "S",
+  return (-p * m.P(1))^0
-  S = m.Lc(m.Lc(m.V"Id" * m.V"List", m.V"ErrId", errId),
+end
-           m.V"ErrComma", errComma),
-  List = -m.P(1) + (m.V"Comma" + m.T(errComma)) * (m.V"Id" + m.T(errId)) * m.V"List",
+function defaultValue (p)
-  Id = m.V"Sp" * m.R'az'^1,
+  return p or m.Cc(1000) 
-  Comma = m.V"Sp" * ",",
+end
-  Sp = m.S" \n\t"^0,
-  ErrId = m.Cc(errId) / terror,
+local grec = g * expect(m.P(-1), "Extra")
-  ErrComma = m.Cc(errComma) / terror
+for k, v in pairs(labels) do
-}
+  grec = m.Rec(grec, record(k) * sync(v.psync) * v.pcap, v.id)
+end
+local function eval(input)
+  errors = {}
+  io.write("Input: ", input, "\n")
+  subject = input
+  local result, label, suffix = grec:match(input)
+  io.write("Syntactic errors found: " .. #errors, "\n")
+  if #errors > 0 then
+    local out = {}
+    for i, err in ipairs(errors) do
+      local pos = err.col
+      local msg = err.msg
+      table.insert(out, "syntax error: " .. msg .. " (at index " .. pos .. ")")
+    end
+    print(table.concat(out, "\n"))
+  end
+  io.write("Result = ")
+  return result  
+end
-print(m.match(g, "one,two"))  --> 8
+print(eval "90-70-(5)+3")
-print(m.match(g, "one two"))  --> expecting ','
+-- Syntactic errors found: 0
-print(m.match(g, "one,\n two,\nthree,"))  --> expecting an identifier
+-- Result = 18
+print(eval "15+")
+-- Syntactic errors found: 1
+-- syntax error: expected a term after the operator (at index 3)
+-- Result = 1015
+print(eval "-2")
+-- Syntactic errors found: 1
+-- syntax error: expected an expression (at index 1)
+-- Result = 998
+print(eval "1+()+")
+-- Syntactic errors found: 2
+-- syntax error: expected an expression (at index 4)
+-- syntax error: expected a term after the operator (at index 5)
+-- Result = 2001
+print(eval "1+(")
+-- Syntactic errors found: 2
+-- syntax error: expected an expression (at index 3)
+-- syntax error: missing a closing ')' after the expression (at index 3)
+-- Result = 1001
+print(eval "3)")
+-- Syntactic errors found: 1
+-- syntax error: extra characters found after the expression (at index 2)
+-- Result = 3
 ```
 #### Error Recovery
author	Sergio Queiroz <sqmedeiros@gmail.com>	2016-12-13 13:53:49 -0300
committer	Sergio Queiroz <sqmedeiros@gmail.com>	2016-12-13 13:53:49 -0300
commit	09fab0decb7df93528ab40fcfd99587e9074c64f (patch)
tree	ecd7a763c7a08712f122945bb5ce1ed7d7e5f077 /README.md
parent	d80821d79376671371c15ded562fbe1a9bebc635 (diff)
parent	1322d612d72ac658f2aa443dca94954b819c0993 (diff)
download	lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.tar.gz lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.tar.bz2 lpeglabel-09fab0decb7df93528ab40fcfd99587e9074c64f.zip