Introduction
LPegLabel is an extension of the LPeg library that provides an implementation of Parsing Expression Grammars (PEGs) with labeled failures. Labels can be used to signal different kinds of erros and to specify which alternative in a labeled ordered choice should handle a given label. Labels can also be combined with the standard patterns of LPeg.
This document describes the new functions available
in LpegLabel and presents some examples of usage.
In LPegLabel, the result of an unsuccessful matching
is a triple nil, lab, sfail
, where lab
is the label associated with the failure, and
sfail
is the suffix input where
the label was thrown.
Below there is a brief summary of the new functions provided by LpegLabel:
Function | Description |
lpeglabel.T (l) |
Throws label l |
lpeglabel.Lc (p1, p2, l1, ..., ln) |
Matches p1 and tries to match p2
if the matching of p1 gives one of l1, ..., ln
|
%{l} |
Syntax of relabel module. Equivalent to lpeg.T(l)
|
p1 /{l1, ..., ln} p2 |
Syntax of relabel module. Equivalent to lpeg.Lc(p1, p2, l1, ..., ln)
|
relabel.setlabels (tlabel) |
Allows to specicify a table with mnemonic labels. |
For a more detailed and formal discussion about PEGs with labels please see Exception Handling for Error Reporting in Parsing Expression Grammars, Error Reporting in Parsing Expression Grammars, and A parsing machine for parsing expression grammars with labeled failures.
Functions
lpeglabel.T(l)
Returns a pattern that throws the label l
.
A label must be an integer between 0
and 63
.
The label 0
is equivalent to the regular failure of PEGs.
lpeglabel.Lc(p1, p2, l1, ..., ln)
Returns a pattern equivalent to a labeled ordered choice.
If the matching of p1
gives one of the labels l1, ..., ln
,
then the matching of p2
is tried from the same position. Otherwise,
the result of the matching of p1
is the pattern's result.
The labeled ordered choice lpeg.Lc(p1, p2, 0)
is equivalent to the
regular ordered choice p1 / p2
.
Although PEG's ordered choice is associative, the labeled ordered choice is not. When using this function, the user should take care to build a left-associative labeled ordered choice pattern.
%{l}
Syntax of relabel module. Equivalent to lpeg.T(l)
.
p1 /{l1, ..., ln} p2
Syntax of relabel module. Equivalent to lpeg.Lc(p1, p2, l1, ..., ln)
.
The /{}
operator is left-associative.
A grammar can use both choice operators (/
and /{}
),
but a single choice can not mix them. That is, the parser
of relabel
module will not recognize a pattern as
p1 / p2 /{l1} p3
.
relabel.setlabels (tlabel)
Allows to specicify a table with labels. They keys of
tlabel
must be integers between 0
and 63
,
and the associated values should be strings.
Some Examples
Throwing a label
The following example defines a grammar that matches a list of identifiers separated by commas. A label is thrown when there is an error matching an identifier or a comma:
local m = require'lpeglabel' local g = m.P{ "S", S = m.V"Id" * m.V"List", List = -m.P(1) + ("," + m.T(2)) * m.V"Id" * m.V"List", Id = m.R'az'^1 + m.T(1), } function mymatch (g, s) local r, e = g:match(s) if not r then if e == 1 then return "Error: expecting an identifier" elseif e == 2 then return "Error: expecting ','" else return "Error" end end return r end print(mymatch(g, "a,b")) print(mymatch(g, "a b")) print(mymatch(g, ", b"))
In this example we could think about writing rule List as follows:
List = m.P(("," + m.T(2)) * m.V"Id")^0but this would give us an expression that when matching the end of input would result in a failure whose associated label would be 2.
In the previous example we could have also created a table with the error messages to improve the readbility of the PEG. Below we rewrite the grammar following this approach:
local m = require'lpeglabel' local errUndef = 0 local errId = 1 local errComma = 2 local terror = { [errUndef] = "Error", [errId] = "Error: expecting an identifier", [errComma] = "Error: expecting ','", } local g = m.P{ "S", S = m.V"Id" * m.V"List", List = -m.P(1) + ("," + m.T(errComma)) * m.V"Id" * m.V"List", Id = m.R'az'^1 + m.T(errId), } function mymatch (g, s) local r, e = g:match(s) if not r then return terror[e] end return r end print(mymatch(g, "a,b")) print(mymatch(g, "a b")) print(mymatch(g, ", b"))
Throwing a label using the relabel module
We can also rewrite the previous example using the relabel module as follows:
local re = require 'relabel' local g = re.compile[[ S <- Id List List <- !. / (',' / %{2}) Id List Id <- [a-z] / %{1} ]] function mymatch (g, s) local r, e = g:match(s) if not r then if e == 1 then return "Error: expecting an identifier" elseif e == 2 then return "Error: expecting ','" else return "Error" end end return r end print(mymatch(g, "a,b")) print(mymatch(g, "a b")) print(mymatch(g, ", b"))
Another way to describe the previous example using the relabel module is by using a table with the description of the errors (terror) and another table that associates a name to a given label (tlabels):
local re = require 'relabel' local errUndef, errId, errComma = 0, 1, 2 local terror = { [errUndef] = "Error", [errId] = "Error: expecting an identifier", [errComma] = "Error: expecting ','", } local tlabels = { ["errUndef"] = errUndef, ["errId"] = errId, ["errComma"] = errComma } re.setlabels(tlabels) local g = re.compile[[ S <- Id List List <- !. / (',' / %{errComma}) Id List Id <- [a-z] / %{errId} ]] function mymatch (g, s) local r, e = g:match(s) if not r then return terror[e] end return r end print(mymatch(g, "a,b")) print(mymatch(g, "a b")) print(mymatch(g, ", b"))
Throwing and catching a label
When a label is thrown, the grammar itself can handle this label by using the labeled ordered choice. Below we rewrite the example of the list of identifiers to show this feature:
local m = require'lpeglabel' local errUndef, errId, errComma = 0, 1, 2 local terror = { [errUndef] = "Error", [errId] = "Error: expecting an identifier", [errComma] = "Error: expecting ','", } g = m.P{ "S", S = m.Lc(m.Lc(m.V"Id" * m.V"List", m.V"ErrId", errId), m.V"ErrComma", errComma), List = -m.P(1) + m.V"Comma" * m.V"Id" * m.V"List", Id = m.R'az'^1 + m.T(errId), Comma = "," + m.T(errComma), ErrId = m.Cc(errId) / terror, ErrComma = m.Cc(errComma) / terror } print(g:match("a,b")) print(g:match("a b")) print(g:match(",b"))
As was pointed out before, the labeled ordered
choice is not associative, so we should impose a left-associative
order when using function Lc
.
Below we use the re module to throw and catch labels.
As was pointed out before, the /{}
operator is left-associative, so we do not need to manually impose
a left-associative order as we did in the previous example that
used Lc
:
local re = require'relabel' local terror = {} local function newError(l, msg) table.insert(terror, { l = l, msg = msg } ) end newError("errId", "Error: expecting an identifier") newError("errComma", "Error: expecting ','") local labelCode = {} local labelMsg = {} for k, v in ipairs(terror) do labelCode[v.l] = k labelMsg[v.l] = v.msg end re.setlabels(labelCode) local p = re.compile([[ S <- Id List /{errId} ErrId /{errComma} ErrComma List <- !. / Comma Id List Id <- [a-z]+ / %{errId} Comma <- ',' / %{errComma} ErrId <- '' -> errId ErrComma <- '' -> errComma ]], labelMsg) print(p:match("a,b")) print(p:match("a b")) print(p:match(",b"))
Tiny Language
As a more complex example, below we have the grammar for the Tiny language, as described in this paper. The example below can also show the line where the syntactic error probably happened.
local re = require 'relabel' local terror = {} local function newError(l, msg) table.insert(terror, { l = l, msg = msg} ) end newError("errSemi", "Error: missing ';'") newError("errExpIf", "Error: expected expression after 'if'") newError("errThen", "Error: expected 'then' keyword") newError("errCmdSeq1", "Error: expected at least a command after 'then'") newError("errCmdSeq2", "Error: expected at least a command after 'else'") newError("errEnd", "Error: expected 'end' keyword") newError("errCmdSeqRep", "Error: expected at least a command after 'repeat'") newError("errUntil", "Error: expected 'until' keyword") newError("errExpRep", "Error: expected expression after 'until'") newError("errAssignOp", "Error: expected ':=' in assigment") newError("errExpAssign", "Error: expected expression after ':='") newError("errReadName", "Error: expected an identifier after 'read'") newError("errWriteExp", "Error: expected expression after 'write'") newError("errSimpExp", "Error: expected '(', ID, or number after '<' or '='") newError("errTerm", "Error: expected '(', ID, or number after '+' or '-'") newError("errFactor", "Error: expected '(', ID, or number after '*' or '/'") newError("errExpFac", "Error: expected expression after '('") newError("errClosePar", "Error: expected ')' after expression") local line local function incLine() line = line + 1 return true end local function countLine(s, i) line = 1 local p = re.compile([[ S <- (%nl -> incLine / .)* ]], { incLine = incLine}) p:match(s:sub(1, i)) return true end local labelCode = {} for k, v in ipairs(terror) do labelCode[v.l] = k end re.setlabels(labelCode) local g = re.compile([[ Tiny <- CmdSeq CmdSeq <- (Cmd (SEMICOLON / ErrSemi)) (Cmd (SEMICOLON / ErrSemi))* Cmd <- IfCmd / RepeatCmd / ReadCmd / WriteCmd / AssignCmd IfCmd <- IF (Exp / ErrExpIf) (THEN / ErrThen) (CmdSeq / ErrCmdSeq1) (ELSE (CmdSeq / ErrCmdSeq2) / '') (END / ErrEnd) RepeatCmd <- REPEAT (CmdSeq / ErrCmdSeqRep) (UNTIL / ErrUntil) (Exp / ErrExpRep) AssignCmd <- NAME (ASSIGNMENT / ErrAssignOp) (Exp / ErrExpAssign) ReadCmd <- READ (NAME / ErrReadName) WriteCmd <- WRITE (Exp / ErrWriteExp) Exp <- SimpleExp ((LESS / EQUAL) (SimpleExp / ErrSimpExp) / '') SimpleExp <- Term ((ADD / SUB) (Term / ErrTerm))* Term <- Factor ((MUL / DIV) (Factor / ErrFactor))* Factor <- OPENPAR (Exp / ErrExpFac) (CLOSEPAR / ErrClosePar) / NUMBER / NAME ErrSemi <- ErrCount %{errSemi} ErrExpIf <- ErrCount %{errExpIf} ErrThen <- ErrCount %{errThen} ErrCmdSeq1 <- ErrCount %{errCmdSeq1} ErrCmdSeq2 <- ErrCount %{errCmdSeq2} ErrEnd <- ErrCount %{errEnd} ErrCmdSeqRep <- ErrCount %{errCmdSeqRep} ErrUntil <- ErrCount %{errUntil} ErrExpRep <- ErrCount %{errExpRep} ErrAssignOp <- ErrCount %{errAssignOp} ErrExpAssign <- ErrCount %{errExpAssign} ErrReadName <- ErrCount %{errReadName} ErrWriteExp <- ErrCount %{errWriteExp} ErrSimpExp <- ErrCount %{errSimpExp} ErrTerm <- ErrCount %{errTerm} ErrFactor <- ErrCount %{errFactor} ErrExpFac <- ErrCount %{errExpFac} ErrClosePar <- ErrCount %{errClosePar} ErrCount <- '' => countLine ADD <- Sp '+' ASSIGNMENT <- Sp ':=' CLOSEPAR <- Sp ')' DIV <- Sp '/' IF <- Sp 'if' ELSE <- Sp 'else' END <- Sp 'end' EQUAL <- Sp '=' LESS <- Sp '<' MUL <- Sp '*' NAME <- Sp !RESERVED [a-z]+ NUMBER <- Sp [0-9]+ OPENPAR <- Sp '(' READ <- Sp 'read' REPEAT <- Sp 'repeat' SEMICOLON <- Sp ';' SUB <- Sp '-' THEN <- Sp 'then' UNTIL <- Sp 'until' WRITE <- Sp 'write' RESERVED <- (IF / ELSE / END / READ / REPEAT / THEN / UNTIL / WRITE) ![a-z]+ Sp <- %s* ]], { countLine = countLine })
Download
LPegLabel source code.
License
The MIT License (MIT)
Copyright (c) 2014-2015 Sérgio Medeiros
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.