diff options
-rw-r--r-- | README.md | 125 |
1 files changed, 65 insertions, 60 deletions
@@ -31,8 +31,7 @@ character fails, and it is caught by an ordered choice. | |||
31 | An error, by its turn, is produced by the throw operator | 31 | An error, by its turn, is produced by the throw operator |
32 | and may be caught by the recovery operator. | 32 | and may be caught by the recovery operator. |
33 | 33 | ||
34 | Below there is a brief summary | 34 | Below there is a brief summary of the new functions provided by LpegLabel: |
35 | of the new functions provided by LpegLabel: | ||
36 | 35 | ||
37 | <table border="1"> | 36 | <table border="1"> |
38 | <tbody><tr><td><b>Function</b></td><td><b>Description</b></td></tr> | 37 | <tbody><tr><td><b>Function</b></td><td><b>Description</b></td></tr> |
@@ -112,57 +111,11 @@ in the *examples* directory. | |||
112 | The following example defines a grammar that matches | 111 | The following example defines a grammar that matches |
113 | a list of identifiers separated by commas. A label | 112 | a list of identifiers separated by commas. A label |
114 | is thrown when there is an error matching an identifier | 113 | is thrown when there is an error matching an identifier |
115 | or a comma: | 114 | or a comma. |
116 | 115 | ||
117 | ```lua | 116 | We use function `newError` to store error messages in a |
118 | local m = require'lpeglabelrec' | 117 | table and to return the index associated with each error message. |
119 | local re = require'relabelrec' | ||
120 | |||
121 | local g = m.P{ | ||
122 | "S", | ||
123 | S = m.V"Id" * m.V"List", | ||
124 | List = -m.P(1) + (m.V"Comma" + m.T(2)) * (m.V"Id" + m.T(1)) * m.V"List", | ||
125 | Id = m.V"Sp" * m.R'az'^1, | ||
126 | Comma = m.V"Sp" * ",", | ||
127 | Sp = m.S" \n\t"^0, | ||
128 | } | ||
129 | |||
130 | function mymatch (g, s) | ||
131 | local r, e, sfail = g:match(s) | ||
132 | if not r then | ||
133 | local line, col = re.calcline(s, #s - #sfail) | ||
134 | local msg = "Error at line " .. line .. " (col " .. col .. ")" | ||
135 | if e == 1 then | ||
136 | return r, msg .. ": expecting an identifier before '" .. sfail .. "'" | ||
137 | elseif e == 2 then | ||
138 | return r, msg .. ": expecting ',' before '" .. sfail .. "'" | ||
139 | else | ||
140 | return r, msg | ||
141 | end | ||
142 | end | ||
143 | return r | ||
144 | end | ||
145 | |||
146 | print(mymatch(g, "one,two")) --> 8 | ||
147 | print(mymatch(g, "one two")) --> nil Error at line 1 (col 3): expecting ',' before ' two' | ||
148 | print(mymatch(g, "one,\n two,\nthree,")) --> nil Error at line 3 (col 6): expecting an identifier before '' | ||
149 | ``` | ||
150 | 118 | ||
151 | In this example we could think about writing rule <em>List</em> as follows: | ||
152 | ```lua | ||
153 | List = ((m.V"Comma" + m.T(2)) * (m.V"Id" + m.T(1)))^0, | ||
154 | ``` | ||
155 | |||
156 | but when matching this expression against the end of input | ||
157 | we would get a failure whose associated label would be **2**, | ||
158 | and this would cause the failure of the *whole* repetition. | ||
159 | |||
160 | |||
161 | ##### Mnemonics instead of numbers | ||
162 | |||
163 | In the previous example we could have created a table | ||
164 | with the error messages to improve the readability of the PEG. | ||
165 | Below we rewrite the previous grammar following this approach: | ||
166 | 119 | ||
167 | ```lua | 120 | ```lua |
168 | local m = require'lpeglabelrec' | 121 | local m = require'lpeglabelrec' |
@@ -203,19 +156,71 @@ print(mymatch(g, "one two")) --> nil Error at line 1 (col 3): expec | |||
203 | print(mymatch(g, "one,\n two,\nthree,")) --> nil Error at line 3 (col 6): expecting an identifier before '' | 156 | print(mymatch(g, "one,\n two,\nthree,")) --> nil Error at line 3 (col 6): expecting an identifier before '' |
204 | ``` | 157 | ``` |
205 | 158 | ||
159 | In this example we could think about writing rule <em>List</em> as follows: | ||
160 | ```lua | ||
161 | List = ((m.V"Comma" + m.T(errComma)) * (m.V"Id" + m.T(errId)))^0, | ||
162 | ``` | ||
163 | |||
164 | but when matching this expression against the end of input | ||
165 | we would get a failure whose associated label would be **errComma**, | ||
166 | and this would cause the failure of the *whole* repetition. | ||
167 | |||
168 | |||
169 | |||
206 | #### Error Recovery | 170 | #### Error Recovery |
207 | 171 | ||
208 | By using the recovery operator we can specify a recovery pattern that | 172 | By using the `Rec` function we can specify a recovery pattern that |
209 | should be matched when a label is thrown. After matching this pattern, | 173 | should be matched when a label is thrown. After matching the recovery |
210 | and possibly recording the error, the parser can continue parsing to | 174 | pattern, and possibly recording the error, the parser will resume |
211 | find more errors. | 175 | the <em>regular</em> matching. For example, in the example below |
176 | we expect to match rule `A`, but in case label 42 is thrown | ||
177 | then we will try to match `recp`: | ||
178 | ```lua | ||
179 | local m = require'lpeglabelrec' | ||
212 | 180 | ||
213 | Below we rewrite the previous example to illustrate a recovery strategy. | 181 | local recp = m.P"oast" |
214 | Grammar `g` remains the same, but we add a recovery grammar `grec` that | ||
215 | handles the labels thrown by `g`. | ||
216 | 182 | ||
217 | arithmetic expression example and modify | 183 | local g = m.P{ |
218 | the `expect` function to use the recovery operator for error recovery: | 184 | "S", |
185 | S = m.Rec(m.V"A", recp, 42) * ".", | ||
186 | A = m.P"t" * (m.P("est") + m.T(42)) | ||
187 | } | ||
188 | |||
189 | print(g:match("test.")) --> 6 | ||
190 | |||
191 | print(g:match("toast.")) --> 7 | ||
192 | |||
193 | print(g:match("oast.")) --> nil 0 oast. | ||
194 | |||
195 | print(g:match("toward.")) --> nil 0 ward. | ||
196 | ``` | ||
197 | When trying to match 'toast.', in rule `A` the first | ||
198 | 't' is matched, and then label 42 is thrown, with the associated | ||
199 | inpux suffix 'oast.'. In rule `S` this label is caught | ||
200 | and the recovery pattern matches 'oast', so pattern `'.'` | ||
201 | matches the rest of the input. | ||
202 | |||
203 | When matching 'oast.', pattern `m.P"t"` fails, and | ||
204 | the result of the matching is <b>nil, 0, oast.</b>. | ||
205 | |||
206 | When matching 'toward.', label 42 is throw, with the associated | ||
207 | input suffix 'oward.'. The matching of the recovery pattern fails to, | ||
208 | so the result of the matching is <b>nil, 0, ward.</b>. | ||
209 | |||
210 | Usually, the recovery pattern is an expression that never fails. | ||
211 | In the previous example, we could have used `(m.P(1) - m.P".")^0` | ||
212 | as the recovery pattern. | ||
213 | |||
214 | Below we rewrite the grammar that describes a list of identifiers | ||
215 | to use a recovery strategy. Grammar `g` remains the same, but we add a | ||
216 | recovery grammar `grec` that handles the labels thrown by `g`. | ||
217 | |||
218 | In grammar `grec` we use functions `record` and `sync`. | ||
219 | Function `record` gives us a pattern that captures two | ||
220 | values: the current subject position (where a label was thrown) | ||
221 | and the label itself. These values will be used to record | ||
222 | all the errors found. Function `sync` give us synchronization | ||
223 | pattern, that macthes the input | ||
219 | 224 | ||
220 | ```lua | 225 | ```lua |
221 | local m = require'lpeglabelrec' | 226 | local m = require'lpeglabelrec' |