diff options
| author | Roberto Ierusalimschy <roberto@inf.puc-rio.br> | 2019-04-17 14:08:22 -0300 |
|---|---|---|
| committer | Roberto Ierusalimschy <roberto@inf.puc-rio.br> | 2019-04-17 14:08:22 -0300 |
| commit | 24bf757183d8bd97f6f5b43d916814f3269c8347 (patch) | |
| tree | 646cd65d6e2dab57691f98f83f15f25c70685ef8 /lpeg.html | |
| parent | 3f7797419e4d7493e1364290a5b127d1cb45e3bf (diff) | |
| download | lpeg-24bf757183d8bd97f6f5b43d916814f3269c8347.tar.gz lpeg-24bf757183d8bd97f6f5b43d916814f3269c8347.tar.bz2 lpeg-24bf757183d8bd97f6f5b43d916814f3269c8347.zip | |
Implementation of UTF-8 ranges
New constructor 'lpeg.utfR(from, to)' creates a pattern that matches
UTF-8 byte sequences representing code points in the range [from, to].
Diffstat (limited to 'lpeg.html')
| -rw-r--r-- | lpeg.html | 12 |
1 files changed, 12 insertions, 0 deletions
| @@ -107,6 +107,9 @@ for creating patterns: | |||
| 107 | <td>Matches any character in <code>string</code> (Set)</td></tr> | 107 | <td>Matches any character in <code>string</code> (Set)</td></tr> |
| 108 | <tr><td><a href="#op-r"><code>lpeg.R("<em>xy</em>")</code></a></td> | 108 | <tr><td><a href="#op-r"><code>lpeg.R("<em>xy</em>")</code></a></td> |
| 109 | <td>Matches any character between <em>x</em> and <em>y</em> (Range)</td></tr> | 109 | <td>Matches any character between <em>x</em> and <em>y</em> (Range)</td></tr> |
| 110 | <tr><td><a href="#op-utfR"><code>lpeg.utfR(cp1, cp2)</code></a></td> | ||
| 111 | <td>Matches an UTF-8 code point between <code>cp1</code> and | ||
| 112 | <code>cp2</code></td></tr> | ||
| 110 | <tr><td><a href="#op-pow"><code>patt^n</code></a></td> | 113 | <tr><td><a href="#op-pow"><code>patt^n</code></a></td> |
| 111 | <td>Matches at least <code>n</code> repetitions of <code>patt</code></td></tr> | 114 | <td>Matches at least <code>n</code> repetitions of <code>patt</code></td></tr> |
| 112 | <tr><td><a href="#op-pow"><code>patt^-n</code></a></td> | 115 | <tr><td><a href="#op-pow"><code>patt^-n</code></a></td> |
| @@ -329,6 +332,15 @@ are patterns that always fail. | |||
| 329 | </p> | 332 | </p> |
| 330 | 333 | ||
| 331 | 334 | ||
| 335 | <h3><a name="op-utfR"></a><code>lpeg.utfR (cp1, cp2)</code></h3> | ||
| 336 | <p> | ||
| 337 | Returns a pattern that matches a valid UTF-8 byte sequence | ||
| 338 | representing a code point in the range <code>[cp1, cp2]</code>. | ||
| 339 | The range is limited by the natural Unicode limit of 0x10FFFF, | ||
| 340 | but may include surrogates. | ||
| 341 | </p> | ||
| 342 | |||
| 343 | |||
| 332 | <h3><a name="op-v"></a><code>lpeg.V (v)</code></h3> | 344 | <h3><a name="op-v"></a><code>lpeg.V (v)</code></h3> |
| 333 | <p> | 345 | <p> |
| 334 | This operation creates a non-terminal (a <em>variable</em>) | 346 | This operation creates a non-terminal (a <em>variable</em>) |
