aboutsummaryrefslogtreecommitdiff
path: root/lpeg.html
diff options
context:
space:
mode:
Diffstat (limited to 'lpeg.html')
-rw-r--r--lpeg.html94
1 files changed, 52 insertions, 42 deletions
diff --git a/lpeg.html b/lpeg.html
index c9bd9f9..5271a52 100644
--- a/lpeg.html
+++ b/lpeg.html
@@ -901,8 +901,8 @@ Creates an <em>accumulator capture</em>.
901This pattern behaves similarly to a 901This pattern behaves similarly to a
902<a href="#cap-func">function capture</a>, 902<a href="#cap-func">function capture</a>,
903with the following differences: 903with the following differences:
904The last captured value is added as a first argument to 904The last captured value before <code>patt</code>
905the call; 905is added as a first argument to the call;
906the return of the function is adjusted to one single value; 906the return of the function is adjusted to one single value;
907that value replaces the last captured value. 907that value replaces the last captured value.
908Note that the capture itself produces no values; 908Note that the capture itself produces no values;
@@ -911,31 +911,6 @@ it only changes the value of its previous capture.
911 911
912<p> 912<p>
913As an example, 913As an example,
914consider the following code fragment:
915</p>
916<pre class="example">
917local name = lpeg.C(lpeg.R("az")^1)
918local p = name * (lpeg.P("^") % string.upper)^-1
919print(p:match("count")) --&gt; count
920print(p:match("count^")) --&gt; COUNT
921</pre>
922<p>
923In the first match,
924the accumulator capture does not match,
925and so the match results in its first capture, a name.
926In the second match,
927the accumulator capture matches,
928so the function <code>string.upper</code>
929is called with the previous capture (created by <code>name</code>)
930plus the string <code>"^"</code>;
931the function ignores its second argument and returns the first argument
932changed to upper case;
933that value then becomes the first and only
934capture value created by the match.
935</p>
936
937<p>
938As another example,
939let us consider the problem of adding a list of numbers. 914let us consider the problem of adding a list of numbers.
940</p> 915</p>
941<pre class="example"> 916<pre class="example">
@@ -956,22 +931,56 @@ First, the initial <code>number</code> captures a number;
956that first capture will play the role of an accumulator. 931that first capture will play the role of an accumulator.
957Then, each time the sequence <code>comma-number</code> 932Then, each time the sequence <code>comma-number</code>
958matches inside the loop there is an accumulator capture: 933matches inside the loop there is an accumulator capture:
959It calls <code>add</code> with the current value of the accumulator 934It calls <code>add</code> with the current value of the
960and the value of the new number, 935accumulator&mdash;which is the last captured value, created by the
961and the result of the call (their sum) replaces the value of the accumulator. 936first <code>number</code>&mdash; and the value of the new number,
937and the result of the call (the sum of the two numbers)
938replaces the value of the accumulator.
962At the end of the match, 939At the end of the match,
963the accumulator with all sums is the final value. 940the accumulator with all sums is the final value.
964</p> 941</p>
965 942
966<p> 943<p>
944As another example,
945consider the following code fragment:
946</p>
947<pre class="example">
948local name = lpeg.C(lpeg.R("az")^1)
949local p = name * (lpeg.P("^") % string.upper)^-1
950print(p:match("count")) --&gt; count
951print(p:match("count^")) --&gt; COUNT
952</pre>
953<p>
954In the match against <code>"count"</code>,
955as there is no <code>"^"</code>,
956the optional accumulator capture does not match;
957so, the match results in its sole capture, a name.
958In the match against <code>"count^"</code>,
959the accumulator capture matches,
960so the function <code>string.upper</code>
961is called with the previous captured value (created by <code>name</code>)
962plus the string <code>"^"</code>;
963the function ignores its second argument and returns the first argument
964changed to upper case;
965that value then becomes the first and only
966capture value created by the match.
967</p>
968
969<p>
967Due to the nature of this capture, 970Due to the nature of this capture,
968you should avoid using it in places where it is not clear 971you should avoid using it in places where it is not clear
969what is its "previous" capture 972what is the "previous" capture,
970(e.g., directly nested in a <a href="#cap-string">string capture</a> 973such as directly nested in a <a href="#cap-string">string capture</a>
971or a <a href="#cap-num">numbered capture</a>). 974or a <a href="#cap-num">numbered capture</a>.
972Due to implementation details, 975(Note that these captures may not need to evaluate
976all their subcaptures to compute their results.)
977Moreover, due to implementation details,
973you should not use this capture directly nested in a 978you should not use this capture directly nested in a
974<a href="#cap-s">substitution capture</a>. 979<a href="#cap-s">substitution capture</a>.
980A simple and effective way to avoid these issues is
981to enclose the whole accumulation composition
982(including the capture that generates the initial value)
983into an anonymous <a href="#cap-g">group capture</a>.
975</p> 984</p>
976 985
977 986
@@ -1056,7 +1065,8 @@ local name = lpeg.C(lpeg.alpha^1) * space
1056local sep = lpeg.S(",;") * space 1065local sep = lpeg.S(",;") * space
1057local pair = name * "=" * space * name * sep^-1 1066local pair = name * "=" * space * name * sep^-1
1058local list = lpeg.Ct("") * (pair % rawset)^0 1067local list = lpeg.Ct("") * (pair % rawset)^0
1059t = list:match("a=b, c = hi; next = pi") --&gt; { a = "b", c = "hi", next = "pi" } 1068t = list:match("a=b, c = hi; next = pi")
1069 --&gt; { a = "b", c = "hi", next = "pi" }
1060</pre> 1070</pre>
1061<p> 1071<p>
1062Each pair has the format <code>name = name</code> followed by 1072Each pair has the format <code>name = name</code> followed by
@@ -1098,7 +1108,7 @@ by <code>sep</code>.
1098If the split results in too many values, 1108If the split results in too many values,
1099it may overflow the maximum number of values 1109it may overflow the maximum number of values
1100that can be returned by a Lua function. 1110that can be returned by a Lua function.
1101In this case, 1111To avoid this problem,
1102we can collect these values in a table: 1112we can collect these values in a table:
1103</p> 1113</p>
1104<pre class="example"> 1114<pre class="example">
@@ -1134,7 +1144,7 @@ end
1134</pre> 1144</pre>
1135<p> 1145<p>
1136This grammar has a straight reading: 1146This grammar has a straight reading:
1137it matches <code>p</code> or skips one character and tries again. 1147its sole rule matches <code>p</code> or skips one character and tries again.
1138</p> 1148</p>
1139 1149
1140<p> 1150<p>
@@ -1143,9 +1153,9 @@ If we want to know where the pattern is in the string
1143we can add position captures to the pattern: 1153we can add position captures to the pattern:
1144</p> 1154</p>
1145<pre class="example"> 1155<pre class="example">
1146local I = lpeg.Cp() 1156local Cp = lpeg.Cp()
1147function anywhere (p) 1157function anywhere (p)
1148 return lpeg.P{ I * p * I + 1 * lpeg.V(1) } 1158 return lpeg.P{ Cp * p * Cp + 1 * lpeg.V(1) }
1149end 1159end
1150 1160
1151print(anywhere("world"):match("hello world!")) --&gt; 7 12 1161print(anywhere("world"):match("hello world!")) --&gt; 7 12
@@ -1155,15 +1165,15 @@ print(anywhere("world"):match("hello world!")) --&gt; 7 12
1155Another option for the search is like this: 1165Another option for the search is like this:
1156</p> 1166</p>
1157<pre class="example"> 1167<pre class="example">
1158local I = lpeg.Cp() 1168local Cp = lpeg.Cp()
1159function anywhere (p) 1169function anywhere (p)
1160 return (1 - lpeg.P(p))^0 * I * p * I 1170 return (1 - lpeg.P(p))^0 * Cp * p * Cp
1161end 1171end
1162</pre> 1172</pre>
1163<p> 1173<p>
1164Again the pattern has a straight reading: 1174Again the pattern has a straight reading:
1165it skips as many characters as possible while not matching <code>p</code>, 1175it skips as many characters as possible while not matching <code>p</code>,
1166and then matches <code>p</code> (plus appropriate captures). 1176and then matches <code>p</code> plus appropriate captures.
1167</p> 1177</p>
1168 1178
1169<p> 1179<p>