diff options
Diffstat (limited to 'lpeg.html')
-rw-r--r-- | lpeg.html | 94 |
1 files changed, 52 insertions, 42 deletions
@@ -901,8 +901,8 @@ Creates an <em>accumulator capture</em>. | |||
901 | This pattern behaves similarly to a | 901 | This pattern behaves similarly to a |
902 | <a href="#cap-func">function capture</a>, | 902 | <a href="#cap-func">function capture</a>, |
903 | with the following differences: | 903 | with the following differences: |
904 | The last captured value is added as a first argument to | 904 | The last captured value before <code>patt</code> |
905 | the call; | 905 | is added as a first argument to the call; |
906 | the return of the function is adjusted to one single value; | 906 | the return of the function is adjusted to one single value; |
907 | that value replaces the last captured value. | 907 | that value replaces the last captured value. |
908 | Note that the capture itself produces no values; | 908 | Note that the capture itself produces no values; |
@@ -911,31 +911,6 @@ it only changes the value of its previous capture. | |||
911 | 911 | ||
912 | <p> | 912 | <p> |
913 | As an example, | 913 | As an example, |
914 | consider the following code fragment: | ||
915 | </p> | ||
916 | <pre class="example"> | ||
917 | local name = lpeg.C(lpeg.R("az")^1) | ||
918 | local p = name * (lpeg.P("^") % string.upper)^-1 | ||
919 | print(p:match("count")) --> count | ||
920 | print(p:match("count^")) --> COUNT | ||
921 | </pre> | ||
922 | <p> | ||
923 | In the first match, | ||
924 | the accumulator capture does not match, | ||
925 | and so the match results in its first capture, a name. | ||
926 | In the second match, | ||
927 | the accumulator capture matches, | ||
928 | so the function <code>string.upper</code> | ||
929 | is called with the previous capture (created by <code>name</code>) | ||
930 | plus the string <code>"^"</code>; | ||
931 | the function ignores its second argument and returns the first argument | ||
932 | changed to upper case; | ||
933 | that value then becomes the first and only | ||
934 | capture value created by the match. | ||
935 | </p> | ||
936 | |||
937 | <p> | ||
938 | As another example, | ||
939 | let us consider the problem of adding a list of numbers. | 914 | let us consider the problem of adding a list of numbers. |
940 | </p> | 915 | </p> |
941 | <pre class="example"> | 916 | <pre class="example"> |
@@ -956,22 +931,56 @@ First, the initial <code>number</code> captures a number; | |||
956 | that first capture will play the role of an accumulator. | 931 | that first capture will play the role of an accumulator. |
957 | Then, each time the sequence <code>comma-number</code> | 932 | Then, each time the sequence <code>comma-number</code> |
958 | matches inside the loop there is an accumulator capture: | 933 | matches inside the loop there is an accumulator capture: |
959 | It calls <code>add</code> with the current value of the accumulator | 934 | It calls <code>add</code> with the current value of the |
960 | and the value of the new number, | 935 | accumulator—which is the last captured value, created by the |
961 | and the result of the call (their sum) replaces the value of the accumulator. | 936 | first <code>number</code>— and the value of the new number, |
937 | and the result of the call (the sum of the two numbers) | ||
938 | replaces the value of the accumulator. | ||
962 | At the end of the match, | 939 | At the end of the match, |
963 | the accumulator with all sums is the final value. | 940 | the accumulator with all sums is the final value. |
964 | </p> | 941 | </p> |
965 | 942 | ||
966 | <p> | 943 | <p> |
944 | As another example, | ||
945 | consider the following code fragment: | ||
946 | </p> | ||
947 | <pre class="example"> | ||
948 | local name = lpeg.C(lpeg.R("az")^1) | ||
949 | local p = name * (lpeg.P("^") % string.upper)^-1 | ||
950 | print(p:match("count")) --> count | ||
951 | print(p:match("count^")) --> COUNT | ||
952 | </pre> | ||
953 | <p> | ||
954 | In the match against <code>"count"</code>, | ||
955 | as there is no <code>"^"</code>, | ||
956 | the optional accumulator capture does not match; | ||
957 | so, the match results in its sole capture, a name. | ||
958 | In the match against <code>"count^"</code>, | ||
959 | the accumulator capture matches, | ||
960 | so the function <code>string.upper</code> | ||
961 | is called with the previous captured value (created by <code>name</code>) | ||
962 | plus the string <code>"^"</code>; | ||
963 | the function ignores its second argument and returns the first argument | ||
964 | changed to upper case; | ||
965 | that value then becomes the first and only | ||
966 | capture value created by the match. | ||
967 | </p> | ||
968 | |||
969 | <p> | ||
967 | Due to the nature of this capture, | 970 | Due to the nature of this capture, |
968 | you should avoid using it in places where it is not clear | 971 | you should avoid using it in places where it is not clear |
969 | what is its "previous" capture | 972 | what is the "previous" capture, |
970 | (e.g., directly nested in a <a href="#cap-string">string capture</a> | 973 | such as directly nested in a <a href="#cap-string">string capture</a> |
971 | or a <a href="#cap-num">numbered capture</a>). | 974 | or a <a href="#cap-num">numbered capture</a>. |
972 | Due to implementation details, | 975 | (Note that these captures may not need to evaluate |
976 | all their subcaptures to compute their results.) | ||
977 | Moreover, due to implementation details, | ||
973 | you should not use this capture directly nested in a | 978 | you should not use this capture directly nested in a |
974 | <a href="#cap-s">substitution capture</a>. | 979 | <a href="#cap-s">substitution capture</a>. |
980 | A simple and effective way to avoid these issues is | ||
981 | to enclose the whole accumulation composition | ||
982 | (including the capture that generates the initial value) | ||
983 | into an anonymous <a href="#cap-g">group capture</a>. | ||
975 | </p> | 984 | </p> |
976 | 985 | ||
977 | 986 | ||
@@ -1056,7 +1065,8 @@ local name = lpeg.C(lpeg.alpha^1) * space | |||
1056 | local sep = lpeg.S(",;") * space | 1065 | local sep = lpeg.S(",;") * space |
1057 | local pair = name * "=" * space * name * sep^-1 | 1066 | local pair = name * "=" * space * name * sep^-1 |
1058 | local list = lpeg.Ct("") * (pair % rawset)^0 | 1067 | local list = lpeg.Ct("") * (pair % rawset)^0 |
1059 | t = list:match("a=b, c = hi; next = pi") --> { a = "b", c = "hi", next = "pi" } | 1068 | t = list:match("a=b, c = hi; next = pi") |
1069 | --> { a = "b", c = "hi", next = "pi" } | ||
1060 | </pre> | 1070 | </pre> |
1061 | <p> | 1071 | <p> |
1062 | Each pair has the format <code>name = name</code> followed by | 1072 | Each pair has the format <code>name = name</code> followed by |
@@ -1098,7 +1108,7 @@ by <code>sep</code>. | |||
1098 | If the split results in too many values, | 1108 | If the split results in too many values, |
1099 | it may overflow the maximum number of values | 1109 | it may overflow the maximum number of values |
1100 | that can be returned by a Lua function. | 1110 | that can be returned by a Lua function. |
1101 | In this case, | 1111 | To avoid this problem, |
1102 | we can collect these values in a table: | 1112 | we can collect these values in a table: |
1103 | </p> | 1113 | </p> |
1104 | <pre class="example"> | 1114 | <pre class="example"> |
@@ -1134,7 +1144,7 @@ end | |||
1134 | </pre> | 1144 | </pre> |
1135 | <p> | 1145 | <p> |
1136 | This grammar has a straight reading: | 1146 | This grammar has a straight reading: |
1137 | it matches <code>p</code> or skips one character and tries again. | 1147 | its sole rule matches <code>p</code> or skips one character and tries again. |
1138 | </p> | 1148 | </p> |
1139 | 1149 | ||
1140 | <p> | 1150 | <p> |
@@ -1143,9 +1153,9 @@ If we want to know where the pattern is in the string | |||
1143 | we can add position captures to the pattern: | 1153 | we can add position captures to the pattern: |
1144 | </p> | 1154 | </p> |
1145 | <pre class="example"> | 1155 | <pre class="example"> |
1146 | local I = lpeg.Cp() | 1156 | local Cp = lpeg.Cp() |
1147 | function anywhere (p) | 1157 | function anywhere (p) |
1148 | return lpeg.P{ I * p * I + 1 * lpeg.V(1) } | 1158 | return lpeg.P{ Cp * p * Cp + 1 * lpeg.V(1) } |
1149 | end | 1159 | end |
1150 | 1160 | ||
1151 | print(anywhere("world"):match("hello world!")) --> 7 12 | 1161 | print(anywhere("world"):match("hello world!")) --> 7 12 |
@@ -1155,15 +1165,15 @@ print(anywhere("world"):match("hello world!")) --> 7 12 | |||
1155 | Another option for the search is like this: | 1165 | Another option for the search is like this: |
1156 | </p> | 1166 | </p> |
1157 | <pre class="example"> | 1167 | <pre class="example"> |
1158 | local I = lpeg.Cp() | 1168 | local Cp = lpeg.Cp() |
1159 | function anywhere (p) | 1169 | function anywhere (p) |
1160 | return (1 - lpeg.P(p))^0 * I * p * I | 1170 | return (1 - lpeg.P(p))^0 * Cp * p * Cp |
1161 | end | 1171 | end |
1162 | </pre> | 1172 | </pre> |
1163 | <p> | 1173 | <p> |
1164 | Again the pattern has a straight reading: | 1174 | Again the pattern has a straight reading: |
1165 | it skips as many characters as possible while not matching <code>p</code>, | 1175 | it skips as many characters as possible while not matching <code>p</code>, |
1166 | and then matches <code>p</code> (plus appropriate captures). | 1176 | and then matches <code>p</code> plus appropriate captures. |
1167 | </p> | 1177 | </p> |
1168 | 1178 | ||
1169 | <p> | 1179 | <p> |