Skip to content

Commit 7ab12ee

Browse files
felix9Mingun
authored andcommitted
Reimplement offset() and add range()
1 parent 42a1337 commit 7ab12ee

File tree

6 files changed

+99
-0
lines changed

6 files changed

+99
-0
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,11 @@ Released: TBD
6363
especially if your plugin replaces both `generateBytecode` as `generateJs` passes.
6464

6565
[@Mingun](https://github.com/peggyjs/peggy/pull/117)
66+
- Parsers now can use two new functions to get location information:
67+
`offset()` and `range()`. Use them if you don't need the whole
68+
location information, because it could be expensive to compute.
69+
That two functions always very efficient (back-ported pegjs/pegjs#528).
70+
[@felix9 and @Mingun](https://github.com/peggyjs/peggy/pull/145)
6671

6772
### Bug fixes
6873

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -539,6 +539,12 @@ available to them.
539539
- `location()` returns an object with the information about the parse position.
540540
Refer to [the corresponding section](#locations) for the details.
541541

542+
- `range()` is similar to `location()`, but returns an object with offsets only.
543+
Refer to [the "Locations" section](#locations) for the details.
544+
545+
- `offset()` returns only the start offset, i.e. `location().start.offset`.
546+
Refer to [the "Locations" section](#locations) for the details.
547+
542548
- `text()` returns the source text between `start` and `end` (which will be `""` for
543549
predicates). Instead of using that function as a return value for the rule consider
544550
using the [`$` operator](#-expression-2).
@@ -709,6 +715,24 @@ For the per-parse initializer, the location is the start of the input, i.e.
709715
The line number is incremented each time the parser finds an end of line
710716
sequence in the input.
711717

718+
Line and column are somewhat expensive to compute, so if you just need the
719+
offset, there's also a function `offset()` that returns just the start offset,
720+
and a function `range()` that returns the object:
721+
722+
```javascript
723+
{
724+
source: options.grammarSource,
725+
start: 23,
726+
end: 25
727+
}
728+
```
729+
730+
(i.e. difference from the `location()` result only in type of `start` and `end`
731+
properties, which contain just an offset instead of the `Location` object.)
732+
733+
All notes about values for `location()` object is also applicable to the `range()`
734+
and `offset()` calls.
735+
712736
Currently, Peggy only works with the [Basic Multilingual Plane (BMP)][BMP] of
713737
Unicode. This means that all offsets are measured in UTF-16 code units. If you
714738
try to parse characters outside this Plane (for example, emoji, or any

docs/documentation.html

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -657,6 +657,12 @@ <h3 id="action-execution-environment">Action Execution Environment</h3>
657657
<li><p><code>location()</code> returns an object with the information about the parse position.
658658
Refer to <a href="#locations">the corresponding section</a> for the details.</p>
659659
</li>
660+
<li><p><code>range()</code> is similar to <code>location()</code>, but returns an object with offsets only.
661+
Refer to <a href="#locations">the &quot;Locations&quot; section</a> for the details.</p>
662+
</li>
663+
<li><p><code>offset()</code> returns only the start offset, i.e. <code>location().start.offset</code>.
664+
Refer to <a href="#locations">the &quot;Locations&quot; section</a> for the details.</p>
665+
</li>
660666
<li><p><code>text()</code> returns the source text between <code>start</code> and <code>end</code> (which will be <code>&quot;&quot;</code> for
661667
predicates). Instead of using that function as a return value for the rule consider
662668
using the <a href="#-expression-2"><code>$</code> operator</a>.</p>
@@ -799,6 +805,24 @@ <h2 id="locations">Locations</h2>
799805
<p>The line number is incremented each time the parser finds an end of line sequence in
800806
the input.</p>
801807

808+
<p>Line and column are somewhat expensive to compute, so if you just need the
809+
offset, there's also a function <code>offset()</code> that returns just the start offset,
810+
and a function <code>range()</code> that returns the object:</p>
811+
812+
<pre><code class="language-javascript">
813+
{
814+
source: options.grammarSource,
815+
start: 23,
816+
end: 25
817+
}
818+
</code></pre>
819+
820+
<p>(i.e. difference from the <code>location()</code> result only in type of <code>start</code> and <code>end</code>
821+
properties, which contain just an offset instead of the <code>Location</code> object.)</p>
822+
823+
<p>All notes about values for <code>location()</code> object is also applicable to the <code>range()</code>
824+
and <code>offset()</code> calls.</p>
825+
802826
<p>Currently, Peggy only works with the <a href="https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane">Basic Multilingual Plane (BMP)</a> of Unicode.
803827
This means that all offsets are measured in UTF-16 code units. If you
804828
try to parse characters outside this Plane (for example, emoji, or any

lib/compiler/passes/generate-js.js

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -857,6 +857,18 @@ function generateJS(ast, options) {
857857
" return input.substring(peg$savedPos, peg$currPos);",
858858
" }",
859859
"",
860+
" function offset() {",
861+
" return peg$savedPos;",
862+
" }",
863+
"",
864+
" function range() {",
865+
" return {",
866+
" source: peg$source,",
867+
" start: peg$savedPos,",
868+
" end: peg$currPos",
869+
" };",
870+
" }",
871+
"",
860872
" function location() {",
861873
" return peg$computeLocation(peg$savedPos, peg$currPos);",
862874
" }",

lib/parser.js

Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

test/behavior/generated-parser-behavior.spec.js

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -584,6 +584,28 @@ describe("generated parser behavior", function() {
584584
end: { offset: 3, line: 2, column: 1 }
585585
});
586586
});
587+
588+
it("|offset| returns current start offset", function() {
589+
const parser = peg.generate([
590+
"start = [0-9]+ @mark",
591+
"mark = 'xx' { return offset(); }"
592+
].join("\n"), options);
593+
594+
expect(parser).to.parse("0123456xx", 7);
595+
});
596+
597+
it("|range| returns current range", function() {
598+
const parser = peg.generate([
599+
"start = [0-9]+ @mark",
600+
"mark = 'xx' { return range(); }"
601+
].join("\n"), options);
602+
603+
expect(parser).to.parse("0123456xx", {
604+
source: undefined,
605+
start: 7,
606+
end: 9
607+
});
608+
});
587609
});
588610
});
589611

0 commit comments

Comments
 (0)