Skip to content

Commit a0ab5bf

Browse files
authored
Merge pull request #140 from purescript-contrib/jbrock-docs
Documentation and new parsers rest,take,eof
2 parents 75710db + 2e23f8c commit a0ab5bf

File tree

7 files changed

+224
-55
lines changed

7 files changed

+224
-55
lines changed

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,19 @@ Breaking changes:
88

99
New features:
1010

11+
- `Parser.String.rest` (#140 by @jamesdbrock)
12+
- `Parser.String.takeN` (#140 by @jamesdbrock)
13+
- `Parser.Token.eof` (#140 by @jamesdbrock)
14+
1115
Bugfixes:
1216

17+
- `Parser.String.eof` Set consumed on success so that this parser combines
18+
correctly with `notFollowedBy eof`. Added a test for this. (#140 by @jamesdbrock)
19+
1320
Other improvements:
1421

22+
- Documentation. (#140 by @jamesdbrock)
23+
1524
## [v8.1.0](https://github.com/purescript-contrib/purescript-parsing/releases/tag/v8.1.0) - 2022-01-10
1625

1726
Other improvements: README Quick start monadic parsing tutorial

README.md

Lines changed: 41 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@
66
[![Maintainer: jamesdbrock](https://img.shields.io/badge/maintainer-jamesdbrock-teal.svg)](https://github.com/jamesdbrock)
77
[![Maintainer: robertdp](https://img.shields.io/badge/maintainer-robertdp-teal.svg)](https://github.com/robertdp)
88

9-
A monadic parser combinator library based on Haskell's [Parsec](https://hackage.haskell.org/package/parsec).
9+
A monadic parser combinator library based on Haskell’s
10+
[Parsec](https://hackage.haskell.org/package/parsec).
1011

1112
## Installation
1213

@@ -22,26 +23,41 @@ Here is a basic tutorial introduction to monadic parsing with this package.
2223

2324
### Parsers
2425

25-
A parser turns a string into a data structure. Parsers in this library have the type `Parser s a`, where `s` is the type of the input string, and `a` is the type of the data which the parser will produce on success. `Parser s a` is a monad. It’s defined in the module `Text.Parsing.Parser`.
26+
A parser turns a string into a data structure. Parsers in this library have the type `Parser s a`, where `s` is the type of the input string, and `a` is the type of the data which the parser will produce on success. `Parser s` is a monad. It’s defined in the module `Text.Parsing.Parser`.
2627

27-
Monads can be used to provide context for a computation, and that’s how we use them in monadic parsing. The context provided by the `Parser` monad is *the parser’s current location in the input string*. Parsing starts at the beginning of the input string.
28+
Monads can be used to provide context for a computation, and that’s how we use them in monadic parsing.
29+
The context provided by the `Parser s` monad is __the parser’s current location in the input string__.
30+
Parsing starts at the beginning of the input string.
2831

29-
Parsing requires two more capabilities: *choice* and *failure*.
32+
Parsing requires two more capabilities: __alternative__ and __failure__.
3033

31-
We need *choice* to be able to make decisions about what kind of thing we’re parsing depending on the input which we encouter. This is provided by the `Alt` typeclass instance of the `Parser` monad, particularly the `<|>` operator. That operator will first try the left parser and if that fails, then it will backtrack the input string and try the right parser.
34+
We need __alternative__ to be able to choose what kind of thing we’re parsing depending
35+
on the input which we encounter. This is provided by the `<|>` “alt”
36+
operator of the `Alt` typeclass instance of the `Parser s` monad.
37+
The expression `p_left <|> p_right` will first try the `p_left` parser and if that fails
38+
__and consumes no input__ then it will try the `p_right` parser.
3239

33-
We need *failure* in case the input stream is not parseable. This is provided by the `fail` function, which calls the `throwError` function of the `MonadThrow` typeclass instance of the `Parser` monad. The result of running a parser has type `Either ParseError a`, so if the parse succeeds then the result is `Right a` and if the parse fails then the result is `Left ParseError`.
40+
We need __failure__ in case the input stream is not parseable. This is provided by the `fail`
41+
function, which calls the `throwError` function of the `MonadThrow` typeclass instance of
42+
the `Parser s` monad.
3443

35-
36-
### Running a parser
37-
38-
To run a parser, call the function `runParser :: s -> Parser s a -> Either ParseError a` in the `Text.Parsing.Parser` module, and supply it with an input string and a parser.
44+
To run a parser, call the function `runParser :: s -> Parser s a -> Either ParseError a` in
45+
the `Text.Parsing.Parser` module, and supply it with an input string and a parser.
46+
If the parse succeeds then the result is `Right a` and if the parse fails then the
47+
result is `Left ParseError`.
3948

4049
### Primitive parsers
4150

42-
Each type of input string needs primitive parsers. Primitive parsers for input string type `String` are in the `Text.Parsing.Parser.String` module. We can use these primitive parsers to write other `String` parsers.
51+
Each type of input string needs primitive parsers.
52+
Primitive parsers for input string type `String` are in the `Text.Parsing.Parser.String` module.
53+
We can use these primitive parsers to write other `String` parsers.
4354

44-
Here is a parser `ayebee :: Parser String Boolean` which will accept only two input strings: `"ab"` or `"aB"`. It will return `true` if the `b` character is uppercase. It will return `false` if the `b` character is lowercase. It will fail with a `ParseError` if the input string is anything else. This parser is written in terms of the primitive parser `char :: Parser String Char`.
55+
Here is a parser `ayebee :: Parser String Boolean` which will accept only two input
56+
strings: `"ab"` or `"aB"`.
57+
It will return `true` if the `b` character is uppercase.
58+
It will return `false` if the `b` character is lowercase.
59+
It will fail with a `ParseError` if the input string is anything else.
60+
This parser is written in terms of the primitive parser `char :: Parser String Char`.
4561

4662
```purescript
4763
ayebee :: Parser String Boolean
@@ -61,24 +77,33 @@ and then the parser will succeed and return `Right true`.
6177

6278
#### [✨ Run the `ayebee` parser in your browser on *Try PureScript!*](https://try.purescript.org/?github=/purescript-contrib/purescript-parsing/main/docs/examples/QuickStart.purs)
6379

64-
When you write a real parser you will usually want to return a more complicated data structure than a single `Boolean`. See [*Parse, don't validate*](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/).
65-
6680
### More parsers
6781

6882
There are other `String` parsers in the module `Text.Parsing.Parser.Token`, for example the parser `letter :: Parser String Char` which will accept any single alphabetic letter.
6983

7084
### Parser combinators
7185

72-
A parser combinator is a function which takes a parser as an argument and returns a new parser. The `many` combinator, for example, will repeat a parser as many times as it can. So the parser `many letter` will have type `Parser String (Array Char)`. Parser combinators are in this package in the module `Text.Parsing.Parser.Combinators`.
86+
A parser combinator is a function which takes a parser as an argument and returns a new parser. The `many` combinator, for example, will repeat a parser as many times as it can. So the parser `many letter` will have type `Parser String (Array Boolean)`. Running that parser
87+
88+
```purescript
89+
runParser "aBabaB" (many ayebee)
90+
```
91+
92+
will return `Right [true, false, true]`.
93+
94+
Parser combinators are in this package in the module `Text.Parsing.Parser.Combinators`.
7395

7496
## Further reading
7597

76-
Here is the original short classic [FUNCTIONAL PEARLS *Monadic Parsing in Haskell*](https://www.cs.nott.ac.uk/~pszgmh/pearl.pdf) by Graham Hutton and Erik Meijer.
98+
Here is the original short classic [FUNCTIONAL PEARLS *Monadic Parsing in Haskell*](https://www.cs.nott.ac.uk/~pszgmh/pearl.pdf) by Graham Hutton and Erik Meijer.
7799

78100
[*Revisiting Monadic Parsing in Haskell*](https://vaibhavsagar.com/blog/2018/02/04/revisiting-monadic-parsing-haskell/) by Vaibhav Sagar is a reflection on the Hutton, Meijer FUNCTIONAL PEARL.
79101

80102
[*Parse, don't validate*](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) by Alexis King is about what it means to “parse” something, without any mention of monads.
81103

104+
[*Parsec: “try a <|> b” considered harmful*](http://blog.ezyang.com/2014/05/parsec-try-a-or-b-considered-harmful/) by Edward Z. Yang is about how to decide when to backtrack
105+
from a failed alternative.
106+
82107
There are lots of other great monadic parsing tutorials on the internet.
83108

84109
## Related Packages

src/Text/Parsing/Parser.purs

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,18 @@ derive instance ordParseError :: Ord ParseError
5151

5252
-- | Contains the remaining input and current position.
5353
data ParseState s = ParseState s Position Boolean
54+
-- ParseState constructor has three parameters,
55+
-- s: the remaining input
56+
-- Position: the current position
57+
-- Boolean: the consumed flag.
58+
--
59+
-- The consumed flag is used to implement the rule for `alt` that
60+
-- * If the left parser fails *without consuming any input*, then backtrack and try the right parser.
61+
-- * If the left parser fails and consumes input, then fail immediately.
62+
--
63+
-- https://hackage.haskell.org/package/parsec/docs/Text-Parsec.html#v:try
64+
--
65+
-- http://blog.ezyang.com/2014/05/parsec-try-a-or-b-considered-harmful/
5466

5567
-- | The Parser monad transformer.
5668
-- |
@@ -105,12 +117,25 @@ derive newtype instance monadStateParserT :: Monad m => MonadState (ParseState s
105117
derive newtype instance monadThrowParserT :: Monad m => MonadThrow ParseError (ParserT s m)
106118
derive newtype instance monadErrorParserT :: Monad m => MonadError ParseError (ParserT s m)
107119

120+
-- | The alternative `Alt` instance provides the `alt` combinator `<|>`.
121+
-- |
122+
-- | The expression `p_left <|> p_right` will first try the `p_left` parser and if that fails
123+
-- | __and consumes no input__ then it will try the `p_right` parser.
124+
-- |
125+
-- | While we are parsing down the `p_left` branch we may reach a point where
126+
-- | we know this is the correct branch, but we cannot parse further. At
127+
-- | that point we want to fail the entire parse instead of trying the `p_right`
128+
-- | branch. To control the point at which we commit to the `p_left` branch
129+
-- | use the `try` combinator.
130+
-- |
131+
-- | The `alt` combinator works this way because it gives us good localized
132+
-- | error messages while also allowing an efficient implementation.
108133
instance altParserT :: Monad m => Alt (ParserT s m) where
109134
alt p1 p2 = (ParserT <<< ExceptT <<< StateT) \(s@(ParseState i p _)) -> do
110-
Tuple e s'@(ParseState _ _ c') <- runStateT (runExceptT (unwrap p1)) (ParseState i p false)
135+
Tuple e s'@(ParseState _ _ consumed) <- runStateT (runExceptT (unwrap p1)) (ParseState i p false)
111136
case e of
112137
Left _
113-
| not c' -> runStateT (runExceptT (unwrap p2)) s
138+
| not consumed -> runStateT (runExceptT (unwrap p2)) s
114139
_ -> pure (Tuple e s')
115140

116141
instance plusParserT :: Monad m => Plus (ParserT s m) where
@@ -147,4 +172,3 @@ failWithPosition message pos = throwError (ParseError message pos)
147172
-- | `region` as the parser backs out the call stack.
148173
region :: forall m s a. Monad m => (ParseError -> ParseError) -> ParserT s m a -> ParserT s m a
149174
region context p = catchError p $ \err -> throwError $ context err
150-

0 commit comments

Comments
 (0)