Skip to content

Improve default error #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 9, 2020
Merged

Improve default error #57

merged 7 commits into from
Dec 9, 2020

Conversation

JordanMartinez
Copy link
Contributor

What does this pull request do?

Fixes #56
Fixes #34

@thomashoneyman
Copy link
Contributor

It's really unfortunate that ParseError is defined as a newtype over a String instead of being a pair of the error and position. For example, that's what the parsing library does:

https://github.com/purescript-contrib/purescript-parsing/blob/f8dae0507998ef25b0d3532423733af378629913/src/Text/Parsing/Parser.purs#L36

With that type, runParser still produces a ParseError:

https://github.com/purescript-contrib/purescript-parsing/blob/f8dae0507998ef25b0d3532423733af378629913/src/Text/Parsing/Parser.purs#L70-L72

and you can recover the same error as this library with:

https://github.com/purescript-contrib/purescript-parsing/blob/f8dae0507998ef25b0d3532423733af378629913/src/Text/Parsing/Parser.purs#L38-L39

or write a printParseError function that pretty-prints the error and location information as you'd have both here. I know it would be a breaking change and perhaps not worth doing for that reason, but I'm more inclined to just fix it that way. I'm a little curious what you, @JordanMartinez, as well as other contributors (@hdgarrood, @garyb, @justinwoo) think of switching to the parsing approach.


That aside, this PR would already be a breaking change because of switching from data to newtype. If we wanted to try and make the smallest possible change we'd be best off keeping the error as data and adding the new runParser' (or runParserWithPosition) function only.

@JordanMartinez
Copy link
Contributor Author

That aside, this PR would already be a breaking change because of switching from data to newtype. If we wanted to try and make the smallest possible change we'd be best off keeping the error as data and adding the new runParser' (or runParserWithPosition) function only.

To clarify, how would it be a breaking change? AFAICT, the only thing changing is the runtime code, which isn't supposed to be touched by people anyway. I didn't add a Newtype instance. The data constructor was and still is exported, so pattern matching will still work.

@JordanMartinez
Copy link
Contributor Author

I'm a little curious what you think of switching to the parsing approach.

I would prefer the parsing approach. I did it this way because it ensures this code isn't breaking. It might be worth it to make a release with this change as is, and then a breaking changes release that switches completely to the parsing approach. That way, people can still benefit from other work done here (e.g. the Monoid instance).

@thomashoneyman
Copy link
Contributor

AFAICT, the only thing changing is the runtime code, which isn't supposed to be touched by people anyway. I didn't add a Newtype instance. The data constructor was and still is exported, so pattern matching will still work.

I'm sorry, you're correct. It's not a breaking change; any pattern-matching will continue working.

@thomashoneyman
Copy link
Contributor

That way, people can still benefit from other work done here (e.g. the Monoid instance).

That can still be released separately, as it's already merged. I think it's also fine to release the newtype change before any breaking changes. I might not want to release a runParser' function only to remove it again in the next version, though, as that feels a bit abrupt. But I'm not firm on that.

@JordanMartinez
Copy link
Contributor Author

Should I submit a different PR that just does the newtype change? I included it here because it was something I noticed when working on this one.

@thomashoneyman
Copy link
Contributor

Yea, perhaps do that so that this PR can focus only on the error type discussion. I'm happy to merge the newtype change.

@JordanMartinez
Copy link
Contributor Author

I've updated this PR to remove the newtype change commit

@JordanMartinez
Copy link
Contributor Author

While I would like to hear what others say, I would argue that we make this breaking change now rather than later for two reasons. First, most people expect runParser to also show where in the String an error occurred because otherwise the error message itself is relatively useless when trying to debug the parser. Second, we'll likely be making other breaking changes, so I think it would be better to do this now rather than later.

@hdgarrood
Copy link
Contributor

I think changing ParseError to include a position as well as a message sounds sensible. That sounds like the kind of breaking change which would be worth doing, especially since we are about to have a round of breaking changes anyway for the upcoming 0.14 release (I’d suggest waiting until then to release this change).

@thomashoneyman
Copy link
Contributor

Three in favor is pretty good. Let's go ahead and move to that representation, then, and it can be released as a breaking change along with anything else for 0.14 / anything from #59.

@JordanMartinez
Copy link
Contributor Author

I've rebased on master, then made runParser' the new definition for runParser.

@thomashoneyman
Copy link
Contributor

I was actually thinking we'd make a larger change: update the ParseError type to contain the position and error message, the same way as is done in the parsing library. Then, runParser would continue to have a ParseError in its type, except it would now have more information. We would also provide a printParseError (like printJsonDecodeError) which turns that parse error into a pretty-printed error message.

@JordanMartinez
Copy link
Contributor Author

How should that be done when #48 is involved?

@thomashoneyman
Copy link
Contributor

I'm not sure. Perhaps this change could be rolled in to #59?

@hdgarrood
Copy link
Contributor

I think this change is orthogonal to #48, so the question is probably just "which one would you rather do first?"

@hdgarrood
Copy link
Contributor

I would say, though, that I'd recommend keeping the scope of #59 narrow, because code points versus code units is a subtle issue which will need a bit of care taken over it.

@thomashoneyman thomashoneyman added merge before 0.14 purs-0.14 A reminder to address this issue or merge this PR before we release PureScript v0.14.0 and removed merge before 0.14 labels Dec 3, 2020
@JordanMartinez
Copy link
Contributor Author

If no one is against this, I'll merge this one in before merging #63.

@hdgarrood
Copy link
Contributor

hdgarrood commented Dec 5, 2020

I thought we had reached a consensus for the approach where the error type is a record containing a message and a position instead?

@JordanMartinez
Copy link
Contributor Author

It's been so long, I couldn't remember. I'll have to read through the thread again, but most likely.

@thomashoneyman
Copy link
Contributor

thomashoneyman commented Dec 8, 2020

This message is a good summary. We're basically doing the same thing from parsing, except instead of using data we'll use a record.

#57 (comment)

@JordanMartinez
Copy link
Contributor Author

How's this?


-- | A parser is represented as a function which takes a pair of
-- | continuations for failure and success.
newtype Parser a = Parser (PosString -> Either { pos :: Pos, error :: ParseError } { result :: a, suffix :: PosString })
newtype Parser a = Parser (PosString -> Either ParseError { result :: a, suffix :: PosString })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially create a type alias for this, too:

Suggested change
newtype Parser a = Parser (PosString -> Either ParseError { result :: a, suffix :: PosString })
type ParseResult a = { result :: a, suffix :: PosString }
newtype Parser a = Parser (PosString -> Either ParseError (ParseResult a))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same record is used in several places, so this would just reduce a bit of duplication. It's not necessary, though.


-- | A parser is represented as a function which takes a pair of
-- | continuations for failure and success.
newtype Parser a = Parser (PosString -> Either { pos :: Pos, error :: ParseError } { result :: a, suffix :: PosString })
newtype Parser a = Parser (PosString -> Either ParseError { result :: a, suffix :: PosString })

-- | Run a parser by providing success and failure continuations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These documentation comments seem a bit funky to me; they keep talking about continuations, but that's not what I interpret from this function. Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this can be removed.

Comment on lines 43 to 47
runParser :: forall a. Parser a -> String -> Either String a
runParser (Parser p) s = bimap printError _.result (p { str: s, pos: 0 })
where
printError :: ParseError -> String
printError rec = rec.error <> "; pos = " <> show rec.pos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about having runParser be forall a. Parser a -> String -> Either ParseError a, and then having a separate printError function? That's a bit more flexible than the current definition (you can recover the current definition with lmap printError).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed in latest commits.

@JordanMartinez
Copy link
Contributor Author

CI now passes.


-- | A parser is represented as a function which takes a pair of
-- | continuations for failure and success.
newtype Parser a = Parser (PosString -> Either { pos :: Pos, error :: ParseError } { result :: a, suffix :: PosString })
newtype Parser a = Parser (PosString -> Either ParseError { result :: a, suffix :: PosString })

-- | Run a parser by providing success and failure continuations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this can be removed.

@JordanMartinez
Copy link
Contributor Author

How about the latest changes to the docs I've made?

@JordanMartinez JordanMartinez merged commit d3be1b0 into purescript-contrib:main Dec 9, 2020
@JordanMartinez JordanMartinez deleted the improveDefaultError branch December 9, 2020 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
purs-0.14 A reminder to address this issue or merge this PR before we release PureScript v0.14.0
Development

Successfully merging this pull request may close these issues.

ParseError is data when it should be newtype Position in error
3 participants