Skip to content

Parse error recovery and incrementality for GHC #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 4, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions content/ideas/parse-error-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Parse error recovery and incrementality for GHC
---

GHC is able to report multiple type errors at once, yet a single parser
error brings the whole compilation pipeline to a halt; see [this tech
proposal](https://github.com/haskellfoundation/tech-proposals/pull/63).

One significant obstacle is the parser generator
[`happy`](https://github.com/simonmar/happy/) that GHC relies on for versatile
and fast parsing:
The current error handling architecture exposed by `happy` will abort on the
first parse error without producing a partial syntax tree at all.

This [draft PR](https://github.com/haskell/happy/pull/272)
improves happy to resume parsing after reporting a parse error, but it lacks
documentation, introduces a number of breaking changes and is in bad need of
cleanup.
Nevertheless, it is technically complete, passes the testsuite and has already
been [tried on GHC as a proof of concept](https://gitlab.haskell.org/ghc/ghc/-/merge_requests/11990).

The goal of this project is to take over the pull request to `happy` so that it
can be merged, and then use the improved `happy` to generate multiple and better
parse error messages in GHC.

There are a couple of stretch goals:

* `happy` could further be improved to pass a closure of its
parse state to reduction actions, so as to enable incremental parsing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and use this in GHC?

* Improve `happy` so that it provides a convenient and encapsulated way to
introspect the LALR item stack, for example to identify bracketing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and use this in GHC?

productions such as `'(' expr . ')'` in order to report mismatched brackets.
There is a [hacky GHC Merge Request](https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4711)
that tries to achieve as much without buy in from happy.
* Improve `happy`s code base, which by now is over 25 years old.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be helpful to identify a couple of concrete things: just "modernisation", or comments, or tests, or what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it is rather hard to spell out what could make a substantial improvment. Certainly it appears that the modularisation effort has left happy in a rather rough stage. See #183.


**Potential Mentors**:
Sebastian Graf

**Difficulty**:
Medium, given that the technical bits have been drafted out.
Still, the student would be required to familiarise themselves with the basics
of LALR parsing theory in order to contribute documentation.

**Size**:
175 hours for merging the PR and beginning to improve GHC, but 350 hours can
easily be spent on working on stretch goals as well for significant improvement
of GHC.