Skip to content

Compactableand Filterable instances #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

artemisSystem
Copy link

@artemisSystem artemisSystem commented Jan 16, 2020

Fixes #84

What does this pull request do?

Implements Compactableand Filterable instances for ParserT

Where should the reviewer start?

instance implementations in src/Text/Parsing/Parser.purs

How should this be manually tested?

Tests provided in test/Main.purs

Other Notes:

Maybe we could add an example showing how to use filterMap to apply functions with type a -> Maybe a to the result of a parser "for free"

@thomashoneyman thomashoneyman self-assigned this Jan 21, 2020
Copy link
Contributor

@thomashoneyman thomashoneyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening a PR, @3ddyy! I've stepped through the code and commented on a few things which stood out to me. While I've used purescript-parsing before, I perform general maintenance on this library and am not a contributing developer myself. With that in mind I'd like to hear from another maintainer before merging.

I'd also appreciate a quick opinion from @LiamGoodacre as the author of purescript-filterable about these instances (if you have a moment, Liam!).

In addition, while I can see what this is adding to the library, I'd appreciate hearing if you have any idea what possible drawbacks these instances may have @3ddyy. For example, do these have worse error messages compared to writing a parser out if the instances didn't exist? If you aren't aware of any drawbacks that's fine, I'm just curious.

, right: do
state <- get
p1 >>= case _ of
Left r -> put state *> fail "Parse returned Left"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (and elsewhere) the state <- get bind occurs ahead of the case statement, and then the state is only used in one branch. Is there any reason not to only perform the get in the branches where it is required? ie.

{ ...
, right: p1 >>= case _ of
    Left _ -> do
      state <- get
      put state *> fail "Parse returned Left"
}

For that matter, do you mind briefly letting me know the reason for getting and then putting the state? Is there any reason to not just call fail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point in getting the state is that it's done before the parser p1 is run. The way you changed it makes it so that the getting and setting happen right after eachother, which would be the same as just failing right away and not doing anything else. The reason i'm getting the state before running the parser is that when the parser fails, I believe the right move is to error at the position from before the parser. Let's say you parse "34" with digit *> filter (_ == 1) digit (digit is from Test.Main and just parses a single digit 1-9). It will error at position 2, as it should. (Reason being that the character 4 is what's wrong). If we do not get the state beforehand and set it when erroring, the error would report position 3, which would be incorrect. The reason for grabbing the whole state is that if the result is filtered away, I want the parser to act as if it wasn't run at all, except for erroring of course. It might be better to just get the position though, and then fail with that position? (not altering other parts of the state.) Then it would be possible to decide whether to have it backtrack or not when using it with <|>, simply by putting try in front of it or not.

@artemisSystem
Copy link
Author

artemisSystem commented Jan 27, 2020

When it comes to error messages, they are a bit worse than what you would get otherwise. For this reason, I've provided two functions in Text.Parsing.Parser.Combinators, filterMapWithError and filterMapEither to help with this. filterMapWithError is just filterMap that lets you provide a string in case the filter fails. filterMapEither is the same, but instead of providing an extra string, it's provided in the Left constructor. I notice now when i'm looking over the two functions, that the implementations seem to have some errors in case of no failure, so i'll have another look at that. I will also have to take another look at the tests (and the combinator implementations) if we decide that only resetting the position is best in the case of a failed filter.

@artemisSystem
Copy link
Author

> filterMapWithError Just "" digit # runParser "12"
(Right 1)

> filterMapWithError Just "" digit *> digit # runParser "12"
(Right 1)

> filterMapWithError Just "" digit *> digit *> digit # runParser "12"
(Right 2)

> filterMapWithError Just "" digit *> digit *> digit *> digit # runParser "12"
(Left (ParseError "Expected \"9\"" (Position { line: 1, column: 3 })))

Yep, when filterMapWithError doesn't fail, it doesn't consume what it's supposed to, i'll fix that once we decide on whether to reset just the position or the entire state (the correct implementation would be slightly different, i think).

@artemisSystem
Copy link
Author

Just waiting on an opinion here: Is it best to reset the parser entirely when something is filtered away (act like it never ran, but just failed instantly)? Or is it better to just reset the position, and allow people to backtrack manually using try and <|>? I'm leaning towards the latter currently, as it seems more in touch with the rest of the package.

@thomashoneyman thomashoneyman removed their assignment Jun 19, 2020
@thomashoneyman thomashoneyman changed the base branch from master to main October 6, 2020 02:58
@jamesdbrock
Copy link
Member

jamesdbrock commented Oct 8, 2021

Hi, thanks for this PR @artemisSystem ! Sorry this has been open for ten months. I hope you're still interested in this.

I think I understand the problem you're trying to solve here, and it's an important problem, and this is a good solution.

The problem is that sometimes we want to fail in the parsing monad for reasons not directly related to grammar, but more related to “validation.” And this is a very good idea! This is exactly the way we should be thinking, according to Alexis King.

I tried to solve this problem myself once with these three little helper combinators. Your solution considers something which I hadn't thought about, which is that if we're going to fail as a result of parsing a thing, then we should backtrack the parsing state and report the error at the beginning of the thing.

I would like to merge this, or something like it, but I want to talk first.

Your functions filterMapWithError and filterMapEither are super useful and we definitely want something like this.

Your implementations for filterMap and partitionMap are used to define filterMapWithError and filterMapEither. And filterMap seems useful. But I can't imagine anyone would ever use partitionMap for anything other than implementing filterMapEither. It's not obvious to me what your implementations for compact, separate, partition mean. I can read them and see what they do, but semantically I don't think I would ever use them in a parser? And I don't see any other functions which require a Filterable instance which would be useful when parsing https://pursuit.purescript.org/packages/purescript-filterable/4.0.1/docs/Data.Filterable#t:Filterable .

So I wonder if these Compactable and Filterable typeclasses are actually not good abstractions here? Your filterMapWithError and filterMapEither functions can be written in terms of those typeclass instances, but they could also be written without them, right?

@jamesdbrock
Copy link
Member

Just waiting on an opinion here: Is it best to reset the parser entirely when something is filtered away (act like it never ran, but just failed instantly)? Or is it better to just reset the position, and allow people to backtrack manually using try and <|>? I'm leaning towards the latter currently, as it seems more in touch with the rest of the package.

This is such a good question and I don't know the answer. We have to think about this.

@jamesdbrock
Copy link
Member

Oh your linked issue has a lot more discussion about this, reading now.

@artemisSystem
Copy link
Author

Being purely pedantic, it's actually been 22 months, but no worries about the time since i'd completely forgotten about this PR myself 😁 (and the ball's been in my court).

I think the original reason i submitted this PR was just the fact that a lawful Filterable instance was possible, and filterMap was a useful function, so i thought "why not add the instances, for completeness' sake".

From what i remember, both of the implementations of the extra functions (filterMapWithError and filterMapEither) are broken and do wrong things in the case of an error or a success, i don't remember which. In addition to that, it doesn't look like i've implemented the changes i suggested in my comment right before yours.

I think I'm unlikely to spend more time on this in the near future, but you're welcome to work off my changes if you want to. I see that this PR is from back in the psc-package days, so it might be best if this PR is rebased and re-opened anyway

P.S. There's some more discussion in #84 in case you haven't seen that. My opinion is that it's fine to have Compactable and Filterable instances, they might come in handy every now and then, especially if the example using <?> to annotate an error (in #84) works (I'm not sure whether it does).

P.P.S.

Just waiting on an opinion here: Is it best to reset the parser entirely when something is filtered away (act like it never ran, but just failed instantly)? Or is it better to just reset the position, and allow people to backtrack manually using try and <|>? I'm leaning towards the latter currently, as it seems more in touch with the rest of the package.

This is such a good question and I don't know the answer. We have to think about this.

I think the latter makes more sense to best align with the rest of the package, which i think Liam is agreeing with via the thumbs up reaction.

@jamesdbrock
Copy link
Member

I think I'm unlikely to spend more time on this in the near future, but you're welcome to work off my changes if you want to.

Yeah I will, there are good ideas in here. Also some time in the last 22 months filterable has moved from LiamGoodacre's Github to https://github.com/purescript/purescript-filterable , so that changes things too.

@jamesdbrock
Copy link
Member

Closing this in favor of #212, see #84 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Implement Compactable and Filterable for ParserT?
3 participants