-
Notifications
You must be signed in to change notification settings - Fork 131
Multiple entries with intuitively equal keys in PackratReader cache #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The problem is that we used PagedSeqReader and it's not exactly usable with Packrat. Whenever it's constructed, source is assigned seq but since PagedSeq is not a subclass of java.lang.CharSequence, an implicit conversion takes place via Predef#SeqCharSequence, creating a new object. The problem is that this happens every time PagedSeqReader#rest is called, breaking packrat caching entirely. Luckily PagedSeqReader seems to be entirely replaceable by CharSequenceReader, so I think I'll submit a pull request that deprecates PagedSeqReader and makes it an equivalent of CharSequenceReader. Any objections? |
Well, seems like it's not entirely replaceable as SeqCharSequence calls length and forces the PagedSeq out of laziness. I submitted a different fix then. |
Could there be test coverage for this? |
I can surely write a test that checks for rd.source == rd.rest.source and likewise for drop, but is that a good test? Any idea how to test this more thoroughly? |
I don't know. Maybe @gourlaysama wants to weigh in? If you don't see a good way to test it, I'm not opposed to merging the change anyway. |
I've given it some more thought and I could probably create a parser that fails when applied twice. It wouldn't necessarily be a more thorough test in terms of coverage -- it may not test both rest and drop methods, but it would test that it behaves as intended. I'll give it a try tomorrow. |
This is the best I can do, I think. I hope it's good enough. :-) |
@gourlaysama ? (maybe he's on a lengthy European-style vacation...) |
Fixed by #65. |
Even though we use
PackratParser
s, we've observed exponential parsing time on one of our bit more complicated inputs. It turned out that strange entries get intoPackratReader
cache; it contained more than one entry for seemingly equal(Parser,Position)
pairs.We've fixed this with a workaround which implements
equals
method in theOffsetPosition
case class so that it does not compareCharSequence
s. Another working option which seems to be more safe is to comparetoString
representations of theCharSequence
s, but that introduces quite significant performance penalty.The text was updated successfully, but these errors were encountered: