Closed
Conversation
There was a problem hiding this comment.
Isn't pos redundant with buf now, since Iobuf has its own cursor?
Contributor
Author
There was a problem hiding this comment.
But the owned variant of a Tendril doesn't.
Eventually I would like to replace BufferQueue with a rope, as discussed elsewhere.
Closed
|
Doesn't this mean that any long string of text (i.e. that spans multiple chunks) would have to fall back to a I guess that's a small price to pay for more efficient iteration. |
|
Iobuf usage looks good to me (modulo comments)! I like this remix. |
kmcallister
added a commit
to kmcallister/html5ever
that referenced
this pull request
Jun 10, 2015
Contributor
Author
|
Now #141. |
kmcallister
added a commit
that referenced
this pull request
Jun 16, 2015
kmcallister
added a commit
that referenced
this pull request
Jun 25, 2015
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is based on #60 but with substantial changes. The biggest difference is that we only use shared buffers for the character runs found by
pop_except_from. The majority of the remaining spans are single ASCII characters, which have their own fast path. Everything else is aStringas before.This branch also drops many of the micro-optimizations from #60. Unlike that PR, we leave the parser rules alone for the most part.
r? @Manishearth or @SimonSapin (general review)
r? @cgaebel (iobuf usage in
tendril.rs)Depending on the specific content and the I/O chunk size, this branch speeds up tokenization by up to a few percent. I did not see any significant performance regressions with sensible chunk sizes.
I have plans for further optimizations, including following up on the rustc bugs @cgaebel identified in #60.
The branch already achieves a significant drop in allocations and memory consumption:
(preliminary numbers)
Webapp spec, single page:
pre-zerocopy
zerocopy
Wikipedia (GotG from servo-static-suite)
pre-zerocopy
zerocopy