Skip to content

Performance - reduce urljoin/urldefrag overhead #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

dnephin
Copy link
Contributor

@dnephin dnephin commented Feb 28, 2015

Based on the work in #182 (I cherry-picked a few commits) for issue #158. Also happens to fix the problem described in #201 at the same time.

Rebased on master, fixed tests and style checks.

Removed base_uri since it was not really being used now that there is a stack.

I hope this is a smaller chunk to review.

ankostis and others added 3 commits February 27, 2015 19:04
…g by keeping

fragments separated from URL (and avoid redunant frag/defrag).
Conflicts:
	jsonschema/tests/test_benchmarks.py

issue python-jsonschema#158: Use try-finally to ensure resolver scopes_stack empty when
iteration breaks (no detectable performance penalty).

* Replace non-python-2.6 DefragResult with named-tuple.
* Add test-case checking scopes_stack empty.
Conflicts:
	jsonschema/tests/test_validators.py
	jsonschema/validators.py
@Julian
Copy link
Member

Julian commented Mar 1, 2015

Awesome, definitely helps, I'll give a shot at looking this over.

I mentioned in the other ticket I would want to maintain backwards compatibility for APIs even if they're not used internally anymore, I haven't checked carefully but I expect that to be fairly easy to add back for this subset?

A benchmark, even a simplistic one, would also be lovely for anything performance related if possible.

@dnephin
Copy link
Contributor Author

dnephin commented Mar 1, 2015

I mentioned in the other ticket I would want to maintain backwards compatibility for APIs even if they're not used internally anymore

Cool, I believe the only thing that was removed was RefResolver.in_scope(). It is easy enough to restore that with the stack implementation.

A benchmark, even a simplistic one, would also be lovely for anything performance related

I was using a very simple one myself to profile the changes, so I it should be no problem to include it.

I actually have another branch which uses a cache around some of the slower operations and is showing another significant performance increase ontop of these changes. I'll see about getting all of these changes together into a single PR.

@dnephin
Copy link
Contributor Author

dnephin commented Mar 1, 2015

I've restored in_scope() which I believe is the only breaking API change, and added bench.py.

Using the example schemas in the bench.py docstring,

master: ~56ms
this branch: ~44ms

I'm going to experiment more with the caching branch.

@dnephin
Copy link
Contributor Author

dnephin commented Mar 1, 2015

With caching this is down to ~21ms. I'm going to get that branch cleaned up and replace this review I think.

@ankostis
Copy link
Contributor

ankostis commented Mar 1, 2015

A great thank you @dnephin for getting up to this!

@Julian
Copy link
Member

Julian commented Mar 1, 2015

Definitely, appreciated.

Running your benchmark on a warmed up PyPy basically yields equivalent times on my machine, which means this doesn't really speed up anything there (probably expectedly). I'm generally OK with CPython-only speed improvements (which I assume is where you're seeing improvement) as long as it doesn't slow down things on PyPy or introduce other complexity, so seems like we're on the right track here. I'll play a bit more with this until you've got your caching branch as well, this is small enough for me to have a shot at reviewing right now :)

@dnephin
Copy link
Contributor Author

dnephin commented Mar 2, 2015

Fair point, I wouldn't expect to see much improvement in pypy. I've opened #203 since I rebased over a bunch of these smaller commits.

@dnephin dnephin closed this Mar 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants