Skip to content

[perf] Explore re-using XPathContext objects, and compiling XPath expressions #3266

@flavorjones

Description

@flavorjones

Two things I want to explore doing to try to improve the performance of XPath (and, transitively, CSS) searches:

  1. re-use XPathContext objects which are a little expensive to create
  2. expose libxml2's ability to compile XPath expressions

For (1), we need to be a bit careful:

  • XPathContext is not thread-safe
  • there is some state we need to set or un-set appropriately:
    • namespaces (via XPathContext#register_namespaces)
    • variables (via XPathContext#register_variable)
  • while preserving other state:
    • the nokogiri: prefix used for dynamic function binding
    • the nokogiri-builtin: prefix used for our performance-optimized builtin functions
    • the built-in xpath functions themselves

but the performance improvement could be significant, see this response from the current libxml2 maintainer indicating "best practice" is to keep one XPathContext per thread and re-use it.

The benchmark submitted by a user in #760 indicates a 4x(!) speedup on simple expressions by avoiding re-initializing an XPathContext object. It seems likely that the real-world speedup will be less (since cleaning up registered namespaces and variables will have some overhead), but it still seems like it would be a pretty decent speedup.

For (2), we'll need a new Ruby class to wrap the compiled expression represented by xmlXPathCompExprPtr, and a way to pass that into #xpath, but that seems like relatively straightforward work. (Note this API won't be available in JRuby.)

I'd like to get a rough benchmark ahead of time to see how much time this will save us, for simple and for complex expressions -- after a brief search I couldn't find any prior results here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions