-
Notifications
You must be signed in to change notification settings - Fork 129
Create benchmark suite #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Using pytest-benchmark in #139 There's an option to obtain graphs in a github page of the repo, if anyone is interested in exploring: https://github.com/benchmark-action/github-action-benchmark#charts-on-github-pages-1 |
We should also add benchmark tests for compilation times, not just runtime. That may be a bit tricky because we need to clear the cache between runs? |
The benchmark suite is there, but it appears to be a bit flaky. |
Are there any benchmarks for actual operations compared to pure numpy or numba? Because the first benefit of pytensor is listed as "execution speed optimizations" in the docs. |
@ogencoglu we don't expect to be faster than pure numba on equivalent computational graphs, in fact one of the backends you can compile PyTensor function to is Numba. Being faster than numpy is more trivial because everything in numpy is eager, and there's all the python overhead between successive calls. In the next example pytensor will do a single loop to add 3 to x, whereas numpy will do 6 loops, with intermediate allocations. import numpy as np
x_test = np.random.normal(size=(1000,))
%timeit x_test + 1 + 0 + 1 + 0 + 1 + 0
# 4.23 μs ± 49.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
import pytensor
import pytensor.tensor as pt
x = pt.vector("x")
fn = pytensor.function([x], x + 1 + 0 + 1 + 0 + 1 + 0)
%timeit fn(x_test)
# 2.29 μs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) Having said that sometimes numpy C-code is so optimized (specially making better use of SIMD) than our default C-backend that it's hard to beat. But we have other backends (including numpy). The point of PyTensor is more to delay computation so we can reason about the computation before committing to a specific form of achieving them. We rather leave most of the low-level work to other libraries. This laziness makes code more interoperable. Perhaps someone defined a pdf expression for a probability distribution based a logarithmically stable form and exponentiates it at the end. Another user wants to reuse this expression to compute the logpdf, so they call it and log it. If you do this with PyTensor code the For benchmarking it's more meaningful to look at specific use-cases, because we are not trying to "beat x library", we are trying to make it easier to generate clever code (or at least not too dumb) that can be passed along to "x library", of the user's choice. |
Description
Something like https://github.com/numpy/numpy/tree/main/benchmarks could be useful to guide performance-related decisions and alert us to unintended regressions.
The text was updated successfully, but these errors were encountered: