Skip to content

Create benchmark suite #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Dec 1, 2022 · 5 comments
Closed

Create benchmark suite #71

ricardoV94 opened this issue Dec 1, 2022 · 5 comments
Labels

Comments

@ricardoV94
Copy link
Member

Description

Something like https://github.com/numpy/numpy/tree/main/benchmarks could be useful to guide performance-related decisions and alert us to unintended regressions.

@ricardoV94 ricardoV94 added help wanted Extra attention is needed performance labels Dec 1, 2022
@ricardoV94 ricardoV94 pinned this issue Dec 1, 2022
@ricardoV94
Copy link
Member Author

Using pytest-benchmark in #139

There's an option to obtain graphs in a github page of the repo, if anyone is interested in exploring: https://github.com/benchmark-action/github-action-benchmark#charts-on-github-pages-1

@ricardoV94
Copy link
Member Author

We should also add benchmark tests for compilation times, not just runtime. That may be a bit tricky because we need to clear the cache between runs?

@michaelosthege
Copy link
Member

The benchmark suite is there, but it appears to be a bit flaky.
On this recent commit that only affects PyPI package builds, there was a performance alert: a6e7722#commitcomment-98337087

@twiecki twiecki unpinned this issue Aug 29, 2023
@ogencoglu
Copy link

Are there any benchmarks for actual operations compared to pure numpy or numba? Because the first benefit of pytensor is listed as "execution speed optimizations" in the docs.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Apr 10, 2025

@ogencoglu we don't expect to be faster than pure numba on equivalent computational graphs, in fact one of the backends you can compile PyTensor function to is Numba.

Being faster than numpy is more trivial because everything in numpy is eager, and there's all the python overhead between successive calls. In the next example pytensor will do a single loop to add 3 to x, whereas numpy will do 6 loops, with intermediate allocations.

import numpy as np

x_test = np.random.normal(size=(1000,))
%timeit x_test + 1 + 0 + 1 + 0 + 1 + 0
# 4.23 μs ± 49.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

import pytensor
import pytensor.tensor as pt
x = pt.vector("x")
fn = pytensor.function([x], x + 1 + 0 + 1 + 0 + 1 + 0)
%timeit fn(x_test)
# 2.29 μs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Having said that sometimes numpy C-code is so optimized (specially making better use of SIMD) than our default C-backend that it's hard to beat. But we have other backends (including numpy). The point of PyTensor is more to delay computation so we can reason about the computation before committing to a specific form of achieving them. We rather leave most of the low-level work to other libraries.

This laziness makes code more interoperable. Perhaps someone defined a pdf expression for a probability distribution based a logarithmically stable form and exponentiates it at the end. Another user wants to reuse this expression to compute the logpdf, so they call it and log it. If you do this with PyTensor code the logp(exp(...) are removed during compilation and you get only the original stable form.


For benchmarking it's more meaningful to look at specific use-cases, because we are not trying to "beat x library", we are trying to make it easier to generate clever code (or at least not too dumb) that can be passed along to "x library", of the user's choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants