Speed up hashing for GridQubit, LineQubit, and NamedQubit#6350
Speed up hashing for GridQubit, LineQubit, and NamedQubit#6350
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6350 +/- ##
=======================================
Coverage 97.84% 97.84%
=======================================
Files 1110 1110
Lines 96656 96696 +40
=======================================
+ Hits 94572 94612 +40
Misses 2084 2084 ☔ View full report in Codecov by Sentry. |
|
Updated to optimized On master: In [2]: nq = cirq.NamedQubit("abc")
In [3]: %timeit hash(nq)
220 ns ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [4]: %timeit hash(cirq.NamedQubit("abc"))
1.78 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)On this branch: In [2]: nq = cirq.NamedQubit("abc")
In [3]: %timeit hash(nq)
162 ns ± 4.14 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [4]: %timeit hash(cirq.NamedQubit("abc"))
553 ns ± 1.16 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) |
|
|
||
| def __hash__(self) -> int: | ||
| if self._hash is None: | ||
| self._hash = hash((self._row, self._col, self._dimension)) |
There was a problem hiding this comment.
Should we just compute this in the constructor? It should be immutable, right? That would remove the need for an if statement.
There was a problem hiding this comment.
We could, but since construction is more common than hashing I think it would be good to keep the constructor as fast as possible. I played around a bit with adding a __new__ method so we can actually cache instances instead of allocating new ones each time the constructor is called, but that interacts with pickling in strange ways that I haven't worked out yet.
There was a problem hiding this comment.
Here's a comparison of timings with lazy hashing (this PR) vs eager hashing where as you suggest the hash is computed in the constructor and there's no conditional in __hash__. It does indeed make hashing faster, but slows down the constructor quite a bit:
| Operation | Lazy Hash [ns] | Eager Hash [ns] | t_eager / t_lazy |
|---|---|---|---|
%timeit cirq.GridQubit(1, 2) |
294 | 430 | 1.46x |
%timeit hash(cirq.GridQubit(1, 2)) |
652 | 589 | 0.90x |
q = cirq.GridQubit(1, 2); %timeit hash(q) |
163 | 135 | 0.83x |
%timeit cirq.LineQubit(3) |
254 | 369 | 1.45x |
%timeit hash(cirq.LineQubit(3) |
560 | 506 | 0.90x |
q = cirq.LineQubit(3); %timeit hash(q) |
163 | 144 | 0.88x |
%timeit cirq.NamedQubit("abc") |
249 | 361 | 1.45x |
%timeit hash(cirq.NamedQubit("abc") |
568 | 502 | 0.88x |
q = cirq.NamedQubit("abc"); %timeit hash(q) |
159 | 137 | 0.86x |
So eager hashing speeds up the __hash__ calls by between 10-20%, but slows down the constructors by almost 50%. If we can figure out __new__ to cache qid instances, then I think eager hashing would be a clear win, but for now I'd suggest we stick with lazy.
There was a problem hiding this comment.
Ok, thanks for the thorough analysis. This makes sense.
This speeds up hashing on
GridQubit,GridQid,LineQubit, andLineQid. Instead of using_compat.cached_method, we store an explicitself._hashproperty to cache computing the hash. We also make the constructors explicit in the concrete classes with no need to call a subperclass constructor. We did some basic benchmarks to compare performance with current master. On this branchGridQubit.__hash__dropped from 1.3us to 488ns (2.66x speedup), while repeated hashing of the same grid qubit dropped from 150ns to 121ns (1.24x speedup). Similarly,LineQubit.__hash__dropped from 685uns to 453ns (1.51x speedup), while repeated hashing of the same line qubit dropped from 142ns to 124ns (1.15x speedup).On master:
and on this branch: