-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib] Set, Dictionary: Hash table unification, general cleanup before ABI stability #19213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please smoke benchmark |
My plan is to land this in separate parts, once the following items are done:
|
This comment has been minimized.
This comment has been minimized.
ed50369
to
5bed9a8
Compare
@swift-ci please smoke benchmark |
@swift-ci please smoke test |
Oops, The lldb failures are to be expected. |
Build comment file:Build failed before running benchmark. |
f3f964f
to
3d6666c
Compare
The lldb tests will fail at the moment. @swift-ci please test |
@swift-ci please smoke benchmark |
The latest changes revert the experimental metadata layout changes -- the SIMD tricks they employed had excellent throughput, but their latency was too high for the kind of short chains that occur in normal hash tables. In the latest version Some of the SIMD-in-a-register tricks still survive in the I split out some optimizations and bugfixes into separate PRs, to be submitted after this lands. |
Build failed |
I haven't looked at the code, I'm just curious. Was the SIMD code x86 specific or did that work on ARM too? |
Build comment file:Build failed before running benchmark. |
The experimental metadata layout stored 8 extra bits of the hash value for each entry in the table. These were used to reduce the cost of collisions. The "SIMD" in this case was done entirely using general-purpose registers -- to process the metadata table, I treated a regular UInt64 as a vector of 8 bytes, so that I could search hash lookup chains by looking at 8 entries at once. This was a significant boost for long chains, but unfortunately, the average size of a lookup chain is between 1 and 2.5 -- so there was just not enough data for SIMD's advantages to kick in, and |
Build failed |
Thanks for the explanation! What was the cause of slowdown in |
Yeah, processing 8x more metadata was much more costly than I expected. The entire point of that exercise was to allow us to increase the max load factor, but SIMD tricks weren't enough to paper over the problems with that, especially at table sizes < 2k. I also tried Robin hood hashing etc., but that had other drawbacks. The simple algorithm the stdlib's hashed collections use seems very difficult to beat on anything other than memory use. |
I could reproduce the benchmark build failure; looks like the defaultValue subscript isn't exercised enough in the test suite. |
Isn't it possible that your experimental design would outperform the simple algorithm on different architecture? I'm assuming you've tested on x86. How did it look on ARM? (Are we tracking benchmarks on that, too?) |
3d6666c
to
6fbe3be
Compare
@swift-ci please smoke benchmark |
@swift-ci please test |
Build failed |
Build failed |
Previously it called index(forKey:) to implement the test, but that method is O(n) for dictionaries bridged over from Cocoa.
Don’t duplicate native paths — make the cocoa case conditionally unreachable instead.
There is no need to fetch the value corresponding to the given key. Also implement the same for Values.subscript, although the impact there is marginal.
… to do Moving the uniqueness check before the first lookup was a bad idea. Revert it.
This should reduce code size and perhaps trigger more optimizations.
Signed/unsigned integer conversions check for unrepresentable values; this wasn’t recognized as impossible, so a trap got compiled into the Dictionary lookup path. Note to self: next time just use bitwise operations.
It is currently unused, but it’s causing issues with the SIL/parse-stlib.sil test.
a544153
to
a3712c3
Compare
- Have the hash buffer include a reference to the original hash storage instance, along with a copy of its _HashTable, so that its lifetime can be independent from the deferred bridging object. - Convert _BridgingHashBuffer to a ManagedBuffer so that we can easily put reference-counted properties in its header.
a3712c3
to
9ae572f
Compare
@swift-ci please test |
Rebased to latest master. Latest round of commits fix all known regressions. This is finally getting ready to land. |
I think it's time to get this one out of the way! ✨ Optimizations I pulled back from the original will come back as followup PRs soon. |
(Resubmitted from #18790, which was getting unwieldy.)
rdar://problem/28187150
This huge PR refactors
Set
andDictionary
to simplify internal structure and to unify low-level hash table operations between the two data structures. It also paves the way for optimizations in upcoming PRs.Use a new
_HashTable
struct to unify low-level hashing operations across Set and Dictionary.Store the capacity directly inside the storage header. This allows the maximum load factor to be controlled by non-inlinable code, and improves insertion performance.
Reorganize storage layout. Storage instances are now laid out like this:
_count
_capacity
_scale
1 << _scale
. (NEW)_seed
_rawKeys
_rawValues
<BITMAP>
<KEYS>
<VALUES>
The old two-word
bitmap
field has been removed.Introduce dedicated classes for the empty singleton instances. (
_EmptySetSingleton
,_EmptyDictionarySingleton
.)Refactor deferred bridging. Add
_BridgingHashBuffer
, a standalone flat hash buffer for use in deferred bridging. Use it to improve memory use in cases where only one of Key or Value is bridged non-verbatim.Deferred bridging changes also eliminate the need for the
TypedNative*Storage
class, as well as for_NativeSet
/_NativeDictionary
’s support for non-hashable keys. Remove them.Rename storage classes as follows:
_RawNativeSetStorage
_RawSetStorage
_RawNativeDictionaryStorage
_RawDictionaryStorage
_TypedNativeSetStorage
_TypedNativeDictionaryStorage
_HashableTypedNativeSetStorage
_SetStorage
_HashableTypedNativeDictionaryStorage
_DictionaryStorage
_EmptySetSingleton
_EmptyDictionarySingleton
Clean up the implementations of Dictionary's various generalized accessors (
_modify
).An lldb data formatter update will need to land before merging this PR.The corresponding changes to lldb's data formatters are already in place: apple/swift-lldb#927.