[stdlib] Set, Dictionary: Hash table unification, general cleanup before ABI stability #19213

lorentey · 2018-09-10T10:42:55Z

(Resubmitted from #18790, which was getting unwieldy.)

rdar://problem/28187150

This huge PR refactors Set and Dictionary to simplify internal structure and to unify low-level hash table operations between the two data structures. It also paves the way for optimizations in upcoming PRs.

Use a new _HashTable struct to unify low-level hashing operations across Set and Dictionary.
Store the capacity directly inside the storage header. This allows the maximum load factor to be controlled by non-inlinable code, and improves insertion performance.

Reorganize storage layout. Storage instances are now laid out like this:

Field	Description
`_count`	Count of occupied entries in the table
`_capacity`	Maximum count supported without resizing. (NEW)
`_scale`	The size class of the table. Number of buckets is `1 << _scale`. (NEW)
`_seed`	The hash seed for this instance.
`_rawKeys`	Pointer to the start of tail-allocated region used for storing keys.
`_rawValues`	Pointer to the start of tail-allocated region used for storing values.
`<BITMAP>`	Tail-allocated bitmap marking occupied entries
`<KEYS>`	Tail-allocated buffer for keys
`<VALUES>`	Tail-allocated buffer for values

The old two-word bitmap field has been removed.

Introduce dedicated classes for the empty singleton instances. (_EmptySetSingleton, _EmptyDictionarySingleton.)
Refactor deferred bridging. Add _BridgingHashBuffer, a standalone flat hash buffer for use in deferred bridging. Use it to improve memory use in cases where only one of Key or Value is bridged non-verbatim.
Deferred bridging changes also eliminate the need for the TypedNative*Storage class, as well as for _NativeSet/_NativeDictionary’s support for non-hashable keys. Remove them.

Rename storage classes as follows:

Old name	New name
`_RawNativeSetStorage`	`_RawSetStorage`
`_RawNativeDictionaryStorage`	`_RawDictionaryStorage`
`_TypedNativeSetStorage`	(removed)
`_TypedNativeDictionaryStorage`	(removed)
`_HashableTypedNativeSetStorage`	`_SetStorage`
`_HashableTypedNativeDictionaryStorage`	`_DictionaryStorage`
(new)	`_EmptySetSingleton`
(new)	`_EmptyDictionarySingleton`
The new names make it obvious at runtime that the new ivar layout is in use; this is useful for tools like lldb.

Clean up the implementations of Dictionary's various generalized accessors (_modify).

~~An lldb data formatter update will need to land before merging this PR.~~ The corresponding changes to lldb's data formatters are already in place: apple/swift-lldb#927.

lorentey · 2018-09-10T10:44:30Z

@swift-ci please smoke benchmark

lorentey · 2018-09-10T11:20:18Z

My plan is to land this in separate parts, once the following items are done:

Update Dictionary to use _HashTable
Update lldb's Set/Dictionary formatters (keeping them compatible with 4.2 for a while)
Based on benchmarks, decide whether to keep the experimental metadata updates or to revert to using a simple bitmap.

lorentey · 2018-09-13T17:02:27Z

@swift-ci please smoke benchmark

lorentey · 2018-09-13T17:02:57Z

@swift-ci please smoke test

lorentey · 2018-09-14T10:26:31Z

/home/buildnode/jenkins/workspace/swift-PR-Linux-smoke-test/branch-master/swift/test/SILOptimizer/no_traps_in_dict_loopup.swift:8:15: error: CHECK-NOT: excluded string found in input
// CHECK-NOT: llvm.trap
              ^
<stdin>:105:15: note: found here
declare void @llvm.trap() #1
              ^~~~~~~~~

Oops, Dictionary.swift accidentally uses the checked bitset lookup variants. I'm removing those in an upcoming cleanup.

The lldb failures are to be expected.

swift-ci · 2018-09-14T18:46:30Z

Build comment file:

Build failed before running benchmark.

lorentey · 2018-09-19T15:20:57Z

The lldb tests will fail at the moment.

@swift-ci please test

lorentey · 2018-09-19T15:21:39Z

@swift-ci please smoke benchmark

lorentey · 2018-09-19T15:26:48Z

The latest changes revert the experimental metadata layout changes -- the SIMD tricks they employed had excellent throughput, but their latency was too high for the kind of short chains that occur in normal hash tables. In the latest version Set and Dictionary use a simple flat bitmap data structure, like before.

Some of the SIMD-in-a-register tricks still survive in the _HashTable struct; they proved useful for iterating over bitmaps, for example.

I split out some optimizations and bugfixes into separate PRs, to be submitted after this lands.

swift-ci · 2018-09-19T15:42:53Z

Build failed
Swift Test Linux Platform
Git Sha - 3d6666cac67d59a98ca3c67a55cae2e5db70bff9

palimondo · 2018-09-19T15:44:26Z

I haven't looked at the code, I'm just curious. Was the SIMD code x86 specific or did that work on ARM too?

swift-ci · 2018-09-19T15:56:36Z

Build comment file:

Build failed before running benchmark.

lorentey · 2018-09-19T15:59:01Z

The experimental metadata layout stored 8 extra bits of the hash value for each entry in the table. These were used to reduce the cost of collisions. The "SIMD" in this case was done entirely using general-purpose registers -- to process the metadata table, I treated a regular UInt64 as a vector of 8 bytes, so that I could search hash lookup chains by looking at 8 entries at once.

This was a significant boost for long chains, but unfortunately, the average size of a lookup chain is between 1 and 2.5 -- so there was just not enough data for SIMD's advantages to kick in, and Set.contains got significantly slower.

swift-ci · 2018-09-19T16:10:32Z

Build failed
Swift Test OS X Platform
Git Sha - 3d6666cac67d59a98ca3c67a55cae2e5db70bff9

palimondo · 2018-09-19T16:15:30Z

Thanks for the explanation! What was the cause of slowdown in contains — just the change of data layout by adding the extra byte?

lorentey · 2018-09-19T16:32:33Z

Yeah, processing 8x more metadata was much more costly than I expected. The entire point of that exercise was to allow us to increase the max load factor, but SIMD tricks weren't enough to paper over the problems with that, especially at table sizes < 2k. I also tried Robin hood hashing etc., but that had other drawbacks.

The simple algorithm the stdlib's hashed collections use seems very difficult to beat on anything other than memory use.

lorentey · 2018-09-19T16:34:41Z

I could reproduce the benchmark build failure; looks like the defaultValue subscript isn't exercised enough in the test suite.

palimondo · 2018-09-19T16:40:51Z

Isn't it possible that your experimental design would outperform the simple algorithm on different architecture? I'm assuming you've tested on x86. How did it look on ARM? (Are we tracking benchmarks on that, too?)

lorentey · 2018-09-19T19:29:50Z

@swift-ci please smoke benchmark

lorentey · 2018-09-19T19:30:10Z

@swift-ci please test

swift-ci · 2018-09-19T19:54:05Z

Build failed
Swift Test Linux Platform
Git Sha - 2ee71d6e4410ca166032eb65a2851aeb624184aa

swift-ci · 2018-09-19T20:20:23Z

Build failed
Swift Test OS X Platform
Git Sha - 2ee71d6e4410ca166032eb65a2851aeb624184aa

Previously it called index(forKey:) to implement the test, but that method is O(n) for dictionaries bridged over from Cocoa.

…inable

Don’t duplicate native paths — make the cocoa case conditionally unreachable instead.

There is no need to fetch the value corresponding to the given key. Also implement the same for Values.subscript, although the impact there is marginal.

… to do Moving the uniqueness check before the first lookup was a bad idea. Revert it.

This should reduce code size and perhaps trigger more optimizations.

…builds

Signed/unsigned integer conversions check for unrepresentable values; this wasn’t recognized as impossible, so a trap got compiled into the Dictionary lookup path. Note to self: next time just use bitwise operations.

It is currently unused, but it’s causing issues with the SIL/parse-stlib.sil test.

- Have the hash buffer include a reference to the original hash storage instance, along with a copy of its _HashTable, so that its lifetime can be independent from the deferred bridging object. - Convert _BridgingHashBuffer to a ManagedBuffer so that we can easily put reference-counted properties in its header.

lorentey · 2018-09-22T12:37:29Z

@swift-ci please test

lorentey · 2018-09-22T14:14:01Z

Rebased to latest master. Latest round of commits fix all known regressions.

This is finally getting ready to land.

lorentey · 2018-09-24T10:20:42Z

I think it's time to get this one out of the way! ✨

Optimizations I pulled back from the original will come back as followup PRs soon.

lorentey changed the title ~~Hashtable unification~~ [WIP] Hashtable unification Sep 10, 2018

This comment has been minimized.

Sign in to view

lorentey force-pushed the hashtable-unification branch from ed50369 to 5bed9a8 Compare September 13, 2018 17:01

lorentey mentioned this pull request Sep 14, 2018

Use custom hash table as node lookup table swiftlang/swift-syntax#11

Merged

lorentey force-pushed the hashtable-unification branch 2 times, most recently from f3f964f to 3d6666c Compare September 19, 2018 15:20

lorentey changed the title ~~[WIP] Hashtable unification~~ [stdlib] Set, Dictionary: Hash table unification, general cleanup before ABI stability Sep 19, 2018

lorentey force-pushed the hashtable-unification branch from 3d6666c to 6fbe3be Compare September 19, 2018 19:08

lorentey added 16 commits September 22, 2018 02:04

[stdlib] Dictionary: Fix compilation on non-ObjC platforms

ea6ffd3

[stdlib] Fix performance regression in Dictionary.Keys.contains(_:)

c3540de

Previously it called index(forKey:) to implement the test, but that method is O(n) for dictionaries bridged over from Cocoa.

[stdlib] Set, Dictionary: Make _HashTableDelegate implementations inl…

c19e20b

…inable

[stdlib] Set, Dictionary: Simplify variant processing

8366edd

Don’t duplicate native paths — make the cocoa case conditionally unreachable instead.

[stdlib] Dictionary: Speed up Keys.subscript for bridged dictionaries

4421d1e

There is no need to fetch the value corresponding to the given key. Also implement the same for Values.subscript, although the impact there is marginal.

[stdlib] Set, Dictionary: Speed up insert/remove when there’s nothing…

778defa

… to do Moving the uniqueness check before the first lookup was a bad idea. Revert it.

[stdlib] Add missing calls to cocoaPath()

2094581

This should reduce code size and perhaps trigger more optimizations.

[stdlib] Set.Iterator.next(): Work around compiler crash in non-ObjC …

4e64931

…builds

[stdlib] Temporary workaround for issue fixed in swiftlang#19444

9b7ba7d

[stdlib] Dictionary.removeValue(forKey:): Fix typo.

e41a6f9

[stdlib] Bitset: Eliminate a trap that wasn’t optimized away

a84ece7

Signed/unsigned integer conversions check for unrepresentable values; this wasn’t recognized as impossible, so a trap got compiled into the Dictionary lookup path. Note to self: next time just use bitwise operations.

[test] Fix flaky test

00bbf1b

[stdlib] Remove _Bitset type for now

3c8ac88

It is currently unused, but it’s causing issues with the SIL/parse-stlib.sil test.

[test] Remove _Bitset tests

77dbc5b

[runtime] Fix layout of empty Set/Dictionary singletons

ace19fd

[stdlib] Discourage CoW copying / rehashing from getting inlined

b2d4e80

lorentey force-pushed the hashtable-unification branch from a544153 to a3712c3 Compare September 22, 2018 12:29

swiftlang deleted a comment from swift-ci Sep 22, 2018

lorentey added 3 commits September 22, 2018 13:36

[stdlib] Set, Dictionary: Sanitize @objc-ness for NSEnumerator members

ff882f4

[stdlib] Dictionary: Remove workaround for rdar://problem/44612356

9ae572f

lorentey force-pushed the hashtable-unification branch from a3712c3 to 9ae572f Compare September 22, 2018 12:36

swiftlang deleted a comment from swift-ci Sep 22, 2018

lorentey merged commit ca4898f into swiftlang:master Sep 24, 2018

lorentey deleted the hashtable-unification branch September 24, 2018 10:20

palimondo mentioned this pull request Feb 19, 2019

[benchmark] Janitor Duty: Sweep Quadratic #22673

Merged

[stdlib] Set, Dictionary: Hash table unification, general cleanup before ABI stability #19213

[stdlib] Set, Dictionary: Hash table unification, general cleanup before ABI stability #19213

Uh oh!

Conversation

lorentey commented Sep 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Sep 10, 2018

Uh oh!

lorentey commented Sep 10, 2018

Uh oh!

This comment has been minimized.

lorentey commented Sep 13, 2018

Uh oh!

lorentey commented Sep 13, 2018

Uh oh!

lorentey commented Sep 14, 2018

Uh oh!

swift-ci commented Sep 14, 2018

Build comment file:

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swift-ci commented Sep 19, 2018

Uh oh!

palimondo commented Sep 19, 2018

Uh oh!

swift-ci commented Sep 19, 2018

Build comment file:

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

swift-ci commented Sep 19, 2018

Uh oh!

palimondo commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

palimondo commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

lorentey commented Sep 19, 2018

Uh oh!

swift-ci commented Sep 19, 2018

Uh oh!

swift-ci commented Sep 19, 2018

Uh oh!

lorentey commented Sep 22, 2018

Uh oh!

lorentey commented Sep 22, 2018

Uh oh!

lorentey commented Sep 24, 2018

Uh oh!

Uh oh!

lorentey commented Sep 10, 2018 •

edited

Loading

lorentey commented Sep 19, 2018 •

edited

Loading