Remove sparse set in Table storage #14928

cBournhonesque · 2024-08-26T18:03:20Z

Objective

We need to remove any SparseSet<ComponentId> in preparation for relations, because we will have components with very high id values once relations are added. (Because relations have a ComponentId in the high bits of the u64 id)

Part 1 was removing the FixedBitSet in Access: #14385
Part 2 is done in this PR: removing the ImmutableSparseSet<ComponentId> in Storage::Table that lets you access the Column corresponding to a given ComponentId.

This PR replaces it with a Vec<ComponentId> and Vec<Column> which are sorted in the same order.
We use the ComponentIndex (which maps from a ComponentId to the list of archetypes that contain this component, along with extra metadata including the Column index containing the ComponentId) to retrieve the Column corresponding to a given ComponentId.

TODO:

run benchmarks

cart · 2024-08-26T23:42:00Z

I believe the alternative to this is hierarchical sparse sets to cut down on allocation size while still preserving a lot of the benefits of bitsets.

@SanderMertens how does Flecs solve this problem?

james-j-obrien · 2024-08-26T23:46:38Z

Sander can correct me if I'm mistaken but this is similar to the approach flecs uses. Tables in flecs store a vector of component ids and several other vectors for mapping the type index to the column (code simplified for brevity):

struct ecs_table_t {
    uint64_t id;                     /* Table id in sparse set */
    int16_t column_count;            /* Number of components (excluding tags) */
    ecs_type_t type;                 /* Vector with component ids */

    ecs_data_t data;                 /* Component storage */
    
    int16_t *component_map;          /* Get column for component id */
    int16_t *column_map;             /* Map type index <-> column
                                      *  - 0..count(T):        type index -> column
                                      *  - count(T)..count(C): column -> type index
                                      */
};

When looking up the column flecs either uses a dense array for low component ids (a class of optimisation that we implicitly always use via sparse sets, we could also split the code paths for perf) or goes through the id index similar to this PR:

int32_t ecs_table_get_column_index(
    const ecs_world_t *world,
    const ecs_table_t *table,
    ecs_id_t id)
{
    if (id < FLECS_HI_COMPONENT_ID) {
        int16_t res = table->component_map[id];
        if (res > 0) {
            return res - 1;
        }
        return -1;
    }

    ecs_id_record_t *idr = flecs_id_record_get(world, id);
    if (!idr) {
        return -1;
    }

    ecs_table_record_t *tr = flecs_id_record_get_table(idr, table);
    if (!tr) {
        return -1;
    }

    return tr->column;
}

SanderMertens · 2024-08-27T04:15:30Z

@cart @james-j-obrien's post covers the gist of it. There are a few other details that impact perf:

A low id range is reserved for components, so the chance that a regular component hits the dense component_map array is pretty high. I'm guessing this is automatically the case in Bevy, but might change if/when component ids switch to entities
The element in the component_map is -type_index if it's a tag. This is different from the column index, which is the index in the array of table columns (see below).
The component index fallback is mostly for relationship pairs, as up to 256 ids are reserved for components and not many games will go beyond that
The column_map and column_count members are part of flecs' ZST optimization where the column array only contains element for components, not tags. Those aren't really relevant for this PR
The id record lookup (which is the component index) has a similar branch, where low component ids are looked up with a dense array (of 65k elements), with high ids (almost exclusively pairs) are looked up in a hashmap
The table record lookup is a hashmap

chescock · 2024-08-27T00:57:27Z

crates/bevy_ecs/src/query/fetch.rs

@@ -1250,6 +1313,7 @@ unsafe impl<'__w, T: Component> ReadOnlyQueryData for Ref<'__w, T> {}

 #[doc(hidden)]
 pub struct WriteFetch<'w, T> {
+    component_index: &'w ComponentIndex,


A lot of Fetch types include a &ComponentIndex now. Would it make sense to store it once in the QueryIter and pass it in to set_table? That would let you avoid changing the Fetch types. You'd store one extra copy for the rare query that never needs it, but save copies for the common queries that access multiple components.

(Alternately, should the Fetch types be doing the first lookup by ComponentId and storing a &'w HashMap<ArchetypeId, ArchetypeRecord>?)

I think the end-goal is to have the query cache the column per archetype, so maybe what you described by

(Alternately, should the Fetch types be doing the first lookup by ComponentId and storing a &'w HashMap<ArchetypeId, ArchetypeRecord>?)

cc @james-j-obrien

chescock · 2024-08-27T00:58:23Z

crates/bevy_ecs/src/query/fetch.rs

+            fetch
+                .component_index
+                .get_column_index(id, archetype_id)
+                .is_some_and(|column_index| table.has_column(column_index))


Can has_column ever fail? It looks like get_column_index would return None rather than an out-of-range value, so you can remove the has_column() method.

good catch!

chescock · 2024-08-27T01:00:22Z

crates/bevy_ecs/src/world/entity_ref.rs

@@ -961,7 +961,10 @@ impl<'w> EntityWorldMut<'w> {
        // matches
        let result = unsafe {
            T::from_components(storages, &mut |storages| {
+                // TODO: how to fix borrow-checker issues here?


Did you already fix this?

chescock · 2024-08-27T19:04:26Z

crates/bevy_ecs/src/storage/table.rs

@@ -669,7 +673,8 @@ impl TableBuilder {
 /// [`Component`]: crate::component::Component
 /// [`World`]: crate::world::World
 pub struct Table {
-    columns: ImmutableSparseSet<ComponentId, Column>,
+    columns: Vec<Column>,


Why use the component index rather than replacing columns: ImmutableSparseSet<ComponentId, Column> with columns: HashMap<ComponentId, Column>? That seems like it would be a simpler change, and it would only require one hash lookup per set_table/set_archetype instead of two.

Is the idea that this uses less memory? I wouldn't have expected a HashMap<ComponentId, Column> to use all that much more space than the Vec<Column> and Vec<ComponentId> that you're storing instead.

@james-j-obrien knows more here.

I think a good first solution to fix memory issues would indeed be to just swap the SparseSet for a Hashmap.
I think the idea was that HashMap have relatively poor performance for iteration

Btw @chescock have you joined the bevy discord? It would be great to chat more about a path to relations!

Btw @chescock have you joined the bevy discord? It would be great to chat more about a path to relations!

Yup! I'm quartermeister there, and I've been lurking on #ecs-dev and the 🌈 working group room!

Storing a HashMap in the table is almost equivalent to storing it in the component index, however in a post-relations future where tables are being created and deleted much more frequently it's economical to try and keep the setup and teardown costs low. It may also be convenient for later optimizations like moving the sparse sets into the component index, that way we only have one codepath for both storages and additionally we are not looking up the info and then the sparse set separately.

While benchmarks haven't been run yet there's a decent chance we want to steal some of flecs' optimizations here and keep a low id lookup array in both the table and the component index.

Diddykonga · 2024-12-10T02:16:56Z

Would SmallVec make sense here, not sure the average component count, but I assume there would be a reasonable number that could be used potentially.

cBournhonesque added 3 commits August 26, 2024 12:38

wip

6cf6510

wip

7c8a13a

wip

e529975

cBournhonesque added 5 commits August 26, 2024 15:42

wip

c07fe20

wip

7f06e49

wip

37a7f97

compiles

8d63745

test work

7fd9a4d

cBournhonesque requested review from chescock, james-j-obrien and Trashtalk217 August 26, 2024 21:49

fix

82ff9ab

cBournhonesque marked this pull request as ready for review August 26, 2024 21:53

cBournhonesque added 2 commits August 26, 2024 17:55

fmt

f8a23e0

opt

bbd7742

chescock reviewed Aug 28, 2024

View reviewed changes

alice-i-cecile mentioned this pull request Jan 27, 2025

Modify bevy_ecs to support removing archetypes and unregistering components #17564

Open

7 tasks

cart mentioned this pull request Feb 8, 2025

Query by Indexed Component Value using Fragmenting Marker Components #17608

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remove sparse set in Table storage #14928

Remove sparse set in Table storage #14928

cBournhonesque commented Aug 26, 2024 •

edited

Loading

Uh oh!

cart commented Aug 26, 2024

Uh oh!

james-j-obrien commented Aug 26, 2024 •

edited

Loading

Uh oh!

SanderMertens commented Aug 27, 2024 •

edited

Loading

Uh oh!

chescock Aug 27, 2024

Uh oh!

cBournhonesque Aug 28, 2024

Uh oh!

chescock Aug 27, 2024

Uh oh!

cBournhonesque Aug 28, 2024

Uh oh!

chescock Aug 27, 2024

Uh oh!

chescock Aug 27, 2024

Uh oh!

cBournhonesque Aug 28, 2024 •

edited

Loading

Uh oh!

chescock Aug 28, 2024

Uh oh!

james-j-obrien Sep 5, 2024

Uh oh!

Diddykonga commented Dec 10, 2024

Uh oh!

Uh oh!

Uh oh!

Remove sparse set in Table storage #14928

Are you sure you want to change the base?

Remove sparse set in Table storage #14928

Conversation

cBournhonesque commented Aug 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Uh oh!

cart commented Aug 26, 2024

Uh oh!

james-j-obrien commented Aug 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SanderMertens commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cBournhonesque Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Diddykonga commented Dec 10, 2024

Uh oh!

Uh oh!

cBournhonesque commented Aug 26, 2024 •

edited

Loading

james-j-obrien commented Aug 26, 2024 •

edited

Loading

SanderMertens commented Aug 27, 2024 •

edited

Loading

cBournhonesque Aug 28, 2024 •

edited

Loading