-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Remove sparse set in Table storage #14928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Remove sparse set in Table storage #14928
Conversation
I believe the alternative to this is hierarchical sparse sets to cut down on allocation size while still preserving a lot of the benefits of bitsets. @SanderMertens how does Flecs solve this problem? |
Sander can correct me if I'm mistaken but this is similar to the approach flecs uses. Tables in flecs store a vector of component ids and several other vectors for mapping the type index to the column (code simplified for brevity): struct ecs_table_t {
uint64_t id; /* Table id in sparse set */
int16_t column_count; /* Number of components (excluding tags) */
ecs_type_t type; /* Vector with component ids */
ecs_data_t data; /* Component storage */
int16_t *component_map; /* Get column for component id */
int16_t *column_map; /* Map type index <-> column
* - 0..count(T): type index -> column
* - count(T)..count(C): column -> type index
*/
}; When looking up the column flecs either uses a dense array for low component ids (a class of optimisation that we implicitly always use via sparse sets, we could also split the code paths for perf) or goes through the id index similar to this PR: int32_t ecs_table_get_column_index(
const ecs_world_t *world,
const ecs_table_t *table,
ecs_id_t id)
{
if (id < FLECS_HI_COMPONENT_ID) {
int16_t res = table->component_map[id];
if (res > 0) {
return res - 1;
}
return -1;
}
ecs_id_record_t *idr = flecs_id_record_get(world, id);
if (!idr) {
return -1;
}
ecs_table_record_t *tr = flecs_id_record_get_table(idr, table);
if (!tr) {
return -1;
}
return tr->column;
} |
@cart @james-j-obrien's post covers the gist of it. There are a few other details that impact perf:
|
@@ -1250,6 +1313,7 @@ unsafe impl<'__w, T: Component> ReadOnlyQueryData for Ref<'__w, T> {} | |||
|
|||
#[doc(hidden)] | |||
pub struct WriteFetch<'w, T> { | |||
component_index: &'w ComponentIndex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of Fetch
types include a &ComponentIndex
now. Would it make sense to store it once in the QueryIter
and pass it in to set_table
? That would let you avoid changing the Fetch
types. You'd store one extra copy for the rare query that never needs it, but save copies for the common queries that access multiple components.
(Alternately, should the Fetch
types be doing the first lookup by ComponentId
and storing a &'w HashMap<ArchetypeId, ArchetypeRecord>
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the end-goal is to have the query cache the column per archetype, so maybe what you described by
(Alternately, should the Fetch types be doing the first lookup by ComponentId and storing a &'w HashMap<ArchetypeId, ArchetypeRecord>?)
fetch | ||
.component_index | ||
.get_column_index(id, archetype_id) | ||
.is_some_and(|column_index| table.has_column(column_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can has_column
ever fail? It looks like get_column_index
would return None
rather than an out-of-range value, so you can remove the has_column()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
@@ -961,7 +961,10 @@ impl<'w> EntityWorldMut<'w> { | |||
// matches | |||
let result = unsafe { | |||
T::from_components(storages, &mut |storages| { | |||
// TODO: how to fix borrow-checker issues here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you already fix this?
@@ -669,7 +673,8 @@ impl TableBuilder { | |||
/// [`Component`]: crate::component::Component | |||
/// [`World`]: crate::world::World | |||
pub struct Table { | |||
columns: ImmutableSparseSet<ComponentId, Column>, | |||
columns: Vec<Column>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the component index rather than replacing columns: ImmutableSparseSet<ComponentId, Column>
with columns: HashMap<ComponentId, Column>
? That seems like it would be a simpler change, and it would only require one hash lookup per set_table
/set_archetype
instead of two.
Is the idea that this uses less memory? I wouldn't have expected a HashMap<ComponentId, Column>
to use all that much more space than the Vec<Column>
and Vec<ComponentId>
that you're storing instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@james-j-obrien knows more here.
I think a good first solution to fix memory issues would indeed be to just swap the SparseSet for a Hashmap.
I think the idea was that HashMap have relatively poor performance for iteration
Btw @chescock have you joined the bevy discord? It would be great to chat more about a path to relations!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw @chescock have you joined the bevy discord? It would be great to chat more about a path to relations!
Yup! I'm quartermeister there, and I've been lurking on #ecs-dev and the 🌈 working group room!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing a HashMap in the table is almost equivalent to storing it in the component index, however in a post-relations future where tables are being created and deleted much more frequently it's economical to try and keep the setup and teardown costs low. It may also be convenient for later optimizations like moving the sparse sets into the component index, that way we only have one codepath for both storages and additionally we are not looking up the info and then the sparse set separately.
While benchmarks haven't been run yet there's a decent chance we want to steal some of flecs' optimizations here and keep a low id lookup array in both the table and the component index.
Would |
Objective
We need to remove any
SparseSet<ComponentId>
in preparation for relations, because we will have components with very high id values once relations are added. (Because relations have a ComponentId in the high bits of the u64 id)Part 1 was removing the FixedBitSet in Access: #14385
Part 2 is done in this PR: removing the
ImmutableSparseSet<ComponentId>
inStorage::Table
that lets you access theColumn
corresponding to a givenComponentId
.This PR replaces it with a
Vec<ComponentId>
andVec<Column>
which are sorted in the same order.We use the
ComponentIndex
(which maps from aComponentId
to the list of archetypes that contain this component, along with extra metadata including theColumn
index containing theComponentId
) to retrieve theColumn
corresponding to a givenComponentId
.TODO: