Skip to content

Feature: merge tables#204

Open
ieQu1 wants to merge 27 commits intomainfrom
dev/merge-tables
Open

Feature: merge tables#204
ieQu1 wants to merge 27 commits intomainfrom
dev/merge-tables

Conversation

@ieQu1
Copy link
Copy Markdown
Member

@ieQu1 ieQu1 commented Apr 14, 2026

See README.md

@zmstone
Copy link
Copy Markdown
Member

zmstone commented Apr 15, 2026

question: in a simplified 3 nodes cluster
C1, C2, and R1,

  • R1 is connected to C1 for the merged table shard
  • C1 isolated from C2
  • R1 issued writes through C1
  • C1 crashes (R1 and C2 continue to work as normal)

What happens to the records written from R1 (through C1) while isolated?

@ieQu1
Copy link
Copy Markdown
Member Author

ieQu1 commented Apr 15, 2026

@zmstone The fault scenario can be further simplified by noting that in merged shards there's no difference between core and replicant roles. So we can assume that all nodes are either cores or replicants, or just "nodes". When writing to merged shard, all writes are local. Mria verifies that all records contain the node in the record

So, to paraphrase the scenario:

N3 is connected to N1 for the merged table shard
N1 isolated from N2
N3 issued writes through N1
N1 crashes (N3 and N2 continue to work as normal)

How it will be handled:
N3 will continue to write records locally, but N2 won't see them due to partition, and vice versa.
If partition heals, N3 will pull N2's records and merge them into the local table, and vice-versa.
When N1 restarts, it will pull records from both N2 and N3, and merge them. N2 and N3 will subscribe to transactions from N1.

Comment thread src/mria_membership.erl Outdated
Comment thread src/mria_rlog_replica.erl Outdated
) when State =:= ?normal;
State =:= ?local_replay ->
mria_status:notify_replicant_replayq_len(Shard, replayq:count(Q)),
%% FIXME: this should be reported per upstream
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME: improve metrics

Comment thread test/mria_SUITE.erl Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces merge tables support to Mria’s RLOG mode, adding a new “merged shard” supervision topology and schema rules so that certain tables can be written locally on all nodes and later merged across the cluster.

Changes:

  • Add merge-table schema validation (merge_table + mandatory node_pattern) and cache merge-shard metadata in persistent_term.
  • Introduce merged-shard supervisors/manager and refactor shard supervisors into upstream/downstream/merged variants, using gproc and pg for addressing and membership.
  • Update replication/bootstrap/import paths and extend CT coverage with a new schema-focused test and README documentation.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test/mria_mnesia_test_util.erl Adds wait_tables/2 helper and removes startup shard env usage in test env.
test/mria_SUITE.erl Updates shard restart waits and adds t_merge_table_schema/1 test.
src/mria_shards_sup.erl Starts only meta shard initially; dynamically selects upstream/downstream/merged shard supervisors.
src/mria_shard_upstream_sup.erl Renames/refactors “core shard” supervisor to upstream variant.
src/mria_shard_merged_sup.erl New top supervisor for merged shards with downstream manager/supervisor.
src/mria_shard_downstream_sup.erl Renames/refactors “replicant shard” supervisor to downstream variant; adds upstream parameterization.
src/mria_schema.erl Adds merge-shard/table APIs, validation, and persistent_term cache of merge shard flags.
src/mria_rlog_sup.erl Adds pg child and separates out shards supervisor child spec.
src/mria_rlog_server.erl Updates references to upstream shard supervisor module name.
src/mria_rlog_replica.erl Adds upstream identity + gproc naming; merge-shard conditional behavior; adds where/2 and ls/1.
src/mria_rlog_merged_manager.erl New manager that ensures downstreams exist per cluster membership for merged shards.
src/mria_rlog.hrl Adds macros for pg scope and any_core upstream marker.
src/mria_rlog.erl Adds shard_writes/1 and bumps protocol version to 3.
src/mria_replica_importer_worker.erl Adds ETS-based import path for merge shards and gproc naming.
src/mria_rebalance.erl Updates agent listing to use new upstream supervisor module.
src/mria_membership.erl Adds special-case pong(_, false) behavior.
src/mria_bootstrapper.erl Adds merge-table bootstrap behavior (per-node clear + ETS-based writes + ETS select iteration).
src/mria.erl Routes transaction APIs via mria_rlog:shard_writes/1; adjusts backend write routing logic.
src/mria.app.src Adds gproc as an application dependency.
rebar.config Adds gproc dependency.
README.md Documents merge tables and their creation/config requirements.
Comments suppressed due to low confidence (3)

src/mria_schema.erl:305

  • do_create_table/2 creates the Mnesia table before verify_merge_table/1 runs (which happens inside add_entry/1). For merge-table validation failures (e.g. missing node_pattern, incompatible storage, incompatible shard), this can leave a table created locally even though the schema entry transaction aborts. Consider validating merge-table options before calling create_table/1, or add cleanup/rollback logic to drop the table when add_entry/1 aborts.
-spec do_create_table(mria:table(), list()) -> mria:t_result(ok).
do_create_table(Table, TabDef) ->
    case make_entry(Table, TabDef) of
        {ok, Entry} ->
            case create_table(Entry) of
                ok ->
                    add_entry(Entry);
                Err ->
                    {aborted, Err}
            end;

src/mria_shard_downstream_sup.erl:19

  • The module header comment says this supervisor “Runs on replicant nodes”, but in merged shards it is also started on core nodes (as downstream supervisors managed by mria_shard_merged_sup). Please update the comment so it matches the new supervision topology.
    src/mria_shard_upstream_sup.erl:19
  • The module header comment says this supervisor “Runs on core nodes”, but merge shards start mria_shard_upstream_sup on replicant nodes as well (via mria_shard_merged_sup). Please update the comment to reflect the actual runtime usage to avoid misleading future maintenance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mria_rlog_replica.erl Outdated
Comment thread test/mria_SUITE.erl Outdated
Comment thread src/mria_shard_merged_sup.erl
Comment thread src/mria_schema.erl Outdated
Comment thread src/mria_replica_importer_worker.erl Outdated
Comment thread README.md
Comment thread src/mria_shards_sup.erl
Comment thread src/mria_membership.erl Outdated
@ieQu1 ieQu1 force-pushed the dev/merge-tables branch from 8ce0cc4 to bc58223 Compare April 16, 2026 14:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md
Comment on lines +75 to +86
1. All tables in the shard must have `merge_table` property = `true`.
2. `node_pattern` property is mandatory.
Its value must be an ets match pattern with one free variable: `'$1'`.
Mria verifies that this value is set to `node()` for each record.
3. `auto_clean` is an optional property that allows a downstream node to clean all records owned by an upstream node when the latter disconnects.

Unlike regular tables,
both cores and replicants use local writes to update such tables.
In a clustered setup,
the contents of the merge table consist of records from all reachable peer nodes.
Records from remote nodes are read-only.

Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README states that Mria verifies node_pattern is set to node() for each record, but the enforcement added in this PR appears to happen only for transactional ops (in mria_upstream:transactional_wrapper/3) and only for write/delete_object. Dirty operations (and transactional delete) currently bypass this check. Consider clarifying the doc to match current behavior, or extending enforcement so the statement is accurate.

Copilot uses AI. Check for mistakes.
Comment thread src/mria_upstream.erl
Comment on lines +125 to +128
verify_merge_table_update(_Op, Acc) ->
%% TODO: deletions and other operations are more tricky.
Acc.

Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify_merge_table_update/2 only validates write and delete_object ops; transactional delete (e.g. mnesia:delete/1,3) and other ops are currently ignored, so a node can potentially delete records that don't belong to it without triggering merge_table_violation. Consider either enforcing the node check for delete/clear_table (e.g. by requiring delete_object for merge tables, or validating via a lookup) or explicitly aborting when unsupported ops are used on merge tables so the invariant holds.

Suggested change
verify_merge_table_update(_Op, Acc) ->
%% TODO: deletions and other operations are more tricky.
Acc.
verify_merge_table_update(Entry, Acc) ->
case tx_entry_table(Entry) of
{ok, Table} ->
case is_merged_table(Table) of
true ->
[Entry | Acc];
false ->
Acc
end;
error ->
Acc
end.
tx_entry_table({{Table, _Key}, _Record, _Op}) ->
{ok, Table};
tx_entry_table(_Entry) ->
error.
is_merged_table(Table) ->
case mria_schema:get_merged_table_check_spec(Table) of
{ok, _MatchSpec} -> true;
_ -> false
end.

Copilot uses AI. Check for mistakes.
Comment thread src/mria_schema.erl
Comment thread src/mria_schema.erl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants