Infrastructure: Change db from mariadb to postgres #711

ConnorNelson · 2025-05-30T23:08:08Z

This resolves #710.

TODO:

Fix dojo db backup
Fix dojo db restore
Resolve all PG-JSON TODOs (json incompatibility between sqlalchemy mariadb vs sqlalchemy postgres)
Determine and document migration story
Improve slow queries

ConnorNelson · 2025-05-30T23:09:13Z

As part of this PR, on production, I am going to put the DB back on the main node. Both the main node and the db node sit almost entirely idle.

ConnorNelson · 2025-06-11T18:55:27Z

Roughly speaking, this is the logic of doing the migration.

Let's assume we merge the PR, and do something like:

dojo compose down ctfd
dojo compose down db
dojo backup

We block anyone from connecting during the migration:

iptables -I INPUT -p tcp --dport  80 -j DROP 
iptables -I INPUT -p tcp --dport 443 -j DROP

Then we save off the backup to /tmp

cp /data/backups/$(ls -th /data/backups | head -n 1) /tmp/db.sql.gz

We'll want to update /data/config.env with some new good default values like:

DB_HOST=db
DB_NAME=ctfd
DB_USER=ctfd
DB_PASS=ctfd

(and also remove DB_EXTERNAL)

And bring everything up, with the updates:

git pull
dojo sync
dojo compose up -d --build db
dojo compose up -d --build --no-deps ctfd

Create a temporary mariadb, which we are going to load all the data in.

docker run -d --name mariadb-tmp --network pwncollege_default \
  -e MYSQL_ROOT_PASSWORD=mypass \
  -e MYSQL_DATABASE=importdb \
  mariadb:10.4.12 \
  --skip-log-bin \
  --innodb_flush_log_at_trx_commit=0 --sync_binlog=0 --innodb_doublewrite=0 \
  --innodb_buffer_pool_size=1G \
  --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

We need to wait for the temp db to finish setting up:

sleep 30

Load the backup into the temp db:

gunzip -c /tmp/db.sql.gz | docker exec -i mariadb-tmp mysql -u root -pmypass importdb

Now, migrate the data into postgres. We are intentionally ignoring the comments/files tables (which contain old data and cause migration issues).

cat <<EOF > pgloader.load
LOAD DATABASE
     FROM mysql://root:mypass@mariadb-tmp/importdb
     INTO postgresql://ctfd:ctfd@db/ctfd

 WITH data only, truncate, reset sequences, prefetch rows = 1000

 EXCLUDING TABLE NAMES MATCHING ~/^comments$/, ~/^files$/

 ALTER SCHEMA 'importdb' RENAME TO 'public';
EOF

docker run --rm --network pwncollege_default \
       -v $(pwd)/pgloader.load:/tmp/pgloader.load \
       dimitri/pgloader:latest pgloader /tmp/pgloader.load

We should be good to go now. So, we let people connect again and cleanup:

iptables -D INPUT -p tcp --dport  80 -j DROP
iptables -D INPUT -p tcp --dport 443 -j DROP
docker kill mariadb-tmp
docker rm mariadb-tmp

ConnorNelson · 2025-06-11T21:50:22Z

Here's some profiling.

This compares this PR on a mostly-idle box (postgres) against master on production (mariadb).

It is unclear how much of the difference is due to the load conditions and how much is due to the database engines, but it should be noted that production was relatively quiet at the time of testing.

Dojo Stats

from CTFd.plugins.dojo_plugin.utils.stats import get_dojo_stats

os.environ["CACHE_WARMER"] = "true"

for dojo_id in ["computing-101", "welcome", "intro-to-cybersecurity", "cse365-s2025"]:
    print(dojo_id)
    dojo = Dojos.from_id(dojo_id).first()
    %timeit -n1 -r3 get_dojo_stats(dojo)

This PR on a mostly-idle box (postgres):

computing-101
22.7 s ± 73.7 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
welcome
8.89 s ± 23.2 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
intro-to-cybersecurity
15.9 s ± 269 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
cse365-s2025
4.4 s ± 325 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Master on production (mariadb):

18.9 s ± 435 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
welcome
7.06 s ± 36.4 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
intro-to-cybersecurity
19.3 s ± 217 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
cse365-s2025
1min 32s ± 462 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Dojo Scoreboard

def get_scoreboard_for(model, duration):
    duration_filter = (
        Solves.date >= datetime.datetime.utcnow() - datetime.timedelta(days=duration)
        if duration else True
    )
    solves = db.func.count().label("solves")
    rank = (
        db.func.row_number()
        .over(order_by=(solves.desc(), db.func.max(Solves.id)))
        .label("rank")
    )
    user_entities = [Solves.user_id, Users.name, Users.email]
    query = (
        model.solves()
        .filter(duration_filter)
        .group_by(*user_entities)
        .order_by(rank)
        .with_entities(rank, solves, *user_entities)
    )

    row_results = query.all()
    results = [{key: getattr(item, key) for key in item.keys()} for item in row_results]
    return results

for dojo_id in ["computing-101", "welcome", "intro-to-cybersecurity", "cse365-s2025"]:
    print(dojo_id)
    dojo = Dojos.from_id(dojo_id).first()
    %timeit -n1 -r3 get_scoreboard_for(dojo, None)

This PR on a mostly-idle box (postgres):

computing-101
2.77 s ± 60.1 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
welcome
3.15 s ± 20.3 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
intro-to-cybersecurity
2.71 s ± 13.1 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
cse365-s2025
3.46 s ± 143 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Master on production (mariadb):

computing-101
4.19 s ± 111 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
welcome
2.26 s ± 51.6 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
intro-to-cybersecurity
4.37 s ± 55 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
cse365-s2025
17.5 s ± 296 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Dojo Scores

def scores_query(granularity, dojo_filter):
    solve_count = db.func.count(Solves.id).label("solve_count")
    last_solve_date = db.func.max(Solves.date).label("last_solve_date")
    fields = granularity + [ Solves.user_id, solve_count, last_solve_date ]
    grouping = granularity + [ Solves.user_id ]

    dsc_query = db.session.query(*fields).where(
        Dojos.dojo_id == DojoChallenges.dojo_id, DojoChallenges.challenge_id == Solves.challenge_id,
        dojo_filter
    ).group_by(*grouping).order_by(Dojos.id, solve_count.desc(), last_solve_date)

    return dsc_query

def dojo_scores():
    dsc_query = scores_query([Dojos.id], or_(Dojos.data["type"].astext == "public", Dojos.official))

    user_ranks = { }
    user_solves = { }
    dojo_ranks = { }
    for dojo_id, user_id, solve_count, _ in dsc_query:
        dojo_ranks.setdefault(dojo_id, [ ]).append(user_id)
        user_ranks.setdefault(user_id, {})[dojo_id] = len(dojo_ranks[dojo_id])
        user_solves.setdefault(user_id, {})[dojo_id] = solve_count

    return {
        "user_ranks": user_ranks,
        "user_solves": user_solves,
        "dojo_ranks": dojo_ranks
    }

%timeit -n1 -r3 dojo_scores()

This PR on a mostly-idle box (postgres):

24.2 s ± 884 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Master on production (mariadb):

1min 44s ± 530 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Module Scores

def scores_query(granularity, dojo_filter):
    solve_count = db.func.count(Solves.id).label("solve_count")
    last_solve_date = db.func.max(Solves.date).label("last_solve_date")
    fields = granularity + [ Solves.user_id, solve_count, last_solve_date ]
    grouping = granularity + [ Solves.user_id ]

    dsc_query = db.session.query(*fields).where(
        Dojos.dojo_id == DojoChallenges.dojo_id, DojoChallenges.challenge_id == Solves.challenge_id,
        dojo_filter
    ).group_by(*grouping).order_by(Dojos.id, solve_count.desc(), last_solve_date)

    return dsc_query

def module_scores():
    dsc_query = scores_query([Dojos.id, DojoChallenges.module_index], or_(Dojos.data["type"].astext == "public", Dojos.official))

    user_ranks = { }
    user_solves = { }
    module_ranks = { }
    for dojo_id, module_idx, user_id, solve_count, _ in dsc_query:
        module_ranks.setdefault(dojo_id, {}).setdefault(module_idx, []).append(user_id)
        user_ranks.setdefault(user_id, {}).setdefault(dojo_id, {})[module_idx] = len(module_ranks[dojo_id][module_idx])
        user_solves.setdefault(user_id, {}).setdefault(dojo_id, {})[module_idx] = solve_count

    return {
        "user_ranks": user_ranks,
        "user_solves": user_solves,
        "module_ranks": module_ranks
    }

%timeit -n1 -r3 module_scores()

This PR on a mostly-idle box (postgres):

38.8 s ± 279 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

Master on production (mariadb):

5min 43s ± 1.13 s per loop (mean ± std. dev. of 3 runs, 1 loop each)

Infrastructure: Change db from mariadb to postgres

0c80206

ConnorNelson added 18 commits May 30, 2025 23:55

Fix sshd

62e70a5

Fix sshd

8fee7cf

Minor refactor

8eb0e45

Remove unused environment variables

307a22d

Fix db backup/restore

eb29c2c

Fix tests

2bed74d

Fix dojo enter

1466892

Fix completions

bbc06f1

Fix scoreboard cache invalidation

ae374f9

Fix completions

140f405

Fix test

871bc91

Refactor

35c400e

Migrate to JSONB

837bc6a

Update JSONB to use astext

c98f835

Remove writeups

7cbefd9

Add pgcrypto

a0d8bb2

ssh key value hash is unique

c150abd

Fix ssh_key, minor refactor

f46ab84

Remove unused DB_EXTERNAL

7d4a645

Merge branch 'master' into feat/postgres

e934700

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Infrastructure: Change db from mariadb to postgres #711

Infrastructure: Change db from mariadb to postgres #711

Uh oh!

ConnorNelson commented May 30, 2025 •

edited

Loading

Uh oh!

ConnorNelson commented May 30, 2025

Uh oh!

ConnorNelson commented Jun 11, 2025 •

edited

Loading

Uh oh!

ConnorNelson commented Jun 11, 2025

Uh oh!

Uh oh!

Infrastructure: Change db from mariadb to postgres #711

Are you sure you want to change the base?

Infrastructure: Change db from mariadb to postgres #711

Uh oh!

Conversation

ConnorNelson commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ConnorNelson commented May 30, 2025

Uh oh!

ConnorNelson commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ConnorNelson commented Jun 11, 2025

Dojo Stats

Dojo Scoreboard

Dojo Scores

Module Scores

Uh oh!

Uh oh!

ConnorNelson commented May 30, 2025 •

edited

Loading

ConnorNelson commented Jun 11, 2025 •

edited

Loading