Skip to content

Conversation

grooverdan
Copy link
Member

Container prep work for new builder (and a general aarch64 rr capability)

On aarch64 g++-multilib isn't a package. The requirement for this package was to produce 32bit RR replay capability.

I don't think we've needed this capability so we can just remove this.

@grooverdan
Copy link
Member Author

builds successful on native arm64.

@grooverdan grooverdan marked this pull request as draft June 20, 2025 03:02
@RazvanLiviuVarzaru
Copy link
Collaborator

Will replace amd64-msan-clang-20-debug ?

@grooverdan
Copy link
Member Author

Will replace amd64-msan-clang-20-debug ?

That was my thought provided there's infra for it. Getting an auto-built working container is useful regardless.

@RazvanLiviuVarzaru
Copy link
Collaborator

Will replace amd64-msan-clang-20-debug ?

That was my thought provided there's infra for it. Getting an auto-built working container is useful regardless.

At first sight, there's room on our arm servers from Hetzner.
How much room, depends on you; if you want this builder to be part of Branch Protection, in which case, build speed and stability become critical and I need to look closer on resource consumption.

image

@grooverdan grooverdan force-pushed the MDBF-1076 branch 2 times, most recently from 425318b to 4d2a376 Compare August 27, 2025 06:43
@grooverdan grooverdan force-pushed the MDBF-1076 branch 2 times, most recently from 09ac84b to 471f00e Compare September 5, 2025 05:50
@grooverdan grooverdan changed the title MDBF-1076: Create MSAN Debug builder (for aarch64) MDBF-1076: Create MSAN Debug builder Sep 12, 2025
@grooverdan grooverdan force-pushed the MDBF-1076 branch 4 times, most recently from fba4350 to 0d5576f Compare September 12, 2025 08:09
@grooverdan grooverdan marked this pull request as ready for review September 12, 2025 08:13
@grooverdan
Copy link
Member Author

After this is merged can #823 be merged which will rebuild the MSAN again with MOTD updates.

grooverdan and others added 5 commits September 17, 2025 11:54
qemu insufficient to run aarch64 compiles
Until MDEV-36723 is merged up Spider is Debug only.
Since it was defined as a sequence in master-migration,
its definition is no longer needed in `master-docker-nonstandard-2`
Tests showed 2 hours for parallel = 6 during MTR:
https://buildbot.dev.mariadb.org/#/builders/535/builds/7

Quite unacceptable.

Increasing parallel for the builder will decrease a host capacity to build a whole push if max_worker_jobs < requested_jobs

Current CPU/MEM utilisation over 1M for hz-bbw8 and 9 is:
- CPU Max around 70 %
- MEM Max around 30 %

This means we can allocate more jobs than CPU's available.
Increased to 110 so that a host will continue to build a whole push.
@RazvanLiviuVarzaru
Copy link
Collaborator

RazvanLiviuVarzaru commented Sep 17, 2025

10.11 - MTR nm 2h - not acceptable - https://buildbot.dev.mariadb.org/#/builders/535/builds/7
10.11 - after a3f2587 - https://buildbot.dev.mariadb.org/#/builders/535/builds/8
11.8 - MariaDB/server@ab6bcd8 not in 11.8 yet https://buildbot.dev.mariadb.org/#/builders/535/builds/6

@grooverdan Running with big-test increases the coverage in a debug , msan setup as opposed to the already defined big-test builder https://buildbot.mariadb.org/#/builders/914 or the big-test setup on the non-debug clang-20 -msan https://buildbot.mariadb.org/#/builders/867/builds/3842/steps/6/logs/stdio ?

Or what are the most time consuming tests that bring the smallest value in this setup?

For now I can't increase parallel over 20, which is a lot already and the running time is still over 45 mins only for the default MTR test suites.

@grooverdan
Copy link
Member Author

grooverdan commented Sep 18, 2025

From the run 7:

https://docs.google.com/spreadsheets/d/1ew284jTiPIUhN2DyXLkv5VfAOpc-jEwl61bfOo-zf8s

Taking the tests < 3 minutes for the 12 builders the idealistic time is 53 minutes - (row 6910).

The > 3 minute tests (57 of them) for the 12 builders ideally would be an extra 25 minutes (row 6968) but with the larger test ~10 minutes there would probably more.

Spreadsheet ideal split per worker (E6969) of 79 minutes scales up to 2h11min so a realty scaleup is 65% more that idealistic.

The run number 8 of 20 parallel looks about the same time run as the valgrind builder, but has the cover of bigtest + everything else that MSAN offers.

So options:

option why pro con
get more hardware with MDBF-143: Add infer #835, and MDBF-1115 create UBASAN Debug builder #839, and gaps of ps-protocol/cursor/view maybe we'll need it. capacity building, lowish implementation cost/sponsorship
disable perfschema in build not so much the ~400 tests, the overhead in execution throughout the server simple enough - exact gain unknown drops in coverage
push a c,cxx_flags Og, O1, O2 or O3 into the build its about testing the results rather than the most resolvable stack simple unknown benefit debugs from artifacts little harder, forks another custom build
disable bigtest because its easy easy drops large coverage

Masking the ~20 5 minute + tests under Debug+MSAN isn't likely to fully push low enough on its own.

@RazvanLiviuVarzaru RazvanLiviuVarzaru merged commit 4b101fe into MariaDB:dev Sep 18, 2025
4 checks passed
@grooverdan grooverdan deleted the MDBF-1076 branch September 18, 2025 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants