fix: stabilize //rs/dogecoin/ckdoge/minter:integration_tests#8978
Open
basvandijk wants to merge 3 commits intomasterfrom
Open
fix: stabilize //rs/dogecoin/ckdoge/minter:integration_tests#8978basvandijk wants to merge 3 commits intomasterfrom
basvandijk wants to merge 3 commits intomasterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
//rs/dogecoin/ckdoge/minter:integration_testsoften fails or times out:Since the test timeouts so often increase the timeout to
long(15m).In addition we downloaded the logs of its last non successful runs:
And prompted Claude Opus 4.6:
It responded with the following Root Cause Analysis and accompanying fix:
fix: deflake
//rs/dogecoin/ckdoge/minter:integration_testsFixes two independent root causes that made this test target fail or time out on nearly every CI run.
Root Cause Analysis
Analysis based on non-successful runs from the last week (
logs/integration_tests/2026-02-21T09:59:02/README.md) (65 entries). Three failure modes were observed:1. Timeout —
should_cancel_and_reimburse_large_withdrawal(most impactful)The test created 1,900 UTXOs to trigger a
TooManyInputsreimbursement (limit: 1,000 inputs).Each UTXO requires multiple cross-canister calls in PocketIC during
minter_update_balance()(KYT check + ledger mint), resulting in ~3,800+ inter-canister messages. This consistently exceeded the 5-minute Bazel test timeout.The test intended to set
max_num_inputs_in_transaction: Some(500)via the init args to use fewer UTXOs, but the ckBTC minter'sFrom<InitArgs> for CkBtcMinterStateimplementation silently ignores this field and always usesDEFAULT_MAX_NUM_INPUTS_IN_TRANSACTION(1,000). The field is only respected during upgrades (viareinit). This meant the test needed >1,000 UTXOs to trigger the error, making it infeasibly slow.Fix: Set
max_num_inputs_in_transactionto100via a canister upgrade (which does respect the arg), then use only 120 UTXOs. The test exercises the exact sameTooManyInputserror path with 101 inputs > 100 max. Runtime dropped from >5 minutes (timeout) to ~50 seconds.2. Port conflict — dogecoind fails to bind (intermittent)
Port allocation in
Daemon::newhas a TOCTOU race: it binds port 0 to get a free port, records the number, drops the listener, then starts the daemon on that port. When 15 tests run in parallel, another process can grab the port in between, causing:This caused a random test (whichever happened to set up last) to panic at
Daemon::new.Fix: Added retry logic (up to 3 attempts) to
Daemon::new. On startup failure (early exit or timeout waiting for "Done loading"), it kills the process, cleans up the data directory, allocates fresh ports, and retries.3. PocketIC server panics (external, not addressed here)
Some runs from
ic-nervous-system-wasmsbranch showed 14/15 tests failing with panics insidepocket_ic_server. These were caused by an unrelated broken PocketIC build on that branch and are not addressed by this PR.Changes
rs/dogecoin/ckdoge/minter/tests/tests.rsshould_cancel_and_reimburse_large_withdrawalto use 120 UTXOs withmax_num_inputs=100(set via upgrade) instead of 1,900 UTXOs withmax_num_inputs=1,000rs/bitcoin/adapter/test_utils/src/bitcoind.rsDaemon::newfor resilience against transient port conflicts