-
Notifications
You must be signed in to change notification settings - Fork 603
Open
Description
Describe the bug
I have two Arm machines (one's a GCP machine, one's internal). This test passes on our internal machines and fails on the GCP machines. As far as I can tell this failure is what the test is looking for: the scheduler isn't cleaning up one of the inputs, and thus it's hanging.
I'm not sure what other debugging I should do here.
To Reproduce
Steps to reproduce the behavior:
make test
Expected behavior
The test to pass.
Screenshots or Pasted Text
Here's the "tool errors" subtest from the log on the broken machine. Note that I've added
diff --git a/clients/drcachesim/scheduler/scheduler_impl.cpp b/clients/drcachesim/scheduler/scheduler_impl.cpp
index 3c570b6e5..d04623bc0 100644
--- a/clients/drcachesim/scheduler/scheduler_impl.cpp
+++ b/clients/drcachesim/scheduler/scheduler_impl.cpp
@@ -3445,7 +3445,7 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::mark_input_eof(input_info_t &inpu
live_input_count_.fetch_add(-1, std::memory_order_release);
assert(old_count > 0);
int live_inputs = live_input_count_.load(std::memory_order_acquire);
- VPRINT(this, 2, "input %d at eof; %d live inputs left\n", input.index, live_inputs);
+ VPRINT(this, 1, "input %d at eof; %d live inputs left\n", input.index, live_inputs);
if (options_.mapping == sched_type_t::MAP_TO_ANY_OUTPUT &&
live_inputs <=
static_cast<int>(inputs_.size() * options_.exit_if_fraction_inputs_left)) {
to make that message more verbose as I was trying to see what's going on.
----------------
Testing tool errors
[scheduler] Scheduler configuration:
[scheduler] Inputs : 5
[scheduler] Outputs : 2
[scheduler] mapping : 2
[scheduler] deps : 0
[scheduler] flags : 0x00000002
[scheduler] quantum_unit : 0
[scheduler] quantum_duration : 0
[scheduler] verbosity : 1
[scheduler] schedule_record_ostream : (nil)
[scheduler] schedule_replay_istream : (nil)
[scheduler] replay_as_traced_istream : (nil)
[scheduler] syscall_switch_threshold : 30000000
[scheduler] blocking_switch_threshold : 500
[scheduler] block_time_scale : 0.000000
[scheduler] block_time_max : 0
[scheduler] kernel_switch_trace_path :
[scheduler] kernel_switch_reader : (nil)
[scheduler] kernel_switch_reader_end : (nil)
[scheduler] single_lockstep_output : 0
[scheduler] randomize_next_input : 0
[scheduler] read_inputs_in_init : 1
[scheduler] honor_direct_switches : 1
[scheduler] time_units_per_us : 1000.000000
[scheduler] quantum_duration_us : 5000
[scheduler] quantum_duration_instrs : 10000000
[scheduler] block_time_multiplier : 0.100000
[scheduler] block_time_max_us : 2500
[scheduler] migration_threshold_us : 500
[scheduler] rebalance_period_us : 50000
[scheduler] honor_infinite_timeouts : 0
[scheduler] exit_if_fraction_inputs_left : 0.100000
[scheduler] kernel_syscall_trace_path :
[scheduler] kernel_syscall_reader : (nil)
[scheduler] kernel_syscall_reader_end : (nil)
[scheduler] Reading headers from inputs to find filetypes
[scheduler] Output 0 triggered a rebalance @0:
[analyzer] Creating 2 worker threads
[analyzer] Worker 0 starting on trace shard 0 stream is 0xaaaac49b8390
[analyzer] Worker 1 starting on trace shard 1 stream is 0xaaaac49b8638
[scheduler] input 0 at eof; 4 live inputs left
[scheduler] input 1 at eof; 3 live inputs left
[scheduler] input 2 at eof; 2 live inputs left
[scheduler] input 3 at eof; 1 live inputs left
[analyzer] Worker 0 hit shard memref error cpuid not supported on trace shard
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @1000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @1500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @2000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @2500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @3000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @3500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @4000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @4500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @5000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @5500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @6000013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @6500013: running #-1; 0 in queue; 0 blocked
[scheduler] Queue snapshot: inputs: 1 schedulable, 0 unscheduled, 4 eof
out #0 @28: running #-1; 0 in queue; 0 blocked
out #1 @7000013: running #-1; 0 in queue; 0 blocked
Versions
$ git log --pretty=oneline | head -n1
1cec2631a8a289e08c3070c57d568a9587a4cee8 i#7685 DrPoints: add inline counter update for AARCH64 (#7737)
$ uname -a
Linux dynamorio-ubuntu-20-arm 5.15.0-1096-gcp #105~20.04.1-Ubuntu SMP Wed Oct 22 06:50:03 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
Metadata
Metadata
Assignees
Labels
No labels