You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kernel testing triggered this warning:
| WARNING: CPU: 0 PID: 13 at kernel/sched/core.c:1156 do_set_cpus_allowed+0x7e/0x80()
| Modules linked in:
| CPU: 0 PID: 13 Comm: migration/0 Not tainted 4.2.0-rc1-00049-g25834c7 openbmc#2
| Call Trace:
| dump_stack+0x4b/0x75
| warn_slowpath_common+0x8b/0xc0
| warn_slowpath_null+0x22/0x30
| do_set_cpus_allowed+0x7e/0x80
| cpuset_cpus_allowed_fallback+0x7c/0x170
| select_fallback_rq+0x221/0x280
| migration_call+0xe3/0x250
| notifier_call_chain+0x53/0x70
| __raw_notifier_call_chain+0x1e/0x30
| cpu_notify+0x28/0x50
| take_cpu_down+0x22/0x40
| multi_cpu_stop+0xd5/0x140
| cpu_stopper_thread+0xbc/0x170
| smpboot_thread_fn+0x174/0x2f0
| kthread+0xc4/0xe0
| ret_from_kernel_thread+0x21/0x30
As Peterz pointed out:
| So the normal rules for changing task_struct::cpus_allowed are holding
| both pi_lock and rq->lock, such that holding either stabilizes the mask.
|
| This is so that wakeup can happen without rq->lock and load-balance
| without pi_lock.
|
| From this we already get the relaxation that we can omit acquiring
| rq->lock if the task is not on the rq, because in that case
| load-balancing will not apply to it.
|
| ** these are the rules currently tested in do_set_cpus_allowed() **
|
| Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which
| unconditionally acquires both locks, we could get away with holding just
| rq->lock when on_rq for modification because that'd still exclude
| __set_cpus_allowed_ptr(), it would also work against
| __kthread_bind_mask() because that assumes !on_rq.
|
| That said, this is all somewhat fragile.
|
| Now, I don't think dropping rq->lock is quite as disastrous as it
| usually is because !cpu_active at this point, which means load-balance
| will not interfere, but that too is somewhat fragile.
|
| So we end up with a choice of two fragile..
This patch fixes it by following the rules for changing
task_struct::cpus_allowed with both pi_lock and rq->lock held.
Reported-by: kernel test robot <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
[ Modified changelog and patch. ]
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
0 commit comments