Skip to content

Commit cd1cb33

Browse files
Valentin Schneideringomolnar
authored andcommitted
sched/topology: Don't try to build empty sched domains
Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 Hardware name: ARM Juno development board (r0) (DT) Workqueue: events cpuset_hotplug_workfn pstate: 60000005 (nZCv daif -PAN -UAO) pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) lr : build_sched_domains (kernel/sched/topology.c:1966) Call trace: build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) partition_sched_domains_locked (kernel/sched/topology.c:2250) rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) kthread (kernel/kthread.c:255) ret_from_fork (arch/arm64/kernel/entry.S:1167) Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is: cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with the following reproducer: $ cgcreate -g cpuset:asym $ cgset -r cpuset.cpus=0-3 asym $ cgset -r cpuset.mems=0 asym $ cgset -r cpuset.cpu_exclusive=1 asym $ cgcreate -g cpuset:smp $ cgset -r cpuset.cpus=4-5 smp $ cgset -r cpuset.mems=0 smp $ cgset -r cpuset.cpu_exclusive=1 smp $ cgset -r cpuset.sched_load_balance=0 . $ echo 0 > /sys/devices/system/cpu/cpu4/online $ echo 0 > /sys/devices/system/cpu/cpu5/online Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: 05484e0 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent 8005803 commit cd1cb33

2 files changed

Lines changed: 6 additions & 2 deletions

File tree

kernel/cgroup/cpuset.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
798798
cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
799799
continue;
800800

801-
if (is_sched_load_balance(cp))
801+
if (is_sched_load_balance(cp) &&
802+
!cpumask_empty(cp->effective_cpus))
802803
csa[csn++] = cp;
803804

804805
/* skip @cp's subtree if not a partition root */

kernel/sched/topology.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1948,14 +1948,17 @@ static struct sched_domain_topology_level
19481948
static int
19491949
build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr)
19501950
{
1951-
enum s_alloc alloc_state;
1951+
enum s_alloc alloc_state = sa_none;
19521952
struct sched_domain *sd;
19531953
struct s_data d;
19541954
struct rq *rq = NULL;
19551955
int i, ret = -ENOMEM;
19561956
struct sched_domain_topology_level *tl_asym;
19571957
bool has_asym = false;
19581958

1959+
if (WARN_ON(cpumask_empty(cpu_map)))
1960+
goto error;
1961+
19591962
alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
19601963
if (alloc_state != sa_rootdomain)
19611964
goto error;

0 commit comments

Comments
 (0)