Message ID | 20240621075045.249798-1-lingyue@xiaomi.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: smp: do not allocate CPU IDs to invalid CPU nodes | expand |
On Fri, Jun 21, 2024 at 03:50:45PM +0800, Lingyue wrote: > Many modules, such as arch topology, rely on num_possible_cpus() to > allocate memory and then access the allocated space using CPU IDs. > These modules assume that there are no gaps in cpu_possible_mask. Is there any documented requirement that cpu_possible_mask has no gaps? It looks like other architectures can have gaps in their cpu_possible_mask, there's no documented requiremetns AFAICT, and there are a bunch of commits handling cpu_possible_mask having gaps, e.g. bc75e99983df1efd ("rcu: Correctly handle sparse possible cpus") 3da43104d3187184 ("ARC: Adjust cpuinfo for non-continuous cpu ids") 72917235fd5f0863 ("tracing: Fix for non-continuous cpu ids") ... so I don't think that the topology code should assume that there are no gaps in cpu_possible_mask. > However, in of_parse_and_init_cpus(), CPU IDs are still allocated > for invalid CPU nodes, leading to gaps in cpu_possible_mask and > resulting in out-of-bounds memory access. So it is crucial to avoid > allocating CPU IDs to invalid CPU nodes. AFAICT the topology code could use 'nr_cpu_ids' instead of 'nr_possible_cpus()', like the tracing commit above, or it could use a per-cpu allocation to avoid this. > This issue can be reproduced easily on QEMU with KASAN enabled, by > modifing reg property of a CPU node to 0xFFFFFFFF > > [ 0.197756] BUG: KASAN: slab-out-of-bounds in topology_normalize_cpu_scale.part.0+0x2cc/0x34c > [ 0.199518] Read of size 4 at addr ffff000007ebe924 by task swapper/0/1 > [ 0.200087] > [ 0.200739] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc4 #3 > [ 0.201647] Hardware name: linux,dummy-virt (DT) > [ 0.203067] Call trace: > [ 0.203404] dump_backtrace+0x90/0xe8 > [ 0.203974] show_stack+0x18/0x24 > [ 0.204424] dump_stack_lvl+0x78/0x90 > [ 0.205090] print_report+0x114/0x5cc > [ 0.205908] kasan_report+0xa4/0xf0 > [ 0.206488] __asan_report_load4_noabort+0x20/0x2c > [ 0.207427] topology_normalize_cpu_scale.part.0+0x2cc/0x34c > [ 0.208275] init_cpu_topology+0x254/0x430 > [ 0.209518] smp_prepare_cpus+0x20/0x25c > [ 0.210824] kernel_init_freeable+0x1dc/0x4fc > [ 0.212047] kernel_init+0x24/0x1ec > [ 0.213143] ret_from_fork+0x10/0x20 > > Signed-off-by: Lingyue <lingyue@xiaomi.com> > --- > arch/arm64/kernel/smp.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index 31c8b3094dd7..5b4178145920 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -638,12 +638,12 @@ static void __init of_parse_and_init_cpus(void) > u64 hwid = of_get_cpu_hwid(dn, 0); > > if (hwid & ~MPIDR_HWID_BITMASK) > - goto next; > + continue; > > if (is_mpidr_duplicate(cpu_count, hwid)) { > pr_err("%pOF: duplicate cpu reg properties in the DT\n", > dn); > - goto next; > + continue; > } > > /* > @@ -656,7 +656,7 @@ static void __init of_parse_and_init_cpus(void) > if (bootcpu_valid) { > pr_err("%pOF: duplicate boot cpu reg property in DT\n", > dn); > - goto next; > + continue; > } > People get very upset when CPU numbering changes, so I'd prefer to avoid this if possible. Mark. > bootcpu_valid = true; > -- > 2.34.1 >
On Fri, Jun 21, 2024 at 11:12:06AM +0100, Mark Rutland wrote: > On Fri, Jun 21, 2024 at 03:50:45PM +0800, Lingyue wrote: > > Many modules, such as arch topology, rely on num_possible_cpus() to > > allocate memory and then access the allocated space using CPU IDs. > > These modules assume that there are no gaps in cpu_possible_mask. > > Is there any documented requirement that cpu_possible_mask has no gaps? > > It looks like other architectures can have gaps in their > cpu_possible_mask, there's no documented requiremetns AFAICT, and there > are a bunch of commits handling cpu_possible_mask having gaps, e.g. > > bc75e99983df1efd ("rcu: Correctly handle sparse possible cpus") > 3da43104d3187184 ("ARC: Adjust cpuinfo for non-continuous cpu ids") > 72917235fd5f0863 ("tracing: Fix for non-continuous cpu ids") > > ... so I don't think that the topology code should assume that there are > no gaps in cpu_possible_mask. > Yes, I also don't find any documented requirement about it. > > However, in of_parse_and_init_cpus(), CPU IDs are still allocated > > for invalid CPU nodes, leading to gaps in cpu_possible_mask and > > resulting in out-of-bounds memory access. So it is crucial to avoid > > allocating CPU IDs to invalid CPU nodes. > > AFAICT the topology code could use 'nr_cpu_ids' instead of > 'nr_possible_cpus()', like the tracing commit above, or it could use a > per-cpu allocation to avoid this. > In this case, of course we can modify the arch topology code to solve the problem. However, I propose that if we can ensure there are no gaps in the cpu_possible_mask, we can solve such misuse issues once and for all, without having to dig and fix other potential similar problems one by one. > > This issue can be reproduced easily on QEMU with KASAN enabled, by > > modifing reg property of a CPU node to 0xFFFFFFFF > > > > [ 0.197756] BUG: KASAN: slab-out-of-bounds in topology_normalize_cpu_scale.part.0+0x2cc/0x34c > > [ 0.199518] Read of size 4 at addr ffff000007ebe924 by task swapper/0/1 > > [ 0.200087] > > [ 0.200739] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc4 #3 > > [ 0.201647] Hardware name: linux,dummy-virt (DT) > > [ 0.203067] Call trace: > > [ 0.203404] dump_backtrace+0x90/0xe8 > > [ 0.203974] show_stack+0x18/0x24 > > [ 0.204424] dump_stack_lvl+0x78/0x90 > > [ 0.205090] print_report+0x114/0x5cc > > [ 0.205908] kasan_report+0xa4/0xf0 > > [ 0.206488] __asan_report_load4_noabort+0x20/0x2c > > [ 0.207427] topology_normalize_cpu_scale.part.0+0x2cc/0x34c > > [ 0.208275] init_cpu_topology+0x254/0x430 > > [ 0.209518] smp_prepare_cpus+0x20/0x25c > > [ 0.210824] kernel_init_freeable+0x1dc/0x4fc > > [ 0.212047] kernel_init+0x24/0x1ec > > [ 0.213143] ret_from_fork+0x10/0x20 > > > > Signed-off-by: Lingyue <lingyue@xiaomi.com> > > --- > > arch/arm64/kernel/smp.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > index 31c8b3094dd7..5b4178145920 100644 > > --- a/arch/arm64/kernel/smp.c > > +++ b/arch/arm64/kernel/smp.c > > @@ -638,12 +638,12 @@ static void __init of_parse_and_init_cpus(void) > > u64 hwid = of_get_cpu_hwid(dn, 0); > > > > if (hwid & ~MPIDR_HWID_BITMASK) > > - goto next; > > + continue; > > > > if (is_mpidr_duplicate(cpu_count, hwid)) { > > pr_err("%pOF: duplicate cpu reg properties in the DT\n", > > dn); > > - goto next; > > + continue; > > } > > > > /* > > @@ -656,7 +656,7 @@ static void __init of_parse_and_init_cpus(void) > > if (bootcpu_valid) { > > pr_err("%pOF: duplicate boot cpu reg property in DT\n", > > dn); > > - goto next; > > + continue; > > } > > > > People get very upset when CPU numbering changes, so I'd prefer to avoid > this if possible. > > Mark. > This modification will only affect CPU ID allocation if there are invalid or duplicate CPU nodes in device tree. IMO, it is not a typical use case, but please let me know if there are any other use case. Many thanks for your response. Lingyue. > > bootcpu_valid = true; > > -- > > 2.34.1 > >
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 31c8b3094dd7..5b4178145920 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -638,12 +638,12 @@ static void __init of_parse_and_init_cpus(void) u64 hwid = of_get_cpu_hwid(dn, 0); if (hwid & ~MPIDR_HWID_BITMASK) - goto next; + continue; if (is_mpidr_duplicate(cpu_count, hwid)) { pr_err("%pOF: duplicate cpu reg properties in the DT\n", dn); - goto next; + continue; } /* @@ -656,7 +656,7 @@ static void __init of_parse_and_init_cpus(void) if (bootcpu_valid) { pr_err("%pOF: duplicate boot cpu reg property in DT\n", dn); - goto next; + continue; } bootcpu_valid = true;
Many modules, such as arch topology, rely on num_possible_cpus() to allocate memory and then access the allocated space using CPU IDs. These modules assume that there are no gaps in cpu_possible_mask. However, in of_parse_and_init_cpus(), CPU IDs are still allocated for invalid CPU nodes, leading to gaps in cpu_possible_mask and resulting in out-of-bounds memory access. So it is crucial to avoid allocating CPU IDs to invalid CPU nodes. This issue can be reproduced easily on QEMU with KASAN enabled, by modifing reg property of a CPU node to 0xFFFFFFFF [ 0.197756] BUG: KASAN: slab-out-of-bounds in topology_normalize_cpu_scale.part.0+0x2cc/0x34c [ 0.199518] Read of size 4 at addr ffff000007ebe924 by task swapper/0/1 [ 0.200087] [ 0.200739] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc4 #3 [ 0.201647] Hardware name: linux,dummy-virt (DT) [ 0.203067] Call trace: [ 0.203404] dump_backtrace+0x90/0xe8 [ 0.203974] show_stack+0x18/0x24 [ 0.204424] dump_stack_lvl+0x78/0x90 [ 0.205090] print_report+0x114/0x5cc [ 0.205908] kasan_report+0xa4/0xf0 [ 0.206488] __asan_report_load4_noabort+0x20/0x2c [ 0.207427] topology_normalize_cpu_scale.part.0+0x2cc/0x34c [ 0.208275] init_cpu_topology+0x254/0x430 [ 0.209518] smp_prepare_cpus+0x20/0x25c [ 0.210824] kernel_init_freeable+0x1dc/0x4fc [ 0.212047] kernel_init+0x24/0x1ec [ 0.213143] ret_from_fork+0x10/0x20 Signed-off-by: Lingyue <lingyue@xiaomi.com> --- arch/arm64/kernel/smp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)