Message ID | 20201124155039.13804-13-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | An alternative series for asymmetric AArch32 systems | expand |
On 11/24/20 15:50, Will Deacon wrote: > If we want to support 32-bit applications, then when we identify a CPU > with mismatched 32-bit EL0 support we must ensure that we will always > have an active 32-bit CPU available to us from then on. This is important > for the scheduler, because is_cpu_allowed() will be constrained to 32-bit > CPUs for compat tasks and forced migration due to a hotplug event will > hang if no 32-bit CPUs are available. > > On detecting a mismatch, prevent offlining of either the mismatching CPU > if it is 32-bit capable, or find the first active 32-bit capable CPU > otherwise. ^^^^^ You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly random function now, cpumask_any_and_distribute(), if you'd like to pick something 'truly' random. I tried to clean up the cpumask_any* API mess, but ended up with too much headache instead. Killng of cpumask_any*() might be the easier option. But that's something different.. > > Signed-off-by: Will Deacon <will@kernel.org> > --- > arch/arm64/kernel/cpufeature.c | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index 29017cbb6c8e..fe470683b43e 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) > > static int enable_mismatched_32bit_el0(unsigned int cpu) > { > + static int lucky_winner = -1; > + > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); > bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) > static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); > } > > + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) > + return 0; Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the boot CPU) specially here, but I don't get why? > + > + if (lucky_winner >= 0) > + return 0; > + > + /* > + * We've detected a mismatch. We need to keep one of our CPUs with > + * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting > + * every CPU in the system for a 32-bit task. > + */ > + lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask, > + cpu_active_mask); cpumask_any_and() could return an error. It could be hard or even impossible to trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential splat? We can do better by the way and do smarter check in remove_cpu() to block offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But won't insist and happy to wait for someone to come complaining this is not good enough first. Some vendors play games with hotplug to help with saving power. They might want to dynamically nominate the last man standing 32bit capable CPU. Again, we can wait for someone to complain first I guess. Cheers -- Qais Yousef > + get_cpu_device(lucky_winner)->offline_disabled = true; > + pr_info("Asymmetric 32-bit EL0 support detected on CPU %u; CPU hot-unplug disabled on CPU %u\n", > + cpu, lucky_winner); > return 0; > } > > -- > 2.29.2.454.gaff20da3a2-goog >
On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote: > On 11/24/20 15:50, Will Deacon wrote: > > If we want to support 32-bit applications, then when we identify a CPU > > with mismatched 32-bit EL0 support we must ensure that we will always > > have an active 32-bit CPU available to us from then on. This is important > > for the scheduler, because is_cpu_allowed() will be constrained to 32-bit > > CPUs for compat tasks and forced migration due to a hotplug event will > > hang if no 32-bit CPUs are available. > > > > On detecting a mismatch, prevent offlining of either the mismatching CPU > > if it is 32-bit capable, or find the first active 32-bit capable CPU > > otherwise. > ^^^^^ > > You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly > random function now, cpumask_any_and_distribute(), if you'd like to pick > something 'truly' random. I think cpumask_any_and() is better, because it makes it clear that I don't care about which CPU is chosen (and under the hood it ends up calling cpumask_first_and() _anyway_). So this is purely cosmetic. > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > index 29017cbb6c8e..fe470683b43e 100644 > > --- a/arch/arm64/kernel/cpufeature.c > > +++ b/arch/arm64/kernel/cpufeature.c > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) > > > > static int enable_mismatched_32bit_el0(unsigned int cpu) > > { > > + static int lucky_winner = -1; > > + > > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); > > bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); > > > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) > > static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); > > } > > > > + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) > > + return 0; > > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the > boot CPU) specially here, but I don't get why? If our ability to execute 32-bit code is the same as the boot CPU then we don't have to do anything. That way, we can postpone nominating the lucky winner until we really need to. > > + if (lucky_winner >= 0) > > + return 0; > > + > > + /* > > + * We've detected a mismatch. We need to keep one of our CPUs with > > + * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting > > + * every CPU in the system for a 32-bit task. > > + */ > > + lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask, > > + cpu_active_mask); > > cpumask_any_and() could return an error. It could be hard or even impossible to > trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling > get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential > splat? I don't see how it can return an error here. There are two cases to consider: 1. The CPU being brought online is the first 32-bit-capable CPU. In which case, we don't use cpumask_any_and() at all. 2. The CPU being brought online is the first 64-bit-only CPU. In which case, the CPU doing the onlining is 32-bit capable and will be in the active mask. > We can do better by the way and do smarter check in remove_cpu() to block > offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But > won't insist and happy to wait for someone to come complaining this is not good > enough first. I couldn't find a satisfactory way to do this without the possibility of subtle races, so I'd prefer to keep it simple for the moment. In particular, I wanted to make sure that somebody iterating over the cpu_possible_mask and calling is_cpu_allowed(p, cpu) for each CPU and a 32-bit task can not reach the end of the mask without ever getting a value of 'true'. I'm open to revisiting this once some of this is merged, but right now I don't think it's needed and it certainly adds complexity. > Some vendors play games with hotplug to help with saving power. They might want > to dynamically nominate the last man standing 32bit capable CPU. Again, we can > wait for someone to complain first I guess. The reality is that either all "big" cores or all "little" cores will be the ones that are 32-bit capable, so I doubt it matters an awful lot which one of the cluster is left online from a PM perspective. The real problem is that a core has to be left online at all, but I don't think we can avoid that. Will
On 12/01/20 22:13, Will Deacon wrote: > On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote: > > On 11/24/20 15:50, Will Deacon wrote: > > > If we want to support 32-bit applications, then when we identify a CPU > > > with mismatched 32-bit EL0 support we must ensure that we will always > > > have an active 32-bit CPU available to us from then on. This is important > > > for the scheduler, because is_cpu_allowed() will be constrained to 32-bit > > > CPUs for compat tasks and forced migration due to a hotplug event will > > > hang if no 32-bit CPUs are available. > > > > > > On detecting a mismatch, prevent offlining of either the mismatching CPU > > > if it is 32-bit capable, or find the first active 32-bit capable CPU > > > otherwise. > > ^^^^^ > > > > You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly > > random function now, cpumask_any_and_distribute(), if you'd like to pick > > something 'truly' random. > > I think cpumask_any_and() is better, because it makes it clear that I don't > care about which CPU is chosen (and under the hood it ends up calling > cpumask_first_and() _anyway_). So this is purely cosmetic. > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > > index 29017cbb6c8e..fe470683b43e 100644 > > > --- a/arch/arm64/kernel/cpufeature.c > > > +++ b/arch/arm64/kernel/cpufeature.c > > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) > > > > > > static int enable_mismatched_32bit_el0(unsigned int cpu) > > > { > > > + static int lucky_winner = -1; > > > + > > > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); > > > bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); > > > > > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) > > > static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); > > > } > > > > > > + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) > > > + return 0; > > > > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the > > boot CPU) specially here, but I don't get why? > > If our ability to execute 32-bit code is the same as the boot CPU then we > don't have to do anything. That way, we can postpone nominating the lucky > winner until we really need to. Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me trouble. If the first N cpus are 64bit only, we'll skip them here. Worth a comment? Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is empty instead? That would be a lot easier to read. > > > > + if (lucky_winner >= 0) > > > + return 0; > > > + > > > + /* > > > + * We've detected a mismatch. We need to keep one of our CPUs with > > > + * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting > > > + * every CPU in the system for a 32-bit task. > > > + */ > > > + lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask, > > > + cpu_active_mask); > > > > cpumask_any_and() could return an error. It could be hard or even impossible to > > trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling > > get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential > > splat? > > I don't see how it can return an error here. There are two cases to > consider: > > 1. The CPU being brought online is the first 32-bit-capable CPU. In which > case, we don't use cpumask_any_and() at all. > > 2. The CPU being brought online is the first 64-bit-only CPU. In which > case, the CPU doing the onlining is 32-bit capable and will be in > the active mask. Now I understand the if condition above; yeah I think this will not hit. The condition above simply guarantees cpu_32bit_el0_mask is not empty. And since this is online path, there's a guarantee there's a single bit shared between the 2 masks since the same path must have set this shared bit. > > > We can do better by the way and do smarter check in remove_cpu() to block > > offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But > > won't insist and happy to wait for someone to come complaining this is not good > > enough first. > > I couldn't find a satisfactory way to do this without the possibility of > subtle races, so I'd prefer to keep it simple for the moment. In particular, > I wanted to make sure that somebody iterating over the cpu_possible_mask > and calling is_cpu_allowed(p, cpu) for each CPU and a 32-bit task can not > reach the end of the mask without ever getting a value of 'true'. > > I'm open to revisiting this once some of this is merged, but right now > I don't think it's needed and it certainly adds complexity. Agreed. I just wanted to share some awareness. Let's not make this series more complicated than it needs to be. > > > Some vendors play games with hotplug to help with saving power. They might want > > to dynamically nominate the last man standing 32bit capable CPU. Again, we can > > wait for someone to complain first I guess. > > The reality is that either all "big" cores or all "little" cores will be the > ones that are 32-bit capable, so I doubt it matters an awful lot which one > of the cluster is left online from a PM perspective. The real problem is > that a core has to be left online at all, but I don't think we can avoid > that. I don't have specific info, but in theory it could matter. We enforce a specific core to be always online, rather than allow any core to be offlined until there's a single one left in the mask. I agree we shouldn't worry about this for now. And yes, we can't avoid keeping the last 32bit capable cpu online. The same way we can't offline the last cpu in a normal system. Thanks -- Qais Yousef
On Wed, Dec 02, 2020 at 12:59:52PM +0000, Qais Yousef wrote: > On 12/01/20 22:13, Will Deacon wrote: > > On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote: > > > On 11/24/20 15:50, Will Deacon wrote: > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > > > index 29017cbb6c8e..fe470683b43e 100644 > > > > --- a/arch/arm64/kernel/cpufeature.c > > > > +++ b/arch/arm64/kernel/cpufeature.c > > > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) > > > > > > > > static int enable_mismatched_32bit_el0(unsigned int cpu) > > > > { > > > > + static int lucky_winner = -1; > > > > + > > > > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); > > > > bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); > > > > > > > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) > > > > static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); > > > > } > > > > > > > > + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) > > > > + return 0; > > > > > > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the > > > boot CPU) specially here, but I don't get why? > > > > If our ability to execute 32-bit code is the same as the boot CPU then we > > don't have to do anything. That way, we can postpone nominating the lucky > > winner until we really need to. > > Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me > trouble. If the first N cpus are 64bit only, we'll skip them here. Worth > a comment? > > Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is > empty instead? That would be a lot easier to read. Sorry, but I don't follow. What if all the CPUs are 32-bit capable? I'll leave the code as-is, since I don't think it's particularly hard to understand, and it does the right thing. Will
On 12/02/20 17:42, Will Deacon wrote: > On Wed, Dec 02, 2020 at 12:59:52PM +0000, Qais Yousef wrote: > > On 12/01/20 22:13, Will Deacon wrote: > > > On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote: > > > > On 11/24/20 15:50, Will Deacon wrote: > > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > > > > index 29017cbb6c8e..fe470683b43e 100644 > > > > > --- a/arch/arm64/kernel/cpufeature.c > > > > > +++ b/arch/arm64/kernel/cpufeature.c > > > > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) > > > > > > > > > > static int enable_mismatched_32bit_el0(unsigned int cpu) > > > > > { > > > > > + static int lucky_winner = -1; > > > > > + > > > > > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); > > > > > bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); > > > > > > > > > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) > > > > > static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); > > > > > } > > > > > > > > > > + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) > > > > > + return 0; > > > > > > > > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the > > > > boot CPU) specially here, but I don't get why? > > > > > > If our ability to execute 32-bit code is the same as the boot CPU then we > > > don't have to do anything. That way, we can postpone nominating the lucky > > > winner until we really need to. > > > > Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me > > trouble. If the first N cpus are 64bit only, we'll skip them here. Worth > > a comment? > > > > Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is > > empty instead? That would be a lot easier to read. > > Sorry, but I don't follow. What if all the CPUs are 32-bit capable? You're right I missed this case. -- Qais Yousef
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 29017cbb6c8e..fe470683b43e 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope) static int enable_mismatched_32bit_el0(unsigned int cpu) { + static int lucky_winner = -1; + struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0); @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu) static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0); } + if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit) + return 0; + + if (lucky_winner >= 0) + return 0; + + /* + * We've detected a mismatch. We need to keep one of our CPUs with + * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting + * every CPU in the system for a 32-bit task. + */ + lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask, + cpu_active_mask); + get_cpu_device(lucky_winner)->offline_disabled = true; + pr_info("Asymmetric 32-bit EL0 support detected on CPU %u; CPU hot-unplug disabled on CPU %u\n", + cpu, lucky_winner); return 0; }
If we want to support 32-bit applications, then when we identify a CPU with mismatched 32-bit EL0 support we must ensure that we will always have an active 32-bit CPU available to us from then on. This is important for the scheduler, because is_cpu_allowed() will be constrained to 32-bit CPUs for compat tasks and forced migration due to a hotplug event will hang if no 32-bit CPUs are available. On detecting a mismatch, prevent offlining of either the mismatching CPU if it is 32-bit capable, or find the first active 32-bit capable CPU otherwise. Signed-off-by: Will Deacon <will@kernel.org> --- arch/arm64/kernel/cpufeature.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)