diff mbox series

[v2,5/6] arm64: Advertise CPUs capable of running 32-bit applications in sysfs

Message ID 20201109213023.15092-6-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series An alternative series for asymmetric AArch32 systems | expand

Commit Message

Will Deacon Nov. 9, 2020, 9:30 p.m. UTC
Since 32-bit applications will be killed if they are caught trying to
execute on a 64-bit-only CPU in a mismatched system, advertise the set
of 32-bit capable CPUs to userspace in sysfs.

Signed-off-by: Will Deacon <will@kernel.org>
---
 .../ABI/testing/sysfs-devices-system-cpu      |  9 +++++++++
 arch/arm64/kernel/cpufeature.c                | 19 +++++++++++++++++++
 2 files changed, 28 insertions(+)

Comments

Greg KH Nov. 10, 2020, 7:04 a.m. UTC | #1
On Mon, Nov 09, 2020 at 09:30:21PM +0000, Will Deacon wrote:
> Since 32-bit applications will be killed if they are caught trying to
> execute on a 64-bit-only CPU in a mismatched system, advertise the set
> of 32-bit capable CPUs to userspace in sysfs.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  .../ABI/testing/sysfs-devices-system-cpu      |  9 +++++++++
>  arch/arm64/kernel/cpufeature.c                | 19 +++++++++++++++++++
>  2 files changed, 28 insertions(+)

I still think the "kill processes that can not run on this CPU" is crazy
but that has nothing to do with this sysfs file patch, which looks good
to me:

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Catalin Marinas Nov. 10, 2020, 9:28 a.m. UTC | #2
On Tue, Nov 10, 2020 at 08:04:26AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Nov 09, 2020 at 09:30:21PM +0000, Will Deacon wrote:
> > Since 32-bit applications will be killed if they are caught trying to
> > execute on a 64-bit-only CPU in a mismatched system, advertise the set
> > of 32-bit capable CPUs to userspace in sysfs.
> > 
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  .../ABI/testing/sysfs-devices-system-cpu      |  9 +++++++++
> >  arch/arm64/kernel/cpufeature.c                | 19 +++++++++++++++++++
> >  2 files changed, 28 insertions(+)
> 
> I still think the "kill processes that can not run on this CPU" is crazy

I agree it's crazy, though we try to keep the kernel support simple
while making it a user-space problem. The alternative is to
force-migrate such process to a more capable CPU, potentially against
the desired user cpumask. In addition, we'd have to block CPU hot-unplug
in case the last 32-bit capable CPU disappears.

The only sane thing is not to allow 32-bit processes at all on such
hardware but I think we lost that battle ;).
Greg KH Nov. 10, 2020, 9:36 a.m. UTC | #3
On Tue, Nov 10, 2020 at 09:28:43AM +0000, Catalin Marinas wrote:
> On Tue, Nov 10, 2020 at 08:04:26AM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Nov 09, 2020 at 09:30:21PM +0000, Will Deacon wrote:
> > > Since 32-bit applications will be killed if they are caught trying to
> > > execute on a 64-bit-only CPU in a mismatched system, advertise the set
> > > of 32-bit capable CPUs to userspace in sysfs.
> > > 
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > > ---
> > >  .../ABI/testing/sysfs-devices-system-cpu      |  9 +++++++++
> > >  arch/arm64/kernel/cpufeature.c                | 19 +++++++++++++++++++
> > >  2 files changed, 28 insertions(+)
> > 
> > I still think the "kill processes that can not run on this CPU" is crazy
> 
> I agree it's crazy, though we try to keep the kernel support simple
> while making it a user-space problem. The alternative is to
> force-migrate such process to a more capable CPU, potentially against
> the desired user cpumask. In addition, we'd have to block CPU hot-unplug
> in case the last 32-bit capable CPU disappears.

You should block CPU hot-unplug for the last 32bit capable CPU, why
would you not want that if there are any active 32bit processes running?

And how is userspace going to know that it is creating a 32bit process?
Are you now going to force all calls to exec() to be mediated somehow by
putting an ELF parser in the init process?

> The only sane thing is not to allow 32-bit processes at all on such
> hardware but I think we lost that battle ;).

That was a hardware decision that was made for some specific reason, so
supporting it in the best way seems to be our best option given that
people obviously must want this crazy type of system otherwise they
wouldn't be paying for it!

While punting the logic out to userspace is simple for the kernel, and
of course my first option, I think this isn't going to work in the
long-run and the kernel will have to "know" what type of process it is
scheduling in order to correctly deal with this nightmare as userspace
can't do that well, if at all.

thanks,

greg k-h
Marc Zyngier Nov. 10, 2020, 9:53 a.m. UTC | #4
On 2020-11-10 09:36, Greg Kroah-Hartman wrote:

[...]

> While punting the logic out to userspace is simple for the kernel, and
> of course my first option, I think this isn't going to work in the
> long-run and the kernel will have to "know" what type of process it is
> scheduling in order to correctly deal with this nightmare as userspace
> can't do that well, if at all.

For that to happen, we must first decide which part of the userspace
ABI we are prepared to compromise on. The most obvious one would be to
allow overriding the affinity on exec, but I'm pretty sure it has bad
interactions with cgroups, on top of violating the existing ABI which
mandates that the affinity is inherited across exec.

There may be other options (always make at least one 32bit-capable CPU
part of the process affinity?), but they all imply some form of 
userspace
visible regressions.

Pick your poison... :-/

         M.
Greg KH Nov. 10, 2020, 10:10 a.m. UTC | #5
On Tue, Nov 10, 2020 at 09:53:53AM +0000, Marc Zyngier wrote:
> On 2020-11-10 09:36, Greg Kroah-Hartman wrote:
> 
> [...]
> 
> > While punting the logic out to userspace is simple for the kernel, and
> > of course my first option, I think this isn't going to work in the
> > long-run and the kernel will have to "know" what type of process it is
> > scheduling in order to correctly deal with this nightmare as userspace
> > can't do that well, if at all.
> 
> For that to happen, we must first decide which part of the userspace
> ABI we are prepared to compromise on. The most obvious one would be to
> allow overriding the affinity on exec, but I'm pretty sure it has bad
> interactions with cgroups, on top of violating the existing ABI which
> mandates that the affinity is inherited across exec.

So you are saying that you have to violate this today with this patch
set?  Or would have to violate that if the scheduler got involved?

How is userspace going to "know" how to do all of this properly?  Who is
going to write that code?

> There may be other options (always make at least one 32bit-capable CPU
> part of the process affinity?), but they all imply some form of userspace
> visible regressions.
> 
> Pick your poison... :-/

What do the system designers and users of this hardware recommend?  They
are the ones that dreamed up this, and seem to somehow want this.  What
do they think the correct solution is here as obviously they must have
thought this through when designing such a beast, right?

And if they didn't think any of this through then why are they wanting
to run Linux on this thing?

thanks,

greg k-h
Marc Zyngier Nov. 10, 2020, 10:46 a.m. UTC | #6
On 2020-11-10 10:10, Greg Kroah-Hartman wrote:
> On Tue, Nov 10, 2020 at 09:53:53AM +0000, Marc Zyngier wrote:
>> On 2020-11-10 09:36, Greg Kroah-Hartman wrote:
>> 
>> [...]
>> 
>> > While punting the logic out to userspace is simple for the kernel, and
>> > of course my first option, I think this isn't going to work in the
>> > long-run and the kernel will have to "know" what type of process it is
>> > scheduling in order to correctly deal with this nightmare as userspace
>> > can't do that well, if at all.
>> 
>> For that to happen, we must first decide which part of the userspace
>> ABI we are prepared to compromise on. The most obvious one would be to
>> allow overriding the affinity on exec, but I'm pretty sure it has bad
>> interactions with cgroups, on top of violating the existing ABI which
>> mandates that the affinity is inherited across exec.
> 
> So you are saying that you have to violate this today with this patch
> set?  Or would have to violate that if the scheduler got involved?

Doing nothing (as with this series) doesn't result in an ABI breakage.
It "only" results in an unreliable system. If you start making decisions
behind userspace's back for the sake of making things reliable (which
is a commendable goal), you break the ABI.

Rock, please meet A Hard Place.

And that's the real issue: there is no good solution to this problem.
Only a different set of ugly compromises.

> How is userspace going to "know" how to do all of this properly?  Who 
> is
> going to write that code?
> 
>> There may be other options (always make at least one 32bit-capable CPU
>> part of the process affinity?), but they all imply some form of 
>> userspace
>> visible regressions.
>> 
>> Pick your poison... :-/
> 
> What do the system designers and users of this hardware recommend?  
> They
> are the ones that dreamed up this, and seem to somehow want this.  What
> do they think the correct solution is here as obviously they must have
> thought this through when designing such a beast, right?
> 
> And if they didn't think any of this through then why are they wanting
> to run Linux on this thing?

At this stage, I'll reach out for some Bombay mix (I dislike pop corn).

         M.
Catalin Marinas Nov. 10, 2020, 10:57 a.m. UTC | #7
On Tue, Nov 10, 2020 at 10:36:33AM +0100, Greg Kroah-Hartman wrote:
> On Tue, Nov 10, 2020 at 09:28:43AM +0000, Catalin Marinas wrote:
> > On Tue, Nov 10, 2020 at 08:04:26AM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Nov 09, 2020 at 09:30:21PM +0000, Will Deacon wrote:
> > > > Since 32-bit applications will be killed if they are caught trying to
> > > > execute on a 64-bit-only CPU in a mismatched system, advertise the set
> > > > of 32-bit capable CPUs to userspace in sysfs.
> > > > 
> > > > Signed-off-by: Will Deacon <will@kernel.org>
> > > > ---
> > > >  .../ABI/testing/sysfs-devices-system-cpu      |  9 +++++++++
> > > >  arch/arm64/kernel/cpufeature.c                | 19 +++++++++++++++++++
> > > >  2 files changed, 28 insertions(+)
> > > 
> > > I still think the "kill processes that can not run on this CPU" is crazy
> > 
> > I agree it's crazy, though we try to keep the kernel support simple
> > while making it a user-space problem. The alternative is to
> > force-migrate such process to a more capable CPU, potentially against
> > the desired user cpumask. In addition, we'd have to block CPU hot-unplug
> > in case the last 32-bit capable CPU disappears.
> 
> You should block CPU hot-unplug for the last 32bit capable CPU, why
> would you not want that if there are any active 32bit processes running?

That's been done in one of the versions submitted to the Android kernel
(which also handles automatic task placement, though by overriding the
user cpumask):

https://android-review.googlesource.com/c/kernel/common/+/1437100/7

I think preventing CPU offlining makes sense only together with forcing
32-bit task placement from the kernel. If we decide to go for the
SIGKILL approach with user-driven affinity, careful CPU offlining should
also be handled by user-space.
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 1a04ca8162ad..8a2e377b0dde 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -493,6 +493,15 @@  Description:	AArch64 CPU registers
 		'identification' directory exposes the CPU ID registers for
 		identifying model and revision of the CPU.
 
+What:		/sys/devices/system/cpu/aarch32_el0
+Date:		November 2020
+Contact:	Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org>
+Description:	Identifies the subset of CPUs in the system that can execute
+		AArch32 (32-bit ARM) applications. If present, the same format as
+		/sys/devices/system/cpu/{offline,online,possible,present} is used.
+		If absent, then all or none of the CPUs can execute AArch32
+		applications and execve() will behave accordingly.
+
 What:		/sys/devices/system/cpu/cpu#/cpu_capacity
 Date:		December 2016
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 264998972627..c90f4a18768c 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -67,6 +67,7 @@ 
 #include <linux/crash_dump.h>
 #include <linux/sort.h>
 #include <linux/stop_machine.h>
+#include <linux/sysfs.h>
 #include <linux/types.h>
 #include <linux/mm.h>
 #include <linux/cpu.h>
@@ -1269,6 +1270,24 @@  const struct cpumask *system_32bit_el0_cpumask(void)
 	return cpu_present_mask;
 }
 
+static ssize_t aarch32_el0_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	const struct cpumask *mask = system_32bit_el0_cpumask();
+
+	return sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(mask));
+}
+static const DEVICE_ATTR_RO(aarch32_el0);
+
+static int __init aarch32_el0_sysfs_init(void)
+{
+	if (!allow_mismatched_32bit_el0)
+		return 0;
+
+	return device_create_file(cpu_subsys.dev_root, &dev_attr_aarch32_el0);
+}
+device_initcall(aarch32_el0_sysfs_init);
+
 static bool has_32bit_el0(const struct arm64_cpu_capabilities *entry, int scope)
 {
 	if (!has_cpuid_feature(entry, scope))