Message ID | 20240822151113.1479789-7-joey.gouly@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Permission Overlay Extension | expand |
On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > POR_EL0 is a register that can be modified by userspace directly, > so it must be context switched. > > Signed-off-by: Joey Gouly <joey.gouly@arm.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> > --- > arch/arm64/include/asm/cpufeature.h | 6 ++++++ > arch/arm64/include/asm/processor.h | 1 + > arch/arm64/include/asm/sysreg.h | 3 +++ > arch/arm64/kernel/process.c | 28 ++++++++++++++++++++++++++++ > 4 files changed, 38 insertions(+) [...] > +static void permission_overlay_switch(struct task_struct *next) > +{ > + if (!system_supports_poe()) > + return; > + > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > + if (current->thread.por_el0 != next->thread.por_el0) { > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ nit: typo "chaning". But more substantially, is this just to prevent spurious faults in the context of a new thread using a stale value for POR_EL0? Will
On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > +static void permission_overlay_switch(struct task_struct *next) > > +{ > > + if (!system_supports_poe()) > > + return; > > + > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > + if (current->thread.por_el0 != next->thread.por_el0) { > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > nit: typo "chaning". > > But more substantially, is this just to prevent spurious faults in the > context of a new thread using a stale value for POR_EL0? Not just prevent faults but enforce the permissions from the new thread's POR_EL0. The kernel may continue with a uaccess routine from here, we can't tell.
On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > +static void permission_overlay_switch(struct task_struct *next) > > > +{ > > > + if (!system_supports_poe()) > > > + return; > > > + > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > nit: typo "chaning". > > > > But more substantially, is this just to prevent spurious faults in the > > context of a new thread using a stale value for POR_EL0? > > Not just prevent faults but enforce the permissions from the new > thread's POR_EL0. The kernel may continue with a uaccess routine from > here, we can't tell. Hmm, I wondered if that was the case. It's a bit weird though, because: - There's a window between switch_mm() and switch_to() where you might reasonably expect to be able to execute uaccess routines - kthread_use_mm() doesn't/can't look at this at all - GUP obviously doesn't care So what do we actually gain by having the uaccess routines honour this? Will
On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > +{ > > > > + if (!system_supports_poe()) > > > > + return; > > > > + > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > nit: typo "chaning". > > > > > > But more substantially, is this just to prevent spurious faults in the > > > context of a new thread using a stale value for POR_EL0? > > > > Not just prevent faults but enforce the permissions from the new > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > here, we can't tell. > > Hmm, I wondered if that was the case. It's a bit weird though, because: > > - There's a window between switch_mm() and switch_to() where you might > reasonably expect to be able to execute uaccess routines I don't think we can have any uaccess between these two switches (a uaccess could fault, that's a pretty weird state between these two). > - kthread_use_mm() doesn't/can't look at this at all No, but a kthread would have it's own, most permissive, POR_EL0. > - GUP obviously doesn't care > > So what do we actually gain by having the uaccess routines honour this? I guess where it matters is more like not accidentally faulting because the previous thread had more restrictive permissions.
On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > +{ > > > > > + if (!system_supports_poe()) > > > > > + return; > > > > > + > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > nit: typo "chaning". > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > context of a new thread using a stale value for POR_EL0? > > > > > > Not just prevent faults but enforce the permissions from the new > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > here, we can't tell. > > > > Hmm, I wondered if that was the case. It's a bit weird though, because: > > > > - There's a window between switch_mm() and switch_to() where you might > > reasonably expect to be able to execute uaccess routines > > I don't think we can have any uaccess between these two switches (a > uaccess could fault, that's a pretty weird state between these two). > > > - kthread_use_mm() doesn't/can't look at this at all > > No, but a kthread would have it's own, most permissive, POR_EL0. > > > - GUP obviously doesn't care > > > > So what do we actually gain by having the uaccess routines honour this? > > I guess where it matters is more like not accidentally faulting because > the previous thread had more restrictive permissions. That's what I wondered initially, but won't the fault handler retry in that case? Will
On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > +{ > > > > > > + if (!system_supports_poe()) > > > > > > + return; > > > > > > + > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > here, we can't tell. [...] > > > So what do we actually gain by having the uaccess routines honour this? > > > > I guess where it matters is more like not accidentally faulting because > > the previous thread had more restrictive permissions. > > That's what I wondered initially, but won't the fault handler retry in > that case? Yes, it will retry and this should be fine (I assume you are only talking about the dropping ISB in the context switch). For the case of running with a more permissive stale POR_EL0, arguably it's slightly more predictable for the user but, OTOH, some syscalls like readv() could be routed through GUP with no checks. As with MTE, we don't guarantee uaccesses honour the user permissions. That said, at some point we should sanitise this path anyway and have a single ISB at the end. In the meantime, I'm fine with dropping the ISB here.
On Mon, Sep 02, 2024 at 08:08:08PM +0100, Catalin Marinas wrote: > On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > > +{ > > > > > > > + if (!system_supports_poe()) > > > > > > > + return; > > > > > > > + > > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > > here, we can't tell. > [...] > > > > So what do we actually gain by having the uaccess routines honour this? > > > > > > I guess where it matters is more like not accidentally faulting because > > > the previous thread had more restrictive permissions. > > > > That's what I wondered initially, but won't the fault handler retry in > > that case? > > Yes, it will retry and this should be fine (I assume you are only > talking about the dropping ISB in the context switch). > > For the case of running with a more permissive stale POR_EL0, arguably it's > slightly more predictable for the user but, OTOH, some syscalls like > readv() could be routed through GUP with no checks. As with MTE, we > don't guarantee uaccesses honour the user permissions. > > That said, at some point we should sanitise this path anyway and have a > single ISB at the end. In the meantime, I'm fine with dropping the ISB > here. > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 Author: Joey Gouly <joey.gouly@arm.com> Date: Tue Sep 3 15:47:26 2024 +0100 fixup! arm64: context switch POR_EL0 register diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index a3a61ecdb165..c224b0955f1a 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) return; current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); - if (current->thread.por_el0 != next->thread.por_el0) { + if (current->thread.por_el0 != next->thread.por_el0) write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ - isb(); - } } /* Will, do you want me to re-send the series with this and the permissions diff from the other thread [1], or you ok with applying them when you pull it in? Thanks, Joey [1] https://lore.kernel.org/all/20240903144823.GA3669886@e124191.cambridge.arm.com/
On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > On Mon, Sep 02, 2024 at 08:08:08PM +0100, Catalin Marinas wrote: > > On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > > > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > > > +{ > > > > > > > > + if (!system_supports_poe()) > > > > > > > > + return; > > > > > > > > + > > > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > > > here, we can't tell. > > [...] > > > > > So what do we actually gain by having the uaccess routines honour this? > > > > > > > > I guess where it matters is more like not accidentally faulting because > > > > the previous thread had more restrictive permissions. > > > > > > That's what I wondered initially, but won't the fault handler retry in > > > that case? > > > > Yes, it will retry and this should be fine (I assume you are only > > talking about the dropping ISB in the context switch). > > > > For the case of running with a more permissive stale POR_EL0, arguably it's > > slightly more predictable for the user but, OTOH, some syscalls like > > readv() could be routed through GUP with no checks. As with MTE, we > > don't guarantee uaccesses honour the user permissions. > > > > That said, at some point we should sanitise this path anyway and have a > > single ISB at the end. In the meantime, I'm fine with dropping the ISB > > here. > > > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > Author: Joey Gouly <joey.gouly@arm.com> > Date: Tue Sep 3 15:47:26 2024 +0100 > > fixup! arm64: context switch POR_EL0 register > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index a3a61ecdb165..c224b0955f1a 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) > return; > > current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > - if (current->thread.por_el0 != next->thread.por_el0) { > + if (current->thread.por_el0 != next->thread.por_el0) > write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > - isb(); > - } > } What about the one in flush_poe()? I'm inclined to drop that as well. > Will, do you want me to re-send the series with this and the permissions > diff from the other thread [1], > or you ok with applying them when you pull it in? I'll have a crack now, but if it fails miserably then I'll let you know. Will
On Wed, Sep 04, 2024 at 11:22:54AM +0100, Will Deacon wrote: > On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > > On Mon, Sep 02, 2024 at 08:08:08PM +0100, Catalin Marinas wrote: > > > On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > > > > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > > > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > > > > +{ > > > > > > > > > + if (!system_supports_poe()) > > > > > > > > > + return; > > > > > > > > > + > > > > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > > > > here, we can't tell. > > > [...] > > > > > > So what do we actually gain by having the uaccess routines honour this? > > > > > > > > > > I guess where it matters is more like not accidentally faulting because > > > > > the previous thread had more restrictive permissions. > > > > > > > > That's what I wondered initially, but won't the fault handler retry in > > > > that case? > > > > > > Yes, it will retry and this should be fine (I assume you are only > > > talking about the dropping ISB in the context switch). > > > > > > For the case of running with a more permissive stale POR_EL0, arguably it's > > > slightly more predictable for the user but, OTOH, some syscalls like > > > readv() could be routed through GUP with no checks. As with MTE, we > > > don't guarantee uaccesses honour the user permissions. > > > > > > That said, at some point we should sanitise this path anyway and have a > > > single ISB at the end. In the meantime, I'm fine with dropping the ISB > > > here. > > > > > > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > > Author: Joey Gouly <joey.gouly@arm.com> > > Date: Tue Sep 3 15:47:26 2024 +0100 > > > > fixup! arm64: context switch POR_EL0 register > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > index a3a61ecdb165..c224b0955f1a 100644 > > --- a/arch/arm64/kernel/process.c > > +++ b/arch/arm64/kernel/process.c > > @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) > > return; > > > > current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > - if (current->thread.por_el0 != next->thread.por_el0) { > > + if (current->thread.por_el0 != next->thread.por_el0) > > write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > - isb(); > > - } > > } > > What about the one in flush_poe()? I'm inclined to drop that as well. Yes I guess that one can be removed too. Catalin any comments? > > > Will, do you want me to re-send the series with this and the permissions > > diff from the other thread [1], > > or you ok with applying them when you pull it in? > > I'll have a crack now, but if it fails miserably then I'll let you know. Thanks! Just to make sure, you should pick the patch up from https://lore.kernel.org/linux-arm-kernel/20240903152937.GA3768522@e124191.cambridge.arm.com/ Not the one I linked to in [1] in my previous e-mail. Thanks, Joey
On Wed, Sep 04, 2024 at 11:22:54AM +0100, Will Deacon wrote: > On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > > Author: Joey Gouly <joey.gouly@arm.com> > > Date: Tue Sep 3 15:47:26 2024 +0100 > > > > fixup! arm64: context switch POR_EL0 register > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > index a3a61ecdb165..c224b0955f1a 100644 > > --- a/arch/arm64/kernel/process.c > > +++ b/arch/arm64/kernel/process.c > > @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) > > return; > > > > current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > - if (current->thread.por_el0 != next->thread.por_el0) { > > + if (current->thread.por_el0 != next->thread.por_el0) > > write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > - isb(); > > - } > > } > > What about the one in flush_poe()? I'm inclined to drop that as well. Yes, that's a similar thing.
On Wed, Sep 04, 2024 at 12:32:21PM +0100, Joey Gouly wrote: > On Wed, Sep 04, 2024 at 11:22:54AM +0100, Will Deacon wrote: > > On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > > > On Mon, Sep 02, 2024 at 08:08:08PM +0100, Catalin Marinas wrote: > > > > On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > > > > > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > > > > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > > > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > > > > > +{ > > > > > > > > > > + if (!system_supports_poe()) > > > > > > > > > > + return; > > > > > > > > > > + > > > > > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > > > > > here, we can't tell. > > > > [...] > > > > > > > So what do we actually gain by having the uaccess routines honour this? > > > > > > > > > > > > I guess where it matters is more like not accidentally faulting because > > > > > > the previous thread had more restrictive permissions. > > > > > > > > > > That's what I wondered initially, but won't the fault handler retry in > > > > > that case? > > > > > > > > Yes, it will retry and this should be fine (I assume you are only > > > > talking about the dropping ISB in the context switch). > > > > > > > > For the case of running with a more permissive stale POR_EL0, arguably it's > > > > slightly more predictable for the user but, OTOH, some syscalls like > > > > readv() could be routed through GUP with no checks. As with MTE, we > > > > don't guarantee uaccesses honour the user permissions. > > > > > > > > That said, at some point we should sanitise this path anyway and have a > > > > single ISB at the end. In the meantime, I'm fine with dropping the ISB > > > > here. > > > > > > > > > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > > > Author: Joey Gouly <joey.gouly@arm.com> > > > Date: Tue Sep 3 15:47:26 2024 +0100 > > > > > > fixup! arm64: context switch POR_EL0 register > > > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > > index a3a61ecdb165..c224b0955f1a 100644 > > > --- a/arch/arm64/kernel/process.c > > > +++ b/arch/arm64/kernel/process.c > > > @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) > > > return; > > > > > > current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > - if (current->thread.por_el0 != next->thread.por_el0) { > > > + if (current->thread.por_el0 != next->thread.por_el0) > > > write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > - isb(); > > > - } > > > } > > > > What about the one in flush_poe()? I'm inclined to drop that as well. > > Yes I guess that one can be removed too. Catalin any comments? > > > > > > Will, do you want me to re-send the series with this and the permissions > > > diff from the other thread [1], > > > or you ok with applying them when you pull it in? > > > > I'll have a crack now, but if it fails miserably then I'll let you know. > > Thanks! Just to make sure, you should pick the patch up from > > https://lore.kernel.org/linux-arm-kernel/20240903152937.GA3768522@e124191.cambridge.arm.com/ > > Not the one I linked to in [1] in my previous e-mail. Right, there's quite a lot I need to do: - Uncorrupt your patches - Fix the conflict in the kvm selftests - Drop the unnecessary ISBs - Fix the ESR checking - Fix the el2_setup labels - Reorder the patches - Drop the patch that is already in kvmarm Working on it... Will
On Wed, Sep 04, 2024 at 12:43:02PM +0100, Will Deacon wrote: > On Wed, Sep 04, 2024 at 12:32:21PM +0100, Joey Gouly wrote: > > On Wed, Sep 04, 2024 at 11:22:54AM +0100, Will Deacon wrote: > > > On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > > > > On Mon, Sep 02, 2024 at 08:08:08PM +0100, Catalin Marinas wrote: > > > > > On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > > > > > > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > > > > > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > > > > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > > > > > > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > > > > > > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > > > > > > > > > > +static void permission_overlay_switch(struct task_struct *next) > > > > > > > > > > > +{ > > > > > > > > > > > + if (!system_supports_poe()) > > > > > > > > > > > + return; > > > > > > > > > > > + > > > > > > > > > > > + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > > > > > > > > + if (current->thread.por_el0 != next->thread.por_el0) { > > > > > > > > > > > + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > > > > > > > > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > > > > > > > > > > > > > > > > > nit: typo "chaning". > > > > > > > > > > > > > > > > > > > > But more substantially, is this just to prevent spurious faults in the > > > > > > > > > > context of a new thread using a stale value for POR_EL0? > > > > > > > > > > > > > > > > > > Not just prevent faults but enforce the permissions from the new > > > > > > > > > thread's POR_EL0. The kernel may continue with a uaccess routine from > > > > > > > > > here, we can't tell. > > > > > [...] > > > > > > > > So what do we actually gain by having the uaccess routines honour this? > > > > > > > > > > > > > > I guess where it matters is more like not accidentally faulting because > > > > > > > the previous thread had more restrictive permissions. > > > > > > > > > > > > That's what I wondered initially, but won't the fault handler retry in > > > > > > that case? > > > > > > > > > > Yes, it will retry and this should be fine (I assume you are only > > > > > talking about the dropping ISB in the context switch). > > > > > > > > > > For the case of running with a more permissive stale POR_EL0, arguably it's > > > > > slightly more predictable for the user but, OTOH, some syscalls like > > > > > readv() could be routed through GUP with no checks. As with MTE, we > > > > > don't guarantee uaccesses honour the user permissions. > > > > > > > > > > That said, at some point we should sanitise this path anyway and have a > > > > > single ISB at the end. In the meantime, I'm fine with dropping the ISB > > > > > here. > > > > > > > > > > > > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > > > > Author: Joey Gouly <joey.gouly@arm.com> > > > > Date: Tue Sep 3 15:47:26 2024 +0100 > > > > > > > > fixup! arm64: context switch POR_EL0 register > > > > > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > > > index a3a61ecdb165..c224b0955f1a 100644 > > > > --- a/arch/arm64/kernel/process.c > > > > +++ b/arch/arm64/kernel/process.c > > > > @@ -515,11 +515,8 @@ static void permission_overlay_switch(struct task_struct *next) > > > > return; > > > > > > > > current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > - if (current->thread.por_el0 != next->thread.por_el0) { > > > > + if (current->thread.por_el0 != next->thread.por_el0) > > > > write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); > > > > - /* ISB required for kernel uaccess routines when chaning POR_EL0 */ > > > > - isb(); > > > > - } > > > > } > > > > > > What about the one in flush_poe()? I'm inclined to drop that as well. > > > > Yes I guess that one can be removed too. Catalin any comments? > > > > > > > > > Will, do you want me to re-send the series with this and the permissions > > > > diff from the other thread [1], > > > > or you ok with applying them when you pull it in? > > > > > > I'll have a crack now, but if it fails miserably then I'll let you know. > > > > Thanks! Just to make sure, you should pick the patch up from > > > > https://lore.kernel.org/linux-arm-kernel/20240903152937.GA3768522@e124191.cambridge.arm.com/ > > > > Not the one I linked to in [1] in my previous e-mail. > > Right, there's quite a lot I need to do: > > - Uncorrupt your patches > - Fix the conflict in the kvm selftests > - Drop the unnecessary ISBs > - Fix the ESR checking > - Fix the el2_setup labels > - Reorder the patches > - Drop the patch that is already in kvmarm > > Working on it... Sorry! I'm happy to rebase onto some arm64 branch if that will help, just let me know. > > Will >
On Wed, Sep 04, 2024 at 01:55:03PM +0100, Joey Gouly wrote: > On Wed, Sep 04, 2024 at 12:43:02PM +0100, Will Deacon wrote: > > Right, there's quite a lot I need to do: > > > > - Uncorrupt your patches > > - Fix the conflict in the kvm selftests > > - Drop the unnecessary ISBs > > - Fix the ESR checking > > - Fix the el2_setup labels > > - Reorder the patches > > - Drop the patch that is already in kvmarm > > > > Working on it... > > Sorry! I'm happy to rebase onto some arm64 branch if that will help, just let me know. Please have a look at for-next/poe (also merged into for-next/core and for-kernelci) and let me know what I got wrong! For Marc: I reordered the series so the KVM bits (and deps) are all the beginning, should you need them. The branch is based on a merge of the shared branch you created previously. Cheers, Will
On Wed, 04 Sep 2024 17:17:58 +0100, Will Deacon <will@kernel.org> wrote: > > On Wed, Sep 04, 2024 at 01:55:03PM +0100, Joey Gouly wrote: > > On Wed, Sep 04, 2024 at 12:43:02PM +0100, Will Deacon wrote: > > > Right, there's quite a lot I need to do: > > > > > > - Uncorrupt your patches > > > - Fix the conflict in the kvm selftests > > > - Drop the unnecessary ISBs > > > - Fix the ESR checking > > > - Fix the el2_setup labels > > > - Reorder the patches > > > - Drop the patch that is already in kvmarm > > > > > > Working on it... > > > > Sorry! I'm happy to rebase onto some arm64 branch if that will help, just let me know. > > Please have a look at for-next/poe (also merged into for-next/core and > for-kernelci) and let me know what I got wrong! > > For Marc: I reordered the series so the KVM bits (and deps) are all the > beginning, should you need them. The branch is based on a merge of the > shared branch you created previously. I just had a quick check, and while there is a small conflict with kvmarm/next, it is extremely minor (small clash in the vcpu_sysreg, for which the resolving order doesn't matter), and not worth dragging additional patches in the shared branch. However, if KVM's own S1PIE series [1] ends up being merged (which I'd really like), I will definitely have to pull the prefix in, as this is a bit more involved conflict wise. Thanks, M. [1] http://lore.kernel.org/all/20240903153834.1909472-1-maz@kernel.org
On Wed, Sep 04, 2024 at 05:17:58PM +0100, Will Deacon wrote: > On Wed, Sep 04, 2024 at 01:55:03PM +0100, Joey Gouly wrote: > > On Wed, Sep 04, 2024 at 12:43:02PM +0100, Will Deacon wrote: > > > Right, there's quite a lot I need to do: > > > > > > - Uncorrupt your patches > > > - Fix the conflict in the kvm selftests > > > - Drop the unnecessary ISBs > > > - Fix the ESR checking > > > - Fix the el2_setup labels > > > - Reorder the patches > > > - Drop the patch that is already in kvmarm > > > > > > Working on it... > > > > Sorry! I'm happy to rebase onto some arm64 branch if that will help, just let me know. > > Please have a look at for-next/poe (also merged into for-next/core and > for-kernelci) and let me know what I got wrong! I pulled for-next/poe and ran the test and it works fine. Also looked at the diff of my branch against your branch, and it looks fine too. Thanks for your work to get this merged! > > For Marc: I reordered the series so the KVM bits (and deps) are all the > beginning, should you need them. The branch is based on a merge of the > shared branch you created previously. > > Cheers, > > Will >
On 22/08/2024 17:10, Joey Gouly wrote: > @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > if (system_supports_tpidr2()) > p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); > > + if (system_supports_poe()) > + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); Here we are only reloading POR_EL0's value if the target is a user thread. However, as this series stands, POR_EL0 is also relevant to kthreads, because any uaccess or GUP done from a kthread will also be checked against POR_EL0. This is especially important in cases like the io_uring kthread, which accesses the memory of the user process that spawned it. To prevent such a kthread from inheriting a stale value of POR_EL0, it seems that we should reload POR_EL0's value in all cases (user and kernel thread). Other approaches could also be considered (e.g. resetting POR_EL0 to unrestricted when creating a kthread), see my reply on v4 [1]. Kevin [1] https://lore.kernel.org/linux-arm-kernel/b4f8b351-4c83-43b4-bfbe-8f67f3f56fb9@arm.com/
On 9/11/24 08:01, Kevin Brodsky wrote: > On 22/08/2024 17:10, Joey Gouly wrote: >> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) >> if (system_supports_tpidr2()) >> p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); >> >> + if (system_supports_poe()) >> + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > Here we are only reloading POR_EL0's value if the target is a user > thread. However, as this series stands, POR_EL0 is also relevant to > kthreads, because any uaccess or GUP done from a kthread will also be > checked against POR_EL0. This is especially important in cases like the > io_uring kthread, which accesses the memory of the user process that > spawned it. To prevent such a kthread from inheriting a stale value of > POR_EL0, it seems that we should reload POR_EL0's value in all cases > (user and kernel thread). The problem with this is trying to figure out which POR_EL0 to use. The kthread could have been spawned ages ago and might not have a POR_EL0 which is very different from the current value of any of the threads in the process right now. There's also no great way for a kthread to reach out and grab an updated value. It's all completely inherently racy. > Other approaches could also be considered (e.g. resetting POR_EL0 to > unrestricted when creating a kthread), see my reply on v4 [1]. I kinda think this is the only way to go. It's the only sensible, predictable way. I _think_ it's what x86 will end up doing with PKRU, but there's been enough churn there that I'd need to go double check what happens in practice. Either way, it would be nice to get an io_uring test in here that actually spawns kthreads: tools/testing/selftests/mm/protection_keys.c
Hi Dave, On Wed, Sep 11, 2024 at 08:33:54AM -0700, Dave Hansen wrote: > On 9/11/24 08:01, Kevin Brodsky wrote: > > On 22/08/2024 17:10, Joey Gouly wrote: > >> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > >> if (system_supports_tpidr2()) > >> p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); > >> > >> + if (system_supports_poe()) > >> + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > Here we are only reloading POR_EL0's value if the target is a user > > thread. However, as this series stands, POR_EL0 is also relevant to > > kthreads, because any uaccess or GUP done from a kthread will also be > > checked against POR_EL0. This is especially important in cases like the > > io_uring kthread, which accesses the memory of the user process that > > spawned it. To prevent such a kthread from inheriting a stale value of > > POR_EL0, it seems that we should reload POR_EL0's value in all cases > > (user and kernel thread). > > The problem with this is trying to figure out which POR_EL0 to use. The > kthread could have been spawned ages ago and might not have a POR_EL0 > which is very different from the current value of any of the threads in > the process right now. > > There's also no great way for a kthread to reach out and grab an updated > value. It's all completely inherently racy. > > > Other approaches could also be considered (e.g. resetting POR_EL0 to > > unrestricted when creating a kthread), see my reply on v4 [1]. > > I kinda think this is the only way to go. It's the only sensible, > predictable way. I _think_ it's what x86 will end up doing with PKRU, > but there's been enough churn there that I'd need to go double check > what happens in practice. I agree. > Either way, it would be nice to get an io_uring test in here that > actually spawns kthreads: > > tools/testing/selftests/mm/protection_keys.c It would be good to update Documentation/core-api/protection-keys.rst as well, since the example with read() raises more questions than it answers! Kevin, Joey -- I've got this series queued in arm64 as-is, so perhaps you could send some patches on top so we can iron this out in time for 6.12? I'll also be at LPC next week if you're about. Cheers, Will
On Thu, Sep 12, 2024 at 11:50:18AM +0100, Will Deacon wrote: > Hi Dave, > > On Wed, Sep 11, 2024 at 08:33:54AM -0700, Dave Hansen wrote: > > On 9/11/24 08:01, Kevin Brodsky wrote: > > > On 22/08/2024 17:10, Joey Gouly wrote: > > >> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > >> if (system_supports_tpidr2()) > > >> p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); > > >> > > >> + if (system_supports_poe()) > > >> + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > Here we are only reloading POR_EL0's value if the target is a user > > > thread. However, as this series stands, POR_EL0 is also relevant to > > > kthreads, because any uaccess or GUP done from a kthread will also be > > > checked against POR_EL0. This is especially important in cases like the > > > io_uring kthread, which accesses the memory of the user process that > > > spawned it. To prevent such a kthread from inheriting a stale value of > > > POR_EL0, it seems that we should reload POR_EL0's value in all cases > > > (user and kernel thread). > > > > The problem with this is trying to figure out which POR_EL0 to use. The > > kthread could have been spawned ages ago and might not have a POR_EL0 > > which is very different from the current value of any of the threads in > > the process right now. > > > > There's also no great way for a kthread to reach out and grab an updated > > value. It's all completely inherently racy. > > > > > Other approaches could also be considered (e.g. resetting POR_EL0 to > > > unrestricted when creating a kthread), see my reply on v4 [1]. > > > > I kinda think this is the only way to go. It's the only sensible, > > predictable way. I _think_ it's what x86 will end up doing with PKRU, > > but there's been enough churn there that I'd need to go double check > > what happens in practice. > > I agree. > > > Either way, it would be nice to get an io_uring test in here that > > actually spawns kthreads: > > > > tools/testing/selftests/mm/protection_keys.c > > It would be good to update Documentation/core-api/protection-keys.rst > as well, since the example with read() raises more questions than it > answers! > > Kevin, Joey -- I've got this series queued in arm64 as-is, so perhaps > you could send some patches on top so we can iron this out in time for > 6.12? I'll also be at LPC next week if you're about. I found the code in arch/x86 that does this, I must have missed this previously. arch/x86/kernel/process.c: int copy_thread() /* Kernel thread ? */ if (unlikely(p->flags & PF_KTHREAD)) { p->thread.pkru = pkru_get_init_value(); memset(childregs, 0, sizeof(struct pt_regs)); kthread_frame_init(frame, args->fn, args->fn_arg); return 0; } I can send a similar patch for arm64. I have no idea how to write io_uring code, so looking for examples I can work with to get a test written. Might just send the arm64 fix first, if that's fine? Thanks, Joey
On Thu, Sep 12, 2024 at 01:48:35PM +0100, Joey Gouly wrote: > On Thu, Sep 12, 2024 at 11:50:18AM +0100, Will Deacon wrote: > > On Wed, Sep 11, 2024 at 08:33:54AM -0700, Dave Hansen wrote: > > > On 9/11/24 08:01, Kevin Brodsky wrote: > > > > On 22/08/2024 17:10, Joey Gouly wrote: > > > >> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > >> if (system_supports_tpidr2()) > > > >> p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); > > > >> > > > >> + if (system_supports_poe()) > > > >> + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); > > > > Here we are only reloading POR_EL0's value if the target is a user > > > > thread. However, as this series stands, POR_EL0 is also relevant to > > > > kthreads, because any uaccess or GUP done from a kthread will also be > > > > checked against POR_EL0. This is especially important in cases like the > > > > io_uring kthread, which accesses the memory of the user process that > > > > spawned it. To prevent such a kthread from inheriting a stale value of > > > > POR_EL0, it seems that we should reload POR_EL0's value in all cases > > > > (user and kernel thread). > > > > > > The problem with this is trying to figure out which POR_EL0 to use. The > > > kthread could have been spawned ages ago and might not have a POR_EL0 > > > which is very different from the current value of any of the threads in > > > the process right now. > > > > > > There's also no great way for a kthread to reach out and grab an updated > > > value. It's all completely inherently racy. > > > > > > > Other approaches could also be considered (e.g. resetting POR_EL0 to > > > > unrestricted when creating a kthread), see my reply on v4 [1]. > > > > > > I kinda think this is the only way to go. It's the only sensible, > > > predictable way. I _think_ it's what x86 will end up doing with PKRU, > > > but there's been enough churn there that I'd need to go double check > > > what happens in practice. > > > > I agree. > > > > > Either way, it would be nice to get an io_uring test in here that > > > actually spawns kthreads: > > > > > > tools/testing/selftests/mm/protection_keys.c > > > > It would be good to update Documentation/core-api/protection-keys.rst > > as well, since the example with read() raises more questions than it > > answers! > > > > Kevin, Joey -- I've got this series queued in arm64 as-is, so perhaps > > you could send some patches on top so we can iron this out in time for > > 6.12? I'll also be at LPC next week if you're about. > > I found the code in arch/x86 that does this, I must have missed this previously. > > arch/x86/kernel/process.c: int copy_thread() > > /* Kernel thread ? */ > if (unlikely(p->flags & PF_KTHREAD)) { > p->thread.pkru = pkru_get_init_value(); > memset(childregs, 0, sizeof(struct pt_regs)); > kthread_frame_init(frame, args->fn, args->fn_arg); > return 0; > } > > I can send a similar patch for arm64. I have no idea how to write io_uring > code, so looking for examples I can work with to get a test written. Might just > send the arm64 fix first, if that's fine? I think fix + documentation is what we need before 6.12, but you've still got plenty of time after the merge window. Cheers, Will
Dave Hansen <dave.hansen@intel.com> writes: > On 9/11/24 08:01, Kevin Brodsky wrote: >> On 22/08/2024 17:10, Joey Gouly wrote: >>> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) >>> if (system_supports_tpidr2()) >>> p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); >>> >>> + if (system_supports_poe()) >>> + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); >> Here we are only reloading POR_EL0's value if the target is a user >> thread. However, as this series stands, POR_EL0 is also relevant to >> kthreads, because any uaccess or GUP done from a kthread will also be >> checked against POR_EL0. This is especially important in cases like the >> io_uring kthread, which accesses the memory of the user process that >> spawned it. To prevent such a kthread from inheriting a stale value of >> POR_EL0, it seems that we should reload POR_EL0's value in all cases >> (user and kernel thread). > > The problem with this is trying to figure out which POR_EL0 to use. The > kthread could have been spawned ages ago and might not have a POR_EL0 > which is very different from the current value of any of the threads in > the process right now. > > There's also no great way for a kthread to reach out and grab an updated > value. It's all completely inherently racy. > >> Other approaches could also be considered (e.g. resetting POR_EL0 to >> unrestricted when creating a kthread), see my reply on v4 [1]. > > I kinda think this is the only way to go. It's the only sensible, > predictable way. I _think_ it's what x86 will end up doing with PKRU, > but there's been enough churn there that I'd need to go double check > what happens in practice. > that is also what powerpc does. /* usage of kthread_use_mm() should inherit the * AMR value of the operating address space. But, the AMR value is * thread-specific and we inherit the address space and not thread * access restrictions. Because of this ignore AMR value when accessing * userspace via kernel thread. */ static __always_inline u64 current_thread_amr(void) { if (current->thread.regs) return current->thread.regs->amr; return default_amr; } -aneesh
diff --git arch/arm64/include/asm/cpufeature.h arch/arm64/include/asm/cpufeature.h index 558434267271..3d261cc123c1 100644 --- arch/arm64/include/asm/cpufeature.h +++ arch/arm64/include/asm/cpufeature.h @@ -832,6 +832,12 @@ static inline bool system_supports_lpa2(void) return cpus_have_final_cap(ARM64_HAS_LPA2); } +static inline bool system_supports_poe(void) +{ + return IS_ENABLED(CONFIG_ARM64_POE) && + alternative_has_cap_unlikely(ARM64_HAS_S1POE); +} + int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); bool try_emulate_mrs(struct pt_regs *regs, u32 isn); diff --git arch/arm64/include/asm/processor.h arch/arm64/include/asm/processor.h index f77371232d8c..e6376f979273 100644 --- arch/arm64/include/asm/processor.h +++ arch/arm64/include/asm/processor.h @@ -184,6 +184,7 @@ struct thread_struct { u64 sctlr_user; u64 svcr; u64 tpidr2_el0; + u64 por_el0; }; static inline unsigned int thread_get_vl(struct thread_struct *thread, diff --git arch/arm64/include/asm/sysreg.h arch/arm64/include/asm/sysreg.h index 4a9ea103817e..494e9efd856f 100644 --- arch/arm64/include/asm/sysreg.h +++ arch/arm64/include/asm/sysreg.h @@ -1077,6 +1077,9 @@ #define POE_RXW UL(0x7) #define POE_MASK UL(0xf) +/* Initial value for Permission Overlay Extension for EL0 */ +#define POR_EL0_INIT POE_RXW + #define ARM64_FEATURE_FIELD_BITS 4 /* Defined for compatibility only, do not add new users. */ diff --git arch/arm64/kernel/process.c arch/arm64/kernel/process.c index 4ae31b7af6c3..a3a61ecdb165 100644 --- arch/arm64/kernel/process.c +++ arch/arm64/kernel/process.c @@ -271,12 +271,23 @@ static void flush_tagged_addr_state(void) clear_thread_flag(TIF_TAGGED_ADDR); } +static void flush_poe(void) +{ + if (!system_supports_poe()) + return; + + write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0); + /* ISB required for kernel uaccess routines when changing POR_EL0 */ + isb(); +} + void flush_thread(void) { fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current); flush_tagged_addr_state(); + flush_poe(); } void arch_release_task_struct(struct task_struct *tsk) @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) if (system_supports_tpidr2()) p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); + if (system_supports_poe()) + p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); + if (stack_start) { if (is_compat_thread(task_thread_info(p))) childregs->compat_sp = stack_start; @@ -495,6 +509,19 @@ static void erratum_1418040_new_exec(void) preempt_enable(); } +static void permission_overlay_switch(struct task_struct *next) +{ + if (!system_supports_poe()) + return; + + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); + if (current->thread.por_el0 != next->thread.por_el0) { + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ + isb(); + } +} + /* * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore * this function must be called with preemption disabled and the update to @@ -530,6 +557,7 @@ struct task_struct *__switch_to(struct task_struct *prev, ssbs_thread_switch(next); erratum_1418040_thread_switch(next); ptrauth_thread_switch_user(next); + permission_overlay_switch(next); /* * Complete any pending TLB or cache maintenance on this CPU in case