diff mbox series

arm64: kernel: memory corruptions due non-disabled PAN

Message ID 20191119221006.1021520-1-pasha.tatashin@soleen.com (mailing list archive)
State Mainlined
Commit 94bb804e1e6f0a9a77acf20d7c70ea141c6c821e
Headers show
Series arm64: kernel: memory corruptions due non-disabled PAN | expand

Commit Message

Pasha Tatashin Nov. 19, 2019, 10:10 p.m. UTC
Userland access functions (__arch_clear_user, __arch_copy_from_user,
__arch_copy_in_user, __arch_copy_to_user), enable and disable PAN
for the duration of copy. However, when copy fails for some reason,
i.e. access violation the code is transferred to fixedup section,
where we do not disable PAN.

The bug is a security violation as the access to userland is still
open when it should be disabled, but it also causes memory corruptions
when software emulated PAN is used: CONFIG_ARM64_SW_TTBR0_PAN=y.

I was able to reproduce memory corruption problem on Broadcom's SoC
ARMv8-A like this:

Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
stack is accessed and copied.

The test program performed the following on every CPU and forking many
processes:

	unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
				  MAP_SHARED | MAP_ANONYMOUS, -1, 0);
	map[0] = getpid();
	sched_yield();
	if (map[0] != getpid()) {
		fprintf(stderr, "Corruption detected!");
	}
	munmap(map, PAGE_SIZE);

From time to time I was getting map[0] to contain pid for a different
process.

Fixes: 338d4f49d6f7114 ("arm64: kernel: Add support for Privileged...")

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 arch/arm64/lib/clear_user.S     | 1 +
 arch/arm64/lib/copy_from_user.S | 1 +
 arch/arm64/lib/copy_in_user.S   | 1 +
 arch/arm64/lib/copy_to_user.S   | 1 +
 4 files changed, 4 insertions(+)

Comments

Mark Rutland Nov. 20, 2019, 4:43 p.m. UTC | #1
Hi Pavel,

On Tue, Nov 19, 2019 at 05:10:06PM -0500, Pavel Tatashin wrote:
> Userland access functions (__arch_clear_user, __arch_copy_from_user,
> __arch_copy_in_user, __arch_copy_to_user), enable and disable PAN
> for the duration of copy. However, when copy fails for some reason,
> i.e. access violation the code is transferred to fixedup section,
> where we do not disable PAN.

Thanks for reporting this. This is a very nasty bug.

> The bug is a security violation as the access to userland is still
> open when it should be disabled, but it also causes memory corruptions
> when software emulated PAN is used: CONFIG_ARM64_SW_TTBR0_PAN=y.

I see that with CONFIG_ARM64_SW_TTBR0_PAN=y, this means that we may
leave the stale TTBR0 value installed across a context-switch (and have
reproduced that locally), but I'm having some difficulty reproducing the
corruption that you see.

> I was able to reproduce memory corruption problem on Broadcom's SoC
> ARMv8-A like this:
> 
> Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
> stack is accessed and copied.

IIUC this tickles the issue by performing a faulting uaccess in IRQ
context. On the path to returnign from the exception, it directly calls
into the scheduler as part of el1_preempt, erroneously passing the TTBR0
value to the next task. Note that a preemption would remove the stale
TTBR0 value as part of kernel entry.

It looks like if we're in this state, and return from an exception taken
from EL1 with SW PAN enabled, we'll also leave the stale TTBR0 value
installed. If PAN was disabled (e.g. in the middle of a uaccess region),
then we'll restore the correct TTBR0.

> The test program performed the following on every CPU and forking many
> processes:
> 
> 	unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
> 				  MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> 	map[0] = getpid();
> 	sched_yield();
> 	if (map[0] != getpid()) {
> 		fprintf(stderr, "Corruption detected!");
> 	}
> 	munmap(map, PAGE_SIZE);

Can you provide the whole test, please? And precisely how you're
launching it?

I've tried turning the above into a main() function, and spawning a
number of instances in parallel while perf is running, but I haven't
been able to reproduce the issue locally, and I'm concerned that I'm
missing something.

> From time to time I was getting map[0] to contain pid for a different
> process.

How often is "from time to time"? How many processes are you running,
across how many CPUs?

> 
> Fixes: 338d4f49d6f7114 ("arm64: kernel: Add support for Privileged...")
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  arch/arm64/lib/clear_user.S     | 1 +
>  arch/arm64/lib/copy_from_user.S | 1 +
>  arch/arm64/lib/copy_in_user.S   | 1 +
>  arch/arm64/lib/copy_to_user.S   | 1 +
>  4 files changed, 4 insertions(+)

FWIW, the diff below looks correct to me, but we might want to fold this
into the C wrappers, so that this is consistent with the other uaccess
cases (and done in one place in the code).

Thanks,
Mark.

> 
> diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
> index 10415572e82f..322b55664cca 100644
> --- a/arch/arm64/lib/clear_user.S
> +++ b/arch/arm64/lib/clear_user.S
> @@ -48,5 +48,6 @@ EXPORT_SYMBOL(__arch_clear_user)
>  	.section .fixup,"ax"
>  	.align	2
>  9:	mov	x0, x2			// return the original size
> +	uaccess_disable_not_uao x2, x3
>  	ret
>  	.previous
> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 680e74409ff9..8472dc7798b3 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -66,5 +66,6 @@ EXPORT_SYMBOL(__arch_copy_from_user)
>  	.section .fixup,"ax"
>  	.align	2
>  9998:	sub	x0, end, dst			// bytes not copied
> +	uaccess_disable_not_uao x3, x4
>  	ret
>  	.previous
> diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S
> index 0bedae3f3792..8e0355c1e318 100644
> --- a/arch/arm64/lib/copy_in_user.S
> +++ b/arch/arm64/lib/copy_in_user.S
> @@ -68,5 +68,6 @@ EXPORT_SYMBOL(__arch_copy_in_user)
>  	.section .fixup,"ax"
>  	.align	2
>  9998:	sub	x0, end, dst			// bytes not copied
> +	uaccess_disable_not_uao x3, x4
>  	ret
>  	.previous
> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> index 2d88c736e8f2..6085214654dc 100644
> --- a/arch/arm64/lib/copy_to_user.S
> +++ b/arch/arm64/lib/copy_to_user.S
> @@ -65,5 +65,6 @@ EXPORT_SYMBOL(__arch_copy_to_user)
>  	.section .fixup,"ax"
>  	.align	2
>  9998:	sub	x0, end, dst			// bytes not copied
> +	uaccess_disable_not_uao x3, x4
>  	ret
>  	.previous
> -- 
> 2.24.0
>
Pasha Tatashin Nov. 20, 2019, 4:55 p.m. UTC | #2
On Wed, Nov 20, 2019 at 11:43 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Pavel,
>
> On Tue, Nov 19, 2019 at 05:10:06PM -0500, Pavel Tatashin wrote:
> > Userland access functions (__arch_clear_user, __arch_copy_from_user,
> > __arch_copy_in_user, __arch_copy_to_user), enable and disable PAN
> > for the duration of copy. However, when copy fails for some reason,
> > i.e. access violation the code is transferred to fixedup section,
> > where we do not disable PAN.
>
> Thanks for reporting this. This is a very nasty bug.

Indeed, it was biting us randomly, and it took me awhile to understand
the root cause.

>
> > The bug is a security violation as the access to userland is still
> > open when it should be disabled, but it also causes memory corruptions
> > when software emulated PAN is used: CONFIG_ARM64_SW_TTBR0_PAN=y.
>
> I see that with CONFIG_ARM64_SW_TTBR0_PAN=y, this means that we may
> leave the stale TTBR0 value installed across a context-switch (and have
> reproduced that locally), but I'm having some difficulty reproducing the
> corruption that you see.

I will send the full test shortly. Note, I was never able to reproduce
it in QEMU, only on real hardware. Also, for some unknown reason after
kexec I could not reproduce it only during first boot, so it is
somewhat fragile, but I am sure it can be reproduced in other cases as
well, it is just my reproducer is not tunes for that.

>
> > I was able to reproduce memory corruption problem on Broadcom's SoC
> > ARMv8-A like this:
> >
> > Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
> > stack is accessed and copied.
>
> IIUC this tickles the issue by performing a faulting uaccess in IRQ
> context. On the path to returnign from the exception, it directly calls
> into the scheduler as part of el1_preempt, erroneously passing the TTBR0
> value to the next task. Note that a preemption would remove the stale
> TTBR0 value as part of kernel entry.

Correct.

>
> It looks like if we're in this state, and return from an exception taken
> from EL1 with SW PAN enabled, we'll also leave the stale TTBR0 value
> installed. If PAN was disabled (e.g. in the middle of a uaccess region),
> then we'll restore the correct TTBR0.
>
> > The test program performed the following on every CPU and forking many
> > processes:
> >
> >       unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
> >                                 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> >       map[0] = getpid();
> >       sched_yield();
> >       if (map[0] != getpid()) {
> >               fprintf(stderr, "Corruption detected!");
> >       }
> >       munmap(map, PAGE_SIZE);
>
> Can you provide the whole test, please? And precisely how you're
> launching it?

I will shortly.

>
> I've tried turning the above into a main() function, and spawning a
> number of instances in parallel while perf is running, but I haven't
> been able to reproduce the issue locally, and I'm concerned that I'm
> missing something.
>
> > From time to time I was getting map[0] to contain pid for a different
> > process.
>
> How often is "from time to time"? How many processes are you running,
> across how many CPUs?

Less than a second on 8 CPU SoC it takes for a process to get access
to address space of another process.

>
> >
> > Fixes: 338d4f49d6f7114 ("arm64: kernel: Add support for Privileged...")
> >
> > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> > ---
> >  arch/arm64/lib/clear_user.S     | 1 +
> >  arch/arm64/lib/copy_from_user.S | 1 +
> >  arch/arm64/lib/copy_in_user.S   | 1 +
> >  arch/arm64/lib/copy_to_user.S   | 1 +
> >  4 files changed, 4 insertions(+)
>
> FWIW, the diff below looks correct to me, but we might want to fold this
> into the C wrappers, so that this is consistent with the other uaccess
> cases (and done in one place in the code).

I agree, and I actually have a patch for that, but I wanted my fix to
be included into 5.4 if possible. This is why I sent it out. I will
send out a C wrapper patch soon, but that one won't need to be
backported to stable.

Pasha
Will Deacon Nov. 20, 2019, 7:16 p.m. UTC | #3
On Tue, Nov 19, 2019 at 05:10:06PM -0500, Pavel Tatashin wrote:
> Userland access functions (__arch_clear_user, __arch_copy_from_user,
> __arch_copy_in_user, __arch_copy_to_user), enable and disable PAN
> for the duration of copy. However, when copy fails for some reason,
> i.e. access violation the code is transferred to fixedup section,
> where we do not disable PAN.
> 
> The bug is a security violation as the access to userland is still
> open when it should be disabled, but it also causes memory corruptions
> when software emulated PAN is used: CONFIG_ARM64_SW_TTBR0_PAN=y.
> 
> I was able to reproduce memory corruption problem on Broadcom's SoC
> ARMv8-A like this:
> 
> Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
> stack is accessed and copied.
> 
> The test program performed the following on every CPU and forking many
> processes:
> 
> 	unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
> 				  MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> 	map[0] = getpid();
> 	sched_yield();
> 	if (map[0] != getpid()) {
> 		fprintf(stderr, "Corruption detected!");
> 	}
> 	munmap(map, PAGE_SIZE);
> 
> From time to time I was getting map[0] to contain pid for a different
> process.
> 
> Fixes: 338d4f49d6f7114 ("arm64: kernel: Add support for Privileged...")
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  arch/arm64/lib/clear_user.S     | 1 +
>  arch/arm64/lib/copy_from_user.S | 1 +
>  arch/arm64/lib/copy_in_user.S   | 1 +
>  arch/arm64/lib/copy_to_user.S   | 1 +
>  4 files changed, 4 insertions(+)

Thanks. I've pushed this and your other patch out [1], with some changes
to the commit message. I'm annoyed that I didn't spot this during review
of the initial PAN patches.

Will

[1] https://fixes.arm64.dev
Pasha Tatashin Nov. 20, 2019, 7:46 p.m. UTC | #4
> > I see that with CONFIG_ARM64_SW_TTBR0_PAN=y, this means that we may
> > leave the stale TTBR0 value installed across a context-switch (and have
> > reproduced that locally), but I'm having some difficulty reproducing the
> > corruption that you see.
>
> I will send the full test shortly. Note, I was never able to reproduce
> it in QEMU, only on real hardware. Also, for some unknown reason after
> kexec I could not reproduce it only during first boot, so it is
> somewhat fragile, but I am sure it can be reproduced in other cases as
> well, it is just my reproducer is not tunes for that.
>

Attached is the test program that I used to reproduce memory corruption.
Test on board with Broadcom's Stingray SoC.

Without fix:
# time /tmp/repro
Corruption: pid 1474 map[0] 1488 cpu 3
Terminated

real    0m0.088s
user    0m0.004s
sys     0m0.071s

With the fix:

# time /tmp/repro
Test passed, all good
Terminated

real    1m1.286s
user    0m0.004s
sys     0m0.970s



Pasha
Pasha Tatashin Nov. 20, 2019, 7:46 p.m. UTC | #5
>
> Thanks. I've pushed this and your other patch out [1], with some changes
> to the commit message. I'm annoyed that I didn't spot this during review
> of the initial PAN patches.
>
> Will

Great.

Thank you,
Pasha
Pasha Tatashin Nov. 20, 2019, 7:52 p.m. UTC | #6
On Wed, Nov 20, 2019 at 2:46 PM Pavel Tatashin
<pasha.tatashin@soleen.com> wrote:
>
> > > I see that with CONFIG_ARM64_SW_TTBR0_PAN=y, this means that we may
> > > leave the stale TTBR0 value installed across a context-switch (and have
> > > reproduced that locally), but I'm having some difficulty reproducing the
> > > corruption that you see.
> >
> > I will send the full test shortly. Note, I was never able to reproduce
> > it in QEMU, only on real hardware. Also, for some unknown reason after
> > kexec I could not reproduce it only during first boot, so it is
> > somewhat fragile, but I am sure it can be reproduced in other cases as
> > well, it is just my reproducer is not tunes for that.
> >
>
> Attached is the test program that I used to reproduce memory corruption.
> Test on board with Broadcom's Stingray SoC.

I forgot to remove from repro.c some of the stuff that I used for debugging:
get_pa() and sched_setaffinity() stuff are not needed.

>
> Without fix:
> # time /tmp/repro
> Corruption: pid 1474 map[0] 1488 cpu 3
> Terminated
>
> real    0m0.088s
> user    0m0.004s
> sys     0m0.071s
>
> With the fix:
>
> # time /tmp/repro
> Test passed, all good
> Terminated
>
> real    1m1.286s
> user    0m0.004s
> sys     0m0.970s
>
>
>
> Pasha
diff mbox series

Patch

diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
index 10415572e82f..322b55664cca 100644
--- a/arch/arm64/lib/clear_user.S
+++ b/arch/arm64/lib/clear_user.S
@@ -48,5 +48,6 @@  EXPORT_SYMBOL(__arch_clear_user)
 	.section .fixup,"ax"
 	.align	2
 9:	mov	x0, x2			// return the original size
+	uaccess_disable_not_uao x2, x3
 	ret
 	.previous
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 680e74409ff9..8472dc7798b3 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -66,5 +66,6 @@  EXPORT_SYMBOL(__arch_copy_from_user)
 	.section .fixup,"ax"
 	.align	2
 9998:	sub	x0, end, dst			// bytes not copied
+	uaccess_disable_not_uao x3, x4
 	ret
 	.previous
diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S
index 0bedae3f3792..8e0355c1e318 100644
--- a/arch/arm64/lib/copy_in_user.S
+++ b/arch/arm64/lib/copy_in_user.S
@@ -68,5 +68,6 @@  EXPORT_SYMBOL(__arch_copy_in_user)
 	.section .fixup,"ax"
 	.align	2
 9998:	sub	x0, end, dst			// bytes not copied
+	uaccess_disable_not_uao x3, x4
 	ret
 	.previous
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 2d88c736e8f2..6085214654dc 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -65,5 +65,6 @@  EXPORT_SYMBOL(__arch_copy_to_user)
 	.section .fixup,"ax"
 	.align	2
 9998:	sub	x0, end, dst			// bytes not copied
+	uaccess_disable_not_uao x3, x4
 	ret
 	.previous