diff mbox

arm kernel oops in highmem.c with 4.2

Message ID alpine.LFD.2.20.1508111558210.1515@knanqh.ubzr (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolas Pitre Aug. 11, 2015, 8:20 p.m. UTC
On Tue, 11 Aug 2015, Russell King - ARM Linux wrote:

> On Tue, Aug 11, 2015 at 03:41:52PM -0400, Nicolas Pitre wrote:
> > I'd agree.  But first I'd like to know why the fedora kernel is using 
> > CONFIG_UACCESS_WITH_MEMCPY?  If it's just because it sounded cool then 
> > that's a bad reason.
> > 
> > That code was created to work around inneficiency in the Orion CPU core 
> > that didn't coalesce writes from STRT instructions, and by using plain 
> > STR and/or STM the actual throughput was more than doubled.  This was 
> > fixed in later cores. However Orion users (if they still exist) might 
> > like the added performance. I don't have Orion-based targets anymore 
> > (they took the way of the recycling facility a while ago).
> 
> Irrespective of that, what has been found is that the implementation is
> unsafe - it is taking a semaphore when page faults are disabled.  In
> other words, notwithstanding the above, it's buggy no matter if it's
> an Orion CPU core, or highmem, or what.  Any place in the kernel which
> uses pagefault_disable() and then calls into this code will be buggy.
> 
> It needs fixing or removing.

Absolutely. I'm not disputing that. I'm only asking so we can wisely 
choose between fixing or removing.  Personally I'd be inclined towards 
the later, unless the following is sufficient to fix it:


> Even if it is fixed, making it _depend_ on CPU_FEROCEON sounds like a
> good idea if non-orion stuff isn't supposed to be enabling it.  Or
> something like that.

It is not clear to me exactly which cores are affected.  That's why the 
Kconfig entry was left open to all, in case it could benefit others.
In theory it shouldn't be harmful to anyone notwithstanding the caveat
mentioned in the help text.


Nicolas

Comments

Mark Salter Aug. 11, 2015, 9:17 p.m. UTC | #1
On Tue, 2015-08-11 at 16:20 -0400, Nicolas Pitre wrote:
> On Tue, 11 Aug 2015, Russell King - ARM Linux wrote:
> 
> > On Tue, Aug 11, 2015 at 03:41:52PM -0400, Nicolas Pitre wrote:
> > > I'd agree.  But first I'd like to know why the fedora kernel is using 
> > > CONFIG_UACCESS_WITH_MEMCPY?  If it's just because it sounded cool 
> > > then 
> > > that's a bad reason.
> > > 
> > > That code was created to work around inneficiency in the Orion CPU 
> > > core 
> > > that didn't coalesce writes from STRT instructions, and by using 
> > > plain 
> > > STR and/or STM the actual throughput was more than doubled.  This was 
> > > fixed in later cores. However Orion users (if they still exist) might 
> > > like the added performance. I don't have Orion-based targets anymore 
> > > (they took the way of the recycling facility a while ago).
> > 
> > Irrespective of that, what has been found is that the implementation is
> > unsafe - it is taking a semaphore when page faults are disabled.  In
> > other words, notwithstanding the above, it's buggy no matter if it's
> > an Orion CPU core, or highmem, or what.  Any place in the kernel which
> > uses pagefault_disable() and then calls into this code will be buggy.
> > 
> > It needs fixing or removing.
> 
> Absolutely. I'm not disputing that. I'm only asking so we can wisely 
> choose between fixing or removing.  Personally I'd be inclined towards 
> the later, unless the following is sufficient to fix it:
> 
> diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> b/arch/arm/lib/uaccess_with_memcpy.c
> index 3e58d71001..4b39af2dfd 100644
> --- a/arch/arm/lib/uaccess_with_memcpy.c
> +++ b/arch/arm/lib/uaccess_with_memcpy.c
> @@ -96,7 +96,7 @@ __copy_to_user_memcpy(void __user *to, const void 
> *from, unsigned long n)
>  	}
>  
>  	/* the mmap semaphore is taken only if not in an atomic context 
> */
> -	atomic = in_atomic();
> +	atomic = faulthandler_disabled();
>  
>  	if (!atomic)
>  		down_read(&current->mm->mmap_sem);

Yeah, that fixes the problem I was seeing.

> 
> > Even if it is fixed, making it _depend_ on CPU_FEROCEON sounds like a
> > good idea if non-orion stuff isn't supposed to be enabling it.  Or
> > something like that.
> 
> It is not clear to me exactly which cores are affected.  That's why the 
> Kconfig entry was left open to all, in case it could benefit others.
> In theory it shouldn't be harmful to anyone notwithstanding the caveat
> mentioned in the help text.

This was on a calxeda highbank.

With CONFIG_UACCESS_WITH_MEMCPY, the dd copy test reported 1.8GB/s
Without CONFIG_UACCESS_WITH_MEMCPY, 2.1GB/s
Nicolas Pitre Aug. 12, 2015, 2:18 a.m. UTC | #2
On Tue, 11 Aug 2015, Mark Salter wrote:

> On Tue, 2015-08-11 at 16:20 -0400, Nicolas Pitre wrote:
> > On Tue, 11 Aug 2015, Russell King - ARM Linux wrote:
> > 
> > > It needs fixing or removing.
> > 
> > Absolutely. I'm not disputing that. I'm only asking so we can wisely 
> > choose between fixing or removing.  Personally I'd be inclined towards 
> > the later, unless the following is sufficient to fix it:
> > 
> > diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> > b/arch/arm/lib/uaccess_with_memcpy.c
> > index 3e58d71001..4b39af2dfd 100644
> > --- a/arch/arm/lib/uaccess_with_memcpy.c
> > +++ b/arch/arm/lib/uaccess_with_memcpy.c
> > @@ -96,7 +96,7 @@ __copy_to_user_memcpy(void __user *to, const void 
> > *from, unsigned long n)
> >  	}
> >  
> >  	/* the mmap semaphore is taken only if not in an atomic context 
> > */
> > -	atomic = in_atomic();
> > +	atomic = faulthandler_disabled();
> >  
> >  	if (!atomic)
> >  		down_read(&current->mm->mmap_sem);
> 
> Yeah, that fixes the problem I was seeing.

Good!  Then I'll add it to RMK's patch system.  May I add a tested-by 
tag with your name?

> This was on a calxeda highbank.

Any idea why this option was set?

> With CONFIG_UACCESS_WITH_MEMCPY, the dd copy test reported 1.8GB/s
> Without CONFIG_UACCESS_WITH_MEMCPY, 2.1GB/s

Therefore this is of no benefit to you.


Nicolas
Mark Salter Aug. 12, 2015, 1:33 p.m. UTC | #3
On Tue, 2015-08-11 at 22:18 -0400, Nicolas Pitre wrote:
> On Tue, 11 Aug 2015, Mark Salter wrote:
> 
> > On Tue, 2015-08-11 at 16:20 -0400, Nicolas Pitre wrote:
> > > On Tue, 11 Aug 2015, Russell King - ARM Linux wrote:
> > > 
> > > > It needs fixing or removing.
> > > 
> > > Absolutely. I'm not disputing that. I'm only asking so we can wisely 
> > > choose between fixing or removing.  Personally I'd be inclined 
> > > towards 
> > > the later, unless the following is sufficient to fix it:
> > > 
> > > diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> > > b/arch/arm/lib/uaccess_with_memcpy.c
> > > index 3e58d71001..4b39af2dfd 100644
> > > --- a/arch/arm/lib/uaccess_with_memcpy.c
> > > +++ b/arch/arm/lib/uaccess_with_memcpy.c
> > > @@ -96,7 +96,7 @@ __copy_to_user_memcpy(void __user *to, const void 
> > > *from, unsigned long n)
> > >  	}
> > >  
> > >  	/* the mmap semaphore is taken only if not in an atomic 
> > > context 
> > > */
> > > -	atomic = in_atomic();
> > > +	atomic = faulthandler_disabled();
> > >  
> > >  	if (!atomic)
> > >  		down_read(&current->mm->mmap_sem);
> > 
> > Yeah, that fixes the problem I was seeing.
> 
> Good!  Then I'll add it to RMK's patch system.  May I add a tested-by 
> tag with your name?

Yes

> 
> > This was on a calxeda highbank.
> 
> Any idea why this option was set?

None. It seems to have been there in the original fedora config
for v7 kernels. But nothing in the config turns on CPU_FEROCEON
so there is no real need for it.

> 
> > With CONFIG_UACCESS_WITH_MEMCPY, the dd copy test reported 1.8GB/s
> > Without CONFIG_UACCESS_WITH_MEMCPY, 2.1GB/s
> 
> Therefore this is of no benefit to you.
> 
> 
> Nicolas
Nicolas Pitre Aug. 12, 2015, 3:50 p.m. UTC | #4
On Wed, 12 Aug 2015, Mark Salter wrote:

> On Tue, 2015-08-11 at 22:18 -0400, Nicolas Pitre wrote:
> > On Tue, 11 Aug 2015, Mark Salter wrote:
> > 
> > > On Tue, 2015-08-11 at 16:20 -0400, Nicolas Pitre wrote:
> > > > diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> > > > b/arch/arm/lib/uaccess_with_memcpy.c
> > > > index 3e58d71001..4b39af2dfd 100644
> > > > --- a/arch/arm/lib/uaccess_with_memcpy.c
> > > > +++ b/arch/arm/lib/uaccess_with_memcpy.c
> > > > @@ -96,7 +96,7 @@ __copy_to_user_memcpy(void __user *to, const void 
> > > > *from, unsigned long n)
> > > >  	}
> > > >  
> > > >  	/* the mmap semaphore is taken only if not in an atomic 
> > > > context 
> > > > */
> > > > -	atomic = in_atomic();
> > > > +	atomic = faulthandler_disabled();
> > > >  
> > > >  	if (!atomic)
> > > >  		down_read(&current->mm->mmap_sem);
> > > 
> > > Yeah, that fixes the problem I was seeing.
> > 
> > Good!  Then I'll add it to RMK's patch system.  May I add a tested-by 
> > tag with your name?
> 
> Yes

Queued as patch #8414/1.


Nicolas
diff mbox

Patch

diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index 3e58d71001..4b39af2dfd 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -96,7 +96,7 @@  __copy_to_user_memcpy(void __user *to, const void *from, unsigned long n)
 	}
 
 	/* the mmap semaphore is taken only if not in an atomic context */
-	atomic = in_atomic();
+	atomic = faulthandler_disabled();
 
 	if (!atomic)
 		down_read(&current->mm->mmap_sem);