diff mbox

[2/2] ARM: domains: add memory dependencies to get_domain/set_domain

Message ID E1ZaL45-0003fs-1k@rmk-PC.arm.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King Sept. 11, 2015, 9:56 a.m. UTC
We need to have memory dependencies on get_domain/set_domain to avoid
the compiler over-optimising these inline assembly instructions.

Loads/stores must not be reordered across a set_domain(), so introduce
a compiler barrier for that assembly.

The value of get_domain() must not be cached across a set_domain(), but
we still want to allow the compiler to optimise it away.  Introduce a
dependency on current_thread_info()->cpu_domain to avoid this; the new
memory clobber in set_domain() should therefore cause the compiler to
re-load this.  The other advantage of using this is we should have its
address in the register set already, or very soon after at most call
sites.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Robert Jarzmik Sept. 11, 2015, 2:56 p.m. UTC | #1
Russell King <rmk+kernel@arm.linux.org.uk> writes:

> We need to have memory dependencies on get_domain/set_domain to avoid
> the compiler over-optimising these inline assembly instructions.
>
> Loads/stores must not be reordered across a set_domain(), so introduce
> a compiler barrier for that assembly.
>
> The value of get_domain() must not be cached across a set_domain(), but
> we still want to allow the compiler to optimise it away.  Introduce a
> dependency on current_thread_info()->cpu_domain to avoid this; the new
> memory clobber in set_domain() should therefore cause the compiler to
> re-load this.  The other advantage of using this is we should have its
> address in the register set already, or very soon after at most call
> sites.
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>

The test is failing without these 2 patches, while with them, an unaligned
access is fixed, and the generated code looks good :
     7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
     7b0:       e3cc300c        bic     r3, ip, #12
     7b4:       e58dc014        str     ip, [sp, #20]
     7b8:       e3833004        orr     r3, r3, #4
     7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Cheers.

--
Robert
Russell King - ARM Linux Sept. 11, 2015, 3:10 p.m. UTC | #2
On Fri, Sep 11, 2015 at 04:56:27PM +0200, Robert Jarzmik wrote:
> Russell King <rmk+kernel@arm.linux.org.uk> writes:
> > We need to have memory dependencies on get_domain/set_domain to avoid
> > the compiler over-optimising these inline assembly instructions.
> >
> > Loads/stores must not be reordered across a set_domain(), so introduce
> > a compiler barrier for that assembly.
> >
> > The value of get_domain() must not be cached across a set_domain(), but
> > we still want to allow the compiler to optimise it away.  Introduce a
> > dependency on current_thread_info()->cpu_domain to avoid this; the new
> > memory clobber in set_domain() should therefore cause the compiler to
> > re-load this.  The other advantage of using this is we should have its
> > address in the register set already, or very soon after at most call
> > sites.
> >
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
> 
> The test is failing without these 2 patches, while with them, an unaligned
> access is fixed, and the generated code looks good :
>      7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
>      7b0:       e3cc300c        bic     r3, ip, #12
>      7b4:       e58dc014        str     ip, [sp, #20]
>      7b8:       e3833004        orr     r3, r3, #4
>      7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Thanks Robert, I've queued these as fixes now.

If you want to put the bug alignment patch in the patch system, I'll get
that off to Linus this weekend too.
Robert Jarzmik Sept. 11, 2015, 3:40 p.m. UTC | #3
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> If you want to put the bug alignment patch in the patch system, I'll get
> that off to Linus this weekend too.
Sure, that will be done before tomorrow evening.

Cheers.
diff mbox

Patch

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..fc8ba1663601 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -12,6 +12,7 @@ 
 
 #ifndef __ASSEMBLY__
 #include <asm/barrier.h>
+#include <asm/thread_info.h>
 #endif
 
 /*
@@ -89,7 +90,8 @@  static inline unsigned int get_domain(void)
 
 	asm(
 	"mrc	p15, 0, %0, c3, c0	@ get domain"
-	 : "=r" (domain));
+	 : "=r" (domain)
+	 : "m" (current_thread_info()->cpu_domain));
 
 	return domain;
 }
@@ -98,7 +100,7 @@  static inline void set_domain(unsigned val)
 {
 	asm volatile(
 	"mcr	p15, 0, %0, c3, c0	@ set domain"
-	  : : "r" (val));
+	  : : "r" (val) : "memory");
 	isb();
 }