Message ID | 20190415143821.GA46880@beast (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] arm64: sysreg: make mrs_s and msr_s macros work with Clang and LTO | expand |
On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote: > From: Alex Matveev <alxmtvv@gmail.com> > > Clang's integrated assembler does not allow assembly macros defined > in one inline asm block using the .macro directive to be used across > separate asm blocks. LLVM developers consider this a feature and not a > bug, recommending code refactoring: > > https://bugs.llvm.org/show_bug.cgi?id=19749 > > As binutils doesn't allow macros to be redefined, this change uses > UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros > in-place and workaround gcc and clang limitations on redefining macros > across different assembler blocks. > > Specifically, the current state after preprocessing looks like this: > > asm volatile(".macro mXX_s ... .endm"); > void f() > { > asm volatile("mXX_s a, b"); > } > > With GCC, it gives macro redefinition error because sysreg.h is included > in multiple source files, and assembler code for all of them is later > combined for LTO (I've seen an intermediate file with hundreds of > identical definitions). > > With clang, it gives macro undefined error because clang doesn't allow > sharing macros between inline asm statements. > > I also seem to remember catching another sort of undefined error with > GCC due to reordering of macro definition asm statement and generated > asm code for function that uses the macro. > > The solution with defining and undefining for each use, while certainly > not elegant, satisfies both GCC and clang, LTO and non-LTO. > > Signed-off-by: Alex Matveev <alxmtvv@gmail.com> > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > v3: split out patch as stand-alone, added more uses in irqflags, > updated commit log, based on discussion in > https://lore.kernel.org/patchwork/patch/851580/ > --- > arch/arm64/include/asm/irqflags.h | 12 +++++-- > arch/arm64/include/asm/kvm_hyp.h | 8 +++-- > arch/arm64/include/asm/sysreg.h | 55 +++++++++++++++++++++---------- > 3 files changed, 53 insertions(+), 22 deletions(-) > > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h > index 43d8366c1e87..06d3987d1546 100644 > --- a/arch/arm64/include/asm/irqflags.h > +++ b/arch/arm64/include/asm/irqflags.h > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void) > asm volatile(ALTERNATIVE( > "msr daifclr, #2 // arch_local_irq_enable\n" > "nop", > + DEFINE_MSR_S > "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n" > + UNDEFINE_MSR_S If we do need this, can we wrap this in a larger CPP macro that does the whole sequence of defining, using, and undefining the asm macros? It would be nice if we could simply rely on a more recent binutils these days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg definition. That would mean we could get rid of the whole msr_s/mrs_s hack by turning that into a CPP macro which built that name. It looks like binutils has been able to do that since September 2014... Are folk using toolchains older than that to compile kernels? Thanks, Mark.
On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote: > > From: Alex Matveev <alxmtvv@gmail.com> > > > > Clang's integrated assembler does not allow assembly macros defined > > in one inline asm block using the .macro directive to be used across > > separate asm blocks. LLVM developers consider this a feature and not a > > bug, recommending code refactoring: > > > > https://bugs.llvm.org/show_bug.cgi?id=19749 > > > > As binutils doesn't allow macros to be redefined, this change uses > > UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros > > in-place and workaround gcc and clang limitations on redefining macros > > across different assembler blocks. > > > > Specifically, the current state after preprocessing looks like this: > > > > asm volatile(".macro mXX_s ... .endm"); > > void f() > > { > > asm volatile("mXX_s a, b"); > > } > > > > With GCC, it gives macro redefinition error because sysreg.h is included > > in multiple source files, and assembler code for all of them is later > > combined for LTO (I've seen an intermediate file with hundreds of > > identical definitions). > > > > With clang, it gives macro undefined error because clang doesn't allow > > sharing macros between inline asm statements. > > > > I also seem to remember catching another sort of undefined error with > > GCC due to reordering of macro definition asm statement and generated > > asm code for function that uses the macro. > > > > The solution with defining and undefining for each use, while certainly > > not elegant, satisfies both GCC and clang, LTO and non-LTO. > > > > Signed-off-by: Alex Matveev <alxmtvv@gmail.com> > > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > Signed-off-by: Kees Cook <keescook@chromium.org> > > --- > > v3: split out patch as stand-alone, added more uses in irqflags, > > updated commit log, based on discussion in > > https://lore.kernel.org/patchwork/patch/851580/ > > --- > > arch/arm64/include/asm/irqflags.h | 12 +++++-- > > arch/arm64/include/asm/kvm_hyp.h | 8 +++-- > > arch/arm64/include/asm/sysreg.h | 55 +++++++++++++++++++++---------- > > 3 files changed, 53 insertions(+), 22 deletions(-) > > > > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h > > index 43d8366c1e87..06d3987d1546 100644 > > --- a/arch/arm64/include/asm/irqflags.h > > +++ b/arch/arm64/include/asm/irqflags.h > > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void) > > asm volatile(ALTERNATIVE( > > "msr daifclr, #2 // arch_local_irq_enable\n" > > "nop", > > + DEFINE_MSR_S > > "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n" > > + UNDEFINE_MSR_S > > If we do need this, can we wrap this in a larger CPP macro that does the > whole sequence of defining, using, and undefining the asm macros? I agree that would be cleaner. For example, __mrs_s/__msr_s can ALMOST do this (be used in arch/arm64/include/asm/irqflags.h) except for the input/output constraints. Maybe the constraints can be removed from the cpp macro definition, then the constraints be left to the macro expansion sites? That way the DEFINE_MXX_S/UNDEFINE_MXX_S can be hidden in the cpp macro itself. > > It would be nice if we could simply rely on a more recent binutils these > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > definition. That would mean we could get rid of the whole msr_s/mrs_s > hack by turning that into a CPP macro which built that name. > > It looks like binutils has been able to do that since September 2014... > > Are folk using toolchains older than that to compile kernels? Do you have a link to a commit? If we can pinpoint the binutils version, that might help. Also, I look forward to this patch for use of Clang's integrated assembler (regardless of LTO). I remember getting frustrated trying to figure out how to resolve this for both assemblers, and I had forgotten this solution existed.
On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > It would be nice if we could simply rely on a more recent binutils these > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > hack by turning that into a CPP macro which built that name. > > > > It looks like binutils has been able to do that since September 2014... > > > > Are folk using toolchains older than that to compile kernels? > > Do you have a link to a commit? If we can pinpoint the binutils > version, that might help. IIUC any version of binutils with commit: df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr") ... should be sufficent. That's on gitweb at: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=df7b4545b2b49572ab63690c130df973af109615 > Also, I look forward to this patch for use of Clang's integrated > assembler (regardless of LTO). I remember getting frustrated trying > to figure out how to resolve this for both assemblers, and I had > forgotten this solution existed. Is this the only blocker for the integrated assembler? Thanks, Mark.
On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote: > > > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h > > > index 43d8366c1e87..06d3987d1546 100644 > > > --- a/arch/arm64/include/asm/irqflags.h > > > +++ b/arch/arm64/include/asm/irqflags.h > > > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void) > > > asm volatile(ALTERNATIVE( > > > "msr daifclr, #2 // arch_local_irq_enable\n" > > > "nop", > > > + DEFINE_MSR_S > > > "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n" > > > + UNDEFINE_MSR_S > > > > If we do need this, can we wrap this in a larger CPP macro that does the > > whole sequence of defining, using, and undefining the asm macros? > > I agree that would be cleaner. For example, __mrs_s/__msr_s can > ALMOST do this (be used in arch/arm64/include/asm/irqflags.h) except > for the input/output constraints. Maybe the constraints can be > removed from the cpp macro definition, then the constraints be left to > the macro expansion sites? That way the DEFINE_MXX_S/UNDEFINE_MXX_S > can be hidden in the cpp macro itself. If those took the register template explicitly, i.e. something like: __mrs_s(SYS_ICC_PMR_EL1, "%x0") __msr_s("%[whatever]", SYS_FOO) ... that would be more generally useful and clear, as they could be used in larger asm sequences. Thanks, Mark.
On Tue, Apr 16, 2019 at 10:08 AM Mark Rutland <mark.rutland@arm.com> wrote: > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > Also, I look forward to this patch for use of Clang's integrated > > assembler (regardless of LTO). I remember getting frustrated trying > > to figure out how to resolve this for both assemblers, and I had > > forgotten this solution existed. > > Is this the only blocker for the integrated assembler? I don't have a good handle on the number of issues that clang's integrated assembler has for arm64 kernels, but IIRC last time I checked I think there were under 10. It's my goal over the next quarter to investigate and get bugs filed so we know the order of magnitude of issues. This particular issue was the trickier/trickiest one. Linaro also recently has added staffing to beef up Clang's integrated assembler for arm32/arm64.
On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > It would be nice if we could simply rely on a more recent binutils these > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > hack by turning that into a CPP macro which built that name. > > > > > > It looks like binutils has been able to do that since September 2014... > > > > > > Are folk using toolchains older than that to compile kernels? > > > > Do you have a link to a commit? If we can pinpoint the binutils > > version, that might help. > > IIUC any version of binutils with commit: > > df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr") > > ... should be sufficent. This appears to be binutils 2.25: $ git describe --match 'binutils-2_*' --contains df7b4545b2b49572 binutils-2_25~418 Documentation/process/changes.rst lists 2.20 as current minimum for binutils. Ubuntu's old LTS (Trusty, Apr 2014) has 2.24, and that release is about to exit support this month. Debian's old-stable (Apr 2015) has 2.25. It's about to exit support when Debian Buster releases. RHEL6 (2010) has 2.20. RHEL7 (2014) has 2.25. It seems not unreasonable to require 2.25 at least for arm64 builds?
On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote: > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > It would be nice if we could simply rely on a more recent binutils these > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > > hack by turning that into a CPP macro which built that name. Mark, can you give me a test case for this? I'd like to check if clang's integrated assembler supports this or not, so we can file an issue and fix it if not. > > > > > > > > It looks like binutils has been able to do that since September 2014... > > > > > > > > Are folk using toolchains older than that to compile kernels? > > > > > > Do you have a link to a commit? If we can pinpoint the binutils > > > version, that might help. > > > > IIUC any version of binutils with commit: > > > > df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr") > > > > ... should be sufficent. > > This appears to be binutils 2.25: > > $ git describe --match 'binutils-2_*' --contains df7b4545b2b49572 > binutils-2_25~418 > > Documentation/process/changes.rst lists 2.20 as current minimum for binutils. > > Ubuntu's old LTS (Trusty, Apr 2014) has 2.24, and that release is > about to exit support this month. > > Debian's old-stable (Apr 2015) has 2.25. It's about to exit support > when Debian Buster releases. > > RHEL6 (2010) has 2.20. RHEL7 (2014) has 2.25. > > It seems not unreasonable to require 2.25 at least for arm64 builds? + Arnd who has a really good feel for which distros ship what.
On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote: > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > It would be nice if we could simply rely on a more recent binutils these > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > > > hack by turning that into a CPP macro which built that name. > > Mark, can you give me a test case for this? I'd like to check if > clang's integrated assembler supports this or not, so we can file an > issue and fix it if not. The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14 bits, so it should be feasible to generate and run an exhautive test for all encodings. I've included SYS and SYSL here too, since I strongly suspect we'll need those going forward. Thanks Mark. ---->8---- /* * ${CC} -c test.c * * OLD_* encodings already exist, NEW_* encodings haven't yet be * allocated (per ARM DDI 0487D.a), but should assemble. All encodings * chosen arbitrarily. */ // OSDTRRX_EL1 #define OLD_DBG_REG "s2_0_c0_c0_2" #define NEW_DBG_REG "s2_7_c0_c15_0" // MIDR_EL1 #define OLD_SYS_REG "s3_0_c0_c0_0" #define NEW_SYS_REG "s3_6_c1_c0_7" #define REG(r) \ asm volatile("msr " r ", xzr\n"); \ asm volatile("mrs xzr, " r "\n") // DC IVAC, XZR #define OLD_SYS "#0, c7, c6, #1" #define NEW_SYS "#7, c15, c15, #7" #define NEW_SYSL_MIN "#0, c0, c0, #0" #define NEW_SYSL_MAX "#7, c15, c15, #7" #define SYS(s) asm volatile("sys " s ", xzr\n") #define SYSL(s) asm volatile("sysl xzr, " s "\n") void test(void) { REG(OLD_DBG_REG); REG(NEW_DBG_REG); REG(OLD_SYS_REG); REG(NEW_SYS_REG); SYS(OLD_SYS); SYS(NEW_SYS); SYSL(NEW_SYSL_MIN); SYSL(NEW_SYSL_MAX); }
On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote: > > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > It would be nice if we could simply rely on a more recent binutils these > > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > > > > hack by turning that into a CPP macro which built that name. > > > > Mark, can you give me a test case for this? I'd like to check if > > clang's integrated assembler supports this or not, so we can file an > > issue and fix it if not. > > The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14 > bits, so it should be feasible to generate and run an exhautive test for > all encodings. > > I've included SYS and SYSL here too, since I strongly suspect we'll need > those going forward. > > Thanks > Mark. > > ---->8---- > /* > * ${CC} -c test.c > * > * OLD_* encodings already exist, NEW_* encodings haven't yet be > * allocated (per ARM DDI 0487D.a), but should assemble. All encodings > * chosen arbitrarily. > */ > > // OSDTRRX_EL1 > #define OLD_DBG_REG "s2_0_c0_c0_2" > #define NEW_DBG_REG "s2_7_c0_c15_0" > > // MIDR_EL1 > #define OLD_SYS_REG "s3_0_c0_c0_0" > #define NEW_SYS_REG "s3_6_c1_c0_7" > > #define REG(r) \ > asm volatile("msr " r ", xzr\n"); \ > asm volatile("mrs xzr, " r "\n") > > // DC IVAC, XZR > #define OLD_SYS "#0, c7, c6, #1" > #define NEW_SYS "#7, c15, c15, #7" > > #define NEW_SYSL_MIN "#0, c0, c0, #0" > #define NEW_SYSL_MAX "#7, c15, c15, #7" > > #define SYS(s) asm volatile("sys " s ", xzr\n") > #define SYSL(s) asm volatile("sysl xzr, " s "\n") > > void test(void) > { > REG(OLD_DBG_REG); > REG(NEW_DBG_REG); > > REG(OLD_SYS_REG); > REG(NEW_SYS_REG); > > SYS(OLD_SYS); > SYS(NEW_SYS); > > SYSL(NEW_SYSL_MIN); > SYSL(NEW_SYSL_MAX); > } Hmmm...yes looks like Clang's integrated assembler is missing support. https://godbolt.org/z/wDBpz4 https://bugs.llvm.org/show_bug.cgi?id=41573 https://github.com/ClangBuiltLinux/linux/issues/453 We'll follow up on getting that fixed. Kees, are you able to respin the patch with the above requested changes from Mark to the macros? Let me know what I can do to help.
On Tue, Apr 23, 2019 at 11:15 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote: > > > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > It would be nice if we could simply rely on a more recent binutils these > > > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > > > > > hack by turning that into a CPP macro which built that name. > > > > > > Mark, can you give me a test case for this? I'd like to check if > > > clang's integrated assembler supports this or not, so we can file an > > > issue and fix it if not. > > > > The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14 > > bits, so it should be feasible to generate and run an exhautive test for > > all encodings. > > > > I've included SYS and SYSL here too, since I strongly suspect we'll need > > those going forward. > > > > Thanks > > Mark. > > > > ---->8---- > > /* > > * ${CC} -c test.c > > * > > * OLD_* encodings already exist, NEW_* encodings haven't yet be > > * allocated (per ARM DDI 0487D.a), but should assemble. All encodings > > * chosen arbitrarily. > > */ > > > > // OSDTRRX_EL1 > > #define OLD_DBG_REG "s2_0_c0_c0_2" > > #define NEW_DBG_REG "s2_7_c0_c15_0" > > > > // MIDR_EL1 > > #define OLD_SYS_REG "s3_0_c0_c0_0" > > #define NEW_SYS_REG "s3_6_c1_c0_7" > > > > #define REG(r) \ > > asm volatile("msr " r ", xzr\n"); \ > > asm volatile("mrs xzr, " r "\n") > > > > // DC IVAC, XZR > > #define OLD_SYS "#0, c7, c6, #1" > > #define NEW_SYS "#7, c15, c15, #7" > > > > #define NEW_SYSL_MIN "#0, c0, c0, #0" > > #define NEW_SYSL_MAX "#7, c15, c15, #7" > > > > #define SYS(s) asm volatile("sys " s ", xzr\n") > > #define SYSL(s) asm volatile("sysl xzr, " s "\n") > > > > void test(void) > > { > > REG(OLD_DBG_REG); > > REG(NEW_DBG_REG); > > > > REG(OLD_SYS_REG); > > REG(NEW_SYS_REG); > > > > SYS(OLD_SYS); > > SYS(NEW_SYS); > > > > SYSL(NEW_SYSL_MIN); > > SYSL(NEW_SYSL_MAX); > > } > > Hmmm...yes looks like Clang's integrated assembler is missing support. > https://godbolt.org/z/wDBpz4 > https://bugs.llvm.org/show_bug.cgi?id=41573 > https://github.com/ClangBuiltLinux/linux/issues/453 > We'll follow up on getting that fixed. > > Kees, are you able to respin the patch with the above requested > changes from Mark to the macros? Let me know what I can do to help. Doesn't this mean arm64 must raise their minimum binutils to 2.25? Is that acceptable? I hadn't seen that question answered yet. (Maybe I missed it?)
On Tue, Apr 23, 2019 at 8:25 PM Kees Cook <keescook@chromium.org> wrote: > On Tue, Apr 23, 2019 at 11:15 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote: > > > > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote: > > > > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > It would be nice if we could simply rely on a more recent binutils these > > > > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg > > > > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s > > > > > > > > hack by turning that into a CPP macro which built that name. > > > > > > > > Mark, can you give me a test case for this? I'd like to check if > > > > clang's integrated assembler supports this or not, so we can file an > > > > issue and fix it if not. > > > > > > The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14 > > > bits, so it should be feasible to generate and run an exhautive test for > > > all encodings. > > > > > > I've included SYS and SYSL here too, since I strongly suspect we'll need > > > those going forward. > > > > > > Thanks > > > Mark. > > > > > > ---->8---- > > > /* > > > * ${CC} -c test.c > > > * > > > * OLD_* encodings already exist, NEW_* encodings haven't yet be > > > * allocated (per ARM DDI 0487D.a), but should assemble. All encodings > > > * chosen arbitrarily. > > > */ > > > > > > // OSDTRRX_EL1 > > > #define OLD_DBG_REG "s2_0_c0_c0_2" > > > #define NEW_DBG_REG "s2_7_c0_c15_0" > > > > > > // MIDR_EL1 > > > #define OLD_SYS_REG "s3_0_c0_c0_0" > > > #define NEW_SYS_REG "s3_6_c1_c0_7" > > > > > > #define REG(r) \ > > > asm volatile("msr " r ", xzr\n"); \ > > > asm volatile("mrs xzr, " r "\n") > > > > > > // DC IVAC, XZR > > > #define OLD_SYS "#0, c7, c6, #1" > > > #define NEW_SYS "#7, c15, c15, #7" > > > > > > #define NEW_SYSL_MIN "#0, c0, c0, #0" > > > #define NEW_SYSL_MAX "#7, c15, c15, #7" > > > > > > #define SYS(s) asm volatile("sys " s ", xzr\n") > > > #define SYSL(s) asm volatile("sysl xzr, " s "\n") > > > > > > void test(void) > > > { > > > REG(OLD_DBG_REG); > > > REG(NEW_DBG_REG); > > > > > > REG(OLD_SYS_REG); > > > REG(NEW_SYS_REG); > > > > > > SYS(OLD_SYS); > > > SYS(NEW_SYS); > > > > > > SYSL(NEW_SYSL_MIN); > > > SYSL(NEW_SYSL_MAX); > > > } > > > > Hmmm...yes looks like Clang's integrated assembler is missing support. > > https://godbolt.org/z/wDBpz4 It looks like you forgot to pass --target=aarch64-linux-gnu to clang here, which produces a build failure. This is the output with the added flag: https://godbolt.org/z/HYoEH1 > > https://bugs.llvm.org/show_bug.cgi?id=41573 > > https://github.com/ClangBuiltLinux/linux/issues/453 > > We'll follow up on getting that fixed. > > > > Kees, are you able to respin the patch with the above requested > > changes from Mark to the macros? Let me know what I can do to help. > > Doesn't this mean arm64 must raise their minimum binutils to 2.25? Is > that acceptable? I hadn't seen that question answered yet. (Maybe I > missed it?) Generally I'd say that's a bit too new for a minimum version. On the other hand, most distros I could find for arm64 have at least 2.25 or later, the oldest one I found being Debian Jessie. Ubuntu Trusty (14.04) was using 2.24 but has its end of support date this month, after five years. However, there are probably still cross-compile toolchains around that use older aarch64-binutils, so it probably hurts someone. Arnd
On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote: > +#define __msr_s(r, v) \ > + DEFINE_MSR_S \ > +" msr_s " __stringify(r) ", %x0\n" \ > + UNDEFINE_MSR_S : : "rZ" (v) BTW ... is "%x0" a typo here? Shouldn't this just be "%0"?
On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote: > > On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote: > > +#define __msr_s(r, v) \ > > + DEFINE_MSR_S \ > > +" msr_s " __stringify(r) ", %x0\n" \ > > + UNDEFINE_MSR_S : : "rZ" (v) > > BTW ... is "%x0" a typo here? Shouldn't this just be "%0"? > No, iirc, that is required to force clang to choose a x# encoding (which is the only one supported by msr/mrs instructions). Otherwise, it will infer from the operand size which encoding to use, and attempt to use w# for 32-bit system registers.
On Tue, Apr 23, 2019 at 3:57 PM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote: > > > > On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote: > > > +#define __msr_s(r, v) \ > > > + DEFINE_MSR_S \ > > > +" msr_s " __stringify(r) ", %x0\n" \ > > > + UNDEFINE_MSR_S : : "rZ" (v) > > > > BTW ... is "%x0" a typo here? Shouldn't this just be "%0"? > > > > No, iirc, that is required to force clang to choose a x# encoding > (which is the only one supported by msr/mrs instructions). Otherwise, > it will infer from the operand size which encoding to use, and attempt > to use w# for 32-bit system registers. Yes: https://github.com/llvm/llvm-project/blob/80b578c7325352187d0aff7bd2bdb5a76b51cf37/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L466
On Tue, Apr 23, 2019 at 4:00 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Apr 23, 2019 at 3:57 PM Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: > > > > On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote: > > > > > > On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote: > > > > +#define __msr_s(r, v) \ > > > > + DEFINE_MSR_S \ > > > > +" msr_s " __stringify(r) ", %x0\n" \ > > > > + UNDEFINE_MSR_S : : "rZ" (v) > > > > > > BTW ... is "%x0" a typo here? Shouldn't this just be "%0"? > > > > > > > No, iirc, that is required to force clang to choose a x# encoding > > (which is the only one supported by msr/mrs instructions). Otherwise, > > it will infer from the operand size which encoding to use, and attempt > > to use w# for 32-bit system registers. > > Yes: https://github.com/llvm/llvm-project/blob/80b578c7325352187d0aff7bd2bdb5a76b51cf37/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L466 Cool, yeah, I see now. Thanks!
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h index 43d8366c1e87..06d3987d1546 100644 --- a/arch/arm64/include/asm/irqflags.h +++ b/arch/arm64/include/asm/irqflags.h @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void) asm volatile(ALTERNATIVE( "msr daifclr, #2 // arch_local_irq_enable\n" "nop", + DEFINE_MSR_S "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n" + UNDEFINE_MSR_S "dsb sy", ARM64_HAS_IRQ_PRIO_MASKING) : @@ -55,7 +57,9 @@ static inline void arch_local_irq_disable(void) { asm volatile(ALTERNATIVE( "msr daifset, #2 // arch_local_irq_disable", - "msr_s " __stringify(SYS_ICC_PMR_EL1) ", %0", + DEFINE_MSR_S + "msr_s " __stringify(SYS_ICC_PMR_EL1) ", %0\n" + UNDEFINE_MSR_S, ARM64_HAS_IRQ_PRIO_MASKING) : : "r" ((unsigned long) GIC_PRIO_IRQOFF) @@ -85,8 +89,10 @@ static inline unsigned long arch_local_save_flags(void) asm volatile(ALTERNATIVE( "mov %0, %1\n" "nop\n" - "nop", + "nop\n", + DEFINE_MRS_S "mrs_s %0, " __stringify(SYS_ICC_PMR_EL1) "\n" + UNDEFINE_MRS_S "ands %1, %1, " __stringify(PSR_I_BIT) "\n" "csel %0, %0, %2, eq", ARM64_HAS_IRQ_PRIO_MASKING) @@ -116,7 +122,9 @@ static inline void arch_local_irq_restore(unsigned long flags) asm volatile(ALTERNATIVE( "msr daif, %0\n" "nop", + DEFINE_MSR_S "msr_s " __stringify(SYS_ICC_PMR_EL1) ", %0\n" + UNDEFINE_MSR_S "dsb sy", ARM64_HAS_IRQ_PRIO_MASKING) : "+r" (flags) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 4da765f2cca5..9d0e7f546240 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -30,7 +30,9 @@ ({ \ u64 reg; \ asm volatile(ALTERNATIVE("mrs %0, " __stringify(r##nvh),\ - "mrs_s %0, " __stringify(r##vh),\ + DEFINE_MRS_S \ + "mrs_s %0, " __stringify(r##vh) "\n"\ + UNDEFINE_MRS_S, \ ARM64_HAS_VIRT_HOST_EXTN) \ : "=r" (reg)); \ reg; \ @@ -40,7 +42,9 @@ do { \ u64 __val = (u64)(v); \ asm volatile(ALTERNATIVE("msr " __stringify(r##nvh) ", %x0",\ - "msr_s " __stringify(r##vh) ", %x0",\ + DEFINE_MSR_S \ + "msr_s " __stringify(r##vh) ", %x0\n"\ + UNDEFINE_MSR_S, \ ARM64_HAS_VIRT_HOST_EXTN) \ : : "rZ" (__val)); \ } while (0) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 5b267dec6194..e9f64c237a7c 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -746,20 +746,39 @@ #include <linux/build_bug.h> #include <linux/types.h> -asm( -" .irp num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n" -" .equ .L__reg_num_x\\num, \\num\n" -" .endr\n" +#define __DEFINE_MRS_MSR_S_REGNUM \ +" .irp num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n" \ +" .equ .L__reg_num_x\\num, \\num\n" \ +" .endr\n" \ " .equ .L__reg_num_xzr, 31\n" -"\n" -" .macro mrs_s, rt, sreg\n" - __emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt)) + +#define DEFINE_MRS_S \ + __DEFINE_MRS_MSR_S_REGNUM \ +" .macro mrs_s, rt, sreg\n" \ + __emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt)) \ " .endm\n" -"\n" -" .macro msr_s, sreg, rt\n" - __emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt)) + +#define DEFINE_MSR_S \ + __DEFINE_MRS_MSR_S_REGNUM \ +" .macro msr_s, sreg, rt\n" \ + __emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt)) \ " .endm\n" -); + +#define UNDEFINE_MRS_S \ +" .purgem mrs_s\n" + +#define UNDEFINE_MSR_S \ +" .purgem msr_s\n" + +#define __mrs_s(r, v) \ + DEFINE_MRS_S \ +" mrs_s %0, " __stringify(r) "\n" \ + UNDEFINE_MRS_S : "=r" (v) + +#define __msr_s(r, v) \ + DEFINE_MSR_S \ +" msr_s " __stringify(r) ", %x0\n" \ + UNDEFINE_MSR_S : : "rZ" (v) /* * Unlike read_cpuid, calls to read_sysreg are never expected to be @@ -785,15 +804,15 @@ asm( * For registers without architectural names, or simply unsupported by * GAS. */ -#define read_sysreg_s(r) ({ \ - u64 __val; \ - asm volatile("mrs_s %0, " __stringify(r) : "=r" (__val)); \ - __val; \ +#define read_sysreg_s(r) ({ \ + u64 __val; \ + asm volatile(__mrs_s(r, __val)); \ + __val; \ }) -#define write_sysreg_s(v, r) do { \ - u64 __val = (u64)(v); \ - asm volatile("msr_s " __stringify(r) ", %x0" : : "rZ" (__val)); \ +#define write_sysreg_s(v, r) do { \ + u64 __val = (u64)(v); \ + asm volatile(__msr_s(r, __val)); \ } while (0) /*