diff mbox series

[v3] arm64: sysreg: make mrs_s and msr_s macros work with Clang and LTO

Message ID 20190415143821.GA46880@beast (mailing list archive)
State New, archived
Headers show
Series [v3] arm64: sysreg: make mrs_s and msr_s macros work with Clang and LTO | expand

Commit Message

Kees Cook April 15, 2019, 2:38 p.m. UTC
From: Alex Matveev <alxmtvv@gmail.com>

Clang's integrated assembler does not allow assembly macros defined
in one inline asm block using the .macro directive to be used across
separate asm blocks. LLVM developers consider this a feature and not a
bug, recommending code refactoring:

  https://bugs.llvm.org/show_bug.cgi?id=19749

As binutils doesn't allow macros to be redefined, this change uses
UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
in-place and workaround gcc and clang limitations on redefining macros
across different assembler blocks.

Specifically, the current state after preprocessing looks like this:

asm volatile(".macro mXX_s ... .endm");
void f()
{
	asm volatile("mXX_s a, b");
}

With GCC, it gives macro redefinition error because sysreg.h is included
in multiple source files, and assembler code for all of them is later
combined for LTO (I've seen an intermediate file with hundreds of
identical definitions).

With clang, it gives macro undefined error because clang doesn't allow
sharing macros between inline asm statements.

I also seem to remember catching another sort of undefined error with
GCC due to reordering of macro definition asm statement and generated
asm code for function that uses the macro.

The solution with defining and undefining for each use, while certainly
not elegant, satisfies both GCC and clang, LTO and non-LTO.

Signed-off-by: Alex Matveev <alxmtvv@gmail.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v3: split out patch as stand-alone, added more uses in irqflags,
    updated commit log, based on discussion in
    https://lore.kernel.org/patchwork/patch/851580/
---
 arch/arm64/include/asm/irqflags.h | 12 +++++--
 arch/arm64/include/asm/kvm_hyp.h  |  8 +++--
 arch/arm64/include/asm/sysreg.h   | 55 +++++++++++++++++++++----------
 3 files changed, 53 insertions(+), 22 deletions(-)

Comments

Mark Rutland April 15, 2019, 5:06 p.m. UTC | #1
On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote:
> From: Alex Matveev <alxmtvv@gmail.com>
> 
> Clang's integrated assembler does not allow assembly macros defined
> in one inline asm block using the .macro directive to be used across
> separate asm blocks. LLVM developers consider this a feature and not a
> bug, recommending code refactoring:
> 
>   https://bugs.llvm.org/show_bug.cgi?id=19749
> 
> As binutils doesn't allow macros to be redefined, this change uses
> UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> in-place and workaround gcc and clang limitations on redefining macros
> across different assembler blocks.
> 
> Specifically, the current state after preprocessing looks like this:
> 
> asm volatile(".macro mXX_s ... .endm");
> void f()
> {
> 	asm volatile("mXX_s a, b");
> }
> 
> With GCC, it gives macro redefinition error because sysreg.h is included
> in multiple source files, and assembler code for all of them is later
> combined for LTO (I've seen an intermediate file with hundreds of
> identical definitions).
> 
> With clang, it gives macro undefined error because clang doesn't allow
> sharing macros between inline asm statements.
> 
> I also seem to remember catching another sort of undefined error with
> GCC due to reordering of macro definition asm statement and generated
> asm code for function that uses the macro.
> 
> The solution with defining and undefining for each use, while certainly
> not elegant, satisfies both GCC and clang, LTO and non-LTO.
> 
> Signed-off-by: Alex Matveev <alxmtvv@gmail.com>
> Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> v3: split out patch as stand-alone, added more uses in irqflags,
>     updated commit log, based on discussion in
>     https://lore.kernel.org/patchwork/patch/851580/
> ---
>  arch/arm64/include/asm/irqflags.h | 12 +++++--
>  arch/arm64/include/asm/kvm_hyp.h  |  8 +++--
>  arch/arm64/include/asm/sysreg.h   | 55 +++++++++++++++++++++----------
>  3 files changed, 53 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index 43d8366c1e87..06d3987d1546 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void)
>  	asm volatile(ALTERNATIVE(
>  		"msr	daifclr, #2		// arch_local_irq_enable\n"
>  		"nop",
> +		DEFINE_MSR_S
>  		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
> +		UNDEFINE_MSR_S

If we do need this, can we wrap this in a larger CPP macro that does the
whole sequence of defining, using, and undefining the asm macros?

It would be nice if we could simply rely on a more recent binutils these
days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
definition. That would mean we could get rid of the whole msr_s/mrs_s
hack by turning that into a CPP macro which built that name.

It looks like binutils has been able to do that since September 2014...

Are folk using toolchains older than that to compile kernels?

Thanks,
Mark.
Nick Desaulniers April 15, 2019, 5:22 p.m. UTC | #2
On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote:
> > From: Alex Matveev <alxmtvv@gmail.com>
> >
> > Clang's integrated assembler does not allow assembly macros defined
> > in one inline asm block using the .macro directive to be used across
> > separate asm blocks. LLVM developers consider this a feature and not a
> > bug, recommending code refactoring:
> >
> >   https://bugs.llvm.org/show_bug.cgi?id=19749
> >
> > As binutils doesn't allow macros to be redefined, this change uses
> > UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > in-place and workaround gcc and clang limitations on redefining macros
> > across different assembler blocks.
> >
> > Specifically, the current state after preprocessing looks like this:
> >
> > asm volatile(".macro mXX_s ... .endm");
> > void f()
> > {
> >       asm volatile("mXX_s a, b");
> > }
> >
> > With GCC, it gives macro redefinition error because sysreg.h is included
> > in multiple source files, and assembler code for all of them is later
> > combined for LTO (I've seen an intermediate file with hundreds of
> > identical definitions).
> >
> > With clang, it gives macro undefined error because clang doesn't allow
> > sharing macros between inline asm statements.
> >
> > I also seem to remember catching another sort of undefined error with
> > GCC due to reordering of macro definition asm statement and generated
> > asm code for function that uses the macro.
> >
> > The solution with defining and undefining for each use, while certainly
> > not elegant, satisfies both GCC and clang, LTO and non-LTO.
> >
> > Signed-off-by: Alex Matveev <alxmtvv@gmail.com>
> > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> > v3: split out patch as stand-alone, added more uses in irqflags,
> >     updated commit log, based on discussion in
> >     https://lore.kernel.org/patchwork/patch/851580/
> > ---
> >  arch/arm64/include/asm/irqflags.h | 12 +++++--
> >  arch/arm64/include/asm/kvm_hyp.h  |  8 +++--
> >  arch/arm64/include/asm/sysreg.h   | 55 +++++++++++++++++++++----------
> >  3 files changed, 53 insertions(+), 22 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> > index 43d8366c1e87..06d3987d1546 100644
> > --- a/arch/arm64/include/asm/irqflags.h
> > +++ b/arch/arm64/include/asm/irqflags.h
> > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void)
> >       asm volatile(ALTERNATIVE(
> >               "msr    daifclr, #2             // arch_local_irq_enable\n"
> >               "nop",
> > +             DEFINE_MSR_S
> >               "msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
> > +             UNDEFINE_MSR_S
>
> If we do need this, can we wrap this in a larger CPP macro that does the
> whole sequence of defining, using, and undefining the asm macros?

I agree that would be cleaner.  For example, __mrs_s/__msr_s can
ALMOST do this (be used in arch/arm64/include/asm/irqflags.h) except
for the input/output constraints.  Maybe the constraints can be
removed from the cpp macro definition, then the constraints be left to
the macro expansion sites?  That way the DEFINE_MXX_S/UNDEFINE_MXX_S
can be hidden in the cpp macro itself.

>
> It would be nice if we could simply rely on a more recent binutils these
> days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> definition. That would mean we could get rid of the whole msr_s/mrs_s
> hack by turning that into a CPP macro which built that name.
>
> It looks like binutils has been able to do that since September 2014...
>
> Are folk using toolchains older than that to compile kernels?

Do you have a link to a commit?  If we can pinpoint the binutils
version, that might help.

Also, I look forward to this patch for use of Clang's integrated
assembler (regardless of LTO).  I remember getting frustrated trying
to figure out how to resolve this for both assemblers, and I had
forgotten this solution existed.
Mark Rutland April 16, 2019, 5:08 p.m. UTC | #3
On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > It would be nice if we could simply rely on a more recent binutils these
> > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > hack by turning that into a CPP macro which built that name.
> >
> > It looks like binutils has been able to do that since September 2014...
> >
> > Are folk using toolchains older than that to compile kernels?
> 
> Do you have a link to a commit?  If we can pinpoint the binutils
> version, that might help.

IIUC any version of binutils with commit:

  df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr")

... should be sufficent.

That's on gitweb at:

  https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=df7b4545b2b49572ab63690c130df973af109615

> Also, I look forward to this patch for use of Clang's integrated
> assembler (regardless of LTO).  I remember getting frustrated trying
> to figure out how to resolve this for both assemblers, and I had
> forgotten this solution existed.

Is this the only blocker for the integrated assembler?

Thanks,
Mark.
Mark Rutland April 16, 2019, 5:18 p.m. UTC | #4
On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote:
> > > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> > > index 43d8366c1e87..06d3987d1546 100644
> > > --- a/arch/arm64/include/asm/irqflags.h
> > > +++ b/arch/arm64/include/asm/irqflags.h
> > > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void)
> > >       asm volatile(ALTERNATIVE(
> > >               "msr    daifclr, #2             // arch_local_irq_enable\n"
> > >               "nop",
> > > +             DEFINE_MSR_S
> > >               "msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
> > > +             UNDEFINE_MSR_S
> >
> > If we do need this, can we wrap this in a larger CPP macro that does the
> > whole sequence of defining, using, and undefining the asm macros?
> 
> I agree that would be cleaner.  For example, __mrs_s/__msr_s can
> ALMOST do this (be used in arch/arm64/include/asm/irqflags.h) except
> for the input/output constraints.  Maybe the constraints can be
> removed from the cpp macro definition, then the constraints be left to
> the macro expansion sites?  That way the DEFINE_MXX_S/UNDEFINE_MXX_S
> can be hidden in the cpp macro itself.

If those took the register template explicitly, i.e. something like:

	__mrs_s(SYS_ICC_PMR_EL1, "%x0") 

	__msr_s("%[whatever]", SYS_FOO)

... that would be more generally useful and clear, as they could be used
in larger asm sequences.

Thanks,
Mark.
Nick Desaulniers April 16, 2019, 5:34 p.m. UTC | #5
On Tue, Apr 16, 2019 at 10:08 AM Mark Rutland <mark.rutland@arm.com> wrote:
> On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > Also, I look forward to this patch for use of Clang's integrated
> > assembler (regardless of LTO).  I remember getting frustrated trying
> > to figure out how to resolve this for both assemblers, and I had
> > forgotten this solution existed.
>
> Is this the only blocker for the integrated assembler?

I don't have a good handle on the number of issues that clang's
integrated assembler has for arm64 kernels, but IIRC last time I
checked I think there were under 10.  It's my goal over the next
quarter to investigate and get bugs filed so we know the order of
magnitude of issues.  This particular issue was the trickier/trickiest
one.  Linaro also recently has added staffing to beef up Clang's
integrated assembler for arm32/arm64.
Kees Cook April 17, 2019, 4:02 a.m. UTC | #6
On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > It would be nice if we could simply rely on a more recent binutils these
> > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > hack by turning that into a CPP macro which built that name.
> > >
> > > It looks like binutils has been able to do that since September 2014...
> > >
> > > Are folk using toolchains older than that to compile kernels?
> >
> > Do you have a link to a commit?  If we can pinpoint the binutils
> > version, that might help.
>
> IIUC any version of binutils with commit:
>
>   df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr")
>
> ... should be sufficent.

This appears to be binutils 2.25:

$ git describe --match 'binutils-2_*' --contains df7b4545b2b49572
binutils-2_25~418

Documentation/process/changes.rst lists 2.20 as current minimum for binutils.

Ubuntu's old LTS (Trusty, Apr 2014) has 2.24, and that release is
about to exit support this month.

Debian's old-stable (Apr 2015) has 2.25. It's about to exit support
when Debian Buster releases.

RHEL6 (2010) has 2.20. RHEL7 (2014) has 2.25.

It seems not unreasonable to require 2.25 at least for arm64 builds?
Nick Desaulniers April 22, 2019, 5:44 p.m. UTC | #7
On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > It would be nice if we could simply rely on a more recent binutils these
> > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > > hack by turning that into a CPP macro which built that name.

Mark, can you give me a test case for this? I'd like to check if
clang's integrated assembler supports this or not, so we can file an
issue and fix it if not.

> > > >
> > > > It looks like binutils has been able to do that since September 2014...
> > > >
> > > > Are folk using toolchains older than that to compile kernels?
> > >
> > > Do you have a link to a commit?  If we can pinpoint the binutils
> > > version, that might help.
> >
> > IIUC any version of binutils with commit:
> >
> >   df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers using mrs and msr")
> >
> > ... should be sufficent.
>
> This appears to be binutils 2.25:
>
> $ git describe --match 'binutils-2_*' --contains df7b4545b2b49572
> binutils-2_25~418
>
> Documentation/process/changes.rst lists 2.20 as current minimum for binutils.
>
> Ubuntu's old LTS (Trusty, Apr 2014) has 2.24, and that release is
> about to exit support this month.
>
> Debian's old-stable (Apr 2015) has 2.25. It's about to exit support
> when Debian Buster releases.
>
> RHEL6 (2010) has 2.20. RHEL7 (2014) has 2.25.
>
> It seems not unreasonable to require 2.25 at least for arm64 builds?

+ Arnd who has a really good feel for which distros ship what.
Mark Rutland April 23, 2019, 2:12 p.m. UTC | #8
On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote:
> On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > It would be nice if we could simply rely on a more recent binutils these
> > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > > > hack by turning that into a CPP macro which built that name.
> 
> Mark, can you give me a test case for this? I'd like to check if
> clang's integrated assembler supports this or not, so we can file an
> issue and fix it if not.

The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14
bits, so it should be feasible to generate and run an exhautive test for
all encodings.

I've included SYS and SYSL here too, since I strongly suspect we'll need
those going forward.

Thanks
Mark.

---->8----
/*
 * ${CC} -c test.c
 *
 * OLD_* encodings already exist, NEW_* encodings haven't yet be
 * allocated (per ARM DDI 0487D.a), but should assemble. All encodings
 * chosen arbitrarily.
 */

// OSDTRRX_EL1
#define OLD_DBG_REG	"s2_0_c0_c0_2"
#define NEW_DBG_REG	"s2_7_c0_c15_0"

// MIDR_EL1
#define OLD_SYS_REG	"s3_0_c0_c0_0"
#define NEW_SYS_REG	"s3_6_c1_c0_7"

#define REG(r)					\
	asm volatile("msr " r ", xzr\n");	\
	asm volatile("mrs xzr, " r "\n")

// DC IVAC, XZR
#define OLD_SYS	"#0, c7, c6, #1"
#define NEW_SYS "#7, c15, c15, #7"

#define NEW_SYSL_MIN	"#0, c0, c0, #0"
#define NEW_SYSL_MAX	"#7, c15, c15, #7"

#define SYS(s)	asm volatile("sys " s ", xzr\n")
#define SYSL(s)	asm volatile("sysl xzr, " s "\n")

void test(void)
{
	REG(OLD_DBG_REG);
	REG(NEW_DBG_REG);

	REG(OLD_SYS_REG);
	REG(NEW_SYS_REG);

	SYS(OLD_SYS);
	SYS(NEW_SYS);

	SYSL(NEW_SYSL_MIN);
	SYSL(NEW_SYSL_MAX);
}
Nick Desaulniers April 23, 2019, 6:15 p.m. UTC | #9
On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote:
> > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > It would be nice if we could simply rely on a more recent binutils these
> > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > > > > hack by turning that into a CPP macro which built that name.
> >
> > Mark, can you give me a test case for this? I'd like to check if
> > clang's integrated assembler supports this or not, so we can file an
> > issue and fix it if not.
>
> The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14
> bits, so it should be feasible to generate and run an exhautive test for
> all encodings.
>
> I've included SYS and SYSL here too, since I strongly suspect we'll need
> those going forward.
>
> Thanks
> Mark.
>
> ---->8----
> /*
>  * ${CC} -c test.c
>  *
>  * OLD_* encodings already exist, NEW_* encodings haven't yet be
>  * allocated (per ARM DDI 0487D.a), but should assemble. All encodings
>  * chosen arbitrarily.
>  */
>
> // OSDTRRX_EL1
> #define OLD_DBG_REG     "s2_0_c0_c0_2"
> #define NEW_DBG_REG     "s2_7_c0_c15_0"
>
> // MIDR_EL1
> #define OLD_SYS_REG     "s3_0_c0_c0_0"
> #define NEW_SYS_REG     "s3_6_c1_c0_7"
>
> #define REG(r)                                  \
>         asm volatile("msr " r ", xzr\n");       \
>         asm volatile("mrs xzr, " r "\n")
>
> // DC IVAC, XZR
> #define OLD_SYS "#0, c7, c6, #1"
> #define NEW_SYS "#7, c15, c15, #7"
>
> #define NEW_SYSL_MIN    "#0, c0, c0, #0"
> #define NEW_SYSL_MAX    "#7, c15, c15, #7"
>
> #define SYS(s)  asm volatile("sys " s ", xzr\n")
> #define SYSL(s) asm volatile("sysl xzr, " s "\n")
>
> void test(void)
> {
>         REG(OLD_DBG_REG);
>         REG(NEW_DBG_REG);
>
>         REG(OLD_SYS_REG);
>         REG(NEW_SYS_REG);
>
>         SYS(OLD_SYS);
>         SYS(NEW_SYS);
>
>         SYSL(NEW_SYSL_MIN);
>         SYSL(NEW_SYSL_MAX);
> }

Hmmm...yes looks like Clang's integrated assembler is missing support.
https://godbolt.org/z/wDBpz4
https://bugs.llvm.org/show_bug.cgi?id=41573
https://github.com/ClangBuiltLinux/linux/issues/453
We'll follow up on getting that fixed.

Kees, are you able to respin the patch with the above requested
changes from Mark to the macros?  Let me know what I can do to help.
Kees Cook April 23, 2019, 6:24 p.m. UTC | #10
On Tue, Apr 23, 2019 at 11:15 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote:
> > > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote:
> > > >
> > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > >
> > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > > It would be nice if we could simply rely on a more recent binutils these
> > > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > > > > > hack by turning that into a CPP macro which built that name.
> > >
> > > Mark, can you give me a test case for this? I'd like to check if
> > > clang's integrated assembler supports this or not, so we can file an
> > > issue and fix it if not.
> >
> > The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14
> > bits, so it should be feasible to generate and run an exhautive test for
> > all encodings.
> >
> > I've included SYS and SYSL here too, since I strongly suspect we'll need
> > those going forward.
> >
> > Thanks
> > Mark.
> >
> > ---->8----
> > /*
> >  * ${CC} -c test.c
> >  *
> >  * OLD_* encodings already exist, NEW_* encodings haven't yet be
> >  * allocated (per ARM DDI 0487D.a), but should assemble. All encodings
> >  * chosen arbitrarily.
> >  */
> >
> > // OSDTRRX_EL1
> > #define OLD_DBG_REG     "s2_0_c0_c0_2"
> > #define NEW_DBG_REG     "s2_7_c0_c15_0"
> >
> > // MIDR_EL1
> > #define OLD_SYS_REG     "s3_0_c0_c0_0"
> > #define NEW_SYS_REG     "s3_6_c1_c0_7"
> >
> > #define REG(r)                                  \
> >         asm volatile("msr " r ", xzr\n");       \
> >         asm volatile("mrs xzr, " r "\n")
> >
> > // DC IVAC, XZR
> > #define OLD_SYS "#0, c7, c6, #1"
> > #define NEW_SYS "#7, c15, c15, #7"
> >
> > #define NEW_SYSL_MIN    "#0, c0, c0, #0"
> > #define NEW_SYSL_MAX    "#7, c15, c15, #7"
> >
> > #define SYS(s)  asm volatile("sys " s ", xzr\n")
> > #define SYSL(s) asm volatile("sysl xzr, " s "\n")
> >
> > void test(void)
> > {
> >         REG(OLD_DBG_REG);
> >         REG(NEW_DBG_REG);
> >
> >         REG(OLD_SYS_REG);
> >         REG(NEW_SYS_REG);
> >
> >         SYS(OLD_SYS);
> >         SYS(NEW_SYS);
> >
> >         SYSL(NEW_SYSL_MIN);
> >         SYSL(NEW_SYSL_MAX);
> > }
>
> Hmmm...yes looks like Clang's integrated assembler is missing support.
> https://godbolt.org/z/wDBpz4
> https://bugs.llvm.org/show_bug.cgi?id=41573
> https://github.com/ClangBuiltLinux/linux/issues/453
> We'll follow up on getting that fixed.
>
> Kees, are you able to respin the patch with the above requested
> changes from Mark to the macros?  Let me know what I can do to help.

Doesn't this mean arm64 must raise their minimum binutils to 2.25? Is
that acceptable? I hadn't seen that question answered yet. (Maybe I
missed it?)
Arnd Bergmann April 23, 2019, 6:39 p.m. UTC | #11
On Tue, Apr 23, 2019 at 8:25 PM Kees Cook <keescook@chromium.org> wrote:
> On Tue, Apr 23, 2019 at 11:15 AM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > On Tue, Apr 23, 2019 at 7:12 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Mon, Apr 22, 2019 at 10:44:10AM -0700, Nick Desaulniers wrote:
> > > > On Tue, Apr 16, 2019 at 9:03 PM Kees Cook <keescook@chromium.org> wrote:
> > > > >
> > > > > On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > > > > > > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > > > It would be nice if we could simply rely on a more recent binutils these
> > > > > > > > days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> > > > > > > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > > > > > > hack by turning that into a CPP macro which built that name.
> > > >
> > > > Mark, can you give me a test case for this? I'd like to check if
> > > > clang's integrated assembler supports this or not, so we can file an
> > > > issue and fix it if not.
> > >
> > > The below is a smoke test. The entire op0:op1:cn:cm:op2 space is 14
> > > bits, so it should be feasible to generate and run an exhautive test for
> > > all encodings.
> > >
> > > I've included SYS and SYSL here too, since I strongly suspect we'll need
> > > those going forward.
> > >
> > > Thanks
> > > Mark.
> > >
> > > ---->8----
> > > /*
> > >  * ${CC} -c test.c
> > >  *
> > >  * OLD_* encodings already exist, NEW_* encodings haven't yet be
> > >  * allocated (per ARM DDI 0487D.a), but should assemble. All encodings
> > >  * chosen arbitrarily.
> > >  */
> > >
> > > // OSDTRRX_EL1
> > > #define OLD_DBG_REG     "s2_0_c0_c0_2"
> > > #define NEW_DBG_REG     "s2_7_c0_c15_0"
> > >
> > > // MIDR_EL1
> > > #define OLD_SYS_REG     "s3_0_c0_c0_0"
> > > #define NEW_SYS_REG     "s3_6_c1_c0_7"
> > >
> > > #define REG(r)                                  \
> > >         asm volatile("msr " r ", xzr\n");       \
> > >         asm volatile("mrs xzr, " r "\n")
> > >
> > > // DC IVAC, XZR
> > > #define OLD_SYS "#0, c7, c6, #1"
> > > #define NEW_SYS "#7, c15, c15, #7"
> > >
> > > #define NEW_SYSL_MIN    "#0, c0, c0, #0"
> > > #define NEW_SYSL_MAX    "#7, c15, c15, #7"
> > >
> > > #define SYS(s)  asm volatile("sys " s ", xzr\n")
> > > #define SYSL(s) asm volatile("sysl xzr, " s "\n")
> > >
> > > void test(void)
> > > {
> > >         REG(OLD_DBG_REG);
> > >         REG(NEW_DBG_REG);
> > >
> > >         REG(OLD_SYS_REG);
> > >         REG(NEW_SYS_REG);
> > >
> > >         SYS(OLD_SYS);
> > >         SYS(NEW_SYS);
> > >
> > >         SYSL(NEW_SYSL_MIN);
> > >         SYSL(NEW_SYSL_MAX);
> > > }
> >
> > Hmmm...yes looks like Clang's integrated assembler is missing support.
> > https://godbolt.org/z/wDBpz4

It looks like you forgot to pass --target=aarch64-linux-gnu to clang here,
which produces a build failure. This is the output with the added flag:

https://godbolt.org/z/HYoEH1

> > https://bugs.llvm.org/show_bug.cgi?id=41573
> > https://github.com/ClangBuiltLinux/linux/issues/453
> > We'll follow up on getting that fixed.
> >
> > Kees, are you able to respin the patch with the above requested
> > changes from Mark to the macros?  Let me know what I can do to help.
>
> Doesn't this mean arm64 must raise their minimum binutils to 2.25? Is
> that acceptable? I hadn't seen that question answered yet. (Maybe I
> missed it?)

Generally I'd say that's a bit too new for a minimum version. On the other
hand, most distros I could find for arm64 have at least 2.25 or later,
the oldest
one I found being Debian Jessie. Ubuntu Trusty (14.04) was using 2.24 but
has its end of support date this month, after five years.

However, there are probably still cross-compile toolchains around that use
older aarch64-binutils, so it probably hurts someone.

       Arnd
Kees Cook April 23, 2019, 10:55 p.m. UTC | #12
On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote:
> +#define __msr_s(r, v)                                          \
> +       DEFINE_MSR_S                                            \
> +"      msr_s " __stringify(r) ", %x0\n"                        \
> +       UNDEFINE_MSR_S : : "rZ" (v)

BTW ... is "%x0" a typo here? Shouldn't this just be "%0"?
Ard Biesheuvel April 23, 2019, 10:57 p.m. UTC | #13
On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote:
> > +#define __msr_s(r, v)                                          \
> > +       DEFINE_MSR_S                                            \
> > +"      msr_s " __stringify(r) ", %x0\n"                        \
> > +       UNDEFINE_MSR_S : : "rZ" (v)
>
> BTW ... is "%x0" a typo here? Shouldn't this just be "%0"?
>

No, iirc, that is required to force clang to choose a x# encoding
(which is the only one supported by msr/mrs instructions). Otherwise,
it will infer from the operand size which encoding to use, and attempt
to use w# for 32-bit system registers.
Nick Desaulniers April 23, 2019, 10:59 p.m. UTC | #14
On Tue, Apr 23, 2019 at 3:57 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote:
> >
> > On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote:
> > > +#define __msr_s(r, v)                                          \
> > > +       DEFINE_MSR_S                                            \
> > > +"      msr_s " __stringify(r) ", %x0\n"                        \
> > > +       UNDEFINE_MSR_S : : "rZ" (v)
> >
> > BTW ... is "%x0" a typo here? Shouldn't this just be "%0"?
> >
>
> No, iirc, that is required to force clang to choose a x# encoding
> (which is the only one supported by msr/mrs instructions). Otherwise,
> it will infer from the operand size which encoding to use, and attempt
> to use w# for 32-bit system registers.

Yes: https://github.com/llvm/llvm-project/blob/80b578c7325352187d0aff7bd2bdb5a76b51cf37/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L466
Kees Cook April 23, 2019, 11:09 p.m. UTC | #15
On Tue, Apr 23, 2019 at 4:00 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Apr 23, 2019 at 3:57 PM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Wed, 24 Apr 2019 at 00:55, Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Mon, Apr 15, 2019 at 7:38 AM Kees Cook <keescook@chromium.org> wrote:
> > > > +#define __msr_s(r, v)                                          \
> > > > +       DEFINE_MSR_S                                            \
> > > > +"      msr_s " __stringify(r) ", %x0\n"                        \
> > > > +       UNDEFINE_MSR_S : : "rZ" (v)
> > >
> > > BTW ... is "%x0" a typo here? Shouldn't this just be "%0"?
> > >
> >
> > No, iirc, that is required to force clang to choose a x# encoding
> > (which is the only one supported by msr/mrs instructions). Otherwise,
> > it will infer from the operand size which encoding to use, and attempt
> > to use w# for 32-bit system registers.
>
> Yes: https://github.com/llvm/llvm-project/blob/80b578c7325352187d0aff7bd2bdb5a76b51cf37/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L466

Cool, yeah, I see now. Thanks!
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 43d8366c1e87..06d3987d1546 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -43,7 +43,9 @@  static inline void arch_local_irq_enable(void)
 	asm volatile(ALTERNATIVE(
 		"msr	daifclr, #2		// arch_local_irq_enable\n"
 		"nop",
+		DEFINE_MSR_S
 		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
+		UNDEFINE_MSR_S
 		"dsb	sy",
 		ARM64_HAS_IRQ_PRIO_MASKING)
 		:
@@ -55,7 +57,9 @@  static inline void arch_local_irq_disable(void)
 {
 	asm volatile(ALTERNATIVE(
 		"msr	daifset, #2		// arch_local_irq_disable",
-		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ", %0",
+		DEFINE_MSR_S
+		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ", %0\n"
+		UNDEFINE_MSR_S,
 		ARM64_HAS_IRQ_PRIO_MASKING)
 		:
 		: "r" ((unsigned long) GIC_PRIO_IRQOFF)
@@ -85,8 +89,10 @@  static inline unsigned long arch_local_save_flags(void)
 	asm volatile(ALTERNATIVE(
 			"mov	%0, %1\n"
 			"nop\n"
-			"nop",
+			"nop\n",
+			DEFINE_MRS_S
 			"mrs_s	%0, " __stringify(SYS_ICC_PMR_EL1) "\n"
+			UNDEFINE_MRS_S
 			"ands	%1, %1, " __stringify(PSR_I_BIT) "\n"
 			"csel	%0, %0, %2, eq",
 			ARM64_HAS_IRQ_PRIO_MASKING)
@@ -116,7 +122,9 @@  static inline void arch_local_irq_restore(unsigned long flags)
 	asm volatile(ALTERNATIVE(
 			"msr	daif, %0\n"
 			"nop",
+			DEFINE_MSR_S
 			"msr_s	" __stringify(SYS_ICC_PMR_EL1) ", %0\n"
+			UNDEFINE_MSR_S
 			"dsb	sy",
 			ARM64_HAS_IRQ_PRIO_MASKING)
 		: "+r" (flags)
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 4da765f2cca5..9d0e7f546240 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -30,7 +30,9 @@ 
 	({								\
 		u64 reg;						\
 		asm volatile(ALTERNATIVE("mrs %0, " __stringify(r##nvh),\
-					 "mrs_s %0, " __stringify(r##vh),\
+					 DEFINE_MRS_S			\
+					 "mrs_s %0, " __stringify(r##vh) "\n"\
+					 UNDEFINE_MRS_S,		\
 					 ARM64_HAS_VIRT_HOST_EXTN)	\
 			     : "=r" (reg));				\
 		reg;							\
@@ -40,7 +42,9 @@ 
 	do {								\
 		u64 __val = (u64)(v);					\
 		asm volatile(ALTERNATIVE("msr " __stringify(r##nvh) ", %x0",\
-					 "msr_s " __stringify(r##vh) ", %x0",\
+					 DEFINE_MSR_S			\
+					 "msr_s " __stringify(r##vh) ", %x0\n"\
+					 UNDEFINE_MSR_S,		\
 					 ARM64_HAS_VIRT_HOST_EXTN)	\
 					 : : "rZ" (__val));		\
 	} while (0)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5b267dec6194..e9f64c237a7c 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -746,20 +746,39 @@ 
 #include <linux/build_bug.h>
 #include <linux/types.h>
 
-asm(
-"	.irp	num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
-"	.equ	.L__reg_num_x\\num, \\num\n"
-"	.endr\n"
+#define __DEFINE_MRS_MSR_S_REGNUM				\
+"	.irp	num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n" \
+"	.equ	.L__reg_num_x\\num, \\num\n"			\
+"	.endr\n"						\
 "	.equ	.L__reg_num_xzr, 31\n"
-"\n"
-"	.macro	mrs_s, rt, sreg\n"
-	__emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt))
+
+#define DEFINE_MRS_S						\
+	__DEFINE_MRS_MSR_S_REGNUM				\
+"	.macro	mrs_s, rt, sreg\n"				\
+	__emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt))	\
 "	.endm\n"
-"\n"
-"	.macro	msr_s, sreg, rt\n"
-	__emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt))
+
+#define DEFINE_MSR_S						\
+	__DEFINE_MRS_MSR_S_REGNUM				\
+"	.macro	msr_s, sreg, rt\n"				\
+	__emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt))	\
 "	.endm\n"
-);
+
+#define UNDEFINE_MRS_S						\
+"	.purgem	mrs_s\n"
+
+#define UNDEFINE_MSR_S						\
+"	.purgem	msr_s\n"
+
+#define __mrs_s(r, v)						\
+	DEFINE_MRS_S						\
+"	mrs_s %0, " __stringify(r) "\n"				\
+	UNDEFINE_MRS_S : "=r" (v)
+
+#define __msr_s(r, v)						\
+	DEFINE_MSR_S						\
+"	msr_s " __stringify(r) ", %x0\n"			\
+	UNDEFINE_MSR_S : : "rZ" (v)
 
 /*
  * Unlike read_cpuid, calls to read_sysreg are never expected to be
@@ -785,15 +804,15 @@  asm(
  * For registers without architectural names, or simply unsupported by
  * GAS.
  */
-#define read_sysreg_s(r) ({						\
-	u64 __val;							\
-	asm volatile("mrs_s %0, " __stringify(r) : "=r" (__val));	\
-	__val;								\
+#define read_sysreg_s(r) ({					\
+	u64 __val;						\
+	asm volatile(__mrs_s(r, __val));			\
+	__val;							\
 })
 
-#define write_sysreg_s(v, r) do {					\
-	u64 __val = (u64)(v);						\
-	asm volatile("msr_s " __stringify(r) ", %x0" : : "rZ" (__val));	\
+#define write_sysreg_s(v, r) do {				\
+	u64 __val = (u64)(v);					\
+	asm volatile(__msr_s(r, __val));			\
 } while (0)
 
 /*