diff mbox series

[01/26] mm: asi: Make some utility functions noinstr compatible

Message ID 20240712-asi-rfc-24-v1-1-144b319a40d8@google.com (mailing list archive)
State New
Headers show
Series Address Space Isolation (ASI) 2024 | expand

Commit Message

Brendan Jackman July 12, 2024, 5 p.m. UTC
From: Junaid Shahid <junaids@google.com>

Some existing utility functions would need to be called from a noinstr
context in the later patches. So mark these as either noinstr or
__always_inline.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/include/asm/processor.h     | 2 +-
 arch/x86/include/asm/special_insns.h | 8 ++++----
 arch/x86/mm/tlb.c                    | 8 ++++----
 include/linux/compiler_types.h       | 8 ++++++++
 4 files changed, 17 insertions(+), 9 deletions(-)

Comments

Borislav Petkov Oct. 25, 2024, 11:41 a.m. UTC | #1
On Fri, Jul 12, 2024 at 05:00:19PM +0000, Brendan Jackman wrote:
> +/*
> + * Can be used for functions which themselves are not strictly noinstr, but
> + * may be called from noinstr code.
> + */
> +#define inline_or_noinstr						\

Hmm, this is confusing. So is it noinstr or is it getting inlined?

I'd expect you either always inline the small functions - as you do for some
aleady - or mark the others noinstr. But not something in between.

Why this?
Brendan Jackman Oct. 25, 2024, 1:21 p.m. UTC | #2
Hey Boris,

On Fri, 25 Oct 2024 at 13:41, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Jul 12, 2024 at 05:00:19PM +0000, Brendan Jackman wrote:
> > +/*
> > + * Can be used for functions which themselves are not strictly noinstr, but
> > + * may be called from noinstr code.
> > + */
> > +#define inline_or_noinstr                                            \
>
> Hmm, this is confusing. So is it noinstr or is it getting inlined?

We don't care if it's getting inlined, which is kinda the point. This
annotation means "you may call this function from noinstr code". My
current understanding is that the normal noinstr annotation means
"this function fundamentally mustn't be instrumented".

So with inline_or_noinstr you get:

1. "Documentation" that the function itself doesn't have any problem
with getting traced etc.
2. Freedom for the compiler to inline or not.

> I'd expect you either always inline the small functions - as you do for some
> aleady - or mark the others noinstr. But not something in between.
>
> Why this?

Overall it's pretty likely I'm wrong about the subtlety of noinstr's
meaning. And the benefits I listed above are pretty minor. I should
have looked into this as it would have been an opportunity to reduce
the patch count of this RFC!

Maybe I'm also forgetting something more important, perhaps Junaid
will weigh in...
Junaid Shahid Oct. 29, 2024, 5:38 p.m. UTC | #3
On 10/25/24 6:21 AM, Brendan Jackman wrote:
> Hey Boris,
> 
> On Fri, 25 Oct 2024 at 13:41, Borislav Petkov <bp@alien8.de> wrote:
>>
>> On Fri, Jul 12, 2024 at 05:00:19PM +0000, Brendan Jackman wrote:
>>> +/*
>>> + * Can be used for functions which themselves are not strictly noinstr, but
>>> + * may be called from noinstr code.
>>> + */
>>> +#define inline_or_noinstr                                            \
>>
>> Hmm, this is confusing. So is it noinstr or is it getting inlined?
> 
> We don't care if it's getting inlined, which is kinda the point. This
> annotation means "you may call this function from noinstr code". My
> current understanding is that the normal noinstr annotation means
> "this function fundamentally mustn't be instrumented".
> 
> So with inline_or_noinstr you get:
> 
> 1. "Documentation" that the function itself doesn't have any problem
> with getting traced etc.
> 2. Freedom for the compiler to inline or not.
> 
>> I'd expect you either always inline the small functions - as you do for some
>> aleady - or mark the others noinstr. But not something in between.
>>
>> Why this?
> 
> Overall it's pretty likely I'm wrong about the subtlety of noinstr's
> meaning. And the benefits I listed above are pretty minor. I should
> have looked into this as it would have been an opportunity to reduce
> the patch count of this RFC!
> 
> Maybe I'm also forgetting something more important, perhaps Junaid
> will weigh in...

Yes, IIRC the idea was that there is no need to prohibit inlining for this class 
of functions.
Thomas Gleixner Oct. 29, 2024, 7:12 p.m. UTC | #4
On Tue, Oct 29 2024 at 10:38, Junaid Shahid wrote:
> On 10/25/24 6:21 AM, Brendan Jackman wrote:
>>> I'd expect you either always inline the small functions - as you do for some
>>> aleady - or mark the others noinstr. But not something in between.
>>>
>>> Why this?
>> 
>> Overall it's pretty likely I'm wrong about the subtlety of noinstr's
>> meaning. And the benefits I listed above are pretty minor. I should
>> have looked into this as it would have been an opportunity to reduce
>> the patch count of this RFC!
>> 
>> Maybe I'm also forgetting something more important, perhaps Junaid
>> will weigh in...
>
> Yes, IIRC the idea was that there is no need to prohibit inlining for this class 
> of functions.

I doubt that it works as you want it to work.

+	inline notrace __attribute((__section__(".noinstr.text")))	\

So this explicitely puts the inline into the .noinstr.text section,
which means when it is used in .text the compiler will generate an out-of
line function in the .noinstr.text section and insert a call into the
usage site. That's independent of the size of the inline.

Thanks,

        tglx
Junaid Shahid Nov. 1, 2024, 1:44 a.m. UTC | #5
On 10/29/24 12:12 PM, Thomas Gleixner wrote:
> 
> I doubt that it works as you want it to work.
> 
> +	inline notrace __attribute((__section__(".noinstr.text")))	\
> 
> So this explicitely puts the inline into the .noinstr.text section,
> which means when it is used in .text the compiler will generate an out-of
> line function in the .noinstr.text section and insert a call into the
> usage site. That's independent of the size of the inline.
> 

Oh, that's interesting. IIRC I had seen regular (.text) inline functions get 
inlined into .noinstr.text callers. I assume the difference is that here the 
section is marked explicitly rather than being implicit?

In any case, I guess we could just mark these functions as plain noinstr. 
(Unless there happens to be some other way to indicate to the compiler to place 
any non-inlined copy of the function in .noinstr.text but still allow inlining 
into .text if it makes sense optimization-wise.)

Thanks,
Junaid
Brendan Jackman Nov. 1, 2024, 10:06 a.m. UTC | #6
On Fri, 1 Nov 2024 at 02:44, Junaid Shahid <junaids@google.com> wrote:
> In any case, I guess we could just mark these functions as plain noinstr.

I wonder if it also would be worth having something like

/*
 * Inline this function so it can be called from noinstr,
 * but it wouldn't actually care itself about being instrumented.
 */
#define inline_for_noinstr __always_inline

Maybe there are already __always_inline functions this would apply to.
Then again, if you care about inlining them so much that you can't
just write "noinstr", then it's probably hot/small enough that
__always_inline would make sense regardless of noinstr.

Probably I'm over-thinking it at this point.
Thomas Gleixner Nov. 1, 2024, 8:27 p.m. UTC | #7
On Thu, Oct 31 2024 at 18:44, Junaid Shahid wrote:
> On 10/29/24 12:12 PM, Thomas Gleixner wrote:
>> 
>> I doubt that it works as you want it to work.
>> 
>> +	inline notrace __attribute((__section__(".noinstr.text")))	\
>> 
>> So this explicitely puts the inline into the .noinstr.text section,
>> which means when it is used in .text the compiler will generate an out-of
>> line function in the .noinstr.text section and insert a call into the
>> usage site. That's independent of the size of the inline.
>> 
>
> Oh, that's interesting. IIRC I had seen regular (.text) inline functions get 
> inlined into .noinstr.text callers. I assume the difference is that here the 
> section is marked explicitly rather than being implicit?

Correct. Inlines without any section attribute are free to be inlined in
any section, but if the compiler decides to uninline them, then it
sticks the uninlined version into the default section ".text".

The other problem there is that an out of line version can be
instrumented if not explicitely forbidden.

That's why we mark them __always_inline, which forces the compiler to
inline it into the usage site unconditionally.

> In any case, I guess we could just mark these functions as plain
> noinstr.

No. Some of them are used in hotpath '.text'. 'noinstr' prevents them to
be actually inlined then as I explained to you before.

> (Unless there happens to be some other way to indicate to the compiler to place 
> any non-inlined copy of the function in .noinstr.text but still allow inlining 
> into .text if it makes sense optimization-wise.)

Ideally the compilers would provide

        __attribute__(force_caller_section)

which makes them place an out of line inline into the section of the
function from which it is called. But we can't have useful things or
they are so badly documented that I can't find them ...

What actually works by some definition of "works" is:

       static __always_inline void __foo(void) { }

       static inline void foo(void)
       {
                __(foo);
       }

       static inline noinstr void foo_noinstr(void)
       {
                __(foo);
       }

The problem is that both GCC and clang optimize foo[_noinstr]() away and
then follow the __always_inline directive of __foo() even if I make
__foo() insanely large and have a gazillion of different functions
marked noinline invoking foo() or foo_noinstr(), unless I add -fno-inline
to the command line.

Which means it's not much different from just having '__always_inline
foo()' without the wrappers....

Compilers clearly lack a --do-what-I-mean command line option.

Now if I'm truly nasty then both compilers do what I mean even without a
magic command line option:

       static __always_inline void __foo(void) { }

       static __maybe_unused void foo(void)
       {
                __(foo);
       }

       static __maybe_unused noinstr void foo_noinstr(void)
       {
                __(foo);
       }

If there is a single invocation of either foo() or foo_noinstr() and
they are small enough then the compiler inlines them, unless -fno-inline
is on the command line. If there are multiple invocations and/or foo
gets big enough then both compilers out of line them. The out of line
wrappers with __foo() inlined in them end always up in the correct
section.

I actually really like the programming model as it is very clear about
the intention of usage and it allows static checkers to validate.

Thoughts?

Thanks,

        tglx
Junaid Shahid Nov. 5, 2024, 9:40 p.m. UTC | #8
On 11/1/24 1:27 PM, Thomas Gleixner wrote:
> On Thu, Oct 31 2024 at 18:44, Junaid Shahid wrote:
>> On 10/29/24 12:12 PM, Thomas Gleixner wrote:
>>>
>>> I doubt that it works as you want it to work.
>>>
>>> +	inline notrace __attribute((__section__(".noinstr.text")))	\
>>>
>>> So this explicitely puts the inline into the .noinstr.text section,
>>> which means when it is used in .text the compiler will generate an out-of
>>> line function in the .noinstr.text section and insert a call into the
>>> usage site. That's independent of the size of the inline.
>>>
>>
>> Oh, that's interesting. IIRC I had seen regular (.text) inline functions get
>> inlined into .noinstr.text callers. I assume the difference is that here the
>> section is marked explicitly rather than being implicit?
> 
> Correct. Inlines without any section attribute are free to be inlined in
> any section, but if the compiler decides to uninline them, then it
> sticks the uninlined version into the default section ".text".
> 
> The other problem there is that an out of line version can be
> instrumented if not explicitely forbidden.
> 
> That's why we mark them __always_inline, which forces the compiler to
> inline it into the usage site unconditionally.
> 
>> In any case, I guess we could just mark these functions as plain
>> noinstr.
> 
> No. Some of them are used in hotpath '.text'. 'noinstr' prevents them to
> be actually inlined then as I explained to you before.
> 
>> (Unless there happens to be some other way to indicate to the compiler to place
>> any non-inlined copy of the function in .noinstr.text but still allow inlining
>> into .text if it makes sense optimization-wise.)
> 
> Ideally the compilers would provide
> 
>          __attribute__(force_caller_section)
> 
> which makes them place an out of line inline into the section of the
> function from which it is called. But we can't have useful things or
> they are so badly documented that I can't find them ...
> 
> What actually works by some definition of "works" is:
> 
>         static __always_inline void __foo(void) { }
> 
>         static inline void foo(void)
>         {
>                  __(foo);
>         }
> 
>         static inline noinstr void foo_noinstr(void)
>         {
>                  __(foo);
>         }
> 
> The problem is that both GCC and clang optimize foo[_noinstr]() away and
> then follow the __always_inline directive of __foo() even if I make
> __foo() insanely large and have a gazillion of different functions
> marked noinline invoking foo() or foo_noinstr(), unless I add -fno-inline
> to the command line.
> 
> Which means it's not much different from just having '__always_inline
> foo()' without the wrappers....
> 
> Compilers clearly lack a --do-what-I-mean command line option.
> 
> Now if I'm truly nasty then both compilers do what I mean even without a
> magic command line option:
> 
>         static __always_inline void __foo(void) { }
> 
>         static __maybe_unused void foo(void)
>         {
>                  __(foo);
>         }
> 
>         static __maybe_unused noinstr void foo_noinstr(void)
>         {
>                  __(foo);
>         }
> 
> If there is a single invocation of either foo() or foo_noinstr() and
> they are small enough then the compiler inlines them, unless -fno-inline
> is on the command line. If there are multiple invocations and/or foo
> gets big enough then both compilers out of line them. The out of line
> wrappers with __foo() inlined in them end always up in the correct
> section.
> 
> I actually really like the programming model as it is very clear about
> the intention of usage and it allows static checkers to validate.
> 
> Thoughts?
> 

Thank you for the details. Yes, I think the last scheme that you described with 
separate wrappers for regular and noinst usage makes sense. IIRC the existing 
static validation wouldn't catch it if someone mistakenly called the .text 
version of the function from noinstr code and it just happened to get inlined. 
Perhaps we should add the -fno-inline compiler option with 
CONFIG_NOINSTR_VALIDATION?

Thanks,
Junaid
diff mbox series

Patch

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 78e51b0d6433d..dc45d622eae4e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -206,7 +206,7 @@  void print_cpu_msr(struct cpuinfo_x86 *);
 /*
  * Friendlier CR3 helpers.
  */
-static inline unsigned long read_cr3_pa(void)
+static __always_inline unsigned long read_cr3_pa(void)
 {
 	return __read_cr3() & CR3_ADDR_MASK;
 }
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 2e9fc5c400cdc..c63433dc04d34 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -42,14 +42,14 @@  static __always_inline void native_write_cr2(unsigned long val)
 	asm volatile("mov %0,%%cr2": : "r" (val) : "memory");
 }
 
-static inline unsigned long __native_read_cr3(void)
+static __always_inline unsigned long __native_read_cr3(void)
 {
 	unsigned long val;
 	asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER);
 	return val;
 }
 
-static inline void native_write_cr3(unsigned long val)
+static __always_inline void native_write_cr3(unsigned long val)
 {
 	asm volatile("mov %0,%%cr3": : "r" (val) : "memory");
 }
@@ -153,12 +153,12 @@  static __always_inline void write_cr2(unsigned long x)
  * Careful!  CR3 contains more than just an address.  You probably want
  * read_cr3_pa() instead.
  */
-static inline unsigned long __read_cr3(void)
+static __always_inline unsigned long __read_cr3(void)
 {
 	return __native_read_cr3();
 }
 
-static inline void write_cr3(unsigned long x)
+static __always_inline void write_cr3(unsigned long x)
 {
 	native_write_cr3(x);
 }
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 44ac64f3a047c..6ca18ac9058b6 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -110,7 +110,7 @@ 
 /*
  * Given @asid, compute kPCID
  */
-static inline u16 kern_pcid(u16 asid)
+static inline_or_noinstr u16 kern_pcid(u16 asid)
 {
 	VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE);
 
@@ -155,9 +155,9 @@  static inline u16 user_pcid(u16 asid)
 	return ret;
 }
 
-static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam)
+static inline_or_noinstr unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam)
 {
-	unsigned long cr3 = __sme_pa(pgd) | lam;
+	unsigned long cr3 = __sme_pa_nodebug(pgd) | lam;
 
 	if (static_cpu_has(X86_FEATURE_PCID)) {
 		VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE);
@@ -1087,7 +1087,7 @@  void flush_tlb_kernel_range(unsigned long start, unsigned long end)
  * It's intended to be used for code like KVM that sneakily changes CR3
  * and needs to restore it.  It needs to be used very carefully.
  */
-unsigned long __get_current_cr3_fast(void)
+inline_or_noinstr unsigned long __get_current_cr3_fast(void)
 {
 	unsigned long cr3 =
 		build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd,
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 8f8236317d5b1..955497335832c 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -320,6 +320,14 @@  struct ftrace_likely_data {
  */
 #define __cpuidle __noinstr_section(".cpuidle.text")
 
+/*
+ * Can be used for functions which themselves are not strictly noinstr, but
+ * may be called from noinstr code.
+ */
+#define inline_or_noinstr						\
+	inline notrace __attribute((__section__(".noinstr.text")))	\
+	__no_kcsan __no_sanitize_address __no_sanitize_coverage
+
 #endif /* __KERNEL__ */
 
 #endif /* __ASSEMBLY__ */