diff mbox series

x86/nospec: Fix evaluate_nospec() code generation under Clang

Message ID 20220425175603.21086-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/nospec: Fix evaluate_nospec() code generation under Clang | expand

Commit Message

Andrew Cooper April 25, 2022, 5:56 p.m. UTC
It turns out that evaluate_nospec() code generation is not safe under Clang.
Given:

  void eval_nospec_test(int x)
  {
      if ( evaluate_nospec(x) )
          asm volatile ("nop #true" ::: "memory");
      else
          asm volatile ("nop #false" ::: "memory");
  }

Clang emits:

  <eval_nospec_test>:
         0f ae e8                lfence
         85 ff                   test   %edi,%edi
         74 02                   je     <eval_nospec_test+0x9>
         90                      nop
         c3                      ret
         90                      nop
         c3                      ret

which is not safe because the lfence has been hoisted above the conditional
jump.  Clang concludes that both barrier_nospec_true()'s have identical side
effects and can safely be merged.

Clang can be persuaded that the side effects are different if there are
different comments in the asm blocks.  This is fragile, but no more fragile
that other aspects of this construct.

Introduce barrier_nospec_false() with a separate internal comment to prevent
Clang merging it with barrier_nospec_true() despite the otherwise-identical
content.  The generated code now becomes:

  <eval_nospec_test>:
         85 ff                   test   %edi,%edi
         74 05                   je     <eval_nospec_test+0x9>
         0f ae e8                lfence
         90                      nop
         c3                      ret
         0f ae e8                lfence
         90                      nop
         c3                      ret

which has the correct number of lfence's, and in the correct place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/nospec.h | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Jan Beulich April 26, 2022, 6:43 a.m. UTC | #1
On 25.04.2022 19:56, Andrew Cooper wrote:
> It turns out that evaluate_nospec() code generation is not safe under Clang.
> Given:
> 
>   void eval_nospec_test(int x)
>   {
>       if ( evaluate_nospec(x) )
>           asm volatile ("nop #true" ::: "memory");
>       else
>           asm volatile ("nop #false" ::: "memory");
>   }
> 
> Clang emits:
> 
>   <eval_nospec_test>:
>          0f ae e8                lfence
>          85 ff                   test   %edi,%edi
>          74 02                   je     <eval_nospec_test+0x9>
>          90                      nop
>          c3                      ret
>          90                      nop
>          c3                      ret
> 
> which is not safe because the lfence has been hoisted above the conditional
> jump.  Clang concludes that both barrier_nospec_true()'s have identical side
> effects and can safely be merged.
> 
> Clang can be persuaded that the side effects are different if there are
> different comments in the asm blocks.  This is fragile, but no more fragile
> that other aspects of this construct.
> 
> Introduce barrier_nospec_false() with a separate internal comment to prevent
> Clang merging it with barrier_nospec_true() despite the otherwise-identical
> content.  The generated code now becomes:
> 
>   <eval_nospec_test>:
>          85 ff                   test   %edi,%edi
>          74 05                   je     <eval_nospec_test+0x9>
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
> 
> which has the correct number of lfence's, and in the correct place.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

I can live with us going this route, so
Reviewed-by: Jan Beulich <jbeulich@suse.com>

However, I'd like alternatives to be considered: Would two asm()s
perhaps not be candidates for merging when they have different
(perhaps fake) arguments or clobbers? If so, would this be less
fragile than relying on comments, which clearly any layer could be
viewed as free to strip off (when the same isn't true for arguments
and clobbers)?

Also you did say you'd open an issue with Clang to try to get their
view on relying on comments here. Could you please add a reference
to that issue in the description here?

Jan
Roger Pau Monné April 26, 2022, 7:18 a.m. UTC | #2
On Mon, Apr 25, 2022 at 06:56:03PM +0100, Andrew Cooper wrote:
> It turns out that evaluate_nospec() code generation is not safe under Clang.
> Given:
> 
>   void eval_nospec_test(int x)
>   {
>       if ( evaluate_nospec(x) )
>           asm volatile ("nop #true" ::: "memory");
>       else
>           asm volatile ("nop #false" ::: "memory");
>   }
> 
> Clang emits:
> 
>   <eval_nospec_test>:
>          0f ae e8                lfence
>          85 ff                   test   %edi,%edi
>          74 02                   je     <eval_nospec_test+0x9>
>          90                      nop
>          c3                      ret
>          90                      nop
>          c3                      ret
> 
> which is not safe because the lfence has been hoisted above the conditional
> jump.  Clang concludes that both barrier_nospec_true()'s have identical side
> effects and can safely be merged.
> 
> Clang can be persuaded that the side effects are different if there are
> different comments in the asm blocks.  This is fragile, but no more fragile
> that other aspects of this construct.
> 
> Introduce barrier_nospec_false() with a separate internal comment to prevent
> Clang merging it with barrier_nospec_true() despite the otherwise-identical
> content.  The generated code now becomes:
> 
>   <eval_nospec_test>:
>          85 ff                   test   %edi,%edi
>          74 05                   je     <eval_nospec_test+0x9>
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
> 
> which has the correct number of lfence's, and in the correct place.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Like Jan I wonder what the clang devs think of this solution.  Is
there any test in clang to assert that comments won't be stripped from
asm blocks before optimization?

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> ---
>  xen/arch/x86/include/asm/nospec.h | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/nospec.h b/xen/arch/x86/include/asm/nospec.h
> index 5312ae4c6f31..7150e76b87fb 100644
> --- a/xen/arch/x86/include/asm/nospec.h
> +++ b/xen/arch/x86/include/asm/nospec.h
> @@ -10,15 +10,26 @@
>  static always_inline bool barrier_nospec_true(void)
>  {
>  #ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
> -    alternative("lfence", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
> +    alternative("lfence #nospec-true", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
>  #endif
>      return true;
>  }
>  
> +static always_inline bool barrier_nospec_false(void)
> +{
> +#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
> +    alternative("lfence #nospec-false", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
> +#endif
> +    return false;
> +}
> +
>  /* Allow to protect evaluation of conditionals with respect to speculation */
>  static always_inline bool evaluate_nospec(bool condition)
>  {
> -    return condition ? barrier_nospec_true() : !barrier_nospec_true();
> +    if ( condition )
> +        return barrier_nospec_true();
> +    else
> +        return barrier_nospec_false();
>  }

Is the switch from using a ternary operator also a requirement for
clang not optimizing this? (I would assume not, but better ask)

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/nospec.h b/xen/arch/x86/include/asm/nospec.h
index 5312ae4c6f31..7150e76b87fb 100644
--- a/xen/arch/x86/include/asm/nospec.h
+++ b/xen/arch/x86/include/asm/nospec.h
@@ -10,15 +10,26 @@ 
 static always_inline bool barrier_nospec_true(void)
 {
 #ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
-    alternative("lfence", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
+    alternative("lfence #nospec-true", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
 #endif
     return true;
 }
 
+static always_inline bool barrier_nospec_false(void)
+{
+#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
+    alternative("lfence #nospec-false", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
+#endif
+    return false;
+}
+
 /* Allow to protect evaluation of conditionals with respect to speculation */
 static always_inline bool evaluate_nospec(bool condition)
 {
-    return condition ? barrier_nospec_true() : !barrier_nospec_true();
+    if ( condition )
+        return barrier_nospec_true();
+    else
+        return barrier_nospec_false();
 }
 
 /* Allow to block speculative execution in generic code */