Message ID | 20240304161041.3465897-3-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xen/nospec: Improvements | expand |
On 04.03.2024 17:10, Andrew Cooper wrote: > --- a/xen/include/xen/nospec.h > +++ b/xen/include/xen/nospec.h > @@ -18,6 +18,15 @@ static always_inline bool evaluate_nospec(bool cond) > #ifndef arch_evaluate_nospec > #define arch_evaluate_nospec(cond) cond > #endif > + > + /* > + * If the compiler can reduce the condition to a constant, then it won't > + * be emitting a conditional branch, and there's nothing needing > + * protecting. > + */ > + if ( __builtin_constant_p(cond) ) > + return cond; > + > return arch_evaluate_nospec(cond); > } While for now, even after having some hours for considering, I can't point out anything concrete that could potentially become a problem here, I still have the gut feeling that this would better be left in the arch logic. (There's the oddity of what the function actually expands to if the #define in context actually takes effect, but that's merely cosmetic.) The one thing I'm firmly unhappy with is "won't" in the comment: We can't know what the compiler will do. I've certainly known of compilers which didn't as you indicate here. That was nothing remotely recent, but ancient DOS/Windows ones. Still, unlike with e.g. __{get,put}_user_bad() the compiler doing something unexpected would go entirely silently here. The other (minor) aspect I'm not entirely happy with is that you insert between the fallback #define and its use. I think (if we need such a #define in the first place) the two would better stay close together. As to the need for the #define: To me static always_inline bool evaluate_nospec(bool cond) { #ifdef arch_evaluate_nospec return arch_evaluate_nospec(cond); #else return cond; #endif } or even static always_inline bool evaluate_nospec(bool cond) { #ifdef arch_evaluate_nospec return arch_evaluate_nospec(cond); #endif return cond; } reads no worse, but perhaps slightly better, and is then consistent with block_speculation(). At which point the question about "insertion point" here would hopefully also disappear, as this addition is meaningful only ahead of the #else. Jan
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h index a4155af08770..56cf67a44176 100644 --- a/xen/include/xen/nospec.h +++ b/xen/include/xen/nospec.h @@ -18,6 +18,15 @@ static always_inline bool evaluate_nospec(bool cond) #ifndef arch_evaluate_nospec #define arch_evaluate_nospec(cond) cond #endif + + /* + * If the compiler can reduce the condition to a constant, then it won't + * be emitting a conditional branch, and there's nothing needing + * protecting. + */ + if ( __builtin_constant_p(cond) ) + return cond; + return arch_evaluate_nospec(cond); }
When the compiler can reduce the condition to a constant, it can elide the conditional and one of the basic blocks. However, arch_evaluate_nospec() will still insert speculation protection, despite there being nothing to protect. Allow the speculation protection to be skipped entirely when the compiler is removing the condition entirely. e.g. for x86, given: int foo(void) { if ( evaluate_nospec(1) ) return 2; else return 42; } then before, we get: <foo>: lfence mov $0x2,%eax retq and afterwards, we get: <foo>: mov $0x2,%eax retq which is correct. With no conditional branch to protect, the lfence isn't providing any relevant safety. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> --- xen/include/xen/nospec.h | 9 +++++++++ 1 file changed, 9 insertions(+)