diff mbox series

[v2,2/5] xen/version: Calculate xen_capabilities_info once at boot

Message ID 20230113230835.29356-3-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series Fix truncation of various XENVER_* subops | expand

Commit Message

Andrew Cooper Jan. 13, 2023, 11:08 p.m. UTC
The arch_get_xen_caps() infrastructure is horribly inefficient, for something
that is constant after features have been resolved on boot.

Every instance used snprintf() to format constants into a string (which gets
shorter when %d gets resolved!), which gets double buffered on the stack.

Switch to using string literals with the "3.0" inserted - these numbers
haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005).

Use initcalls to format the data into xen_cap_info, which is deliberately not
of type xen_capabilities_info_t because a 1k array is a silly overhead for
storing a maximum of 77 chars (the x86 version) and isn't liable to need any
more space in the forseeable future.

This speeds up the the XENVER_capabilities hypercall, but the purpose of the
change is to allow us to introduce a better XENVER_* API that doesn't force us
to put a 1k buffer on the stack.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>

v2:
 * New

If Xen had strncpy(), then the hunk in do_xen_version() could read:

  if ( deny )
     memset(info, 0, sizeof(info));
  else
     strncpy(info, xen_cap_info, sizeof(info));

to avoid double processing the start of the buffer, but given the ABI (must
write 1k chars into the guest), I cannot see any way of taking info off the
stack without some kind of strncpy_to_guest() API.

Moving to __initcall() also allows new architectures to not implement this
API, and I'm going to recommend strongly that they dont.  Its a very dubious
way of signalling about 3 bits of info to the toolstack, and inefficient to
use (the toolstack has to do string parsing on the result figure out if
PV64/PV32/HVM is available).
---
 xen/arch/arm/setup.c        | 19 ++++++-------------
 xen/arch/x86/setup.c        | 31 ++++++++++---------------------
 xen/common/kernel.c         |  3 ++-
 xen/include/xen/hypercall.h |  2 --
 xen/include/xen/version.h   |  2 ++
 5 files changed, 20 insertions(+), 37 deletions(-)

Comments

Jan Beulich Jan. 16, 2023, 3:53 p.m. UTC | #1
On 14.01.2023 00:08, Andrew Cooper wrote:
> The arch_get_xen_caps() infrastructure is horribly inefficient, for something
> that is constant after features have been resolved on boot.
> 
> Every instance used snprintf() to format constants into a string (which gets
> shorter when %d gets resolved!), which gets double buffered on the stack.
> 
> Switch to using string literals with the "3.0" inserted - these numbers
> haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005).
> 
> Use initcalls to format the data into xen_cap_info, which is deliberately not
> of type xen_capabilities_info_t because a 1k array is a silly overhead for
> storing a maximum of 77 chars (the x86 version) and isn't liable to need any
> more space in the forseeable future.

So I was wondering if once we arrived at the new ABI (and hence the 3.0 one
is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new
ABI's availability here by a string including "5.0" where at present we
expose (only) "3.0".

> If Xen had strncpy(), then the hunk in do_xen_version() could read:
> 
>   if ( deny )
>      memset(info, 0, sizeof(info));
>   else
>      strncpy(info, xen_cap_info, sizeof(info));
> 
> to avoid double processing the start of the buffer, but given the ABI (must
> write 1k chars into the guest), I cannot see any way of taking info off the
> stack without some kind of strncpy_to_guest() API.

How about using clear_guest() for the 1k range, then copy_to_guest() for
merely the string? Plus - are we even required to clear the buffer past
the nul terminator?

Jan
Andrew Cooper Jan. 16, 2023, 8:52 p.m. UTC | #2
On 16/01/2023 3:53 pm, Jan Beulich wrote:
> On 14.01.2023 00:08, Andrew Cooper wrote:
>> The arch_get_xen_caps() infrastructure is horribly inefficient, for something
>> that is constant after features have been resolved on boot.
>>
>> Every instance used snprintf() to format constants into a string (which gets
>> shorter when %d gets resolved!), which gets double buffered on the stack.
>>
>> Switch to using string literals with the "3.0" inserted - these numbers
>> haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005).
>>
>> Use initcalls to format the data into xen_cap_info, which is deliberately not
>> of type xen_capabilities_info_t because a 1k array is a silly overhead for
>> storing a maximum of 77 chars (the x86 version) and isn't liable to need any
>> more space in the forseeable future.
> So I was wondering if once we arrived at the new ABI (and hence the 3.0 one
> is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new
> ABI's availability here by a string including "5.0" where at present we
> expose (only) "3.0".

"the new ABI" is is still two things.

The one part is changes to the in-guest ABI which does make it GPA based
(for HVM), but this does need to be broadly backwards compatible.  This
ABI string lives in the PV guest elfnotes (and is ultimately the thing
that distinguishes PV32pae vs PV64), but nowhere interesting for HVM
guests as far as I can see (furthermore, the 3 variations of hvm-3.0-
are bogus.

xen-3.0-x86_64la57 would probably be the least invasive way to extend PV
support to 5-level paging.

The other part is a stable tools API/ABI.  This can have any kind of
interface we choose, and frankly there are better interfaces than this
stringly typed one.


"xen-3.0" is even hardcoded in libelf.  I can't foresee a good reason to
bump 3 -> 5 and break all current PV guests.

>> If Xen had strncpy(), then the hunk in do_xen_version() could read:
>>
>>   if ( deny )
>>      memset(info, 0, sizeof(info));
>>   else
>>      strncpy(info, xen_cap_info, sizeof(info));
>>
>> to avoid double processing the start of the buffer, but given the ABI (must
>> write 1k chars into the guest), I cannot see any way of taking info off the
>> stack without some kind of strncpy_to_guest() API.
> How about using clear_guest() for the 1k range, then copy_to_guest() for
> merely the string? Plus - are we even required to clear the buffer past
> the nul terminator?

Well, we have previously always copied 1k bytes.  But this is always
been a NUL terminated API of a string persuasion, so I find it hard to
believe that any caller cares beyond the NUL.

Because of safe_strcpy(), xen_cap_info is guaranteed to be NUL
terminated, so if we don't care about padding the buffer with extra
zeroes, we don't even need the clear_guest().

Also, similar reasoning would apply to XENVER_cmdline which is typically
rather less than 1k in length (at least it's not on the stack, but it is
still an excessive copy).

~Andrew
Jan Beulich Jan. 17, 2023, 8:46 a.m. UTC | #3
On 16.01.2023 21:52, Andrew Cooper wrote:
> On 16/01/2023 3:53 pm, Jan Beulich wrote:
>> On 14.01.2023 00:08, Andrew Cooper wrote:
>>> The arch_get_xen_caps() infrastructure is horribly inefficient, for something
>>> that is constant after features have been resolved on boot.
>>>
>>> Every instance used snprintf() to format constants into a string (which gets
>>> shorter when %d gets resolved!), which gets double buffered on the stack.
>>>
>>> Switch to using string literals with the "3.0" inserted - these numbers
>>> haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005).
>>>
>>> Use initcalls to format the data into xen_cap_info, which is deliberately not
>>> of type xen_capabilities_info_t because a 1k array is a silly overhead for
>>> storing a maximum of 77 chars (the x86 version) and isn't liable to need any
>>> more space in the forseeable future.
>> So I was wondering if once we arrived at the new ABI (and hence the 3.0 one
>> is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new
>> ABI's availability here by a string including "5.0" where at present we
>> expose (only) "3.0".
> 
> "the new ABI" is is still two things.
> 
> The one part is changes to the in-guest ABI which does make it GPA based
> (for HVM), but this does need to be broadly backwards compatible.  This
> ABI string lives in the PV guest elfnotes (and is ultimately the thing
> that distinguishes PV32pae vs PV64), but nowhere interesting for HVM
> guests as far as I can see (furthermore, the 3 variations of hvm-3.0-
> are bogus.
> 
> xen-3.0-x86_64la57 would probably be the least invasive way to extend PV
> support to 5-level paging.
> 
> The other part is a stable tools API/ABI.  This can have any kind of
> interface we choose, and frankly there are better interfaces than this
> stringly typed one.
> 
> 
> "xen-3.0" is even hardcoded in libelf.  I can't foresee a good reason to
> bump 3 -> 5 and break all current PV guests.

I didn't by any means mean to suggest to bump. I was seeing us add new
5.0 entries, with the presence of the 3.0 ones indicating the backwards
compatible ABI. An option might then later be to make the compatible
ABI compile (kconfig) time conditional.

>>> If Xen had strncpy(), then the hunk in do_xen_version() could read:
>>>
>>>   if ( deny )
>>>      memset(info, 0, sizeof(info));
>>>   else
>>>      strncpy(info, xen_cap_info, sizeof(info));
>>>
>>> to avoid double processing the start of the buffer, but given the ABI (must
>>> write 1k chars into the guest), I cannot see any way of taking info off the
>>> stack without some kind of strncpy_to_guest() API.
>> How about using clear_guest() for the 1k range, then copy_to_guest() for
>> merely the string? Plus - are we even required to clear the buffer past
>> the nul terminator?
> 
> Well, we have previously always copied 1k bytes.  But this is always
> been a NUL terminated API of a string persuasion, so I find it hard to
> believe that any caller cares beyond the NUL.
> 
> Because of safe_strcpy(), xen_cap_info is guaranteed to be NUL
> terminated, so if we don't care about padding the buffer with extra
> zeroes, we don't even need the clear_guest().

Right, that's what I was hinting at as a possible option to do right here,
or to do subsequently.

> Also, similar reasoning would apply to XENVER_cmdline which is typically
> rather less than 1k in length (at least it's not on the stack, but it is
> still an excessive copy).

Indeed, but I understood your primary concern was the on-stack copy, so my
suggestion first focused on that.

Jan
diff mbox series

Patch

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 1f26f67b90e3..b71b4bc506e0 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -1194,24 +1194,17 @@  void __init start_xen(unsigned long boot_phys_offset,
     switch_stack_and_jump(idle_vcpu[0]->arch.cpu_info, init_done);
 }
 
-void arch_get_xen_caps(xen_capabilities_info_t *info)
+static int __init init_xen_cap_info(void)
 {
-    /* Interface name is always xen-3.0-* for Xen-3.x. */
-    int major = 3, minor = 0;
-    char s[32];
-
-    (*info)[0] = '\0';
-
 #ifdef CONFIG_ARM_64
-    snprintf(s, sizeof(s), "xen-%d.%d-aarch64 ", major, minor);
-    safe_strcat(*info, s);
+    safe_strcat(xen_cap_info, "xen-3.0-aarch64 ");
 #endif
     if ( cpu_has_aarch32 )
-    {
-        snprintf(s, sizeof(s), "xen-%d.%d-armv7l ", major, minor);
-        safe_strcat(*info, s);
-    }
+        safe_strcat(xen_cap_info, "xen-3.0-armv7l ");
+
+    return 0;
 }
+__initcall(init_xen_cap_info);
 
 /*
  * Local variables:
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 566422600d94..f80821469ece 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1956,35 +1956,24 @@  void __init noreturn __start_xen(unsigned long mbi_p)
     unreachable();
 }
 
-void arch_get_xen_caps(xen_capabilities_info_t *info)
+static int __init cf_check init_xen_cap_info(void)
 {
-    /* Interface name is always xen-3.0-* for Xen-3.x. */
-    int major = 3, minor = 0;
-    char s[32];
-
-    (*info)[0] = '\0';
-
     if ( IS_ENABLED(CONFIG_PV) )
     {
-        snprintf(s, sizeof(s), "xen-%d.%d-x86_64 ", major, minor);
-        safe_strcat(*info, s);
+        safe_strcat(xen_cap_info, "xen-3.0-x86_64 ");
 
         if ( opt_pv32 )
-        {
-            snprintf(s, sizeof(s), "xen-%d.%d-x86_32p ", major, minor);
-            safe_strcat(*info, s);
-        }
+            safe_strcat(xen_cap_info, "xen-3.0-x86_32p ");
     }
     if ( hvm_enabled )
-    {
-        snprintf(s, sizeof(s), "hvm-%d.%d-x86_32 ", major, minor);
-        safe_strcat(*info, s);
-        snprintf(s, sizeof(s), "hvm-%d.%d-x86_32p ", major, minor);
-        safe_strcat(*info, s);
-        snprintf(s, sizeof(s), "hvm-%d.%d-x86_64 ", major, minor);
-        safe_strcat(*info, s);
-    }
+        safe_strcat(xen_cap_info,
+                    "hvm-3.0-x86_32 "
+                    "hvm-3.0-x86_32p "
+                    "hvm-3.0-x86_64 ");
+
+    return 0;
 }
+__initcall(init_xen_cap_info);
 
 int __hwdom_init xen_in_range(unsigned long mfn)
 {
diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index f7b1f65f373c..4fa1d6710115 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -30,6 +30,7 @@  enum system_state system_state = SYS_STATE_early_boot;
 
 xen_commandline_t saved_cmdline;
 static const char __initconst opt_builtin_cmdline[] = CONFIG_CMDLINE;
+char __ro_after_init xen_cap_info[128];
 
 static int assign_integer_param(const struct kernel_param *param, uint64_t val)
 {
@@ -509,7 +510,7 @@  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         memset(info, 0, sizeof(info));
         if ( !deny )
-            arch_get_xen_caps(&info);
+            safe_strcpy(info, xen_cap_info);
 
         if ( copy_to_guest(arg, info, ARRAY_SIZE(info)) )
             return -EFAULT;
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index f307dfb59760..15b6be6ec818 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -56,6 +56,4 @@  common_vcpu_op(int cmd,
     struct vcpu *v,
     XEN_GUEST_HANDLE_PARAM(void) arg);
 
-void arch_get_xen_caps(xen_capabilities_info_t *info);
-
 #endif /* __XEN_HYPERCALL_H__ */
diff --git a/xen/include/xen/version.h b/xen/include/xen/version.h
index 93c58773630c..4856ad1b446d 100644
--- a/xen/include/xen/version.h
+++ b/xen/include/xen/version.h
@@ -19,6 +19,8 @@  const char *xen_deny(void);
 const char *xen_build_info(void);
 int xen_build_id(const void **p, unsigned int *len);
 
+extern char xen_cap_info[128];
+
 #ifdef BUILD_ID
 void xen_build_init(void);
 int xen_build_id_check(const Elf_Note *n, unsigned int n_sz,