Message ID | 20230113230835.29356-3-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix truncation of various XENVER_* subops | expand |
On 14.01.2023 00:08, Andrew Cooper wrote: > The arch_get_xen_caps() infrastructure is horribly inefficient, for something > that is constant after features have been resolved on boot. > > Every instance used snprintf() to format constants into a string (which gets > shorter when %d gets resolved!), which gets double buffered on the stack. > > Switch to using string literals with the "3.0" inserted - these numbers > haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005). > > Use initcalls to format the data into xen_cap_info, which is deliberately not > of type xen_capabilities_info_t because a 1k array is a silly overhead for > storing a maximum of 77 chars (the x86 version) and isn't liable to need any > more space in the forseeable future. So I was wondering if once we arrived at the new ABI (and hence the 3.0 one is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new ABI's availability here by a string including "5.0" where at present we expose (only) "3.0". > If Xen had strncpy(), then the hunk in do_xen_version() could read: > > if ( deny ) > memset(info, 0, sizeof(info)); > else > strncpy(info, xen_cap_info, sizeof(info)); > > to avoid double processing the start of the buffer, but given the ABI (must > write 1k chars into the guest), I cannot see any way of taking info off the > stack without some kind of strncpy_to_guest() API. How about using clear_guest() for the 1k range, then copy_to_guest() for merely the string? Plus - are we even required to clear the buffer past the nul terminator? Jan
On 16/01/2023 3:53 pm, Jan Beulich wrote: > On 14.01.2023 00:08, Andrew Cooper wrote: >> The arch_get_xen_caps() infrastructure is horribly inefficient, for something >> that is constant after features have been resolved on boot. >> >> Every instance used snprintf() to format constants into a string (which gets >> shorter when %d gets resolved!), which gets double buffered on the stack. >> >> Switch to using string literals with the "3.0" inserted - these numbers >> haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005). >> >> Use initcalls to format the data into xen_cap_info, which is deliberately not >> of type xen_capabilities_info_t because a 1k array is a silly overhead for >> storing a maximum of 77 chars (the x86 version) and isn't liable to need any >> more space in the forseeable future. > So I was wondering if once we arrived at the new ABI (and hence the 3.0 one > is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new > ABI's availability here by a string including "5.0" where at present we > expose (only) "3.0". "the new ABI" is is still two things. The one part is changes to the in-guest ABI which does make it GPA based (for HVM), but this does need to be broadly backwards compatible. This ABI string lives in the PV guest elfnotes (and is ultimately the thing that distinguishes PV32pae vs PV64), but nowhere interesting for HVM guests as far as I can see (furthermore, the 3 variations of hvm-3.0- are bogus. xen-3.0-x86_64la57 would probably be the least invasive way to extend PV support to 5-level paging. The other part is a stable tools API/ABI. This can have any kind of interface we choose, and frankly there are better interfaces than this stringly typed one. "xen-3.0" is even hardcoded in libelf. I can't foresee a good reason to bump 3 -> 5 and break all current PV guests. >> If Xen had strncpy(), then the hunk in do_xen_version() could read: >> >> if ( deny ) >> memset(info, 0, sizeof(info)); >> else >> strncpy(info, xen_cap_info, sizeof(info)); >> >> to avoid double processing the start of the buffer, but given the ABI (must >> write 1k chars into the guest), I cannot see any way of taking info off the >> stack without some kind of strncpy_to_guest() API. > How about using clear_guest() for the 1k range, then copy_to_guest() for > merely the string? Plus - are we even required to clear the buffer past > the nul terminator? Well, we have previously always copied 1k bytes. But this is always been a NUL terminated API of a string persuasion, so I find it hard to believe that any caller cares beyond the NUL. Because of safe_strcpy(), xen_cap_info is guaranteed to be NUL terminated, so if we don't care about padding the buffer with extra zeroes, we don't even need the clear_guest(). Also, similar reasoning would apply to XENVER_cmdline which is typically rather less than 1k in length (at least it's not on the stack, but it is still an excessive copy). ~Andrew
On 16.01.2023 21:52, Andrew Cooper wrote: > On 16/01/2023 3:53 pm, Jan Beulich wrote: >> On 14.01.2023 00:08, Andrew Cooper wrote: >>> The arch_get_xen_caps() infrastructure is horribly inefficient, for something >>> that is constant after features have been resolved on boot. >>> >>> Every instance used snprintf() to format constants into a string (which gets >>> shorter when %d gets resolved!), which gets double buffered on the stack. >>> >>> Switch to using string literals with the "3.0" inserted - these numbers >>> haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005). >>> >>> Use initcalls to format the data into xen_cap_info, which is deliberately not >>> of type xen_capabilities_info_t because a 1k array is a silly overhead for >>> storing a maximum of 77 chars (the x86 version) and isn't liable to need any >>> more space in the forseeable future. >> So I was wondering if once we arrived at the new ABI (and hence the 3.0 one >> is properly legacy) we shouldn't declare Xen 5.0 and then also mark the new >> ABI's availability here by a string including "5.0" where at present we >> expose (only) "3.0". > > "the new ABI" is is still two things. > > The one part is changes to the in-guest ABI which does make it GPA based > (for HVM), but this does need to be broadly backwards compatible. This > ABI string lives in the PV guest elfnotes (and is ultimately the thing > that distinguishes PV32pae vs PV64), but nowhere interesting for HVM > guests as far as I can see (furthermore, the 3 variations of hvm-3.0- > are bogus. > > xen-3.0-x86_64la57 would probably be the least invasive way to extend PV > support to 5-level paging. > > The other part is a stable tools API/ABI. This can have any kind of > interface we choose, and frankly there are better interfaces than this > stringly typed one. > > > "xen-3.0" is even hardcoded in libelf. I can't foresee a good reason to > bump 3 -> 5 and break all current PV guests. I didn't by any means mean to suggest to bump. I was seeing us add new 5.0 entries, with the presence of the 3.0 ones indicating the backwards compatible ABI. An option might then later be to make the compatible ABI compile (kconfig) time conditional. >>> If Xen had strncpy(), then the hunk in do_xen_version() could read: >>> >>> if ( deny ) >>> memset(info, 0, sizeof(info)); >>> else >>> strncpy(info, xen_cap_info, sizeof(info)); >>> >>> to avoid double processing the start of the buffer, but given the ABI (must >>> write 1k chars into the guest), I cannot see any way of taking info off the >>> stack without some kind of strncpy_to_guest() API. >> How about using clear_guest() for the 1k range, then copy_to_guest() for >> merely the string? Plus - are we even required to clear the buffer past >> the nul terminator? > > Well, we have previously always copied 1k bytes. But this is always > been a NUL terminated API of a string persuasion, so I find it hard to > believe that any caller cares beyond the NUL. > > Because of safe_strcpy(), xen_cap_info is guaranteed to be NUL > terminated, so if we don't care about padding the buffer with extra > zeroes, we don't even need the clear_guest(). Right, that's what I was hinting at as a possible option to do right here, or to do subsequently. > Also, similar reasoning would apply to XENVER_cmdline which is typically > rather less than 1k in length (at least it's not on the stack, but it is > still an excessive copy). Indeed, but I understood your primary concern was the on-stack copy, so my suggestion first focused on that. Jan
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 1f26f67b90e3..b71b4bc506e0 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -1194,24 +1194,17 @@ void __init start_xen(unsigned long boot_phys_offset, switch_stack_and_jump(idle_vcpu[0]->arch.cpu_info, init_done); } -void arch_get_xen_caps(xen_capabilities_info_t *info) +static int __init init_xen_cap_info(void) { - /* Interface name is always xen-3.0-* for Xen-3.x. */ - int major = 3, minor = 0; - char s[32]; - - (*info)[0] = '\0'; - #ifdef CONFIG_ARM_64 - snprintf(s, sizeof(s), "xen-%d.%d-aarch64 ", major, minor); - safe_strcat(*info, s); + safe_strcat(xen_cap_info, "xen-3.0-aarch64 "); #endif if ( cpu_has_aarch32 ) - { - snprintf(s, sizeof(s), "xen-%d.%d-armv7l ", major, minor); - safe_strcat(*info, s); - } + safe_strcat(xen_cap_info, "xen-3.0-armv7l "); + + return 0; } +__initcall(init_xen_cap_info); /* * Local variables: diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 566422600d94..f80821469ece 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1956,35 +1956,24 @@ void __init noreturn __start_xen(unsigned long mbi_p) unreachable(); } -void arch_get_xen_caps(xen_capabilities_info_t *info) +static int __init cf_check init_xen_cap_info(void) { - /* Interface name is always xen-3.0-* for Xen-3.x. */ - int major = 3, minor = 0; - char s[32]; - - (*info)[0] = '\0'; - if ( IS_ENABLED(CONFIG_PV) ) { - snprintf(s, sizeof(s), "xen-%d.%d-x86_64 ", major, minor); - safe_strcat(*info, s); + safe_strcat(xen_cap_info, "xen-3.0-x86_64 "); if ( opt_pv32 ) - { - snprintf(s, sizeof(s), "xen-%d.%d-x86_32p ", major, minor); - safe_strcat(*info, s); - } + safe_strcat(xen_cap_info, "xen-3.0-x86_32p "); } if ( hvm_enabled ) - { - snprintf(s, sizeof(s), "hvm-%d.%d-x86_32 ", major, minor); - safe_strcat(*info, s); - snprintf(s, sizeof(s), "hvm-%d.%d-x86_32p ", major, minor); - safe_strcat(*info, s); - snprintf(s, sizeof(s), "hvm-%d.%d-x86_64 ", major, minor); - safe_strcat(*info, s); - } + safe_strcat(xen_cap_info, + "hvm-3.0-x86_32 " + "hvm-3.0-x86_32p " + "hvm-3.0-x86_64 "); + + return 0; } +__initcall(init_xen_cap_info); int __hwdom_init xen_in_range(unsigned long mfn) { diff --git a/xen/common/kernel.c b/xen/common/kernel.c index f7b1f65f373c..4fa1d6710115 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -30,6 +30,7 @@ enum system_state system_state = SYS_STATE_early_boot; xen_commandline_t saved_cmdline; static const char __initconst opt_builtin_cmdline[] = CONFIG_CMDLINE; +char __ro_after_init xen_cap_info[128]; static int assign_integer_param(const struct kernel_param *param, uint64_t val) { @@ -509,7 +510,7 @@ long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) memset(info, 0, sizeof(info)); if ( !deny ) - arch_get_xen_caps(&info); + safe_strcpy(info, xen_cap_info); if ( copy_to_guest(arg, info, ARRAY_SIZE(info)) ) return -EFAULT; diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h index f307dfb59760..15b6be6ec818 100644 --- a/xen/include/xen/hypercall.h +++ b/xen/include/xen/hypercall.h @@ -56,6 +56,4 @@ common_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg); -void arch_get_xen_caps(xen_capabilities_info_t *info); - #endif /* __XEN_HYPERCALL_H__ */ diff --git a/xen/include/xen/version.h b/xen/include/xen/version.h index 93c58773630c..4856ad1b446d 100644 --- a/xen/include/xen/version.h +++ b/xen/include/xen/version.h @@ -19,6 +19,8 @@ const char *xen_deny(void); const char *xen_build_info(void); int xen_build_id(const void **p, unsigned int *len); +extern char xen_cap_info[128]; + #ifdef BUILD_ID void xen_build_init(void); int xen_build_id_check(const Elf_Note *n, unsigned int n_sz,
The arch_get_xen_caps() infrastructure is horribly inefficient, for something that is constant after features have been resolved on boot. Every instance used snprintf() to format constants into a string (which gets shorter when %d gets resolved!), which gets double buffered on the stack. Switch to using string literals with the "3.0" inserted - these numbers haven't changed in 18 years (The Xen 3.0 release was Dec 5th 2005). Use initcalls to format the data into xen_cap_info, which is deliberately not of type xen_capabilities_info_t because a 1k array is a silly overhead for storing a maximum of 77 chars (the x86 version) and isn't liable to need any more space in the forseeable future. This speeds up the the XENVER_capabilities hypercall, but the purpose of the change is to allow us to introduce a better XENVER_* API that doesn't force us to put a 1k buffer on the stack. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Julien Grall <julien@xen.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> v2: * New If Xen had strncpy(), then the hunk in do_xen_version() could read: if ( deny ) memset(info, 0, sizeof(info)); else strncpy(info, xen_cap_info, sizeof(info)); to avoid double processing the start of the buffer, but given the ABI (must write 1k chars into the guest), I cannot see any way of taking info off the stack without some kind of strncpy_to_guest() API. Moving to __initcall() also allows new architectures to not implement this API, and I'm going to recommend strongly that they dont. Its a very dubious way of signalling about 3 bits of info to the toolstack, and inefficient to use (the toolstack has to do string parsing on the result figure out if PV64/PV32/HVM is available). --- xen/arch/arm/setup.c | 19 ++++++------------- xen/arch/x86/setup.c | 31 ++++++++++--------------------- xen/common/kernel.c | 3 ++- xen/include/xen/hypercall.h | 2 -- xen/include/xen/version.h | 2 ++ 5 files changed, 20 insertions(+), 37 deletions(-)