diff mbox series

[4/4] xen/version: Introduce non-truncating XENVER_* subops

Message ID 20230103200943.5801-5-andrew.cooper3@citrix.com (mailing list archive)
State Superseded
Headers show
Series Fix truncation of various XENVER_* subops | expand

Commit Message

Andrew Cooper Jan. 3, 2023, 8:09 p.m. UTC
Recently in XenServer, we have encountered problems caused by both
XENVER_extraversion and XENVER_commandline having fixed bounds.

More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
array, and using an unqualified 'char' which has implementation-specific
signed-ness.

Provide brand new ops, which are capable of expressing variable length
strings, and mark the older ops as broken.

This fixes all issues around XENVER_extraversion being longer than 15 chars.
More work is required to remove other assumptions about XENVER_commandline
being 1023 chars long.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>

Tested by forcing XENVER_extraversion to be 20 chars long, and confirming that
an untruncated version can be obtained.

API/ABI wise, XENVER_build_id could be merged into xenver_varstring_op(), but
the internal infrastructure is awkward.
---
 xen/common/kernel.c          | 69 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/version.h | 56 +++++++++++++++++++++++++++++++++--
 xen/include/xlat.lst         |  1 +
 xen/xsm/flask/hooks.c        |  4 +++
 4 files changed, 128 insertions(+), 2 deletions(-)

Comments

Julien Grall Jan. 3, 2023, 8:47 p.m. UTC | #1
Hi Andrew,

On 03/01/2023 20:09, Andrew Cooper wrote:
> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
> index c8325219f648..cf2d2ef38b54 100644
> --- a/xen/include/public/version.h
> +++ b/xen/include/public/version.h
> @@ -19,12 +19,20 @@
>   /* arg == NULL; returns major:minor (16:16). */
>   #define XENVER_version      0
>   
> -/* arg == xen_extraversion_t. */
> +/*
> + * arg == xen_extraversion_t.
> + *
> + * This API/ABI is broken.  Use XENVER_extraversion2 instead.

I read this as newer tools should never try to call XENVER_extraversion. 
But I don't think this is what you intend to say, correct? If so, I 
would say that an OS should first try XENVER_extraversion2 and then 
fallback to XENVER_extraversion if it returns -ENOSYS.

Same goes for the new hypercalls.

Cheers,
Andrew Cooper Jan. 3, 2023, 9:22 p.m. UTC | #2
On 03/01/2023 8:47 pm, Julien Grall wrote:
> Hi Andrew,
>
> On 03/01/2023 20:09, Andrew Cooper wrote:
>> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>> index c8325219f648..cf2d2ef38b54 100644
>> --- a/xen/include/public/version.h
>> +++ b/xen/include/public/version.h
>> @@ -19,12 +19,20 @@
>>   /* arg == NULL; returns major:minor (16:16). */
>>   #define XENVER_version      0
>>   -/* arg == xen_extraversion_t. */
>> +/*
>> + * arg == xen_extraversion_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_extraversion2 instead.
>
> I read this as newer tools should never try to call
> XENVER_extraversion. But I don't think this is what you intend to say,
> correct? If so, I would say that an OS should first try
> XENVER_extraversion2 and then fallback to XENVER_extraversion if it
> returns -ENOSYS.
>
> Same goes for the new hypercalls.

Right, but that's sufficiently basic that it goes without saying.

This is not a "my first introduction to writing code" tutorial.

~Andrew
Julien Grall Jan. 3, 2023, 9:40 p.m. UTC | #3
On 03/01/2023 21:22, Andrew Cooper wrote:
> On 03/01/2023 8:47 pm, Julien Grall wrote:
>> On 03/01/2023 20:09, Andrew Cooper wrote:
>>> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>>> index c8325219f648..cf2d2ef38b54 100644
>>> --- a/xen/include/public/version.h
>>> +++ b/xen/include/public/version.h
>>> @@ -19,12 +19,20 @@
>>>    /* arg == NULL; returns major:minor (16:16). */
>>>    #define XENVER_version      0
>>>    -/* arg == xen_extraversion_t. */
>>> +/*
>>> + * arg == xen_extraversion_t.
>>> + *
>>> + * This API/ABI is broken.  Use XENVER_extraversion2 instead.
>>
>> I read this as newer tools should never try to call
>> XENVER_extraversion. But I don't think this is what you intend to say,
>> correct? If so, I would say that an OS should first try
>> XENVER_extraversion2 and then fallback to XENVER_extraversion if it
>> returns -ENOSYS.
>>
>> Same goes for the new hypercalls.
> 
> Right, but that's sufficiently basic that it goes without saying.
It never hurts to be obvious. I couldn't say the same for the other way 
around.

> 
> This is not a "my first introduction to writing code" tutorial.

That's the first introduction to Xen for an OS developer. Adding a few 
words (maybe with the first version introducing it) would likely be 
welcomed by any developer.

This avoids any misinterpretation and/or losing time playing the git 
blame game...

Cheers,
Jan Beulich Jan. 4, 2023, 5:04 p.m. UTC | #4
On 03.01.2023 21:09, Andrew Cooper wrote:
> Recently in XenServer, we have encountered problems caused by both
> XENVER_extraversion and XENVER_commandline having fixed bounds.
> 
> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
> array, and using an unqualified 'char' which has implementation-specific
> signed-ness.

Which is fine as long as only ASCII is returned. If non-ASCII can be returned,
I agree "unsigned char" is better, but then we also need to spell out what
encoding the strings use (UTF-8 presumably).

> API/ABI wise, XENVER_build_id could be merged into xenver_varstring_op(), but
> the internal infrastructure is awkward.

I guess build-id also doesn't fit the rest because of not returning strings,
but indeed an array of bytes. You also couldn't use strlen() on the array.

> @@ -469,6 +470,66 @@ static int __init cf_check param_init(void)
>  __initcall(param_init);
>  #endif
>  
> +static long xenver_varstring_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    const char *str = NULL;
> +    size_t sz = 0;
> +    union {
> +        xen_capabilities_info_t info;
> +    } u;
> +    struct xen_var_string user_str;
> +
> +    switch ( cmd )
> +    {
> +    case XENVER_extraversion2:
> +        str = xen_extra_version();
> +        break;
> +
> +    case XENVER_changeset2:
> +        str = xen_changeset();
> +        break;
> +
> +    case XENVER_commandline2:
> +        str = saved_cmdline;
> +        break;
> +
> +    case XENVER_capabilities2:
> +        memset(u.info, 0, sizeof(u.info));
> +        arch_get_xen_caps(&u.info);
> +        str = u.info;
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        break;
> +    }
> +
> +    if ( !str ||
> +         !(sz = strlen(str)) )
> +        return -ENODATA; /* failsafe */

Is this really appropriate for e.g. an empty command line?

> +    if ( sz > INT32_MAX )
> +        return -E2BIG; /* Compat guests.  2G ought to be plenty. */

While the comment here and in the public header mention compat guests,
the check is uniform. What's the deal?

> +    if ( guest_handle_is_null(arg) ) /* Length request */
> +        return sz;
> +
> +    if ( copy_from_guest(&user_str, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( user_str.len == 0 )
> +        return -EINVAL;
> +
> +    if ( sz > user_str.len )
> +        return -ENOBUFS;
> +
> +    if ( copy_to_guest_offset(arg, offsetof(struct xen_var_string, buf),
> +                              str, sz) )
> +        return -EFAULT;

Not inserting a nul terminator is going to make this slightly awkward to
use.

> @@ -103,6 +126,35 @@ struct xen_build_id {
>  };
>  typedef struct xen_build_id xen_build_id_t;
>  
> +/*
> + * Container for an arbitrary variable length string.
> + */
> +struct xen_var_string {
> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
> +};
> +typedef struct xen_var_string xen_var_string_t;
> +
> +/*
> + * arg == xenver_string_t

Nit: xen_var_string_t (also again in the following text).

> + * Equivalent to the original ops, but with a non-truncating API/ABI.
> + *
> + * Passing arg == NULL is a request for size.  The returned size does not
> + * include a NUL terminator, and has a practical upper limit of INT32_MAX for
> + * 32bit guests.  This is expected to be plenty for the purpose.

As said above, the limit applies to all guests, which the wording here
doesn't suggest.

Jan

> + * Otherwise, the input xenver_string_t provides the size of the following
> + * buffer.  Xen will fill the buffer, and return the number of bytes written
> + * (e.g. if the input buffer was longer than necessary).
> + *
> + * These hypercalls can fail, in which case they'll return -XEN_Exx.
> + */
> +#define XENVER_extraversion2 11
> +#define XENVER_capabilities2 12
> +#define XENVER_changeset2    13
> +#define XENVER_commandline2  14
> +
>  #endif /* __XEN_PUBLIC_VERSION_H__ */
>  
>  /*
Andrew Cooper Jan. 4, 2023, 6:34 p.m. UTC | #5
On 04/01/2023 5:04 pm, Jan Beulich wrote:
> On 03.01.2023 21:09, Andrew Cooper wrote:
>> Recently in XenServer, we have encountered problems caused by both
>> XENVER_extraversion and XENVER_commandline having fixed bounds.
>>
>> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
>> array, and using an unqualified 'char' which has implementation-specific
>> signed-ness.
> Which is fine as long as only ASCII is returned. If non-ASCII can be returned,
> I agree "unsigned char" is better, but then we also need to spell out what
> encoding the strings use (UTF-8 presumably).

Well, it "functions" as far as ASCII is concerned, but I wouldn't say
that was fine for an ABI.

The command line can reasonably have non-ASCII characters these days. 
(as can the compile information, but I intentionally didn't convert them
to this new format).

UTF-8 is probably the sensible thing to specify, if we're going to make
any statement.

>
>> API/ABI wise, XENVER_build_id could be merged into xenver_varstring_op(), but
>> the internal infrastructure is awkward.
> I guess build-id also doesn't fit the rest because of not returning strings,
> but indeed an array of bytes. You also couldn't use strlen() on the array.

The format is unspecified, but it is a base64 encoded ASCII string (not
NUL terminated).

>> @@ -469,6 +470,66 @@ static int __init cf_check param_init(void)
>>  __initcall(param_init);
>>  #endif
>>  
>> +static long xenver_varstring_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>> +{
>> +    const char *str = NULL;
>> +    size_t sz = 0;
>> +    union {
>> +        xen_capabilities_info_t info;
>> +    } u;
>> +    struct xen_var_string user_str;
>> +
>> +    switch ( cmd )
>> +    {
>> +    case XENVER_extraversion2:
>> +        str = xen_extra_version();
>> +        break;
>> +
>> +    case XENVER_changeset2:
>> +        str = xen_changeset();
>> +        break;
>> +
>> +    case XENVER_commandline2:
>> +        str = saved_cmdline;
>> +        break;
>> +
>> +    case XENVER_capabilities2:
>> +        memset(u.info, 0, sizeof(u.info));
>> +        arch_get_xen_caps(&u.info);
>> +        str = u.info;
>> +        break;
>> +
>> +    default:
>> +        ASSERT_UNREACHABLE();
>> +        break;
>> +    }
>> +
>> +    if ( !str ||
>> +         !(sz = strlen(str)) )
>> +        return -ENODATA; /* failsafe */
> Is this really appropriate for e.g. an empty command line?

Hmm.  Good point.  All can in principle be empty.

In which case I think I'll put the -ENODATA in the default case and have
a return 0 for the sz == 0 case.

>> +    if ( sz > INT32_MAX )
>> +        return -E2BIG; /* Compat guests.  2G ought to be plenty. */
> While the comment here and in the public header mention compat guests,
> the check is uniform. What's the deal?

Well its either this, or a (comat ? INT32_MAX : INT64_MAX) check, along
with the ifdefary and predicates required to make that compile.

But there's not a CPU today which can actually move 2G of data (which is
4G of L1d bandwidth) without suffering the watchdog (especially as we've
just read it once for strlen(), so that's 6G of bandwidth), nor do I
expect this to change in the forseeable future.

There's some boundary (probably far lower) beyond which we can't use the
algorithm here.

There wants to be some limit, and I don't feel it is necessary to make
it variable on the guest type.

>
>> +    if ( guest_handle_is_null(arg) ) /* Length request */
>> +        return sz;
>> +
>> +    if ( copy_from_guest(&user_str, arg, 1) )
>> +        return -EFAULT;
>> +
>> +    if ( user_str.len == 0 )
>> +        return -EINVAL;
>> +
>> +    if ( sz > user_str.len )
>> +        return -ENOBUFS;
>> +
>> +    if ( copy_to_guest_offset(arg, offsetof(struct xen_var_string, buf),
>> +                              str, sz) )
>> +        return -EFAULT;
> Not inserting a nul terminator is going to make this slightly awkward to
> use.

Not really.  It matches how build-id currently works, and forces the
caller to use proper tained-string discipline.

To safely printk() it, all you need is:

rc = xen_version(XENVER_$X, &str);
if ( rc < 0 )
    return rc;
printk("%*.s\n", rc, str.buf);

>> @@ -103,6 +126,35 @@ struct xen_build_id {
>>  };
>>  typedef struct xen_build_id xen_build_id_t;
>>  
>> +/*
>> + * Container for an arbitrary variable length string.
>> + */
>> +struct xen_var_string {
>> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
>> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
>> +};
>> +typedef struct xen_var_string xen_var_string_t;
>> +
>> +/*
>> + * arg == xenver_string_t
> Nit: xen_var_string_t (also again in the following text).

Ah yes.  I originally called it xenver_string_t then found a whole load
of compat assumptions about xen prefixes on names.

Will fix.


But overall, I'm not seeing a major objection to this change?  In which
case I'll go ahead and do the tools/ cleanup too for v2.

~Andrew
Jan Beulich Jan. 5, 2023, 8:15 a.m. UTC | #6
On 04.01.2023 19:34, Andrew Cooper wrote:
> On 04/01/2023 5:04 pm, Jan Beulich wrote:
>> On 03.01.2023 21:09, Andrew Cooper wrote:
>>> API/ABI wise, XENVER_build_id could be merged into xenver_varstring_op(), but
>>> the internal infrastructure is awkward.
>> I guess build-id also doesn't fit the rest because of not returning strings,
>> but indeed an array of bytes. You also couldn't use strlen() on the array.
> 
> The format is unspecified, but it is a base64 encoded ASCII string (not
> NUL terminated).

Where are you taking this base64 encoding from? I see the build ID being pure
binary data, which could easily have an embedded nul.

>>> +    if ( sz > INT32_MAX )
>>> +        return -E2BIG; /* Compat guests.  2G ought to be plenty. */
>> While the comment here and in the public header mention compat guests,
>> the check is uniform. What's the deal?
> 
> Well its either this, or a (comat ? INT32_MAX : INT64_MAX) check, along
> with the ifdefary and predicates required to make that compile.
> 
> But there's not a CPU today which can actually move 2G of data (which is
> 4G of L1d bandwidth) without suffering the watchdog (especially as we've
> just read it once for strlen(), so that's 6G of bandwidth), nor do I
> expect this to change in the forseeable future.
> 
> There's some boundary (probably far lower) beyond which we can't use the
> algorithm here.
> 
> There wants to be some limit, and I don't feel it is necessary to make
> it variable on the guest type.

Sure. My question was merely because of the special mentioning of 32-bit /
compat guests. I'm fine with the universal limit, and I'd also be fine
with a lower (universal) bound. All I'm after is that the (to me at least)
confusing comments be adjusted.

> But overall, I'm not seeing a major objection to this change?  In which
> case I'll go ahead and do the tools/ cleanup too for v2.

Well, I'm not entirely convinced of the need for the new subops (we could
as well introduce build time checks to make sure no truncation will occur
for the existing ones), but I also won't object to their introduction. At
least for the command line I can see that we will want (need) to support
more than 1k worth of a string, considering how many options we have
accumulated.

Jan
Andrew Cooper Jan. 5, 2023, 10:28 p.m. UTC | #7
On 05/01/2023 8:15 am, Jan Beulich wrote:
> On 04.01.2023 19:34, Andrew Cooper wrote:
>> On 04/01/2023 5:04 pm, Jan Beulich wrote:
>>> On 03.01.2023 21:09, Andrew Cooper wrote:
>>>> API/ABI wise, XENVER_build_id could be merged into xenver_varstring_op(), but
>>>> the internal infrastructure is awkward.
>>> I guess build-id also doesn't fit the rest because of not returning strings,
>>> but indeed an array of bytes. You also couldn't use strlen() on the array.
>> The format is unspecified, but it is a base64 encoded ASCII string (not
>> NUL terminated).
> Where are you taking this base64 encoding from? I see the build ID being pure
> binary data, which could easily have an embedded nul.

Oh, so it is.

I'd failed to spot that libxl formats it automatically on behalf of the
caller.

>>>> +    if ( sz > INT32_MAX )
>>>> +        return -E2BIG; /* Compat guests.  2G ought to be plenty. */
>>> While the comment here and in the public header mention compat guests,
>>> the check is uniform. What's the deal?
>> Well its either this, or a (comat ? INT32_MAX : INT64_MAX) check, along
>> with the ifdefary and predicates required to make that compile.
>>
>> But there's not a CPU today which can actually move 2G of data (which is
>> 4G of L1d bandwidth) without suffering the watchdog (especially as we've
>> just read it once for strlen(), so that's 6G of bandwidth), nor do I
>> expect this to change in the forseeable future.
>>
>> There's some boundary (probably far lower) beyond which we can't use the
>> algorithm here.
>>
>> There wants to be some limit, and I don't feel it is necessary to make
>> it variable on the guest type.
> Sure. My question was merely because of the special mentioning of 32-bit /
> compat guests. I'm fine with the universal limit, and I'd also be fine
> with a lower (universal) bound. All I'm after is that the (to me at least)
> confusing comments be adjusted.

How about 16k then?

>> But overall, I'm not seeing a major objection to this change?  In which
>> case I'll go ahead and do the tools/ cleanup too for v2.
> Well, I'm not entirely convinced of the need for the new subops (we could
> as well introduce build time checks to make sure no truncation will occur
> for the existing ones), but I also won't object to their introduction. At
> least for the command line I can see that we will want (need) to support
> more than 1k worth of a string, considering how many options we have
> accumulated.

I've legitimate customer bugs caused by the cmdline limit, and real test
problems caused by the extraversion limit which I'm unwilling to "fix"
by sticking to the current API/ABI.

~Andrew
Jan Beulich Jan. 6, 2023, 7:56 a.m. UTC | #8
On 05.01.2023 23:28, Andrew Cooper wrote:
> On 05/01/2023 8:15 am, Jan Beulich wrote:
>> On 04.01.2023 19:34, Andrew Cooper wrote:
>>> On 04/01/2023 5:04 pm, Jan Beulich wrote:
>>>> On 03.01.2023 21:09, Andrew Cooper wrote:
>>>>> +    if ( sz > INT32_MAX )
>>>>> +        return -E2BIG; /* Compat guests.  2G ought to be plenty. */
>>>> While the comment here and in the public header mention compat guests,
>>>> the check is uniform. What's the deal?
>>> Well its either this, or a (comat ? INT32_MAX : INT64_MAX) check, along
>>> with the ifdefary and predicates required to make that compile.
>>>
>>> But there's not a CPU today which can actually move 2G of data (which is
>>> 4G of L1d bandwidth) without suffering the watchdog (especially as we've
>>> just read it once for strlen(), so that's 6G of bandwidth), nor do I
>>> expect this to change in the forseeable future.
>>>
>>> There's some boundary (probably far lower) beyond which we can't use the
>>> algorithm here.
>>>
>>> There wants to be some limit, and I don't feel it is necessary to make
>>> it variable on the guest type.
>> Sure. My question was merely because of the special mentioning of 32-bit /
>> compat guests. I'm fine with the universal limit, and I'd also be fine
>> with a lower (universal) bound. All I'm after is that the (to me at least)
>> confusing comments be adjusted.
> 
> How about 16k then?

Might be okay. If I was to pick a value, I'd use 64k.

Jan
diff mbox series

Patch

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 70e7dff87488..56bd6c6f5d9c 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -24,6 +24,7 @@ 
 CHECK_compile_info;
 CHECK_feature_info;
 CHECK_build_id;
+CHECK_var_string;
 #endif
 
 enum system_state system_state = SYS_STATE_early_boot;
@@ -469,6 +470,66 @@  static int __init cf_check param_init(void)
 __initcall(param_init);
 #endif
 
+static long xenver_varstring_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    const char *str = NULL;
+    size_t sz = 0;
+    union {
+        xen_capabilities_info_t info;
+    } u;
+    struct xen_var_string user_str;
+
+    switch ( cmd )
+    {
+    case XENVER_extraversion2:
+        str = xen_extra_version();
+        break;
+
+    case XENVER_changeset2:
+        str = xen_changeset();
+        break;
+
+    case XENVER_commandline2:
+        str = saved_cmdline;
+        break;
+
+    case XENVER_capabilities2:
+        memset(u.info, 0, sizeof(u.info));
+        arch_get_xen_caps(&u.info);
+        str = u.info;
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    if ( !str ||
+         !(sz = strlen(str)) )
+        return -ENODATA; /* failsafe */
+
+    if ( sz > INT32_MAX )
+        return -E2BIG; /* Compat guests.  2G ought to be plenty. */
+
+    if ( guest_handle_is_null(arg) ) /* Length request */
+        return sz;
+
+    if ( copy_from_guest(&user_str, arg, 1) )
+        return -EFAULT;
+
+    if ( user_str.len == 0 )
+        return -EINVAL;
+
+    if ( sz > user_str.len )
+        return -ENOBUFS;
+
+    if ( copy_to_guest_offset(arg, offsetof(struct xen_var_string, buf),
+                              str, sz) )
+        return -EFAULT;
+
+    return sz;
+}
+
 long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     bool_t deny = !!xsm_xen_version(XSM_OTHER, cmd);
@@ -670,6 +731,14 @@  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         return sz;
     }
+
+    case XENVER_extraversion2:
+    case XENVER_capabilities2:
+    case XENVER_changeset2:
+    case XENVER_commandline2:
+        if ( deny )
+            return -EPERM;
+        return xenver_varstring_op(cmd, arg);
     }
 
     return -ENOSYS;
diff --git a/xen/include/public/version.h b/xen/include/public/version.h
index c8325219f648..cf2d2ef38b54 100644
--- a/xen/include/public/version.h
+++ b/xen/include/public/version.h
@@ -19,12 +19,20 @@ 
 /* arg == NULL; returns major:minor (16:16). */
 #define XENVER_version      0
 
-/* arg == xen_extraversion_t. */
+/*
+ * arg == xen_extraversion_t.
+ *
+ * This API/ABI is broken.  Use XENVER_extraversion2 instead.
+ */
 #define XENVER_extraversion 1
 typedef char xen_extraversion_t[16];
 #define XEN_EXTRAVERSION_LEN (sizeof(xen_extraversion_t))
 
-/* arg == xen_compile_info_t. */
+/*
+ * arg == xen_compile_info_t.
+ *
+ * This API/ABI is broken and truncates data.
+ */
 #define XENVER_compile_info 2
 struct xen_compile_info {
     char compiler[64];
@@ -34,10 +42,20 @@  struct xen_compile_info {
 };
 typedef struct xen_compile_info xen_compile_info_t;
 
+/*
+ * arg == xen_capabilities_info_t.
+ *
+ * This API/ABI is broken.  Use XENVER_capabilities2 instead.
+ */
 #define XENVER_capabilities 3
 typedef char xen_capabilities_info_t[1024];
 #define XEN_CAPABILITIES_INFO_LEN (sizeof(xen_capabilities_info_t))
 
+/*
+ * arg == xen_changeset_info_t.
+ *
+ * This API/ABI is broken.  Use XENVER_changeset2 instead.
+ */
 #define XENVER_changeset 4
 typedef char xen_changeset_info_t[64];
 #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t))
@@ -88,6 +106,11 @@  typedef struct xen_feature_info xen_feature_info_t;
  */
 #define XENVER_guest_handle 8
 
+/*
+ * arg == xen_commandline_t.
+ *
+ * This API/ABI is broken.  Use XENVER_commandline2 instead.
+ */
 #define XENVER_commandline 9
 typedef char xen_commandline_t[1024];
 
@@ -103,6 +126,35 @@  struct xen_build_id {
 };
 typedef struct xen_build_id xen_build_id_t;
 
+/*
+ * Container for an arbitrary variable length string.
+ */
+struct xen_var_string {
+    uint32_t len;                          /* IN:  size of buf[] in bytes. */
+    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
+};
+typedef struct xen_var_string xen_var_string_t;
+
+/*
+ * arg == xenver_string_t
+ *
+ * Equivalent to the original ops, but with a non-truncating API/ABI.
+ *
+ * Passing arg == NULL is a request for size.  The returned size does not
+ * include a NUL terminator, and has a practical upper limit of INT32_MAX for
+ * 32bit guests.  This is expected to be plenty for the purpose.
+ *
+ * Otherwise, the input xenver_string_t provides the size of the following
+ * buffer.  Xen will fill the buffer, and return the number of bytes written
+ * (e.g. if the input buffer was longer than necessary).
+ *
+ * These hypercalls can fail, in which case they'll return -XEN_Exx.
+ */
+#define XENVER_extraversion2 11
+#define XENVER_capabilities2 12
+#define XENVER_changeset2    13
+#define XENVER_commandline2  14
+
 #endif /* __XEN_PUBLIC_VERSION_H__ */
 
 /*
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index f2bae220a6df..19cef4424add 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -172,6 +172,7 @@ 
 ?	compile_info                    version.h
 ?	feature_info                    version.h
 ?	build_id                        version.h
+?	var_string                      version.h
 ?	xenoprof_init			xenoprof.h
 ?	xenoprof_passive		xenoprof.h
 ?	flask_access			xsm/flask_op.h
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 78225f68c15c..a671dcd0322e 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1777,15 +1777,18 @@  static int cf_check flask_xen_version(uint32_t op)
         /* These sub-ops ignore the permission checks and return data. */
         return 0;
     case XENVER_extraversion:
+    case XENVER_extraversion2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_EXTRAVERSION, NULL);
     case XENVER_compile_info:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_COMPILE_INFO, NULL);
     case XENVER_capabilities:
+    case XENVER_capabilities2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_CAPABILITIES, NULL);
     case XENVER_changeset:
+    case XENVER_changeset2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_CHANGESET, NULL);
     case XENVER_pagesize:
@@ -1795,6 +1798,7 @@  static int cf_check flask_xen_version(uint32_t op)
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_GUEST_HANDLE, NULL);
     case XENVER_commandline:
+    case XENVER_commandline2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_COMMANDLINE, NULL);
     case XENVER_build_id: