diff mbox series

stubdom: foreignmemory: Fix build after 0dbb4be739c5

Message ID 20210713092019.7379-1-julien@xen.org (mailing list archive)
State New
Headers show
Series stubdom: foreignmemory: Fix build after 0dbb4be739c5 | expand

Commit Message

Julien Grall July 13, 2021, 9:20 a.m. UTC
From: Julien Grall <jgrall@amazon.com>

Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
wreck the build in an interesting way:

In file included from xen/stubdom/include/xen/domctl.h:39:0,
                 from xen/tools/include/xenctrl.h:36,
                 from private.h:4,
                 from minios.c:29:
xen/include/public/memory.h:407:5: error: expected specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
     XEN_GUEST_HANDLE_64(const_uint8) buffer;
     ^~~~~~~~~~~~~~~~~~~

This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
the public headers will start to expose the non-stable ABI. However,
xen.h has already been included by a mini-OS header before hand. So
there is a mismatch in the way the headers are included.

For now solve it in a very simple (and gross) way by including
xenctrl.h before the mini-os headers.

Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE redefinition error")
Signed-off-by: Julien Grall <jgrall@amazon.com>

---

Cc: Andrew Cooper <andrew.cooper3@citrix.com>

I couldn't find a better way with would not result to revert the patch
(and break build on some system) or involve a longer rework of the
headers.
---
 tools/libs/foreignmemory/minios.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Andrew Cooper July 13, 2021, 9:25 a.m. UTC | #1
On 13/07/2021 10:20, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
>
> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
> wreck the build in an interesting way:
>
> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>                  from xen/tools/include/xenctrl.h:36,
>                  from private.h:4,
>                  from minios.c:29:
> xen/include/public/memory.h:407:5: error: expected specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>      XEN_GUEST_HANDLE_64(const_uint8) buffer;
>      ^~~~~~~~~~~~~~~~~~~
>
> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
> the public headers will start to expose the non-stable ABI. However,
> xen.h has already been included by a mini-OS header before hand. So
> there is a mismatch in the way the headers are included.
>
> For now solve it in a very simple (and gross) way by including
> xenctrl.h before the mini-os headers.
>
> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE redefinition error")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> ---
>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>
> I couldn't find a better way with would not result to revert the patch
> (and break build on some system) or involve a longer rework of the
> headers.
> ---
>  tools/libs/foreignmemory/minios.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/tools/libs/foreignmemory/minios.c b/tools/libs/foreignmemory/minios.c
> index c5453736d598..d7b3f0e1c823 100644
> --- a/tools/libs/foreignmemory/minios.c
> +++ b/tools/libs/foreignmemory/minios.c
> @@ -17,6 +17,14 @@
>   * Copyright 2007-2008 Samuel Thibault <samuel.thibault@eu.citrix.com>.
>   */
>  
> +/*
> + * xenctlr.h

xenctrl.h

Otherwise, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

>  currently defines __XEN_TOOLS__ which affects what is
> + * exposed by Xen headers. As the define needs to be set consistently,
> + * we want to include xenctrl.h before the mini-os headers (they include
> + * public headers).
> + */
> +#include <xenctrl.h>
> +
>  #include <mini-os/types.h>
>  #include <mini-os/os.h>
>  #include <mini-os/mm.h>
Juergen Gross July 13, 2021, 9:27 a.m. UTC | #2
On 13.07.21 11:20, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
> wreck the build in an interesting way:
> 
> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>                   from xen/tools/include/xenctrl.h:36,
>                   from private.h:4,
>                   from minios.c:29:
> xen/include/public/memory.h:407:5: error: expected specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>       XEN_GUEST_HANDLE_64(const_uint8) buffer;
>       ^~~~~~~~~~~~~~~~~~~
> 
> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
> the public headers will start to expose the non-stable ABI. However,
> xen.h has already been included by a mini-OS header before hand. So
> there is a mismatch in the way the headers are included.
> 
> For now solve it in a very simple (and gross) way by including
> xenctrl.h before the mini-os headers.
> 
> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE redefinition error")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
> 
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> I couldn't find a better way with would not result to revert the patch
> (and break build on some system) or involve a longer rework of the
> headers.

Just adding a "#define __XEN_TOOLS__" before the #include statements
doesn't work?

> ---
>   tools/libs/foreignmemory/minios.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/tools/libs/foreignmemory/minios.c b/tools/libs/foreignmemory/minios.c
> index c5453736d598..d7b3f0e1c823 100644
> --- a/tools/libs/foreignmemory/minios.c
> +++ b/tools/libs/foreignmemory/minios.c
> @@ -17,6 +17,14 @@
>    * Copyright 2007-2008 Samuel Thibault <samuel.thibault@eu.citrix.com>.
>    */
>   
> +/*
> + * xenctlr.h currently defines __XEN_TOOLS__ which affects what is

Typo, should be xenctrl.h

> + * exposed by Xen headers. As the define needs to be set consistently,
> + * we want to include xenctrl.h before the mini-os headers (they include
> + * public headers).
> + */
> +#include <xenctrl.h>
> +
>   #include <mini-os/types.h>
>   #include <mini-os/os.h>
>   #include <mini-os/mm.h>
> 

Juergen
Julien Grall July 13, 2021, 9:31 a.m. UTC | #3
On 13/07/2021 10:27, Juergen Gross wrote:
> On 13.07.21 11:20, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>> wreck the build in an interesting way:
>>
>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>                   from xen/tools/include/xenctrl.h:36,
>>                   from private.h:4,
>>                   from minios.c:29:
>> xen/include/public/memory.h:407:5: error: expected 
>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>       XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>       ^~~~~~~~~~~~~~~~~~~
>>
>> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
>> the public headers will start to expose the non-stable ABI. However,
>> xen.h has already been included by a mini-OS header before hand. So
>> there is a mismatch in the way the headers are included.
>>
>> For now solve it in a very simple (and gross) way by including
>> xenctrl.h before the mini-os headers.
>>
>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE 
>> redefinition error")
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> I couldn't find a better way with would not result to revert the patch
>> (and break build on some system) or involve a longer rework of the
>> headers.
> 
> Just adding a "#define __XEN_TOOLS__" before the #include statements
> doesn't work?
It works but if someone decides to the rework the header and drop 
__XEN_TOOLS__ we would still define in minios.c (we technically don't 
need it). So I find the solution a lot worse than what I wrote.

> 
>> ---
>>   tools/libs/foreignmemory/minios.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/tools/libs/foreignmemory/minios.c 
>> b/tools/libs/foreignmemory/minios.c
>> index c5453736d598..d7b3f0e1c823 100644
>> --- a/tools/libs/foreignmemory/minios.c
>> +++ b/tools/libs/foreignmemory/minios.c
>> @@ -17,6 +17,14 @@
>>    * Copyright 2007-2008 Samuel Thibault <samuel.thibault@eu.citrix.com>.
>>    */
>> +/*
>> + * xenctlr.h currently defines __XEN_TOOLS__ which affects what is
> 
> Typo, should be xenctrl.h

I will fix it.
Andrew Cooper July 13, 2021, 9:35 a.m. UTC | #4
On 13/07/2021 10:27, Juergen Gross wrote:
> On 13.07.21 11:20, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>> wreck the build in an interesting way:
>>
>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>                   from xen/tools/include/xenctrl.h:36,
>>                   from private.h:4,
>>                   from minios.c:29:
>> xen/include/public/memory.h:407:5: error: expected
>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>       XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>       ^~~~~~~~~~~~~~~~~~~
>>
>> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
>> the public headers will start to expose the non-stable ABI. However,
>> xen.h has already been included by a mini-OS header before hand. So
>> there is a mismatch in the way the headers are included.
>>
>> For now solve it in a very simple (and gross) way by including
>> xenctrl.h before the mini-os headers.
>>
>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>> redefinition error")
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> I couldn't find a better way with would not result to revert the patch
>> (and break build on some system) or involve a longer rework of the
>> headers.
>
> Just adding a "#define __XEN_TOOLS__" before the #include statements
> doesn't work?

Not really, no.

libxenforeignmem has nothing at all to do with any Xen unstable
interfaces.  Including xenctrl.h in the first place was wrong, because
it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
also wrong.

This all needs reverting/reworking to avoid making the stable libraries
depend on unstable ones, but in the short term we also need to unbreak
the CI.

~Andrew
Juergen Gross July 13, 2021, 10:35 a.m. UTC | #5
On 13.07.21 11:31, Julien Grall wrote:
> 
> 
> On 13/07/2021 10:27, Juergen Gross wrote:
>> On 13.07.21 11:20, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>>> wreck the build in an interesting way:
>>>
>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>                   from xen/tools/include/xenctrl.h:36,
>>>                   from private.h:4,
>>>                   from minios.c:29:
>>> xen/include/public/memory.h:407:5: error: expected 
>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>       XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>       ^~~~~~~~~~~~~~~~~~~
>>>
>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
>>> the public headers will start to expose the non-stable ABI. However,
>>> xen.h has already been included by a mini-OS header before hand. So
>>> there is a mismatch in the way the headers are included.
>>>
>>> For now solve it in a very simple (and gross) way by including
>>> xenctrl.h before the mini-os headers.
>>>
>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE 
>>> redefinition error")
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> ---
>>>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> I couldn't find a better way with would not result to revert the patch
>>> (and break build on some system) or involve a longer rework of the
>>> headers.
>>
>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>> doesn't work?
> It works but if someone decides to the rework the header and drop 
> __XEN_TOOLS__ we would still define in minios.c (we technically don't 
> need it). So I find the solution a lot worse than what I wrote.

Hmm, yes.


Juergen
Julien Grall July 13, 2021, 11:21 a.m. UTC | #6
Hi Andrew,

On 13/07/2021 10:35, Andrew Cooper wrote:
> On 13/07/2021 10:27, Juergen Gross wrote:
>> On 13.07.21 11:20, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>>> wreck the build in an interesting way:
>>>
>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>                    from xen/tools/include/xenctrl.h:36,
>>>                    from private.h:4,
>>>                    from minios.c:29:
>>> xen/include/public/memory.h:407:5: error: expected
>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>        XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>        ^~~~~~~~~~~~~~~~~~~
>>>
>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
>>> the public headers will start to expose the non-stable ABI. However,
>>> xen.h has already been included by a mini-OS header before hand. So
>>> there is a mismatch in the way the headers are included.
>>>
>>> For now solve it in a very simple (and gross) way by including
>>> xenctrl.h before the mini-os headers.
>>>
>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>> redefinition error")
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> ---
>>>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> I couldn't find a better way with would not result to revert the patch
>>> (and break build on some system) or involve a longer rework of the
>>> headers.
>>
>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>> doesn't work?
> 
> Not really, no.
> 
> libxenforeignmem has nothing at all to do with any Xen unstable
> interfaces.  Including xenctrl.h in the first place was wrong, because
> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
> also wrong.

Well... Previously we were using PAGE_SIZE which is just plain wrong on Arm.

At the moment, we don't have a way to query the page granularity of the 
hypervisor. But we know it can't change because of the way the current 
ABI was designed. Hence why using XC_PAGE_SIZE is the best of option we 
had until we go to ABIv2.

Cheers,
Andrew Cooper July 13, 2021, 11:23 a.m. UTC | #7
On 13/07/2021 12:21, Julien Grall wrote:
> Hi Andrew,
>
> On 13/07/2021 10:35, Andrew Cooper wrote:
>> On 13/07/2021 10:27, Juergen Gross wrote:
>>> On 13.07.21 11:20, Julien Grall wrote:
>>>> From: Julien Grall <jgrall@amazon.com>
>>>>
>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>>>> wreck the build in an interesting way:
>>>>
>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>                    from xen/tools/include/xenctrl.h:36,
>>>>                    from private.h:4,
>>>>                    from minios.c:29:
>>>> xen/include/public/memory.h:407:5: error: expected
>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>        XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>        ^~~~~~~~~~~~~~~~~~~
>>>>
>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>> therefore
>>>> the public headers will start to expose the non-stable ABI. However,
>>>> xen.h has already been included by a mini-OS header before hand. So
>>>> there is a mismatch in the way the headers are included.
>>>>
>>>> For now solve it in a very simple (and gross) way by including
>>>> xenctrl.h before the mini-os headers.
>>>>
>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>> redefinition error")
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>
>>>> ---
>>>>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>
>>>> I couldn't find a better way with would not result to revert the patch
>>>> (and break build on some system) or involve a longer rework of the
>>>> headers.
>>>
>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>> doesn't work?
>>
>> Not really, no.
>>
>> libxenforeignmem has nothing at all to do with any Xen unstable
>> interfaces.  Including xenctrl.h in the first place was wrong, because
>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>> also wrong.
>
> Well... Previously we were using PAGE_SIZE which is just plain wrong
> on Arm.
>
> At the moment, we don't have a way to query the page granularity of
> the hypervisor. But we know it can't change because of the way the
> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
> option we had until we go to ABIv2.

Still doesn't mean that XC_PAGE_SIZE was ok to use.

Sounds like the constant needs moving into the Xen public headers, and
the inclusions of xenctrl.h into stable libraries needs reverting.

~Andrew
Julien Grall July 13, 2021, 11:53 a.m. UTC | #8
Hi Andrew,

On 13/07/2021 12:23, Andrew Cooper wrote:
> On 13/07/2021 12:21, Julien Grall wrote:
>> Hi Andrew,
>>
>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>
>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
>>>>> wreck the build in an interesting way:
>>>>>
>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>                     from xen/tools/include/xenctrl.h:36,
>>>>>                     from private.h:4,
>>>>>                     from minios.c:29:
>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>         XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>         ^~~~~~~~~~~~~~~~~~~
>>>>>
>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>> therefore
>>>>> the public headers will start to expose the non-stable ABI. However,
>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>> there is a mismatch in the way the headers are included.
>>>>>
>>>>> For now solve it in a very simple (and gross) way by including
>>>>> xenctrl.h before the mini-os headers.
>>>>>
>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>> redefinition error")
>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>
>>>>> ---
>>>>>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>
>>>>> I couldn't find a better way with would not result to revert the patch
>>>>> (and break build on some system) or involve a longer rework of the
>>>>> headers.
>>>>
>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>> doesn't work?
>>>
>>> Not really, no.
>>>
>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>> interfaces.  Including xenctrl.h in the first place was wrong, because
>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>> also wrong.
>>
>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>> on Arm.
>>
>> At the moment, we don't have a way to query the page granularity of
>> the hypervisor. But we know it can't change because of the way the
>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>> option we had until we go to ABIv2.
> 
> Still doesn't mean that XC_PAGE_SIZE was ok to use.

Note that I wrote "best of the option". The series has been sitting for 
ages with no-one answering... You could have provided your option back 
then if you thought it wasn't a good use...

> 
> Sounds like the constant needs moving into the Xen public headers, and
> the inclusions of xenctrl.h into stable libraries needs reverting.

This could work. Are you planning to work on it?

Cheers,
Andrew Cooper July 13, 2021, 12:39 p.m. UTC | #9
On 13/07/2021 12:53, Julien Grall wrote:
> Hi Andrew,
>
> On 13/07/2021 12:23, Andrew Cooper wrote:
>> On 13/07/2021 12:21, Julien Grall wrote:
>>> Hi Andrew,
>>>
>>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>
>>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h
>>>>>> and
>>>>>> wreck the build in an interesting way:
>>>>>>
>>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>>                     from xen/tools/include/xenctrl.h:36,
>>>>>>                     from private.h:4,
>>>>>>                     from minios.c:29:
>>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>>         XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>>         ^~~~~~~~~~~~~~~~~~~
>>>>>>
>>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>>> therefore
>>>>>> the public headers will start to expose the non-stable ABI. However,
>>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>>> there is a mismatch in the way the headers are included.
>>>>>>
>>>>>> For now solve it in a very simple (and gross) way by including
>>>>>> xenctrl.h before the mini-os headers.
>>>>>>
>>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>>> redefinition error")
>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>
>>>>>> I couldn't find a better way with would not result to revert the
>>>>>> patch
>>>>>> (and break build on some system) or involve a longer rework of the
>>>>>> headers.
>>>>>
>>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>>> doesn't work?
>>>>
>>>> Not really, no.
>>>>
>>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>>> interfaces.  Including xenctrl.h in the first place was wrong, because
>>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>>> also wrong.
>>>
>>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>>> on Arm.
>>>
>>> At the moment, we don't have a way to query the page granularity of
>>> the hypervisor. But we know it can't change because of the way the
>>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>>> option we had until we go to ABIv2.
>>
>> Still doesn't mean that XC_PAGE_SIZE was ok to use.
>
> Note that I wrote "best of the option". The series has been sitting
> for ages with no-one answering... You could have provided your option
> back then if you thought it wasn't a good use...

On a series I wasn't even CC'd on?

And noone had even bothered to compile test?

>
>>
>> Sounds like the constant needs moving into the Xen public headers, and
>> the inclusions of xenctrl.h into stable libraries needs reverting.
>
> This could work. Are you planning to work on it?

No.  I don't have enough time to do my own work thanks to all the CI
breakage and regressions being committed.

This needs fixing, or the original series reverting for 4.16 because the
current form (with or without this emergency build fix) isn't acceptable
to release with.

~Andrew
Julien Grall July 13, 2021, 1 p.m. UTC | #10
On 13/07/2021 13:39, Andrew Cooper wrote:
> On 13/07/2021 12:53, Julien Grall wrote:
>> Hi Andrew,
>>
>> On 13/07/2021 12:23, Andrew Cooper wrote:
>>> On 13/07/2021 12:21, Julien Grall wrote:
>>>> Hi Andrew,
>>>>
>>>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>
>>>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h
>>>>>>> and
>>>>>>> wreck the build in an interesting way:
>>>>>>>
>>>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>>>                      from xen/tools/include/xenctrl.h:36,
>>>>>>>                      from private.h:4,
>>>>>>>                      from minios.c:29:
>>>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>>>          XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>>>          ^~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>>>> therefore
>>>>>>> the public headers will start to expose the non-stable ABI. However,
>>>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>>>> there is a mismatch in the way the headers are included.
>>>>>>>
>>>>>>> For now solve it in a very simple (and gross) way by including
>>>>>>> xenctrl.h before the mini-os headers.
>>>>>>>
>>>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>>>> redefinition error")
>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>
>>>>>>> I couldn't find a better way with would not result to revert the
>>>>>>> patch
>>>>>>> (and break build on some system) or involve a longer rework of the
>>>>>>> headers.
>>>>>>
>>>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>>>> doesn't work?
>>>>>
>>>>> Not really, no.
>>>>>
>>>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>>>> interfaces.  Including xenctrl.h in the first place was wrong, because
>>>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>>>> also wrong.
>>>>
>>>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>>>> on Arm.
>>>>
>>>> At the moment, we don't have a way to query the page granularity of
>>>> the hypervisor. But we know it can't change because of the way the
>>>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>>>> option we had until we go to ABIv2.
>>>
>>> Still doesn't mean that XC_PAGE_SIZE was ok to use.
>>
>> Note that I wrote "best of the option". The series has been sitting
>> for ages with no-one answering... You could have provided your option
>> back then if you thought it wasn't a good use...
> 
> On a series I wasn't even CC'd on?

You had the link on IRC because we discussed it.

> 
> And noone had even bothered to compile test?

Well, that was a mistake. At the same time, if it compiled the "issue" 
you describe would have gone unnoticed. ;)

>>
>>>
>>> Sounds like the constant needs moving into the Xen public headers, and
>>> the inclusions of xenctrl.h into stable libraries needs reverting.
>>
>> This could work. Are you planning to work on it?
> 
> No.  I don't have enough time to do my own work thanks to all the CI
> breakage and regressions being committed.
> This needs fixing, or the original series reverting for 4.16 because the
> current form (with or without this emergency build fix) isn't acceptable
> to release with.
I disagree with this caracterization. Yes, this is including a 
non-stable header but it doesn't link with non-stable library.

In fact, reverting the series will bring back two issues:
   1) Xen tools will not build on all the distros
   2) Using PAGE_{SIZE, SHIFT} break arm tools because the userspace is 
not meant to rely on a given kernel page granularity.

So this doesn't look like a priority for 4.16. Although, it would be a 
nice clean-up to have so the libraries are more compliant.

Cheers,
Costin Lupu July 13, 2021, 1:46 p.m. UTC | #11
Hi guys,

On 7/13/21 4:00 PM, Julien Grall wrote:
> 
> 
> On 13/07/2021 13:39, Andrew Cooper wrote:
>> On 13/07/2021 12:53, Julien Grall wrote:
>>> Hi Andrew,
>>>
>>> On 13/07/2021 12:23, Andrew Cooper wrote:
>>>> On 13/07/2021 12:21, Julien Grall wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>>>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>>
>>>>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h
>>>>>>>> and
>>>>>>>> wreck the build in an interesting way:
>>>>>>>>
>>>>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>>>>                      from xen/tools/include/xenctrl.h:36,
>>>>>>>>                      from private.h:4,
>>>>>>>>                      from minios.c:29:
>>>>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>>>>          XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>>>>          ^~~~~~~~~~~~~~~~~~~
>>>>>>>>
>>>>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>>>>> therefore
>>>>>>>> the public headers will start to expose the non-stable ABI.
>>>>>>>> However,
>>>>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>>>>> there is a mismatch in the way the headers are included.
>>>>>>>>
>>>>>>>> For now solve it in a very simple (and gross) way by including
>>>>>>>> xenctrl.h before the mini-os headers.
>>>>>>>>
>>>>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>>>>> redefinition error")
>>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>>
>>>>>>>> I couldn't find a better way with would not result to revert the
>>>>>>>> patch
>>>>>>>> (and break build on some system) or involve a longer rework of the
>>>>>>>> headers.
>>>>>>>
>>>>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>>>>> doesn't work?
>>>>>>
>>>>>> Not really, no.
>>>>>>
>>>>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>>>>> interfaces.  Including xenctrl.h in the first place was wrong,
>>>>>> because
>>>>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>>>>> also wrong.
>>>>>
>>>>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>>>>> on Arm.
>>>>>
>>>>> At the moment, we don't have a way to query the page granularity of
>>>>> the hypervisor. But we know it can't change because of the way the
>>>>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>>>>> option we had until we go to ABIv2.
>>>>
>>>> Still doesn't mean that XC_PAGE_SIZE was ok to use.
>>>
>>> Note that I wrote "best of the option". The series has been sitting
>>> for ages with no-one answering... You could have provided your option
>>> back then if you thought it wasn't a good use...
>>
>> On a series I wasn't even CC'd on?
> 
> You had the link on IRC because we discussed it.
> 
>>
>> And noone had even bothered to compile test?
> 
> Well, that was a mistake. At the same time, if it compiled the "issue"
> you describe would have gone unnoticed. ;)
> 
>>>
>>>>
>>>> Sounds like the constant needs moving into the Xen public headers, and
>>>> the inclusions of xenctrl.h into stable libraries needs reverting.
>>>
>>> This could work. Are you planning to work on it?
>>
>> No.  I don't have enough time to do my own work thanks to all the CI
>> breakage and regressions being committed.
>> This needs fixing, or the original series reverting for 4.16 because the
>> current form (with or without this emergency build fix) isn't acceptable
>> to release with.
> I disagree with this caracterization. Yes, this is including a
> non-stable header but it doesn't link with non-stable library.
> 
> In fact, reverting the series will bring back two issues:
>   1) Xen tools will not build on all the distros
>   2) Using PAGE_{SIZE, SHIFT} break arm tools because the userspace is
> not meant to rely on a given kernel page granularity.
> 
> So this doesn't look like a priority for 4.16. Although, it would be a
> nice clean-up to have so the libraries are more compliant.

First of all, sorry for breaking the build.

As Jan already suggested on a different thread, we can fix this by
isolating the XC_PAGE_* definitions of the toolstack in a header of
their own. I'm open to suggestions regarding the name of the header (my
suggestion would be xenctrl_page.h) and path (I guess it should be in
tools/include, right?). Also, should we change the names of the macros
from XC_PAGE_* to something else in order to reflect that they are
toolstack related instead of xenctrl specific?

@Andrew: Can you please tell me why XC_PAGE_SIZE wasn't ok to use? I'm
asking this in order to fully understand the issue.


Cheers,
Costin
Juergen Gross July 13, 2021, 2 p.m. UTC | #12
On 13.07.21 15:46, Costin Lupu wrote:
> Hi guys,
> 
> On 7/13/21 4:00 PM, Julien Grall wrote:
>>
>>
>> On 13/07/2021 13:39, Andrew Cooper wrote:
>>> On 13/07/2021 12:53, Julien Grall wrote:
>>>> Hi Andrew,
>>>>
>>>> On 13/07/2021 12:23, Andrew Cooper wrote:
>>>>> On 13/07/2021 12:21, Julien Grall wrote:
>>>>>> Hi Andrew,
>>>>>>
>>>>>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>>>>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>>>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>>>
>>>>>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h
>>>>>>>>> and
>>>>>>>>> wreck the build in an interesting way:
>>>>>>>>>
>>>>>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>>>>>                       from xen/tools/include/xenctrl.h:36,
>>>>>>>>>                       from private.h:4,
>>>>>>>>>                       from minios.c:29:
>>>>>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>>>>>           XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>>>>>           ^~~~~~~~~~~~~~~~~~~
>>>>>>>>>
>>>>>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>>>>>> therefore
>>>>>>>>> the public headers will start to expose the non-stable ABI.
>>>>>>>>> However,
>>>>>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>>>>>> there is a mismatch in the way the headers are included.
>>>>>>>>>
>>>>>>>>> For now solve it in a very simple (and gross) way by including
>>>>>>>>> xenctrl.h before the mini-os headers.
>>>>>>>>>
>>>>>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>>>>>> redefinition error")
>>>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>>>
>>>>>>>>> I couldn't find a better way with would not result to revert the
>>>>>>>>> patch
>>>>>>>>> (and break build on some system) or involve a longer rework of the
>>>>>>>>> headers.
>>>>>>>>
>>>>>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>>>>>> doesn't work?
>>>>>>>
>>>>>>> Not really, no.
>>>>>>>
>>>>>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>>>>>> interfaces.  Including xenctrl.h in the first place was wrong,
>>>>>>> because
>>>>>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>>>>>> also wrong.
>>>>>>
>>>>>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>>>>>> on Arm.
>>>>>>
>>>>>> At the moment, we don't have a way to query the page granularity of
>>>>>> the hypervisor. But we know it can't change because of the way the
>>>>>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>>>>>> option we had until we go to ABIv2.
>>>>>
>>>>> Still doesn't mean that XC_PAGE_SIZE was ok to use.
>>>>
>>>> Note that I wrote "best of the option". The series has been sitting
>>>> for ages with no-one answering... You could have provided your option
>>>> back then if you thought it wasn't a good use...
>>>
>>> On a series I wasn't even CC'd on?
>>
>> You had the link on IRC because we discussed it.
>>
>>>
>>> And noone had even bothered to compile test?
>>
>> Well, that was a mistake. At the same time, if it compiled the "issue"
>> you describe would have gone unnoticed. ;)
>>
>>>>
>>>>>
>>>>> Sounds like the constant needs moving into the Xen public headers, and
>>>>> the inclusions of xenctrl.h into stable libraries needs reverting.
>>>>
>>>> This could work. Are you planning to work on it?
>>>
>>> No.  I don't have enough time to do my own work thanks to all the CI
>>> breakage and regressions being committed.
>>> This needs fixing, or the original series reverting for 4.16 because the
>>> current form (with or without this emergency build fix) isn't acceptable
>>> to release with.
>> I disagree with this caracterization. Yes, this is including a
>> non-stable header but it doesn't link with non-stable library.
>>
>> In fact, reverting the series will bring back two issues:
>>    1) Xen tools will not build on all the distros
>>    2) Using PAGE_{SIZE, SHIFT} break arm tools because the userspace is
>> not meant to rely on a given kernel page granularity.
>>
>> So this doesn't look like a priority for 4.16. Although, it would be a
>> nice clean-up to have so the libraries are more compliant.
> 
> First of all, sorry for breaking the build.
> 
> As Jan already suggested on a different thread, we can fix this by
> isolating the XC_PAGE_* definitions of the toolstack in a header of
> their own. I'm open to suggestions regarding the name of the header (my
> suggestion would be xenctrl_page.h) and path (I guess it should be in
> tools/include, right?). Also, should we change the names of the macros
> from XC_PAGE_* to something else in order to reflect that they are
> toolstack related instead of xenctrl specific?

I would rather have that definition in xen/include/public/arch-*.h as
this is a hypervisor attribute.

And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.


Juergen
Jan Beulich July 13, 2021, 2:14 p.m. UTC | #13
On 13.07.2021 16:00, Juergen Gross wrote:
> On 13.07.21 15:46, Costin Lupu wrote:
>> Hi guys,
>>
>> On 7/13/21 4:00 PM, Julien Grall wrote:
>>>
>>>
>>> On 13/07/2021 13:39, Andrew Cooper wrote:
>>>> On 13/07/2021 12:53, Julien Grall wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> On 13/07/2021 12:23, Andrew Cooper wrote:
>>>>>> On 13/07/2021 12:21, Julien Grall wrote:
>>>>>>> Hi Andrew,
>>>>>>>
>>>>>>> On 13/07/2021 10:35, Andrew Cooper wrote:
>>>>>>>> On 13/07/2021 10:27, Juergen Gross wrote:
>>>>>>>>> On 13.07.21 11:20, Julien Grall wrote:
>>>>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>>>>
>>>>>>>>>> Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h
>>>>>>>>>> and
>>>>>>>>>> wreck the build in an interesting way:
>>>>>>>>>>
>>>>>>>>>> In file included from xen/stubdom/include/xen/domctl.h:39:0,
>>>>>>>>>>                       from xen/tools/include/xenctrl.h:36,
>>>>>>>>>>                       from private.h:4,
>>>>>>>>>>                       from minios.c:29:
>>>>>>>>>> xen/include/public/memory.h:407:5: error: expected
>>>>>>>>>> specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
>>>>>>>>>>           XEN_GUEST_HANDLE_64(const_uint8) buffer;
>>>>>>>>>>           ^~~~~~~~~~~~~~~~~~~
>>>>>>>>>>
>>>>>>>>>> This is happening because xenctrl.h defines __XEN_TOOLS__ and
>>>>>>>>>> therefore
>>>>>>>>>> the public headers will start to expose the non-stable ABI.
>>>>>>>>>> However,
>>>>>>>>>> xen.h has already been included by a mini-OS header before hand. So
>>>>>>>>>> there is a mismatch in the way the headers are included.
>>>>>>>>>>
>>>>>>>>>> For now solve it in a very simple (and gross) way by including
>>>>>>>>>> xenctrl.h before the mini-os headers.
>>>>>>>>>>
>>>>>>>>>> Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE
>>>>>>>>>> redefinition error")
>>>>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>>>>
>>>>>>>>>> I couldn't find a better way with would not result to revert the
>>>>>>>>>> patch
>>>>>>>>>> (and break build on some system) or involve a longer rework of the
>>>>>>>>>> headers.
>>>>>>>>>
>>>>>>>>> Just adding a "#define __XEN_TOOLS__" before the #include statements
>>>>>>>>> doesn't work?
>>>>>>>>
>>>>>>>> Not really, no.
>>>>>>>>
>>>>>>>> libxenforeignmem has nothing at all to do with any Xen unstable
>>>>>>>> interfaces.  Including xenctrl.h in the first place was wrong,
>>>>>>>> because
>>>>>>>> it is an unstable library.  By extension, the use of XC_PAGE_SIZE is
>>>>>>>> also wrong.
>>>>>>>
>>>>>>> Well... Previously we were using PAGE_SIZE which is just plain wrong
>>>>>>> on Arm.
>>>>>>>
>>>>>>> At the moment, we don't have a way to query the page granularity of
>>>>>>> the hypervisor. But we know it can't change because of the way the
>>>>>>> current ABI was designed. Hence why using XC_PAGE_SIZE is the best of
>>>>>>> option we had until we go to ABIv2.
>>>>>>
>>>>>> Still doesn't mean that XC_PAGE_SIZE was ok to use.
>>>>>
>>>>> Note that I wrote "best of the option". The series has been sitting
>>>>> for ages with no-one answering... You could have provided your option
>>>>> back then if you thought it wasn't a good use...
>>>>
>>>> On a series I wasn't even CC'd on?
>>>
>>> You had the link on IRC because we discussed it.
>>>
>>>>
>>>> And noone had even bothered to compile test?
>>>
>>> Well, that was a mistake. At the same time, if it compiled the "issue"
>>> you describe would have gone unnoticed. ;)
>>>
>>>>>
>>>>>>
>>>>>> Sounds like the constant needs moving into the Xen public headers, and
>>>>>> the inclusions of xenctrl.h into stable libraries needs reverting.
>>>>>
>>>>> This could work. Are you planning to work on it?
>>>>
>>>> No.  I don't have enough time to do my own work thanks to all the CI
>>>> breakage and regressions being committed.
>>>> This needs fixing, or the original series reverting for 4.16 because the
>>>> current form (with or without this emergency build fix) isn't acceptable
>>>> to release with.
>>> I disagree with this caracterization. Yes, this is including a
>>> non-stable header but it doesn't link with non-stable library.
>>>
>>> In fact, reverting the series will bring back two issues:
>>>    1) Xen tools will not build on all the distros
>>>    2) Using PAGE_{SIZE, SHIFT} break arm tools because the userspace is
>>> not meant to rely on a given kernel page granularity.
>>>
>>> So this doesn't look like a priority for 4.16. Although, it would be a
>>> nice clean-up to have so the libraries are more compliant.
>>
>> First of all, sorry for breaking the build.
>>
>> As Jan already suggested on a different thread, we can fix this by
>> isolating the XC_PAGE_* definitions of the toolstack in a header of
>> their own. I'm open to suggestions regarding the name of the header (my
>> suggestion would be xenctrl_page.h) and path (I guess it should be in
>> tools/include, right?). Also, should we change the names of the macros
>> from XC_PAGE_* to something else in order to reflect that they are
>> toolstack related instead of xenctrl specific?
> 
> I would rather have that definition in xen/include/public/arch-*.h as
> this is a hypervisor attribute.
> 
> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.

Even that doesn't seem right to me, at least in principle. There shouldn't
be a build time setting when it may vary at runtime. IOW on Arm I think a
runtime query to the hypervisor would be needed instead. And thinking
even more generally, perhaps there could also be mixed (base) page sizes
in use at run time, so it may need to be a bit mask which gets returned.

Jan
Julien Grall July 13, 2021, 2:19 p.m. UTC | #14
Hi Jan,

On 13/07/2021 15:14, Jan Beulich wrote:
>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
> 
> Even that doesn't seem right to me, at least in principle. There shouldn't
> be a build time setting when it may vary at runtime. IOW on Arm I think a
> runtime query to the hypervisor would be needed instead.

Yes, we want to be able to use the same userspace/OS without rebuilding 
to a specific hypervisor page size.

> And thinking
> even more generally, perhaps there could also be mixed (base) page sizes
> in use at run time, so it may need to be a bit mask which gets returned.

I am not sure to understand this. Are you saying the hypervisor may use 
at the same time different page size?

Cheers,
Jan Beulich July 13, 2021, 2:23 p.m. UTC | #15
On 13.07.2021 16:19, Julien Grall wrote:
> On 13/07/2021 15:14, Jan Beulich wrote:
>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>
>> Even that doesn't seem right to me, at least in principle. There shouldn't
>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>> runtime query to the hypervisor would be needed instead.
> 
> Yes, we want to be able to use the same userspace/OS without rebuilding 
> to a specific hypervisor page size.
> 
>> And thinking
>> even more generally, perhaps there could also be mixed (base) page sizes
>> in use at run time, so it may need to be a bit mask which gets returned.
> 
> I am not sure to understand this. Are you saying the hypervisor may use 
> at the same time different page size?

I think so, yes. And I further think the hypervisor could even allow its
guests to do so. There would be a distinction between the granularity at
which RAM gets allocated and the granularity at which page mappings (RAM
or other) can be established. Which yields an environment which I'd say
has no clear "system page size".

Jan
Juergen Gross July 13, 2021, 2:23 p.m. UTC | #16
On 13.07.21 16:19, Julien Grall wrote:
> Hi Jan,
> 
> On 13/07/2021 15:14, Jan Beulich wrote:
>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>
>> Even that doesn't seem right to me, at least in principle. There 
>> shouldn't
>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>> runtime query to the hypervisor would be needed instead.
> 
> Yes, we want to be able to use the same userspace/OS without rebuilding 
> to a specific hypervisor page size.

This define is used for accessing data of other domains. See the define
for XEN_PAGE_SIZE in xen/include/public/io/ring.h

So it should be a constant (minimal) page size for all hypervisors and
guests of an architecture.


Juergen
Jan Beulich July 13, 2021, 2:28 p.m. UTC | #17
On 13.07.2021 16:23, Juergen Gross wrote:
> On 13.07.21 16:19, Julien Grall wrote:
>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>
>>> Even that doesn't seem right to me, at least in principle. There 
>>> shouldn't
>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>> runtime query to the hypervisor would be needed instead.
>>
>> Yes, we want to be able to use the same userspace/OS without rebuilding 
>> to a specific hypervisor page size.
> 
> This define is used for accessing data of other domains. See the define
> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
> 
> So it should be a constant (minimal) page size for all hypervisors and
> guests of an architecture.

But that's only because of limitations baked into ring.h. For example,
a grant shouldn't be (address,attributes), but (address,order,attributes).
A frontend running in an OS with 16k page size could then still announce
a single ring "page", and a backend running in an OS with 4k page size
would still have no trouble mapping that ring. (The other way around
would of course get more interesting.)

Jan
Juergen Gross July 13, 2021, 2:33 p.m. UTC | #18
On 13.07.21 16:28, Jan Beulich wrote:
> On 13.07.2021 16:23, Juergen Gross wrote:
>> On 13.07.21 16:19, Julien Grall wrote:
>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>
>>>> Even that doesn't seem right to me, at least in principle. There
>>>> shouldn't
>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>> runtime query to the hypervisor would be needed instead.
>>>
>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>> to a specific hypervisor page size.
>>
>> This define is used for accessing data of other domains. See the define
>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>
>> So it should be a constant (minimal) page size for all hypervisors and
>> guests of an architecture.
> 
> But that's only because of limitations baked into ring.h. For example,
> a grant shouldn't be (address,attributes), but (address,order,attributes).
> A frontend running in an OS with 16k page size could then still announce
> a single ring "page", and a backend running in an OS with 4k page size
> would still have no trouble mapping that ring. (The other way around
> would of course get more interesting.)

Right. The current interfaces don't provide this ability. For those the
minimal size is "the right thing" IMO.


Juergen
Julien Grall July 13, 2021, 2:33 p.m. UTC | #19
On 13/07/2021 15:23, Jan Beulich wrote:
> On 13.07.2021 16:19, Julien Grall wrote:
>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>
>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>> runtime query to the hypervisor would be needed instead.
>>
>> Yes, we want to be able to use the same userspace/OS without rebuilding
>> to a specific hypervisor page size.
>>
>>> And thinking
>>> even more generally, perhaps there could also be mixed (base) page sizes
>>> in use at run time, so it may need to be a bit mask which gets returned.
>>
>> I am not sure to understand this. Are you saying the hypervisor may use
>> at the same time different page size?
> 
> I think so, yes. And I further think the hypervisor could even allow its
> guests to do so.

This is already the case on Arm. We need to differentiate between the 
page size used by the guest and the one used by Xen for the stage-2 page 
table (what you call EPT on x86).

In this case, we are talking about the page size used by the hypervisor 
to configure the stage-2 page table

> There would be a distinction between the granularity at
> which RAM gets allocated and the granularity at which page mappings (RAM
> or other) can be established. Which yields an environment which I'd say
> has no clear "system page size".

I don't quite understand why you would allocate and etablish the memory 
with a different page size in the hypervisor. Can you give an example?

Cheers,
Julien Grall July 13, 2021, 2:38 p.m. UTC | #20
Hi Juergen,

On 13/07/2021 15:23, Juergen Gross wrote:
> On 13.07.21 16:19, Julien Grall wrote:
>> Hi Jan,
>>
>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>
>>> Even that doesn't seem right to me, at least in principle. There 
>>> shouldn't
>>> be a build time setting when it may vary at runtime. IOW on Arm I 
>>> think a
>>> runtime query to the hypervisor would be needed instead.
>>
>> Yes, we want to be able to use the same userspace/OS without 
>> rebuilding to a specific hypervisor page size.
> 
> This define is used for accessing data of other domains. See the define
> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
> 
> So it should be a constant (minimal) page size for all hypervisors and
> guests of an architecture.

Do you mean the maximum rather than minimal? If you use the minimal 
(4KB), then you would not be able to map the page in the stage-2 if the 
hypervisor is using 64KB.

Cheers,
Juergen Gross July 13, 2021, 3:09 p.m. UTC | #21
On 13.07.21 16:38, Julien Grall wrote:
> Hi Juergen,
> 
> On 13/07/2021 15:23, Juergen Gross wrote:
>> On 13.07.21 16:19, Julien Grall wrote:
>>> Hi Jan,
>>>
>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>
>>>> Even that doesn't seem right to me, at least in principle. There 
>>>> shouldn't
>>>> be a build time setting when it may vary at runtime. IOW on Arm I 
>>>> think a
>>>> runtime query to the hypervisor would be needed instead.
>>>
>>> Yes, we want to be able to use the same userspace/OS without 
>>> rebuilding to a specific hypervisor page size.
>>
>> This define is used for accessing data of other domains. See the define
>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>
>> So it should be a constant (minimal) page size for all hypervisors and
>> guests of an architecture.
> 
> Do you mean the maximum rather than minimal? If you use the minimal 
> (4KB), then you would not be able to map the page in the stage-2 if the 
> hypervisor is using 64KB.

But this would mean that the current solution to use XC_PAGE_SIZE is
wrong, as this is 4k.


Juergen
Julien Grall July 13, 2021, 3:15 p.m. UTC | #22
Hi Juergen,

On 13/07/2021 16:09, Juergen Gross wrote:
> On 13.07.21 16:38, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 13/07/2021 15:23, Juergen Gross wrote:
>>> On 13.07.21 16:19, Julien Grall wrote:
>>>> Hi Jan,
>>>>
>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>> And I don't think it should be named XC_PAGE_*, but rather 
>>>>>> XEN_PAGE_*.
>>>>>
>>>>> Even that doesn't seem right to me, at least in principle. There 
>>>>> shouldn't
>>>>> be a build time setting when it may vary at runtime. IOW on Arm I 
>>>>> think a
>>>>> runtime query to the hypervisor would be needed instead.
>>>>
>>>> Yes, we want to be able to use the same userspace/OS without 
>>>> rebuilding to a specific hypervisor page size.
>>>
>>> This define is used for accessing data of other domains. See the define
>>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>>
>>> So it should be a constant (minimal) page size for all hypervisors and
>>> guests of an architecture.
>>
>> Do you mean the maximum rather than minimal? If you use the minimal 
>> (4KB), then you would not be able to map the page in the stage-2 if 
>> the hypervisor is using 64KB.
> 
> But this would mean that the current solution to use XC_PAGE_SIZE is
> wrong, as this is 4k.

The existing ABI is implicitely based on using the hypervisor page 
granularity (currently 4KB).

There is really no way we can support existing guest on 64KB hypervisor. 
But if we were going to break them, then we should consider to do one of 
the following option:
    1) use 64KB page granularity for ABI
    2) query the hypervisor page granularity at runtime

The ideal is 2) because it is more scalable for the future. We also need 
to consider to extend the PV protocol so the backend and frontend can 
agree on the page size.

Cheers,
Juergen Gross July 13, 2021, 3:20 p.m. UTC | #23
On 13.07.21 17:15, Julien Grall wrote:
> Hi Juergen,
> 
> On 13/07/2021 16:09, Juergen Gross wrote:
>> On 13.07.21 16:38, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 13/07/2021 15:23, Juergen Gross wrote:
>>>> On 13.07.21 16:19, Julien Grall wrote:
>>>>> Hi Jan,
>>>>>
>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>> And I don't think it should be named XC_PAGE_*, but rather 
>>>>>>> XEN_PAGE_*.
>>>>>>
>>>>>> Even that doesn't seem right to me, at least in principle. There 
>>>>>> shouldn't
>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I 
>>>>>> think a
>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>
>>>>> Yes, we want to be able to use the same userspace/OS without 
>>>>> rebuilding to a specific hypervisor page size.
>>>>
>>>> This define is used for accessing data of other domains. See the define
>>>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>>>
>>>> So it should be a constant (minimal) page size for all hypervisors and
>>>> guests of an architecture.
>>>
>>> Do you mean the maximum rather than minimal? If you use the minimal 
>>> (4KB), then you would not be able to map the page in the stage-2 if 
>>> the hypervisor is using 64KB.
>>
>> But this would mean that the current solution to use XC_PAGE_SIZE is
>> wrong, as this is 4k.
> 
> The existing ABI is implicitely based on using the hypervisor page 
> granularity (currently 4KB).
> 
> There is really no way we can support existing guest on 64KB hypervisor. 
> But if we were going to break them, then we should consider to do one of 
> the following option:
>     1) use 64KB page granularity for ABI
>     2) query the hypervisor page granularity at runtime
> 
> The ideal is 2) because it is more scalable for the future. We also need 
> to consider to extend the PV protocol so the backend and frontend can 
> agree on the page size.

I absolutely agree, but my suggestion was to help finding a proper way
to cleanup the current interface mess. And this should be done the way
I suggested IMO.

A later interface extension for future guests can still be done on top
of that.


Juergen
Jan Beulich July 13, 2021, 3:52 p.m. UTC | #24
On 13.07.2021 16:33, Julien Grall wrote:
> On 13/07/2021 15:23, Jan Beulich wrote:
>> On 13.07.2021 16:19, Julien Grall wrote:
>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>
>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>> runtime query to the hypervisor would be needed instead.
>>>
>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>> to a specific hypervisor page size.
>>>
>>>> And thinking
>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>
>>> I am not sure to understand this. Are you saying the hypervisor may use
>>> at the same time different page size?
>>
>> I think so, yes. And I further think the hypervisor could even allow its
>> guests to do so.
> 
> This is already the case on Arm. We need to differentiate between the 
> page size used by the guest and the one used by Xen for the stage-2 page 
> table (what you call EPT on x86).
> 
> In this case, we are talking about the page size used by the hypervisor 
> to configure the stage-2 page table
> 
>> There would be a distinction between the granularity at
>> which RAM gets allocated and the granularity at which page mappings (RAM
>> or other) can be established. Which yields an environment which I'd say
>> has no clear "system page size".
> 
> I don't quite understand why you would allocate and etablish the memory 
> with a different page size in the hypervisor. Can you give an example?

Pages may get allocated in 16k chunks, but there may be ways to map
4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
you'd e.g. still balloon pages in and out at 16k granularity.

Jan
Julien Grall July 13, 2021, 4:15 p.m. UTC | #25
Hi Jan,

On 13/07/2021 16:52, Jan Beulich wrote:
> On 13.07.2021 16:33, Julien Grall wrote:
>> On 13/07/2021 15:23, Jan Beulich wrote:
>>> On 13.07.2021 16:19, Julien Grall wrote:
>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>>
>>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>>> runtime query to the hypervisor would be needed instead.
>>>>
>>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>>> to a specific hypervisor page size.
>>>>
>>>>> And thinking
>>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>>
>>>> I am not sure to understand this. Are you saying the hypervisor may use
>>>> at the same time different page size?
>>>
>>> I think so, yes. And I further think the hypervisor could even allow its
>>> guests to do so.
>>
>> This is already the case on Arm. We need to differentiate between the
>> page size used by the guest and the one used by Xen for the stage-2 page
>> table (what you call EPT on x86).
>>
>> In this case, we are talking about the page size used by the hypervisor
>> to configure the stage-2 page table
>>
>>> There would be a distinction between the granularity at
>>> which RAM gets allocated and the granularity at which page mappings (RAM
>>> or other) can be established. Which yields an environment which I'd say
>>> has no clear "system page size".
>>
>> I don't quite understand why you would allocate and etablish the memory
>> with a different page size in the hypervisor. Can you give an example?
> 
> Pages may get allocated in 16k chunks, but there may be ways to map
> 4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
> you'd e.g. still balloon pages in and out at 16k granularity.
Right, 16KB is a multiple of 4KB, so a guest could say "Please allocate 
a contiguous chunk of 4 4KB pages".

 From my understanding, you are suggesting to tell the guest that we 
"support 4KB, 16KB, 64KB...". However, it should be sufficient to say 
"we support 4KB and all its multiple".

For hypervisor configured with 16KB (or 64KB) as the smaller page 
granularity, then we would say "we support 16KB (resp. 64KB) and all its 
multiple".

So the only thing we need is a way to query the small page granularity 
supported. This could be a shift, size, whatever...

If the guest is supporting a small page granularity, then the guest 
would need to make sure to adapt the balloning, grants... so they are at 
least a multiple of the page granularity supported by the hypervisor.

Cheers,
Jan Beulich July 13, 2021, 4:27 p.m. UTC | #26
On 13.07.2021 18:15, Julien Grall wrote:
> On 13/07/2021 16:52, Jan Beulich wrote:
>> On 13.07.2021 16:33, Julien Grall wrote:
>>> On 13/07/2021 15:23, Jan Beulich wrote:
>>>> On 13.07.2021 16:19, Julien Grall wrote:
>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>>>
>>>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>
>>>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>>>> to a specific hypervisor page size.
>>>>>
>>>>>> And thinking
>>>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>>>
>>>>> I am not sure to understand this. Are you saying the hypervisor may use
>>>>> at the same time different page size?
>>>>
>>>> I think so, yes. And I further think the hypervisor could even allow its
>>>> guests to do so.
>>>
>>> This is already the case on Arm. We need to differentiate between the
>>> page size used by the guest and the one used by Xen for the stage-2 page
>>> table (what you call EPT on x86).
>>>
>>> In this case, we are talking about the page size used by the hypervisor
>>> to configure the stage-2 page table
>>>
>>>> There would be a distinction between the granularity at
>>>> which RAM gets allocated and the granularity at which page mappings (RAM
>>>> or other) can be established. Which yields an environment which I'd say
>>>> has no clear "system page size".
>>>
>>> I don't quite understand why you would allocate and etablish the memory
>>> with a different page size in the hypervisor. Can you give an example?
>>
>> Pages may get allocated in 16k chunks, but there may be ways to map
>> 4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
>> you'd e.g. still balloon pages in and out at 16k granularity.
> Right, 16KB is a multiple of 4KB, so a guest could say "Please allocate 
> a contiguous chunk of 4 4KB pages".
> 
>  From my understanding, you are suggesting to tell the guest that we 
> "support 4KB, 16KB, 64KB...". However, it should be sufficient to say 
> "we support 4KB and all its multiple".

No - in this case it could legitimately expect to be able to balloon
out a single 4k page. Yet that's not possible with 16k allocation
granularity.

Jan

> For hypervisor configured with 16KB (or 64KB) as the smaller page 
> granularity, then we would say "we support 16KB (resp. 64KB) and all its 
> multiple".
> 
> So the only thing we need is a way to query the small page granularity 
> supported. This could be a shift, size, whatever...
> 
> If the guest is supporting a small page granularity, then the guest 
> would need to make sure to adapt the balloning, grants... so they are at 
> least a multiple of the page granularity supported by the hypervisor.
> 
> Cheers,
>
Julien Grall July 13, 2021, 4:33 p.m. UTC | #27
Hi,

On 13/07/2021 17:27, Jan Beulich wrote:
> On 13.07.2021 18:15, Julien Grall wrote:
>> On 13/07/2021 16:52, Jan Beulich wrote:
>>> On 13.07.2021 16:33, Julien Grall wrote:
>>>> On 13/07/2021 15:23, Jan Beulich wrote:
>>>>> On 13.07.2021 16:19, Julien Grall wrote:
>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>>>>
>>>>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>
>>>>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>>>>> to a specific hypervisor page size.
>>>>>>
>>>>>>> And thinking
>>>>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>>>>
>>>>>> I am not sure to understand this. Are you saying the hypervisor may use
>>>>>> at the same time different page size?
>>>>>
>>>>> I think so, yes. And I further think the hypervisor could even allow its
>>>>> guests to do so.
>>>>
>>>> This is already the case on Arm. We need to differentiate between the
>>>> page size used by the guest and the one used by Xen for the stage-2 page
>>>> table (what you call EPT on x86).
>>>>
>>>> In this case, we are talking about the page size used by the hypervisor
>>>> to configure the stage-2 page table
>>>>
>>>>> There would be a distinction between the granularity at
>>>>> which RAM gets allocated and the granularity at which page mappings (RAM
>>>>> or other) can be established. Which yields an environment which I'd say
>>>>> has no clear "system page size".
>>>>
>>>> I don't quite understand why you would allocate and etablish the memory
>>>> with a different page size in the hypervisor. Can you give an example?
>>>
>>> Pages may get allocated in 16k chunks, but there may be ways to map
>>> 4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
>>> you'd e.g. still balloon pages in and out at 16k granularity.
>> Right, 16KB is a multiple of 4KB, so a guest could say "Please allocate
>> a contiguous chunk of 4 4KB pages".
>>
>>   From my understanding, you are suggesting to tell the guest that we
>> "support 4KB, 16KB, 64KB...". However, it should be sufficient to say
>> "we support 4KB and all its multiple".
> 
> No - in this case it could legitimately expect to be able to balloon
> out a single 4k page. Yet that's not possible with 16k allocation
> granularity.

I am confused... why would you want to put such restriction? IOW, what 
are you trying to protect against?

Cheers,
Jan Beulich July 14, 2021, 6:11 a.m. UTC | #28
On 13.07.2021 18:33, Julien Grall wrote:
> Hi,
> 
> On 13/07/2021 17:27, Jan Beulich wrote:
>> On 13.07.2021 18:15, Julien Grall wrote:
>>> On 13/07/2021 16:52, Jan Beulich wrote:
>>>> On 13.07.2021 16:33, Julien Grall wrote:
>>>>> On 13/07/2021 15:23, Jan Beulich wrote:
>>>>>> On 13.07.2021 16:19, Julien Grall wrote:
>>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>>>>>
>>>>>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>>
>>>>>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>>>>>> to a specific hypervisor page size.
>>>>>>>
>>>>>>>> And thinking
>>>>>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>>>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>>>>>
>>>>>>> I am not sure to understand this. Are you saying the hypervisor may use
>>>>>>> at the same time different page size?
>>>>>>
>>>>>> I think so, yes. And I further think the hypervisor could even allow its
>>>>>> guests to do so.
>>>>>
>>>>> This is already the case on Arm. We need to differentiate between the
>>>>> page size used by the guest and the one used by Xen for the stage-2 page
>>>>> table (what you call EPT on x86).
>>>>>
>>>>> In this case, we are talking about the page size used by the hypervisor
>>>>> to configure the stage-2 page table
>>>>>
>>>>>> There would be a distinction between the granularity at
>>>>>> which RAM gets allocated and the granularity at which page mappings (RAM
>>>>>> or other) can be established. Which yields an environment which I'd say
>>>>>> has no clear "system page size".
>>>>>
>>>>> I don't quite understand why you would allocate and etablish the memory
>>>>> with a different page size in the hypervisor. Can you give an example?
>>>>
>>>> Pages may get allocated in 16k chunks, but there may be ways to map
>>>> 4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
>>>> you'd e.g. still balloon pages in and out at 16k granularity.
>>> Right, 16KB is a multiple of 4KB, so a guest could say "Please allocate
>>> a contiguous chunk of 4 4KB pages".
>>>
>>>   From my understanding, you are suggesting to tell the guest that we
>>> "support 4KB, 16KB, 64KB...". However, it should be sufficient to say
>>> "we support 4KB and all its multiple".
>>
>> No - in this case it could legitimately expect to be able to balloon
>> out a single 4k page. Yet that's not possible with 16k allocation
>> granularity.
> 
> I am confused... why would you want to put such restriction? IOW, what 
> are you trying to protect against?

Protect? It may simply be that the most efficient page size is 16k.
Hence accounting of pages may be done at 16k granularity. IOW there
then is one struct page_info per 16k page. How would you propose a
guest would alloc/free 4k pages in such a configuration?

Jan
Julien Grall July 14, 2021, 8:51 a.m. UTC | #29
On 14/07/2021 07:11, Jan Beulich wrote:
> On 13.07.2021 18:33, Julien Grall wrote:
>> Hi,
>>
>> On 13/07/2021 17:27, Jan Beulich wrote:
>>> On 13.07.2021 18:15, Julien Grall wrote:
>>>> On 13/07/2021 16:52, Jan Beulich wrote:
>>>>> On 13.07.2021 16:33, Julien Grall wrote:
>>>>>> On 13/07/2021 15:23, Jan Beulich wrote:
>>>>>>> On 13.07.2021 16:19, Julien Grall wrote:
>>>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather XEN_PAGE_*.
>>>>>>>>>
>>>>>>>>> Even that doesn't seem right to me, at least in principle. There shouldn't
>>>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I think a
>>>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>>>
>>>>>>>> Yes, we want to be able to use the same userspace/OS without rebuilding
>>>>>>>> to a specific hypervisor page size.
>>>>>>>>
>>>>>>>>> And thinking
>>>>>>>>> even more generally, perhaps there could also be mixed (base) page sizes
>>>>>>>>> in use at run time, so it may need to be a bit mask which gets returned.
>>>>>>>>
>>>>>>>> I am not sure to understand this. Are you saying the hypervisor may use
>>>>>>>> at the same time different page size?
>>>>>>>
>>>>>>> I think so, yes. And I further think the hypervisor could even allow its
>>>>>>> guests to do so.
>>>>>>
>>>>>> This is already the case on Arm. We need to differentiate between the
>>>>>> page size used by the guest and the one used by Xen for the stage-2 page
>>>>>> table (what you call EPT on x86).
>>>>>>
>>>>>> In this case, we are talking about the page size used by the hypervisor
>>>>>> to configure the stage-2 page table
>>>>>>
>>>>>>> There would be a distinction between the granularity at
>>>>>>> which RAM gets allocated and the granularity at which page mappings (RAM
>>>>>>> or other) can be established. Which yields an environment which I'd say
>>>>>>> has no clear "system page size".
>>>>>>
>>>>>> I don't quite understand why you would allocate and etablish the memory
>>>>>> with a different page size in the hypervisor. Can you give an example?
>>>>>
>>>>> Pages may get allocated in 16k chunks, but there may be ways to map
>>>>> 4k MMIO regions, 4k grants, etc. Due to the 16k allocation granularity
>>>>> you'd e.g. still balloon pages in and out at 16k granularity.
>>>> Right, 16KB is a multiple of 4KB, so a guest could say "Please allocate
>>>> a contiguous chunk of 4 4KB pages".
>>>>
>>>>    From my understanding, you are suggesting to tell the guest that we
>>>> "support 4KB, 16KB, 64KB...". However, it should be sufficient to say
>>>> "we support 4KB and all its multiple".
>>>
>>> No - in this case it could legitimately expect to be able to balloon
>>> out a single 4k page. Yet that's not possible with 16k allocation
>>> granularity.
>>
>> I am confused... why would you want to put such restriction? IOW, what
>> are you trying to protect against?
> 
> Protect? It may simply be that the most efficient page size is 16k.
> Hence accounting of pages may be done at 16k granularity.

I am assuming you are speaking about accounting in the hypervisor. So...

> IOW there
> then is one struct page_info per 16k page. How would you propose a
> guest would alloc/free 4k pages in such a configuration?
... the hypercall interface would be using 16KB page granularity as a base.

But IIUC, you are thinking to also allow mapping to be done with 4KB. I 
think from the hypercall interface, this should be considered as a subpage.

I am not entirely convinced the subpage size should be exposed in a 
generic hypercall query because only a subset will support it. If all 
were supporting, the base granularity would be the subpage granularity 
rendering the discussion moot....

Anyway, we can discuss that when there is a formal proposal on the ML.

Cheers,
Costin Lupu July 16, 2021, 6:28 p.m. UTC | #30
On 7/13/21 6:20 PM, Juergen Gross wrote:
> On 13.07.21 17:15, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 13/07/2021 16:09, Juergen Gross wrote:
>>> On 13.07.21 16:38, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 13/07/2021 15:23, Juergen Gross wrote:
>>>>> On 13.07.21 16:19, Julien Grall wrote:
>>>>>> Hi Jan,
>>>>>>
>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather
>>>>>>>> XEN_PAGE_*.
>>>>>>>
>>>>>>> Even that doesn't seem right to me, at least in principle. There
>>>>>>> shouldn't
>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I
>>>>>>> think a
>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>
>>>>>> Yes, we want to be able to use the same userspace/OS without
>>>>>> rebuilding to a specific hypervisor page size.
>>>>>
>>>>> This define is used for accessing data of other domains. See the
>>>>> define
>>>>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>>>>
>>>>> So it should be a constant (minimal) page size for all hypervisors and
>>>>> guests of an architecture.
>>>>
>>>> Do you mean the maximum rather than minimal? If you use the minimal
>>>> (4KB), then you would not be able to map the page in the stage-2 if
>>>> the hypervisor is using 64KB.
>>>
>>> But this would mean that the current solution to use XC_PAGE_SIZE is
>>> wrong, as this is 4k.
>>
>> The existing ABI is implicitely based on using the hypervisor page
>> granularity (currently 4KB).
>>
>> There is really no way we can support existing guest on 64KB
>> hypervisor. But if we were going to break them, then we should
>> consider to do one of the following option:
>>     1) use 64KB page granularity for ABI
>>     2) query the hypervisor page granularity at runtime
>>
>> The ideal is 2) because it is more scalable for the future. We also
>> need to consider to extend the PV protocol so the backend and frontend
>> can agree on the page size.
> 
> I absolutely agree, but my suggestion was to help finding a proper way
> to cleanup the current interface mess. And this should be done the way
> I suggested IMO.
> 
> A later interface extension for future guests can still be done on top
> of that.

Alright, let's have a little recap to see if I got it right and to agree
on the next steps. There are 2 proposed solutions, let's say a static
one and a dynamic one.

1) Static solution (proposed by Juergen)
- We define XEN_PAGE_* values in a xen/include/public/arch-*/*.h header.
- Q: Should we define a new header for that? page.h or page_size.h are
ok as new filenames?

Pros:
- We fix the interfaces mess and we can get rid of xenctrl lib
dependency for some of the libs that need only the XEN_PAGE_* definitions.
- It's faster to implement, with fewer changes.

Cons:
- Well, it's static, it doesn't allow the hypervisor to provide
different values for different guests.


2) Dynamic solution (proposed by Jan and Julien)
We get the value(s) by calling a hypcall, probably as a query related to
some guest domain.

Pros:
- It's dynamic and scalable. We would support different values for
different guests.

Cons:
- More difficult to implement. It changes the paradigm in the toolstack
libs, every occurrence of XC_PAGE_* would have to be amended. Moreover,
we might want to make the hypcall once and save the value for later
(probably several toolstack structures should be extended for that)


I searched for the occurrences of XC_PAGE_* in the toolstack libs and
it's a *lot* of them. IMHO I think we should pick the static solution
for now, considering that it would be faster to implement. Please let me
know if this is OK or not. Any comments are appreciated.

Cheers,
Costin
Andrew Cooper July 27, 2021, 1:36 p.m. UTC | #31
On 16/07/2021 19:28, Costin Lupu wrote:
> On 7/13/21 6:20 PM, Juergen Gross wrote:
>> On 13.07.21 17:15, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 13/07/2021 16:09, Juergen Gross wrote:
>>>> On 13.07.21 16:38, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 13/07/2021 15:23, Juergen Gross wrote:
>>>>>> On 13.07.21 16:19, Julien Grall wrote:
>>>>>>> Hi Jan,
>>>>>>>
>>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather
>>>>>>>>> XEN_PAGE_*.
>>>>>>>> Even that doesn't seem right to me, at least in principle. There
>>>>>>>> shouldn't
>>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I
>>>>>>>> think a
>>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>> Yes, we want to be able to use the same userspace/OS without
>>>>>>> rebuilding to a specific hypervisor page size.
>>>>>> This define is used for accessing data of other domains. See the
>>>>>> define
>>>>>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>>>>>
>>>>>> So it should be a constant (minimal) page size for all hypervisors and
>>>>>> guests of an architecture.
>>>>> Do you mean the maximum rather than minimal? If you use the minimal
>>>>> (4KB), then you would not be able to map the page in the stage-2 if
>>>>> the hypervisor is using 64KB.
>>>> But this would mean that the current solution to use XC_PAGE_SIZE is
>>>> wrong, as this is 4k.
>>> The existing ABI is implicitely based on using the hypervisor page
>>> granularity (currently 4KB).
>>>
>>> There is really no way we can support existing guest on 64KB
>>> hypervisor. But if we were going to break them, then we should
>>> consider to do one of the following option:
>>>     1) use 64KB page granularity for ABI
>>>     2) query the hypervisor page granularity at runtime
>>>
>>> The ideal is 2) because it is more scalable for the future. We also
>>> need to consider to extend the PV protocol so the backend and frontend
>>> can agree on the page size.
>> I absolutely agree, but my suggestion was to help finding a proper way
>> to cleanup the current interface mess. And this should be done the way
>> I suggested IMO.
>>
>> A later interface extension for future guests can still be done on top
>> of that.
> Alright, let's have a little recap to see if I got it right and to agree
> on the next steps. There are 2 proposed solutions, let's say a static
> one and a dynamic one.
>
> 1) Static solution (proposed by Juergen)
> - We define XEN_PAGE_* values in a xen/include/public/arch-*/*.h header.
> - Q: Should we define a new header for that? page.h or page_size.h are
> ok as new filenames?
>
> Pros:
> - We fix the interfaces mess and we can get rid of xenctrl lib
> dependency for some of the libs that need only the XEN_PAGE_* definitions.
> - It's faster to implement, with fewer changes.
>
> Cons:
> - Well, it's static, it doesn't allow the hypervisor to provide
> different values for different guests.
>
>
> 2) Dynamic solution (proposed by Jan and Julien)
> We get the value(s) by calling a hypcall, probably as a query related to
> some guest domain.
>
> Pros:
> - It's dynamic and scalable. We would support different values for
> different guests.
>
> Cons:
> - More difficult to implement. It changes the paradigm in the toolstack
> libs, every occurrence of XC_PAGE_* would have to be amended. Moreover,
> we might want to make the hypcall once and save the value for later
> (probably several toolstack structures should be extended for that)
>
>
> I searched for the occurrences of XC_PAGE_* in the toolstack libs and
> it's a *lot* of them. IMHO I think we should pick the static solution
> for now, considering that it would be faster to implement. Please let me
> know if this is OK or not. Any comments are appreciated.

The immediate problem needing fixing is the stable libraries inclusion
of unstable headers - specifically, the inclusion of <xenctrl.h>.

Juergen's proposal moves the existing constant to a more appropriate
location, and specifically, a location where its value is stable.

It does not change the ABI.  It merely demonstrates that the existing
ABI is broken, and thus is absolutely a step in the right direction.

This is the approach you should take in the short term, and needs
sorting before 4.16 ships.


The dynamic solution, while preferable in the longterm, is far more
complicated than even described thus far, and is not as simple as just
having a hypercall and using that value.

Among other things, it requires coordination with the dom0 kernel as to
its pagetable setup, and with Xen's choice of pagetable size for dom0,
which may not be the same as domU's.  It is a large quantity of work,
very invasive to the existing APIs/ABIs, and stands no chance at all of
being ready for 4.16.

~Andrew
Costin Lupu July 30, 2021, 9:18 a.m. UTC | #32
On 7/27/21 4:36 PM, Andrew Cooper wrote:
> On 16/07/2021 19:28, Costin Lupu wrote:
>> On 7/13/21 6:20 PM, Juergen Gross wrote:
>>> On 13.07.21 17:15, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 13/07/2021 16:09, Juergen Gross wrote:
>>>>> On 13.07.21 16:38, Julien Grall wrote:
>>>>>> Hi Juergen,
>>>>>>
>>>>>> On 13/07/2021 15:23, Juergen Gross wrote:
>>>>>>> On 13.07.21 16:19, Julien Grall wrote:
>>>>>>>> Hi Jan,
>>>>>>>>
>>>>>>>> On 13/07/2021 15:14, Jan Beulich wrote:
>>>>>>>>>> And I don't think it should be named XC_PAGE_*, but rather
>>>>>>>>>> XEN_PAGE_*.
>>>>>>>>> Even that doesn't seem right to me, at least in principle. There
>>>>>>>>> shouldn't
>>>>>>>>> be a build time setting when it may vary at runtime. IOW on Arm I
>>>>>>>>> think a
>>>>>>>>> runtime query to the hypervisor would be needed instead.
>>>>>>>> Yes, we want to be able to use the same userspace/OS without
>>>>>>>> rebuilding to a specific hypervisor page size.
>>>>>>> This define is used for accessing data of other domains. See the
>>>>>>> define
>>>>>>> for XEN_PAGE_SIZE in xen/include/public/io/ring.h
>>>>>>>
>>>>>>> So it should be a constant (minimal) page size for all hypervisors and
>>>>>>> guests of an architecture.
>>>>>> Do you mean the maximum rather than minimal? If you use the minimal
>>>>>> (4KB), then you would not be able to map the page in the stage-2 if
>>>>>> the hypervisor is using 64KB.
>>>>> But this would mean that the current solution to use XC_PAGE_SIZE is
>>>>> wrong, as this is 4k.
>>>> The existing ABI is implicitely based on using the hypervisor page
>>>> granularity (currently 4KB).
>>>>
>>>> There is really no way we can support existing guest on 64KB
>>>> hypervisor. But if we were going to break them, then we should
>>>> consider to do one of the following option:
>>>>     1) use 64KB page granularity for ABI
>>>>     2) query the hypervisor page granularity at runtime
>>>>
>>>> The ideal is 2) because it is more scalable for the future. We also
>>>> need to consider to extend the PV protocol so the backend and frontend
>>>> can agree on the page size.
>>> I absolutely agree, but my suggestion was to help finding a proper way
>>> to cleanup the current interface mess. And this should be done the way
>>> I suggested IMO.
>>>
>>> A later interface extension for future guests can still be done on top
>>> of that.
>> Alright, let's have a little recap to see if I got it right and to agree
>> on the next steps. There are 2 proposed solutions, let's say a static
>> one and a dynamic one.
>>
>> 1) Static solution (proposed by Juergen)
>> - We define XEN_PAGE_* values in a xen/include/public/arch-*/*.h header.
>> - Q: Should we define a new header for that? page.h or page_size.h are
>> ok as new filenames?
>>
>> Pros:
>> - We fix the interfaces mess and we can get rid of xenctrl lib
>> dependency for some of the libs that need only the XEN_PAGE_* definitions.
>> - It's faster to implement, with fewer changes.
>>
>> Cons:
>> - Well, it's static, it doesn't allow the hypervisor to provide
>> different values for different guests.
>>
>>
>> 2) Dynamic solution (proposed by Jan and Julien)
>> We get the value(s) by calling a hypcall, probably as a query related to
>> some guest domain.
>>
>> Pros:
>> - It's dynamic and scalable. We would support different values for
>> different guests.
>>
>> Cons:
>> - More difficult to implement. It changes the paradigm in the toolstack
>> libs, every occurrence of XC_PAGE_* would have to be amended. Moreover,
>> we might want to make the hypcall once and save the value for later
>> (probably several toolstack structures should be extended for that)
>>
>>
>> I searched for the occurrences of XC_PAGE_* in the toolstack libs and
>> it's a *lot* of them. IMHO I think we should pick the static solution
>> for now, considering that it would be faster to implement. Please let me
>> know if this is OK or not. Any comments are appreciated.
> 
> The immediate problem needing fixing is the stable libraries inclusion
> of unstable headers - specifically, the inclusion of <xenctrl.h>.
> 
> Juergen's proposal moves the existing constant to a more appropriate
> location, and specifically, a location where its value is stable.
> 
> It does not change the ABI.  It merely demonstrates that the existing
> ABI is broken, and thus is absolutely a step in the right direction.
> 
> This is the approach you should take in the short term, and needs
> sorting before 4.16 ships.
> 
> 
> The dynamic solution, while preferable in the longterm, is far more
> complicated than even described thus far, and is not as simple as just
> having a hypercall and using that value.
> 
> Among other things, it requires coordination with the dom0 kernel as to
> its pagetable setup, and with Xen's choice of pagetable size for dom0,
> which may not be the same as domU's.  It is a large quantity of work,
> very invasive to the existing APIs/ABIs, and stands no chance at all of
> being ready for 4.16.

Thanks for clearing this, Andrew. What is the deadline for the 4.16
release? Where can I find the release calendar?

Costin
diff mbox series

Patch

diff --git a/tools/libs/foreignmemory/minios.c b/tools/libs/foreignmemory/minios.c
index c5453736d598..d7b3f0e1c823 100644
--- a/tools/libs/foreignmemory/minios.c
+++ b/tools/libs/foreignmemory/minios.c
@@ -17,6 +17,14 @@ 
  * Copyright 2007-2008 Samuel Thibault <samuel.thibault@eu.citrix.com>.
  */
 
+/*
+ * xenctlr.h currently defines __XEN_TOOLS__ which affects what is
+ * exposed by Xen headers. As the define needs to be set consistently,
+ * we want to include xenctrl.h before the mini-os headers (they include
+ * public headers).
+ */
+#include <xenctrl.h>
+
 #include <mini-os/types.h>
 #include <mini-os/os.h>
 #include <mini-os/mm.h>