diff mbox

[06/16] SUPPORT.md: Add scalability features

Message ID 20171113154126.13038-6-george.dunlap@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

George Dunlap Nov. 13, 2017, 3:41 p.m. UTC
Superpage support and PVHVM.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Konrad Wilk <konrad.wilk@oracle.com>
CC: Tim Deegan <tim@xen.org>
CC: Julien Grall <julien.grall@arm.com>
---
 SUPPORT.md | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Julien Grall Nov. 16, 2017, 3:19 p.m. UTC | #1
Hi George,

On 13/11/17 15:41, George Dunlap wrote:
> Superpage support and PVHVM.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Konrad Wilk <konrad.wilk@oracle.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> ---
>   SUPPORT.md | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 
> diff --git a/SUPPORT.md b/SUPPORT.md
> index c884fac7f5..a8c56d13dd 100644
> --- a/SUPPORT.md
> +++ b/SUPPORT.md
> @@ -195,6 +195,27 @@ on embedded platforms.
>   
>   Enables NUMA aware scheduling in Xen
>   
> +## Scalability
> +
> +### 1GB/2MB super page support
> +
> +    Status, x86 HVM/PVH: : Supported
> +    Status, ARM: Supported
> +
> +NB that this refers to the ability of guests
> +to have higher-level page table entries point directly to memory,
> +improving TLB performance.
> +This is independent of the ARM "page granularity" feature (see below).

I am not entirely sure about this paragraph for Arm. I understood this 
section as support for stage-2 page-table (aka EPT on x86) but the 
paragraph lead me to believe to it is for guest.

The size of super pages of guests will depend on the page granularity 
used by itself and the format of the page-table (e.g LPAE vs short 
descriptor). We have no control on that.

What we have control is the size of mapping used for stage-2 page-table.

> +
> +### x86/PVHVM
> +
> +    Status: Supported
> +
> +This is a useful label for a set of hypervisor features
> +which add paravirtualized functionality to HVM guests
> +for improved performance and scalability.
> +This includes exposing event channels to HVM guests.
> +
>   # Format and definitions
>   
>   This file contains prose, and machine-readable fragments.
> 

Cheers,
George Dunlap Nov. 16, 2017, 3:30 p.m. UTC | #2
On 11/16/2017 03:19 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Superpage support and PVHVM.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Konrad Wilk <konrad.wilk@oracle.com>
>> CC: Tim Deegan <tim@xen.org>
>> CC: Julien Grall <julien.grall@arm.com>
>> ---
>>   SUPPORT.md | 21 +++++++++++++++++++++
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index c884fac7f5..a8c56d13dd 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -195,6 +195,27 @@ on embedded platforms.
>>     Enables NUMA aware scheduling in Xen
>>   +## Scalability
>> +
>> +### 1GB/2MB super page support
>> +
>> +    Status, x86 HVM/PVH: : Supported
>> +    Status, ARM: Supported
>> +
>> +NB that this refers to the ability of guests
>> +to have higher-level page table entries point directly to memory,
>> +improving TLB performance.
>> +This is independent of the ARM "page granularity" feature (see below).
> 
> I am not entirely sure about this paragraph for Arm. I understood this
> section as support for stage-2 page-table (aka EPT on x86) but the
> paragraph lead me to believe to it is for guest.

Hmm, yes likely there was some confusion when this was listed.  We
probably should make separate entries for HAP / stage 2 superpage
support and guest PT superpage support.

 -George
Jan Beulich Nov. 21, 2017, 8:16 a.m. UTC | #3
>>> On 13.11.17 at 16:41, <george.dunlap@citrix.com> wrote:
> --- a/SUPPORT.md
> +++ b/SUPPORT.md
> @@ -195,6 +195,27 @@ on embedded platforms.
>  
>  Enables NUMA aware scheduling in Xen
>  
> +## Scalability
> +
> +### 1GB/2MB super page support
> +
> +    Status, x86 HVM/PVH: : Supported

On top of what you and Julien have worked out here already: Don't
we need to clarify here that this for HAP mode, while shadow more
doesn't support 1Gb guest pages (and doesn't use 2Mb host pages)?

Jan
George Dunlap Nov. 21, 2017, 4:43 p.m. UTC | #4
On 11/16/2017 03:19 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Superpage support and PVHVM.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Konrad Wilk <konrad.wilk@oracle.com>
>> CC: Tim Deegan <tim@xen.org>
>> CC: Julien Grall <julien.grall@arm.com>
>> ---
>>   SUPPORT.md | 21 +++++++++++++++++++++
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index c884fac7f5..a8c56d13dd 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -195,6 +195,27 @@ on embedded platforms.
>>     Enables NUMA aware scheduling in Xen
>>   +## Scalability
>> +
>> +### 1GB/2MB super page support
>> +
>> +    Status, x86 HVM/PVH: : Supported
>> +    Status, ARM: Supported
>> +
>> +NB that this refers to the ability of guests
>> +to have higher-level page table entries point directly to memory,
>> +improving TLB performance.
>> +This is independent of the ARM "page granularity" feature (see below).
> 
> I am not entirely sure about this paragraph for Arm. I understood this
> section as support for stage-2 page-table (aka EPT on x86) but the
> paragraph lead me to believe to it is for guest.
> 
> The size of super pages of guests will depend on the page granularity
> used by itself and the format of the page-table (e.g LPAE vs short
> descriptor). We have no control on that.
> 
> What we have control is the size of mapping used for stage-2 page-table.

Stepping back from the document for a minute: would it make sense to use
"hardware assisted paging" (HAP) for Intel EPT, AMD RVI (previously
NPT), and ARM stage-2 pagetables?  HAP was already a general term used
to describe the two x86 technologies; and I think the description makes
sense, because if we didn't have hardware-assisted stage 2 pagetables
we'd need Xen-provided shadow pagetables.

Back to the question at hand, there are four different things:

1. Whether Xen itself uses superpage mappings for its virtual address
space.  (Not sure if Xen does this or not.)

2. Whether Xen uses superpage mappings for HAP.  Xen uses this on x86
when hardware support is -- I take it Xen does this on ARM as well?

3. Whether Xen provides the *interface* for a guest to use L2 or L3
superpages (for 4k page granularity, 2MiB or 1GiB respectively) in its
own pagetables.  I *think* HAP on x86 provides the interface whenever
the underlying hardware does.  I assume it's the same for ARM?  In the
case of shadow mode, we only provide the interface for 2MiB pagetables.

4. Whether a guest using L2 or L3 superpages will actually have
superpages, or whether it's "only emulated".  As Jan said, for shadow
pagetables on x86, the underlying pagetables still only have 4k pages,
so the guest will get no benefit from using L2 superpages in its
pagetables (either in terms of reduced memory reads on a tlb miss, or in
terms of larger effectiveness of each TLB entry).

#3 and #4 are probably the most pertinent to users, with #2 being next
on the list, and #1 being least.

Does that make sense?

 -George
Julien Grall Nov. 21, 2017, 5:31 p.m. UTC | #5
Hi George,

On 11/21/2017 04:43 PM, George Dunlap wrote:
> On 11/16/2017 03:19 PM, Julien Grall wrote:
>> On 13/11/17 15:41, George Dunlap wrote:
>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>> ---
>>> CC: Ian Jackson <ian.jackson@citrix.com>
>>> CC: Wei Liu <wei.liu2@citrix.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: Konrad Wilk <konrad.wilk@oracle.com>
>>> CC: Tim Deegan <tim@xen.org>
>>> CC: Julien Grall <julien.grall@arm.com>
>>> ---
>>>    SUPPORT.md | 21 +++++++++++++++++++++
>>>    1 file changed, 21 insertions(+)
>>>
>>> diff --git a/SUPPORT.md b/SUPPORT.md
>>> index c884fac7f5..a8c56d13dd 100644
>>> --- a/SUPPORT.md
>>> +++ b/SUPPORT.md
>>> @@ -195,6 +195,27 @@ on embedded platforms.
>>>      Enables NUMA aware scheduling in Xen
>>>    +## Scalability
>>> +
>>> +### 1GB/2MB super page support
>>> +
>>> +    Status, x86 HVM/PVH: : Supported
>>> +    Status, ARM: Supported
>>> +
>>> +NB that this refers to the ability of guests
>>> +to have higher-level page table entries point directly to memory,
>>> +improving TLB performance.
>>> +This is independent of the ARM "page granularity" feature (see below).
>>
>> I am not entirely sure about this paragraph for Arm. I understood this
>> section as support for stage-2 page-table (aka EPT on x86) but the
>> paragraph lead me to believe to it is for guest.
>>
>> The size of super pages of guests will depend on the page granularity
>> used by itself and the format of the page-table (e.g LPAE vs short
>> descriptor). We have no control on that.
>>
>> What we have control is the size of mapping used for stage-2 page-table.
> 
> Stepping back from the document for a minute: would it make sense to use
> "hardware assisted paging" (HAP) for Intel EPT, AMD RVI (previously
> NPT), and ARM stage-2 pagetables?  HAP was already a general term used
> to describe the two x86 technologies; and I think the description makes
> sense, because if we didn't have hardware-assisted stage 2 pagetables
> we'd need Xen-provided shadow pagetables.

I think using the term "hardware assisted paging" should be fine to 
refer the 3 technologies.

> 
> Back to the question at hand, there are four different things:
> 
> 1. Whether Xen itself uses superpage mappings for its virtual address
> space.  (Not sure if Xen does this or not.)

Xen is trying to use superpage mappings for itself whenever it is possible.

> 
> 2. Whether Xen uses superpage mappings for HAP.  Xen uses this on x86
> when hardware support is -- I take it Xen does this on ARM as well?

The size of superpages supported will depend on the page-table format 
(short-descriptor vs LPAE) and the granularity used.

Supersection (16MB) for short-descriptor is optional but mandatory when 
the processor support LPAE. LPAE is mandatory with virtualization. So 
all size of superpages are supported.

Note that stage-2 page-tables can only use LPAE page-table.

I would also rather avoid to mention any superpage size for Arm in 
SUPPORT.MD as there are a lot.

Short-descriptor is always using 4KB granularity supports 16MB, 1MB, 64KB

LPAE supports 4KB, 16KB, 64KB granularities. Each of them having 
different size of superpage.

> 
> 3. Whether Xen provides the *interface* for a guest to use L2 or L3
> superpages (for 4k page granularity, 2MiB or 1GiB respectively) in its
> own pagetables.  I *think* HAP on x86 provides the interface whenever
> the underlying hardware does.  I assume it's the same for ARM?  In the
> case of shadow mode, we only provide the interface for 2MiB pagetables.

See above. We have no way to control that in the guest.

> 
> 4. Whether a guest using L2 or L3 superpages will actually have
> superpages, or whether it's "only emulated".  As Jan said, for shadow
> pagetables on x86, the underlying pagetables still only have 4k pages,
> so the guest will get no benefit from using L2 superpages in its
> pagetables (either in terms of reduced memory reads on a tlb miss, or in
> terms of larger effectiveness of each TLB entry).
> 
> #3 and #4 are probably the most pertinent to users, with #2 being next
> on the list, and #1 being least.
> 
> Does that make sense?

Cheers,
George Dunlap Nov. 21, 2017, 5:51 p.m. UTC | #6
On 11/21/2017 05:31 PM, Julien Grall wrote:
> Hi George,
> 
> On 11/21/2017 04:43 PM, George Dunlap wrote:
>> On 11/16/2017 03:19 PM, Julien Grall wrote:
>>> On 13/11/17 15:41, George Dunlap wrote:
>>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>>> ---
>>>> CC: Ian Jackson <ian.jackson@citrix.com>
>>>> CC: Wei Liu <wei.liu2@citrix.com>
>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> CC: Jan Beulich <jbeulich@suse.com>
>>>> CC: Stefano Stabellini <sstabellini@kernel.org>
>>>> CC: Konrad Wilk <konrad.wilk@oracle.com>
>>>> CC: Tim Deegan <tim@xen.org>
>>>> CC: Julien Grall <julien.grall@arm.com>
>>>> ---
>>>>    SUPPORT.md | 21 +++++++++++++++++++++
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/SUPPORT.md b/SUPPORT.md
>>>> index c884fac7f5..a8c56d13dd 100644
>>>> --- a/SUPPORT.md
>>>> +++ b/SUPPORT.md
>>>> @@ -195,6 +195,27 @@ on embedded platforms.
>>>>      Enables NUMA aware scheduling in Xen
>>>>    +## Scalability
>>>> +
>>>> +### 1GB/2MB super page support
>>>> +
>>>> +    Status, x86 HVM/PVH: : Supported
>>>> +    Status, ARM: Supported
>>>> +
>>>> +NB that this refers to the ability of guests
>>>> +to have higher-level page table entries point directly to memory,
>>>> +improving TLB performance.
>>>> +This is independent of the ARM "page granularity" feature (see below).
>>>
>>> I am not entirely sure about this paragraph for Arm. I understood this
>>> section as support for stage-2 page-table (aka EPT on x86) but the
>>> paragraph lead me to believe to it is for guest.
>>>
>>> The size of super pages of guests will depend on the page granularity
>>> used by itself and the format of the page-table (e.g LPAE vs short
>>> descriptor). We have no control on that.
>>>
>>> What we have control is the size of mapping used for stage-2 page-table.
>>
>> Stepping back from the document for a minute: would it make sense to use
>> "hardware assisted paging" (HAP) for Intel EPT, AMD RVI (previously
>> NPT), and ARM stage-2 pagetables?  HAP was already a general term used
>> to describe the two x86 technologies; and I think the description makes
>> sense, because if we didn't have hardware-assisted stage 2 pagetables
>> we'd need Xen-provided shadow pagetables.
> 
> I think using the term "hardware assisted paging" should be fine to
> refer the 3 technologies.

OK, great.

[snip]

> Short-descriptor is always using 4KB granularity supports 16MB, 1MB, 64KB
> 
> LPAE supports 4KB, 16KB, 64KB granularities. Each of them having
> different size of superpage.

Yes, that's why I started saying "L2 and L3 superpages", to mean
"Superpage entries in L2 or L3 pagetables", instead of 2MiB or 1GiB.
(Let me know if you can think of a better way to describe that.)

>> 3. Whether Xen provides the *interface* for a guest to use L2 or L3
>> superpages (for 4k page granularity, 2MiB or 1GiB respectively) in its
>> own pagetables.  I *think* HAP on x86 provides the interface whenever
>> the underlying hardware does.  I assume it's the same for ARM?  In the
>> case of shadow mode, we only provide the interface for 2MiB pagetables.
> 
> See above. We have no way to control that in the guest.

We don't control whether the guest uses *any* features.  Should we not
mention PV disks or SMMUv2 or whatever because we don't know if the
guest will use them?

Of course not.  This document describes whether the guest *has the
features available to use*, either provided by the hardware or emulated
by Xen.

It sounds like you may not have ever thought about whether an ARM guest
has L2 or L3 superpages available, because it's always had all of them;
but it's different on x86.

[snip]

>> 2. Whether Xen uses superpage mappings for HAP.  Xen uses this on x86
>> when hardware support is -- I take it Xen does this on ARM as well?
>
> The size of superpages supported will depend on the page-table format
> (short-descriptor vs LPAE) and the granularity used.
>
> Supersection (16MB) for short-descriptor is optional but mandatory when
> the processor support LPAE. LPAE is mandatory with virtualization. So
> all size of superpages are supported.
>
> Note that stage-2 page-tables can only use LPAE page-table.
>
> I would also rather avoid to mention any superpage size for Arm in
> SUPPORT.MD as there are a lot.

So it sounds like basically everything supported on native was supported
in virtualization (and under Xen) from the start, so it's probably less
important to mention.  But since we *will* need to do that for x86, we
probably need to say *something* in case people want to know.

Let me see what I can come up with.

 -George
diff mbox

Patch

diff --git a/SUPPORT.md b/SUPPORT.md
index c884fac7f5..a8c56d13dd 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -195,6 +195,27 @@  on embedded platforms.
 
 Enables NUMA aware scheduling in Xen
 
+## Scalability
+
+### 1GB/2MB super page support
+
+    Status, x86 HVM/PVH: : Supported
+    Status, ARM: Supported
+
+NB that this refers to the ability of guests
+to have higher-level page table entries point directly to memory,
+improving TLB performance.
+This is independent of the ARM "page granularity" feature (see below).
+
+### x86/PVHVM
+
+    Status: Supported
+
+This is a useful label for a set of hypervisor features
+which add paravirtualized functionality to HVM guests 
+for improved performance and scalability.
+This includes exposing event channels to HVM guests.
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.