diff mbox

[v2,02/20] xen: Introduce a function to split a Linux page into Xen page

Message ID 1436474552-31789-3-git-send-email-julien.grall@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Julien Grall July 9, 2015, 8:42 p.m. UTC
The Xen interface is always using 4KB page. This means that a Linux page
may be split across multiple Xen page when the page granularity is not
the same.

This helper will break down a Linux page into 4KB chunk and call the
helper on each of them.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
    Changes in v2:
        - Patch added
---
 include/xen/page.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Stefano Stabellini July 16, 2015, 2:23 p.m. UTC | #1
On Thu, 9 Jul 2015, Julien Grall wrote:
> The Xen interface is always using 4KB page. This means that a Linux page
> may be split across multiple Xen page when the page granularity is not
> the same.
> 
> This helper will break down a Linux page into 4KB chunk and call the
> helper on each of them.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>     Changes in v2:
>         - Patch added
> ---
>  include/xen/page.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/include/xen/page.h b/include/xen/page.h
> index 8ebd37b..b1f7722 100644
> --- a/include/xen/page.h
> +++ b/include/xen/page.h
> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>  
>  extern unsigned long xen_released_pages;
>  
> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
> +				    void *data)
> +{
> +	unsigned long pfn = xen_page_to_pfn(page);
> +	int i;
> +	int ret;

please initialize ret to 0


> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++, pfn++) {
> +		ret = fn(page, pfn, data);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return ret;
> +}
> +
> +
>  #endif	/* _XEN_PAGE_H */


Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Julien Grall July 16, 2015, 2:54 p.m. UTC | #2
Hi Stefano,

On 16/07/2015 16:23, Stefano Stabellini wrote:
>> diff --git a/include/xen/page.h b/include/xen/page.h
>> index 8ebd37b..b1f7722 100644
>> --- a/include/xen/page.h
>> +++ b/include/xen/page.h
>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>
>>   extern unsigned long xen_released_pages;
>>
>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>> +				    void *data)
>> +{
>> +	unsigned long pfn = xen_page_to_pfn(page);
>> +	int i;
>> +	int ret;
>
> please initialize ret to 0

Hmmm... right. I'm not sure why the compiler didn't catch it.

>
>> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++, pfn++) {
>> +		ret = fn(page, pfn, data);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +
>>   #endif	/* _XEN_PAGE_H */
>
>
> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Thank you,
David Vrabel July 24, 2015, 9:31 a.m. UTC | #3
On 09/07/15 21:42, Julien Grall wrote:
> The Xen interface is always using 4KB page. This means that a Linux page
> may be split across multiple Xen page when the page granularity is not
> the same.
> 
> This helper will break down a Linux page into 4KB chunk and call the
> helper on each of them.
[...]
> --- a/include/xen/page.h
> +++ b/include/xen/page.h
> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>  
>  extern unsigned long xen_released_pages;
>  
> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
> +
> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
> +				    void *data)

I think this should be outlined (unless you have measurements that
support making it inlined).

Also perhaps make it

int xen_for_each_gfn(struct page *page,
                     xen_gfn_fn_t fn, void *data);

or

int xen_for_each_gfn(struct page **page, unsigned int count,
                     xen_gfn_fn_t fn, void *data);

?

David
Julien Grall July 24, 2015, 9:54 a.m. UTC | #4
On 24/07/15 10:31, David Vrabel wrote:
> On 09/07/15 21:42, Julien Grall wrote:
>> The Xen interface is always using 4KB page. This means that a Linux page
>> may be split across multiple Xen page when the page granularity is not
>> the same.
>>
>> This helper will break down a Linux page into 4KB chunk and call the
>> helper on each of them.
> [...]
>> --- a/include/xen/page.h
>> +++ b/include/xen/page.h
>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>  
>>  extern unsigned long xen_released_pages;
>>  
>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>> +
>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>> +				    void *data)
> 
> I think this should be outlined (unless you have measurements that
> support making it inlined).

I don't have any performance measurements. Although, when Linux is using
4KB page granularity, the loop in this helper will be dropped by the
helper. The code would look like:

unsigned long pfn = xen_page_to_pfn(page);

ret = fn(page, fn, data);
if (ret)
  return ret;

The compiler could even inline the callback (fn). So it drops 2
functions call.

> 
> Also perhaps make it
> 
> int xen_for_each_gfn(struct page *page,
>                      xen_gfn_fn_t fn, void *data);

gfn standing for Guest Frame Number right?

> or
> 
> int xen_for_each_gfn(struct page **page, unsigned int count,
>                      xen_gfn_fn_t fn, void *data);

count being the number of Linux page or Xen page? We have some code (see
xlate_mmu patch #19) requiring to iter on a specific number of Xen page.
I was thinking to introduce a separate function for iterating on a
specific number of Xen PFN.

We don't want to introduce it for everyone as we need to hide this
complexity from most the caller.

In general case, the 2 suggestions would not be very useful. Most of the
time we have some actions to do per Linux page (see the balloon code for
instance).

Regards,
David Vrabel July 24, 2015, 10:10 a.m. UTC | #5
On 24/07/15 10:54, Julien Grall wrote:
> On 24/07/15 10:31, David Vrabel wrote:
>> On 09/07/15 21:42, Julien Grall wrote:
>>> The Xen interface is always using 4KB page. This means that a Linux page
>>> may be split across multiple Xen page when the page granularity is not
>>> the same.
>>>
>>> This helper will break down a Linux page into 4KB chunk and call the
>>> helper on each of them.
>> [...]
>>> --- a/include/xen/page.h
>>> +++ b/include/xen/page.h
>>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>>  
>>>  extern unsigned long xen_released_pages;
>>>  
>>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>>> +
>>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>>> +				    void *data)
>>
>> I think this should be outlined (unless you have measurements that
>> support making it inlined).
> 
> I don't have any performance measurements. Although, when Linux is using
> 4KB page granularity, the loop in this helper will be dropped by the
> helper. The code would look like:
> 
> unsigned long pfn = xen_page_to_pfn(page);
> 
> ret = fn(page, fn, data);
> if (ret)
>   return ret;
> 
> The compiler could even inline the callback (fn). So it drops 2
> functions call.

Ok, keep it inlined.

>> Also perhaps make it
>>
>> int xen_for_each_gfn(struct page *page,
>>                      xen_gfn_fn_t fn, void *data);
> 
> gfn standing for Guest Frame Number right?

Yes.  This suggestion is just changing the name to make it more obvious
what it does.

David
Julien Grall July 24, 2015, 10:20 a.m. UTC | #6
On 24/07/15 11:10, David Vrabel wrote:
> On 24/07/15 10:54, Julien Grall wrote:
>> On 24/07/15 10:31, David Vrabel wrote:
>>> On 09/07/15 21:42, Julien Grall wrote:
>>>> The Xen interface is always using 4KB page. This means that a Linux page
>>>> may be split across multiple Xen page when the page granularity is not
>>>> the same.
>>>>
>>>> This helper will break down a Linux page into 4KB chunk and call the
>>>> helper on each of them.
>>> [...]
>>>> --- a/include/xen/page.h
>>>> +++ b/include/xen/page.h
>>>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>>>  
>>>>  extern unsigned long xen_released_pages;
>>>>  
>>>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>>>> +
>>>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>>>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>>>> +				    void *data)
>>>
>>> I think this should be outlined (unless you have measurements that
>>> support making it inlined).
>>
>> I don't have any performance measurements. Although, when Linux is using
>> 4KB page granularity, the loop in this helper will be dropped by the
>> helper. The code would look like:
>>
>> unsigned long pfn = xen_page_to_pfn(page);
>>
>> ret = fn(page, fn, data);
>> if (ret)
>>   return ret;
>>
>> The compiler could even inline the callback (fn). So it drops 2
>> functions call.
> 
> Ok, keep it inlined.
> 
>>> Also perhaps make it
>>>
>>> int xen_for_each_gfn(struct page *page,
>>>                      xen_gfn_fn_t fn, void *data);
>>
>> gfn standing for Guest Frame Number right?
> 
> Yes.  This suggestion is just changing the name to make it more obvious
> what it does.

I will change the name.

Regards,
Julien Grall Aug. 5, 2015, 2:30 p.m. UTC | #7
Hi David,

On 24/07/15 11:10, David Vrabel wrote:
> On 24/07/15 10:54, Julien Grall wrote:
>> On 24/07/15 10:31, David Vrabel wrote:
>>> On 09/07/15 21:42, Julien Grall wrote:
>>>> The Xen interface is always using 4KB page. This means that a Linux page
>>>> may be split across multiple Xen page when the page granularity is not
>>>> the same.
>>>>
>>>> This helper will break down a Linux page into 4KB chunk and call the
>>>> helper on each of them.
>>> [...]
>>>> --- a/include/xen/page.h
>>>> +++ b/include/xen/page.h
>>>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>>>  
>>>>  extern unsigned long xen_released_pages;
>>>>  
>>>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>>>> +
>>>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>>>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>>>> +				    void *data)
>>>
>>> I think this should be outlined (unless you have measurements that
>>> support making it inlined).
>>
>> I don't have any performance measurements. Although, when Linux is using
>> 4KB page granularity, the loop in this helper will be dropped by the
>> helper. The code would look like:
>>
>> unsigned long pfn = xen_page_to_pfn(page);
>>
>> ret = fn(page, fn, data);
>> if (ret)
>>   return ret;
>>
>> The compiler could even inline the callback (fn). So it drops 2
>> functions call.
> 
> Ok, keep it inlined.
> 
>>> Also perhaps make it
>>>
>>> int xen_for_each_gfn(struct page *page,
>>>                      xen_gfn_fn_t fn, void *data);
>>
>> gfn standing for Guest Frame Number right?
> 
> Yes.  This suggestion is just changing the name to make it more obvious
> what it does.

Thinking more about this suggestion. The callback (fn) is getting a 4K
PFN in parameter and not a GFN.

This is because the balloon code seems to require having a 4K PFN in
hand in few places. For instance XENMEM_populate_physmap and
HYPERVISOR_update_va_mapping.

Although, I'm not sure to understand the difference between GMFN, and
GPFN in the hypercall doc.

Regards,
David Vrabel Aug. 5, 2015, 3:50 p.m. UTC | #8
On 05/08/15 15:30, Julien Grall wrote:
> Hi David,
> 
> On 24/07/15 11:10, David Vrabel wrote:
>> On 24/07/15 10:54, Julien Grall wrote:
>>> On 24/07/15 10:31, David Vrabel wrote:
>>>> On 09/07/15 21:42, Julien Grall wrote:
>>>>> The Xen interface is always using 4KB page. This means that a Linux page
>>>>> may be split across multiple Xen page when the page granularity is not
>>>>> the same.
>>>>>
>>>>> This helper will break down a Linux page into 4KB chunk and call the
>>>>> helper on each of them.
>>>> [...]
>>>>> --- a/include/xen/page.h
>>>>> +++ b/include/xen/page.h
>>>>> @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
>>>>>  
>>>>>  extern unsigned long xen_released_pages;
>>>>>  
>>>>> +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
>>>>> +
>>>>> +/* Break down the page in 4KB granularity and call fn foreach xen pfn */
>>>>> +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
>>>>> +				    void *data)
>>>>
>>>> I think this should be outlined (unless you have measurements that
>>>> support making it inlined).
>>>
>>> I don't have any performance measurements. Although, when Linux is using
>>> 4KB page granularity, the loop in this helper will be dropped by the
>>> helper. The code would look like:
>>>
>>> unsigned long pfn = xen_page_to_pfn(page);
>>>
>>> ret = fn(page, fn, data);
>>> if (ret)
>>>   return ret;
>>>
>>> The compiler could even inline the callback (fn). So it drops 2
>>> functions call.
>>
>> Ok, keep it inlined.
>>
>>>> Also perhaps make it
>>>>
>>>> int xen_for_each_gfn(struct page *page,
>>>>                      xen_gfn_fn_t fn, void *data);
>>>
>>> gfn standing for Guest Frame Number right?
>>
>> Yes.  This suggestion is just changing the name to make it more obvious
>> what it does.
> 
> Thinking more about this suggestion. The callback (fn) is getting a 4K
> PFN in parameter and not a GFN.

I would like only APIs that deal with 64 KiB PFNs and 4 KiB GFNs.  I
think having a 4 KiB "PFN" is confusing.

Can you rework this xen_for_each_gfn() to pass GFNs to fn, instead?

> This is because the balloon code seems to require having a 4K PFN in
> hand in few places. For instance XENMEM_populate_physmap and
> HYPERVISOR_update_va_mapping.

Ug. For an auto-xlate guest frame-list needs GFNs, for a PV guest
XENMEM_populate_physmap does want PFNs (so it can fill in the M2P).

Perhaps in increase_reservation:

if (auto-xlate)
   frame_list[i] = page_to_gfn(page);
   /* Or whatever per-GFN loop you need. */
else
   frame_list[i] = page_to_pfn(page);

update_va_mapping takes VAs (e.g, __va(pfn << PAGE_SHIFT) could be
page_to_virt(page).

Sorry for being so picky here, but the inconsistency of terminology and
API misuse is already confusing and I don't want to see it get worse.

David

> 
> Although, I'm not sure to understand the difference between GMFN, and
> GPFN in the hypercall doc.
> 
> Regards,
>
Julien Grall Aug. 5, 2015, 4:06 p.m. UTC | #9
On 05/08/15 16:50, David Vrabel wrote:
>>>>> Also perhaps make it
>>>>>
>>>>> int xen_for_each_gfn(struct page *page,
>>>>>                      xen_gfn_fn_t fn, void *data);
>>>>
>>>> gfn standing for Guest Frame Number right?
>>>
>>> Yes.  This suggestion is just changing the name to make it more obvious
>>> what it does.
>>
>> Thinking more about this suggestion. The callback (fn) is getting a 4K
>> PFN in parameter and not a GFN.
> 
> I would like only APIs that deal with 64 KiB PFNs and 4 KiB GFNs.  I
> think having a 4 KiB "PFN" is confusing.

I agree with that. Note that helpers splitting Linux page in 4K chunk
such as gnttab_for_each_grant (see patch #3) and this one may still have
the concept of 4K "PFN" for internal purpose.

> Can you rework this xen_for_each_gfn() to pass GFNs to fn, instead?

I will do.

> 
>> This is because the balloon code seems to require having a 4K PFN in
>> hand in few places. For instance XENMEM_populate_physmap and
>> HYPERVISOR_update_va_mapping.
> 
> Ug. For an auto-xlate guest frame-list needs GFNs, for a PV guest
> XENMEM_populate_physmap does want PFNs (so it can fill in the M2P).
> 
> Perhaps in increase_reservation:
> 
> if (auto-xlate)
>    frame_list[i] = page_to_gfn(page);
>    /* Or whatever per-GFN loop you need. */
> else
>    frame_list[i] = page_to_pfn(page);
> 
> update_va_mapping takes VAs (e.g, __va(pfn << PAGE_SHIFT) could be
> page_to_virt(page).

I though about a similar approach but I wanted to keep the code generic,
i.e support the splitting even for non auto-translated guest. Hence the
implementation xen_apply_to_page.

Anyway, I will see to do what you suggest.

> Sorry for being so picky here, but the inconsistency of terminology and
> API misuse is already confusing and I don't want to see it get worse.

No worry, I'm happy to rework this code to be able to drop the 4K PFN
concept.

Regards,
diff mbox

Patch

diff --git a/include/xen/page.h b/include/xen/page.h
index 8ebd37b..b1f7722 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -39,4 +39,24 @@  struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
 
 extern unsigned long xen_released_pages;
 
+typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data);
+
+/* Break down the page in 4KB granularity and call fn foreach xen pfn */
+static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn,
+				    void *data)
+{
+	unsigned long pfn = xen_page_to_pfn(page);
+	int i;
+	int ret;
+
+	for (i = 0; i < XEN_PFN_PER_PAGE; i++, pfn++) {
+		ret = fn(page, pfn, data);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
+
 #endif	/* _XEN_PAGE_H */