Message ID | 1438966019-19322-13-git-send-email-julien.grall@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 7 Aug 2015, Julien Grall wrote: > For ARM64 guests, Linux is able to support either 64K or 4K page > granularity. Although, the hypercall interface is always based on 4K > page granularity. > > With 64K page granularity, a single page will be spread over multiple > Xen frame. > > To avoid splitting the page into 4K frame, take advantage of the > extent_order field to directly allocate/free chunk of the Linux page > size. > > Note that PVMMU is only used for PV guest (which is x86) and the page > granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure > that because the code has not been modified. > > Signed-off-by: Julien Grall <julien.grall@citrix.com> This is much better than the previous version. Good idea using the extent_order field. I only have a minor comment below. > --- > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: David Vrabel <david.vrabel@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > > Changes in v3: > - Fix errors reported by checkpatch.pl > - s/mfn/gfn/ based on the new naming > - Rather than splitting the page into 4KB chunk, use the > extent_order field to allocate directly a Linux page size. This > is avoid lots of code for no benefits. > > Changes in v2: > - Use xen_apply_to_page to split a page in 4K chunk > - It's not necessary to have a smaller frame list. Re-use > PAGE_SIZE > - Convert reserve_additional_memory to use XEN_... macro > --- > drivers/xen/balloon.c | 47 ++++++++++++++++++++++++++++++++++++----------- > 1 file changed, 36 insertions(+), 11 deletions(-) > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 9734649..c8739c8 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -70,6 +70,9 @@ > #include <xen/features.h> > #include <xen/page.h> > > +/* Use one extent per PAGE_SIZE to avoid break down into multiple frame */ > +#define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1) > + > /* > * balloon_process() state: > * > @@ -230,6 +233,11 @@ static enum bp_state reserve_additional_memory(long credit) > nid = memory_add_physaddr_to_nid(hotplug_start_paddr); > > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > + > /* > * add_memory() will build page tables for the new memory so > * the p2m must contain invalid entries so the correct > @@ -326,11 +334,11 @@ static enum bp_state reserve_additional_memory(long credit) > static enum bp_state increase_reservation(unsigned long nr_pages) > { > int rc; > - unsigned long pfn, i; > + unsigned long i; > struct page *page; > struct xen_memory_reservation reservation = { > .address_bits = 0, > - .extent_order = 0, > + .extent_order = EXTENT_ORDER, > .domid = DOMID_SELF > }; > > @@ -352,7 +360,11 @@ static enum bp_state increase_reservation(unsigned long nr_pages) > nr_pages = i; > break; > } > - frame_list[i] = page_to_pfn(page); > + > + /* XENMEM_populate_physmap requires a PFN based on Xen > + * granularity. > + */ > + frame_list[i] = xen_page_to_pfn(page); > page = balloon_next_page(page); > } > > @@ -366,10 +378,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) > page = balloon_retrieve(false); > BUG_ON(page == NULL); > > - pfn = page_to_pfn(page); > - > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > + > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > + unsigned long pfn = page_to_pfn(page); > + > set_phys_to_machine(pfn, frame_list[i]); > > /* Link back into the page tables if not highmem. */ > @@ -396,14 +413,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) > static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > { > enum bp_state state = BP_DONE; > - unsigned long pfn, i; > + unsigned long i; > struct page *page; > int ret; > struct xen_memory_reservation reservation = { > .address_bits = 0, > - .extent_order = 0, > + .extent_order = EXTENT_ORDER, > .domid = DOMID_SELF > }; > + static struct page *pages[ARRAY_SIZE(frame_list)]; This array can be rather large: I would try to avoid it, see below. > #ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG > if (balloon_stats.hotplug_pages) { > @@ -426,7 +444,9 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > } > scrub_page(page); > > - frame_list[i] = page_to_pfn(page); > + /* XENMEM_decrease_reservation requires a GFN */ > + frame_list[i] = xen_page_to_gfn(page); > + pages[i] = page; > } > > /* > @@ -440,12 +460,17 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > > /* Update direct mapping, invalidate P2M, and add to balloon. */ > for (i = 0; i < nr_pages; i++) { > - pfn = frame_list[i]; > - frame_list[i] = pfn_to_gfn(pfn); > - page = pfn_to_page(pfn); > + page = pages[i]; > > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > + > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > + unsigned long pfn = page_to_pfn(page); I would simply and avoid introducing a new array: pfn = (frame_list[i] << XEN_PAGE_SHIFT) >> PAGE_SHIFT; page = pfn_to_page(pfn); > if (!PageHighMem(page)) { > ret = HYPERVISOR_update_va_mapping( > (unsigned long)__va(pfn << PAGE_SHIFT), > -- > 2.1.4 >
Hi Stefano, On 10/08/15 12:18, Stefano Stabellini wrote: >> /* Link back into the page tables if not highmem. */ >> @@ -396,14 +413,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) >> static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) >> { >> enum bp_state state = BP_DONE; >> - unsigned long pfn, i; >> + unsigned long i; >> struct page *page; >> int ret; >> struct xen_memory_reservation reservation = { >> .address_bits = 0, >> - .extent_order = 0, >> + .extent_order = EXTENT_ORDER, >> .domid = DOMID_SELF >> }; >> + static struct page *pages[ARRAY_SIZE(frame_list)]; > > This array can be rather large: I would try to avoid it, see below. [..] > > I would simply and avoid introducing a new array: > pfn = (frame_list[i] << XEN_PAGE_SHIFT) >> PAGE_SHIFT; > page = pfn_to_page(pfn); Which won't work because the frame_list contains a gfn and not a pfn. We need to translate back the gfn into a pfn and the into a page. The cost of the translation may be big and I wanted to avoid anymore XEN_PAGE_SHIFT in the code. In general we should avoid to deal with 4KB PFN when it's not necessary, it make the code more confusing to read. If your only concern is the size of the array, we could decrease the number of frames by batch. Or allocation the variable once a boot time. Regards,
On Mon, 10 Aug 2015, Julien Grall wrote: > Hi Stefano, > > On 10/08/15 12:18, Stefano Stabellini wrote: > >> /* Link back into the page tables if not highmem. */ > >> @@ -396,14 +413,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) > >> static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > >> { > >> enum bp_state state = BP_DONE; > >> - unsigned long pfn, i; > >> + unsigned long i; > >> struct page *page; > >> int ret; > >> struct xen_memory_reservation reservation = { > >> .address_bits = 0, > >> - .extent_order = 0, > >> + .extent_order = EXTENT_ORDER, > >> .domid = DOMID_SELF > >> }; > >> + static struct page *pages[ARRAY_SIZE(frame_list)]; > > > > This array can be rather large: I would try to avoid it, see below. > > [..] > > > > > I would simply and avoid introducing a new array: > > pfn = (frame_list[i] << XEN_PAGE_SHIFT) >> PAGE_SHIFT; > > page = pfn_to_page(pfn); > > Which won't work because the frame_list contains a gfn and not a pfn. > We need to translate back the gfn into a pfn and the into a page. > > The cost of the translation may be big and I wanted to avoid anymore > XEN_PAGE_SHIFT in the code. In general we should avoid to deal with 4KB > PFN when it's not necessary, it make the code more confusing to read. That is true > If your only concern is the size of the array, we could decrease the > number of frames by batch. Or allocation the variable once a boot time. Yes, that is my only concern. Allocating only nr_pages new struct page* would be good enough I guess.
On 10/08/15 13:55, Stefano Stabellini wrote: >> If your only concern is the size of the array, we could decrease the >> number of frames by batch. Or allocation the variable once a boot time. > > Yes, that is my only concern. Allocating only nr_pages new struct page* > would be good enough I guess. That would be even worst. We shouldn't allocate the array at every call, but at boot time. Note that frame_list is already a static variable use 64KB when 64KB page is used. I guess this will be unlikely to remove that much frame in a single batch. But I will keep this optimization for later. Anyway, I'm wondering if we could re-use the lru field to link the page when allocate them and retrieve in the second loop in order to avoid the pages array. Regards,
On 07/08/15 17:46, Julien Grall wrote: > For ARM64 guests, Linux is able to support either 64K or 4K page > granularity. Although, the hypercall interface is always based on 4K > page granularity. > > With 64K page granularity, a single page will be spread over multiple > Xen frame. > > To avoid splitting the page into 4K frame, take advantage of the > extent_order field to directly allocate/free chunk of the Linux page > size. > > Note that PVMMU is only used for PV guest (which is x86) and the page > granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure > that because the code has not been modified. [...] > #ifdef CONFIG_XEN_HAVE_PVMMU > + /* We don't support PV MMU when Linux and Xen is using > + * different page granularity. > + */ > + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); You don't need this BUILD_BUG_ON() twice. Otherwise, Reviewed-by: David Vrabel <david.vrabel@citrix.com> David
On 20/08/15 10:59, David Vrabel wrote: > On 07/08/15 17:46, Julien Grall wrote: >> For ARM64 guests, Linux is able to support either 64K or 4K page >> granularity. Although, the hypercall interface is always based on 4K >> page granularity. >> >> With 64K page granularity, a single page will be spread over multiple >> Xen frame. >> >> To avoid splitting the page into 4K frame, take advantage of the >> extent_order field to directly allocate/free chunk of the Linux page >> size. >> >> Note that PVMMU is only used for PV guest (which is x86) and the page >> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure >> that because the code has not been modified. > [...] >> #ifdef CONFIG_XEN_HAVE_PVMMU >> + /* We don't support PV MMU when Linux and Xen is using >> + * different page granularity. >> + */ >> + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); > > You don't need this BUILD_BUG_ON() twice. I put twice the BUILD_BUG_ON so if we ever decide to drop one of the #ifdef CONFIG_XEN_HAVE_PVMMU, the check will still be present. So I'd like to keep it. > Otherwise, > > Reviewed-by: David Vrabel <david.vrabel@citrix.com> Based on the discussion with Stefano I rework a bit the balloon code to re-use lru field (see [1]), so I won't retain your reviewed-by on the next series. Regards, [1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg00781.html
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 9734649..c8739c8 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -70,6 +70,9 @@ #include <xen/features.h> #include <xen/page.h> +/* Use one extent per PAGE_SIZE to avoid break down into multiple frame */ +#define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1) + /* * balloon_process() state: * @@ -230,6 +233,11 @@ static enum bp_state reserve_additional_memory(long credit) nid = memory_add_physaddr_to_nid(hotplug_start_paddr); #ifdef CONFIG_XEN_HAVE_PVMMU + /* We don't support PV MMU when Linux and Xen is using + * different page granularity. + */ + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); + /* * add_memory() will build page tables for the new memory so * the p2m must contain invalid entries so the correct @@ -326,11 +334,11 @@ static enum bp_state reserve_additional_memory(long credit) static enum bp_state increase_reservation(unsigned long nr_pages) { int rc; - unsigned long pfn, i; + unsigned long i; struct page *page; struct xen_memory_reservation reservation = { .address_bits = 0, - .extent_order = 0, + .extent_order = EXTENT_ORDER, .domid = DOMID_SELF }; @@ -352,7 +360,11 @@ static enum bp_state increase_reservation(unsigned long nr_pages) nr_pages = i; break; } - frame_list[i] = page_to_pfn(page); + + /* XENMEM_populate_physmap requires a PFN based on Xen + * granularity. + */ + frame_list[i] = xen_page_to_pfn(page); page = balloon_next_page(page); } @@ -366,10 +378,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) page = balloon_retrieve(false); BUG_ON(page == NULL); - pfn = page_to_pfn(page); - #ifdef CONFIG_XEN_HAVE_PVMMU + /* We don't support PV MMU when Linux and Xen is using + * different page granularity. + */ + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + unsigned long pfn = page_to_pfn(page); + set_phys_to_machine(pfn, frame_list[i]); /* Link back into the page tables if not highmem. */ @@ -396,14 +413,15 @@ static enum bp_state increase_reservation(unsigned long nr_pages) static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) { enum bp_state state = BP_DONE; - unsigned long pfn, i; + unsigned long i; struct page *page; int ret; struct xen_memory_reservation reservation = { .address_bits = 0, - .extent_order = 0, + .extent_order = EXTENT_ORDER, .domid = DOMID_SELF }; + static struct page *pages[ARRAY_SIZE(frame_list)]; #ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG if (balloon_stats.hotplug_pages) { @@ -426,7 +444,9 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) } scrub_page(page); - frame_list[i] = page_to_pfn(page); + /* XENMEM_decrease_reservation requires a GFN */ + frame_list[i] = xen_page_to_gfn(page); + pages[i] = page; } /* @@ -440,12 +460,17 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) /* Update direct mapping, invalidate P2M, and add to balloon. */ for (i = 0; i < nr_pages; i++) { - pfn = frame_list[i]; - frame_list[i] = pfn_to_gfn(pfn); - page = pfn_to_page(pfn); + page = pages[i]; #ifdef CONFIG_XEN_HAVE_PVMMU + /* We don't support PV MMU when Linux and Xen is using + * different page granularity. + */ + BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE); + if (!xen_feature(XENFEAT_auto_translated_physmap)) { + unsigned long pfn = page_to_pfn(page); + if (!PageHighMem(page)) { ret = HYPERVISOR_update_va_mapping( (unsigned long)__va(pfn << PAGE_SHIFT),
For ARM64 guests, Linux is able to support either 64K or 4K page granularity. Although, the hypercall interface is always based on 4K page granularity. With 64K page granularity, a single page will be spread over multiple Xen frame. To avoid splitting the page into 4K frame, take advantage of the extent_order field to directly allocate/free chunk of the Linux page size. Note that PVMMU is only used for PV guest (which is x86) and the page granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure that because the code has not been modified. Signed-off-by: Julien Grall <julien.grall@citrix.com> --- Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Changes in v3: - Fix errors reported by checkpatch.pl - s/mfn/gfn/ based on the new naming - Rather than splitting the page into 4KB chunk, use the extent_order field to allocate directly a Linux page size. This is avoid lots of code for no benefits. Changes in v2: - Use xen_apply_to_page to split a page in 4K chunk - It's not necessary to have a smaller frame list. Re-use PAGE_SIZE - Convert reserve_additional_memory to use XEN_... macro --- drivers/xen/balloon.c | 47 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 11 deletions(-)