Message ID | 20250404133459.16125-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] x86/xen: fix balloon target initialization for PVH dom0 | expand |
On 04.04.25 15:34, Roger Pau Monne wrote: > PVH dom0 re-uses logic from PV dom0, in which RAM ranges not assigned to > dom0 are re-used as scratch memory to map foreign and grant pages. Such > logic relies on reporting those unpopulated ranges as RAM to Linux, and > mark them as reserved. This way Linux creates the underlying page > structures required for metadata management. > > Such approach works fine on PV because the initial balloon target is > calculated using specific Xen data, that doesn't take into account the > memory type changes described above. However on HVM and PVH the initial > balloon target is calculated using get_num_physpages(), and that function > does take into account the unpopulated RAM regions used as scratch space > for remote domain mappings. > > This leads to PVH dom0 having an incorrect initial balloon target, which > causes malfunction (excessive memory freeing) of the balloon driver if the > dom0 memory target is later adjusted from the toolstack. > > Fix this by using xen_released_pages to account for any pages that are part > of the memory map, but are already unpopulated when the balloon driver is > initialized. This accounts for any regions used for scratch remote > mappings. > > Take the opportunity to unify PV with PVH/HVM guests regarding the usage of > get_num_physpages(), as that avoids having to add different logic for PV vs > PVH in both balloon_add_regions() and arch_xen_unpopulated_init(). > > Much like a6aa4eb994ee, the code in this changeset should have been part of > 38620fc4e893. > > Fixes: a6aa4eb994ee ('xen/x86: add extra pages to unpopulated-alloc if available') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Juergen
On 4/4/25 06:34, Roger Pau Monne wrote: > Much like a6aa4eb994ee, the code in this changeset should have been part of > 38620fc4e893. > > Fixes: a6aa4eb994ee ('xen/x86: add extra pages to unpopulated-alloc if available') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> I don't see a cc:stable@ on there. Was there a reason for that or did you just leave it off by accident?
On 04.04.25 20:20, Dave Hansen wrote: > On 4/4/25 06:34, Roger Pau Monne wrote: >> Much like a6aa4eb994ee, the code in this changeset should have been part of >> 38620fc4e893. >> >> Fixes: a6aa4eb994ee ('xen/x86: add extra pages to unpopulated-alloc if available') >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > I don't see a cc:stable@ on there. Was there a reason for that or did > you just leave it off by accident? Thanks for noticing. I'll add it when committing. Juergen
Hi Roger, kernel test robot noticed the following build errors: [auto build test ERROR on xen-tip/linux-next] [also build test ERROR on tip/x86/core linus/master v6.14 next-20250404] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Roger-Pau-Monne/x86-xen-fix-balloon-target-initialization-for-PVH-dom0/20250404-214218 base: https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next patch link: https://lore.kernel.org/r/20250404133459.16125-1-roger.pau%40citrix.com patch subject: [PATCH v2] x86/xen: fix balloon target initialization for PVH dom0 config: x86_64-buildonly-randconfig-001-20250405 (https://download.01.org/0day-ci/archive/20250405/202504052022.avvx45LI-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250405/202504052022.avvx45LI-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202504052022.avvx45LI-lkp@intel.com/ All errors (new ones prefixed by >>): ld: drivers/xen/balloon.o: in function `balloon_init': >> drivers/xen/balloon.c:727: undefined reference to `xen_released_pages' vim +727 drivers/xen/balloon.c 716 717 static int __init balloon_init(void) 718 { 719 struct task_struct *task; 720 int rc; 721 722 if (!xen_domain()) 723 return -ENODEV; 724 725 pr_info("Initialising balloon driver\n"); 726 > 727 if (xen_released_pages >= get_num_physpages()) { 728 WARN(1, "Released pages underflow current target"); 729 return -ERANGE; 730 } 731 732 balloon_stats.current_pages = get_num_physpages() - xen_released_pages; 733 balloon_stats.target_pages = balloon_stats.current_pages; 734 balloon_stats.balloon_low = 0; 735 balloon_stats.balloon_high = 0; 736 balloon_stats.total_pages = balloon_stats.current_pages; 737 738 balloon_stats.schedule_delay = 1; 739 balloon_stats.max_schedule_delay = 32; 740 balloon_stats.retry_count = 1; 741 balloon_stats.max_retry_count = 4; 742
Hi Roger, kernel test robot noticed the following build errors: [auto build test ERROR on xen-tip/linux-next] [also build test ERROR on tip/x86/core linus/master v6.14 next-20250404] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Roger-Pau-Monne/x86-xen-fix-balloon-target-initialization-for-PVH-dom0/20250404-214218 base: https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next patch link: https://lore.kernel.org/r/20250404133459.16125-1-roger.pau%40citrix.com patch subject: [PATCH v2] x86/xen: fix balloon target initialization for PVH dom0 config: x86_64-buildonly-randconfig-003-20250405 (https://download.01.org/0day-ci/archive/20250405/202504052301.dMtJILzp-lkp@intel.com/config) compiler: clang version 20.1.2 (https://github.com/llvm/llvm-project 58df0ef89dd64126512e4ee27b4ac3fd8ddf6247) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250405/202504052301.dMtJILzp-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202504052301.dMtJILzp-lkp@intel.com/ All errors (new ones prefixed by >>): >> ld.lld: error: undefined symbol: xen_released_pages >>> referenced by balloon.c >>> drivers/xen/balloon.o:(balloon_init) in archive vmlinux.a >>> referenced by balloon.c >>> drivers/xen/balloon.o:(balloon_init) in archive vmlinux.a
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 43dcd8c7badc..651bb206434c 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -466,6 +466,13 @@ int __init arch_xen_unpopulated_init(struct resource **res) xen_free_unpopulated_pages(1, &pg); } + /* + * Account for the region being in the physmap but unpopulated. + * The value in xen_released_pages is used by the balloon + * driver to know how much of the physmap is unpopulated and + * set an accurate initial memory target. + */ + xen_released_pages += xen_extra_mem[i].n_pfns; /* Zero so region is not also added to the balloon driver. */ xen_extra_mem[i].n_pfns = 0; } diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 163f7f1d70f1..ee165f4f7fe6 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -675,7 +675,7 @@ void xen_free_ballooned_pages(unsigned int nr_pages, struct page **pages) } EXPORT_SYMBOL(xen_free_ballooned_pages); -static void __init balloon_add_regions(void) +static int __init balloon_add_regions(void) { unsigned long start_pfn, pages; unsigned long pfn, extra_pfn_end; @@ -698,26 +698,38 @@ static void __init balloon_add_regions(void) for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) balloon_append(pfn_to_page(pfn)); - balloon_stats.total_pages += extra_pfn_end - start_pfn; + /* + * Extra regions are accounted for in the physmap, but need + * decreasing from current_pages to balloon down the initial + * allocation, because they are already accounted for in + * total_pages. + */ + if (extra_pfn_end - start_pfn >= balloon_stats.current_pages) { + WARN(1, "Extra pages underflow current target"); + return -ERANGE; + } + balloon_stats.current_pages -= extra_pfn_end - start_pfn; } + + return 0; } static int __init balloon_init(void) { struct task_struct *task; + int rc; if (!xen_domain()) return -ENODEV; pr_info("Initialising balloon driver\n"); -#ifdef CONFIG_XEN_PV - balloon_stats.current_pages = xen_pv_domain() - ? min(xen_start_info->nr_pages - xen_released_pages, max_pfn) - : get_num_physpages(); -#else - balloon_stats.current_pages = get_num_physpages(); -#endif + if (xen_released_pages >= get_num_physpages()) { + WARN(1, "Released pages underflow current target"); + return -ERANGE; + } + + balloon_stats.current_pages = get_num_physpages() - xen_released_pages; balloon_stats.target_pages = balloon_stats.current_pages; balloon_stats.balloon_low = 0; balloon_stats.balloon_high = 0; @@ -734,7 +746,9 @@ static int __init balloon_init(void) register_sysctl_init("xen/balloon", balloon_table); #endif - balloon_add_regions(); + rc = balloon_add_regions(); + if (rc) + return rc; task = kthread_run(balloon_thread, NULL, "xen-balloon"); if (IS_ERR(task)) {
PVH dom0 re-uses logic from PV dom0, in which RAM ranges not assigned to dom0 are re-used as scratch memory to map foreign and grant pages. Such logic relies on reporting those unpopulated ranges as RAM to Linux, and mark them as reserved. This way Linux creates the underlying page structures required for metadata management. Such approach works fine on PV because the initial balloon target is calculated using specific Xen data, that doesn't take into account the memory type changes described above. However on HVM and PVH the initial balloon target is calculated using get_num_physpages(), and that function does take into account the unpopulated RAM regions used as scratch space for remote domain mappings. This leads to PVH dom0 having an incorrect initial balloon target, which causes malfunction (excessive memory freeing) of the balloon driver if the dom0 memory target is later adjusted from the toolstack. Fix this by using xen_released_pages to account for any pages that are part of the memory map, but are already unpopulated when the balloon driver is initialized. This accounts for any regions used for scratch remote mappings. Take the opportunity to unify PV with PVH/HVM guests regarding the usage of get_num_physpages(), as that avoids having to add different logic for PV vs PVH in both balloon_add_regions() and arch_xen_unpopulated_init(). Much like a6aa4eb994ee, the code in this changeset should have been part of 38620fc4e893. Fixes: a6aa4eb994ee ('xen/x86: add extra pages to unpopulated-alloc if available') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v1: - Replace BUG_ON() with a WARN and failure to initialize the balloon driver. --- arch/x86/xen/enlighten.c | 7 +++++++ drivers/xen/balloon.c | 34 ++++++++++++++++++++++++---------- 2 files changed, 31 insertions(+), 10 deletions(-)