Message ID | 20191217201550.15864-5-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Don't allocate dom->p2m_host[] for translated domains | expand |
Hi, On 17/12/2019 21:15, Andrew Cooper wrote: > xc_dom_p2m() and dom->p2m_host[] implement a linear transform for translated > domains, but waste a substantial chunk of RAM doing so. > > ARM literally never reads dom->p2m_host[] (because of the xc_dom_translated() > short circuit in xc_dom_p2m()). Drop it all. > > x86 HVM does use dom->p2m_host[] for xc_domain_populate_physmap_exact() calls > when populating 4k pages. Reuse the same tactic from 2M/1G ranges and use an > on-stack array instead. Drop the memory allocation. > > x86 PV guests do use dom->p2m_host[] as a non-identity transform. Rename the > field to pv_p2m to make it clear it is PV-only. Nice cleanup! This will probably make slightly faster guest boot :). > @@ -359,7 +356,6 @@ static int populate_guest_memory(struct xc_dom_image *dom, > static int meminit(struct xc_dom_image *dom) > { > int i, rc; > - xen_pfn_t pfn; > uint64_t modbase; > > uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT; > @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom) > assert(ramsize == 0); /* Too much RAM is rejected above */ > > dom->p2m_size = p2m_size; Do we need to keep p2m_size? Cheers,
On 23/12/2019 18:23, Julien Grall wrote: > Hi, > > On 17/12/2019 21:15, Andrew Cooper wrote: >> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for >> translated >> domains, but waste a substantial chunk of RAM doing so. >> >> ARM literally never reads dom->p2m_host[] (because of the >> xc_dom_translated() >> short circuit in xc_dom_p2m()). Drop it all. >> >> x86 HVM does use dom->p2m_host[] for >> xc_domain_populate_physmap_exact() calls >> when populating 4k pages. Reuse the same tactic from 2M/1G ranges >> and use an >> on-stack array instead. Drop the memory allocation. >> >> x86 PV guests do use dom->p2m_host[] as a non-identity transform. >> Rename the >> field to pv_p2m to make it clear it is PV-only. > > Nice cleanup! This will probably make slightly faster guest boot :). > >> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct >> xc_dom_image *dom, >> static int meminit(struct xc_dom_image *dom) >> { >> int i, rc; >> - xen_pfn_t pfn; >> uint64_t modbase; >> uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT; >> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom) >> assert(ramsize == 0); /* Too much RAM is rejected above */ >> dom->p2m_size = p2m_size; > > Do we need to keep p2m_size? Yes, I think so. There are some common checks which would fail if it becomes 0, and importantly, it is used to locate safe gfns for temporary physmap mappings. ~Andrew
Hi Andrew, On 02/01/2020 17:49, Andrew Cooper wrote: > On 23/12/2019 18:23, Julien Grall wrote: >> Hi, >> >> On 17/12/2019 21:15, Andrew Cooper wrote: >>> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for >>> translated >>> domains, but waste a substantial chunk of RAM doing so. >>> >>> ARM literally never reads dom->p2m_host[] (because of the >>> xc_dom_translated() >>> short circuit in xc_dom_p2m()). Drop it all. >>> >>> x86 HVM does use dom->p2m_host[] for >>> xc_domain_populate_physmap_exact() calls >>> when populating 4k pages. Reuse the same tactic from 2M/1G ranges >>> and use an >>> on-stack array instead. Drop the memory allocation. >>> >>> x86 PV guests do use dom->p2m_host[] as a non-identity transform. >>> Rename the >>> field to pv_p2m to make it clear it is PV-only. >> >> Nice cleanup! This will probably make slightly faster guest boot :). >> >>> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct >>> xc_dom_image *dom, >>> static int meminit(struct xc_dom_image *dom) >>> { >>> int i, rc; >>> - xen_pfn_t pfn; >>> uint64_t modbase; >>> uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT; >>> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom) >>> assert(ramsize == 0); /* Too much RAM is rejected above */ >>> dom->p2m_size = p2m_size; >> >> Do we need to keep p2m_size? > > Yes, I think so. > > There are some common checks which would fail if it becomes 0, and > importantly, it is used to locate safe gfns for temporary physmap mappings. Do you mean the scratch page? If so, on Arm we use a fix address in the memory map rather than p2m_size. I have looked at the usage of the p2m_size in the common code and I didn't spot any path that can be used by Arm and using p2m_size. Also, I don't think p2m_size is calculated correctly on Arm. It only englobe the RAM and doesn't take into account the rest of the mappings (e.g MMIO...). So I am not sure how this could be used in common code. Cheers,
On 17.12.2019 21:15, Andrew Cooper wrote: > --- a/tools/libxc/include/xc_dom.h > +++ b/tools/libxc/include/xc_dom.h > @@ -123,16 +123,12 @@ struct xc_dom_image { > > /* other state info */ > uint32_t f_active[XENFEAT_NR_SUBMAPS]; > + > /* > - * p2m_host maps guest physical addresses an offset from > - * rambase_pfn (see below) into gfns. The removal of this part of the comment and ... > - * For a pure PV guest this means that it maps GPFNs into MFNs for > - * a hybrid guest this means that it maps GPFNs to GPFNS. > - * > - * Note that the input is offset by rambase. > + * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs. It is > + * eventually copied into guest context. > */ > - xen_pfn_t *p2m_host; > + xen_pfn_t *pv_p2m; > > /* physical memory > * > @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) > { > if ( xc_dom_translated(dom) ) > return pfn; > - if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages) > + > + /* x86 PV only now. */ > + if ( pfn >= dom->total_pages ) > return INVALID_MFN; > - return dom->p2m_host[pfn - dom->rambase_pfn]; > + > + return dom->pv_p2m[pfn]; > } ... the dropping of all uses of rambase_pfn here make it look like you're doing away with that concept altogether. Is this really correct? If so, I guess you want to - say a word in this regard in the description, the more that you don't fully get rid of this (the structure field and some uses elsewhere remain), - drop/adjust the respective comment fragment next to the field a little further down in the structure. > @@ -1245,11 +1245,11 @@ static int meminit_pv(struct xc_dom_image *dom) > return -EINVAL; > } > > - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); > - if ( dom->p2m_host == NULL ) > + dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); Since you're touching the line anyway, perhaps better sizeof(*dom->pv_p2m)? Jan
On 03/01/2020 14:25, Jan Beulich wrote: > On 17.12.2019 21:15, Andrew Cooper wrote: >> --- a/tools/libxc/include/xc_dom.h >> +++ b/tools/libxc/include/xc_dom.h >> @@ -123,16 +123,12 @@ struct xc_dom_image { >> >> /* other state info */ >> uint32_t f_active[XENFEAT_NR_SUBMAPS]; >> + >> /* >> - * p2m_host maps guest physical addresses an offset from >> - * rambase_pfn (see below) into gfns. > The removal of this part of the comment and ... > >> - * For a pure PV guest this means that it maps GPFNs into MFNs for >> - * a hybrid guest this means that it maps GPFNs to GPFNS. >> - * >> - * Note that the input is offset by rambase. >> + * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs. It is >> + * eventually copied into guest context. >> */ >> - xen_pfn_t *p2m_host; >> + xen_pfn_t *pv_p2m; >> >> /* physical memory >> * >> @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) >> { >> if ( xc_dom_translated(dom) ) >> return pfn; >> - if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages) >> + >> + /* x86 PV only now. */ >> + if ( pfn >= dom->total_pages ) >> return INVALID_MFN; >> - return dom->p2m_host[pfn - dom->rambase_pfn]; >> + >> + return dom->pv_p2m[pfn]; >> } > ... the dropping of all uses of rambase_pfn here make it look > like you're doing away with that concept altogether. Is this > really correct? Well - it is effectively dead code here, because of the xc_dom_translated() in context above, and it being 0 outside of ARM. The (nonzero) value is used by other bits of ARM code. > If so, I guess you want to > - say a word in this regard in the description, the more that > you don't fully get rid of this (the structure field and > some uses elsewhere remain), > - drop/adjust the respective comment fragment next to the > field a little further down in the structure. The domain builder is made almost exclusively of bitrot, but I don't have an ARM system to do any testing on. Given how fragile the code is, I'm not comfortable doing speculative deletion of code I'm not certain is unused. > >> @@ -1245,11 +1245,11 @@ static int meminit_pv(struct xc_dom_image *dom) >> return -EINVAL; >> } >> >> - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); >> - if ( dom->p2m_host == NULL ) >> + dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); > Since you're touching the line anyway, perhaps better > sizeof(*dom->pv_p2m)? Will fix. ~Andrew
On 03.01.2020 16:02, Andrew Cooper wrote: > On 03/01/2020 14:25, Jan Beulich wrote: >> On 17.12.2019 21:15, Andrew Cooper wrote: >>> --- a/tools/libxc/include/xc_dom.h >>> +++ b/tools/libxc/include/xc_dom.h >>> @@ -123,16 +123,12 @@ struct xc_dom_image { >>> >>> /* other state info */ >>> uint32_t f_active[XENFEAT_NR_SUBMAPS]; >>> + >>> /* >>> - * p2m_host maps guest physical addresses an offset from >>> - * rambase_pfn (see below) into gfns. >> The removal of this part of the comment and ... >> >>> - * For a pure PV guest this means that it maps GPFNs into MFNs for >>> - * a hybrid guest this means that it maps GPFNs to GPFNS. >>> - * >>> - * Note that the input is offset by rambase. >>> + * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs. It is >>> + * eventually copied into guest context. >>> */ >>> - xen_pfn_t *p2m_host; >>> + xen_pfn_t *pv_p2m; >>> >>> /* physical memory >>> * >>> @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) >>> { >>> if ( xc_dom_translated(dom) ) >>> return pfn; >>> - if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages) >>> + >>> + /* x86 PV only now. */ >>> + if ( pfn >= dom->total_pages ) >>> return INVALID_MFN; >>> - return dom->p2m_host[pfn - dom->rambase_pfn]; >>> + >>> + return dom->pv_p2m[pfn]; >>> } >> ... the dropping of all uses of rambase_pfn here make it look >> like you're doing away with that concept altogether. Is this >> really correct? > > Well - it is effectively dead code here, because of the > xc_dom_translated() in context above, and it being 0 outside of ARM. > > The (nonzero) value is used by other bits of ARM code. > >> If so, I guess you want to >> - say a word in this regard in the description, the more that >> you don't fully get rid of this (the structure field and >> some uses elsewhere remain), >> - drop/adjust the respective comment fragment next to the >> field a little further down in the structure. > > The domain builder is made almost exclusively of bitrot, but I don't > have an ARM system to do any testing on. Given how fragile the code is, > I'm not comfortable doing speculative deletion of code I'm not certain > is unused. My remark was about this (non-Arm) part of the comment: * An x86 PV guest has one or more blocks of physical RAM, * consisting of total_pages starting at rambase_pfn. The start * address and size of each block is controlled by vNUMA * structures. I did in no way mean to ask for speculative deletion of code. Jan
Hi, On 03/01/2020 10:44, Julien Grall wrote: > Hi Andrew, > > On 02/01/2020 17:49, Andrew Cooper wrote: >> On 23/12/2019 18:23, Julien Grall wrote: >>> Hi, >>> >>> On 17/12/2019 21:15, Andrew Cooper wrote: >>>> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for >>>> translated >>>> domains, but waste a substantial chunk of RAM doing so. >>>> >>>> ARM literally never reads dom->p2m_host[] (because of the >>>> xc_dom_translated() >>>> short circuit in xc_dom_p2m()). Drop it all. >>>> >>>> x86 HVM does use dom->p2m_host[] for >>>> xc_domain_populate_physmap_exact() calls >>>> when populating 4k pages. Reuse the same tactic from 2M/1G ranges >>>> and use an >>>> on-stack array instead. Drop the memory allocation. >>>> >>>> x86 PV guests do use dom->p2m_host[] as a non-identity transform. >>>> Rename the >>>> field to pv_p2m to make it clear it is PV-only. >>> >>> Nice cleanup! This will probably make slightly faster guest boot :). >>> >>>> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct >>>> xc_dom_image *dom, >>>> static int meminit(struct xc_dom_image *dom) >>>> { >>>> int i, rc; >>>> - xen_pfn_t pfn; >>>> uint64_t modbase; >>>> uint64_t ramsize = (uint64_t)dom->total_pages << >>>> XC_PAGE_SHIFT; >>>> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom) >>>> assert(ramsize == 0); /* Too much RAM is rejected above */ >>>> dom->p2m_size = p2m_size; >>> >>> Do we need to keep p2m_size? >> >> Yes, I think so. >> >> There are some common checks which would fail if it becomes 0, and >> importantly, it is used to locate safe gfns for temporary physmap >> mappings. > > Do you mean the scratch page? If so, on Arm we use a fix address in the > memory map rather than p2m_size. > > I have looked at the usage of the p2m_size in the common code and I > didn't spot any path that can be used by Arm and using p2m_size. > > Also, I don't think p2m_size is calculated correctly on Arm. It only > englobe the RAM and doesn't take into account the rest of the mappings > (e.g MMIO...). So I am not sure how this could be used in common code. Thinking a bit more of this. As the p2m_size was IHMO already buggy, it should not make any difference after this series. Therefore removing p2m_size can be probably done on a follow-up patch. So I am happy with the patch as-is: Acked-by: Julien Grall <julien@xen.org> Cheers,
diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c index 10891eabcc..0e68b969a2 100644 --- a/stubdom/grub/kexec.c +++ b/stubdom/grub/kexec.c @@ -87,17 +87,17 @@ static void do_exchange(struct xc_dom_image *dom, xen_pfn_t target_pfn, xen_pfn_ xen_pfn_t target_mfn; for (source_pfn = 0; source_pfn < start_info.nr_pages; source_pfn++) - if (dom->p2m_host[source_pfn] == source_mfn) + if (dom->pv_p2m[source_pfn] == source_mfn) break; ASSERT(source_pfn < start_info.nr_pages); - target_mfn = dom->p2m_host[target_pfn]; + target_mfn = dom->pv_p2m[target_pfn]; /* Put target MFN at source PFN */ - dom->p2m_host[source_pfn] = target_mfn; + dom->pv_p2m[source_pfn] = target_mfn; /* Put source MFN at target PFN */ - dom->p2m_host[target_pfn] = source_mfn; + dom->pv_p2m[target_pfn] = source_mfn; } int kexec_allocate(struct xc_dom_image *dom) @@ -110,7 +110,7 @@ int kexec_allocate(struct xc_dom_image *dom) pages_moved2pfns = realloc(pages_moved2pfns, new_allocated * sizeof(*pages_moved2pfns)); for (i = allocated; i < new_allocated; i++) { /* Exchange old page of PFN i with a newly allocated page. */ - xen_pfn_t old_mfn = dom->p2m_host[i]; + xen_pfn_t old_mfn = dom->pv_p2m[i]; xen_pfn_t new_pfn; xen_pfn_t new_mfn; @@ -122,7 +122,7 @@ int kexec_allocate(struct xc_dom_image *dom) /* * If PFN of newly allocated page (new_pfn) is less then currently * requested PFN (i) then look for relevant PFN/MFN pair. In this - * situation dom->p2m_host[new_pfn] no longer contains proper MFN + * situation dom->pv_p2m[new_pfn] no longer contains proper MFN * because original page with new_pfn was moved earlier * to different location. */ @@ -132,10 +132,10 @@ int kexec_allocate(struct xc_dom_image *dom) pages_moved2pfns[i] = new_pfn; /* Put old page at new PFN */ - dom->p2m_host[new_pfn] = old_mfn; + dom->pv_p2m[new_pfn] = old_mfn; /* Put new page at PFN i */ - dom->p2m_host[i] = new_mfn; + dom->pv_p2m[i] = new_mfn; } allocated = new_allocated; @@ -282,11 +282,11 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char dom->p2m_size = dom->total_pages; /* setup initial p2m */ - dom->p2m_host = malloc(sizeof(*dom->p2m_host) * dom->p2m_size); + dom->pv_p2m = malloc(sizeof(*dom->pv_p2m) * dom->p2m_size); /* Start with our current P2M */ for (i = 0; i < dom->p2m_size; i++) - dom->p2m_host[i] = pfn_to_mfn(i); + dom->pv_p2m[i] = pfn_to_mfn(i); if ( (rc = xc_dom_build_image(dom)) != 0 ) { printk("xc_dom_build_image returned %d\n", rc); @@ -373,7 +373,7 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char _boot_oldpdmfn = virt_to_mfn(start_info.pt_base); DEBUG("boot old pd mfn %lx\n", _boot_oldpdmfn); DEBUG("boot pd virt %lx\n", dom->pgtables_seg.vstart); - _boot_pdmfn = dom->p2m_host[PHYS_PFN(dom->pgtables_seg.vstart - dom->parms.virt_base)]; + _boot_pdmfn = dom->pv_p2m[PHYS_PFN(dom->pgtables_seg.vstart - dom->parms.virt_base)]; DEBUG("boot pd mfn %lx\n", _boot_pdmfn); _boot_stack = _boot_target + PAGE_SIZE; DEBUG("boot stack %lx\n", _boot_stack); @@ -384,13 +384,13 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char /* Keep only useful entries */ for (nr_m2p_updates = pfn = 0; pfn < start_info.nr_pages; pfn++) - if (dom->p2m_host[pfn] != pfn_to_mfn(pfn)) + if (dom->pv_p2m[pfn] != pfn_to_mfn(pfn)) nr_m2p_updates++; m2p_updates = malloc(sizeof(*m2p_updates) * nr_m2p_updates); for (i = pfn = 0; pfn < start_info.nr_pages; pfn++) - if (dom->p2m_host[pfn] != pfn_to_mfn(pfn)) { - m2p_updates[i].ptr = PFN_PHYS(dom->p2m_host[pfn]) | MMU_MACHPHYS_UPDATE; + if (dom->pv_p2m[pfn] != pfn_to_mfn(pfn)) { + m2p_updates[i].ptr = PFN_PHYS(dom->pv_p2m[pfn]) | MMU_MACHPHYS_UPDATE; m2p_updates[i].val = pfn; i++; } diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index b7d0faf7e1..d2e316f35e 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -123,16 +123,12 @@ struct xc_dom_image { /* other state info */ uint32_t f_active[XENFEAT_NR_SUBMAPS]; + /* - * p2m_host maps guest physical addresses an offset from - * rambase_pfn (see below) into gfns. - * - * For a pure PV guest this means that it maps GPFNs into MFNs for - * a hybrid guest this means that it maps GPFNs to GPFNS. - * - * Note that the input is offset by rambase. + * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs. It is + * eventually copied into guest context. */ - xen_pfn_t *p2m_host; + xen_pfn_t *pv_p2m; /* physical memory * @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) { if ( xc_dom_translated(dom) ) return pfn; - if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages) + + /* x86 PV only now. */ + if ( pfn >= dom->total_pages ) return INVALID_MFN; - return dom->p2m_host[pfn - dom->rambase_pfn]; + + return dom->pv_p2m[pfn]; } #endif /* _XC_DOM_H */ diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index 7e0fb9169f..931404c222 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -348,9 +348,6 @@ static int populate_guest_memory(struct xc_dom_image *dom, } } - for ( pfn = 0; pfn < nr_pfns; pfn++ ) - dom->p2m_host[pfn] = base_pfn + pfn; - out: free(extents); return rc < 0 ? rc : 0; @@ -359,7 +356,6 @@ static int populate_guest_memory(struct xc_dom_image *dom, static int meminit(struct xc_dom_image *dom) { int i, rc; - xen_pfn_t pfn; uint64_t modbase; uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT; @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom) assert(ramsize == 0); /* Too much RAM is rejected above */ dom->p2m_size = p2m_size; - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size); - if ( dom->p2m_host == NULL ) - return -EINVAL; - for ( pfn = 0; pfn < p2m_size; pfn++ ) - dom->p2m_host[pfn] = INVALID_PFN; /* setup initial p2m and allocate guest memory */ for ( i = 0; i < GUEST_RAM_BANKS && dom->rambank_size[i]; i++ ) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index f21662c8b9..819afcb03f 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -318,7 +318,7 @@ static xen_pfn_t move_l3_below_4G(struct xc_dom_image *dom, if ( !new_l3mfn ) goto out; - p2m_guest[l3pfn] = dom->p2m_host[l3pfn] = new_l3mfn; + p2m_guest[l3pfn] = dom->pv_p2m[l3pfn] = new_l3mfn; if ( xc_add_mmu_update(dom->xch, mmu, (((unsigned long long)new_l3mfn) @@ -450,11 +450,11 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom) uint32_t *p2m_guest = domx86->p2m_guest; xen_pfn_t l3mfn, l3pfn, i; - /* Copy dom->p2m_host[] into the guest. */ + /* Copy dom->pv_p2m[] into the guest. */ for ( i = 0; i < dom->p2m_size; ++i ) { - if ( dom->p2m_host[i] != INVALID_PFN ) - p2m_guest[i] = dom->p2m_host[i]; + if ( dom->pv_p2m[i] != INVALID_PFN ) + p2m_guest[i] = dom->pv_p2m[i]; else p2m_guest[i] = -1; } @@ -505,11 +505,11 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom) uint64_t *p2m_guest = domx86->p2m_guest; xen_pfn_t i; - /* Copy dom->p2m_host[] into the guest. */ + /* Copy dom->pv_p2m[] into the guest. */ for ( i = 0; i < dom->p2m_size; ++i ) { - if ( dom->p2m_host[i] != INVALID_PFN ) - p2m_guest[i] = dom->p2m_host[i]; + if ( dom->pv_p2m[i] != INVALID_PFN ) + p2m_guest[i] = dom->pv_p2m[i]; else p2m_guest[i] = -1; } @@ -1245,11 +1245,11 @@ static int meminit_pv(struct xc_dom_image *dom) return -EINVAL; } - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); - if ( dom->p2m_host == NULL ) + dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size); + if ( dom->pv_p2m == NULL ) return -EINVAL; for ( pfn = 0; pfn < dom->p2m_size; pfn++ ) - dom->p2m_host[pfn] = INVALID_PFN; + dom->pv_p2m[pfn] = INVALID_PFN; /* allocate guest memory */ for ( i = 0; i < nr_vmemranges; i++ ) @@ -1269,7 +1269,7 @@ static int meminit_pv(struct xc_dom_image *dom) pfn_base = vmemranges[i].start >> PAGE_SHIFT; for ( pfn = pfn_base; pfn < pfn_base+pages; pfn++ ) - dom->p2m_host[pfn] = pfn; + dom->pv_p2m[pfn] = pfn; pfn_base_idx = pfn_base; while ( super_pages ) { @@ -1279,7 +1279,7 @@ static int meminit_pv(struct xc_dom_image *dom) for ( pfn = pfn_base_idx, j = 0; pfn < pfn_base_idx + (count << SUPERPAGE_2MB_SHIFT); pfn += SUPERPAGE_2MB_NR_PFNS, j++ ) - extents[j] = dom->p2m_host[pfn]; + extents[j] = dom->pv_p2m[pfn]; rc = xc_domain_populate_physmap(dom->xch, dom->guest_domid, count, SUPERPAGE_2MB_SHIFT, memflags, extents); @@ -1292,7 +1292,7 @@ static int meminit_pv(struct xc_dom_image *dom) { mfn = extents[j]; for ( k = 0; k < SUPERPAGE_2MB_NR_PFNS; k++, pfn++ ) - dom->p2m_host[pfn] = mfn + k; + dom->pv_p2m[pfn] = mfn + k; } pfn_base_idx = pfn; } @@ -1301,7 +1301,7 @@ static int meminit_pv(struct xc_dom_image *dom) { allocsz = min_t(uint64_t, 1024 * 1024, pages - j); rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid, - allocsz, 0, memflags, &dom->p2m_host[pfn_base + j]); + allocsz, 0, memflags, &dom->pv_p2m[pfn_base + j]); if ( rc ) { @@ -1428,25 +1428,6 @@ static int meminit_hvm(struct xc_dom_image *dom) } dom->p2m_size = p2m_size; - dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * - dom->p2m_size); - if ( dom->p2m_host == NULL ) - { - DOMPRINTF("Could not allocate p2m"); - goto error_out; - } - - for ( i = 0; i < p2m_size; i++ ) - dom->p2m_host[i] = ((xen_pfn_t)-1); - for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ ) - { - uint64_t pfn; - - for ( pfn = vmemranges[vmemid].start >> PAGE_SHIFT; - pfn < vmemranges[vmemid].end >> PAGE_SHIFT; - pfn++ ) - dom->p2m_host[pfn] = pfn; - } /* * Try to claim pages for early warning of insufficient memory available. @@ -1488,14 +1469,16 @@ static int meminit_hvm(struct xc_dom_image *dom) * We attempt to allocate 1GB pages if possible. It falls back on 2MB * pages if 1GB allocation fails. 4KB pages will be used eventually if * both fail. - * - * Under 2MB mode, we allocate pages in batches of no more than 8MB to - * ensure that we can be preempted and hence dom0 remains responsive. */ if ( dom->device_model ) { + xen_pfn_t extents[0xa0]; + + for ( i = 0; i < ARRAY_SIZE(extents); ++i ) + extents[i] = i; + rc = xc_domain_populate_physmap_exact( - xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]); + xch, domid, 0xa0, 0, memflags, extents); if ( rc != 0 ) { DOMPRINTF("Could not populate low memory (< 0xA0).\n"); @@ -1538,7 +1521,7 @@ static int meminit_hvm(struct xc_dom_image *dom) if ( count > max_pages ) count = max_pages; - cur_pfn = dom->p2m_host[cur_pages]; + cur_pfn = cur_pages; /* Take care the corner cases of super page tails */ if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) && @@ -1564,8 +1547,7 @@ static int meminit_hvm(struct xc_dom_image *dom) xen_pfn_t sp_extents[nr_extents]; for ( i = 0; i < nr_extents; i++ ) - sp_extents[i] = - dom->p2m_host[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)]; + sp_extents[i] = cur_pages + (i << SUPERPAGE_1GB_SHIFT); done = xc_domain_populate_physmap(xch, domid, nr_extents, SUPERPAGE_1GB_SHIFT, @@ -1604,8 +1586,7 @@ static int meminit_hvm(struct xc_dom_image *dom) xen_pfn_t sp_extents[nr_extents]; for ( i = 0; i < nr_extents; i++ ) - sp_extents[i] = - dom->p2m_host[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)]; + sp_extents[i] = cur_pages + (i << SUPERPAGE_2MB_SHIFT); done = xc_domain_populate_physmap(xch, domid, nr_extents, SUPERPAGE_2MB_SHIFT, @@ -1624,8 +1605,13 @@ static int meminit_hvm(struct xc_dom_image *dom) /* Fall back to 4kB extents. */ if ( count != 0 ) { + xen_pfn_t extents[count]; + + for ( i = 0; i < count; ++i ) + extents[i] = cur_pages + i; + rc = xc_domain_populate_physmap_exact( - xch, domid, count, 0, new_memflags, &dom->p2m_host[cur_pages]); + xch, domid, count, 0, new_memflags, extents); cur_pages += count; stat_normal_pages += count; }
xc_dom_p2m() and dom->p2m_host[] implement a linear transform for translated domains, but waste a substantial chunk of RAM doing so. ARM literally never reads dom->p2m_host[] (because of the xc_dom_translated() short circuit in xc_dom_p2m()). Drop it all. x86 HVM does use dom->p2m_host[] for xc_domain_populate_physmap_exact() calls when populating 4k pages. Reuse the same tactic from 2M/1G ranges and use an on-stack array instead. Drop the memory allocation. x86 PV guests do use dom->p2m_host[] as a non-identity transform. Rename the field to pv_p2m to make it clear it is PV-only. No change in the constructed guests. Reported-by: Varad Gautam <vrd@amazon.de> Reported-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Ian Jackson <Ian.Jackson@citrix.com> CC: Wei Liu <wl@xen.org> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Julien Grall <julien@xen.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Varad Gautam <vrd@amazon.de> --- stubdom/grub/kexec.c | 28 ++++++++--------- tools/libxc/include/xc_dom.h | 19 ++++++------ tools/libxc/xc_dom_arm.c | 9 ------ tools/libxc/xc_dom_x86.c | 72 ++++++++++++++++++-------------------------- 4 files changed, 52 insertions(+), 76 deletions(-)