diff mbox series

[4/4] tools/dombuilder: Don't allocate dom->p2m_host[] for translated domains

Message ID 20191217201550.15864-5-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series Don't allocate dom->p2m_host[] for translated domains | expand

Commit Message

Andrew Cooper Dec. 17, 2019, 8:15 p.m. UTC
xc_dom_p2m() and dom->p2m_host[] implement a linear transform for translated
domains, but waste a substantial chunk of RAM doing so.

ARM literally never reads dom->p2m_host[] (because of the xc_dom_translated()
short circuit in xc_dom_p2m()).  Drop it all.

x86 HVM does use dom->p2m_host[] for xc_domain_populate_physmap_exact() calls
when populating 4k pages.  Reuse the same tactic from 2M/1G ranges and use an
on-stack array instead.  Drop the memory allocation.

x86 PV guests do use dom->p2m_host[] as a non-identity transform.  Rename the
field to pv_p2m to make it clear it is PV-only.

No change in the constructed guests.

Reported-by: Varad Gautam <vrd@amazon.de>
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Varad Gautam <vrd@amazon.de>
---
 stubdom/grub/kexec.c         | 28 ++++++++---------
 tools/libxc/include/xc_dom.h | 19 ++++++------
 tools/libxc/xc_dom_arm.c     |  9 ------
 tools/libxc/xc_dom_x86.c     | 72 ++++++++++++++++++--------------------------
 4 files changed, 52 insertions(+), 76 deletions(-)

Comments

Julien Grall Dec. 23, 2019, 6:23 p.m. UTC | #1
Hi,

On 17/12/2019 21:15, Andrew Cooper wrote:
> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for translated
> domains, but waste a substantial chunk of RAM doing so.
> 
> ARM literally never reads dom->p2m_host[] (because of the xc_dom_translated()
> short circuit in xc_dom_p2m()).  Drop it all.
> 
> x86 HVM does use dom->p2m_host[] for xc_domain_populate_physmap_exact() calls
> when populating 4k pages.  Reuse the same tactic from 2M/1G ranges and use an
> on-stack array instead.  Drop the memory allocation.
> 
> x86 PV guests do use dom->p2m_host[] as a non-identity transform.  Rename the
> field to pv_p2m to make it clear it is PV-only.

Nice cleanup! This will probably make slightly faster guest boot :).

> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct xc_dom_image *dom,
>   static int meminit(struct xc_dom_image *dom)
>   {
>       int i, rc;
> -    xen_pfn_t pfn;
>       uint64_t modbase;
>   
>       uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT;
> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom)
>       assert(ramsize == 0); /* Too much RAM is rejected above */
>   
>       dom->p2m_size = p2m_size;

Do we need to keep p2m_size?

Cheers,
Andrew Cooper Jan. 2, 2020, 5:49 p.m. UTC | #2
On 23/12/2019 18:23, Julien Grall wrote:
> Hi,
>
> On 17/12/2019 21:15, Andrew Cooper wrote:
>> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for
>> translated
>> domains, but waste a substantial chunk of RAM doing so.
>>
>> ARM literally never reads dom->p2m_host[] (because of the
>> xc_dom_translated()
>> short circuit in xc_dom_p2m()).  Drop it all.
>>
>> x86 HVM does use dom->p2m_host[] for
>> xc_domain_populate_physmap_exact() calls
>> when populating 4k pages.  Reuse the same tactic from 2M/1G ranges
>> and use an
>> on-stack array instead.  Drop the memory allocation.
>>
>> x86 PV guests do use dom->p2m_host[] as a non-identity transform. 
>> Rename the
>> field to pv_p2m to make it clear it is PV-only.
>
> Nice cleanup! This will probably make slightly faster guest boot :).
>
>> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct
>> xc_dom_image *dom,
>>   static int meminit(struct xc_dom_image *dom)
>>   {
>>       int i, rc;
>> -    xen_pfn_t pfn;
>>       uint64_t modbase;
>>         uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT;
>> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom)
>>       assert(ramsize == 0); /* Too much RAM is rejected above */
>>         dom->p2m_size = p2m_size;
>
> Do we need to keep p2m_size?

Yes, I think so.

There are some common checks which would fail if it becomes 0, and
importantly, it is used to locate safe gfns for temporary physmap mappings.

~Andrew
Julien Grall Jan. 3, 2020, 10:44 a.m. UTC | #3
Hi Andrew,

On 02/01/2020 17:49, Andrew Cooper wrote:
> On 23/12/2019 18:23, Julien Grall wrote:
>> Hi,
>>
>> On 17/12/2019 21:15, Andrew Cooper wrote:
>>> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for
>>> translated
>>> domains, but waste a substantial chunk of RAM doing so.
>>>
>>> ARM literally never reads dom->p2m_host[] (because of the
>>> xc_dom_translated()
>>> short circuit in xc_dom_p2m()).  Drop it all.
>>>
>>> x86 HVM does use dom->p2m_host[] for
>>> xc_domain_populate_physmap_exact() calls
>>> when populating 4k pages.  Reuse the same tactic from 2M/1G ranges
>>> and use an
>>> on-stack array instead.  Drop the memory allocation.
>>>
>>> x86 PV guests do use dom->p2m_host[] as a non-identity transform.
>>> Rename the
>>> field to pv_p2m to make it clear it is PV-only.
>>
>> Nice cleanup! This will probably make slightly faster guest boot :).
>>
>>> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct
>>> xc_dom_image *dom,
>>>    static int meminit(struct xc_dom_image *dom)
>>>    {
>>>        int i, rc;
>>> -    xen_pfn_t pfn;
>>>        uint64_t modbase;
>>>          uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT;
>>> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom)
>>>        assert(ramsize == 0); /* Too much RAM is rejected above */
>>>          dom->p2m_size = p2m_size;
>>
>> Do we need to keep p2m_size?
> 
> Yes, I think so.
> 
> There are some common checks which would fail if it becomes 0, and
> importantly, it is used to locate safe gfns for temporary physmap mappings.

Do you mean the scratch page? If so, on Arm we use a fix address in the 
memory map rather than p2m_size.

I have looked at the usage of the p2m_size in the common code and I 
didn't spot any path that can be used by Arm and using p2m_size.

Also, I don't think p2m_size is calculated correctly on Arm. It only 
englobe the RAM and doesn't take into account the rest of the mappings 
(e.g MMIO...). So I am not sure how this could be used in common code.

Cheers,
Jan Beulich Jan. 3, 2020, 2:25 p.m. UTC | #4
On 17.12.2019 21:15, Andrew Cooper wrote:
> --- a/tools/libxc/include/xc_dom.h
> +++ b/tools/libxc/include/xc_dom.h
> @@ -123,16 +123,12 @@ struct xc_dom_image {
>  
>      /* other state info */
>      uint32_t f_active[XENFEAT_NR_SUBMAPS];
> +
>      /*
> -     * p2m_host maps guest physical addresses an offset from
> -     * rambase_pfn (see below) into gfns.

The removal of this part of the comment and ...

> -     * For a pure PV guest this means that it maps GPFNs into MFNs for
> -     * a hybrid guest this means that it maps GPFNs to GPFNS.
> -     *
> -     * Note that the input is offset by rambase.
> +     * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs.  It is
> +     * eventually copied into guest context.
>       */
> -    xen_pfn_t *p2m_host;
> +    xen_pfn_t *pv_p2m;
>  
>      /* physical memory
>       *
> @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
>  {
>      if ( xc_dom_translated(dom) )
>          return pfn;
> -    if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
> +
> +    /* x86 PV only now. */
> +    if ( pfn >= dom->total_pages )
>          return INVALID_MFN;
> -    return dom->p2m_host[pfn - dom->rambase_pfn];
> +
> +    return dom->pv_p2m[pfn];
>  }

... the dropping of all uses of rambase_pfn here make it look
like you're doing away with that concept altogether. Is this
really correct? If so, I guess you want to
- say a word in this regard in the description, the more that
  you don't fully get rid of this (the structure field and
  some uses elsewhere remain),
- drop/adjust the respective comment fragment next to the
  field a little further down in the structure.

> @@ -1245,11 +1245,11 @@ static int meminit_pv(struct xc_dom_image *dom)
>          return -EINVAL;
>      }
>  
> -    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);
> -    if ( dom->p2m_host == NULL )
> +    dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);

Since you're touching the line anyway, perhaps better
sizeof(*dom->pv_p2m)?

Jan
Andrew Cooper Jan. 3, 2020, 3:02 p.m. UTC | #5
On 03/01/2020 14:25, Jan Beulich wrote:
> On 17.12.2019 21:15, Andrew Cooper wrote:
>> --- a/tools/libxc/include/xc_dom.h
>> +++ b/tools/libxc/include/xc_dom.h
>> @@ -123,16 +123,12 @@ struct xc_dom_image {
>>  
>>      /* other state info */
>>      uint32_t f_active[XENFEAT_NR_SUBMAPS];
>> +
>>      /*
>> -     * p2m_host maps guest physical addresses an offset from
>> -     * rambase_pfn (see below) into gfns.
> The removal of this part of the comment and ...
>
>> -     * For a pure PV guest this means that it maps GPFNs into MFNs for
>> -     * a hybrid guest this means that it maps GPFNs to GPFNS.
>> -     *
>> -     * Note that the input is offset by rambase.
>> +     * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs.  It is
>> +     * eventually copied into guest context.
>>       */
>> -    xen_pfn_t *p2m_host;
>> +    xen_pfn_t *pv_p2m;
>>  
>>      /* physical memory
>>       *
>> @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
>>  {
>>      if ( xc_dom_translated(dom) )
>>          return pfn;
>> -    if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
>> +
>> +    /* x86 PV only now. */
>> +    if ( pfn >= dom->total_pages )
>>          return INVALID_MFN;
>> -    return dom->p2m_host[pfn - dom->rambase_pfn];
>> +
>> +    return dom->pv_p2m[pfn];
>>  }
> ... the dropping of all uses of rambase_pfn here make it look
> like you're doing away with that concept altogether. Is this
> really correct?

Well - it is effectively dead code here, because of the
xc_dom_translated() in context above, and it being 0 outside of ARM.

The (nonzero) value is used by other bits of ARM code.

>  If so, I guess you want to
> - say a word in this regard in the description, the more that
>   you don't fully get rid of this (the structure field and
>   some uses elsewhere remain),
> - drop/adjust the respective comment fragment next to the
>   field a little further down in the structure.

The domain builder is made almost exclusively of bitrot, but I don't
have an ARM system to do any testing on.  Given how fragile the code is,
I'm not comfortable doing speculative deletion of code I'm not certain
is unused.

>
>> @@ -1245,11 +1245,11 @@ static int meminit_pv(struct xc_dom_image *dom)
>>          return -EINVAL;
>>      }
>>  
>> -    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);
>> -    if ( dom->p2m_host == NULL )
>> +    dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);
> Since you're touching the line anyway, perhaps better
> sizeof(*dom->pv_p2m)?

Will fix.

~Andrew
Jan Beulich Jan. 3, 2020, 3:33 p.m. UTC | #6
On 03.01.2020 16:02, Andrew Cooper wrote:
> On 03/01/2020 14:25, Jan Beulich wrote:
>> On 17.12.2019 21:15, Andrew Cooper wrote:
>>> --- a/tools/libxc/include/xc_dom.h
>>> +++ b/tools/libxc/include/xc_dom.h
>>> @@ -123,16 +123,12 @@ struct xc_dom_image {
>>>  
>>>      /* other state info */
>>>      uint32_t f_active[XENFEAT_NR_SUBMAPS];
>>> +
>>>      /*
>>> -     * p2m_host maps guest physical addresses an offset from
>>> -     * rambase_pfn (see below) into gfns.
>> The removal of this part of the comment and ...
>>
>>> -     * For a pure PV guest this means that it maps GPFNs into MFNs for
>>> -     * a hybrid guest this means that it maps GPFNs to GPFNS.
>>> -     *
>>> -     * Note that the input is offset by rambase.
>>> +     * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs.  It is
>>> +     * eventually copied into guest context.
>>>       */
>>> -    xen_pfn_t *p2m_host;
>>> +    xen_pfn_t *pv_p2m;
>>>  
>>>      /* physical memory
>>>       *
>>> @@ -433,9 +429,12 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
>>>  {
>>>      if ( xc_dom_translated(dom) )
>>>          return pfn;
>>> -    if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
>>> +
>>> +    /* x86 PV only now. */
>>> +    if ( pfn >= dom->total_pages )
>>>          return INVALID_MFN;
>>> -    return dom->p2m_host[pfn - dom->rambase_pfn];
>>> +
>>> +    return dom->pv_p2m[pfn];
>>>  }
>> ... the dropping of all uses of rambase_pfn here make it look
>> like you're doing away with that concept altogether. Is this
>> really correct?
> 
> Well - it is effectively dead code here, because of the
> xc_dom_translated() in context above, and it being 0 outside of ARM.
> 
> The (nonzero) value is used by other bits of ARM code.
> 
>>  If so, I guess you want to
>> - say a word in this regard in the description, the more that
>>   you don't fully get rid of this (the structure field and
>>   some uses elsewhere remain),
>> - drop/adjust the respective comment fragment next to the
>>   field a little further down in the structure.
> 
> The domain builder is made almost exclusively of bitrot, but I don't
> have an ARM system to do any testing on.  Given how fragile the code is,
> I'm not comfortable doing speculative deletion of code I'm not certain
> is unused.

My remark was about this (non-Arm) part of the comment:

     * An x86 PV guest has one or more blocks of physical RAM,
     * consisting of total_pages starting at rambase_pfn. The start
     * address and size of each block is controlled by vNUMA
     * structures.

I did in no way mean to ask for speculative deletion of code.

Jan
Julien Grall Jan. 8, 2020, 9:36 a.m. UTC | #7
Hi,

On 03/01/2020 10:44, Julien Grall wrote:
> Hi Andrew,
> 
> On 02/01/2020 17:49, Andrew Cooper wrote:
>> On 23/12/2019 18:23, Julien Grall wrote:
>>> Hi,
>>>
>>> On 17/12/2019 21:15, Andrew Cooper wrote:
>>>> xc_dom_p2m() and dom->p2m_host[] implement a linear transform for
>>>> translated
>>>> domains, but waste a substantial chunk of RAM doing so.
>>>>
>>>> ARM literally never reads dom->p2m_host[] (because of the
>>>> xc_dom_translated()
>>>> short circuit in xc_dom_p2m()).  Drop it all.
>>>>
>>>> x86 HVM does use dom->p2m_host[] for
>>>> xc_domain_populate_physmap_exact() calls
>>>> when populating 4k pages.  Reuse the same tactic from 2M/1G ranges
>>>> and use an
>>>> on-stack array instead.  Drop the memory allocation.
>>>>
>>>> x86 PV guests do use dom->p2m_host[] as a non-identity transform.
>>>> Rename the
>>>> field to pv_p2m to make it clear it is PV-only.
>>>
>>> Nice cleanup! This will probably make slightly faster guest boot :).
>>>
>>>> @@ -359,7 +356,6 @@ static int populate_guest_memory(struct
>>>> xc_dom_image *dom,
>>>>    static int meminit(struct xc_dom_image *dom)
>>>>    {
>>>>        int i, rc;
>>>> -    xen_pfn_t pfn;
>>>>        uint64_t modbase;
>>>>          uint64_t ramsize = (uint64_t)dom->total_pages << 
>>>> XC_PAGE_SHIFT;
>>>> @@ -423,11 +419,6 @@ static int meminit(struct xc_dom_image *dom)
>>>>        assert(ramsize == 0); /* Too much RAM is rejected above */
>>>>          dom->p2m_size = p2m_size;
>>>
>>> Do we need to keep p2m_size?
>>
>> Yes, I think so.
>>
>> There are some common checks which would fail if it becomes 0, and
>> importantly, it is used to locate safe gfns for temporary physmap 
>> mappings.
> 
> Do you mean the scratch page? If so, on Arm we use a fix address in the 
> memory map rather than p2m_size.
> 
> I have looked at the usage of the p2m_size in the common code and I 
> didn't spot any path that can be used by Arm and using p2m_size.
> 
> Also, I don't think p2m_size is calculated correctly on Arm. It only 
> englobe the RAM and doesn't take into account the rest of the mappings 
> (e.g MMIO...). So I am not sure how this could be used in common code.

Thinking a bit more of this. As the p2m_size was IHMO already buggy, it 
should not make any difference after this series. Therefore removing 
p2m_size can be probably done on a follow-up patch.

So I am happy with the patch as-is:

Acked-by: Julien Grall <julien@xen.org>

Cheers,
diff mbox series

Patch

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 10891eabcc..0e68b969a2 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -87,17 +87,17 @@  static void do_exchange(struct xc_dom_image *dom, xen_pfn_t target_pfn, xen_pfn_
     xen_pfn_t target_mfn;
 
     for (source_pfn = 0; source_pfn < start_info.nr_pages; source_pfn++)
-        if (dom->p2m_host[source_pfn] == source_mfn)
+        if (dom->pv_p2m[source_pfn] == source_mfn)
             break;
     ASSERT(source_pfn < start_info.nr_pages);
 
-    target_mfn = dom->p2m_host[target_pfn];
+    target_mfn = dom->pv_p2m[target_pfn];
 
     /* Put target MFN at source PFN */
-    dom->p2m_host[source_pfn] = target_mfn;
+    dom->pv_p2m[source_pfn] = target_mfn;
 
     /* Put source MFN at target PFN */
-    dom->p2m_host[target_pfn] = source_mfn;
+    dom->pv_p2m[target_pfn] = source_mfn;
 }
 
 int kexec_allocate(struct xc_dom_image *dom)
@@ -110,7 +110,7 @@  int kexec_allocate(struct xc_dom_image *dom)
     pages_moved2pfns = realloc(pages_moved2pfns, new_allocated * sizeof(*pages_moved2pfns));
     for (i = allocated; i < new_allocated; i++) {
         /* Exchange old page of PFN i with a newly allocated page.  */
-        xen_pfn_t old_mfn = dom->p2m_host[i];
+        xen_pfn_t old_mfn = dom->pv_p2m[i];
         xen_pfn_t new_pfn;
         xen_pfn_t new_mfn;
 
@@ -122,7 +122,7 @@  int kexec_allocate(struct xc_dom_image *dom)
 	/*
 	 * If PFN of newly allocated page (new_pfn) is less then currently
 	 * requested PFN (i) then look for relevant PFN/MFN pair. In this
-	 * situation dom->p2m_host[new_pfn] no longer contains proper MFN
+	 * situation dom->pv_p2m[new_pfn] no longer contains proper MFN
 	 * because original page with new_pfn was moved earlier
 	 * to different location.
 	 */
@@ -132,10 +132,10 @@  int kexec_allocate(struct xc_dom_image *dom)
 	pages_moved2pfns[i] = new_pfn;
 
         /* Put old page at new PFN */
-        dom->p2m_host[new_pfn] = old_mfn;
+        dom->pv_p2m[new_pfn] = old_mfn;
 
         /* Put new page at PFN i */
-        dom->p2m_host[i] = new_mfn;
+        dom->pv_p2m[i] = new_mfn;
     }
 
     allocated = new_allocated;
@@ -282,11 +282,11 @@  void kexec(void *kernel, long kernel_size, void *module, long module_size, char
     dom->p2m_size = dom->total_pages;
 
     /* setup initial p2m */
-    dom->p2m_host = malloc(sizeof(*dom->p2m_host) * dom->p2m_size);
+    dom->pv_p2m = malloc(sizeof(*dom->pv_p2m) * dom->p2m_size);
 
     /* Start with our current P2M */
     for (i = 0; i < dom->p2m_size; i++)
-        dom->p2m_host[i] = pfn_to_mfn(i);
+        dom->pv_p2m[i] = pfn_to_mfn(i);
 
     if ( (rc = xc_dom_build_image(dom)) != 0 ) {
         printk("xc_dom_build_image returned %d\n", rc);
@@ -373,7 +373,7 @@  void kexec(void *kernel, long kernel_size, void *module, long module_size, char
     _boot_oldpdmfn = virt_to_mfn(start_info.pt_base);
     DEBUG("boot old pd mfn %lx\n", _boot_oldpdmfn);
     DEBUG("boot pd virt %lx\n", dom->pgtables_seg.vstart);
-    _boot_pdmfn = dom->p2m_host[PHYS_PFN(dom->pgtables_seg.vstart - dom->parms.virt_base)];
+    _boot_pdmfn = dom->pv_p2m[PHYS_PFN(dom->pgtables_seg.vstart - dom->parms.virt_base)];
     DEBUG("boot pd mfn %lx\n", _boot_pdmfn);
     _boot_stack = _boot_target + PAGE_SIZE;
     DEBUG("boot stack %lx\n", _boot_stack);
@@ -384,13 +384,13 @@  void kexec(void *kernel, long kernel_size, void *module, long module_size, char
 
     /* Keep only useful entries */
     for (nr_m2p_updates = pfn = 0; pfn < start_info.nr_pages; pfn++)
-        if (dom->p2m_host[pfn] != pfn_to_mfn(pfn))
+        if (dom->pv_p2m[pfn] != pfn_to_mfn(pfn))
             nr_m2p_updates++;
 
     m2p_updates = malloc(sizeof(*m2p_updates) * nr_m2p_updates);
     for (i = pfn = 0; pfn < start_info.nr_pages; pfn++)
-        if (dom->p2m_host[pfn] != pfn_to_mfn(pfn)) {
-            m2p_updates[i].ptr = PFN_PHYS(dom->p2m_host[pfn]) | MMU_MACHPHYS_UPDATE;
+        if (dom->pv_p2m[pfn] != pfn_to_mfn(pfn)) {
+            m2p_updates[i].ptr = PFN_PHYS(dom->pv_p2m[pfn]) | MMU_MACHPHYS_UPDATE;
             m2p_updates[i].val = pfn;
             i++;
         }
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index b7d0faf7e1..d2e316f35e 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -123,16 +123,12 @@  struct xc_dom_image {
 
     /* other state info */
     uint32_t f_active[XENFEAT_NR_SUBMAPS];
+
     /*
-     * p2m_host maps guest physical addresses an offset from
-     * rambase_pfn (see below) into gfns.
-     *
-     * For a pure PV guest this means that it maps GPFNs into MFNs for
-     * a hybrid guest this means that it maps GPFNs to GPFNS.
-     *
-     * Note that the input is offset by rambase.
+     * pv_p2m is specific to x86 PV guests, and maps GFNs to MFNs.  It is
+     * eventually copied into guest context.
      */
-    xen_pfn_t *p2m_host;
+    xen_pfn_t *pv_p2m;
 
     /* physical memory
      *
@@ -433,9 +429,12 @@  static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
 {
     if ( xc_dom_translated(dom) )
         return pfn;
-    if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
+
+    /* x86 PV only now. */
+    if ( pfn >= dom->total_pages )
         return INVALID_MFN;
-    return dom->p2m_host[pfn - dom->rambase_pfn];
+
+    return dom->pv_p2m[pfn];
 }
 
 #endif /* _XC_DOM_H */
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 7e0fb9169f..931404c222 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -348,9 +348,6 @@  static int populate_guest_memory(struct xc_dom_image *dom,
         }
     }
 
-    for ( pfn = 0; pfn < nr_pfns; pfn++ )
-        dom->p2m_host[pfn] = base_pfn + pfn;
-
 out:
     free(extents);
     return rc < 0 ? rc : 0;
@@ -359,7 +356,6 @@  static int populate_guest_memory(struct xc_dom_image *dom,
 static int meminit(struct xc_dom_image *dom)
 {
     int i, rc;
-    xen_pfn_t pfn;
     uint64_t modbase;
 
     uint64_t ramsize = (uint64_t)dom->total_pages << XC_PAGE_SHIFT;
@@ -423,11 +419,6 @@  static int meminit(struct xc_dom_image *dom)
     assert(ramsize == 0); /* Too much RAM is rejected above */
 
     dom->p2m_size = p2m_size;
-    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size);
-    if ( dom->p2m_host == NULL )
-        return -EINVAL;
-    for ( pfn = 0; pfn < p2m_size; pfn++ )
-        dom->p2m_host[pfn] = INVALID_PFN;
 
     /* setup initial p2m and allocate guest memory */
     for ( i = 0; i < GUEST_RAM_BANKS && dom->rambank_size[i]; i++ )
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index f21662c8b9..819afcb03f 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -318,7 +318,7 @@  static xen_pfn_t move_l3_below_4G(struct xc_dom_image *dom,
     if ( !new_l3mfn )
         goto out;
 
-    p2m_guest[l3pfn] = dom->p2m_host[l3pfn] = new_l3mfn;
+    p2m_guest[l3pfn] = dom->pv_p2m[l3pfn] = new_l3mfn;
 
     if ( xc_add_mmu_update(dom->xch, mmu,
                            (((unsigned long long)new_l3mfn)
@@ -450,11 +450,11 @@  static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
     uint32_t *p2m_guest = domx86->p2m_guest;
     xen_pfn_t l3mfn, l3pfn, i;
 
-    /* Copy dom->p2m_host[] into the guest. */
+    /* Copy dom->pv_p2m[] into the guest. */
     for ( i = 0; i < dom->p2m_size; ++i )
     {
-        if ( dom->p2m_host[i] != INVALID_PFN )
-            p2m_guest[i] = dom->p2m_host[i];
+        if ( dom->pv_p2m[i] != INVALID_PFN )
+            p2m_guest[i] = dom->pv_p2m[i];
         else
             p2m_guest[i] = -1;
     }
@@ -505,11 +505,11 @@  static int setup_pgtables_x86_64(struct xc_dom_image *dom)
     uint64_t *p2m_guest = domx86->p2m_guest;
     xen_pfn_t i;
 
-    /* Copy dom->p2m_host[] into the guest. */
+    /* Copy dom->pv_p2m[] into the guest. */
     for ( i = 0; i < dom->p2m_size; ++i )
     {
-        if ( dom->p2m_host[i] != INVALID_PFN )
-            p2m_guest[i] = dom->p2m_host[i];
+        if ( dom->pv_p2m[i] != INVALID_PFN )
+            p2m_guest[i] = dom->pv_p2m[i];
         else
             p2m_guest[i] = -1;
     }
@@ -1245,11 +1245,11 @@  static int meminit_pv(struct xc_dom_image *dom)
         return -EINVAL;
     }
 
-    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);
-    if ( dom->p2m_host == NULL )
+    dom->pv_p2m = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->p2m_size);
+    if ( dom->pv_p2m == NULL )
         return -EINVAL;
     for ( pfn = 0; pfn < dom->p2m_size; pfn++ )
-        dom->p2m_host[pfn] = INVALID_PFN;
+        dom->pv_p2m[pfn] = INVALID_PFN;
 
     /* allocate guest memory */
     for ( i = 0; i < nr_vmemranges; i++ )
@@ -1269,7 +1269,7 @@  static int meminit_pv(struct xc_dom_image *dom)
         pfn_base = vmemranges[i].start >> PAGE_SHIFT;
 
         for ( pfn = pfn_base; pfn < pfn_base+pages; pfn++ )
-            dom->p2m_host[pfn] = pfn;
+            dom->pv_p2m[pfn] = pfn;
 
         pfn_base_idx = pfn_base;
         while ( super_pages ) {
@@ -1279,7 +1279,7 @@  static int meminit_pv(struct xc_dom_image *dom)
             for ( pfn = pfn_base_idx, j = 0;
                   pfn < pfn_base_idx + (count << SUPERPAGE_2MB_SHIFT);
                   pfn += SUPERPAGE_2MB_NR_PFNS, j++ )
-                extents[j] = dom->p2m_host[pfn];
+                extents[j] = dom->pv_p2m[pfn];
             rc = xc_domain_populate_physmap(dom->xch, dom->guest_domid, count,
                                             SUPERPAGE_2MB_SHIFT, memflags,
                                             extents);
@@ -1292,7 +1292,7 @@  static int meminit_pv(struct xc_dom_image *dom)
             {
                 mfn = extents[j];
                 for ( k = 0; k < SUPERPAGE_2MB_NR_PFNS; k++, pfn++ )
-                    dom->p2m_host[pfn] = mfn + k;
+                    dom->pv_p2m[pfn] = mfn + k;
             }
             pfn_base_idx = pfn;
         }
@@ -1301,7 +1301,7 @@  static int meminit_pv(struct xc_dom_image *dom)
         {
             allocsz = min_t(uint64_t, 1024 * 1024, pages - j);
             rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
-                     allocsz, 0, memflags, &dom->p2m_host[pfn_base + j]);
+                     allocsz, 0, memflags, &dom->pv_p2m[pfn_base + j]);
 
             if ( rc )
             {
@@ -1428,25 +1428,6 @@  static int meminit_hvm(struct xc_dom_image *dom)
     }
 
     dom->p2m_size = p2m_size;
-    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-                                      dom->p2m_size);
-    if ( dom->p2m_host == NULL )
-    {
-        DOMPRINTF("Could not allocate p2m");
-        goto error_out;
-    }
-
-    for ( i = 0; i < p2m_size; i++ )
-        dom->p2m_host[i] = ((xen_pfn_t)-1);
-    for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
-    {
-        uint64_t pfn;
-
-        for ( pfn = vmemranges[vmemid].start >> PAGE_SHIFT;
-              pfn < vmemranges[vmemid].end >> PAGE_SHIFT;
-              pfn++ )
-            dom->p2m_host[pfn] = pfn;
-    }
 
     /*
      * Try to claim pages for early warning of insufficient memory available.
@@ -1488,14 +1469,16 @@  static int meminit_hvm(struct xc_dom_image *dom)
      * We attempt to allocate 1GB pages if possible. It falls back on 2MB
      * pages if 1GB allocation fails. 4KB pages will be used eventually if
      * both fail.
-     * 
-     * Under 2MB mode, we allocate pages in batches of no more than 8MB to 
-     * ensure that we can be preempted and hence dom0 remains responsive.
      */
     if ( dom->device_model )
     {
+        xen_pfn_t extents[0xa0];
+
+        for ( i = 0; i < ARRAY_SIZE(extents); ++i )
+            extents[i] = i;
+
         rc = xc_domain_populate_physmap_exact(
-            xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]);
+            xch, domid, 0xa0, 0, memflags, extents);
         if ( rc != 0 )
         {
             DOMPRINTF("Could not populate low memory (< 0xA0).\n");
@@ -1538,7 +1521,7 @@  static int meminit_hvm(struct xc_dom_image *dom)
             if ( count > max_pages )
                 count = max_pages;
 
-            cur_pfn = dom->p2m_host[cur_pages];
+            cur_pfn = cur_pages;
 
             /* Take care the corner cases of super page tails */
             if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
@@ -1564,8 +1547,7 @@  static int meminit_hvm(struct xc_dom_image *dom)
                 xen_pfn_t sp_extents[nr_extents];
 
                 for ( i = 0; i < nr_extents; i++ )
-                    sp_extents[i] =
-                        dom->p2m_host[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
+                    sp_extents[i] = cur_pages + (i << SUPERPAGE_1GB_SHIFT);
 
                 done = xc_domain_populate_physmap(xch, domid, nr_extents,
                                                   SUPERPAGE_1GB_SHIFT,
@@ -1604,8 +1586,7 @@  static int meminit_hvm(struct xc_dom_image *dom)
                     xen_pfn_t sp_extents[nr_extents];
 
                     for ( i = 0; i < nr_extents; i++ )
-                        sp_extents[i] =
-                            dom->p2m_host[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+                        sp_extents[i] = cur_pages + (i << SUPERPAGE_2MB_SHIFT);
 
                     done = xc_domain_populate_physmap(xch, domid, nr_extents,
                                                       SUPERPAGE_2MB_SHIFT,
@@ -1624,8 +1605,13 @@  static int meminit_hvm(struct xc_dom_image *dom)
             /* Fall back to 4kB extents. */
             if ( count != 0 )
             {
+                xen_pfn_t extents[count];
+
+                for ( i = 0; i < count; ++i )
+                    extents[i] = cur_pages + i;
+
                 rc = xc_domain_populate_physmap_exact(
-                    xch, domid, count, 0, new_memflags, &dom->p2m_host[cur_pages]);
+                    xch, domid, count, 0, new_memflags, extents);
                 cur_pages += count;
                 stat_normal_pages += count;
             }