diff mbox

[v7,2/7] xen/x86: populate PVHv2 Dom0 physical memory map

Message ID 20170222142459.28199-3-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roger Pau Monné Feb. 22, 2017, 2:24 p.m. UTC
Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
memory pages that are shared between Xen and a domain, and use it in order to
remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.

On hardware lacking support for unrestricted mode also craft the identity page
tables and the TSS used for virtual 8086 mode.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
 - Rebase on top of Jan VM86 TSS fix.
 - Use hvm_copy_to_guest_phys to zero the TSS area.
 - Request the TSS memory area to be aligned to 128.
 - Move write_32bit_pse_identmap to arch-specific mm.c file.

Changes since v5:
 - Adjust the logic to set need_paging.
 - Remove the usage of the _AC macro.
 - Subtract memory from the end of regions instead of the start.
 - Create the VM86_TSS before the identity page table, so that the page table
   is aligned to a page boundary.
 - Use MB1_PAGES in modify_identity_mmio.
 - Move and simply the ASSERT in pvh_setup_p2m.
 - Move the creation of the PSE page tables to a separate function, and use it
   in shadow_enable also.
 - Make the map modify_identiy_mmio parameter a constant.
 - Add a comment to HVM_VM86_TSS_SIZE, although it seems this might need
   further fixing.
 - Introduce pvh_add_mem_range in order to mark the regions used by the VM86
   TSS and the identity page tables as reserved in the memory map.
 - Add a parameter to request aligned memory from pvh_steal_ram.

Changes since v4:
 - Move process_pending_softirqs to previous patch.
 - Fix off-by-one errors in some checks.
 - Make unshare_xen_page_with_guest __init.
 - Improve unshare_xen_page_with_guest by making use of already existing
   is_xen_heap_page and put_page.
 - s/hvm/pvh/.
 - Use PAGE_ORDER_4K in pvh_setup_e820 in order to keep consistency with the
   p2m code.

Changes since v3:
 - Drop get_order_from_bytes_floor, it was only used by
   hvm_populate_memory_range.
 - Switch hvm_populate_memory_range to use frame numbers instead of full memory
   addresses.
 - Add a helper to steal the low 1MB RAM areas from dom_io and add them to Dom0
   as normal RAM.
 - Introduce unshare_xen_page_with_guest in order to remove pages from dom_io,
   so they can be assigned to other domains. This is needed in order to remove
   the low 1MB RAM regions from dom_io and assign them to the hardware_domain.
 - Simplify the loop in hvm_steal_ram.
 - Move definition of map_identity_mmio into this patch.

Changes since v2:
 - Introduce get_order_from_bytes_floor as a local function to
   domain_build.c.
 - Remove extra asserts.
 - Make hvm_populate_memory_range return an error code instead of panicking.
 - Fix comments and printks.
 - Use ULL sufix instead of casting to uint64_t.
 - Rename hvm_setup_vmx_unrestricted_guest to
   hvm_setup_vmx_realmode_helpers.
 - Only substract two pages from the memory calculation, that will be used
   by the MADT replacement.
 - Remove some comments.
 - Remove printing allocation information.
 - Don't stash any pages for the MADT, TSS or ident PT, those will be
   subtracted directly from RAM regions of the memory map.
 - Count the number of iterations before calling process_pending_softirqs
   when populating the memory map.
 - Move the initial call to process_pending_softirqs into construct_dom0,
   and remove the ones from construct_dom0_hvm and construct_dom0_pv.
 - Make memflags global so it can be shared between alloc_chunk and
   hvm_populate_memory_range.

Changes since RFC:
 - Use IS_ALIGNED instead of checking with PAGE_MASK.
 - Use the new %pB specifier in order to print sizes in human readable form.
 - Create a VM86 TSS for hardware that doesn't support unrestricted mode.
 - Subtract guest RAM for the identity page table and the VM86 TSS.
 - Split the creation of the unrestricted mode helper structures to a
   separate function.
 - Use preemption with paging_set_allocation.
 - Use get_order_from_bytes_floor.
---
 xen/arch/x86/domain_build.c     | 365 +++++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/mm.c               |  26 +++
 xen/arch/x86/mm/shadow/common.c |   7 +-
 xen/include/asm-x86/mm.h        |   5 +
 4 files changed, 393 insertions(+), 10 deletions(-)

Comments

Jan Beulich Feb. 23, 2017, 1:39 p.m. UTC | #1
>>> On 22.02.17 at 15:24, <roger.pau@citrix.com> wrote:
> Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
> memory pages that are shared between Xen and a domain, and use it in order to
> remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.
> 
> On hardware lacking support for unrestricted mode also craft the identity page
> tables and the TSS used for virtual 8086 mode.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> @@ -44,6 +45,16 @@ static long __initdata dom0_min_nrpages;
>  static long __initdata dom0_max_nrpages = LONG_MAX;
>  
>  /*
> + * Have the TSS cover the ISA port range, which makes it
> + * - 104 bytes base structure
> + * - 32 bytes interrupt redirection bitmap
> + * - 128 bytes I/O bitmap
> + * - one trailing byte
> + * or a total of 265 bytes.
> + */
> +#define HVM_VM86_TSS_SIZE 265

... I'm not convinced the same rationale as used in hvmloader
applies here. Namely, the more without legacy devices, there
should be pretty little reason for such a Dom0 to do port I/O
to any (including the ISA) ports from real mode, nor can I see
the usefulness of invoking INT $n instructions without there
being any firmware.

Jan
Roger Pau Monné Feb. 23, 2017, 3:27 p.m. UTC | #2
On Thu, Feb 23, 2017 at 06:39:53AM -0700, Jan Beulich wrote:
> >>> On 22.02.17 at 15:24, <roger.pau@citrix.com> wrote:
> > Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
> > memory pages that are shared between Xen and a domain, and use it in order to
> > remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.
> > 
> > On hardware lacking support for unrestricted mode also craft the identity page
> > tables and the TSS used for virtual 8086 mode.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> albeit ...
> 
> > @@ -44,6 +45,16 @@ static long __initdata dom0_min_nrpages;
> >  static long __initdata dom0_max_nrpages = LONG_MAX;
> >  
> >  /*
> > + * Have the TSS cover the ISA port range, which makes it
> > + * - 104 bytes base structure
> > + * - 32 bytes interrupt redirection bitmap
> > + * - 128 bytes I/O bitmap
> > + * - one trailing byte
> > + * or a total of 265 bytes.
> > + */
> > +#define HVM_VM86_TSS_SIZE 265
> 
> ... I'm not convinced the same rationale as used in hvmloader
> applies here. Namely, the more without legacy devices, there
> should be pretty little reason for such a Dom0 to do port I/O
> to any (including the ISA) ports from real mode, nor can I see
> the usefulness of invoking INT $n instructions without there
> being any firmware.

Right, without firmware there isn't much point in any of this. This is just
going to be used to boot the APs, and that's probably all (which shouldn't
attempt to write to any IO port or execute any INT instruction).

I also don't see much benefit from deviating from what HVM does, so I would
just leave it as is.

Thanks, Roger.
Jan Beulich Feb. 23, 2017, 4:09 p.m. UTC | #3
>>> On 23.02.17 at 16:27, <roger.pau@citrix.com> wrote:
> I also don't see much benefit from deviating from what HVM does, so I would
> just leave it as is.

Understood. The risk is that in a couple of years time understanding
why it is the way it is and (whether that's necessary) may take as
much effort as did the deciphering of the (buggy) hvmloader
setup code for this.

Jan
Andrew Cooper Feb. 23, 2017, 4:16 p.m. UTC | #4
On 23/02/17 16:09, Jan Beulich wrote:
>>>> On 23.02.17 at 16:27, <roger.pau@citrix.com> wrote:
>> I also don't see much benefit from deviating from what HVM does, so I would
>> just leave it as is.
> Understood. The risk is that in a couple of years time understanding
> why it is the way it is and (whether that's necessary) may take as
> much effort as did the deciphering of the (buggy) hvmloader
> setup code for this.

In which case, just leave a short comment saying something like "Copy
HVMLoader for consistency, not that we expect a PVH domain from using
this for anything other than their AP trampoline".

~Andrew
diff mbox

Patch

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 0c8a269..adc4c00 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -21,6 +21,7 @@ 
 #include <xen/compat.h>
 #include <xen/libelf.h>
 #include <xen/pfn.h>
+#include <xen/guest_access.h>
 #include <asm/regs.h>
 #include <asm/system.h>
 #include <asm/io.h>
@@ -44,6 +45,16 @@  static long __initdata dom0_min_nrpages;
 static long __initdata dom0_max_nrpages = LONG_MAX;
 
 /*
+ * Have the TSS cover the ISA port range, which makes it
+ * - 104 bytes base structure
+ * - 32 bytes interrupt redirection bitmap
+ * - 128 bytes I/O bitmap
+ * - one trailing byte
+ * or a total of 265 bytes.
+ */
+#define HVM_VM86_TSS_SIZE 265
+
+/*
  * dom0_mem=[min:<min_amt>,][max:<max_amt>,][<amt>]
  * 
  * <min_amt>: The minimum amount of memory which should be allocated for dom0.
@@ -243,11 +254,12 @@  boolean_param("ro-hpet", ro_hpet);
 #define round_pgup(_p)    (((_p)+(PAGE_SIZE-1))&PAGE_MASK)
 #define round_pgdown(_p)  ((_p)&PAGE_MASK)
 
+static unsigned int __initdata memflags = MEMF_no_dma|MEMF_exact_node;
+
 static struct page_info * __init alloc_chunk(
     struct domain *d, unsigned long max_pages)
 {
     static unsigned int __initdata last_order = MAX_ORDER;
-    static unsigned int __initdata memflags = MEMF_no_dma|MEMF_exact_node;
     struct page_info *page;
     unsigned int order = get_order_from_pages(max_pages), free_order;
 
@@ -332,7 +344,9 @@  static unsigned long __init compute_dom0_nr_pages(
             avail -= max_pdx >> s;
     }
 
-    need_paging = opt_dom0_shadow || (is_pvh_domain(d) && !iommu_hap_pt_share);
+    need_paging = has_hvm_container_domain(d)
+                  ? !iommu_hap_pt_share || !paging_mode_hap(d)
+                  : opt_dom0_shadow;
     for ( ; ; need_paging = 0 )
     {
         nr_pages = dom0_nrpages;
@@ -364,7 +378,8 @@  static unsigned long __init compute_dom0_nr_pages(
         avail -= dom0_paging_pages(d, nr_pages);
     }
 
-    if ( (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) &&
+    if ( is_pv_domain(d) &&
+         (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) &&
          ((dom0_min_nrpages <= 0) || (nr_pages > min_pages)) )
     {
         /*
@@ -580,6 +595,7 @@  static __init void pvh_setup_e820(struct domain *d, unsigned long nr_pages)
     struct e820entry *entry, *entry_guest;
     unsigned int i;
     unsigned long pages, cur_pages = 0;
+    uint64_t start, end;
 
     /*
      * Craft the e820 memory map for Dom0 based on the hardware e820 map.
@@ -607,8 +623,22 @@  static __init void pvh_setup_e820(struct domain *d, unsigned long nr_pages)
             continue;
         }
 
-        *entry_guest = *entry;
-        pages = PFN_UP(entry_guest->size);
+        /*
+         * Make sure the start and length are aligned to PAGE_SIZE, because
+         * that's the minimum granularity of the 2nd stage translation. Since
+         * the p2m code uses PAGE_ORDER_4K internally, also use it here in
+         * order to prevent this code from getting out of sync.
+         */
+        start = ROUNDUP(entry->addr, PAGE_SIZE << PAGE_ORDER_4K);
+        end = (entry->addr + entry->size) &
+              ~((PAGE_SIZE << PAGE_ORDER_4K) - 1);
+        if ( start >= end )
+            continue;
+
+        entry_guest->type = E820_RAM;
+        entry_guest->addr = start;
+        entry_guest->size = end - start;
+        pages = PFN_DOWN(entry_guest->size);
         if ( (cur_pages + pages) > nr_pages )
         {
             /* Truncate region */
@@ -1676,15 +1706,340 @@  out:
     return rc;
 }
 
+static int __init modify_identity_mmio(struct domain *d, unsigned long pfn,
+                                       unsigned long nr_pages, const bool map)
+{
+    int rc;
+
+    for ( ; ; )
+    {
+        rc = (map ? map_mmio_regions : unmap_mmio_regions)
+             (d, _gfn(pfn), nr_pages, _mfn(pfn));
+        if ( rc == 0 )
+            break;
+        if ( rc < 0 )
+        {
+            printk(XENLOG_WARNING
+                   "Failed to identity %smap [%#lx,%#lx) for d%d: %d\n",
+                   map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, rc);
+            break;
+        }
+        nr_pages -= rc;
+        pfn += rc;
+        process_pending_softirqs();
+    }
+
+    return rc;
+}
+
+/* Populate a HVM memory range using the biggest possible order. */
+static int __init pvh_populate_memory_range(struct domain *d,
+                                            unsigned long start,
+                                            unsigned long nr_pages)
+{
+    unsigned int order, i = 0;
+    struct page_info *page;
+    int rc;
+#define MAP_MAX_ITER 64
+
+    order = MAX_ORDER;
+    while ( nr_pages != 0 )
+    {
+        unsigned int range_order = get_order_from_pages(nr_pages + 1);
+
+        order = min(range_order ? range_order - 1 : 0, order);
+        page = alloc_domheap_pages(d, order, memflags);
+        if ( page == NULL )
+        {
+            if ( order == 0 && memflags )
+            {
+                /* Try again without any memflags. */
+                memflags = 0;
+                order = MAX_ORDER;
+                continue;
+            }
+            if ( order == 0 )
+            {
+                printk("Unable to allocate memory with order 0!\n");
+                return -ENOMEM;
+            }
+            order--;
+            continue;
+        }
+
+        rc = guest_physmap_add_page(d, _gfn(start), _mfn(page_to_mfn(page)),
+                                    order);
+        if ( rc != 0 )
+        {
+            printk("Failed to populate memory: [%#lx,%lx): %d\n",
+                   start, start + (1UL << order), rc);
+            return -ENOMEM;
+        }
+        start += 1UL << order;
+        nr_pages -= 1UL << order;
+        if ( (++i % MAP_MAX_ITER) == 0 )
+            process_pending_softirqs();
+    }
+
+    return 0;
+#undef MAP_MAX_ITER
+}
+
+/* Steal RAM from the end of a memory region. */
+static int __init pvh_steal_ram(struct domain *d, unsigned long size,
+                                unsigned long align, paddr_t limit,
+                                paddr_t *addr)
+{
+    unsigned int i = d->arch.nr_e820;
+
+    /*
+     * Alignment 0 should be set to 1, so it doesn't wrap around in the
+     * calculations below.
+     */
+    align = align ? : 1;
+    while ( i-- )
+    {
+        struct e820entry *entry = &d->arch.e820[i];
+
+        if ( entry->type != E820_RAM || entry->addr + entry->size > limit ||
+             entry->addr < MB(1) )
+            continue;
+
+        *addr = (entry->addr + entry->size - size) & ~(align - 1);
+        if ( *addr < entry->addr )
+            continue;
+
+        entry->size = *addr - entry->addr;
+        return 0;
+    }
+
+    return -ENOMEM;
+}
+
+/* NB: memory map must be sorted at all times for this to work correctly. */
+static int __init pvh_add_mem_range(struct domain *d, uint64_t s, uint64_t e,
+                                    unsigned int type)
+{
+    struct e820entry *map;
+    unsigned int i;
+
+    for ( i = 0; i < d->arch.nr_e820; i++ )
+    {
+        uint64_t rs = d->arch.e820[i].addr;
+        uint64_t re = rs + d->arch.e820[i].size;
+
+        if ( rs == e && d->arch.e820[i].type == type )
+        {
+            d->arch.e820[i].addr = s;
+            return 0;
+        }
+
+        if ( re == s && d->arch.e820[i].type == type &&
+             (i + 1 == d->arch.nr_e820 || d->arch.e820[i + 1].addr >= e) )
+        {
+            d->arch.e820[i].size += e - s;
+            return 0;
+        }
+
+        if ( rs >= e )
+            break;
+
+        if ( re > s )
+            return -EEXIST;
+    }
+
+    map = xzalloc_array(struct e820entry, d->arch.nr_e820 + 1);
+    if ( !map )
+    {
+        printk(XENLOG_WARNING "E820: out of memory to add region\n");
+        return -ENOMEM;
+    }
+
+    memcpy(map, d->arch.e820, i * sizeof(*d->arch.e820));
+    memcpy(map + i + 1, d->arch.e820 + i,
+           (d->arch.nr_e820 - i) * sizeof(*d->arch.e820));
+    map[i].addr = s;
+    map[i].size = e - s;
+    map[i].type = type;
+    xfree(d->arch.e820);
+    d->arch.e820 = map;
+    d->arch.nr_e820++;
+
+    return 0;
+}
+
+static int __init pvh_setup_vmx_realmode_helpers(struct domain *d)
+{
+    p2m_type_t p2mt;
+    uint32_t rc, *ident_pt;
+    mfn_t mfn;
+    paddr_t gaddr;
+    struct vcpu *v = d->vcpu[0];
+
+    /*
+     * Steal some space from the last RAM region below 4GB and use it to
+     * store the real-mode TSS. It needs to be aligned to 128 so that the
+     * TSS structure (which accounts for the first 104b) doesn't cross
+     * a page boundary.
+     */
+    if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) )
+    {
+        if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) !=
+             HVMCOPY_okay )
+            printk("Unable to zero VM86 TSS area\n");
+        d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] =
+            VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr;
+        if ( pvh_add_mem_range(d, gaddr, gaddr + HVM_VM86_TSS_SIZE,
+                               E820_RESERVED) )
+            printk("Unable to set VM86 TSS as reserved in the memory map\n");
+    }
+    else
+        printk("Unable to allocate VM86 TSS area\n");
+
+    /* Steal some more RAM for the identity page tables. */
+    if ( pvh_steal_ram(d, PAGE_SIZE, PAGE_SIZE, GB(4), &gaddr) )
+    {
+        printk("Unable to find memory to stash the identity page tables\n");
+        return -ENOMEM;
+    }
+
+    /*
+     * Identity-map page table is required for running with CR0.PG=0
+     * when using Intel EPT. Create a 32-bit non-PAE page directory of
+     * superpages.
+     */
+    ident_pt = map_domain_gfn(p2m_get_hostp2m(d), _gfn(PFN_DOWN(gaddr)),
+                              &mfn, &p2mt, 0, &rc);
+    if ( ident_pt == NULL )
+    {
+        printk("Unable to map identity page tables\n");
+        return -ENOMEM;
+    }
+    write_32bit_pse_identmap(ident_pt);
+    unmap_domain_page(ident_pt);
+    put_page(mfn_to_page(mfn_x(mfn)));
+    d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] = gaddr;
+    if ( pvh_add_mem_range(d, gaddr, gaddr + PAGE_SIZE, E820_RESERVED) )
+            printk("Unable to set identity page tables as reserved in the memory map\n");
+
+    return 0;
+}
+
+/* Assign the low 1MB to Dom0. */
+static void __init pvh_steal_low_ram(struct domain *d, unsigned long start,
+                                     unsigned long nr_pages)
+{
+    unsigned long mfn;
+
+    ASSERT(start + nr_pages <= PFN_DOWN(MB(1)));
+
+    for ( mfn = start; mfn < start + nr_pages; mfn++ )
+    {
+        struct page_info *pg = mfn_to_page(mfn);
+        int rc;
+
+        rc = unshare_xen_page_with_guest(pg, dom_io);
+        if ( rc )
+        {
+            printk("Unable to unshare Xen mfn %#lx: %d\n", mfn, rc);
+            continue;
+        }
+
+        share_xen_page_with_guest(pg, d, XENSHARE_writable);
+        rc = guest_physmap_add_entry(d, _gfn(mfn), _mfn(mfn), 0, p2m_ram_rw);
+        if ( rc )
+            printk("Unable to add mfn %#lx to p2m: %d\n", mfn, rc);
+    }
+}
+
+static int __init pvh_setup_p2m(struct domain *d)
+{
+    struct vcpu *v = d->vcpu[0];
+    unsigned long nr_pages;
+    unsigned int i;
+    int rc;
+    bool preempted;
+#define MB1_PAGES PFN_DOWN(MB(1))
+
+    nr_pages = compute_dom0_nr_pages(d, NULL, 0);
+
+    pvh_setup_e820(d, nr_pages);
+    do {
+        preempted = false;
+        paging_set_allocation(d, dom0_paging_pages(d, nr_pages),
+                              &preempted);
+        process_pending_softirqs();
+    } while ( preempted );
+
+    /*
+     * Memory below 1MB is identity mapped.
+     * NB: this only makes sense when booted from legacy BIOS.
+     */
+    rc = modify_identity_mmio(d, 0, MB1_PAGES, true);
+    if ( rc )
+    {
+        printk("Failed to identity map low 1MB: %d\n", rc);
+        return rc;
+    }
+
+    /* Populate memory map. */
+    for ( i = 0; i < d->arch.nr_e820; i++ )
+    {
+        unsigned long addr, size;
+
+        if ( d->arch.e820[i].type != E820_RAM )
+            continue;
+
+        addr = PFN_DOWN(d->arch.e820[i].addr);
+        size = PFN_DOWN(d->arch.e820[i].size);
+
+        if ( addr >= MB1_PAGES )
+            rc = pvh_populate_memory_range(d, addr, size);
+        else
+        {
+            ASSERT(addr + size < MB1_PAGES);
+            pvh_steal_low_ram(d, addr, size);
+        }
+
+        if ( rc )
+            return rc;
+    }
+
+    if ( cpu_has_vmx && paging_mode_hap(d) && !vmx_unrestricted_guest(v) )
+    {
+        /*
+         * Since Dom0 cannot be migrated, we will only setup the
+         * unrestricted guest helpers if they are needed by the current
+         * hardware we are running on.
+         */
+        rc = pvh_setup_vmx_realmode_helpers(d);
+        if ( rc )
+            return rc;
+    }
+
+    return 0;
+#undef MB1_PAGES
+}
+
 static int __init construct_dom0_pvh(struct domain *d, const module_t *image,
                                      unsigned long image_headroom,
                                      module_t *initrd,
                                      void *(*bootstrap_map)(const module_t *),
                                      char *cmdline)
 {
+    int rc;
 
     printk("** Building a PVH Dom0 **\n");
 
+    iommu_hwdom_init(d);
+
+    rc = pvh_setup_p2m(d);
+    if ( rc )
+    {
+        printk("Failed to setup Dom0 physical memory map\n");
+        return rc;
+    }
+
     panic("Building a PVHv2 Dom0 is not yet supported.");
     return 0;
 }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 75bdbc3..14cf652 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -474,6 +474,22 @@  void share_xen_page_with_guest(
     spin_unlock(&d->page_alloc_lock);
 }
 
+int __init unshare_xen_page_with_guest(struct page_info *page,
+                                       struct domain *d)
+{
+    if ( page_get_owner(page) != d || !is_xen_heap_page(page) )
+        return -EINVAL;
+
+    if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
+        put_page(page);
+
+    /* Remove the owner and clear the flags. */
+    page->u.inuse.type_info = 0;
+    page_set_owner(page, NULL);
+
+    return 0;
+}
+
 void share_xen_page_with_privileged_guests(
     struct page_info *page, int readonly)
 {
@@ -6595,6 +6611,16 @@  void paging_invlpg(struct vcpu *v, unsigned long va)
         hvm_funcs.invlpg(v, va);
 }
 
+/* Build a 32bit PSE page table using 4MB pages. */
+void write_32bit_pse_identmap(uint32_t *l2)
+{
+    unsigned int i;
+
+    for ( i = 0; i < PAGE_SIZE / sizeof(*l2); i++ )
+        l2[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
+                 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 51d6bdf..560a7fd 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -3053,7 +3053,7 @@  int shadow_enable(struct domain *d, u32 mode)
     unsigned int old_pages;
     struct page_info *pg = NULL;
     uint32_t *e;
-    int i, rv = 0;
+    int rv = 0;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
     mode |= PG_SH_enable;
@@ -3109,10 +3109,7 @@  int shadow_enable(struct domain *d, u32 mode)
         /* Fill it with 32-bit, non-PAE superpage entries, each mapping 4MB
          * of virtual address space onto the same physical address range */
         e = __map_domain_page(pg);
-        for ( i = 0; i < PAGE_SIZE / sizeof(*e); i++ )
-            e[i] = ((0x400000U * i)
-                    | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER
-                    | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
+        write_32bit_pse_identmap(e);
         unmap_domain_page(e);
         pg->u.inuse.type_info = PGT_l2_page_table | 1 | PGT_validated;
     }
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index a66d5b1..d4a074a 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -275,6 +275,8 @@  struct spage_info
 #define XENSHARE_readonly 1
 extern void share_xen_page_with_guest(
     struct page_info *page, struct domain *d, int readonly);
+extern int unshare_xen_page_with_guest(struct page_info *page,
+                                       struct domain *d);
 extern void share_xen_page_with_privileged_guests(
     struct page_info *page, int readonly);
 extern void free_shared_domheap_page(struct page_info *page);
@@ -597,4 +599,7 @@  typedef struct mm_rwlock {
 
 extern const char zero_page[];
 
+/* Build a 32bit PSE page table using 4MB pages. */
+void write_32bit_pse_identmap(uint32_t *l2);
+
 #endif /* __ASM_X86_MM_H__ */