[2/3] x86: don't maintain compat M2P when !PV32
diff mbox series

Message ID 4575f42b-a347-b34e-0032-e04668106a9b@suse.com
State Superseded
Headers show
Series
  • x86: M2P maintenance adjustments (step 1)
Related show

Commit Message

Jan Beulich Aug. 6, 2020, 9:28 a.m. UTC
Note that opt_pv32's declaration / #define need to be moved due to other
header dependencies; in particular can asm-x86/mm.h not include
asm-x86/pv/domain.h.

While touching their definitions anyway, also adjust section placement
of m2p_compat_vstart and compat_idle_pg_table_l2. Similarly, while
putting init_xen_pae_l2_slots() inside #ifdef, also move it to a PV-only
source file.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
An alternative place for opt_pv32.h would seem to be asm-x86/config.h.

Comments

Andrew Cooper Aug. 6, 2020, 7:16 p.m. UTC | #1
On 06/08/2020 10:28, Jan Beulich wrote:
> Note that opt_pv32's declaration / #define need to be moved due to other
> header dependencies; in particular can asm-x86/mm.h not include
> asm-x86/pv/domain.h.

While I do appreciate that our headers are a complete tangle, I can't
help but feel that this is making the problem worse.

mm.h isn't a better place for opt_pv32 to live.  config.h perhaps,
seeing as its effects are wider than both the domain support itself, or
the memory management support ?

> While touching their definitions anyway, also adjust section placement
> of m2p_compat_vstart and compat_idle_pg_table_l2. Similarly, while
> putting init_xen_pae_l2_slots() inside #ifdef, also move it to a PV-only
> source file.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

So interestingly, this is done out of the order which I was expecting to
do things.  Its not a problem, but I'd like to double check that we
aren't creating future problems.

The goal of this suggestion was actually for PV-Shim, to have only the
regular or compat M2P, as they're fairly large structures and adversely
affect VM density.

This of course requires the kernel elf file to be parsed earlier during
boot, but that isn't a problem.  (It also allows for a PV/PVH dom0
usability fix, whereby the Xen command line has to match the ELF image
provided, rather than auto-selecting the default when only one option is
available.)

The other aspect would be to teach Xen to run on only the compat M2P,
which is fine for any shim smaller than 16T.  (Honestly, if it weren't
an ABI with guests, Shim ought to run exclusively on the compat M2P to
reduce the memory overhead.)

Then during boot, the Shim path would chose to construct only the
regular or compat M2P, based on bitness of the provided kernel.

> ---
> An alternative place for opt_pv32.h would seem to be asm-x86/config.h.

Oh - yes please.  I think that would be better overall.

>
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -597,8 +597,10 @@ int arch_domain_create(struct domain *d,
>      }
>      d->arch.emulation_flags = emflags;
>  
> +#ifdef CONFIG_PV32
>      HYPERVISOR_COMPAT_VIRT_START(d) =
>          is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u;
> +#endif

Can we drop HYPERVISOR_COMPAT_VIRT_START() ?

Its use here as an lvalue in particular makes logic especually hard to
follow, but all it is actually doing is wrapping the shorter
d->arch.hv_compat_vstart

In particular, it would remove the need to conditionally stub
HYPERVISOR_COMPAT_VIRT_START() later.

> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -315,10 +318,10 @@ static void destroy_m2p_mapping(struct m
>   */
>  static int setup_compat_m2p_table(struct mem_hotadd_info *info)
>  {
> +    int err = 0;
>      unsigned long i, smap, emap, epfn = info->epfn;
>      mfn_t mfn;
>      unsigned int n;
> -    int err = 0;

Remnants of an earlier change?

> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -42,8 +42,12 @@
>  #define _PGT_validated    PG_shift(6)
>  #define PGT_validated     PG_mask(1, 6)
>   /* PAE only: is this an L2 page directory containing Xen-private mappings? */
> +#ifdef CONFIG_PV32
>  #define _PGT_pae_xen_l2   PG_shift(7)
>  #define PGT_pae_xen_l2    PG_mask(1, 7)
> +#else
> +#define PGT_pae_xen_l2    0
> +#endif

Hmm - this is going to irritate Coverity and Clang some more.  I still
need to figure out an effective way to make Coverity not object to this
type of short circuiting like this.

I've looked through the users and I think that they're all safe.  I do
however wonder whether is_guest_l2_slot() can be simplified and have its
is_pv_32bit_domain() clause dropped, seeing as it is expensive with its
lfences, and the logic ought to only care about PGT_pae_xen_l2 vs
PGT_l2_page_table.

>  /* Has this page been *partially* validated for use as its current type? */
>  #define _PGT_partial      PG_shift(8)
>  #define PGT_partial       PG_mask(1, 8)
> @@ -494,15 +498,39 @@ extern paddr_t mem_hotplug;
>  #define SHARED_M2P_ENTRY         (~0UL - 1UL)
>  #define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
>  
> -#define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
> +#ifdef CONFIG_PV32
> +
> +extern int8_t opt_pv32;
> +
> +# define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
> +
> +# define set_compat_m2p(mfn, entry) \
> +    ((void)(!opt_pv32 || \
> +            (mfn) >= (RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) / 4 || \
> +            (compat_machine_to_phys_mapping[mfn] = (entry))))

I know this is extracting previous logic, but "entry" would probably be
better if it were named "val" or similar.

However, see my reply to patch 3 which I think will simplify this
substantially.

~Andrew
Jan Beulich Aug. 7, 2020, 10:12 a.m. UTC | #2
On 06.08.2020 21:16, Andrew Cooper wrote:
> On 06/08/2020 10:28, Jan Beulich wrote:
>> Note that opt_pv32's declaration / #define need to be moved due to other
>> header dependencies; in particular can asm-x86/mm.h not include
>> asm-x86/pv/domain.h.
>> 
>> While touching their definitions anyway, also adjust section placement
>> of m2p_compat_vstart and compat_idle_pg_table_l2. Similarly, while
>> putting init_xen_pae_l2_slots() inside #ifdef, also move it to a PV-only
>> source file.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> So interestingly, this is done out of the order which I was expecting to
> do things.  Its not a problem, but I'd like to double check that we
> aren't creating future problems.

I've thought about this for quite some time, and didn't see how it
would cause problems. And the change here clearly is the more low
hanging fruit.

> The goal of this suggestion was actually for PV-Shim, to have only the
> regular or compat M2P, as they're fairly large structures and adversely
> affect VM density.

But in particular for {INVALID,SHARED}_M2P_ENTRY there'll need to
be some, well, hacks if we want to use the compat one as a
replacement for the native one. This will require some more careful
thought (at least on my side).

> This of course requires the kernel elf file to be parsed earlier during
> boot, but that isn't a problem.  (It also allows for a PV/PVH dom0
> usability fix, whereby the Xen command line has to match the ELF image
> provided, rather than auto-selecting the default when only one option is
> available.)

Hmm, no, that's not my current plan, see the cover letter. I've
already checked that there are no set_gpfn_from_mfn() uses (except
with INVALID_M2P_ENTRY) ahead of Dom0 creation. So instead of
moving the parsing earlier, I'm intending to move the setting up of
the (right) M2P later. My current take on this is that it'll mainly
involve breaking out existing code into its own functions.

> The other aspect would be to teach Xen to run on only the compat M2P,
> which is fine for any shim smaller than 16T.  (Honestly, if it weren't
> an ABI with guests, Shim ought to run exclusively on the compat M2P to
> reduce the memory overhead.)

You've covered the shim aspect above, I thought, and the ABI aspect
precludes not maintaining the native M2P when there's a 64-bit guest.
So I'm not sure what you're trying to suggest here that we may be
able to gain.

> Then during boot, the Shim path would chose to construct only the
> regular or compat M2P, based on bitness of the provided kernel.

That's the plan, yes, but covered higher up.

>> ---
>> An alternative place for opt_pv32.h would seem to be asm-x86/config.h.
> 
> Oh - yes please.  I think that would be better overall.

Done.

>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -597,8 +597,10 @@ int arch_domain_create(struct domain *d,
>>      }
>>      d->arch.emulation_flags = emflags;
>>  
>> +#ifdef CONFIG_PV32
>>      HYPERVISOR_COMPAT_VIRT_START(d) =
>>          is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u;
>> +#endif
> 
> Can we drop HYPERVISOR_COMPAT_VIRT_START() ?
> 
> Its use here as an lvalue in particular makes logic especually hard to
> follow, but all it is actually doing is wrapping the shorter
> d->arch.hv_compat_vstart
> 
> In particular, it would remove the need to conditionally stub
> HYPERVISOR_COMPAT_VIRT_START() later.

I can do this as a prereq patch, sure, but I'm not convinced as
the avoiding of the stub will mean a few new #ifdef-s afaict.
Please confirm that you're convinced this will yield the overall
better result.

>> --- a/xen/arch/x86/x86_64/mm.c
>> +++ b/xen/arch/x86/x86_64/mm.c
>> @@ -315,10 +318,10 @@ static void destroy_m2p_mapping(struct m
>>   */
>>  static int setup_compat_m2p_table(struct mem_hotadd_info *info)
>>  {
>> +    int err = 0;
>>      unsigned long i, smap, emap, epfn = info->epfn;
>>      mfn_t mfn;
>>      unsigned int n;
>> -    int err = 0;
> 
> Remnants of an earlier change?

Oops.

>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -42,8 +42,12 @@
>>  #define _PGT_validated    PG_shift(6)
>>  #define PGT_validated     PG_mask(1, 6)
>>   /* PAE only: is this an L2 page directory containing Xen-private mappings? */
>> +#ifdef CONFIG_PV32
>>  #define _PGT_pae_xen_l2   PG_shift(7)
>>  #define PGT_pae_xen_l2    PG_mask(1, 7)
>> +#else
>> +#define PGT_pae_xen_l2    0
>> +#endif
> 
> Hmm - this is going to irritate Coverity and Clang some more.  I still
> need to figure out an effective way to make Coverity not object to this
> type of short circuiting like this.

I assume this is just a remark, not implying any action on my
part?

> I've looked through the users and I think that they're all safe.

I wouldn't have dared make the change without first checking.

>  I do
> however wonder whether is_guest_l2_slot() can be simplified and have its
> is_pv_32bit_domain() clause dropped, seeing as it is expensive with its
> lfences, and the logic ought to only care about PGT_pae_xen_l2 vs
> PGT_l2_page_table.

Good idea, yes, but that'll be a separate patch.

>> @@ -494,15 +498,39 @@ extern paddr_t mem_hotplug;
>>  #define SHARED_M2P_ENTRY         (~0UL - 1UL)
>>  #define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
>>  
>> -#define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
>> +#ifdef CONFIG_PV32
>> +
>> +extern int8_t opt_pv32;
>> +
>> +# define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
>> +
>> +# define set_compat_m2p(mfn, entry) \
>> +    ((void)(!opt_pv32 || \
>> +            (mfn) >= (RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) / 4 || \
>> +            (compat_machine_to_phys_mapping[mfn] = (entry))))
> 
> I know this is extracting previous logic, but "entry" would probably be
> better if it were named "val" or similar.

I was wondering myself, but didn't consider val or alike meaningfully
better. As it looks you do, I'll switch.

> However, see my reply to patch 3 which I think will simplify this
> substantially.

Neither my inbox nor the list archives have such a reply, so I can
only assume this is yet to be sent.

Jan
Jan Beulich Aug. 7, 2020, 10:24 a.m. UTC | #3
On 07.08.2020 12:12, Jan Beulich wrote:
> On 06.08.2020 21:16, Andrew Cooper wrote:
>> On 06/08/2020 10:28, Jan Beulich wrote:
>>> An alternative place for opt_pv32.h would seem to be asm-x86/config.h.
>>
>> Oh - yes please.  I think that would be better overall.
> 
> Done.

Now that I'm trying to build the result: There's an immediate
downside - we can't use int8_t in config.h, so I'll have to
switch to using signed char instead.

Jan
Jan Beulich Aug. 25, 2020, 1:23 p.m. UTC | #4
On 07.08.2020 12:12, Jan Beulich wrote:
> On 06.08.2020 21:16, Andrew Cooper wrote:
>> On 06/08/2020 10:28, Jan Beulich wrote:
>>> Note that opt_pv32's declaration / #define need to be moved due to other
>>> header dependencies; in particular can asm-x86/mm.h not include
>>> asm-x86/pv/domain.h.
>>>
>>> While touching their definitions anyway, also adjust section placement
>>> of m2p_compat_vstart and compat_idle_pg_table_l2. Similarly, while
>>> putting init_xen_pae_l2_slots() inside #ifdef, also move it to a PV-only
>>> source file.
>>>
>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> So interestingly, this is done out of the order which I was expecting to
>> do things.  Its not a problem, but I'd like to double check that we
>> aren't creating future problems.
> 
> I've thought about this for quite some time, and didn't see how it
> would cause problems. And the change here clearly is the more low
> hanging fruit.
> 
>> The goal of this suggestion was actually for PV-Shim, to have only the
>> regular or compat M2P, as they're fairly large structures and adversely
>> affect VM density.
> 
> But in particular for {INVALID,SHARED}_M2P_ENTRY there'll need to
> be some, well, hacks if we want to use the compat one as a
> replacement for the native one. This will require some more careful
> thought (at least on my side).

Having looked into this some more, I'm still unsure whether this is a
viable thing to do. While we do have VALID_M2P() (checking the top bit
of the entry), its use is rather limited. The most noteworthy place
(but by far not the only one) where it's _not_ used is perhaps the
handling of MMU_MACHPHYS_UPDATE. Additionally there's no similar
checking of bit 31 for 32-bit guests at all. Hence at a first
approximation both (uint32_t)INVALID_P2M_ENTRY and
(uint32_t)SHARED_P2M_ENTRY are to be considered valid GFNs (albeit
they wouldn't have been on a 32-bit Xen, but those were similarly
lacking enforcement of this restriction anyway).

Jan

Patch
diff mbox series

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -597,8 +597,10 @@  int arch_domain_create(struct domain *d,
     }
     d->arch.emulation_flags = emflags;
 
+#ifdef CONFIG_PV32
     HYPERVISOR_COMPAT_VIRT_START(d) =
         is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u;
+#endif
 
     if ( (rc = paging_domain_init(d)) != 0 )
         goto fail;
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1427,13 +1427,6 @@  static bool pae_xen_mappings_check(const
     return true;
 }
 
-void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
-{
-    memcpy(&l2t[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
-           compat_idle_pg_table_l2,
-           COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2t));
-}
-
 static int promote_l2_table(struct page_info *page, unsigned long type)
 {
     struct domain *d = page_get_owner(page);
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -128,6 +128,15 @@  bool pv_map_ldt_shadow_page(unsigned int
     return true;
 }
 
+#ifdef CONFIG_PV32
+void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
+{
+    memcpy(&l2t[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
+           compat_idle_pg_table_l2,
+           COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2t));
+}
+#endif
+
 /*
  * Local variables:
  * mode: C
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -103,6 +103,9 @@  int compat_arch_memory_op(unsigned long
             .max_mfn = MACH2PHYS_COMPAT_NR_ENTRIES(d) - 1
         };
 
+        if ( !opt_pv32 )
+            return -EOPNOTSUPP;
+
         if ( copy_to_guest(arg, &mapping, 1) )
             rc = -EFAULT;
 
@@ -115,6 +118,9 @@  int compat_arch_memory_op(unsigned long
         unsigned long limit;
         compat_pfn_t last_mfn;
 
+        if ( !opt_pv32 )
+            return -EOPNOTSUPP;
+
         if ( copy_from_guest(&xmml, arg, 1) )
             return -EFAULT;
 
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -40,9 +40,11 @@  EMIT_FILE;
 #include <asm/mem_sharing.h>
 #include <public/memory.h>
 
-unsigned int __read_mostly m2p_compat_vstart = __HYPERVISOR_COMPAT_VIRT_START;
+#ifdef CONFIG_PV32
+unsigned int __initdata m2p_compat_vstart = __HYPERVISOR_COMPAT_VIRT_START;
 
-l2_pgentry_t *compat_idle_pg_table_l2;
+l2_pgentry_t *__read_mostly compat_idle_pg_table_l2;
+#endif
 
 void *do_page_walk(struct vcpu *v, unsigned long addr)
 {
@@ -218,7 +220,8 @@  static void destroy_compat_m2p_mapping(s
 {
     unsigned long i, smap = info->spfn, emap = info->spfn;
 
-    if ( smap > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2) )
+    if ( !opt_pv32 ||
+         smap > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2) )
         return;
 
     if ( emap > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2) )
@@ -315,10 +318,10 @@  static void destroy_m2p_mapping(struct m
  */
 static int setup_compat_m2p_table(struct mem_hotadd_info *info)
 {
+    int err = 0;
     unsigned long i, smap, emap, epfn = info->epfn;
     mfn_t mfn;
     unsigned int n;
-    int err = 0;
 
     smap = info->spfn & (~((1UL << (L2_PAGETABLE_SHIFT - 2)) -1));
 
@@ -326,7 +329,8 @@  static int setup_compat_m2p_table(struct
      * Notice: For hot-added memory, only range below m2p_compat_vstart
      * will be filled up (assuming memory is discontinous when booting).
      */
-    if   ((smap > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2)) )
+    if ( !opt_pv32 ||
+         (smap > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2)) )
         return 0;
 
     if ( epfn > ((RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) >> 2) )
@@ -368,6 +372,7 @@  static int setup_compat_m2p_table(struct
     }
 #undef CNT
 #undef MFN
+
     return err;
 }
 
@@ -609,17 +614,24 @@  void __init paging_init(void)
 #undef MFN
 
     /* Create user-accessible L2 directory to map the MPT for compat guests. */
-    if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
-        goto nomem;
-    compat_idle_pg_table_l2 = l2_ro_mpt;
-    clear_page(l2_ro_mpt);
-    /* Allocate and map the compatibility mode machine-to-phys table. */
-    mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
-    if ( mpt_size > RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START )
-        mpt_size = RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START;
-    mpt_size &= ~((1UL << L2_PAGETABLE_SHIFT) - 1UL);
-    if ( (m2p_compat_vstart + mpt_size) < MACH2PHYS_COMPAT_VIRT_END )
-        m2p_compat_vstart = MACH2PHYS_COMPAT_VIRT_END - mpt_size;
+    if ( opt_pv32 )
+    {
+        if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
+            goto nomem;
+        compat_idle_pg_table_l2 = l2_ro_mpt;
+        clear_page(l2_ro_mpt);
+
+        /* Allocate and map the compatibility mode machine-to-phys table. */
+        mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
+        if ( mpt_size > RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START )
+            mpt_size = RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START;
+        mpt_size &= ~((1UL << L2_PAGETABLE_SHIFT) - 1UL);
+        if ( (m2p_compat_vstart + mpt_size) < MACH2PHYS_COMPAT_VIRT_END )
+            m2p_compat_vstart = MACH2PHYS_COMPAT_VIRT_END - mpt_size;
+    }
+    else
+        mpt_size = 0;
+
 #define MFN(x) (((x) << L2_PAGETABLE_SHIFT) / sizeof(unsigned int))
 #define CNT ((sizeof(*frame_table) & -sizeof(*frame_table)) / \
              sizeof(*compat_machine_to_phys_mapping))
@@ -845,23 +857,24 @@  void __init subarch_init_memory(void)
                 mfn_to_page(_mfn(m2p_start_mfn + i)), SHARE_ro);
     }
 
-    for ( v  = RDWR_COMPAT_MPT_VIRT_START;
-          v != RDWR_COMPAT_MPT_VIRT_END;
-          v += 1 << L2_PAGETABLE_SHIFT )
-    {
-        l3e = l3e_from_l4e(idle_pg_table[l4_table_offset(v)],
-                           l3_table_offset(v));
-        if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
-            continue;
-        l2e = l2e_from_l3e(l3e, l2_table_offset(v));
-        if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) )
-            continue;
-        m2p_start_mfn = l2e_get_pfn(l2e);
+    if ( opt_pv32 )
+        for ( v  = RDWR_COMPAT_MPT_VIRT_START;
+              v != RDWR_COMPAT_MPT_VIRT_END;
+              v += 1 << L2_PAGETABLE_SHIFT )
+        {
+            l3e = l3e_from_l4e(idle_pg_table[l4_table_offset(v)],
+                               l3_table_offset(v));
+            if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
+                continue;
+            l2e = l2e_from_l3e(l3e, l2_table_offset(v));
+            if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) )
+                continue;
+            m2p_start_mfn = l2e_get_pfn(l2e);
 
-        for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-            share_xen_page_with_privileged_guests(
-                mfn_to_page(_mfn(m2p_start_mfn + i)), SHARE_ro);
-    }
+            for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
+                share_xen_page_with_privileged_guests(
+                    mfn_to_page(_mfn(m2p_start_mfn + i)), SHARE_ro);
+        }
 
     /* Mark all of direct map NX if hardware supports it. */
     if ( !cpu_has_nx )
@@ -933,6 +946,9 @@  long subarch_memory_op(unsigned long cmd
         break;
 
     case XENMEM_machphys_compat_mfn_list:
+        if ( !opt_pv32 )
+            return -EOPNOTSUPP;
+
         if ( copy_from_guest(&xmml, arg, 1) )
             return -EFAULT;
 
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -142,7 +142,7 @@  extern unsigned char boot_edid_info[128]
  *  0xffff82c000000000 - 0xffff82cfffffffff [64GB,  2^36 bytes, PML4:261]
  *    vmap()/ioremap()/fixmap area.
  *  0xffff82d000000000 - 0xffff82d03fffffff [1GB,   2^30 bytes, PML4:261]
- *    Compatibility machine-to-phys translation table.
+ *    Compatibility machine-to-phys translation table (CONFIG_PV32).
  *  0xffff82d040000000 - 0xffff82d07fffffff [1GB,   2^30 bytes, PML4:261]
  *    Xen text, static data, bss.
 #ifndef CONFIG_BIGMEM
@@ -246,9 +246,18 @@  extern unsigned char boot_edid_info[128]
 
 #ifndef __ASSEMBLY__
 
+#ifdef CONFIG_PV32
+
 /* This is not a fixed value, just a lower limit. */
 #define __HYPERVISOR_COMPAT_VIRT_START 0xF5800000
 #define HYPERVISOR_COMPAT_VIRT_START(d) ((d)->arch.hv_compat_vstart)
+
+#else
+
+#define HYPERVISOR_COMPAT_VIRT_START(d) ((void)(d), 0)
+
+#endif
+
 #define MACH2PHYS_COMPAT_VIRT_START    HYPERVISOR_COMPAT_VIRT_START
 #define MACH2PHYS_COMPAT_VIRT_END      0xFFE00000
 #define MACH2PHYS_COMPAT_NR_ENTRIES(d) \
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -307,7 +307,9 @@  struct arch_domain
 {
     struct page_info *perdomain_l3_pg;
 
+#ifdef CONFIG_PV32
     unsigned int hv_compat_vstart;
+#endif
 
     /* Maximum physical-address bitwidth supported by this guest. */
     unsigned int physaddr_bitsize;
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -42,8 +42,12 @@ 
 #define _PGT_validated    PG_shift(6)
 #define PGT_validated     PG_mask(1, 6)
  /* PAE only: is this an L2 page directory containing Xen-private mappings? */
+#ifdef CONFIG_PV32
 #define _PGT_pae_xen_l2   PG_shift(7)
 #define PGT_pae_xen_l2    PG_mask(1, 7)
+#else
+#define PGT_pae_xen_l2    0
+#endif
 /* Has this page been *partially* validated for use as its current type? */
 #define _PGT_partial      PG_shift(8)
 #define PGT_partial       PG_mask(1, 8)
@@ -494,15 +498,39 @@  extern paddr_t mem_hotplug;
 #define SHARED_M2P_ENTRY         (~0UL - 1UL)
 #define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
 
-#define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
+#ifdef CONFIG_PV32
+
+extern int8_t opt_pv32;
+
+# define compat_machine_to_phys_mapping ((unsigned int *)RDWR_COMPAT_MPT_VIRT_START)
+
+# define set_compat_m2p(mfn, entry) \
+    ((void)(!opt_pv32 || \
+            (mfn) >= (RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) / 4 || \
+            (compat_machine_to_phys_mapping[mfn] = (entry))))
+
+#else /* !CONFIG_PV32 */
+
+# define opt_pv32 false
+
+/*
+ * Declare the symbol such that (dead) code referencing it can be built
+ * without a lot of #ifdef-ary, but mark it fully const and don't define
+ * this symbol anywhere (relying on DCE by the compiler).
+ */
+extern const unsigned int *const compat_machine_to_phys_mapping;
+
+# define set_compat_m2p(mfn, entry)
+
+#endif /* CONFIG_PV32 */
+
 #define _set_gpfn_from_mfn(mfn, pfn) ({                        \
     struct domain *d = page_get_owner(mfn_to_page(_mfn(mfn))); \
     unsigned long entry = (d && (d == dom_cow)) ?              \
         SHARED_M2P_ENTRY : (pfn);                              \
-    ((void)((mfn) >= (RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START) / 4 || \
-            (compat_machine_to_phys_mapping[(mfn)] = (unsigned int)(entry))), \
-     machine_to_phys_mapping[(mfn)] = (entry));                \
-    })
+    set_compat_m2p(mfn, (unsigned int)(entry));                \
+    machine_to_phys_mapping[mfn] = (entry);                    \
+})
 
 /*
  * Disable some users of set_gpfn_from_mfn() (e.g., free_heap_pages()) until
--- a/xen/include/asm-x86/pv/domain.h
+++ b/xen/include/asm-x86/pv/domain.h
@@ -23,12 +23,6 @@ 
 
 #include <xen/sched.h>
 
-#ifdef CONFIG_PV32
-extern int8_t opt_pv32;
-#else
-# define opt_pv32 false
-#endif
-
 /*
  * PCID values for the address spaces of 64-bit pv domains:
  *