diff mbox series

[11/22] x86: add a boot option to enable and disable the direct map

Message ID 20221216114853.8227-12-julien@xen.org (mailing list archive)
State New, archived
Headers show
Series Remove the directmap | expand

Commit Message

Julien Grall Dec. 16, 2022, 11:48 a.m. UTC
From: Hongyan Xia <hongyxia@amazon.com>

Also add a helper function to retrieve it. Change arch_mfns_in_direct_map
to check this option before returning.

This is added as a boot command line option, not a Kconfig to allow
the user to experiment the feature without rebuild the hypervisor.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

----

    TODO:
        * Do we also want to provide a Kconfig option?

    Changes since Hongyan's version:
        * Reword the commit message
        * opt_directmap is only modified during boot so mark it as
          __ro_after_init
---
 docs/misc/xen-command-line.pandoc | 12 ++++++++++++
 xen/arch/arm/include/asm/mm.h     |  5 +++++
 xen/arch/x86/include/asm/mm.h     | 17 ++++++++++++++++-
 xen/arch/x86/mm.c                 |  3 +++
 xen/arch/x86/setup.c              |  2 ++
 5 files changed, 38 insertions(+), 1 deletion(-)

Comments

Jan Beulich Dec. 22, 2022, 1:24 p.m. UTC | #1
On 16.12.2022 12:48, Julien Grall wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> Also add a helper function to retrieve it. Change arch_mfns_in_direct_map
> to check this option before returning.

I think the abstract parts of this want to be generic right away. I can't
see why Arm would not suffer from the same issue that this work is trying
to address.

> This is added as a boot command line option, not a Kconfig to allow
> the user to experiment the feature without rebuild the hypervisor.

I think there wants to be a (generic) Kconfig piece here, to control the
default of the option. Plus a 2nd, prompt-less element which an arch can
select to force the setting to always-on, suppressing the choice of
default. That 2nd control would then be used to compile out the
boolean_param() for Arm for the time being.

That said, I think this change comes too early in the series, or there is
something missing. As said in reply to patch 10, while there the mapcache
is being initialized for the idle domain, I don't think it can be used
just yet. Read through mapcache_current_vcpu() to understand why I think
that way, paying particular attention to the ASSERT() near the end. In
preparation of this patch here I think the mfn_to_virt() uses have to all
disappear from map_domain_page(). Perhaps yet more strongly all
..._to_virt() (except fix_to_virt() and friends) and __va() have to
disappear up front from x86 and any code path which can be taken on x86
(which may simply mean purging all respective x86 #define-s, without
breaking the build in any way).

Jan
Stefano Stabellini Jan. 23, 2023, 9:45 p.m. UTC | #2
On Fri, 16 Dec 2022, Julien Grall wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> Also add a helper function to retrieve it. Change arch_mfns_in_direct_map
> to check this option before returning.
> 
> This is added as a boot command line option, not a Kconfig to allow
> the user to experiment the feature without rebuild the hypervisor.
> 
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ----
> 
>     TODO:
>         * Do we also want to provide a Kconfig option?
> 
>     Changes since Hongyan's version:
>         * Reword the commit message
>         * opt_directmap is only modified during boot so mark it as
>           __ro_after_init
> ---
>  docs/misc/xen-command-line.pandoc | 12 ++++++++++++
>  xen/arch/arm/include/asm/mm.h     |  5 +++++
>  xen/arch/x86/include/asm/mm.h     | 17 ++++++++++++++++-
>  xen/arch/x86/mm.c                 |  3 +++
>  xen/arch/x86/setup.c              |  2 ++
>  5 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index b7ee97be762e..a63e4612acac 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -760,6 +760,18 @@ Specify the size of the console debug trace buffer. By specifying `cpu:`
>  additionally a trace buffer of the specified size is allocated per cpu.
>  The debug trace feature is only enabled in debugging builds of Xen.
>  
> +### directmap (x86)
> +> `= <boolean>`
> +
> +> Default: `true`
> +
> +Enable or disable the direct map region in Xen.
> +
> +By default, Xen creates the direct map region which maps physical memory
> +in that region. Setting this to no will remove the direct map, blocking
> +exploits that leak secrets via speculative memory access in the direct
> +map.
> +
>  ### dma_bits
>  > `= <integer>`
>  
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 68adcac9fa8d..2366928d71aa 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -406,6 +406,11 @@ static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
>      } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
>  }
>  
> +static inline bool arch_has_directmap(void)
> +{
> +    return true;

Shoudn't arch_has_directmap return false for arm32?



> +}
> +
>  #endif /*  __ARCH_ARM_MM__ */
>  /*
>   * Local variables:
> diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h
> index db29e3e2059f..cf8b20817c6c 100644
> --- a/xen/arch/x86/include/asm/mm.h
> +++ b/xen/arch/x86/include/asm/mm.h
> @@ -464,6 +464,8 @@ static inline int get_page_and_type(struct page_info *page,
>      ASSERT(((_p)->count_info & PGC_count_mask) != 0);          \
>      ASSERT(page_get_owner(_p) == (_d))
>  
> +extern bool opt_directmap;
> +
>  /******************************************************************************
>   * With shadow pagetables, the different kinds of address start
>   * to get get confusing.
> @@ -620,13 +622,26 @@ extern const char zero_page[];
>  /* Build a 32bit PSE page table using 4MB pages. */
>  void write_32bit_pse_identmap(uint32_t *l2);
>  
> +static inline bool arch_has_directmap(void)
> +{
> +    return opt_directmap;
> +}
> +
>  /*
>   * x86 maps part of physical memory via the directmap region.
>   * Return whether the range of MFN falls in the directmap region.
> + *
> + * When boot command line sets directmap=no, we will not have a direct map at
> + * all so this will always return false.
>   */
>  static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>  {
> -    unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
> +    unsigned long eva;
> +
> +    if ( !arch_has_directmap() )
> +        return false;
> +
> +    eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
>  
>      return (mfn + nr) <= (virt_to_mfn(eva - 1) + 1);
>  }
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 041bd4cfde17..e76e135b96fc 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -157,6 +157,9 @@ l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
>  l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
>      l1_fixmap_x[L1_PAGETABLE_ENTRIES];
>  
> +bool __ro_after_init opt_directmap = true;
> +boolean_param("directmap", opt_directmap);
> +
>  /* Frame table size in pages. */
>  unsigned long max_page;
>  unsigned long total_pages;
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 1c2e09711eb0..2cb051c6e4e7 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1423,6 +1423,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      if ( highmem_start )
>          xenheap_max_mfn(PFN_DOWN(highmem_start - 1));
>  
> +    printk("Booting with directmap %s\n", arch_has_directmap() ? "on" : "off");
> +
>      /*
>       * Walk every RAM region and map it in its entirety (on x86/64, at least)
>       * and notify it to the boot allocator.
> -- 
> 2.38.1
>
Julien Grall Jan. 23, 2023, 10:01 p.m. UTC | #3
Hi Stefano,

On 23/01/2023 21:45, Stefano Stabellini wrote:
>> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
>> index 68adcac9fa8d..2366928d71aa 100644
>> --- a/xen/arch/arm/include/asm/mm.h
>> +++ b/xen/arch/arm/include/asm/mm.h
>> @@ -406,6 +406,11 @@ static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
>>       } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
>>   }
>>   
>> +static inline bool arch_has_directmap(void)
>> +{
>> +    return true;
> 
> Shoudn't arch_has_directmap return false for arm32?

We still have a directmap on Arm32, but it only covers the xenheap.

Cheers,
Elias El Yandouzi Jan. 11, 2024, 10:47 a.m. UTC | #4
Hi,

On 22/12/2022 13:24, Jan Beulich wrote:
> That said, I think this change comes too early in the series, or there is
> something missing. 

At first, I had the same feeling but looking at the rest of the series, 
I can see that the option is needed in follow-up patches.

> As said in reply to patch 10, while there the mapcache
> is being initialized for the idle domain, I don't think it can be used
> just yet. Read through mapcache_current_vcpu() to understand why I think
> that way, paying particular attention to the ASSERT() near the end.

Would be able to elaborate a bit more why you think that? I haven't been 
able to get your point.

> In preparation of this patch here I think the mfn_to_virt() uses have to all
> disappear from map_domain_page(). Perhaps yet more strongly all
> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
> disappear up front from x86 and any code path which can be taken on x86
> (which may simply mean purging all respective x86 #define-s, without
> breaking the build in any way).

I agree with you on that one. I think it is what we're aiming for in the 
long term. However, as mentioned by Julien in the cover letter, the 
series's name is a misnomer and I am afraid we won't be able to remove 
all of them with this series. These helpers would still be used for 
xenheap pages or when the direct map is enabled.
Jan Beulich Jan. 11, 2024, 11:53 a.m. UTC | #5
On 11.01.2024 11:47, Elias El Yandouzi wrote:
> On 22/12/2022 13:24, Jan Beulich wrote:
>> That said, I think this change comes too early in the series, or there is
>> something missing. 
> 
> At first, I had the same feeling but looking at the rest of the series, 
> I can see that the option is needed in follow-up patches.
> 
>> As said in reply to patch 10, while there the mapcache
>> is being initialized for the idle domain, I don't think it can be used
>> just yet. Read through mapcache_current_vcpu() to understand why I think
>> that way, paying particular attention to the ASSERT() near the end.
> 
> Would be able to elaborate a bit more why you think that? I haven't been 
> able to get your point.

Why exactly I referred to the ASSERT() there I can't reconstruct. The
function as a whole looks problematic though when suddenly the idle
domain also gains a mapcache. I'm sorry, too much context was lost
from over a year ago; all of this will need looking at from scratch
again whenever a new version was posted.

>> In preparation of this patch here I think the mfn_to_virt() uses have to all
>> disappear from map_domain_page(). Perhaps yet more strongly all
>> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
>> disappear up front from x86 and any code path which can be taken on x86
>> (which may simply mean purging all respective x86 #define-s, without
>> breaking the build in any way).
> 
> I agree with you on that one. I think it is what we're aiming for in the 
> long term. However, as mentioned by Julien in the cover letter, the 
> series's name is a misnomer and I am afraid we won't be able to remove 
> all of them with this series. These helpers would still be used for 
> xenheap pages or when the direct map is enabled.

Leaving a hazard of certain uses not having been converted, or even
overlooked in patches going in at around the same time as this series?
I view this as pretty "adventurous".

Jan
Julien Grall Jan. 11, 2024, 12:25 p.m. UTC | #6
Hi Jan,

On 11/01/2024 11:53, Jan Beulich wrote:
> On 11.01.2024 11:47, Elias El Yandouzi wrote:
>> On 22/12/2022 13:24, Jan Beulich wrote:
>>> That said, I think this change comes too early in the series, or there is
>>> something missing.
>>
>> At first, I had the same feeling but looking at the rest of the series,
>> I can see that the option is needed in follow-up patches.
>>
>>> As said in reply to patch 10, while there the mapcache
>>> is being initialized for the idle domain, I don't think it can be used
>>> just yet. Read through mapcache_current_vcpu() to understand why I think
>>> that way, paying particular attention to the ASSERT() near the end.
>>
>> Would be able to elaborate a bit more why you think that? I haven't been
>> able to get your point.
> 
> Why exactly I referred to the ASSERT() there I can't reconstruct. The
> function as a whole looks problematic though when suddenly the idle
> domain also gains a mapcache. I'm sorry, too much context was lost
> from over a year ago; all of this will need looking at from scratch
> again whenever a new version was posted.
> 
>>> In preparation of this patch here I think the mfn_to_virt() uses have to all
>>> disappear from map_domain_page(). Perhaps yet more strongly all
>>> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
>>> disappear up front from x86 and any code path which can be taken on x86
>>> (which may simply mean purging all respective x86 #define-s, without
>>> breaking the build in any way).
>>
>> I agree with you on that one. I think it is what we're aiming for in the
>> long term. However, as mentioned by Julien in the cover letter, the
>> series's name is a misnomer and I am afraid we won't be able to remove
>> all of them with this series. These helpers would still be used for
>> xenheap pages or when the direct map is enabled.
> 
> Leaving a hazard of certain uses not having been converted, or even
> overlooked in patches going in at around the same time as this series?
> I view this as pretty "adventurous".

Until we get rid of the directmap completely (which is not the goal of 
this series), we will need to keep mfn_to_virt().

In fact the one you ask to remove in map_domain_page() will need to be 
replaced with function doing the same thing. The same for the code that 
will initially prepare the directmap. This to avoid impacting 
performance when the user still wants to use the directmap.

So are you just asking to remove most of the use and rename *_to_virt() 
to something that is more directmap specific (e.g. mfn_to_directmap_virt())?

Cheers,
Jan Beulich Jan. 11, 2024, 2:09 p.m. UTC | #7
On 11.01.2024 13:25, Julien Grall wrote:
> Hi Jan,
> 
> On 11/01/2024 11:53, Jan Beulich wrote:
>> On 11.01.2024 11:47, Elias El Yandouzi wrote:
>>> On 22/12/2022 13:24, Jan Beulich wrote:
>>>> That said, I think this change comes too early in the series, or there is
>>>> something missing.
>>>
>>> At first, I had the same feeling but looking at the rest of the series,
>>> I can see that the option is needed in follow-up patches.
>>>
>>>> As said in reply to patch 10, while there the mapcache
>>>> is being initialized for the idle domain, I don't think it can be used
>>>> just yet. Read through mapcache_current_vcpu() to understand why I think
>>>> that way, paying particular attention to the ASSERT() near the end.
>>>
>>> Would be able to elaborate a bit more why you think that? I haven't been
>>> able to get your point.
>>
>> Why exactly I referred to the ASSERT() there I can't reconstruct. The
>> function as a whole looks problematic though when suddenly the idle
>> domain also gains a mapcache. I'm sorry, too much context was lost
>> from over a year ago; all of this will need looking at from scratch
>> again whenever a new version was posted.
>>
>>>> In preparation of this patch here I think the mfn_to_virt() uses have to all
>>>> disappear from map_domain_page(). Perhaps yet more strongly all
>>>> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
>>>> disappear up front from x86 and any code path which can be taken on x86
>>>> (which may simply mean purging all respective x86 #define-s, without
>>>> breaking the build in any way).
>>>
>>> I agree with you on that one. I think it is what we're aiming for in the
>>> long term. However, as mentioned by Julien in the cover letter, the
>>> series's name is a misnomer and I am afraid we won't be able to remove
>>> all of them with this series. These helpers would still be used for
>>> xenheap pages or when the direct map is enabled.
>>
>> Leaving a hazard of certain uses not having been converted, or even
>> overlooked in patches going in at around the same time as this series?
>> I view this as pretty "adventurous".
> 
> Until we get rid of the directmap completely (which is not the goal of 
> this series), we will need to keep mfn_to_virt().
> 
> In fact the one you ask to remove in map_domain_page() will need to be 
> replaced with function doing the same thing. The same for the code that 
> will initially prepare the directmap. This to avoid impacting 
> performance when the user still wants to use the directmap.
> 
> So are you just asking to remove most of the use and rename *_to_virt() 
> to something that is more directmap specific (e.g. mfn_to_directmap_virt())?

Well, in a way. If done this way, mfn_to_virt() (and __va()) should have no
users by the end of the series, and it would be obvious that nothing was
missed (and by then purging the old ones we could also ensure no new uses
would appear).

Jan
Elias El Yandouzi Jan. 11, 2024, 6:25 p.m. UTC | #8
On 11/01/2024 14:09, Jan Beulich wrote:
> On 11.01.2024 13:25, Julien Grall wrote:
>> Hi Jan,
>>
>> On 11/01/2024 11:53, Jan Beulich wrote:
>>> On 11.01.2024 11:47, Elias El Yandouzi wrote:
>>>> On 22/12/2022 13:24, Jan Beulich wrote:
>>>>> That said, I think this change comes too early in the series, or there is
>>>>> something missing.
>>>>
>>>> At first, I had the same feeling but looking at the rest of the series,
>>>> I can see that the option is needed in follow-up patches.
>>>>
>>>>> As said in reply to patch 10, while there the mapcache
>>>>> is being initialized for the idle domain, I don't think it can be used
>>>>> just yet. Read through mapcache_current_vcpu() to understand why I think
>>>>> that way, paying particular attention to the ASSERT() near the end.
>>>>
>>>> Would be able to elaborate a bit more why you think that? I haven't been
>>>> able to get your point.
>>>
>>> Why exactly I referred to the ASSERT() there I can't reconstruct. The
>>> function as a whole looks problematic though when suddenly the idle
>>> domain also gains a mapcache. I'm sorry, too much context was lost
>>> from over a year ago; all of this will need looking at from scratch
>>> again whenever a new version was posted.
>>>
>>>>> In preparation of this patch here I think the mfn_to_virt() uses have to all
>>>>> disappear from map_domain_page(). Perhaps yet more strongly all
>>>>> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
>>>>> disappear up front from x86 and any code path which can be taken on x86
>>>>> (which may simply mean purging all respective x86 #define-s, without
>>>>> breaking the build in any way).
>>>>
>>>> I agree with you on that one. I think it is what we're aiming for in the
>>>> long term. However, as mentioned by Julien in the cover letter, the
>>>> series's name is a misnomer and I am afraid we won't be able to remove
>>>> all of them with this series. These helpers would still be used for
>>>> xenheap pages or when the direct map is enabled.
>>>
>>> Leaving a hazard of certain uses not having been converted, or even
>>> overlooked in patches going in at around the same time as this series?
>>> I view this as pretty "adventurous".
>>
>> Until we get rid of the directmap completely (which is not the goal of
>> this series), we will need to keep mfn_to_virt().
>>
>> In fact the one you ask to remove in map_domain_page() will need to be
>> replaced with function doing the same thing. The same for the code that
>> will initially prepare the directmap. This to avoid impacting
>> performance when the user still wants to use the directmap.
>>
>> So are you just asking to remove most of the use and rename *_to_virt()
>> to something that is more directmap specific (e.g. mfn_to_directmap_virt())?
> 
> Well, in a way. If done this way, mfn_to_virt() (and __va()) should have no
> users by the end of the series, and it would be obvious that nothing was
> missed (and by then purging the old ones we could also ensure no new uses
> would appear).

What about maddr_to_virt()? For instance, in the function 
xen/arch/x86/dmi_scan.c:dmi_iterate(), we need to access a very low 
machine address which isn't in the directmap range.

How would you proceed? Calling vmap() seems to be a bit overkill for 
just a temporary mapping and I don't really want to rework this function 
to use map_domain_page().

In such case, how would you proceed? What do you suggest?

Cheers,
Jan Beulich Jan. 12, 2024, 7:47 a.m. UTC | #9
On 11.01.2024 19:25, Elias El Yandouzi wrote:
> 
> 
> On 11/01/2024 14:09, Jan Beulich wrote:
>> On 11.01.2024 13:25, Julien Grall wrote:
>>> Hi Jan,
>>>
>>> On 11/01/2024 11:53, Jan Beulich wrote:
>>>> On 11.01.2024 11:47, Elias El Yandouzi wrote:
>>>>> On 22/12/2022 13:24, Jan Beulich wrote:
>>>>>> That said, I think this change comes too early in the series, or there is
>>>>>> something missing.
>>>>>
>>>>> At first, I had the same feeling but looking at the rest of the series,
>>>>> I can see that the option is needed in follow-up patches.
>>>>>
>>>>>> As said in reply to patch 10, while there the mapcache
>>>>>> is being initialized for the idle domain, I don't think it can be used
>>>>>> just yet. Read through mapcache_current_vcpu() to understand why I think
>>>>>> that way, paying particular attention to the ASSERT() near the end.
>>>>>
>>>>> Would be able to elaborate a bit more why you think that? I haven't been
>>>>> able to get your point.
>>>>
>>>> Why exactly I referred to the ASSERT() there I can't reconstruct. The
>>>> function as a whole looks problematic though when suddenly the idle
>>>> domain also gains a mapcache. I'm sorry, too much context was lost
>>>> from over a year ago; all of this will need looking at from scratch
>>>> again whenever a new version was posted.
>>>>
>>>>>> In preparation of this patch here I think the mfn_to_virt() uses have to all
>>>>>> disappear from map_domain_page(). Perhaps yet more strongly all
>>>>>> ..._to_virt() (except fix_to_virt() and friends) and __va() have to
>>>>>> disappear up front from x86 and any code path which can be taken on x86
>>>>>> (which may simply mean purging all respective x86 #define-s, without
>>>>>> breaking the build in any way).
>>>>>
>>>>> I agree with you on that one. I think it is what we're aiming for in the
>>>>> long term. However, as mentioned by Julien in the cover letter, the
>>>>> series's name is a misnomer and I am afraid we won't be able to remove
>>>>> all of them with this series. These helpers would still be used for
>>>>> xenheap pages or when the direct map is enabled.
>>>>
>>>> Leaving a hazard of certain uses not having been converted, or even
>>>> overlooked in patches going in at around the same time as this series?
>>>> I view this as pretty "adventurous".
>>>
>>> Until we get rid of the directmap completely (which is not the goal of
>>> this series), we will need to keep mfn_to_virt().
>>>
>>> In fact the one you ask to remove in map_domain_page() will need to be
>>> replaced with function doing the same thing. The same for the code that
>>> will initially prepare the directmap. This to avoid impacting
>>> performance when the user still wants to use the directmap.
>>>
>>> So are you just asking to remove most of the use and rename *_to_virt()
>>> to something that is more directmap specific (e.g. mfn_to_directmap_virt())?
>>
>> Well, in a way. If done this way, mfn_to_virt() (and __va()) should have no
>> users by the end of the series, and it would be obvious that nothing was
>> missed (and by then purging the old ones we could also ensure no new uses
>> would appear).
> 
> What about maddr_to_virt()? For instance, in the function 
> xen/arch/x86/dmi_scan.c:dmi_iterate(), we need to access a very low 
> machine address which isn't in the directmap range.

I'm afraid I don't follow: Very low addresses are always in the
direct map range, which - on x86 - always starts at 0.

> How would you proceed? Calling vmap() seems to be a bit overkill for 
> just a temporary mapping and I don't really want to rework this function 
> to use map_domain_page().
> 
> In such case, how would you proceed? What do you suggest?

fixmap may be an option to consider, but I also don't see why you
apparently think using vmap() would be a possibility while at the
same time making use of map_domain_page() is too much effort.

Jan
Elias El Yandouzi Jan. 15, 2024, 2:50 p.m. UTC | #10
Hi,

On 12/01/2024 07:47, Jan Beulich wrote:
> On 11.01.2024 19:25, Elias El Yandouzi wrote:
>> On 11/01/2024 14:09, Jan Beulich wrote:
>>
>> What about maddr_to_virt()? For instance, in the function
>> xen/arch/x86/dmi_scan.c:dmi_iterate(), we need to access a very low
>> machine address which isn't in the directmap range.
> 
> I'm afraid I don't follow: Very low addresses are always in the
> direct map range, which - on x86 - always starts at 0.
> 

I reckon it was poorly phrased. IIUC, we'd like to remove every use of 
*_to_virt() in the case the directmap option is disabled.
So I meant that in this situation, the helper arch_mfns_in_direct_map() 
would return false.

>> How would you proceed? Calling vmap() seems to be a bit overkill for
>> just a temporary mapping and I don't really want to rework this function
>> to use map_domain_page().
>>
>> In such case, how would you proceed? What do you suggest?
> 
> fixmap may be an option to consider, but I also don't see why you
> apparently think using vmap() would be a possibility while at the
> same time making use of map_domain_page() is too much effort.

I thought about using vmap() as it allows to map a contiguous region 
easily. It is also used in the follow-up patch 17/22, so I thought it 
could be viable.

I was reluctant to use map_domain_page() for two reasons. 1) it only 
allows to map one page at the time, so I'd need to rework more deeply 
the function dmi_iterate() 2) because the mapcache wouldn't be ready to 
use at that time, the mapping would end up in PMAP which is meant to map 
the page tables, nothing else.
Jan Beulich Jan. 16, 2024, 8:30 a.m. UTC | #11
On 15.01.2024 15:50, Elias El Yandouzi wrote:
> On 12/01/2024 07:47, Jan Beulich wrote:
>> On 11.01.2024 19:25, Elias El Yandouzi wrote:
>>> How would you proceed? Calling vmap() seems to be a bit overkill for
>>> just a temporary mapping and I don't really want to rework this function
>>> to use map_domain_page().
>>>
>>> In such case, how would you proceed? What do you suggest?
>>
>> fixmap may be an option to consider, but I also don't see why you
>> apparently think using vmap() would be a possibility while at the
>> same time making use of map_domain_page() is too much effort.
> 
> I thought about using vmap() as it allows to map a contiguous region 
> easily. It is also used in the follow-up patch 17/22, so I thought it 
> could be viable.
> 
> I was reluctant to use map_domain_page() for two reasons. 1) it only 
> allows to map one page at the time, so I'd need to rework more deeply 
> the function dmi_iterate() 2) because the mapcache wouldn't be ready to 
> use at that time, the mapping would end up in PMAP which is meant to map 
> the page tables, nothing else.

Oh, right, this makes sense of course.

Jan
diff mbox series

Patch

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index b7ee97be762e..a63e4612acac 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -760,6 +760,18 @@  Specify the size of the console debug trace buffer. By specifying `cpu:`
 additionally a trace buffer of the specified size is allocated per cpu.
 The debug trace feature is only enabled in debugging builds of Xen.
 
+### directmap (x86)
+> `= <boolean>`
+
+> Default: `true`
+
+Enable or disable the direct map region in Xen.
+
+By default, Xen creates the direct map region which maps physical memory
+in that region. Setting this to no will remove the direct map, blocking
+exploits that leak secrets via speculative memory access in the direct
+map.
+
 ### dma_bits
 > `= <integer>`
 
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 68adcac9fa8d..2366928d71aa 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -406,6 +406,11 @@  static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn)
     } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
 }
 
+static inline bool arch_has_directmap(void)
+{
+    return true;
+}
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h
index db29e3e2059f..cf8b20817c6c 100644
--- a/xen/arch/x86/include/asm/mm.h
+++ b/xen/arch/x86/include/asm/mm.h
@@ -464,6 +464,8 @@  static inline int get_page_and_type(struct page_info *page,
     ASSERT(((_p)->count_info & PGC_count_mask) != 0);          \
     ASSERT(page_get_owner(_p) == (_d))
 
+extern bool opt_directmap;
+
 /******************************************************************************
  * With shadow pagetables, the different kinds of address start
  * to get get confusing.
@@ -620,13 +622,26 @@  extern const char zero_page[];
 /* Build a 32bit PSE page table using 4MB pages. */
 void write_32bit_pse_identmap(uint32_t *l2);
 
+static inline bool arch_has_directmap(void)
+{
+    return opt_directmap;
+}
+
 /*
  * x86 maps part of physical memory via the directmap region.
  * Return whether the range of MFN falls in the directmap region.
+ *
+ * When boot command line sets directmap=no, we will not have a direct map at
+ * all so this will always return false.
  */
 static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 {
-    unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
+    unsigned long eva;
+
+    if ( !arch_has_directmap() )
+        return false;
+
+    eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
 
     return (mfn + nr) <= (virt_to_mfn(eva - 1) + 1);
 }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 041bd4cfde17..e76e135b96fc 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -157,6 +157,9 @@  l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap_x[L1_PAGETABLE_ENTRIES];
 
+bool __ro_after_init opt_directmap = true;
+boolean_param("directmap", opt_directmap);
+
 /* Frame table size in pages. */
 unsigned long max_page;
 unsigned long total_pages;
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 1c2e09711eb0..2cb051c6e4e7 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1423,6 +1423,8 @@  void __init noreturn __start_xen(unsigned long mbi_p)
     if ( highmem_start )
         xenheap_max_mfn(PFN_DOWN(highmem_start - 1));
 
+    printk("Booting with directmap %s\n", arch_has_directmap() ? "on" : "off");
+
     /*
      * Walk every RAM region and map it in its entirety (on x86/64, at least)
      * and notify it to the boot allocator.