diff mbox series

[v3,18/18] xen/arm64: mm: Rework switch_ttbr()

Message ID 20221212095523.52683-19-julien@xen.org (mailing list archive)
State Superseded
Headers show
Series xen/arm: Don't switch TTBR while the MMU is on | expand

Commit Message

Julien Grall Dec. 12, 2022, 9:55 a.m. UTC
From: Julien Grall <jgrall@amazon.com>

At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
still on.

Switching TTBR is like replacing existing mappings with new ones. So
we need to follow the break-before-make sequence.

In this case, it means the MMU needs to be switched off while the
TTBR is updated. In order to disable the MMU, we need to first
jump to an identity mapping.

Rename switch_ttbr() to switch_ttbr_id() and create an helper on
top to temporary map the identity mapping and call switch_ttbr()
via the identity address.

switch_ttbr_id() is now reworked to temporarily turn off the MMU
before updating the TTBR.

We also need to make sure the helper switch_ttbr() is part of the
identity mapping. So move _end_boot past it.

The arm32 code will use a different approach. So this issue is for now
only resolved on arm64.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v2:
        - Remove the arm32 changes. This will be addressed differently
        - Re-instate the instruct cache flush. This is not strictly
          necessary but kept it for safety.
        - Use "dsb ish"  rather than "dsb sy".

    TODO:
        * Handle the case where the runtime Xen is loaded at a different
          position for cache coloring. This will be dealt separately.
---
 xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
 xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
 xen/arch/arm/include/asm/mm.h |  2 ++
 xen/arch/arm/mm.c             | 14 +++++-----
 4 files changed, 82 insertions(+), 23 deletions(-)

Comments

Stefano Stabellini Dec. 13, 2022, 2 a.m. UTC | #1
On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
> still on.
> 
> Switching TTBR is like replacing existing mappings with new ones. So
> we need to follow the break-before-make sequence.
> 
> In this case, it means the MMU needs to be switched off while the
> TTBR is updated. In order to disable the MMU, we need to first
> jump to an identity mapping.
> 
> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
> top to temporary map the identity mapping and call switch_ttbr()
> via the identity address.
> 
> switch_ttbr_id() is now reworked to temporarily turn off the MMU
> before updating the TTBR.
> 
> We also need to make sure the helper switch_ttbr() is part of the
> identity mapping. So move _end_boot past it.
> 
> The arm32 code will use a different approach. So this issue is for now
> only resolved on arm64.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

This patch looks overall good to me, aside from the few minor comments
below. I would love for someone else, maybe from ARM, reviewing steps
1-6 making sure they are the right sequence.


> ---
> 
>     Changes in v2:
>         - Remove the arm32 changes. This will be addressed differently
>         - Re-instate the instruct cache flush. This is not strictly
>           necessary but kept it for safety.
>         - Use "dsb ish"  rather than "dsb sy".
> 
>     TODO:
>         * Handle the case where the runtime Xen is loaded at a different
>           position for cache coloring. This will be dealt separately.
> ---
>  xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>  xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>  xen/arch/arm/include/asm/mm.h |  2 ++
>  xen/arch/arm/mm.c             | 14 +++++-----
>  4 files changed, 82 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 663f5813b12e..1f69864492b6 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -816,30 +816,46 @@ ENDPROC(fail)
>   * Switch TTBR
>   *
>   * x0    ttbr
> - *
> - * TODO: This code does not comply with break-before-make.
>   */
> -ENTRY(switch_ttbr)
> -        dsb   sy                     /* Ensure the flushes happen before
> -                                      * continuing */
> -        isb                          /* Ensure synchronization with previous
> -                                      * changes to text */
> -        tlbi   alle2                 /* Flush hypervisor TLB */
> -        ic     iallu                 /* Flush I-cache */
> -        dsb    sy                    /* Ensure completion of TLB flush */
> +ENTRY(switch_ttbr_id)
> +        /* 1) Ensure any previous read/write have completed */
> +        dsb    ish
> +        isb
> +
> +        /* 2) Turn off MMU */
> +        mrs    x1, SCTLR_EL2
> +        bic    x1, x1, #SCTLR_Axx_ELx_M

do we need a "dsb   sy" here? we have in enable_mmu


> +        msr    SCTLR_EL2, x1
> +        isb
> +
> +        /*
> +         * 3) Flush the TLBs.
> +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
> +         */
> +        dsb   nshst
> +        tlbi  alle2
> +        dsb   nsh
> +        isb
> +
> +        /* 4) Update the TTBR */
> +        msr   TTBR0_EL2, x0
>          isb
>  
> -        msr    TTBR0_EL2, x0
> +        /*
> +         * 5) Flush I-cache
> +         * This should not be necessary but it is kept for safety.
> +         */
> +        ic     iallu
> +        isb
>  
> -        isb                          /* Ensure synchronization with previous
> -                                      * changes to text */
> -        tlbi   alle2                 /* Flush hypervisor TLB */
> -        ic     iallu                 /* Flush I-cache */
> -        dsb    sy                    /* Ensure completion of TLB flush */
> +        /* 5) Turn on the MMU */

This should be 6)


> +        mrs   x1, SCTLR_EL2
> +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
> +        msr   SCTLR_EL2, x1
>          isb
>  
>          ret
> -ENDPROC(switch_ttbr)
> +ENDPROC(switch_ttbr_id)
>  
>  #ifdef CONFIG_EARLY_PRINTK
>  /*
> diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
> index 9eaf545ea9dd..2ede4e75ae33 100644
> --- a/xen/arch/arm/arm64/mm.c
> +++ b/xen/arch/arm/arm64/mm.c
> @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
>      lpae_t pte;
>      DECLARE_OFFSETS(id_offsets, id_addr);
>  
> +    /*
> +     * We will be re-using the boot ID tables. They may not have been
> +     * zeroed but they should be unlinked. So it is fine to use
> +     * clear_page().
> +     */
> +    clear_page(boot_first_id);
> +    clear_page(boot_second_id);
> +    clear_page(boot_third_id);

Could this code be in patch #15?


>      if ( id_offsets[0] != 0 )
>          panic("Cannot handled ID mapping above 512GB\n");
>  
> @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
>      BUG_ON(rc);
>  }
>  
> +extern void switch_ttbr_id(uint64_t ttbr);
> +
> +typedef void (switch_ttbr_fn)(uint64_t ttbr);
> +
> +void __init switch_ttbr(uint64_t ttbr)
> +{
> +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
> +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> +    lpae_t pte;
> +
> +    /* Enable the identity mapping in the boot page tables */

See below...

> +    update_identity_mapping(true);
> +    /* Enable the identity mapping in the runtime page tables */
> +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +    pte.pt.ro = 1;
> +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
> +
> +    /* Switch TTBR */
> +    fn(ttbr);
> +
> +    /*
> +     * Disable the identity mapping in the runtime page tables.
> +     * Note it is not necessary to disable it in the boot page tables
> +     * because they are not going to be used by this CPU anymore.
> +     */

...is update_identity_mapping acting on the boot pagetables or the
runtime pagetables? The two comments make me think that
update_identity_mapping is enabling mapping in the boot pagetables and
removing them from the runtime pagetable, which would be strangely
inconsistent.

> +    update_identity_mapping(false);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 68adcac9fa8d..bff6923f3ea9 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -196,6 +196,8 @@ extern unsigned long total_pages;
>  extern void setup_pagetables(unsigned long boot_phys_offset);
>  /* Map FDT in boot pagetable */
>  extern void *early_fdt_map(paddr_t fdt_paddr);
> +/* Switch to a new root page-tables */
> +extern void switch_ttbr(uint64_t ttbr);
>  /* Remove early mappings */
>  extern void remove_early_mappings(void);
>  /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 39e0d9e03c9c..cf23ae02d1b7 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -476,8 +476,6 @@ static void xen_pt_enforce_wnx(void)
>      flush_xen_tlb_local();
>  }
>  
> -extern void switch_ttbr(uint64_t ttbr);
> -
>  /* Clear a translation table and clean & invalidate the cache */
>  static void clear_table(void *table)
>  {
> @@ -550,13 +548,17 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>      ttbr = (uintptr_t) cpu0_pgtable + phys_offset;
>  #endif
>  
> -    switch_ttbr(ttbr);
> -
> -    xen_pt_enforce_wnx();
> -
> +    /*
> +     * This needs to be setup first so switch_ttbr() can enable the
> +     * identity mapping.
> +     */
>  #ifdef CONFIG_ARM_32
>      per_cpu(xen_pgtable, 0) = cpu0_pgtable;
>  #endif
> +
> +    switch_ttbr(ttbr);
> +
> +    xen_pt_enforce_wnx();
>  }
>  
>  static void clear_boot_pagetables(void)
> -- 
> 2.38.1
>
Julien Grall Dec. 13, 2022, 7:08 p.m. UTC | #2
Hi Stefano,

On 13/12/2022 02:00, Stefano Stabellini wrote:
> On Mon, 12 Dec 2022, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
>> still on.
>>
>> Switching TTBR is like replacing existing mappings with new ones. So
>> we need to follow the break-before-make sequence.
>>
>> In this case, it means the MMU needs to be switched off while the
>> TTBR is updated. In order to disable the MMU, we need to first
>> jump to an identity mapping.
>>
>> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
>> top to temporary map the identity mapping and call switch_ttbr()
>> via the identity address.
>>
>> switch_ttbr_id() is now reworked to temporarily turn off the MMU
>> before updating the TTBR.
>>
>> We also need to make sure the helper switch_ttbr() is part of the
>> identity mapping. So move _end_boot past it.
>>
>> The arm32 code will use a different approach. So this issue is for now
>> only resolved on arm64.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> This patch looks overall good to me, aside from the few minor comments
> below. I would love for someone else, maybe from ARM, reviewing steps
> 1-6 making sure they are the right sequence.
> 
> 
>> ---
>>
>>      Changes in v2:
>>          - Remove the arm32 changes. This will be addressed differently
>>          - Re-instate the instruct cache flush. This is not strictly
>>            necessary but kept it for safety.
>>          - Use "dsb ish"  rather than "dsb sy".
>>
>>      TODO:
>>          * Handle the case where the runtime Xen is loaded at a different
>>            position for cache coloring. This will be dealt separately.
>> ---
>>   xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>>   xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>>   xen/arch/arm/include/asm/mm.h |  2 ++
>>   xen/arch/arm/mm.c             | 14 +++++-----
>>   4 files changed, 82 insertions(+), 23 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index 663f5813b12e..1f69864492b6 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -816,30 +816,46 @@ ENDPROC(fail)
>>    * Switch TTBR
>>    *
>>    * x0    ttbr
>> - *
>> - * TODO: This code does not comply with break-before-make.
>>    */
>> -ENTRY(switch_ttbr)
>> -        dsb   sy                     /* Ensure the flushes happen before
>> -                                      * continuing */
>> -        isb                          /* Ensure synchronization with previous
>> -                                      * changes to text */
>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>> -        ic     iallu                 /* Flush I-cache */
>> -        dsb    sy                    /* Ensure completion of TLB flush */
>> +ENTRY(switch_ttbr_id)
>> +        /* 1) Ensure any previous read/write have completed */
>> +        dsb    ish
>> +        isb
>> +
>> +        /* 2) Turn off MMU */
>> +        mrs    x1, SCTLR_EL2
>> +        bic    x1, x1, #SCTLR_Axx_ELx_M
> 
> do we need a "dsb   sy" here? we have in enable_mmu

Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. 
The isb doesn't flush the I-cache, it just flushes the pipeline.

For the dsb, I am not convinced it is necessary because we already have 
the 'dsb nsh' above and in any case the barrier seems to be too strong.

I guess that will be another patch... (probably at a lower priority).

Now back to your question of the 'dsb' here. There is already a 'dsb 
ish' above. So memory access before turning off the MMU should be 
completed. Also...

> 
> 
>> +        msr    SCTLR_EL2, x1
>> +        isb

... this isb will ensure the completion of SCTLR before the TLBs are 
flushed before. And there should be no memory access (or than 
instructions here). So I don't think the a dsb is needed.

Would you mind to explain why you think there is one needed?

>> +
>> +        /*
>> +         * 3) Flush the TLBs.
>> +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
>> +         */
>> +        dsb   nshst
>> +        tlbi  alle2
>> +        dsb   nsh
>> +        isb
>> +
>> +        /* 4) Update the TTBR */
>> +        msr   TTBR0_EL2, x0
>>           isb
>>   
>> -        msr    TTBR0_EL2, x0
>> +        /*
>> +         * 5) Flush I-cache
>> +         * This should not be necessary but it is kept for safety.
>> +         */
>> +        ic     iallu
>> +        isb
>>   
>> -        isb                          /* Ensure synchronization with previous
>> -                                      * changes to text */
>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>> -        ic     iallu                 /* Flush I-cache */
>> -        dsb    sy                    /* Ensure completion of TLB flush */
>> +        /* 5) Turn on the MMU */
> 
> This should be 6)

I will update it.

> 
> 
>> +        mrs   x1, SCTLR_EL2
>> +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
>> +        msr   SCTLR_EL2, x1
>>           isb
>>   
>>           ret
>> -ENDPROC(switch_ttbr)
>> +ENDPROC(switch_ttbr_id)
>>   
>>   #ifdef CONFIG_EARLY_PRINTK
>>   /*
>> diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
>> index 9eaf545ea9dd..2ede4e75ae33 100644
>> --- a/xen/arch/arm/arm64/mm.c
>> +++ b/xen/arch/arm/arm64/mm.c
>> @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
>>       lpae_t pte;
>>       DECLARE_OFFSETS(id_offsets, id_addr);
>>   
>> +    /*
>> +     * We will be re-using the boot ID tables. They may not have been
>> +     * zeroed but they should be unlinked. So it is fine to use
>> +     * clear_page().
>> +     */
>> +    clear_page(boot_first_id);
>> +    clear_page(boot_second_id);
>> +    clear_page(boot_third_id);
> 
> Could this code be in patch #15?

Yes, I can't remember why I decided to clear them in patch #18.

>>       if ( id_offsets[0] != 0 )
>>           panic("Cannot handled ID mapping above 512GB\n");
>>   
>> @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
>>       BUG_ON(rc);
>>   }
>>   
>> +extern void switch_ttbr_id(uint64_t ttbr);
>> +
>> +typedef void (switch_ttbr_fn)(uint64_t ttbr);
>> +
>> +void __init switch_ttbr(uint64_t ttbr)
>> +{
>> +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
>> +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
>> +    lpae_t pte;
>> +
>> +    /* Enable the identity mapping in the boot page tables */
> 
> See below...
> 
>> +    update_identity_mapping(true);
>> +    /* Enable the identity mapping in the runtime page tables */
>> +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
>> +    pte.pt.table = 1;
>> +    pte.pt.xn = 0;
>> +    pte.pt.ro = 1;
>> +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
>> +
>> +    /* Switch TTBR */
>> +    fn(ttbr);
>> +
>> +    /*
>> +     * Disable the identity mapping in the runtime page tables.
>> +     * Note it is not necessary to disable it in the boot page tables
>> +     * because they are not going to be used by this CPU anymore.
>> +     */
> 
> ...is update_identity_mapping acting on the boot pagetables or the
> runtime pagetables? The two comments make me think that
> update_identity_mapping is enabling mapping in the boot pagetables and
> removing them from the runtime pagetable, which would be strangely
> inconsistent.

update_identity_mapping() is acting on the live page-tables (i.e. the 
one pointed by TTBR_EL2).

Before switch_ttbr(), this will be the boot page-tables. But after, this 
will be the runtime page-tables.

Would the following comment on top of the declaration of 
update_identity_mapping() clarifies it:

"Enable/disable the identity mapping in the live page-tables (i.e. the 
one pointed by TTBR_EL2)."

Cheers,
Stefano Stabellini Dec. 13, 2022, 10:56 p.m. UTC | #3
On Tue, 13 Dec 2022, Julien Grall wrote:
> Hi Stefano,
> 
> On 13/12/2022 02:00, Stefano Stabellini wrote:
> > On Mon, 12 Dec 2022, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
> > > still on.
> > > 
> > > Switching TTBR is like replacing existing mappings with new ones. So
> > > we need to follow the break-before-make sequence.
> > > 
> > > In this case, it means the MMU needs to be switched off while the
> > > TTBR is updated. In order to disable the MMU, we need to first
> > > jump to an identity mapping.
> > > 
> > > Rename switch_ttbr() to switch_ttbr_id() and create an helper on
> > > top to temporary map the identity mapping and call switch_ttbr()
> > > via the identity address.
> > > 
> > > switch_ttbr_id() is now reworked to temporarily turn off the MMU
> > > before updating the TTBR.
> > > 
> > > We also need to make sure the helper switch_ttbr() is part of the
> > > identity mapping. So move _end_boot past it.
> > > 
> > > The arm32 code will use a different approach. So this issue is for now
> > > only resolved on arm64.
> > > 
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > 
> > This patch looks overall good to me, aside from the few minor comments
> > below. I would love for someone else, maybe from ARM, reviewing steps
> > 1-6 making sure they are the right sequence.
> > 
> > 
> > > ---
> > > 
> > >      Changes in v2:
> > >          - Remove the arm32 changes. This will be addressed differently
> > >          - Re-instate the instruct cache flush. This is not strictly
> > >            necessary but kept it for safety.
> > >          - Use "dsb ish"  rather than "dsb sy".
> > > 
> > >      TODO:
> > >          * Handle the case where the runtime Xen is loaded at a different
> > >            position for cache coloring. This will be dealt separately.
> > > ---
> > >   xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
> > >   xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
> > >   xen/arch/arm/include/asm/mm.h |  2 ++
> > >   xen/arch/arm/mm.c             | 14 +++++-----
> > >   4 files changed, 82 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > > index 663f5813b12e..1f69864492b6 100644
> > > --- a/xen/arch/arm/arm64/head.S
> > > +++ b/xen/arch/arm/arm64/head.S
> > > @@ -816,30 +816,46 @@ ENDPROC(fail)
> > >    * Switch TTBR
> > >    *
> > >    * x0    ttbr
> > > - *
> > > - * TODO: This code does not comply with break-before-make.
> > >    */
> > > -ENTRY(switch_ttbr)
> > > -        dsb   sy                     /* Ensure the flushes happen before
> > > -                                      * continuing */
> > > -        isb                          /* Ensure synchronization with
> > > previous
> > > -                                      * changes to text */
> > > -        tlbi   alle2                 /* Flush hypervisor TLB */
> > > -        ic     iallu                 /* Flush I-cache */
> > > -        dsb    sy                    /* Ensure completion of TLB flush */
> > > +ENTRY(switch_ttbr_id)
> > > +        /* 1) Ensure any previous read/write have completed */
> > > +        dsb    ish
> > > +        isb
> > > +
> > > +        /* 2) Turn off MMU */
> > > +        mrs    x1, SCTLR_EL2
> > > +        bic    x1, x1, #SCTLR_Axx_ELx_M
> > 
> > do we need a "dsb   sy" here? we have in enable_mmu
> 
> Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. The isb
> doesn't flush the I-cache, it just flushes the pipeline.
> 
> For the dsb, I am not convinced it is necessary because we already have the
> 'dsb nsh' above and in any case the barrier seems to be too strong.
> 
> I guess that will be another patch... (probably at a lower priority).
> 
> Now back to your question of the 'dsb' here. There is already a 'dsb ish'
> above. So memory access before turning off the MMU should be completed.
> Also...
> 
> > 
> > 
> > > +        msr    SCTLR_EL2, x1
> > > +        isb
> 
> ... this isb will ensure the completion of SCTLR before the TLBs are flushed
> before. And there should be no memory access (or than instructions here). So I
> don't think the a dsb is needed.
> 
> Would you mind to explain why you think there is one needed?

I am not at all sure whether it is needed or not, I was just noticing
that we have the "dsb sy" in enable_mmu and here we don't.

Thinking about it, the only reason for the additional dsb would be to
make sure that the two operations:

  mrs    x1, SCTLR_EL2
  bic    x1, x1, #SCTLR_Axx_ELx_M

are completed before disabling the MMU:

  msr    SCTLR_EL2, x1

Is that actually a requirement? I don't know.


> > > +
> > > +        /*
> > > +         * 3) Flush the TLBs.
> > > +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
> > > +         */
> > > +        dsb   nshst
> > > +        tlbi  alle2
> > > +        dsb   nsh
> > > +        isb
> > > +
> > > +        /* 4) Update the TTBR */
> > > +        msr   TTBR0_EL2, x0
> > >           isb
> > >   -        msr    TTBR0_EL2, x0
> > > +        /*
> > > +         * 5) Flush I-cache
> > > +         * This should not be necessary but it is kept for safety.
> > > +         */
> > > +        ic     iallu
> > > +        isb
> > >   -        isb                          /* Ensure synchronization with
> > > previous
> > > -                                      * changes to text */
> > > -        tlbi   alle2                 /* Flush hypervisor TLB */
> > > -        ic     iallu                 /* Flush I-cache */
> > > -        dsb    sy                    /* Ensure completion of TLB flush */
> > > +        /* 5) Turn on the MMU */
> > 
> > This should be 6)
> 
> I will update it.
> 
> > 
> > 
> > > +        mrs   x1, SCTLR_EL2
> > > +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
> > > +        msr   SCTLR_EL2, x1
> > >           isb
> > >             ret
> > > -ENDPROC(switch_ttbr)
> > > +ENDPROC(switch_ttbr_id)
> > >     #ifdef CONFIG_EARLY_PRINTK
> > >   /*
> > > diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
> > > index 9eaf545ea9dd..2ede4e75ae33 100644
> > > --- a/xen/arch/arm/arm64/mm.c
> > > +++ b/xen/arch/arm/arm64/mm.c
> > > @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
> > >       lpae_t pte;
> > >       DECLARE_OFFSETS(id_offsets, id_addr);
> > >   +    /*
> > > +     * We will be re-using the boot ID tables. They may not have been
> > > +     * zeroed but they should be unlinked. So it is fine to use
> > > +     * clear_page().
> > > +     */
> > > +    clear_page(boot_first_id);
> > > +    clear_page(boot_second_id);
> > > +    clear_page(boot_third_id);
> > 
> > Could this code be in patch #15?
> 
> Yes, I can't remember why I decided to clear them in patch #18.
> 
> > >       if ( id_offsets[0] != 0 )
> > >           panic("Cannot handled ID mapping above 512GB\n");
> > >   @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
> > >       BUG_ON(rc);
> > >   }
> > >   +extern void switch_ttbr_id(uint64_t ttbr);
> > > +
> > > +typedef void (switch_ttbr_fn)(uint64_t ttbr);
> > > +
> > > +void __init switch_ttbr(uint64_t ttbr)
> > > +{
> > > +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
> > > +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> > > +    lpae_t pte;
> > > +
> > > +    /* Enable the identity mapping in the boot page tables */
> > 
> > See below...
> > 
> > > +    update_identity_mapping(true);
> > > +    /* Enable the identity mapping in the runtime page tables */
> > > +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
> > > +    pte.pt.table = 1;
> > > +    pte.pt.xn = 0;
> > > +    pte.pt.ro = 1;
> > > +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
> > > +
> > > +    /* Switch TTBR */
> > > +    fn(ttbr);
> > > +
> > > +    /*
> > > +     * Disable the identity mapping in the runtime page tables.
> > > +     * Note it is not necessary to disable it in the boot page tables
> > > +     * because they are not going to be used by this CPU anymore.
> > > +     */
> > 
> > ...is update_identity_mapping acting on the boot pagetables or the
> > runtime pagetables? The two comments make me think that
> > update_identity_mapping is enabling mapping in the boot pagetables and
> > removing them from the runtime pagetable, which would be strangely
> > inconsistent.
> 
> update_identity_mapping() is acting on the live page-tables (i.e. the one
> pointed by TTBR_EL2).
> 
> Before switch_ttbr(), this will be the boot page-tables. But after, this will
> be the runtime page-tables.
> 
> Would the following comment on top of the declaration of
> update_identity_mapping() clarifies it:
> 
> "Enable/disable the identity mapping in the live page-tables (i.e. the one
> pointed by TTBR_EL2)."

Thank you!
Julien Grall Dec. 13, 2022, 11:01 p.m. UTC | #4
Hi Stefano,

On 13/12/2022 22:56, Stefano Stabellini wrote:
> On Tue, 13 Dec 2022, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 13/12/2022 02:00, Stefano Stabellini wrote:
>>> On Mon, 12 Dec 2022, Julien Grall wrote:
>>>> From: Julien Grall <jgrall@amazon.com>
>>>>
>>>> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
>>>> still on.
>>>>
>>>> Switching TTBR is like replacing existing mappings with new ones. So
>>>> we need to follow the break-before-make sequence.
>>>>
>>>> In this case, it means the MMU needs to be switched off while the
>>>> TTBR is updated. In order to disable the MMU, we need to first
>>>> jump to an identity mapping.
>>>>
>>>> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
>>>> top to temporary map the identity mapping and call switch_ttbr()
>>>> via the identity address.
>>>>
>>>> switch_ttbr_id() is now reworked to temporarily turn off the MMU
>>>> before updating the TTBR.
>>>>
>>>> We also need to make sure the helper switch_ttbr() is part of the
>>>> identity mapping. So move _end_boot past it.
>>>>
>>>> The arm32 code will use a different approach. So this issue is for now
>>>> only resolved on arm64.
>>>>
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> This patch looks overall good to me, aside from the few minor comments
>>> below. I would love for someone else, maybe from ARM, reviewing steps
>>> 1-6 making sure they are the right sequence.
>>>
>>>
>>>> ---
>>>>
>>>>       Changes in v2:
>>>>           - Remove the arm32 changes. This will be addressed differently
>>>>           - Re-instate the instruct cache flush. This is not strictly
>>>>             necessary but kept it for safety.
>>>>           - Use "dsb ish"  rather than "dsb sy".
>>>>
>>>>       TODO:
>>>>           * Handle the case where the runtime Xen is loaded at a different
>>>>             position for cache coloring. This will be dealt separately.
>>>> ---
>>>>    xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>>>>    xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>>>>    xen/arch/arm/include/asm/mm.h |  2 ++
>>>>    xen/arch/arm/mm.c             | 14 +++++-----
>>>>    4 files changed, 82 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>>>> index 663f5813b12e..1f69864492b6 100644
>>>> --- a/xen/arch/arm/arm64/head.S
>>>> +++ b/xen/arch/arm/arm64/head.S
>>>> @@ -816,30 +816,46 @@ ENDPROC(fail)
>>>>     * Switch TTBR
>>>>     *
>>>>     * x0    ttbr
>>>> - *
>>>> - * TODO: This code does not comply with break-before-make.
>>>>     */
>>>> -ENTRY(switch_ttbr)
>>>> -        dsb   sy                     /* Ensure the flushes happen before
>>>> -                                      * continuing */
>>>> -        isb                          /* Ensure synchronization with
>>>> previous
>>>> -                                      * changes to text */
>>>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>>>> -        ic     iallu                 /* Flush I-cache */
>>>> -        dsb    sy                    /* Ensure completion of TLB flush */
>>>> +ENTRY(switch_ttbr_id)
>>>> +        /* 1) Ensure any previous read/write have completed */
>>>> +        dsb    ish
>>>> +        isb
>>>> +
>>>> +        /* 2) Turn off MMU */
>>>> +        mrs    x1, SCTLR_EL2
>>>> +        bic    x1, x1, #SCTLR_Axx_ELx_M
>>>
>>> do we need a "dsb   sy" here? we have in enable_mmu
>>
>> Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. The isb
>> doesn't flush the I-cache, it just flushes the pipeline.
>>
>> For the dsb, I am not convinced it is necessary because we already have the
>> 'dsb nsh' above and in any case the barrier seems to be too strong.
>>
>> I guess that will be another patch... (probably at a lower priority).
>>
>> Now back to your question of the 'dsb' here. There is already a 'dsb ish'
>> above. So memory access before turning off the MMU should be completed.
>> Also...
>>
>>>
>>>
>>>> +        msr    SCTLR_EL2, x1
>>>> +        isb
>>
>> ... this isb will ensure the completion of SCTLR before the TLBs are flushed
>> before. And there should be no memory access (or than instructions here). So I
>> don't think the a dsb is needed.
>>
>> Would you mind to explain why you think there is one needed?
> 
> I am not at all sure whether it is needed or not, I was just noticing
> that we have the "dsb sy" in enable_mmu and here we don't.
> 
> Thinking about it, the only reason for the additional dsb would be to
> make sure that the two operations:
> 
>    mrs    x1, SCTLR_EL2
>    bic    x1, x1, #SCTLR_Axx_ELx_M
> 
> are completed before disabling the MMU:

That's not what a 'dsb' is for. It is used for memory ordering there are 
are no memory access involved here.

If you want the operations to be completed, then this would be an 'isb'. 
Yet, this would not be necessary here as the next instruction cannot be 
re-ordered because of the register dependency.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 663f5813b12e..1f69864492b6 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -816,30 +816,46 @@  ENDPROC(fail)
  * Switch TTBR
  *
  * x0    ttbr
- *
- * TODO: This code does not comply with break-before-make.
  */
-ENTRY(switch_ttbr)
-        dsb   sy                     /* Ensure the flushes happen before
-                                      * continuing */
-        isb                          /* Ensure synchronization with previous
-                                      * changes to text */
-        tlbi   alle2                 /* Flush hypervisor TLB */
-        ic     iallu                 /* Flush I-cache */
-        dsb    sy                    /* Ensure completion of TLB flush */
+ENTRY(switch_ttbr_id)
+        /* 1) Ensure any previous read/write have completed */
+        dsb    ish
+        isb
+
+        /* 2) Turn off MMU */
+        mrs    x1, SCTLR_EL2
+        bic    x1, x1, #SCTLR_Axx_ELx_M
+        msr    SCTLR_EL2, x1
+        isb
+
+        /*
+         * 3) Flush the TLBs.
+         * See asm/arm64/flushtlb.h for the explanation of the sequence.
+         */
+        dsb   nshst
+        tlbi  alle2
+        dsb   nsh
+        isb
+
+        /* 4) Update the TTBR */
+        msr   TTBR0_EL2, x0
         isb
 
-        msr    TTBR0_EL2, x0
+        /*
+         * 5) Flush I-cache
+         * This should not be necessary but it is kept for safety.
+         */
+        ic     iallu
+        isb
 
-        isb                          /* Ensure synchronization with previous
-                                      * changes to text */
-        tlbi   alle2                 /* Flush hypervisor TLB */
-        ic     iallu                 /* Flush I-cache */
-        dsb    sy                    /* Ensure completion of TLB flush */
+        /* 5) Turn on the MMU */
+        mrs   x1, SCTLR_EL2
+        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
+        msr   SCTLR_EL2, x1
         isb
 
         ret
-ENDPROC(switch_ttbr)
+ENDPROC(switch_ttbr_id)
 
 #ifdef CONFIG_EARLY_PRINTK
 /*
diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
index 9eaf545ea9dd..2ede4e75ae33 100644
--- a/xen/arch/arm/arm64/mm.c
+++ b/xen/arch/arm/arm64/mm.c
@@ -31,6 +31,15 @@  static void __init prepare_boot_identity_mapping(void)
     lpae_t pte;
     DECLARE_OFFSETS(id_offsets, id_addr);
 
+    /*
+     * We will be re-using the boot ID tables. They may not have been
+     * zeroed but they should be unlinked. So it is fine to use
+     * clear_page().
+     */
+    clear_page(boot_first_id);
+    clear_page(boot_second_id);
+    clear_page(boot_third_id);
+
     if ( id_offsets[0] != 0 )
         panic("Cannot handled ID mapping above 512GB\n");
 
@@ -111,6 +120,36 @@  void update_identity_mapping(bool enable)
     BUG_ON(rc);
 }
 
+extern void switch_ttbr_id(uint64_t ttbr);
+
+typedef void (switch_ttbr_fn)(uint64_t ttbr);
+
+void __init switch_ttbr(uint64_t ttbr)
+{
+    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
+    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+    lpae_t pte;
+
+    /* Enable the identity mapping in the boot page tables */
+    update_identity_mapping(true);
+    /* Enable the identity mapping in the runtime page tables */
+    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+    pte.pt.ro = 1;
+    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
+
+    /* Switch TTBR */
+    fn(ttbr);
+
+    /*
+     * Disable the identity mapping in the runtime page tables.
+     * Note it is not necessary to disable it in the boot page tables
+     * because they are not going to be used by this CPU anymore.
+     */
+    update_identity_mapping(false);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 68adcac9fa8d..bff6923f3ea9 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -196,6 +196,8 @@  extern unsigned long total_pages;
 extern void setup_pagetables(unsigned long boot_phys_offset);
 /* Map FDT in boot pagetable */
 extern void *early_fdt_map(paddr_t fdt_paddr);
+/* Switch to a new root page-tables */
+extern void switch_ttbr(uint64_t ttbr);
 /* Remove early mappings */
 extern void remove_early_mappings(void);
 /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 39e0d9e03c9c..cf23ae02d1b7 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -476,8 +476,6 @@  static void xen_pt_enforce_wnx(void)
     flush_xen_tlb_local();
 }
 
-extern void switch_ttbr(uint64_t ttbr);
-
 /* Clear a translation table and clean & invalidate the cache */
 static void clear_table(void *table)
 {
@@ -550,13 +548,17 @@  void __init setup_pagetables(unsigned long boot_phys_offset)
     ttbr = (uintptr_t) cpu0_pgtable + phys_offset;
 #endif
 
-    switch_ttbr(ttbr);
-
-    xen_pt_enforce_wnx();
-
+    /*
+     * This needs to be setup first so switch_ttbr() can enable the
+     * identity mapping.
+     */
 #ifdef CONFIG_ARM_32
     per_cpu(xen_pgtable, 0) = cpu0_pgtable;
 #endif
+
+    switch_ttbr(ttbr);
+
+    xen_pt_enforce_wnx();
 }
 
 static void clear_boot_pagetables(void)