mbox series

[RFC,0/8] use __create_pgd_mapping() to implement idmap and unify codes

Message ID 20210410095654.24102-1-kernelfans@gmail.com (mailing list archive)
Headers show
Series use __create_pgd_mapping() to implement idmap and unify codes | expand

Message

Pingfan Liu April 10, 2021, 9:56 a.m. UTC
Hi everyone,

Sorry to bring up this RFC in a hurry, since I paid attention to "arm64:
MMU enabled kexec relocation" too late and now it has advanced to "[PATCH
v13 00/18] arm64: MMU enabled kexec relocation".

And I think maybe that work can be based on my series.

I have raised my concern when reviewing "[PATCH v12 00/17] arm64: MMU
enabled kexec relocation"
  https://linuxlists.cc/l/1/linux-kernel/t/3923858/(patch_v12_00_17)_arm64:_mmu_enabled_kexec_relocation#post3948651
  (It seems that lore.kernel.org has not archived my reply)
  Where I wrote:
    Then the processes may be neat (I hope so):
    -1. set up identity map in machine_kexec_post_load(), instead of
    copying linear map.
    -2. Also past this temporary identity map to arm64_relocate_new_kernel()
    -3. in arm64_relocate_new_kernel(), just load identity map and
    re-enable MMU. After copying, just turn off MMU.

In a short discuss off-line, Pavel pointed to me
  https://lore.kernel.org/linux-arm-kernel/CA+CK2bC2KwWufE1DWa4szn_hQ1dbjDVHgYUu7=J4O_kvKXTrHg@mail.gmail.com/#t,
which prevent him from using idmap to implement his series.


After digging into the code, I find that if extending one more pgtable level,
the __create_pgd_mapping() routines can be re-used for idmap_pg_dir and
init_pg_dir. Besides, it can be re-used for trans_pgd_idmap_page().
That is what this series do.

As for "[PATCHv13 00/18] arm64: MMU enabled kexec relocation", here is
my two cents:
  -1. a call to create_idmap() API in machine_kexec_post_load(), to map
src + dst + arm64_relocate_new_kernel().
  -2. turn on MMU in arm64_relocate_new_kernel(), after done, turn off.

Sorry again for a hurry. It can be compiled, but far from good.

Thanks,

Pingfan

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Brown <broonie@kernel.org>
To: linux-arm-kernel@lists.infradead.org

Pingfan Liu (8):
  arm64/mm: split out __create_pgd_mapping() routines
  arm64/mm: change __create_pgd_mapping() prototype to accept nr_entries
    and introduce create_idmap()
  arm64/mm: change __create_pgd_mapping() prototype to accept extra info
    for allocator
  arm64/mm: enable __create_pgd_mapping() to run across different
    pgtable
  arm64/mm: make trans_pgd_idmap_page() use create_idmap()
  arm64/mm: introduce pgtable allocator for head
  arm64/pgtable-prot.h: reorganize to cope with asm
  arm64/head: convert idmap_pg_dir and init_pg_dir to
    __create_pgd_mapping()

 arch/arm64/Kconfig                    |   4 +
 arch/arm64/include/asm/pgalloc.h      |  28 ++
 arch/arm64/include/asm/pgtable-prot.h |  34 ++-
 arch/arm64/kernel/head.S              | 190 ++++----------
 arch/arm64/mm/Makefile                |   2 +
 arch/arm64/mm/idmap_mmu.c             |  46 ++++
 arch/arm64/mm/mmu.c                   | 358 ++++++--------------------
 arch/arm64/mm/mmu_include.c           | 284 ++++++++++++++++++++
 arch/arm64/mm/trans_pgd.c             |  59 ++---
 9 files changed, 535 insertions(+), 470 deletions(-)
 create mode 100644 arch/arm64/mm/idmap_mmu.c
 create mode 100644 arch/arm64/mm/mmu_include.c

Comments

Pasha Tatashin April 14, 2021, 2:05 p.m. UTC | #1
On Sat, Apr 10, 2021 at 5:57 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> Hi everyone,
>
> Sorry to bring up this RFC in a hurry, since I paid attention to "arm64:
> MMU enabled kexec relocation" too late and now it has advanced to "[PATCH
> v13 00/18] arm64: MMU enabled kexec relocation".
>
> And I think maybe that work can be based on my series.
>
> I have raised my concern when reviewing "[PATCH v12 00/17] arm64: MMU
> enabled kexec relocation"
>   https://linuxlists.cc/l/1/linux-kernel/t/3923858/(patch_v12_00_17)_arm64:_mmu_enabled_kexec_relocation#post3948651
>   (It seems that lore.kernel.org has not archived my reply)
>   Where I wrote:
>     Then the processes may be neat (I hope so):
>     -1. set up identity map in machine_kexec_post_load(), instead of
>     copying linear map.
>     -2. Also past this temporary identity map to arm64_relocate_new_kernel()
>     -3. in arm64_relocate_new_kernel(), just load identity map and
>     re-enable MMU. After copying, just turn off MMU.


Hi Pingfan,

The MMU enabled kexec code has been in development for a while, and
has gone through several iterations:

1. simply reserve memory (similar to crash kernel) so no relocation is
needed. The approach was only ~50 LOC, but since this was an ARM64
specific problem I was asked to fix it in ARM64, not in generic code.
2. The second approach was to use idmap (as you are proposing now),
but James Morse explained to me that there are arm systems that have
very high starting physical addresses that they cannot cover all
physical memory via idamp. So, I cannot assume that I can idmap any
page in PA.
3. The third approach was to unify some of page table management code
with hibernations (trans_pgd), and use contiguous VA maps, so the
relocation function can be as simply as possible. However, both, Eric
Biederman and James Morse asked me to change it to a linear map
instead: to be inline with other arches, and also for easier
debugging.
4. The fourth approach is the current one, I am using a linear map,
and a lot of patches for this project have already landed into the
mainline. The last set of changes does not add any new LOC: "18 files
changed, 315 insertions(+), 330 deletions(-)", as all the preliminary
work has landed upstream.

What is the benefit of going back to approach 2, when the current
approach has already been agreed with James and Eric, and does not add
new complexity, as the net LOC change is negative?

Thank you,
Pavel


>
> In a short discuss off-line, Pavel pointed to me
>   https://lore.kernel.org/linux-arm-kernel/CA+CK2bC2KwWufE1DWa4szn_hQ1dbjDVHgYUu7=J4O_kvKXTrHg@mail.gmail.com/#t,
> which prevent him from using idmap to implement his series.
>
>
> After digging into the code, I find that if extending one more pgtable level,
> the __create_pgd_mapping() routines can be re-used for idmap_pg_dir and
> init_pg_dir. Besides, it can be re-used for trans_pgd_idmap_page().
> That is what this series do.
>
> As for "[PATCHv13 00/18] arm64: MMU enabled kexec relocation", here is
> my two cents:
>   -1. a call to create_idmap() API in machine_kexec_post_load(), to map
> src + dst + arm64_relocate_new_kernel().
>   -2. turn on MMU in arm64_relocate_new_kernel(), after done, turn off.
>
> Sorry again for a hurry. It can be compiled, but far from good.
>
> Thanks,
>
> Pingfan
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Kristina Martsenko <kristina.martsenko@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Atish Patra <atish.patra@wdc.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Mark Brown <broonie@kernel.org>
> To: linux-arm-kernel@lists.infradead.org
>
> Pingfan Liu (8):
>   arm64/mm: split out __create_pgd_mapping() routines
>   arm64/mm: change __create_pgd_mapping() prototype to accept nr_entries
>     and introduce create_idmap()
>   arm64/mm: change __create_pgd_mapping() prototype to accept extra info
>     for allocator
>   arm64/mm: enable __create_pgd_mapping() to run across different
>     pgtable
>   arm64/mm: make trans_pgd_idmap_page() use create_idmap()
>   arm64/mm: introduce pgtable allocator for head
>   arm64/pgtable-prot.h: reorganize to cope with asm
>   arm64/head: convert idmap_pg_dir and init_pg_dir to
>     __create_pgd_mapping()
>
>  arch/arm64/Kconfig                    |   4 +
>  arch/arm64/include/asm/pgalloc.h      |  28 ++
>  arch/arm64/include/asm/pgtable-prot.h |  34 ++-
>  arch/arm64/kernel/head.S              | 190 ++++----------
>  arch/arm64/mm/Makefile                |   2 +
>  arch/arm64/mm/idmap_mmu.c             |  46 ++++
>  arch/arm64/mm/mmu.c                   | 358 ++++++--------------------
>  arch/arm64/mm/mmu_include.c           | 284 ++++++++++++++++++++
>  arch/arm64/mm/trans_pgd.c             |  59 ++---
>  9 files changed, 535 insertions(+), 470 deletions(-)
>  create mode 100644 arch/arm64/mm/idmap_mmu.c
>  create mode 100644 arch/arm64/mm/mmu_include.c
>
> --
> 2.29.2
>
Pingfan Liu April 15, 2021, 2:14 a.m. UTC | #2
Hi Pavel,

First of all, sorry to stir up this topic too late.

On Wed, Apr 14, 2021 at 10:06 PM Pavel Tatashin
<pasha.tatashin@soleen.com> wrote:
>
[...]
> Hi Pingfan,
>
> The MMU enabled kexec code has been in development for a while, and
> has gone through several iterations:
>

Yes, I have gone through the whole iterations after reviewing v12.

> 1. simply reserve memory (similar to crash kernel) so no relocation is
> needed. The approach was only ~50 LOC, but since this was an ARM64
> specific problem I was asked to fix it in ARM64, not in generic code.
> 2. The second approach was to use idmap (as you are proposing now),
> but James Morse explained to me that there are arm systems that have
> very high starting physical addresses that they cannot cover all
> physical memory via idamp. So, I cannot assume that I can idmap any
> page in PA.

I think here, the exact blocking factor is the routines in hand can
not set up idmap. But the current routines have the capability if
enhanced. And  it turns out easy to achieve the goal by redefining
CONFIG_PGTABLE_LEVEL.

> 3. The third approach was to unify some of page table management code
> with hibernations (trans_pgd), and use contiguous VA maps, so the
> relocation function can be as simply as possible. However, both, Eric
> Biederman and James Morse asked me to change it to a linear map
> instead: to be inline with other arches, and also for easier
> debugging.
> 4. The fourth approach is the current one, I am using a linear map,
> and a lot of patches for this project have already landed into the
> mainline. The last set of changes does not add any new LOC: "18 files
> changed, 315 insertions(+), 330 deletions(-)", as all the preliminary
> work has landed upstream.
>
> What is the benefit of going back to approach 2, when the current
> approach has already been agreed with James and Eric, and does not add
> new complexity, as the net LOC change is negative?
>

It takes me some time to understand the stub handling. But James is
expert on this field and sure about it.  In contrast, idmap is similar
to linear map, meanwhile free of kvm stub handling. As a result, the
code scarcely needs change against MMU-disabled code.

Anyway, the primary target of my series is to share the common code
with [5/8] and [8/8].

Thanks,
Pingfan