Message ID | 20190310011906.254635-1-yuzhao@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/3] arm64: mm: use appropriate ctors for page tables | expand |
Hello Yu, We had some disagreements over this series last time around after which I had posted the following series [1] which tried to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK after doing some pgtable accounting changes. After some thoughts and deliberations I figure that its better not to do pgtable alloc changes on arm64 creating a brand new semantics which ideally should be first debated and agreed upon in generic MM. Though I still see value in a changed generic pgtable page allocation semantics for user and kernel space that should not stop us from enabling more granular PMD level locks through ARCH_ENABLE_SPLIT_PMD_PTLOCK right now. [1] https://www.spinics.net/lists/arm-kernel/msg709917.html Having said that this series attempts to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK with some minimal changes to existing kernel pgtable page allocation code. Hence just trying to re-evaluate the series in that isolation. On 03/10/2019 06:49 AM, Yu Zhao wrote: > For pte page, use pgtable_page_ctor(); for pmd page, use > pgtable_pmd_page_ctor(); and for the rest (pud, p4d and pgd), > don't use any. This is semantics change. Hence the question is why ? Should not we wait until a generic MM agreement in place in this regard ? Can we avoid this ? Is the change really required to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for user space THP which this series originally intended to achieve ? > > For now, we don't select ARCH_ENABLE_SPLIT_PMD_PTLOCK and > pgtable_pmd_page_ctor() is a nop. When we do in patch 3, we > make sure pmd is not folded so we won't mistakenly call > pgtable_pmd_page_ctor() on pud or p4d. This makes sense from code perspective but I still dont understand the need to change kernel pgtable page allocation semantics without any real benefit or fix at the moment. Cant we keep kernel page table page allocation unchanged for now and just enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for user space THP benefits ? Do you see any concern with that.
On Mon, Mar 11, 2019 at 01:15:55PM +0530, Anshuman Khandual wrote: > Hello Yu, > > We had some disagreements over this series last time around after which I had > posted the following series [1] which tried to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK > after doing some pgtable accounting changes. After some thoughts and deliberations > I figure that its better not to do pgtable alloc changes on arm64 creating a brand > new semantics which ideally should be first debated and agreed upon in generic MM. > > Though I still see value in a changed generic pgtable page allocation semantics > for user and kernel space that should not stop us from enabling more granular > PMD level locks through ARCH_ENABLE_SPLIT_PMD_PTLOCK right now. > > [1] https://www.spinics.net/lists/arm-kernel/msg709917.html > > Having said that this series attempts to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK with > some minimal changes to existing kernel pgtable page allocation code. Hence just > trying to re-evaluate the series in that isolation. > > On 03/10/2019 06:49 AM, Yu Zhao wrote: > > > For pte page, use pgtable_page_ctor(); for pmd page, use > > pgtable_pmd_page_ctor(); and for the rest (pud, p4d and pgd), > > don't use any. > > This is semantics change. Hence the question is why ? Should not we wait until a > generic MM agreement in place in this regard ? Can we avoid this ? Is the change > really required to enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for user space THP which > this series originally intended to achieve ? > > > > > For now, we don't select ARCH_ENABLE_SPLIT_PMD_PTLOCK and > > pgtable_pmd_page_ctor() is a nop. When we do in patch 3, we > > make sure pmd is not folded so we won't mistakenly call > > pgtable_pmd_page_ctor() on pud or p4d. > > This makes sense from code perspective but I still dont understand the need to > change kernel pgtable page allocation semantics without any real benefit or fix at > the moment. Cant we keep kernel page table page allocation unchanged for now and > just enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for user space THP benefits ? Do you see > any concern with that. This is not for kernel page tables (i.e. init_mm). This is to accommodate pre-allocated efi_mm page tables because it uses apply_to_page_range() which then calls pte_alloc_map_lock().
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index b6f5aa52ac67..f704b291f2c5 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -98,7 +98,7 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, } EXPORT_SYMBOL(phys_mem_access_prot); -static phys_addr_t __init early_pgtable_alloc(void) +static phys_addr_t __init early_pgtable_alloc(int shift) { phys_addr_t phys; void *ptr; @@ -173,7 +173,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), + phys_addr_t (*pgtable_alloc)(int), int flags) { unsigned long next; @@ -183,7 +183,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr, if (pmd_none(pmd)) { phys_addr_t pte_phys; BUG_ON(!pgtable_alloc); - pte_phys = pgtable_alloc(); + pte_phys = pgtable_alloc(PAGE_SHIFT); __pmd_populate(pmdp, pte_phys, PMD_TYPE_TABLE); pmd = READ_ONCE(*pmdp); } @@ -207,7 +207,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr, static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), int flags) + phys_addr_t (*pgtable_alloc)(int), int flags) { unsigned long next; pmd_t *pmdp; @@ -245,7 +245,7 @@ static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end, static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), int flags) + phys_addr_t (*pgtable_alloc)(int), int flags) { unsigned long next; pud_t pud = READ_ONCE(*pudp); @@ -257,7 +257,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr, if (pud_none(pud)) { phys_addr_t pmd_phys; BUG_ON(!pgtable_alloc); - pmd_phys = pgtable_alloc(); + pmd_phys = pgtable_alloc(PMD_SHIFT); __pud_populate(pudp, pmd_phys, PUD_TYPE_TABLE); pud = READ_ONCE(*pudp); } @@ -293,7 +293,7 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next, static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), + phys_addr_t (*pgtable_alloc)(int), int flags) { unsigned long next; @@ -303,7 +303,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end, if (pgd_none(pgd)) { phys_addr_t pud_phys; BUG_ON(!pgtable_alloc); - pud_phys = pgtable_alloc(); + pud_phys = pgtable_alloc(PUD_SHIFT); __pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE); pgd = READ_ONCE(*pgdp); } @@ -344,7 +344,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end, static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), + phys_addr_t (*pgtable_alloc)(int), int flags) { unsigned long addr, length, end, next; @@ -370,11 +370,23 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, } while (pgdp++, addr = next, addr != end); } -static phys_addr_t pgd_pgtable_alloc(void) +static phys_addr_t pgd_pgtable_alloc(int shift) { void *ptr = (void *)__get_free_page(PGALLOC_GFP); - if (!ptr || !pgtable_page_ctor(virt_to_page(ptr))) - BUG(); + BUG_ON(!ptr); + + /* + * Call proper page table ctor in case later we need to + * call core mm functions like apply_to_page_range() on + * this pre-allocated page table. + * + * We don't select ARCH_ENABLE_SPLIT_PMD_PTLOCK if pmd is + * folded, and if so pgtable_pmd_page_ctor() becomes nop. + */ + if (shift == PAGE_SHIFT) + BUG_ON(!pgtable_page_ctor(virt_to_page(ptr))); + else if (shift == PMD_SHIFT) + BUG_ON(!pgtable_pmd_page_ctor(virt_to_page(ptr))); /* Ensure the zeroed page is visible to the page table walker */ dsb(ishst);