diff mbox series

PATCH] fs/dax: fix compile problem on parisc and mips

Message ID fb91b40d258414b0fdce2c380752e48daa6a70d6.camel@HansenPartnership.com (mailing list archive)
State New, archived
Headers show
Series PATCH] fs/dax: fix compile problem on parisc and mips | expand

Commit Message

James Bottomley Dec. 4, 2020, 12:33 a.m. UTC
These platforms define PMD_ORDER in asm/pgtable.h

This means that as soon as dax.c included asm/pgtable.h in commit
11cf9d863dcb ("fs/dax: Deposit pagetable even when installing zero
page") we clash with PMD_ORDER introduced by cfc93c6c6c96 ("dax:
Convert dax_insert_pfn_mkwrite to XArray") and we get this problem:

/home/jejb/git/linux-build/fs/dax.c:53: warning: "PMD_ORDER" redefined
   53 | #define PMD_ORDER (PMD_SHIFT - PAGE_SHIFT)
      |
In file included from /home/jejb/git/linux-build/include/linux/pgtable.h:6,
                 from /home/jejb/git/linux-build/include/linux/mm.h:33,
                 from /home/jejb/git/linux-build/include/linux/bvec.h:14,
                 from /home/jejb/git/linux-build/include/linux/blk_types.h:10,
                 from /home/jejb/git/linux-build/include/linux/genhd.h:19,
                 from /home/jejb/git/linux-build/include/linux/blkdev.h:8,
                 from /home/jejb/git/linux-build/fs/dax.c:10:
/home/jejb/git/linux-build/arch/parisc/include/asm/pgtable.h:124: note: this is the location of the previous definition
  124 | #define PMD_ORDER 1 /* Number of pages per pmd */
      |
make[2]: *** Deleting file 'fs/dax.o'

Fix by renaming dax's PMD_ORDER to DAX_PMD_ORDER

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 fs/dax.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Matthew Wilcox Dec. 4, 2020, 3:48 a.m. UTC | #1
On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote:
> These platforms define PMD_ORDER in asm/pgtable.h

I think that's the real problem, though.

#define PGD_ORDER       1 /* Number of pages per pgd */
#define PMD_ORDER       1 /* Number of pages per pmd */
#define PGD_ALLOC_ORDER (2 + 1) /* first pgd contains pmd */
#else
#define PGD_ORDER       1 /* Number of pages per pgd */
#define PGD_ALLOC_ORDER (PGD_ORDER + 1)

That should clearly be PMD_ALLOC_ORDER, not PMD_ORDER.  Or even
PAGES_PER_PMD like the comment calls it, because I really think
that doing an order-3 (8 pages) allocation for the PGD is wrong.
Helge Deller Dec. 4, 2020, 7:57 a.m. UTC | #2
On 12/4/20 4:48 AM, Matthew Wilcox wrote:
> On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote:
>> These platforms define PMD_ORDER in asm/pgtable.h
>
> I think that's the real problem, though.
>
> #define PGD_ORDER       1 /* Number of pages per pgd */
> #define PMD_ORDER       1 /* Number of pages per pmd */
> #define PGD_ALLOC_ORDER (2 + 1) /* first pgd contains pmd */
> #else
> #define PGD_ORDER       1 /* Number of pages per pgd */
> #define PGD_ALLOC_ORDER (PGD_ORDER + 1)
>
> That should clearly be PMD_ALLOC_ORDER, not PMD_ORDER.  Or even
> PAGES_PER_PMD like the comment calls it, because I really think
> that doing an order-3 (8 pages) allocation for the PGD is wrong.

We need a spinlock to protect parallel accesses to the PGD,
search for pgd_spinlock().
This spinlock is stored behind the memory used for the PGD, which
is why we allocate more memory (and waste 3 pages).

Helge
Matthew Wilcox Dec. 4, 2020, 12:44 p.m. UTC | #3
On Fri, Dec 04, 2020 at 08:57:37AM +0100, Helge Deller wrote:
> On 12/4/20 4:48 AM, Matthew Wilcox wrote:
> > On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote:
> >> These platforms define PMD_ORDER in asm/pgtable.h
> >
> > I think that's the real problem, though.
> >
> > #define PGD_ORDER       1 /* Number of pages per pgd */
> > #define PMD_ORDER       1 /* Number of pages per pmd */
> > #define PGD_ALLOC_ORDER (2 + 1) /* first pgd contains pmd */
> > #else
> > #define PGD_ORDER       1 /* Number of pages per pgd */
> > #define PGD_ALLOC_ORDER (PGD_ORDER + 1)
> >
> > That should clearly be PMD_ALLOC_ORDER, not PMD_ORDER.  Or even
> > PAGES_PER_PMD like the comment calls it, because I really think
> > that doing an order-3 (8 pages) allocation for the PGD is wrong.
> 
> We need a spinlock to protect parallel accesses to the PGD,
> search for pgd_spinlock().
> This spinlock is stored behind the memory used for the PGD, which
> is why we allocate more memory (and waste 3 pages).

There are better places to store it than that!  For example, storing it
in the struct page, like many architectures do for split ptlocks.
You'll have to skip the _pt_pad_1 so it doesn't get confused with
being a compound_head, but soemthing like this:

		struct {	/* PA-RISC PGD */
			unsigned long _pa_pad_1;	/* compound_head */
#if ALLOC_SPLIT_PA_PTLOCKS
			spinlock_t *pa_ptl;
#else
			spinlock_t pa_ptl;
#endif
		};

inside struct page (linux/mm_types.h) should do the trick.  You'll
still need to allocate them separately if various debugging options
are enabled (see the ALLOC_SPLIT_PTLOCKS for details), but usually
this will save you a lot of memory.

You could also fill in pt_mm like x86 does for its pgds, and then use
mm->page_table_lock to protect whatever the PGD lock currently protects.
Maybe page_table_lock is sometimes held when calling ptep_set_wrprotect()
and sometimes isn't; then this wouldn't work.

Also, could you fix the comments?  They don't match the code:

 #define PGD_ORDER      1 /* Number of pages per pgd */

should be

 #define PGD_ALLOC_ORDER      1 /* 2 pages (8KiB) per pgd */
John David Anglin Dec. 4, 2020, 1:28 p.m. UTC | #4
On 2020-12-04 7:44 a.m., Matthew Wilcox wrote:
> You'll
> still need to allocate them separately if various debugging options
> are enabled (see the ALLOC_SPLIT_PTLOCKS for details), but usually
> this will save you a lot of memory.
We need all we can get:
(.mlocate): page allocation failure: order:5, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
CPU: 2 PID: 28271 Comm: (.mlocate) Not tainted 5.9.11+ #1
Hardware name: 9000/800/rp3440
Backtrace:
 [<000000004018d050>] show_stack+0x50/0x70
 [<0000000040826354>] dump_stack+0xbc/0x130
 [<000000004033dc14>] warn_alloc+0x144/0x1e0
 [<000000004033e930>] __alloc_pages_slowpath.constprop.0+0xc80/0xcf8
 [<000000004033ec48>] __alloc_pages_nodemask+0x2a0/0x2f0
 [<0000000040351a2c>] cache_alloc_refill+0x6b4/0xe50
 [<000000004035416c>] __kmalloc+0x5e4/0x740
 [<00000000040ddbe8>] nfsd_reply_cache_init+0x1d0/0x360 [nfsd]
 [<00000000040d1118>] nfsd_init_net+0xb0/0x1c0 [nfsd]
 [<00000000406a5860>] ops_init+0x68/0x178
 [<00000000406a5b24>] setup_net+0x1b4/0x348
 [<00000000406a6e20>] copy_net_ns+0x1f0/0x450
 [<00000000401e6578>] create_new_namespaces+0x1b0/0x418
 [<00000000401e72ac>] unshare_nsproxy_namespaces+0x8c/0xf0
 [<00000000401b6acc>] ksys_unshare+0x1bc/0x440
 [<00000000401b6d70>] sys_unshare+0x20/0x38
 [<0000000040188018>] syscall_exit+0x0/0x14

Mem-Info:
active_anon:1209957 inactive_anon:438171 isolated_anon:0
 active_file:38971 inactive_file:21741 isolated_file:0
 unevictable:4662 dirty:144 writeback:0
 slab_reclaimable:45748 slab_unreclaimable:51548
 mapped:34940 shmem:1471859 pagetables:5429 bounce:0
 free:213676 free_pcp:317 free_cma:0
Node 0 active_anon:4839828kB inactive_anon:1753200kB active_file:155884kB inactive_file:86964kB unevictable:18648kB isolated(anon):0kB
isolated(file):0kB mapped:139760kB dirty:576kB writeback:0kB shmem:5887436kB writeback_tmp:0kB kernel_stack:6032kB all_unreclaimable? no
Normal free:853948kB min:11448kB low:19636kB high:27824kB reserved_highatomic:0KB active_anon:4839828kB inactive_anon:1753544kB
active_file:155884kB inactive_file:86964kB unevictable:18648kB writepending:576kB present:8386560kB managed:8211756kB mlocked:18648kB
pagetables:21716kB bounce:0kB free_pcp:1168kB local_pcp:352kB free_cma:0kB
lowmem_reserve[]: 0 0
Normal: 58860*4kB (UME) 48155*8kB (UME) 11414*16kB (UME) 1291*32kB (UME) 134*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
853192kB
1607717 total pagecache pages
73822 pages in swap cache
Swap cache stats: add 21252454, delete 21178083, find 5702723/6815498
Free swap  = 33742652kB
Total swap = 49758652kB
2096640 pages RAM
0 pages HighMem/MovableOnly
43701 pages reserved

Cheers,
Dave
Matthew Wilcox Dec. 4, 2020, 2:11 p.m. UTC | #5
On Fri, Dec 04, 2020 at 08:28:47AM -0500, John David Anglin wrote:
> (.mlocate): page allocation failure: order:5, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
>  [<000000004035416c>] __kmalloc+0x5e4/0x740
>  [<00000000040ddbe8>] nfsd_reply_cache_init+0x1d0/0x360 [nfsd]

Oof, order 5.  Fortunately, that one was already fixed by commit
8c38b705b4f4ca4e7f9cc116141bc38391917c30.
diff mbox series

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 5b47834f2e1b..4d3b0db5c321 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -50,7 +50,7 @@  static inline unsigned int pe_order(enum page_entry_size pe_size)
 #define PG_PMD_NR	(PMD_SIZE >> PAGE_SHIFT)
 
 /* The order of a PMD entry */
-#define PMD_ORDER	(PMD_SHIFT - PAGE_SHIFT)
+#define DAX_PMD_ORDER	(PMD_SHIFT - PAGE_SHIFT)
 
 static wait_queue_head_t wait_table[DAX_WAIT_TABLE_ENTRIES];
 
@@ -98,7 +98,7 @@  static bool dax_is_locked(void *entry)
 static unsigned int dax_entry_order(void *entry)
 {
 	if (xa_to_value(entry) & DAX_PMD)
-		return PMD_ORDER;
+		return DAX_PMD_ORDER;
 	return 0;
 }
 
@@ -1471,7 +1471,7 @@  static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
 {
 	struct vm_area_struct *vma = vmf->vma;
 	struct address_space *mapping = vma->vm_file->f_mapping;
-	XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, PMD_ORDER);
+	XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, DAX_PMD_ORDER);
 	unsigned long pmd_addr = vmf->address & PMD_MASK;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
 	bool sync;
@@ -1530,7 +1530,7 @@  static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	 * entry is already in the array, for instance), it will return
 	 * VM_FAULT_FALLBACK.
 	 */
-	entry = grab_mapping_entry(&xas, mapping, PMD_ORDER);
+	entry = grab_mapping_entry(&xas, mapping, DAX_PMD_ORDER);
 	if (xa_is_internal(entry)) {
 		result = xa_to_internal(entry);
 		goto fallback;
@@ -1696,7 +1696,7 @@  dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order)
 	if (order == 0)
 		ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn);
 #ifdef CONFIG_FS_DAX_PMD
-	else if (order == PMD_ORDER)
+	else if (order == DAX_PMD_ORDER)
 		ret = vmf_insert_pfn_pmd(vmf, pfn, FAULT_FLAG_WRITE);
 #endif
 	else