diff mbox series

[v16,13/25] mm: pagewalk: Don't lock PTEs for walk_page_range_novma()

Message ID 20191206135316.47703-14-steven.price@arm.com
State New, archived
Headers show
Series Generic page walk and ptdump | expand

Commit Message

Steven Price Dec. 6, 2019, 1:53 p.m. UTC
walk_page_range_novma() can be used to walk page tables or the kernel or
for firmware. These page tables may contain entries that are not backed
by a struct page and so it isn't (in general) possible to take the PTE
lock for the pte_entry() callback. So update walk_pte_range() to only
take the lock when no_vma==false and add a comment explaining the
difference to walk_page_range_novma().

Signed-off-by: Steven Price <steven.price@arm.com>
---
 mm/pagewalk.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

kernel test robot Dec. 10, 2019, 11:23 a.m. UTC | #1
Hi Steven,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.5-rc1 next-20191209]
[cannot apply to arm64/for-next/core tip/x86/mm]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Steven-Price/Generic-page-walk-and-ptdump/20191208-035831
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ad910e36da4ca3a1bd436989f632d062dda0c921
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.1-101-g82dee2e-dirty
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> include/linux/spinlock.h:378:9: sparse: sparse: context imbalance in 'walk_pte_range' - unexpected unlock

vim +/walk_pte_range +378 include/linux/spinlock.h

c2f21ce2e31286 Thomas Gleixner 2009-12-02  375  
3490565b633c70 Denys Vlasenko  2015-07-13  376  static __always_inline void spin_unlock(spinlock_t *lock)
c2f21ce2e31286 Thomas Gleixner 2009-12-02  377  {
c2f21ce2e31286 Thomas Gleixner 2009-12-02 @378  	raw_spin_unlock(&lock->rlock);
c2f21ce2e31286 Thomas Gleixner 2009-12-02  379  }
c2f21ce2e31286 Thomas Gleixner 2009-12-02  380  

:::::: The code at line 378 was first introduced by commit
:::::: c2f21ce2e31286a0a32f8da0a7856e9ca1122ef3 locking: Implement new raw_spinlock

:::::: TO: Thomas Gleixner <tglx@linutronix.de>
:::::: CC: Thomas Gleixner <tglx@linutronix.de>

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
Steven Price Dec. 11, 2019, 3:54 p.m. UTC | #2
On 10/12/2019 11:23, kbuild test robot wrote:
> Hi Steven,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on linus/master]
> [also build test WARNING on v5.5-rc1 next-20191209]
> [cannot apply to arm64/for-next/core tip/x86/mm]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
> 
> url:    https://github.com/0day-ci/linux/commits/Steven-Price/Generic-page-walk-and-ptdump/20191208-035831
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ad910e36da4ca3a1bd436989f632d062dda0c921
> reproduce:
>         # apt-get install sparse
>         # sparse version: v0.6.1-101-g82dee2e-dirty
>         make ARCH=x86_64 allmodconfig
>         make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
> 
> 
> sparse warnings: (new ones prefixed by >>)
> 
>>> include/linux/spinlock.h:378:9: sparse: sparse: context imbalance in 'walk_pte_range' - unexpected unlock

I believe this is a false positive (although the trace here is useless).
This patch adds a conditional lock/unlock:

pte = walk->no_vma ? pte_offset_map(pmd, addr) :
		     pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
...
if (!walk->no_vma)
	spin_unlock(ptl);
pte_unmap(pte);

I'm not sure how to match sparse happy about that. Is the only option to
have two versions of the walk_pte_range() function? One which takes the
lock and one which doesn't.

Steve

> vim +/walk_pte_range +378 include/linux/spinlock.h
> 
> c2f21ce2e31286 Thomas Gleixner 2009-12-02  375  
> 3490565b633c70 Denys Vlasenko  2015-07-13  376  static __always_inline void spin_unlock(spinlock_t *lock)
> c2f21ce2e31286 Thomas Gleixner 2009-12-02  377  {
> c2f21ce2e31286 Thomas Gleixner 2009-12-02 @378  	raw_spin_unlock(&lock->rlock);
> c2f21ce2e31286 Thomas Gleixner 2009-12-02  379  }
> c2f21ce2e31286 Thomas Gleixner 2009-12-02  380  
> 
> :::::: The code at line 378 was first introduced by commit
> :::::: c2f21ce2e31286a0a32f8da0a7856e9ca1122ef3 locking: Implement new raw_spinlock
> 
> :::::: TO: Thomas Gleixner <tglx@linutronix.de>
> :::::: CC: Thomas Gleixner <tglx@linutronix.de>
> 
> ---
> 0-DAY kernel test infrastructure                 Open Source Technology Center
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Luc Van Oostenryck Dec. 11, 2019, 5:12 p.m. UTC | #3
On Wed, Dec 11, 2019 at 03:54:06PM +0000, Steven Price wrote:
> On 10/12/2019 11:23, kbuild test robot wrote:
> >>> include/linux/spinlock.h:378:9: sparse: sparse: context imbalance in 'walk_pte_range' - unexpected unlock
> 
> I believe this is a false positive (although the trace here is useless).
> This patch adds a conditional lock/unlock:
> 
> pte = walk->no_vma ? pte_offset_map(pmd, addr) :
> 		     pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
> ...
> if (!walk->no_vma)
> 	spin_unlock(ptl);
> pte_unmap(pte);
> 
> I'm not sure how to match sparse happy about that. Is the only option to
> have two versions of the walk_pte_range() function? One which takes the
> lock and one which doesn't.

Yes.

-- Luc
Qian Cai Dec. 11, 2019, 5:19 p.m. UTC | #4
> On Dec 11, 2019, at 10:54 AM, Steven Price <Steven.Price@arm.com> wrote:
> 
> I believe this is a false positive (although the trace here is useless).
> This patch adds a conditional lock/unlock:
> 
> pte = walk->no_vma ? pte_offset_map(pmd, addr) :
>             pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
> ...
> if (!walk->no_vma)
>    spin_unlock(ptl);
> pte_unmap(pte);
> 
> I'm not sure how to match sparse happy about that. Is the only option to
> have two versions of the walk_pte_range() function? One which takes the
> lock and one which doesn't.

Or just ignore the sparse false positive without complicating the code further.
diff mbox series

Patch

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index efa464cf079b..1b9a3ba24c51 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -10,9 +10,10 @@  static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 	pte_t *pte;
 	int err = 0;
 	const struct mm_walk_ops *ops = walk->ops;
-	spinlock_t *ptl;
+	spinlock_t *uninitialized_var(ptl);
 
-	pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+	pte = walk->no_vma ? pte_offset_map(pmd, addr) :
+			     pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
 	for (;;) {
 		err = ops->pte_entry(pte, addr, addr + PAGE_SIZE, walk);
 		if (err)
@@ -23,7 +24,9 @@  static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 		pte++;
 	}
 
-	pte_unmap_unlock(pte, ptl);
+	if (!walk->no_vma)
+		spin_unlock(ptl);
+	pte_unmap(pte);
 	return err;
 }
 
@@ -383,6 +386,12 @@  int walk_page_range(struct mm_struct *mm, unsigned long start,
 	return err;
 }
 
+/*
+ * Similar to walk_page_range() but can walk any page tables even if they are
+ * not backed by VMAs. Because 'unusual' entries may be walked this function
+ * will also not lock the PTEs for the pte_entry() callback. This is useful for
+ * walking the kernel pages tables or page tables for firmware.
+ */
 int walk_page_range_novma(struct mm_struct *mm, unsigned long start,
 			  unsigned long end, const struct mm_walk_ops *ops,
 			  void *private)