diff mbox series

filemap: Handle error return from __filemap_get_folio()

Message ID 20230506160415.2992089-1-willy@infradead.org (mailing list archive)
State New
Headers show
Series filemap: Handle error return from __filemap_get_folio() | expand

Commit Message

Matthew Wilcox May 6, 2023, 4:04 p.m. UTC
Smatch reports that filemap_fault() was missed in the conversion of
__filemap_get_folio() error returns from NULL to ERR_PTR.

Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reported-by: syzbot+48011b86c8ea329af1b9@syzkaller.appspotmail.com
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Linus Torvalds May 6, 2023, 4:09 p.m. UTC | #1
On Sat, May 6, 2023 at 9:04 AM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Smatch reports that filemap_fault() was missed in the conversion of
> __filemap_get_folio() error returns from NULL to ERR_PTR.

Thanks, applied directly.

            Linus
Linus Torvalds May 6, 2023, 4:35 p.m. UTC | #2
On Sat, May 6, 2023 at 9:09 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Thanks, applied directly.

Actually, I take that back.

I applied it, and then I looked at the context some more, and decided
that I hated it.

It's not that the patch itself was wrong, but the whole original

        if (folio)
                folio_put(folio);

was wrong in that context, and shouldn't have been done that way.

Converting the conditional to use !IS_ERR() fixes a conversion error,
but doesn't fix the original problem with the code.

The only path that triggered that "we have no folio" was wrong to jump
to that "should I drop this folio" code sequence in the first place.

So the fix is to not use "out_retry" at all for the IS_ERR(folio) case.

Yes, yes, compilers can do jump threading, and in a perfect world
might realize that there are two consecutive tests for IS_ERR(folio)
and just do this kind of conversion automatically.

But the fact that compilers *might* fix out our code to do the right
thing doesn't mean that the code shouldn't have been written to just
have the proper error handling nesting in the first place.

And yes, the simplest fix for the "wrong test" would be to just add a
new "out_nofolio" error case after "out_retry", and use that.

However, even that seems wrong, because the return value for that path
is the wrong one.

So the "goto out_retry" was actually *doubly* wrong.

If the __filemap_get_folio(FGP_CREAT) failed, then we should not
return VM_FAULT_RETRY at all. We should return VM_FAULT_OOM, like it
does for the no-fpin case.

So the whole

                if (IS_ERR(folio)) {
                        if (fpin)
                                goto out_retry;

sequence seems to be completely wrong in several ways. It shouldn't go
to "out_retry" at all, because this is simply not a "retry" case.

And that in turn means that "out_retry" shouldn't be testing folio at
all (whether for IS_ERR() _or_ for NULL).

So i really think the proper fix is this (whitespace-damaged) patch instead:

    --- a/mm/filemap.c
    +++ b/mm/filemap.c
    @@ -3291,7 +3291,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
                                              vmf->gfp_mask);
                    if (IS_ERR(folio)) {
    -                       if (fpin)
    -                               goto out_retry;
                            filemap_invalidate_unlock_shared(mapping);
    +                       if (fpin)
    +                               fput(fpin);
                            return VM_FAULT_OOM;
                    }
    @@ -3379,6 +3379,5 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
             * page.
             */
    -       if (folio)
    -               folio_put(folio);
    +       folio_put(folio);
            if (mapping_locked)
                    filemap_invalidate_unlock_shared(mapping);

which fixes both problems by simply avoiding a bogus 'goto' entirely.

Hmm?

                      Linus
Linus Torvalds May 6, 2023, 5:04 p.m. UTC | #3
On Sat, May 6, 2023 at 9:35 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And yes, the simplest fix for the "wrong test" would be to just add a
> new "out_nofolio" error case after "out_retry", and use that.
>
> However, even that seems wrong, because the return value for that path
> is the wrong one.

Actually, my suggested patch is _also_ wrong.

The problem is that we do need to return VM_FAULT_RETRY to let the
caller know that we released the mmap_lock.

And once we return VM_FAULT_RETRY, the other error bits don't even matter.

So while I think the *right* thing to do is to return VM_FAULT_OOM |
VM_FAULT_RETRY, that doesn't actually end up working, because if
VM_FAULT_RETRY is set, the caller will know that "yes, mmap_lock was
dropped", but the callers will also just ignore the other bits and
unconditionally retry.

How very very annoying.

This was introduced several years ago by commit 6b4c9f446981
("filemap: drop the mmap_sem for all blocking operations").

Looking at that, we have at least one other similar error case wrong
too: the "page_not_uptodate" case carefully checks for IO errors and
retries only if there was no error (or for the AOP_TRUNCATED_PAGE)
case.

For an actual IO error on page reading, it returns VM_FAULT_SIGBUS.

Except - again - for that "if (fpin) goto out_retry" case, which will
just return VM_FAULT_RETRY and retry the fault.

I do not believe that retrying the fault is the right thing to do when
we ran out of memory, or when we had an IO error, and I do not think
it was intentional that the error handling was changed.

But I  think this is all just a mistake from how that VM_FAULT_RETRY
works in the callers.

How very very annoying.

So scratch that patch suggestion of mine, but let's bring in some
people involved with the original fpin code, and see if we can find
some solution that honors that error case too.

               Linus
Linus Torvalds May 6, 2023, 5:10 p.m. UTC | #4
On Sat, May 6, 2023 at 10:04 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So scratch that patch suggestion of mine, but let's bring in some
> people involved with the original fpin code, and see if we can find
> some solution that honors that error case too.

.. in the meantime, I did end up applying your patch.

I still don't like the code around it, but let's fix the conversion
error and worry about the other issues later.

                  Linus
Linus Torvalds May 6, 2023, 5:34 p.m. UTC | #5
On Sat, May 6, 2023 at 10:10 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> .. in the meantime, I did end up applying your patch.

Final (?) note on this: I not only applied your patch, but went
looking for any other missed cases.

And found one in fs/nfs/dir.c, which like ext4 had grown a new use of
__filemap_get_folio().

But unlike ext4 it hadn't been caught in linux-next (or I had just
missed it) and so I hadn't caught it in my merge either.

I hope that's the last one.

I grepped for all these __filemap_get_folio() cases when I did the MM
merge that brought in that change (and did the ext4 merge fixup), but
then the nfs pull happened later and I didn't think to check for new
cases...

A current grep seems to say that it's all good. But we had all missed
the second check in filemap_fault(), so...

                   Linus
Andrew Morton May 6, 2023, 5:41 p.m. UTC | #6
On Sat, 6 May 2023 10:34:31 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sat, May 6, 2023 at 10:10 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > .. in the meantime, I did end up applying your patch.
> 
> Final (?) note on this: I not only applied your patch, but went
> looking for any other missed cases.
> 
> And found one in fs/nfs/dir.c, which like ext4 had grown a new use of
> __filemap_get_folio().
> 
> But unlike ext4 it hadn't been caught in linux-next (or I had just
> missed it) and so I hadn't caught it in my merge either.
> 
> I hope that's the last one.
> 
> I grepped for all these __filemap_get_folio() cases when I did the MM
> merge that brought in that change (and did the ext4 merge fixup), but
> then the nfs pull happened later and I didn't think to check for new
> cases...
> 
> A current grep seems to say that it's all good. But we had all missed
> the second check in filemap_fault(), so...
> 

We have a related afs fix which I plan to send over later today:

From: Christoph Hellwig <hch@lst.de>
Subject: afs: fix the afs_dir_get_folio return value
Date: Wed, 3 May 2023 17:45:26 +0200

Keep returning NULL on failure instead of letting an ERR_PTR escape to
callers that don't expect it.

Link: https://lkml.kernel.org/r/20230503154526.1223095-2-hch@lst.de
Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/afs/dir_edit.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/fs/afs/dir_edit.c~afs-fix-the-afs_dir_get_folio-return-value
+++ a/fs/afs/dir_edit.c
@@ -115,11 +115,12 @@ static struct folio *afs_dir_get_folio(s
 	folio = __filemap_get_folio(mapping, index,
 				    FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
 				    mapping->gfp_mask);
-	if (IS_ERR(folio))
+	if (IS_ERR(folio)) {
 		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
-	else if (folio && !folio_test_private(folio))
+		return NULL;
+	}
+	if (!folio_test_private(folio))
 		folio_attach_private(folio, (void *)1);
-
 	return folio;
 }
Dan Carpenter May 8, 2023, 1:56 p.m. UTC | #7
On Sat, May 06, 2023 at 10:41:22AM -0700, Andrew Morton wrote:
> --- a/fs/afs/dir_edit.c~afs-fix-the-afs_dir_get_folio-return-value
> +++ a/fs/afs/dir_edit.c
> @@ -115,11 +115,12 @@ static struct folio *afs_dir_get_folio(s
>  	folio = __filemap_get_folio(mapping, index,
>  				    FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
>  				    mapping->gfp_mask);
> -	if (IS_ERR(folio))
> +	if (IS_ERR(folio)) {
>  		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
> -	else if (folio && !folio_test_private(folio))
> +		return NULL;
> +	}
> +	if (!folio_test_private(folio))
>  		folio_attach_private(folio, (void *)1);
> -
>  	return folio;
>  }

This one is quite tricky for Smatch.  I mentioned earlier that the
existing Smatch check for error pointer dereferences has some stupid
stuff going on.  I've replaced some of the stupid and I'll testing it
tonight.

1)  There is an existing check which complains if you have "if (p) "
    where p can be an error pointer, but not NULL.  If I revert the fix,
    I get the correct warning now.

    fs/afs/dir_edit.c:242 afs_edit_dir_add()
    warn: 'folio0' is an error pointer or valid *NEW*

2) There is an existing check for dereferencing error pointers.  However,
   I don't think kmap_local_folio() actually  dereferences the folio.
   The folio_nr_pages() function does, but depending on the .config,
   it's kind of complicated and buried inside a READ_ONCE().  I've
   improved the Smatch code for this but I don't have a solution yet.

3) I've created a new warning which generally complains about passing
   error pointers.  Obviously there are functions where that's normal,
   like passing error pointers to IS_ERR() and dev_err_probe().  It
   may or may not be useful.  I'll look at the warnings tomorrow.

    fs/afs/dir_edit.c:265 afs_edit_dir_add()
    warn: passing error pointer 'folio0' to folio_nr_pages()

regards,
dan carpenter
Dan Carpenter May 9, 2023, 7:43 a.m. UTC | #8
> 1)  There is an existing check which complains if you have "if (p) "
>     where p can be an error pointer, but not NULL.  If I revert the fix,
>     I get the correct warning now.
> 
>     fs/afs/dir_edit.c:242 afs_edit_dir_add()
>     warn: 'folio0' is an error pointer or valid *NEW*

I ran the new code last night.  There was one more folio bug (but every
function in the call tree triggers a warning).

fs/nfs/dir.c:405 nfs_readdir_folio_get_locked() warn: 'folio' is an error pointer or valid
fs/nfs/dir.c:1000 nfs_readdir_folio_get_cached() warn: 'folio' is an error pointer or valid
fs/nfs/dir.c:1019 find_and_lock_cache_page() warn: 'desc->folio' is an error pointer or valid

Other new warnings.  Mostly harmless checks for NULL.
drivers/usb/host/max3421-hcd.c:1913 max3421_probe() warn: 'max3421_hcd->spi_thread' is an error pointer or valid
drivers/block/aoe/aoecmd.c:1259 aoe_ktstart() warn: 'task' is an error pointer or valid
drivers/target/target_core_fabric_configfs.c:482 target_fabric_make_np() warn: 'se_tpg_np' is an error pointer or valid
drivers/media/i2c/rdacm20.c:641 rdacm20_probe() warn: 'dev->sensor' is an error pointer or valid
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:291 test_vcap_xn_rule_creator() warn: '__right' is an error pointer or valid
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1349 vcap_api_encode_rule_test() warn: '__right' is an error pointer or valid
fs/configfs/dir.c:1339 configfs_mkdir() warn: 'item' is an error pointer or valid
kernel/cgroup/cgroup.c:5542 css_create() warn: 'css' is an error pointer or valid
sound/soc/apple/mca.c:955 mca_pcm_new() warn: 'chan' is an error pointer or valid
sound/soc/apple/mca.c:961 mca_pcm_new() warn: 'chan' is an error pointer or valid
lib/test_kmod.c:320 try_one_request() warn: 'info->task_sync' is an error pointer or valid
lib/test_firmware.c:918 trigger_batched_requests_store() warn: 'req->task' is an error pointer or valid

False postives based on my .config:
fs/ceph/cache.c:100 ceph_fscache_register_fs() warn: 'fsc->fscache' is an error pointer or valid
fs/erofs/fscache.c:354 erofs_fscache_register_volume() warn: 'volume' is an error pointer or valid

False positive because of a bug in Smatch:
fs/overlayfs/readdir.c:906 ovl_dir_fsync() warn: 'realfile' is an error pointer or valid

regards,
dan carpenter
Linus Torvalds May 9, 2023, 5:37 p.m. UTC | #9
On Tue, May 9, 2023 at 12:43 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> I ran the new code last night.  There was one more folio bug (but every
> function in the call tree triggers a warning).
>
> fs/nfs/dir.c:405 nfs_readdir_folio_get_locked() warn: 'folio' is an error pointer or valid

Thanks. I fixed that one up too.

In the process I ended up grepping for the other wrappers around
__filemap_get_folio() that also return ERR_PTR's for errors.

I might have missed some - this was just manual, after all - but while
I'm sure smatch finds those NULL pointer confusions, what I *did* find
was a lack of testing the result at all.

In fs/btrfs/extent_io.c, we have

    while (index <= end_index) {
        folio = filemap_get_folio(mapping, index);
        filemap_dirty_folio(mapping, folio);

and in fs/gfs2/lops.c we have a similar case of filemap_get_folio ->
folio_wait_locked use without checking for any errors.

I assume it's probably hard to trigger errors in those paths, but it
does look dodgy.

Added some relevant people to the participants to let them know...

I wonder if smatch can figure out automatically when error pointers do
*not* work, and need checking for.

A lot of places do not need to check for them, because they just
return the error pointer further up the call chain, but anything that
then actually dereferences it should have checked for them (same as
for NULL, of course).

               Linus
Johannes Weiner May 9, 2023, 7:19 p.m. UTC | #10
On Sat, May 06, 2023 at 10:04:48AM -0700, Linus Torvalds wrote:
> On Sat, May 6, 2023 at 9:35 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > And yes, the simplest fix for the "wrong test" would be to just add a
> > new "out_nofolio" error case after "out_retry", and use that.
> >
> > However, even that seems wrong, because the return value for that path
> > is the wrong one.
> 
> Actually, my suggested patch is _also_ wrong.
> 
> The problem is that we do need to return VM_FAULT_RETRY to let the
> caller know that we released the mmap_lock.
> 
> And once we return VM_FAULT_RETRY, the other error bits don't even matter.
> 
> So while I think the *right* thing to do is to return VM_FAULT_OOM |
> VM_FAULT_RETRY, that doesn't actually end up working, because if
> VM_FAULT_RETRY is set, the caller will know that "yes, mmap_lock was
> dropped", but the callers will also just ignore the other bits and
> unconditionally retry.
> 
> How very very annoying.
> 
> This was introduced several years ago by commit 6b4c9f446981
> ("filemap: drop the mmap_sem for all blocking operations").
> 
> Looking at that, we have at least one other similar error case wrong
> too: the "page_not_uptodate" case carefully checks for IO errors and
> retries only if there was no error (or for the AOP_TRUNCATED_PAGE)
> case.
> 
> For an actual IO error on page reading, it returns VM_FAULT_SIGBUS.
> 
> Except - again - for that "if (fpin) goto out_retry" case, which will
> just return VM_FAULT_RETRY and retry the fault.
> 
> I do not believe that retrying the fault is the right thing to do when
> we ran out of memory, or when we had an IO error, and I do not think
> it was intentional that the error handling was changed.

This is a while ago and the code has changed quite a bit since, so
bear with me.

Originally, we only ever did a maximum of two tries: one where the
lock might be dropped to kick off IO, then a synchronous one. IIRC the
thinking at the time was that events like OOMs and IO failures are
rare enough that doing the retry anyway (even if somewhat pointless)
and reacting to the issue then (if it persisted) was a tradeoff to
keep the retry logic simple.

Since 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times") we
don't clear FAULT_FLAG_ALLOW_RETRY anymore though, and we might see
more than one loop. At least outside the page cache. So I agree it
makes sense to look at the return value more carefully and act on
errors more timely in the arch handler.

Draft patch below. It survives a boot and a will-it-scale smoke test,
but I haven't put it through the grinder yet.

One thing that gave me pause is this comment:

	/*
	 * If we need to retry the mmap_lock has already been released,
	 * and if there is a fatal signal pending there is no guarantee
	 * that we made any progress. Handle this case first.
	 */

I think it made sense when it was added in 26178ec11ef3 ("x86: mm:
consolidate VM_FAULT_RETRY handling"). But after 39678191cd89
("x86/mm: use helper fault_signal_pending()") it's in a misleading
location, since the signal handling is above it.

So I'm removing it, but please let me know if I missed something.

---
 arch/x86/mm/fault.c | 40 +++++++++++++++++++++++-----------------
 mm/filemap.c        | 18 +++++++++++-------
 2 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e4399983c50c..f1d242be723f 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1456,20 +1456,15 @@ void do_user_addr_fault(struct pt_regs *regs,
 		return;
 
 	/*
-	 * If we need to retry the mmap_lock has already been released,
-	 * and if there is a fatal signal pending there is no guarantee
-	 * that we made any progress. Handle this case first.
+	 * If we need to retry the mmap_lock has already been released.
 	 */
-	if (unlikely(fault & VM_FAULT_RETRY)) {
-		flags |= FAULT_FLAG_TRIED;
-		goto retry;
-	}
+	if (likely(!(fault & VM_FAULT_RETRY)))
+		mmap_read_unlock(mm);
 
-	mmap_read_unlock(mm);
 #ifdef CONFIG_PER_VMA_LOCK
 done:
 #endif
-	if (likely(!(fault & VM_FAULT_ERROR)))
+	if (likely(!(fault & (VM_FAULT_ERROR|VM_FAULT_RETRY))))
 		return;
 
 	if (fatal_signal_pending(current) && !user_mode(regs)) {
@@ -1493,15 +1488,26 @@ void do_user_addr_fault(struct pt_regs *regs,
 		 * oom-killed):
 		 */
 		pagefault_out_of_memory();
-	} else {
-		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
-			     VM_FAULT_HWPOISON_LARGE))
-			do_sigbus(regs, error_code, address, fault);
-		else if (fault & VM_FAULT_SIGSEGV)
-			bad_area_nosemaphore(regs, error_code, address);
-		else
-			BUG();
+		return;
+	}
+
+	if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
+		     VM_FAULT_HWPOISON_LARGE)) {
+		do_sigbus(regs, error_code, address, fault);
+		return;
 	}
+
+	if (fault & VM_FAULT_SIGSEGV) {
+		bad_area_nosemaphore(regs, error_code, address);
+		return;
+	}
+
+	if (fault & VM_FAULT_RETRY) {
+		flags |= FAULT_FLAG_TRIED;
+		goto retry;
+	}
+
+	BUG();
 }
 NOKPROBE_SYMBOL(do_user_addr_fault);
 
diff --git a/mm/filemap.c b/mm/filemap.c
index b4c9bd368b7e..f97ca5045c19 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3290,10 +3290,11 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 					  FGP_CREAT|FGP_FOR_MMAP,
 					  vmf->gfp_mask);
 		if (IS_ERR(folio)) {
+			ret = VM_FAULT_OOM;
 			if (fpin)
 				goto out_retry;
 			filemap_invalidate_unlock_shared(mapping);
-			return VM_FAULT_OOM;
+			return ret;
 		}
 	}
 
@@ -3362,15 +3363,18 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	 */
 	fpin = maybe_unlock_mmap_for_io(vmf, fpin);
 	error = filemap_read_folio(file, mapping->a_ops->read_folio, folio);
-	if (fpin)
-		goto out_retry;
 	folio_put(folio);
-
-	if (!error || error == AOP_TRUNCATED_PAGE)
+	folio = NULL;
+	if (!error || error == AOP_TRUNCATED_PAGE) {
+		if (fpin)
+			goto out_retry;
 		goto retry_find;
+	}
+	ret = VM_FAULT_SIGBUS;
+	if (fpin)
+		goto out_retry;
 	filemap_invalidate_unlock_shared(mapping);
-
-	return VM_FAULT_SIGBUS;
+	return ret;
 
 out_retry:
 	/*
Christoph Hellwig May 9, 2023, 8:49 p.m. UTC | #11
On Tue, May 09, 2023 at 10:37:12AM -0700, Linus Torvalds wrote:
> In fs/btrfs/extent_io.c, we have
> 
>     while (index <= end_index) {
>         folio = filemap_get_folio(mapping, index);
>         filemap_dirty_folio(mapping, folio);
> 
> and in fs/gfs2/lops.c we have a similar case of filemap_get_folio ->
> folio_wait_locked use without checking for any errors.
> 
> I assume it's probably hard to trigger errors in those paths, but it
> does look dodgy.

The pages are pinned at the point.  That being said this code is
absolutely for many other reasons and needs to go away.  I've been
looking into it, but it will probably take several more months to
unpile the onion to get to the hear of the btrfs writeback code
problems..
Peter Xu May 10, 2023, 8:27 p.m. UTC | #12
On Tue, May 09, 2023 at 03:19:18PM -0400, Johannes Weiner wrote:
> On Sat, May 06, 2023 at 10:04:48AM -0700, Linus Torvalds wrote:
> > On Sat, May 6, 2023 at 9:35 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > And yes, the simplest fix for the "wrong test" would be to just add a
> > > new "out_nofolio" error case after "out_retry", and use that.
> > >
> > > However, even that seems wrong, because the return value for that path
> > > is the wrong one.
> > 
> > Actually, my suggested patch is _also_ wrong.
> > 
> > The problem is that we do need to return VM_FAULT_RETRY to let the
> > caller know that we released the mmap_lock.
> > 
> > And once we return VM_FAULT_RETRY, the other error bits don't even matter.
> > 
> > So while I think the *right* thing to do is to return VM_FAULT_OOM |
> > VM_FAULT_RETRY, that doesn't actually end up working, because if
> > VM_FAULT_RETRY is set, the caller will know that "yes, mmap_lock was
> > dropped", but the callers will also just ignore the other bits and
> > unconditionally retry.
> > 
> > How very very annoying.
> > 
> > This was introduced several years ago by commit 6b4c9f446981
> > ("filemap: drop the mmap_sem for all blocking operations").
> > 
> > Looking at that, we have at least one other similar error case wrong
> > too: the "page_not_uptodate" case carefully checks for IO errors and
> > retries only if there was no error (or for the AOP_TRUNCATED_PAGE)
> > case.
> > 
> > For an actual IO error on page reading, it returns VM_FAULT_SIGBUS.
> > 
> > Except - again - for that "if (fpin) goto out_retry" case, which will
> > just return VM_FAULT_RETRY and retry the fault.
> > 
> > I do not believe that retrying the fault is the right thing to do when
> > we ran out of memory, or when we had an IO error, and I do not think
> > it was intentional that the error handling was changed.
> 
> This is a while ago and the code has changed quite a bit since, so
> bear with me.
> 
> Originally, we only ever did a maximum of two tries: one where the
> lock might be dropped to kick off IO, then a synchronous one. IIRC the
> thinking at the time was that events like OOMs and IO failures are
> rare enough that doing the retry anyway (even if somewhat pointless)
> and reacting to the issue then (if it persisted) was a tradeoff to
> keep the retry logic simple.
> 
> Since 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times") we
> don't clear FAULT_FLAG_ALLOW_RETRY anymore though, and we might see
> more than one loop. At least outside the page cache. So I agree it
> makes sense to look at the return value more carefully and act on
> errors more timely in the arch handler.
> 
> Draft patch below. It survives a boot and a will-it-scale smoke test,
> but I haven't put it through the grinder yet.
> 
> One thing that gave me pause is this comment:
> 
> 	/*
> 	 * If we need to retry the mmap_lock has already been released,
> 	 * and if there is a fatal signal pending there is no guarantee
> 	 * that we made any progress. Handle this case first.
> 	 */
> 
> I think it made sense when it was added in 26178ec11ef3 ("x86: mm:
> consolidate VM_FAULT_RETRY handling"). But after 39678191cd89
> ("x86/mm: use helper fault_signal_pending()") it's in a misleading
> location, since the signal handling is above it.
> 
> So I'm removing it, but please let me know if I missed something.
> 
> ---
>  arch/x86/mm/fault.c | 40 +++++++++++++++++++++++-----------------
>  mm/filemap.c        | 18 +++++++++++-------
>  2 files changed, 34 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index e4399983c50c..f1d242be723f 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1456,20 +1456,15 @@ void do_user_addr_fault(struct pt_regs *regs,
>  		return;
>  
>  	/*
> -	 * If we need to retry the mmap_lock has already been released,
> -	 * and if there is a fatal signal pending there is no guarantee
> -	 * that we made any progress. Handle this case first.
> +	 * If we need to retry the mmap_lock has already been released.
>  	 */
> -	if (unlikely(fault & VM_FAULT_RETRY)) {
> -		flags |= FAULT_FLAG_TRIED;
> -		goto retry;
> -	}
> +	if (likely(!(fault & VM_FAULT_RETRY)))
> +		mmap_read_unlock(mm);
>  
> -	mmap_read_unlock(mm);
>  #ifdef CONFIG_PER_VMA_LOCK
>  done:
>  #endif
> -	if (likely(!(fault & VM_FAULT_ERROR)))
> +	if (likely(!(fault & (VM_FAULT_ERROR|VM_FAULT_RETRY))))
>  		return;
>  
>  	if (fatal_signal_pending(current) && !user_mode(regs)) {
> @@ -1493,15 +1488,26 @@ void do_user_addr_fault(struct pt_regs *regs,
>  		 * oom-killed):
>  		 */
>  		pagefault_out_of_memory();
> -	} else {
> -		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
> -			     VM_FAULT_HWPOISON_LARGE))
> -			do_sigbus(regs, error_code, address, fault);
> -		else if (fault & VM_FAULT_SIGSEGV)
> -			bad_area_nosemaphore(regs, error_code, address);
> -		else
> -			BUG();
> +		return;
> +	}
> +
> +	if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
> +		     VM_FAULT_HWPOISON_LARGE)) {
> +		do_sigbus(regs, error_code, address, fault);
> +		return;
>  	}
> +
> +	if (fault & VM_FAULT_SIGSEGV) {
> +		bad_area_nosemaphore(regs, error_code, address);
> +		return;
> +	}
> +
> +	if (fault & VM_FAULT_RETRY) {
> +		flags |= FAULT_FLAG_TRIED;
> +		goto retry;
> +	}
> +
> +	BUG();
>  }
>  NOKPROBE_SYMBOL(do_user_addr_fault);
>  
> diff --git a/mm/filemap.c b/mm/filemap.c
> index b4c9bd368b7e..f97ca5045c19 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3290,10 +3290,11 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>  					  FGP_CREAT|FGP_FOR_MMAP,
>  					  vmf->gfp_mask);
>  		if (IS_ERR(folio)) {
> +			ret = VM_FAULT_OOM;
>  			if (fpin)
>  				goto out_retry;
>  			filemap_invalidate_unlock_shared(mapping);
> -			return VM_FAULT_OOM;
> +			return ret;
>  		}
>  	}
>  
> @@ -3362,15 +3363,18 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>  	 */
>  	fpin = maybe_unlock_mmap_for_io(vmf, fpin);
>  	error = filemap_read_folio(file, mapping->a_ops->read_folio, folio);
> -	if (fpin)
> -		goto out_retry;
>  	folio_put(folio);
> -
> -	if (!error || error == AOP_TRUNCATED_PAGE)
> +	folio = NULL;
> +	if (!error || error == AOP_TRUNCATED_PAGE) {
> +		if (fpin)
> +			goto out_retry;
>  		goto retry_find;
> +	}
> +	ret = VM_FAULT_SIGBUS;
> +	if (fpin)
> +		goto out_retry;
>  	filemap_invalidate_unlock_shared(mapping);
> -
> -	return VM_FAULT_SIGBUS;
> +	return ret;
>  
>  out_retry:
>  	/*
> -- 
> 2.40.1
> 

The change looks all right to me.

Acked-by: Peter Xu <peterx@redhat.com>

For the long term maybe we want to cleanup a bit on the VM_FAULT_* entries,
e.g., here VM_FAULT_RETRY doesn't really mean "we should retry the fault"
but instead only the hint to show that we'ver released the mmap lock when
any error is set.

Meanwhile it's also not the only one to express that because now we also
have VM_FAULT_COMPLETED, so it's debatable which one we should use here
purely from the logic and definition of the retvals.

One thing we can make this slightly cleaner in the future is we can have a
flag sololy for "we have released the mmap lock", so fundamentally we can
consider renaming COMPLETE->MM_RELEASED, then we use RETRY to only hint
"whether we really want to retry" and leave "whether mmap lock released"
for the new flag.

It could look like:

  MM_RELEASED    (new) RETRY
  0              0             -> page fault resolved (old "retval=0")
  0              1             -> retry with mmap held (currently invalid, so let's ignore this)
  1              0             -> resolved and lock released (old COMPLETE)
  1              1             -> lock releaesd,need to retry (old RETRY)

Then IIUC for error cases we can return MM_RELEASED|ERROR for whatever
error (without setting RETRY).

Not sure whether it helps in any form.  Even if it would, it can definitely
be done on top.

Thanks,
Linus Torvalds May 10, 2023, 9:33 p.m. UTC | #13
On Wed, May 10, 2023 at 3:27 PM Peter Xu <peterx@redhat.com> wrote:
>
> The change looks all right to me.

I hate how this path has turned very architecture-specific again, and
this patch makes it worse.

For a short while we shared a lot of the code between architectures
thanks to the new fault helpers (ie fault_signal_pending() and
friends). Now CONFIG_PER_VMA_LOCK has made a mockery of that, and this
patch makes it worse by just fixing x86.

As is historically always the case.

And I think that patch is actually buggy. Because it does that

     ret = VM_FAULT_SIGBUS;
     if (fpin)
             goto out_retry;

in the fault path, and now returns VM_FAULT_SIGBUS | VM_FAULT_RETRY as a result.

And there are a lot of users that will be *very* unhappy about that combination.

Look at mm/gup.c, for example. It will do

        if (ret & VM_FAULT_ERROR) {
                int err = vm_fault_to_errno(ret, *flags);

                if (err)
                        return err;
       ...

and not react to VM_FAULT_RETRY being set at all - so it won't clear
the "locked" variable, for example.

And that all just makes it very clear how much of a painpoint that
conditional lock release is. It's horrendous.

That's always a huge pain in general, and it really complicates how
things get handled. It has very real and obvious historical reasons
("we never released the lock" being converted one case at a time to
"we release the lock in this situation and set the bit to tell you"),
but I wonder if we should just say "handle_mm_fault() _always_
releases the lock".

We'd still keep the RETRY bit as a "this did not complete, you need to
retry", but at least the whole secondary meaning of "oh, and if it
isn't set, you need to release the lock you took" would go away.
Because RETRY has really annoying semantics today.

Again - I know exactly _why_ it has those horrendous semantics, and
I'm not blaming anybody, but it makes it really painful to deal with
15 different architectures that then have to deal with those things.

How painful would it be to try to take baby steps and remove those
kinds of things and slowly move towards a situation where RETRY isn't
such a magically special bit?

Peter Xu, you did a lot of the helper function cleanups, and moved
some of the accounting code into generic code. It's a few years ago,
but maybe you still remember this area.. And Luto, I'm adding you to
the participants because you've touched that code more than most.

Could we make the rule be that handle_mm_fault() just *always*
releases the lock? How painful would that be?

For many of the architectures, I *think* it would mean just removing the

        mmap_read_unlock(mm);

in the fault handling path (after the RETRY test).

But there's a couple of other uses of handle_mm_fault() too, notably GUP.

Am I just dreaming? Or could we try to simplify this area first?
Because I think Johannes' patch right now is quite buggy, and only
converted one caller, when a lot of other callers will currently find
it very problematic to see both VM_FAULT_RETRY and one of the error
bits..

It's also possible that I'm misreading things. But I really think the
lock handling would be a lot easier if the rule was "it is locked on
entry, it's always unlocked on exit".

Of course, with the vma locking, the "it" above is nontrivial too.
That sure didn't simplify thinking about this all....

              Linus
Linus Torvalds May 10, 2023, 9:44 p.m. UTC | #14
On Wed, May 10, 2023 at 4:33 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> We'd still keep the RETRY bit as a "this did not complete, you need to
> retry", but at least the whole secondary meaning of "oh, and if it
> isn't set, you need to release the lock you took" would go away.

"unless VM_FAULT_COMPLETED is set, in which case everything was fine,
and you shouldn't release the lock because we already released it".

I completely forgot about that wart that came in last year.

I think that if we made handle_mm_fault() always unlock, that thing
would go away entirely, since "0" would now just mean the same thing.

Is there really any case that *wants* to keep the mmap lock held, and
couldn't just always re-take it if it needs to do another page
(possibly retry, but the retry case obviously already has that issue)?

Certainly nothing wants the vma lock, so it's only the "mmap_sem" case
that would be an issue.

              Linus
Peter Xu May 11, 2023, 4:45 a.m. UTC | #15
On Wed, May 10, 2023 at 04:44:59PM -0500, Linus Torvalds wrote:
> On Wed, May 10, 2023 at 4:33 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > We'd still keep the RETRY bit as a "this did not complete, you need to
> > retry", but at least the whole secondary meaning of "oh, and if it
> > isn't set, you need to release the lock you took" would go away.
> 
> "unless VM_FAULT_COMPLETED is set, in which case everything was fine,
> and you shouldn't release the lock because we already released it".
> 
> I completely forgot about that wart that came in last year.
> 
> I think that if we made handle_mm_fault() always unlock, that thing
> would go away entirely, since "0" would now just mean the same thing.
> 
> Is there really any case that *wants* to keep the mmap lock held, and
> couldn't just always re-take it if it needs to do another page
> (possibly retry, but the retry case obviously already has that issue)?

FAULT_FLAG_RETRY_NOWAIT?

> Certainly nothing wants the vma lock, so it's only the "mmap_sem" case
> that would be an issue.

You're definitely right that the gup path is broken which I didn't notice
when reading...  I know I shouldn't review such a still slightly involved
patch during travel. :(

I still think maybe we have chance to generalize at least the fault path,
I'd still start with something like having just 2-3 archs having a shared
routine handle only some part of the fault path (I remember there was a
previous discussion previously, but I didn't follow up much from there..).

So even if we still need duplicates over many archs, we'll start to have
something we can use as a baseline in fault path.  Does it sound a sane
thing to consider as a start, or maybe not?

The other question - considering RETRY_NOWAIT being there - do we still
want to have something like what Johannes proposed first to fix the problem
(with all arch and gup fixed)?  I'd think yes, but I could missed something.

Thanks,
Dan Carpenter May 11, 2023, 9:44 a.m. UTC | #16
On Tue, May 09, 2023 at 10:37:12AM -0700, Linus Torvalds wrote:
> In fs/btrfs/extent_io.c, we have
> 
>     while (index <= end_index) {
>         folio = filemap_get_folio(mapping, index);
>         filemap_dirty_folio(mapping, folio);
> 

I have re-worked the Smatch check for warnings about dereferencing
error pointer so now it warns about that:

fs/btrfs/extent_io.c:224 extent_range_redirty_for_io() error: 'folio' dereferencing possible ERR_PTR()

I've also made a list of functions which don't necessarily need to be
checked so it's easy to silence false positives now.
https://github.com/error27/smatch/blob/master/check_err_ptr_deref.c

regards,
dan carpenter
block/partitions/core.c:578 blk_add_partition() error: 'part' dereferencing possible ERR_PTR()
drivers/base/firmware_loader/main.c:820 fw_log_firmware_info() error: 'alg' dereferencing possible ERR_PTR()
drivers/cpufreq/imx6q-cpufreq.c:483 imx6q_cpufreq_probe() error: 'opp' dereferencing possible ERR_PTR()
drivers/cpufreq/imx6q-cpufreq.c:486 imx6q_cpufreq_probe() error: 'opp' dereferencing possible ERR_PTR()
drivers/crypto/caam/caamalg_qi2.c:1289 aead_encrypt_done() error: 'req_ctx' dereferencing possible ERR_PTR()
drivers/crypto/caam/caamalg_qi2.c:1310 aead_decrypt_done() error: 'req_ctx' dereferencing possible ERR_PTR()
drivers/crypto/caam/caamalg_qi2.c:1398 skcipher_encrypt_done() error: 'req_ctx' dereferencing possible ERR_PTR()
drivers/crypto/caam/caamalg_qi2.c:1436 skcipher_decrypt_done() error: 'req_ctx' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1056 sa_aes_dma_in_callback() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1275 sa_run() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1372 sa_sha_dma_in_callback() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1711 sa_aead_dma_in_callback() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1714 sa_aead_dma_in_callback() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/crypto/sa2ul.c:1721 sa_aead_dma_in_callback() error: 'mdptr' dereferencing possible ERR_PTR()
drivers/devfreq/mtk-cci-devfreq.c:388 mtk_ccifreq_probe() error: 'drv->sram_reg' dereferencing possible ERR_PTR()
drivers/gpio/gpiolib.c:332 gpio_name_to_desc() error: 'desc' dereferencing possible ERR_PTR()
drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c:265 komeda_component_get_avail_scaler() error: 'pipe_st' dereferencing possible ERR_PTR()
drivers/gpu/drm/exynos/exynos_hdmi.c:1864 hdmi_bind() error: 'crtc' dereferencing possible ERR_PTR()
drivers/gpu/drm/imx/lcdc/imx-lcdc.c:408 imx_lcdc_probe() error: 'lcdc' dereferencing possible ERR_PTR()
drivers/gpu/drm/kmb/kmb_drv.c:578 kmb_probe() error: 'kmb->kmb_dsi' dereferencing possible ERR_PTR()
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:103 a5xx_submit_in_rb() error: 'ptr' dereferencing possible ERR_PTR()
drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c:107 pingpong_tearcheck_disable() error: 'mixer' dereferencing possible ERR_PTR()
drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c:28 pingpong_tearcheck_setup() error: 'mixer' dereferencing possible ERR_PTR()
drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c:81 pingpong_tearcheck_enable() error: 'mixer' dereferencing possible ERR_PTR()
drivers/gpu/drm/sun4i/sun4i_backend.c:493 sun4i_backend_atomic_check() error: 'plane_state' dereferencing possible ERR_PTR()
drivers/gpu/drm/tegra/nvdec.c:284 nvdec_load_falcon_firmware() error: 'nvdec->falcon.firmware.virt' dereferencing possible ERR_PTR()
drivers/gpu/drm/tests/drm_client_modeset_test.c:71 drm_client_modeset_test_init() error: 'priv->drm' dereferencing possible ERR_PTR()
drivers/gpu/drm/tests/drm_managed_test.c:44 drm_test_managed_run_action() error: 'drm' dereferencing possible ERR_PTR()
drivers/gpu/drm/tests/drm_managed_test.c:47 drm_test_managed_run_action() error: 'drm' dereferencing possible ERR_PTR()
drivers/gpu/drm/tests/drm_managed_test.c:50 drm_test_managed_run_action() error: 'drm' dereferencing possible ERR_PTR()
drivers/gpu/drm/tests/drm_probe_helper_test.c:52 drm_probe_helper_test_init() error: 'priv->drm' dereferencing possible ERR_PTR()
drivers/gpu/drm/vc4/tests/vc4_mock.c:173 __mock_device() error: 'vc4' dereferencing possible ERR_PTR()
drivers/gpu/drm/vc4/tests/vc4_mock.c:174 __mock_device() error: 'vc4' dereferencing possible ERR_PTR()
drivers/gpu/drm/vc4/tests/vc4_mock.c:176 __mock_device() error: 'vc4' dereferencing possible ERR_PTR()
drivers/gpu/drm/vc4/tests/vc4_mock.c:177 __mock_device() error: 'vc4' dereferencing possible ERR_PTR()
drivers/iio/adc/ad7949.c:387 ad7949_spi_probe() error: 'ad7949_adc->vref' dereferencing possible ERR_PTR()
drivers/infiniband/core/uverbs_std_types_counters.c:110 ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ() error: 'uattr' dereferencing possible ERR_PTR()
drivers/iommu/apple-dart.c:879 apple_dart_device_group() error: 'group' dereferencing possible ERR_PTR()
drivers/iommu/arm/arm-smmu/arm-smmu-impl.c:226 arm_smmu_impl_init() error: 'smmu' dereferencing possible ERR_PTR()
drivers/md/bcache/btree.c:1510 btree_gc_rewrite_node() error: 'n' dereferencing possible ERR_PTR()
drivers/md/bcache/btree.c:1515 btree_gc_rewrite_node() error: 'n' dereferencing possible ERR_PTR()
drivers/media/i2c/tc358746.c:434 tc358746_apply_misc_config() error: 'fmt' dereferencing possible ERR_PTR()
drivers/media/i2c/tc358746.c:790 tc358746_set_fmt() error: 'fmt' dereferencing possible ERR_PTR()
drivers/media/i2c/tc358746.c:918 tc358746_link_validate() error: 'fmt' dereferencing possible ERR_PTR()
drivers/media/platform/mediatek/vcodec/vdec/vdec_h264_req_if.c:381 vdec_h264_slice_decode() error: 'dec_params' dereferencing possible ERR_PTR()
drivers/media/platform/st/stm32/stm32-dcmi.c:1035 dcmi_try_fmt() error: 'state' dereferencing possible ERR_PTR()
drivers/media/platform/st/stm32/stm32-dcmi.c:1196 dcmi_set_sensor_format() error: 'state' dereferencing possible ERR_PTR()
drivers/mmc/host/sdhci.c:4758 sdhci_setup_host() error: 'mmc->supply.vqmmc' dereferencing possible ERR_PTR()
drivers/mtd/ubi/fastmap.c:356 process_pool_aeb() error: 'av' dereferencing possible ERR_PTR()
drivers/mtd/ubi/vtbl.c:595 init_volumes() error: 'av' dereferencing possible ERR_PTR()
drivers/mtd/ubi/vtbl.c:745 check_attaching_info() error: 'av' dereferencing possible ERR_PTR()
drivers/mtd/ubi/vtbl.c:762 check_attaching_info() error: 'av' dereferencing possible ERR_PTR()
drivers/mtd/ubi/vtbl.c:765 check_attaching_info() error: 'av' dereferencing possible ERR_PTR()
drivers/mtd/ubi/vtbl.c:820 ubi_read_volume_table() error: 'av' dereferencing possible ERR_PTR()
drivers/net/ethernet/broadcom/bcmsysport.c:2343 bcm_sysport_map_queues() error: 'dp' dereferencing possible ERR_PTR()
drivers/net/ethernet/broadcom/bcmsysport.c:2395 bcm_sysport_unmap_queues() error: 'dp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c:204 cn10k_alloc_leaf_profile() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c:1756 otx2_get_max_mtu() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c:321 otx2_config_priority_flow_ctrl() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_dmac_flt.c:204 otx2_dmacflt_update() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_dmac_flt.c:31 otx2_dmacflt_do_add() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c:1070 otx2_set_fecparam() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c:333 otx2_get_pauseparam() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c:125 otx2_alloc_mcam_entries() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c:203 otx2_mcam_entry_init() error: 'rsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c:238 otx2_mcam_entry_init() error: 'frsp' dereferencing possible ERR_PTR()
drivers/net/ethernet/mediatek/mtk_eth_soc.c:3198 mtk_device_event() error: 'dp' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1082 esw_add_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1094 esw_add_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1124 esw_add_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1129 esw_add_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1152 esw_del_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:1157 esw_del_fdb_peer_miss_rules() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:2606 esw_unset_master_egress_rule() error: 'vport' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:3388 esw_offloads_devlink_ns_eq_netdev_ns() error: 'esw' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c:1683 dr_ste_v0_build_src_gvmi_qpn_tag() error: 'vport_cap' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c:2014 dr_ste_v1_build_src_gvmi_qpn_tag() error: 'vport_cap' dereferencing possible ERR_PTR()
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:8506 __mlxsw_sp_port_vlan_router_join() error: 'fid' dereferencing possible ERR_PTR()
drivers/net/ethernet/sfc/mae.c:606 efx_mae_lookup_mport() error: 'm' dereferencing possible ERR_PTR()
drivers/net/ethernet/sfc/tc.c:454 efx_tc_flower_record_encap_match() error: 'old' dereferencing possible ERR_PTR()
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:5405 stmmac_rx() error: 'skb' dereferencing possible ERR_PTR()
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:5417 stmmac_rx() error: 'skb' dereferencing possible ERR_PTR()
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:5435 stmmac_rx() error: 'skb' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1003 brcmf_chip_recognition() error: 'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1006 brcmf_chip_recognition() error: 'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1009 brcmf_chip_recognition() error: 'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1012 brcmf_chip_recognition() error: 'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1015 brcmf_chip_recognition() error: 'core' dereferencing possible ERR_PTR()
drivers/nfc/pn533/pn533.c:1422 pn533_autopoll_complete() error: 'resp' dereferencing possible ERR_PTR()
drivers/nfc/pn533/pn533.c:1519 pn533_poll_complete() error: 'resp' dereferencing possible ERR_PTR()
drivers/nfc/pn533/pn533.c:1524 pn533_poll_complete() error: 'resp' dereferencing possible ERR_PTR()
drivers/phy/sunplus/phy-sunplus-usb2.c:92 update_disc_vol() error: 'cell' dereferencing possible ERR_PTR()
drivers/phy/tegra/xusb.c:703 tegra_xusb_setup_usb_role_switch() error: 'lane' dereferencing possible ERR_PTR()
drivers/platform/surface/surface3_power.c:552 mshw0011_probe() error: 'data->poll_task' dereferencing possible ERR_PTR()
drivers/regulator/core.c:5604 regulator_register() error: 'init_data' dereferencing possible ERR_PTR()
drivers/scsi/qla4xxx/ql4_os.c:6941 qla4xxx_sess_conn_setup() error: 'ep' dereferencing possible ERR_PTR()
drivers/soc/qcom/icc-bwmon.c:691 bwmon_intr_thread() error: 'target_opp' dereferencing possible ERR_PTR()
drivers/soc/qcom/socinfo.c:750 socinfo_debugfs_init() error: 'versions' dereferencing possible ERR_PTR()
drivers/thermal/rcar_thermal.c:498 rcar_thermal_probe() error: 'priv->zone' dereferencing possible ERR_PTR()
drivers/tty/serial/max310x.c:1621 max310x_i2c_probe() error: 'port_client' dereferencing possible ERR_PTR()
drivers/usb/gadget/function/uvc_v4l2.c:124 find_format_by_pix() error: 'fmtdesc' dereferencing possible ERR_PTR()
drivers/usb/gadget/function/uvc_v4l2.c:378 uvc_v4l2_enum_format() error: 'fmtdesc' dereferencing possible ERR_PTR()
drivers/usb/host/ehci-sched.c:1358 reserve_release_iso_bandwidth() error: 'tt' dereferencing possible ERR_PTR()
drivers/usb/host/ehci-sched.c:261 reserve_release_intr_bandwidth() error: 'tt' dereferencing possible ERR_PTR()
drivers/usb/host/max3421-hcd.c:1914 max3421_probe() error: 'max3421_hcd->spi_thread' dereferencing possible ERR_PTR()
drivers/video/fbdev/omap2/omapfb/displays/connector-analog-tv.c:220 tvc_probe() error: 'ddata->in' dereferencing possible ERR_PTR()
fs/btrfs/extent_io.c:224 extent_range_redirty_for_io() error: 'folio' dereferencing possible ERR_PTR()
fs/cifs/connect.c:2725 cifs_match_super() error: 'tlink' dereferencing possible ERR_PTR()
fs/hfs/brec.c:160 hfs_brec_insert() error: 'fd->bnode' dereferencing possible ERR_PTR()
fs/hfs/brec.c:437 hfs_brec_update_parent() error: 'fd->bnode' dereferencing possible ERR_PTR()
fs/hfsplus/brec.c:160 hfsplus_brec_insert() error: 'fd->bnode' dereferencing possible ERR_PTR()
fs/hfsplus/brec.c:441 hfs_brec_update_parent() error: 'fd->bnode' dereferencing possible ERR_PTR()
fs/ksmbd/smbacl.c:1296 smb_check_perm_dacl() error: 'posix_acls' dereferencing possible ERR_PTR()
fs/ksmbd/vfs.c:1323 ksmbd_vfs_make_xattr_posix_acl() error: 'posix_acls' dereferencing possible ERR_PTR()
fs/ksmbd/vfs.c:1830 ksmbd_vfs_inherit_posix_acl() error: 'acls' dereferencing possible ERR_PTR()
fs/namei.c:2898 path_pts() error: 'path->dentry' dereferencing possible ERR_PTR()
fs/nfs/blocklayout/rpc_pipefs.c:152 nfs4blocklayout_register_sb() error: 'dir' dereferencing possible ERR_PTR()
fs/nfs/cache_lib.c:122 nfs_cache_register_sb() error: 'dir' dereferencing possible ERR_PTR()
fs/nfsd/nfs4recover.c:942 nfsd4_cld_register_sb() error: 'dir' dereferencing possible ERR_PTR()
fs/nfs/nfs4proc.c:3086 _nfs4_open_and_get_state() error: 'dentry' dereferencing possible ERR_PTR()
fs/nfs/nfs4proc.c:3096 _nfs4_open_and_get_state() error: 'dentry' dereferencing possible ERR_PTR()
fs/proc/base.c:2091 proc_fill_cache() error: 'child' dereferencing possible ERR_PTR()
ipc/shm.c:1681 do_shmat() error: 'shp' dereferencing possible ERR_PTR()
kernel/cpu.c:618 finish_cpu() error: 'idle' dereferencing possible ERR_PTR()
kernel/trace/ring_buffer.c:3971 ring_buffer_write() error: 'event' dereferencing possible ERR_PTR()
kernel/trace/trace_events.c:3959 event_test_stuff() error: 'test_thread' dereferencing possible ERR_PTR()
lib/kunit/string-stream-test.c:25 string_stream_test_not_empty_after_add() error: 'stream' dereferencing possible ERR_PTR()
lib/kunit/string-stream-test.c:35 string_stream_test_get_string() error: 'stream' dereferencing possible ERR_PTR()
net/core/filter.c:10998 ____sk_select_reuseport() error: 'selected_sk' dereferencing possible ERR_PTR()
net/smc/smc_ib.c:215 smc_ib_find_route() error: 'neigh' dereferencing possible ERR_PTR()
net/sunrpc/clnt.c:135 rpc_setup_pipedir_sb() error: 'dir' dereferencing possible ERR_PTR()
net/sunrpc/rpc_pipe.c:1312 rpc_gssd_dummy_populate() error: 'gssd_dentry' dereferencing possible ERR_PTR()
net/sunrpc/rpc_pipe.c:1327 rpc_gssd_dummy_populate() error: 'clnt_dentry' dereferencing possible ERR_PTR()
net/sunrpc/rpc_pipe.c:641 __rpc_lookup_create_exclusive() error: 'dentry' dereferencing possible ERR_PTR()
net/sunrpc/rpc_pipe.c:666 __rpc_depopulate() error: 'dentry' dereferencing possible ERR_PTR()
security/keys/trusted-keys/trusted_tee.c:112 trusted_tee_seal() error: 'reg_shm_out' dereferencing possible ERR_PTR()
security/keys/trusted-keys/trusted_tee.c:171 trusted_tee_unseal() error: 'reg_shm_out' dereferencing possible ERR_PTR()
security/keys/trusted-keys/trusted_tpm2.c:75 tpm2_key_encode() error: 'work' dereferencing possible ERR_PTR()
Peter Xu May 12, 2023, 12:14 a.m. UTC | #17
On Wed, May 10, 2023 at 09:45:23PM -0700, Peter Xu wrote:
> On Wed, May 10, 2023 at 04:44:59PM -0500, Linus Torvalds wrote:
> > On Wed, May 10, 2023 at 4:33 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > We'd still keep the RETRY bit as a "this did not complete, you need to
> > > retry", but at least the whole secondary meaning of "oh, and if it
> > > isn't set, you need to release the lock you took" would go away.
> > 
> > "unless VM_FAULT_COMPLETED is set, in which case everything was fine,
> > and you shouldn't release the lock because we already released it".
> > 
> > I completely forgot about that wart that came in last year.
> > 
> > I think that if we made handle_mm_fault() always unlock, that thing
> > would go away entirely, since "0" would now just mean the same thing.
> > 
> > Is there really any case that *wants* to keep the mmap lock held, and
> > couldn't just always re-take it if it needs to do another page
> > (possibly retry, but the retry case obviously already has that issue)?
> 
> FAULT_FLAG_RETRY_NOWAIT?

I had a slightly closer look today on "releasing mmap read lock in
handle_mm_fault()", so I think it's also fine we just keep NOWAIT special
but release the mmap locks if !NOWAIT (NOWAIT is always special, say, when
with VM_FAULT_RETRY returned), but I'm not sure whether it's good enough.

A major issue of not releasing mmap lock always is gup can fault >1 pages
per call to handle_mm_fault(), so it's ideal if it's not major fault gup
can take mmap read lock once but fault in >1 pages without releasing it.

I've got three tiny patches attached where I added a wrapper for
handle_mm_fault() (I called it handle_mm_fault_one(), meaning it should be
used where we only want to fault in 1 page, aka, real hardware faults, so
it always make sense to release mmap lock there).  I tried to convert 4
archs (x86,arm,ppc,s390) to have a feeling of what it would clean up.  With
3 patches applied, one good thing is we can easily move all ERROR handling
above either RETRY|COMPLETE handling so it's much easier to review such a
change as what Johannes proposed, but not sure whether it's something
worthwhile to have.  Please let me know if anyone thinks I should post as a
formal patch.

Even with something like that, we'll still need per-arch code changes if we
want to handle ERROR+RETRY cases anyway, because we used to handle RETRY
first.  To make it less an effort in the future, we still need to
generalize the fault path.

> 
> > Certainly nothing wants the vma lock, so it's only the "mmap_sem" case
> > that would be an issue.
> 
> You're definitely right that the gup path is broken which I didn't notice
> when reading...  I know I shouldn't review such a still slightly involved
> patch during travel. :(
> 
> I still think maybe we have chance to generalize at least the fault path,
> I'd still start with something like having just 2-3 archs having a shared
> routine handle only some part of the fault path (I remember there was a
> previous discussion previously, but I didn't follow up much from there..).
> 
> So even if we still need duplicates over many archs, we'll start to have
> something we can use as a baseline in fault path.  Does it sound a sane
> thing to consider as a start, or maybe not?
> 
> The other question - considering RETRY_NOWAIT being there - do we still
> want to have something like what Johannes proposed first to fix the problem
> (with all arch and gup fixed)?  I'd think yes, but I could missed something.

Three patches attached:
=======================

From 95e968c75f9114cc4a2ef74f0108f09157f7dd15 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 11 May 2023 12:49:32 -0700
Subject: [PATCH 1/3] mm: handle_mm_fault_one()

Introduce a wrapper for handle_mm_fault() which will only resolve one page
fault at a time.  It means if there's no further fault to take care of, we
can safely unify the unlock of either mmap lock or vma lock in the wrapper.
It may help cleanup fault paths across archs.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm.h | 38 ++++++++++++++++++++++++++++++++++++++
 mm/memory.c        | 11 ++++++++---
 2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1bd731a2972b..dfe6083c6953 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2360,6 +2360,44 @@ static inline void unmap_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen, int even_cows) { }
 #endif
 
+static inline bool
+mm_should_release_mmap(unsigned long flags, vm_fault_t retval)
+{
+	/* The caller explicitly requested to keep the mmap read lock */
+	if (flags & FAULT_FLAG_RETRY_NOWAIT)
+		return false;
+
+	/* If the mmap read lock is already released, we're all good */
+	if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
+		return false;
+
+	/* Otherwise always release it */
+	return true;
+}
+
+/*
+ * This is mostly handle_mm_fault(), but it also take care of releasing
+ * mmap or vma read lock as long as possible (e.g. when !RETRY_NOWAIT).
+ *
+ * Normally it's the case when we got a hardware page fault, where we want
+ * to release the lock right after the page fault. And it's not for case
+ * like GUP where it can fault a range of pages continuously with mmap lock
+ * being held during the process.
+ */
+static inline vm_fault_t
+handle_mm_fault_one(struct vm_area_struct *vma, unsigned long address,
+		    unsigned int flags, struct pt_regs *regs)
+{
+	vm_fault_t retval = handle_mm_fault(vma, address, flags, regs);
+
+	if (flags & FAULT_FLAG_VMA_LOCK)
+		vma_end_read(vma);
+	else if (mm_should_release_mmap(flags, retval))
+		mmap_read_unlock(mm);
+
+	return retval;
+}
+
 static inline void unmap_shared_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen)
 {
diff --git a/mm/memory.c b/mm/memory.c
index 146bb94764f8..c43e3410bf8f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5206,10 +5206,15 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma,
 }
 
 /*
- * By the time we get here, we already hold the mm semaphore
+ * By the time we get here, we already hold relevant locks to make sure vma
+ * is rock solid:
  *
- * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __folio_lock_or_retry().
+ *   - When FAULT_FLAG_VMA_LOCK, it's the vma read lock
+ *   - When !FAULT_FLAG_VMA_LOCK, it's the mmap read lock
+ *
+ * For the 2nd case, it's possible that the mmap_lock may have been
+ * released during the fault procedure depending on the return value.  See
+ * filemap_fault(), __folio_lock_or_retry(), or fault_dirty_shared_page().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs)
kernel test robot May 12, 2023, 3:28 a.m. UTC | #18
Hi Peter,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-handle_mm_fault_one/20230512-081554
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ZF2E6i4pqJr7m436%40x1n
patch subject: [PATCH 1/3] mm: handle_mm_fault_one()
config: x86_64-randconfig-a014 (https://download.01.org/0day-ci/archive/20230512/202305121114.JAbwVHjS-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Xu/mm-handle_mm_fault_one/20230512-081554
        git checkout 0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 prepare

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305121114.JAbwVHjS-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:20:
>> include/linux/mm.h:2371:6: error: use of undeclared identifier 'fault'
           if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
               ^
>> include/linux/mm.h:2396:20: error: use of undeclared identifier 'mm'
                   mmap_read_unlock(mm);
                                    ^
   2 errors generated.
   make[2]: *** [scripts/Makefile.build:114: arch/x86/kernel/asm-offsets.s] Error 1
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:1287: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:226: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +/fault +2371 include/linux/mm.h

  2362	
  2363	static inline bool
  2364	mm_should_release_mmap(unsigned long flags, vm_fault_t retval)
  2365	{
  2366		/* The caller explicitly requested to keep the mmap read lock */
  2367		if (flags & FAULT_FLAG_RETRY_NOWAIT)
  2368			return false;
  2369	
  2370		/* If the mmap read lock is already released, we're all good */
> 2371		if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
  2372			return false;
  2373	
  2374		/* Otherwise always release it */
  2375		return true;
  2376	}
  2377	
  2378	/*
  2379	 * This is mostly handle_mm_fault(), but it also take care of releasing
  2380	 * mmap or vma read lock as long as possible (e.g. when !RETRY_NOWAIT).
  2381	 *
  2382	 * Normally it's the case when we got a hardware page fault, where we want
  2383	 * to release the lock right after the page fault. And it's not for case
  2384	 * like GUP where it can fault a range of pages continuously with mmap lock
  2385	 * being held during the process.
  2386	 */
  2387	static inline vm_fault_t
  2388	handle_mm_fault_one(struct vm_area_struct *vma, unsigned long address,
  2389			    unsigned int flags, struct pt_regs *regs)
  2390	{
  2391		vm_fault_t retval = handle_mm_fault(vma, address, flags, regs);
  2392	
  2393		if (flags & FAULT_FLAG_VMA_LOCK)
  2394			vma_end_read(vma);
  2395		else if (mm_should_release_mmap(flags, retval))
> 2396			mmap_read_unlock(mm);
  2397	
  2398		return retval;
  2399	}
  2400
kernel test robot May 12, 2023, 3:52 a.m. UTC | #19
Hi Peter,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-handle_mm_fault_one/20230512-081554
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ZF2E6i4pqJr7m436%40x1n
patch subject: [PATCH 1/3] mm: handle_mm_fault_one()
config: x86_64-randconfig-a013 (https://download.01.org/0day-ci/archive/20230512/202305121115.gTte4W7A-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Xu/mm-handle_mm_fault_one/20230512-081554
        git checkout 0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 prepare

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305121115.gTte4W7A-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/memcontrol.h:20,
                    from include/linux/swap.h:9,
                    from include/linux/suspend.h:5,
                    from arch/x86/kernel/asm-offsets.c:14:
   include/linux/mm.h: In function 'mm_should_release_mmap':
>> include/linux/mm.h:2371:13: error: 'fault' undeclared (first use in this function)
    2371 |         if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
         |             ^~~~~
   include/linux/mm.h:2371:13: note: each undeclared identifier is reported only once for each function it appears in
   include/linux/mm.h: In function 'handle_mm_fault_one':
>> include/linux/mm.h:2396:34: error: 'mm' undeclared (first use in this function); did you mean 'tm'?
    2396 |                 mmap_read_unlock(mm);
         |                                  ^~
         |                                  tm
   make[2]: *** [scripts/Makefile.build:114: arch/x86/kernel/asm-offsets.s] Error 1
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:1287: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:226: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +/fault +2371 include/linux/mm.h

  2362	
  2363	static inline bool
  2364	mm_should_release_mmap(unsigned long flags, vm_fault_t retval)
  2365	{
  2366		/* The caller explicitly requested to keep the mmap read lock */
  2367		if (flags & FAULT_FLAG_RETRY_NOWAIT)
  2368			return false;
  2369	
  2370		/* If the mmap read lock is already released, we're all good */
> 2371		if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
  2372			return false;
  2373	
  2374		/* Otherwise always release it */
  2375		return true;
  2376	}
  2377	
  2378	/*
  2379	 * This is mostly handle_mm_fault(), but it also take care of releasing
  2380	 * mmap or vma read lock as long as possible (e.g. when !RETRY_NOWAIT).
  2381	 *
  2382	 * Normally it's the case when we got a hardware page fault, where we want
  2383	 * to release the lock right after the page fault. And it's not for case
  2384	 * like GUP where it can fault a range of pages continuously with mmap lock
  2385	 * being held during the process.
  2386	 */
  2387	static inline vm_fault_t
  2388	handle_mm_fault_one(struct vm_area_struct *vma, unsigned long address,
  2389			    unsigned int flags, struct pt_regs *regs)
  2390	{
  2391		vm_fault_t retval = handle_mm_fault(vma, address, flags, regs);
  2392	
  2393		if (flags & FAULT_FLAG_VMA_LOCK)
  2394			vma_end_read(vma);
  2395		else if (mm_should_release_mmap(flags, retval))
> 2396			mmap_read_unlock(mm);
  2397	
  2398		return retval;
  2399	}
  2400
kernel test robot May 12, 2023, 3:52 a.m. UTC | #20
Hi Peter,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-handle_mm_fault_one/20230512-081554
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ZF2E6i4pqJr7m436%40x1n
patch subject: [PATCH 1/3] mm: handle_mm_fault_one()
config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20230512/202305121127.9uKHOw3S-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Xu/mm-handle_mm_fault_one/20230512-081554
        git checkout 0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 prepare

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305121127.9uKHOw3S-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/x86/kernel/asm-offsets.c:14:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
   In file included from include/linux/memcontrol.h:20:
>> include/linux/mm.h:2371:6: error: use of undeclared identifier 'fault'
           if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
               ^
>> include/linux/mm.h:2396:20: error: use of undeclared identifier 'mm'
                   mmap_read_unlock(mm);
                                    ^
   2 errors generated.
   make[2]: *** [scripts/Makefile.build:114: arch/x86/kernel/asm-offsets.s] Error 1
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:1287: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:226: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +/fault +2371 include/linux/mm.h

  2362	
  2363	static inline bool
  2364	mm_should_release_mmap(unsigned long flags, vm_fault_t retval)
  2365	{
  2366		/* The caller explicitly requested to keep the mmap read lock */
  2367		if (flags & FAULT_FLAG_RETRY_NOWAIT)
  2368			return false;
  2369	
  2370		/* If the mmap read lock is already released, we're all good */
> 2371		if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
  2372			return false;
  2373	
  2374		/* Otherwise always release it */
  2375		return true;
  2376	}
  2377	
  2378	/*
  2379	 * This is mostly handle_mm_fault(), but it also take care of releasing
  2380	 * mmap or vma read lock as long as possible (e.g. when !RETRY_NOWAIT).
  2381	 *
  2382	 * Normally it's the case when we got a hardware page fault, where we want
  2383	 * to release the lock right after the page fault. And it's not for case
  2384	 * like GUP where it can fault a range of pages continuously with mmap lock
  2385	 * being held during the process.
  2386	 */
  2387	static inline vm_fault_t
  2388	handle_mm_fault_one(struct vm_area_struct *vma, unsigned long address,
  2389			    unsigned int flags, struct pt_regs *regs)
  2390	{
  2391		vm_fault_t retval = handle_mm_fault(vma, address, flags, regs);
  2392	
  2393		if (flags & FAULT_FLAG_VMA_LOCK)
  2394			vma_end_read(vma);
  2395		else if (mm_should_release_mmap(flags, retval))
> 2396			mmap_read_unlock(mm);
  2397	
  2398		return retval;
  2399	}
  2400
kernel test robot May 12, 2023, 4:49 a.m. UTC | #21
Hi Peter,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-handle_mm_fault_one/20230512-081554
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ZF2E6i4pqJr7m436%40x1n
patch subject: [PATCH 1/3] mm: handle_mm_fault_one()
config: hexagon-randconfig-r045-20230511 (https://download.01.org/0day-ci/archive/20230512/202305121214.bxn0gOdr-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project b0fb98227c90adf2536c9ad644a74d5e92961111)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Xu/mm-handle_mm_fault_one/20230512-081554
        git checkout 0a03a4870c8a62e3ba52a0f9b50b307f509acb2b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/net/ethernet/hisilicon/hns/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305121214.bxn0gOdr-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   In file included from drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:12:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:10:
   include/linux/mm.h:2371:6: error: use of undeclared identifier 'fault'
           if (fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))
               ^
   include/linux/mm.h:2371:6: error: use of undeclared identifier 'fault'
   include/linux/mm.h:2371:6: error: use of undeclared identifier 'fault'
   include/linux/mm.h:2396:20: error: use of undeclared identifier 'mm'
                   mmap_read_unlock(mm);
                                    ^
>> drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:200:48: warning: shift count >= width of type [-Wshift-count-overflow]
           if (!dma_set_mask_and_coherent(dsaf_dev->dev, DMA_BIT_MASK(64ULL)))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK'
   #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                        ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
>> drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:200:48: warning: shift count >= width of type [-Wshift-count-overflow]
           if (!dma_set_mask_and_coherent(dsaf_dev->dev, DMA_BIT_MASK(64ULL)))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK'
   #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                        ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:61: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                               ^~~~
>> drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:200:48: warning: shift count >= width of type [-Wshift-count-overflow]
           if (!dma_set_mask_and_coherent(dsaf_dev->dev, DMA_BIT_MASK(64ULL)))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK'
   #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                        ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                       ~~~~~~~~~~~~~~~~~^~~~~
   include/linux/compiler.h:69:3: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
            ^~~~
   9 warnings and 4 errors generated.


vim +200 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c

8413b3be4d77dae Kejian Yan                   2016-06-03   40  
336a443bd9ddca3 YueHaibing                   2018-07-26   41  static int hns_dsaf_get_cfg(struct dsaf_device *dsaf_dev)
511e6bc071db148 huangdaode                   2015-09-17   42  {
511e6bc071db148 huangdaode                   2015-09-17   43  	int ret, i;
511e6bc071db148 huangdaode                   2015-09-17   44  	u32 desc_num;
511e6bc071db148 huangdaode                   2015-09-17   45  	u32 buf_size;
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23   46) 	u32 reset_offset = 0;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23   47) 	u32 res_idx = 0;
48189d6aaf1ed1b yankejian                    2016-01-20   48  	const char *mode_str;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23   49) 	struct regmap *syscon;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23   50) 	struct resource *res;
453cafbce5bd256 Peter Chen                   2016-08-01   51  	struct device_node *np = dsaf_dev->dev->of_node, *np_temp;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23   52) 	struct platform_device *pdev = to_platform_device(dsaf_dev->dev);
511e6bc071db148 huangdaode                   2015-09-17   53  
8413b3be4d77dae Kejian Yan                   2016-06-03   54  	if (dev_of_node(dsaf_dev->dev)) {
13ac695e7ea16cb Salil                        2015-12-03   55  		if (of_device_is_compatible(np, "hisilicon,hns-dsaf-v1"))
511e6bc071db148 huangdaode                   2015-09-17   56  			dsaf_dev->dsaf_ver = AE_VERSION_1;
13ac695e7ea16cb Salil                        2015-12-03   57  		else
13ac695e7ea16cb Salil                        2015-12-03   58  			dsaf_dev->dsaf_ver = AE_VERSION_2;
8413b3be4d77dae Kejian Yan                   2016-06-03   59  	} else if (is_acpi_node(dsaf_dev->dev->fwnode)) {
8413b3be4d77dae Kejian Yan                   2016-06-03   60  		if (acpi_dev_found(hns_dsaf_acpi_match[0].id))
8413b3be4d77dae Kejian Yan                   2016-06-03   61  			dsaf_dev->dsaf_ver = AE_VERSION_1;
8413b3be4d77dae Kejian Yan                   2016-06-03   62  		else if (acpi_dev_found(hns_dsaf_acpi_match[1].id))
8413b3be4d77dae Kejian Yan                   2016-06-03   63  			dsaf_dev->dsaf_ver = AE_VERSION_2;
8413b3be4d77dae Kejian Yan                   2016-06-03   64  		else
8413b3be4d77dae Kejian Yan                   2016-06-03   65  			return -ENXIO;
8413b3be4d77dae Kejian Yan                   2016-06-03   66  	} else {
8413b3be4d77dae Kejian Yan                   2016-06-03   67  		dev_err(dsaf_dev->dev, "cannot get cfg data from of or acpi\n");
8413b3be4d77dae Kejian Yan                   2016-06-03   68  		return -ENXIO;
8413b3be4d77dae Kejian Yan                   2016-06-03   69  	}
511e6bc071db148 huangdaode                   2015-09-17   70  
6162928c76dcba2 Kejian Yan                   2016-06-03   71  	ret = device_property_read_string(dsaf_dev->dev, "mode", &mode_str);
511e6bc071db148 huangdaode                   2015-09-17   72  	if (ret) {
511e6bc071db148 huangdaode                   2015-09-17   73  		dev_err(dsaf_dev->dev, "get dsaf mode fail, ret=%d!\n", ret);
511e6bc071db148 huangdaode                   2015-09-17   74  		return ret;
511e6bc071db148 huangdaode                   2015-09-17   75  	}
511e6bc071db148 huangdaode                   2015-09-17   76  	for (i = 0; i < DSAF_MODE_MAX; i++) {
511e6bc071db148 huangdaode                   2015-09-17   77  		if (g_dsaf_mode_match[i] &&
511e6bc071db148 huangdaode                   2015-09-17   78  		    !strcmp(mode_str, g_dsaf_mode_match[i]))
511e6bc071db148 huangdaode                   2015-09-17   79  			break;
511e6bc071db148 huangdaode                   2015-09-17   80  	}
511e6bc071db148 huangdaode                   2015-09-17   81  	if (i >= DSAF_MODE_MAX ||
511e6bc071db148 huangdaode                   2015-09-17   82  	    i == DSAF_MODE_INVALID || i == DSAF_MODE_ENABLE) {
511e6bc071db148 huangdaode                   2015-09-17   83  		dev_err(dsaf_dev->dev,
511e6bc071db148 huangdaode                   2015-09-17   84  			"%s prs mode str fail!\n", dsaf_dev->ae_dev.name);
511e6bc071db148 huangdaode                   2015-09-17   85  		return -EINVAL;
511e6bc071db148 huangdaode                   2015-09-17   86  	}
511e6bc071db148 huangdaode                   2015-09-17   87  	dsaf_dev->dsaf_mode = (enum dsaf_mode)i;
511e6bc071db148 huangdaode                   2015-09-17   88  
511e6bc071db148 huangdaode                   2015-09-17   89  	if (dsaf_dev->dsaf_mode > DSAF_MODE_ENABLE)
511e6bc071db148 huangdaode                   2015-09-17   90  		dsaf_dev->dsaf_en = HRD_DSAF_NO_DSAF_MODE;
511e6bc071db148 huangdaode                   2015-09-17   91  	else
511e6bc071db148 huangdaode                   2015-09-17   92  		dsaf_dev->dsaf_en = HRD_DSAF_MODE;
511e6bc071db148 huangdaode                   2015-09-17   93  
511e6bc071db148 huangdaode                   2015-09-17   94  	if ((i == DSAF_MODE_ENABLE_16VM) ||
511e6bc071db148 huangdaode                   2015-09-17   95  	    (i == DSAF_MODE_DISABLE_2PORT_8VM) ||
511e6bc071db148 huangdaode                   2015-09-17   96  	    (i == DSAF_MODE_DISABLE_6PORT_2VM))
511e6bc071db148 huangdaode                   2015-09-17   97  		dsaf_dev->dsaf_tc_mode = HRD_DSAF_8TC_MODE;
511e6bc071db148 huangdaode                   2015-09-17   98  	else
511e6bc071db148 huangdaode                   2015-09-17   99  		dsaf_dev->dsaf_tc_mode = HRD_DSAF_4TC_MODE;
511e6bc071db148 huangdaode                   2015-09-17  100  
8413b3be4d77dae Kejian Yan                   2016-06-03  101  	if (dev_of_node(dsaf_dev->dev)) {
453cafbce5bd256 Peter Chen                   2016-08-01  102  		np_temp = of_parse_phandle(np, "subctrl-syscon", 0);
453cafbce5bd256 Peter Chen                   2016-08-01  103  		syscon = syscon_node_to_regmap(np_temp);
453cafbce5bd256 Peter Chen                   2016-08-01  104  		of_node_put(np_temp);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  105) 		if (IS_ERR_OR_NULL(syscon)) {
8413b3be4d77dae Kejian Yan                   2016-06-03  106  			res = platform_get_resource(pdev, IORESOURCE_MEM,
8413b3be4d77dae Kejian Yan                   2016-06-03  107  						    res_idx++);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  108) 			if (!res) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  109) 				dev_err(dsaf_dev->dev, "subctrl info is needed!\n");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  110) 				return -ENOMEM;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  111) 			}
8413b3be4d77dae Kejian Yan                   2016-06-03  112  
8413b3be4d77dae Kejian Yan                   2016-06-03  113  			dsaf_dev->sc_base = devm_ioremap_resource(&pdev->dev,
8413b3be4d77dae Kejian Yan                   2016-06-03  114  								  res);
b3dc93501e34b6e Wei Yongjun                  2016-08-23  115  			if (IS_ERR(dsaf_dev->sc_base))
96329a181bfbbac Wei Yongjun                  2016-07-05  116  				return PTR_ERR(dsaf_dev->sc_base);
511e6bc071db148 huangdaode                   2015-09-17  117  
8413b3be4d77dae Kejian Yan                   2016-06-03  118  			res = platform_get_resource(pdev, IORESOURCE_MEM,
8413b3be4d77dae Kejian Yan                   2016-06-03  119  						    res_idx++);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  120) 			if (!res) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  121) 				dev_err(dsaf_dev->dev, "serdes-ctrl info is needed!\n");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  122) 				return -ENOMEM;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  123) 			}
8413b3be4d77dae Kejian Yan                   2016-06-03  124  
8413b3be4d77dae Kejian Yan                   2016-06-03  125  			dsaf_dev->sds_base = devm_ioremap_resource(&pdev->dev,
8413b3be4d77dae Kejian Yan                   2016-06-03  126  								   res);
b3dc93501e34b6e Wei Yongjun                  2016-08-23  127  			if (IS_ERR(dsaf_dev->sds_base))
96329a181bfbbac Wei Yongjun                  2016-07-05  128  				return PTR_ERR(dsaf_dev->sds_base);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  129) 		} else {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  130) 			dsaf_dev->sub_ctrl = syscon;
511e6bc071db148 huangdaode                   2015-09-17  131  		}
8413b3be4d77dae Kejian Yan                   2016-06-03  132  	}
511e6bc071db148 huangdaode                   2015-09-17  133  
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  134) 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ppe-base");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  135) 	if (!res) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  136) 		res = platform_get_resource(pdev, IORESOURCE_MEM, res_idx++);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  137) 		if (!res) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  138) 			dev_err(dsaf_dev->dev, "ppe-base info is needed!\n");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  139) 			return -ENOMEM;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  140) 		}
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  141) 	}
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  142) 	dsaf_dev->ppe_base = devm_ioremap_resource(&pdev->dev, res);
b3dc93501e34b6e Wei Yongjun                  2016-08-23  143  	if (IS_ERR(dsaf_dev->ppe_base))
96329a181bfbbac Wei Yongjun                  2016-07-05  144  		return PTR_ERR(dsaf_dev->ppe_base);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  145) 	dsaf_dev->ppe_paddr = res->start;
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  146) 
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  147) 	if (!HNS_DSAF_IS_DEBUG(dsaf_dev)) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  148) 		res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  149) 						   "dsaf-base");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  150) 		if (!res) {
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  151) 			res = platform_get_resource(pdev, IORESOURCE_MEM,
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  152) 						    res_idx);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  153) 			if (!res) {
511e6bc071db148 huangdaode                   2015-09-17  154  				dev_err(dsaf_dev->dev,
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  155) 					"dsaf-base info is needed!\n");
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  156) 				return -ENOMEM;
511e6bc071db148 huangdaode                   2015-09-17  157  			}
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  158) 		}
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  159) 		dsaf_dev->io_base = devm_ioremap_resource(&pdev->dev, res);
b3dc93501e34b6e Wei Yongjun                  2016-08-23  160  		if (IS_ERR(dsaf_dev->io_base))
96329a181bfbbac Wei Yongjun                  2016-07-05  161  			return PTR_ERR(dsaf_dev->io_base);
831d828bf2cc853 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  162) 	}
511e6bc071db148 huangdaode                   2015-09-17  163  
6162928c76dcba2 Kejian Yan                   2016-06-03  164  	ret = device_property_read_u32(dsaf_dev->dev, "desc-num", &desc_num);
511e6bc071db148 huangdaode                   2015-09-17  165  	if (ret < 0 || desc_num < HNS_DSAF_MIN_DESC_CNT ||
511e6bc071db148 huangdaode                   2015-09-17  166  	    desc_num > HNS_DSAF_MAX_DESC_CNT) {
511e6bc071db148 huangdaode                   2015-09-17  167  		dev_err(dsaf_dev->dev, "get desc-num(%d) fail, ret=%d!\n",
511e6bc071db148 huangdaode                   2015-09-17  168  			desc_num, ret);
f6c2df1e5b913f9 Qianqian Xie                 2016-06-21  169  		return -EINVAL;
511e6bc071db148 huangdaode                   2015-09-17  170  	}
511e6bc071db148 huangdaode                   2015-09-17  171  	dsaf_dev->desc_num = desc_num;
511e6bc071db148 huangdaode                   2015-09-17  172  
6162928c76dcba2 Kejian Yan                   2016-06-03  173  	ret = device_property_read_u32(dsaf_dev->dev, "reset-field-offset",
6162928c76dcba2 Kejian Yan                   2016-06-03  174  				       &reset_offset);
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  175) 	if (ret < 0) {
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  176) 		dev_dbg(dsaf_dev->dev,
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  177) 			"get reset-field-offset fail, ret=%d!\r\n", ret);
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  178) 	}
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  179) 	dsaf_dev->reset_offset = reset_offset;
422c3107ed2cc62 Yisen.Zhuang\(Zhuangyuzeng\  2016-04-23  180) 
6162928c76dcba2 Kejian Yan                   2016-06-03  181  	ret = device_property_read_u32(dsaf_dev->dev, "buf-size", &buf_size);
511e6bc071db148 huangdaode                   2015-09-17  182  	if (ret < 0) {
511e6bc071db148 huangdaode                   2015-09-17  183  		dev_err(dsaf_dev->dev,
511e6bc071db148 huangdaode                   2015-09-17  184  			"get buf-size fail, ret=%d!\r\n", ret);
f6c2df1e5b913f9 Qianqian Xie                 2016-06-21  185  		return ret;
511e6bc071db148 huangdaode                   2015-09-17  186  	}
511e6bc071db148 huangdaode                   2015-09-17  187  	dsaf_dev->buf_size = buf_size;
511e6bc071db148 huangdaode                   2015-09-17  188  
511e6bc071db148 huangdaode                   2015-09-17  189  	dsaf_dev->buf_size_type = hns_rcb_buf_size2type(buf_size);
511e6bc071db148 huangdaode                   2015-09-17  190  	if (dsaf_dev->buf_size_type < 0) {
511e6bc071db148 huangdaode                   2015-09-17  191  		dev_err(dsaf_dev->dev,
511e6bc071db148 huangdaode                   2015-09-17  192  			"buf_size(%d) is wrong!\n", buf_size);
f6c2df1e5b913f9 Qianqian Xie                 2016-06-21  193  		return -EINVAL;
511e6bc071db148 huangdaode                   2015-09-17  194  	}
511e6bc071db148 huangdaode                   2015-09-17  195  
a24274aa5c2328a Kejian Yan                   2016-06-03  196  	dsaf_dev->misc_op = hns_misc_op_get(dsaf_dev);
a24274aa5c2328a Kejian Yan                   2016-06-03  197  	if (!dsaf_dev->misc_op)
a24274aa5c2328a Kejian Yan                   2016-06-03  198  		return -ENOMEM;
a24274aa5c2328a Kejian Yan                   2016-06-03  199  
511e6bc071db148 huangdaode                   2015-09-17 @200  	if (!dma_set_mask_and_coherent(dsaf_dev->dev, DMA_BIT_MASK(64ULL)))
511e6bc071db148 huangdaode                   2015-09-17  201  		dev_dbg(dsaf_dev->dev, "set mask to 64bit\n");
511e6bc071db148 huangdaode                   2015-09-17  202  	else
511e6bc071db148 huangdaode                   2015-09-17  203  		dev_err(dsaf_dev->dev, "set mask to 64bit fail!\n");
511e6bc071db148 huangdaode                   2015-09-17  204  
511e6bc071db148 huangdaode                   2015-09-17  205  	return 0;
511e6bc071db148 huangdaode                   2015-09-17  206  }
511e6bc071db148 huangdaode                   2015-09-17  207
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index a34abfe8c654..b4c9bd368b7e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3378,7 +3378,7 @@  vm_fault_t filemap_fault(struct vm_fault *vmf)
 	 * re-find the vma and come back and find our hopefully still populated
 	 * page.
 	 */
-	if (folio)
+	if (!IS_ERR(folio))
 		folio_put(folio);
 	if (mapping_locked)
 		filemap_invalidate_unlock_shared(mapping);