diff mbox series

RDMA/ucma: Put a lock around every call to the rdma_cm layer

Message ID 20200218210432.GA31966@ziepe.ca (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/ucma: Put a lock around every call to the rdma_cm layer | expand

Commit Message

Jason Gunthorpe Feb. 18, 2020, 9:04 p.m. UTC
The rdma_cm must be used single threaded.

This appears to be a bug in the design, as it does have lots of locking
that seems like it should allow concurrency. However, when it is all said
and done every single place that uses the cma_exch() scheme is broken, and
all the unlocked reads from the ucma of the cm_id data are wrong too.

syzkaller has been finding endless bugs related to this.

Fixing this in any elegant way is some enormous amount of work. Take a
very big hammer and put a mutex around everything to do with the
ucma_context at the top of every syscall.

Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: syzbot+adb15cf8c2798e4e0db4@syzkaller.appspotmail.com
Reported-by: syzbot+e5579222b6a3edd96522@syzkaller.appspotmail.com
Reported-by: syzbot+4b628fcc748474003457@syzkaller.appspotmail.com
Reported-by: syzbot+29ee8f76017ce6cf03da@syzkaller.appspotmail.com
Reported-by: syzbot+6956235342b7317ec564@syzkaller.appspotmail.com
Reported-by: syzbot+b358909d8d01556b790b@syzkaller.appspotmail.com
Reported-by: syzbot+6b46b135602a3f3ac99e@syzkaller.appspotmail.com
Reported-by: syzbot+8458d13b13562abf6b77@syzkaller.appspotmail.com
Reported-by: syzbot+bd034f3fdc0402e942ed@syzkaller.appspotmail.com
Reported-by: syzbot+c92378b32760a4eef756@syzkaller.appspotmail.com
Reported-by: syzbot+68b44a1597636e0b342c@syzkaller.appspotmail.com
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/core/ucma.c | 50 ++++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 2 deletions(-)

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-rc

Lets see if I told syzkaller about this properly..

EricB: If there are other rdma_cm related hits in syzkaller besides
these 11 lets include them as  well. I wasn't able to find a way to
search for things, this list is from your past email, thanks.

Comments

syzbot Feb. 18, 2020, 10:10 p.m. UTC | #1
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+adb15cf8c2798e4e0db4@syzkaller.appspotmail.com

Tested on:

commit:         11a48a5a Linux 5.6-rc2
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-rc
kernel config:  https://syzkaller.appspot.com/x/.config?x=3e5684f9a45838bb
dashboard link: https://syzkaller.appspot.com/bug?extid=adb15cf8c2798e4e0db4
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
patch:          https://syzkaller.appspot.com/x/patch.diff?x=14709845e00000

Note: testing is done by a robot and is best-effort only.
Eric Biggers Feb. 19, 2020, 6:07 a.m. UTC | #2
On Tue, Feb 18, 2020 at 09:04:36PM +0000, Jason Gunthorpe wrote:
> The rdma_cm must be used single threaded.
> 
> This appears to be a bug in the design, as it does have lots of locking
> that seems like it should allow concurrency. However, when it is all said
> and done every single place that uses the cma_exch() scheme is broken, and
> all the unlocked reads from the ucma of the cm_id data are wrong too.
> 
> syzkaller has been finding endless bugs related to this.
> 
> Fixing this in any elegant way is some enormous amount of work. Take a
> very big hammer and put a mutex around everything to do with the
> ucma_context at the top of every syscall.
> 
> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
> Reported-by: syzbot+adb15cf8c2798e4e0db4@syzkaller.appspotmail.com
> Reported-by: syzbot+e5579222b6a3edd96522@syzkaller.appspotmail.com
> Reported-by: syzbot+4b628fcc748474003457@syzkaller.appspotmail.com
> Reported-by: syzbot+29ee8f76017ce6cf03da@syzkaller.appspotmail.com
> Reported-by: syzbot+6956235342b7317ec564@syzkaller.appspotmail.com
> Reported-by: syzbot+b358909d8d01556b790b@syzkaller.appspotmail.com
> Reported-by: syzbot+6b46b135602a3f3ac99e@syzkaller.appspotmail.com
> Reported-by: syzbot+8458d13b13562abf6b77@syzkaller.appspotmail.com
> Reported-by: syzbot+bd034f3fdc0402e942ed@syzkaller.appspotmail.com
> Reported-by: syzbot+c92378b32760a4eef756@syzkaller.appspotmail.com
> Reported-by: syzbot+68b44a1597636e0b342c@syzkaller.appspotmail.com
> Cc: Eric Biggers <ebiggers@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/ucma.c | 50 ++++++++++++++++++++++++++++++++--
>  1 file changed, 48 insertions(+), 2 deletions(-)
> 
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-rc
> 
> Lets see if I told syzkaller about this properly..
> 
> EricB: If there are other rdma_cm related hits in syzkaller besides
> these 11 lets include them as  well. I wasn't able to find a way to
> search for things, this list is from your past email, thanks.
> 

Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
provide an updated list until I go through the long backlog of bugs.

A comment on the patch below:

> @@ -1112,13 +1134,17 @@ static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
>  	if (cmd.conn_param.valid) {
>  		ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
>  		mutex_lock(&file->mut);
> +		mutex_lock(&ctx->mutex);
>  		ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
> +		mutex_unlock(&ctx->mutex);
>  		if (!ret)
>  			ctx->uid = cmd.uid;
>  		mutex_unlock(&file->mut);

This is nesting the new ucma_context::mutex inside the existing ucma_file::mut.

> @@ -1403,6 +1443,7 @@ static ssize_t ucma_process_join(struct ucma_file *file,
>  	if (IS_ERR(ctx))
>  		return PTR_ERR(ctx);
>  
> +	mutex_lock(&ctx->mutex);
>  	mutex_lock(&file->mut);
>  	mc = ucma_alloc_multicast(ctx);
>  	if (!mc) {

... but this is doing the opposite.  So it can deadlock.

What's the intended order?

Also, are these two separate mutexes actually needed?  I.e., did you consider
using the existing ucma_file::mut, but it didn't work or it wasn't fine-grained
enough?  (It looks like one ucma_file can have multiple ucma_contexts.)

- Eric
Jason Gunthorpe Feb. 19, 2020, 8:22 p.m. UTC | #3
On Tue, Feb 18, 2020 at 10:07:01PM -0800, Eric Biggers wrote:
> > these 11 lets include them as  well. I wasn't able to find a way to
> > search for things, this list is from your past email, thanks.
> > 
> 
> Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
> provide an updated list until I go through the long backlog of bugs.

Ok

> A comment on the patch below:
> 
> > @@ -1112,13 +1134,17 @@ static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
> >  	if (cmd.conn_param.valid) {
> >  		ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
> >  		mutex_lock(&file->mut);
> > +		mutex_lock(&ctx->mutex);
> >  		ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
> > +		mutex_unlock(&ctx->mutex);
> >  		if (!ret)
> >  			ctx->uid = cmd.uid;
> >  		mutex_unlock(&file->mut);
> 
> This is nesting the new ucma_context::mutex inside the existing ucma_file::mut.

Ah, indeed
 
> > @@ -1403,6 +1443,7 @@ static ssize_t ucma_process_join(struct ucma_file *file,
> >  	if (IS_ERR(ctx))
> >  		return PTR_ERR(ctx);
> >  
> > +	mutex_lock(&ctx->mutex);
> >  	mutex_lock(&file->mut);
> >  	mc = ucma_alloc_multicast(ctx);
> >  	if (!mc) {
> 
> ... but this is doing the opposite.  So it can deadlock.

Lets narrow this one to just rdma_join_multicast(), looks safe

> What's the intended order?

The code works better if mut is the exterior lock, it seems
 
> Also, are these two separate mutexes actually needed?  I.e., did you consider
> using the existing ucma_file::mut, but it didn't work or it wasn't fine-grained
> enough?  (It looks like one ucma_file can have multiple ucma_contexts.)

It would probably work, but it is not as fine grained as a per-ctx
lock, and some people do care about performance with this stuff.

The file->mut is protecting some global per-fd lists it seems.

Thanks,
Jason
Jason Gunthorpe Feb. 27, 2020, 8:42 p.m. UTC | #4
On Tue, Feb 18, 2020 at 09:04:36PM +0000, Jason Gunthorpe wrote:
> The rdma_cm must be used single threaded.
> 
> This appears to be a bug in the design, as it does have lots of locking
> that seems like it should allow concurrency. However, when it is all said
> and done every single place that uses the cma_exch() scheme is broken, and
> all the unlocked reads from the ucma of the cm_id data are wrong too.
> 
> syzkaller has been finding endless bugs related to this.
> 
> Fixing this in any elegant way is some enormous amount of work. Take a
> very big hammer and put a mutex around everything to do with the
> ucma_context at the top of every syscall.
> 
> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
> Reported-by: syzbot+adb15cf8c2798e4e0db4@syzkaller.appspotmail.com
> Reported-by: syzbot+e5579222b6a3edd96522@syzkaller.appspotmail.com
> Reported-by: syzbot+4b628fcc748474003457@syzkaller.appspotmail.com
> Reported-by: syzbot+29ee8f76017ce6cf03da@syzkaller.appspotmail.com
> Reported-by: syzbot+6956235342b7317ec564@syzkaller.appspotmail.com
> Reported-by: syzbot+b358909d8d01556b790b@syzkaller.appspotmail.com
> Reported-by: syzbot+6b46b135602a3f3ac99e@syzkaller.appspotmail.com
> Reported-by: syzbot+8458d13b13562abf6b77@syzkaller.appspotmail.com
> Reported-by: syzbot+bd034f3fdc0402e942ed@syzkaller.appspotmail.com
> Reported-by: syzbot+c92378b32760a4eef756@syzkaller.appspotmail.com
> Reported-by: syzbot+68b44a1597636e0b342c@syzkaller.appspotmail.com
> Cc: Eric Biggers <ebiggers@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/ucma.c | 50 ++++++++++++++++++++++++++++++++--
>  1 file changed, 48 insertions(+), 2 deletions(-)

It has had some testing on the Mellanox test suite, so applied to
for-next.

I did not put this in -rc or cc stable since it seems like it should
have more testing

Jason
Eric Biggers March 7, 2020, 8:41 p.m. UTC | #5
On Wed, Feb 19, 2020 at 08:22:25PM +0000, Jason Gunthorpe wrote:
> On Tue, Feb 18, 2020 at 10:07:01PM -0800, Eric Biggers wrote:
> > > these 11 lets include them as  well. I wasn't able to find a way to
> > > search for things, this list is from your past email, thanks.
> > > 
> > 
> > Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
> > provide an updated list until I go through the long backlog of bugs.
> 
> Ok

Here's an updated list:

--------------------------------------------------------------------------------
Title:              general protection fault in rds_ib_add_one
Last occurred:      0 days ago
Reported:           12 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=15f96d171c64999196ac7db3de107f24b9182a8e
Original thread:    https://lore.kernel.org/lkml/000000000000b9b7d4059f4e4ac7@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug has received 5 replies; the last was 5 days
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+274094e62023782eeb17@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 5 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/000000000000b9b7d4059f4e4ac7@google.com

--------------------------------------------------------------------------------
Title:              INFO: trying to register non-static key in xa_destroy
Last occurred:      0 days ago
Reported:           11 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=c0a75a31c5fa84e6e5d3131fd98a5b56e2141b9a
Original thread:    https://lore.kernel.org/lkml/00000000000046895c059f5cae37@google.com/T/#u

This bug has a C reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+2e80962bedd9559fe0b3@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/00000000000046895c059f5cae37@google.com

--------------------------------------------------------------------------------
Title:              general protection fault in nldev_stat_set_doit
Last occurred:      4 days ago
Reported:           11 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=1fbcb607cf49d8b5a3c8e056971f045f9bfa34f3
Original thread:    https://lore.kernel.org/lkml/0000000000004aa34d059f5caedc@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug has received 1 reply, 11 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+bd4af81bc51ee0283445@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 11 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/0000000000004aa34d059f5caedc@google.com

--------------------------------------------------------------------------------
Title:              BUG: corrupted list in _cma_attach_to_dev
Last occurred:      2 days ago
Reported:           6 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=067b1e60bab1b617c1208f078cd76c9087f070e0
Original thread:    https://lore.kernel.org/lkml/000000000000cfed90059fcfdccb@google.com/T/#u

This bug has a C reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+06b50ee4a9bd73e8b89f@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000cfed90059fcfdccb@google.com

--------------------------------------------------------------------------------
Title:              WARNING: kobject bug in ib_register_device
Last occurred:      1 day ago
Reported:           12 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=805ad726feb6910e35088ae7bbe61f4125e573b7
Original thread:    https://lore.kernel.org/lkml/000000000000026ac5059f4e27f3@google.com/T/#u

This bug has a C reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+da615ac67d4dbea32cbc@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000026ac5059f4e27f3@google.com

--------------------------------------------------------------------------------
Title:              BUG: corrupted list in cma_listen_on_dev
Last occurred:      4 days ago
Reported:           4 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=e8fcdea4e5a443c597c94fb6eda7d6646eafe6a2
Original thread:    https://lore.kernel.org/lkml/00000000000020c5d205a001c308@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug has received 1 reply, 3 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+2b10b240fbbed30f10fb@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 3 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/00000000000020c5d205a001c308@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Read in rxe_query_port
Last occurred:      0 days ago
Reported:           6 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=f00443e97b44c466dc75edc31601110bf62a6f69
Original thread:    https://lore.kernel.org/lkml/0000000000000c9e12059fc941ff@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one has replied to the original thread for this bug yet.

I'm not confident this bug is really in the net/rdma subsystem.  I also think it
might be in the net/smc subsystem.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+e11efb687f5ab7f01f3d@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000000c9e12059fc941ff@google.com

--------------------------------------------------------------------------------
Title:              WARNING in ib_free_port_attrs
Last occurred:      1 day ago
Reported:           6 days ago
Branches:           net and net-next
Dashboard link:     https://syzkaller.appspot.com/bug?id=4ec089798f282f2d2c3219151e420ed1ba10120d
Original thread:    https://lore.kernel.org/lkml/000000000000460717059fd83734@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 1 reply, 4 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+e909641b84b5bc17ad8b@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 4 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/000000000000460717059fd83734@google.com

--------------------------------------------------------------------------------
Title:              INFO: task hung in rdma_destroy_id
Last occurred:      3 days ago
Reported:           5 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=e89b86960c3636f57dbb16bb25a829377ebdf43d
Original thread:    https://lore.kernel.org/lkml/00000000000059e701059fe3ec2f@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+0abbad99bee187cf63d4@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/00000000000059e701059fe3ec2f@google.com

--------------------------------------------------------------------------------
Title:              general protection fault in kobject_get
Last occurred:      6 days ago
Reported:           6 days ago
Branches:           net-next
Dashboard link:     https://syzkaller.appspot.com/bug?id=f8e0f99b310558dd489cc7427711a640c10b93e5
Original thread:    https://lore.kernel.org/lkml/000000000000c4b371059fd83a92@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 2 replies; the last was 4 days
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+46fe08363dbba223dec5@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 4 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/000000000000c4b371059fd83a92@google.com

--------------------------------------------------------------------------------
Title:              WARNING: kobject bug in add_one_compat_dev
Last occurred:      8 days ago
Reported:           10 days ago
Branches:           linux-next and net-next
Dashboard link:     https://syzkaller.appspot.com/bug?id=f8880fdc3cd0ba268421672360cf79bfa7fa4272
Original thread:    https://lore.kernel.org/lkml/0000000000005f77d6059f888f2e@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+ab4dae63f7d310641ded@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000005f77d6059f888f2e@google.com

--------------------------------------------------------------------------------
Title:              WARNING in srp_remove_one
Last occurred:      9 days ago
Reported:           6 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=16a5827f8f6f6ef0967e6492ffb2e2ca54c8c0fb
Original thread:    https://lore.kernel.org/lkml/000000000000144d79059fc9415d@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+687bc62a84a6a2a3555a@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000144d79059fc9415d@google.com
Jason Gunthorpe March 9, 2020, 7:30 p.m. UTC | #6
On Sat, Mar 07, 2020 at 12:41:53PM -0800, Eric Biggers wrote:
> On Wed, Feb 19, 2020 at 08:22:25PM +0000, Jason Gunthorpe wrote:
> > On Tue, Feb 18, 2020 at 10:07:01PM -0800, Eric Biggers wrote:
> > > > these 11 lets include them as  well. I wasn't able to find a way to
> > > > search for things, this list is from your past email, thanks.
> > > > 
> > > 
> > > Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
> > > provide an updated list until I go through the long backlog of bugs.
> > 
> > Ok
> 
> Here's an updated list:
> 
> --------------------------------------------------------------------------------
> Title:              general protection fault in rds_ib_add_one
> Last occurred:      0 days ago
> Reported:           12 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=15f96d171c64999196ac7db3de107f24b9182a8e
> Original thread:    https://lore.kernel.org/lkml/000000000000b9b7d4059f4e4ac7@google.com/T/#u
> 
> This bug has a C reproducer.

Looks like this is fixed by Hillf

> --------------------------------------------------------------------------------
> Title:              INFO: trying to register non-static key in xa_destroy
> Last occurred:      0 days ago
> Reported:           11 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=c0a75a31c5fa84e6e5d3131fd98a5b56e2141b9a
> Original thread:    https://lore.kernel.org/lkml/00000000000046895c059f5cae37@google.com/T/#u
> 
> This bug has a C reproducer.

Fixed in v5.6-rc5

> --------------------------------------------------------------------------------
> Title:              general protection fault in nldev_stat_set_doit
> Last occurred:      4 days ago
> Reported:           11 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=1fbcb607cf49d8b5a3c8e056971f045f9bfa34f3
> Original thread:    https://lore.kernel.org/lkml/0000000000004aa34d059f5caedc@google.com/T/#u
> 
> This bug has a C reproducer.

Fixed in v5.6-rc5
 
> --------------------------------------------------------------------------------
> Title:              BUG: corrupted list in _cma_attach_to_dev
> Last occurred:      2 days ago
> Reported:           6 days ago
> Branches:           Mainline
> Dashboard link:     https://syzkaller.appspot.com/bug?id=067b1e60bab1b617c1208f078cd76c9087f070e0
> Original thread:    https://lore.kernel.org/lkml/000000000000cfed90059fcfdccb@google.com/T/#u
> 
> This bug has a C reproducer.

Most likely fixed by this patch, syzkaller is re-testing

> --------------------------------------------------------------------------------
> Title:              WARNING: kobject bug in ib_register_device
> Last occurred:      1 day ago
> Reported:           12 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=805ad726feb6910e35088ae7bbe61f4125e573b7
> Original thread:    https://lore.kernel.org/lkml/000000000000026ac5059f4e27f3@google.com/T/#u
> 
> This bug has a C reproducer.

Oh, this wasn't sent to rdma, yes, obvious rdma bug, made a patch

> --------------------------------------------------------------------------------
> Title:              BUG: corrupted list in cma_listen_on_dev
> Last occurred:      4 days ago
> Reported:           4 days ago
> Branches:           Mainline
> Dashboard link:     https://syzkaller.appspot.com/bug?id=e8fcdea4e5a443c597c94fb6eda7d6646eafe6a2
> Original thread:    https://lore.kernel.org/lkml/00000000000020c5d205a001c308@google.com/T/#u
> 
> This bug has a C reproducer.

Fixed by this patch, syzkaller confirmed, now duped to another bug

> --------------------------------------------------------------------------------
> Title:              KASAN: use-after-free Read in rxe_query_port
> Last occurred:      0 days ago
> Reported:           6 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=f00443e97b44c466dc75edc31601110bf62a6f69
> Original thread:    https://lore.kernel.org/lkml/0000000000000c9e12059fc941ff@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

Perhaps Yanjun Zhu will look at this

> --------------------------------------------------------------------------------
> Title:              WARNING in ib_free_port_attrs
> Last occurred:      1 day ago
> Reported:           6 days ago
> Branches:           net and net-next
> Dashboard link:     https://syzkaller.appspot.com/bug?id=4ec089798f282f2d2c3219151e420ed1ba10120d
> Original thread:    https://lore.kernel.org/lkml/000000000000460717059fd83734@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

Parav and I looked at this for a while and couldn't figure how how it
is possible. Hoping for a reproducer

> --------------------------------------------------------------------------------
> Title:              INFO: task hung in rdma_destroy_id
> Last occurred:      3 days ago
> Reported:           5 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=e89b86960c3636f57dbb16bb25a829377ebdf43d
> Original thread:    https://lore.kernel.org/lkml/00000000000059e701059fe3ec2f@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

Most likely fixed by this patch

> --------------------------------------------------------------------------------
> Title:              general protection fault in kobject_get
> Last occurred:      6 days ago
> Reported:           6 days ago
> Branches:           net-next
> Dashboard link:     https://syzkaller.appspot.com/bug?id=f8e0f99b310558dd489cc7427711a640c10b93e5
> Original thread:    https://lore.kernel.org/lkml/000000000000c4b371059fd83a92@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

Really surprised no reproducer, this is not a race bug. I wrote a fix,
it is being tested now.

> --------------------------------------------------------------------------------
> Title:              WARNING: kobject bug in add_one_compat_dev
> Last occurred:      8 days ago
> Reported:           10 days ago
> Branches:           linux-next and net-next
> Dashboard link:     https://syzkaller.appspot.com/bug?id=f8880fdc3cd0ba268421672360cf79bfa7fa4272
> Original thread:    https://lore.kernel.org/lkml/0000000000005f77d6059f888f2e@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

Hmm. I wonder if this is because 'dev_set_name' failed and we ignored
it? Is that possible with this log? Lets fix that at least - I have no
other idea how we could get an empty name.

> --------------------------------------------------------------------------------
> Title:              WARNING in srp_remove_one
> Last occurred:      9 days ago
> Reported:           6 days ago
> Branches:           Mainline
> Dashboard link:     https://syzkaller.appspot.com/bug?id=16a5827f8f6f6ef0967e6492ffb2e2ca54c8c0fb
> Original thread:    https://lore.kernel.org/lkml/000000000000144d79059fc9415d@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.

This looks a lot like 'WARNING in ib_free_port_attrs' - I don't have a
clear idea how these sysfs errors are possible. I wonder if there is
something strange going on in sysfs land during net ns actions?

Thanks,
Jason
Eric Biggers June 27, 2020, 10:57 p.m. UTC | #7
On Mon, Mar 09, 2020 at 04:30:12PM -0300, Jason Gunthorpe wrote:
> On Sat, Mar 07, 2020 at 12:41:53PM -0800, Eric Biggers wrote:
> > On Wed, Feb 19, 2020 at 08:22:25PM +0000, Jason Gunthorpe wrote:
> > > On Tue, Feb 18, 2020 at 10:07:01PM -0800, Eric Biggers wrote:
> > > > > these 11 lets include them as  well. I wasn't able to find a way to
> > > > > search for things, this list is from your past email, thanks.
> > > > > 
> > > > 
> > > > Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
> > > > provide an updated list until I go through the long backlog of bugs.
> > > 
> > > Ok
> > 
> > Here's an updated list:
> > 
> > --------------------------------------------------------------------------------
> > Title:              general protection fault in rds_ib_add_one
> > Last occurred:      0 days ago
> > Reported:           12 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=15f96d171c64999196ac7db3de107f24b9182a8e
> > Original thread:    https://lore.kernel.org/lkml/000000000000b9b7d4059f4e4ac7@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Looks like this is fixed by Hillf
> 
> > --------------------------------------------------------------------------------
> > Title:              INFO: trying to register non-static key in xa_destroy
> > Last occurred:      0 days ago
> > Reported:           11 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=c0a75a31c5fa84e6e5d3131fd98a5b56e2141b9a
> > Original thread:    https://lore.kernel.org/lkml/00000000000046895c059f5cae37@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Fixed in v5.6-rc5
> 
> > --------------------------------------------------------------------------------
> > Title:              general protection fault in nldev_stat_set_doit
> > Last occurred:      4 days ago
> > Reported:           11 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=1fbcb607cf49d8b5a3c8e056971f045f9bfa34f3
> > Original thread:    https://lore.kernel.org/lkml/0000000000004aa34d059f5caedc@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Fixed in v5.6-rc5
>  
> > --------------------------------------------------------------------------------
> > Title:              BUG: corrupted list in _cma_attach_to_dev
> > Last occurred:      2 days ago
> > Reported:           6 days ago
> > Branches:           Mainline
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=067b1e60bab1b617c1208f078cd76c9087f070e0
> > Original thread:    https://lore.kernel.org/lkml/000000000000cfed90059fcfdccb@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Most likely fixed by this patch, syzkaller is re-testing
> 
> > --------------------------------------------------------------------------------
> > Title:              WARNING: kobject bug in ib_register_device
> > Last occurred:      1 day ago
> > Reported:           12 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=805ad726feb6910e35088ae7bbe61f4125e573b7
> > Original thread:    https://lore.kernel.org/lkml/000000000000026ac5059f4e27f3@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Oh, this wasn't sent to rdma, yes, obvious rdma bug, made a patch
> 
> > --------------------------------------------------------------------------------
> > Title:              BUG: corrupted list in cma_listen_on_dev
> > Last occurred:      4 days ago
> > Reported:           4 days ago
> > Branches:           Mainline
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=e8fcdea4e5a443c597c94fb6eda7d6646eafe6a2
> > Original thread:    https://lore.kernel.org/lkml/00000000000020c5d205a001c308@google.com/T/#u
> > 
> > This bug has a C reproducer.
> 
> Fixed by this patch, syzkaller confirmed, now duped to another bug
> 
> > --------------------------------------------------------------------------------
> > Title:              KASAN: use-after-free Read in rxe_query_port
> > Last occurred:      0 days ago
> > Reported:           6 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=f00443e97b44c466dc75edc31601110bf62a6f69
> > Original thread:    https://lore.kernel.org/lkml/0000000000000c9e12059fc941ff@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> Perhaps Yanjun Zhu will look at this
> 
> > --------------------------------------------------------------------------------
> > Title:              WARNING in ib_free_port_attrs
> > Last occurred:      1 day ago
> > Reported:           6 days ago
> > Branches:           net and net-next
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=4ec089798f282f2d2c3219151e420ed1ba10120d
> > Original thread:    https://lore.kernel.org/lkml/000000000000460717059fd83734@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> Parav and I looked at this for a while and couldn't figure how how it
> is possible. Hoping for a reproducer
> 
> > --------------------------------------------------------------------------------
> > Title:              INFO: task hung in rdma_destroy_id
> > Last occurred:      3 days ago
> > Reported:           5 days ago
> > Branches:           Mainline and others
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=e89b86960c3636f57dbb16bb25a829377ebdf43d
> > Original thread:    https://lore.kernel.org/lkml/00000000000059e701059fe3ec2f@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> Most likely fixed by this patch
> 
> > --------------------------------------------------------------------------------
> > Title:              general protection fault in kobject_get
> > Last occurred:      6 days ago
> > Reported:           6 days ago
> > Branches:           net-next
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=f8e0f99b310558dd489cc7427711a640c10b93e5
> > Original thread:    https://lore.kernel.org/lkml/000000000000c4b371059fd83a92@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> Really surprised no reproducer, this is not a race bug. I wrote a fix,
> it is being tested now.
> 
> > --------------------------------------------------------------------------------
> > Title:              WARNING: kobject bug in add_one_compat_dev
> > Last occurred:      8 days ago
> > Reported:           10 days ago
> > Branches:           linux-next and net-next
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=f8880fdc3cd0ba268421672360cf79bfa7fa4272
> > Original thread:    https://lore.kernel.org/lkml/0000000000005f77d6059f888f2e@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> Hmm. I wonder if this is because 'dev_set_name' failed and we ignored
> it? Is that possible with this log? Lets fix that at least - I have no
> other idea how we could get an empty name.
> 
> > --------------------------------------------------------------------------------
> > Title:              WARNING in srp_remove_one
> > Last occurred:      9 days ago
> > Reported:           6 days ago
> > Branches:           Mainline
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=16a5827f8f6f6ef0967e6492ffb2e2ca54c8c0fb
> > Original thread:    https://lore.kernel.org/lkml/000000000000144d79059fc9415d@google.com/T/#u
> > 
> > Unfortunately, this bug does not have a reproducer.
> 
> This looks a lot like 'WARNING in ib_free_port_attrs' - I don't have a
> clear idea how these sysfs errors are possible. I wonder if there is
> something strange going on in sysfs land during net ns actions?
> 
> Thanks,
> Jason

Hi Jason, here's my latest list (updated today) of bugs that are probably in
drivers/infiniband/:

--------------------------------------------------------------------------------
Title:              general protection fault in rds_ib_add_one
Last occurred:      1 day ago
Reported:           124 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=15f96d171c64999196ac7db3de107f24b9182a8e
Original thread:    https://lore.kernel.org/lkml/000000000000b9b7d4059f4e4ac7@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug received 5 replies; the last was 117 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+274094e62023782eeb17@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000b9b7d4059f4e4ac7@google.com

--------------------------------------------------------------------------------
Title:              WARNING in srp_remove_one
Last occurred:      0 days ago
Reported:           118 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=16a5827f8f6f6ef0967e6492ffb2e2ca54c8c0fb
Original thread:    https://lore.kernel.org/lkml/000000000000144d79059fc9415d@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+687bc62a84a6a2a3555a@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000144d79059fc9415d@google.com

--------------------------------------------------------------------------------
Title:              WARNING in ib_uverbs_remove_one
Last occurred:      0 days ago
Reported:           99 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=7d092a26c44ac45dc0a59a1a0474be064db8fa66
Original thread:    https://lore.kernel.org/lkml/000000000000c3a75205a14cb8c9@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+d3f37b9458fe8281d078@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000c3a75205a14cb8c9@google.com

--------------------------------------------------------------------------------
Title:              WARNING in ib_free_port_attrs
Last occurred:      1 day ago
Reported:           117 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=4ec089798f282f2d2c3219151e420ed1ba10120d
Original thread:    https://lore.kernel.org/lkml/000000000000460717059fd83734@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug received 1 reply, 116 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+e909641b84b5bc17ad8b@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000460717059fd83734@google.com

--------------------------------------------------------------------------------
Title:              WARNING in ib_umad_kill_port
Last occurred:      1 day ago
Reported:           82 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=4ecc18c71d37b62b131aee8184a642ae5d2d21a6
Original thread:    https://lore.kernel.org/lkml/00000000000075245205a2997f68@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 8 replies; the last was 79 days
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+9627a92b1f9262d5d30c@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/00000000000075245205a2997f68@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Write in addr_resolve
Last occurred:      20 days ago
Reported:           17 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=e0a96faaf6799220954d5f5d8ec6fa0c386f85ac
Original thread:    https://lore.kernel.org/lkml/000000000000eb293205a7bdd19a@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+08092148130652a6faae@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000eb293205a7bdd19a@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Read in addr_handler (2)
Last occurred:      17 days ago
Reported:           17 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=cfd37bf8b5d2768b6b87e7b4c3a588a06ea6284a
Original thread:    https://lore.kernel.org/lkml/000000000000107b4605a7bdce7d@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 2 replies; the last was 20 hours
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+a929647172775e335941@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 20 hours ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/000000000000107b4605a7bdce7d@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Read in ib_uverbs_remove_one
Last occurred:      33 days ago
Reported:           30 days ago
Branches:           linux-next
Dashboard link:     https://syzkaller.appspot.com/bug?id=f1a3b9d9350867a50d642b8e2cee217569b8adca
Original thread:    https://lore.kernel.org/lkml/00000000000095442505a6b63551@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 2 replies; the last was 28 days
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+478fd0d54412b8759e0d@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/00000000000095442505a6b63551@google.com

--------------------------------------------------------------------------------
Title:              WARNING in ib_unregister_device_queued
Last occurred:      51 days ago
Reported:           62 days ago
Branches:           net
Dashboard link:     https://syzkaller.appspot.com/bug?id=979c332b27ca869bd26c337574ef068908c1da3c
Original thread:    https://lore.kernel.org/lkml/000000000000aa012505a431c7d9@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

The original thread for this bug has received 2 replies; the last was 60 days
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000aa012505a431c7d9@google.com
Jason Gunthorpe June 29, 2020, 2:30 p.m. UTC | #8
On Sat, Jun 27, 2020 at 03:57:34PM -0700, Eric Biggers wrote:
 
> Hi Jason, here's my latest list (updated today) of bugs that are probably in
> drivers/infiniband/:

Thanks, lets see:
 
> Title:              general protection fault in rds_ib_add_one
> Last occurred:      1 day ago
> Reported:           124 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=15f96d171c64999196ac7db3de107f24b9182a8e
> Original thread:    https://lore.kernel.org/lkml/000000000000b9b7d4059f4e4ac7@google.com/T/#u
> 
> This bug has a C reproducer.
> 
> The original thread for this bug received 5 replies; the last was 117 days ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+274094e62023782eeb17@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000b9b7d4059f4e4ac7@google.com

This is RDS, Hillif and Santos were handling it.. I pinged the thread

 
> Title:              WARNING in srp_remove_one
> Last occurred:      0 days ago
> Reported:           118 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=16a5827f8f6f6ef0967e6492ffb2e2ca54c8c0fb
> Original thread:    https://lore.kernel.org/lkml/000000000000144d79059fc9415d@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> No one replied to the original thread for this bug.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+687bc62a84a6a2a3555a@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000144d79059fc9415d@google.com

This is *probably* the 'sysfs problem'

I have a good guess what this is, but I really need to see a
reproduction on it before trying to fix it. I can probably make the
odds of hitting this much higher with a patch, but syzkaller would
have to chew on that patch for a while to find a reproduction..

> Title:              WARNING in ib_uverbs_remove_one
> Last occurred:      0 days ago
> Reported:           99 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=7d092a26c44ac45dc0a59a1a0474be064db8fa66
> Original thread:    https://lore.kernel.org/lkml/000000000000c3a75205a14cb8c9@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> No one replied to the original thread for this bug.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+d3f37b9458fe8281d078@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000c3a75205a14cb8c9@google.com

This is *probably* the 'sysfs problem'

> Title:              WARNING in ib_free_port_attrs
> Last occurred:      1 day ago
> Reported:           117 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=4ec089798f282f2d2c3219151e420ed1ba10120d
> Original thread:    https://lore.kernel.org/lkml/000000000000460717059fd83734@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> The original thread for this bug received 1 reply, 116 days ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+e909641b84b5bc17ad8b@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000460717059fd83734@google.com

This is *probably* the 'sysfs problem', but not as clear

> Title:              WARNING in ib_umad_kill_port
> Last occurred:      1 day ago
> Reported:           82 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=4ecc18c71d37b62b131aee8184a642ae5d2d21a6
> Original thread:    https://lore.kernel.org/lkml/00000000000075245205a2997f68@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> The original thread for this bug has received 8 replies; the last was 79 days
> ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+9627a92b1f9262d5d30c@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/00000000000075245205a2997f68@google.com

This is *probably* the 'sysfs problem'

> Title:              KASAN: use-after-free Write in addr_resolve
> Last occurred:      20 days ago
> Reported:           17 days ago
> Branches:           Mainline
> Dashboard link:     https://syzkaller.appspot.com/bug?id=e0a96faaf6799220954d5f5d8ec6fa0c386f85ac
> Original thread:    https://lore.kernel.org/lkml/000000000000eb293205a7bdd19a@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> No one has replied to the original thread for this bug yet.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+08092148130652a6faae@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000eb293205a7bdd19a@google.com

I think I have a fix for this

> Title:              KASAN: use-after-free Read in addr_handler (2)
> Last occurred:      17 days ago
> Reported:           17 days ago
> Branches:           Mainline
> Dashboard link:     https://syzkaller.appspot.com/bug?id=cfd37bf8b5d2768b6b87e7b4c3a588a06ea6284a
> Original thread:    https://lore.kernel.org/lkml/000000000000107b4605a7bdce7d@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> The original thread for this bug has received 2 replies; the last was 20 hours
> ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+a929647172775e335941@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please reply to the original
> thread, which had activity only 20 hours ago.  For the git send-email command to
> use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
> instructions" at https://lore.kernel.org/r/000000000000107b4605a7bdce7d@google.com

Probably a dup of the above

> Title:              KASAN: use-after-free Read in ib_uverbs_remove_one
> Last occurred:      33 days ago
> Reported:           30 days ago
> Branches:           linux-next
> Dashboard link:     https://syzkaller.appspot.com/bug?id=f1a3b9d9350867a50d642b8e2cee217569b8adca
> Original thread:    https://lore.kernel.org/lkml/00000000000095442505a6b63551@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> The original thread for this bug has received 2 replies; the last was 28 days
> ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+478fd0d54412b8759e0d@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/00000000000095442505a6b63551@google.com

Couldn't figure it out.

> Title:              WARNING in ib_unregister_device_queued
> Last occurred:      51 days ago
> Reported:           62 days ago
> Branches:           net
> Dashboard link:     https://syzkaller.appspot.com/bug?id=979c332b27ca869bd26c337574ef068908c1da3c
> Original thread:    https://lore.kernel.org/lkml/000000000000aa012505a431c7d9@google.com/T/#u
> 
> Unfortunately, this bug does not have a reproducer.
> 
> The original thread for this bug has received 2 replies; the last was 60 days
> ago.
> 
> If you fix this bug, please add the following tag to the commit:
>     Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com
> 
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lore.kernel.org/r/000000000000aa012505a431c7d9@google.com

Fix sent

Jason
Jason Gunthorpe Nov. 16, 2020, 8:46 p.m. UTC | #9
On Sat, Jun 27, 2020 at 03:57:34PM -0700, Eric Biggers wrote:

> Hi Jason, here's my latest list (updated today) of bugs that are probably in
> drivers/infiniband/:

I'm going to apply this patch and I think it will clean out a bunch of
the non-reproducer syzkaller bugs related to sysfs:

https://patchwork.kernel.org/project/linux-rdma/patch/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com/

But I have no idea how to tell if it fixes it or not??

Any advice on how best to use syzbot here?

Thanks,
Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 4b72a3f7c134b2..0e8846ab86b5b6 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -91,6 +91,7 @@  struct ucma_context {
 
 	struct ucma_file	*file;
 	struct rdma_cm_id	*cm_id;
+	struct mutex		mutex;
 	u64			uid;
 
 	struct list_head	list;
@@ -216,6 +217,7 @@  static struct ucma_context *ucma_alloc_ctx(struct ucma_file *file)
 	init_completion(&ctx->comp);
 	INIT_LIST_HEAD(&ctx->mc_list);
 	ctx->file = file;
+	mutex_init(&ctx->mutex);
 
 	if (xa_alloc(&ctx_table, &ctx->id, ctx, xa_limit_32b, GFP_KERNEL))
 		goto error;
@@ -589,6 +591,7 @@  static int ucma_free_ctx(struct ucma_context *ctx)
 	}
 
 	events_reported = ctx->events_reported;
+	mutex_destroy(&ctx->mutex);
 	kfree(ctx);
 	return events_reported;
 }
@@ -658,7 +661,10 @@  static ssize_t ucma_bind_ip(struct ucma_file *file, const char __user *inbuf,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_bind_addr(ctx->cm_id, (struct sockaddr *) &cmd.addr);
+	mutex_unlock(&ctx->mutex);
+
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -681,7 +687,9 @@  static ssize_t ucma_bind(struct ucma_file *file, const char __user *inbuf,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_bind_addr(ctx->cm_id, (struct sockaddr *) &cmd.addr);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -705,8 +713,10 @@  static ssize_t ucma_resolve_ip(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_resolve_addr(ctx->cm_id, (struct sockaddr *) &cmd.src_addr,
 				(struct sockaddr *) &cmd.dst_addr, cmd.timeout_ms);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -731,8 +741,10 @@  static ssize_t ucma_resolve_addr(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_resolve_addr(ctx->cm_id, (struct sockaddr *) &cmd.src_addr,
 				(struct sockaddr *) &cmd.dst_addr, cmd.timeout_ms);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -752,7 +764,9 @@  static ssize_t ucma_resolve_route(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_resolve_route(ctx->cm_id, cmd.timeout_ms);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -841,6 +855,7 @@  static ssize_t ucma_query_route(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	memset(&resp, 0, sizeof resp);
 	addr = (struct sockaddr *) &ctx->cm_id->route.addr.src_addr;
 	memcpy(&resp.src_addr, addr, addr->sa_family == AF_INET ?
@@ -864,6 +879,7 @@  static ssize_t ucma_query_route(struct ucma_file *file,
 		ucma_copy_iw_route(&resp, &ctx->cm_id->route);
 
 out:
+	mutex_unlock(&ctx->mutex);
 	if (copy_to_user(u64_to_user_ptr(cmd.response),
 			 &resp, sizeof(resp)))
 		ret = -EFAULT;
@@ -1014,6 +1030,7 @@  static ssize_t ucma_query(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	switch (cmd.option) {
 	case RDMA_USER_CM_QUERY_ADDR:
 		ret = ucma_query_addr(ctx, response, out_len);
@@ -1028,6 +1045,7 @@  static ssize_t ucma_query(struct ucma_file *file,
 		ret = -ENOSYS;
 		break;
 	}
+	mutex_unlock(&ctx->mutex);
 
 	ucma_put_ctx(ctx);
 	return ret;
@@ -1068,7 +1086,9 @@  static ssize_t ucma_connect(struct ucma_file *file, const char __user *inbuf,
 		return PTR_ERR(ctx);
 
 	ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
+	mutex_lock(&ctx->mutex);
 	ret = rdma_connect(ctx->cm_id, &conn_param);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1089,7 +1109,9 @@  static ssize_t ucma_listen(struct ucma_file *file, const char __user *inbuf,
 
 	ctx->backlog = cmd.backlog > 0 && cmd.backlog < max_backlog ?
 		       cmd.backlog : max_backlog;
+	mutex_lock(&ctx->mutex);
 	ret = rdma_listen(ctx->cm_id, ctx->backlog);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1112,13 +1134,17 @@  static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
 	if (cmd.conn_param.valid) {
 		ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
 		mutex_lock(&file->mut);
+		mutex_lock(&ctx->mutex);
 		ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
+		mutex_unlock(&ctx->mutex);
 		if (!ret)
 			ctx->uid = cmd.uid;
 		mutex_unlock(&file->mut);
-	} else
+	} else {
+		mutex_lock(&ctx->mutex);
 		ret = __rdma_accept(ctx->cm_id, NULL, NULL);
-
+		mutex_unlock(&ctx->mutex);
+	}
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1137,7 +1163,9 @@  static ssize_t ucma_reject(struct ucma_file *file, const char __user *inbuf,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_reject(ctx->cm_id, cmd.private_data, cmd.private_data_len);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1156,7 +1184,9 @@  static ssize_t ucma_disconnect(struct ucma_file *file, const char __user *inbuf,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	ret = rdma_disconnect(ctx->cm_id);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1187,7 +1217,9 @@  static ssize_t ucma_init_qp_attr(struct ucma_file *file,
 	resp.qp_attr_mask = 0;
 	memset(&qp_attr, 0, sizeof qp_attr);
 	qp_attr.qp_state = cmd.qp_state;
+	mutex_lock(&ctx->mutex);
 	ret = rdma_init_qp_attr(ctx->cm_id, &qp_attr, &resp.qp_attr_mask);
+	mutex_unlock(&ctx->mutex);
 	if (ret)
 		goto out;
 
@@ -1273,9 +1305,13 @@  static int ucma_set_ib_path(struct ucma_context *ctx,
 		struct sa_path_rec opa;
 
 		sa_convert_path_ib_to_opa(&opa, &sa_path);
+		mutex_lock(&ctx->mutex);
 		ret = rdma_set_ib_path(ctx->cm_id, &opa);
+		mutex_unlock(&ctx->mutex);
 	} else {
+		mutex_lock(&ctx->mutex);
 		ret = rdma_set_ib_path(ctx->cm_id, &sa_path);
+		mutex_unlock(&ctx->mutex);
 	}
 	if (ret)
 		return ret;
@@ -1308,7 +1344,9 @@  static int ucma_set_option_level(struct ucma_context *ctx, int level,
 
 	switch (level) {
 	case RDMA_OPTION_ID:
+		mutex_lock(&ctx->mutex);
 		ret = ucma_set_option_id(ctx, optname, optval, optlen);
+		mutex_unlock(&ctx->mutex);
 		break;
 	case RDMA_OPTION_IB:
 		ret = ucma_set_option_ib(ctx, optname, optval, optlen);
@@ -1368,8 +1406,10 @@  static ssize_t ucma_notify(struct ucma_file *file, const char __user *inbuf,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	if (ctx->cm_id->device)
 		ret = rdma_notify(ctx->cm_id, (enum ib_event_type)cmd.event);
+	mutex_unlock(&ctx->mutex);
 
 	ucma_put_ctx(ctx);
 	return ret;
@@ -1403,6 +1443,7 @@  static ssize_t ucma_process_join(struct ucma_file *file,
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	mutex_lock(&ctx->mutex);
 	mutex_lock(&file->mut);
 	mc = ucma_alloc_multicast(ctx);
 	if (!mc) {
@@ -1427,6 +1468,7 @@  static ssize_t ucma_process_join(struct ucma_file *file,
 	xa_store(&multicast_table, mc->id, mc, 0);
 
 	mutex_unlock(&file->mut);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return 0;
 
@@ -1439,6 +1481,7 @@  static ssize_t ucma_process_join(struct ucma_file *file,
 	kfree(mc);
 err1:
 	mutex_unlock(&file->mut);
+	mutex_unlock(&ctx->mutex);
 	ucma_put_ctx(ctx);
 	return ret;
 }
@@ -1513,7 +1556,10 @@  static ssize_t ucma_leave_multicast(struct ucma_file *file,
 		goto out;
 	}
 
+	mutex_lock(&mc->ctx->mutex);
 	rdma_leave_multicast(mc->ctx->cm_id, (struct sockaddr *) &mc->addr);
+	mutex_unlock(&mc->ctx->mutex);
+
 	mutex_lock(&mc->ctx->file->mut);
 	ucma_cleanup_mc_events(mc);
 	list_del(&mc->list);