mbox series

[0/15,v2] loop: Fix oops and possible deadlocks

Message ID 20181010100415.26525-1-jack@suse.cz (mailing list archive)
Headers show
Series loop: Fix oops and possible deadlocks | expand

Message

Jan Kara Oct. 10, 2018, 10:04 a.m. UTC
Hi,

this patch series fixes oops and possible deadlocks as reported by syzbot [1]
[2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
patches are cleaning up the locking in the loop driver so that we can in the
end reasonably easily switch to rereading partitions without holding mutex
protecting the loop device.

I have lightly tested the patches by creating, deleting, and modifying loop
devices but if there's some more comprehensive loopback device testsuite, I
can try running it. Review is welcome!

Changes since v1:
* Added patch moving fput() calls in loop_change_fd() from under loop_ctl_mutex
* Fixed bug in loop_control_ioctl() where it failed to return error properly

								Honza

[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Comments

Tetsuo Handa Oct. 10, 2018, 10:19 a.m. UTC | #1
On 2018/10/10 19:04, Jan Kara wrote:
> Hi,
> 
> this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> patches are cleaning up the locking in the loop driver so that we can in the
> end reasonably easily switch to rereading partitions without holding mutex
> protecting the loop device.
> 
> I have lightly tested the patches by creating, deleting, and modifying loop
> devices but if there's some more comprehensive loopback device testsuite, I
> can try running it. Review is welcome!

Testing on linux-next by syzbot will be the most comprehensive. ;-)

> 
> Changes since v1:
> * Added patch moving fput() calls in loop_change_fd() from under loop_ctl_mutex
> * Fixed bug in loop_control_ioctl() where it failed to return error properly
> 
> 								Honza
> 
> [1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
> [2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889
>
Johannes Thumshirn Oct. 10, 2018, 11:42 a.m. UTC | #2
On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> On 2018/10/10 19:04, Jan Kara wrote:
> > Hi,
> > 
> > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > patches are cleaning up the locking in the loop driver so that we can in the
> > end reasonably easily switch to rereading partitions without holding mutex
> > protecting the loop device.
> > 
> > I have lightly tested the patches by creating, deleting, and modifying loop
> > devices but if there's some more comprehensive loopback device testsuite, I
> > can try running it. Review is welcome!
> 
> Testing on linux-next by syzbot will be the most comprehensive. ;-)

Apart from that blktests has a loop category and I think it could also be
worthwhile to add the C reproducer from syzkaller to blktests.

Byte,
	Johannes
Jan Kara Oct. 10, 2018, 12:28 p.m. UTC | #3
On Wed 10-10-18 13:42:27, Johannes Thumshirn wrote:
> On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> > On 2018/10/10 19:04, Jan Kara wrote:
> > > Hi,
> > > 
> > > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > > patches are cleaning up the locking in the loop driver so that we can in the
> > > end reasonably easily switch to rereading partitions without holding mutex
> > > protecting the loop device.
> > > 
> > > I have lightly tested the patches by creating, deleting, and modifying loop
> > > devices but if there's some more comprehensive loopback device testsuite, I
> > > can try running it. Review is welcome!
> > 
> > Testing on linux-next by syzbot will be the most comprehensive. ;-)
> 
> Apart from that blktests has a loop category and I think it could also be
> worthwhile to add the C reproducer from syzkaller to blktests.

Yeah, I did run loop tests now and they ran fine. I can try converting the
syzbot reproducers into something legible but it will take a while.

								Honza
Johannes Thumshirn Oct. 10, 2018, 12:43 p.m. UTC | #4
On Wed, Oct 10, 2018 at 02:28:09PM +0200, Jan Kara wrote:
> On Wed 10-10-18 13:42:27, Johannes Thumshirn wrote:
> > On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> > > On 2018/10/10 19:04, Jan Kara wrote:
> > > > Hi,
> > > > 
> > > > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > > > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > > > patches are cleaning up the locking in the loop driver so that we can in the
> > > > end reasonably easily switch to rereading partitions without holding mutex
> > > > protecting the loop device.
> > > > 
> > > > I have lightly tested the patches by creating, deleting, and modifying loop
> > > > devices but if there's some more comprehensive loopback device testsuite, I
> > > > can try running it. Review is welcome!
> > > 
> > > Testing on linux-next by syzbot will be the most comprehensive. ;-)
> > 
> > Apart from that blktests has a loop category and I think it could also be
> > worthwhile to add the C reproducer from syzkaller to blktests.
> 
> Yeah, I did run loop tests now and they ran fine. I can try converting the
> syzbot reproducers into something legible but it will take a while.

There is one C repropducer which can be used (it just needs minor
modifications to pass in the device instead of loop0).

See for instance blktests/src/sg/syzkaller1.c
Jan Kara Oct. 16, 2018, 11:36 a.m. UTC | #5
On Wed 10-10-18 14:28:09, Jan Kara wrote:
> On Wed 10-10-18 13:42:27, Johannes Thumshirn wrote:
> > On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> > > On 2018/10/10 19:04, Jan Kara wrote:
> > > > Hi,
> > > > 
> > > > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > > > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > > > patches are cleaning up the locking in the loop driver so that we can in the
> > > > end reasonably easily switch to rereading partitions without holding mutex
> > > > protecting the loop device.
> > > > 
> > > > I have lightly tested the patches by creating, deleting, and modifying loop
> > > > devices but if there's some more comprehensive loopback device testsuite, I
> > > > can try running it. Review is welcome!
> > > 
> > > Testing on linux-next by syzbot will be the most comprehensive. ;-)
> > 
> > Apart from that blktests has a loop category and I think it could also be
> > worthwhile to add the C reproducer from syzkaller to blktests.
> 
> Yeah, I did run loop tests now and they ran fine. I can try converting the
> syzbot reproducers into something legible but it will take a while.

So I took a stab at this. But I hit two issues:

1) For the reproducer triggering the lockdep warning, you need a 32-bit
binary (so that it uses compat_ioctl). I don't think we want to introduce
32-bit devel environment dependency to blktests. With 64-bits, the problem
is also there but someone noticed and silenced lockdep (with a reason that
I consider is incorrect)... I think the test is still worth it though as
I'll remove the lockdep-fooling code in my patches and thus new breakage
will be noticed.

2) For the oops (use-after-free) issue I was not able to reproduce that in
my test KVM in couple hours. The race window is rather narrow and syzbot
with KASAN and everything hit it only 11 times. So I'm not sure how useful
that test is. Any opinions?

								Honza
Johannes Thumshirn Oct. 16, 2018, 12:04 p.m. UTC | #6
On 16/10/18 13:36, Jan Kara wrote:
[...]
> 2) For the oops (use-after-free) issue I was not able to reproduce that in
> my test KVM in couple hours. The race window is rather narrow and syzbot
> with KASAN and everything hit it only 11 times. So I'm not sure how useful
> that test is. Any opinions?

Personally I think a test that has varying outcomes depending on how
often you run it (just to hit the race) isn't really suitable for a
suite like blktests.

But that's my personal opinion only, Omar what's your opinion here?
Omar Sandoval Oct. 16, 2018, 6:16 p.m. UTC | #7
On Tue, Oct 16, 2018 at 01:36:54PM +0200, Jan Kara wrote:
> On Wed 10-10-18 14:28:09, Jan Kara wrote:
> > On Wed 10-10-18 13:42:27, Johannes Thumshirn wrote:
> > > On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> > > > On 2018/10/10 19:04, Jan Kara wrote:
> > > > > Hi,
> > > > > 
> > > > > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > > > > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > > > > patches are cleaning up the locking in the loop driver so that we can in the
> > > > > end reasonably easily switch to rereading partitions without holding mutex
> > > > > protecting the loop device.
> > > > > 
> > > > > I have lightly tested the patches by creating, deleting, and modifying loop
> > > > > devices but if there's some more comprehensive loopback device testsuite, I
> > > > > can try running it. Review is welcome!
> > > > 
> > > > Testing on linux-next by syzbot will be the most comprehensive. ;-)
> > > 
> > > Apart from that blktests has a loop category and I think it could also be
> > > worthwhile to add the C reproducer from syzkaller to blktests.
> > 
> > Yeah, I did run loop tests now and they ran fine. I can try converting the
> > syzbot reproducers into something legible but it will take a while.
> 
> So I took a stab at this. But I hit two issues:
> 
> 1) For the reproducer triggering the lockdep warning, you need a 32-bit
> binary (so that it uses compat_ioctl). I don't think we want to introduce
> 32-bit devel environment dependency to blktests. With 64-bits, the problem
> is also there but someone noticed and silenced lockdep (with a reason that
> I consider is incorrect)... I think the test is still worth it though as
> I'll remove the lockdep-fooling code in my patches and thus new breakage
> will be noticed.

Agreed, even if it doesn't trigger lockdep now, it's a good regression
test.

> 2) For the oops (use-after-free) issue I was not able to reproduce that in
> my test KVM in couple hours. The race window is rather narrow and syzbot
> with KASAN and everything hit it only 11 times. So I'm not sure how useful
> that test is. Any opinions?

I'd say we should add it anyways. If anything, it's a smoke test for
changing fds on a loop device. You could add a note that the race it's
testing for is very narrow.
Jan Kara Oct. 17, 2018, 9:47 a.m. UTC | #8
On Tue 16-10-18 11:16:22, Omar Sandoval wrote:
> On Tue, Oct 16, 2018 at 01:36:54PM +0200, Jan Kara wrote:
> > On Wed 10-10-18 14:28:09, Jan Kara wrote:
> > > On Wed 10-10-18 13:42:27, Johannes Thumshirn wrote:
> > > > On Wed, Oct 10, 2018 at 07:19:00PM +0900, Tetsuo Handa wrote:
> > > > > On 2018/10/10 19:04, Jan Kara wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > > > > > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > > > > > patches are cleaning up the locking in the loop driver so that we can in the
> > > > > > end reasonably easily switch to rereading partitions without holding mutex
> > > > > > protecting the loop device.
> > > > > > 
> > > > > > I have lightly tested the patches by creating, deleting, and modifying loop
> > > > > > devices but if there's some more comprehensive loopback device testsuite, I
> > > > > > can try running it. Review is welcome!
> > > > > 
> > > > > Testing on linux-next by syzbot will be the most comprehensive. ;-)
> > > > 
> > > > Apart from that blktests has a loop category and I think it could also be
> > > > worthwhile to add the C reproducer from syzkaller to blktests.
> > > 
> > > Yeah, I did run loop tests now and they ran fine. I can try converting the
> > > syzbot reproducers into something legible but it will take a while.
> > 
> > So I took a stab at this. But I hit two issues:
> > 
> > 1) For the reproducer triggering the lockdep warning, you need a 32-bit
> > binary (so that it uses compat_ioctl). I don't think we want to introduce
> > 32-bit devel environment dependency to blktests. With 64-bits, the problem
> > is also there but someone noticed and silenced lockdep (with a reason that
> > I consider is incorrect)... I think the test is still worth it though as
> > I'll remove the lockdep-fooling code in my patches and thus new breakage
> > will be noticed.
> 
> Agreed, even if it doesn't trigger lockdep now, it's a good regression
> test.
> 
> > 2) For the oops (use-after-free) issue I was not able to reproduce that in
> > my test KVM in couple hours. The race window is rather narrow and syzbot
> > with KASAN and everything hit it only 11 times. So I'm not sure how useful
> > that test is. Any opinions?
> 
> I'd say we should add it anyways. If anything, it's a smoke test for
> changing fds on a loop device. You could add a note that the race it's
> testing for is very narrow.

OK, I'll post the patches later today.

								Honza