mbox series

[0/16,v3] loop: Fix oops and possible deadlocks

Message ID 20181108130116.12140-1-jack@suse.cz (mailing list archive)
Headers show
Series loop: Fix oops and possible deadlocks | expand

Message

Jan Kara Nov. 8, 2018, 1:01 p.m. UTC
Hi,

this patch series fixes oops and possible deadlocks as reported by syzbot [1]
[2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
patches are cleaning up the locking in the loop driver so that we can in the
end reasonably easily switch to rereading partitions without holding mutex
protecting the loop device.

I have tested the patches by creating, deleting, modifying loop devices, and by
running loop blktests (as well as creating new ones with the load syzkaller has
used to detect the problem). Review is welcome but I think the patches are fine
to go as far as I'm concerned! Jens, can you please pick them up?

Changes since v1:
* Added patch moving fput() calls in loop_change_fd() from under loop_ctl_mutex
* Fixed bug in loop_control_ioctl() where it failed to return error properly

Changes since v2:
* Rebase on top of 4.20-rc1
* Add patch to stop fooling lockdep about loop_ctl_mutex

								Honza

[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Comments

Jens Axboe Nov. 8, 2018, 1:21 p.m. UTC | #1
On 11/8/18 6:01 AM, Jan Kara wrote:
> Hi,
> 
> this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> patches are cleaning up the locking in the loop driver so that we can in the
> end reasonably easily switch to rereading partitions without holding mutex
> protecting the loop device.
> 
> I have tested the patches by creating, deleting, modifying loop devices, and by
> running loop blktests (as well as creating new ones with the load syzkaller has
> used to detect the problem). Review is welcome but I think the patches are fine
> to go as far as I'm concerned! Jens, can you please pick them up?

I've been waiting for this series - it's a big series for 4.20, though... It
would suck having to defer this to 4.21 since it's a long standing issue, but
the risk of a new regression is present as well.

Are you fine with this going in for 4.21?
Jan Kara Nov. 8, 2018, 1:25 p.m. UTC | #2
On Thu 08-11-18 06:21:21, Jens Axboe wrote:
> On 11/8/18 6:01 AM, Jan Kara wrote:
> > Hi,
> > 
> > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > patches are cleaning up the locking in the loop driver so that we can in the
> > end reasonably easily switch to rereading partitions without holding mutex
> > protecting the loop device.
> > 
> > I have tested the patches by creating, deleting, modifying loop devices, and by
> > running loop blktests (as well as creating new ones with the load syzkaller has
> > used to detect the problem). Review is welcome but I think the patches are fine
> > to go as far as I'm concerned! Jens, can you please pick them up?
> 
> I've been waiting for this series - it's a big series for 4.20, though... It
> would suck having to defer this to 4.21 since it's a long standing issue, but
> the risk of a new regression is present as well.
> 
> Are you fine with this going in for 4.21?

Yeah, I'm fine with the delay so that it has time to soak in linux-next.

								Honza
Jens Axboe Nov. 8, 2018, 1:31 p.m. UTC | #3
On 11/8/18 6:25 AM, Jan Kara wrote:
> On Thu 08-11-18 06:21:21, Jens Axboe wrote:
>> On 11/8/18 6:01 AM, Jan Kara wrote:
>>> Hi,
>>>
>>> this patch series fixes oops and possible deadlocks as reported by syzbot [1]
>>> [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
>>> patches are cleaning up the locking in the loop driver so that we can in the
>>> end reasonably easily switch to rereading partitions without holding mutex
>>> protecting the loop device.
>>>
>>> I have tested the patches by creating, deleting, modifying loop devices, and by
>>> running loop blktests (as well as creating new ones with the load syzkaller has
>>> used to detect the problem). Review is welcome but I think the patches are fine
>>> to go as far as I'm concerned! Jens, can you please pick them up?
>>
>> I've been waiting for this series - it's a big series for 4.20, though... It
>> would suck having to defer this to 4.21 since it's a long standing issue, but
>> the risk of a new regression is present as well.
>>
>> Are you fine with this going in for 4.21?
> 
> Yeah, I'm fine with the delay so that it has time to soak in linux-next.

Sounds good - thanks for getting this work done, much appreciated.
I have applied it for 4.21.
Theodore Ts'o Nov. 8, 2018, 9:28 p.m. UTC | #4
On Thu, Nov 08, 2018 at 02:01:00PM +0100, Jan Kara wrote:
> Hi,
> 
> this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> patches are cleaning up the locking in the loop driver so that we can in the
> end reasonably easily switch to rereading partitions without holding mutex
> protecting the loop device.
> 
> I have tested the patches by creating, deleting, modifying loop devices, and by
> running loop blktests (as well as creating new ones with the load syzkaller has
> used to detect the problem). Review is welcome but I think the patches are fine
> to go as far as I'm concerned! Jens, can you please pick them up?
> 
> Changes since v1:
> * Added patch moving fput() calls in loop_change_fd() from under loop_ctl_mutex
> * Fixed bug in loop_control_ioctl() where it failed to return error properly
> 
> Changes since v2:
> * Rebase on top of 4.20-rc1
> * Add patch to stop fooling lockdep about loop_ctl_mutex
> 
> 								Honza

Thanks for working on fixing up the Loop driver to fix these races!

Is it worth adding some Cc: stable@kernel.org lines?  Figuring out
which Fixes they should apply to might be tricky, and from my
experience because of some of the recent loop work, backporting to
older stable kernels is not necessarily going to be trivial.  But
since Dmitry also runs Syzkaller on stable kernels, it'd be great if
we could get them backported without relying on Sasha's AUTOSTABLE.

   	     	  	     	     	     - Ted
Jan Kara Nov. 12, 2018, 10:15 a.m. UTC | #5
On Thu 08-11-18 16:28:11, Theodore Y. Ts'o wrote:
> On Thu, Nov 08, 2018 at 02:01:00PM +0100, Jan Kara wrote:
> > Hi,
> > 
> > this patch series fixes oops and possible deadlocks as reported by syzbot [1]
> > [2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
> > patches are cleaning up the locking in the loop driver so that we can in the
> > end reasonably easily switch to rereading partitions without holding mutex
> > protecting the loop device.
> > 
> > I have tested the patches by creating, deleting, modifying loop devices, and by
> > running loop blktests (as well as creating new ones with the load syzkaller has
> > used to detect the problem). Review is welcome but I think the patches are fine
> > to go as far as I'm concerned! Jens, can you please pick them up?
> > 
> > Changes since v1:
> > * Added patch moving fput() calls in loop_change_fd() from under loop_ctl_mutex
> > * Fixed bug in loop_control_ioctl() where it failed to return error properly
> > 
> > Changes since v2:
> > * Rebase on top of 4.20-rc1
> > * Add patch to stop fooling lockdep about loop_ctl_mutex
> > 
> > 								Honza
> 
> Thanks for working on fixing up the Loop driver to fix these races!
> 
> Is it worth adding some Cc: stable@kernel.org lines?  Figuring out
> which Fixes they should apply to might be tricky, and from my
> experience because of some of the recent loop work, backporting to
> older stable kernels is not necessarily going to be trivial.  But
> since Dmitry also runs Syzkaller on stable kernels, it'd be great if
> we could get them backported without relying on Sasha's AUTOSTABLE.

That's a fair request but generally I've found this too intrusive for
stable.  The patch 2/16 should be relatively easy to backport and closes
the possible use-after-free which is the nasties of the problems (but also
so rare that I was never able to hit it in my testing and syzbot hit it
only couple of times todate). So there CC to stable might make sense.  The
rest fixes possible deadlocks and they are possible to trigger only by root
bashing reconfiguration of loop devices - IMO not worth the hassle for
stable.

								Honza