mbox series

[v2,0/7] bdi: fix use-after-free for bdi device

Message ID 20200226111851.55348-1-yuyufen@huawei.com (mailing list archive)
Headers show
Series bdi: fix use-after-free for bdi device | expand

Message

Yufen Yu Feb. 26, 2020, 11:18 a.m. UTC
Hi, all 

We have reported a use-after-free crash for bdi device in
__blkg_prfill_rwstat() (see Patch #3). The bug is caused by printing
device kobj->name while the device and kobj->name has been freed by
bdi_unregister().

In fact, commit 68f23b8906 "memcg: fix a crash in wb_workfn when
a device disappears" has tried to address the issue, but the code
is till somewhat racy after that commit.

In this patchset, we try to protect device lifetime with RCU, avoiding
the device been freed when others used.

A way which maybe fix the problem is copy device name into special
memory (as discussed in [0]), but that is also need lock protect.

[0] https://lore.kernel.org/linux-block/20200219125505.GP16121@quack2.suse.cz/

V1:
  https://www.spinics.net/lists/linux-block/msg49693.html
  Add a new spinlock and copy kobj->name into caller buffer.
  Or using synchronize_rcu() to wait until reader complete.

Yufen Yu (7):
  blk-wbt: use bdi_dev_name() to get device name
  fs/ceph: use bdi_dev_name() to get device name
  bdi: protect device lifetime with RCU
  bdi: create a new function bdi_get_dev_name()
  bfq: fix potential kernel crash when print dev err info
  memcg: fix crash in wb_workfn when bdi unregister
  blk-wbt: replace bdi_dev_name() with bdi_get_dev_name()

 block/bfq-iosched.c              |  7 +++--
 block/blk-cgroup.c               |  8 ++++--
 block/genhd.c                    |  4 +--
 fs/ceph/debugfs.c                |  2 +-
 fs/ext4/super.c                  |  2 +-
 fs/fs-writeback.c                |  4 ++-
 include/linux/backing-dev-defs.h |  8 +++++-
 include/linux/backing-dev.h      | 31 +++++++++++++++++++--
 include/trace/events/wbt.h       |  8 +++---
 include/trace/events/writeback.h | 38 ++++++++++++--------------
 mm/backing-dev.c                 | 59 +++++++++++++++++++++++++++++++++-------
 11 files changed, 124 insertions(+), 47 deletions(-)

Comments

Greg KH March 4, 2020, 5:29 p.m. UTC | #1
On Wed, Feb 26, 2020 at 07:18:44PM +0800, Yufen Yu wrote:
> Hi, all 
> 
> We have reported a use-after-free crash for bdi device in
> __blkg_prfill_rwstat() (see Patch #3). The bug is caused by printing
> device kobj->name while the device and kobj->name has been freed by
> bdi_unregister().

How does that happen?  Who has access to a kobject without also having
the reference count incremented at the same time?  Is this through sysfs
or somewhere within the kernel itself?

> In fact, commit 68f23b8906 "memcg: fix a crash in wb_workfn when
> a device disappears" has tried to address the issue, but the code
> is till somewhat racy after that commit.

That commit is really odd, and I think is just papering over the real
issue, which is shown in the changelog for that commit.

A kobject can be unregistered, like bdi_unregister() does, even if there
are active references for it.  But someone needs to also go around and
decrement those references in order for things to be properly freed.

It feels like the use of struct device (and by virtue of that, struct
kobject and really a kref) here is not being done correctly at all.

The rule should be, "whenever you pass a pointer to a device off, the
reference count is incremented".  Somehow that is not happening here and
RCU is not going to solve the issue really, it's only going to delay the
problem from showing up until much later.

> In this patchset, we try to protect device lifetime with RCU, avoiding
> the device been freed when others used.

The struct device refcount should be all that is needed, don't use RCU
just to "delay freeing this object until some later time because someone
else might have a pointer to id".  That's ripe for disaster.

> A way which maybe fix the problem is copy device name into special
> memory (as discussed in [0]), but that is also need lock protect.

Hah, all that is needed is the name here?  That's sad.

thanks,

greg k-h
Tejun Heo March 4, 2020, 6:57 p.m. UTC | #2
Hey, Greg.

On Wed, Mar 04, 2020 at 06:29:07PM +0100, Greg KH wrote:
> How does that happen?  Who has access to a kobject without also having
> the reference count incremented at the same time?  Is this through sysfs
> or somewhere within the kernel itself?

Hopefully, this part was addressed in the other reply.

> The struct device refcount should be all that is needed, don't use RCU
> just to "delay freeing this object until some later time because someone
> else might have a pointer to id".  That's ripe for disaster.

I think it's an idiomatic use of rcu given the circumstances. Whether
the circumstances are reasonable is totally debatable.

Thanks.
Theodore Ts'o March 4, 2020, 7:02 p.m. UTC | #3
On Wed, Mar 04, 2020 at 06:29:07PM +0100, Greg KH wrote:
> The rule should be, "whenever you pass a pointer to a device off, the
> reference count is incremented".  Somehow that is not happening here and
> RCU is not going to solve the issue really, it's only going to delay the
> problem from showing up until much later.
> ...
> The struct device refcount should be all that is needed, don't use RCU
> just to "delay freeing this object until some later time because someone
> else might have a pointer to id".  That's ripe for disaster.

I agree that this is a better fix than trying to continue to paper
over the problem.

That being said, I also think it would be better if we could *also*
send a notification to the file system (or device mapper) when a block
device has disappeared, so we can set a flag in struct super
indicating, "this is an ex-device" so that we don't have to have
potentially hundreds of I/O errors clogging up the console and/or any
error notification ifrastructure we might want to add in the future,
as we attempt to send I/O to a device is not coming back.  This would
allow us to short-circuit things like writeback, instead of letting
everything drain via pointless io_submits sending bios that will never
go anywhere useful.

					- Ted
Greg KH March 4, 2020, 8:07 p.m. UTC | #4
On Wed, Mar 04, 2020 at 01:57:39PM -0500, Tejun Heo wrote:
> Hey, Greg.
> 
> On Wed, Mar 04, 2020 at 06:29:07PM +0100, Greg KH wrote:
> > How does that happen?  Who has access to a kobject without also having
> > the reference count incremented at the same time?  Is this through sysfs
> > or somewhere within the kernel itself?
> 
> Hopefully, this part was addressed in the other reply.

Yes, thanks.

> > The struct device refcount should be all that is needed, don't use RCU
> > just to "delay freeing this object until some later time because someone
> > else might have a pointer to id".  That's ripe for disaster.
> 
> I think it's an idiomatic use of rcu given the circumstances. Whether
> the circumstances are reasonable is totally debatable.

They are not reasonable :)