diff mbox

[GIT,PULL,02/58] lightnvm: prevent bd removal if busy

Message ID 20171013124647.32668-3-m@bjorling.me (mailing list archive)
State New, archived
Headers show

Commit Message

Matias Bjørling Oct. 13, 2017, 12:45 p.m. UTC
From: Rakesh Pandit <rakesh@tuxera.com>

When a virtual block device is formatted and mounted after creating
with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
would result in this:

446416.309757] bdi-block not registered
[446416.309773] ------------[ cut here ]------------
[446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
  __mark_inode_dirty+0x268/0x340

Ideally removal should return -EBUSY as block device is mounted after
formatting.  This patch tries to address this checking if whole device
or any partition of it already mounted or not before removal.

Whole device is checked using "bd_super" member of block device.  This
member is always set once block device has been mounted using a
filesystem.  Another member "bd_part_count" takes care of checking any
if any partitions are under use.  "bd_part_count" is only updated
under locks when partitions are opened or closed (first open and last
release).  This at least does take care sending -EBUSY if removal is
being attempted while whole block device or any partition is mounted.

Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
---
 drivers/lightnvm/core.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Christoph Hellwig Oct. 13, 2017, 2:58 p.m. UTC | #1
On Fri, Oct 13, 2017 at 02:45:51PM +0200, Matias Bjørling wrote:
> From: Rakesh Pandit <rakesh@tuxera.com>
> 
> When a virtual block device is formatted and mounted after creating
> with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
> would result in this:
> 
> 446416.309757] bdi-block not registered
> [446416.309773] ------------[ cut here ]------------
> [446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
>   __mark_inode_dirty+0x268/0x340
> 
> Ideally removal should return -EBUSY as block device is mounted after
> formatting.  This patch tries to address this checking if whole device
> or any partition of it already mounted or not before removal.

How is this different from any other block device that can be
removed even if a file system is mounted?

> 
> Whole device is checked using "bd_super" member of block device.  This
> member is always set once block device has been mounted using a
> filesystem.  Another member "bd_part_count" takes care of checking any
> if any partitions are under use.  "bd_part_count" is only updated
> under locks when partitions are opened or closed (first open and last
> release).  This at least does take care sending -EBUSY if removal is
> being attempted while whole block device or any partition is mounted.
> 

That's a massive layering violation, and a driver has no business
looking at these fields.
Rakesh Pandit Oct. 13, 2017, 3:35 p.m. UTC | #2
On Fri, Oct 13, 2017 at 07:58:09AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 13, 2017 at 02:45:51PM +0200, Matias Bjørling wrote:
> > From: Rakesh Pandit <rakesh@tuxera.com>
> > 
> > When a virtual block device is formatted and mounted after creating
> > with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
> > would result in this:
> > 
> > 446416.309757] bdi-block not registered
> > [446416.309773] ------------[ cut here ]------------
> > [446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
> >   __mark_inode_dirty+0x268/0x340
> > 
> > Ideally removal should return -EBUSY as block device is mounted after
> > formatting.  This patch tries to address this checking if whole device
> > or any partition of it already mounted or not before removal.
> 
> How is this different from any other block device that can be
> removed even if a file system is mounted?

One can create many virtual block devices on top of physical using:
nvme lnvm create ... -t pblk

And remove them using:
nvme lnvm remove

Because the block devices are virtual in nature created by a program I was
expecting removal to tell me they are busy instead of bdi-block not registered
following by a WARNING (above).  My use case was writing automatic test case
but I assumed this is useful in general.

> 
> > 
> > Whole device is checked using "bd_super" member of block device.  This
> > member is always set once block device has been mounted using a
> > filesystem.  Another member "bd_part_count" takes care of checking any
> > if any partitions are under use.  "bd_part_count" is only updated
> > under locks when partitions are opened or closed (first open and last
> > release).  This at least does take care sending -EBUSY if removal is
> > being attempted while whole block device or any partition is mounted.
> > 
> 
> That's a massive layering violation, and a driver has no business
> looking at these fields.

Okay, I didn't consider this earlier.  I would suggest a revert for this.
Javier González Oct. 13, 2017, 3:58 p.m. UTC | #3
> On 13 Oct 2017, at 17.35, Rakesh Pandit <rakesh@tuxera.com> wrote:
> 
>> On Fri, Oct 13, 2017 at 07:58:09AM -0700, Christoph Hellwig wrote:
>>> On Fri, Oct 13, 2017 at 02:45:51PM +0200, Matias Bjørling wrote:
>>> From: Rakesh Pandit <rakesh@tuxera.com>
>>> 
>>> When a virtual block device is formatted and mounted after creating
>>> with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
>>> would result in this:
>>> 
>>> 446416.309757] bdi-block not registered
>>> [446416.309773] ------------[ cut here ]------------
>>> [446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
>>>  __mark_inode_dirty+0x268/0x340
>>> 
>>> Ideally removal should return -EBUSY as block device is mounted after
>>> formatting.  This patch tries to address this checking if whole device
>>> or any partition of it already mounted or not before removal.
>> 
>> How is this different from any other block device that can be
>> removed even if a file system is mounted?
> 
> One can create many virtual block devices on top of physical using:
> nvme lnvm create ... -t pblk
> 
> And remove them using:
> nvme lnvm remove
> 
> Because the block devices are virtual in nature created by a program I was
> expecting removal to tell me they are busy instead of bdi-block not registered
> following by a WARNING (above).  My use case was writing automatic test case
> but I assumed this is useful in general.
> 
>> 
>>> 
>>> Whole device is checked using "bd_super" member of block device.  This
>>> member is always set once block device has been mounted using a
>>> filesystem.  Another member "bd_part_count" takes care of checking any
>>> if any partitions are under use.  "bd_part_count" is only updated
>>> under locks when partitions are opened or closed (first open and last
>>> release).  This at least does take care sending -EBUSY if removal is
>>> being attempted while whole block device or any partition is mounted.
>>> 
>> 
>> That's a massive layering violation, and a driver has no business
>> looking at these fields.
> 
> Okay, I didn't consider this earlier.  I would suggest a revert for this.

The use case is still valid, since a block device typically does not disappear under a file system - at least not because of a script suddenly removing it by mistake. 

Any suggestion on how we can do this better?

Javier.
Javier González Oct. 14, 2017, 6:04 a.m. UTC | #4
> On 13 Oct 2017, at 17.58, Javier González <javigon.napster@gmail.com> wrote:
> 
> 
>>> On 13 Oct 2017, at 17.35, Rakesh Pandit <rakesh@tuxera.com> wrote:
>>> 
>>>> On Fri, Oct 13, 2017 at 07:58:09AM -0700, Christoph Hellwig wrote:
>>>> On Fri, Oct 13, 2017 at 02:45:51PM +0200, Matias Bjørling wrote:
>>>> From: Rakesh Pandit <rakesh@tuxera.com>
>>>> 
>>>> When a virtual block device is formatted and mounted after creating
>>>> with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
>>>> would result in this:
>>>> 
>>>> 446416.309757] bdi-block not registered
>>>> [446416.309773] ------------[ cut here ]------------
>>>> [446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
>>>> __mark_inode_dirty+0x268/0x340
>>>> 
>>>> Ideally removal should return -EBUSY as block device is mounted after
>>>> formatting.  This patch tries to address this checking if whole device
>>>> or any partition of it already mounted or not before removal.
>>> 
>>> How is this different from any other block device that can be
>>> removed even if a file system is mounted?
>> 
>> One can create many virtual block devices on top of physical using:
>> nvme lnvm create ... -t pblk
>> 
>> And remove them using:
>> nvme lnvm remove
>> 
>> Because the block devices are virtual in nature created by a program I was
>> expecting removal to tell me they are busy instead of bdi-block not registered
>> following by a WARNING (above).  My use case was writing automatic test case
>> but I assumed this is useful in general.
>> 
>>> 
>>>> 
>>>> Whole device is checked using "bd_super" member of block device.  This
>>>> member is always set once block device has been mounted using a
>>>> filesystem.  Another member "bd_part_count" takes care of checking any
>>>> if any partitions are under use.  "bd_part_count" is only updated
>>>> under locks when partitions are opened or closed (first open and last
>>>> release).  This at least does take care sending -EBUSY if removal is
>>>> being attempted while whole block device or any partition is mounted.
>>>> 
>>> 
>>> That's a massive layering violation, and a driver has no business
>>> looking at these fields.
>> 
>> Okay, I didn't consider this earlier.  I would suggest a revert for this.
> 
> The use case is still valid, since a block device typically does not disappear under a file system - at least not because of a script suddenly removing it by mistake. 
> 
> Any suggestion on how we can do this better?
> 

Thinking about it, it does not seem like we have any checks now when removing a fabrics block device?

Would it make sense to have a common way to let drivers know if they are in use, at least to give a warning?

Javier
Matias Bjørling Oct. 16, 2017, 3:14 p.m. UTC | #5
On Fri, Oct 13, 2017 at 5:35 PM, Rakesh Pandit <rakesh@tuxera.com> wrote:
> On Fri, Oct 13, 2017 at 07:58:09AM -0700, Christoph Hellwig wrote:
>> On Fri, Oct 13, 2017 at 02:45:51PM +0200, Matias Bjørling wrote:
>> > From: Rakesh Pandit <rakesh@tuxera.com>
>> >
>> > When a virtual block device is formatted and mounted after creating
>> > with "nvme lnvm create... -t pblk", a removal from "nvm lnvm remove"
>> > would result in this:
>> >
>> > 446416.309757] bdi-block not registered
>> > [446416.309773] ------------[ cut here ]------------
>> > [446416.309780] WARNING: CPU: 3 PID: 4319 at fs/fs-writeback.c:2159
>> >   __mark_inode_dirty+0x268/0x340
>> >
>> > Ideally removal should return -EBUSY as block device is mounted after
>> > formatting.  This patch tries to address this checking if whole device
>> > or any partition of it already mounted or not before removal.
>>
>> How is this different from any other block device that can be
>> removed even if a file system is mounted?
>
> One can create many virtual block devices on top of physical using:
> nvme lnvm create ... -t pblk
>
> And remove them using:
> nvme lnvm remove
>
> Because the block devices are virtual in nature created by a program I was
> expecting removal to tell me they are busy instead of bdi-block not registered
> following by a WARNING (above).  My use case was writing automatic test case
> but I assumed this is useful in general.
>
>>
>> >
>> > Whole device is checked using "bd_super" member of block device.  This
>> > member is always set once block device has been mounted using a
>> > filesystem.  Another member "bd_part_count" takes care of checking any
>> > if any partitions are under use.  "bd_part_count" is only updated
>> > under locks when partitions are opened or closed (first open and last
>> > release).  This at least does take care sending -EBUSY if removal is
>> > being attempted while whole block device or any partition is mounted.
>> >
>>
>> That's a massive layering violation, and a driver has no business
>> looking at these fields.
>
> Okay, I didn't consider this earlier.  I would suggest a revert for this.

I see you've already done it. Thanks Jens.
diff mbox

Patch

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 60e163b..c490711 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -373,6 +373,7 @@  static void __nvm_remove_target(struct nvm_target *t)
 static int nvm_remove_tgt(struct nvm_dev *dev, struct nvm_ioctl_remove *remove)
 {
 	struct nvm_target *t;
+	struct block_device *bdev;
 
 	mutex_lock(&dev->mlock);
 	t = nvm_find_target(dev, remove->tgtname);
@@ -380,6 +381,19 @@  static int nvm_remove_tgt(struct nvm_dev *dev, struct nvm_ioctl_remove *remove)
 		mutex_unlock(&dev->mlock);
 		return 1;
 	}
+	bdev = bdget_disk(t->disk, 0);
+	if (!bdev) {
+		pr_err("nvm: removal failed, allocating bd failed\n");
+		mutex_unlock(&dev->mlock);
+		return -ENOMEM;
+	}
+	if (bdev->bd_super || bdev->bd_part_count) {
+		pr_err("nvm: removal failed, block device busy\n");
+		bdput(bdev);
+		mutex_unlock(&dev->mlock);
+		return -EBUSY;
+	}
+	bdput(bdev);
 	__nvm_remove_target(t);
 	mutex_unlock(&dev->mlock);