diff mbox series

dm: introduce a no open flag for deferred remove

Message ID 20220124150209.22202-1-bgeffon@google.com (mailing list archive)
State Rejected, archived
Delegated to: Mike Snitzer
Headers show
Series dm: introduce a no open flag for deferred remove | expand

Commit Message

Brian Geffon Jan. 24, 2022, 3:02 p.m. UTC
When a device is being removed with deferred remove it's
still possible to open and use the device. This change
introduces a flag called DM_DEFERRED_REMOVE_NO_OPEN_FLAG
which when used with DM_DEFERRED_REMOVE will cause any
new opens to fail with -ENXIO.

If this flag is used without DM_DEFERRED_REMOVE it will
result in an -EINVAL.

Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 drivers/md/dm-core.h          |  1 +
 drivers/md/dm-ioctl.c         | 39 +++++++++++++++++++++++++++--------
 drivers/md/dm.c               | 21 ++++++++++++++++---
 drivers/md/dm.h               |  9 +++++++-
 include/uapi/linux/dm-ioctl.h | 12 +++++++++--
 5 files changed, 67 insertions(+), 15 deletions(-)

Comments

Brian Geffon Jan. 24, 2022, 3:25 p.m. UTC | #1
On Mon, Jan 24, 2022 at 10:14 AM Alasdair G Kergon <agk@redhat.com> wrote:
>
> On Mon, Jan 24, 2022 at 07:02:09AM -0800, Brian Geffon wrote:
> > When a device is being removed with deferred remove it's
> > still possible to open and use the device. This change
> > introduces a flag called DM_DEFERRED_REMOVE_NO_OPEN_FLAG
> > which when used with DM_DEFERRED_REMOVE will cause any
> > new opens to fail with -ENXIO.
>
> What is the need for this?

Hi Alasdair,
Thank you for looking at this. There are a few reasons this might be
useful, the first is if you're trying to speed up a graceful teardown
of the device by informing userspace that this device is going to be
removed in the near future. Another might be on systems where it might
be worthwhile to not have users with CAP_DAC_OVERRIDE be able to open
the device. The logic on this second case is that, suppose you have a
dm-crypt block device which is backing swap, the data on this device
is ephemeral so a flow might be to setup swap followed by dmsetup
remove --deferred /dev/mapper/encrypted-swap. This will guarantee that
as soon as swap is torn down the encrypted block device is dropped,
additionally with this new flag you'll be guaranteed that there can be
no further opens on it.

> Does it break any semantics assumed by userspace?

No, this is fully backwards compatible with the current deferred
remove behavior, it's not required. Additionally, since on the actual
remove userspace would receive an -ENXIO already once the remove
process has started it seems reasonable to return -ENXIO in the
deferred remove case when this flag is enabled.

Brian

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Brian Geffon Jan. 25, 2022, 2:25 p.m. UTC | #2
On Mon, Jan 24, 2022 at 7:21 PM Alasdair G Kergon <agk@redhat.com> wrote:
>
> On Mon, Jan 24, 2022 at 10:25:47AM -0500, Brian Geffon wrote:
> > Thank you for looking at this. There are a few reasons this might be
> > useful, the first is if you're trying to speed up a graceful teardown
> > of the device by informing userspace that this device is going to be
> > removed in the near future. Another might be on systems where it might
> > be worthwhile to not have users with CAP_DAC_OVERRIDE be able to open
> > the device. The logic on this second case is that, suppose you have a
> > dm-crypt block device which is backing swap, the data on this device
> > is ephemeral so a flow might be to setup swap followed by dmsetup
> > remove --deferred /dev/mapper/encrypted-swap. This will guarantee that
> > as soon as swap is torn down the encrypted block device is dropped,
> > additionally with this new flag you'll be guaranteed that there can be
> > no further opens on it.
>
> And is that the reason you propose this?
> - You want a special exclusive 'one time open' device that
>   self-destructs when closed?

Yes, this is the primary reason I'm exploring this. I tried to find an
implementation that might be useful in other situations which is why I
landed on this approach.

>
> > No, this is fully backwards compatible with the current deferred
> > remove behavior, it's not required. Additionally, since on the actual
> > remove userspace would receive an -ENXIO already once the remove
> > process has started it seems reasonable to return -ENXIO in the
> > deferred remove case when this flag is enabled.
>
> Well I feel it does break existing semantics which is why we wrote
> the code the way we did.  The state can be long-lived, the code
> that has it open might legitimately want to open it again in
> parallel etc. - in general this seems a bad idea.

Thanks for explaining and providing that context.

>
> But if the reason for this is basically "make it harder for
> anything else to access my encrypted swap" and to deliberately
> prevent access, then let's approach the requirement from that angle.
> Are there alternative implementations with interventions at different
> points?

Absolutely, we could perhaps create a new ioctl which allows the
caller to specify the maximum open count, and when the open count hits
that value it fails any new opens with -EBUSY. Perhaps this would be
enforced in dm_get_device? This type of behavior could theoretically
mimic the patch I mailed when used in conjunction with deferred
removal. Is this an approach you think might make more sense with the
existing design? Are there any implementation points you believe might
make more sense for such a feature?

>  Are there similar requirements for devices that don't need
> deferred remove?

I cannot immediately think of a situation where you'd do this without
deferred removal.

>  If this is indeed the best place to insert this type
> of restriction, then we should label it and document it accordingly so
> people don't mistakenly use it for the 'normal' case.  (We always keep
> libdevmapper and dmsetup in sync with kernel interface extensions, so
> we'd seek a tiny patch to do that too.)

Yes, absolutely. I'll happily send patches for userspace libraries,
applications, and documentation once we converge on an acceptable
approach.

Thanks again for spending your time discussing this,
Brian

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Brian Geffon Jan. 25, 2022, 5:11 p.m. UTC | #3
On Mon, Jan 24, 2022 at 7:21 PM Alasdair G Kergon <agk@redhat.com> wrote:
>
> On Mon, Jan 24, 2022 at 10:25:47AM -0500, Brian Geffon wrote:
> > Thank you for looking at this. There are a few reasons this might be
> > useful, the first is if you're trying to speed up a graceful teardown
> > of the device by informing userspace that this device is going to be
> > removed in the near future. Another might be on systems where it might
> > be worthwhile to not have users with CAP_DAC_OVERRIDE be able to open
> > the device. The logic on this second case is that, suppose you have a
> > dm-crypt block device which is backing swap, the data on this device
> > is ephemeral so a flow might be to setup swap followed by dmsetup
> > remove --deferred /dev/mapper/encrypted-swap. This will guarantee that
> > as soon as swap is torn down the encrypted block device is dropped,
> > additionally with this new flag you'll be guaranteed that there can be
> > no further opens on it.
>
> And is that the reason you propose this?
> - You want a special exclusive 'one time open' device that
>   self-destructs when closed?
>
> > No, this is fully backwards compatible with the current deferred
> > remove behavior, it's not required. Additionally, since on the actual
> > remove userspace would receive an -ENXIO already once the remove
> > process has started it seems reasonable to return -ENXIO in the
> > deferred remove case when this flag is enabled.
>
> Well I feel it does break existing semantics which is why we wrote
> the code the way we did.  The state can be long-lived, the code
> that has it open might legitimately want to open it again in
> parallel etc. - in general this seems a bad idea.
>
> But if the reason for this is basically "make it harder for
> anything else to access my encrypted swap" and to deliberately
> prevent access, then let's approach the requirement from that angle.
> Are there alternative implementations with interventions at different
> points?

I was thinking perhaps another implementation might involve using
open_count on dm_ioctl as an in param on DM_DEV_CREATE only. Using
open_count as an in parameter on DM_DEV_CREATE could be activated by a
new flag, perhaps DM_ENFORCE_OPEN_COUNT_FLAG. This would allow the
behavior to be baked into the device from the start. We would then
enforce it in dm_blk_open. What would you think about an approach like
this?

Thanks,
Brian

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index b855fef4f38a..b30e59deb4a8 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -139,6 +139,7 @@  struct mapped_device {
 #define DMF_SUSPENDED_INTERNALLY 7
 #define DMF_POST_SUSPENDING 8
 #define DMF_EMULATE_ZONE_APPEND 9
+#define DMF_DEFERRED_REMOVE_NO_OPEN 10
 
 void disable_discard(struct mapped_device *md);
 void disable_write_same(struct mapped_device *md);
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index 21fe8652b095..07bb679880de 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -60,7 +60,8 @@  struct vers_iter {
 static struct rb_root name_rb_tree = RB_ROOT;
 static struct rb_root uuid_rb_tree = RB_ROOT;
 
-static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred, bool only_deferred);
+static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred,
+		bool deferred_no_open, bool only_deferred);
 
 /*
  * Guards access to both hash tables.
@@ -74,7 +75,7 @@  static DEFINE_MUTEX(dm_hash_cells_mutex);
 
 static void dm_hash_exit(void)
 {
-	dm_hash_remove_all(false, false, false);
+	dm_hash_remove_all(false, false, false, false);
 }
 
 /*-----------------------------------------------------------------
@@ -315,7 +316,8 @@  static struct dm_table *__hash_remove(struct hash_cell *hc)
 	return table;
 }
 
-static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred, bool only_deferred)
+static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred,
+		bool deferred_no_open, bool only_deferred)
 {
 	int dev_skipped;
 	struct rb_node *n;
@@ -334,7 +336,8 @@  static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred, bool
 		dm_get(md);
 
 		if (keep_open_devices &&
-		    dm_lock_for_deletion(md, mark_deferred, only_deferred)) {
+		    dm_lock_for_deletion(md, mark_deferred, deferred_no_open,
+				    only_deferred)) {
 			dm_put(md);
 			dev_skipped++;
 			continue;
@@ -496,7 +499,7 @@  static struct mapped_device *dm_hash_rename(struct dm_ioctl *param,
 
 void dm_deferred_remove(void)
 {
-	dm_hash_remove_all(true, false, true);
+	dm_hash_remove_all(true, false, false, true);
 }
 
 /*-----------------------------------------------------------------
@@ -510,7 +513,13 @@  typedef int (*ioctl_fn)(struct file *filp, struct dm_ioctl *param, size_t param_
 
 static int remove_all(struct file *filp, struct dm_ioctl *param, size_t param_size)
 {
-	dm_hash_remove_all(true, !!(param->flags & DM_DEFERRED_REMOVE), false);
+	if (param->flags & DM_DEFERRED_REMOVE_NO_OPEN_FLAG &&
+			!(param->flags & DM_DEFERRED_REMOVE)) {
+		return -EINVAL;
+	}
+
+	dm_hash_remove_all(true, !!(param->flags & DM_DEFERRED_REMOVE),
+			!!(param->flags & DM_DEFERRED_REMOVE_NO_OPEN_FLAG), false);
 	param->data_size = 0;
 	return 0;
 }
@@ -811,9 +820,13 @@  static void __dev_status(struct mapped_device *md, struct dm_ioctl *param)
 	if (dm_suspended_internally_md(md))
 		param->flags |= DM_INTERNAL_SUSPEND_FLAG;
 
-	if (dm_test_deferred_remove_flag(md))
+	if (dm_test_deferred_remove_flag(md)) {
 		param->flags |= DM_DEFERRED_REMOVE;
 
+		if (dm_test_deferred_remove_no_open_flag(md))
+			param->flags |= DM_DEFERRED_REMOVE_NO_OPEN_FLAG;
+	}
+
 	param->dev = huge_encode_dev(disk_devt(disk));
 
 	/*
@@ -960,10 +973,18 @@  static int dev_remove(struct file *filp, struct dm_ioctl *param, size_t param_si
 
 	md = hc->md;
 
+	if (param->flags & DM_DEFERRED_REMOVE_NO_OPEN_FLAG &&
+			!(param->flags & DM_DEFERRED_REMOVE)) {
+		up_write(&_hash_lock);
+		dm_put(md);
+		return -EINVAL;
+	}
+
 	/*
 	 * Ensure the device is not open and nothing further can open it.
 	 */
-	r = dm_lock_for_deletion(md, !!(param->flags & DM_DEFERRED_REMOVE), false);
+	r = dm_lock_for_deletion(md, !!(param->flags & DM_DEFERRED_REMOVE),
+			!!(param->flags & DM_DEFERRED_REMOVE_NO_OPEN_FLAG), false);
 	if (r) {
 		if (r == -EBUSY && param->flags & DM_DEFERRED_REMOVE) {
 			up_write(&_hash_lock);
@@ -984,7 +1005,7 @@  static int dev_remove(struct file *filp, struct dm_ioctl *param, size_t param_si
 		dm_table_destroy(t);
 	}
 
-	param->flags &= ~DM_DEFERRED_REMOVE;
+	param->flags &= ~(DM_DEFERRED_REMOVE | DM_DEFERRED_REMOVE_NO_OPEN_FLAG);
 
 	dm_ima_measure_on_device_remove(md, false);
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c0ae8087c602..90b74043e162 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -315,7 +315,10 @@  static int dm_blk_open(struct block_device *bdev, fmode_t mode)
 		goto out;
 
 	if (test_bit(DMF_FREEING, &md->flags) ||
+	    test_bit(DMF_DEFERRED_REMOVE_NO_OPEN, &md->flags) ||
 	    dm_deleting_md(md)) {
+		BUG_ON(test_bit(DMF_DEFERRED_REMOVE_NO_OPEN, &md->flags) &&
+		       !test_bit(DMF_DEFERRED_REMOVE, &md->flags));
 		md = NULL;
 		goto out;
 	}
@@ -355,7 +358,8 @@  int dm_open_count(struct mapped_device *md)
 /*
  * Guarantees nothing is using the device before it's deleted.
  */
-int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only_deferred)
+int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred,
+		bool deferred_no_open, bool only_deferred)
 {
 	int r = 0;
 
@@ -363,8 +367,12 @@  int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only
 
 	if (dm_open_count(md)) {
 		r = -EBUSY;
-		if (mark_deferred)
+		if (mark_deferred) {
 			set_bit(DMF_DEFERRED_REMOVE, &md->flags);
+
+			if (deferred_no_open)
+				set_bit(DMF_DEFERRED_REMOVE_NO_OPEN, &md->flags);
+		}
 	} else if (only_deferred && !test_bit(DMF_DEFERRED_REMOVE, &md->flags))
 		r = -EEXIST;
 	else
@@ -383,8 +391,10 @@  int dm_cancel_deferred_remove(struct mapped_device *md)
 
 	if (test_bit(DMF_DELETING, &md->flags))
 		r = -EBUSY;
-	else
+	else {
 		clear_bit(DMF_DEFERRED_REMOVE, &md->flags);
+		clear_bit(DMF_DEFERRED_REMOVE_NO_OPEN, &md->flags);
+	}
 
 	spin_unlock(&_minor_lock);
 
@@ -2718,6 +2728,11 @@  int dm_test_deferred_remove_flag(struct mapped_device *md)
 	return test_bit(DMF_DEFERRED_REMOVE, &md->flags);
 }
 
+int dm_test_deferred_remove_no_open_flag(struct mapped_device *md)
+{
+	return test_bit(DMF_DEFERRED_REMOVE_NO_OPEN, &md->flags);
+}
+
 int dm_suspended(struct dm_target *ti)
 {
 	return dm_suspended_md(ti->table->md);
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 9013dc1a7b00..8d0d4344f882 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -158,6 +158,12 @@  void dm_internal_resume(struct mapped_device *md);
  */
 int dm_test_deferred_remove_flag(struct mapped_device *md);
 
+/*
+ * Test if the device is scheduled for deferred remove while
+ * disallowing opens.
+ */
+int dm_test_deferred_remove_no_open_flag(struct mapped_device *md);
+
 /*
  * Try to remove devices marked for deferred removal.
  */
@@ -198,7 +204,8 @@  void dm_stripe_exit(void);
 void dm_destroy(struct mapped_device *md);
 void dm_destroy_immediate(struct mapped_device *md);
 int dm_open_count(struct mapped_device *md);
-int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only_deferred);
+int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred,
+			 bool deferred_no_open, bool only_deferred);
 int dm_cancel_deferred_remove(struct mapped_device *md);
 int dm_request_based(struct mapped_device *md);
 int dm_get_table_device(struct mapped_device *md, dev_t dev, fmode_t mode,
diff --git a/include/uapi/linux/dm-ioctl.h b/include/uapi/linux/dm-ioctl.h
index c12ce30b52df..c0fee607b827 100644
--- a/include/uapi/linux/dm-ioctl.h
+++ b/include/uapi/linux/dm-ioctl.h
@@ -286,9 +286,9 @@  enum {
 #define DM_DEV_SET_GEOMETRY	_IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl)
 
 #define DM_VERSION_MAJOR	4
-#define DM_VERSION_MINOR	45
+#define DM_VERSION_MINOR	46
 #define DM_VERSION_PATCHLEVEL	0
-#define DM_VERSION_EXTRA	"-ioctl (2021-03-22)"
+#define DM_VERSION_EXTRA	"-ioctl (2022-01-21)"
 
 /* Status bits */
 #define DM_READONLY_FLAG	(1 << 0) /* In/Out */
@@ -382,4 +382,12 @@  enum {
  */
 #define DM_IMA_MEASUREMENT_FLAG	(1 << 19) /* In */
 
+/* If set with DF_DEFERRED_REMOVE if an immediate remove is not
+ * possible because the device is still open any new additional
+ * opens will also be rejected.
+ *
+ * It is an error to specify this flag without DM_DEFERRED_REMOVE.
+ */
+#define DM_DEFERRED_REMOVE_NO_OPEN_FLAG	(1 << 20) /* In/Out */
+
 #endif				/* _LINUX_DM_IOCTL_H */