btrfs-progs: udev: add rules for dm devices
diff mbox

Message ID 6ddc2c5c-42df-e7ca-daff-8848cbc3d9e9@suse.com
State Superseded
Headers show

Commit Message

Jeff Mahoney May 6, 2016, 7:27 p.m. UTC
Systemd's btrfs rule runs btrfs dev ready on each device
as it's discovered.  The btrfs command is executed as a builtin
command via an IMPORT{builtin} rule, which means it gets
executed at rule evaluation time, not rule execution time.  That
means that the device mapper links haven't been setup yet and the only
nodes that can be depended upon are /dev/dm-#.  That we see
/dev/mapper/name names in /proc/mounts is only because we replace the
device name we have cached with the one passed in via mount.  If
we have a multi-device file system and the primary device is removed,
the remaining devices will show /dev/dm-#.  In addition, if the
udev rule is executed again by someone generating a change event (e.g.
partprobe), the names are also replaced by the /dev/dm-# names.

This patch adds a new rule that adds a run rule that calls btrfs dev
ready again using the device mapper links once they're created.
---
 64-btrfs-dm.rules | 10 ++++++++++
 Makefile.in       |  7 +++++++
 configure.ac      |  2 ++
 3 files changed, 19 insertions(+)
 create mode 100644 64-btrfs-dm.rules

Comments

Jeff Mahoney May 6, 2016, 7:30 p.m. UTC | #1
On 5/6/16 3:27 PM, Jeff Mahoney wrote:
> Systemd's btrfs rule runs btrfs dev ready on each device
> as it's discovered.  The btrfs command is executed as a builtin
> command via an IMPORT{builtin} rule, which means it gets
> executed at rule evaluation time, not rule execution time.  That
> means that the device mapper links haven't been setup yet and the only
> nodes that can be depended upon are /dev/dm-#.  That we see
> /dev/mapper/name names in /proc/mounts is only because we replace the
> device name we have cached with the one passed in via mount.  If
> we have a multi-device file system and the primary device is removed,
> the remaining devices will show /dev/dm-#.  In addition, if the
> udev rule is executed again by someone generating a change event (e.g.
> partprobe), the names are also replaced by the /dev/dm-# names.
> 
> This patch adds a new rule that adds a run rule that calls btrfs dev
> ready again using the device mapper links once they're created.

I did submit this initially to the systemd folks and they said it
belonged in the device-mapper package.  I'm not convinced.
Device-mapper has no business worrying about how to properly tell btrfs
about its friendly names.  We'll leave the debate of why it is that
systemd has any business implementing its own btrfs ready command and
using it to provide a udev rule but NOT this part of it for another day.

-Jeff


> ---
>  64-btrfs-dm.rules | 10 ++++++++++
>  Makefile.in       |  7 +++++++
>  configure.ac      |  2 ++
>  3 files changed, 19 insertions(+)
>  create mode 100644 64-btrfs-dm.rules
> 
> diff --git a/64-btrfs-dm.rules b/64-btrfs-dm.rules
> new file mode 100644
> index 0000000..bbe1c35
> --- /dev/null
> +++ b/64-btrfs-dm.rules
> @@ -0,0 +1,10 @@
> +SUBSYSTEM!="block", GOTO="btrfs_end"
> +KERNEL!="dm-[0-9]*", GOTO="btrfs_end"
> +ACTION!="add|change", GOTO="btrfs_end"
> +ENV{ID_FS_TYPE}!="btrfs", GOTO="btrfs_end"
> +
> +# Once the device mapper symlink is created, tell btrfs about it
> +# so we get the friendly name in /proc/mounts (and tools that read it)
> +ENV{DM_NAME}=="?*", TEST=="/dev/mapper/$env{DM_NAME}", RUN{builtin}+="btrfs ready /dev/mapper/$env{DM_NAME}"
> +
> +LABEL="btrfs_end"
> diff --git a/Makefile.in b/Makefile.in
> index 19697ff..d555f6a 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -83,11 +83,15 @@ libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
>  	       extent_io.h ioctl.h ctree.h btrfsck.h version.h
>  TESTS = fsck-tests.sh convert-tests.sh
>  
> +udev_rules = 64-btrfs-dm.rules
> +
>  prefix ?= @prefix@
>  exec_prefix = @exec_prefix@
>  bindir = @bindir@
>  libdir ?= @libdir@
>  incdir = @includedir@/btrfs
> +udevdir = @UDEVDIR@
> +udevruledir = ${udevdir}/rules.d
>  
>  ifeq ("$(origin V)", "command line")
>    BUILD_VERBOSE = $(V)
> @@ -377,6 +381,9 @@ install: $(libs) $(progs_install) $(INSTALLDIRS)
>  	cp -a $(lib_links) $(DESTDIR)$(libdir)
>  	$(INSTALL) -m755 -d $(DESTDIR)$(incdir)
>  	$(INSTALL) -m644 $(headers) $(DESTDIR)$(incdir)
> +ifneq ($(udevdir), "")
> +	$(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
> +endif
>  
>  install-static: $(progs_static) $(INSTALLDIRS)
>  	for p in $(progs_static) ; do \
> diff --git a/configure.ac b/configure.ac
> index fc343ea..4af7474 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -124,6 +124,8 @@ PKG_STATIC(UUID_LIBS_STATIC, [uuid])
>  PKG_CHECK_MODULES(ZLIB, [zlib])
>  PKG_STATIC(ZLIB_LIBS_STATIC, [zlib])
>  
> +PKG_CHECK_VAR([UDEVDIR], [udev], [udevdir])
> +
>  dnl lzo library does not provide pkg-config, let use classic way
>  AC_CHECK_LIB([lzo2], [lzo_version], [
>  	LZO2_LIBS="-llzo2"
>
Andrei Borzenkov May 8, 2016, 5 a.m. UTC | #2
06.05.2016 22:27, Jeff Mahoney ?????:
> Systemd's btrfs rule runs btrfs dev ready on each device
> as it's discovered.  The btrfs command is executed as a builtin
> command via an IMPORT{builtin} rule, which means it gets
> executed at rule evaluation time, not rule execution time.  That
> means that the device mapper links haven't been setup yet and the only
> nodes that can be depended upon are /dev/dm-#.  That we see
> /dev/mapper/name names in /proc/mounts is only because we replace the
> device name we have cached with the one passed in via mount.  If
> we have a multi-device file system and the primary device is removed,
> the remaining devices will show /dev/dm-#.  In addition, if the

And I still do not understand why it is bad while /dev/sd#n is good.

> udev rule is executed again by someone generating a change event (e.g.
> partprobe), the names are also replaced by the /dev/dm-# names.
> 
> This patch adds a new rule that adds a run rule that calls btrfs dev
> ready again using the device mapper links once they're created.
> ---
>  64-btrfs-dm.rules | 10 ++++++++++
>  Makefile.in       |  7 +++++++
>  configure.ac      |  2 ++
>  3 files changed, 19 insertions(+)
>  create mode 100644 64-btrfs-dm.rules
> 
> diff --git a/64-btrfs-dm.rules b/64-btrfs-dm.rules
> new file mode 100644
> index 0000000..bbe1c35
> --- /dev/null
> +++ b/64-btrfs-dm.rules
> @@ -0,0 +1,10 @@
> +SUBSYSTEM!="block", GOTO="btrfs_end"
> +KERNEL!="dm-[0-9]*", GOTO="btrfs_end"
> +ACTION!="add|change", GOTO="btrfs_end"
> +ENV{ID_FS_TYPE}!="btrfs", GOTO="btrfs_end"
> +
> +# Once the device mapper symlink is created, tell btrfs about it
> +# so we get the friendly name in /proc/mounts (and tools that read it)
> +ENV{DM_NAME}=="?*", TEST=="/dev/mapper/$env{DM_NAME}", RUN{builtin}+="btrfs ready /dev/mapper/$env{DM_NAME}"
> +

That won't work for the very first event (presumably "add"). /dev/mapper
link is created only after all rules have been processed, so it will
always evaluate to false.

> +LABEL="btrfs_end"
> diff --git a/Makefile.in b/Makefile.in
> index 19697ff..d555f6a 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -83,11 +83,15 @@ libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
>  	       extent_io.h ioctl.h ctree.h btrfsck.h version.h
>  TESTS = fsck-tests.sh convert-tests.sh
>  
> +udev_rules = 64-btrfs-dm.rules
> +
>  prefix ?= @prefix@
>  exec_prefix = @exec_prefix@
>  bindir = @bindir@
>  libdir ?= @libdir@
>  incdir = @includedir@/btrfs
> +udevdir = @UDEVDIR@
> +udevruledir = ${udevdir}/rules.d
>  
>  ifeq ("$(origin V)", "command line")
>    BUILD_VERBOSE = $(V)
> @@ -377,6 +381,9 @@ install: $(libs) $(progs_install) $(INSTALLDIRS)
>  	cp -a $(lib_links) $(DESTDIR)$(libdir)
>  	$(INSTALL) -m755 -d $(DESTDIR)$(incdir)
>  	$(INSTALL) -m644 $(headers) $(DESTDIR)$(incdir)
> +ifneq ($(udevdir), "")
> +	$(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
> +endif
>  
>  install-static: $(progs_static) $(INSTALLDIRS)
>  	for p in $(progs_static) ; do \
> diff --git a/configure.ac b/configure.ac
> index fc343ea..4af7474 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -124,6 +124,8 @@ PKG_STATIC(UUID_LIBS_STATIC, [uuid])
>  PKG_CHECK_MODULES(ZLIB, [zlib])
>  PKG_STATIC(ZLIB_LIBS_STATIC, [zlib])
>  
> +PKG_CHECK_VAR([UDEVDIR], [udev], [udevdir])
> +
>  dnl lzo library does not provide pkg-config, let use classic way
>  AC_CHECK_LIB([lzo2], [lzo_version], [
>  	LZO2_LIBS="-llzo2"
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Mahoney May 9, 2016, 1:33 a.m. UTC | #3
On 5/8/16 1:00 AM, Andrei Borzenkov wrote:
> 06.05.2016 22:27, Jeff Mahoney ?????:
>> Systemd's btrfs rule runs btrfs dev ready on each device
>> as it's discovered.  The btrfs command is executed as a builtin
>> command via an IMPORT{builtin} rule, which means it gets
>> executed at rule evaluation time, not rule execution time.  That
>> means that the device mapper links haven't been setup yet and the only
>> nodes that can be depended upon are /dev/dm-#.  That we see
>> /dev/mapper/name names in /proc/mounts is only because we replace the
>> device name we have cached with the one passed in via mount.  If
>> we have a multi-device file system and the primary device is removed,
>> the remaining devices will show /dev/dm-#.  In addition, if the
> 
> And I still do not understand why it is bad while /dev/sd#n is good.

The dm-# names are for in-kernel use because the major:minor mappings
don't change but the tables host upon them can be changed to a
completely different mapping without tearing down the device. This is
different behavior than sd#n, which will maintain the same mapping for
lifetime of the device.

Both device mapper and util-linux treat these names as preferred names.
 They're used by mount.  The links are created by device mapper.
Always.  These aren't "arbitrary" names as you insist on calling them.
The are the de facto official names for device mapper devices.

If you use any file system to mount using e.g. LVM's /dev/vg/lv, it will
appear in 'mount' or /proc/mounts as /dev/mapper/vg-lv.  If you use any
file system mount using multipath names, they will similarly appear in
'mount' or /proc/mounts as /dev/mapper/wwid-part#.  It's the same for
dm-crypt, dm-raid, and literally every other device mapper mapping.
There may be other convenience links but these are the names that the
device mapper userspace, which configures everything surrounding device
mapper, uses and expects to exist.

This is different for btrfs.  Every other file system just uses the name
as passed in via the mount command.  Btrfs does too for the first device
and for any devices added from the command line as opposed to being
discovered as long as there are no change events on the block device.
Once that change event comes in or a device is removed to reveal one of
the discovered devices, it reverts to the dm-# names as provided by the
initial udev rule.

That is a *clear* user-visible inconsistency that is easily
reproducible.  That's why it's an issue that needs to be resolved.

That you think that the dm-# names should be used everywhere is an
opinion shared far too late.  The /dev/mapper names are used everywhere
and we need to resolve this inconsistency.  The solution is *not* to fly
in the face of years of user experience and change everything /else/ to
use dm-#.

>> +# Once the device mapper symlink is created, tell btrfs about it
>> +# so we get the friendly name in /proc/mounts (and tools that read it)
>> +ENV{DM_NAME}=="?*", TEST=="/dev/mapper/$env{DM_NAME}", RUN{builtin}+="btrfs ready /dev/mapper/$env{DM_NAME}"
>> +
> 
> That won't work for the very first event (presumably "add"). /dev/mapper
> link is created only after all rules have been processed, so it will
> always evaluate to false.

Yep, I missed that in my test after adding it.  I'd focused on the
partprobe portion of it after adding that rule.  Removing the TEST==
section is safe and results in the correct behavior.  If the link
doesn't exist at that point, device mapper userspace is broken.

An easy test is attached.  If the before and after are different, that's
the problem.  My results, in both modes, without the rule change look
like this:

Before
/dev/mapper/testvg1-lvtest
/dev/mapper/testvg2-lvtest
After
/dev/dm-1
/dev/mapper/testvg1-lvtest

... and after the rule change:

Before
/dev/mapper/testvg1-lvtest
/dev/mapper/testvg2-lvtest
After
/dev/mapper/testvg1-lvtest
/dev/mapper/testvg2-lvtest

-Jeff

Patch
diff mbox

diff --git a/64-btrfs-dm.rules b/64-btrfs-dm.rules
new file mode 100644
index 0000000..bbe1c35
--- /dev/null
+++ b/64-btrfs-dm.rules
@@ -0,0 +1,10 @@ 
+SUBSYSTEM!="block", GOTO="btrfs_end"
+KERNEL!="dm-[0-9]*", GOTO="btrfs_end"
+ACTION!="add|change", GOTO="btrfs_end"
+ENV{ID_FS_TYPE}!="btrfs", GOTO="btrfs_end"
+
+# Once the device mapper symlink is created, tell btrfs about it
+# so we get the friendly name in /proc/mounts (and tools that read it)
+ENV{DM_NAME}=="?*", TEST=="/dev/mapper/$env{DM_NAME}", RUN{builtin}+="btrfs ready /dev/mapper/$env{DM_NAME}"
+
+LABEL="btrfs_end"
diff --git a/Makefile.in b/Makefile.in
index 19697ff..d555f6a 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -83,11 +83,15 @@  libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
 	       extent_io.h ioctl.h ctree.h btrfsck.h version.h
 TESTS = fsck-tests.sh convert-tests.sh
 
+udev_rules = 64-btrfs-dm.rules
+
 prefix ?= @prefix@
 exec_prefix = @exec_prefix@
 bindir = @bindir@
 libdir ?= @libdir@
 incdir = @includedir@/btrfs
+udevdir = @UDEVDIR@
+udevruledir = ${udevdir}/rules.d
 
 ifeq ("$(origin V)", "command line")
   BUILD_VERBOSE = $(V)
@@ -377,6 +381,9 @@  install: $(libs) $(progs_install) $(INSTALLDIRS)
 	cp -a $(lib_links) $(DESTDIR)$(libdir)
 	$(INSTALL) -m755 -d $(DESTDIR)$(incdir)
 	$(INSTALL) -m644 $(headers) $(DESTDIR)$(incdir)
+ifneq ($(udevdir), "")
+	$(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
+endif
 
 install-static: $(progs_static) $(INSTALLDIRS)
 	for p in $(progs_static) ; do \
diff --git a/configure.ac b/configure.ac
index fc343ea..4af7474 100644
--- a/configure.ac
+++ b/configure.ac
@@ -124,6 +124,8 @@  PKG_STATIC(UUID_LIBS_STATIC, [uuid])
 PKG_CHECK_MODULES(ZLIB, [zlib])
 PKG_STATIC(ZLIB_LIBS_STATIC, [zlib])
 
+PKG_CHECK_VAR([UDEVDIR], [udev], [udevdir])
+
 dnl lzo library does not provide pkg-config, let use classic way
 AC_CHECK_LIB([lzo2], [lzo_version], [
 	LZO2_LIBS="-llzo2"