[GIT,PULL] Mount notifications
mbox series

Message ID 1842689.1596468469@warthog.procyon.org.uk
State New
Headers show
Series
  • [GIT,PULL] Mount notifications
Related show

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/mount-notifications-20200803

Message

David Howells Aug. 3, 2020, 3:27 p.m. UTC
Hi Linus,

Here's a set of patches to add notifications for mount topology events,
such as mounting, unmounting, mount expiry, mount reconfiguration.

The first patch in the series adds a hard limit on the number of watches
that any particular user can add.  The RLIMIT_NOFILE value for the process
adding a watch is used as the limit.  Even if you don't take the rest of
the series, can you at least take this one?

An LSM hook is included for an LSM to rule on whether or not a mount watch
may be set on a particular path.

This series is intended to be taken in conjunction with the fsinfo series
which I'll post a pull request for shortly and which is dependent on it.

Karel Zak[*] has created preliminary patches that add support to libmount
and Ian Kent has started working on making systemd use them.

[*] https://github.com/karelzak/util-linux/commits/topic/fsinfo

Note that there have been some last minute changes to the patchset: you
wanted something adding and Miklós wanted some bits taking out/changing.
I've placed a tag, fsinfo-core-20200724 on the aggregate of these two
patchsets that can be compared to fsinfo-core-20200803.

To summarise the changes: I added the limiter that you wanted; removed an
unused symbol; made the mount ID fields in the notificaion 64-bit (the
fsinfo patchset has a change to convey the mount uniquifier instead of the
mount ID); removed the event counters from the mount notification and moved
the event counters into the fsinfo patchset.


====
WHY?
====

Why do we want mount notifications?  Whilst /proc/mounts can be polled, it
only tells you that something changed in your namespace.  To find out, you
have to trawl /proc/mounts or similar to work out what changed in the mount
object attributes and mount topology.  I'm told that the proc file holding
the namespace_sem is a point of contention, especially as the process of
generating the text descriptions of the mounts/superblocks can be quite
involved.

The notification generated here directly indicates the mounts involved in
any particular event and gives an idea of what the change was.

This is combined with a new fsinfo() system call that allows, amongst other
things, the ability to retrieve in one go an { id, change_counter } tuple
from all the children of a specified mount, allowing buffer overruns to be
dealt with quickly.

This is of use to systemd to improve efficiency:

	https://lore.kernel.org/linux-fsdevel/20200227151421.3u74ijhqt6ekbiss@ws.net.home/

And it's not just Red Hat that's potentially interested in this:

	https://lore.kernel.org/linux-fsdevel/293c9bd3-f530-d75e-c353-ddeabac27cf6@6wind.com/


David
---
The following changes since commit ba47d845d715a010f7b51f6f89bae32845e6acb7:

  Linux 5.8-rc6 (2020-07-19 15:41:18 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/mount-notifications-20200803

for you to fetch changes up to 841a0dfa511364fa9a8d67512e0643669f1f03e3:

  watch_queue: sample: Display mount tree change notifications (2020-08-03 12:15:38 +0100)

----------------------------------------------------------------
Mount notifications

----------------------------------------------------------------
David Howells (5):
      watch_queue: Limit the number of watches a user can hold
      watch_queue: Make watch_sizeof() check record size
      watch_queue: Add security hooks to rule on setting mount watches
      watch_queue: Implement mount topology and attribute change notifications
      watch_queue: sample: Display mount tree change notifications

 Documentation/watch_queue.rst               |  12 +-
 arch/alpha/kernel/syscalls/syscall.tbl      |   1 +
 arch/arm/tools/syscall.tbl                  |   1 +
 arch/arm64/include/asm/unistd.h             |   2 +-
 arch/arm64/include/asm/unistd32.h           |   2 +
 arch/ia64/kernel/syscalls/syscall.tbl       |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   1 +
 arch/s390/kernel/syscalls/syscall.tbl       |   1 +
 arch/sh/kernel/syscalls/syscall.tbl         |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   1 +
 fs/Kconfig                                  |   9 ++
 fs/Makefile                                 |   1 +
 fs/mount.h                                  |  18 +++
 fs/mount_notify.c                           | 222 ++++++++++++++++++++++++++++
 fs/namespace.c                              |  22 +++
 include/linux/dcache.h                      |   1 +
 include/linux/lsm_hook_defs.h               |   3 +
 include/linux/lsm_hooks.h                   |   6 +
 include/linux/sched/user.h                  |   3 +
 include/linux/security.h                    |   8 +
 include/linux/syscalls.h                    |   2 +
 include/linux/watch_queue.h                 |   7 +-
 include/uapi/asm-generic/unistd.h           |   4 +-
 include/uapi/linux/watch_queue.h            |  31 +++-
 kernel/sys_ni.c                             |   3 +
 kernel/watch_queue.c                        |   8 +
 samples/watch_queue/watch_test.c            |  41 ++++-
 security/security.c                         |   7 +
 37 files changed, 422 insertions(+), 6 deletions(-)
 create mode 100644 fs/mount_notify.c

Comments

Ian Kent Aug. 3, 2020, 10:48 p.m. UTC | #1
On Mon, 2020-08-03 at 16:27 +0100, David Howells wrote:
> Hi Linus,
> 
> Here's a set of patches to add notifications for mount topology
> events,
> such as mounting, unmounting, mount expiry, mount reconfiguration.
> 
> The first patch in the series adds a hard limit on the number of
> watches
> that any particular user can add.  The RLIMIT_NOFILE value for the
> process
> adding a watch is used as the limit.  Even if you don't take the rest
> of
> the series, can you at least take this one?
> 
> An LSM hook is included for an LSM to rule on whether or not a mount
> watch
> may be set on a particular path.
> 
> This series is intended to be taken in conjunction with the fsinfo
> series
> which I'll post a pull request for shortly and which is dependent on
> it.
> 
> Karel Zak[*] has created preliminary patches that add support to
> libmount
> and Ian Kent has started working on making systemd use them.
> 
> [*] https://github.com/karelzak/util-linux/commits/topic/fsinfo
> 
> Note that there have been some last minute changes to the patchset:
> you
> wanted something adding and Miklós wanted some bits taking
> out/changing.
> I've placed a tag, fsinfo-core-20200724 on the aggregate of these two
> patchsets that can be compared to fsinfo-core-20200803.
> 
> To summarise the changes: I added the limiter that you wanted;
> removed an
> unused symbol; made the mount ID fields in the notificaion 64-bit
> (the
> fsinfo patchset has a change to convey the mount uniquifier instead
> of the
> mount ID); removed the event counters from the mount notification and
> moved
> the event counters into the fsinfo patchset.

I've pushed my systemd changes to a github repo.
I haven't yet updated it with the changes above but will get to it.

They can be found at:
https://github.com/raven-au/systemd.git branch notifications-devel

> 
> 
> ====
> WHY?
> ====
> 
> Why do we want mount notifications?  Whilst /proc/mounts can be
> polled, it
> only tells you that something changed in your namespace.  To find
> out, you
> have to trawl /proc/mounts or similar to work out what changed in the
> mount
> object attributes and mount topology.  I'm told that the proc file
> holding
> the namespace_sem is a point of contention, especially as the process
> of
> generating the text descriptions of the mounts/superblocks can be
> quite
> involved.
> 
> The notification generated here directly indicates the mounts
> involved in
> any particular event and gives an idea of what the change was.
> 
> This is combined with a new fsinfo() system call that allows, amongst
> other
> things, the ability to retrieve in one go an { id, change_counter }
> tuple
> from all the children of a specified mount, allowing buffer overruns
> to be
> dealt with quickly.
> 
> This is of use to systemd to improve efficiency:
> 
> 	
> https://lore.kernel.org/linux-fsdevel/20200227151421.3u74ijhqt6ekbiss@ws.net.home/
> 
> And it's not just Red Hat that's potentially interested in this:
> 
> 	
> https://lore.kernel.org/linux-fsdevel/293c9bd3-f530-d75e-c353-ddeabac27cf6@6wind.com/
> 
> 
> David
> ---
> The following changes since commit
> ba47d845d715a010f7b51f6f89bae32845e6acb7:
> 
>   Linux 5.8-rc6 (2020-07-19 15:41:18 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git 
> tags/mount-notifications-20200803
> 
> for you to fetch changes up to
> 841a0dfa511364fa9a8d67512e0643669f1f03e3:
> 
>   watch_queue: sample: Display mount tree change notifications (2020-
> 08-03 12:15:38 +0100)
> 
> ----------------------------------------------------------------
> Mount notifications
> 
> ----------------------------------------------------------------
> David Howells (5):
>       watch_queue: Limit the number of watches a user can hold
>       watch_queue: Make watch_sizeof() check record size
>       watch_queue: Add security hooks to rule on setting mount
> watches
>       watch_queue: Implement mount topology and attribute change
> notifications
>       watch_queue: sample: Display mount tree change notifications
> 
>  Documentation/watch_queue.rst               |  12 +-
>  arch/alpha/kernel/syscalls/syscall.tbl      |   1 +
>  arch/arm/tools/syscall.tbl                  |   1 +
>  arch/arm64/include/asm/unistd.h             |   2 +-
>  arch/arm64/include/asm/unistd32.h           |   2 +
>  arch/ia64/kernel/syscalls/syscall.tbl       |   1 +
>  arch/m68k/kernel/syscalls/syscall.tbl       |   1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl |   1 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |   1 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   |   1 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   |   1 +
>  arch/parisc/kernel/syscalls/syscall.tbl     |   1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    |   1 +
>  arch/s390/kernel/syscalls/syscall.tbl       |   1 +
>  arch/sh/kernel/syscalls/syscall.tbl         |   1 +
>  arch/sparc/kernel/syscalls/syscall.tbl      |   1 +
>  arch/x86/entry/syscalls/syscall_32.tbl      |   1 +
>  arch/x86/entry/syscalls/syscall_64.tbl      |   1 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     |   1 +
>  fs/Kconfig                                  |   9 ++
>  fs/Makefile                                 |   1 +
>  fs/mount.h                                  |  18 +++
>  fs/mount_notify.c                           | 222
> ++++++++++++++++++++++++++++
>  fs/namespace.c                              |  22 +++
>  include/linux/dcache.h                      |   1 +
>  include/linux/lsm_hook_defs.h               |   3 +
>  include/linux/lsm_hooks.h                   |   6 +
>  include/linux/sched/user.h                  |   3 +
>  include/linux/security.h                    |   8 +
>  include/linux/syscalls.h                    |   2 +
>  include/linux/watch_queue.h                 |   7 +-
>  include/uapi/asm-generic/unistd.h           |   4 +-
>  include/uapi/linux/watch_queue.h            |  31 +++-
>  kernel/sys_ni.c                             |   3 +
>  kernel/watch_queue.c                        |   8 +
>  samples/watch_queue/watch_test.c            |  41 ++++-
>  security/security.c                         |   7 +
>  37 files changed, 422 insertions(+), 6 deletions(-)
>  create mode 100644 fs/mount_notify.c
>