mbox series

[00/25] VFS: Introduce filesystem information query syscall [ver #14]

Message ID 156138532485.25627.7459410522109581052.stgit@warthog.procyon.org.uk (mailing list archive)
Headers show
Series VFS: Introduce filesystem information query syscall [ver #14] | expand

Message

David Howells June 24, 2019, 2:08 p.m. UTC
Hi Al,

Here are a set of patches that adds a syscall, fsinfo(), that allows
attributes of a filesystem/superblock to be queried.  Attribute values are
of four basic types:

 (1) Version dependent-length structure (size defined by type).

 (2) Variable-length string (up to PAGE_SIZE).

 (3) Array of fixed-length structures (up to INT_MAX size).

 (4) Opaque blob (up to INT_MAX size).

Attributes can have multiple values in up to two dimensions and all the
values of a particular attribute must have the same type.

Note that the attribute values *are* allowed to vary between dentries
within a single superblock, depending on the specific dentry that you're
looking at.

I've tried to make the interface as light as possible, so integer/enum
attribute selector rather than string and the core does all the allocation
and extensibility support work rather than leaving that to the filesystems.
That means that for the first two attribute types, sb->s_op->fsinfo() may
assume that the provided buffer is always present and always big enough.

Further, this removes the possibility of the filesystem gaining access to the
userspace buffer.


fsinfo() allows a variety of information to be retrieved about a filesystem
and the mount topology:

 (1) General superblock attributes:

      - The amount of space/free space in a filesystem (as statfs()).
      - Filesystem identifiers (UUID, volume label, device numbers, ...)
      - The limits on a filesystem's capabilities
      - Information on supported statx fields and attributes and IOC flags.
      - A variety single-bit flags indicating supported capabilities.
      - Timestamp resolution and range.
      - Sources (as per mount(2), but fsconfig() allows multiple sources).
      - In-filesystem filename format information.
      - Filesystem parameters ("mount -o xxx"-type things).
      - LSM parameters (again "mount -o xxx"-type things).

 (2) Filesystem-specific superblock attributes:

      - Server names and addresses.
      - Cell name.

 (3) Filesystem configuration metadata attributes:

      - Filesystem parameter type descriptions.
      - Name -> parameter mappings.
      - Simple enumeration name -> value mappings.

 (4) Mount topology:

      - General information about a mount object.
      - Mount device name(s).
      - Children of a mount object and their relative paths.

 (5) Information about what the fsinfo() syscall itself supports, including
     the number of attibutes supported and the number of capability bits
     supported.


The system is extensible:

 (1) New attributes can be added.  There is no requirement that a
     filesystem implement every attribute.  Note that the core VFS keeps a
     table of types and sizes so it can handle future extensibility rather
     than delegating this to the filesystems.

 (2) Version length-dependent structure attributes can be made larger and
     have additional information tacked on the end, provided it keeps the
     layout of the existing fields.  If an older process asks for a shorter
     structure, it will only be given the bits it asks for.  If a newer
     process asks for a longer structure on an older kernel, the extra
     space will be set to 0.  In all cases, the size of the data actually
     available is returned.

     In essence, the size of a structure is that structure's version: a
     smaller size is an earlier version and a later version includes
     everything that the earlier version did.

 (3) New single-bit capability flags can be added.  This is a structure-typed
     attribute and, as such, (2) applies.  Any bits you wanted but the kernel
     doesn't support are automatically set to 0.

If a filesystem-specific attribute is added, it should just take up the next
number in the enumeration.  Currently, I do not intend that the number space
should be subdivided between interested parties.


fsinfo() may be called like the following, for example:

	struct fsinfo_params params = {
		.at_flags	= AT_SYMLINK_NOFOLLOW,
		.request	= FSINFO_ATTR_SERVER_ADDRESS;
		.Nth		= 2;
		.Mth		= 1;
	};
	struct fsinfo_server_address address;

	len = fsinfo(AT_FDCWD, "/afs/grand.central.org/doc", &params,
		     &address, sizeof(address));

The above example would query a network filesystem, such as AFS or NFS, and
ask what the 2nd address (Mth) of the 3rd server (Nth) that the superblock is
using is.  Whereas:

	struct fsinfo_params params = {
		.at_flags	= AT_SYMLINK_NOFOLLOW,
		.request	= FSINFO_ATTR_CELL_NAME;
	};
	char cell_name[256];

	len = fsinfo(AT_FDCWD, "/afs/grand.central.org/doc", &params,
		     &cell_name, sizeof(cell_name));

would retrieve the name of an AFS cell as a string.

fsinfo() can also be used to query a context from fsopen() or fspick():

	fd = fsopen("ext4", 0);
	struct fsinfo_params params = {
		.request	= FSINFO_ATTR_PARAM_DESCRIPTION;
	};
	struct fsinfo_param_description desc;
	fsinfo(fd, NULL, &params, &desc, sizeof(desc));

even if that context doesn't currently have a superblock attached (though if
there's no superblock attached, only filesystem-specific things like parameter
descriptions can be accessed).

The patches can be found here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

on branch:

	fsinfo


===================
SIGNIFICANT CHANGES
===================

 ver #14:

 (*) Increase to 128-bit the fields for number of blocks and files in the
     filesystem and also the max file size and max inode number fields.

 (*) Increase to 64-bit the fields for max hard links and max xattr body
     length.

 (*) Provide struct fsinfo_timestamp_one to represent the characteristics
     of a single timestamp and move the range into it.  FAT, for example,
     has different ranges for different timestamps.  Each timestamp is then
     represented by one of these structs.

 (*) Don't expose MS_* flags (such as MS_RDONLY) through this interface as
     they ought to be considered deprecated; instead anyone who wants them
     should parse FSINFO_ATTR_PARAMETERS for the string equivalents.

 (*) Add a flag, AT_FSINFO_FROM_FSOPEN, to indicate that the fd being
     accessed is from fsopen()/fspick() and that fsinfo() should look
     inside and access the filesystem referred to by the fs_context.

 (*) If the filesystem implements FSINFO_ATTR_PARAMETERS for itself, don't
     automatically include flags for the SB_* bits that are normally
     rendered by, say, /proc/mounts (such as SB_RDONLY).  Rather, a helper
     is provided that the filesystem must call with an appropriately
     wangled s_flags.

 (*) Drop the NFS fsinfo patch for now as NFS fs_context support is
     unlikely to get upstream in the upcoming merge window.

 ver #13:

 (*) Provided a "fixed-struct array" type so that the list of children of a
     mount and all their change counters can be read atomically.

 (*) Additional filesystem examples.

 (*) Documented the API.

 ver #12:

 (*) Rename ->get_fsinfo() to ->fsinfo().

 (*) Pass the path through to to ->fsinfo() as it's needed for NFS to
     retrocalculate the source name.

 (*) Indicated which is the source parameter in the param-description
     attribute.

 (*) Dropped the realm attribute.

David
---
David Howells (17):
      vfs: syscall: Add fsinfo() to query filesystem information
      fsinfo: Add syscalls to other arches
      vfs: Allow fsinfo() to query what's in an fs_context
      vfs: Allow fsinfo() to be used to query an fs parameter description
      vfs: Implement parameter value retrieval with fsinfo()
      fsinfo: Implement retrieval of LSM parameters with fsinfo()
      vfs: Introduce a non-repeating system-unique superblock ID
      vfs: Allow fsinfo() to look up a mount object by ID
      vfs: Add mount notification count
      vfs: Allow mount information to be queried by fsinfo()
      vfs: fsinfo sample: Mount listing program
      fsinfo: Add API documentation
      hugetlbfs: Add support for fsinfo()
      kernfs, cgroup: Add fsinfo support
      fsinfo: Support SELinux superblock parameter retrieval
      fsinfo: Support Smack superblock parameter retrieval
      afs: Support fsinfo()

Ian Kent (8):
      fsinfo: proc - add sb operation fsinfo()
      fsinfo: autofs - add sb operation fsinfo()
      fsinfo: shmem - add tmpfs sb operation fsinfo()
      fsinfo: devpts - add sb operation fsinfo()
      fsinfo: pstore - add sb operation fsinfo()
      fsinfo: debugfs - add sb operation fsinfo()
      fsinfo: bpf - add sb operation fsinfo()
      fsinfo: ufs - add sb operation fsinfo()


 Documentation/filesystems/fsinfo.rst        |  596 ++++++++++++++++++
 arch/alpha/kernel/syscalls/syscall.tbl      |    1 
 arch/arm/tools/syscall.tbl                  |    1 
 arch/arm64/include/asm/unistd.h             |    2 
 arch/ia64/kernel/syscalls/syscall.tbl       |    1 
 arch/m68k/kernel/syscalls/syscall.tbl       |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl   |    1 
 arch/parisc/kernel/syscalls/syscall.tbl     |    1 
 arch/powerpc/kernel/syscalls/syscall.tbl    |    1 
 arch/s390/kernel/syscalls/syscall.tbl       |    1 
 arch/sh/kernel/syscalls/syscall.tbl         |    1 
 arch/sparc/kernel/syscalls/syscall.tbl      |    1 
 arch/x86/entry/syscalls/syscall_32.tbl      |    1 
 arch/x86/entry/syscalls/syscall_64.tbl      |    1 
 arch/xtensa/kernel/syscalls/syscall.tbl     |    1 
 fs/Kconfig                                  |    7 
 fs/Makefile                                 |    1 
 fs/afs/internal.h                           |    1 
 fs/afs/super.c                              |  180 +++++-
 fs/autofs/inode.c                           |   64 ++
 fs/d_path.c                                 |    2 
 fs/debugfs/inode.c                          |   38 +
 fs/devpts/inode.c                           |   43 +
 fs/fsinfo.c                                 |  877 +++++++++++++++++++++++++++
 fs/hugetlbfs/inode.c                        |   57 ++
 fs/internal.h                               |   11 
 fs/kernfs/mount.c                           |   20 +
 fs/mount.h                                  |   22 +
 fs/namespace.c                              |  307 +++++++++
 fs/proc/inode.c                             |   37 +
 fs/pstore/inode.c                           |   32 +
 fs/statfs.c                                 |    2 
 fs/super.c                                  |   24 +
 fs/ufs/super.c                              |   58 ++
 include/linux/fs.h                          |    8 
 include/linux/fsinfo.h                      |   70 ++
 include/linux/kernfs.h                      |    4 
 include/linux/lsm_hooks.h                   |   13 
 include/linux/security.h                    |   11 
 include/linux/syscalls.h                    |    4 
 include/uapi/asm-generic/unistd.h           |    4 
 include/uapi/linux/fcntl.h                  |    3 
 include/uapi/linux/fsinfo.h                 |  319 ++++++++++
 kernel/bpf/inode.c                          |   25 +
 kernel/cgroup/cgroup-v1.c                   |   44 +
 kernel/cgroup/cgroup.c                      |   19 +
 kernel/sys_ni.c                             |    1 
 mm/shmem.c                                  |   72 ++
 samples/vfs/Makefile                        |    9 
 samples/vfs/test-fs-query.c                 |  138 ++++
 samples/vfs/test-fsinfo.c                   |  682 +++++++++++++++++++++
 samples/vfs/test-mntinfo.c                  |  241 +++++++
 security/security.c                         |   12 
 security/selinux/hooks.c                    |   41 +
 security/selinux/include/security.h         |    2 
 security/selinux/ss/services.c              |   49 ++
 security/smack/smack_lsm.c                  |   43 +
 60 files changed, 4202 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/filesystems/fsinfo.rst
 create mode 100644 fs/fsinfo.c
 create mode 100644 include/linux/fsinfo.h
 create mode 100644 include/uapi/linux/fsinfo.h
 create mode 100644 samples/vfs/test-fs-query.c
 create mode 100644 samples/vfs/test-fsinfo.c
 create mode 100644 samples/vfs/test-mntinfo.c

Comments

Christian Brauner June 26, 2019, 10:05 a.m. UTC | #1
On Mon, Jun 24, 2019 at 03:08:45PM +0100, David Howells wrote:
> 
> Hi Al,
> 
> Here are a set of patches that adds a syscall, fsinfo(), that allows
> attributes of a filesystem/superblock to be queried.  Attribute values are
> of four basic types:
> 
>  (1) Version dependent-length structure (size defined by type).
> 
>  (2) Variable-length string (up to PAGE_SIZE).
> 
>  (3) Array of fixed-length structures (up to INT_MAX size).
> 
>  (4) Opaque blob (up to INT_MAX size).
> 
> Attributes can have multiple values in up to two dimensions and all the
> values of a particular attribute must have the same type.
> 
> Note that the attribute values *are* allowed to vary between dentries
> within a single superblock, depending on the specific dentry that you're
> looking at.
> 
> I've tried to make the interface as light as possible, so integer/enum
> attribute selector rather than string and the core does all the allocation
> and extensibility support work rather than leaving that to the filesystems.
> That means that for the first two attribute types, sb->s_op->fsinfo() may
> assume that the provided buffer is always present and always big enough.
> 
> Further, this removes the possibility of the filesystem gaining access to the
> userspace buffer.
> 
> 
> fsinfo() allows a variety of information to be retrieved about a filesystem
> and the mount topology:
> 
>  (1) General superblock attributes:
> 
>       - The amount of space/free space in a filesystem (as statfs()).
>       - Filesystem identifiers (UUID, volume label, device numbers, ...)
>       - The limits on a filesystem's capabilities
>       - Information on supported statx fields and attributes and IOC flags.
>       - A variety single-bit flags indicating supported capabilities.
>       - Timestamp resolution and range.
>       - Sources (as per mount(2), but fsconfig() allows multiple sources).
>       - In-filesystem filename format information.
>       - Filesystem parameters ("mount -o xxx"-type things).
>       - LSM parameters (again "mount -o xxx"-type things).
> 
>  (2) Filesystem-specific superblock attributes:
> 
>       - Server names and addresses.
>       - Cell name.
> 
>  (3) Filesystem configuration metadata attributes:
> 
>       - Filesystem parameter type descriptions.
>       - Name -> parameter mappings.
>       - Simple enumeration name -> value mappings.
> 
>  (4) Mount topology:
> 
>       - General information about a mount object.
>       - Mount device name(s).
>       - Children of a mount object and their relative paths.
> 
>  (5) Information about what the fsinfo() syscall itself supports, including
>      the number of attibutes supported and the number of capability bits
>      supported.

Phew, this patchset is a lot. It's good of course but can we please cut
some of the more advanced features such as querying by mount id,
submounts etc. pp. for now?
I feel this would help with review and since your interface is
extensible it's really not a big deal if we defer fancy features to
later cycles after people had more time to review and the interface has
seen some exposure.

The mount api changes over the last months have honestly been so huge
that any chance to make the changes smaller and easier to digest we
should take. (I'm really not complaining. Good that the work is done and
it's entirely ok that it's a lot of code.)

It would also be great if after you have dropped some stuff from this
patchset and gotten an Ack we could stuff it into linux-next for some
time because it hasn't been so far...

Christian
Ian Kent June 26, 2019, 10:42 a.m. UTC | #2
On Wed, 2019-06-26 at 12:05 +0200, Christian Brauner wrote:
> On Mon, Jun 24, 2019 at 03:08:45PM +0100, David Howells wrote:
> > Hi Al,
> > 
> > Here are a set of patches that adds a syscall, fsinfo(), that allows
> > attributes of a filesystem/superblock to be queried.  Attribute values are
> > of four basic types:
> > 
> >  (1) Version dependent-length structure (size defined by type).
> > 
> >  (2) Variable-length string (up to PAGE_SIZE).
> > 
> >  (3) Array of fixed-length structures (up to INT_MAX size).
> > 
> >  (4) Opaque blob (up to INT_MAX size).
> > 
> > Attributes can have multiple values in up to two dimensions and all the
> > values of a particular attribute must have the same type.
> > 
> > Note that the attribute values *are* allowed to vary between dentries
> > within a single superblock, depending on the specific dentry that you're
> > looking at.
> > 
> > I've tried to make the interface as light as possible, so integer/enum
> > attribute selector rather than string and the core does all the allocation
> > and extensibility support work rather than leaving that to the filesystems.
> > That means that for the first two attribute types, sb->s_op->fsinfo() may
> > assume that the provided buffer is always present and always big enough.
> > 
> > Further, this removes the possibility of the filesystem gaining access to
> > the
> > userspace buffer.
> > 
> > 
> > fsinfo() allows a variety of information to be retrieved about a filesystem
> > and the mount topology:
> > 
> >  (1) General superblock attributes:
> > 
> >       - The amount of space/free space in a filesystem (as statfs()).
> >       - Filesystem identifiers (UUID, volume label, device numbers, ...)
> >       - The limits on a filesystem's capabilities
> >       - Information on supported statx fields and attributes and IOC flags.
> >       - A variety single-bit flags indicating supported capabilities.
> >       - Timestamp resolution and range.
> >       - Sources (as per mount(2), but fsconfig() allows multiple sources).
> >       - In-filesystem filename format information.
> >       - Filesystem parameters ("mount -o xxx"-type things).
> >       - LSM parameters (again "mount -o xxx"-type things).
> > 
> >  (2) Filesystem-specific superblock attributes:
> > 
> >       - Server names and addresses.
> >       - Cell name.
> > 
> >  (3) Filesystem configuration metadata attributes:
> > 
> >       - Filesystem parameter type descriptions.
> >       - Name -> parameter mappings.
> >       - Simple enumeration name -> value mappings.
> > 
> >  (4) Mount topology:
> > 
> >       - General information about a mount object.
> >       - Mount device name(s).
> >       - Children of a mount object and their relative paths.
> > 
> >  (5) Information about what the fsinfo() syscall itself supports, including
> >      the number of attibutes supported and the number of capability bits
> >      supported.
> 
> Phew, this patchset is a lot. It's good of course but can we please cut
> some of the more advanced features such as querying by mount id,
> submounts etc. pp. for now?

Did you mean the "vfs: Allow fsinfo() to look up a mount object by ID"
patch?

We would need to be very careful what was dropped.

For example, I've found that the patch above is pretty much essential
for fsinfo() to be useful from user space.

> I feel this would help with review and since your interface is
> extensible it's really not a big deal if we defer fancy features to
> later cycles after people had more time to review and the interface has
> seen some exposure.
> 
> The mount api changes over the last months have honestly been so huge
> that any chance to make the changes smaller and easier to digest we
> should take. (I'm really not complaining. Good that the work is done and
> it's entirely ok that it's a lot of code.)
> 
> It would also be great if after you have dropped some stuff from this
> patchset and gotten an Ack we could stuff it into linux-next for some
> time because it hasn't been so far...
> 
> Christian
Christian Brauner June 26, 2019, 10:47 a.m. UTC | #3
On Wed, Jun 26, 2019 at 06:42:51PM +0800, Ian Kent wrote:
> On Wed, 2019-06-26 at 12:05 +0200, Christian Brauner wrote:
> > On Mon, Jun 24, 2019 at 03:08:45PM +0100, David Howells wrote:
> > > Hi Al,
> > > 
> > > Here are a set of patches that adds a syscall, fsinfo(), that allows
> > > attributes of a filesystem/superblock to be queried.  Attribute values are
> > > of four basic types:
> > > 
> > >  (1) Version dependent-length structure (size defined by type).
> > > 
> > >  (2) Variable-length string (up to PAGE_SIZE).
> > > 
> > >  (3) Array of fixed-length structures (up to INT_MAX size).
> > > 
> > >  (4) Opaque blob (up to INT_MAX size).
> > > 
> > > Attributes can have multiple values in up to two dimensions and all the
> > > values of a particular attribute must have the same type.
> > > 
> > > Note that the attribute values *are* allowed to vary between dentries
> > > within a single superblock, depending on the specific dentry that you're
> > > looking at.
> > > 
> > > I've tried to make the interface as light as possible, so integer/enum
> > > attribute selector rather than string and the core does all the allocation
> > > and extensibility support work rather than leaving that to the filesystems.
> > > That means that for the first two attribute types, sb->s_op->fsinfo() may
> > > assume that the provided buffer is always present and always big enough.
> > > 
> > > Further, this removes the possibility of the filesystem gaining access to
> > > the
> > > userspace buffer.
> > > 
> > > 
> > > fsinfo() allows a variety of information to be retrieved about a filesystem
> > > and the mount topology:
> > > 
> > >  (1) General superblock attributes:
> > > 
> > >       - The amount of space/free space in a filesystem (as statfs()).
> > >       - Filesystem identifiers (UUID, volume label, device numbers, ...)
> > >       - The limits on a filesystem's capabilities
> > >       - Information on supported statx fields and attributes and IOC flags.
> > >       - A variety single-bit flags indicating supported capabilities.
> > >       - Timestamp resolution and range.
> > >       - Sources (as per mount(2), but fsconfig() allows multiple sources).
> > >       - In-filesystem filename format information.
> > >       - Filesystem parameters ("mount -o xxx"-type things).
> > >       - LSM parameters (again "mount -o xxx"-type things).
> > > 
> > >  (2) Filesystem-specific superblock attributes:
> > > 
> > >       - Server names and addresses.
> > >       - Cell name.
> > > 
> > >  (3) Filesystem configuration metadata attributes:
> > > 
> > >       - Filesystem parameter type descriptions.
> > >       - Name -> parameter mappings.
> > >       - Simple enumeration name -> value mappings.
> > > 
> > >  (4) Mount topology:
> > > 
> > >       - General information about a mount object.
> > >       - Mount device name(s).
> > >       - Children of a mount object and their relative paths.
> > > 
> > >  (5) Information about what the fsinfo() syscall itself supports, including
> > >      the number of attibutes supported and the number of capability bits
> > >      supported.
> > 
> > Phew, this patchset is a lot. It's good of course but can we please cut
> > some of the more advanced features such as querying by mount id,
> > submounts etc. pp. for now?
> 
> Did you mean the "vfs: Allow fsinfo() to look up a mount object by ID"
> patch?
> 
> We would need to be very careful what was dropped.

Not dropped as in never implement but rather defer it by one merge
window to give us a) more time to review and settle the interface while
b) not stalling the overall patch.

> 
> For example, I've found that the patch above is pretty much essential
> for fsinfo() to be useful from user space.

Yeah, but that interface is not clearly defined yet as can be seen from
the commit message and that's what's bothering me most.
Christian Brauner June 26, 2019, 1:19 p.m. UTC | #4
On Wed, Jun 26, 2019 at 12:05:25PM +0200, Christian Brauner wrote:
> On Mon, Jun 24, 2019 at 03:08:45PM +0100, David Howells wrote:
> > 
> > Hi Al,
> > 
> > Here are a set of patches that adds a syscall, fsinfo(), that allows
> > attributes of a filesystem/superblock to be queried.  Attribute values are
> > of four basic types:
> > 
> >  (1) Version dependent-length structure (size defined by type).
> > 
> >  (2) Variable-length string (up to PAGE_SIZE).
> > 
> >  (3) Array of fixed-length structures (up to INT_MAX size).
> > 
> >  (4) Opaque blob (up to INT_MAX size).
> > 
> > Attributes can have multiple values in up to two dimensions and all the
> > values of a particular attribute must have the same type.
> > 
> > Note that the attribute values *are* allowed to vary between dentries
> > within a single superblock, depending on the specific dentry that you're
> > looking at.
> > 
> > I've tried to make the interface as light as possible, so integer/enum
> > attribute selector rather than string and the core does all the allocation
> > and extensibility support work rather than leaving that to the filesystems.
> > That means that for the first two attribute types, sb->s_op->fsinfo() may
> > assume that the provided buffer is always present and always big enough.
> > 
> > Further, this removes the possibility of the filesystem gaining access to the
> > userspace buffer.
> > 
> > 
> > fsinfo() allows a variety of information to be retrieved about a filesystem
> > and the mount topology:
> > 
> >  (1) General superblock attributes:
> > 
> >       - The amount of space/free space in a filesystem (as statfs()).
> >       - Filesystem identifiers (UUID, volume label, device numbers, ...)
> >       - The limits on a filesystem's capabilities
> >       - Information on supported statx fields and attributes and IOC flags.
> >       - A variety single-bit flags indicating supported capabilities.
> >       - Timestamp resolution and range.
> >       - Sources (as per mount(2), but fsconfig() allows multiple sources).
> >       - In-filesystem filename format information.
> >       - Filesystem parameters ("mount -o xxx"-type things).
> >       - LSM parameters (again "mount -o xxx"-type things).
> > 
> >  (2) Filesystem-specific superblock attributes:
> > 
> >       - Server names and addresses.
> >       - Cell name.
> > 
> >  (3) Filesystem configuration metadata attributes:
> > 
> >       - Filesystem parameter type descriptions.
> >       - Name -> parameter mappings.
> >       - Simple enumeration name -> value mappings.
> > 
> >  (4) Mount topology:
> > 
> >       - General information about a mount object.
> >       - Mount device name(s).
> >       - Children of a mount object and their relative paths.
> > 
> >  (5) Information about what the fsinfo() syscall itself supports, including
> >      the number of attibutes supported and the number of capability bits
> >      supported.
> 
> Phew, this patchset is a lot. It's good of course but can we please cut
> some of the more advanced features such as querying by mount id,
> submounts etc. pp. for now?
> I feel this would help with review and since your interface is
> extensible it's really not a big deal if we defer fancy features to
> later cycles after people had more time to review and the interface has
> seen some exposure.
> 
> The mount api changes over the last months have honestly been so huge
> that any chance to make the changes smaller and easier to digest we
> should take. (I'm really not complaining. Good that the work is done and
> it's entirely ok that it's a lot of code.)
> 
> It would also be great if after you have dropped some stuff from this
> patchset and gotten an Ack we could stuff it into linux-next for some
> time because it hasn't been so far...

And I also very much recommend to remove any potential cross-dependency
between the fsinfo() and the notification patchset.
Ideally, I'd like to see fsinfo() to be completely independent to not
block it on something way more controversial.
Furthermore, I can't possibly keep the context of another huge patchset
not yet merged in the back of my mind while reviewing this patchset. :)

Christian
David Howells June 26, 2019, 2:31 p.m. UTC | #5
Christian Brauner <christian@brauner.io> wrote:

> And I also very much recommend to remove any potential cross-dependency
> between the fsinfo() and the notification patchset.

The problem with that is that to make the notification patchset useful for
mount notifications, you need some information that you would obtain through
fsinfo().

David
Christian Brauner June 26, 2019, 2:50 p.m. UTC | #6
On Wed, Jun 26, 2019 at 03:31:37PM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > And I also very much recommend to remove any potential cross-dependency
> > between the fsinfo() and the notification patchset.
> 
> The problem with that is that to make the notification patchset useful for
> mount notifications, you need some information that you would obtain through
> fsinfo().

But would it really be that bad if you'd just land fsinfo() and then
focus on the notification stuff. This very much rather looks like a
timing issue than a conceptual one, i.e. you could very much just push
fsinfo() and leave the notification stuff alone until that is done.

Once fsinfo() has landed you can then go on to put additional bits you
need from or for fsinfo() for the notification patchset in there. Seems
you have at least sketched both APIs sufficiently that you know what you
need to look out for to not cause any regressions later on when you need
to expand them.

Christian
Ian Kent June 27, 2019, 12:38 a.m. UTC | #7
On Wed, 2019-06-26 at 12:47 +0200, Christian Brauner wrote:
> On Wed, Jun 26, 2019 at 06:42:51PM +0800, Ian Kent wrote:
> > On Wed, 2019-06-26 at 12:05 +0200, Christian Brauner wrote:
> > > On Mon, Jun 24, 2019 at 03:08:45PM +0100, David Howells wrote:
> > > > Hi Al,
> > > > 
> > > > Here are a set of patches that adds a syscall, fsinfo(), that allows
> > > > attributes of a filesystem/superblock to be queried.  Attribute values
> > > > are
> > > > of four basic types:
> > > > 
> > > >  (1) Version dependent-length structure (size defined by type).
> > > > 
> > > >  (2) Variable-length string (up to PAGE_SIZE).
> > > > 
> > > >  (3) Array of fixed-length structures (up to INT_MAX size).
> > > > 
> > > >  (4) Opaque blob (up to INT_MAX size).
> > > > 
> > > > Attributes can have multiple values in up to two dimensions and all the
> > > > values of a particular attribute must have the same type.
> > > > 
> > > > Note that the attribute values *are* allowed to vary between dentries
> > > > within a single superblock, depending on the specific dentry that you're
> > > > looking at.
> > > > 
> > > > I've tried to make the interface as light as possible, so integer/enum
> > > > attribute selector rather than string and the core does all the
> > > > allocation
> > > > and extensibility support work rather than leaving that to the
> > > > filesystems.
> > > > That means that for the first two attribute types, sb->s_op->fsinfo()
> > > > may
> > > > assume that the provided buffer is always present and always big enough.
> > > > 
> > > > Further, this removes the possibility of the filesystem gaining access
> > > > to
> > > > the
> > > > userspace buffer.
> > > > 
> > > > 
> > > > fsinfo() allows a variety of information to be retrieved about a
> > > > filesystem
> > > > and the mount topology:
> > > > 
> > > >  (1) General superblock attributes:
> > > > 
> > > >       - The amount of space/free space in a filesystem (as statfs()).
> > > >       - Filesystem identifiers (UUID, volume label, device numbers, ...)
> > > >       - The limits on a filesystem's capabilities
> > > >       - Information on supported statx fields and attributes and IOC
> > > > flags.
> > > >       - A variety single-bit flags indicating supported capabilities.
> > > >       - Timestamp resolution and range.
> > > >       - Sources (as per mount(2), but fsconfig() allows multiple
> > > > sources).
> > > >       - In-filesystem filename format information.
> > > >       - Filesystem parameters ("mount -o xxx"-type things).
> > > >       - LSM parameters (again "mount -o xxx"-type things).
> > > > 
> > > >  (2) Filesystem-specific superblock attributes:
> > > > 
> > > >       - Server names and addresses.
> > > >       - Cell name.
> > > > 
> > > >  (3) Filesystem configuration metadata attributes:
> > > > 
> > > >       - Filesystem parameter type descriptions.
> > > >       - Name -> parameter mappings.
> > > >       - Simple enumeration name -> value mappings.
> > > > 
> > > >  (4) Mount topology:
> > > > 
> > > >       - General information about a mount object.
> > > >       - Mount device name(s).
> > > >       - Children of a mount object and their relative paths.
> > > > 
> > > >  (5) Information about what the fsinfo() syscall itself supports,
> > > > including
> > > >      the number of attibutes supported and the number of capability bits
> > > >      supported.
> > > 
> > > Phew, this patchset is a lot. It's good of course but can we please cut
> > > some of the more advanced features such as querying by mount id,
> > > submounts etc. pp. for now?
> > 
> > Did you mean the "vfs: Allow fsinfo() to look up a mount object by ID"
> > patch?
> > 
> > We would need to be very careful what was dropped.
> 
> Not dropped as in never implement but rather defer it by one merge
> window to give us a) more time to review and settle the interface while
> b) not stalling the overall patch.

Sure, and I'm not saying something like what you recommend shouldn't
be done.

I'm working on user space mount table improvements that I want to
get done ahead of the merge.

Well, I would be but there's still mount-api conversions that need
to be done so that fsinfo() patches don't end up with endless merge
conflicts. The fsinfo() super block method will result in changes
in the same area as the mount-api changes.

The mount-api changes are proving to be a bit of a challenge.

Anyway, the plan is to use the mount table handling improvements to
try and
locate bugs and missing or not quite right functionality.

> 
> > For example, I've found that the patch above is pretty much essential
> > for fsinfo() to be useful from user space.
> 
> Yeah, but that interface is not clearly defined yet as can be seen from
> the commit message and that's what's bothering me most.

Yeah, but updating my cloned branch hasn't been difficult.

There's a certain amount of functionality that I'd like to see
retained for when I get back to the user space development.

Using the notifications changes are something I'm not likely
to get to for quite some time so breaking those out into a
separate branch (like they were not so long ago) would be
more sensible IMHO.

There may be some other bits that David can identify too.

Ian