mbox series

[v4,0/3] fs: introduce getfsxattrat and setfsxattrat syscalls

Message ID 20250321-xattrat-syscall-v4-0-3e82e6fb3264@kernel.org (mailing list archive)
Headers show
Series fs: introduce getfsxattrat and setfsxattrat syscalls | expand

Message

Andrey Albershteyn March 21, 2025, 7:48 p.m. UTC
This patchset introduced two new syscalls getfsxattrat() and
setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
except they use *at() semantics. Therefore, there's no need to open the
file to get an fd.

These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is XFS quota projects.

XFS has project quotas which could be attached to a directory. All
new inodes in these directories inherit project ID set on parent
directory.

The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special
files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
with empty project ID. Those inodes then are not shown in the quota
accounting but still exist in the directory. This is not critical but in
the case when special files are created in the directory with already
existing project quota, these new inodes inherit extended attributes.
This creates a mix of special files with and without attributes.
Moreover, special files with attributes don't have a possibility to
become clear or change the attributes. This, in turn, prevents userspace
from re-creating quota project on these existing files.

Christian, if this get in some mergeable state, please don't merge it
yet. Amir suggested these syscalls better to use updated struct fsxattr
with masking from Pali Rohár patchset, so, let's see how it goes.

NAME

	getfsxattrat/setfsxattrat - get/set filesystem inode attributes

SYNOPSIS

	#include <sys/syscall.h>    /* Definition of SYS_* constants */
	#include <unistd.h>

	long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
		struct fsxattr *fsx, size_t size,
		unsigned int at_flags);
	long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
		struct fsxattr *fsx, size_t size,
		unsigned int at_flags);

	Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
	use syscall(2) instead.

DESCRIPTION

	The syscalls take fd and path to the child together with struct
	fsxattr. If path is absolute, fd is not used. If path is empty,
	inode under fd is used to get/set attributes on.

	This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
	ioctl with a difference that file don't need to be open as we
	can reference it with a path instead of fd. By having this we
	can manipulated filesystem inode attributes not only on regular
	files but also on special ones. This is not possible with
	FS_IOC_FSSETXATTR ioctl as with special files we can not call
	ioctl() directly on the filesystem inode using file descriptor.

RETURN VALUE

	On success, 0 is returned.  On error, -1 is returned, and errno
	is set to indicate the error.

ERRORS

	EINVAL		Invalid at_flag specified (only
			AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
			supported).

	EINVAL		Size was smaller than any known version of
			struct fsxattr.

	EINVAL		Invalid combination of parameters provided in
			fsxattr for this type of file.

	E2BIG		Size of input argument **struct fsxattr** is too
			big.

	EBADF		Invalid file descriptor was provided.

	EPERM		No permission to change this file.

	EOPNOTSUPP	Filesystem does not support setting attributes
			on this type of inode

HISTORY

	Added in Linux 6.14.

EXAMPLE

Create directory and file "mkdir ./dir && touch ./dir/foo" and then
execute the following program:

	#include <fcntl.h>
	#include <errno.h>
	#include <string.h>
	#include <linux/fs.h>
	#include <stdio.h>
	#include <sys/syscall.h>
	#include <unistd.h>

	int
	main(int argc, char **argv) {
		int dfd;
		int error;
		struct fsxattr fsx;

		dfd = open("./dir", O_RDONLY);
		if (dfd == -1) {
			printf("can not open ./dir");
			return dfd;
		}

		error = syscall(467, dfd, "./foo", &fsx, 0);
		if (error) {
			printf("can not call 467: %s", strerror(errno));
			return error;
		}

		printf("dir/foo flags: %d\n", fsx.fsx_xflags);

		fsx.fsx_xflags |= FS_XFLAG_NODUMP;
		error = syscall(468, dfd, "./foo", &fsx, 0);
		if (error) {
			printf("can not call 468: %s", strerror(errno));
			return error;
		}

		printf("dir/foo flags: %d\n", fsx.fsx_xflags);

		return error;
	}

SEE ALSO

	ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)

---
Changes in v4:
- Use getname_maybe_null() for correct handling of dfd + path semantic
- Remove restriction for special files on which flags are allowed
- Utilize copy_struct_from_user() for better future compatibility
- Add draft man page to cover letter
- Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
- Add missing __user to header declaration of syscalls
- Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org

Changes in v3:
- Remove unnecessary "dfd is dir" check as it checked in user_path_at()
- Remove unnecessary "same filesystem" check
- Use CLASS() instead of directly calling fdget/fdput
- Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org

v1:
https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/

Previous discussion:
https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/

---
Andrey Albershteyn (3):
      lsm: introduce new hooks for setting/getting inode fsxattr
      fs: split fileattr/fsxattr converters into helpers
      fs: introduce getfsxattrat and setfsxattrat syscalls

 arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
 arch/arm/tools/syscall.tbl                  |   2 +
 arch/arm64/tools/syscall_32.tbl             |   2 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
 arch/s390/kernel/syscalls/syscall.tbl       |   2 +
 arch/sh/kernel/syscalls/syscall.tbl         |   2 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
 fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
 fs/ioctl.c                                  |  39 ++++++---
 include/linux/fileattr.h                    |   2 +
 include/linux/lsm_hook_defs.h               |   4 +
 include/linux/security.h                    |  16 ++++
 include/linux/syscalls.h                    |   6 ++
 include/uapi/asm-generic/unistd.h           |   8 +-
 include/uapi/linux/fs.h                     |   3 +
 security/security.c                         |  32 +++++++
 25 files changed, 259 insertions(+), 13 deletions(-)
---
base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
change-id: 20250114-xattrat-syscall-6a1136d2db59

Best regards,

Comments

Amir Goldstein March 23, 2025, 8:45 a.m. UTC | #1
On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
>
> This patchset introduced two new syscalls getfsxattrat() and
> setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> except they use *at() semantics. Therefore, there's no need to open the
> file to get an fd.
>
> These syscalls allow userspace to set filesystem inode attributes on
> special files. One of the usage examples is XFS quota projects.
>
> XFS has project quotas which could be attached to a directory. All
> new inodes in these directories inherit project ID set on parent
> directory.
>
> The project is created from userspace by opening and calling
> FS_IOC_FSSETXATTR on each inode. This is not possible for special
> files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> with empty project ID. Those inodes then are not shown in the quota
> accounting but still exist in the directory. This is not critical but in
> the case when special files are created in the directory with already
> existing project quota, these new inodes inherit extended attributes.
> This creates a mix of special files with and without attributes.
> Moreover, special files with attributes don't have a possibility to
> become clear or change the attributes. This, in turn, prevents userspace
> from re-creating quota project on these existing files.
>
> Christian, if this get in some mergeable state, please don't merge it
> yet. Amir suggested these syscalls better to use updated struct fsxattr
> with masking from Pali Rohár patchset, so, let's see how it goes.

Andrey,

To be honest I don't think it would be fair to delay your syscalls more
than needed.

If Pali can follow through and post patches on top of your syscalls for
next merge window that would be great, but otherwise, I think the
minimum requirement is that the syscalls return EINVAL if fsx_pad
is not zero. we can take it from there later.

We can always also increase the size of struct fsxattr, but let's first
use the padding space already available.

Thanks,
Amir.

>
> NAME
>
>         getfsxattrat/setfsxattrat - get/set filesystem inode attributes
>
> SYNOPSIS
>
>         #include <sys/syscall.h>    /* Definition of SYS_* constants */
>         #include <unistd.h>
>
>         long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
>                 struct fsxattr *fsx, size_t size,
>                 unsigned int at_flags);
>         long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
>                 struct fsxattr *fsx, size_t size,
>                 unsigned int at_flags);
>
>         Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
>         use syscall(2) instead.
>
> DESCRIPTION
>
>         The syscalls take fd and path to the child together with struct
>         fsxattr. If path is absolute, fd is not used. If path is empty,
>         inode under fd is used to get/set attributes on.
>
>         This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
>         ioctl with a difference that file don't need to be open as we
>         can reference it with a path instead of fd. By having this we
>         can manipulated filesystem inode attributes not only on regular
>         files but also on special ones. This is not possible with
>         FS_IOC_FSSETXATTR ioctl as with special files we can not call
>         ioctl() directly on the filesystem inode using file descriptor.
>
> RETURN VALUE
>
>         On success, 0 is returned.  On error, -1 is returned, and errno
>         is set to indicate the error.
>
> ERRORS
>
>         EINVAL          Invalid at_flag specified (only
>                         AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
>                         supported).
>
>         EINVAL          Size was smaller than any known version of
>                         struct fsxattr.
>
>         EINVAL          Invalid combination of parameters provided in
>                         fsxattr for this type of file.
>
>         E2BIG           Size of input argument **struct fsxattr** is too
>                         big.
>
>         EBADF           Invalid file descriptor was provided.
>
>         EPERM           No permission to change this file.
>
>         EOPNOTSUPP      Filesystem does not support setting attributes
>                         on this type of inode
>
> HISTORY
>
>         Added in Linux 6.14.
>
> EXAMPLE
>
> Create directory and file "mkdir ./dir && touch ./dir/foo" and then
> execute the following program:
>
>         #include <fcntl.h>
>         #include <errno.h>
>         #include <string.h>
>         #include <linux/fs.h>
>         #include <stdio.h>
>         #include <sys/syscall.h>
>         #include <unistd.h>
>
>         int
>         main(int argc, char **argv) {
>                 int dfd;
>                 int error;
>                 struct fsxattr fsx;
>
>                 dfd = open("./dir", O_RDONLY);
>                 if (dfd == -1) {
>                         printf("can not open ./dir");
>                         return dfd;
>                 }
>
>                 error = syscall(467, dfd, "./foo", &fsx, 0);
>                 if (error) {
>                         printf("can not call 467: %s", strerror(errno));
>                         return error;
>                 }
>
>                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
>
>                 fsx.fsx_xflags |= FS_XFLAG_NODUMP;
>                 error = syscall(468, dfd, "./foo", &fsx, 0);
>                 if (error) {
>                         printf("can not call 468: %s", strerror(errno));
>                         return error;
>                 }
>
>                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
>
>                 return error;
>         }
>
> SEE ALSO
>
>         ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)
>
> ---
> Changes in v4:
> - Use getname_maybe_null() for correct handling of dfd + path semantic
> - Remove restriction for special files on which flags are allowed
> - Utilize copy_struct_from_user() for better future compatibility
> - Add draft man page to cover letter
> - Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
> - Add missing __user to header declaration of syscalls
> - Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org
>
> Changes in v3:
> - Remove unnecessary "dfd is dir" check as it checked in user_path_at()
> - Remove unnecessary "same filesystem" check
> - Use CLASS() instead of directly calling fdget/fdput
> - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org
>
> v1:
> https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/
>
> Previous discussion:
> https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/
>
> ---
> Andrey Albershteyn (3):
>       lsm: introduce new hooks for setting/getting inode fsxattr
>       fs: split fileattr/fsxattr converters into helpers
>       fs: introduce getfsxattrat and setfsxattrat syscalls
>
>  arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
>  arch/arm/tools/syscall.tbl                  |   2 +
>  arch/arm64/tools/syscall_32.tbl             |   2 +
>  arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
>  arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
>  arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
>  arch/s390/kernel/syscalls/syscall.tbl       |   2 +
>  arch/sh/kernel/syscalls/syscall.tbl         |   2 +
>  arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
>  fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
>  fs/ioctl.c                                  |  39 ++++++---
>  include/linux/fileattr.h                    |   2 +
>  include/linux/lsm_hook_defs.h               |   4 +
>  include/linux/security.h                    |  16 ++++
>  include/linux/syscalls.h                    |   6 ++
>  include/uapi/asm-generic/unistd.h           |   8 +-
>  include/uapi/linux/fs.h                     |   3 +
>  security/security.c                         |  32 +++++++
>  25 files changed, 259 insertions(+), 13 deletions(-)
> ---
> base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
> change-id: 20250114-xattrat-syscall-6a1136d2db59
>
> Best regards,
> --
> Andrey Albershteyn <aalbersh@kernel.org>
>
>
Pali Rohár March 23, 2025, 10:32 a.m. UTC | #2
On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> >
> > This patchset introduced two new syscalls getfsxattrat() and
> > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > except they use *at() semantics. Therefore, there's no need to open the
> > file to get an fd.
> >
> > These syscalls allow userspace to set filesystem inode attributes on
> > special files. One of the usage examples is XFS quota projects.
> >
> > XFS has project quotas which could be attached to a directory. All
> > new inodes in these directories inherit project ID set on parent
> > directory.
> >
> > The project is created from userspace by opening and calling
> > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > with empty project ID. Those inodes then are not shown in the quota
> > accounting but still exist in the directory. This is not critical but in
> > the case when special files are created in the directory with already
> > existing project quota, these new inodes inherit extended attributes.
> > This creates a mix of special files with and without attributes.
> > Moreover, special files with attributes don't have a possibility to
> > become clear or change the attributes. This, in turn, prevents userspace
> > from re-creating quota project on these existing files.
> >
> > Christian, if this get in some mergeable state, please don't merge it
> > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > with masking from Pali Rohár patchset, so, let's see how it goes.
> 
> Andrey,
> 
> To be honest I don't think it would be fair to delay your syscalls more
> than needed.

I agree.

> If Pali can follow through and post patches on top of your syscalls for
> next merge window that would be great, but otherwise, I think the
> minimum requirement is that the syscalls return EINVAL if fsx_pad
> is not zero. we can take it from there later.

IMHO SYS_getfsxattrat is fine in this form.

For SYS_setfsxattrat I think there are needed some modifications
otherwise we would have problem again with backward compatibility as
is with ioctl if the syscall wants to be extended in future.

I would suggest for following modifications for SYS_setfsxattrat:

- return EINVAL if fsx_xflags contains some reserved or unsupported flag

- add some flag to completely ignore fsx_extsize, fsx_projid, and
  fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
  change fsx_xflags, and so could be used without the preceding
  SYS_getfsxattrat call.

What do you think about it?

Use cases for future without breaking backward compatibility:
- atomically / race-free do set or clear just one flag in fsx_xflags
  (so avoid getfsxattrat - modify buffer - setfsxattrat roundtrip)
- use fsx_pad[] for some new purposes 

> We can always also increase the size of struct fsxattr, but let's first
> use the padding space already available.
> 
> Thanks,
> Amir.
> 
> >
> > NAME
> >
> >         getfsxattrat/setfsxattrat - get/set filesystem inode attributes
> >
> > SYNOPSIS
> >
> >         #include <sys/syscall.h>    /* Definition of SYS_* constants */
> >         #include <unistd.h>
> >
> >         long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
> >                 struct fsxattr *fsx, size_t size,
> >                 unsigned int at_flags);
> >         long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
> >                 struct fsxattr *fsx, size_t size,
> >                 unsigned int at_flags);
> >
> >         Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
> >         use syscall(2) instead.
> >
> > DESCRIPTION
> >
> >         The syscalls take fd and path to the child together with struct
> >         fsxattr. If path is absolute, fd is not used. If path is empty,
> >         inode under fd is used to get/set attributes on.
> >
> >         This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
> >         ioctl with a difference that file don't need to be open as we
> >         can reference it with a path instead of fd. By having this we
> >         can manipulated filesystem inode attributes not only on regular
> >         files but also on special ones. This is not possible with
> >         FS_IOC_FSSETXATTR ioctl as with special files we can not call
> >         ioctl() directly on the filesystem inode using file descriptor.
> >
> > RETURN VALUE
> >
> >         On success, 0 is returned.  On error, -1 is returned, and errno
> >         is set to indicate the error.
> >
> > ERRORS
> >
> >         EINVAL          Invalid at_flag specified (only
> >                         AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
> >                         supported).
> >
> >         EINVAL          Size was smaller than any known version of
> >                         struct fsxattr.
> >
> >         EINVAL          Invalid combination of parameters provided in
> >                         fsxattr for this type of file.
> >
> >         E2BIG           Size of input argument **struct fsxattr** is too
> >                         big.
> >
> >         EBADF           Invalid file descriptor was provided.
> >
> >         EPERM           No permission to change this file.
> >
> >         EOPNOTSUPP      Filesystem does not support setting attributes
> >                         on this type of inode
> >
> > HISTORY
> >
> >         Added in Linux 6.14.
> >
> > EXAMPLE
> >
> > Create directory and file "mkdir ./dir && touch ./dir/foo" and then
> > execute the following program:
> >
> >         #include <fcntl.h>
> >         #include <errno.h>
> >         #include <string.h>
> >         #include <linux/fs.h>
> >         #include <stdio.h>
> >         #include <sys/syscall.h>
> >         #include <unistd.h>
> >
> >         int
> >         main(int argc, char **argv) {
> >                 int dfd;
> >                 int error;
> >                 struct fsxattr fsx;
> >
> >                 dfd = open("./dir", O_RDONLY);
> >                 if (dfd == -1) {
> >                         printf("can not open ./dir");
> >                         return dfd;
> >                 }
> >
> >                 error = syscall(467, dfd, "./foo", &fsx, 0);
> >                 if (error) {
> >                         printf("can not call 467: %s", strerror(errno));
> >                         return error;
> >                 }
> >
> >                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
> >
> >                 fsx.fsx_xflags |= FS_XFLAG_NODUMP;
> >                 error = syscall(468, dfd, "./foo", &fsx, 0);
> >                 if (error) {
> >                         printf("can not call 468: %s", strerror(errno));
> >                         return error;
> >                 }
> >
> >                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
> >
> >                 return error;
> >         }
> >
> > SEE ALSO
> >
> >         ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)
> >
> > ---
> > Changes in v4:
> > - Use getname_maybe_null() for correct handling of dfd + path semantic
> > - Remove restriction for special files on which flags are allowed
> > - Utilize copy_struct_from_user() for better future compatibility
> > - Add draft man page to cover letter
> > - Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
> > - Add missing __user to header declaration of syscalls
> > - Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org
> >
> > Changes in v3:
> > - Remove unnecessary "dfd is dir" check as it checked in user_path_at()
> > - Remove unnecessary "same filesystem" check
> > - Use CLASS() instead of directly calling fdget/fdput
> > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org
> >
> > v1:
> > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/
> >
> > Previous discussion:
> > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/
> >
> > ---
> > Andrey Albershteyn (3):
> >       lsm: introduce new hooks for setting/getting inode fsxattr
> >       fs: split fileattr/fsxattr converters into helpers
> >       fs: introduce getfsxattrat and setfsxattrat syscalls
> >
> >  arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
> >  arch/arm/tools/syscall.tbl                  |   2 +
> >  arch/arm64/tools/syscall_32.tbl             |   2 +
> >  arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
> >  arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
> >  arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
> >  arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
> >  arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
> >  arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
> >  arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
> >  arch/s390/kernel/syscalls/syscall.tbl       |   2 +
> >  arch/sh/kernel/syscalls/syscall.tbl         |   2 +
> >  arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
> >  arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
> >  arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
> >  arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
> >  fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
> >  fs/ioctl.c                                  |  39 ++++++---
> >  include/linux/fileattr.h                    |   2 +
> >  include/linux/lsm_hook_defs.h               |   4 +
> >  include/linux/security.h                    |  16 ++++
> >  include/linux/syscalls.h                    |   6 ++
> >  include/uapi/asm-generic/unistd.h           |   8 +-
> >  include/uapi/linux/fs.h                     |   3 +
> >  security/security.c                         |  32 +++++++
> >  25 files changed, 259 insertions(+), 13 deletions(-)
> > ---
> > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
> > change-id: 20250114-xattrat-syscall-6a1136d2db59
> >
> > Best regards,
> > --
> > Andrey Albershteyn <aalbersh@kernel.org>
> >
> >
Amir Goldstein March 27, 2025, 11:47 a.m. UTC | #3
On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
>
> On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > >
> > > This patchset introduced two new syscalls getfsxattrat() and
> > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > except they use *at() semantics. Therefore, there's no need to open the
> > > file to get an fd.
> > >
> > > These syscalls allow userspace to set filesystem inode attributes on
> > > special files. One of the usage examples is XFS quota projects.
> > >
> > > XFS has project quotas which could be attached to a directory. All
> > > new inodes in these directories inherit project ID set on parent
> > > directory.
> > >
> > > The project is created from userspace by opening and calling
> > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > with empty project ID. Those inodes then are not shown in the quota
> > > accounting but still exist in the directory. This is not critical but in
> > > the case when special files are created in the directory with already
> > > existing project quota, these new inodes inherit extended attributes.
> > > This creates a mix of special files with and without attributes.
> > > Moreover, special files with attributes don't have a possibility to
> > > become clear or change the attributes. This, in turn, prevents userspace
> > > from re-creating quota project on these existing files.
> > >
> > > Christian, if this get in some mergeable state, please don't merge it
> > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > with masking from Pali Rohár patchset, so, let's see how it goes.
> >
> > Andrey,
> >
> > To be honest I don't think it would be fair to delay your syscalls more
> > than needed.
>
> I agree.
>
> > If Pali can follow through and post patches on top of your syscalls for
> > next merge window that would be great, but otherwise, I think the
> > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > is not zero. we can take it from there later.
>
> IMHO SYS_getfsxattrat is fine in this form.
>
> For SYS_setfsxattrat I think there are needed some modifications
> otherwise we would have problem again with backward compatibility as
> is with ioctl if the syscall wants to be extended in future.
>
> I would suggest for following modifications for SYS_setfsxattrat:
>
> - return EINVAL if fsx_xflags contains some reserved or unsupported flag
>
> - add some flag to completely ignore fsx_extsize, fsx_projid, and
>   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
>   change fsx_xflags, and so could be used without the preceding
>   SYS_getfsxattrat call.
>
> What do you think about it?

I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.

You can use this later to extend for the semantics of flags/fields mask
and we can have a long discussion later on what this semantics should be.

Right?

Amir.
Pali Rohár March 27, 2025, 7:26 p.m. UTC | #4
On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > >
> > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > file to get an fd.
> > > >
> > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > special files. One of the usage examples is XFS quota projects.
> > > >
> > > > XFS has project quotas which could be attached to a directory. All
> > > > new inodes in these directories inherit project ID set on parent
> > > > directory.
> > > >
> > > > The project is created from userspace by opening and calling
> > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > with empty project ID. Those inodes then are not shown in the quota
> > > > accounting but still exist in the directory. This is not critical but in
> > > > the case when special files are created in the directory with already
> > > > existing project quota, these new inodes inherit extended attributes.
> > > > This creates a mix of special files with and without attributes.
> > > > Moreover, special files with attributes don't have a possibility to
> > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > from re-creating quota project on these existing files.
> > > >
> > > > Christian, if this get in some mergeable state, please don't merge it
> > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > >
> > > Andrey,
> > >
> > > To be honest I don't think it would be fair to delay your syscalls more
> > > than needed.
> >
> > I agree.
> >
> > > If Pali can follow through and post patches on top of your syscalls for
> > > next merge window that would be great, but otherwise, I think the
> > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > is not zero. we can take it from there later.
> >
> > IMHO SYS_getfsxattrat is fine in this form.
> >
> > For SYS_setfsxattrat I think there are needed some modifications
> > otherwise we would have problem again with backward compatibility as
> > is with ioctl if the syscall wants to be extended in future.
> >
> > I would suggest for following modifications for SYS_setfsxattrat:
> >
> > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> >
> > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> >   change fsx_xflags, and so could be used without the preceding
> >   SYS_getfsxattrat call.
> >
> > What do you think about it?
> 
> I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> 
> You can use this later to extend for the semantics of flags/fields mask
> and we can have a long discussion later on what this semantics should be.
> 
> Right?
> 
> Amir.

It is really enough? All new extensions later would have to be added
into fsx_pad fields, and currently unused bits in fsx_xflags would be
unusable for extensions.
Amir Goldstein March 27, 2025, 8:57 p.m. UTC | #5
On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
>
> On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > >
> > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > file to get an fd.
> > > > >
> > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > special files. One of the usage examples is XFS quota projects.
> > > > >
> > > > > XFS has project quotas which could be attached to a directory. All
> > > > > new inodes in these directories inherit project ID set on parent
> > > > > directory.
> > > > >
> > > > > The project is created from userspace by opening and calling
> > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > accounting but still exist in the directory. This is not critical but in
> > > > > the case when special files are created in the directory with already
> > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > This creates a mix of special files with and without attributes.
> > > > > Moreover, special files with attributes don't have a possibility to
> > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > from re-creating quota project on these existing files.
> > > > >
> > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > >
> > > > Andrey,
> > > >
> > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > than needed.
> > >
> > > I agree.
> > >
> > > > If Pali can follow through and post patches on top of your syscalls for
> > > > next merge window that would be great, but otherwise, I think the
> > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > is not zero. we can take it from there later.
> > >
> > > IMHO SYS_getfsxattrat is fine in this form.
> > >
> > > For SYS_setfsxattrat I think there are needed some modifications
> > > otherwise we would have problem again with backward compatibility as
> > > is with ioctl if the syscall wants to be extended in future.
> > >
> > > I would suggest for following modifications for SYS_setfsxattrat:
> > >
> > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > >
> > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > >   change fsx_xflags, and so could be used without the preceding
> > >   SYS_getfsxattrat call.
> > >
> > > What do you think about it?
> >
> > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> >
> > You can use this later to extend for the semantics of flags/fields mask
> > and we can have a long discussion later on what this semantics should be.
> >
> > Right?
> >
> > Amir.
>
> It is really enough?

I don't know. Let's see...

> All new extensions later would have to be added
> into fsx_pad fields, and currently unused bits in fsx_xflags would be
> unusable for extensions.

I am working under the assumption that the first extension would be
to support fsx_xflags_mask and from there, you could add filesystem
flags support checks and then new flags. Am I wrong?

Obviously, fsx_xflags_mask would be taken from fsx_pad space.
After that extension is implemented, calling SYS_setfsxattrat() with
a zero fsx_xflags_mask would be silly for programs that do not do
the legacy get+set.

So when we introduce  fsx_xflags_mask, we could say that a value
of zero means that the mask is not being checked at all and unknown
flags in set syscall are ignored (a.k.a legacy ioctl behavior).

Programs that actually want to try and set without get will have to set
a non zero fsx_xflags_mask to do something useful.

I don't think this is great.
I would rather that the first version of syscalls will require the mask
and will always enforce filesystems supported flags.

If you can get those patches (on top of current series) posted and
reviewed in time for the next merge window, including consensus
on the actual semantics, that would be the best IMO.

But I am just preparing a plan B in case you do not have time to
work on the patches or if consensus on the API extensions is not
reached on time.

I think that for plan B, the minimum is to verify zero pad field and
that is something that this syscall has to do anyway, because this
is the way that backward compact APIs work.

If you want the syscall to always return -EINVAL for setting xflags
that are currently undefined I agree that would be nice as well.

Thanks,
Amir.
Pali Rohár March 27, 2025, 9:13 p.m. UTC | #6
On Thursday 27 March 2025 21:57:34 Amir Goldstein wrote:
> On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > > >
> > > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > >
> > > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > > file to get an fd.
> > > > > >
> > > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > > special files. One of the usage examples is XFS quota projects.
> > > > > >
> > > > > > XFS has project quotas which could be attached to a directory. All
> > > > > > new inodes in these directories inherit project ID set on parent
> > > > > > directory.
> > > > > >
> > > > > > The project is created from userspace by opening and calling
> > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > > accounting but still exist in the directory. This is not critical but in
> > > > > > the case when special files are created in the directory with already
> > > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > > This creates a mix of special files with and without attributes.
> > > > > > Moreover, special files with attributes don't have a possibility to
> > > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > > from re-creating quota project on these existing files.
> > > > > >
> > > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > > >
> > > > > Andrey,
> > > > >
> > > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > > than needed.
> > > >
> > > > I agree.
> > > >
> > > > > If Pali can follow through and post patches on top of your syscalls for
> > > > > next merge window that would be great, but otherwise, I think the
> > > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > > is not zero. we can take it from there later.
> > > >
> > > > IMHO SYS_getfsxattrat is fine in this form.
> > > >
> > > > For SYS_setfsxattrat I think there are needed some modifications
> > > > otherwise we would have problem again with backward compatibility as
> > > > is with ioctl if the syscall wants to be extended in future.
> > > >
> > > > I would suggest for following modifications for SYS_setfsxattrat:
> > > >
> > > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > > >
> > > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > > >   change fsx_xflags, and so could be used without the preceding
> > > >   SYS_getfsxattrat call.
> > > >
> > > > What do you think about it?
> > >
> > > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> > >
> > > You can use this later to extend for the semantics of flags/fields mask
> > > and we can have a long discussion later on what this semantics should be.
> > >
> > > Right?
> > >
> > > Amir.
> >
> > It is really enough?
> 
> I don't know. Let's see...
> 
> > All new extensions later would have to be added
> > into fsx_pad fields, and currently unused bits in fsx_xflags would be
> > unusable for extensions.
> 
> I am working under the assumption that the first extension would be
> to support fsx_xflags_mask and from there, you could add filesystem
> flags support checks and then new flags. Am I wrong?
> 
> Obviously, fsx_xflags_mask would be taken from fsx_pad space.
> After that extension is implemented, calling SYS_setfsxattrat() with
> a zero fsx_xflags_mask would be silly for programs that do not do
> the legacy get+set.
> 
> So when we introduce  fsx_xflags_mask, we could say that a value
> of zero means that the mask is not being checked at all and unknown
> flags in set syscall are ignored (a.k.a legacy ioctl behavior).
> 
> Programs that actually want to try and set without get will have to set
> a non zero fsx_xflags_mask to do something useful.

Here we need to also solve the problem that without GET call we do not
have valid values for fsx_extsize, fsx_projid, and fsx_cowextsize. So
maybe we would need some flag in fsx_pad that fsx_extsize, fsx_projid,
or fsx_cowextsize are ignored/masked.

> I don't think this is great.
> I would rather that the first version of syscalls will require the mask
> and will always enforce filesystems supported flags.

It is not great... But what about this? In a first step (part of this
syscall patch series) would be just a check that fsx_pad is zero.
Non-zero will return -EINVAL.

In next changes would added fsx_filter bit field, which for each
fsx_xflags and also for fsx_extsize, fsx_projid, and fsx_cowextsize
fields would add a new bit flag which would say (when SET) that the
particular thing has to be ignored.

So when fsx_pad is all-zeros then fsx_filter (first field in fsx_pad)
would say that nothing in fsx_xflags, fsx_extsize, fsx_projid, and
fsx_cowextsize is ignored, and hence behave like before.

And when something in fsx_pad/fsx_filter is set then it says which
fields are ignored/filtered-out.

> If you can get those patches (on top of current series) posted and
> reviewed in time for the next merge window, including consensus
> on the actual semantics, that would be the best IMO.

I think that this starting to be more complicated to rebase my patches
in a way that they do not affect IOCTL path but implement it properly
for new syscall path. It does not sounds like a trivial thing which I
would finish in merge window time and having proper review and consensus
on this.

> But I am just preparing a plan B in case you do not have time to
> work on the patches or if consensus on the API extensions is not
> reached on time.
> 
> I think that for plan B, the minimum is to verify zero pad field and
> that is something that this syscall has to do anyway, because this
> is the way that backward compact APIs work.
> 
> If you want the syscall to always return -EINVAL for setting xflags
> that are currently undefined I agree that would be nice as well.
> 
> Thanks,
> Amir.
Amir Goldstein March 28, 2025, 2:09 p.m. UTC | #7
On Thu, Mar 27, 2025 at 10:13 PM Pali Rohár <pali@kernel.org> wrote:
>
> On Thursday 27 March 2025 21:57:34 Amir Goldstein wrote:
> > On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > > > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > > > >
> > > > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > > >
> > > > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > > > file to get an fd.
> > > > > > >
> > > > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > > > special files. One of the usage examples is XFS quota projects.
> > > > > > >
> > > > > > > XFS has project quotas which could be attached to a directory. All
> > > > > > > new inodes in these directories inherit project ID set on parent
> > > > > > > directory.
> > > > > > >
> > > > > > > The project is created from userspace by opening and calling
> > > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > > > accounting but still exist in the directory. This is not critical but in
> > > > > > > the case when special files are created in the directory with already
> > > > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > > > This creates a mix of special files with and without attributes.
> > > > > > > Moreover, special files with attributes don't have a possibility to
> > > > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > > > from re-creating quota project on these existing files.
> > > > > > >
> > > > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > > > >
> > > > > > Andrey,
> > > > > >
> > > > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > > > than needed.
> > > > >
> > > > > I agree.
> > > > >
> > > > > > If Pali can follow through and post patches on top of your syscalls for
> > > > > > next merge window that would be great, but otherwise, I think the
> > > > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > > > is not zero. we can take it from there later.
> > > > >
> > > > > IMHO SYS_getfsxattrat is fine in this form.
> > > > >
> > > > > For SYS_setfsxattrat I think there are needed some modifications
> > > > > otherwise we would have problem again with backward compatibility as
> > > > > is with ioctl if the syscall wants to be extended in future.
> > > > >
> > > > > I would suggest for following modifications for SYS_setfsxattrat:
> > > > >
> > > > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > > > >
> > > > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > > > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > > > >   change fsx_xflags, and so could be used without the preceding
> > > > >   SYS_getfsxattrat call.
> > > > >
> > > > > What do you think about it?
> > > >
> > > > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> > > >
> > > > You can use this later to extend for the semantics of flags/fields mask
> > > > and we can have a long discussion later on what this semantics should be.
> > > >
> > > > Right?
> > > >
> > > > Amir.
> > >
> > > It is really enough?
> >
> > I don't know. Let's see...
> >
> > > All new extensions later would have to be added
> > > into fsx_pad fields, and currently unused bits in fsx_xflags would be
> > > unusable for extensions.
> >
> > I am working under the assumption that the first extension would be
> > to support fsx_xflags_mask and from there, you could add filesystem
> > flags support checks and then new flags. Am I wrong?
> >
> > Obviously, fsx_xflags_mask would be taken from fsx_pad space.
> > After that extension is implemented, calling SYS_setfsxattrat() with
> > a zero fsx_xflags_mask would be silly for programs that do not do
> > the legacy get+set.
> >
> > So when we introduce  fsx_xflags_mask, we could say that a value
> > of zero means that the mask is not being checked at all and unknown
> > flags in set syscall are ignored (a.k.a legacy ioctl behavior).
> >
> > Programs that actually want to try and set without get will have to set
> > a non zero fsx_xflags_mask to do something useful.
>
> Here we need to also solve the problem that without GET call we do not
> have valid values for fsx_extsize, fsx_projid, and fsx_cowextsize. So
> maybe we would need some flag in fsx_pad that fsx_extsize, fsx_projid,
> or fsx_cowextsize are ignored/masked.
>
> > I don't think this is great.
> > I would rather that the first version of syscalls will require the mask
> > and will always enforce filesystems supported flags.
>
> It is not great... But what about this? In a first step (part of this
> syscall patch series) would be just a check that fsx_pad is zero.
> Non-zero will return -EINVAL.
>
> In next changes would added fsx_filter bit field, which for each
> fsx_xflags and also for fsx_extsize, fsx_projid, and fsx_cowextsize
> fields would add a new bit flag which would say (when SET) that the
> particular thing has to be ignored.

1. I don't like the inverse mask. statx already has the stx_mask
    and stx_attributes_mask, so I rather stick to same semantics
    because some of those attributes are exposed via statx as well
2. fsx_*extsize already have a bit that says if that the particular
    attribute is valid or not, so setting a zero fsx_cowextsize with the
    flag FS_XFLAG_COWEXTSIZE has no effect in xfs:

        /*
         * Only set the extent size hint if we've already determined that the
         * extent size hint should be set on the inode. If no extent size flags
         * are set on the inode then unconditionally clear the extent size hint.
         */
        if (ip->i_diflags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT))
                ip->i_extsize = XFS_B_TO_FSB(mp, fa->fsx_extsize);
        else
                ip->i_extsize = 0;

        if (xfs_has_v3inodes(mp)) {
                if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
                        ip->i_cowextsize = XFS_B_TO_FSB(mp, fa->fsx_cowextsize);
                else
                        ip->i_cowextsize = 0;
        }

I think we need to enforce this logic in fileattr_set_prepare()
and I think we need to add a flag FS_XFLAG_PROJID
that will be set in GET when fsx_projid != 0 and similarly
required when setting fsx_projid != 0.

Probably will need to add some backward compat glue for this
flag in GET ioctl to avoid breaking out of tree fs and fuse.

>
> So when fsx_pad is all-zeros then fsx_filter (first field in fsx_pad)
> would say that nothing in fsx_xflags, fsx_extsize, fsx_projid, and
> fsx_cowextsize is ignored, and hence behave like before.
>
> And when something in fsx_pad/fsx_filter is set then it says which
> fields are ignored/filtered-out.
>
> > If you can get those patches (on top of current series) posted and
> > reviewed in time for the next merge window, including consensus
> > on the actual semantics, that would be the best IMO.
>
> I think that this starting to be more complicated to rebase my patches
> in a way that they do not affect IOCTL path but implement it properly
> for new syscall path. It does not sounds like a trivial thing which I
> would finish in merge window time and having proper review and consensus
> on this.
>

Yes, it is better to separate the two efforts.

wrt erroring on unsupported SET flags, all fs other than xfs already
have some variant of fileattr_has_fsx(), so xfs is the only filesystem
that requires special care with the new syscalls.
It's easier to write a patch than it is to explain what I mean, so
I'll try to write a patch.

Thanks,
Amir.