Message ID | 20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] fs: introduce getfsxattrat and setfsxattrat syscalls | expand |
On February 11, 2025 9:22:47 AM PST, Andrey Albershteyn <aalbersh@redhat.com> wrote: >From: Andrey Albershteyn <aalbersh@redhat.com> > >Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode >extended attributes/flags. The syscalls take parent directory fd and >path to the child together with struct fsxattr. > >This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference >that file don't need to be open as we can reference it with a path >instead of fd. By having this we can manipulated inode extended >attributes not only on regular files but also on special ones. This >is not possible with FS_IOC_FSSETXATTR ioctl as with special files >we can not call ioctl() directly on the filesystem inode using fd. > >This patch adds two new syscalls which allows userspace to get/set >extended inode attributes on special files by using parent directory >and a path - *at() like syscall. > >Also, as vfs_fileattr_set() is now will be called on special files >too, let's forbid any other attributes except projid and nextents >(symlink can have an extent). > >CC: linux-api@vger.kernel.org >CC: linux-fsdevel@vger.kernel.org >CC: linux-xfs@vger.kernel.org >Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> >--- >v1: >https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > >Previous discussion: >https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > >XFS has project quotas which could be attached to a directory. All >new inodes in these directories inherit project ID set on parent >directory. > >The project is created from userspace by opening and calling >FS_IOC_FSSETXATTR on each inode. This is not possible for special >files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left >with empty project ID. Those inodes then are not shown in the quota >accounting but still exist in the directory. Moreover, in the case >when special files are created in the directory with already >existing project quota, these inode inherit extended attributes. >This than leaves them with these attributes without the possibility >to clear them out. This, in turn, prevents userspace from >re-creating quota project on these existing files. >--- >Changes in v3: >- Remove unnecessary "dfd is dir" check as it checked in user_path_at() >- Remove unnecessary "same filesystem" check >- Use CLASS() instead of directly calling fdget/fdput >- Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org >--- > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > arch/arm/tools/syscall.tbl | 2 + > arch/arm64/tools/syscall_32.tbl | 2 + > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > arch/s390/kernel/syscalls/syscall.tbl | 2 + > arch/sh/kernel/syscalls/syscall.tbl | 2 + > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > fs/inode.c | 75 +++++++++++++++++++++++++++++ > fs/ioctl.c | 16 +++++- > include/linux/fileattr.h | 1 + > include/linux/syscalls.h | 4 ++ > include/uapi/asm-generic/unistd.h | 8 ++- > 21 files changed, 133 insertions(+), 3 deletions(-) > >diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl >index c59d53d6d3f3490f976ca179ddfe02e69265ae4d..4b9e687494c16b60c6fd6ca1dc4d6564706a7e25 100644 >--- a/arch/alpha/kernel/syscalls/syscall.tbl >+++ b/arch/alpha/kernel/syscalls/syscall.tbl >@@ -506,3 +506,5 @@ > 574 common getxattrat sys_getxattrat > 575 common listxattrat sys_listxattrat > 576 common removexattrat sys_removexattrat >+577 common getfsxattrat sys_getfsxattrat >+578 common setfsxattrat sys_setfsxattrat >diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl >index 49eeb2ad8dbd8e074c6240417693f23fb328afa8..66466257f3c2debb3e2299f0b608c6740c98cab2 100644 >--- a/arch/arm/tools/syscall.tbl >+++ b/arch/arm/tools/syscall.tbl >@@ -481,3 +481,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl >index 69a829912a05eb8a3e21ed701d1030e31c0148bc..9c516118b154811d8d11d5696f32817430320dbf 100644 >--- a/arch/arm64/tools/syscall_32.tbl >+++ b/arch/arm64/tools/syscall_32.tbl >@@ -478,3 +478,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl >index f5ed71f1910d09769c845c2d062d99ee0449437c..159476387f394a92ee5e29db89b118c630372db2 100644 >--- a/arch/m68k/kernel/syscalls/syscall.tbl >+++ b/arch/m68k/kernel/syscalls/syscall.tbl >@@ -466,3 +466,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl >index 680f568b77f2cbefc3eacb2517f276041f229b1e..a6d59ee740b58cacf823702003cf9bad17c0d3b7 100644 >--- a/arch/microblaze/kernel/syscalls/syscall.tbl >+++ b/arch/microblaze/kernel/syscalls/syscall.tbl >@@ -472,3 +472,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl >index 0b9b7e25b69ad592642f8533bee9ccfe95ce9626..cfe38fcebe1a0279e11751378d3e71c5ec6b6569 100644 >--- a/arch/mips/kernel/syscalls/syscall_n32.tbl >+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl >@@ -405,3 +405,5 @@ > 464 n32 getxattrat sys_getxattrat > 465 n32 listxattrat sys_listxattrat > 466 n32 removexattrat sys_removexattrat >+467 n32 getfsxattrat sys_getfsxattrat >+468 n32 setfsxattrat sys_setfsxattrat >diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl >index c844cd5cda620b2809a397cdd6f4315ab6a1bfe2..29a0c5974d1aa2f01e33edc0252d75fb97abe230 100644 >--- a/arch/mips/kernel/syscalls/syscall_n64.tbl >+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl >@@ -381,3 +381,5 @@ > 464 n64 getxattrat sys_getxattrat > 465 n64 listxattrat sys_listxattrat > 466 n64 removexattrat sys_removexattrat >+467 n64 getfsxattrat sys_getfsxattrat >+468 n64 setfsxattrat sys_setfsxattrat >diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl >index 349b8aad1159f404103bd2057a1e64e9bf309f18..6c00436807c57c492ba957fcd59af1202231cf80 100644 >--- a/arch/mips/kernel/syscalls/syscall_o32.tbl >+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl >@@ -454,3 +454,5 @@ > 464 o32 getxattrat sys_getxattrat > 465 o32 listxattrat sys_listxattrat > 466 o32 removexattrat sys_removexattrat >+467 o32 getfsxattrat sys_getfsxattrat >+468 o32 setfsxattrat sys_setfsxattrat >diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl >index d9fc94c869657fcfbd7aca1d5f5abc9fae2fb9d8..b3578fac43d6b65167787fcc97d2d09f5a9828e7 100644 >--- a/arch/parisc/kernel/syscalls/syscall.tbl >+++ b/arch/parisc/kernel/syscalls/syscall.tbl >@@ -465,3 +465,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl >index d8b4ab78bef076bd50d49b87dea5060fd8c1686a..808045d82c9465c3bfa96b15947546efe5851e9a 100644 >--- a/arch/powerpc/kernel/syscalls/syscall.tbl >+++ b/arch/powerpc/kernel/syscalls/syscall.tbl >@@ -557,3 +557,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl >index e9115b4d8b635b846e5c9ad6ce229605323723a5..78dfc2c184d4815baf8a9e61c546c9936d58a47c 100644 >--- a/arch/s390/kernel/syscalls/syscall.tbl >+++ b/arch/s390/kernel/syscalls/syscall.tbl >@@ -469,3 +469,5 @@ > 464 common getxattrat sys_getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat sys_setfsxattrat >diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl >index c8cad33bf250ea110de37bd1407f5a43ec5e38f2..d5a5c8339f0ed25ea07c4aba90351d352033c8a0 100644 >--- a/arch/sh/kernel/syscalls/syscall.tbl >+++ b/arch/sh/kernel/syscalls/syscall.tbl >@@ -470,3 +470,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl >index 727f99d333b304b3db0711953a3d91ece18a28eb..817dcd8603bcbffc47f3f59aa3b74b16486453d0 100644 >--- a/arch/sparc/kernel/syscalls/syscall.tbl >+++ b/arch/sparc/kernel/syscalls/syscall.tbl >@@ -512,3 +512,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl >index 4d0fb2fba7e208ae9455459afe11e277321d9f74..b4842c027c5d00c0236b2ba89387c5e2267447bd 100644 >--- a/arch/x86/entry/syscalls/syscall_32.tbl >+++ b/arch/x86/entry/syscalls/syscall_32.tbl >@@ -472,3 +472,5 @@ > 464 i386 getxattrat sys_getxattrat > 465 i386 listxattrat sys_listxattrat > 466 i386 removexattrat sys_removexattrat >+467 i386 getfsxattrat sys_getfsxattrat >+468 i386 setfsxattrat sys_setfsxattrat >diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl >index 5eb708bff1c791debd6cfc5322583b2ae53f6437..b6f0a7236aaee624cf9b484239a1068085a8ffe1 100644 >--- a/arch/x86/entry/syscalls/syscall_64.tbl >+++ b/arch/x86/entry/syscalls/syscall_64.tbl >@@ -390,6 +390,8 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat > > # > # Due to a historical design error, certain syscalls are numbered differently >diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl >index 37effc1b134eea061f2c350c1d68b4436b65a4dd..425d56be337d1de22f205ac503df61ff86224fee 100644 >--- a/arch/xtensa/kernel/syscalls/syscall.tbl >+++ b/arch/xtensa/kernel/syscalls/syscall.tbl >@@ -437,3 +437,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat >+467 common getfsxattrat sys_getfsxattrat >+468 common setfsxattrat sys_setfsxattrat >diff --git a/fs/inode.c b/fs/inode.c >index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 >--- a/fs/inode.c >+++ b/fs/inode.c >@@ -23,6 +23,9 @@ > #include <linux/rw_hint.h> > #include <linux/seq_file.h> > #include <linux/debugfs.h> >+#include <linux/syscalls.h> >+#include <linux/fileattr.h> >+#include <linux/namei.h> > #include <trace/events/writeback.h> > #define CREATE_TRACE_POINTS > #include <trace/events/timestamp.h> >@@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > return mode & ~S_ISGID; > } > EXPORT_SYMBOL(mode_strip_sgid); >+ >+SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, >+ struct fsxattr __user *, fsx, unsigned int, at_flags) >+{ >+ CLASS(fd, dir)(dfd); >+ struct fileattr fa; >+ struct path filepath; >+ int error; >+ unsigned int lookup_flags = 0; >+ >+ if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) >+ return -EINVAL; >+ >+ if (at_flags & AT_SYMLINK_FOLLOW) >+ lookup_flags |= LOOKUP_FOLLOW; >+ >+ if (at_flags & AT_EMPTY_PATH) >+ lookup_flags |= LOOKUP_EMPTY; >+ >+ if (fd_empty(dir)) >+ return -EBADF; >+ >+ error = user_path_at(dfd, filename, lookup_flags, &filepath); >+ if (error) >+ return error; >+ >+ error = vfs_fileattr_get(filepath.dentry, &fa); >+ if (!error) >+ error = copy_fsxattr_to_user(&fa, fsx); >+ >+ path_put(&filepath); >+ return error; >+} >+ >+SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, >+ struct fsxattr __user *, fsx, unsigned int, at_flags) >+{ >+ CLASS(fd, dir)(dfd); >+ struct fileattr fa; >+ struct path filepath; >+ int error; >+ unsigned int lookup_flags = 0; >+ >+ if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) >+ return -EINVAL; >+ >+ if (at_flags & AT_SYMLINK_FOLLOW) >+ lookup_flags |= LOOKUP_FOLLOW; >+ >+ if (at_flags & AT_EMPTY_PATH) >+ lookup_flags |= LOOKUP_EMPTY; >+ >+ if (fd_empty(dir)) >+ return -EBADF; >+ >+ if (copy_fsxattr_from_user(&fa, fsx)) >+ return -EFAULT; >+ >+ error = user_path_at(dfd, filename, lookup_flags, &filepath); >+ if (error) >+ return error; >+ >+ error = mnt_want_write(filepath.mnt); >+ if (!error) { >+ error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), >+ filepath.dentry, &fa); >+ mnt_drop_write(filepath.mnt); >+ } >+ >+ path_put(&filepath); >+ return error; >+} >diff --git a/fs/ioctl.c b/fs/ioctl.c >index 638a36be31c14afc66a7fd6eb237d9545e8ad997..dc160c2ef145e4931d625f1f93c2a8ae7f87abf3 100644 >--- a/fs/ioctl.c >+++ b/fs/ioctl.c >@@ -558,8 +558,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa) > } > EXPORT_SYMBOL(copy_fsxattr_to_user); > >-static int copy_fsxattr_from_user(struct fileattr *fa, >- struct fsxattr __user *ufa) >+int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa) > { > struct fsxattr xfa; > >@@ -646,6 +645,19 @@ static int fileattr_set_prepare(struct inode *inode, > if (fa->fsx_cowextsize == 0) > fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; > >+ /* >+ * The only use case for special files is to set project ID, forbid any >+ * other attributes >+ */ >+ if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { >+ if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) >+ return -EINVAL; >+ if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) >+ return -EINVAL; >+ if (fa->fsx_extsize || fa->fsx_cowextsize) >+ return -EINVAL; >+ } >+ > return 0; > } > >diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h >index 47c05a9851d0600964b644c9c7218faacfd865f8..8598e94b530b8b280a2697eaf918dd60f573d6ee 100644 >--- a/include/linux/fileattr.h >+++ b/include/linux/fileattr.h >@@ -34,6 +34,7 @@ struct fileattr { > }; > > int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa); >+int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa); > > void fileattr_fill_xflags(struct fileattr *fa, u32 xflags); > void fileattr_fill_flags(struct fileattr *fa, u32 flags); >diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h >index c6333204d45130eb022f6db460eea34a1f6e91db..3134d463d9af64c6e78adb37bff4b91f77b5305f 100644 >--- a/include/linux/syscalls.h >+++ b/include/linux/syscalls.h >@@ -371,6 +371,10 @@ asmlinkage long sys_removexattrat(int dfd, const char __user *path, > asmlinkage long sys_lremovexattr(const char __user *path, > const char __user *name); > asmlinkage long sys_fremovexattr(int fd, const char __user *name); >+asmlinkage long sys_getfsxattrat(int dfd, const char __user *filename, >+ struct fsxattr *fsx, unsigned int at_flags); >+asmlinkage long sys_setfsxattrat(int dfd, const char __user *filename, >+ struct fsxattr *fsx, unsigned int at_flags); > asmlinkage long sys_getcwd(char __user *buf, unsigned long size); > asmlinkage long sys_eventfd2(unsigned int count, int flags); > asmlinkage long sys_epoll_create1(int flags); >diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h >index 88dc393c2bca38c0fa1b3fae579f7cfe4931223c..50be2e1007bc2779120d05c6e9512a689f86779c 100644 >--- a/include/uapi/asm-generic/unistd.h >+++ b/include/uapi/asm-generic/unistd.h >@@ -850,8 +850,14 @@ __SYSCALL(__NR_listxattrat, sys_listxattrat) > #define __NR_removexattrat 466 > __SYSCALL(__NR_removexattrat, sys_removexattrat) > >+/* fs/inode.c */ >+#define __NR_getfsxattrat 467 >+__SYSCALL(__NR_getfsxattrat, sys_getfsxattrat) >+#define __NR_setfsxattrat 468 >+__SYSCALL(__NR_setfsxattrat, sys_setfsxattrat) >+ > #undef __NR_syscalls >-#define __NR_syscalls 467 >+#define __NR_syscalls 469 > > /* > * 32 bit systems traditionally used different > >--- >base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 >change-id: 20250114-xattrat-syscall-6a1136d2db59 > >Best regards, Could you please give a quick description of the API – even just the prototype – and, for the future, include in the cover letter?
On Tue, Feb 11, 2025, at 18:22, Andrey Albershteyn wrote: > From: Andrey Albershteyn <aalbersh@redhat.com> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > extended attributes/flags. The syscalls take parent directory fd and > path to the child together with struct fsxattr. > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > that file don't need to be open as we can reference it with a path > instead of fd. By having this we can manipulated inode extended > attributes not only on regular files but also on special ones. This > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > we can not call ioctl() directly on the filesystem inode using fd. > > This patch adds two new syscalls which allows userspace to get/set > extended inode attributes on special files by using parent directory > and a path - *at() like syscall. > > Also, as vfs_fileattr_set() is now will be called on special files > too, let's forbid any other attributes except projid and nextents > (symlink can have an extent). > > CC: linux-api@vger.kernel.org > CC: linux-fsdevel@vger.kernel.org > CC: linux-xfs@vger.kernel.org > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> I checked the syscall.tbl additions and the ABI to ensure that it follows the usual guidelines and is portable across all architectures, this looks good. Thanks for addressing my v1 comments: Acked-by: Arnd Bergmann <arnd@arndb.de> Disclaimer: I have no idea if the new syscalls are a good idea or if they are fit for the purpose, I trust the VFS maintainers will take care of reviewing that.
Got more comments below with private mail: On 2025-02-11 18:22:47, Andrey Albershteyn wrote: > From: Andrey Albershteyn <aalbersh@redhat.com> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > extended attributes/flags. The syscalls take parent directory fd and > path to the child together with struct fsxattr. > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > that file don't need to be open as we can reference it with a path > instead of fd. By having this we can manipulated inode extended > attributes not only on regular files but also on special ones. This > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > we can not call ioctl() directly on the filesystem inode using fd. > > This patch adds two new syscalls which allows userspace to get/set > extended inode attributes on special files by using parent directory > and a path - *at() like syscall. > > Also, as vfs_fileattr_set() is now will be called on special files > too, let's forbid any other attributes except projid and nextents > (symlink can have an extent). > > CC: linux-api@vger.kernel.org > CC: linux-fsdevel@vger.kernel.org > CC: linux-xfs@vger.kernel.org > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > --- > v1: > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > Previous discussion: > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > XFS has project quotas which could be attached to a directory. All > new inodes in these directories inherit project ID set on parent > directory. > > The project is created from userspace by opening and calling > FS_IOC_FSSETXATTR on each inode. This is not possible for special > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > with empty project ID. Those inodes then are not shown in the quota > accounting but still exist in the directory. Moreover, in the case > when special files are created in the directory with already > existing project quota, these inode inherit extended attributes. > This than leaves them with these attributes without the possibility > to clear them out. This, in turn, prevents userspace from > re-creating quota project on these existing files. > --- > Changes in v3: > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > - Remove unnecessary "same filesystem" check > - Use CLASS() instead of directly calling fdget/fdput > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > --- > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > arch/arm/tools/syscall.tbl | 2 + > arch/arm64/tools/syscall_32.tbl | 2 + > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > arch/s390/kernel/syscalls/syscall.tbl | 2 + > arch/sh/kernel/syscalls/syscall.tbl | 2 + > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > fs/inode.c | 75 +++++++++++++++++++++++++++++ > fs/ioctl.c | 16 +++++- > include/linux/fileattr.h | 1 + > include/linux/syscalls.h | 4 ++ > include/uapi/asm-generic/unistd.h | 8 ++- > 21 files changed, 133 insertions(+), 3 deletions(-) > > diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl > index c59d53d6d3f3490f976ca179ddfe02e69265ae4d..4b9e687494c16b60c6fd6ca1dc4d6564706a7e25 100644 > --- a/arch/alpha/kernel/syscalls/syscall.tbl > +++ b/arch/alpha/kernel/syscalls/syscall.tbl > @@ -506,3 +506,5 @@ > 574 common getxattrat sys_getxattrat > 575 common listxattrat sys_listxattrat > 576 common removexattrat sys_removexattrat > +577 common getfsxattrat sys_getfsxattrat > +578 common setfsxattrat sys_setfsxattrat > diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl > index 49eeb2ad8dbd8e074c6240417693f23fb328afa8..66466257f3c2debb3e2299f0b608c6740c98cab2 100644 > --- a/arch/arm/tools/syscall.tbl > +++ b/arch/arm/tools/syscall.tbl > @@ -481,3 +481,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl > index 69a829912a05eb8a3e21ed701d1030e31c0148bc..9c516118b154811d8d11d5696f32817430320dbf 100644 > --- a/arch/arm64/tools/syscall_32.tbl > +++ b/arch/arm64/tools/syscall_32.tbl > @@ -478,3 +478,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl > index f5ed71f1910d09769c845c2d062d99ee0449437c..159476387f394a92ee5e29db89b118c630372db2 100644 > --- a/arch/m68k/kernel/syscalls/syscall.tbl > +++ b/arch/m68k/kernel/syscalls/syscall.tbl > @@ -466,3 +466,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl > index 680f568b77f2cbefc3eacb2517f276041f229b1e..a6d59ee740b58cacf823702003cf9bad17c0d3b7 100644 > --- a/arch/microblaze/kernel/syscalls/syscall.tbl > +++ b/arch/microblaze/kernel/syscalls/syscall.tbl > @@ -472,3 +472,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl > index 0b9b7e25b69ad592642f8533bee9ccfe95ce9626..cfe38fcebe1a0279e11751378d3e71c5ec6b6569 100644 > --- a/arch/mips/kernel/syscalls/syscall_n32.tbl > +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl > @@ -405,3 +405,5 @@ > 464 n32 getxattrat sys_getxattrat > 465 n32 listxattrat sys_listxattrat > 466 n32 removexattrat sys_removexattrat > +467 n32 getfsxattrat sys_getfsxattrat > +468 n32 setfsxattrat sys_setfsxattrat > diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl > index c844cd5cda620b2809a397cdd6f4315ab6a1bfe2..29a0c5974d1aa2f01e33edc0252d75fb97abe230 100644 > --- a/arch/mips/kernel/syscalls/syscall_n64.tbl > +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl > @@ -381,3 +381,5 @@ > 464 n64 getxattrat sys_getxattrat > 465 n64 listxattrat sys_listxattrat > 466 n64 removexattrat sys_removexattrat > +467 n64 getfsxattrat sys_getfsxattrat > +468 n64 setfsxattrat sys_setfsxattrat > diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl > index 349b8aad1159f404103bd2057a1e64e9bf309f18..6c00436807c57c492ba957fcd59af1202231cf80 100644 > --- a/arch/mips/kernel/syscalls/syscall_o32.tbl > +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl > @@ -454,3 +454,5 @@ > 464 o32 getxattrat sys_getxattrat > 465 o32 listxattrat sys_listxattrat > 466 o32 removexattrat sys_removexattrat > +467 o32 getfsxattrat sys_getfsxattrat > +468 o32 setfsxattrat sys_setfsxattrat > diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl > index d9fc94c869657fcfbd7aca1d5f5abc9fae2fb9d8..b3578fac43d6b65167787fcc97d2d09f5a9828e7 100644 > --- a/arch/parisc/kernel/syscalls/syscall.tbl > +++ b/arch/parisc/kernel/syscalls/syscall.tbl > @@ -465,3 +465,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl > index d8b4ab78bef076bd50d49b87dea5060fd8c1686a..808045d82c9465c3bfa96b15947546efe5851e9a 100644 > --- a/arch/powerpc/kernel/syscalls/syscall.tbl > +++ b/arch/powerpc/kernel/syscalls/syscall.tbl > @@ -557,3 +557,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl > index e9115b4d8b635b846e5c9ad6ce229605323723a5..78dfc2c184d4815baf8a9e61c546c9936d58a47c 100644 > --- a/arch/s390/kernel/syscalls/syscall.tbl > +++ b/arch/s390/kernel/syscalls/syscall.tbl > @@ -469,3 +469,5 @@ > 464 common getxattrat sys_getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat sys_setfsxattrat > diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl > index c8cad33bf250ea110de37bd1407f5a43ec5e38f2..d5a5c8339f0ed25ea07c4aba90351d352033c8a0 100644 > --- a/arch/sh/kernel/syscalls/syscall.tbl > +++ b/arch/sh/kernel/syscalls/syscall.tbl > @@ -470,3 +470,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl > index 727f99d333b304b3db0711953a3d91ece18a28eb..817dcd8603bcbffc47f3f59aa3b74b16486453d0 100644 > --- a/arch/sparc/kernel/syscalls/syscall.tbl > +++ b/arch/sparc/kernel/syscalls/syscall.tbl > @@ -512,3 +512,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl > index 4d0fb2fba7e208ae9455459afe11e277321d9f74..b4842c027c5d00c0236b2ba89387c5e2267447bd 100644 > --- a/arch/x86/entry/syscalls/syscall_32.tbl > +++ b/arch/x86/entry/syscalls/syscall_32.tbl > @@ -472,3 +472,5 @@ > 464 i386 getxattrat sys_getxattrat > 465 i386 listxattrat sys_listxattrat > 466 i386 removexattrat sys_removexattrat > +467 i386 getfsxattrat sys_getfsxattrat > +468 i386 setfsxattrat sys_setfsxattrat > diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl > index 5eb708bff1c791debd6cfc5322583b2ae53f6437..b6f0a7236aaee624cf9b484239a1068085a8ffe1 100644 > --- a/arch/x86/entry/syscalls/syscall_64.tbl > +++ b/arch/x86/entry/syscalls/syscall_64.tbl > @@ -390,6 +390,8 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > > # > # Due to a historical design error, certain syscalls are numbered differently > diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl > index 37effc1b134eea061f2c350c1d68b4436b65a4dd..425d56be337d1de22f205ac503df61ff86224fee 100644 > --- a/arch/xtensa/kernel/syscalls/syscall.tbl > +++ b/arch/xtensa/kernel/syscalls/syscall.tbl > @@ -437,3 +437,5 @@ > 464 common getxattrat sys_getxattrat > 465 common listxattrat sys_listxattrat > 466 common removexattrat sys_removexattrat > +467 common getfsxattrat sys_getfsxattrat > +468 common setfsxattrat sys_setfsxattrat > diff --git a/fs/inode.c b/fs/inode.c > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -23,6 +23,9 @@ > #include <linux/rw_hint.h> > #include <linux/seq_file.h> > #include <linux/debugfs.h> > +#include <linux/syscalls.h> > +#include <linux/fileattr.h> > +#include <linux/namei.h> > #include <trace/events/writeback.h> > #define CREATE_TRACE_POINTS > #include <trace/events/timestamp.h> > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > return mode & ~S_ISGID; > } > EXPORT_SYMBOL(mode_strip_sgid); > + > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = vfs_fileattr_get(filepath.dentry, &fa); vfs_fileattr_get() returns ENOIOCTLCMD, where EOPNOTSUPP is more appropriate > + if (!error) > + error = copy_fsxattr_to_user(&fa, fsx); > + > + path_put(&filepath); > + return error; > +} > + > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) ^ can be const > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + if (copy_fsxattr_from_user(&fa, fsx)) > + return -EFAULT; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = mnt_want_write(filepath.mnt); > + if (!error) { > + error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), > + filepath.dentry, &fa); same here with returned error
It looks security checks are missing. With IOCTL commands, file permissions are checked at open time, but with these syscalls the path is only resolved but no specific access seems to be checked (except inode_owner_or_capable via vfs_fileattr_set). On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > From: Andrey Albershteyn <aalbersh@redhat.com> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > extended attributes/flags. The syscalls take parent directory fd and > path to the child together with struct fsxattr. > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > that file don't need to be open as we can reference it with a path > instead of fd. By having this we can manipulated inode extended > attributes not only on regular files but also on special ones. This > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > we can not call ioctl() directly on the filesystem inode using fd. > > This patch adds two new syscalls which allows userspace to get/set > extended inode attributes on special files by using parent directory > and a path - *at() like syscall. > > Also, as vfs_fileattr_set() is now will be called on special files > too, let's forbid any other attributes except projid and nextents > (symlink can have an extent). > > CC: linux-api@vger.kernel.org > CC: linux-fsdevel@vger.kernel.org > CC: linux-xfs@vger.kernel.org > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > --- > v1: > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > Previous discussion: > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > XFS has project quotas which could be attached to a directory. All > new inodes in these directories inherit project ID set on parent > directory. > > The project is created from userspace by opening and calling > FS_IOC_FSSETXATTR on each inode. This is not possible for special > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > with empty project ID. Those inodes then are not shown in the quota > accounting but still exist in the directory. Moreover, in the case > when special files are created in the directory with already > existing project quota, these inode inherit extended attributes. > This than leaves them with these attributes without the possibility > to clear them out. This, in turn, prevents userspace from > re-creating quota project on these existing files. > --- > Changes in v3: > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > - Remove unnecessary "same filesystem" check > - Use CLASS() instead of directly calling fdget/fdput > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > --- > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > arch/arm/tools/syscall.tbl | 2 + > arch/arm64/tools/syscall_32.tbl | 2 + > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > arch/s390/kernel/syscalls/syscall.tbl | 2 + > arch/sh/kernel/syscalls/syscall.tbl | 2 + > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > fs/inode.c | 75 +++++++++++++++++++++++++++++ > fs/ioctl.c | 16 +++++- > include/linux/fileattr.h | 1 + > include/linux/syscalls.h | 4 ++ > include/uapi/asm-generic/unistd.h | 8 ++- > 21 files changed, 133 insertions(+), 3 deletions(-) > [...] > diff --git a/fs/inode.c b/fs/inode.c > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -23,6 +23,9 @@ > #include <linux/rw_hint.h> > #include <linux/seq_file.h> > #include <linux/debugfs.h> > +#include <linux/syscalls.h> > +#include <linux/fileattr.h> > +#include <linux/namei.h> > #include <trace/events/writeback.h> > #define CREATE_TRACE_POINTS > #include <trace/events/timestamp.h> > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > return mode & ~S_ISGID; > } > EXPORT_SYMBOL(mode_strip_sgid); > + > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; security_inode_getattr() should probably be called here. > + > + error = vfs_fileattr_get(filepath.dentry, &fa); > + if (!error) > + error = copy_fsxattr_to_user(&fa, fsx); > + > + path_put(&filepath); > + return error; > +} > + > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + if (copy_fsxattr_from_user(&fa, fsx)) > + return -EFAULT; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = mnt_want_write(filepath.mnt); > + if (!error) { security_inode_setattr() should probably be called too. > + error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), > + filepath.dentry, &fa); > + mnt_drop_write(filepath.mnt); > + } > + > + path_put(&filepath); > + return error; > +} > diff --git a/fs/ioctl.c b/fs/ioctl.c > index 638a36be31c14afc66a7fd6eb237d9545e8ad997..dc160c2ef145e4931d625f1f93c2a8ae7f87abf3 100644 > --- a/fs/ioctl.c > +++ b/fs/ioctl.c > @@ -558,8 +558,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa) > } > EXPORT_SYMBOL(copy_fsxattr_to_user); > > -static int copy_fsxattr_from_user(struct fileattr *fa, > - struct fsxattr __user *ufa) > +int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa) > { > struct fsxattr xfa; > > @@ -646,6 +645,19 @@ static int fileattr_set_prepare(struct inode *inode, > if (fa->fsx_cowextsize == 0) > fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; > > + /* > + * The only use case for special files is to set project ID, forbid any > + * other attributes > + */ > + if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { > + if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) > + return -EINVAL; > + if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) > + return -EINVAL; > + if (fa->fsx_extsize || fa->fsx_cowextsize) > + return -EINVAL; > + } > + > return 0; > } > > diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h [...]
On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > From: Andrey Albershteyn <aalbersh@redhat.com> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > extended attributes/flags. The syscalls take parent directory fd and > path to the child together with struct fsxattr. > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > that file don't need to be open as we can reference it with a path > instead of fd. By having this we can manipulated inode extended > attributes not only on regular files but also on special ones. This > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > we can not call ioctl() directly on the filesystem inode using fd. > > This patch adds two new syscalls which allows userspace to get/set > extended inode attributes on special files by using parent directory > and a path - *at() like syscall. > > Also, as vfs_fileattr_set() is now will be called on special files > too, let's forbid any other attributes except projid and nextents > (symlink can have an extent). > > CC: linux-api@vger.kernel.org > CC: linux-fsdevel@vger.kernel.org > CC: linux-xfs@vger.kernel.org > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > --- > v1: > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > Previous discussion: > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > XFS has project quotas which could be attached to a directory. All > new inodes in these directories inherit project ID set on parent > directory. > > The project is created from userspace by opening and calling > FS_IOC_FSSETXATTR on each inode. This is not possible for special > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > with empty project ID. Those inodes then are not shown in the quota > accounting but still exist in the directory. Moreover, in the case > when special files are created in the directory with already > existing project quota, these inode inherit extended attributes. > This than leaves them with these attributes without the possibility > to clear them out. This, in turn, prevents userspace from > re-creating quota project on these existing files. > --- > Changes in v3: > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > - Remove unnecessary "same filesystem" check > - Use CLASS() instead of directly calling fdget/fdput > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > --- > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > arch/arm/tools/syscall.tbl | 2 + > arch/arm64/tools/syscall_32.tbl | 2 + > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > arch/s390/kernel/syscalls/syscall.tbl | 2 + > arch/sh/kernel/syscalls/syscall.tbl | 2 + > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > fs/inode.c | 75 +++++++++++++++++++++++++++++ > fs/ioctl.c | 16 +++++- > include/linux/fileattr.h | 1 + > include/linux/syscalls.h | 4 ++ > include/uapi/asm-generic/unistd.h | 8 ++- > 21 files changed, 133 insertions(+), 3 deletions(-) > <cut to the syscall definitions> > diff --git a/fs/inode.c b/fs/inode.c > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -23,6 +23,9 @@ > #include <linux/rw_hint.h> > #include <linux/seq_file.h> > #include <linux/debugfs.h> > +#include <linux/syscalls.h> > +#include <linux/fileattr.h> > +#include <linux/namei.h> > #include <trace/events/writeback.h> > #define CREATE_TRACE_POINTS > #include <trace/events/timestamp.h> > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > return mode & ~S_ISGID; > } > EXPORT_SYMBOL(mode_strip_sgid); > + > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) Should the kernel require userspace to pass the size of the fsx buffer? That way we avoid needing to rev the interface when we decide to grow the structure. --D > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = vfs_fileattr_get(filepath.dentry, &fa); > + if (!error) > + error = copy_fsxattr_to_user(&fa, fsx); > + > + path_put(&filepath); > + return error; > +} > + > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; > + > + if (copy_fsxattr_from_user(&fa, fsx)) > + return -EFAULT; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = mnt_want_write(filepath.mnt); > + if (!error) { > + error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), > + filepath.dentry, &fa); > + mnt_drop_write(filepath.mnt); > + } > + > + path_put(&filepath); > + return error; > +} > diff --git a/fs/ioctl.c b/fs/ioctl.c > index 638a36be31c14afc66a7fd6eb237d9545e8ad997..dc160c2ef145e4931d625f1f93c2a8ae7f87abf3 100644 > --- a/fs/ioctl.c > +++ b/fs/ioctl.c > @@ -558,8 +558,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa) > } > EXPORT_SYMBOL(copy_fsxattr_to_user); > > -static int copy_fsxattr_from_user(struct fileattr *fa, > - struct fsxattr __user *ufa) > +int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa) > { > struct fsxattr xfa; > > @@ -646,6 +645,19 @@ static int fileattr_set_prepare(struct inode *inode, > if (fa->fsx_cowextsize == 0) > fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; > > + /* > + * The only use case for special files is to set project ID, forbid any > + * other attributes > + */ > + if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { > + if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) > + return -EINVAL; > + if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) > + return -EINVAL; > + if (fa->fsx_extsize || fa->fsx_cowextsize) > + return -EINVAL; > + } > + > return 0; > } > > diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h > index 47c05a9851d0600964b644c9c7218faacfd865f8..8598e94b530b8b280a2697eaf918dd60f573d6ee 100644 > --- a/include/linux/fileattr.h > +++ b/include/linux/fileattr.h > @@ -34,6 +34,7 @@ struct fileattr { > }; > > int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa); > +int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa); > > void fileattr_fill_xflags(struct fileattr *fa, u32 xflags); > void fileattr_fill_flags(struct fileattr *fa, u32 flags); > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index c6333204d45130eb022f6db460eea34a1f6e91db..3134d463d9af64c6e78adb37bff4b91f77b5305f 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -371,6 +371,10 @@ asmlinkage long sys_removexattrat(int dfd, const char __user *path, > asmlinkage long sys_lremovexattr(const char __user *path, > const char __user *name); > asmlinkage long sys_fremovexattr(int fd, const char __user *name); > +asmlinkage long sys_getfsxattrat(int dfd, const char __user *filename, > + struct fsxattr *fsx, unsigned int at_flags); > +asmlinkage long sys_setfsxattrat(int dfd, const char __user *filename, > + struct fsxattr *fsx, unsigned int at_flags); > asmlinkage long sys_getcwd(char __user *buf, unsigned long size); > asmlinkage long sys_eventfd2(unsigned int count, int flags); > asmlinkage long sys_epoll_create1(int flags); > diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h > index 88dc393c2bca38c0fa1b3fae579f7cfe4931223c..50be2e1007bc2779120d05c6e9512a689f86779c 100644 > --- a/include/uapi/asm-generic/unistd.h > +++ b/include/uapi/asm-generic/unistd.h > @@ -850,8 +850,14 @@ __SYSCALL(__NR_listxattrat, sys_listxattrat) > #define __NR_removexattrat 466 > __SYSCALL(__NR_removexattrat, sys_removexattrat) > > +/* fs/inode.c */ > +#define __NR_getfsxattrat 467 > +__SYSCALL(__NR_getfsxattrat, sys_getfsxattrat) > +#define __NR_setfsxattrat 468 > +__SYSCALL(__NR_setfsxattrat, sys_setfsxattrat) > + > #undef __NR_syscalls > -#define __NR_syscalls 467 > +#define __NR_syscalls 469 > > /* > * 32 bit systems traditionally used different > > --- > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 > change-id: 20250114-xattrat-syscall-6a1136d2db59 > > Best regards, > -- > Andrey Albershteyn <aalbersh@kernel.org> > >
On Feb 21, 2025, at 11:11 AM, Darrick J. Wong <djwong@kernel.org> wrote: > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: >> From: Andrey Albershteyn <aalbersh@redhat.com> >> >> Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode >> extended attributes/flags. The syscalls take parent directory fd and >> path to the child together with struct fsxattr. >> >> This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference >> that file don't need to be open as we can reference it with a path >> instead of fd. By having this we can manipulated inode extended >> attributes not only on regular files but also on special ones. This >> is not possible with FS_IOC_FSSETXATTR ioctl as with special files >> we can not call ioctl() directly on the filesystem inode using fd. >> >> This patch adds two new syscalls which allows userspace to get/set >> extended inode attributes on special files by using parent directory >> and a path - *at() like syscall. >> >> Also, as vfs_fileattr_set() is now will be called on special files >> too, let's forbid any other attributes except projid and nextents >> (symlink can have an extent). >> >> CC: linux-api@vger.kernel.org >> CC: linux-fsdevel@vger.kernel.org >> CC: linux-xfs@vger.kernel.org >> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> >> --- >> v1: >> https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ >> >> Previous discussion: >> https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ >> >> XFS has project quotas which could be attached to a directory. All >> new inodes in these directories inherit project ID set on parent >> directory. >> >> The project is created from userspace by opening and calling >> FS_IOC_FSSETXATTR on each inode. This is not possible for special >> files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left >> with empty project ID. Those inodes then are not shown in the quota >> accounting but still exist in the directory. Moreover, in the case >> when special files are created in the directory with already >> existing project quota, these inode inherit extended attributes. >> This than leaves them with these attributes without the possibility >> to clear them out. This, in turn, prevents userspace from >> re-creating quota project on these existing files. >> --- >> Changes in v3: >> - Remove unnecessary "dfd is dir" check as it checked in user_path_at() >> - Remove unnecessary "same filesystem" check >> - Use CLASS() instead of directly calling fdget/fdput >> - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org >> --- >> arch/alpha/kernel/syscalls/syscall.tbl | 2 + >> arch/arm/tools/syscall.tbl | 2 + >> arch/arm64/tools/syscall_32.tbl | 2 + >> arch/m68k/kernel/syscalls/syscall.tbl | 2 + >> arch/microblaze/kernel/syscalls/syscall.tbl | 2 + >> arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + >> arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + >> arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + >> arch/parisc/kernel/syscalls/syscall.tbl | 2 + >> arch/powerpc/kernel/syscalls/syscall.tbl | 2 + >> arch/s390/kernel/syscalls/syscall.tbl | 2 + >> arch/sh/kernel/syscalls/syscall.tbl | 2 + >> arch/sparc/kernel/syscalls/syscall.tbl | 2 + >> arch/x86/entry/syscalls/syscall_32.tbl | 2 + >> arch/x86/entry/syscalls/syscall_64.tbl | 2 + >> arch/xtensa/kernel/syscalls/syscall.tbl | 2 + >> fs/inode.c | 75 +++++++++++++++++++++++++++++ >> fs/ioctl.c | 16 +++++- >> include/linux/fileattr.h | 1 + >> include/linux/syscalls.h | 4 ++ >> include/uapi/asm-generic/unistd.h | 8 ++- >> 21 files changed, 133 insertions(+), 3 deletions(-) >> > > <cut to the syscall definitions> > >> diff --git a/fs/inode.c b/fs/inode.c >> index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 >> --- a/fs/inode.c >> +++ b/fs/inode.c >> @@ -23,6 +23,9 @@ >> #include <linux/rw_hint.h> >> #include <linux/seq_file.h> >> #include <linux/debugfs.h> >> +#include <linux/syscalls.h> >> +#include <linux/fileattr.h> >> +#include <linux/namei.h> >> #include <trace/events/writeback.h> >> #define CREATE_TRACE_POINTS >> #include <trace/events/timestamp.h> >> @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, >> return mode & ~S_ISGID; >> } >> EXPORT_SYMBOL(mode_strip_sgid); >> + >> +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, >> + struct fsxattr __user *, fsx, unsigned int, at_flags) > > Should the kernel require userspace to pass the size of the fsx buffer? > That way we avoid needing to rev the interface when we decide to grow > the structure. Definitely having some extensibility would be good, and there isn't much room left today. The struct size change would be handled automatically by the ioctl() interface, but not the new syscall interface. Another option would be to use an xflags to indicate "larger struct" and then store the size after the end of the current struct. It would also be possible to use one of the few remaining fields for this, but one is earmarked for the DOS flags and/or a bitmask of supported flags, and there isn't really any value to it until more fields are needed. #define FS_XFLAG_LARGE_STRUCT 0x40000000 struct fsxattr { __u32 fsx_xflags; /* xflags field value (get/set) */ __u32 fsx_extsize; /* extsize field value (get/set)*/ __u32 fsx_nextents; /* nextents field value (get) */ __u32 fsx_projid; /* project identifier (get/set) */ __u32 fsx_cowextsize; /* CoW extsize field value (get/set)*/ unsigned char fsx_pad[8]; __u32 fsx_fsxattr_size; /* struct size in bytes (get/set) */ : /* future fields */ : }; Not su Cheers, Andreas
On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > extended attributes/flags. The syscalls take parent directory fd and > > path to the child together with struct fsxattr. > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > that file don't need to be open as we can reference it with a path > > instead of fd. By having this we can manipulated inode extended > > attributes not only on regular files but also on special ones. This > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > we can not call ioctl() directly on the filesystem inode using fd. > > > > This patch adds two new syscalls which allows userspace to get/set > > extended inode attributes on special files by using parent directory > > and a path - *at() like syscall. > > > > Also, as vfs_fileattr_set() is now will be called on special files > > too, let's forbid any other attributes except projid and nextents > > (symlink can have an extent). > > > > CC: linux-api@vger.kernel.org > > CC: linux-fsdevel@vger.kernel.org > > CC: linux-xfs@vger.kernel.org > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > --- > > v1: > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > Previous discussion: > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > XFS has project quotas which could be attached to a directory. All > > new inodes in these directories inherit project ID set on parent > > directory. > > > > The project is created from userspace by opening and calling > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > with empty project ID. Those inodes then are not shown in the quota > > accounting but still exist in the directory. Moreover, in the case > > when special files are created in the directory with already > > existing project quota, these inode inherit extended attributes. > > This than leaves them with these attributes without the possibility > > to clear them out. This, in turn, prevents userspace from > > re-creating quota project on these existing files. > > --- > > Changes in v3: > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > - Remove unnecessary "same filesystem" check > > - Use CLASS() instead of directly calling fdget/fdput > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > --- > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > arch/arm/tools/syscall.tbl | 2 + > > arch/arm64/tools/syscall_32.tbl | 2 + > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > fs/ioctl.c | 16 +++++- > > include/linux/fileattr.h | 1 + > > include/linux/syscalls.h | 4 ++ > > include/uapi/asm-generic/unistd.h | 8 ++- > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > <cut to the syscall definitions> > > > diff --git a/fs/inode.c b/fs/inode.c > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -23,6 +23,9 @@ > > #include <linux/rw_hint.h> > > #include <linux/seq_file.h> > > #include <linux/debugfs.h> > > +#include <linux/syscalls.h> > > +#include <linux/fileattr.h> > > +#include <linux/namei.h> > > #include <trace/events/writeback.h> > > #define CREATE_TRACE_POINTS > > #include <trace/events/timestamp.h> > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > return mode & ~S_ISGID; > > } > > EXPORT_SYMBOL(mode_strip_sgid); > > + > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > Should the kernel require userspace to pass the size of the fsx buffer? > That way we avoid needing to rev the interface when we decide to grow > the structure. > This makes sense to me, but I see that Andreas proposed other ways, as long as we have a plan on how to extend the struct if we need more space. Andrey, I am sorry to bring this up in v3, but I would like to request two small changes before merging this API. This patch by Pali [1] adds fsx_xflags_mask for the filesystem to report the supported set of xflags. It was argued that we can make this change with the existing ioctl, because it is not going to break xfs_io -c lsattr/chattr, which is fine, but I think that we should merge the fsx_xflags_mask change along with getfsxattrat() which is a new UAPI. The second request is related to setfsxattrat(). With current FS_IOC_FSSETXATTR, IIUC, xfs ignores unsupported fsx_xflags. I think this needs to be fixed before merging setfsxattrat(). It's ok that a program calling FS_IOC_FSSETXATTR will not know if unsupported flags will be ignored, because that's the way it is, but I think that setfsxattrat() must return -EINVAL for trying to set unsupported xflags. As I explained in [2] I think it is fine if FS_IOC_FSSETXATTR will also start returning -EINVAL for unsupported flags, but I would like setfsxattrat() to make that a guarantee. There was an open question, what does fsx_xflags_mask mean for setfsxattrat() - it is a mask like in inode_set_flags() as Andreas suggested? I think that would be a good idea. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20250216164029.20673-4-pali@kernel.org/ [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxjwQJiKAqyjEmKUnq-VihyeSsxyEy2F+J38NXwrAXurFQ@mail.gmail.com/
On Fri, Feb 21, 2025 at 10:08 AM Mickaël Salaün <mic@digikod.net> wrote: > > It looks security checks are missing. With IOCTL commands, file > permissions are checked at open time, but with these syscalls the path > is only resolved but no specific access seems to be checked (except > inode_owner_or_capable via vfs_fileattr_set). Thanks for reviewing the patch and catching this Mickaël. I agree with the hooks identified and their placement; it should be fairly straightforward with only a few lines added in each case.
On Tue 11-02-25 18:22:47, Andrey Albershteyn wrote: > From: Andrey Albershteyn <aalbersh@redhat.com> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > extended attributes/flags. The syscalls take parent directory fd and > path to the child together with struct fsxattr. > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > that file don't need to be open as we can reference it with a path > instead of fd. By having this we can manipulated inode extended > attributes not only on regular files but also on special ones. This > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > we can not call ioctl() directly on the filesystem inode using fd. > > This patch adds two new syscalls which allows userspace to get/set > extended inode attributes on special files by using parent directory > and a path - *at() like syscall. > > Also, as vfs_fileattr_set() is now will be called on special files > too, let's forbid any other attributes except projid and nextents > (symlink can have an extent). > > CC: linux-api@vger.kernel.org > CC: linux-fsdevel@vger.kernel.org > CC: linux-xfs@vger.kernel.org > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> Some comments below: > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) ^^ This should be !(at_flags & AT_SYMLINK_NOFOLLOW)? In the check above you verify for AT_SYMLINK_NOFOLLOW and that also matches what setxattrat() does... > + lookup_flags |= LOOKUP_FOLLOW; > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; This check is wrong and in fact the whole dfd handling looks buggy. openat(2) manpage describes the expected behavior: The dirfd argument is used in conjunction with the pathname argument as follows: • If the pathname given in pathname is absolute, then dirfd is ig- nored. ^^^^ This is what you break. If the pathname is absolute, you're not expected to touch dirfd. • If the pathname given in pathname is relative and dirfd is the spe- cial value AT_FDCWD, then pathname is interpreted relative to the current working directory of the calling process (like open()). ^^^ Also AT_FDCWD handling would be broken by the above check. • If the pathname given in pathname is relative, then it is inter- preted relative to the directory referred to by the file descriptor dirfd (rather than relative to the current working directory of the calling process, as is done by open() for a relative pathname). In this case, dirfd must be a directory that was opened for reading (O_RDONLY) or using the O_PATH flag. If the pathname given in pathname is relative, and dirfd is not a valid file descriptor, an error (EBADF) results. (Specifying an invalid file descriptor number in dirfd can be used as a means to ensure that path- name is absolute.) > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); ^^^ And user_path_at() isn't quite what you need either because with AT_EMPTY_PATH we also want to allow for filename to be NULL (not just empty string) and user_path_at() does not support that. That's why I in my previous replies suggested you should follow what setxattrat() does and that sadly it is more painful than it should be. You need something like: name = getname_maybe_null(filename, at_flags); if (!name) { CLASS(fd, f)(dfd); if (fd_empty(f)) return -EBADF; error = vfs_fileattr_get(file_dentry(fd_file(f)), &fa); } else { error = filename_lookup(dfd, filename, lookup_flags, &filepath, NULL); if (error) goto out; error = vfs_fileattr_get(filepath.dentry, &fa); path_put(&filepath); } if (!error) error = copy_fsxattr_to_user(&fa, fsx); out: putname(name); return error; Longer term, we need to provide user_path_maybe_null_at() for this but I don't want to drag you into this cleanup :) > + if (error) > + return error; > + > + error = vfs_fileattr_get(filepath.dentry, &fa); > + if (!error) > + error = copy_fsxattr_to_user(&fa, fsx); > + > + path_put(&filepath); > + return error; > +} > + > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > + struct fsxattr __user *, fsx, unsigned int, at_flags) > +{ > + CLASS(fd, dir)(dfd); > + struct fileattr fa; > + struct path filepath; > + int error; > + unsigned int lookup_flags = 0; > + > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > + return -EINVAL; > + > + if (at_flags & AT_SYMLINK_FOLLOW) > + lookup_flags |= LOOKUP_FOLLOW; I think using AT_SYMLINK_NOFOLLOW is actually more traditional and thus less surprising to users so I'd prefer that. Definitely this needs to be consistent with getfsxattrat(). > + > + if (at_flags & AT_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > + > + if (fd_empty(dir)) > + return -EBADF; Same comment regarding dfd handling as above. > + > + if (copy_fsxattr_from_user(&fa, fsx)) > + return -EFAULT; > + > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > + if (error) > + return error; > + > + error = mnt_want_write(filepath.mnt); > + if (!error) { > + error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), > + filepath.dentry, &fa); > + mnt_drop_write(filepath.mnt); > + } > + > + path_put(&filepath); > + return error; > +} Otherwise the patch looks good to me. Honza
On Fri, Feb 21, 2025 at 08:15:24PM +0100, Amir Goldstein wrote: > On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > > extended attributes/flags. The syscalls take parent directory fd and > > > path to the child together with struct fsxattr. > > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > > that file don't need to be open as we can reference it with a path > > > instead of fd. By having this we can manipulated inode extended > > > attributes not only on regular files but also on special ones. This > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > > we can not call ioctl() directly on the filesystem inode using fd. > > > > > > This patch adds two new syscalls which allows userspace to get/set > > > extended inode attributes on special files by using parent directory > > > and a path - *at() like syscall. > > > > > > Also, as vfs_fileattr_set() is now will be called on special files > > > too, let's forbid any other attributes except projid and nextents > > > (symlink can have an extent). > > > > > > CC: linux-api@vger.kernel.org > > > CC: linux-fsdevel@vger.kernel.org > > > CC: linux-xfs@vger.kernel.org > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > > --- > > > v1: > > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > > > Previous discussion: > > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > > > XFS has project quotas which could be attached to a directory. All > > > new inodes in these directories inherit project ID set on parent > > > directory. > > > > > > The project is created from userspace by opening and calling > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > > with empty project ID. Those inodes then are not shown in the quota > > > accounting but still exist in the directory. Moreover, in the case > > > when special files are created in the directory with already > > > existing project quota, these inode inherit extended attributes. > > > This than leaves them with these attributes without the possibility > > > to clear them out. This, in turn, prevents userspace from > > > re-creating quota project on these existing files. > > > --- > > > Changes in v3: > > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > > - Remove unnecessary "same filesystem" check > > > - Use CLASS() instead of directly calling fdget/fdput > > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > > --- > > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > > arch/arm/tools/syscall.tbl | 2 + > > > arch/arm64/tools/syscall_32.tbl | 2 + > > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > > fs/ioctl.c | 16 +++++- > > > include/linux/fileattr.h | 1 + > > > include/linux/syscalls.h | 4 ++ > > > include/uapi/asm-generic/unistd.h | 8 ++- > > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > > > > <cut to the syscall definitions> > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > > --- a/fs/inode.c > > > +++ b/fs/inode.c > > > @@ -23,6 +23,9 @@ > > > #include <linux/rw_hint.h> > > > #include <linux/seq_file.h> > > > #include <linux/debugfs.h> > > > +#include <linux/syscalls.h> > > > +#include <linux/fileattr.h> > > > +#include <linux/namei.h> > > > #include <trace/events/writeback.h> > > > #define CREATE_TRACE_POINTS > > > #include <trace/events/timestamp.h> > > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > > return mode & ~S_ISGID; > > > } > > > EXPORT_SYMBOL(mode_strip_sgid); > > > + > > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > > > Should the kernel require userspace to pass the size of the fsx buffer? > > That way we avoid needing to rev the interface when we decide to grow > > the structure. Please version the struct by size as we do for clone3(), mount_setattr(), listmount()'s struct mnt_id_req, sched_setattr(), all the new xattrat*() system calls and a host of others. So laying out the struct 64bit and passing a size alongside it. This is all handled by copy_struct_from_user() and copy_struct_to_user() so nothing to reinvent. And it's easy to copy from existing system calls.
On 2025-02-21 16:08:33, Mickaël Salaün wrote: > It looks security checks are missing. With IOCTL commands, file > permissions are checked at open time, but with these syscalls the path > is only resolved but no specific access seems to be checked (except > inode_owner_or_capable via vfs_fileattr_set). > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > extended attributes/flags. The syscalls take parent directory fd and > > path to the child together with struct fsxattr. > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > that file don't need to be open as we can reference it with a path > > instead of fd. By having this we can manipulated inode extended > > attributes not only on regular files but also on special ones. This > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > we can not call ioctl() directly on the filesystem inode using fd. > > > > This patch adds two new syscalls which allows userspace to get/set > > extended inode attributes on special files by using parent directory > > and a path - *at() like syscall. > > > > Also, as vfs_fileattr_set() is now will be called on special files > > too, let's forbid any other attributes except projid and nextents > > (symlink can have an extent). > > > > CC: linux-api@vger.kernel.org > > CC: linux-fsdevel@vger.kernel.org > > CC: linux-xfs@vger.kernel.org > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > --- > > v1: > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > Previous discussion: > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > XFS has project quotas which could be attached to a directory. All > > new inodes in these directories inherit project ID set on parent > > directory. > > > > The project is created from userspace by opening and calling > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > with empty project ID. Those inodes then are not shown in the quota > > accounting but still exist in the directory. Moreover, in the case > > when special files are created in the directory with already > > existing project quota, these inode inherit extended attributes. > > This than leaves them with these attributes without the possibility > > to clear them out. This, in turn, prevents userspace from > > re-creating quota project on these existing files. > > --- > > Changes in v3: > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > - Remove unnecessary "same filesystem" check > > - Use CLASS() instead of directly calling fdget/fdput > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > --- > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > arch/arm/tools/syscall.tbl | 2 + > > arch/arm64/tools/syscall_32.tbl | 2 + > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > fs/ioctl.c | 16 +++++- > > include/linux/fileattr.h | 1 + > > include/linux/syscalls.h | 4 ++ > > include/uapi/asm-generic/unistd.h | 8 ++- > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > [...] > > > diff --git a/fs/inode.c b/fs/inode.c > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -23,6 +23,9 @@ > > #include <linux/rw_hint.h> > > #include <linux/seq_file.h> > > #include <linux/debugfs.h> > > +#include <linux/syscalls.h> > > +#include <linux/fileattr.h> > > +#include <linux/namei.h> > > #include <trace/events/writeback.h> > > #define CREATE_TRACE_POINTS > > #include <trace/events/timestamp.h> > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > return mode & ~S_ISGID; > > } > > EXPORT_SYMBOL(mode_strip_sgid); > > + > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > +{ > > + CLASS(fd, dir)(dfd); > > + struct fileattr fa; > > + struct path filepath; > > + int error; > > + unsigned int lookup_flags = 0; > > + > > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > > + return -EINVAL; > > + > > + if (at_flags & AT_SYMLINK_FOLLOW) > > + lookup_flags |= LOOKUP_FOLLOW; > > + > > + if (at_flags & AT_EMPTY_PATH) > > + lookup_flags |= LOOKUP_EMPTY; > > + > > + if (fd_empty(dir)) > > + return -EBADF; > > + > > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > > + if (error) > > + return error; > > security_inode_getattr() should probably be called here. > > > + > > + error = vfs_fileattr_get(filepath.dentry, &fa); > > + if (!error) > > + error = copy_fsxattr_to_user(&fa, fsx); > > + > > + path_put(&filepath); > > + return error; > > +} > > + > > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > +{ > > + CLASS(fd, dir)(dfd); > > + struct fileattr fa; > > + struct path filepath; > > + int error; > > + unsigned int lookup_flags = 0; > > + > > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > > + return -EINVAL; > > + > > + if (at_flags & AT_SYMLINK_FOLLOW) > > + lookup_flags |= LOOKUP_FOLLOW; > > + > > + if (at_flags & AT_EMPTY_PATH) > > + lookup_flags |= LOOKUP_EMPTY; > > + > > + if (fd_empty(dir)) > > + return -EBADF; > > + > > + if (copy_fsxattr_from_user(&fa, fsx)) > > + return -EFAULT; > > + > > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > > + if (error) > > + return error; > > + > > + error = mnt_want_write(filepath.mnt); > > + if (!error) { > > security_inode_setattr() should probably be called too. Aren't those checks for something different - inode attributes ATTR_*? (sorry, the naming can't be more confusing) Looking into security_inode_setattr() it seems to expect struct iattr, which works with inode attributes (mode, time, uid/gid...). These new syscalls work with filesystem inode extended flags/attributes FS_XFLAG_* in fsxattr->fsx_xflags. Let me know if I missing something here
On 2025-02-24 12:32:17, Christian Brauner wrote: > On Fri, Feb 21, 2025 at 08:15:24PM +0100, Amir Goldstein wrote: > > On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > > > > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > > > extended attributes/flags. The syscalls take parent directory fd and > > > > path to the child together with struct fsxattr. > > > > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > > > that file don't need to be open as we can reference it with a path > > > > instead of fd. By having this we can manipulated inode extended > > > > attributes not only on regular files but also on special ones. This > > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > > > we can not call ioctl() directly on the filesystem inode using fd. > > > > > > > > This patch adds two new syscalls which allows userspace to get/set > > > > extended inode attributes on special files by using parent directory > > > > and a path - *at() like syscall. > > > > > > > > Also, as vfs_fileattr_set() is now will be called on special files > > > > too, let's forbid any other attributes except projid and nextents > > > > (symlink can have an extent). > > > > > > > > CC: linux-api@vger.kernel.org > > > > CC: linux-fsdevel@vger.kernel.org > > > > CC: linux-xfs@vger.kernel.org > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > > > --- > > > > v1: > > > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > > > > > Previous discussion: > > > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > > > > > XFS has project quotas which could be attached to a directory. All > > > > new inodes in these directories inherit project ID set on parent > > > > directory. > > > > > > > > The project is created from userspace by opening and calling > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > > > with empty project ID. Those inodes then are not shown in the quota > > > > accounting but still exist in the directory. Moreover, in the case > > > > when special files are created in the directory with already > > > > existing project quota, these inode inherit extended attributes. > > > > This than leaves them with these attributes without the possibility > > > > to clear them out. This, in turn, prevents userspace from > > > > re-creating quota project on these existing files. > > > > --- > > > > Changes in v3: > > > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > > > - Remove unnecessary "same filesystem" check > > > > - Use CLASS() instead of directly calling fdget/fdput > > > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > > > --- > > > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > > > arch/arm/tools/syscall.tbl | 2 + > > > > arch/arm64/tools/syscall_32.tbl | 2 + > > > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > > > fs/ioctl.c | 16 +++++- > > > > include/linux/fileattr.h | 1 + > > > > include/linux/syscalls.h | 4 ++ > > > > include/uapi/asm-generic/unistd.h | 8 ++- > > > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > > > > > > > <cut to the syscall definitions> > > > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > > > --- a/fs/inode.c > > > > +++ b/fs/inode.c > > > > @@ -23,6 +23,9 @@ > > > > #include <linux/rw_hint.h> > > > > #include <linux/seq_file.h> > > > > #include <linux/debugfs.h> > > > > +#include <linux/syscalls.h> > > > > +#include <linux/fileattr.h> > > > > +#include <linux/namei.h> > > > > #include <trace/events/writeback.h> > > > > #define CREATE_TRACE_POINTS > > > > #include <trace/events/timestamp.h> > > > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > > > return mode & ~S_ISGID; > > > > } > > > > EXPORT_SYMBOL(mode_strip_sgid); > > > > + > > > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > > > > > Should the kernel require userspace to pass the size of the fsx buffer? > > > That way we avoid needing to rev the interface when we decide to grow > > > the structure. > > Please version the struct by size as we do for clone3(), > mount_setattr(), listmount()'s struct mnt_id_req, sched_setattr(), all > the new xattrat*() system calls and a host of others. So laying out the > struct 64bit and passing a size alongside it. > > This is all handled by copy_struct_from_user() and copy_struct_to_user() > so nothing to reinvent. And it's easy to copy from existing system > calls. > Oh, thanks for pointing to these, will use them.
On 2025-02-24 11:54:34, Jan Kara wrote: > On Tue 11-02-25 18:22:47, Andrey Albershteyn wrote: > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > extended attributes/flags. The syscalls take parent directory fd and > > path to the child together with struct fsxattr. > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > that file don't need to be open as we can reference it with a path > > instead of fd. By having this we can manipulated inode extended > > attributes not only on regular files but also on special ones. This > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > we can not call ioctl() directly on the filesystem inode using fd. > > > > This patch adds two new syscalls which allows userspace to get/set > > extended inode attributes on special files by using parent directory > > and a path - *at() like syscall. > > > > Also, as vfs_fileattr_set() is now will be called on special files > > too, let's forbid any other attributes except projid and nextents > > (symlink can have an extent). > > > > CC: linux-api@vger.kernel.org > > CC: linux-fsdevel@vger.kernel.org > > CC: linux-xfs@vger.kernel.org > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > Some comments below: > > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > +{ > > + CLASS(fd, dir)(dfd); > > + struct fileattr fa; > > + struct path filepath; > > + int error; > > + unsigned int lookup_flags = 0; > > + > > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) > > + return -EINVAL; > > + > > + if (at_flags & AT_SYMLINK_FOLLOW) > ^^ This should be !(at_flags & AT_SYMLINK_NOFOLLOW)? > > In the check above you verify for AT_SYMLINK_NOFOLLOW and that also matches > what setxattrat() does... Right, didn't notice that this is actually opposite to setxattrat(), will change that. > > > > + lookup_flags |= LOOKUP_FOLLOW; > > + > > + if (at_flags & AT_EMPTY_PATH) > > + lookup_flags |= LOOKUP_EMPTY; > > + > > + if (fd_empty(dir)) > > + return -EBADF; > > This check is wrong and in fact the whole dfd handling looks buggy. > openat(2) manpage describes the expected behavior: > > The dirfd argument is used in conjunction with the pathname argument as > follows: > > • If the pathname given in pathname is absolute, then dirfd is ig- > nored. > ^^^^ This is what you break. If the pathname is absolute, you're > not expected to touch dirfd. > > • If the pathname given in pathname is relative and dirfd is the spe- > cial value AT_FDCWD, then pathname is interpreted relative to the > current working directory of the calling process (like open()). > ^^^ Also AT_FDCWD handling would be broken by the above check. > > • If the pathname given in pathname is relative, then it is inter- > preted relative to the directory referred to by the file descriptor > dirfd (rather than relative to the current working directory of the > calling process, as is done by open() for a relative pathname). In > this case, dirfd must be a directory that was opened for reading > (O_RDONLY) or using the O_PATH flag. > > If the pathname given in pathname is relative, and dirfd is not a valid > file descriptor, an error (EBADF) results. (Specifying an invalid file > descriptor number in dirfd can be used as a means to ensure that path- > name is absolute.) > > > + > > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > ^^^ And user_path_at() isn't quite what you need either > because with AT_EMPTY_PATH we also want to allow for filename to be NULL > (not just empty string) and user_path_at() does not support that. That's > why I in my previous replies suggested you should follow what setxattrat() > does and that sadly it is more painful than it should be. You need > something like: > > name = getname_maybe_null(filename, at_flags); > if (!name) { > CLASS(fd, f)(dfd); > > if (fd_empty(f)) > return -EBADF; > error = vfs_fileattr_get(file_dentry(fd_file(f)), &fa); > } else { > error = filename_lookup(dfd, filename, lookup_flags, &filepath, > NULL); > if (error) > goto out; > error = vfs_fileattr_get(filepath.dentry, &fa); > path_put(&filepath); > } > if (!error) > error = copy_fsxattr_to_user(&fa, fsx); > out: > putname(name); > return error; > > Longer term, we need to provide user_path_maybe_null_at() for this but I > don't want to drag you into this cleanup :) Oh, I missed that, thanks for pointing this out, I will change it as suggested.
On Mon, Feb 24, 2025 at 11:00 AM Andrey Albershteyn <aalbersh@redhat.com> wrote: > On 2025-02-21 16:08:33, Mickaël Salaün wrote: > > It looks security checks are missing. With IOCTL commands, file > > permissions are checked at open time, but with these syscalls the path > > is only resolved but no specific access seems to be checked (except > > inode_owner_or_capable via vfs_fileattr_set). ... > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: ... > > > +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, > > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > > +{ > > > + CLASS(fd, dir)(dfd); > > > + struct fileattr fa; > > > + struct path filepath; > > > + int error; > > > + unsigned int lookup_flags = 0; > > > + > > > + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) > > > + return -EINVAL; > > > + > > > + if (at_flags & AT_SYMLINK_FOLLOW) > > > + lookup_flags |= LOOKUP_FOLLOW; > > > + > > > + if (at_flags & AT_EMPTY_PATH) > > > + lookup_flags |= LOOKUP_EMPTY; > > > + > > > + if (fd_empty(dir)) > > > + return -EBADF; > > > + > > > + if (copy_fsxattr_from_user(&fa, fsx)) > > > + return -EFAULT; > > > + > > > + error = user_path_at(dfd, filename, lookup_flags, &filepath); > > > + if (error) > > > + return error; > > > + > > > + error = mnt_want_write(filepath.mnt); > > > + if (!error) { > > > > security_inode_setattr() should probably be called too. > > Aren't those checks for something different - inode attributes > ATTR_*? > (sorry, the naming can't be more confusing) > > Looking into security_inode_setattr() it seems to expect struct > iattr, which works with inode attributes (mode, time, uid/gid...). > These new syscalls work with filesystem inode extended flags/attributes > FS_XFLAG_* in fsxattr->fsx_xflags. Let me know if I missing > something here A valid point. While these are two different operations, with different structs/types, I suspect that most LSMs will consider them to be roughly equivalent from an access control perspective, which is why I felt the existing security_inode_{set,get}attr() hooks seemed appropriate. However, there likely is value in keeping the ATTR and FSX operations separate; those LSMs that wish to treat them the same can easily do so in their respective LSM callbacks. With all this in mind, I think it probably makes sense to create two new LSM hooks, security_inode_{get,set}fsxattr(). The get hook should probably be placed inside vfs_fileattr_get() just before the call to the inode's fileattr_get() method, and the set hook should probably be placed inside vfs_fileattr_set(), inside the inode lock and after a successful call to fileattr_set_prepare(). Does that sound better to everyone?
On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: > On Fri, Feb 21, 2025 at 08:15:24PM +0100, Amir Goldstein wrote: >> On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: >> > > @@ -23,6 +23,9 @@ >> > > #include <linux/rw_hint.h> >> > > #include <linux/seq_file.h> >> > > #include <linux/debugfs.h> >> > > +#include <linux/syscalls.h> >> > > +#include <linux/fileattr.h> >> > > +#include <linux/namei.h> >> > > #include <trace/events/writeback.h> >> > > #define CREATE_TRACE_POINTS >> > > #include <trace/events/timestamp.h> >> > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, >> > > return mode & ~S_ISGID; >> > > } >> > > EXPORT_SYMBOL(mode_strip_sgid); >> > > + >> > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, >> > > + struct fsxattr __user *, fsx, unsigned int, at_flags) >> > >> > Should the kernel require userspace to pass the size of the fsx buffer? >> > That way we avoid needing to rev the interface when we decide to grow >> > the structure. > > Please version the struct by size as we do for clone3(), > mount_setattr(), listmount()'s struct mnt_id_req, sched_setattr(), all > the new xattrat*() system calls and a host of others. So laying out the > struct 64bit and passing a size alongside it. > > This is all handled by copy_struct_from_user() and copy_struct_to_user() > so nothing to reinvent. And it's easy to copy from existing system > calls. I don't think that works in this case, because 'struct fsxattr' is an existing structure that is defined with a fixed size of 28 bytes. If we ever need more than 8 extra bytes, then the existing ioctl commands are also broken. Replacing fsxattr with an extensible structure of the same contents would work, but I feel that just adds more complication for little gain. On the other hand, there is an open question about how unknown flags and fields in this structure. FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR treats them as optional and just ignores anything it doesn't understand, while copy_struct_from_user() would treat any unknown but set bytes as -E2BIG. The ioctl interface relies on the existing behavior, see 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of CoW extent size hint") for how it was previously extended with an optional flag/word. I think that is fine for the syscall as well, but should be properly documented since it is different from how most syscalls work. Arnd
On Tue, Feb 25, 2025 at 09:02:04AM +0100, Arnd Bergmann wrote: > On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: > > On Fri, Feb 21, 2025 at 08:15:24PM +0100, Amir Goldstein wrote: > >> On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > >> > > @@ -23,6 +23,9 @@ > >> > > #include <linux/rw_hint.h> > >> > > #include <linux/seq_file.h> > >> > > #include <linux/debugfs.h> > >> > > +#include <linux/syscalls.h> > >> > > +#include <linux/fileattr.h> > >> > > +#include <linux/namei.h> > >> > > #include <trace/events/writeback.h> > >> > > #define CREATE_TRACE_POINTS > >> > > #include <trace/events/timestamp.h> > >> > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > >> > > return mode & ~S_ISGID; > >> > > } > >> > > EXPORT_SYMBOL(mode_strip_sgid); > >> > > + > >> > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > >> > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > >> > > >> > Should the kernel require userspace to pass the size of the fsx buffer? > >> > That way we avoid needing to rev the interface when we decide to grow > >> > the structure. > > > > Please version the struct by size as we do for clone3(), > > mount_setattr(), listmount()'s struct mnt_id_req, sched_setattr(), all > > the new xattrat*() system calls and a host of others. So laying out the > > struct 64bit and passing a size alongside it. > > > > This is all handled by copy_struct_from_user() and copy_struct_to_user() > > so nothing to reinvent. And it's easy to copy from existing system > > calls. > > I don't think that works in this case, because 'struct fsxattr' > is an existing structure that is defined with a fixed size of > 28 bytes. If we ever need more than 8 extra bytes, then the > existing ioctl commands are also broken. > > Replacing fsxattr with an extensible structure of the same contents > would work, but I feel that just adds more complication for little > gain. > > On the other hand, there is an open question about how unknown > flags and fields in this structure. FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR > treats them as optional and just ignores anything it doesn't > understand, while copy_struct_from_user() would treat any unknown > but set bytes as -E2BIG. > > The ioctl interface relies on the existing behavior, see > 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of > CoW extent size hint") for how it was previously extended > with an optional flag/word. I think that is fine for the syscall > as well, but should be properly documented since it is different > from how most syscalls work. If we're doing a new system call I see no reason to limit us to a pre-existing structure or structure layout.
On Tue, Feb 25, 2025, at 11:22, Christian Brauner wrote: > On Tue, Feb 25, 2025 at 09:02:04AM +0100, Arnd Bergmann wrote: >> On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: >> >> The ioctl interface relies on the existing behavior, see >> 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of >> CoW extent size hint") for how it was previously extended >> with an optional flag/word. I think that is fine for the syscall >> as well, but should be properly documented since it is different >> from how most syscalls work. > > If we're doing a new system call I see no reason to limit us to a > pre-existing structure or structure layout. Obviously we could create a new structure, but I also see no reason to do so. The existing ioctl interface was added in in 2002 as part of linux-2.5.35 with 16 bytes of padding, half of which have been used so far. If this structure works for another 23 years before we run out of spare bytes, I think that's good enough. Building in an incompatible way to handle potential future contents would just make it harder to use for any userspace that wants to use the new syscalls but still needs a fallback to the ioctl version. Arnd
On Tue, Feb 25, 2025 at 11:40:51AM +0100, Arnd Bergmann wrote: > On Tue, Feb 25, 2025, at 11:22, Christian Brauner wrote: > > On Tue, Feb 25, 2025 at 09:02:04AM +0100, Arnd Bergmann wrote: > >> On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: > >> > >> The ioctl interface relies on the existing behavior, see > >> 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of > >> CoW extent size hint") for how it was previously extended > >> with an optional flag/word. I think that is fine for the syscall > >> as well, but should be properly documented since it is different > >> from how most syscalls work. > > > > If we're doing a new system call I see no reason to limit us to a > > pre-existing structure or structure layout. > > Obviously we could create a new structure, but I also see no > reason to do so. The existing ioctl interface was added in > in 2002 as part of linux-2.5.35 with 16 bytes of padding, half > of which have been used so far. > > If this structure works for another 23 years before we run out > of spare bytes, I think that's good enough. Building in an > incompatible way to handle potential future contents would > just make it harder to use for any userspace that wants to > use the new syscalls but still needs a fallback to the > ioctl version. The fact that this structure has existed since the dawn of time doesn't mean it needs to be retained when adding a completely new system call. People won't mix both. They either switch to the new interface because they want to get around the limitations of the old interface or they keep using the old interface and the associated workarounds. In another thread they keep arguing about new extensions for Windows that are going to be added to the ioctl interface and how to make it fit into this. That just shows that it's very hard to predict from the amount of past changes how many future changes are going to happen. And if an interface is easy to extend it might well invite new changes that people didn't want to or couldn't make using the old interface.
On Tue, Feb 25, 2025 at 12:24:08PM +0100, Christian Brauner wrote: > On Tue, Feb 25, 2025 at 11:40:51AM +0100, Arnd Bergmann wrote: > > On Tue, Feb 25, 2025, at 11:22, Christian Brauner wrote: > > > On Tue, Feb 25, 2025 at 09:02:04AM +0100, Arnd Bergmann wrote: > > >> On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: > > >> > > >> The ioctl interface relies on the existing behavior, see > > >> 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of > > >> CoW extent size hint") for how it was previously extended > > >> with an optional flag/word. I think that is fine for the syscall > > >> as well, but should be properly documented since it is different > > >> from how most syscalls work. > > > > > > If we're doing a new system call I see no reason to limit us to a > > > pre-existing structure or structure layout. > > > > Obviously we could create a new structure, but I also see no > > reason to do so. The existing ioctl interface was added in > > in 2002 as part of linux-2.5.35 with 16 bytes of padding, half > > of which have been used so far. > > > > If this structure works for another 23 years before we run out > > of spare bytes, I think that's good enough. Building in an > > incompatible way to handle potential future contents would > > just make it harder to use for any userspace that wants to > > use the new syscalls but still needs a fallback to the > > ioctl version. > > The fact that this structure has existed since the dawn of time doesn't > mean it needs to be retained when adding a completely new system call. > > People won't mix both. They either switch to the new interface because > they want to get around the limitations of the old interface or they > keep using the old interface and the associated workarounds. > > In another thread they keep arguing about new extensions for Windows > that are going to be added to the ioctl interface and how to make it fit > into this. That just shows that it's very hard to predict from the > amount of past changes how many future changes are going to happen. And > if an interface is easy to extend it might well invite new changes that > people didn't want to or couldn't make using the old interface. Agreed, I don't think it's hard to enlarge struct fsxattr in the existing ioctl interface; either we figure out how to make the kernel fill out the "missing" bytes with an internal getfsxattr call, or we make it return some errno if we would be truncating real output due to struct size limits and leave a note in the manpage that "EL3HLT means use a bigger structure definition" Then both interfaces can plod along for another 30 years. :) --D
On Tuesday 25 February 2025 07:59:26 Darrick J. Wong wrote: > On Tue, Feb 25, 2025 at 12:24:08PM +0100, Christian Brauner wrote: > > On Tue, Feb 25, 2025 at 11:40:51AM +0100, Arnd Bergmann wrote: > > > On Tue, Feb 25, 2025, at 11:22, Christian Brauner wrote: > > > > On Tue, Feb 25, 2025 at 09:02:04AM +0100, Arnd Bergmann wrote: > > > >> On Mon, Feb 24, 2025, at 12:32, Christian Brauner wrote: > > > >> > > > >> The ioctl interface relies on the existing behavior, see > > > >> 0a6eab8bd4e0 ("vfs: support FS_XFLAG_COWEXTSIZE and get/set of > > > >> CoW extent size hint") for how it was previously extended > > > >> with an optional flag/word. I think that is fine for the syscall > > > >> as well, but should be properly documented since it is different > > > >> from how most syscalls work. > > > > > > > > If we're doing a new system call I see no reason to limit us to a > > > > pre-existing structure or structure layout. > > > > > > Obviously we could create a new structure, but I also see no > > > reason to do so. The existing ioctl interface was added in > > > in 2002 as part of linux-2.5.35 with 16 bytes of padding, half > > > of which have been used so far. > > > > > > If this structure works for another 23 years before we run out > > > of spare bytes, I think that's good enough. Building in an > > > incompatible way to handle potential future contents would > > > just make it harder to use for any userspace that wants to > > > use the new syscalls but still needs a fallback to the > > > ioctl version. > > > > The fact that this structure has existed since the dawn of time doesn't > > mean it needs to be retained when adding a completely new system call. > > > > People won't mix both. They either switch to the new interface because > > they want to get around the limitations of the old interface or they > > keep using the old interface and the associated workarounds. > > > > In another thread they keep arguing about new extensions for Windows > > that are going to be added to the ioctl interface and how to make it fit > > into this. That just shows that it's very hard to predict from the > > amount of past changes how many future changes are going to happen. And > > if an interface is easy to extend it might well invite new changes that > > people didn't want to or couldn't make using the old interface. > > Agreed, I don't think it's hard to enlarge struct fsxattr in the > existing ioctl interface; either we figure out how to make the kernel > fill out the "missing" bytes with an internal getfsxattr call, or we > make it return some errno if we would be truncating real output due to > struct size limits and leave a note in the manpage that "EL3HLT means > use a bigger structure definition" > > Then both interfaces can plod along for another 30 years. :) > > --D For Windows attributes, there are for sure needed new 11 bits for attributes which can be both get and set, additional 4 bits for get-only attributes, and plus there are 9 reserved bits (which Windows can start using it and exporting over NTFS or SMB). And it is possible that Windows can reuse some bits which were previously assigned for things which today does not appear on NTFS. I think that fsx_xflags does not have enough free bits for all these attributes. So it would be really nice to design API/ABI in away which can be extended for new fields. Also another two points, for this new syscalls. I have not looked at the current changes (I was added to CC just recently), but it would be nice: 1) If syscall API allows to operate on the symlink itself. This is because NTFS and also SMB symlink also contains attributes. ioctl interface currently does not support to get/set these symlink attributes. 2) If syscall API contains ability to just change subset of attributes. And provide an error reporting to userspace if userspace application is trying to set attribute which is not supported by the filesystem. This error reporting is needed for possible "cp -a" or possible "rsync" implementation which informs when some metadata cannot be backup/restored. There are more filesystems which supports only subset of attributes, this applies also for windows attributes. For example UDF fs supports only "hidden" attribute.
On 2025-02-21 20:15:24, Amir Goldstein wrote: > On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > > extended attributes/flags. The syscalls take parent directory fd and > > > path to the child together with struct fsxattr. > > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > > that file don't need to be open as we can reference it with a path > > > instead of fd. By having this we can manipulated inode extended > > > attributes not only on regular files but also on special ones. This > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > > we can not call ioctl() directly on the filesystem inode using fd. > > > > > > This patch adds two new syscalls which allows userspace to get/set > > > extended inode attributes on special files by using parent directory > > > and a path - *at() like syscall. > > > > > > Also, as vfs_fileattr_set() is now will be called on special files > > > too, let's forbid any other attributes except projid and nextents > > > (symlink can have an extent). > > > > > > CC: linux-api@vger.kernel.org > > > CC: linux-fsdevel@vger.kernel.org > > > CC: linux-xfs@vger.kernel.org > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > > --- > > > v1: > > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > > > Previous discussion: > > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > > > XFS has project quotas which could be attached to a directory. All > > > new inodes in these directories inherit project ID set on parent > > > directory. > > > > > > The project is created from userspace by opening and calling > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > > with empty project ID. Those inodes then are not shown in the quota > > > accounting but still exist in the directory. Moreover, in the case > > > when special files are created in the directory with already > > > existing project quota, these inode inherit extended attributes. > > > This than leaves them with these attributes without the possibility > > > to clear them out. This, in turn, prevents userspace from > > > re-creating quota project on these existing files. > > > --- > > > Changes in v3: > > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > > - Remove unnecessary "same filesystem" check > > > - Use CLASS() instead of directly calling fdget/fdput > > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > > --- > > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > > arch/arm/tools/syscall.tbl | 2 + > > > arch/arm64/tools/syscall_32.tbl | 2 + > > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > > fs/ioctl.c | 16 +++++- > > > include/linux/fileattr.h | 1 + > > > include/linux/syscalls.h | 4 ++ > > > include/uapi/asm-generic/unistd.h | 8 ++- > > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > > > > <cut to the syscall definitions> > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > > --- a/fs/inode.c > > > +++ b/fs/inode.c > > > @@ -23,6 +23,9 @@ > > > #include <linux/rw_hint.h> > > > #include <linux/seq_file.h> > > > #include <linux/debugfs.h> > > > +#include <linux/syscalls.h> > > > +#include <linux/fileattr.h> > > > +#include <linux/namei.h> > > > #include <trace/events/writeback.h> > > > #define CREATE_TRACE_POINTS > > > #include <trace/events/timestamp.h> > > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > > return mode & ~S_ISGID; > > > } > > > EXPORT_SYMBOL(mode_strip_sgid); > > > + > > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > > > Should the kernel require userspace to pass the size of the fsx buffer? > > That way we avoid needing to rev the interface when we decide to grow > > the structure. > > > > This makes sense to me, but I see that Andreas proposed other ways, > as long as we have a plan on how to extend the struct if we need more space. > > Andrey, I am sorry to bring this up in v3, but I would like to request > two small changes before merging this API. > > This patch by Pali [1] adds fsx_xflags_mask for the filesystem to > report the supported set of xflags. > > It was argued that we can make this change with the existing ioctl, > because it is not going to break xfs_io -c lsattr/chattr, which is fine, > but I think that we should merge the fsx_xflags_mask change along > with getfsxattrat() which is a new UAPI. > > The second request is related to setfsxattrat(). > With current FS_IOC_FSSETXATTR, IIUC, xfs ignores unsupported > fsx_xflags. I think this needs to be fixed before merging setfsxattrat(). > It's ok that a program calling FS_IOC_FSSETXATTR will not know > if unsupported flags will be ignored, because that's the way it is, > but I think that setfsxattrat() must return -EINVAL for trying to > set unsupported xflags. > > As I explained in [2] I think it is fine if FS_IOC_FSSETXATTR > will also start returning -EINVAL for unsupported flags, but I would > like setfsxattrat() to make that a guarantee. > > There was an open question, what does fsx_xflags_mask mean > for setfsxattrat() - it is a mask like in inode_set_flags() as Andreas > suggested? I think that would be a good idea. > > Thanks, > Amir. > > [1] https://lore.kernel.org/linux-fsdevel/20250216164029.20673-4-pali@kernel.org/ > [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxjwQJiKAqyjEmKUnq-VihyeSsxyEy2F+J38NXwrAXurFQ@mail.gmail.com/ > I'm fine with making Pali's patchset a dependency for this syscall, as if vfs_fileattr_set() will start returning EINVAL on unsupported flags this syscall will pass it through (ioctls will need to ignore it). And as these syscalls use fsxattr anyway the fsx_xflags_mask field will be here.
On Friday 28 February 2025 09:30:38 Andrey Albershteyn wrote: > On 2025-02-21 20:15:24, Amir Goldstein wrote: > > On Fri, Feb 21, 2025 at 7:13 PM Darrick J. Wong <djwong@kernel.org> wrote: > > > > > > On Tue, Feb 11, 2025 at 06:22:47PM +0100, Andrey Albershteyn wrote: > > > > From: Andrey Albershteyn <aalbersh@redhat.com> > > > > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode > > > > extended attributes/flags. The syscalls take parent directory fd and > > > > path to the child together with struct fsxattr. > > > > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference > > > > that file don't need to be open as we can reference it with a path > > > > instead of fd. By having this we can manipulated inode extended > > > > attributes not only on regular files but also on special ones. This > > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files > > > > we can not call ioctl() directly on the filesystem inode using fd. > > > > > > > > This patch adds two new syscalls which allows userspace to get/set > > > > extended inode attributes on special files by using parent directory > > > > and a path - *at() like syscall. > > > > > > > > Also, as vfs_fileattr_set() is now will be called on special files > > > > too, let's forbid any other attributes except projid and nextents > > > > (symlink can have an extent). > > > > > > > > CC: linux-api@vger.kernel.org > > > > CC: linux-fsdevel@vger.kernel.org > > > > CC: linux-xfs@vger.kernel.org > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > > > --- > > > > v1: > > > > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/ > > > > > > > > Previous discussion: > > > > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/ > > > > > > > > XFS has project quotas which could be attached to a directory. All > > > > new inodes in these directories inherit project ID set on parent > > > > directory. > > > > > > > > The project is created from userspace by opening and calling > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > > > with empty project ID. Those inodes then are not shown in the quota > > > > accounting but still exist in the directory. Moreover, in the case > > > > when special files are created in the directory with already > > > > existing project quota, these inode inherit extended attributes. > > > > This than leaves them with these attributes without the possibility > > > > to clear them out. This, in turn, prevents userspace from > > > > re-creating quota project on these existing files. > > > > --- > > > > Changes in v3: > > > > - Remove unnecessary "dfd is dir" check as it checked in user_path_at() > > > > - Remove unnecessary "same filesystem" check > > > > - Use CLASS() instead of directly calling fdget/fdput > > > > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org > > > > --- > > > > arch/alpha/kernel/syscalls/syscall.tbl | 2 + > > > > arch/arm/tools/syscall.tbl | 2 + > > > > arch/arm64/tools/syscall_32.tbl | 2 + > > > > arch/m68k/kernel/syscalls/syscall.tbl | 2 + > > > > arch/microblaze/kernel/syscalls/syscall.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_n32.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_n64.tbl | 2 + > > > > arch/mips/kernel/syscalls/syscall_o32.tbl | 2 + > > > > arch/parisc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/powerpc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/s390/kernel/syscalls/syscall.tbl | 2 + > > > > arch/sh/kernel/syscalls/syscall.tbl | 2 + > > > > arch/sparc/kernel/syscalls/syscall.tbl | 2 + > > > > arch/x86/entry/syscalls/syscall_32.tbl | 2 + > > > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > > > arch/xtensa/kernel/syscalls/syscall.tbl | 2 + > > > > fs/inode.c | 75 +++++++++++++++++++++++++++++ > > > > fs/ioctl.c | 16 +++++- > > > > include/linux/fileattr.h | 1 + > > > > include/linux/syscalls.h | 4 ++ > > > > include/uapi/asm-generic/unistd.h | 8 ++- > > > > 21 files changed, 133 insertions(+), 3 deletions(-) > > > > > > > > > > <cut to the syscall definitions> > > > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > > index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 > > > > --- a/fs/inode.c > > > > +++ b/fs/inode.c > > > > @@ -23,6 +23,9 @@ > > > > #include <linux/rw_hint.h> > > > > #include <linux/seq_file.h> > > > > #include <linux/debugfs.h> > > > > +#include <linux/syscalls.h> > > > > +#include <linux/fileattr.h> > > > > +#include <linux/namei.h> > > > > #include <trace/events/writeback.h> > > > > #define CREATE_TRACE_POINTS > > > > #include <trace/events/timestamp.h> > > > > @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, > > > > return mode & ~S_ISGID; > > > > } > > > > EXPORT_SYMBOL(mode_strip_sgid); > > > > + > > > > +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, > > > > + struct fsxattr __user *, fsx, unsigned int, at_flags) > > > > > > Should the kernel require userspace to pass the size of the fsx buffer? > > > That way we avoid needing to rev the interface when we decide to grow > > > the structure. > > > > > > > This makes sense to me, but I see that Andreas proposed other ways, > > as long as we have a plan on how to extend the struct if we need more space. > > > > Andrey, I am sorry to bring this up in v3, but I would like to request > > two small changes before merging this API. > > > > This patch by Pali [1] adds fsx_xflags_mask for the filesystem to > > report the supported set of xflags. > > > > It was argued that we can make this change with the existing ioctl, > > because it is not going to break xfs_io -c lsattr/chattr, which is fine, > > but I think that we should merge the fsx_xflags_mask change along > > with getfsxattrat() which is a new UAPI. > > > > The second request is related to setfsxattrat(). > > With current FS_IOC_FSSETXATTR, IIUC, xfs ignores unsupported > > fsx_xflags. I think this needs to be fixed before merging setfsxattrat(). > > It's ok that a program calling FS_IOC_FSSETXATTR will not know > > if unsupported flags will be ignored, because that's the way it is, > > but I think that setfsxattrat() must return -EINVAL for trying to > > set unsupported xflags. > > > > As I explained in [2] I think it is fine if FS_IOC_FSSETXATTR > > will also start returning -EINVAL for unsupported flags, but I would > > like setfsxattrat() to make that a guarantee. > > > > There was an open question, what does fsx_xflags_mask mean > > for setfsxattrat() - it is a mask like in inode_set_flags() as Andreas > > suggested? I think that would be a good idea. > > > > Thanks, > > Amir. > > > > [1] https://lore.kernel.org/linux-fsdevel/20250216164029.20673-4-pali@kernel.org/ > > [2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxjwQJiKAqyjEmKUnq-VihyeSsxyEy2F+J38NXwrAXurFQ@mail.gmail.com/ > > > > I'm fine with making Pali's patchset a dependency for this syscall, > as if vfs_fileattr_set() will start returning EINVAL on unsupported > flags this syscall will pass it through (ioctls will need to ignore > it). And as these syscalls use fsxattr anyway the fsx_xflags_mask > field will be here. > > -- > - Andrey > Hello Andrey, if I understand correctly then it is needed for new setfsxattrat() call to return EINVAL on any unsupported flags since beginning. Then I could extend it for new flags without breaking backward or forward compatibility of the setfsxattrat() call.
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index c59d53d6d3f3490f976ca179ddfe02e69265ae4d..4b9e687494c16b60c6fd6ca1dc4d6564706a7e25 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -506,3 +506,5 @@ 574 common getxattrat sys_getxattrat 575 common listxattrat sys_listxattrat 576 common removexattrat sys_removexattrat +577 common getfsxattrat sys_getfsxattrat +578 common setfsxattrat sys_setfsxattrat diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 49eeb2ad8dbd8e074c6240417693f23fb328afa8..66466257f3c2debb3e2299f0b608c6740c98cab2 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -481,3 +481,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl index 69a829912a05eb8a3e21ed701d1030e31c0148bc..9c516118b154811d8d11d5696f32817430320dbf 100644 --- a/arch/arm64/tools/syscall_32.tbl +++ b/arch/arm64/tools/syscall_32.tbl @@ -478,3 +478,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index f5ed71f1910d09769c845c2d062d99ee0449437c..159476387f394a92ee5e29db89b118c630372db2 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -466,3 +466,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 680f568b77f2cbefc3eacb2517f276041f229b1e..a6d59ee740b58cacf823702003cf9bad17c0d3b7 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -472,3 +472,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 0b9b7e25b69ad592642f8533bee9ccfe95ce9626..cfe38fcebe1a0279e11751378d3e71c5ec6b6569 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -405,3 +405,5 @@ 464 n32 getxattrat sys_getxattrat 465 n32 listxattrat sys_listxattrat 466 n32 removexattrat sys_removexattrat +467 n32 getfsxattrat sys_getfsxattrat +468 n32 setfsxattrat sys_setfsxattrat diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index c844cd5cda620b2809a397cdd6f4315ab6a1bfe2..29a0c5974d1aa2f01e33edc0252d75fb97abe230 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -381,3 +381,5 @@ 464 n64 getxattrat sys_getxattrat 465 n64 listxattrat sys_listxattrat 466 n64 removexattrat sys_removexattrat +467 n64 getfsxattrat sys_getfsxattrat +468 n64 setfsxattrat sys_setfsxattrat diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 349b8aad1159f404103bd2057a1e64e9bf309f18..6c00436807c57c492ba957fcd59af1202231cf80 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -454,3 +454,5 @@ 464 o32 getxattrat sys_getxattrat 465 o32 listxattrat sys_listxattrat 466 o32 removexattrat sys_removexattrat +467 o32 getfsxattrat sys_getfsxattrat +468 o32 setfsxattrat sys_setfsxattrat diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index d9fc94c869657fcfbd7aca1d5f5abc9fae2fb9d8..b3578fac43d6b65167787fcc97d2d09f5a9828e7 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -465,3 +465,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index d8b4ab78bef076bd50d49b87dea5060fd8c1686a..808045d82c9465c3bfa96b15947546efe5851e9a 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -557,3 +557,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index e9115b4d8b635b846e5c9ad6ce229605323723a5..78dfc2c184d4815baf8a9e61c546c9936d58a47c 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -469,3 +469,5 @@ 464 common getxattrat sys_getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat sys_setfsxattrat diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index c8cad33bf250ea110de37bd1407f5a43ec5e38f2..d5a5c8339f0ed25ea07c4aba90351d352033c8a0 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -470,3 +470,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 727f99d333b304b3db0711953a3d91ece18a28eb..817dcd8603bcbffc47f3f59aa3b74b16486453d0 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -512,3 +512,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 4d0fb2fba7e208ae9455459afe11e277321d9f74..b4842c027c5d00c0236b2ba89387c5e2267447bd 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -472,3 +472,5 @@ 464 i386 getxattrat sys_getxattrat 465 i386 listxattrat sys_listxattrat 466 i386 removexattrat sys_removexattrat +467 i386 getfsxattrat sys_getfsxattrat +468 i386 setfsxattrat sys_setfsxattrat diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 5eb708bff1c791debd6cfc5322583b2ae53f6437..b6f0a7236aaee624cf9b484239a1068085a8ffe1 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -390,6 +390,8 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index 37effc1b134eea061f2c350c1d68b4436b65a4dd..425d56be337d1de22f205ac503df61ff86224fee 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -437,3 +437,5 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat +467 common getfsxattrat sys_getfsxattrat +468 common setfsxattrat sys_setfsxattrat diff --git a/fs/inode.c b/fs/inode.c index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..b2dddd9db4fabaf67a6cbf541a86978b290411ec 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -23,6 +23,9 @@ #include <linux/rw_hint.h> #include <linux/seq_file.h> #include <linux/debugfs.h> +#include <linux/syscalls.h> +#include <linux/fileattr.h> +#include <linux/namei.h> #include <trace/events/writeback.h> #define CREATE_TRACE_POINTS #include <trace/events/timestamp.h> @@ -2953,3 +2956,75 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap, return mode & ~S_ISGID; } EXPORT_SYMBOL(mode_strip_sgid); + +SYSCALL_DEFINE4(getfsxattrat, int, dfd, const char __user *, filename, + struct fsxattr __user *, fsx, unsigned int, at_flags) +{ + CLASS(fd, dir)(dfd); + struct fileattr fa; + struct path filepath; + int error; + unsigned int lookup_flags = 0; + + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) + return -EINVAL; + + if (at_flags & AT_SYMLINK_FOLLOW) + lookup_flags |= LOOKUP_FOLLOW; + + if (at_flags & AT_EMPTY_PATH) + lookup_flags |= LOOKUP_EMPTY; + + if (fd_empty(dir)) + return -EBADF; + + error = user_path_at(dfd, filename, lookup_flags, &filepath); + if (error) + return error; + + error = vfs_fileattr_get(filepath.dentry, &fa); + if (!error) + error = copy_fsxattr_to_user(&fa, fsx); + + path_put(&filepath); + return error; +} + +SYSCALL_DEFINE4(setfsxattrat, int, dfd, const char __user *, filename, + struct fsxattr __user *, fsx, unsigned int, at_flags) +{ + CLASS(fd, dir)(dfd); + struct fileattr fa; + struct path filepath; + int error; + unsigned int lookup_flags = 0; + + if ((at_flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) + return -EINVAL; + + if (at_flags & AT_SYMLINK_FOLLOW) + lookup_flags |= LOOKUP_FOLLOW; + + if (at_flags & AT_EMPTY_PATH) + lookup_flags |= LOOKUP_EMPTY; + + if (fd_empty(dir)) + return -EBADF; + + if (copy_fsxattr_from_user(&fa, fsx)) + return -EFAULT; + + error = user_path_at(dfd, filename, lookup_flags, &filepath); + if (error) + return error; + + error = mnt_want_write(filepath.mnt); + if (!error) { + error = vfs_fileattr_set(file_mnt_idmap(fd_file(dir)), + filepath.dentry, &fa); + mnt_drop_write(filepath.mnt); + } + + path_put(&filepath); + return error; +} diff --git a/fs/ioctl.c b/fs/ioctl.c index 638a36be31c14afc66a7fd6eb237d9545e8ad997..dc160c2ef145e4931d625f1f93c2a8ae7f87abf3 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -558,8 +558,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa) } EXPORT_SYMBOL(copy_fsxattr_to_user); -static int copy_fsxattr_from_user(struct fileattr *fa, - struct fsxattr __user *ufa) +int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa) { struct fsxattr xfa; @@ -646,6 +645,19 @@ static int fileattr_set_prepare(struct inode *inode, if (fa->fsx_cowextsize == 0) fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; + /* + * The only use case for special files is to set project ID, forbid any + * other attributes + */ + if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { + if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) + return -EINVAL; + if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) + return -EINVAL; + if (fa->fsx_extsize || fa->fsx_cowextsize) + return -EINVAL; + } + return 0; } diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h index 47c05a9851d0600964b644c9c7218faacfd865f8..8598e94b530b8b280a2697eaf918dd60f573d6ee 100644 --- a/include/linux/fileattr.h +++ b/include/linux/fileattr.h @@ -34,6 +34,7 @@ struct fileattr { }; int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa); +int copy_fsxattr_from_user(struct fileattr *fa, struct fsxattr __user *ufa); void fileattr_fill_xflags(struct fileattr *fa, u32 xflags); void fileattr_fill_flags(struct fileattr *fa, u32 flags); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index c6333204d45130eb022f6db460eea34a1f6e91db..3134d463d9af64c6e78adb37bff4b91f77b5305f 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -371,6 +371,10 @@ asmlinkage long sys_removexattrat(int dfd, const char __user *path, asmlinkage long sys_lremovexattr(const char __user *path, const char __user *name); asmlinkage long sys_fremovexattr(int fd, const char __user *name); +asmlinkage long sys_getfsxattrat(int dfd, const char __user *filename, + struct fsxattr *fsx, unsigned int at_flags); +asmlinkage long sys_setfsxattrat(int dfd, const char __user *filename, + struct fsxattr *fsx, unsigned int at_flags); asmlinkage long sys_getcwd(char __user *buf, unsigned long size); asmlinkage long sys_eventfd2(unsigned int count, int flags); asmlinkage long sys_epoll_create1(int flags); diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 88dc393c2bca38c0fa1b3fae579f7cfe4931223c..50be2e1007bc2779120d05c6e9512a689f86779c 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -850,8 +850,14 @@ __SYSCALL(__NR_listxattrat, sys_listxattrat) #define __NR_removexattrat 466 __SYSCALL(__NR_removexattrat, sys_removexattrat) +/* fs/inode.c */ +#define __NR_getfsxattrat 467 +__SYSCALL(__NR_getfsxattrat, sys_getfsxattrat) +#define __NR_setfsxattrat 468 +__SYSCALL(__NR_setfsxattrat, sys_setfsxattrat) + #undef __NR_syscalls -#define __NR_syscalls 467 +#define __NR_syscalls 469 /* * 32 bit systems traditionally used different