diff mbox series

[v8,08/14] fs: add __remove_file_privs() with flags parameter

Message ID 20220608171741.3875418-9-shr@fb.com (mailing list archive)
State New, archived
Headers show
Series io-uring/xfs: support async buffered writes | expand

Commit Message

Stefan Roesch June 8, 2022, 5:17 p.m. UTC
This adds the function __remove_file_privs, which allows the caller to
pass the kiocb flags parameter.

No intended functional changes in this patch.

Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/inode.c | 57 +++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 20 deletions(-)

Comments

Christian Brauner June 10, 2022, 11:53 a.m. UTC | #1
On Wed, Jun 08, 2022 at 10:17:35AM -0700, Stefan Roesch wrote:
> This adds the function __remove_file_privs, which allows the caller to
> pass the kiocb flags parameter.
> 
> No intended functional changes in this patch.
> 
> Signed-off-by: Stefan Roesch <shr@fb.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jan Kara <jack@suse.cz>
> ---

Looks good to me,
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
kernel test robot June 13, 2022, 8:50 a.m. UTC | #2
Greeting,

FYI, we noticed a -4.3% regression of phoronix-test-suite.fio.SequentialWrite.Sync.Yes.No.4KB.DefaultTestDirectory.mb_s due to commit:


commit: b6c81e63ec1e668f8df716a91c5386074411dbbe ("[PATCH v8 08/14] fs: add __remove_file_privs() with flags parameter")
url: https://github.com/intel-lab-lkp/linux/commits/Stefan-Roesch/io-uring-xfs-support-async-buffered-writes/20220609-013837
patch link: https://lore.kernel.org/io-uring/20220608171741.3875418-9-shr@fb.com

in testcase: phoronix-test-suite
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:

	test: fio-1.14.1
	option_a: Sequential Write
	option_b: Sync
	option_c: Yes
	option_d: No
	option_e: 4KB
	option_f: Default Test Directory
	cpufreq_governor: performance
	ucode: 0x500320a

test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@intel.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/option_d/option_e/option_f/rootfs/tbox_group/test/testcase/ucode:
  gcc-11/performance/x86_64-rhel-8.3/Sequential Write/Sync/Yes/No/4KB/Default Test Directory/debian-x86_64-phoronix/lkp-csl-2sp7/fio-1.14.1/phoronix-test-suite/0x500320a

commit: 
  003096ce91 ("fs: Add check for async buffered writes to generic_write_checks")
  b6c81e63ec ("fs: add __remove_file_privs() with flags parameter")

003096ce910675ce b6c81e63ec1e668f8df716a91c5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    245666            -4.3%     235222        phoronix-test-suite.fio.SequentialWrite.Sync.Yes.No.4KB.DefaultTestDirectory.iops
    960.00            -4.3%     919.17        phoronix-test-suite.fio.SequentialWrite.Sync.Yes.No.4KB.DefaultTestDirectory.mb_s
  1.51e+08            -5.5%  1.427e+08        phoronix-test-suite.time.file_system_outputs
    256970 ±  9%     +44.3%     370918 ± 11%  meminfo.Dirty
    267660 ± 11%     +39.6%     373743 ± 11%  numa-meminfo.node0.Dirty
    610732            -4.5%     583080        vmstat.io.bo
     67160 ± 11%     +38.3%      92904 ± 11%  numa-vmstat.node0.nr_dirty
     79263 ±  8%     +33.0%     105418 ± 10%  numa-vmstat.node0.nr_zone_write_pending
     32.67            +2.5%      33.48        perf-stat.i.major-faults
      5211            +2.3%       5331        perf-stat.i.minor-faults
   3509036            -7.2%    3255351 ±  3%  perf-stat.i.node-stores
      5244            +2.3%       5364        perf-stat.i.page-faults
     32.44            +2.5%      33.24        perf-stat.ps.major-faults
      5173            +2.3%       5292        perf-stat.ps.minor-faults
   3477114            -7.2%    3225288 ±  3%  perf-stat.ps.node-stores
      5206            +2.3%       5324        perf-stat.ps.page-faults
  18872500            -5.5%   17841229        proc-vmstat.nr_dirtied
     65070 ±  9%     +42.8%      92916 ± 11%  proc-vmstat.nr_dirty
    827164            +3.2%     853524        proc-vmstat.nr_file_pages
    386220            +6.9%     412691 ±  2%  proc-vmstat.nr_inactive_file
     29963            +2.5%      30724        proc-vmstat.nr_slab_reclaimable
  18830707            -5.8%   17734737        proc-vmstat.nr_written
    386220            +6.9%     412691 ±  2%  proc-vmstat.nr_zone_inactive_file
     77263 ±  6%     +35.2%     104462 ±  9%  proc-vmstat.nr_zone_write_pending
  23867810 ±  3%      -8.0%   21964544        proc-vmstat.numa_hit
  22976263            -4.8%   21871403        proc-vmstat.numa_local
  22987067            -4.6%   21929083        proc-vmstat.pgalloc_normal
  22684993            -5.0%   21562034        proc-vmstat.pgfree
  75323098            -5.8%   70939215        proc-vmstat.pgpgout
      9889 ±  9%     -17.4%       8171 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
      9.45 ± 51%    +247.3%      32.81 ± 72%  sched_debug.cfs_rq:/.removed.load_avg.avg
     53.83 ± 25%    +159.7%     139.81 ± 54%  sched_debug.cfs_rq:/.removed.load_avg.stddev
      4.30 ± 52%    +260.1%      15.49 ± 76%  sched_debug.cfs_rq:/.removed.runnable_avg.avg
     24.46 ± 33%    +172.5%      66.65 ± 57%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
      4.30 ± 52%    +260.1%      15.49 ± 76%  sched_debug.cfs_rq:/.removed.util_avg.avg
     24.46 ± 33%    +172.5%      66.65 ± 57%  sched_debug.cfs_rq:/.removed.util_avg.stddev
    820.61 ±  7%     +18.9%     975.33 ±  7%  sched_debug.cfs_rq:/.runnable_avg.max
      9891 ±  9%     -17.4%       8171 ± 10%  sched_debug.cfs_rq:/.spread0.stddev
    818.56 ±  7%     +19.1%     975.06 ±  7%  sched_debug.cfs_rq:/.util_avg.max
     11.42 ± 27%     +92.0%      21.92 ± 33%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    427.50 ± 24%     +84.2%     787.44 ± 30%  sched_debug.cfs_rq:/.util_est_enqueued.max
     58.65 ± 22%     +79.2%     105.07 ± 29%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      0.00            +0.7        0.74 ± 11%  perf-profile.calltrace.cycles-pp.__vfs_getxattr.cap_inode_need_killpriv.security_inode_need_killpriv.__file_remove_privs.file_modified
      0.00            +0.8        0.80 ±  9%  perf-profile.calltrace.cycles-pp.cap_inode_need_killpriv.security_inode_need_killpriv.__file_remove_privs.file_modified.ext4_buffered_write_iter
      0.00            +0.8        0.84 ± 11%  perf-profile.calltrace.cycles-pp.security_inode_need_killpriv.__file_remove_privs.file_modified.ext4_buffered_write_iter.new_sync_write
      0.00            +0.9        0.90 ±  9%  perf-profile.calltrace.cycles-pp.__file_remove_privs.file_modified.ext4_buffered_write_iter.new_sync_write.vfs_write
      0.00            +0.9        0.91 ±  9%  perf-profile.calltrace.cycles-pp.file_modified.ext4_buffered_write_iter.new_sync_write.vfs_write.ksys_write
      0.40 ± 10%      -0.1        0.33 ± 13%  perf-profile.children.cycles-pp.jbd2_journal_grab_journal_head
      0.26 ± 18%      -0.1        0.20 ± 13%  perf-profile.children.cycles-pp.file_update_time
      0.12 ±  8%      -0.0        0.08 ± 20%  perf-profile.children.cycles-pp.xa_get_order
      0.09 ± 16%      +0.0        0.12 ± 13%  perf-profile.children.cycles-pp.do_filp_open
      0.09 ± 17%      +0.0        0.12 ± 13%  perf-profile.children.cycles-pp.path_openat
      0.00            +0.1        0.08 ± 16%  perf-profile.children.cycles-pp.up_read
      0.00            +0.1        0.09 ± 30%  perf-profile.children.cycles-pp.shmem_xattr_handler_get
      0.00            +0.1        0.11 ± 31%  perf-profile.children.cycles-pp.strlen
      0.00            +0.1        0.13 ± 27%  perf-profile.children.cycles-pp.simple_xattr_get
      0.00            +0.2        0.19 ± 17%  perf-profile.children.cycles-pp.down_read
      0.00            +0.4        0.36 ±  9%  perf-profile.children.cycles-pp.xattr_resolve_name
      0.00            +0.4        0.44 ± 11%  perf-profile.children.cycles-pp.ext4_xattr_get
      0.02 ±141%      +0.9        0.91 ±  9%  perf-profile.children.cycles-pp.file_modified
      0.00            +1.1        1.14 ± 10%  perf-profile.children.cycles-pp.__vfs_getxattr
      0.00            +1.2        1.22 ± 10%  perf-profile.children.cycles-pp.cap_inode_need_killpriv
      0.00            +1.3        1.26 ± 10%  perf-profile.children.cycles-pp.security_inode_need_killpriv
      0.00            +1.4        1.37 ± 10%  perf-profile.children.cycles-pp.__file_remove_privs
      0.40 ± 11%      -0.1        0.32 ± 12%  perf-profile.self.cycles-pp.jbd2_journal_grab_journal_head
      0.16 ± 19%      -0.1        0.09 ± 26%  perf-profile.self.cycles-pp.ext4_da_write_begin
      0.22 ± 10%      -0.1        0.16 ± 14%  perf-profile.self.cycles-pp.create_empty_buffers
      0.00            +0.1        0.08 ± 16%  perf-profile.self.cycles-pp.up_read
      0.00            +0.1        0.08 ± 16%  perf-profile.self.cycles-pp.ext4_xattr_get
      0.00            +0.1        0.09 ± 15%  perf-profile.self.cycles-pp.down_read
      0.00            +0.1        0.09 ± 22%  perf-profile.self.cycles-pp.__file_remove_privs
      0.00            +0.1        0.09 ± 22%  perf-profile.self.cycles-pp.cap_inode_need_killpriv
      0.00            +0.1        0.10 ± 36%  perf-profile.self.cycles-pp.strlen
      0.00            +0.4        0.35 ±  9%  perf-profile.self.cycles-pp.xattr_resolve_name




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
diff mbox series

Patch

diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..ac1cf5aa78c8 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2010,36 +2010,43 @@  static int __remove_privs(struct user_namespace *mnt_userns,
 	return notify_change(mnt_userns, dentry, &newattrs, NULL);
 }
 
-/*
- * Remove special file priviledges (suid, capabilities) when file is written
- * to or truncated.
- */
-int file_remove_privs(struct file *file)
+static int __file_remove_privs(struct file *file, unsigned int flags)
 {
 	struct dentry *dentry = file_dentry(file);
 	struct inode *inode = file_inode(file);
+	int error;
 	int kill;
-	int error = 0;
 
-	/*
-	 * Fast path for nothing security related.
-	 * As well for non-regular files, e.g. blkdev inodes.
-	 * For example, blkdev_write_iter() might get here
-	 * trying to remove privs which it is not allowed to.
-	 */
 	if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode))
 		return 0;
 
 	kill = dentry_needs_remove_privs(dentry);
-	if (kill < 0)
+	if (kill <= 0)
 		return kill;
-	if (kill)
-		error = __remove_privs(file_mnt_user_ns(file), dentry, kill);
+
+	if (flags & IOCB_NOWAIT)
+		return -EAGAIN;
+
+	error = __remove_privs(file_mnt_user_ns(file), dentry, kill);
 	if (!error)
 		inode_has_no_xattr(inode);
 
 	return error;
 }
+
+/**
+ * file_remove_privs - remove special file privileges (suid, capabilities)
+ * @file: file to remove privileges from
+ *
+ * When file is modified by a write or truncation ensure that special
+ * file privileges are removed.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int file_remove_privs(struct file *file)
+{
+	return __file_remove_privs(file, 0);
+}
 EXPORT_SYMBOL(file_remove_privs);
 
 /**
@@ -2090,18 +2097,28 @@  int file_update_time(struct file *file)
 }
 EXPORT_SYMBOL(file_update_time);
 
-/* Caller must hold the file's inode lock */
+/**
+ * file_modified - handle mandated vfs changes when modifying a file
+ * @file: file that was modified
+ *
+ * When file has been modified ensure that special
+ * file privileges are removed and time settings are updated.
+ *
+ * Context: Caller must hold the file's inode lock.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
 int file_modified(struct file *file)
 {
-	int err;
+	int ret;
 
 	/*
 	 * Clear the security bits if the process is not being run by root.
 	 * This keeps people from modifying setuid and setgid binaries.
 	 */
-	err = file_remove_privs(file);
-	if (err)
-		return err;
+	ret = __file_remove_privs(file, 0);
+	if (ret)
+		return ret;
 
 	if (unlikely(file->f_mode & FMODE_NOCMTIME))
 		return 0;