Message ID | 20240513072119.2335346-1-yi.zhang@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ext4/jbd2: drop jbd2_transaction_committed() | expand |
On Mon 13-05-24 15:21:19, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@huawei.com> > > jbd2_transaction_committed() is used to check whether a transaction with > the given tid has already committed, it hold j_state_lock in read mode > and check the tid of current running transaction and committing > transaction, but holding the j_state_lock is expensive. > > We have already stored the sequence number of the most recently > committed transaction in journal t->j_commit_sequence, we could do this > check by comparing it with the given tid instead. If the given tid isn't > smaller than j_commit_sequence, we can ensure that the given transaction > has been committed. That way we could drop the expensive lock and > achieve about 10% ~ 20% performance gains in concurrent DIOs on may > virtual machine with 100G ramdisk. > > fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \ > -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \ > -group_reporting > > Before: > overwrite IOPS=88.2k, BW=344MiB/s > read IOPS=95.7k, BW=374MiB/s > rand overwrite IOPS=98.7k, BW=386MiB/s > randread IOPS=102k, BW=397MiB/s > > After: > verwrite: IOPS=105k, BW=410MiB/s > read: IOPS=112k, BW=436MiB/s > rand overwrite: IOPS=104k, BW=404MiB/s > randread: IOPS=111k, BW=432MiB/s > > CC: Dave Chinner <david@fromorbit.com> > Suggested-by: Dave Chinner <david@fromorbit.com> > Link: https://lore.kernel.org/linux-ext4/493ab4c5-505c-a351-eefa-7d2677cdf800@huaweicloud.com/T/#m6a14df5d085527a188c5a151191e87a3252dc4e2 > Signed-off-by: Zhang Yi <yi.zhang@huawei.com> I agree this is workable solution and the performance benefits are nice. But I have some comments regarding the implementation: > @@ -3199,8 +3199,8 @@ static bool ext4_inode_datasync_dirty(struct inode *inode) > journal_t *journal = EXT4_SB(inode->i_sb)->s_journal; > > if (journal) { > - if (jbd2_transaction_committed(journal, > - EXT4_I(inode)->i_datasync_tid)) > + if (tid_geq(journal->j_commit_sequence, > + EXT4_I(inode)->i_datasync_tid)) Please leave the helper jbd2_transaction_committed(), just make the implementation more efficient. Also accessing j_commit_sequence without any lock is theoretically problematic wrt compiler optimization. You should have READ_ONCE() there and the places modifying j_commit_sequence need to use WRITE_ONCE(). Honza
On 2024/5/15 8:25, Jan Kara wrote: > On Mon 13-05-24 15:21:19, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@huawei.com> >> >> jbd2_transaction_committed() is used to check whether a transaction with >> the given tid has already committed, it hold j_state_lock in read mode >> and check the tid of current running transaction and committing >> transaction, but holding the j_state_lock is expensive. >> >> We have already stored the sequence number of the most recently >> committed transaction in journal t->j_commit_sequence, we could do this >> check by comparing it with the given tid instead. If the given tid isn't >> smaller than j_commit_sequence, we can ensure that the given transaction >> has been committed. That way we could drop the expensive lock and >> achieve about 10% ~ 20% performance gains in concurrent DIOs on may >> virtual machine with 100G ramdisk. >> >> fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \ >> -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \ >> -group_reporting >> >> Before: >> overwrite IOPS=88.2k, BW=344MiB/s >> read IOPS=95.7k, BW=374MiB/s >> rand overwrite IOPS=98.7k, BW=386MiB/s >> randread IOPS=102k, BW=397MiB/s >> >> After: >> verwrite: IOPS=105k, BW=410MiB/s >> read: IOPS=112k, BW=436MiB/s >> rand overwrite: IOPS=104k, BW=404MiB/s >> randread: IOPS=111k, BW=432MiB/s >> >> CC: Dave Chinner <david@fromorbit.com> >> Suggested-by: Dave Chinner <david@fromorbit.com> >> Link: https://lore.kernel.org/linux-ext4/493ab4c5-505c-a351-eefa-7d2677cdf800@huaweicloud.com/T/#m6a14df5d085527a188c5a151191e87a3252dc4e2 >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> > > I agree this is workable solution and the performance benefits are nice. But > I have some comments regarding the implementation: > >> @@ -3199,8 +3199,8 @@ static bool ext4_inode_datasync_dirty(struct inode *inode) >> journal_t *journal = EXT4_SB(inode->i_sb)->s_journal; >> >> if (journal) { >> - if (jbd2_transaction_committed(journal, >> - EXT4_I(inode)->i_datasync_tid)) >> + if (tid_geq(journal->j_commit_sequence, >> + EXT4_I(inode)->i_datasync_tid)) > > Please leave the helper jbd2_transaction_committed(), just make the > implementation more efficient. Sure. > Also accessing j_commit_sequence without any > lock is theoretically problematic wrt compiler optimization. You should have > READ_ONCE() there and the places modifying j_commit_sequence need to use > WRITE_ONCE(). > Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE() here. IIUC, if we add READ_ONCE(), we could make sure to get the latest j_commit_sequence, if not, there is a window (it might becomes larger) that we could get the old value and jbd2_transaction_committed() could return false even if the given transaction was just committed, but I think the window is always there, so it looks like it is not a big problem, is that right? Thanks, Yi.
On Thu 16-05-24 16:27:25, Zhang Yi wrote: > On 2024/5/15 8:25, Jan Kara wrote: > > On Mon 13-05-24 15:21:19, Zhang Yi wrote: > > Also accessing j_commit_sequence without any > > lock is theoretically problematic wrt compiler optimization. You should have > > READ_ONCE() there and the places modifying j_commit_sequence need to use > > WRITE_ONCE(). > > > > Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE() > here. IIUC, if we add READ_ONCE(), we could make sure to get the latest > j_commit_sequence, if not, there is a window (it might becomes larger) that > we could get the old value and jbd2_transaction_committed() could return false > even if the given transaction was just committed, but I think the window is > always there, so it looks like it is not a big problem, is that right? Well, all accesses to any memory should use READ_ONCE(), be protected by a lock, or use types that handle atomicity on assembly level (like atomic_t, or atomic bit operations and similar). Otherwise the compiler is free to assume the underlying memory cannot change and generate potentionally invalid code. In this case, I don't think realistically any compiler will do it but still it is a good practice and also it saves us from KCSAN warnings. If you want to know more details about possible problems, see tools/memory-model/Documentation/explanation.txt chapter "PLAIN ACCESSES AND DATA RACES". Honza
On 2024/5/20 16:49, Jan Kara wrote: > On Thu 16-05-24 16:27:25, Zhang Yi wrote: >> On 2024/5/15 8:25, Jan Kara wrote: >>> On Mon 13-05-24 15:21:19, Zhang Yi wrote: >>> Also accessing j_commit_sequence without any >>> lock is theoretically problematic wrt compiler optimization. You should have >>> READ_ONCE() there and the places modifying j_commit_sequence need to use >>> WRITE_ONCE(). >>> >> >> Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE() >> here. IIUC, if we add READ_ONCE(), we could make sure to get the latest >> j_commit_sequence, if not, there is a window (it might becomes larger) that >> we could get the old value and jbd2_transaction_committed() could return false >> even if the given transaction was just committed, but I think the window is >> always there, so it looks like it is not a big problem, is that right? > > Well, all accesses to any memory should use READ_ONCE(), be protected by a > lock, or use types that handle atomicity on assembly level (like atomic_t, > or atomic bit operations and similar). Otherwise the compiler is free to > assume the underlying memory cannot change and generate potentionally > invalid code. In this case, I don't think realistically any compiler will > do it but still it is a good practice and also it saves us from KCSAN > warnings. If you want to know more details about possible problems, see > > tools/memory-model/Documentation/explanation.txt > > chapter "PLAIN ACCESSES AND DATA RACES". > Sure, this document is really helpful, I'll add READ_ONCE() and WRITE_ONCE() here, thanks a lot. Yi.
Hello, kernel test robot noticed a 674.0% improvement of stress-ng.fiemap.ops_per_sec on: commit: 8a7952dd64b38f3ac2754ca550b52ac3ca921c1a ("[PATCH] ext4/jbd2: drop jbd2_transaction_committed()") url: https://github.com/intel-lab-lkp/linux/commits/Zhang-Yi/ext4-jbd2-drop-jbd2_transaction_committed/20240513-153311 base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev patch link: https://lore.kernel.org/all/20240513072119.2335346-1-yi.zhang@huaweicloud.com/ patch subject: [PATCH] ext4/jbd2: drop jbd2_transaction_committed() testcase: stress-ng test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% disk: 1HDD testtime: 60s fs: ext4 test: fiemap cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240527/202405271406.9b9763f6-oliver.sang@intel.com ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fiemap/stress-ng/60s commit: 26770a717c ("jbd2: add prefix 'jbd2' for 'shrink_type'") 8a7952dd64 ("ext4/jbd2: drop jbd2_transaction_committed()") 26770a717cac5704 8a7952dd64b38f3ac2754ca550b ---------------- --------------------------- %stddev %change %stddev \ | \ 1.13 ± 5% +52.7% 1.73 ± 3% iostat.cpu.user 0.01 ± 47% +0.0 0.04 ± 44% mpstat.cpu.all.iowait% 0.02 ± 4% -0.0 0.01 ± 15% mpstat.cpu.all.soft% 1.15 ± 5% +0.6 1.75 ± 3% mpstat.cpu.all.usr% 100427 ± 42% -59.1% 41106 ± 66% numa-vmstat.node0.nr_anon_pages 149979 ± 34% +205.2% 457720 ± 31% numa-vmstat.node1.nr_inactive_anon 149980 ± 34% +205.2% 457720 ± 31% numa-vmstat.node1.nr_zone_inactive_anon 402376 ± 42% -59.1% 164453 ± 66% numa-meminfo.node0.AnonPages 424656 ± 39% -57.6% 180029 ± 58% numa-meminfo.node0.AnonPages.max 620453 ± 33% +198.4% 1851452 ± 30% numa-meminfo.node1.Inactive 601997 ± 34% +204.4% 1832749 ± 31% numa-meminfo.node1.Inactive(anon) 2387 ± 2% +12.8% 2693 ± 4% vmstat.io.bo 10.17 ± 13% +83.8% 18.69 ± 3% vmstat.procs.b 201402 ± 9% +631.9% 1474068 ± 6% vmstat.system.cs 167899 +5.5% 177141 vmstat.system.in 347.50 ± 16% +53.4% 533.00 ± 10% perf-c2c.DRAM.local 4232 ± 48% +166.1% 11262 ± 24% perf-c2c.DRAM.remote 23484 ± 10% +119.8% 51621 ± 7% perf-c2c.HITM.local 3377 ± 56% +179.0% 9424 ± 27% perf-c2c.HITM.remote 26861 ± 7% +127.3% 61046 ± 5% perf-c2c.HITM.total 7612782 ± 7% +676.3% 59099460 ± 6% stress-ng.fiemap.ops 124734 ± 8% +674.0% 965466 ± 6% stress-ng.fiemap.ops_per_sec 12752270 ± 10% +643.8% 94847653 ± 5% stress-ng.time.involuntary_context_switches 19.02 ± 6% +124.6% 42.70 ± 5% stress-ng.time.user_time 62295 +17.8% 73411 ± 4% stress-ng.time.voluntary_context_switches 262799 ± 5% +90.1% 499700 ± 43% meminfo.Active 3750501 +38.7% 5203048 ± 2% meminfo.Cached 3989040 +36.2% 5432649 ± 2% meminfo.Committed_AS 1154859 ± 3% +105.8% 2376633 ± 12% meminfo.Inactive 1118790 ± 3% +109.2% 2340758 ± 12% meminfo.Inactive(anon) 362491 ± 7% +153.0% 917217 ± 19% meminfo.Mapped 5902699 +25.8% 7426870 meminfo.Memused 546333 ± 3% +265.7% 1998111 ± 5% meminfo.Shmem 5963280 +25.0% 7456799 meminfo.max_used_kB 0.92 ± 16% -21.5% 0.72 ± 8% sched_debug.cfs_rq:/.h_nr_running.stddev 809.29 ± 9% -15.5% 683.50 ± 7% sched_debug.cfs_rq:/.runnable_avg.stddev 567.16 ± 10% +123.7% 1268 ± 9% sched_debug.cfs_rq:/.util_est.avg 2015 ± 8% +41.9% 2859 ± 11% sched_debug.cfs_rq:/.util_est.max 0.75 ±223% +23377.8% 176.08 ± 20% sched_debug.cfs_rq:/.util_est.min 447.08 ± 8% +19.1% 532.35 ± 12% sched_debug.cfs_rq:/.util_est.stddev 0.92 ± 16% -22.1% 0.71 ± 8% sched_debug.cpu.nr_running.stddev 101043 ± 9% +621.7% 729210 ± 5% sched_debug.cpu.nr_switches.avg 176826 ± 18% +423.9% 926474 ± 5% sched_debug.cpu.nr_switches.max 17586 ± 35% +762.4% 151656 ± 44% sched_debug.cpu.nr_switches.min 54968 ± 44% +170.9% 148885 ± 5% sched_debug.cpu.nr_switches.stddev 5712 +5.1% 6002 ± 2% proc-vmstat.nr_active_file 939938 +38.6% 1302470 ± 2% proc-vmstat.nr_file_pages 280225 ± 3% +108.7% 584958 ± 12% proc-vmstat.nr_inactive_anon 91383 ± 8% +149.7% 228197 ± 19% proc-vmstat.nr_mapped 136324 ± 3% +265.7% 498577 ± 5% proc-vmstat.nr_shmem 25123 +2.8% 25834 proc-vmstat.nr_slab_reclaimable 5712 +5.1% 6002 ± 2% proc-vmstat.nr_zone_active_file 280225 ± 3% +108.7% 584958 ± 12% proc-vmstat.nr_zone_inactive_anon 15324 ± 30% +359.4% 70400 ± 9% proc-vmstat.numa_hint_faults 10098 ± 24% +283.5% 38729 ± 19% proc-vmstat.numa_hint_faults_local 604637 +86.4% 1127295 ± 4% proc-vmstat.numa_hit 538359 +97.1% 1061202 ± 4% proc-vmstat.numa_local 532388 +13.4% 603511 ± 3% proc-vmstat.numa_pte_updates 723145 +72.1% 1244321 ± 3% proc-vmstat.pgalloc_normal 383685 ± 2% +21.4% 465726 ± 2% proc-vmstat.pgfault 152450 +13.4% 172900 ± 4% proc-vmstat.pgpgout 2.951e+09 ± 6% +571.9% 1.983e+10 ± 2% perf-stat.i.branch-instructions 0.64 ± 4% -0.2 0.40 ± 2% perf-stat.i.branch-miss-rate% 19046341 ± 4% +291.2% 74517781 perf-stat.i.branch-misses 9666270 ± 31% +318.8% 40484483 ± 19% perf-stat.i.cache-misses 70884859 ± 6% +367.8% 3.316e+08 ± 2% perf-stat.i.cache-references 213832 ± 9% +623.2% 1546476 ± 6% perf-stat.i.context-switches 15.20 ± 7% -84.9% 2.30 ± 2% perf-stat.i.cpi 414.63 ± 5% +10.9% 459.73 ± 8% perf-stat.i.cpu-migrations 26710 ± 31% -77.7% 5960 ± 22% perf-stat.i.cycles-between-cache-misses 1.501e+10 ± 6% +556.2% 9.849e+10 ± 2% perf-stat.i.instructions 0.07 ± 6% +515.9% 0.44 ± 2% perf-stat.i.ipc 3.33 ± 9% +626.6% 24.22 ± 6% perf-stat.i.metric.K/sec 4908 ± 3% +26.6% 6216 ± 3% perf-stat.i.minor-faults 4908 ± 3% +26.6% 6216 ± 3% perf-stat.i.page-faults 0.65 ± 5% -0.3 0.38 ± 2% perf-stat.overall.branch-miss-rate% 15.14 ± 6% -84.8% 2.30 ± 2% perf-stat.overall.cpi 25496 ± 29% -77.2% 5824 ± 21% perf-stat.overall.cycles-between-cache-misses 0.07 ± 6% +556.1% 0.44 ± 2% perf-stat.overall.ipc 2.9e+09 ± 6% +572.3% 1.95e+10 ± 2% perf-stat.ps.branch-instructions 18850245 ± 4% +289.0% 73330498 perf-stat.ps.branch-misses 9571442 ± 31% +315.9% 39805993 ± 19% perf-stat.ps.cache-misses 70175556 ± 6% +365.5% 3.267e+08 ± 2% perf-stat.ps.cache-references 205492 ± 9% +636.9% 1514212 ± 6% perf-stat.ps.context-switches 405.62 ± 5% +10.8% 449.30 ± 8% perf-stat.ps.cpu-migrations 1.475e+10 ± 6% +556.5% 9.684e+10 ± 2% perf-stat.ps.instructions 4823 ± 3% +25.8% 6069 ± 4% perf-stat.ps.minor-faults 4823 ± 3% +25.8% 6069 ± 4% perf-stat.ps.page-faults 9.213e+11 ± 6% +555.4% 6.038e+12 ± 2% perf-stat.total.instructions 85.73 -85.7 0.00 perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter.iomap_fiemap 86.11 -84.1 1.99 ± 2% perf-profile.calltrace.cycles-pp.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl 50.51 ± 3% -50.5 0.00 perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter 96.50 -12.4 84.06 perf-profile.calltrace.cycles-pp.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl 97.49 -5.6 91.94 perf-profile.calltrace.cycles-pp.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64 98.08 -1.5 96.62 perf-profile.calltrace.cycles-pp.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe 98.13 -1.1 96.99 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 98.14 -1.1 97.05 perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 98.18 -1.1 97.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 98.19 -1.0 97.14 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl 98.22 -0.9 97.31 perf-profile.calltrace.cycles-pp.ioctl 0.62 ± 9% +0.3 0.94 perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.62 ± 9% +0.3 0.96 perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield 0.70 ± 9% +0.5 1.18 perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield 0.87 ± 8% +0.7 1.58 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield 0.87 ± 8% +0.7 1.59 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield 0.00 +1.0 1.00 ± 4% perf-profile.calltrace.cycles-pp.ext4_sb_block_valid.__check_block_validity.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter 0.94 ± 8% +1.0 1.98 ± 2% perf-profile.calltrace.cycles-pp.__sched_yield 0.00 +1.1 1.10 ± 2% perf-profile.calltrace.cycles-pp._copy_to_user.fiemap_fill_next_extent.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl 0.00 +1.3 1.35 ± 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks 0.00 +1.5 1.48 ± 2% perf-profile.calltrace.cycles-pp.__check_block_validity.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap 0.00 +1.6 1.58 ± 7% perf-profile.calltrace.cycles-pp._raw_spin_lock.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report 0.00 +3.2 3.22 ± 2% perf-profile.calltrace.cycles-pp.fiemap_fill_next_extent.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64 4.96 ± 6% +4.8 9.72 ± 2% perf-profile.calltrace.cycles-pp._raw_read_lock.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter 0.78 ± 7% +5.4 6.22 ± 2% perf-profile.calltrace.cycles-pp.iomap_iter_advance.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl 0.62 ± 6% +50.6 51.27 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter 9.55 ± 7% +66.3 75.88 perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap 10.12 ± 7% +69.9 79.98 perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl 85.84 -85.8 0.00 perf-profile.children.cycles-pp.jbd2_transaction_committed 86.14 -84.0 2.10 ± 2% perf-profile.children.cycles-pp.ext4_set_iomap 55.59 ± 3% -45.8 9.83 ± 2% perf-profile.children.cycles-pp._raw_read_lock 96.56 -12.1 84.50 perf-profile.children.cycles-pp.ext4_iomap_begin_report 97.54 -5.3 92.26 perf-profile.children.cycles-pp.iomap_iter 98.11 -1.3 96.84 perf-profile.children.cycles-pp.iomap_fiemap 98.14 -1.1 97.00 perf-profile.children.cycles-pp.do_vfs_ioctl 98.14 -1.1 97.06 perf-profile.children.cycles-pp.__x64_sys_ioctl 98.22 -0.9 97.34 perf-profile.children.cycles-pp.ioctl 99.42 -0.7 98.76 perf-profile.children.cycles-pp.do_syscall_64 99.43 -0.6 98.81 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.39 ± 21% -0.1 0.25 ± 6% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 0.14 ± 24% -0.1 0.03 ±105% perf-profile.children.cycles-pp.build_id__mark_dso_hit 0.19 ± 5% +0.0 0.21 ± 6% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 0.08 ± 13% +0.0 0.11 perf-profile.children.cycles-pp.switch_fpu_return 0.09 ± 7% +0.0 0.12 ± 5% perf-profile.children.cycles-pp.update_process_times 0.11 ± 5% +0.0 0.14 ± 6% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.10 ± 5% +0.0 0.14 ± 5% perf-profile.children.cycles-pp.tick_nohz_handler 0.00 +0.1 0.05 perf-profile.children.cycles-pp.__fdget 0.00 +0.1 0.05 ± 7% perf-profile.children.cycles-pp.__update_load_avg_se 0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.00 +0.1 0.06 ± 6% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.00 +0.1 0.06 perf-profile.children.cycles-pp.___perf_sw_event 0.00 +0.1 0.06 perf-profile.children.cycles-pp.__dequeue_entity 0.00 +0.1 0.06 ± 6% perf-profile.children.cycles-pp.__switch_to_asm 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.pick_eevdf 0.00 +0.1 0.07 perf-profile.children.cycles-pp.__switch_to 0.00 +0.1 0.08 ± 6% perf-profile.children.cycles-pp.entry_SYSCALL_64 0.00 +0.1 0.08 ± 4% perf-profile.children.cycles-pp.rseq_ip_fixup 0.00 +0.1 0.09 ± 4% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate 0.27 ± 22% +0.1 0.37 ± 9% perf-profile.children.cycles-pp.perf_session__process_events 0.27 ± 22% +0.1 0.37 ± 9% perf-profile.children.cycles-pp.record__finish_output 0.26 ± 25% +0.1 0.36 ± 9% perf-profile.children.cycles-pp.reader__read_event 0.00 +0.1 0.11 ± 3% perf-profile.children.cycles-pp.set_next_entity 0.00 +0.1 0.12 ± 4% perf-profile.children.cycles-pp.update_load_avg 0.08 ± 6% +0.1 0.20 perf-profile.children.cycles-pp.do_sched_yield 0.00 +0.1 0.12 ± 4% perf-profile.children.cycles-pp.put_prev_entity 0.02 ±141% +0.1 0.15 ± 2% perf-profile.children.cycles-pp.update_curr 0.09 ± 36% +0.1 0.23 ± 18% perf-profile.children.cycles-pp.process_simple 0.00 +0.1 0.15 ± 3% perf-profile.children.cycles-pp.__rseq_handle_notify_resume 0.07 ± 57% +0.1 0.22 ± 19% perf-profile.children.cycles-pp.queue_event 0.00 +0.1 0.15 ± 4% perf-profile.children.cycles-pp.clear_bhb_loop 0.00 +0.1 0.15 ± 4% perf-profile.children.cycles-pp.stress_fiemap 0.07 ± 57% +0.2 0.22 ± 19% perf-profile.children.cycles-pp.ordered_events__queue 0.00 +0.2 0.16 ± 2% perf-profile.children.cycles-pp.yield_task_fair 0.00 +0.2 0.16 ± 3% perf-profile.children.cycles-pp.ext4_inode_block_valid 0.20 ± 5% +0.2 0.41 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.09 ± 10% +0.3 0.38 ± 2% perf-profile.children.cycles-pp.pick_next_task_fair 0.62 ± 9% +0.3 0.95 perf-profile.children.cycles-pp.__schedule 0.63 ± 10% +0.3 0.97 perf-profile.children.cycles-pp.schedule 0.06 ± 11% +0.4 0.48 ± 3% perf-profile.children.cycles-pp.iomap_to_fiemap 0.70 ± 9% +0.5 1.18 perf-profile.children.cycles-pp.__x64_sys_sched_yield 0.13 ± 8% +0.9 1.05 ± 3% perf-profile.children.cycles-pp.ext4_sb_block_valid 0.94 ± 8% +1.1 2.03 ± 2% perf-profile.children.cycles-pp.__sched_yield 0.16 ± 7% +1.1 1.27 ± 2% perf-profile.children.cycles-pp._copy_to_user 0.20 ± 8% +1.4 1.59 ± 2% perf-profile.children.cycles-pp.__check_block_validity 0.20 ± 7% +1.4 1.64 ± 7% perf-profile.children.cycles-pp._raw_spin_lock 0.41 ± 7% +2.9 3.30 ± 2% perf-profile.children.cycles-pp.fiemap_fill_next_extent 0.80 ± 7% +5.6 6.35 ± 2% perf-profile.children.cycles-pp.iomap_iter_advance 0.64 ± 7% +50.8 51.40 perf-profile.children.cycles-pp.percpu_counter_add_batch 9.60 ± 7% +66.7 76.28 perf-profile.children.cycles-pp.ext4_es_lookup_extent 10.16 ± 7% +70.2 80.39 perf-profile.children.cycles-pp.ext4_map_blocks 55.43 ± 3% -45.7 9.70 ± 2% perf-profile.self.cycles-pp._raw_read_lock 34.41 ± 6% -34.4 0.00 perf-profile.self.cycles-pp.jbd2_transaction_committed 0.10 ± 10% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.prepare_task_switch 0.08 ± 11% +0.0 0.12 ± 4% perf-profile.self.cycles-pp.__schedule 0.00 +0.1 0.05 perf-profile.self.cycles-pp.___perf_sw_event 0.00 +0.1 0.05 perf-profile.self.cycles-pp.__update_load_avg_se 0.00 +0.1 0.05 perf-profile.self.cycles-pp.update_curr 0.00 +0.1 0.05 ± 7% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.00 +0.1 0.06 ± 13% perf-profile.self.cycles-pp.__sched_yield 0.00 +0.1 0.06 ± 6% perf-profile.self.cycles-pp.__switch_to_asm 0.00 +0.1 0.07 perf-profile.self.cycles-pp.__switch_to 0.20 ± 7% +0.1 0.28 ± 4% perf-profile.self.cycles-pp._raw_spin_lock 0.00 +0.1 0.09 ± 4% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate 0.00 +0.1 0.11 ± 4% perf-profile.self.cycles-pp.ext4_inode_block_valid 0.00 +0.1 0.13 ± 5% perf-profile.self.cycles-pp.stress_fiemap 0.07 ± 57% +0.1 0.22 ± 19% perf-profile.self.cycles-pp.queue_event 0.00 +0.1 0.15 ± 4% perf-profile.self.cycles-pp.clear_bhb_loop 0.05 ± 45% +0.3 0.38 ± 2% perf-profile.self.cycles-pp.__check_block_validity 0.03 ± 70% +0.3 0.38 ± 3% perf-profile.self.cycles-pp.iomap_to_fiemap 0.13 ± 7% +0.9 1.00 ± 4% perf-profile.self.cycles-pp.ext4_sb_block_valid 0.13 ± 7% +0.9 1.05 ± 2% perf-profile.self.cycles-pp.iomap_fiemap 0.16 ± 7% +1.1 1.24 ± 2% perf-profile.self.cycles-pp._copy_to_user 0.19 ± 7% +1.3 1.52 ± 2% perf-profile.self.cycles-pp.iomap_iter 0.29 ± 11% +1.6 1.88 ± 2% perf-profile.self.cycles-pp.ext4_set_iomap 0.25 ± 8% +1.8 2.04 ± 2% perf-profile.self.cycles-pp.fiemap_fill_next_extent 0.27 ± 8% +1.9 2.21 ± 2% perf-profile.self.cycles-pp.ext4_iomap_begin_report 0.38 ± 13% +2.3 2.70 ± 4% perf-profile.self.cycles-pp.ext4_map_blocks 0.78 ± 7% +5.4 6.23 ± 2% perf-profile.self.cycles-pp.iomap_iter_advance 3.98 ± 8% +11.1 15.12 ± 2% perf-profile.self.cycles-pp.ext4_es_lookup_extent 0.56 ± 7% +48.9 49.48 perf-profile.self.cycles-pp.percpu_counter_add_batch Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 537803250ca9..e8e2865bf9ac 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3199,8 +3199,8 @@ static bool ext4_inode_datasync_dirty(struct inode *inode) journal_t *journal = EXT4_SB(inode->i_sb)->s_journal; if (journal) { - if (jbd2_transaction_committed(journal, - EXT4_I(inode)->i_datasync_tid)) + if (tid_geq(journal->j_commit_sequence, + EXT4_I(inode)->i_datasync_tid)) return false; if (test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT)) return !list_empty(&EXT4_I(inode)->i_fc_list); diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index b6c114c11b97..73737cd1106f 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -786,23 +786,6 @@ int jbd2_fc_end_commit_fallback(journal_t *journal) } EXPORT_SYMBOL(jbd2_fc_end_commit_fallback); -/* Return 1 when transaction with given tid has already committed. */ -int jbd2_transaction_committed(journal_t *journal, tid_t tid) -{ - int ret = 1; - - read_lock(&journal->j_state_lock); - if (journal->j_running_transaction && - journal->j_running_transaction->t_tid == tid) - ret = 0; - if (journal->j_committing_transaction && - journal->j_committing_transaction->t_tid == tid) - ret = 0; - read_unlock(&journal->j_state_lock); - return ret; -} -EXPORT_SYMBOL(jbd2_transaction_committed); - /* * When this function returns the transaction corresponding to tid * will be completed. If the transaction has currently running, start diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 971f3e826e15..e15ae324169d 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1643,7 +1643,6 @@ extern void jbd2_clear_buffer_revoked_flags(journal_t *journal); int jbd2_log_start_commit(journal_t *journal, tid_t tid); int jbd2_journal_start_commit(journal_t *journal, tid_t *tid); int jbd2_log_wait_commit(journal_t *journal, tid_t tid); -int jbd2_transaction_committed(journal_t *journal, tid_t tid); int jbd2_complete_transaction(journal_t *journal, tid_t tid); int jbd2_log_do_checkpoint(journal_t *journal); int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid);