Message ID | 20241101035133.925251-1-zhenghaoran@buaa.edu.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: Fix data race in log_conflicting_inodes | expand |
在 2024/11/1 14:21, Hao-ran Zheng 写道: > The Data Race occurs when the `log_conflicting_inodes()` function is > executed in different threads at the same time. When one thread assigns > a value to `ctx->logging_conflict_inodes` while another thread performs > an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the > same time, a data contention problem may arise. > > Further, an atomicity violation may also occur here. Consider the > following case, when a thread A `if(ctx->logging_conflict_inodes)` > passes the judgment, the execution switches to another thread B, at > which time the value of `ctx->logging_conflict_inodes` has not yet > been assigned true, which would result in multiple threads executing > `log_conflicting_inodes()`. > > To address this issue, it is recommended to add locks to protect > `logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock > protection during assignment and judgment. This modification ensures > that the value of `ctx->logging_conflict_inodes` does not change during > the validation process, thereby maintaining its integrity. > > Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn> > --- > fs/btrfs/tree-log.c | 7 +++++++ > fs/btrfs/tree-log.h | 1 + > 2 files changed, 8 insertions(+) > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index 9637c7cdc0cf..9cdbf280ca9a 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode) > INIT_LIST_HEAD(&ctx->conflict_inodes); > ctx->num_conflict_inodes = 0; > ctx->logging_conflict_inodes = false; > + spin_lock_init(&ctx->logging_conflict_inodes_lock); > ctx->scratch_eb = NULL; > } > > @@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, > struct btrfs_log_ctx *ctx) > { > int ret = 0; > + unsigned long logging_conflict_inodes_flags; > > /* > * Conflicting inodes are logged by the first call to btrfs_log_inode(), > * otherwise we could have unbounded recursion of btrfs_log_inode() > * calls. This check guarantees we can have only 1 level of recursion. > */ > + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > if (ctx->logging_conflict_inodes) > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); Not an expert on the log tree, but in the above case, the only thing the spinlock is protecting is a bool. This looks overkilled to me. Yes, several booleans can be stored in to a single byte, which can cause problems. But in that case, why not changing those booleans into a unsigned long and use test_bit()/set_bit()/clear_bit() so that the bit operation will be atomic and no need for the extra spinlock. Although I haven't check the other boolean usage, but at least for this @logging_conflict_inodes variable, it looks like atomic bit operation is safe. Thanks, Qu > return 0; > > ctx->logging_conflict_inodes = true; > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > > /* > * New conflicting inodes may be found and added to the list while we > @@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, > break; > } > > + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > ctx->logging_conflict_inodes = false; > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > if (ret) > free_conflicting_inodes(ctx); > > diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h > index dc313e6bb2fa..0f862d0c80f2 100644 > --- a/fs/btrfs/tree-log.h > +++ b/fs/btrfs/tree-log.h > @@ -44,6 +44,7 @@ struct btrfs_log_ctx { > struct list_head conflict_inodes; > int num_conflict_inodes; > bool logging_conflict_inodes; > + spinlock_t logging_conflict_inodes_lock; > /* > * Used for fsyncs that need to copy items from the subvolume tree to > * the log tree (full sync flag set or copy everything flag set) to
On Fri, Nov 1, 2024 at 3:52 AM Hao-ran Zheng <zhenghaoran@buaa.edu.cn> wrote: > > The Data Race occurs when the `log_conflicting_inodes()` function is > executed in different threads at the same time. When one thread assigns > a value to `ctx->logging_conflict_inodes` while another thread performs > an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the > same time, a data contention problem may arise. No, there's no problem at all. A log context is thread local, it's never shared between threads. > > Further, an atomicity violation may also occur here. Consider the > following case, when a thread A `if(ctx->logging_conflict_inodes)` > passes the judgment, the execution switches to another thread B, at > which time the value of `ctx->logging_conflict_inodes` has not yet > been assigned true, which would result in multiple threads executing > `log_conflicting_inodes()`. No. When you make such claims, please provide a sequence diagram that shows how the tasks interact, what their call stacks are, so that we can see where the race happens. But again, this is completely wrong because a log context (struct btrfs_log_ctx) is never shared between threads. Thanks. > > To address this issue, it is recommended to add locks to protect > `logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock > protection during assignment and judgment. This modification ensures > that the value of `ctx->logging_conflict_inodes` does not change during > the validation process, thereby maintaining its integrity. > > Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn> > --- > fs/btrfs/tree-log.c | 7 +++++++ > fs/btrfs/tree-log.h | 1 + > 2 files changed, 8 insertions(+) > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index 9637c7cdc0cf..9cdbf280ca9a 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode) > INIT_LIST_HEAD(&ctx->conflict_inodes); > ctx->num_conflict_inodes = 0; > ctx->logging_conflict_inodes = false; > + spin_lock_init(&ctx->logging_conflict_inodes_lock); > ctx->scratch_eb = NULL; > } > > @@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, > struct btrfs_log_ctx *ctx) > { > int ret = 0; > + unsigned long logging_conflict_inodes_flags; > > /* > * Conflicting inodes are logged by the first call to btrfs_log_inode(), > * otherwise we could have unbounded recursion of btrfs_log_inode() > * calls. This check guarantees we can have only 1 level of recursion. > */ > + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); Even if this was remotely correct, why the irqsave? The fsync code is never called under irq context. > if (ctx->logging_conflict_inodes) > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > return 0; > > ctx->logging_conflict_inodes = true; > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > > /* > * New conflicting inodes may be found and added to the list while we > @@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, > break; > } > > + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > ctx->logging_conflict_inodes = false; > + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > if (ret) > free_conflicting_inodes(ctx); > > diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h > index dc313e6bb2fa..0f862d0c80f2 100644 > --- a/fs/btrfs/tree-log.h > +++ b/fs/btrfs/tree-log.h > @@ -44,6 +44,7 @@ struct btrfs_log_ctx { > struct list_head conflict_inodes; > int num_conflict_inodes; > bool logging_conflict_inodes; > + spinlock_t logging_conflict_inodes_lock; > /* > * Used for fsyncs that need to copy items from the subvolume tree to > * the log tree (full sync flag set or copy everything flag set) to > -- > 2.34.1 > >
Hi Hao-ran, kernel test robot noticed the following build errors: [auto build test ERROR on kdave/for-next] [also build test ERROR on linus/master v6.12-rc5 next-20241101] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Hao-ran-Zheng/btrfs-Fix-data-race-in-log_conflicting_inodes/20241101-115429 base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next patch link: https://lore.kernel.org/r/20241101035133.925251-1-zhenghaoran%40buaa.edu.cn patch subject: [PATCH] btrfs: Fix data race in log_conflicting_inodes config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20241102/202411021443.lsHICRJl-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021443.lsHICRJl-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411021443.lsHICRJl-lkp@intel.com/ All error/warnings (new ones prefixed by >>): In file included from include/linux/sched.h:2145, from fs/btrfs/tree-log.c:6: fs/btrfs/tree-log.c: In function 'log_conflicting_inodes': >> fs/btrfs/tree-log.c:5790:33: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'? 5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ include/linux/spinlock.h:244:48: note: in definition of macro 'raw_spin_lock_irqsave' 244 | flags = _raw_spin_lock_irqsave(lock); \ | ^~~~ fs/btrfs/tree-log.c:5790:9: note: in expansion of macro 'spin_lock_irqsave' 5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~ fs/btrfs/tree-log.c:5792:46: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'? 5792 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes >> fs/btrfs/tree-log.c:5791:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 5791 | if (ctx->logging_conflict_inodes) | ^~ fs/btrfs/tree-log.c:5793:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 5793 | return 0; | ^~~~~~ fs/btrfs/tree-log.c:5796:38: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'? 5796 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes fs/btrfs/tree-log.c:5877:33: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'? 5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ include/linux/spinlock.h:244:48: note: in definition of macro 'raw_spin_lock_irqsave' 244 | flags = _raw_spin_lock_irqsave(lock); \ | ^~~~ fs/btrfs/tree-log.c:5877:9: note: in expansion of macro 'spin_lock_irqsave' 5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~ fs/btrfs/tree-log.c:5879:38: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'? 5879 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes vim +5790 fs/btrfs/tree-log.c 5777 5778 static int log_conflicting_inodes(struct btrfs_trans_handle *trans, 5779 struct btrfs_root *root, 5780 struct btrfs_log_ctx *ctx) 5781 { 5782 int ret = 0; 5783 unsigned long logging_conflict_inodes_flags; 5784 5785 /* 5786 * Conflicting inodes are logged by the first call to btrfs_log_inode(), 5787 * otherwise we could have unbounded recursion of btrfs_log_inode() 5788 * calls. This check guarantees we can have only 1 level of recursion. 5789 */ > 5790 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > 5791 if (ctx->logging_conflict_inodes) 5792 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5793 return 0; 5794 5795 ctx->logging_conflict_inodes = true; 5796 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5797 5798 /* 5799 * New conflicting inodes may be found and added to the list while we 5800 * are logging a conflicting inode, so keep iterating while the list is 5801 * not empty. 5802 */ 5803 while (!list_empty(&ctx->conflict_inodes)) { 5804 struct btrfs_ino_list *curr; 5805 struct inode *inode; 5806 u64 ino; 5807 u64 parent; 5808 5809 curr = list_first_entry(&ctx->conflict_inodes, 5810 struct btrfs_ino_list, list); 5811 ino = curr->ino; 5812 parent = curr->parent; 5813 list_del(&curr->list); 5814 kfree(curr); 5815 5816 inode = btrfs_iget_logging(ino, root); 5817 /* 5818 * If the other inode that had a conflicting dir entry was 5819 * deleted in the current transaction, we need to log its parent 5820 * directory. See the comment at add_conflicting_inode(). 5821 */ 5822 if (IS_ERR(inode)) { 5823 ret = PTR_ERR(inode); 5824 if (ret != -ENOENT) 5825 break; 5826 5827 inode = btrfs_iget_logging(parent, root); 5828 if (IS_ERR(inode)) { 5829 ret = PTR_ERR(inode); 5830 break; 5831 } 5832 5833 /* 5834 * Always log the directory, we cannot make this 5835 * conditional on need_log_inode() because the directory 5836 * might have been logged in LOG_INODE_EXISTS mode or 5837 * the dir index of the conflicting inode is not in a 5838 * dir index key range logged for the directory. So we 5839 * must make sure the deletion is recorded. 5840 */ 5841 ret = btrfs_log_inode(trans, BTRFS_I(inode), 5842 LOG_INODE_ALL, ctx); 5843 btrfs_add_delayed_iput(BTRFS_I(inode)); 5844 if (ret) 5845 break; 5846 continue; 5847 } 5848 5849 /* 5850 * Here we can use need_log_inode() because we only need to log 5851 * the inode in LOG_INODE_EXISTS mode and rename operations 5852 * update the log, so that the log ends up with the new name and 5853 * without the old name. 5854 * 5855 * We did this check at add_conflicting_inode(), but here we do 5856 * it again because if some other task logged the inode after 5857 * that, we can avoid doing it again. 5858 */ 5859 if (!need_log_inode(trans, BTRFS_I(inode))) { 5860 btrfs_add_delayed_iput(BTRFS_I(inode)); 5861 continue; 5862 } 5863 5864 /* 5865 * We are safe logging the other inode without acquiring its 5866 * lock as long as we log with the LOG_INODE_EXISTS mode. We 5867 * are safe against concurrent renames of the other inode as 5868 * well because during a rename we pin the log and update the 5869 * log with the new name before we unpin it. 5870 */ 5871 ret = btrfs_log_inode(trans, BTRFS_I(inode), LOG_INODE_EXISTS, ctx); 5872 btrfs_add_delayed_iput(BTRFS_I(inode)); 5873 if (ret) 5874 break; 5875 } 5876 5877 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5878 ctx->logging_conflict_inodes = false; 5879 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5880 if (ret) 5881 free_conflicting_inodes(ctx); 5882 5883 return ret; 5884 } 5885
Hi Hao-ran, kernel test robot noticed the following build errors: [auto build test ERROR on kdave/for-next] [also build test ERROR on linus/master v6.12-rc5 next-20241101] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Hao-ran-Zheng/btrfs-Fix-data-race-in-log_conflicting_inodes/20241101-115429 base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next patch link: https://lore.kernel.org/r/20241101035133.925251-1-zhenghaoran%40buaa.edu.cn patch subject: [PATCH] btrfs: Fix data race in log_conflicting_inodes config: x86_64-kexec (https://download.01.org/0day-ci/archive/20241102/202411021448.6pjzV4h1-lkp@intel.com/config) compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021448.6pjzV4h1-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411021448.6pjzV4h1-lkp@intel.com/ All error/warnings (new ones prefixed by >>): In file included from fs/btrfs/tree-log.c:8: In file included from include/linux/blkdev.h:9: In file included from include/linux/blk_types.h:10: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:8: In file included from include/linux/cacheflush.h:5: In file included from arch/x86/include/asm/cacheflush.h:5: In file included from include/linux/mm.h:2213: include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 504 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 505 | item]; | ~~~~ include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 511 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 512 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 524 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 525 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ >> fs/btrfs/tree-log.c:5790:26: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'? 5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes include/linux/spinlock.h:381:39: note: expanded from macro 'spin_lock_irqsave' 381 | raw_spin_lock_irqsave(spinlock_check(lock), flags); \ | ^ include/linux/spinlock.h:244:34: note: expanded from macro 'raw_spin_lock_irqsave' 244 | flags = _raw_spin_lock_irqsave(lock); \ | ^ fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here 44 | struct list_head conflict_inodes; | ^ fs/btrfs/tree-log.c:5792:32: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'? 5792 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here 44 | struct list_head conflict_inodes; | ^ >> fs/btrfs/tree-log.c:5793:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] 5793 | return 0; | ^ fs/btrfs/tree-log.c:5791:2: note: previous statement is here 5791 | if (ctx->logging_conflict_inodes) | ^ fs/btrfs/tree-log.c:5796:31: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'? 5796 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here 44 | struct list_head conflict_inodes; | ^ fs/btrfs/tree-log.c:5877:26: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'? 5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes include/linux/spinlock.h:381:39: note: expanded from macro 'spin_lock_irqsave' 381 | raw_spin_lock_irqsave(spinlock_check(lock), flags); \ | ^ include/linux/spinlock.h:244:34: note: expanded from macro 'raw_spin_lock_irqsave' 244 | flags = _raw_spin_lock_irqsave(lock); \ | ^ fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here 44 | struct list_head conflict_inodes; | ^ fs/btrfs/tree-log.c:5879:31: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'? 5879 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); | ^~~~~~~~~~~~~~~~~~~~ | conflict_inodes fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here 44 | struct list_head conflict_inodes; | ^ 5 warnings and 5 errors generated. vim +5790 fs/btrfs/tree-log.c 5777 5778 static int log_conflicting_inodes(struct btrfs_trans_handle *trans, 5779 struct btrfs_root *root, 5780 struct btrfs_log_ctx *ctx) 5781 { 5782 int ret = 0; 5783 unsigned long logging_conflict_inodes_flags; 5784 5785 /* 5786 * Conflicting inodes are logged by the first call to btrfs_log_inode(), 5787 * otherwise we could have unbounded recursion of btrfs_log_inode() 5788 * calls. This check guarantees we can have only 1 level of recursion. 5789 */ > 5790 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5791 if (ctx->logging_conflict_inodes) 5792 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); > 5793 return 0; 5794 5795 ctx->logging_conflict_inodes = true; 5796 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5797 5798 /* 5799 * New conflicting inodes may be found and added to the list while we 5800 * are logging a conflicting inode, so keep iterating while the list is 5801 * not empty. 5802 */ 5803 while (!list_empty(&ctx->conflict_inodes)) { 5804 struct btrfs_ino_list *curr; 5805 struct inode *inode; 5806 u64 ino; 5807 u64 parent; 5808 5809 curr = list_first_entry(&ctx->conflict_inodes, 5810 struct btrfs_ino_list, list); 5811 ino = curr->ino; 5812 parent = curr->parent; 5813 list_del(&curr->list); 5814 kfree(curr); 5815 5816 inode = btrfs_iget_logging(ino, root); 5817 /* 5818 * If the other inode that had a conflicting dir entry was 5819 * deleted in the current transaction, we need to log its parent 5820 * directory. See the comment at add_conflicting_inode(). 5821 */ 5822 if (IS_ERR(inode)) { 5823 ret = PTR_ERR(inode); 5824 if (ret != -ENOENT) 5825 break; 5826 5827 inode = btrfs_iget_logging(parent, root); 5828 if (IS_ERR(inode)) { 5829 ret = PTR_ERR(inode); 5830 break; 5831 } 5832 5833 /* 5834 * Always log the directory, we cannot make this 5835 * conditional on need_log_inode() because the directory 5836 * might have been logged in LOG_INODE_EXISTS mode or 5837 * the dir index of the conflicting inode is not in a 5838 * dir index key range logged for the directory. So we 5839 * must make sure the deletion is recorded. 5840 */ 5841 ret = btrfs_log_inode(trans, BTRFS_I(inode), 5842 LOG_INODE_ALL, ctx); 5843 btrfs_add_delayed_iput(BTRFS_I(inode)); 5844 if (ret) 5845 break; 5846 continue; 5847 } 5848 5849 /* 5850 * Here we can use need_log_inode() because we only need to log 5851 * the inode in LOG_INODE_EXISTS mode and rename operations 5852 * update the log, so that the log ends up with the new name and 5853 * without the old name. 5854 * 5855 * We did this check at add_conflicting_inode(), but here we do 5856 * it again because if some other task logged the inode after 5857 * that, we can avoid doing it again. 5858 */ 5859 if (!need_log_inode(trans, BTRFS_I(inode))) { 5860 btrfs_add_delayed_iput(BTRFS_I(inode)); 5861 continue; 5862 } 5863 5864 /* 5865 * We are safe logging the other inode without acquiring its 5866 * lock as long as we log with the LOG_INODE_EXISTS mode. We 5867 * are safe against concurrent renames of the other inode as 5868 * well because during a rename we pin the log and update the 5869 * log with the new name before we unpin it. 5870 */ 5871 ret = btrfs_log_inode(trans, BTRFS_I(inode), LOG_INODE_EXISTS, ctx); 5872 btrfs_add_delayed_iput(BTRFS_I(inode)); 5873 if (ret) 5874 break; 5875 } 5876 5877 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5878 ctx->logging_conflict_inodes = false; 5879 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); 5880 if (ret) 5881 free_conflicting_inodes(ctx); 5882 5883 return ret; 5884 } 5885
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 9637c7cdc0cf..9cdbf280ca9a 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode) INIT_LIST_HEAD(&ctx->conflict_inodes); ctx->num_conflict_inodes = 0; ctx->logging_conflict_inodes = false; + spin_lock_init(&ctx->logging_conflict_inodes_lock); ctx->scratch_eb = NULL; } @@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, struct btrfs_log_ctx *ctx) { int ret = 0; + unsigned long logging_conflict_inodes_flags; /* * Conflicting inodes are logged by the first call to btrfs_log_inode(), * otherwise we could have unbounded recursion of btrfs_log_inode() * calls. This check guarantees we can have only 1 level of recursion. */ + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); if (ctx->logging_conflict_inodes) + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); return 0; ctx->logging_conflict_inodes = true; + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); /* * New conflicting inodes may be found and added to the list while we @@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, break; } + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); ctx->logging_conflict_inodes = false; + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags); if (ret) free_conflicting_inodes(ctx); diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h index dc313e6bb2fa..0f862d0c80f2 100644 --- a/fs/btrfs/tree-log.h +++ b/fs/btrfs/tree-log.h @@ -44,6 +44,7 @@ struct btrfs_log_ctx { struct list_head conflict_inodes; int num_conflict_inodes; bool logging_conflict_inodes; + spinlock_t logging_conflict_inodes_lock; /* * Used for fsyncs that need to copy items from the subvolume tree to * the log tree (full sync flag set or copy everything flag set) to
The Data Race occurs when the `log_conflicting_inodes()` function is executed in different threads at the same time. When one thread assigns a value to `ctx->logging_conflict_inodes` while another thread performs an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the same time, a data contention problem may arise. Further, an atomicity violation may also occur here. Consider the following case, when a thread A `if(ctx->logging_conflict_inodes)` passes the judgment, the execution switches to another thread B, at which time the value of `ctx->logging_conflict_inodes` has not yet been assigned true, which would result in multiple threads executing `log_conflicting_inodes()`. To address this issue, it is recommended to add locks to protect `logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock protection during assignment and judgment. This modification ensures that the value of `ctx->logging_conflict_inodes` does not change during the validation process, thereby maintaining its integrity. Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn> --- fs/btrfs/tree-log.c | 7 +++++++ fs/btrfs/tree-log.h | 1 + 2 files changed, 8 insertions(+)