diff mbox series

btrfs: fix a potential racy

Message ID 20200419015907.15503-1-wu000273@umn.edu (mailing list archive)
State New, archived
Headers show
Series btrfs: fix a potential racy | expand

Commit Message

wu000273@umn.edu April 19, 2020, 1:59 a.m. UTC
From: Qiushi Wu <wu000273@umn.edu>

In function reada_find_extent and reada_extent_put, kref_get(&zone->refcnt)
are not called in a lock context. Potential racy may happen. It's possible
that thread1 decreases the kref to 0, and thread2 increases the kref to 1,
then thread1 releases the pointer.

Signed-off-by: Qiushi Wu <wu000273@umn.edu>
---
 fs/btrfs/reada.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

kernel test robot April 22, 2020, 5:54 p.m. UTC | #1
Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on btrfs/next]
[also build test WARNING on v5.7-rc2 next-20200421]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/wu000273-umn-edu/btrfs-fix-a-potential-racy/20200421-072124
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   fs/btrfs/reada.c: In function 'reada_extent_put':
>> fs/btrfs/reada.c:509:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     509 |   struct reada_zone *zone = re->zones[i];
         |   ^~~~~~

vim +509 fs/btrfs/reada.c

7414a03fbf9e75 Arne Jansen        2011-05-23  485  
7414a03fbf9e75 Arne Jansen        2011-05-23  486  static void reada_extent_put(struct btrfs_fs_info *fs_info,
7414a03fbf9e75 Arne Jansen        2011-05-23  487  			     struct reada_extent *re)
7414a03fbf9e75 Arne Jansen        2011-05-23  488  {
7414a03fbf9e75 Arne Jansen        2011-05-23  489  	int i;
09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01  490  	unsigned long index = re->logical >> PAGE_SHIFT;
7414a03fbf9e75 Arne Jansen        2011-05-23  491  
7414a03fbf9e75 Arne Jansen        2011-05-23  492  	spin_lock(&fs_info->reada_lock);
99621b44aa194e Al Viro            2012-08-29  493  	if (--re->refcnt) {
7414a03fbf9e75 Arne Jansen        2011-05-23  494  		spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  495  		return;
7414a03fbf9e75 Arne Jansen        2011-05-23  496  	}
7414a03fbf9e75 Arne Jansen        2011-05-23  497  
7414a03fbf9e75 Arne Jansen        2011-05-23  498  	radix_tree_delete(&fs_info->reada_tree, index);
7414a03fbf9e75 Arne Jansen        2011-05-23  499  	for (i = 0; i < re->nzones; ++i) {
7414a03fbf9e75 Arne Jansen        2011-05-23  500  		struct reada_zone *zone = re->zones[i];
7414a03fbf9e75 Arne Jansen        2011-05-23  501  
7414a03fbf9e75 Arne Jansen        2011-05-23  502  		radix_tree_delete(&zone->device->reada_extents, index);
7414a03fbf9e75 Arne Jansen        2011-05-23  503  	}
7414a03fbf9e75 Arne Jansen        2011-05-23  504  
7414a03fbf9e75 Arne Jansen        2011-05-23  505  	spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  506  
7414a03fbf9e75 Arne Jansen        2011-05-23  507  	for (i = 0; i < re->nzones; ++i) {
1255ee642bf9b0 Qiushi Wu          2020-04-18  508  		spin_lock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen        2011-05-23 @509  		struct reada_zone *zone = re->zones[i];
7414a03fbf9e75 Arne Jansen        2011-05-23  510  		kref_get(&zone->refcnt);
1255ee642bf9b0 Qiushi Wu          2020-04-18  511  		spin_unlock(&fs_info->reada_lock);
1255ee642bf9b0 Qiushi Wu          2020-04-18  512  
7414a03fbf9e75 Arne Jansen        2011-05-23  513  		spin_lock(&zone->lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  514  		--zone->elems;
7414a03fbf9e75 Arne Jansen        2011-05-23  515  		if (zone->elems == 0) {
7414a03fbf9e75 Arne Jansen        2011-05-23  516  			/* no fs_info->reada_lock needed, as this can't be
7414a03fbf9e75 Arne Jansen        2011-05-23  517  			 * the last ref */
7414a03fbf9e75 Arne Jansen        2011-05-23  518  			kref_put(&zone->refcnt, reada_zone_release);
7414a03fbf9e75 Arne Jansen        2011-05-23  519  		}
7414a03fbf9e75 Arne Jansen        2011-05-23  520  		spin_unlock(&zone->lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  521  
7414a03fbf9e75 Arne Jansen        2011-05-23  522  		spin_lock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  523  		kref_put(&zone->refcnt, reada_zone_release);
7414a03fbf9e75 Arne Jansen        2011-05-23  524  		spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen        2011-05-23  525  	}
7414a03fbf9e75 Arne Jansen        2011-05-23  526  
7414a03fbf9e75 Arne Jansen        2011-05-23  527  	kfree(re);
7414a03fbf9e75 Arne Jansen        2011-05-23  528  }
7414a03fbf9e75 Arne Jansen        2011-05-23  529  

:::::: The code at line 509 was first introduced by commit
:::::: 7414a03fbf9e75fbbf2a3c16828cd862e572aa44 btrfs: initial readahead code and prototypes

:::::: TO: Arne Jansen <sensille@gmx.net>
:::::: CC: Arne Jansen <sensille@gmx.net>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
David Sterba April 22, 2020, 9:45 p.m. UTC | #2
On Sat, Apr 18, 2020 at 08:59:07PM -0500, wu000273@umn.edu wrote:
> From: Qiushi Wu <wu000273@umn.edu>
> 
> In function reada_find_extent and reada_extent_put, kref_get(&zone->refcnt)
> are not called in a lock context. Potential racy may happen. It's possible
> that thread1 decreases the kref to 0, and thread2 increases the kref to 1,
> then thread1 releases the pointer.

There are several things I don't see or understand why they woudl be a
problem.

kref_get does not need to be take under locks in case it's not the first
reference or if it's clear that eg. the caller has taken a reference and
it'll never go to 0.

The references are supposed to be lightweight so taking the locks per
kref_get does not follow a good pattern.

The kref type is a wrapper around refcount_t, that detects if there's an
increment from 0->1, similar to what you suggest that could happen. It
might be tricky to actually hit that case so I'm not saying that it's
all ok, just that we haven't seen that so far.

Your description of the race needs to be expanded as it's too terse,
where exactly the increments could happen, how would be the calls in the
two cpus interleaved, like

cpu1				cpu2

				reada_find_extent()
				   kref_get
				   ...
reada_find_extent()
   				   kref_put (last put, refs == 0)
   kref_get (inc from zero)
   ...

Then it's easer to reason about the actual races and the context where
it could happen. Eg. btrfs_reada_add adds one reference from the
beginning, so the final put does not happen in the middle of
reada_find_extent unless something would be seriously broken and that
would manifest very clearly.
diff mbox series

Patch

diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index 243a2e44526e..4b90d04f7a0a 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -361,13 +361,15 @@  static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info,
 		zone = reada_find_zone(dev, logical, bbio);
 		if (!zone)
 			continue;
-
+		spin_lock(&fs_info->reada_lock);
 		re->zones[re->nzones++] = zone;
 		spin_lock(&zone->lock);
 		if (!zone->elems)
 			kref_get(&zone->refcnt);
+		spin_unlock(&fs_info->reada_lock);
 		++zone->elems;
 		spin_unlock(&zone->lock);
+
 		spin_lock(&fs_info->reada_lock);
 		kref_put(&zone->refcnt, reada_zone_release);
 		spin_unlock(&fs_info->reada_lock);
@@ -458,8 +460,11 @@  static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info,
 	for (nzones = 0; nzones < re->nzones; ++nzones) {
 		struct reada_zone *zone;
 
+		spin_lock(&fs_info->reada_lock);
 		zone = re->zones[nzones];
 		kref_get(&zone->refcnt);
+		spin_unlock(&fs_info->reada_lock);
+
 		spin_lock(&zone->lock);
 		--zone->elems;
 		if (zone->elems == 0) {
@@ -502,9 +507,11 @@  static void reada_extent_put(struct btrfs_fs_info *fs_info,
 	spin_unlock(&fs_info->reada_lock);
 
 	for (i = 0; i < re->nzones; ++i) {
+		spin_lock(&fs_info->reada_lock);
 		struct reada_zone *zone = re->zones[i];
-
 		kref_get(&zone->refcnt);
+		spin_unlock(&fs_info->reada_lock);
+
 		spin_lock(&zone->lock);
 		--zone->elems;
 		if (zone->elems == 0) {