diff mbox series

[V2] fuse: Fix VM_BUG_ON_PAGE issue while accessing zero ref count page

Message ID 1598452035-3472-1-git-send-email-ppvk@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series [V2] fuse: Fix VM_BUG_ON_PAGE issue while accessing zero ref count page | expand

Commit Message

Pradeep P V K Aug. 26, 2020, 2:27 p.m. UTC
There is a potential race between fuse_abort_conn() and
fuse_copy_page() as shown below, due to which VM_BUG_ON_PAGE
crash is observed.

context#1:                      context#2:
fuse_dev_do_read()              fuse_abort_conn()
->fuse_copy_args()               ->end_requests()
 ->fuse_copy_pages()              ->request_end()
   ->fuse_copy_page()               ->fuse_writepage_end)
     ->fuse_ref_page()                ->fuse_writepage_free()
                                        ->__free_page()
					   ->put_page_testzero()

     ->get_page()
     ->VM_BUG_ON_PAGE()

This results in below crash as when ->put_page_testzero() in context#2
decrease the page reference and get_page() in context#1 accessed it
with zero page reference count.

[  174.391095]  (1)[10406:Thread-6]page dumped because:
VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
[  174.391113]  (1)[10406:Thread-6]page allocated via order 0,
migratetype Unmovable, gfp_mask
0x620042(GFP_NOFS|__GFP_HIGHMEM|__GFP_HARDWALL), pid 261, ts
174390946312 ns

[  174.391135]  (1)[10406:Thread-6] prep_new_page+0x13c/0x210
[  174.391148]  (1)[10406:Thread-6] get_page_from_freelist+0x21ac/0x2370
[  174.391161]  (1)[10406:Thread-6] __alloc_pages_nodemask+0x244/0x14a8
[  174.391176]  (1)[10406:Thread-6] fuse_writepages_fill+0x150/0x708
[  174.391190]  (1)[10406:Thread-6] write_cache_pages+0x3d8/0x550
[  174.391202]  (1)[10406:Thread-6] fuse_writepages+0x94/0x130
[  174.391214]  (1)[10406:Thread-6] do_writepages+0x74/0x140
[  174.391228]  (1)[10406:Thread-6] __writeback_single_inode+0x168/0x788
[  174.391239]  (1)[10406:Thread-6] writeback_sb_inodes+0x56c/0xab8
[  174.391251]  (1)[10406:Thread-6] __writeback_inodes_wb+0x94/0x180
[  174.391262]  (1)[10406:Thread-6] wb_writeback+0x318/0x618
[  174.391274]  (1)[10406:Thread-6] wb_workfn+0x468/0x828
[  174.391290]  (1)[10406:Thread-6] process_one_work+0x3d0/0x720
[  174.391302]  (1)[10406:Thread-6] worker_thread+0x234/0x4c0
[  174.391314]  (1)[10406:Thread-6] kthread+0x144/0x158
[  174.391327]  (1)[10406:Thread-6] ret_from_fork+0x10/0x1c
[  174.391363]  (1)[10406:Thread-6]------------[ cut here ]------------
[  174.391371]  (1)[10406:Thread-6]kernel BUG at include/linux/mm.h:980!
[  174.391381]  (1)[10406:Thread-6]Internal error: Oops - BUG: 0 [#1]
...
[  174.486928]  (1)[10406:Thread-6]pc : fuse_copy_page+0x750/0x790
[  174.493029]  (1)[10406:Thread-6]lr : fuse_copy_page+0x750/0x790
[  174.718831]  (1)[10406:Thread-6] fuse_copy_page+0x750/0x790
[  174.718838]  (1)[10406:Thread-6] fuse_copy_args+0xb4/0x1e8
[  174.718843]  (1)[10406:Thread-6] fuse_dev_do_read+0x424/0x888
[  174.718848]  (1)[10406:Thread-6] fuse_dev_splice_read+0x94/0x200
[  174.718856]  (1)[10406:Thread-6] __arm64_sys_splice+0x874/0xb20
[  174.718864]  (1)[10406:Thread-6] el0_svc_common+0xc8/0x240
[  174.718869]  (1)[10406:Thread-6] el0_svc_handler+0x6c/0x88
[  174.718875]  (1)[10406:Thread-6] el0_svc+0x8/0xc
[  174.778853]  (1)[10406:Thread-6]Kernel panic - not syncing: Fatal

Fix this by protecting fuse_copy_pages() with fc->lock.

Changes since V1:
- Modified the logic as per kernel v5.9-rc1.
- Added Reported by tag.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
---
 fs/fuse/dev.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox Aug. 27, 2020, 12:34 p.m. UTC | #1
On Wed, Aug 26, 2020 at 07:57:15PM +0530, Pradeep P V K wrote:
> Fix this by protecting fuse_copy_pages() with fc->lock.

No.  This is a spinlock and fuse_copy_pages() can allocate memory
with GFP_KERNEL.  You need to enable more debugging on your test
system.
Pradeep P V K Aug. 28, 2020, 12:59 p.m. UTC | #2
On 2020-08-27 18:04, Matthew Wilcox wrote:
> On Wed, Aug 26, 2020 at 07:57:15PM +0530, Pradeep P V K wrote:
>> Fix this by protecting fuse_copy_pages() with fc->lock.
> 
> No.  This is a spinlock and fuse_copy_pages() can allocate memory
> with GFP_KERNEL.  You need to enable more debugging on your test
> system.

Thanks Matthew for the review and comments. I will address this in my
next patch set. BTW, can you please share your thoughts, if there is a
way to extract fuse_connection Data Structure from fuse_copy_state DS ?
This would really help to use fc->lock only on intended function 
(fuse_ref_page()).
diff mbox series

Patch

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 02b3c36..ff9f88e 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1258,9 +1258,11 @@  static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	spin_unlock(&fpq->lock);
 	cs->req = req;
 	err = fuse_copy_one(cs, &req->in.h, sizeof(req->in.h));
+	spin_lock(&fc->lock);
 	if (!err)
 		err = fuse_copy_args(cs, args->in_numargs, args->in_pages,
 				     (struct fuse_arg *) args->in_args, 0);
+	spin_unlock(&fc->lock);
 	fuse_copy_finish(cs);
 	spin_lock(&fpq->lock);
 	clear_bit(FR_LOCKED, &req->flags);
@@ -1893,8 +1895,11 @@  static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
 
 	if (oh.error)
 		err = nbytes != sizeof(oh) ? -EINVAL : 0;
-	else
+	else {
+		spin_lock(&fc->lock);
 		err = copy_out_args(cs, req->args, nbytes);
+		spin_unlock(&fc->lock);
+	}
 	fuse_copy_finish(cs);
 
 	spin_lock(&fpq->lock);