diff mbox

FIle copy to FAT FS on NVDIMM hits BUG_ON at fs/buffer.c:3305!

Message ID 20170726171101.GA15980@bombadil.infradead.org (mailing list archive)
State New, archived
Headers show

Commit Message

Matthew Wilcox July 26, 2017, 5:11 p.m. UTC
On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote:
> The locking of this path seems to be broken. The guy familiar to
> bdev_write_page() path will made real fix though, The following patch
> should be explaining enough what is wrong.
> 
> In short, clean_buffers() must be called before unlocking lock_page().

Thanks for that.  This should fix the problem while not leaking the
unlock_page call outside bdev_write_page.

--- 8< ---

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

Comments

Kani, Toshi July 27, 2017, 4:12 p.m. UTC | #1
On Wed, 2017-07-26 at 10:11 -0700, Matthew Wilcox wrote:
> On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote:

> > The locking of this path seems to be broken. The guy familiar to

> > bdev_write_page() path will made real fix though, The following

> > patch should be explaining enough what is wrong.

> > 

> > In short, clean_buffers() must be called before unlocking

> > lock_page().

> 

> Thanks for that.  This should fix the problem while not leaking the

> unlock_page call outside bdev_write_page.

> 

> --- 8< ---

> 

> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>


Thanks Willy and Hirofumi for the quick fix!  I've tested the change,
and it works fine.

Tested-by: Toshi Kani <toshi.kani@hpe.com>


Out of curiosity, I am still wondering about the following: 
 - Why is this issue exposed with FAT FS, but not with other FSs?
 - Why did I not see this issue with BTT?

Thanks,
-Toshi
Ross Zwisler July 27, 2017, 6:03 p.m. UTC | #2
On Thu, Jul 27, 2017 at 04:12:18PM +0000, Kani, Toshimitsu wrote:
> On Wed, 2017-07-26 at 10:11 -0700, Matthew Wilcox wrote:
> > On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote:
> > > The locking of this path seems to be broken. The guy familiar to
> > > bdev_write_page() path will made real fix though, The following
> > > patch should be explaining enough what is wrong.
> > > 
> > > In short, clean_buffers() must be called before unlocking
> > > lock_page().
> > 
> > Thanks for that.  This should fix the problem while not leaking the
> > unlock_page call outside bdev_write_page.
> > 
> > --- 8< ---
> > 
> > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> 
> Thanks Willy and Hirofumi for the quick fix!  I've tested the change,
> and it works fine.
> 
> Tested-by: Toshi Kani <toshi.kani@hpe.com>
> 
> Out of curiosity, I am still wondering about the following: 
>  - Why is this issue exposed with FAT FS, but not with other FSs?
>  - Why did I not see this issue with BTT?

I also didn't see it with FAT + BRD, so I think the answer to both of the
above is just that the timings were different.
Kani, Toshi July 27, 2017, 6:20 p.m. UTC | #3
On Thu, 2017-07-27 at 12:03 -0600, Ross Zwisler wrote:
> On Thu, Jul 27, 2017 at 04:12:18PM +0000, Kani, Toshimitsu wrote:

> > On Wed, 2017-07-26 at 10:11 -0700, Matthew Wilcox wrote:

> > > On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote:

> > > > The locking of this path seems to be broken. The guy familiar

> > > > to bdev_write_page() path will made real fix though, The

> > > > following patch should be explaining enough what is wrong.

> > > > 

> > > > In short, clean_buffers() must be called before unlocking

> > > > lock_page().

> > > 

> > > Thanks for that.  This should fix the problem while not leaking

> > > the unlock_page call outside bdev_write_page.

> > > 

> > > --- 8< ---

> > > 

> > > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

> > 

> > Thanks Willy and Hirofumi for the quick fix!  I've tested the

> > change, and it works fine.

> > 

> > Tested-by: Toshi Kani <toshi.kani@hpe.com>

> > 

> > Out of curiosity, I am still wondering about the following: 

> >  - Why is this issue exposed with FAT FS, but not with other FSs?

> >  - Why did I not see this issue with BTT?

> 

> I also didn't see it with FAT + BRD, so I think the answer to both of

> the above is just that the timings were different.


Yeah, that may well be the case.  Just a tough nature for testing.

Thanks!
-Toshi
Kani, Toshi Sept. 20, 2017, 9:04 p.m. UTC | #4
On Thu, 2017-07-27 at 16:12 +0000, Kani, Toshimitsu wrote:
> On Wed, 2017-07-26 at 10:11 -0700, Matthew Wilcox wrote:

> > On Wed, Jul 26, 2017 at 06:23:08PM +0900, OGAWA Hirofumi wrote:

> > > The locking of this path seems to be broken. The guy familiar to

> > > bdev_write_page() path will made real fix though, The following

> > > patch should be explaining enough what is wrong.

> > > 

> > > In short, clean_buffers() must be called before unlocking

> > > lock_page().

> > 

> > Thanks for that.  This should fix the problem while not leaking the

> > unlock_page call outside bdev_write_page.

> > 

> > --- 8< ---

> > 

> > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

> 

> Thanks Willy and Hirofumi for the quick fix!  I've tested the change,

> and it works fine.

> 

> Tested-by: Toshi Kani <toshi.kani@hpe.com>


Since removing rw_page would have to wait till 4.15, can we go with this
fix for 4.14?  I confirmed that Matthew's patch [1] applies cleanly to
4.14-rc1.  This patch can also be applied to stable kernels.

Matthew, can you resend this patch with description?

[1] https://www.spinics.net/lists/linux-fsdevel/msg113835.html

Thanks,
-Toshi
diff mbox

Patch

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9941dc8342df..3fbe75bdd257 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -716,10 +716,12 @@  int bdev_write_page(struct block_device *bdev, sector_t sector,
 
 	set_page_writeback(page);
 	result = ops->rw_page(bdev, sector + get_start_sect(bdev), page, true);
-	if (result)
+	if (result) {
 		end_page_writeback(page);
-	else
+	} else {
+		clean_page_buffers(page);
 		unlock_page(page);
+	}
 	blk_queue_exit(bdev->bd_queue);
 	return result;
 }
diff --git a/fs/mpage.c b/fs/mpage.c
index 2e4c41ccb5c9..d97b003f1607 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -468,6 +468,16 @@  static void clean_buffers(struct page *page, unsigned first_unmapped)
 		try_to_free_buffers(page);
 }
 
+/*
+ * For situations where we want to clean all buffers attached to a page.
+ * We don't need to calculate how many buffers are attached to the page,
+ * we just need to specify a number larger than the maximum number of buffers.
+ */
+void clean_page_buffers(struct page *page)
+{
+	clean_buffers(page, PAGE_SIZE);
+}
+
 static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
 		      void *data)
 {
@@ -605,10 +615,8 @@  static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
 	if (bio == NULL) {
 		if (first_unmapped == blocks_per_page) {
 			if (!bdev_write_page(bdev, blocks[0] << (blkbits - 9),
-								page, wbc)) {
-				clean_buffers(page, first_unmapped);
+								page, wbc))
 				goto out;
-			}
 		}
 		bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
 				BIO_MAX_PAGES, GFP_NOFS|__GFP_HIGH);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index c8dae555eccf..446b24cac67d 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -232,6 +232,7 @@  int generic_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
 void page_zero_new_buffers(struct page *page, unsigned from, unsigned to);
+void clean_page_buffers(struct page *page);
 int cont_write_begin(struct file *, struct address_space *, loff_t,
 			unsigned, unsigned, struct page **, void **,
 			get_block_t *, loff_t *);