Message ID | 20160718060215.GB16044@dastard (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On 07/17/2016 11:02 PM, Dave Chinner wrote: > On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote: >> On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote: >>> Hello all, >>> >>> I've found a nasty source of slab corruption. Based on seeing similar symptoms >>> on boxes at Facebook, I suspect it's been around since at least 3.10. >>> >>> It only reproduces under memory pressure so far as I can tell: the issue seems >>> to be that XFS reclaims pages from buffers that are still in use by >>> scsi/block. I'm not sure which side the bug lies on, but I've only observed it >>> with XFS. > [....] >> But this indicates that the page is under writeback at this point, >> so that tends to indicate that the above freeing was incorrect. >> >> Hmmm - it's clear we've got direct reclaim involved here, and the >> suspicion of a dirty page that has had it's bufferheads cleared. >> Are there any other warnings in the log from XFS prior to kasan >> throwing the error? > > Can you try the patch below? Thanks for getting this out so quickly :) So far so good: I booted Linus' tree as of this morning and reproduced the ASAN splat. After applying your patch I haven't triggered it. I'm a bit wary since it was hard to trigger reliably in the first place... so I lined up a few dozen boxes to run the test case overnight. I'll confirm in the morning (-0700) they look good. Thanks, Calvin > -Dave.
On 07/18/2016 07:05 PM, Calvin Owens wrote: > On 07/17/2016 11:02 PM, Dave Chinner wrote: >> On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote: >>> On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote: >>>> Hello all, >>>> >>>> I've found a nasty source of slab corruption. Based on seeing similar symptoms >>>> on boxes at Facebook, I suspect it's been around since at least 3.10. >>>> >>>> It only reproduces under memory pressure so far as I can tell: the issue seems >>>> to be that XFS reclaims pages from buffers that are still in use by >>>> scsi/block. I'm not sure which side the bug lies on, but I've only observed it >>>> with XFS. >> [....] >>> But this indicates that the page is under writeback at this point, >>> so that tends to indicate that the above freeing was incorrect. >>> >>> Hmmm - it's clear we've got direct reclaim involved here, and the >>> suspicion of a dirty page that has had it's bufferheads cleared. >>> Are there any other warnings in the log from XFS prior to kasan >>> throwing the error? >> >> Can you try the patch below? > > Thanks for getting this out so quickly :) > > So far so good: I booted Linus' tree as of this morning and reproduced the ASAN > splat. After applying your patch I haven't triggered it. > > I'm a bit wary since it was hard to trigger reliably in the first place... so I > lined up a few dozen boxes to run the test case overnight. I'll confirm in the > morning (-0700) they look good. All right, my testcase ran 2099 times overnight without triggering anything. For the overnight tests, I booted the boxes with "mem=" to artificially limit RAM, which makes my repro *much* more reliable (I feel silly for not thinking of that in the first place). With that setup, I hit the ASAN splat 21 times in 98 runs on vanilla 4.7-rc7. So I'm sold. Tested-by: Calvin Owens <calvinowens@fb.com> Again, really appreciate the quick response :) Thanks, Calvin > Thanks, > Calvin > >> -Dave.
On Tue, Jul 19, 2016 at 02:22:47PM -0700, Calvin Owens wrote: > On 07/18/2016 07:05 PM, Calvin Owens wrote: > >On 07/17/2016 11:02 PM, Dave Chinner wrote: > >>On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote: > >>>On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote: > >>>>Hello all, > >>>> > >>>>I've found a nasty source of slab corruption. Based on seeing similar symptoms > >>>>on boxes at Facebook, I suspect it's been around since at least 3.10. > >>>> > >>>>It only reproduces under memory pressure so far as I can tell: the issue seems > >>>>to be that XFS reclaims pages from buffers that are still in use by > >>>>scsi/block. I'm not sure which side the bug lies on, but I've only observed it > >>>>with XFS. > >>[....] > >>>But this indicates that the page is under writeback at this point, > >>>so that tends to indicate that the above freeing was incorrect. > >>> > >>>Hmmm - it's clear we've got direct reclaim involved here, and the > >>>suspicion of a dirty page that has had it's bufferheads cleared. > >>>Are there any other warnings in the log from XFS prior to kasan > >>>throwing the error? > >> > >>Can you try the patch below? > > > >Thanks for getting this out so quickly :) > > > >So far so good: I booted Linus' tree as of this morning and reproduced the ASAN > >splat. After applying your patch I haven't triggered it. > > > >I'm a bit wary since it was hard to trigger reliably in the first place... so I > >lined up a few dozen boxes to run the test case overnight. I'll confirm in the > >morning (-0700) they look good. > > All right, my testcase ran 2099 times overnight without triggering anything. > > For the overnight tests, I booted the boxes with "mem=" to artificially limit RAM, > which makes my repro *much* more reliable (I feel silly for not thinking of that > in the first place). With that setup, I hit the ASAN splat 21 times in 98 runs on > vanilla 4.7-rc7. So I'm sold. > > Tested-by: Calvin Owens <calvinowens@fb.com> Thanks for testing, Calvin. I'll update the patch and get it reviewed and committed. Cheers, Dave.
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 80714eb..0cfb944 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -87,6 +87,12 @@ xfs_find_bdev_for_inode( * We're now finished for good with this page. Update the page state via the * associated buffer_heads, paying attention to the start and end offsets that * we need to process on the page. + * + * Landmine Warning: bh->b_end_io() will call end_page_writeback() on the last + * buffer in the IO. Once it does this, it is unsafe to access the bufferhead or + * the page at all, as we may be racing with memory reclaim and it can free both + * the bufferhead chain and the page as it will see the page as clean and + * unused. */ static void xfs_finish_page_writeback( @@ -95,8 +101,9 @@ xfs_finish_page_writeback( int error) { unsigned int end = bvec->bv_offset + bvec->bv_len - 1; - struct buffer_head *head, *bh; + struct buffer_head *head, *bh, *next; unsigned int off = 0; + unsigned int bsize; ASSERT(bvec->bv_offset < PAGE_SIZE); ASSERT((bvec->bv_offset & ((1 << inode->i_blkbits) - 1)) == 0); @@ -105,15 +112,17 @@ xfs_finish_page_writeback( bh = head = page_buffers(bvec->bv_page); + bsize = bh->b_size; do { + next = bh->b_this_page; if (off < bvec->bv_offset) goto next_bh; if (off > end) break; bh->b_end_io(bh, !error); next_bh: - off += bh->b_size; - } while ((bh = bh->b_this_page) != head); + off += bsize; + } while ((bh = next) != head); } /*