Message ID | 1766807082.14812757.1575377439007.JavaMail.zimbra@redhat.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later | expand |
On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote: > My theory is that there's a race in iomap. There appear to be > interleaved calls to iomap_set_range_uptodate() for same page > with varying offset and length. Each call sees bitmap as _not_ > entirely "uptodate" and hence doesn't call SetPageUptodate(). > Even though each bit in bitmap ends up uptodate by the time > all calls finish. Weird. That should be prevented by the page lock that all callers of iomap_set_range_uptodate. But in case I miss something, does the patch below trigger? If not it is not jut a race, but might be some weird ordering problem with the bitops, especially if it only triggers on ppc, which is very weakly ordered. diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index d33c7bc5ee92..25e942c71590 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) unsigned int i; bool uptodate = true; + WARN_ON_ONCE(!PageLocked(page)); + if (iop) { for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { if (i >= first && i <= last)
----- Original Message ----- > On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote: > > My theory is that there's a race in iomap. There appear to be > > interleaved calls to iomap_set_range_uptodate() for same page > > with varying offset and length. Each call sees bitmap as _not_ > > entirely "uptodate" and hence doesn't call SetPageUptodate(). > > Even though each bit in bitmap ends up uptodate by the time > > all calls finish. > > Weird. That should be prevented by the page lock that all callers > of iomap_set_range_uptodate. But in case I miss something, does > the patch below trigger? If not it is not jut a race, but might > be some weird ordering problem with the bitops, especially if it > only triggers on ppc, which is very weakly ordered. > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index d33c7bc5ee92..25e942c71590 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off, > unsigned len) > unsigned int i; > bool uptodate = true; > > + WARN_ON_ONCE(!PageLocked(page)); > + > if (iop) { > for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { > if (i >= first && i <= last) > Hit it pretty quick this time: # uptime 09:27:42 up 22 min, 2 users, load average: 0.09, 13.38, 26.18 # /mnt/testarea/ltp/testcases/bin/genbessel Bus error (core dumped) # dmesg | grep -i -e warn -e call [ 0.000000] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or unsupported by kernel) [ 0.000000] random: get_random_u64 called from cache_random_seq_create+0x98/0x1e0 with crng_init=0 [ 0.000000] rcu: Offload RCU callbacks from CPUs: (none). [ 5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 [ 5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 [ 5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac did not trigger. Is it possible for iomap code to submit multiple bio-s for same locked page and then receive callbacks in parallel?
On Tue, Dec 03, 2019 at 09:35:28AM -0500, Jan Stancek wrote: > > ----- Original Message ----- > > On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote: > > > My theory is that there's a race in iomap. There appear to be > > > interleaved calls to iomap_set_range_uptodate() for same page > > > with varying offset and length. Each call sees bitmap as _not_ > > > entirely "uptodate" and hence doesn't call SetPageUptodate(). > > > Even though each bit in bitmap ends up uptodate by the time > > > all calls finish. > > > > Weird. That should be prevented by the page lock that all callers > > of iomap_set_range_uptodate. But in case I miss something, does > > the patch below trigger? If not it is not jut a race, but might > > be some weird ordering problem with the bitops, especially if it > > only triggers on ppc, which is very weakly ordered. > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > index d33c7bc5ee92..25e942c71590 100644 > > --- a/fs/iomap/buffered-io.c > > +++ b/fs/iomap/buffered-io.c > > @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off, > > unsigned len) > > unsigned int i; > > bool uptodate = true; > > > > + WARN_ON_ONCE(!PageLocked(page)); > > + > > if (iop) { > > for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { > > if (i >= first && i <= last) > > > > Hit it pretty quick this time: > > # uptime > 09:27:42 up 22 min, 2 users, load average: 0.09, 13.38, 26.18 > > # /mnt/testarea/ltp/testcases/bin/genbessel > Bus error (core dumped) > > # dmesg | grep -i -e warn -e call > [ 0.000000] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or unsupported by kernel) > [ 0.000000] random: get_random_u64 called from cache_random_seq_create+0x98/0x1e0 with crng_init=0 > [ 0.000000] rcu: Offload RCU callbacks from CPUs: (none). > [ 5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 > [ 5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 > [ 5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 > > So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac > did not trigger. > > Is it possible for iomap code to submit multiple bio-s for same > locked page and then receive callbacks in parallel? Yes, if (say) you have 64k pages on a 4k-block filesystem and the extent mapping for all 16 blocks aren't contiguous, then iomap will issue separate bios for each physical fragment it finds. iomap will call submit_bio on those bios whenever it thinks it's done filling the bio, so you can indeed get multiple callbacks in parallel. --D
Please try the patch below: diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 512856a88106..340c15400423 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -28,6 +28,7 @@ struct iomap_page { atomic_t read_count; atomic_t write_count; + spinlock_t uptodate_lock; DECLARE_BITMAP(uptodate, PAGE_SIZE / 512); }; @@ -51,6 +52,7 @@ iomap_page_create(struct inode *inode, struct page *page) iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); atomic_set(&iop->read_count, 0); atomic_set(&iop->write_count, 0); + spin_lock_init(&iop->uptodate_lock); bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE); /* @@ -139,25 +141,38 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop, } static void -iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) +iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len) { struct iomap_page *iop = to_iomap_page(page); struct inode *inode = page->mapping->host; unsigned first = off >> inode->i_blkbits; unsigned last = (off + len - 1) >> inode->i_blkbits; - unsigned int i; bool uptodate = true; + unsigned long flags; + unsigned int i; - if (iop) { - for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { - if (i >= first && i <= last) - set_bit(i, iop->uptodate); - else if (!test_bit(i, iop->uptodate)) - uptodate = false; - } + spin_lock_irqsave(&iop->uptodate_lock, flags); + for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { + if (i >= first && i <= last) + set_bit(i, iop->uptodate); + else if (!test_bit(i, iop->uptodate)) + uptodate = false; } - if (uptodate && !PageError(page)) + if (uptodate) + SetPageUptodate(page); + spin_unlock_irqrestore(&iop->uptodate_lock, flags); +} + +static void +iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) +{ + if (PageError(page)) + return; + + if (page_has_private(page)) + iomap_iop_set_range_uptodate(page, off, len); + else SetPageUptodate(page); }
----- Original Message -----
> Please try the patch below:
I ran reproducer for 18 hours on 2 systems were it previously reproduced,
there were no crashes / SIGBUS.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e25901ae3ff4..abe37031c93d 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -131,7 +131,11 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { if (i >= first && i <= last) set_bit(i, iop->uptodate); - else if (!test_bit(i, iop->uptodate)) + } + for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { + if (i >= first && i <= last) + continue; + if (!test_bit(i, iop->uptodate)) uptodate = false; } }