diff mbox

[BUG] xfs/305 hangs 4.10-rc4 kernel

Message ID 20170130215952.GA11230@bfoster.bfoster (mailing list archive)
State Deferred, archived
Headers show

Commit Message

Brian Foster Jan. 30, 2017, 9:59 p.m. UTC
On Mon, Jan 30, 2017 at 01:12:24PM -0500, Brian Foster wrote:
> On Fri, Jan 27, 2017 at 10:52:19AM +0800, Eryu Guan wrote:
> > On Thu, Jan 26, 2017 at 01:44:28PM -0500, Brian Foster wrote:
> > > On Thu, Jan 26, 2017 at 11:29:50AM +0800, Eryu Guan wrote:
> > > > On Wed, Jan 25, 2017 at 09:52:16AM -0500, Brian Foster wrote:
> > > > > On Wed, Jan 25, 2017 at 02:39:43PM +0800, Eryu Guan wrote:
> > > > > > Hi all,
> > > > > > 
> > > > > > I hit another test hang triggered by running xfs/305 in a loop, it
> > > > > > usually reproduces within 50 iterations. It also involves disabling
> > > > > > quota, which looks similar to the generic/232 hang[1]. Perhaps they
> > > > > > share the same root cause?
> > > > > > 
> > > > > > [333853.198159] xfs_quota       D    0 13880  13441 0x00000080
> > > > > > [333853.227479] Call Trace:
> > > > > > [333853.239429]  __schedule+0x21c/0x6b0
> > > > > > [333853.255410]  schedule+0x36/0x80
> > > > > > [333853.269932]  schedule_timeout+0x1d1/0x3a0
> > > > > > [333853.288354]  ? lock_timer_base+0xa0/0xa0
> > > > > > [333853.306296]  ? xfs_qm_need_dqattach+0x80/0x80 [xfs]
> > > > > > [333853.329015]  schedule_timeout_uninterruptible+0x1f/0x30
> > > > > > [333853.353344]  xfs_qm_dquot_walk.isra.10+0x172/0x190 [xfs]
> > > > > > [333853.377747]  xfs_qm_dqpurge_all+0x5c/0x80 [xfs]
> > > > > > [333853.398433]  xfs_qm_scall_quotaoff+0x127/0x380 [xfs]
> > > > > > [333853.421062]  xfs_quota_disable+0x3d/0x50 [xfs]
> > > > > > [333853.441319]  SyS_quotactl+0x391/0x850
> > > > > > [333853.458143]  ? __audit_syscall_entry+0xaf/0x100
> > > > > > [333853.478903]  ? syscall_trace_enter+0x1d0/0x2b0
> > > > > > [333853.499262]  ? __audit_syscall_exit+0x209/0x290
> > > > > > [333853.519938]  do_syscall_64+0x67/0x180
> > > > > > [333853.536771]  entry_SYSCALL64_slow_path+0x25/0x25
> > > > > > [333853.558105] RIP: 0033:0x7f95be064fea
> > > > > > [333853.574521] RSP: 002b:00007ffc988b6348 EFLAGS: 00000206 ORIG_RAX: 00000000000000b3
> > > > > > [333853.608922] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f95be064fea
> > > > > > [333853.641478] RDX: 0000000000000000 RSI: 00000000027bb2f0 RDI: 0000000000580201
> > > > > > [333853.674091] RBP: 00000000027bb320 R08: 00007ffc988b635c R09: 0000000000000001
> > > > > > [333853.706913] R10: 00007ffc988b635c R11: 0000000000000206 R12: 00000000027bb2d0
> > > > > > [333853.744953] R13: 00000000027ba0a0 R14: 0000000000000000 R15: 00000000027ba080
> > > > > > 
> > > > > > I've seen this hang on multiple test hosts with different test configs,
> > > > > > e.g. xfs_2k_crc xfs_4k_reflink. I also tried with 4.9 kernel, but I kept
> > > > > > hitting the BUG_ON that this kernel commit[2] fixed in 4.10-rc kernel.
> > > > > > 
> > > > > 
> > > > > Interesting... I'm not sure this is necessarily locked up given the call
> > > > > to schedule_timeout(). That suggests we could be in the
> > > > > xfs_qm_dquot_walk()->delay() call and possibly spinning on skipped
> > > > > entries. It looks like the only way we skip an entry is if it is already
> > > > > being freed or has a reference count. The only other dquot free path
> > > > > looks like the shrinker, but that doesn't appear in the blocked task
> > > > > list at all. We do have a dqread call, but that path hasn't even added
> > > > > the dquot to the radix tree yet.
> > > > > 
> > > > > > I attached full sysrq-w output. If you need more info please let me
> > > > > > know.
> > > > > >
> > > > > 
> > > > > It looks like pretty much everything is hung up on log reservation,
> > > > > including writeback, which is more indicative of a potential log
> > > > > reservation problem. xfsaild appears to be alive, however (a
> > > > > schedule_timeout() in that path suggests the thread is working to push
> > > > > log items and free up log space), so maybe something else is going on.
> > > > > It might be interesting to see if we can tell if we're spinning
> > > > > somewhere by enabling tracepoints once in this state. Also, have you
> > > > > checked for any unexpected syslog messages when this occurs?
> > > > 
> > > > There's nothing interesting in dmesg with stock 4.10-rc4 kernel, but if
> > > > I test with Darrick's xfs-linux tree for-next branch (which has a few
> > > > patches on top of 4.10-rc4), I can see this message:
> > > > 
> > > > XFS (dm-5): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
> > > > 
> > > > But I can see this message in every xfs/305 iteration, and other tests
> > > > too. I attached dmesg log prior to sysrq-w output, gathered from
> > > > "for-next" kernel.
> > > > 
> > > > I also attached trace log while we're in this state (also from
> > > > "for-next" kernel):
> > > > 
> > > > cd /mnt/ext4
> > > > trace-cmd record -e xfs_*
> > > > sleep 10
> > > > trace-cmd report > log
> > > > 
> > > 
> > > Thanks Eryu. The only thing in the AIL is the XFS_LI_QUOTAOFF log item.
> > > This item is hardcoded to the locked state because it does not really
> > > exist on disk (i.e., it cannot be pushed). Rather, it looks like it is
> > > released upon commit of an accompanying quotaoff_end log item that is
> > > committed after most of the quotaoff work is completed, including the
> > > purge that you're actually stuck on. The purpose of this appears to be
> > > to allow log recovery to deal with the possibility of dquots being
> > > logged after the initial quotaoff transaction. This mechanism
> > > effectively pins the tail of the log with the quotaoff start log item to
> > > ensure log recovery sees it before any subsequent dquot modifications if
> > > we happen to crash at some point in the middle of those operations.
> > > 
> > > What this means is that there is only a fixed amount of physical log
> > > space between the time the quotaoff start is logged and until the
> > > quotaoff completes. If we consume the log space before the quotaoff
> > > completes, we basically deadlock the fs. Indeed, while I haven't been
> > > able to reproduce via xfs/305 alone, I can manufacture this problem by
> > > inserting a long enough sleep after the qutoaoff start item is logged.
> > > The parallel fsstress' then chew up the log and the fs ends up stuck.
> > > 
> > > So I think this is the state you get into, but the root cause is still
> > > not totally clear. It could just be the workload, but it seems a little
> > > crazy to wrap the log before the purge completes since it doesn't log
> > > anything itself. Then again, it looks like the xfs/305 workload can chew
> > > up the log in 30-45s on my 10g scratch dev, so probably not out of the
> > > realm of possibility.
> > > 
> > > I am wondering about the possibility of something else getting stuck
> > > that stalls the dqpurge and then causes the fs-wide deadlock via domino
> > > effect. Could you provide the xfs_info of your scratch device? Also, if
> > 
> > # xfs_info /mnt/testarea/scratch/
> > meta-data=/dev/mapper/systemvg-testlv2 isize=512    agcount=16, agsize=245696 blks
> >          =                       sectsz=512   attr=2, projid32bit=1
> >          =                       crc=1        finobt=1 spinodes=0 rmapbt=1
> >          =                       reflink=1
> > data     =                       bsize=4096   blocks=3931136, imaxpct=25
> >          =                       sunit=64     swidth=192 blks
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> > log      =internal               bsize=4096   blocks=7680, version=2
> >          =                       sectsz=512   sunit=64 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> > 
> > > you could make a system available that is able to reproduce this
> > > reliably enough, that might make it easier to analyze what's going on...
> > 
> > I do have a system that could reproduce the hang reliably, but not so
> > efficiently, it usually reproduces around 30 minutes by running xfs/305
> > in a loop. It's a RH internal test host, I can provide you the address
> > and password if you think it's useful.
> > 
> 
> Thanks for the system info. I could reproduce fairly quickly. I adopted
> some of the characteristics of your scratch dev on my local machine and
> was able to reproduce there as well. What is of note is that in both
> cases I reproduce with xfs_quota stuck at a different point from your
> original report. Instead of being stuck in xfs_qm_dquot_walk(), I
> reproduce the following stack:
> 
> [190961.556198] xfs_quota       D    0  9394   8970 0x00000080
> [190961.581188] Call Trace:
> [190961.592501]  __schedule+0x21c/0x6b0
> [190961.608704]  ? radix_tree_gang_lookup+0xc8/0x110
> [190961.629892]  schedule+0x36/0x80
> [190961.644336]  xlog_grant_head_wait+0xd2/0x210 [xfs]
> [190961.666212]  xlog_grant_head_check+0xaf/0x160 [xfs]
> [190961.688687]  xfs_log_reserve+0xed/0x270 [xfs]
> [190961.708753]  ? kmem_cache_alloc+0x194/0x1a0
> [190961.727905]  xfs_trans_reserve+0x1d6/0x2c0 [xfs]
> [190961.748958]  xfs_trans_alloc+0xc1/0x140 [xfs]
> [190961.768899]  xfs_qm_log_quotaoff_end+0x3c/0x90 [xfs]
> [190961.793068]  xfs_qm_scall_quotaoff+0x128/0x3b0 [xfs]
> [190961.819012]  xfs_quota_disable+0x3d/0x50 [xfs]
> [190961.839279]  SyS_quotactl+0x391/0x850
> [190961.856210]  ? __audit_syscall_entry+0xaf/0x100
> [190961.876958]  ? syscall_trace_enter+0x1d0/0x2b0
> [190961.897212]  ? __audit_syscall_exit+0x209/0x290
> [190961.917889]  do_syscall_64+0x67/0x180
> [190961.934687]  entry_SYSCALL64_slow_path+0x25/0x25
> 
> ... which suggests that we probably need to allocate the quotaoff_end
> transaction right when we allocate the quotaoff start transaction and
> stash it for use once we've released/purged all of the dquots.
> 
> There may be multiple issues here so I'll give something like that a
> shot and see if I can still reproduce a hang.
> 

I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
separate issue (occurs after the test, long after quotaoff has
completed). I haven't reproduced that one again nor the original hang in
100+ iterations so far. Care to give the following a whirl in your
environment? Thanks.

Brian

--8<--

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eryu Guan Feb. 4, 2017, 11:47 a.m. UTC | #1
On Mon, Jan 30, 2017 at 04:59:52PM -0500, Brian Foster wrote:
> 
> I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
> separate issue (occurs after the test, long after quotaoff has
> completed). I haven't reproduced that one again nor the original hang in
> 100+ iterations so far. Care to give the following a whirl in your
> environment? Thanks.

I applied your test patch on top of 4.10-rc4 stock kernel and hit
xfs/305 hang at 82nd iteration. I attached the dmesg and sysrq-w log.
You can login the same RH internal test host if that's helpful, I left
the host running in the hang state.

Thanks,
Eryu
Brian Foster Feb. 6, 2017, 5:59 p.m. UTC | #2
On Sat, Feb 04, 2017 at 07:47:00PM +0800, Eryu Guan wrote:
> On Mon, Jan 30, 2017 at 04:59:52PM -0500, Brian Foster wrote:
> > 
> > I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
> > separate issue (occurs after the test, long after quotaoff has
> > completed). I haven't reproduced that one again nor the original hang in
> > 100+ iterations so far. Care to give the following a whirl in your
> > environment? Thanks.
> 
> I applied your test patch on top of 4.10-rc4 stock kernel and hit
> xfs/305 hang at 82nd iteration. I attached the dmesg and sysrq-w log.
> You can login the same RH internal test host if that's helpful, I left
> the host running in the hang state.
> 

Ok, that's not too surprising. It does look like we are in some kind of
live lock situation. xfs_quota is spinning on the dqpurge and two or
three fsstress workers are spinning on xfs_iget() retries via bulkstat.

I'm going to hard reboot this box and try to restart this test with some
customized tracepoints to try and get more data..

Brian

> Thanks,
> Eryu


--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster Feb. 13, 2017, 4:43 p.m. UTC | #3
On Mon, Feb 06, 2017 at 12:59:08PM -0500, Brian Foster wrote:
> On Sat, Feb 04, 2017 at 07:47:00PM +0800, Eryu Guan wrote:
> > On Mon, Jan 30, 2017 at 04:59:52PM -0500, Brian Foster wrote:
> > > 
> > > I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
> > > separate issue (occurs after the test, long after quotaoff has
> > > completed). I haven't reproduced that one again nor the original hang in
> > > 100+ iterations so far. Care to give the following a whirl in your
> > > environment? Thanks.
> > 
> > I applied your test patch on top of 4.10-rc4 stock kernel and hit
> > xfs/305 hang at 82nd iteration. I attached the dmesg and sysrq-w log.
> > You can login the same RH internal test host if that's helpful, I left
> > the host running in the hang state.
> > 
> 
> Ok, that's not too surprising. It does look like we are in some kind of
> live lock situation. xfs_quota is spinning on the dqpurge and two or
> three fsstress workers are spinning on xfs_iget() retries via bulkstat.
> 
> I'm going to hard reboot this box and try to restart this test with some
> customized tracepoints to try and get more data..
> 

I managed to get enough data to manufacture the problem locally. It
looks like quotaoff is racy with respect to inode allocation.

The latter can allocate an inode, populate the incore data structures,
etc. and set i_flags to XFS_INEW until the inode is fully populated. If
inode allocation races with quotaoff such that the alloc doesn't yet see
the quota off state and thus grabs a new dquot reference, it's possible
for the subsequent xfs_qm_dqrele_all_inodes() scan from the quota off
path to simply skip the associated inode because of its XFS_INEW state.
The associated ->i_[ugp]dquot is thus never released and the dqpurge
spins indefinitely.

I reproduce this locally by running quotaoff during an artificial delay
in xfs_finish_inode_setup() right before XFS_INEW is cleared. The least
intrusive way I can think of dealing with this is to account XFS_INEW
inodes as "skipped" inodes during the ag walk (for dqrele_all_inodes()
only) and restart the scan until they are completed.

Brian

> Brian
> 
> > Thanks,
> > Eryu
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eryu Guan July 26, 2017, 5:09 a.m. UTC | #4
Hi Brian,

[sorry for bringing up an old thread, but this seems relavent]

On Mon, Feb 13, 2017 at 11:43:07AM -0500, Brian Foster wrote:
> On Mon, Feb 06, 2017 at 12:59:08PM -0500, Brian Foster wrote:
> > On Sat, Feb 04, 2017 at 07:47:00PM +0800, Eryu Guan wrote:
> > > On Mon, Jan 30, 2017 at 04:59:52PM -0500, Brian Foster wrote:
> > > > 
> > > > I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
> > > > separate issue (occurs after the test, long after quotaoff has
> > > > completed). I haven't reproduced that one again nor the original hang in
> > > > 100+ iterations so far. Care to give the following a whirl in your
> > > > environment? Thanks.
> > > 
> > > I applied your test patch on top of 4.10-rc4 stock kernel and hit
> > > xfs/305 hang at 82nd iteration. I attached the dmesg and sysrq-w log.
> > > You can login the same RH internal test host if that's helpful, I left
> > > the host running in the hang state.
> > > 
> > 
> > Ok, that's not too surprising. It does look like we are in some kind of
> > live lock situation. xfs_quota is spinning on the dqpurge and two or
> > three fsstress workers are spinning on xfs_iget() retries via bulkstat.
> > 
> > I'm going to hard reboot this box and try to restart this test with some
> > customized tracepoints to try and get more data..
> > 
> 
> I managed to get enough data to manufacture the problem locally. It
> looks like quotaoff is racy with respect to inode allocation.
> 
> The latter can allocate an inode, populate the incore data structures,
> etc. and set i_flags to XFS_INEW until the inode is fully populated. If
> inode allocation races with quotaoff such that the alloc doesn't yet see
> the quota off state and thus grabs a new dquot reference, it's possible
> for the subsequent xfs_qm_dqrele_all_inodes() scan from the quota off
> path to simply skip the associated inode because of its XFS_INEW state.
> The associated ->i_[ugp]dquot is thus never released and the dqpurge
> spins indefinitely.
> 
> I reproduce this locally by running quotaoff during an artificial delay
> in xfs_finish_inode_setup() right before XFS_INEW is cleared. The least
> intrusive way I can think of dealing with this is to account XFS_INEW
> inodes as "skipped" inodes during the ag walk (for dqrele_all_inodes()
> only) and restart the scan until they are completed.

Is xfs/305 supposed to pass with 4.13-rc2 kernel now? I know there're
some quota patches went in XFS recently, but I'm not sure if there's any
known leftover bug.

Anyway, I hit a test hang with 4.13-rc2 kernel with xfs/305 again, test
configuration is rmapbt XFS with 4k block size on ppc64le host. But I
haven't tried to reproduce it yet. Just want to confirm with you first.

Thanks,
Eryu

[ 9219.858228] run fstests xfs/305 at 2017-07-24 04:25:26 
[ 9220.209846] XFS (vda5): Unmounting Filesystem 
[ 9220.393077] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
[ 9220.393861] XFS (vda5): Mounting V5 Filesystem 
[ 9220.400258] XFS (vda5): Ending clean mount 
[ 9220.459884] XFS (vda5): Unmounting Filesystem 
[ 9220.663924] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
[ 9220.664542] XFS (vda5): Mounting V5 Filesystem 
[ 9220.669948] XFS (vda5): Ending clean mount 
[ 9220.670681] XFS (vda5): Quotacheck needed: Please wait. 
[ 9220.698900] XFS (vda5): Quotacheck: Done. 
[ 9245.913313] XFS (vda5): Unmounting Filesystem 
[ 9246.386313] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
[ 9246.387058] XFS (vda5): Mounting V5 Filesystem 
[ 9246.568249] XFS (vda5): Ending clean mount 
[ 9246.570422] XFS (vda5): Quotacheck needed: Please wait. 
[ 9247.212955] XFS (vda5): Quotacheck: Done. 
[ 9271.562000] XFS (vda5): Unmounting Filesystem 
[ 9271.920984] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
[ 9271.921746] XFS (vda5): Mounting V5 Filesystem 
[ 9271.975980] XFS (vda5): Ending clean mount 
[ 9271.979411] XFS (vda5): Quotacheck needed: Please wait. 
[ 9272.686058] XFS (vda5): Quotacheck: Done. 
[ 9459.464674] INFO: task kworker/u16:0:31345 blocked for more than 120 seconds. 
[ 9459.465111]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.465381] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.465789] kworker/u16:0   D    0 31345      2 0x00000880 
[ 9459.466245] Workqueue: writeback wb_workfn (flush-253:0) 
[ 9459.466519] Call Trace: 
[ 9459.466664] [c00000009a9872d0] [c0000001fc1ca200] 0xc0000001fc1ca200 (unreliable) 
[ 9459.467081] [c00000009a9874a0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.467435] [c00000009a987500] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.467780] [c00000009a9875d0] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.468173] [c00000009a987600] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.468797] [c00000009a987660] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.469242] [c00000009a9876b0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.469685] [c00000009a987700] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.470141] [c00000009a987750] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.470595] [c00000009a9877b0] [d000000004f9bdf8] xfs_setfilesize_trans_alloc.isra.9+0x38/0x90 [xfs] 
[ 9459.471116] [c00000009a9877f0] [d000000004f9bfc4] xfs_submit_ioend+0x174/0x1c0 [xfs] 
[ 9459.471572] [c00000009a987830] [d000000004f9c124] xfs_vm_writepages+0x114/0x140 [xfs] 
[ 9459.471994] [c00000009a9878b0] [c0000000002624ac] do_writepages+0x4c/0x110 
[ 9459.472339] [c00000009a9878e0] [c000000000368fd4] __writeback_single_inode+0x64/0x500 
[ 9459.472756] [c00000009a987940] [c000000000369b4c] writeback_sb_inodes+0x2dc/0x5b0 
[ 9459.473173] [c00000009a987a50] [c000000000369f04] __writeback_inodes_wb+0xe4/0x150 
[ 9459.473590] [c00000009a987ab0] [c00000000036a23c] wb_writeback+0x2cc/0x420 
[ 9459.473935] [c00000009a987b80] [c00000000036aea8] wb_workfn+0x138/0x4c0 
[ 9459.474290] [c00000009a987c90] [c00000000010b828] process_one_work+0x198/0x480 
[ 9459.474696] [c00000009a987d30] [c00000000010bb98] worker_thread+0x88/0x510 
[ 9459.474970] [c00000009a987dc0] [c000000000113920] kthread+0x160/0x1a0 
[ 9459.475240] [c00000009a987e30] [c00000000000b524] ret_from_kernel_thread+0x5c/0xb8 
[ 9459.475579] INFO: task fsstress:2190 blocked for more than 120 seconds. 
[ 9459.475846]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.476054] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.476365] fsstress        D    0  2190   2188 0x00040080 
[ 9459.476577] Call Trace: 
[ 9459.476684] [c0000001ea0276e0] [0000000000000001] 0x1 (unreliable) 
[ 9459.476949] [c0000001ea0278b0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.477210] [c0000001ea027910] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.477466] [c0000001ea0279e0] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.477779] [c0000001ea027a10] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.478142] [c0000001ea027a70] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.478529] [c0000001ea027ac0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.478890] [c0000001ea027b10] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.479247] [c0000001ea027b60] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.479604] [c0000001ea027bc0] [d000000004fa1be8] xfs_free_file_space+0x288/0x430 [xfs] 
[ 9459.479963] [c0000001ea027cd0] [d000000004fabef0] xfs_file_fallocate+0x190/0x3b0 [xfs] 
[ 9459.480280] [c0000001ea027d90] [c000000000321148] vfs_fallocate+0x178/0x300 
[ 9459.480539] [c0000001ea027de0] [c000000000322630] SyS_fallocate+0x60/0xe0 
[ 9459.480804] [c0000001ea027e30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.481071] INFO: task fsstress:2191 blocked for more than 120 seconds. 
[ 9459.481338]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.481548] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.481862] fsstress        D    0  2191   2188 0x00040080 
[ 9459.482076] Call Trace: 
[ 9459.482185] [c0000001ea073800] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.482455] [c0000001ea073860] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.482725] [c0000001ea073930] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.483039] [c0000001ea073960] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.483406] [c0000001ea0739c0] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.483770] [c0000001ea073a10] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.484130] [c0000001ea073a60] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.484490] [c0000001ea073ab0] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.484842] [c0000001ea073b10] [d000000004fbbe80] xfs_vn_update_time+0x60/0x1f0 [xfs] 
[ 9459.485155] [c0000001ea073b70] [c00000000034e530] file_update_time+0xf0/0x170 
[ 9459.485508] [c0000001ea073bc0] [d000000004fac408] xfs_file_aio_write_checks+0x1f8/0x270 [xfs] 
[ 9459.485913] [c0000001ea073c50] [d000000004fac73c] xfs_file_dio_aio_write+0xcc/0x360 [xfs] 
[ 9459.486269] [c0000001ea073cb0] [d000000004facd80] xfs_file_write_iter+0xd0/0x1d0 [xfs] 
[ 9459.486584] [c0000001ea073d00] [c000000000324210] __vfs_write+0x120/0x200 
[ 9459.486841] [c0000001ea073d90] [c000000000325c88] vfs_write+0xc8/0x240 
[ 9459.487100] [c0000001ea073de0] [c000000000327a80] SyS_write+0x60/0x110 
[ 9459.487363] [c0000001ea073e30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.487622] INFO: task fsstress:2192 blocked for more than 120 seconds. 
[ 9459.487879]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.488087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.488451] fsstress        D    0  2192   2188 0x00040080 
[ 9459.488631] Call Trace: 
[ 9459.488713] [c0000001ea07b6a0] [c0000001ea07b740] 0xc0000001ea07b740 (unreliable) 
[ 9459.488939] [c0000001ea07b870] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.489119] [c0000001ea07b8d0] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.489301] [c0000001ea07b9a0] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.489509] [c0000001ea07b9d0] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.489749] [c0000001ea07ba30] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.489984] [c0000001ea07ba80] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.490218] [c0000001ea07bad0] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.490450] [c0000001ea07bb20] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.490687] [c0000001ea07bb80] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
[ 9459.490922] [c0000001ea07bcc0] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
[ 9459.491132] [c0000001ea07bd50] [c000000000336fe8] vfs_mknod+0x1c8/0x260 
[ 9459.491311] [c0000001ea07bda0] [c00000000033da58] SyS_mknodat+0x2d8/0x310 
[ 9459.491489] [c0000001ea07be30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.491666] INFO: task fsstress:2193 blocked for more than 120 seconds. 
[ 9459.491840]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.491977] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.492190] fsstress        D    0  2193   2188 0x00040080 
[ 9459.492329] Call Trace: 
[ 9459.492400] [c0000001ea043740] [0000000600000007] 0x600000007 (unreliable) 
[ 9459.492576] [c0000001ea043910] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.492751] [c0000001ea043970] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.492936] [c0000001ea043a40] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.493151] [c0000001ea043a70] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.493386] [c0000001ea043ad0] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.493621] [c0000001ea043b20] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.493856] [c0000001ea043b70] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.494087] [c0000001ea043bc0] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.494318] [c0000001ea043c20] [d000000004fc1f6c] xfs_link+0x17c/0x370 [xfs] 
[ 9459.494560] [c0000001ea043cc0] [d000000004fbc568] xfs_vn_link+0x68/0xc0 [xfs] 
[ 9459.494770] [c0000001ea043d10] [c00000000033785c] vfs_link+0x34c/0x480 
[ 9459.494942] [c0000001ea043d70] [c00000000033e0c0] SyS_linkat+0x2a0/0x370 
[ 9459.495118] [c0000001ea043e30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.495290] INFO: task fsstress:2194 blocked for more than 120 seconds. 
[ 9459.495462]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.495597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.495801] fsstress        D    0  2194   2188 0x00040080 
[ 9459.495942] Call Trace: 
[ 9459.496011] [c0000001ea01b900] [0000000000000001] 0x1 (unreliable) 
[ 9459.496183] [c0000001ea01bad0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.496357] [c0000001ea01bb30] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.496531] [c0000001ea01bc00] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.496709] [c0000001ea01bc30] [c0000000003670b4] wb_wait_for_completion+0x84/0xd0 
[ 9459.496919] [c0000001ea01bc90] [c000000000368530] sync_inodes_sb+0xd0/0x330 
[ 9459.497094] [c0000001ea01bd60] [c00000000036fd84] sync_inodes_one_sb+0x24/0x40 
[ 9459.497301] [c0000001ea01bd80] [c00000000032b978] iterate_supers+0x1b8/0x1f0 
[ 9459.497509] [c0000001ea01bdf0] [c0000000003702a0] sys_sync+0x50/0xe0 
[ 9459.497682] [c0000001ea01be30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.497857] INFO: task fsstress:2195 blocked for more than 120 seconds. 
[ 9459.498027]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.498165] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.498376] fsstress        D    0  2195   2188 0x00040080 
[ 9459.498514] Call Trace: 
[ 9459.498586] [c0000001ea057ad0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.498761] [c0000001ea057b30] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.498935] [c0000001ea057c00] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.499107] [c0000001ea057c30] [c0000000003670b4] wb_wait_for_completion+0x84/0xd0 
[ 9459.499322] [c0000001ea057c90] [c000000000368530] sync_inodes_sb+0xd0/0x330 
[ 9459.499496] [c0000001ea057d60] [c00000000036fd84] sync_inodes_one_sb+0x24/0x40 
[ 9459.499703] [c0000001ea057d80] [c00000000032b978] iterate_supers+0x1b8/0x1f0 
[ 9459.499914] [c0000001ea057df0] [c0000000003702a0] sys_sync+0x50/0xe0 
[ 9459.500089] [c0000001ea057e30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.500265] INFO: task fsstress:2196 blocked for more than 120 seconds. 
[ 9459.500440]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.500577] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.500787] fsstress        D    0  2196   2188 0x00040080 
[ 9459.500930] Call Trace: 
[ 9459.501003] [c0000001ea03b8d0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.501179] [c0000001ea03b930] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.501359] [c0000001ea03ba00] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.501576] [c0000001ea03ba30] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.501813] [c0000001ea03ba90] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.502051] [c0000001ea03bae0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.502288] [c0000001ea03bb30] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.502521] [c0000001ea03bb80] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.502755] [c0000001ea03bbe0] [d000000004fbbe80] xfs_vn_update_time+0x60/0x1f0 [xfs] 
[ 9459.502962] [c0000001ea03bc40] [c0000000003508dc] touch_atime+0xec/0x110 
[ 9459.503164] [c0000001ea03bc80] [d000000004fab4c8] xfs_file_dio_aio_read+0x168/0x170 [xfs] 
[ 9459.503396] [c0000001ea03bcd0] [d000000004fab954] xfs_file_read_iter+0x114/0x140 [xfs] 
[ 9459.503601] [c0000001ea03bd10] [c000000000324010] __vfs_read+0x120/0x200 
[ 9459.503776] [c0000001ea03bda0] [c000000000325adc] vfs_read+0xac/0x190 
[ 9459.503948] [c0000001ea03bde0] [c000000000327970] SyS_read+0x60/0x110 
[ 9459.504123] [c0000001ea03be30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.504298] INFO: task fsstress:2197 blocked for more than 120 seconds. 
[ 9459.504467]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.504608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.504813] fsstress        D    0  2197   2188 0x00040080 
[ 9459.504952] Call Trace: 
[ 9459.505022] [c0000001ea00f6c0] [0000000500000002] 0x500000002 (unreliable) 
[ 9459.505195] [c0000001ea00f890] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.505373] [c0000001ea00f8f0] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.505552] [c0000001ea00f9c0] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.505756] [c0000001ea00f9f0] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.505991] [c0000001ea00fa50] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.506226] [c0000001ea00faa0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.506460] [c0000001ea00faf0] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.506698] [c0000001ea00fb40] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.506931] [c0000001ea00fba0] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
[ 9459.507161] [c0000001ea00fce0] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
[ 9459.507370] [c0000001ea00fd70] [c000000000337208] vfs_mkdir+0x188/0x270 
[ 9459.507544] [c0000001ea00fdc0] [c00000000033dbfc] SyS_mkdirat+0x13c/0x150 
[ 9459.507716] [c0000001ea00fe30] [c00000000000b184] system_call+0x58/0x6c 
[ 9459.507893] INFO: task fsstress:2198 blocked for more than 120 seconds. 
[ 9459.508064]       Tainted: G        W       4.13.0-rc2 #1 
[ 9459.508201] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 9459.508412] fsstress        D    0  2198   2188 0x00040080 
[ 9459.508551] Call Trace: 
[ 9459.508621] [c0000001ea02b3f0] [0000000600000002] 0x600000002 (unreliable) 
[ 9459.508796] [c0000001ea02b5c0] [c00000000001ad18] __switch_to+0x2e8/0x430 
[ 9459.508967] [c0000001ea02b620] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
[ 9459.509144] [c0000001ea02b6f0] [c00000000093b070] schedule+0x40/0xb0 
[ 9459.509357] [c0000001ea02b720] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
[ 9459.509593] [c0000001ea02b780] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
[ 9459.509830] [c0000001ea02b7d0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
[ 9459.510066] [c0000001ea02b820] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
[ 9459.510302] [c0000001ea02b870] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
[ 9459.510535] [c0000001ea02b8d0] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
[ 9459.510768] [c0000001ea02ba10] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
[ 9459.510975] [c0000001ea02baa0] [c000000000339fb4] lookup_open+0x654/0x790 
[ 9459.511151] [c0000001ea02bb60] [c00000000033b784] do_last+0x554/0x8d0 
[ 9459.511323] [c0000001ea02bc00] [c00000000033bbb4] path_openat+0xb4/0x3c0 
[ 9459.511496] [c0000001ea02bc90] [c00000000033d584] do_filp_open+0x74/0xf0 
[ 9459.511670] [c0000001ea02bdc0] [c0000000003238e0] do_sys_open+0x1b0/0x2d0 
[ 9459.511844] [c0000001ea02be30] [c00000000000b184] system_call+0x58/0x6c 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster July 26, 2017, 10:35 a.m. UTC | #5
On Wed, Jul 26, 2017 at 01:09:52PM +0800, Eryu Guan wrote:
> Hi Brian,
> 
> [sorry for bringing up an old thread, but this seems relavent]
> 
> On Mon, Feb 13, 2017 at 11:43:07AM -0500, Brian Foster wrote:
> > On Mon, Feb 06, 2017 at 12:59:08PM -0500, Brian Foster wrote:
> > > On Sat, Feb 04, 2017 at 07:47:00PM +0800, Eryu Guan wrote:
> > > > On Mon, Jan 30, 2017 at 04:59:52PM -0500, Brian Foster wrote:
> > > > > 
> > > > > I reproduced an xfs_wait_buftarg() unmount hang once that looks like a
> > > > > separate issue (occurs after the test, long after quotaoff has
> > > > > completed). I haven't reproduced that one again nor the original hang in
> > > > > 100+ iterations so far. Care to give the following a whirl in your
> > > > > environment? Thanks.
> > > > 
> > > > I applied your test patch on top of 4.10-rc4 stock kernel and hit
> > > > xfs/305 hang at 82nd iteration. I attached the dmesg and sysrq-w log.
> > > > You can login the same RH internal test host if that's helpful, I left
> > > > the host running in the hang state.
> > > > 
> > > 
> > > Ok, that's not too surprising. It does look like we are in some kind of
> > > live lock situation. xfs_quota is spinning on the dqpurge and two or
> > > three fsstress workers are spinning on xfs_iget() retries via bulkstat.
> > > 
> > > I'm going to hard reboot this box and try to restart this test with some
> > > customized tracepoints to try and get more data..
> > > 
> > 
> > I managed to get enough data to manufacture the problem locally. It
> > looks like quotaoff is racy with respect to inode allocation.
> > 
> > The latter can allocate an inode, populate the incore data structures,
> > etc. and set i_flags to XFS_INEW until the inode is fully populated. If
> > inode allocation races with quotaoff such that the alloc doesn't yet see
> > the quota off state and thus grabs a new dquot reference, it's possible
> > for the subsequent xfs_qm_dqrele_all_inodes() scan from the quota off
> > path to simply skip the associated inode because of its XFS_INEW state.
> > The associated ->i_[ugp]dquot is thus never released and the dqpurge
> > spins indefinitely.
> > 
> > I reproduce this locally by running quotaoff during an artificial delay
> > in xfs_finish_inode_setup() right before XFS_INEW is cleared. The least
> > intrusive way I can think of dealing with this is to account XFS_INEW
> > inodes as "skipped" inodes during the ag walk (for dqrele_all_inodes()
> > only) and restart the scan until they are completed.
> 
> Is xfs/305 supposed to pass with 4.13-rc2 kernel now? I know there're
> some quota patches went in XFS recently, but I'm not sure if there's any
> known leftover bug.
> 
> Anyway, I hit a test hang with 4.13-rc2 kernel with xfs/305 again, test
> configuration is rmapbt XFS with 4k block size on ppc64le host. But I
> haven't tried to reproduce it yet. Just want to confirm with you first.
> 

I believe it is still possible for quotaoff to deadlock the log. It is
not shown below, but I assume you have a blocked xfs_quota process
attempting to run a quotaoff..? How big is your fs/log?

I sent a patch a while ago, but it wasn't safe because it could
potentially lead to nested transactions in the quotaoff context. We
don't have a proper fix for this as of yet.

Brian

> Thanks,
> Eryu
> 
> [ 9219.858228] run fstests xfs/305 at 2017-07-24 04:25:26 
> [ 9220.209846] XFS (vda5): Unmounting Filesystem 
> [ 9220.393077] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
> [ 9220.393861] XFS (vda5): Mounting V5 Filesystem 
> [ 9220.400258] XFS (vda5): Ending clean mount 
> [ 9220.459884] XFS (vda5): Unmounting Filesystem 
> [ 9220.663924] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
> [ 9220.664542] XFS (vda5): Mounting V5 Filesystem 
> [ 9220.669948] XFS (vda5): Ending clean mount 
> [ 9220.670681] XFS (vda5): Quotacheck needed: Please wait. 
> [ 9220.698900] XFS (vda5): Quotacheck: Done. 
> [ 9245.913313] XFS (vda5): Unmounting Filesystem 
> [ 9246.386313] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
> [ 9246.387058] XFS (vda5): Mounting V5 Filesystem 
> [ 9246.568249] XFS (vda5): Ending clean mount 
> [ 9246.570422] XFS (vda5): Quotacheck needed: Please wait. 
> [ 9247.212955] XFS (vda5): Quotacheck: Done. 
> [ 9271.562000] XFS (vda5): Unmounting Filesystem 
> [ 9271.920984] XFS (vda5): EXPERIMENTAL reverse mapping btree feature enabled. Use at your own risk! 
> [ 9271.921746] XFS (vda5): Mounting V5 Filesystem 
> [ 9271.975980] XFS (vda5): Ending clean mount 
> [ 9271.979411] XFS (vda5): Quotacheck needed: Please wait. 
> [ 9272.686058] XFS (vda5): Quotacheck: Done. 
> [ 9459.464674] INFO: task kworker/u16:0:31345 blocked for more than 120 seconds. 
> [ 9459.465111]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.465381] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.465789] kworker/u16:0   D    0 31345      2 0x00000880 
> [ 9459.466245] Workqueue: writeback wb_workfn (flush-253:0) 
> [ 9459.466519] Call Trace: 
> [ 9459.466664] [c00000009a9872d0] [c0000001fc1ca200] 0xc0000001fc1ca200 (unreliable) 
> [ 9459.467081] [c00000009a9874a0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.467435] [c00000009a987500] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.467780] [c00000009a9875d0] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.468173] [c00000009a987600] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.468797] [c00000009a987660] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.469242] [c00000009a9876b0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.469685] [c00000009a987700] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.470141] [c00000009a987750] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.470595] [c00000009a9877b0] [d000000004f9bdf8] xfs_setfilesize_trans_alloc.isra.9+0x38/0x90 [xfs] 
> [ 9459.471116] [c00000009a9877f0] [d000000004f9bfc4] xfs_submit_ioend+0x174/0x1c0 [xfs] 
> [ 9459.471572] [c00000009a987830] [d000000004f9c124] xfs_vm_writepages+0x114/0x140 [xfs] 
> [ 9459.471994] [c00000009a9878b0] [c0000000002624ac] do_writepages+0x4c/0x110 
> [ 9459.472339] [c00000009a9878e0] [c000000000368fd4] __writeback_single_inode+0x64/0x500 
> [ 9459.472756] [c00000009a987940] [c000000000369b4c] writeback_sb_inodes+0x2dc/0x5b0 
> [ 9459.473173] [c00000009a987a50] [c000000000369f04] __writeback_inodes_wb+0xe4/0x150 
> [ 9459.473590] [c00000009a987ab0] [c00000000036a23c] wb_writeback+0x2cc/0x420 
> [ 9459.473935] [c00000009a987b80] [c00000000036aea8] wb_workfn+0x138/0x4c0 
> [ 9459.474290] [c00000009a987c90] [c00000000010b828] process_one_work+0x198/0x480 
> [ 9459.474696] [c00000009a987d30] [c00000000010bb98] worker_thread+0x88/0x510 
> [ 9459.474970] [c00000009a987dc0] [c000000000113920] kthread+0x160/0x1a0 
> [ 9459.475240] [c00000009a987e30] [c00000000000b524] ret_from_kernel_thread+0x5c/0xb8 
> [ 9459.475579] INFO: task fsstress:2190 blocked for more than 120 seconds. 
> [ 9459.475846]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.476054] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.476365] fsstress        D    0  2190   2188 0x00040080 
> [ 9459.476577] Call Trace: 
> [ 9459.476684] [c0000001ea0276e0] [0000000000000001] 0x1 (unreliable) 
> [ 9459.476949] [c0000001ea0278b0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.477210] [c0000001ea027910] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.477466] [c0000001ea0279e0] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.477779] [c0000001ea027a10] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.478142] [c0000001ea027a70] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.478529] [c0000001ea027ac0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.478890] [c0000001ea027b10] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.479247] [c0000001ea027b60] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.479604] [c0000001ea027bc0] [d000000004fa1be8] xfs_free_file_space+0x288/0x430 [xfs] 
> [ 9459.479963] [c0000001ea027cd0] [d000000004fabef0] xfs_file_fallocate+0x190/0x3b0 [xfs] 
> [ 9459.480280] [c0000001ea027d90] [c000000000321148] vfs_fallocate+0x178/0x300 
> [ 9459.480539] [c0000001ea027de0] [c000000000322630] SyS_fallocate+0x60/0xe0 
> [ 9459.480804] [c0000001ea027e30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.481071] INFO: task fsstress:2191 blocked for more than 120 seconds. 
> [ 9459.481338]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.481548] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.481862] fsstress        D    0  2191   2188 0x00040080 
> [ 9459.482076] Call Trace: 
> [ 9459.482185] [c0000001ea073800] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.482455] [c0000001ea073860] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.482725] [c0000001ea073930] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.483039] [c0000001ea073960] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.483406] [c0000001ea0739c0] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.483770] [c0000001ea073a10] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.484130] [c0000001ea073a60] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.484490] [c0000001ea073ab0] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.484842] [c0000001ea073b10] [d000000004fbbe80] xfs_vn_update_time+0x60/0x1f0 [xfs] 
> [ 9459.485155] [c0000001ea073b70] [c00000000034e530] file_update_time+0xf0/0x170 
> [ 9459.485508] [c0000001ea073bc0] [d000000004fac408] xfs_file_aio_write_checks+0x1f8/0x270 [xfs] 
> [ 9459.485913] [c0000001ea073c50] [d000000004fac73c] xfs_file_dio_aio_write+0xcc/0x360 [xfs] 
> [ 9459.486269] [c0000001ea073cb0] [d000000004facd80] xfs_file_write_iter+0xd0/0x1d0 [xfs] 
> [ 9459.486584] [c0000001ea073d00] [c000000000324210] __vfs_write+0x120/0x200 
> [ 9459.486841] [c0000001ea073d90] [c000000000325c88] vfs_write+0xc8/0x240 
> [ 9459.487100] [c0000001ea073de0] [c000000000327a80] SyS_write+0x60/0x110 
> [ 9459.487363] [c0000001ea073e30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.487622] INFO: task fsstress:2192 blocked for more than 120 seconds. 
> [ 9459.487879]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.488087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.488451] fsstress        D    0  2192   2188 0x00040080 
> [ 9459.488631] Call Trace: 
> [ 9459.488713] [c0000001ea07b6a0] [c0000001ea07b740] 0xc0000001ea07b740 (unreliable) 
> [ 9459.488939] [c0000001ea07b870] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.489119] [c0000001ea07b8d0] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.489301] [c0000001ea07b9a0] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.489509] [c0000001ea07b9d0] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.489749] [c0000001ea07ba30] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.489984] [c0000001ea07ba80] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.490218] [c0000001ea07bad0] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.490450] [c0000001ea07bb20] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.490687] [c0000001ea07bb80] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
> [ 9459.490922] [c0000001ea07bcc0] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
> [ 9459.491132] [c0000001ea07bd50] [c000000000336fe8] vfs_mknod+0x1c8/0x260 
> [ 9459.491311] [c0000001ea07bda0] [c00000000033da58] SyS_mknodat+0x2d8/0x310 
> [ 9459.491489] [c0000001ea07be30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.491666] INFO: task fsstress:2193 blocked for more than 120 seconds. 
> [ 9459.491840]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.491977] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.492190] fsstress        D    0  2193   2188 0x00040080 
> [ 9459.492329] Call Trace: 
> [ 9459.492400] [c0000001ea043740] [0000000600000007] 0x600000007 (unreliable) 
> [ 9459.492576] [c0000001ea043910] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.492751] [c0000001ea043970] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.492936] [c0000001ea043a40] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.493151] [c0000001ea043a70] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.493386] [c0000001ea043ad0] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.493621] [c0000001ea043b20] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.493856] [c0000001ea043b70] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.494087] [c0000001ea043bc0] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.494318] [c0000001ea043c20] [d000000004fc1f6c] xfs_link+0x17c/0x370 [xfs] 
> [ 9459.494560] [c0000001ea043cc0] [d000000004fbc568] xfs_vn_link+0x68/0xc0 [xfs] 
> [ 9459.494770] [c0000001ea043d10] [c00000000033785c] vfs_link+0x34c/0x480 
> [ 9459.494942] [c0000001ea043d70] [c00000000033e0c0] SyS_linkat+0x2a0/0x370 
> [ 9459.495118] [c0000001ea043e30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.495290] INFO: task fsstress:2194 blocked for more than 120 seconds. 
> [ 9459.495462]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.495597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.495801] fsstress        D    0  2194   2188 0x00040080 
> [ 9459.495942] Call Trace: 
> [ 9459.496011] [c0000001ea01b900] [0000000000000001] 0x1 (unreliable) 
> [ 9459.496183] [c0000001ea01bad0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.496357] [c0000001ea01bb30] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.496531] [c0000001ea01bc00] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.496709] [c0000001ea01bc30] [c0000000003670b4] wb_wait_for_completion+0x84/0xd0 
> [ 9459.496919] [c0000001ea01bc90] [c000000000368530] sync_inodes_sb+0xd0/0x330 
> [ 9459.497094] [c0000001ea01bd60] [c00000000036fd84] sync_inodes_one_sb+0x24/0x40 
> [ 9459.497301] [c0000001ea01bd80] [c00000000032b978] iterate_supers+0x1b8/0x1f0 
> [ 9459.497509] [c0000001ea01bdf0] [c0000000003702a0] sys_sync+0x50/0xe0 
> [ 9459.497682] [c0000001ea01be30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.497857] INFO: task fsstress:2195 blocked for more than 120 seconds. 
> [ 9459.498027]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.498165] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.498376] fsstress        D    0  2195   2188 0x00040080 
> [ 9459.498514] Call Trace: 
> [ 9459.498586] [c0000001ea057ad0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.498761] [c0000001ea057b30] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.498935] [c0000001ea057c00] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.499107] [c0000001ea057c30] [c0000000003670b4] wb_wait_for_completion+0x84/0xd0 
> [ 9459.499322] [c0000001ea057c90] [c000000000368530] sync_inodes_sb+0xd0/0x330 
> [ 9459.499496] [c0000001ea057d60] [c00000000036fd84] sync_inodes_one_sb+0x24/0x40 
> [ 9459.499703] [c0000001ea057d80] [c00000000032b978] iterate_supers+0x1b8/0x1f0 
> [ 9459.499914] [c0000001ea057df0] [c0000000003702a0] sys_sync+0x50/0xe0 
> [ 9459.500089] [c0000001ea057e30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.500265] INFO: task fsstress:2196 blocked for more than 120 seconds. 
> [ 9459.500440]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.500577] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.500787] fsstress        D    0  2196   2188 0x00040080 
> [ 9459.500930] Call Trace: 
> [ 9459.501003] [c0000001ea03b8d0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.501179] [c0000001ea03b930] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.501359] [c0000001ea03ba00] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.501576] [c0000001ea03ba30] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.501813] [c0000001ea03ba90] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.502051] [c0000001ea03bae0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.502288] [c0000001ea03bb30] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.502521] [c0000001ea03bb80] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.502755] [c0000001ea03bbe0] [d000000004fbbe80] xfs_vn_update_time+0x60/0x1f0 [xfs] 
> [ 9459.502962] [c0000001ea03bc40] [c0000000003508dc] touch_atime+0xec/0x110 
> [ 9459.503164] [c0000001ea03bc80] [d000000004fab4c8] xfs_file_dio_aio_read+0x168/0x170 [xfs] 
> [ 9459.503396] [c0000001ea03bcd0] [d000000004fab954] xfs_file_read_iter+0x114/0x140 [xfs] 
> [ 9459.503601] [c0000001ea03bd10] [c000000000324010] __vfs_read+0x120/0x200 
> [ 9459.503776] [c0000001ea03bda0] [c000000000325adc] vfs_read+0xac/0x190 
> [ 9459.503948] [c0000001ea03bde0] [c000000000327970] SyS_read+0x60/0x110 
> [ 9459.504123] [c0000001ea03be30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.504298] INFO: task fsstress:2197 blocked for more than 120 seconds. 
> [ 9459.504467]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.504608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.504813] fsstress        D    0  2197   2188 0x00040080 
> [ 9459.504952] Call Trace: 
> [ 9459.505022] [c0000001ea00f6c0] [0000000500000002] 0x500000002 (unreliable) 
> [ 9459.505195] [c0000001ea00f890] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.505373] [c0000001ea00f8f0] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.505552] [c0000001ea00f9c0] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.505756] [c0000001ea00f9f0] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.505991] [c0000001ea00fa50] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.506226] [c0000001ea00faa0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.506460] [c0000001ea00faf0] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.506698] [c0000001ea00fb40] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.506931] [c0000001ea00fba0] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
> [ 9459.507161] [c0000001ea00fce0] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
> [ 9459.507370] [c0000001ea00fd70] [c000000000337208] vfs_mkdir+0x188/0x270 
> [ 9459.507544] [c0000001ea00fdc0] [c00000000033dbfc] SyS_mkdirat+0x13c/0x150 
> [ 9459.507716] [c0000001ea00fe30] [c00000000000b184] system_call+0x58/0x6c 
> [ 9459.507893] INFO: task fsstress:2198 blocked for more than 120 seconds. 
> [ 9459.508064]       Tainted: G        W       4.13.0-rc2 #1 
> [ 9459.508201] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [ 9459.508412] fsstress        D    0  2198   2188 0x00040080 
> [ 9459.508551] Call Trace: 
> [ 9459.508621] [c0000001ea02b3f0] [0000000600000002] 0x600000002 (unreliable) 
> [ 9459.508796] [c0000001ea02b5c0] [c00000000001ad18] __switch_to+0x2e8/0x430 
> [ 9459.508967] [c0000001ea02b620] [c00000000093a8e8] __schedule+0x3a8/0xaf0 
> [ 9459.509144] [c0000001ea02b6f0] [c00000000093b070] schedule+0x40/0xb0 
> [ 9459.509357] [c0000001ea02b720] [d000000004fd4cd4] xlog_grant_head_wait+0x114/0x320 [xfs] 
> [ 9459.509593] [c0000001ea02b780] [d000000004fd5044] xlog_grant_head_check+0x164/0x1a0 [xfs] 
> [ 9459.509830] [c0000001ea02b7d0] [d000000004fd99d8] xfs_log_reserve+0xd8/0x280 [xfs] 
> [ 9459.510066] [c0000001ea02b820] [d000000004fd1a84] xfs_trans_reserve+0x244/0x2b0 [xfs] 
> [ 9459.510302] [c0000001ea02b870] [d000000004fd330c] xfs_trans_alloc+0xec/0x190 [xfs] 
> [ 9459.510535] [c0000001ea02b8d0] [d000000004fc1364] xfs_create+0x1b4/0x7d0 [xfs] 
> [ 9459.510768] [c0000001ea02ba10] [d000000004fbdb38] xfs_generic_create+0x2f8/0x440 [xfs] 
> [ 9459.510975] [c0000001ea02baa0] [c000000000339fb4] lookup_open+0x654/0x790 
> [ 9459.511151] [c0000001ea02bb60] [c00000000033b784] do_last+0x554/0x8d0 
> [ 9459.511323] [c0000001ea02bc00] [c00000000033bbb4] path_openat+0xb4/0x3c0 
> [ 9459.511496] [c0000001ea02bc90] [c00000000033d584] do_filp_open+0x74/0xf0 
> [ 9459.511670] [c0000001ea02bdc0] [c0000000003238e0] do_sys_open+0x1b0/0x2d0 
> [ 9459.511844] [c0000001ea02be30] [c00000000000b184] system_call+0x58/0x6c 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eryu Guan July 26, 2017, 10:43 a.m. UTC | #6
On Wed, Jul 26, 2017 at 06:35:03AM -0400, Brian Foster wrote:
> On Wed, Jul 26, 2017 at 01:09:52PM +0800, Eryu Guan wrote:
> > Hi Brian,
> > 
> > [sorry for bringing up an old thread, but this seems relavent]
> > 
[snip]
> > 
> > Is xfs/305 supposed to pass with 4.13-rc2 kernel now? I know there're
> > some quota patches went in XFS recently, but I'm not sure if there's any
> > known leftover bug.
> > 
> > Anyway, I hit a test hang with 4.13-rc2 kernel with xfs/305 again, test
> > configuration is rmapbt XFS with 4k block size on ppc64le host. But I
> > haven't tried to reproduce it yet. Just want to confirm with you first.
> > 
> 
> I believe it is still possible for quotaoff to deadlock the log. It is
> not shown below, but I assume you have a blocked xfs_quota process
> attempting to run a quotaoff..? How big is your fs/log?

I'm not sure if there's a quotaoff blocked too, that's all I have for
now, this was an automated test job and it didn't print any sysrq-w
output as expected. The fs is 15G but I don't know much about the log,
the host host was returned to the test pool..

> 
> I sent a patch a while ago, but it wasn't safe because it could
> potentially lead to nested transactions in the quotaoff context. We
> don't have a proper fix for this as of yet.

Ok, thanks! At least this isn't something totally surprising people,
I'll see if I can hit it again in future test runs.

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
index 475a388..9eccfc1 100644
--- a/fs/xfs/xfs_qm_syscalls.c
+++ b/fs/xfs/xfs_qm_syscalls.c
@@ -35,9 +35,11 @@ 
 #include "xfs_trace.h"
 #include "xfs_icache.h"
 
-STATIC int	xfs_qm_log_quotaoff(xfs_mount_t *, xfs_qoff_logitem_t **, uint);
-STATIC int	xfs_qm_log_quotaoff_end(xfs_mount_t *, xfs_qoff_logitem_t *,
-					uint);
+STATIC int	xfs_qm_log_quotaoff(struct xfs_mount *, struct xfs_trans **,
+				    uint);
+STATIC int	xfs_qm_log_quotaoff_end(struct xfs_mount *,
+					struct xfs_qoff_logitem *,
+					struct xfs_trans **, uint);
 
 /*
  * Turn off quota accounting and/or enforcement for all udquots and/or
@@ -56,7 +58,7 @@  xfs_qm_scall_quotaoff(
 	uint			dqtype;
 	int			error;
 	uint			inactivate_flags;
-	xfs_qoff_logitem_t	*qoffstart;
+	struct xfs_trans	*qend_tp;
 
 	/*
 	 * No file system can have quotas enabled on disk but not in core.
@@ -128,7 +130,7 @@  xfs_qm_scall_quotaoff(
 	 * and synchronously. If we fail to write, we should abort the
 	 * operation as it cannot be recovered safely if we crash.
 	 */
-	error = xfs_qm_log_quotaoff(mp, &qoffstart, flags);
+	error = xfs_qm_log_quotaoff(mp, &qend_tp, flags);
 	if (error)
 		goto out_unlock;
 
@@ -181,7 +183,7 @@  xfs_qm_scall_quotaoff(
 	 * So, we have QUOTAOFF start and end logitems; the start
 	 * logitem won't get overwritten until the end logitem appears...
 	 */
-	error = xfs_qm_log_quotaoff_end(mp, qoffstart, flags);
+	error = xfs_trans_commit(qend_tp);
 	if (error) {
 		/* We're screwed now. Shutdown is the only option. */
 		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
@@ -556,13 +558,14 @@  xfs_qm_scall_setqlim(
 
 STATIC int
 xfs_qm_log_quotaoff_end(
-	xfs_mount_t		*mp,
-	xfs_qoff_logitem_t	*startqoff,
+	struct xfs_mount	*mp,
+	struct xfs_qoff_logitem	*startqoff,
+	struct xfs_trans	**tpp,
 	uint			flags)
 {
-	xfs_trans_t		*tp;
+	struct xfs_trans	*tp;
 	int			error;
-	xfs_qoff_logitem_t	*qoffi;
+	struct xfs_qoff_logitem	*qoffi;
 
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_equotaoff, 0, 0, 0, &tp);
 	if (error)
@@ -578,21 +581,22 @@  xfs_qm_log_quotaoff_end(
 	 * We don't care about quotoff's performance.
 	 */
 	xfs_trans_set_sync(tp);
-	return xfs_trans_commit(tp);
+	*tpp = tp;
+	return 0;
 }
 
 
 STATIC int
 xfs_qm_log_quotaoff(
-	xfs_mount_t	       *mp,
-	xfs_qoff_logitem_t     **qoffstartp,
-	uint		       flags)
+	struct xfs_mount	*mp,
+	struct xfs_trans	**end_tp,
+	uint			flags)
 {
-	xfs_trans_t	       *tp;
+	struct xfs_trans	*tp;
 	int			error;
-	xfs_qoff_logitem_t     *qoffi;
+	struct xfs_qoff_logitem	*qoffi;
 
-	*qoffstartp = NULL;
+	*end_tp = NULL;
 
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_quotaoff, 0, 0, 0, &tp);
 	if (error)
@@ -617,7 +621,8 @@  xfs_qm_log_quotaoff(
 	if (error)
 		goto out;
 
-	*qoffstartp = qoffi;
+	error = xfs_qm_log_quotaoff_end(mp, qoffi, end_tp, flags);
+	/* TODO: shutdown on error */
 out:
 	return error;
 }