Message ID | 20200827013444.24270-3-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | avoid xfs transaction reservation recursion | expand |
On Thu, Aug 27, 2020 at 09:34:44AM +0800, Yafang Shao wrote: > @@ -1500,9 +1500,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) > > /* > * Given that we do not allow direct reclaim to call us, we should > - * never be called in a recursive filesystem reclaim context. > + * never be called while in a filesystem transaction. > */ > - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) > + if (WARN_ON_ONCE(wbc->fstrans_recursion)) > goto redirty; Erm, Dave said: > I think we should just remove > the check completely from iomap_writepage() and move it up into > xfs_vm_writepage() and xfs_vm_writepages(). ie everywhere you set this new bit, just check current->journal_info.
On Thu, Aug 27, 2020 at 9:58 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Thu, Aug 27, 2020 at 09:34:44AM +0800, Yafang Shao wrote: > > @@ -1500,9 +1500,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) > > > > /* > > * Given that we do not allow direct reclaim to call us, we should > > - * never be called in a recursive filesystem reclaim context. > > + * never be called while in a filesystem transaction. > > */ > > - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) > > + if (WARN_ON_ONCE(wbc->fstrans_recursion)) > > goto redirty; > > Erm, Dave said: > > > I think we should just remove > > the check completely from iomap_writepage() and move it up into > > xfs_vm_writepage() and xfs_vm_writepages(). > > ie everywhere you set this new bit, just check current->journal_info. I can't get you. Would you pls. be more specific ? I move the check of current->journal into xfs_vm_writepage() and xfs_vm_writepages(), and I think that is the easiest way to implement it. /* we abort the update if there was an IO error */ @@ -564,6 +565,9 @@ xfs_vm_writepage( { struct xfs_writepage_ctx wpc = { }; + if (xfs_trans_context_active()) + wbc->fstrans_recursion = 1; <<< set for XFS only. + return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); } -- Thanks Yafang
On Thu, Aug 27, 2020 at 10:13:15AM +0800, Yafang Shao wrote: > On Thu, Aug 27, 2020 at 9:58 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Thu, Aug 27, 2020 at 09:34:44AM +0800, Yafang Shao wrote: > > > @@ -1500,9 +1500,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) > > > > > > /* > > > * Given that we do not allow direct reclaim to call us, we should > > > - * never be called in a recursive filesystem reclaim context. > > > + * never be called while in a filesystem transaction. > > > */ > > > - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) > > > + if (WARN_ON_ONCE(wbc->fstrans_recursion)) > > > goto redirty; > > > > Erm, Dave said: > > > > > I think we should just remove > > > the check completely from iomap_writepage() and move it up into > > > xfs_vm_writepage() and xfs_vm_writepages(). > > > > ie everywhere you set this new bit, just check current->journal_info. > > > I can't get you. Would you pls. be more specific ? > > I move the check of current->journal into xfs_vm_writepage() and > xfs_vm_writepages(), and I think that is the easiest way to implement > it. > > /* we abort the update if there was an IO error */ > @@ -564,6 +565,9 @@ xfs_vm_writepage( > { > struct xfs_writepage_ctx wpc = { }; > > + if (xfs_trans_context_active()) > + wbc->fstrans_recursion = 1; <<< set for XFS only. > + > return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); Get rid of wbc->fstrans_recursion. Just do if (WARN_ON_ONCE(current->journal_info)) ..... right here in the XFS code. Cheers, Dave.
On Thu, Aug 27, 2020 at 10:42 AM Dave Chinner <david@fromorbit.com> wrote: > > On Thu, Aug 27, 2020 at 10:13:15AM +0800, Yafang Shao wrote: > > On Thu, Aug 27, 2020 at 9:58 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Thu, Aug 27, 2020 at 09:34:44AM +0800, Yafang Shao wrote: > > > > @@ -1500,9 +1500,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) > > > > > > > > /* > > > > * Given that we do not allow direct reclaim to call us, we should > > > > - * never be called in a recursive filesystem reclaim context. > > > > + * never be called while in a filesystem transaction. > > > > */ > > > > - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) > > > > + if (WARN_ON_ONCE(wbc->fstrans_recursion)) > > > > goto redirty; > > > > > > Erm, Dave said: > > > > > > > I think we should just remove > > > > the check completely from iomap_writepage() and move it up into > > > > xfs_vm_writepage() and xfs_vm_writepages(). > > > > > > ie everywhere you set this new bit, just check current->journal_info. > > > > > > I can't get you. Would you pls. be more specific ? > > > > I move the check of current->journal into xfs_vm_writepage() and > > xfs_vm_writepages(), and I think that is the easiest way to implement > > it. > > > > /* we abort the update if there was an IO error */ > > @@ -564,6 +565,9 @@ xfs_vm_writepage( > > { > > struct xfs_writepage_ctx wpc = { }; > > > > + if (xfs_trans_context_active()) > > + wbc->fstrans_recursion = 1; <<< set for XFS only. > > + > > return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); > > Get rid of wbc->fstrans_recursion. Just do > > if (WARN_ON_ONCE(current->journal_info)) > ..... > > right here in the XFS code. > Understood. But we have to implement the 'redirty' as well in the XFS code, that may make the implementation more complicated.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index bcfc288dba3f..12e0fc4cb825 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1500,9 +1500,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) /* * Given that we do not allow direct reclaim to call us, we should - * never be called in a recursive filesystem reclaim context. + * never be called while in a filesystem transaction. */ - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) + if (WARN_ON_ONCE(wbc->fstrans_recursion)) goto redirty; /* diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index b35611882ff9..9e2eedc9d208 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -62,7 +62,8 @@ xfs_setfilesize_trans_alloc( * We hand off the transaction to the completion thread now, so * clear the flag here. */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); + return 0; } @@ -125,7 +126,7 @@ xfs_setfilesize_ioend( * thus we need to mark ourselves as being in a transaction manually. * Similarly for freeze protection. */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_set(tp); __sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS); /* we abort the update if there was an IO error */ @@ -564,6 +565,9 @@ xfs_vm_writepage( { struct xfs_writepage_ctx wpc = { }; + if (xfs_trans_context_active()) + wbc->fstrans_recursion = 1; + return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); } @@ -574,6 +578,9 @@ xfs_vm_writepages( { struct xfs_writepage_ctx wpc = { }; + if (xfs_trans_context_active()) + wbc->fstrans_recursion = 1; + xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED); return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops); } diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index ab737fed7b12..8a4f6db77e33 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -102,10 +102,6 @@ typedef __u32 xfs_nlink_t; #define xfs_cowb_secs xfs_params.cowb_timer.val #define current_cpu() (raw_smp_processor_id()) -#define current_set_flags_nested(sp, f) \ - (*(sp) = current->flags, current->flags |= (f)) -#define current_restore_flags_nested(sp, f) \ - (current->flags = ((current->flags & ~(f)) | (*(sp) & (f)))) #define NBBY 8 /* number of bits per byte */ diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index ed72867b1a19..5f3a4ff51b3c 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -153,8 +153,6 @@ xfs_trans_reserve( int error = 0; bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0; - /* Mark this thread as being in a transaction */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); /* * Attempt to reserve the needed disk blocks by decrementing @@ -163,10 +161,8 @@ xfs_trans_reserve( */ if (blocks > 0) { error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd); - if (error != 0) { - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (error != 0) return -ENOSPC; - } tp->t_blk_res += blocks; } @@ -241,8 +237,6 @@ xfs_trans_reserve( tp->t_blk_res = 0; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); - return error; } @@ -284,6 +278,8 @@ xfs_trans_alloc( INIT_LIST_HEAD(&tp->t_dfops); tp->t_firstblock = NULLFSBLOCK; + /* Mark this thread as being in a transaction */ + xfs_trans_context_set(tp); error = xfs_trans_reserve(tp, resp, blocks, rtextents); if (error) { xfs_trans_cancel(tp); @@ -878,7 +874,8 @@ __xfs_trans_commit( xfs_log_commit_cil(mp, tp, &commit_lsn, regrant); - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (!regrant) + xfs_trans_context_clear(tp); xfs_trans_free(tp); /* @@ -910,7 +907,8 @@ __xfs_trans_commit( xfs_log_ticket_ungrant(mp->m_log, tp->t_ticket); tp->t_ticket = NULL; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, !!error); xfs_trans_free(tp); @@ -971,7 +969,7 @@ xfs_trans_cancel( } /* mark this thread as no longer being in a transaction */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, dirty); xfs_trans_free(tp); @@ -1013,6 +1011,7 @@ xfs_trans_roll( if (error) return error; + xfs_trans_context_update(trans, *tpp); /* * Reserve space in the log for the next transaction. * This also pushes items in the "AIL", the list of logged items, diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index b752501818d2..f84b563438f6 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -243,4 +243,34 @@ void xfs_trans_buf_copy_type(struct xfs_buf *dst_bp, extern kmem_zone_t *xfs_trans_zone; +static inline void +xfs_trans_context_set(struct xfs_trans *tp) +{ + ASSERT(!current->journal_info); + current->journal_info = tp; + tp->t_pflags = memalloc_nofs_save(); +} + +static inline void +xfs_trans_context_update(struct xfs_trans *old, struct xfs_trans *new) +{ + ASSERT(current->journal_info == old); + current->journal_info = new; +} + +static inline void +xfs_trans_context_clear(struct xfs_trans *tp) +{ + ASSERT(current->journal_info == tp); + current->journal_info = NULL; + memalloc_nofs_restore(tp->t_pflags); +} + +static inline bool +xfs_trans_context_active(void) +{ + /* Use journal_info to indicate current is in a transaction */ + return current->journal_info != NULL; +} + #endif /* __XFS_TRANS_H__ */ diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 8e5c5bb16e2d..33f6fb901a78 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -80,6 +80,9 @@ struct writeback_control { unsigned punt_to_cgroup:1; /* cgrp punting, see __REQ_CGROUP_PUNT */ + /* To avoid filesystem transaction reservation recursion */ + unsigned fstrans_recursion:1; + #ifdef CONFIG_CGROUP_WRITEBACK struct bdi_writeback *wb; /* wb this writeback is issued under */ struct inode *inode; /* inode being written out */