Message ID | 20201005152102.15797-1-bfoster@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | None | expand |
On Mon, Oct 05, 2020 at 11:21:02AM -0400, Brian Foster wrote: > We've had reports of soft lockup warnings in the iomap ioend > completion path due to very large bios and/or bio chains. Divert any > ioends with 256k or more pages to process to the workqueue so > completion occurs in non-atomic context and can reschedule to avoid > soft lockup warnings. Hmmmm... is there any way we can just make end_page_writeback faster? TBH it still strikes me as odd that we'd cap ioends this way just to cover for the fact that we have to poke each and every page. (Also, those 'bool atomic' in the other patch make me kind of nervous -- how do we make sure (from a QA perspective) that nobody gets that wrong?) --D > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > > v2: > - Fix type in macro. > > fs/xfs/xfs_aops.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c > index 3e061ea99922..c00cc0624986 100644 > --- a/fs/xfs/xfs_aops.c > +++ b/fs/xfs/xfs_aops.c > @@ -30,6 +30,13 @@ XFS_WPC(struct iomap_writepage_ctx *ctx) > return container_of(ctx, struct xfs_writepage_ctx, ctx); > } > > +/* > + * Kick extra large ioends off to the workqueue. Completion will process a lot > + * of pages for a large bio or bio chain and a non-atomic context is required to > + * reschedule and avoid soft lockup warnings. > + */ > +#define XFS_LARGE_IOEND (262144ULL << PAGE_SHIFT) > + > /* > * Fast and loose check if this write could update the on-disk inode size. > */ > @@ -239,7 +246,8 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) > { > return ioend->io_private || > ioend->io_type == IOMAP_UNWRITTEN || > - (ioend->io_flags & IOMAP_F_SHARED); > + (ioend->io_flags & IOMAP_F_SHARED) || > + (ioend->io_size >= XFS_LARGE_IOEND); > } > > STATIC void > -- > 2.25.4 >
On Mon, Oct 05, 2020 at 08:55:37PM -0700, Darrick J. Wong wrote: > On Mon, Oct 05, 2020 at 11:21:02AM -0400, Brian Foster wrote: > > We've had reports of soft lockup warnings in the iomap ioend > > completion path due to very large bios and/or bio chains. Divert any > > ioends with 256k or more pages to process to the workqueue so > > completion occurs in non-atomic context and can reschedule to avoid > > soft lockup warnings. > > Hmmmm... is there any way we can just make end_page_writeback faster? > I'm not sure that would help us. It's not doing much work as it is. The issue is just that we effectively queue so many of them to a single bio completion due to either bio chaining or the physical page merging implemented by multipage bvecs. > TBH it still strikes me as odd that we'd cap ioends this way just to > cover for the fact that we have to poke each and every page. > I suppose, but it's not like we don't already account for constructing bios that must be handed off to a workqueue for completion processing. Also FWIW this doesn't cap ioend size like my original patch does. It just updates XFS to send them to the completion workqueue. > (Also, those 'bool atomic' in the other patch make me kind of nervous -- > how do we make sure (from a QA perspective) that nobody gets that wrong?) > Yeah, that's a bit ugly. If somebody has a better idea on the factoring I'm interested in hearing about it. My understanding is that in_atomic() is not reliable and/or generally frowned upon, hence the explicit context parameter. Also, I don't have the error handy but my development kernel complains quite clearly if we make a call that can potentially sleep in atomic context. I believe this is the purpose of the __might_sleep() (CONFIG_DEBUG_ATOMIC_SLEEP) annotation. Brian > --D > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > > > v2: > > - Fix type in macro. > > > > fs/xfs/xfs_aops.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c > > index 3e061ea99922..c00cc0624986 100644 > > --- a/fs/xfs/xfs_aops.c > > +++ b/fs/xfs/xfs_aops.c > > @@ -30,6 +30,13 @@ XFS_WPC(struct iomap_writepage_ctx *ctx) > > return container_of(ctx, struct xfs_writepage_ctx, ctx); > > } > > > > +/* > > + * Kick extra large ioends off to the workqueue. Completion will process a lot > > + * of pages for a large bio or bio chain and a non-atomic context is required to > > + * reschedule and avoid soft lockup warnings. > > + */ > > +#define XFS_LARGE_IOEND (262144ULL << PAGE_SHIFT) > > + > > /* > > * Fast and loose check if this write could update the on-disk inode size. > > */ > > @@ -239,7 +246,8 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) > > { > > return ioend->io_private || > > ioend->io_type == IOMAP_UNWRITTEN || > > - (ioend->io_flags & IOMAP_F_SHARED); > > + (ioend->io_flags & IOMAP_F_SHARED) || > > + (ioend->io_size >= XFS_LARGE_IOEND); > > } > > > > STATIC void > > -- > > 2.25.4 > > >
On Mon, Oct 05, 2020 at 08:55:37PM -0700, Darrick J. Wong wrote: > On Mon, Oct 05, 2020 at 11:21:02AM -0400, Brian Foster wrote: > > We've had reports of soft lockup warnings in the iomap ioend > > completion path due to very large bios and/or bio chains. Divert any > > ioends with 256k or more pages to process to the workqueue so > > completion occurs in non-atomic context and can reschedule to avoid > > soft lockup warnings. > > Hmmmm... is there any way we can just make end_page_writeback faster? There are ways to make it faster. I don't know if they're a "just" solution ... 1. We can use THPs. That will reduce the number of pages being operated on. I hear somebody might have a patch set for that. Incidentally, this patch set will clash with the THP patchset, so one of us is going to have to rebase on the other's work. Not a complaint, just acknowledging that some coordination will be needed for the 5.11 merge window. 2. We could create end_writeback_pages(struct pagevec *pvec) which calls a new test_clear_writeback_pages(pvec). That could amortise taking the memcg lock and finding the lruvec and taking the mapping lock -- assuming these pages are sufficiently virtually contiguous. It can definitely amortise all the statistics updates. 3. We can make wake_up_page(page, PG_writeback); more efficient. If you can produce this situation on demand, I had a patch for that which languished due to lack of interest. https://lore.kernel.org/linux-fsdevel/20200416220130.13343-1-willy@infradead.org/
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 3e061ea99922..c00cc0624986 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -30,6 +30,13 @@ XFS_WPC(struct iomap_writepage_ctx *ctx) return container_of(ctx, struct xfs_writepage_ctx, ctx); } +/* + * Kick extra large ioends off to the workqueue. Completion will process a lot + * of pages for a large bio or bio chain and a non-atomic context is required to + * reschedule and avoid soft lockup warnings. + */ +#define XFS_LARGE_IOEND (262144ULL << PAGE_SHIFT) + /* * Fast and loose check if this write could update the on-disk inode size. */ @@ -239,7 +246,8 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) { return ioend->io_private || ioend->io_type == IOMAP_UNWRITTEN || - (ioend->io_flags & IOMAP_F_SHARED); + (ioend->io_flags & IOMAP_F_SHARED) || + (ioend->io_size >= XFS_LARGE_IOEND); } STATIC void
We've had reports of soft lockup warnings in the iomap ioend completion path due to very large bios and/or bio chains. Divert any ioends with 256k or more pages to process to the workqueue so completion occurs in non-atomic context and can reschedule to avoid soft lockup warnings. Signed-off-by: Brian Foster <bfoster@redhat.com> --- v2: - Fix type in macro. fs/xfs/xfs_aops.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)