Message ID | 20170329204224.6412-1-dev@lynxeye.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Mar 29, 2017 at 10:42:24PM +0200, Lucas Stach wrote: > If the AIL has been pushed up to the target LSN, there is no > point in waking up every 50ms to check if there is more work > to do. All functions that move the target LSN forward make sure > to wake aild as appropriate. > > Keep the timeout wakeup as a watchdog in case we miss the > wakeup from a target LSN update to guarantee forward progress, > but extend the timeout to 10 seconds. > > This keeps the safety net, but also makes laptop users happy > as it gets rid of almost all the wakeups caused by a lightly > loaded FS. The aild already has an idle capability that occurs when the target has been reached. See xfsaild() - it will ignore the timeout and schedule indefinitely when the AIL has been emptied and the target has not been updated during the last push. i.e. this timeout is not a watchdog, just a backoff for the next check if there is still work to be done. Keep in mind that XFS doesn't fully empty the AIL until the log has been covered, and this takes 60-90s to occur after the last modification has occurred to the filesystem. Delaying pushes on an uncovered log risks breaking the covering state machine (it's dependent on writeback from the AIL occurring within a certain time) and so changes like this may break idling on more machines that it "fixes". FYI, filesystems that refuse to idle are typically a sign of userspace touching the filesystem every 2-3 minutes. IIRC, the XFS ail event tracing will tell you if metadata is being dirtied regularly and so whether the AIL is staying empty or not and hence whether it should actually be idle... Yes, there have been bugs in this code in the past, and there may be bugs now. However, just bumping the timeout up to something massive is not a solution if there is a still bugs lurking here... Cheers, Dave.
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index d6c9c3e..1eb40dc 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -457,12 +457,21 @@ xfsaild_push( if (xfs_buf_delwri_submit_nowait(&ailp->xa_buf_list)) ailp->xa_log_flush++; - if (!count || XFS_LSN_CMP(lsn, target) >= 0) { + if (!count) { out_done: /* - * We reached the target or the AIL is empty, so wait a bit - * longer for I/O to complete and remove pushed items from the - * AIL before we start the next scan from the start of the AIL. + * If there was nothing to be pushed we can go to sleep longer, + * as this is purely a watchdog timeout. If the target gets + * moved forward we will get scheduled in before hitting this + * timeout. + */ + tout = 10000; + ailp->xa_last_pushed_lsn = 0; + } else if (XFS_LSN_CMP(lsn, target) >= 0) { + /* + * We reached the target, so wait a bit longer for I/O to + * complete and remove pushed items from the AIL before we + * start the next scan from the start of the AIL. */ tout = 50; ailp->xa_last_pushed_lsn = 0;
If the AIL has been pushed up to the target LSN, there is no point in waking up every 50ms to check if there is more work to do. All functions that move the target LSN forward make sure to wake aild as appropriate. Keep the timeout wakeup as a watchdog in case we miss the wakeup from a target LSN update to guarantee forward progress, but extend the timeout to 10 seconds. This keeps the safety net, but also makes laptop users happy as it gets rid of almost all the wakeups caused by a lightly loaded FS. Signed-off-by: Lucas Stach <dev@lynxeye.de> --- fs/xfs/xfs_trans_ail.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)