Message ID | 20200218175425.20598-1-bfoster@redhat.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [v2] xfs: fix iclog release error check race with shutdown | expand |
On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > l_icloglock held"), xlog_state_release_iclog() always performed a > locked check of the iclog error state before proceeding into the > sync state processing code. As of this commit, part of > xlog_state_release_iclog() was open-coded into > xfs_log_release_iclog() and as a result the locked error state check > was lost. > > The lockless check still exists, but this doesn't account for the > possibility of a race with a shutdown being performed by another > task causing the iclog state to change while the original task waits > on ->l_icloglock. This has reproduced very rarely via generic/475 > and manifests as an assert failure in __xlog_state_release_iclog() > due to an unexpected iclog state. > > Restore the locked error state check in xlog_state_release_iclog() > to ensure that an iclog state update via shutdown doesn't race with > the iclog release state processing code. > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > Reported-by: Zorro Lang <zlang@redhat.com> > Signed-off-by: Brian Foster <bfoster@redhat.com> Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > l_icloglock held"), xlog_state_release_iclog() always performed a > locked check of the iclog error state before proceeding into the > sync state processing code. As of this commit, part of > xlog_state_release_iclog() was open-coded into > xfs_log_release_iclog() and as a result the locked error state check > was lost. > > The lockless check still exists, but this doesn't account for the > possibility of a race with a shutdown being performed by another > task causing the iclog state to change while the original task waits > on ->l_icloglock. This has reproduced very rarely via generic/475 > and manifests as an assert failure in __xlog_state_release_iclog() > due to an unexpected iclog state. > > Restore the locked error state check in xlog_state_release_iclog() > to ensure that an iclog state update via shutdown doesn't race with > the iclog release state processing code. > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > Reported-by: Zorro Lang <zlang@redhat.com> > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > > v2: > - Include Fixes tag. > - Use common error path to include shutdown call. > v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ > > fs/xfs/xfs_log.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > index f6006d94a581..796ff37d5bb5 100644 > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -605,18 +605,23 @@ xfs_log_release_iclog( > struct xlog *log = mp->m_log; > bool sync; > > - if (iclog->ic_state == XLOG_STATE_IOERROR) { > - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > - return -EIO; > - } hmmmm. xfs_force_shutdown() will does nothing if the iclog at the head of the log->iclog list is marked with XLOG_STATE_IOERROR before IO is submitted. In general, that is the case here as the head iclog is what xlog_state_get_iclog_space() returns. i.e. XLOG_STATE_IOERROR here implies the log has already been shut down because the state is IOERROR rather than ACTIVE or WANT_SYNC as was set when the iclog was obtained from xlog_state_get_iclog_space(). > + if (iclog->ic_state == XLOG_STATE_IOERROR) > + goto error; > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > + spin_unlock(&log->l_icloglock); > + goto error; > + } > sync = __xlog_state_release_iclog(log, iclog); > spin_unlock(&log->l_icloglock); > if (sync) > xlog_sync(log, iclog); > } > return 0; > +error: > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > + return -EIO; Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) just like is in xfs_log_force_umount() when this pre-existing log IO error condition is tripped over... Cheers, Dave.
On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > xfs_force_shutdown() will does nothing if the iclog at the head of > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > submitted. In general, that is the case here as the head iclog is > what xlog_state_get_iclog_space() returns. > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > was set when the iclog was obtained from > xlog_state_get_iclog_space(). > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > + goto error; > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > + spin_unlock(&log->l_icloglock); > > + goto error; > > + } > > sync = __xlog_state_release_iclog(log, iclog); > > spin_unlock(&log->l_icloglock); > > if (sync) > > xlog_sync(log, iclog); > > } > > return 0; > > +error: > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > + return -EIO; > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > just like is in xfs_log_force_umount() when this pre-existing log > IO error condition is tripped over... Indeed, the xfs_force_shutdown is a no-op both for the old and this new check. Now the real question, which is a bit out of scope for this patch is why we even have XLOG_STATE_IOERROR? Wouldn't it make more sense to just user the shutdown flag in the mount structure and avoid the extra state complexity and thus clean up this whole mess?
On Tue, Feb 18, 2020 at 02:36:44PM -0800, Christoph Hellwig wrote: > On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > > xfs_force_shutdown() will does nothing if the iclog at the head of > > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > > submitted. In general, that is the case here as the head iclog is > > what xlog_state_get_iclog_space() returns. > > > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > > was set when the iclog was obtained from > > xlog_state_get_iclog_space(). > > > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > > + goto error; > > > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > + spin_unlock(&log->l_icloglock); > > > + goto error; > > > + } > > > sync = __xlog_state_release_iclog(log, iclog); > > > spin_unlock(&log->l_icloglock); > > > if (sync) > > > xlog_sync(log, iclog); > > > } > > > return 0; > > > +error: > > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > + return -EIO; > > > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > > just like is in xfs_log_force_umount() when this pre-existing log > > IO error condition is tripped over... > > Indeed, the xfs_force_shutdown is a no-op both for the old and this > new check. > > Now the real question, which is a bit out of scope for this patch is > why we even have XLOG_STATE_IOERROR? I _think_ it was originally intended to prevent log shutdown recursion when shutdowns trigger log IO errors and try to shut down again. > Wouldn't it make more sense > to just user the shutdown flag in the mount structure and avoid the > extra state complexity and thus clean up this whole mess? I'd suggest that XLOG_FORCED_SHUTDOWN() is more appropriate in code that has no reason to know anything about the xfs_mount state e.g. the code in xlog_state_release_iclog() has a log and iclog context and introducing a xfs-mount context to check for shutdown is a fairly significant layering violation deep inside the internal log implementation... Cheers, Dave.
On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > > l_icloglock held"), xlog_state_release_iclog() always performed a > > locked check of the iclog error state before proceeding into the > > sync state processing code. As of this commit, part of > > xlog_state_release_iclog() was open-coded into > > xfs_log_release_iclog() and as a result the locked error state check > > was lost. > > > > The lockless check still exists, but this doesn't account for the > > possibility of a race with a shutdown being performed by another > > task causing the iclog state to change while the original task waits > > on ->l_icloglock. This has reproduced very rarely via generic/475 > > and manifests as an assert failure in __xlog_state_release_iclog() > > due to an unexpected iclog state. > > > > Restore the locked error state check in xlog_state_release_iclog() > > to ensure that an iclog state update via shutdown doesn't race with > > the iclog release state processing code. > > > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > > Reported-by: Zorro Lang <zlang@redhat.com> > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > > > v2: > > - Include Fixes tag. > > - Use common error path to include shutdown call. > > v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ > > > > fs/xfs/xfs_log.c | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > index f6006d94a581..796ff37d5bb5 100644 > > --- a/fs/xfs/xfs_log.c > > +++ b/fs/xfs/xfs_log.c > > @@ -605,18 +605,23 @@ xfs_log_release_iclog( > > struct xlog *log = mp->m_log; > > bool sync; > > > > - if (iclog->ic_state == XLOG_STATE_IOERROR) { > > - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > - return -EIO; > > - } > > hmmmm. > > xfs_force_shutdown() will does nothing if the iclog at the head of > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > submitted. In general, that is the case here as the head iclog is > what xlog_state_get_iclog_space() returns. > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > was set when the iclog was obtained from > xlog_state_get_iclog_space(). > I'm not sure that assumption is always true. If the iclog is (was) WANT_SYNC, for example, then it's already been switched out from the head of that list. That might only happen if a checkpoint happens to align nicely with the end of an iclog, but that's certainly possible. We also need to consider that ->l_icloglock cycles here and thus somebody else could switch out the head iclog.. > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > + goto error; > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > + spin_unlock(&log->l_icloglock); > > + goto error; > > + } > > sync = __xlog_state_release_iclog(log, iclog); > > spin_unlock(&log->l_icloglock); > > if (sync) > > xlog_sync(log, iclog); > > } > > return 0; > > +error: > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > + return -EIO; > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > just like is in xfs_log_force_umount() when this pre-existing log > IO error condition is tripped over... > I think this change is fundamentally correct based on the simple fact that we only ever set XLOG_STATE_IOERROR in the shutdown path. That said, the assert would be potentially racy against the shutdown path without any ordering guarantee that the release path might see the iclog state update prior to the log state update and lead to a potentially false negative assert failure. I suspect this is why the shutdown call might have been made from here in the first place (and partly why I didn't bother with it in the restored locked state check). Given the context of this patch (fixing a regression) and the practical history of this code (and the annoying process of identifying and tracking down the source of these kind of shutdown buglets), I'm inclined to have this patch preserve the historical and relatively proven behavior as much as possible and consider further cleanups separately... Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >
On Wed, Feb 19, 2020 at 02:00:40PM +1100, Dave Chinner wrote: > > Now the real question, which is a bit out of scope for this patch is > > why we even have XLOG_STATE_IOERROR? > > I _think_ it was originally intended to prevent log shutdown > recursion when shutdowns trigger log IO errors and try to shut down > again. > > > Wouldn't it make more sense > > to just user the shutdown flag in the mount structure and avoid the > > extra state complexity and thus clean up this whole mess? > > I'd suggest that XLOG_FORCED_SHUTDOWN() is more appropriate in code > that has no reason to know anything about the xfs_mount state e.g. > the code in xlog_state_release_iclog() has a log and iclog context > and introducing a xfs-mount context to check for shutdown is a > fairly significant layering violation deep inside the internal log > implementation... Yes, XLOG_FORCED_SHUTDOWN makes more sense. I did in fact hack up a quick patch for that last night, but I'm going to hold it back until the bug fix is merged.
On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > l_icloglock held"), xlog_state_release_iclog() always performed a > locked check of the iclog error state before proceeding into the > sync state processing code. As of this commit, part of > xlog_state_release_iclog() was open-coded into > xfs_log_release_iclog() and as a result the locked error state check > was lost. > > The lockless check still exists, but this doesn't account for the > possibility of a race with a shutdown being performed by another > task causing the iclog state to change while the original task waits > on ->l_icloglock. This has reproduced very rarely via generic/475 > and manifests as an assert failure in __xlog_state_release_iclog() > due to an unexpected iclog state. > > Restore the locked error state check in xlog_state_release_iclog() > to ensure that an iclog state update via shutdown doesn't race with > the iclog release state processing code. Btw, from code inspection I think we also need the same check after taking the lock in xlog_state_release_iclog.
On Wed, Feb 19, 2020 at 08:12:32AM -0500, Brian Foster wrote: > On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > > On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > > > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > > > l_icloglock held"), xlog_state_release_iclog() always performed a > > > locked check of the iclog error state before proceeding into the > > > sync state processing code. As of this commit, part of > > > xlog_state_release_iclog() was open-coded into > > > xfs_log_release_iclog() and as a result the locked error state check > > > was lost. > > > > > > The lockless check still exists, but this doesn't account for the > > > possibility of a race with a shutdown being performed by another > > > task causing the iclog state to change while the original task waits > > > on ->l_icloglock. This has reproduced very rarely via generic/475 > > > and manifests as an assert failure in __xlog_state_release_iclog() > > > due to an unexpected iclog state. > > > > > > Restore the locked error state check in xlog_state_release_iclog() > > > to ensure that an iclog state update via shutdown doesn't race with > > > the iclog release state processing code. > > > > > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > > > Reported-by: Zorro Lang <zlang@redhat.com> > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > --- > > > > > > v2: > > > - Include Fixes tag. > > > - Use common error path to include shutdown call. > > > v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ > > > > > > fs/xfs/xfs_log.c | 13 +++++++++---- > > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > > index f6006d94a581..796ff37d5bb5 100644 > > > --- a/fs/xfs/xfs_log.c > > > +++ b/fs/xfs/xfs_log.c > > > @@ -605,18 +605,23 @@ xfs_log_release_iclog( > > > struct xlog *log = mp->m_log; > > > bool sync; > > > > > > - if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > - return -EIO; > > > - } > > > > hmmmm. > > > > xfs_force_shutdown() will does nothing if the iclog at the head of > > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > > submitted. In general, that is the case here as the head iclog is > > what xlog_state_get_iclog_space() returns. > > > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > > was set when the iclog was obtained from > > xlog_state_get_iclog_space(). > > > > I'm not sure that assumption is always true. If the iclog is (was) > WANT_SYNC, for example, then it's already been switched out from the > head of that list. That might only happen if a checkpoint happens to > align nicely with the end of an iclog, but that's certainly possible. We > also need to consider that ->l_icloglock cycles here and thus somebody > else could switch out the head iclog.. > > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > > + goto error; > > > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > + spin_unlock(&log->l_icloglock); > > > + goto error; > > > + } > > > sync = __xlog_state_release_iclog(log, iclog); > > > spin_unlock(&log->l_icloglock); > > > if (sync) > > > xlog_sync(log, iclog); > > > } > > > return 0; > > > +error: > > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > + return -EIO; > > > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > > just like is in xfs_log_force_umount() when this pre-existing log > > IO error condition is tripped over... > > > > I think this change is fundamentally correct based on the simple fact > that we only ever set XLOG_STATE_IOERROR in the shutdown path. That > said, the assert would be potentially racy against the shutdown path > without any ordering guarantee that the release path might see the iclog > state update prior to the log state update and lead to a potentially > false negative assert failure. I suspect this is why the shutdown call > might have been made from here in the first place (and partly why I > didn't bother with it in the restored locked state check). > > Given the context of this patch (fixing a regression) and the practical > history of this code (and the annoying process of identifying and > tracking down the source of these kind of shutdown buglets), I'm > inclined to have this patch preserve the historical and relatively > proven behavior as much as possible and consider further cleanups > separately... That sounds reasonable to me (who has at this point missed most of the discussion about this patch). Is there going to be a v3 of this patch? Or has all of the further discussion centered around further cleanups, and this one is good to go? --D > Brian > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@fromorbit.com > > >
On Wed, Feb 19, 2020 at 08:12:32AM -0500, Brian Foster wrote: > On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > > On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > > > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > > > l_icloglock held"), xlog_state_release_iclog() always performed a > > > locked check of the iclog error state before proceeding into the > > > sync state processing code. As of this commit, part of > > > xlog_state_release_iclog() was open-coded into > > > xfs_log_release_iclog() and as a result the locked error state check > > > was lost. > > > > > > The lockless check still exists, but this doesn't account for the > > > possibility of a race with a shutdown being performed by another > > > task causing the iclog state to change while the original task waits > > > on ->l_icloglock. This has reproduced very rarely via generic/475 > > > and manifests as an assert failure in __xlog_state_release_iclog() > > > due to an unexpected iclog state. > > > > > > Restore the locked error state check in xlog_state_release_iclog() > > > to ensure that an iclog state update via shutdown doesn't race with > > > the iclog release state processing code. > > > > > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > > > Reported-by: Zorro Lang <zlang@redhat.com> > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > --- > > > > > > v2: > > > - Include Fixes tag. > > > - Use common error path to include shutdown call. > > > v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ > > > > > > fs/xfs/xfs_log.c | 13 +++++++++---- > > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > > index f6006d94a581..796ff37d5bb5 100644 > > > --- a/fs/xfs/xfs_log.c > > > +++ b/fs/xfs/xfs_log.c > > > @@ -605,18 +605,23 @@ xfs_log_release_iclog( > > > struct xlog *log = mp->m_log; > > > bool sync; > > > > > > - if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > - return -EIO; > > > - } > > > > hmmmm. > > > > xfs_force_shutdown() will does nothing if the iclog at the head of > > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > > submitted. In general, that is the case here as the head iclog is > > what xlog_state_get_iclog_space() returns. > > > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > > was set when the iclog was obtained from > > xlog_state_get_iclog_space(). > > > > I'm not sure that assumption is always true. If the iclog is (was) > WANT_SYNC, for example, then it's already been switched out from the > head of that list. Yes, but if the iclog is in XLOG_STATE_IOERROR here, the only way that state change can happen is if a shutdown has been completely processed between getting the iclog from xlog_state_get_iclog_space() and releasing it here. It can't come from an actual IO error, because a iclog being written to can't be under IO. Hence it doesn't matter if it was XLOG_STATE_ACTIVE at the head of the log or XLOG_STATE_WANT_SYNC and ready for IO; that state has been overwritten by the shutdown that has already been completed.... > That might only happen if a checkpoint happens to > align nicely with the end of an iclog, but that's certainly possible. We > also need to consider that ->l_icloglock cycles here and thus somebody > else could switch out the head iclog.. Yup, that can happen, but the only way that we can get to the XLOG_STATE_IOERROR here is for a shutdown to have already processed a shutdown. > > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > > + goto error; > > > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > + spin_unlock(&log->l_icloglock); > > > + goto error; > > > + } > > > sync = __xlog_state_release_iclog(log, iclog); > > > spin_unlock(&log->l_icloglock); > > > if (sync) > > > xlog_sync(log, iclog); > > > } > > > return 0; > > > +error: > > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > + return -EIO; > > > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > > just like is in xfs_log_force_umount() when this pre-existing log > > IO error condition is tripped over... > > > > I think this change is fundamentally correct based on the simple fact > that we only ever set XLOG_STATE_IOERROR in the shutdown path. That > said, the assert would be potentially racy against the shutdown path > without any ordering guarantee that the release path might see the iclog > state update prior to the log state update and lead to a potentially > false negative assert failure. I suspect this is why the shutdown call > might have been made from here in the first place (and partly why I > didn't bother with it in the restored locked state check). *nod* And if we change this to XLOG_FORCED_SHUTDOWN() checks, then the need for an assert check goes away, anyway... > Given the context of this patch (fixing a regression) and the practical > history of this code (and the annoying process of identifying and > tracking down the source of these kind of shutdown buglets), I'm > inclined to have this patch preserve the historical and relatively > proven behavior as much as possible and consider further cleanups > separately... If that's the case, then I don't think the shutdown call should be moved - the second XLOG_STATE_IOERROR occurs under the l_icloglock. Given that XLOG_STATE_IOERROR can only be set while holding the l_icloglock during shutdown processing, it makes no sense to recurse into shutdown here. Cheers, Dave.
On Wed, Feb 19, 2020 at 01:51:41PM -0800, Darrick J. Wong wrote: > On Wed, Feb 19, 2020 at 08:12:32AM -0500, Brian Foster wrote: > > On Wed, Feb 19, 2020 at 08:52:43AM +1100, Dave Chinner wrote: > > > On Tue, Feb 18, 2020 at 12:54:25PM -0500, Brian Foster wrote: > > > > Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with > > > > l_icloglock held"), xlog_state_release_iclog() always performed a > > > > locked check of the iclog error state before proceeding into the > > > > sync state processing code. As of this commit, part of > > > > xlog_state_release_iclog() was open-coded into > > > > xfs_log_release_iclog() and as a result the locked error state check > > > > was lost. > > > > > > > > The lockless check still exists, but this doesn't account for the > > > > possibility of a race with a shutdown being performed by another > > > > task causing the iclog state to change while the original task waits > > > > on ->l_icloglock. This has reproduced very rarely via generic/475 > > > > and manifests as an assert failure in __xlog_state_release_iclog() > > > > due to an unexpected iclog state. > > > > > > > > Restore the locked error state check in xlog_state_release_iclog() > > > > to ensure that an iclog state update via shutdown doesn't race with > > > > the iclog release state processing code. > > > > > > > > Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") > > > > Reported-by: Zorro Lang <zlang@redhat.com> > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > > --- > > > > > > > > v2: > > > > - Include Fixes tag. > > > > - Use common error path to include shutdown call. > > > > v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ > > > > > > > > fs/xfs/xfs_log.c | 13 +++++++++---- > > > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > > > index f6006d94a581..796ff37d5bb5 100644 > > > > --- a/fs/xfs/xfs_log.c > > > > +++ b/fs/xfs/xfs_log.c > > > > @@ -605,18 +605,23 @@ xfs_log_release_iclog( > > > > struct xlog *log = mp->m_log; > > > > bool sync; > > > > > > > > - if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > > - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > > - return -EIO; > > > > - } > > > > > > hmmmm. > > > > > > xfs_force_shutdown() will does nothing if the iclog at the head of > > > the log->iclog list is marked with XLOG_STATE_IOERROR before IO is > > > submitted. In general, that is the case here as the head iclog is > > > what xlog_state_get_iclog_space() returns. > > > > > > i.e. XLOG_STATE_IOERROR here implies the log has already been shut > > > down because the state is IOERROR rather than ACTIVE or WANT_SYNC as > > > was set when the iclog was obtained from > > > xlog_state_get_iclog_space(). > > > > > > > I'm not sure that assumption is always true. If the iclog is (was) > > WANT_SYNC, for example, then it's already been switched out from the > > head of that list. That might only happen if a checkpoint happens to > > align nicely with the end of an iclog, but that's certainly possible. We > > also need to consider that ->l_icloglock cycles here and thus somebody > > else could switch out the head iclog.. > > > > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) > > > > + goto error; > > > > > > > > if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { > > > > + if (iclog->ic_state == XLOG_STATE_IOERROR) { > > > > + spin_unlock(&log->l_icloglock); > > > > + goto error; > > > > + } > > > > sync = __xlog_state_release_iclog(log, iclog); > > > > spin_unlock(&log->l_icloglock); > > > > if (sync) > > > > xlog_sync(log, iclog); > > > > } > > > > return 0; > > > > +error: > > > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); > > > > + return -EIO; > > > > > > Hence I suspect that this should be ASSERT(XLOG_FORCED_SHUTDOWN(log)) > > > just like is in xfs_log_force_umount() when this pre-existing log > > > IO error condition is tripped over... > > > > > > > I think this change is fundamentally correct based on the simple fact > > that we only ever set XLOG_STATE_IOERROR in the shutdown path. That > > said, the assert would be potentially racy against the shutdown path > > without any ordering guarantee that the release path might see the iclog > > state update prior to the log state update and lead to a potentially > > false negative assert failure. I suspect this is why the shutdown call > > might have been made from here in the first place (and partly why I > > didn't bother with it in the restored locked state check). > > > > Given the context of this patch (fixing a regression) and the practical > > history of this code (and the annoying process of identifying and > > tracking down the source of these kind of shutdown buglets), I'm > > inclined to have this patch preserve the historical and relatively > > proven behavior as much as possible and consider further cleanups > > separately... > > That sounds reasonable to me (who has at this point missed most of the > discussion about this patch). Is there going to be a v3 of this patch? > Or has all of the further discussion centered around further cleanups, > and this one is good to go? > I wasn't planning on a v3. The discussion to this point has been centered around the xfs_force_shutdown() call in the associated function (which is orthogonal to the bug). v1 is technically correct, but Christoph suggested to restore historical behavior wrt to the shutdown call. v2 does that, but is a bit superfluous in that the iclog error state with the lock held implies shutdown has already occurred. This is harmless (unless we're worried about shutdown performance or something..), but I think Dave indicated he preferred v1 based on that reasoning. Functionally I don't think it matters either way and at this point I have no preference between v1 or v2. They fix the same problem. Do note that v2 does have the Fixed: tag I missed with v1 (as well as a R-b)... Brian > --D > > > Brian > > > > > Cheers, > > > > > > Dave. > > > -- > > > Dave Chinner > > > david@fromorbit.com > > > > > >
On Thu, Feb 20, 2020 at 07:41:44AM -0500, Brian Foster wrote: > I wasn't planning on a v3. The discussion to this point has been > centered around the xfs_force_shutdown() call in the associated function > (which is orthogonal to the bug). v1 is technically correct, but > Christoph suggested to restore historical behavior wrt to the shutdown > call. v2 does that, but is a bit superfluous in that the iclog error > state with the lock held implies shutdown has already occurred. This is > harmless (unless we're worried about shutdown performance or > something..), but I think Dave indicated he preferred v1 based on that > reasoning. > > Functionally I don't think it matters either way and at this point I > have no preference between v1 or v2. They fix the same problem. Do note > that v2 does have the Fixed: tag I missed with v1 (as well as a R-b)... I'm fine with v1 after all this discussion, and volunteer to clean up all the ioerr handling for the log code after this fix goes in. That being said as noted in one of my replies I think we also need to add the same check in the other caller of __xlog_state_release_iclog.
On Thu, Feb 20, 2020 at 07:43:17AM -0800, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 07:41:44AM -0500, Brian Foster wrote: > > I wasn't planning on a v3. The discussion to this point has been > > centered around the xfs_force_shutdown() call in the associated function > > (which is orthogonal to the bug). v1 is technically correct, but > > Christoph suggested to restore historical behavior wrt to the shutdown > > call. v2 does that, but is a bit superfluous in that the iclog error > > state with the lock held implies shutdown has already occurred. This is > > harmless (unless we're worried about shutdown performance or > > something..), but I think Dave indicated he preferred v1 based on that > > reasoning. > > > > Functionally I don't think it matters either way and at this point I > > have no preference between v1 or v2. They fix the same problem. Do note > > that v2 does have the Fixed: tag I missed with v1 (as well as a R-b)... > > I'm fine with v1 after all this discussion, and volunteer to clean up > all the ioerr handling for the log code after this fix goes in. > Ok. > That being said as noted in one of my replies I think we also need to > add the same check in the other caller of __xlog_state_release_iclog. > That seems reasonable as a cleanup, but I'm not sure how critical it is otherwise. We've already handled the iclog at that point, so we're basically just changing the function to return an error code regardless of the fact that we ran xlog_sync() (which doesn't care about iclog state until we attempt to write, where it checks state again before bio submission) and that most callers have their own IOERROR state checks up the chain anyways because there is no other indication of I/O submission failure.. Brian
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index f6006d94a581..796ff37d5bb5 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -605,18 +605,23 @@ xfs_log_release_iclog( struct xlog *log = mp->m_log; bool sync; - if (iclog->ic_state == XLOG_STATE_IOERROR) { - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); - return -EIO; - } + if (iclog->ic_state == XLOG_STATE_IOERROR) + goto error; if (atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) { + if (iclog->ic_state == XLOG_STATE_IOERROR) { + spin_unlock(&log->l_icloglock); + goto error; + } sync = __xlog_state_release_iclog(log, iclog); spin_unlock(&log->l_icloglock); if (sync) xlog_sync(log, iclog); } return 0; +error: + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); + return -EIO; } /*
Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with l_icloglock held"), xlog_state_release_iclog() always performed a locked check of the iclog error state before proceeding into the sync state processing code. As of this commit, part of xlog_state_release_iclog() was open-coded into xfs_log_release_iclog() and as a result the locked error state check was lost. The lockless check still exists, but this doesn't account for the possibility of a race with a shutdown being performed by another task causing the iclog state to change while the original task waits on ->l_icloglock. This has reproduced very rarely via generic/475 and manifests as an assert failure in __xlog_state_release_iclog() due to an unexpected iclog state. Restore the locked error state check in xlog_state_release_iclog() to ensure that an iclog state update via shutdown doesn't race with the iclog release state processing code. Fixes: df732b29c807 ("xfs: call xlog_state_release_iclog with l_icloglock held") Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> --- v2: - Include Fixes tag. - Use common error path to include shutdown call. v1: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ fs/xfs/xfs_log.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)