Message ID | a678551d318d5dd9835ad9800dfb41c787654dd1.1695731841.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: some fixes and cleanups around btrfs_cow_block() | expand |
On Tue, Sep 26, 2023 at 01:45:13PM +0100, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > At btrfs_cow_block() we have these checks to verify we are not using a > stale transaction (a past transaction with an unblocked state or higher), > and the only thing we do is to trigger a WARN with a message and a stack > trace. This however is a critical problem, highly unexpected and if it > happens it's most likely due to a bug, so we should error out and turn the > fs into error state so that such issue is much more easily noticed if it's > triggered. > > The problem is critical because using such stale transaction will lead to > not persisting the extent buffer used for the COW operation, as allocating > a tree block adds the range of the respective extent buffer to the > ->dirty_pages iotree of the transaction, and a stale transaction, in the > unlocked state or higher, will not flush dirty extent buffers anymore, > therefore resulting in not persisting the tree block and resource leaks > (not cleaning the dirty_pages iotree for example). > > So do the following changes: > > 1) Return -EUCLEAN if we find a stale transaction; > > 2) Turn the fs into error state, with error -EUCLEAN, so that no > transaction can be committed, and generate a stack trace; > > 3) Combine both conditions into a single if statement, as both are related > and have the same error message; > > 4) Mark the check as unlikely, since this is not expected to ever happen. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > --- > fs/btrfs/ctree.c | 24 ++++++++++++++++-------- > 1 file changed, 16 insertions(+), 8 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 56d2360e597c..dff2e07ba437 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -686,14 +686,22 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, > btrfs_err(fs_info, > "COW'ing blocks on a fs root that's being dropped"); > > - if (trans->transaction != fs_info->running_transaction) > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > - trans->transid, > - fs_info->running_transaction->transid); > - > - if (trans->transid != fs_info->generation) > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > - trans->transid, fs_info->generation); > + /* > + * COWing must happen through a running transaction, which always > + * matches the current fs generation (it's a transaction with a state > + * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs > + * into error state to prevent the commit of any transaction. > + */ > + if (unlikely(trans->transaction != fs_info->running_transaction || > + trans->transid != fs_info->generation)) { > + btrfs_handle_fs_error(fs_info, -EUCLEAN, Can this be a transaction abort? The helper btrfs_handle_fs_error() is from times before we had the abort mechanism and should not be used in new code when the abort can be done. There are cases where transaction is not available (like superblock commit), but these are exceptions. > +"unexpected transaction when attempting to COW block %llu on root %llu, transaction %llu running transaction %llu fs generation %llu", > + buf->start, btrfs_root_id(root), > + trans->transid, > + fs_info->running_transaction->transid, > + fs_info->generation); > + return -EUCLEAN; > + } > > if (!should_cow_block(trans, root, buf)) { > *cow_ret = buf; > -- > 2.40.1
On Tue, Sep 26, 2023 at 6:38 PM David Sterba <dsterba@suse.cz> wrote: > > On Tue, Sep 26, 2023 at 01:45:13PM +0100, fdmanana@kernel.org wrote: > > From: Filipe Manana <fdmanana@suse.com> > > > > At btrfs_cow_block() we have these checks to verify we are not using a > > stale transaction (a past transaction with an unblocked state or higher), > > and the only thing we do is to trigger a WARN with a message and a stack > > trace. This however is a critical problem, highly unexpected and if it > > happens it's most likely due to a bug, so we should error out and turn the > > fs into error state so that such issue is much more easily noticed if it's > > triggered. > > > > The problem is critical because using such stale transaction will lead to > > not persisting the extent buffer used for the COW operation, as allocating > > a tree block adds the range of the respective extent buffer to the > > ->dirty_pages iotree of the transaction, and a stale transaction, in the > > unlocked state or higher, will not flush dirty extent buffers anymore, > > therefore resulting in not persisting the tree block and resource leaks > > (not cleaning the dirty_pages iotree for example). > > > > So do the following changes: > > > > 1) Return -EUCLEAN if we find a stale transaction; > > > > 2) Turn the fs into error state, with error -EUCLEAN, so that no > > transaction can be committed, and generate a stack trace; > > > > 3) Combine both conditions into a single if statement, as both are related > > and have the same error message; > > > > 4) Mark the check as unlikely, since this is not expected to ever happen. > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > --- > > fs/btrfs/ctree.c | 24 ++++++++++++++++-------- > > 1 file changed, 16 insertions(+), 8 deletions(-) > > > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > > index 56d2360e597c..dff2e07ba437 100644 > > --- a/fs/btrfs/ctree.c > > +++ b/fs/btrfs/ctree.c > > @@ -686,14 +686,22 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, > > btrfs_err(fs_info, > > "COW'ing blocks on a fs root that's being dropped"); > > > > - if (trans->transaction != fs_info->running_transaction) > > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > > - trans->transid, > > - fs_info->running_transaction->transid); > > - > > - if (trans->transid != fs_info->generation) > > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > > - trans->transid, fs_info->generation); > > + /* > > + * COWing must happen through a running transaction, which always > > + * matches the current fs generation (it's a transaction with a state > > + * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs > > + * into error state to prevent the commit of any transaction. > > + */ > > + if (unlikely(trans->transaction != fs_info->running_transaction || > > + trans->transid != fs_info->generation)) { > > + btrfs_handle_fs_error(fs_info, -EUCLEAN, > > Can this be a transaction abort? The helper btrfs_handle_fs_error() is > from times before we had the abort mechanism and should not be used in > new code when the abort can be done. There are cases where transaction > is not available (like superblock commit), but these are exceptions. The handle we have here is for a stale transaction - not the one currently running (if there's any). That's why the btrfs_handle_fs_error() call instead. > > > +"unexpected transaction when attempting to COW block %llu on root %llu, transaction %llu running transaction %llu fs generation %llu", > > + buf->start, btrfs_root_id(root), > > + trans->transid, > > + fs_info->running_transaction->transid, > > + fs_info->generation); > > + return -EUCLEAN; > > + } > > > > if (!should_cow_block(trans, root, buf)) { > > *cow_ret = buf; > > -- > > 2.40.1
On Tue, Sep 26, 2023 at 6:57 PM Filipe Manana <fdmanana@kernel.org> wrote: > > On Tue, Sep 26, 2023 at 6:38 PM David Sterba <dsterba@suse.cz> wrote: > > > > On Tue, Sep 26, 2023 at 01:45:13PM +0100, fdmanana@kernel.org wrote: > > > From: Filipe Manana <fdmanana@suse.com> > > > > > > At btrfs_cow_block() we have these checks to verify we are not using a > > > stale transaction (a past transaction with an unblocked state or higher), > > > and the only thing we do is to trigger a WARN with a message and a stack > > > trace. This however is a critical problem, highly unexpected and if it > > > happens it's most likely due to a bug, so we should error out and turn the > > > fs into error state so that such issue is much more easily noticed if it's > > > triggered. > > > > > > The problem is critical because using such stale transaction will lead to > > > not persisting the extent buffer used for the COW operation, as allocating > > > a tree block adds the range of the respective extent buffer to the > > > ->dirty_pages iotree of the transaction, and a stale transaction, in the > > > unlocked state or higher, will not flush dirty extent buffers anymore, > > > therefore resulting in not persisting the tree block and resource leaks > > > (not cleaning the dirty_pages iotree for example). > > > > > > So do the following changes: > > > > > > 1) Return -EUCLEAN if we find a stale transaction; > > > > > > 2) Turn the fs into error state, with error -EUCLEAN, so that no > > > transaction can be committed, and generate a stack trace; > > > > > > 3) Combine both conditions into a single if statement, as both are related > > > and have the same error message; > > > > > > 4) Mark the check as unlikely, since this is not expected to ever happen. > > > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > > --- > > > fs/btrfs/ctree.c | 24 ++++++++++++++++-------- > > > 1 file changed, 16 insertions(+), 8 deletions(-) > > > > > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > > > index 56d2360e597c..dff2e07ba437 100644 > > > --- a/fs/btrfs/ctree.c > > > +++ b/fs/btrfs/ctree.c > > > @@ -686,14 +686,22 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, > > > btrfs_err(fs_info, > > > "COW'ing blocks on a fs root that's being dropped"); > > > > > > - if (trans->transaction != fs_info->running_transaction) > > > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > > > - trans->transid, > > > - fs_info->running_transaction->transid); > > > - > > > - if (trans->transid != fs_info->generation) > > > - WARN(1, KERN_CRIT "trans %llu running %llu\n", > > > - trans->transid, fs_info->generation); > > > + /* > > > + * COWing must happen through a running transaction, which always > > > + * matches the current fs generation (it's a transaction with a state > > > + * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs > > > + * into error state to prevent the commit of any transaction. > > > + */ > > > + if (unlikely(trans->transaction != fs_info->running_transaction || > > > + trans->transid != fs_info->generation)) { > > > + btrfs_handle_fs_error(fs_info, -EUCLEAN, > > > > Can this be a transaction abort? The helper btrfs_handle_fs_error() is > > from times before we had the abort mechanism and should not be used in > > new code when the abort can be done. There are cases where transaction > > is not available (like superblock commit), but these are exceptions. > > The handle we have here is for a stale transaction - not the one > currently running (if there's any). > That's why the btrfs_handle_fs_error() call instead. We can actually btrfs_abort_transaction() even on a stale one. It still sets the error at fs_info->fs_error, which is what matters here. Do you want me to replace it and send a v2, or do you prefer to fixup it up yourself? Thanks > > > > > > +"unexpected transaction when attempting to COW block %llu on root %llu, transaction %llu running transaction %llu fs generation %llu", > > > + buf->start, btrfs_root_id(root), > > > + trans->transid, > > > + fs_info->running_transaction->transid, > > > + fs_info->generation); > > > + return -EUCLEAN; > > > + } > > > > > > if (!should_cow_block(trans, root, buf)) { > > > *cow_ret = buf; > > > -- > > > 2.40.1
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 56d2360e597c..dff2e07ba437 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -686,14 +686,22 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, btrfs_err(fs_info, "COW'ing blocks on a fs root that's being dropped"); - if (trans->transaction != fs_info->running_transaction) - WARN(1, KERN_CRIT "trans %llu running %llu\n", - trans->transid, - fs_info->running_transaction->transid); - - if (trans->transid != fs_info->generation) - WARN(1, KERN_CRIT "trans %llu running %llu\n", - trans->transid, fs_info->generation); + /* + * COWing must happen through a running transaction, which always + * matches the current fs generation (it's a transaction with a state + * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs + * into error state to prevent the commit of any transaction. + */ + if (unlikely(trans->transaction != fs_info->running_transaction || + trans->transid != fs_info->generation)) { + btrfs_handle_fs_error(fs_info, -EUCLEAN, +"unexpected transaction when attempting to COW block %llu on root %llu, transaction %llu running transaction %llu fs generation %llu", + buf->start, btrfs_root_id(root), + trans->transid, + fs_info->running_transaction->transid, + fs_info->generation); + return -EUCLEAN; + } if (!should_cow_block(trans, root, buf)) { *cow_ret = buf;