Btrfs: do not abort transaction when failing to insert hole extent
diff mbox

Message ID 1522447918-19147-1-git-send-email-bo.liu@linux.alibaba.com
State New
Headers show

Commit Message

Liu Bo March 30, 2018, 10:11 p.m. UTC
This is running in a typical write path, not inside a critical path
where we have to abort the running transaction, so it's OK to return
errors to callers and eventually to userspace.

Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
---
 fs/btrfs/inode.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

David Sterba April 5, 2018, 4:48 p.m. UTC | #1
On Sat, Mar 31, 2018 at 06:11:55AM +0800, Liu Bo wrote:
> This is running in a typical write path, not inside a critical path
> where we have to abort the running transaction, so it's OK to return
> errors to callers and eventually to userspace.

I'm not sure this is entierly correct, several other places do not abort
after btrfs_drop_extents as there's nothing that would leave the
structres in some half-state.

> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> ---
>  fs/btrfs/inode.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index c7b75dd..b9310f8 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4939,16 +4939,13 @@ static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
>  
>  	ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
>  	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
>  		btrfs_end_transaction(trans);
>  		return ret;
>  	}
>  
>  	ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
>  			offset, 0, 0, len, 0, len, 0, 0, 0);

But here the extents have been already dropped and missing to insert the
items does not seem to lead to a consistent state.

It's possible that I'm missing something. In a call path that can be
safely rolled back even with a started transaction, we don't need to
abort in all cases. But if the rollback requires some non-trivial
modifications, I don't see options how to avoid the abort.

__btrfs_drop_extents does a lot of state changes and can itself fail
in the middle of dropping the range, aborting looks like the safest
option.

> -	if (ret)
> -		btrfs_abort_transaction(trans, ret);
> -	else
> +	if (!ret)
>  		btrfs_update_inode(trans, root, inode);
>  	btrfs_end_transaction(trans);
>  	return ret;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo April 5, 2018, 6:58 p.m. UTC | #2
On Thu, Apr 5, 2018 at 9:48 AM, David Sterba <dsterba@suse.cz> wrote:
> On Sat, Mar 31, 2018 at 06:11:55AM +0800, Liu Bo wrote:
>> This is running in a typical write path, not inside a critical path
>> where we have to abort the running transaction, so it's OK to return
>> errors to callers and eventually to userspace.
>
> I'm not sure this is entierly correct, several other places do not abort
> after btrfs_drop_extents as there's nothing that would leave the
> structres in some half-state.
>
>> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
>> ---
>>  fs/btrfs/inode.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index c7b75dd..b9310f8 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -4939,16 +4939,13 @@ static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
>>
>>       ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
>>       if (ret) {
>> -             btrfs_abort_transaction(trans, ret);
>>               btrfs_end_transaction(trans);
>>               return ret;
>>       }
>>
>>       ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
>>                       offset, 0, 0, len, 0, len, 0, 0, 0);
>
> But here the extents have been already dropped and missing to insert the
> items does not seem to lead to a consistent state.
>
> It's possible that I'm missing something. In a call path that can be
> safely rolled back even with a started transaction, we don't need to
> abort in all cases. But if the rollback requires some non-trivial
> modifications, I don't see options how to avoid the abort.
>
> __btrfs_drop_extents does a lot of state changes and can itself fail
> in the middle of dropping the range, aborting looks like the safest
> option.
>

As maybe_insert_hole is only called by btrfs_cont_expand here, which
means it's a really hole, I don't expect drop_extents would drop
anything, we can remove this drop_extents and put an assert after
btrfs_insert_file_extent for checking EEXIST.

It's different from punch hole where we need to explicitly drop an
actual extent and replace it with a hole range.

thanks,
liubo

>> -     if (ret)
>> -             btrfs_abort_transaction(trans, ret);
>> -     else
>> +     if (!ret)
>>               btrfs_update_inode(trans, root, inode);
>>       btrfs_end_transaction(trans);
>>       return ret;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba April 6, 2018, 1:21 p.m. UTC | #3
On Thu, Apr 05, 2018 at 11:58:16AM -0700, Liu Bo wrote:
> On Thu, Apr 5, 2018 at 9:48 AM, David Sterba <dsterba@suse.cz> wrote:
> > On Sat, Mar 31, 2018 at 06:11:55AM +0800, Liu Bo wrote:
> >> This is running in a typical write path, not inside a critical path
> >> where we have to abort the running transaction, so it's OK to return
> >> errors to callers and eventually to userspace.
> >
> > I'm not sure this is entierly correct, several other places do not abort
> > after btrfs_drop_extents as there's nothing that would leave the
> > structres in some half-state.
> >
> >> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> >> ---
> >>  fs/btrfs/inode.c | 5 +----
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> >> index c7b75dd..b9310f8 100644
> >> --- a/fs/btrfs/inode.c
> >> +++ b/fs/btrfs/inode.c
> >> @@ -4939,16 +4939,13 @@ static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
> >>
> >>       ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
> >>       if (ret) {
> >> -             btrfs_abort_transaction(trans, ret);
> >>               btrfs_end_transaction(trans);
> >>               return ret;
> >>       }
> >>
> >>       ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
> >>                       offset, 0, 0, len, 0, len, 0, 0, 0);
> >
> > But here the extents have been already dropped and missing to insert the
> > items does not seem to lead to a consistent state.
> >
> > It's possible that I'm missing something. In a call path that can be
> > safely rolled back even with a started transaction, we don't need to
> > abort in all cases. But if the rollback requires some non-trivial
> > modifications, I don't see options how to avoid the abort.
> >
> > __btrfs_drop_extents does a lot of state changes and can itself fail
> > in the middle of dropping the range, aborting looks like the safest
> > option.
> >
> 
> As maybe_insert_hole is only called by btrfs_cont_expand here, which
> means it's a really hole, I don't expect drop_extents would drop
> anything, we can remove this drop_extents and put an assert after
> btrfs_insert_file_extent for checking EEXIST.

Sounds good.

> It's different from punch hole where we need to explicitly drop an
> actual extent and replace it with a hole range.

Right, that's what I didn't see at first.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo April 6, 2018, 5:43 p.m. UTC | #4
On Fri, Apr 6, 2018 at 6:21 AM, David Sterba <dsterba@suse.cz> wrote:
> On Thu, Apr 05, 2018 at 11:58:16AM -0700, Liu Bo wrote:
>> On Thu, Apr 5, 2018 at 9:48 AM, David Sterba <dsterba@suse.cz> wrote:
>> > On Sat, Mar 31, 2018 at 06:11:55AM +0800, Liu Bo wrote:
>> >> This is running in a typical write path, not inside a critical path
>> >> where we have to abort the running transaction, so it's OK to return
>> >> errors to callers and eventually to userspace.
>> >
>> > I'm not sure this is entierly correct, several other places do not abort
>> > after btrfs_drop_extents as there's nothing that would leave the
>> > structres in some half-state.
>> >
>> >> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
>> >> ---
>> >>  fs/btrfs/inode.c | 5 +----
>> >>  1 file changed, 1 insertion(+), 4 deletions(-)
>> >>
>> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> >> index c7b75dd..b9310f8 100644
>> >> --- a/fs/btrfs/inode.c
>> >> +++ b/fs/btrfs/inode.c
>> >> @@ -4939,16 +4939,13 @@ static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
>> >>
>> >>       ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
>> >>       if (ret) {
>> >> -             btrfs_abort_transaction(trans, ret);
>> >>               btrfs_end_transaction(trans);
>> >>               return ret;
>> >>       }
>> >>
>> >>       ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
>> >>                       offset, 0, 0, len, 0, len, 0, 0, 0);
>> >
>> > But here the extents have been already dropped and missing to insert the
>> > items does not seem to lead to a consistent state.
>> >
>> > It's possible that I'm missing something. In a call path that can be
>> > safely rolled back even with a started transaction, we don't need to
>> > abort in all cases. But if the rollback requires some non-trivial
>> > modifications, I don't see options how to avoid the abort.
>> >
>> > __btrfs_drop_extents does a lot of state changes and can itself fail
>> > in the middle of dropping the range, aborting looks like the safest
>> > option.
>> >
>>
>> As maybe_insert_hole is only called by btrfs_cont_expand here, which
>> means it's a really hole, I don't expect drop_extents would drop
>> anything, we can remove this drop_extents and put an assert after
>> btrfs_insert_file_extent for checking EEXIST.
>
> Sounds good.
>

Let me make a v2 and have a fstests run.

thanks,
liubo

>> It's different from punch hole where we need to explicitly drop an
>> actual extent and replace it with a hole range.
>
> Right, that's what I didn't see at first.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo April 10, 2018, 1:23 a.m. UTC | #5
On Fri, Apr 6, 2018 at 10:43 AM, Liu Bo <obuil.liubo@gmail.com> wrote:
> On Fri, Apr 6, 2018 at 6:21 AM, David Sterba <dsterba@suse.cz> wrote:
>> On Thu, Apr 05, 2018 at 11:58:16AM -0700, Liu Bo wrote:
>>> On Thu, Apr 5, 2018 at 9:48 AM, David Sterba <dsterba@suse.cz> wrote:
>>> > On Sat, Mar 31, 2018 at 06:11:55AM +0800, Liu Bo wrote:
>>> >> This is running in a typical write path, not inside a critical path
>>> >> where we have to abort the running transaction, so it's OK to return
>>> >> errors to callers and eventually to userspace.
>>> >
>>> > I'm not sure this is entierly correct, several other places do not abort
>>> > after btrfs_drop_extents as there's nothing that would leave the
>>> > structres in some half-state.
>>> >
>>> >> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
>>> >> ---
>>> >>  fs/btrfs/inode.c | 5 +----
>>> >>  1 file changed, 1 insertion(+), 4 deletions(-)
>>> >>
>>> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>>> >> index c7b75dd..b9310f8 100644
>>> >> --- a/fs/btrfs/inode.c
>>> >> +++ b/fs/btrfs/inode.c
>>> >> @@ -4939,16 +4939,13 @@ static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
>>> >>
>>> >>       ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
>>> >>       if (ret) {
>>> >> -             btrfs_abort_transaction(trans, ret);
>>> >>               btrfs_end_transaction(trans);
>>> >>               return ret;
>>> >>       }
>>> >>
>>> >>       ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
>>> >>                       offset, 0, 0, len, 0, len, 0, 0, 0);
>>> >
>>> > But here the extents have been already dropped and missing to insert the
>>> > items does not seem to lead to a consistent state.
>>> >
>>> > It's possible that I'm missing something. In a call path that can be
>>> > safely rolled back even with a started transaction, we don't need to
>>> > abort in all cases. But if the rollback requires some non-trivial
>>> > modifications, I don't see options how to avoid the abort.
>>> >
>>> > __btrfs_drop_extents does a lot of state changes and can itself fail
>>> > in the middle of dropping the range, aborting looks like the safest
>>> > option.
>>> >
>>>
>>> As maybe_insert_hole is only called by btrfs_cont_expand here, which
>>> means it's a really hole, I don't expect drop_extents would drop
>>> anything, we can remove this drop_extents and put an assert after
>>> btrfs_insert_file_extent for checking EEXIST.
>>
>> Sounds good.
>>
>
> Let me make a v2 and have a fstests run.
>

It turns out that the btrfs_drop_extents() here is quite necessary
since fallocate(2) has a FALLOC_FL_KEEP_SIZE option, and when that
happens, a hole extent would be appended between the EOF and fallocate
range's start, then a later truncate up would have to drop these hole
extents in order to expand with a new hole...

As I don't see a way to gracefully solve this except keeping
drop_extents(), lets drop this patch instead.

thanks,
liubo

> thanks,
> liubo
>
>>> It's different from punch hole where we need to explicitly drop an
>>> actual extent and replace it with a hole range.
>>
>> Right, that's what I didn't see at first.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba April 10, 2018, 12:12 p.m. UTC | #6
On Mon, Apr 09, 2018 at 06:23:14PM -0700, Liu Bo wrote:
> >>> As maybe_insert_hole is only called by btrfs_cont_expand here, which
> >>> means it's a really hole, I don't expect drop_extents would drop
> >>> anything, we can remove this drop_extents and put an assert after
> >>> btrfs_insert_file_extent for checking EEXIST.
> >> Sounds good.
> > Let me make a v2 and have a fstests run.
> It turns out that the btrfs_drop_extents() here is quite necessary
> since fallocate(2) has a FALLOC_FL_KEEP_SIZE option, and when that
> happens, a hole extent would be appended between the EOF and fallocate
> range's start, then a later truncate up would have to drop these hole
> extents in order to expand with a new hole...

Would it make sense to split the cases where FALLOC_FL_KEEP_SIZE is and
is not used? Either passing an argument or 2 functions where one could
avoid the drop. I've looked at the code only briefly, so this may be a
nonsense in the end.

> As I don't see a way to gracefully solve this except keeping
> drop_extents(), lets drop this patch instead.

Understood.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo April 12, 2018, 1:39 a.m. UTC | #7
On Tue, Apr 10, 2018 at 5:12 AM, David Sterba <dsterba@suse.cz> wrote:
> On Mon, Apr 09, 2018 at 06:23:14PM -0700, Liu Bo wrote:
>> >>> As maybe_insert_hole is only called by btrfs_cont_expand here, which
>> >>> means it's a really hole, I don't expect drop_extents would drop
>> >>> anything, we can remove this drop_extents and put an assert after
>> >>> btrfs_insert_file_extent for checking EEXIST.
>> >> Sounds good.
>> > Let me make a v2 and have a fstests run.
>> It turns out that the btrfs_drop_extents() here is quite necessary
>> since fallocate(2) has a FALLOC_FL_KEEP_SIZE option, and when that
>> happens, a hole extent would be appended between the EOF and fallocate
>> range's start, then a later truncate up would have to drop these hole
>> extents in order to expand with a new hole...
>
> Would it make sense to split the cases where FALLOC_FL_KEEP_SIZE is and
> is not used? Either passing an argument or 2 functions where one could
> avoid the drop. I've looked at the code only briefly, so this may be a
> nonsense in the end.
>

I'm afraid that It won't work as there is no way I could think of to
know whether an inode has been fallocate with KEEP_SIZE, IOW, seems
that we have to do a search to check if there are extra extents beyond
EOF.

thanks,
liubo

>> As I don't see a way to gracefully solve this except keeping
>> drop_extents(), lets drop this patch instead.
>
> Understood.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c7b75dd..b9310f8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4939,16 +4939,13 @@  static int maybe_insert_hole(struct btrfs_root *root, struct inode *inode,
 
 	ret = btrfs_drop_extents(trans, root, inode, offset, offset + len, 1);
 	if (ret) {
-		btrfs_abort_transaction(trans, ret);
 		btrfs_end_transaction(trans);
 		return ret;
 	}
 
 	ret = btrfs_insert_file_extent(trans, root, btrfs_ino(BTRFS_I(inode)),
 			offset, 0, 0, len, 0, len, 0, 0, 0);
-	if (ret)
-		btrfs_abort_transaction(trans, ret);
-	else
+	if (!ret)
 		btrfs_update_inode(trans, root, inode);
 	btrfs_end_transaction(trans);
 	return ret;