diff mbox series

xfs: fix extent busy updating

Message ID 20230103193217.4941-1-wen.gang.wang@oracle.com (mailing list archive)
State Accepted
Headers show
Series xfs: fix extent busy updating | expand

Commit Message

Wengang Wang Jan. 3, 2023, 7:32 p.m. UTC
In xfs_extent_busy_update_extent() case 6 and 7, whenever bno is modified on
extent busy, the relavent length has to be modified accordingly.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
---
 fs/xfs/xfs_extent_busy.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Darrick J. Wong Jan. 4, 2023, 4:24 p.m. UTC | #1
On Tue, Jan 03, 2023 at 11:32:17AM -0800, Wengang Wang wrote:
> In xfs_extent_busy_update_extent() case 6 and 7, whenever bno is modified on
> extent busy, the relavent length has to be modified accordingly.
> 
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> ---
>  fs/xfs/xfs_extent_busy.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index ad22a003f959..f3d328e4a440 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -236,6 +236,7 @@ xfs_extent_busy_update_extent(
>  		 *
>  		 */
>  		busyp->bno = fend;
> +		busyp->length = bend - fend;

Looks correct to me, but how did you find this?  Is there some sort of
test case we could attach to this?

--D

>  	} else if (bbno < fbno) {
>  		/*
>  		 * Case 8:
> -- 
> 2.21.0 (Apple Git-122.2)
>
Wengang Wang Jan. 4, 2023, 5:41 p.m. UTC | #2
Darrick,

> On Jan 4, 2023, at 8:24 AM, Darrick J. Wong <djwong@kernel.org> wrote:
> 
> On Tue, Jan 03, 2023 at 11:32:17AM -0800, Wengang Wang wrote:
>> In xfs_extent_busy_update_extent() case 6 and 7, whenever bno is modified on
>> extent busy, the relavent length has to be modified accordingly.
>> 
>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>> ---
>> fs/xfs/xfs_extent_busy.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
>> index ad22a003f959..f3d328e4a440 100644
>> --- a/fs/xfs/xfs_extent_busy.c
>> +++ b/fs/xfs/xfs_extent_busy.c
>> @@ -236,6 +236,7 @@ xfs_extent_busy_update_extent(
>> 		 *
>> 		 */
>> 		busyp->bno = fend;
>> +		busyp->length = bend - fend;
> 
> Looks correct to me, but how did you find this?

I was working with a UEK5 XFS bug where busy blocks (contained in extent busy) are allocated to regular files unexpectedly.
When I was trying to fix that problem (still reuse busy blocks for directories), the problem here is exposed.


>  Is there some sort of
> test case we could attach to this?

Hm.. I can only reproduce this with my patch. Well, the idea is that, for example,

1) we have an extent busy in the busy tree:    (bno=100, len=200)
2) allocate blocks for directories from above extent busy (multiple times)
3) after the allocations, above extent busy finally becomes  (bno=300, len=200)  though it should become (bno=300, len=0) and should be removed from the busy tree.
4) the block 300 (in that AG) is used as metadata (directory blocks containing dir entries) and then that block is freed
5) insert the new extent busy (bno=300, len=1) to the busy tree,
in function xfs_extent_busy_insert():

 61         while (*rbp) {
 62                 parent = *rbp;
 63                 busyp = rb_entry(parent, struct xfs_extent_busy, rb_node);
 64
 65                 if (new->bno < busyp->bno) {
 66                         rbp = &(*rbp)->rb_left;
 67                         ASSERT(new->bno + new->length <= busyp->bno);
 68                 } else if (new->bno > busyp->bno) {
 69                         rbp = &(*rbp)->rb_right;
 70                         ASSERT(bno >= busyp->bno + busyp->length);
 71                 } else {
 72                         ASSERT(0);
 73                 }
 74         }

Note that node (bno=300, len=200) already exists in the tree, the code hits line 72, the “else” case, and enters infinite loop.

thanks,
wengang

> 
> --D
> 
>> 	} else if (bbno < fbno) {
>> 		/*
>> 		 * Case 8:
>> -- 
>> 2.21.0 (Apple Git-122.2)
>>
Darrick J. Wong Jan. 5, 2023, 7:58 p.m. UTC | #3
On Wed, Jan 04, 2023 at 05:41:15PM +0000, Wengang Wang wrote:
> Darrick,
> 
> > On Jan 4, 2023, at 8:24 AM, Darrick J. Wong <djwong@kernel.org> wrote:
> > 
> > On Tue, Jan 03, 2023 at 11:32:17AM -0800, Wengang Wang wrote:
> >> In xfs_extent_busy_update_extent() case 6 and 7, whenever bno is modified on
> >> extent busy, the relavent length has to be modified accordingly.
> >> 
> >> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> >> ---
> >> fs/xfs/xfs_extent_busy.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> >> index ad22a003f959..f3d328e4a440 100644
> >> --- a/fs/xfs/xfs_extent_busy.c
> >> +++ b/fs/xfs/xfs_extent_busy.c
> >> @@ -236,6 +236,7 @@ xfs_extent_busy_update_extent(
> >> 		 *
> >> 		 */
> >> 		busyp->bno = fend;
> >> +		busyp->length = bend - fend;
> > 
> > Looks correct to me, but how did you find this?
> 
> I was working with a UEK5 XFS bug where busy blocks (contained in
> extent busy) are allocated to regular files unexpectedly.  When I was
> trying to fix that problem (still reuse busy blocks for directories),
> the problem here is exposed.
> 
> 
> >  Is there some sort of
> > test case we could attach to this?
> 
> Hm.. I can only reproduce this with my patch. Well, the idea is that,
> for example,
> 
> 1) we have an extent busy in the busy tree:    (bno=100, len=200)
> 2) allocate blocks for directories from above extent busy (multiple times)
> 3) after the allocations, above extent busy finally becomes  (bno=300, len=200)  though it should become (bno=300, len=0) and should be removed from the busy tree.
> 4) the block 300 (in that AG) is used as metadata (directory blocks containing dir entries) and then that block is freed
> 5) insert the new extent busy (bno=300, len=1) to the busy tree,
> in function xfs_extent_busy_insert():
> 
>  61         while (*rbp) {
>  62                 parent = *rbp;
>  63                 busyp = rb_entry(parent, struct xfs_extent_busy, rb_node);
>  64
>  65                 if (new->bno < busyp->bno) {
>  66                         rbp = &(*rbp)->rb_left;
>  67                         ASSERT(new->bno + new->length <= busyp->bno);
>  68                 } else if (new->bno > busyp->bno) {
>  69                         rbp = &(*rbp)->rb_right;
>  70                         ASSERT(bno >= busyp->bno + busyp->length);
>  71                 } else {
>  72                         ASSERT(0);
>  73                 }
>  74         }
> 
> Note that node (bno=300, len=200) already exists in the tree, the code
> hits line 72, the “else” case, and enters infinite loop.

Hm.  I /think/ it would be possible but fairly difficult to write an
fstest -- you'd have to create a storage with a very slow DISCARD
command, mount the fs with -o discard, fill the fs, free all the blocks,
make a large directory structure, and then rmdir the directory.  Those
last three steps would have to happen before the discards finish.

Uh... do you think that's worth the effort?  I don't think the kernel
even has a dm driver to simulate a device with only slow discards.

In the meantime,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> 
> thanks,
> wengang
> 
> > 
> > --D
> > 
> >> 	} else if (bbno < fbno) {
> >> 		/*
> >> 		 * Case 8:
> >> -- 
> >> 2.21.0 (Apple Git-122.2)
> >> 
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
index ad22a003f959..f3d328e4a440 100644
--- a/fs/xfs/xfs_extent_busy.c
+++ b/fs/xfs/xfs_extent_busy.c
@@ -236,6 +236,7 @@  xfs_extent_busy_update_extent(
 		 *
 		 */
 		busyp->bno = fend;
+		busyp->length = bend - fend;
 	} else if (bbno < fbno) {
 		/*
 		 * Case 8: