Message ID | ddcae9658e46880dfec2fb0aa61d01fb3353d202.1685449706.git.ojaswin@linux.ibm.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | multiblock allocator improvements | expand |
On Tue, May 30, 2023 at 3:25 PM Ojaswin Mujoo <ojaswin@linux.ibm.com> wrote: > > This reverts commit 32c0869370194ae5ac9f9f501953ef693040f6a1. > > The reverted commit was intended to remove a dead check however it was observed > that this check was actually being used to exit early instead of looping > sbi->s_mb_max_to_scan times when we are able to find a free extent bigger than > the goal extent. Due to this, a my performance tests (fsmark, parallel file > writes in a highly fragmented FS) were seeing a 2x-3x regression. > > Example, the default value of the following variables is: > > sbi->s_mb_max_to_scan = 200 > sbi->s_mb_min_to_scan = 10 > > In ext4_mb_check_limits() if we find an extent smaller than goal, then we return > early and try again. This loop will go on until we have processed > sbi->s_mb_max_to_scan(=200) number of free extents at which point we exit and > just use whatever we have even if it is smaller than goal extent. > > Now, the regression comes when we find an extent bigger than goal. Earlier, in > this case we would loop only sbi->s_mb_min_to_scan(=10) times and then just use > the bigger extent. However with commit 32c08693 that check was removed and hence > we would loop sbi->s_mb_max_to_scan(=200) times even though we have a big enough > free extent to satisfy the request. The only time we would exit early would be > when the free extent is *exactly* the size of our goal, which is pretty uncommon > occurrence and so we would almost always end up looping 200 times. > > Hence, revert the commit by adding the check back to fix the regression. Also > add a comment to outline this policy. > Hi, I applied this single patch of your series v2 on top of Linux v6.4-rc4. So, if this is a regression I ask myself if this is material for Linux 6.4? Can you comment on this, please? Thanks. Regards, -Sedat- > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> > Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> > --- > fs/ext4/mballoc.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index d4b6a2c1881d..7ac6d3524f29 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -2063,7 +2063,7 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, > if (bex->fe_len < gex->fe_len) > return; > > - if (finish_group) > + if (finish_group || ac->ac_found > sbi->s_mb_min_to_scan) > ext4_mb_use_best_found(ac, e4b); > } > > @@ -2075,6 +2075,20 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, > * in the context. Later, the best found extent will be used, if > * mballoc can't find good enough extent. > * > + * The algorithm used is roughly as follows: > + * > + * * If free extent found is exactly as big as goal, then > + * stop the scan and use it immediately > + * > + * * If free extent found is smaller than goal, then keep retrying > + * upto a max of sbi->s_mb_max_to_scan times (default 200). After > + * that stop scanning and use whatever we have. > + * > + * * If free extent found is bigger than goal, then keep retrying > + * upto a max of sbi->s_mb_min_to_scan times (default 10) before > + * stopping the scan and using the extent. > + * > + * > * FIXME: real allocation policy is to be designed yet! > */ > static void ext4_mb_measure_extent(struct ext4_allocation_context *ac, > -- > 2.31.1 >
On Tue, May 30, 2023 at 06:28:22PM +0200, Sedat Dilek wrote: > On Tue, May 30, 2023 at 3:25 PM Ojaswin Mujoo <ojaswin@linux.ibm.com> wrote: > > > > This reverts commit 32c0869370194ae5ac9f9f501953ef693040f6a1. > > > > The reverted commit was intended to remove a dead check however it was observed > > that this check was actually being used to exit early instead of looping > > sbi->s_mb_max_to_scan times when we are able to find a free extent bigger than > > the goal extent. Due to this, a my performance tests (fsmark, parallel file > > writes in a highly fragmented FS) were seeing a 2x-3x regression. > > > > Example, the default value of the following variables is: > > > > sbi->s_mb_max_to_scan = 200 > > sbi->s_mb_min_to_scan = 10 > > > > In ext4_mb_check_limits() if we find an extent smaller than goal, then we return > > early and try again. This loop will go on until we have processed > > sbi->s_mb_max_to_scan(=200) number of free extents at which point we exit and > > just use whatever we have even if it is smaller than goal extent. > > > > Now, the regression comes when we find an extent bigger than goal. Earlier, in > > this case we would loop only sbi->s_mb_min_to_scan(=10) times and then just use > > the bigger extent. However with commit 32c08693 that check was removed and hence > > we would loop sbi->s_mb_max_to_scan(=200) times even though we have a big enough > > free extent to satisfy the request. The only time we would exit early would be > > when the free extent is *exactly* the size of our goal, which is pretty uncommon > > occurrence and so we would almost always end up looping 200 times. > > > > Hence, revert the commit by adding the check back to fix the regression. Also > > add a comment to outline this policy. > > > > Hi, > > I applied this single patch of your series v2 on top of Linux v6.4-rc4. > > So, if this is a regression I ask myself if this is material for Linux 6.4? > > Can you comment on this, please? > > Thanks. > > Regards, > -Sedat- Hi Sedat, Since this patch fixes a regression I think it should ideally go in Linux 6.4 Regards, ojaswin > > > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> > > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> > > Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> > > --- > > fs/ext4/mballoc.c | 16 +++++++++++++++- > > 1 file changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > > index d4b6a2c1881d..7ac6d3524f29 100644 > > --- a/fs/ext4/mballoc.c > > +++ b/fs/ext4/mballoc.c > > @@ -2063,7 +2063,7 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, > > if (bex->fe_len < gex->fe_len) > > return; > > > > - if (finish_group) > > + if (finish_group || ac->ac_found > sbi->s_mb_min_to_scan) > > ext4_mb_use_best_found(ac, e4b); > > } > > > > @@ -2075,6 +2075,20 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, > > * in the context. Later, the best found extent will be used, if > > * mballoc can't find good enough extent. > > * > > + * The algorithm used is roughly as follows: > > + * > > + * * If free extent found is exactly as big as goal, then > > + * stop the scan and use it immediately > > + * > > + * * If free extent found is smaller than goal, then keep retrying > > + * upto a max of sbi->s_mb_max_to_scan times (default 200). After > > + * that stop scanning and use whatever we have. > > + * > > + * * If free extent found is bigger than goal, then keep retrying > > + * upto a max of sbi->s_mb_min_to_scan times (default 10) before > > + * stopping the scan and using the extent. > > + * > > + * > > * FIXME: real allocation policy is to be designed yet! > > */ > > static void ext4_mb_measure_extent(struct ext4_allocation_context *ac, > > -- > > 2.31.1 > >
On 31.05.23 10:57, Ojaswin Mujoo wrote: > On Tue, May 30, 2023 at 06:28:22PM +0200, Sedat Dilek wrote: >> On Tue, May 30, 2023 at 3:25 PM Ojaswin Mujoo <ojaswin@linux.ibm.com> wrote: >>> >>> This reverts commit 32c0869370194ae5ac9f9f501953ef693040f6a1. >>> >>> The reverted commit was intended to remove a dead check however it was observed >>> that this check was actually being used to exit early instead of looping >>> sbi->s_mb_max_to_scan times when we are able to find a free extent bigger than >>> the goal extent. Due to this, a my performance tests (fsmark, parallel file >>> writes in a highly fragmented FS) were seeing a 2x-3x regression. >>> >>> Example, the default value of the following variables is: >>> >>> sbi->s_mb_max_to_scan = 200 >>> sbi->s_mb_min_to_scan = 10 >>> >>> In ext4_mb_check_limits() if we find an extent smaller than goal, then we return >>> early and try again. This loop will go on until we have processed >>> sbi->s_mb_max_to_scan(=200) number of free extents at which point we exit and >>> just use whatever we have even if it is smaller than goal extent. >>> >>> Now, the regression comes when we find an extent bigger than goal. Earlier, in >>> this case we would loop only sbi->s_mb_min_to_scan(=10) times and then just use >>> the bigger extent. However with commit 32c08693 that check was removed and hence >>> we would loop sbi->s_mb_max_to_scan(=200) times even though we have a big enough >>> free extent to satisfy the request. The only time we would exit early would be >>> when the free extent is *exactly* the size of our goal, which is pretty uncommon >>> occurrence and so we would almost always end up looping 200 times. >>> >>> Hence, revert the commit by adding the check back to fix the regression. Also >>> add a comment to outline this policy. >> >> I applied this single patch of your series v2 on top of Linux v6.4-rc4. >> >> So, if this is a regression I ask myself if this is material for Linux 6.4? >> >> Can you comment on this, please? > > Since this patch fixes a regression I think it should ideally go in > Linux 6.4 Ted can speak up for himself, but maybe this might speed things up: A lot of maintainers in a case like this want fixes (like this) submitted separately from other changes (like the rest of this series). /me hopes this will help and not confuse anything Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
On Fri, Jun 02, 2023 at 03:45:52PM +0200, Thorsten Leemhuis wrote: > > > > Since this patch fixes a regression I think it should ideally go in > > Linux 6.4 > > Ted can speak up for himself, but maybe this might speed things up: > > A lot of maintainers in a case like this want fixes (like this) > submitted separately from other changes (like the rest of this series). While it's nice to do that in the future (since I would have noticed this earlier, it could have gone into my regression fixes push to Linus last week), in this particular case I've already noted this particular issue, and per the discussion in the last weekly ext4 video conference chat, I will be reordering the pashes so I can send a secondary regression fix to Linus very shortly. Thanks, - Ted
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index d4b6a2c1881d..7ac6d3524f29 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2063,7 +2063,7 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, if (bex->fe_len < gex->fe_len) return; - if (finish_group) + if (finish_group || ac->ac_found > sbi->s_mb_min_to_scan) ext4_mb_use_best_found(ac, e4b); } @@ -2075,6 +2075,20 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac, * in the context. Later, the best found extent will be used, if * mballoc can't find good enough extent. * + * The algorithm used is roughly as follows: + * + * * If free extent found is exactly as big as goal, then + * stop the scan and use it immediately + * + * * If free extent found is smaller than goal, then keep retrying + * upto a max of sbi->s_mb_max_to_scan times (default 200). After + * that stop scanning and use whatever we have. + * + * * If free extent found is bigger than goal, then keep retrying + * upto a max of sbi->s_mb_min_to_scan times (default 10) before + * stopping the scan and using the extent. + * + * * FIXME: real allocation policy is to be designed yet! */ static void ext4_mb_measure_extent(struct ext4_allocation_context *ac,