mbox series

[v2,0/2] btrfs: scrub avoid use-after-free when chunk length is not 64K aligned

Message ID cover.1705449249.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: scrub avoid use-after-free when chunk length is not 64K aligned | expand

Message

Qu Wenruo Jan. 17, 2024, 12:32 a.m. UTC
[Changelog]
v2:
- Split out the RST code change
  So that backport can happen more smoothly.
  Furthermore, the RST specific part is really just a small enhancement.
  As RST would always do the btrfs_map_block(), even if we have a
  corrupted extent item beyond chunk, it would be properly caught,
  thus at most false alerts, no real use-after-free can happen after
  the first patch.

- Slight update on the commit message of the first patch
  Fix a copy-and-paste error of the number used to calculate the chunk
  end.
  Remove the RST scrub part, as we won't do any RST fix (although
  it would still sliently fix RST, since both RST and regular scrub
  share the same endio function)

There is a bug report about use-after-free during scrub and crash the
system.
It turns out to be a chunk whose lenght is not 64K aligned causing the
problem.

The first patch would be the proper fix, needs to be backported to all
kernel using newer scrub interface.

The 2nd patch is a small enhancement for RST scrub, inspired by the
above bug, which doesn't really need to be backported.

Qu Wenruo (2):
  btrfs: scrub: avoid use-after-free when chunk length is not 64K
    aligned
  btrfs: scrub: limit RST scrub to chunk boundary

 fs/btrfs/scrub.c | 36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

Comments

Johannes Thumshirn Jan. 17, 2024, 7:54 a.m. UTC | #1
On 17.01.24 01:33, Qu Wenruo wrote:
> [Changelog]
> v2:
> - Split out the RST code change
>    So that backport can happen more smoothly.
>    Furthermore, the RST specific part is really just a small enhancement.
>    As RST would always do the btrfs_map_block(), even if we have a
>    corrupted extent item beyond chunk, it would be properly caught,
>    thus at most false alerts, no real use-after-free can happen after
>    the first patch.
> 
> - Slight update on the commit message of the first patch
>    Fix a copy-and-paste error of the number used to calculate the chunk
>    end.
>    Remove the RST scrub part, as we won't do any RST fix (although
>    it would still sliently fix RST, since both RST and regular scrub
>    share the same endio function)
> 
> There is a bug report about use-after-free during scrub and crash the
> system.
> It turns out to be a chunk whose lenght is not 64K aligned causing the
> problem.
> 
> The first patch would be the proper fix, needs to be backported to all
> kernel using newer scrub interface.
> 
> The 2nd patch is a small enhancement for RST scrub, inspired by the
> above bug, which doesn't really need to be backported.
> 
> Qu Wenruo (2):
>    btrfs: scrub: avoid use-after-free when chunk length is not 64K
>      aligned
>    btrfs: scrub: limit RST scrub to chunk boundary
> 
>   fs/btrfs/scrub.c | 36 +++++++++++++++++++++++++++++-------
>   1 file changed, 29 insertions(+), 7 deletions(-)
> 

For the series,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

One more thing I personally would add (as a 3rd patch that doesn't need 
to get backported to stable) is this:

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 0123d2728923..046fdf8f6773 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1641,14 +1641,23 @@ static void scrub_reset_stripe(struct 
scrub_stripe *stripe)
         }
  }

+static unsigned int scrub_nr_stripe_sectors(struct scrub_stripe *stripe)
+{
+       struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
+       struct btrfs_block_group *bg = stripe->bg;
+       u64 bg_end = bg->start + bg->length;
+       unsigned int nr_sectors;
+
+       nr_sectors = min(BTRFS_STRIPE_LEN, bg_end - stripe->logical);
+       return nr_sectors >> fs_info->sectorsize_bits;
+}
+
  static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx,
                                             struct scrub_stripe *stripe)
  {
         struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
         struct btrfs_bio *bbio = NULL;
-       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, stripe->bg->start +
-                                     stripe->bg->length - 
stripe->logical) >>
-                                 fs_info->sectorsize_bits;
+       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
         u64 stripe_len = BTRFS_STRIPE_LEN;
         int mirror = stripe->mirror_num;
         int i;
@@ -1718,9 +1727,7 @@ static void scrub_submit_initial_read(struct 
scrub_ctx *sctx,
  {
         struct btrfs_fs_info *fs_info = sctx->fs_info;
         struct btrfs_bio *bbio;
-       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, stripe->bg->start +
-                                     stripe->bg->length - 
stripe->logical) >>
-                                 fs_info->sectorsize_bits;
+       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
         int mirror = stripe->mirror_num;

         ASSERT(stripe->bg);

Sorry for the complete whitespace damage, but I think you get the point.
Qu Wenruo Jan. 17, 2024, 7:57 a.m. UTC | #2
On 2024/1/17 18:24, Johannes Thumshirn wrote:
> On 17.01.24 01:33, Qu Wenruo wrote:
>> [Changelog]
>> v2:
>> - Split out the RST code change
>>     So that backport can happen more smoothly.
>>     Furthermore, the RST specific part is really just a small enhancement.
>>     As RST would always do the btrfs_map_block(), even if we have a
>>     corrupted extent item beyond chunk, it would be properly caught,
>>     thus at most false alerts, no real use-after-free can happen after
>>     the first patch.
>>
>> - Slight update on the commit message of the first patch
>>     Fix a copy-and-paste error of the number used to calculate the chunk
>>     end.
>>     Remove the RST scrub part, as we won't do any RST fix (although
>>     it would still sliently fix RST, since both RST and regular scrub
>>     share the same endio function)
>>
>> There is a bug report about use-after-free during scrub and crash the
>> system.
>> It turns out to be a chunk whose lenght is not 64K aligned causing the
>> problem.
>>
>> The first patch would be the proper fix, needs to be backported to all
>> kernel using newer scrub interface.
>>
>> The 2nd patch is a small enhancement for RST scrub, inspired by the
>> above bug, which doesn't really need to be backported.
>>
>> Qu Wenruo (2):
>>     btrfs: scrub: avoid use-after-free when chunk length is not 64K
>>       aligned
>>     btrfs: scrub: limit RST scrub to chunk boundary
>>
>>    fs/btrfs/scrub.c | 36 +++++++++++++++++++++++++++++-------
>>    1 file changed, 29 insertions(+), 7 deletions(-)
>>
> 
> For the series,
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> 
> One more thing I personally would add (as a 3rd patch that doesn't need
> to get backported to stable) is this:
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 0123d2728923..046fdf8f6773 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -1641,14 +1641,23 @@ static void scrub_reset_stripe(struct
> scrub_stripe *stripe)
>           }
>    }
> 
> +static unsigned int scrub_nr_stripe_sectors(struct scrub_stripe *stripe)
> +{
> +       struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
> +       struct btrfs_block_group *bg = stripe->bg;
> +       u64 bg_end = bg->start + bg->length;
> +       unsigned int nr_sectors;
> +
> +       nr_sectors = min(BTRFS_STRIPE_LEN, bg_end - stripe->logical);
> +       return nr_sectors >> fs_info->sectorsize_bits;
> +}
> +
>    static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx,
>                                               struct scrub_stripe *stripe)
>    {
>           struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
>           struct btrfs_bio *bbio = NULL;
> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, stripe->bg->start +
> -                                     stripe->bg->length -
> stripe->logical) >>
> -                                 fs_info->sectorsize_bits;
> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
>           u64 stripe_len = BTRFS_STRIPE_LEN;
>           int mirror = stripe->mirror_num;
>           int i;
> @@ -1718,9 +1727,7 @@ static void scrub_submit_initial_read(struct
> scrub_ctx *sctx,
>    {
>           struct btrfs_fs_info *fs_info = sctx->fs_info;
>           struct btrfs_bio *bbio;
> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, stripe->bg->start +
> -                                     stripe->bg->length -
> stripe->logical) >>
> -                                 fs_info->sectorsize_bits;
> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
>           int mirror = stripe->mirror_num;
> 
>           ASSERT(stripe->bg);
> 
> Sorry for the complete whitespace damage, but I think you get the point.

That's what I did before the v1, but it turns out that just two call 
sites, and I open-coded them in the final patch.

Just a preference thing, I'm fine either way.

Thanks,
Qu
Johannes Thumshirn Jan. 17, 2024, 8:09 a.m. UTC | #3
On 17.01.24 08:57, Qu Wenruo wrote:
> 
> 
> On 2024/1/17 18:24, Johannes Thumshirn wrote:
>> On 17.01.24 01:33, Qu Wenruo wrote:
>>> [Changelog]
>>> v2:
>>> - Split out the RST code change
>>>     So that backport can happen more smoothly.
>>>     Furthermore, the RST specific part is really just a small 
>>> enhancement.
>>>     As RST would always do the btrfs_map_block(), even if we have a
>>>     corrupted extent item beyond chunk, it would be properly caught,
>>>     thus at most false alerts, no real use-after-free can happen after
>>>     the first patch.
>>>
>>> - Slight update on the commit message of the first patch
>>>     Fix a copy-and-paste error of the number used to calculate the chunk
>>>     end.
>>>     Remove the RST scrub part, as we won't do any RST fix (although
>>>     it would still sliently fix RST, since both RST and regular scrub
>>>     share the same endio function)
>>>
>>> There is a bug report about use-after-free during scrub and crash the
>>> system.
>>> It turns out to be a chunk whose lenght is not 64K aligned causing the
>>> problem.
>>>
>>> The first patch would be the proper fix, needs to be backported to all
>>> kernel using newer scrub interface.
>>>
>>> The 2nd patch is a small enhancement for RST scrub, inspired by the
>>> above bug, which doesn't really need to be backported.
>>>
>>> Qu Wenruo (2):
>>>     btrfs: scrub: avoid use-after-free when chunk length is not 64K
>>>       aligned
>>>     btrfs: scrub: limit RST scrub to chunk boundary
>>>
>>>    fs/btrfs/scrub.c | 36 +++++++++++++++++++++++++++++-------
>>>    1 file changed, 29 insertions(+), 7 deletions(-)
>>>
>>
>> For the series,
>> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>
>> One more thing I personally would add (as a 3rd patch that doesn't need
>> to get backported to stable) is this:
>>
>> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
>> index 0123d2728923..046fdf8f6773 100644
>> --- a/fs/btrfs/scrub.c
>> +++ b/fs/btrfs/scrub.c
>> @@ -1641,14 +1641,23 @@ static void scrub_reset_stripe(struct
>> scrub_stripe *stripe)
>>           }
>>    }
>>
>> +static unsigned int scrub_nr_stripe_sectors(struct scrub_stripe *stripe)
>> +{
>> +       struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
>> +       struct btrfs_block_group *bg = stripe->bg;
>> +       u64 bg_end = bg->start + bg->length;
>> +       unsigned int nr_sectors;
>> +
>> +       nr_sectors = min(BTRFS_STRIPE_LEN, bg_end - stripe->logical);
>> +       return nr_sectors >> fs_info->sectorsize_bits;
>> +}
>> +
>>    static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx,
>>                                               struct scrub_stripe 
>> *stripe)
>>    {
>>           struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
>>           struct btrfs_bio *bbio = NULL;
>> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, 
>> stripe->bg->start +
>> -                                     stripe->bg->length -
>> stripe->logical) >>
>> -                                 fs_info->sectorsize_bits;
>> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
>>           u64 stripe_len = BTRFS_STRIPE_LEN;
>>           int mirror = stripe->mirror_num;
>>           int i;
>> @@ -1718,9 +1727,7 @@ static void scrub_submit_initial_read(struct
>> scrub_ctx *sctx,
>>    {
>>           struct btrfs_fs_info *fs_info = sctx->fs_info;
>>           struct btrfs_bio *bbio;
>> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, 
>> stripe->bg->start +
>> -                                     stripe->bg->length -
>> stripe->logical) >>
>> -                                 fs_info->sectorsize_bits;
>> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
>>           int mirror = stripe->mirror_num;
>>
>>           ASSERT(stripe->bg);
>>
>> Sorry for the complete whitespace damage, but I think you get the point.
> 
> That's what I did before the v1, but it turns out that just two call 
> sites, and I open-coded them in the final patch.
> 
> Just a preference thing, I'm fine either way.
> 

Yeah of cause, I just hate the overly long line and code duplication :D
David Sterba Jan. 17, 2024, 5:17 p.m. UTC | #4
On Wed, Jan 17, 2024 at 08:09:20AM +0000, Johannes Thumshirn wrote:
> On 17.01.24 08:57, Qu Wenruo wrote:
> > 
> > 
> > On 2024/1/17 18:24, Johannes Thumshirn wrote:
> >> On 17.01.24 01:33, Qu Wenruo wrote:
> >>> [Changelog]
> >>> v2:
> >>> - Split out the RST code change
> >>>     So that backport can happen more smoothly.
> >>>     Furthermore, the RST specific part is really just a small 
> >>> enhancement.
> >>>     As RST would always do the btrfs_map_block(), even if we have a
> >>>     corrupted extent item beyond chunk, it would be properly caught,
> >>>     thus at most false alerts, no real use-after-free can happen after
> >>>     the first patch.
> >>>
> >>> - Slight update on the commit message of the first patch
> >>>     Fix a copy-and-paste error of the number used to calculate the chunk
> >>>     end.
> >>>     Remove the RST scrub part, as we won't do any RST fix (although
> >>>     it would still sliently fix RST, since both RST and regular scrub
> >>>     share the same endio function)
> >>>
> >>> There is a bug report about use-after-free during scrub and crash the
> >>> system.
> >>> It turns out to be a chunk whose lenght is not 64K aligned causing the
> >>> problem.
> >>>
> >>> The first patch would be the proper fix, needs to be backported to all
> >>> kernel using newer scrub interface.
> >>>
> >>> The 2nd patch is a small enhancement for RST scrub, inspired by the
> >>> above bug, which doesn't really need to be backported.
> >>>
> >>> Qu Wenruo (2):
> >>>     btrfs: scrub: avoid use-after-free when chunk length is not 64K
> >>>       aligned
> >>>     btrfs: scrub: limit RST scrub to chunk boundary
> >>>
> >>>    fs/btrfs/scrub.c | 36 +++++++++++++++++++++++++++++-------
> >>>    1 file changed, 29 insertions(+), 7 deletions(-)
> >>>
> >>
> >> For the series,
> >> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> >>
> >> One more thing I personally would add (as a 3rd patch that doesn't need
> >> to get backported to stable) is this:
> >>
> >> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> >> index 0123d2728923..046fdf8f6773 100644
> >> --- a/fs/btrfs/scrub.c
> >> +++ b/fs/btrfs/scrub.c
> >> @@ -1641,14 +1641,23 @@ static void scrub_reset_stripe(struct
> >> scrub_stripe *stripe)
> >>           }
> >>    }
> >>
> >> +static unsigned int scrub_nr_stripe_sectors(struct scrub_stripe *stripe)
> >> +{
> >> +       struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
> >> +       struct btrfs_block_group *bg = stripe->bg;
> >> +       u64 bg_end = bg->start + bg->length;
> >> +       unsigned int nr_sectors;
> >> +
> >> +       nr_sectors = min(BTRFS_STRIPE_LEN, bg_end - stripe->logical);
> >> +       return nr_sectors >> fs_info->sectorsize_bits;
> >> +}
> >> +
> >>    static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx,
> >>                                               struct scrub_stripe 
> >> *stripe)
> >>    {
> >>           struct btrfs_fs_info *fs_info = stripe->bg->fs_info;
> >>           struct btrfs_bio *bbio = NULL;
> >> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, 
> >> stripe->bg->start +
> >> -                                     stripe->bg->length -
> >> stripe->logical) >>
> >> -                                 fs_info->sectorsize_bits;
> >> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
> >>           u64 stripe_len = BTRFS_STRIPE_LEN;
> >>           int mirror = stripe->mirror_num;
> >>           int i;
> >> @@ -1718,9 +1727,7 @@ static void scrub_submit_initial_read(struct
> >> scrub_ctx *sctx,
> >>    {
> >>           struct btrfs_fs_info *fs_info = sctx->fs_info;
> >>           struct btrfs_bio *bbio;
> >> -       unsigned int nr_sectors = min(BTRFS_STRIPE_LEN, 
> >> stripe->bg->start +
> >> -                                     stripe->bg->length -
> >> stripe->logical) >>
> >> -                                 fs_info->sectorsize_bits;
> >> +       unsigned int nr_sectors = scrub_nr_stripe_sectors(stripe);
> >>           int mirror = stripe->mirror_num;
> >>
> >>           ASSERT(stripe->bg);
> >>
> >> Sorry for the complete whitespace damage, but I think you get the point.
> > 
> > That's what I did before the v1, but it turns out that just two call 
> > sites, and I open-coded them in the final patch.
> > 
> > Just a preference thing, I'm fine either way.
> 
> Yeah of cause, I just hate the overly long line and code duplication :D

In this case the expression is not trivial so a helper makes sense,
using it twice is enough. For something even more complicated to
calculate a helper makes sense even for one time use. If one is not able
to grasp what the expression does, like "pick bigger of the two", the
helper should be used, or at least a comment added.