diff mbox series

btrfs: don't access possibly stale fs_info data for printing duplicate device

Message ID 2bb63b693331e27b440768b163a84935fe01edda.1605182240.git.johannes.thumshirn@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs: don't access possibly stale fs_info data for printing duplicate device | expand

Commit Message

Johannes Thumshirn Nov. 12, 2020, 11:59 a.m. UTC
Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), use a normal pr_warn().

Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---

This is an alternative and IMHO simpler aproach to what Anand proposed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-2-anand.jain@oracle.com/T/


 fs/btrfs/volumes.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Nikolay Borisov Nov. 12, 2020, 12:02 p.m. UTC | #1
On 12.11.20 г. 13:59 ч., Johannes Thumshirn wrote:
> Syzbot reported a possible use-after-free when printing a duplicate device
> warning device_list_add().
> 
> At this point it can happen that a btrfs_device::fs_info is not correctly
> setup yet, so we're accessing stale data, when printing the warning
> message using the btrfs_printk() wrappers.
> 
> Instead of printing possibly uninitialized or already freed memory in
> btrfs_printk(), use a normal pr_warn().
> 
> Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com> , though see below for
a suggestion.

> ---
> 
> This is an alternative and IMHO simpler aproach to what Anand proposed in
> https://lore.kernel.org/linux-btrfs/20200114060920.4527-2-anand.jain@oracle.com/T/
> 
> 
>  fs/btrfs/volumes.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bb1aa96e1233..eb1af5e3d596 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -940,8 +940,8 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>  			if (device->bdev != path_bdev) {
>  				bdput(path_bdev);
>  				mutex_unlock(&fs_devices->device_list_mutex);
> -				btrfs_warn_in_rcu(device->fs_info,
> -	"duplicate device %s devid %llu generation %llu scanned by %s (%d)",
> +				pr_warn(
> +	"BTRFS: duplicate device %s devid %llu generation %llu scanned by %s (%d)",
>  						  path, devid, found_transid,
>  						  current->comm,
>  						  task_pid_nr(current));
> 


Would a simple 'if()' here catch the case where fs_info is not
initialized essentially open-coding what Anand has proposed? My idea is
to be able to provide the filesystem id when we can (best effort) and
simply use pr_warn otherwise, but without having to change the internals
of btrfs_printk and instead handle the single problematic call site ?
Johannes Thumshirn Nov. 12, 2020, 12:09 p.m. UTC | #2
On 12/11/2020 13:03, Nikolay Borisov wrote:
> Would a simple 'if()' here catch the case where fs_info is not
> initialized essentially open-coding what Anand has proposed? My idea is
> to be able to provide the filesystem id when we can (best effort) and
> simply use pr_warn otherwise, but without having to change the internals
> of btrfs_printk and instead handle the single problematic call site ?
> 

Unfortunately not, I've been trying to do that but the device->fs_info pointer
exists and accessing it triggers a KASAN splat.

Actually btrfs_printk() is already checking if fs_info is NULL or not to decide
whether to print <unknown> or fs_info->sb->s_id.

Another option would be to do btrfs_warn_in_rcu(NULL but that doesn't buy us a lot
more.
Nikolay Borisov Nov. 12, 2020, 12:25 p.m. UTC | #3
On 12.11.20 г. 14:09 ч., Johannes Thumshirn wrote:
> On 12/11/2020 13:03, Nikolay Borisov wrote:
>> Would a simple 'if()' here catch the case where fs_info is not
>> initialized essentially open-coding what Anand has proposed? My idea is
>> to be able to provide the filesystem id when we can (best effort) and
>> simply use pr_warn otherwise, but without having to change the internals
>> of btrfs_printk and instead handle the single problematic call site ?
>>
> 
> Unfortunately not, I've been trying to do that but the device->fs_info pointer
> exists and accessing it triggers a KASAN splat.
> 
> Actually btrfs_printk() is already checking if fs_info is NULL or not to decide
> whether to print <unknown> or fs_info->sb->s_id.
> 
> Another option would be to do btrfs_warn_in_rcu(NULL but that doesn't buy us a lot
> more.
> 

Indeed, so I'm fine to proceed with this simpler code.
David Sterba Nov. 13, 2020, 4:53 p.m. UTC | #4
On Thu, Nov 12, 2020 at 12:09:52PM +0000, Johannes Thumshirn wrote:
> On 12/11/2020 13:03, Nikolay Borisov wrote:
> > Would a simple 'if()' here catch the case where fs_info is not
> > initialized essentially open-coding what Anand has proposed? My idea is
> > to be able to provide the filesystem id when we can (best effort) and
> > simply use pr_warn otherwise, but without having to change the internals
> > of btrfs_printk and instead handle the single problematic call site ?
> > 
> 
> Unfortunately not, I've been trying to do that but the device->fs_info pointer
> exists and accessing it triggers a KASAN splat.
> 
> Actually btrfs_printk() is already checking if fs_info is NULL or not to decide
> whether to print <unknown> or fs_info->sb->s_id.
> 
> Another option would be to do btrfs_warn_in_rcu(NULL but that doesn't buy us a lot
> more.

It does, not calling pr_warn and using the helpers that print the
standard header. I really want to avoid using the raw pr_* functions if
possible, in this case we can use the NULL parameter.

Previous discussion:
https://lore.kernel.org/linux-btrfs/20200110090555.7049-1-anand.jain@oracle.com/t/#u

We can update btrfs_printk to leave out "(device %s)" completely in
case there's no fs_info, and then switch everything to the helpers (there
are still too many pr_* left).
David Sterba Nov. 13, 2020, 4:57 p.m. UTC | #5
On Fri, Nov 13, 2020 at 05:53:05PM +0100, David Sterba wrote:
> On Thu, Nov 12, 2020 at 12:09:52PM +0000, Johannes Thumshirn wrote:
> > On 12/11/2020 13:03, Nikolay Borisov wrote:
> > > Would a simple 'if()' here catch the case where fs_info is not
> > > initialized essentially open-coding what Anand has proposed? My idea is
> > > to be able to provide the filesystem id when we can (best effort) and
> > > simply use pr_warn otherwise, but without having to change the internals
> > > of btrfs_printk and instead handle the single problematic call site ?
> > > 
> > 
> > Unfortunately not, I've been trying to do that but the device->fs_info pointer
> > exists and accessing it triggers a KASAN splat.
> > 
> > Actually btrfs_printk() is already checking if fs_info is NULL or not to decide
> > whether to print <unknown> or fs_info->sb->s_id.
> > 
> > Another option would be to do btrfs_warn_in_rcu(NULL but that doesn't buy us a lot
> > more.
> 
> It does, not calling pr_warn and using the helpers that print the
> standard header. I really want to avoid using the raw pr_* functions if
> possible, in this case we can use the NULL parameter.
> 
> Previous discussion:
> https://lore.kernel.org/linux-btrfs/20200110090555.7049-1-anand.jain@oracle.com/t/#u
> 
> We can update btrfs_printk to leave out "(device %s)" completely in
> case there's no fs_info, and then switch everything to the helpers (there
> are still too many pr_* left).

--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -240,9 +240,14 @@ void __cold btrfs_printk(const struct btrfs_fs_info *fs_info, const char *fmt, .
        vaf.fmt = fmt;
        vaf.va = &args;
 
-       if (__ratelimit(ratelimit))
-               printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
-                       fs_info ? fs_info->sb->s_id : "<unknown>", &vaf);
+       if (__ratelimit(ratelimit)) {
+               if (fs_info) {
+                       printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
+                                       fs_info->sb->s_id, &vaf);
+               } else {
+                       printk("%sBTRFS %s: %pV\n", lvl, type, &vaf);
+               }
+       }
 
        va_end(args);
 }
---
Johannes Thumshirn Nov. 13, 2020, 6:41 p.m. UTC | #6
On 13/11/2020 17:59, David Sterba wrote:
> 
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -240,9 +240,14 @@ void __cold btrfs_printk(const struct btrfs_fs_info *fs_info, const char *fmt, .
>         vaf.fmt = fmt;
>         vaf.va = &args;
>  
> -       if (__ratelimit(ratelimit))
> -               printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
> -                       fs_info ? fs_info->sb->s_id : "<unknown>", &vaf);
> +       if (__ratelimit(ratelimit)) {
> +               if (fs_info) {
> +                       printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
> +                                       fs_info->sb->s_id, &vaf);
> +               } else {
> +                       printk("%sBTRFS %s: %pV\n", lvl, type, &vaf);
> +               }
> +       }
>  
>         va_end(args);
>  }

But this does not help us at all. If fs_info is not correctly initialized but
the pointer is non-NULL we get the KASAN splat syzkaller is reporting.

Just try your patch with KASAN and the reproducer, it will crash within one 
minute. Been there, done that.
Johannes Thumshirn Nov. 13, 2020, 6:45 p.m. UTC | #7
On 13/11/2020 19:41, Johannes Thumshirn wrote:
> On 13/11/2020 17:59, David Sterba wrote:
>>
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -240,9 +240,14 @@ void __cold btrfs_printk(const struct btrfs_fs_info *fs_info, const char *fmt, .
>>         vaf.fmt = fmt;
>>         vaf.va = &args;
>>  
>> -       if (__ratelimit(ratelimit))
>> -               printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
>> -                       fs_info ? fs_info->sb->s_id : "<unknown>", &vaf);
>> +       if (__ratelimit(ratelimit)) {
>> +               if (fs_info) {
>> +                       printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
>> +                                       fs_info->sb->s_id, &vaf);
>> +               } else {
>> +                       printk("%sBTRFS %s: %pV\n", lvl, type, &vaf);
>> +               }
>> +       }
>>  
>>         va_end(args);
>>  }
> 
> But this does not help us at all. If fs_info is not correctly initialized but
> the pointer is non-NULL we get the KASAN splat syzkaller is reporting.
> 
> Just try your patch with KASAN and the reproducer, it will crash within one 
> minute. Been there, done that.
> 

Ah sorry now I get what you want, yes well have to call btrfs_warn() with a NULL
argument if you don't want raw pr_* calls. As long as we're not passing in 
something.

Will update the patch on monday.
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bb1aa96e1233..eb1af5e3d596 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -940,8 +940,8 @@  static noinline struct btrfs_device *device_list_add(const char *path,
 			if (device->bdev != path_bdev) {
 				bdput(path_bdev);
 				mutex_unlock(&fs_devices->device_list_mutex);
-				btrfs_warn_in_rcu(device->fs_info,
-	"duplicate device %s devid %llu generation %llu scanned by %s (%d)",
+				pr_warn(
+	"BTRFS: duplicate device %s devid %llu generation %llu scanned by %s (%d)",
 						  path, devid, found_transid,
 						  current->comm,
 						  task_pid_nr(current));