diff mbox

block: note about cloned bios and bio_for_each_segment_all

Message ID 06376146-e4c3-efd6-8893-155052b14034@kernel.dk (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe July 14, 2017, 2:22 p.m. UTC
On 07/14/2017 07:47 AM, Ming Lei wrote:
>> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
>>  /*
>>   * drivers should _never_ use the all version - the bio may have been split
>>   * before it got to the driver and the driver won't own all of it
>> + *
>> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
>> + * this could lead to silent corruptions.
>>   */
>>  #define bio_for_each_segment_all(bvl, bio, i)                          \
>>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>> --
>> 2.13.0
>>
> 
> Maybe we can add a warning here if it is a cloned bio.

I think that's a good idea, it's easy for people to get this wrong, and
the consequences can be dire. How about something like this?

Comments

Liu Bo July 14, 2017, 8:54 p.m. UTC | #1
On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
> On 07/14/2017 07:47 AM, Ming Lei wrote:
> >> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
> >>  /*
> >>   * drivers should _never_ use the all version - the bio may have been split
> >>   * before it got to the driver and the driver won't own all of it
> >> + *
> >> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
> >> + * this could lead to silent corruptions.
> >>   */
> >>  #define bio_for_each_segment_all(bvl, bio, i)                          \
> >>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
> >> --
> >> 2.13.0
> >>
> > 
> > Maybe we can add a warning here if it is a cloned bio.
> 
> I think that's a good idea, it's easy for people to get this wrong, and
> the consequences can be dire. How about something like this?
> 
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 7b1cf4ba0902..13b6ac6eae29 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -155,9 +155,12 @@ static inline void *bio_data(struct bio *bio)
>  
>  /*
>   * drivers should _never_ use the all version - the bio may have been split
> - * before it got to the driver and the driver won't own all of it
> + * before it got to the driver and the driver won't own all of it.
> + *
> + * Don't use this on cloned bio's.
>   */
>  #define bio_for_each_segment_all(bvl, bio, i)				\
> +	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));			\
>  	for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>  
>  static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
> 

This patch gave me a crash, I'm double checking it..

thanks,
-liubo

[  104.140220] BUG: unable to handle kernel paging request at ffffffffa0399c1a
[  104.140675] IP: report_bug+0xc4/0x180
[  104.140916] PGD 2626067 
[  104.140917] P4D 2626067 
[  104.141089] PUD 2627063 
[  104.141259] PMD 2346aa067 
[  104.141429] PTE 800000023569c161
[  104.141610] 
[  104.141926] Oops: 0003 [#1] SMP
[  104.142137] Dumping ftrace buffer:
[  104.142366]    (ftrace buffer empty)
[  104.142602] Modules linked in: btrfs(OE) xor raid6_pq ppdev parport_pc parport serio_raw nfsd auth_rpcgss nfs_acl lockd grace sunrpc [last unloaded$
 xor]
[  104.143493] CPU: 0 PID: 144 Comm: kworker/u16:4 Tainted: G        W  OE   4.12.0+ #801
[  104.144009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
[  104.144654] Workqueue: btrfs-rmw btrfs_rmw_helper [btrfs]
[  104.145009] task: ffff8802382f8000 task.stack: ffffc90000f80000
[  104.145393] RIP: 0010:report_bug+0xc4/0x180
[  104.145668] RSP: 0018:ffffc90000f83ac8 EFLAGS: 00010002
[  104.146015] RAX: 0000000000000001 RBX: ffffffffa0356cff RCX: 0000000000000001
[  104.146474] RDX: ffffffffa0399c10 RSI: 0000000000000480 RDI: 0000000000000000
[  104.146931] RBP: ffffc90000f83ae8 R08: 0000000000000907 R09: 0000000000000000
[  104.147393] R10: 000000005170b2af R11: 000000002b881219 R12: ffffc90000f83c38
[  104.147852] R13: ffffffffa038a415 R14: 0000000000000004 R15: 0000000000000006
[  104.148315] FS:  0000000000000000(0000) GS:ffff88023a600000(0000) knlGS:0000000000000000
[  104.148836] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  104.149211] CR2: ffffffffa0399c1a CR3: 00000002244fc000 CR4: 00000000000006f0
[  104.149672] Call Trace:
[  104.149840]  fixup_bug+0x43/0x60
[  104.150060]  do_trap+0x18a/0x1f0
[  104.150276]  do_error_trap+0xdf/0x1a0
[  104.150606]  ? index_rbio_pages+0x14f/0x160 [btrfs]
[  104.150929]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  104.151241]  do_invalid_op+0x20/0x30
[  104.151478]  invalid_op+0x1e/0x30
[  104.151787] RIP: 0010:index_rbio_pages+0x14f/0x160 [btrfs]
[  104.152148] RSP: 0018:ffffc90000f83ce8 EFLAGS: 00010002
[  104.152488] RAX: ffffffffa0421938 RBX: 0000000000000000 RCX: 0000000000000000
[  104.152947] RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffffffffa0440420
[  104.153410] RBP: ffffc90000f83d18 R08: 0000000000000000 R09: 0000000000000000
[  104.153869] R10: 0000000000000001 R11: 000000002b881219 R12: ffffffffa0421960
[  104.154330] R13: 0000000000000001 R14: ffff880236e00000 R15: ffff880227718080
[  104.154882]  ? index_rbio_pages+0xc6/0x160 [btrfs]
[  104.155286]  rmw_work+0x76/0x310 [btrfs]
[  104.155633]  btrfs_scrubparity_helper+0xad/0x8e0 [btrfs]
[  104.156070]  btrfs_rmw_helper+0xe/0x10 [btrfs]
[  104.156364]  process_one_work+0x34f/0x9c0
[  104.156631]  worker_thread+0x34a/0x6b0
[  104.156879]  kthread+0x180/0x190
[  104.157095]  ? create_worker+0x230/0x230
[  104.157352]  ? kthread_create_on_node+0x70/0x70
[  104.157648]  ? kthread_create_on_node+0x70/0x70
[  104.157944]  ret_from_fork+0x2a/0x40
[  104.158183] Code: 44 89 c7 31 c0 66 83 e7 04 0f 95 c0 48 83 c0 02 48 83 04 c5 18 e6 dc 82 01 66 85 ff b8 01 00 00 00 0f 85 72 ff ff ff 41 83 c8 04 <
66> 44 89 42 0a 48 89 c8 83 e0 01 48 83 c0 02 48 83 04 c5 f0 e5 
[  104.159436] RIP: report_bug+0xc4/0x180 RSP: ffffc90000f83ac8
[  104.159803] CR2: ffffffffa0399c1a
[  104.160026] ---[ end trace 78686c1f7150bacf ]---
Ming Lei July 14, 2017, 10:35 p.m. UTC | #2
On Sat, Jul 15, 2017 at 4:54 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
>> On 07/14/2017 07:47 AM, Ming Lei wrote:
>> >> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
>> >>  /*
>> >>   * drivers should _never_ use the all version - the bio may have been split
>> >>   * before it got to the driver and the driver won't own all of it
>> >> + *
>> >> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
>> >> + * this could lead to silent corruptions.
>> >>   */
>> >>  #define bio_for_each_segment_all(bvl, bio, i)                          \
>> >>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>> >> --
>> >> 2.13.0
>> >>
>> >
>> > Maybe we can add a warning here if it is a cloned bio.
>>
>> I think that's a good idea, it's easy for people to get this wrong, and
>> the consequences can be dire. How about something like this?
>>
>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>> index 7b1cf4ba0902..13b6ac6eae29 100644
>> --- a/include/linux/bio.h
>> +++ b/include/linux/bio.h
>> @@ -155,9 +155,12 @@ static inline void *bio_data(struct bio *bio)
>>
>>  /*
>>   * drivers should _never_ use the all version - the bio may have been split
>> - * before it got to the driver and the driver won't own all of it
>> + * before it got to the driver and the driver won't own all of it.
>> + *
>> + * Don't use this on cloned bio's.
>>   */
>>  #define bio_for_each_segment_all(bvl, bio, i)                                \
>> +     WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));                     \
>>       for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>>
>>  static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
>>
>
> This patch gave me a crash, I'm double checking it..

Hi Liu Bo,

Looks one extra warning shouldn't have trigger a crash, please double
check and update
with us.

I just start a VM and run a quick test on ext4, btrfs, looks
everything is fine, and not see
any warning.
Jens Axboe July 14, 2017, 10:37 p.m. UTC | #3
On 07/14/2017 04:35 PM, Ming Lei wrote:
> On Sat, Jul 15, 2017 at 4:54 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
>> On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
>>> On 07/14/2017 07:47 AM, Ming Lei wrote:
>>>>> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
>>>>>  /*
>>>>>   * drivers should _never_ use the all version - the bio may have been split
>>>>>   * before it got to the driver and the driver won't own all of it
>>>>> + *
>>>>> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
>>>>> + * this could lead to silent corruptions.
>>>>>   */
>>>>>  #define bio_for_each_segment_all(bvl, bio, i)                          \
>>>>>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>>>>> --
>>>>> 2.13.0
>>>>>
>>>>
>>>> Maybe we can add a warning here if it is a cloned bio.
>>>
>>> I think that's a good idea, it's easy for people to get this wrong, and
>>> the consequences can be dire. How about something like this?
>>>
>>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>>> index 7b1cf4ba0902..13b6ac6eae29 100644
>>> --- a/include/linux/bio.h
>>> +++ b/include/linux/bio.h
>>> @@ -155,9 +155,12 @@ static inline void *bio_data(struct bio *bio)
>>>
>>>  /*
>>>   * drivers should _never_ use the all version - the bio may have been split
>>> - * before it got to the driver and the driver won't own all of it
>>> + * before it got to the driver and the driver won't own all of it.
>>> + *
>>> + * Don't use this on cloned bio's.
>>>   */
>>>  #define bio_for_each_segment_all(bvl, bio, i)                                \
>>> +     WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));                     \
>>>       for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>>>
>>>  static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
>>>
>>
>> This patch gave me a crash, I'm double checking it..
> 
> Hi Liu Bo,
> 
> Looks one extra warning shouldn't have trigger a crash, please double
> check and update
> with us.
> 
> I just start a VM and run a quick test on ext4, btrfs, looks
> everything is fine, and not see
> any warning.

I booted it here too, and it works fine for me. So I agree, the crash seems
to be unrelated.
Liu Bo July 14, 2017, 11:20 p.m. UTC | #4
On Sat, Jul 15, 2017 at 06:35:05AM +0800, Ming Lei wrote:
> On Sat, Jul 15, 2017 at 4:54 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> > On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
> >> On 07/14/2017 07:47 AM, Ming Lei wrote:
> >> >> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
> >> >>  /*
> >> >>   * drivers should _never_ use the all version - the bio may have been split
> >> >>   * before it got to the driver and the driver won't own all of it
> >> >> + *
> >> >> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
> >> >> + * this could lead to silent corruptions.
> >> >>   */
> >> >>  #define bio_for_each_segment_all(bvl, bio, i)                          \
> >> >>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
> >> >> --
> >> >> 2.13.0
> >> >>
> >> >
> >> > Maybe we can add a warning here if it is a cloned bio.
> >>
> >> I think that's a good idea, it's easy for people to get this wrong, and
> >> the consequences can be dire. How about something like this?
> >>
> >> diff --git a/include/linux/bio.h b/include/linux/bio.h
> >> index 7b1cf4ba0902..13b6ac6eae29 100644
> >> --- a/include/linux/bio.h
> >> +++ b/include/linux/bio.h
> >> @@ -155,9 +155,12 @@ static inline void *bio_data(struct bio *bio)
> >>
> >>  /*
> >>   * drivers should _never_ use the all version - the bio may have been split
> >> - * before it got to the driver and the driver won't own all of it
> >> + * before it got to the driver and the driver won't own all of it.
> >> + *
> >> + * Don't use this on cloned bio's.
> >>   */
> >>  #define bio_for_each_segment_all(bvl, bio, i)                                \
> >> +     WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));                     \
> >>       for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
> >>
> >>  static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
> >>
> >
> > This patch gave me a crash, I'm double checking it..
> 
> Hi Liu Bo,
> 
> Looks one extra warning shouldn't have trigger a crash, please double
> check and update
> with us.
> 
> I just start a VM and run a quick test on ext4, btrfs, looks
> everything is fine, and not see
> any warning.
> 
> -- 
> Ming Lei

I removed that WARN_ON() in btrfs's index_rbio_pages() which is simply
WARN_ON(bio_flagged(bio, BIO_CLONED));

And I still got the same crash.

The test I ran is

$ mkfs.btrfs -f -draid5 /dev/sd[cde]
$ mount /dev/sde /mnt/btrfs
$ xfs_io -f -d -c "pwrite 0 128K" /mnt/btrfs/foobar
# then kernel went to panic.

It's 4.12.0 vanilla + Jen's patch, but given that it's purely a
WARN_ON_ONCE(), I haven't figured out where the crash came from.

thanks,
-liubo

[   70.885850] BTRFS: device fsid b994b77f-a9a0-4fa2-8a66-5a36caa7c174 devid 1 transid 5 /dev/sdc
[   70.896194] BTRFS: device fsid b994b77f-a9a0-4fa2-8a66-5a36caa7c174 devid 2 transid 5 /dev/sdd
[   70.903853] BTRFS: device fsid b994b77f-a9a0-4fa2-8a66-5a36caa7c174 devid 3 transid 5 /dev/sde
[   72.991044] BTRFS info (device sde): disk space caching is enabled
[   72.991494] BTRFS info (device sde): has skinny extents
[   72.991836] BTRFS info (device sde): flagging fs with big metadata feature
[   73.015831] BTRFS info (device sde): creating UUID tree
[   82.798313] BUG: unable to handle kernel paging request at ffffffffa03bcc0e
[   82.799070] IP: report_bug+0xc4/0x180
[   82.799312] PGD 2626067 
[   82.799313] P4D 2626067 
[   82.799483] PUD 2627063 
[   82.799655] PMD 2346a3067 
[   82.799868] PTE 800000022019b161
[   82.800091] 
[   82.800471] Oops: 0003 [#1] SMP
[   82.800734] Dumping ftrace buffer:
[   82.800997]    (ftrace buffer empty)
[   82.801305] Modules linked in: btrfs(OE) xor raid6_pq ppdev parport_pc parport serio_raw nfsd auth_rpcgss nfs_acl lockd grace sunrpc
[   82.802312] CPU: 3 PID: 154 Comm: kworker/u16:5 Tainted: G           OE   4.12.0+ #802
[   82.802947] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
[   82.803690] Workqueue: btrfs-rmw btrfs_rmw_helper [btrfs]
[   82.804176] task: ffff8802383bd200 task.stack: ffffc90000fa4000
[   82.804680] RIP: 0010:report_bug+0xc4/0x180
[   82.805082] RSP: 0018:ffffc90000fa7ac8 EFLAGS: 00010002
[   82.805454] RAX: 0000000000000001 RBX: ffffffffa0379ca5 RCX: 0000000000000001
[   82.805911] RDX: ffffffffa03bcc04 RSI: 000000000000047f RDI: 0000000000000000
[   82.806373] RBP: ffffc90000fa7ae8 R08: 0000000000000907 R09: 0000000000000000
[   82.806832] R10: 00000000e3e32d2f R11: 000000000526ce2c R12: ffffc90000fa7c38
[   82.807301] R13: ffffffffa03ad415 R14: 0000000000000004 R15: 0000000000000006
[   82.807758] FS:  0000000000000000(0000) GS:ffff88023ac00000(0000) knlGS:0000000000000000
[   82.808653] CR2: ffffffffa03bcc0e CR3: 00000002203f7000 CR4: 00000000000006e0
[   82.809129] Call Trace:
[   82.809297]  fixup_bug+0x43/0x60
[   82.809512]  do_trap+0x18a/0x1f0
[   82.809727]  do_error_trap+0xdf/0x1a0
[   82.810051]  ? index_rbio_pages+0xf5/0x100 [btrfs]
[   82.810366]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[   82.810672]  do_invalid_op+0x20/0x30
[   82.810907]  invalid_op+0x1e/0x30
[   82.811205] RIP: 0010:index_rbio_pages+0xf5/0x100 [btrfs]
[   82.811554] RSP: 0018:ffffc90000fa7ce8 EFLAGS: 00010002
[   82.811891] RAX: 0000000000000002 RBX: ffffffffa0444938 RCX: 0000000000000000
[   82.812349] RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffffffffa04633f8
[   82.812805] RBP: ffffc90000fa7d18 R08: 0000000000000001 R09: 0000000000000000
[   82.813280] R10: 0000000000000001 R11: 000000000526ce2c R12: 0000000000000001
[   82.813738] R13: ffff8802386fc800 R14: ffff88022fe9a200 R15: 0000000000000000
[   82.814280]  ? index_rbio_pages+0x7a/0x100 [btrfs]
[   82.814670]  rmw_work+0x76/0x310 [btrfs]
[   82.815007]  btrfs_scrubparity_helper+0xad/0x8e0 [btrfs]
[   82.815430]  btrfs_rmw_helper+0xe/0x10 [btrfs]
[   82.815723]  process_one_work+0x34f/0x9c0
[   82.815992]  worker_thread+0x34a/0x6b0
[   82.816241]  kthread+0x180/0x190
[   82.816455]  ? create_worker+0x230/0x230
[   82.816712]  ? kthread_create_on_node+0x70/0x70
[   82.817019]  ret_from_fork+0x2a/0x40
[   82.817267] Code: 44 89 c7 31 c0 66 83 e7 04 0f 95 c0 48 83 c0 02 48 83 04 c5 18 e6 dc 82 01 66 85 ff b8 01 00 00 00 0f 85 72 ff ff ff 41 83 c8 04 <
66> 44 89 42 0a 48 89 c8 83 e0 01 48 83 c0 02 48 83 04 c5 f0 e5 
[   82.818516] RIP: report_bug+0xc4/0x180 RSP: ffffc90000fa7ac8
[   82.818883] CR2: ffffffffa03bcc0e
[   82.819110] ---[ end trace 5a34df2460aff289 ]---
David Sterba July 15, 2017, 12:28 a.m. UTC | #5
On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
>  /*
>   * drivers should _never_ use the all version - the bio may have been split
> - * before it got to the driver and the driver won't own all of it
> + * before it got to the driver and the driver won't own all of it.
> + *
> + * Don't use this on cloned bio's.
>   */
>  #define bio_for_each_segment_all(bvl, bio, i)				\
> +	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));			\
>  	for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)

This needs a multiline statement protection, otherwise

if (...)
	bio_for_each_segment_all(...) {
		...
	}

would expand to

if (...)
	WARN_ON_ONCE(...);

bio_for_each_segment_all(...) { ... }

However "do { ... } while(0)" cannot be used here, so WARN_ON_ONCE could
be moved to the initialization block of for:

#define bio_for_each_segment_all(bvl, bio, i)				\
	for (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)),		\
	     i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)

I haven't found any code using the exact broken pattern. So this does
not explain the crash Liu Bo reports.
Bart Van Assche July 18, 2017, 10:19 p.m. UTC | #6
On 07/14/17 13:54, Liu Bo wrote:
> On Fri, Jul 14, 2017 at 08:22:31AM -0600, Jens Axboe wrote:
>> On 07/14/2017 07:47 AM, Ming Lei wrote:
>>>> @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio)
>>>>  /*
>>>>   * drivers should _never_ use the all version - the bio may have been split
>>>>   * before it got to the driver and the driver won't own all of it
>>>> + *
>>>> + * Note that cloned bios must not use this as their bi_vcnt may be invalid and
>>>> + * this could lead to silent corruptions.
>>>>   */
>>>>  #define bio_for_each_segment_all(bvl, bio, i)                          \
>>>>         for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>>>> --
>>>> 2.13.0
>>>>
>>>
>>> Maybe we can add a warning here if it is a cloned bio.
>>
>> I think that's a good idea, it's easy for people to get this wrong, and
>> the consequences can be dire. How about something like this?
>>
>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>> index 7b1cf4ba0902..13b6ac6eae29 100644
>> --- a/include/linux/bio.h
>> +++ b/include/linux/bio.h
>> @@ -155,9 +155,12 @@ static inline void *bio_data(struct bio *bio)
>>  
>>  /*
>>   * drivers should _never_ use the all version - the bio may have been split
>> - * before it got to the driver and the driver won't own all of it
>> + * before it got to the driver and the driver won't own all of it.
>> + *
>> + * Don't use this on cloned bio's.
>>   */
>>  #define bio_for_each_segment_all(bvl, bio, i)				\
>> +	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));			\
>>  	for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
>>  
>>  static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
>>
> 
> This patch gave me a crash, I'm double checking it..
> 
> thanks,
> -liubo
> 
> [  104.140220] BUG: unable to handle kernel paging request at ffffffffa0399c1a
> [  104.140675] IP: report_bug+0xc4/0x180
> [  104.140916] PGD 2626067 
> [  104.140917] P4D 2626067 
> [  104.141089] PUD 2627063 
> [  104.141259] PMD 2346aa067 
> [  104.141429] PTE 800000023569c161
> [  104.141610] 
> [  104.141926] Oops: 0003 [#1] SMP
> [  104.142137] Dumping ftrace buffer:
> [  104.142366]    (ftrace buffer empty)
> [  104.142602] Modules linked in: btrfs(OE) xor raid6_pq ppdev parport_pc parport serio_raw nfsd auth_rpcgss nfs_acl lockd grace sunrpc [last unloaded$
>  xor]
> [  104.143493] CPU: 0 PID: 144 Comm: kworker/u16:4 Tainted: G        W  OE   4.12.0+ #801
> [  104.144009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
> [  104.144654] Workqueue: btrfs-rmw btrfs_rmw_helper [btrfs]
> [  104.145009] task: ffff8802382f8000 task.stack: ffffc90000f80000
> [  104.145393] RIP: 0010:report_bug+0xc4/0x180
> [  104.145668] RSP: 0018:ffffc90000f83ac8 EFLAGS: 00010002
> [  104.146015] RAX: 0000000000000001 RBX: ffffffffa0356cff RCX: 0000000000000001
> [  104.146474] RDX: ffffffffa0399c10 RSI: 0000000000000480 RDI: 0000000000000000
> [  104.146931] RBP: ffffc90000f83ae8 R08: 0000000000000907 R09: 0000000000000000
> [  104.147393] R10: 000000005170b2af R11: 000000002b881219 R12: ffffc90000f83c38
> [  104.147852] R13: ffffffffa038a415 R14: 0000000000000004 R15: 0000000000000006
> [  104.148315] FS:  0000000000000000(0000) GS:ffff88023a600000(0000) knlGS:0000000000000000
> [  104.148836] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  104.149211] CR2: ffffffffa0399c1a CR3: 00000002244fc000 CR4: 00000000000006f0
> [  104.149672] Call Trace:
> [  104.149840]  fixup_bug+0x43/0x60
> [  104.150060]  do_trap+0x18a/0x1f0
> [  104.150276]  do_error_trap+0xdf/0x1a0
> [  104.150606]  ? index_rbio_pages+0x14f/0x160 [btrfs]
> [  104.150929]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [  104.151241]  do_invalid_op+0x20/0x30
> [  104.151478]  invalid_op+0x1e/0x30
> [  104.151787] RIP: 0010:index_rbio_pages+0x14f/0x160 [btrfs]
> [  104.152148] RSP: 0018:ffffc90000f83ce8 EFLAGS: 00010002
> [  104.152488] RAX: ffffffffa0421938 RBX: 0000000000000000 RCX: 0000000000000000
> [  104.152947] RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffffffffa0440420
> [  104.153410] RBP: ffffc90000f83d18 R08: 0000000000000000 R09: 0000000000000000
> [  104.153869] R10: 0000000000000001 R11: 000000002b881219 R12: ffffffffa0421960
> [  104.154330] R13: 0000000000000001 R14: ffff880236e00000 R15: ffff880227718080
> [  104.154882]  ? index_rbio_pages+0xc6/0x160 [btrfs]
> [  104.155286]  rmw_work+0x76/0x310 [btrfs]
> [  104.155633]  btrfs_scrubparity_helper+0xad/0x8e0 [btrfs]
> [  104.156070]  btrfs_rmw_helper+0xe/0x10 [btrfs]
> [  104.156364]  process_one_work+0x34f/0x9c0
> [  104.156631]  worker_thread+0x34a/0x6b0
> [  104.156879]  kthread+0x180/0x190
> [  104.157095]  ? create_worker+0x230/0x230
> [  104.157352]  ? kthread_create_on_node+0x70/0x70
> [  104.157648]  ? kthread_create_on_node+0x70/0x70
> [  104.157944]  ret_from_fork+0x2a/0x40
> [  104.159436] RIP: report_bug+0xc4/0x180 RSP: ffffc90000f83ac8
> [  104.159803] CR2: ffffffffa0399c1a
> [  104.160026] ---[ end trace 78686c1f7150bacf ]---

(+Peter)
 
Hello Peter,

In a test I ran myself with kernel v4.12-rc1 I also noticed that a
WARN_ON_ONCE() statement triggered an oops in report_bug() and killed the
kernel thread of the caller instead of letting the caller continue. What I
ran into is probably the same oops as in the above call trace. For the test
I ran myself the disassembly is as follows:

(gdb) list *(report_bug+0x94)
0xffffffff812ba024 is in report_bug (lib/bug.c:177).
172                                     return BUG_TRAP_TYPE_WARN;
173
174                             /*
175                              * Since this is the only store, concurrency is not an issue.
176                              */
177                             bug->flags |= BUGFLAG_DONE;
178                     }
179             }
180
181             if (warning) {

Could this be related to patch "debug: Add _ONCE() logic to report_bug()"?
Is this a known issue? If so, is a fix perhaps already available?

Thanks,

Bart.
Peter Zijlstra July 24, 2017, 4:19 p.m. UTC | #7
On Tue, Jul 18, 2017 at 03:19:00PM -0700, Bart Van Assche wrote:
> Hello Peter,

Sorry for being late, I'm trying to recover from a few weeks of leave
and the inbox is in shambles.

> In a test I ran myself with kernel v4.12-rc1 I also noticed that a
> WARN_ON_ONCE() statement triggered an oops in report_bug() and killed the
> kernel thread of the caller instead of letting the caller continue. What I
> ran into is probably the same oops as in the above call trace. For the test
> I ran myself the disassembly is as follows:
> 
> (gdb) list *(report_bug+0x94)
> 0xffffffff812ba024 is in report_bug (lib/bug.c:177).
> 172                                     return BUG_TRAP_TYPE_WARN;
> 173
> 174                             /*
> 175                              * Since this is the only store, concurrency is not an issue.
> 176                              */
> 177                             bug->flags |= BUGFLAG_DONE;
> 178                     }
> 179             }
> 180
> 181             if (warning) {
> 
> Could this be related to patch "debug: Add _ONCE() logic to report_bug()"?
> Is this a known issue? If so, is a fix perhaps already available?

Yep..  please see:

  https://marc.info/?l=linux-kernel&m=150055119925677
diff mbox

Patch

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7b1cf4ba0902..13b6ac6eae29 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -155,9 +155,12 @@  static inline void *bio_data(struct bio *bio)
 
 /*
  * drivers should _never_ use the all version - the bio may have been split
- * before it got to the driver and the driver won't own all of it
+ * before it got to the driver and the driver won't own all of it.
+ *
+ * Don't use this on cloned bio's.
  */
 #define bio_for_each_segment_all(bvl, bio, i)				\
+	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));			\
 	for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
 
 static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,