diff mbox

dm mirror: fix crash caused by NULL-pointer dereference

Message ID 20170626090848.21585-1-zren@suse.com (mailing list archive)
State Rejected, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Zhen Ren June 26, 2017, 9:08 a.m. UTC
When the primary mirror device fails, activating a mirrored
LV will crash the kernel. It can be reproduced 100% with the
scripts below:

"""
dd if=/dev/zero of=file.raw bs=1G count=1
loopdev=$(losetup -f)
losetup $loopdev file.raw
dmsetup create pv1 --table "0 102399 linear $loopdev 0"
dmsetup create pv2 --table "0 102399 linear $loopdev 102400"
vgcreate vgtest /dev/mapper/pv1 /dev/mapper/pv2
lvcreate -l1 --type mirror -m1 -n mirror12 --mirrorlog core \
                 vgtest /dev/mapper/pv1 /dev/mapper/pv2
vgchange -an vgtest
echo 0 10000000 error | dmsetup load /dev/mapper/pv1
dmsetup resume /dev/mapper/pv1
vgchange -ay vgtest
"""

The call trace:
"""
[  287.008629] device-mapper: raid1: Unable to read primary mirror during recovery
[  287.008632] device-mapper: raid1: Primary mirror (254:10) failed while out-of-sync: Reads may fail.
...
[  287.012480] BUG: unable to handle kernel NULL pointer dereference at 0000000000000019
[  287.012515] IP: [<ffffffffa00d944f>] mirror_end_io+0x7f/0x130 [dm_mirror]
...
[  291.994645] Call Trace:
[  291.994671]  [<ffffffffa007b215>] clone_endio+0x35/0xe0 [dm_mod]
[  291.994675]  [<ffffffffa0589ced>] do_reads+0x17d/0x1d0 [dm_mirror]
[  291.994680]  [<ffffffffa058af5c>] do_mirror+0xec/0x250 [dm_mirror]
[  291.994687]  [<ffffffff810958fe>] process_one_work+0x14e/0x410
[  291.994691]  [<ffffffff81096156>] worker_thread+0x116/0x490
[  291.994694]  [<ffffffff8109b627>] kthread+0xc7/0xe0
"""

Fixes it by setting "details.bi_bdev" to NULL in error path beforing
calling into mirror_end_io(), which will fail the IO properly.

Reported-by: Jerry Tang <jtang@suse.com>
Signed-off-by: Eric Ren <zren@suse.com>
---
 drivers/md/dm-raid1.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Johannes Thumshirn June 26, 2017, 9:14 a.m. UTC | #1
On Mon, Jun 26, 2017 at 05:08:48PM +0800, Eric Ren wrote:
> When the primary mirror device fails, activating a mirrored
> LV will crash the kernel. It can be reproduced 100% with the
> scripts below:
> 
> """
> dd if=/dev/zero of=file.raw bs=1G count=1
> loopdev=$(losetup -f)
> losetup $loopdev file.raw
> dmsetup create pv1 --table "0 102399 linear $loopdev 0"
> dmsetup create pv2 --table "0 102399 linear $loopdev 102400"
> vgcreate vgtest /dev/mapper/pv1 /dev/mapper/pv2
> lvcreate -l1 --type mirror -m1 -n mirror12 --mirrorlog core \
>                  vgtest /dev/mapper/pv1 /dev/mapper/pv2
> vgchange -an vgtest
> echo 0 10000000 error | dmsetup load /dev/mapper/pv1
> dmsetup resume /dev/mapper/pv1
> vgchange -ay vgtest
> """

So as you have a reproducer, can you eventually write an blktest [1]
regression test for it?


Thanks,
	Johannes

[1] https://github.com/osandov/blktests
Zdenek Kabelac June 26, 2017, 9:46 a.m. UTC | #2
Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
> When the primary mirror device fails, activating a mirrored
> LV will crash the kernel. It can be reproduced 100% with the
> scripts below:
> 
> """
> dd if=/dev/zero of=file.raw bs=1G count=1
> loopdev=$(losetup -f)
> losetup $loopdev file.raw
> dmsetup create pv1 --table "0 102399 linear $loopdev 0"
> dmsetup create pv2 --table "0 102399 linear $loopdev 102400"
> vgcreate vgtest /dev/mapper/pv1 /dev/mapper/pv2
> lvcreate -l1 --type mirror -m1 -n mirror12 --mirrorlog core \
>                   vgtest /dev/mapper/pv1 /dev/mapper/pv2
> vgchange -an vgtest
> echo 0 10000000 error | dmsetup load /dev/mapper/pv1
> dmsetup resume /dev/mapper/pv1
> vgchange -ay vgtest
> " > The call trace:
> """
> [  287.008629] device-mapper: raid1: Unable to read primary mirror during recovery
> [  287.008632] device-mapper: raid1: Primary mirror (254:10) failed while out-of-sync: Reads may fail.
> ...
> [  287.012480] BUG: unable to handle kernel NULL pointer dereference at 0000000000000019
> [  287.012515] IP: [<ffffffffa00d944f>] mirror_end_io+0x7f/0x130 [dm_mirror]
> ...
> [  291.994645] Call Trace:
> [  291.994671]  [<ffffffffa007b215>] clone_endio+0x35/0xe0 [dm_mod]
> [  291.994675]  [<ffffffffa0589ced>] do_reads+0x17d/0x1d0 [dm_mirror]
> [  291.994680]  [<ffffffffa058af5c>] do_mirror+0xec/0x250 [dm_mirror]
> [  291.994687]  [<ffffffff810958fe>] process_one_work+0x14e/0x410
> [  291.994691]  [<ffffffff81096156>] worker_thread+0x116/0x490
> [  291.994694]  [<ffffffff8109b627>] kthread+0xc7/0xe0
> """
> 
> Fixes it by setting "details.bi_bdev" to NULL in error path beforing
> calling into mirror_end_io(), which will fail the IO properly.


Hi

Which kernel version is this ?

I'd thought we've already fixed this BZ for old mirrors:
https://bugzilla.redhat.com/show_bug.cgi?id=1382382

There similar BZ for md-raid based mirrors (--type raid1)
https://bugzilla.redhat.com/show_bug.cgi?id=1416099


Regards

Zdenej

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 26, 2017, 10:42 a.m. UTC | #3
Hi,

On 06/26/2017 05:14 PM, Johannes Thumshirn wrote:
> On Mon, Jun 26, 2017 at 05:08:48PM +0800, Eric Ren wrote:
>> When the primary mirror device fails, activating a mirrored
>> LV will crash the kernel. It can be reproduced 100% with the
>> scripts below:
>>
>> """
>> dd if=/dev/zero of=file.raw bs=1G count=1
>> loopdev=$(losetup -f)
>> losetup $loopdev file.raw
>> dmsetup create pv1 --table "0 102399 linear $loopdev 0"
>> dmsetup create pv2 --table "0 102399 linear $loopdev 102400"
>> vgcreate vgtest /dev/mapper/pv1 /dev/mapper/pv2
>> lvcreate -l1 --type mirror -m1 -n mirror12 --mirrorlog core \
>>                   vgtest /dev/mapper/pv1 /dev/mapper/pv2
>> vgchange -an vgtest
>> echo 0 10000000 error | dmsetup load /dev/mapper/pv1
>> dmsetup resume /dev/mapper/pv1
>> vgchange -ay vgtest
>> """
> So as you have a reproducer, can you eventually write an blktest [1]
> regression test for it?
No problem, will do later on.

Thanks,
Eric

>
>
> Thanks,
> 	Johannes
>
> [1] https://github.com/osandov/blktests
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 26, 2017, 10:55 a.m. UTC | #4
Hi Zdenek,


On 06/26/2017 05:46 PM, Zdenek Kabelac wrote:
> Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
>>
> [... snip...]
>
> Hi
>
> Which kernel version is this ?
>
> I'd thought we've already fixed this BZ for old mirrors:
> https://bugzilla.redhat.com/show_bug.cgi?id=1382382
>
> There similar BZ for md-raid based mirrors (--type raid1)
> https://bugzilla.redhat.com/show_bug.cgi?id=1416099
My base kernel version is 4.4.68, but with this 2 latest fixes applied:

"""
Revert "dm mirror: use all available legs on multiple failures"
dm io: fix duplicate bio completion due to missing ref count
"""

Yes, I've been working on dm-mirror crash issue for couple days,
but only see this fixes above when I rebased to send my fixes out, hah.

But, with this 2 fixes, the producer can still crash the kernel. Anyway, 
I can
test with upstream kernel again ;-P

Regards,
Eric

>
>
> Regards
>
> Zdenej
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zdenek Kabelac June 26, 2017, 11:49 a.m. UTC | #5
Dne 26.6.2017 v 12:55 Eric Ren napsal(a):
> Hi Zdenek,
> 
> 
> On 06/26/2017 05:46 PM, Zdenek Kabelac wrote:
>> Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
>>>
>> [... snip...]
>>
>> Hi
>>
>> Which kernel version is this ?
>>
>> I'd thought we've already fixed this BZ for old mirrors:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1382382
>>
>> There similar BZ for md-raid based mirrors (--type raid1)
>> https://bugzilla.redhat.com/show_bug.cgi?id=1416099
> My base kernel version is 4.4.68, but with this 2 latest fixes applied:
> 
> """
> Revert "dm mirror: use all available legs on multiple failures"

Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.

I'm now starting to wonder why?

It's been a real fix for a real issue - and 'revert' message states
there is no such problem ??

I'm confused....

Mike  - have you tried the sequence from BZ  ?

Zdenek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer June 26, 2017, 1:43 p.m. UTC | #6
On Mon, Jun 26 2017 at  7:49am -0400,
Zdenek Kabelac <zkabelac@redhat.com> wrote:

> Dne 26.6.2017 v 12:55 Eric Ren napsal(a):
> >Hi Zdenek,
> >
> >
> >On 06/26/2017 05:46 PM, Zdenek Kabelac wrote:
> >>Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
> >>>
> >>[... snip...]
> >>
> >>Hi
> >>
> >>Which kernel version is this ?
> >>
> >>I'd thought we've already fixed this BZ for old mirrors:
> >>https://bugzilla.redhat.com/show_bug.cgi?id=1382382
> >>
> >>There similar BZ for md-raid based mirrors (--type raid1)
> >>https://bugzilla.redhat.com/show_bug.cgi?id=1416099
> >My base kernel version is 4.4.68, but with this 2 latest fixes applied:
> >
> >"""
> >Revert "dm mirror: use all available legs on multiple failures"
> 
> Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.
> 
> I'm now starting to wonder why?
> 
> It's been a real fix for a real issue - and 'revert' message states
> there is no such problem ??
> 
> I'm confused....
> 
> Mike  - have you tried the sequence from BZ  ?

Jon (cc'd) tested it and found that the original issue wasn't
reproducible.  But that the commit in question was the source of crashes
(likely due to needing the fix from this thread).

We can revisit for 4.13 -- but I'll defer to Heinz.

Eric, please share your results from testing the latest 4.12-rc7.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 26, 2017, 1:47 p.m. UTC | #7
Hi,

On 06/26/2017 06:55 PM, Eric Ren wrote:
> Hi Zdenek,
>
>
> On 06/26/2017 05:46 PM, Zdenek Kabelac wrote:
>> Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
>>>
>> [... snip...]
>>
>> Hi
>>
>> Which kernel version is this ?
>>
>> I'd thought we've already fixed this BZ for old mirrors:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1382382
>>
>> There similar BZ for md-raid based mirrors (--type raid1)
>> https://bugzilla.redhat.com/show_bug.cgi?id=1416099
> My base kernel version is 4.4.68, but with this 2 latest fixes applied:
>
> """
> Revert "dm mirror: use all available legs on multiple failures"
> dm io: fix duplicate bio completion due to missing ref count

I have a confusion about this "dm io..." fix. The fix itself is good.

Without it, a mkfs.ext4 on a mirrored dev whose primary mirror dev
has failed, will crash the kernel with the discard operation from mkfs.ext4.
However, mkfs.ext4 can succeed on a healthy mirrored device. This
is the thing I don't understand, because no matter the mirrored device is
good or not, there's always a duplicate bio completion before having this
this fix, thus write_callback() will be called twice, crashing will 
occur on the
second write_callback():
"""
static void write_callback(unsigned long error, void *context)
{
         unsigned i;
         struct bio *bio = (struct bio *) context;
         struct mirror_set *ms;
         int should_wake = 0;
         unsigned long flags;

         ms = bio_get_m(bio)->ms;        ====> NULL pointer at the 
duplicate completion
         bio_set_m(bio, NULL);
"""

If no this fix, I expected the DISCARD IO would always crash the kernel, 
but it's not true when
the mirrored device is good. Hope someone happen to know the reason can 
give some hints ;-P

Regards,
Eric

> """
>
> Yes, I've been working on dm-mirror crash issue for couple days,
> but only see this fixes above when I rebased to send my fixes out, hah.
>
> But, with this 2 fixes, the producer can still crash the kernel. 
> Anyway, I can
> test with upstream kernel again ;-P
>
> Regards,
> Eric
>
>>
>>
>> Regards
>>
>> Zdenej
>>
>
> -- 
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 26, 2017, 1:56 p.m. UTC | #8
Hi Mike,

On 06/26/2017 09:43 PM, Mike Snitzer wrote:
> On Mon, Jun 26 2017 at  7:49am -0400,
> Zdenek Kabelac <zkabelac@redhat.com> wrote:
>
>> [...snip...]
>> Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.
>>
>> I'm now starting to wonder why?
>>
>> It's been a real fix for a real issue - and 'revert' message states
>> there is no such problem ??
>>
>> I'm confused....
>>
>> Mike  - have you tried the sequence from BZ  ?
> Jon (cc'd) tested it and found that the original issue wasn't
> reproducible.  But that the commit in question was the source of crashes
> (likely due to needing the fix from this thread).
>
> We can revisit for 4.13 -- but I'll defer to Heinz.
>
> Eric, please share your results from testing the latest 4.12-rc7.
Sure, but I will do it tomorrow, different timezone ;-)

Maybe, Jon can also try my producer on the latest kernel. So, we
can double confirm with each other.

Thanks,
Eric

>
> Mike
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer June 26, 2017, 2:37 p.m. UTC | #9
On Mon, Jun 26 2017 at  9:47am -0400,
Eric Ren <zren@suse.com> wrote:

> Hi,
> 
> On 06/26/2017 06:55 PM, Eric Ren wrote:
> >Hi Zdenek,
> >
> >
> >On 06/26/2017 05:46 PM, Zdenek Kabelac wrote:
> >>Dne 26.6.2017 v 11:08 Eric Ren napsal(a):
> >>>
> >>[... snip...]
> >>
> >>Hi
> >>
> >>Which kernel version is this ?
> >>
> >>I'd thought we've already fixed this BZ for old mirrors:
> >>https://bugzilla.redhat.com/show_bug.cgi?id=1382382
> >>
> >>There similar BZ for md-raid based mirrors (--type raid1)
> >>https://bugzilla.redhat.com/show_bug.cgi?id=1416099
> >My base kernel version is 4.4.68, but with this 2 latest fixes applied:
> >
> >"""
> >Revert "dm mirror: use all available legs on multiple failures"
> >dm io: fix duplicate bio completion due to missing ref count
> 
> I have a confusion about this "dm io..." fix. The fix itself is good.
> 
> Without it, a mkfs.ext4 on a mirrored dev whose primary mirror dev
> has failed, will crash the kernel with the discard operation from mkfs.ext4.
> However, mkfs.ext4 can succeed on a healthy mirrored device. This
> is the thing I don't understand, because no matter the mirrored device is
> good or not, there's always a duplicate bio completion before having this
> this fix, thus write_callback() will be called twice, crashing will
> occur on the
> second write_callback():

No, there is only a duplicate bio completion if the error path is taken
(e.g. underlying device doesn't support discard).

> """
> static void write_callback(unsigned long error, void *context)
> {
>         unsigned i;
>         struct bio *bio = (struct bio *) context;
>         struct mirror_set *ms;
>         int should_wake = 0;
>         unsigned long flags;
> 
>         ms = bio_get_m(bio)->ms;        ====> NULL pointer at the
> duplicate completion
>         bio_set_m(bio, NULL);
> """
> 
> If no this fix, I expected the DISCARD IO would always crash the
> kernel, but it's not true when
> the mirrored device is good. Hope someone happen to know the reason
> can give some hints ;-P

If the mirror is healthy then only one completion is returned to
dm-mirror (via write_callback).  The problem was the error patch wasn't
managing the reference count as needed.  Whereas dm-io's normal discard
IO path does.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 26, 2017, 3:27 p.m. UTC | #10
Hi Mike,


On 06/26/2017 10:37 PM, Mike Snitzer wrote:
> On Mon, Jun 26 2017 at  9:47am -0400,
> Eric Ren <zren@suse.com> wrote:
> [...snip...]
>>> """
>>> Revert "dm mirror: use all available legs on multiple failures"
>>> dm io: fix duplicate bio completion due to missing ref count
>> I have a confusion about this "dm io..." fix. The fix itself is good.
>>
>> Without it, a mkfs.ext4 on a mirrored dev whose primary mirror dev
>> has failed, will crash the kernel with the discard operation from mkfs.ext4.
>> However, mkfs.ext4 can succeed on a healthy mirrored device. This
>> is the thing I don't understand, because no matter the mirrored device is
>> good or not, there's always a duplicate bio completion before having this
>> this fix, thus write_callback() will be called twice, crashing will
>> occur on the
>> second write_callback():
> No, there is only a duplicate bio completion if the error path is taken
> (e.g. underlying device doesn't support discard).
Hmm, when "op == REQ_OP_DISCARD", please see comments in do_region():

"""
static void do_region(int op, int op_flags, unsigned region,
                       struct dm_io_region *where, struct dpages *dp,
                       struct io *io)
{
...
  if (op == REQ_OP_DISCARD)
                 special_cmd_max_sectors = q->limits.max_discard_sectors;
...
         if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES ||
              op == REQ_OP_WRITE_SAME) && special_cmd_max_sectors == 0) {
                 atomic_inc(&io->count);        ===>   [1]
                 dec_count(io, region, -EOPNOTSUPP);     ===>  [2]
                 return;
         }
"""

[1] ref count fixed by patch "dm io: ...";
[2] we won't come here if "special_cmd_max_sectors != 0", which is true 
when both sides
of the mirror is good.

So only when a mirror device fails, "max_discard_sectors" on its queue 
is 0, thus this error path
will be taken, right?

>
>> """
>> static void write_callback(unsigned long error, void *context)
>> {
>>          unsigned i;
>>          struct bio *bio = (struct bio *) context;
>>          struct mirror_set *ms;
>>          int should_wake = 0;
>>          unsigned long flags;
>>
>>          ms = bio_get_m(bio)->ms;        ====> NULL pointer at the
>> duplicate completion
>>          bio_set_m(bio, NULL);
>> """
>>
>> If no this fix, I expected the DISCARD IO would always crash the
>> kernel, but it's not true when
>> the mirrored device is good. Hope someone happen to know the reason
>> can give some hints ;-P
> If the mirror is healthy then only one completion is returned to
> dm-mirror (via write_callback).  The problem was the error patch wasn't
> managing the reference count as needed.  Whereas dm-io's normal discard
> IO path does.

Yes, I can understand this.

Thanks a lot,
Eric

>
> Mike
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 27, 2017, 1:46 a.m. UTC | #11
Hi,

On 06/26/2017 11:27 PM, Eric Ren wrote:
> Hi Mike,
>
>
> On 06/26/2017 10:37 PM, Mike Snitzer wrote:
>> On Mon, Jun 26 2017 at  9:47am -0400,
>> Eric Ren <zren@suse.com> wrote:
>> [...snip...]
>>>> """
>>>> Revert "dm mirror: use all available legs on multiple failures"
>>>> dm io: fix duplicate bio completion due to missing ref count
>>> I have a confusion about this "dm io..." fix. The fix itself is good.
>>>
>>> Without it, a mkfs.ext4 on a mirrored dev whose primary mirror dev
>>> has failed, will crash the kernel with the discard operation from 
>>> mkfs.ext4.
>>> However, mkfs.ext4 can succeed on a healthy mirrored device. This
>>> is the thing I don't understand, because no matter the mirrored 
>>> device is
>>> good or not, there's always a duplicate bio completion before having 
>>> this
>>> this fix, thus write_callback() will be called twice, crashing will
>>> occur on the
>>> second write_callback():
>> No, there is only a duplicate bio completion if the error path is taken
>> (e.g. underlying device doesn't support discard).
> Hmm, when "op == REQ_OP_DISCARD", please see comments in do_region():
>
> """
> static void do_region(int op, int op_flags, unsigned region,
>                       struct dm_io_region *where, struct dpages *dp,
>                       struct io *io)
> {
> ...
>  if (op == REQ_OP_DISCARD)
>                 special_cmd_max_sectors = q->limits.max_discard_sectors;
> ...
>         if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES ||
>              op == REQ_OP_WRITE_SAME) && special_cmd_max_sectors == 0) {
>                 atomic_inc(&io->count);        ===> [1]
>                 dec_count(io, region, -EOPNOTSUPP);     ===> [2]
>                 return;
>         }
> """
>
> [1] ref count fixed by patch "dm io: ...";
> [2] we won't come here if "special_cmd_max_sectors != 0", which is 
> true when both sides
> of the mirror is good.
>
> So only when a mirror device fails, "max_discard_sectors" on its queue 
> is 0, thus this error path
> will be taken, right?

Yes, I checked this as below:

- print info
"""
@@ -310,8 +310,10 @@ static void do_region(int op, int op_flags, 
unsigned region,
         /*
          * Reject unsupported discard and write same requests.
          */
-       if (op == REQ_OP_DISCARD)
+       if (op == REQ_OP_DISCARD) {
special_cmd_max_sectors = q->limits.max_discard_sectors;
+ DMERR("do_region: op=DISCARD, q->limits.max_discard_sectors=%d\n", 
special_cmd_max_sectors);
+       }
"""
- log message when mkfs on mirror device that primary leg fails:
[ 169.672880] device-mapper: io: do_region: op=DISCARD, 
q->limits.max_discard_sectors=0    ===> for the failed leg
[ 169.672885] device-mapper: io: do_region: op=DISCARD, 
q->limits.max_discard_sectors=8388607   ===> for the good leg

Thanks,
Eric
>
>>
>>> """
>>> static void write_callback(unsigned long error, void *context)
>>> {
>>>          unsigned i;
>>>          struct bio *bio = (struct bio *) context;
>>>          struct mirror_set *ms;
>>>          int should_wake = 0;
>>>          unsigned long flags;
>>>
>>>          ms = bio_get_m(bio)->ms;        ====> NULL pointer at the
>>> duplicate completion
>>>          bio_set_m(bio, NULL);
>>> """
>>>
>>> If no this fix, I expected the DISCARD IO would always crash the
>>> kernel, but it's not true when
>>> the mirrored device is good. Hope someone happen to know the reason
>>> can give some hints ;-P
>> If the mirror is healthy then only one completion is returned to
>> dm-mirror (via write_callback).  The problem was the error patch wasn't
>> managing the reference count as needed.  Whereas dm-io's normal discard
>> IO path does.
>
> Yes, I can understand this.
>
> Thanks a lot,
> Eric
>
>>
>> Mike
>>
>> -- 
>> dm-devel mailing list
>> dm-devel@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>
> -- 
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Zhen Ren June 27, 2017, 5:47 a.m. UTC | #12
Hi Mike,


On 06/26/2017 09:43 PM, Mike Snitzer wrote:
> On Mon, Jun 26 2017 at  7:49am -0400,
> Zdenek Kabelac <zkabelac@redhat.com> wrote:
>
> [...snip...]
> Jon (cc'd) tested it and found that the original issue wasn't
> reproducible.  But that the commit in question was the source of crashes
> (likely due to needing the fix from this thread).
>
> We can revisit for 4.13 -- but I'll defer to Heinz.
>
> Eric, please share your results from testing the latest 4.12-rc7.
Testing on 4.12-rc7 doesn't crash the kernel.

Sorry for the noise, please ignore this patch. It's still a
confusion to me why backporting this fixes has problem.
Will update this thread, if I find something.

Regards,
Eric

>
> Mike
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 4da8858..696e308 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -568,6 +568,7 @@  static void do_reads(struct mirror_set *ms, struct bio_list *reads)
 	region_t region;
 	struct bio *bio;
 	struct mirror *m;
+	struct dm_raid1_bio_record *bio_record;
 
 	while ((bio = bio_list_pop(reads))) {
 		region = dm_rh_bio_to_region(ms->rh, bio);
@@ -583,8 +584,16 @@  static void do_reads(struct mirror_set *ms, struct bio_list *reads)
 
 		if (likely(m))
 			read_async_bio(m, bio);
-		else
+		else {
+			/*
+			 * In mirror_end_io(), we will fail the IO properly
+			 * if details.bi_bdev is NULL.
+			 */
+			bio_record = dm_per_bio_data(bio,
+					sizeof(struct dm_raid1_bio_record));
+			bio_record->details.bi_bdev = NULL;
 			bio_io_error(bio);
+		}
 	}
 }