diff mbox

dm-multipath test scripts

Message ID 56C662F1.8070407@ce.jp.nec.com (mailing list archive)
State Accepted, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Junichi Nomura Feb. 19, 2016, 12:33 a.m. UTC
Hi Mike,

On 02/19/16 02:17, Mike Snitzer wrote:
> But unfortunately I cannot get either the scsidebug or tcmloop mode to
> run against v4.5-rc4
> 
> For tcmloop, targetcli fails with:
> "Could not create ISCSIFabricModule in configFS."

Hmm, it sounds like there's unnecessary dependency in targetcli.

> (fixed by enabling CONFIG_ISCSI_TARGET under TARGET_CORE)

OK.

> I'm seeing all tests fail due to fio verification failure.  I'll need to
> inspect this further..
> 
> But the most problematic test is ./tests/test_03_dm_failpath -- it seems
> to actively break _any_ v4.5-rc kernel I try (with a never-ending flood
> of messages like "device-mapper: multipath: Failing path 8:192."); I
> haven't tried older kernels.

It seems fail/recover cycle runs too fast for I/O to make any progress.
I hit similar case and had to slow down the stress with attached patch.
Please try this. Sorry for the inconvenience.

> What is the last kernel version that your scripts have worked on?

v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.

> Taking a step back:
> These scripts don't belong in Documentation/device-mapper/mptest/ (or
> anywhere in the kernel tree for that matter).
> 
> I'd really prefer it if we could port your scripts over to the
> device-mapper-test-suite, see:
> https://github.com/jthornber/device-mapper-test-suite

Yes, I agree such a project is better place for this to live.

Comments

Junichi Nomura Feb. 19, 2016, 8:37 a.m. UTC | #1
On 02/19/16 09:33, Nomura Junichi wrote:
> On 02/19/16 02:17, Mike Snitzer wrote:
>> What is the last kernel version that your scripts have worked on?
> 
> v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.

v4.5-rc4 works fine, too.

So if all tests fail for you, it might be due to difference in
environment or kernel config.
I'm running tests on RHEL7.2 and kernel config is based on RHEL7 kernel.

# rpm -q targetcli
targetcli-2.1.fb41-3.el7.noarch
# cp /boot/config-3.10.0-327.el7.x86_64 .config
# make olddefconfig
...
Mike Snitzer Feb. 19, 2016, 7:42 p.m. UTC | #2
On Fri, Feb 19 2016 at  3:37am -0500,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:

> On 02/19/16 09:33, Nomura Junichi wrote:
> > On 02/19/16 02:17, Mike Snitzer wrote:
> >> What is the last kernel version that your scripts have worked on?
> > 
> > v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.
> 
> v4.5-rc4 works fine, too.

Have you been running with blk-mq?
Either by setting CONFIG_DM_MQ_DEFAULT or:
echo Y > /sys/module/dm_mod/parameters/use_blk_mq

I'm seeing test_02_sdev_delete fail with blk-mq enabled.

(and with my latest DM mpath code I'm testing for Linux 4.6: I'm seeing
a nasty deadlock from test_01_sdev_offline... so that's "fun")

> So if all tests fail for you, it might be due to difference in
> environment or kernel config.

After upgrading fio to latest fio.git I no longer see fio validate
errors.

> I'm running tests on RHEL7.2 and kernel config is based on RHEL7 kernel.
> 
> # rpm -q targetcli
> targetcli-2.1.fb41-3.el7.noarch
> # cp /boot/config-3.10.0-327.el7.x86_64 .config
> # make olddefconfig
> ...

Yes, I'm using a comparable setup.  But with localmodconfig.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Feb. 20, 2016, 6:12 a.m. UTC | #3
On Fri, Feb 19 2016 at  2:42pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Fri, Feb 19 2016 at  3:37am -0500,
> Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> 
> > On 02/19/16 09:33, Nomura Junichi wrote:
> > > On 02/19/16 02:17, Mike Snitzer wrote:
> > >> What is the last kernel version that your scripts have worked on?
> > > 
> > > v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.
> > 
> > v4.5-rc4 works fine, too.
> 
> Have you been running with blk-mq?
> Either by setting CONFIG_DM_MQ_DEFAULT or:
> echo Y > /sys/module/dm_mod/parameters/use_blk_mq
> 
> I'm seeing test_02_sdev_delete fail with blk-mq enabled.

I only see failure if I stack dm-mq ontop of old non-mq scsi devices with:

echo N > /sys/module/scsi_mod/parameters/use_blk_mq
echo Y > /sys/module/dm_mod/parameters/use_blk_mq

If I use scsi-mq for the underlying devices all works fine (been testing
the latest dm-4.6 branch though, I'll go back and try stock 4.5-rc4 just
to double check).

But this makes me think the novelty of having dm-mq support stacking on
non-blk-mq devices was misplaced.  It is a senseless config.  I'll
probably remove support for such stacking soon (next week). 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Hannes Reinecke Feb. 20, 2016, 9:42 a.m. UTC | #4
On 02/20/2016 07:12 AM, Mike Snitzer wrote:
> On Fri, Feb 19 2016 at  2:42pm -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
>
>> On Fri, Feb 19 2016 at  3:37am -0500,
>> Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
>>
>>> On 02/19/16 09:33, Nomura Junichi wrote:
>>>> On 02/19/16 02:17, Mike Snitzer wrote:
>>>>> What is the last kernel version that your scripts have worked on?
>>>>
>>>> v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.
>>>
>>> v4.5-rc4 works fine, too.
>>
>> Have you been running with blk-mq?
>> Either by setting CONFIG_DM_MQ_DEFAULT or:
>> echo Y > /sys/module/dm_mod/parameters/use_blk_mq
>>
>> I'm seeing test_02_sdev_delete fail with blk-mq enabled.
>
> I only see failure if I stack dm-mq ontop of old non-mq scsi devices with:
>
> echo N > /sys/module/scsi_mod/parameters/use_blk_mq
> echo Y > /sys/module/dm_mod/parameters/use_blk_mq
>
> If I use scsi-mq for the underlying devices all works fine (been testing
> the latest dm-4.6 branch though, I'll go back and try stock 4.5-rc4 just
> to double check).
>
> But this makes me think the novelty of having dm-mq support stacking on
> non-blk-mq devices was misplaced.  It is a senseless config.  I'll
> probably remove support for such stacking soon (next week).

Hmm. I must admit I really, really don't like these 'once-and-for-all'
parameter.

ATM the only SCSI driver to support SCSI-mq properly are lpfc, virtio, 
and fnic. None of the other driver have been modified, and I suspect the 
performance might be less than stellar.

So there will be configurations where one might want to run scsi-mq 
alongside non-mq HBAs.

I would really love to see to make that more granular so that these 
configurations can run efficiently.
I know Christoph is violently against it, but I don't really see any 
solution presenting itself at the moment.

Maybe a good topic for LSF ...

Cheers,

Hannes
Bart Van Assche Feb. 20, 2016, 3:13 p.m. UTC | #5
On 02/20/16 01:42, Hannes Reinecke wrote:
> ATM the only SCSI driver to support SCSI-mq properly are lpfc, virtio, 
> and fnic. None of the other driver have been modified, and I suspect the 
> performance might be less than stellar.

Hello Hannes,

Before scsi-mq support was added to the above drivers it was added
to the ib_srp driver. Apparently today there are multiple drivers
that support multiple hardware queues:

$ git grep -nH 'nr_hw_queues = [^1]' | grep -vE 'block/blk-mq|scsi/scsi_lib'
drivers/block/null_blk.c:667:		nullb->tag_set.nr_hw_queues = submit_queues;
drivers/block/virtio_blk.c:638:	vblk->tag_set.nr_hw_queues = vblk->num_vqs;
drivers/block/xen-blkfront.c:927:	info->tag_set.nr_hw_queues = info->nr_rings;
drivers/infiniband/ulp/srp/ib_srp.c:3337:	target->scsi_host->nr_hw_queues = target->ch_count;
drivers/nvme/host/pci.c:1680:		dev->tagset.nr_hw_queues = dev->online_queues - 1;
drivers/scsi/lpfc/lpfc_init.c:3318:	shost->nr_hw_queues = phba->cfg_fcp_io_channel;
drivers/scsi/virtio_scsi.c:1008:	shost->nr_hw_queues = num_queues;

Performance data is available in http://events.linuxfoundation.org/sites/events/files/slides/Vault%20-%20scsi-mq%20v2.pdf.

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Feb. 20, 2016, 4:34 p.m. UTC | #6
On Sat, Feb 20 2016 at  4:42am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 02/20/2016 07:12 AM, Mike Snitzer wrote:
> >On Fri, Feb 19 2016 at  2:42pm -0500,
> >Mike Snitzer <snitzer@redhat.com> wrote:
> >
> >>On Fri, Feb 19 2016 at  3:37am -0500,
> >>Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> >>
> >>>On 02/19/16 09:33, Nomura Junichi wrote:
> >>>>On 02/19/16 02:17, Mike Snitzer wrote:
> >>>>>What is the last kernel version that your scripts have worked on?
> >>>>
> >>>>v4.4 worked fine. I'll check with v4.5-rc4 when I get a machine.
> >>>
> >>>v4.5-rc4 works fine, too.
> >>
> >>Have you been running with blk-mq?
> >>Either by setting CONFIG_DM_MQ_DEFAULT or:
> >>echo Y > /sys/module/dm_mod/parameters/use_blk_mq
> >>
> >>I'm seeing test_02_sdev_delete fail with blk-mq enabled.
> >
> >I only see failure if I stack dm-mq ontop of old non-mq scsi devices with:
> >
> >echo N > /sys/module/scsi_mod/parameters/use_blk_mq
> >echo Y > /sys/module/dm_mod/parameters/use_blk_mq
> >
> >If I use scsi-mq for the underlying devices all works fine (been testing
> >the latest dm-4.6 branch though, I'll go back and try stock 4.5-rc4 just
> >to double check).
> >
> >But this makes me think the novelty of having dm-mq support stacking on
> >non-blk-mq devices was misplaced.  It is a senseless config.  I'll
> >probably remove support for such stacking soon (next week).
> 
> Hmm. I must admit I really, really don't like these 'once-and-for-all'
> parameter.
> 
> ATM the only SCSI driver to support SCSI-mq properly are lpfc,
> virtio, and fnic. None of the other driver have been modified, and I
> suspect the performance might be less than stellar.
> 
> So there will be configurations where one might want to run scsi-mq
> alongside non-mq HBAs.

dm-mq already disallows such a mix though.  From dm_table_set_type():

        if (use_blk_mq) {
                /* verify _all_ devices in the table are blk-mq devices */
                list_for_each_entry(dd, devices, list)
                        if (!bdev_get_queue(dd->dm_dev->bdev)->mq_ops) {
                                DMERR("table load rejected: not all devices"
                                      " are blk-mq request-stackable");
                                return -EINVAL;
                        }
                t->type = DM_TYPE_MQ_REQUEST_BASED;

        }

But I was talking about removing support for dm-mq stacked on _all_ old
.request_fn devices.

> I would really love to see to make that more granular so that these
> configurations can run efficiently.
> I know Christoph is violently against it, but I don't really see any
> solution presenting itself at the moment.

I'm with Christoph on this.  Supporting such elaborate mixing is too
fragile.  Time would be much better spent converting drivers to properly
support scsi-mq and/or fixing scsi-mq to perform better.

> Maybe a good topic for LSF ...

Unlikely.. but don't let me stop you! ;)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Feb. 24, 2016, 7:37 p.m. UTC | #7
On Thu, Feb 18 2016 at  7:33pm -0500,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:

> Hi Mike,
> 
> On 02/19/16 02:17, Mike Snitzer wrote:
> 
> > Taking a step back:
> > These scripts don't belong in Documentation/device-mapper/mptest/ (or
> > anywhere in the kernel tree for that matter).
> > 
> > I'd really prefer it if we could port your scripts over to the
> > device-mapper-test-suite, see:
> > https://github.com/jthornber/device-mapper-test-suite
> 
> Yes, I agree such a project is better place for this to live.

I was going to attempt porting your scripts to device-mapper-test-suite
but I'll have to come back to that (I have more important tasks at this
time).

So I've created a guthub repo for your scripts:
https://github.com/snitm/mptest

I'll let you know once I've ported to device-mapper-test-suite.  But in
the meantime I'll take any changes you or others have to 'mptest'.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/lib/failpath_dm_message b/lib/failpath_dm_message
index 1a3bcf8..5b8f28a 100755
--- a/lib/failpath_dm_message
+++ b/lib/failpath_dm_message
@@ -30,9 +30,11 @@  start_failpath_dm_message () {
 		for m in $majs; do
 			dmsetup message $MPNAME 0 "fail_path $m"
 		done
+		sleep 1
 		for m in $majs; do
 			dmsetup message $MPNAME 0 "reinstate_path $m"
 		done
+		sleep 1
 	done &
 }