From patchwork Fri Jul 31 01:20:58 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: NeilBrown <neilb@suse.com>
X-Patchwork-Id: 6907791
X-Patchwork-Delegate: snitzer@redhat.com
Return-Path: <dm-devel-bounces@redhat.com>
X-Original-To: patchwork-dm-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id B04AFC05AC
	for <patchwork-dm-devel@patchwork.kernel.org>;
	Fri, 31 Jul 2015 01:25:48 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 9412A20525
	for <patchwork-dm-devel@patchwork.kernel.org>;
	Fri, 31 Jul 2015 01:25:47 +0000 (UTC)
Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 117262045E
	for <patchwork-dm-devel@patchwork.kernel.org>;
	Fri, 31 Jul 2015 01:25:46 +0000 (UTC)
Received: from lists01.pubmisc.prod.ext.phx2.redhat.com
	(lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33])
	by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id t6V1LBpQ029050;
	Thu, 30 Jul 2015 21:21:12 -0400
Received: from int-mx13.intmail.prod.int.phx2.redhat.com
	(int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])
	by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with
	ESMTP id t6V1LAC2026952 for <dm-devel@listman.util.phx.redhat.com>;
	Thu, 30 Jul 2015 21:21:10 -0400
Received: from mx1.redhat.com (ext-mx02.extmail.prod.ext.phx2.redhat.com
	[10.5.110.26])
	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with
	ESMTP id t6V1LA3g021318
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256
	verify=NO); Thu, 30 Jul 2015 21:21:10 -0400
Received: from mx2.suse.de (mx2.suse.de [195.135.220.15])
	by mx1.redhat.com (Postfix) with ESMTPS id 7BAC0B5959;
	Fri, 31 Jul 2015 01:21:08 +0000 (UTC)
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254])
	by mx2.suse.de (Postfix) with ESMTP id C320EAC25;
	Fri, 31 Jul 2015 01:21:06 +0000 (UTC)
Date: Fri, 31 Jul 2015 11:20:58 +1000
From: NeilBrown <neilb@suse.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Message-ID: <20150731112058.6e97b491@noble>
In-Reply-To: <1438262886.2229.1.camel@HansenPartnership.com>
References: <1710310402.852769.1438246982906.JavaMail.zimbra@redhat.com>
	<1438262886.2229.1.camel@HansenPartnership.com>
MIME-Version: 1.0
X-RedHat-Spam-Score: -7.3  (BAYES_00, DCC_REPUT_00_12, RCVD_IN_DNSWL_HI,
	SPF_PASS,
	URIBL_BLOCKED) 195.135.220.15 mx2.suse.de 195.135.220.15 mx2.suse.de
	<neilb@suse.com>
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26
X-Scanned-By: MIMEDefang 2.75 on 10.5.110.26
X-loop: dm-devel@redhat.com
Cc: linux-scsi@vger.kernel.org, Jes.Sorensen@redhat.com, dm-devel@redhat.com,
	linux-raid@vger.kernel.org, xni@redhat.com, Yi Zhang <yizhan@redhat.com>
Subject: Re: [dm-devel] kernel BUG at drivers/scsi/scsi_lib.c:1101! observed
	during md5sum for one file on (RAID4->RAID0) device
X-BeenThere: dm-devel@redhat.com
X-Mailman-Version: 2.1.12
Precedence: junk
Reply-To: device-mapper development <dm-devel@redhat.com>
List-Id: device-mapper development <dm-devel.redhat.com>
List-Unsubscribe: <https://www.redhat.com/mailman/options/dm-devel>,
	<mailto:dm-devel-request@redhat.com?subject=unsubscribe>
List-Archive: <https://www.redhat.com/archives/dm-devel>
List-Post: <mailto:dm-devel@redhat.com>
List-Help: <mailto:dm-devel-request@redhat.com?subject=help>
List-Subscribe: <https://www.redhat.com/mailman/listinfo/dm-devel>,
	<mailto:dm-devel-request@redhat.com?subject=subscribe>
Sender: dm-devel-bounces@redhat.com
Errors-To: dm-devel-bounces@redhat.com
X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI,
	RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Thu, 30 Jul 2015 06:28:06 -0700 James Bottomley
<James.Bottomley@HansenPartnership.com> wrote:

> On Thu, 2015-07-30 at 05:03 -0400, Yi Zhang wrote:
> > Hi SCSI/RAID maintainer
> > 
> > During raid test with 4.2.0-rc3, I observed below kernel BUG, pls check below info for the test log/environment/test steps.
> > 
> > Log:
> > [  306.741662] md: bind<sdb1>
> > [  306.750865] md: bind<sdc1>
> > [  306.753993] md: bind<sdd1>
> > [  306.764475] md: bind<sde1>
> > [  306.786156] md: bind<sdf1>
> > [  306.789362] md: bind<sdh1>
> > [  306.792555] md: bind<sdg1>
> > [  306.868166] raid6: sse2x1   gen() 10589 MB/s
> > [  306.889143] raid6: sse2x1   xor()  8218 MB/s
> > [  306.910121] raid6: sse2x2   gen() 13453 MB/s
> > [  306.931102] raid6: sse2x2   xor()  8990 MB/s
> > [  306.952079] raid6: sse2x4   gen() 15539 MB/s
> > [  306.973063] raid6: sse2x4   xor() 10771 MB/s
> > [  306.994039] raid6: avx2x1   gen() 20582 MB/s
> > [  307.015017] raid6: avx2x2   gen() 24019 MB/s
> > [  307.035998] raid6: avx2x4   gen() 27824 MB/s
> > [  307.040755] raid6: using algorithm avx2x4 gen() 27824 MB/s
> > [  307.046869] raid6: using avx2x2 recovery algorithm
> > [  307.058793] async_tx: api initialized (async)
> > [  307.075428] xor: automatically using best checksumming function:
> > [  307.091942]    avx       : 32008.000 MB/sec
> > [  307.147662] md: raid6 personality registered for level 6
> > [  307.153584] md: raid5 personality registered for level 5
> > [  307.159505] md: raid4 personality registered for level 4
> > [  307.165698] md/raid:md0: device sdf1 operational as raid disk 4
> > [  307.172300] md/raid:md0: device sde1 operational as raid disk 3
> > [  307.178899] md/raid:md0: device sdd1 operational as raid disk 2
> > [  307.185497] md/raid:md0: device sdc1 operational as raid disk 1
> > [  307.192093] md/raid:md0: device sdb1 operational as raid disk 0
> > [  307.199052] md/raid:md0: allocated 6482kB
> > [  307.203573] md/raid:md0: raid level 4 active with 5 out of 6 devices, algorithm 0
> > [  307.211958] md0: detected capacity change from 0 to 53645148160
> > [  307.218658] md: recovery of RAID array md0
> > [  307.223226] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> > [  307.229729] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
> > [  307.240427] md: using 128k window, over a total of 10477568k.
> > [  374.670951] md: md0: recovery done.
> > [  375.722806] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> > [  447.553364] md: unbind<sdh1>
> > [  447.559905] md: export_rdev(sdh1)
> > [  447.572684] md: cannot remove active disk sdg1 from md0 ...
> > [  447.578909] md/raid:md0: Disk failure on sdg1, disabling device.
> > [  447.578909] md/raid:md0: Operation continuing on 5 devices.
> > [  447.594850] md: unbind<sdg1>
> > [  447.601834] md: export_rdev(sdg1)
> > [  447.615446] md: raid0 personality registered for level 0
> > [  447.629275] md/raid0:md0: md_size is 104775680 sectors.
> > [  447.635094] md: RAID0 configuration for md0 - 1 zone
> > [  447.640627] md: zone0=[sdb1/sdc1/sdd1/sde1/sdf1]
> > [  447.645833]       zone-offset=         0KB, device-offset=         0KB, size=  52387840KB
> > [  447.654949] 
> > [  447.739443] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> > [  447.749258] bio too big device sde1 (768 > 512)
> 
> This is the actual error.  It looks like an md problem (md list copied).

Thanks.  It certainly does look like an md problem.... ah, found it.

level_store in drivers/md/md.c calls blk_set_stacking_limits after
calling ->takeover and before calling ->run.
->run should impose the limits from the underlying device, but for
RAID0, ->takeover is doing that.

I can fix that... hopefully it will become irrelevant soon when the
immutable-bio patches go in.


This patch isn't quite right, but it should be pretty close.
Can you test and confirm?
Thanks,
NeilBrown

- " @@ -272,17 +264,6 @@ static int create_strip_zones(struct mddev
*mddev, struct r0conf **private_conf) goto abort;
 	}
 
-	if (mddev->queue) {
-		blk_queue_io_min(mddev->queue, mddev->chunk_sectors <<
9);
-		blk_queue_io_opt(mddev->queue,
-				 (mddev->chunk_sectors << 9) *
mddev->raid_disks); -
-		if (!discard_supported)
-			queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD,
mddev->queue);
-		else
-			queue_flag_set_unlocked(QUEUE_FLAG_DISCARD,
mddev->queue);
-	}
-
 	pr_debug("md/raid0:%s: done.\n", mdname(mddev));
 	*private_conf = conf;
 
@@ -433,12 +414,6 @@ static int raid0_run(struct mddev *mddev)
 	if (md_check_no_bitmap(mddev))
 		return -EINVAL;
 
-	if (mddev->queue) {
-		blk_queue_max_hw_sectors(mddev->queue,
mddev->chunk_sectors);
-		blk_queue_max_write_same_sectors(mddev->queue,
mddev->chunk_sectors);
-		blk_queue_max_discard_sectors(mddev->queue,
mddev->chunk_sectors);
-	}
-
 	/* if private is not null, we are here after takeover */
 	if (mddev->private == NULL) {
 		ret = create_strip_zones(mddev, &conf);
@@ -447,6 +422,29 @@ static int raid0_run(struct mddev *mddev)
 		mddev->private = conf;
 	}
 	conf = mddev->private;
+	if (mddev->queue) {
+		struct md_rdev *rdev;
+		bool discard_supported = false;
+
+		rdev_for_each(rdev, mddev) {
+			disk_stack_limits(mddev->gendisk, rdev->bdev,
+					  rdev->data_offset << 9);
+			if
(blk_queue_discard(bdev_get_queue(rdev->bdev)))
+				discard_supported = true;
+		}
+		blk_queue_max_hw_sectors(mddev->queue,
mddev->chunk_sectors);
+		blk_queue_max_write_same_sectors(mddev->queue,
mddev->chunk_sectors);
+		blk_queue_max_discard_sectors(mddev->queue,
mddev->chunk_sectors); +
+		blk_queue_io_min(mddev->queue, mddev->chunk_sectors <<
9);
+		blk_queue_io_opt(mddev->queue,
+				 (mddev->chunk_sectors << 9) *
mddev->raid_disks); +
+		if (!discard_supported)
+			queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD,
mddev->queue);
+		else
+			queue_flag_set_unlocked(QUEUE_FLAG_DISCARD,
mddev->queue);
+	}
 
 	/* calculate array device size */
 	md_set_array_sectors(mddev, raid0_size(mddev, 0, 0));
---
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index efb654eb5399..17804f374709 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -83,7 +83,6 @@ static int create_strip_zones(struct mddev *mddev,
struct r0conf **private_conf) char b[BDEVNAME_SIZE];
 	char b2[BDEVNAME_SIZE];
 	struct r0conf *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
-	bool discard_supported = false;
 
 	if (!conf)
 		return -ENOMEM;
@@ -188,19 +187,12 @@ static int create_strip_zones(struct mddev
*mddev, struct r0conf **private_conf) }
 		dev[j] = rdev1;
 
-		if (mddev->queue)
-			disk_stack_limits(mddev->gendisk, rdev1->bdev,
-					  rdev1->data_offset << 9);
-
 		if (rdev1->bdev->bd_disk->queue->merge_bvec_fn)
 			conf->has_merge_bvec = 1;
 
 		if (!smallest || (rdev1->sectors < smallest->sectors))
 			smallest = rdev1;
 		cnt++;
-
-		if (blk_queue_discard(bdev_get_queue(rdev1->bdev)))
-			discard_supported = true;
 	}
 	if (cnt != mddev->raid_disks) {
 		printk(KERN_ERR "md/raid0:%s: too few disks (%d of %d)