diff mbox

block: Always check queue limits for cloned requests

Message ID 1448524017-130967-1-git-send-email-hare@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Hannes Reinecke Nov. 26, 2015, 7:46 a.m. UTC
When a cloned request is retried on other queues it always needs
to be checked against the queue limits of that queue.
Otherwise the calculations for nr_phys_segments might be wrong,
leading to a crash in scsi_init_sgtable().

To clarify this the patch renames blk_rq_check_limits()
to blk_cloned_rq_check_limits() and removes the symbol
export, as the new function should only be used for
cloned requests and never exported.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 block/blk-core.c       | 21 +++++++--------------
 include/linux/blkdev.h |  1 -
 2 files changed, 7 insertions(+), 15 deletions(-)

Comments

Mike Snitzer Nov. 26, 2015, 1:11 p.m. UTC | #1
On Thu, Nov 26 2015 at  2:46am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> When a cloned request is retried on other queues it always needs
> to be checked against the queue limits of that queue.
> Otherwise the calculations for nr_phys_segments might be wrong,
> leading to a crash in scsi_init_sgtable().
> 
> To clarify this the patch renames blk_rq_check_limits()
> to blk_cloned_rq_check_limits() and removes the symbol
> export, as the new function should only be used for
> cloned requests and never exported.
> 
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: Ewan Milne <emilne@redhat.com>
> Cc: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>

Patch looks good.  Thanks for getting to the bottom of this.

Jens, please add these extra tags when you pick this up:

Fixes: e2a60da74 ("block: Clean up special command handling logic")
Cc: stable@vger.kernel.org # 3.7+
Acked-by: Mike Snitzer <snitzer@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Markus Trippelsdorf Nov. 29, 2015, 11:49 a.m. UTC | #2
On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
> On Thu, Nov 26 2015 at  2:46am -0500,
> Hannes Reinecke <hare@suse.de> wrote:
> 
> > When a cloned request is retried on other queues it always needs
> > to be checked against the queue limits of that queue.
> > Otherwise the calculations for nr_phys_segments might be wrong,
> > leading to a crash in scsi_init_sgtable().
> > 
> > To clarify this the patch renames blk_rq_check_limits()
> > to blk_cloned_rq_check_limits() and removes the symbol
> > export, as the new function should only be used for
> > cloned requests and never exported.
> > 
> > Cc: Mike Snitzer <snitzer@redhat.com>
> > Cc: Ewan Milne <emilne@redhat.com>
> > Cc: Jeff Moyer <jmoyer@redhat.com>
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
> 
> Patch looks good.  Thanks for getting to the bottom of this.
> 
> Jens, please add these extra tags when you pick this up:
> 
> Fixes: e2a60da74 ("block: Clean up special command handling logic")
> Cc: stable@vger.kernel.org # 3.7+
> Acked-by: Mike Snitzer <snitzer@redhat.com>

I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
with this patch applied.

markus@x4 linux % git describe
v4.4-rc2-215-g081f3698e606
Hannes Reinecke Nov. 29, 2015, 3:43 p.m. UTC | #3
On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
>> On Thu, Nov 26 2015 at  2:46am -0500,
>> Hannes Reinecke <hare@suse.de> wrote:
>>
>>> When a cloned request is retried on other queues it always needs
>>> to be checked against the queue limits of that queue.
>>> Otherwise the calculations for nr_phys_segments might be wrong,
>>> leading to a crash in scsi_init_sgtable().
>>>
>>> To clarify this the patch renames blk_rq_check_limits()
>>> to blk_cloned_rq_check_limits() and removes the symbol
>>> export, as the new function should only be used for
>>> cloned requests and never exported.
>>>
>>> Cc: Mike Snitzer <snitzer@redhat.com>
>>> Cc: Ewan Milne <emilne@redhat.com>
>>> Cc: Jeff Moyer <jmoyer@redhat.com>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>
>> Patch looks good.  Thanks for getting to the bottom of this.
>>
>> Jens, please add these extra tags when you pick this up:
>>
>> Fixes: e2a60da74 ("block: Clean up special command handling logic")
>> Cc: stable@vger.kernel.org # 3.7+
>> Acked-by: Mike Snitzer <snitzer@redhat.com>
> 
> I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> with this patch applied.
> 
> markus@x4 linux % git describe
> v4.4-rc2-215-g081f3698e606
> 
Can you generate a crashdump?
I would need to cross-check with the other dumps I'm having to figure
out if this really is the same issue.
There have been other reports (and fixes) which show we're fighting
several distinct issues here.

Cheers,

Hannes
Markus Trippelsdorf Nov. 29, 2015, 4:15 p.m. UTC | #4
On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
> >> On Thu, Nov 26 2015 at  2:46am -0500,
> >> Hannes Reinecke <hare@suse.de> wrote:
> >>
> >>> When a cloned request is retried on other queues it always needs
> >>> to be checked against the queue limits of that queue.
> >>> Otherwise the calculations for nr_phys_segments might be wrong,
> >>> leading to a crash in scsi_init_sgtable().
> >>>
> >>> To clarify this the patch renames blk_rq_check_limits()
> >>> to blk_cloned_rq_check_limits() and removes the symbol
> >>> export, as the new function should only be used for
> >>> cloned requests and never exported.
> >>>
> >>> Cc: Mike Snitzer <snitzer@redhat.com>
> >>> Cc: Ewan Milne <emilne@redhat.com>
> >>> Cc: Jeff Moyer <jmoyer@redhat.com>
> >>> Signed-off-by: Hannes Reinecke <hare@suse.de>
> >>
> >> Patch looks good.  Thanks for getting to the bottom of this.
> >>
> >> Jens, please add these extra tags when you pick this up:
> >>
> >> Fixes: e2a60da74 ("block: Clean up special command handling logic")
> >> Cc: stable@vger.kernel.org # 3.7+
> >> Acked-by: Mike Snitzer <snitzer@redhat.com>
> > 
> > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > with this patch applied.
> > 
> > markus@x4 linux % git describe
> > v4.4-rc2-215-g081f3698e606
> > 
> Can you generate a crashdump?
> I would need to cross-check with the other dumps I'm having to figure
> out if this really is the same issue.
> There have been other reports (and fixes) which show we're fighting
> several distinct issues here.

Unfortunately no. The crash happens on the disk where I store my log
files. And after it happened the magic SysRq keys don't work anymore.

The crash only happens on my spinning rust drive that uses the cfq
scheduler. The SSDs (deadline) are fine.

The BUG happens reproducibly when building http://www.sagemath.org/ on
that drive.
Mike Snitzer Nov. 29, 2015, 4:49 p.m. UTC | #5
On Sun, Nov 29 2015 at 11:15am -0500,
Markus Trippelsdorf <markus@trippelsdorf.de> wrote:

> On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> > On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > > 
> > > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > > with this patch applied.
> > > 
> > > markus@x4 linux % git describe
> > > v4.4-rc2-215-g081f3698e606
> > > 
> > Can you generate a crashdump?
> > I would need to cross-check with the other dumps I'm having to figure
> > out if this really is the same issue.
> > There have been other reports (and fixes) which show we're fighting
> > several distinct issues here.
> 
> Unfortunately no. The crash happens on the disk where I store my log
> files. And after it happened the magic SysRq keys don't work anymore.
> 
> The crash only happens on my spinning rust drive that uses the cfq
> scheduler. The SSDs (deadline) are fine.
> 
> The BUG happens reproducibly when building http://www.sagemath.org/ on
> that drive.

Are you using DM multipath?  If unsure, please let us know which
device(s) map to the "spinning rust drive", and provide output from:
lsblk
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Markus Trippelsdorf Nov. 29, 2015, 5:05 p.m. UTC | #6
On 2015.11.29 at 11:49 -0500, Mike Snitzer wrote:
> On Sun, Nov 29 2015 at 11:15am -0500,
> Markus Trippelsdorf <markus@trippelsdorf.de> wrote:
> 
> > On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> > > On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > > > 
> > > > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > > > with this patch applied.
> > > > 
> > > > markus@x4 linux % git describe
> > > > v4.4-rc2-215-g081f3698e606
> > > > 
> > > Can you generate a crashdump?
> > > I would need to cross-check with the other dumps I'm having to figure
> > > out if this really is the same issue.
> > > There have been other reports (and fixes) which show we're fighting
> > > several distinct issues here.
> > 
> > Unfortunately no. The crash happens on the disk where I store my log
> > files. And after it happened the magic SysRq keys don't work anymore.
> > 
> > The crash only happens on my spinning rust drive that uses the cfq
> > scheduler. The SSDs (deadline) are fine.
> > 
> > The BUG happens reproducibly when building http://www.sagemath.org/ on
> > that drive.
> 
> Are you using DM multipath?  If unsure, please let us know which
> device(s) map to the "spinning rust drive", and provide output from:
> lsblk

No, I'm not using DM multipath. 

/dev/sdb2 on /var type btrfs (rw,relatime,compress=lzo,noacl,space_cache)
/dev/sdb2      btrfs     1.9T  904G  944G  49% /var

scsi 1:0:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC29 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
sd 1:0:0:0: [sdb] 4096-byte physical blocks
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1CH164
Hannes Reinecke Nov. 30, 2015, 6:47 a.m. UTC | #7
On 11/29/2015 06:05 PM, Markus Trippelsdorf wrote:
> On 2015.11.29 at 11:49 -0500, Mike Snitzer wrote:
>> On Sun, Nov 29 2015 at 11:15am -0500,
>> Markus Trippelsdorf <markus@trippelsdorf.de> wrote:
>>
>>> On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
>>>> On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
>>>>>
>>>>> I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
>>>>> with this patch applied.
>>>>>
>>>>> markus@x4 linux % git describe
>>>>> v4.4-rc2-215-g081f3698e606
>>>>>
>>>> Can you generate a crashdump?
>>>> I would need to cross-check with the other dumps I'm having to figure
>>>> out if this really is the same issue.
>>>> There have been other reports (and fixes) which show we're fighting
>>>> several distinct issues here.
>>>
>>> Unfortunately no. The crash happens on the disk where I store my log
>>> files. And after it happened the magic SysRq keys don't work anymore.
>>>
>>> The crash only happens on my spinning rust drive that uses the cfq
>>> scheduler. The SSDs (deadline) are fine.
>>>
>>> The BUG happens reproducibly when building http://www.sagemath.org/ on
>>> that drive.
>>
>> Are you using DM multipath?  If unsure, please let us know which
>> device(s) map to the "spinning rust drive", and provide output from:
>> lsblk
> 
> No, I'm not using DM multipath. 
> 
> /dev/sdb2 on /var type btrfs (rw,relatime,compress=lzo,noacl,space_cache)
> /dev/sdb2      btrfs     1.9T  904G  944G  49% /var
> 
> scsi 1:0:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC29 PQ: 0 ANSI: 5
> sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
> sd 1:0:0:0: [sdb] 4096-byte physical blocks
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: Attached scsi generic sg1 type 0
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> 
> Model Family:     Seagate Barracuda 7200.14 (AF)
> Device Model:     ST2000DM001-1CH164
> 
As Ming Lei indicated, this is probably a different issue. My patch
is for fixing multipath-failover induced I/O errors only.
So if you're not using multipath you won't be affected, neither by
the original issue triggering the BUG_ON nor my patch attempting to
fix it.

Cheers,

Hannes
diff mbox

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 5131993b..a0af404 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2114,7 +2114,8 @@  blk_qc_t submit_bio(int rw, struct bio *bio)
 EXPORT_SYMBOL(submit_bio);
 
 /**
- * blk_rq_check_limits - Helper function to check a request for the queue limit
+ * blk_cloned_rq_check_limits - Helper function to check a cloned request
+ *                              for new the queue limits
  * @q:  the queue
  * @rq: the request being checked
  *
@@ -2125,20 +2126,13 @@  EXPORT_SYMBOL(submit_bio);
  *    after it is inserted to @q, it should be checked against @q before
  *    the insertion using this generic function.
  *
- *    This function should also be useful for request stacking drivers
- *    in some cases below, so export this function.
  *    Request stacking drivers like request-based dm may change the queue
- *    limits while requests are in the queue (e.g. dm's table swapping).
- *    Such request stacking drivers should check those requests against
- *    the new queue limits again when they dispatch those requests,
- *    although such checkings are also done against the old queue limits
- *    when submitting requests.
+ *    limits when retrying requests on other queues. Those requests need
+ *    to be checked against the new queue limits again during dispatch.
  */
-int blk_rq_check_limits(struct request_queue *q, struct request *rq)
+static int blk_cloned_rq_check_limits(struct request_queue *q,
+				      struct request *rq)
 {
-	if (!rq_mergeable(rq))
-		return 0;
-
 	if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) {
 		printk(KERN_ERR "%s: over max size limit.\n", __func__);
 		return -EIO;
@@ -2158,7 +2152,6 @@  int blk_rq_check_limits(struct request_queue *q, struct request *rq)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(blk_rq_check_limits);
 
 /**
  * blk_insert_cloned_request - Helper for stacking drivers to submit a request
@@ -2170,7 +2163,7 @@  int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 	unsigned long flags;
 	int where = ELEVATOR_INSERT_BACK;
 
-	if (blk_rq_check_limits(q, rq))
+	if (blk_cloned_rq_check_limits(q, rq))
 		return -EIO;
 
 	if (rq->rq_disk &&
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3fe27f8..d14961b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -773,7 +773,6 @@  extern void blk_rq_set_block_pc(struct request *);
 extern void blk_requeue_request(struct request_queue *, struct request *);
 extern void blk_add_request_payload(struct request *rq, struct page *page,
 		unsigned int len);
-extern int blk_rq_check_limits(struct request_queue *q, struct request *rq);
 extern int blk_lld_busy(struct request_queue *q);
 extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 			     struct bio_set *bs, gfp_t gfp_mask,