diff mbox series

[1/1] block: return BLK_STS_NOTSUPP if operation is not supported

Message ID 1595608402-16457-1-git-send-email-ritika.srivastava@oracle.com (mailing list archive)
State New, archived
Headers show
Series [1/1] block: return BLK_STS_NOTSUPP if operation is not supported | expand

Commit Message

Ritika Srivastava July 24, 2020, 4:33 p.m. UTC
If WRITE_ZERO/WRITE_SAME operation is not supported by the storage,
blk_cloned_rq_check_limits() will return -EIO which will cause
device-mapper to fail the paths.

Instead, if the queue limit is set to 0, return BLK_STS_NOTSUPP.
BLK_STS_NOTSUPP will be ignored by device-mapper and will not fail the
paths.

Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ritika Srivastava <ritika.srivastava@oracle.com>
---
 block/blk-core.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig July 26, 2020, 3:10 p.m. UTC | #1
On Fri, Jul 24, 2020 at 09:33:22AM -0700, Ritika Srivastava wrote:
> If WRITE_ZERO/WRITE_SAME operation is not supported by the storage,
> blk_cloned_rq_check_limits() will return -EIO which will cause
> device-mapper to fail the paths.
> 
> Instead, if the queue limit is set to 0, return BLK_STS_NOTSUPP.
> BLK_STS_NOTSUPP will be ignored by device-mapper and will not fail the
> paths.

How do we end up calling blk_cloned_rq_check_limits when the operations
aren't actually supported?  The upper layers should make sure this never
happens, so you're just fixing symptoms here and not the root cause.
Ritika Srivastava July 27, 2020, 7:11 p.m. UTC | #2
Hi Christoph,

Thank you for the review.

> On Jul 26, 2020, at 8:10 AM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Fri, Jul 24, 2020 at 09:33:22AM -0700, Ritika Srivastava wrote:
>> If WRITE_ZERO/WRITE_SAME operation is not supported by the storage,
>> blk_cloned_rq_check_limits() will return -EIO which will cause
>> device-mapper to fail the paths.
>> 
>> Instead, if the queue limit is set to 0, return BLK_STS_NOTSUPP.
>> BLK_STS_NOTSUPP will be ignored by device-mapper and will not fail the
>> paths.
> 
> How do we end up calling blk_cloned_rq_check_limits when the operations
> aren't actually supported?  The upper layers should make sure this never
> happens, so you're just fixing symptoms here and not the root cause.
The device advertises it’s write zero value as non zero value:
cat /sys/block/sdh/queue/write_zeroes_max_bytes
33553920

Hence block layer issues write zeroes in blkdev_issue_zeroout()

In response, the storage returns the following SCSI error
Sense Key : Illegal Request [current]
Add. Sense: Invalid command operation code

Once this error is received, the write zero capability of the device is disabled and write_zeroes_max_bytes is set to 0.
DM device queue limits are also set to 0 and device-mapper fails the path.
To avoid this, if BLK_STS_NOTSUPP could be returned, then device-mapper would not fail the paths.

Once the write zero capability has been disabled, blk-lib issues zeroes via __blkdev_issue_zero_pages().

Please let me know if I missed something.
Christoph Hellwig July 28, 2020, 7:28 a.m. UTC | #3
On Mon, Jul 27, 2020 at 12:11:19PM -0700, Ritika Srivastava wrote:
> Hence block layer issues write zeroes in blkdev_issue_zeroout()
> 
> In response, the storage returns the following SCSI error
> Sense Key : Illegal Request [current]
> Add. Sense: Invalid command operation code
> 
> Once this error is received, the write zero capability of the device is disabled and write_zeroes_max_bytes is set to 0.
> DM device queue limits are also set to 0 and device-mapper fails the path.
> To avoid this, if BLK_STS_NOTSUPP could be returned, then device-mapper would not fail the paths.
> 
> Once the write zero capability has been disabled, blk-lib issues zeroes via __blkdev_issue_zero_pages().
> 
> Please let me know if I missed something.

Oh, the stupid SCSI runtime detection case again.  Can you please
document this in a comment?

Also please switch blk_cloned_rq_check_limits to just return the
blk_status_t directly.
Ritika Srivastava July 28, 2020, 5:18 p.m. UTC | #4
> On Jul 28, 2020, at 12:28 AM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Mon, Jul 27, 2020 at 12:11:19PM -0700, Ritika Srivastava wrote:
>> Hence block layer issues write zeroes in blkdev_issue_zeroout()
>> 
>> In response, the storage returns the following SCSI error
>> Sense Key : Illegal Request [current]
>> Add. Sense: Invalid command operation code
>> 
>> Once this error is received, the write zero capability of the device is disabled and write_zeroes_max_bytes is set to 0.
>> DM device queue limits are also set to 0 and device-mapper fails the path.
>> To avoid this, if BLK_STS_NOTSUPP could be returned, then device-mapper would not fail the paths.
>> 
>> Once the write zero capability has been disabled, blk-lib issues zeroes via __blkdev_issue_zero_pages().
>> 
>> Please let me know if I missed something.
> 
> Oh, the stupid SCSI runtime detection case again.  Can you please
> document this in a comment?
> 
> Also please switch blk_cloned_rq_check_limits to just return the
> blk_status_t directly.

Sure, I will make these changes and send updated version.

Thanks,
Ritika
diff mbox series

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 9bfaee0..173bb04 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1223,10 +1223,13 @@  blk_qc_t submit_bio(struct bio *bio)
 static int blk_cloned_rq_check_limits(struct request_queue *q,
 				      struct request *rq)
 {
-	if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, req_op(rq))) {
+	unsigned int queue_max_sector = blk_queue_get_max_sectors(q, req_op(rq));
+
+	if (blk_rq_sectors(rq) > queue_max_sector) {
 		printk(KERN_ERR "%s: over max size limit. (%u > %u)\n",
-			__func__, blk_rq_sectors(rq),
-			blk_queue_get_max_sectors(q, req_op(rq)));
+			__func__, blk_rq_sectors(rq), queue_max_sector);
+		if (queue_max_sector == 0)
+			return -EOPNOTSUPP;
 		return -EIO;
 	}
 
@@ -1253,7 +1256,11 @@  static int blk_cloned_rq_check_limits(struct request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 {
-	if (blk_cloned_rq_check_limits(q, rq))
+	int cloned_limit_check = blk_cloned_rq_check_limits(q, rq);
+
+	if (cloned_limit_check == -EOPNOTSUPP)
+		return BLK_STS_NOTSUPP;
+	else if (cloned_limit_check)
 		return BLK_STS_IOERR;
 
 	if (rq->rq_disk &&