From patchwork Wed Apr 5 17:21:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 9665149 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BC4A2602B5 for ; Wed, 5 Apr 2017 17:22:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA987204C1 for ; Wed, 5 Apr 2017 17:22:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9F50C2859E; Wed, 5 Apr 2017 17:22:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28AFA204C1 for ; Wed, 5 Apr 2017 17:22:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932827AbdDERW5 (ORCPT ); Wed, 5 Apr 2017 13:22:57 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:59806 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755956AbdDERWs (ORCPT ); Wed, 5 Apr 2017 13:22:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=JVa1fjb0YzbRmvqr4CeToZ061NS+gVnviM9Ah1VDM/s=; b=apcX73UQWUjHhUpizOkeqy5fE 0sQLLmq+z5iNOSN1VopMKSKvGTTBaXtS32KHUfuJWEst+19pqb/w+8563NbytAI5MZoPnYwcwUMDC C/l0YuphFuHwN2PU98xc59hRS8g6OT4N0Afr2rqqb/THCIXxX07mpkHIMpbWnt7BW5uK+gw/DuO3Y HldnFGkfRa6BO3FwpMOVc9UrB3nl90kKAw/fBfZIfSt2DgLOQTbVrM0C5jOlv0sTo3OIYz7jTWlei ReGL2Beu4Hm40BtKctbv6Dn5ZjmCh3qYpKUQihRNKiDUOySmZ3PhNXE4650HuLD7SAtD/ZMen/IQZ UZejStgtQ==; Received: from clnet-p099-196.ikbnet.co.at ([83.175.99.196] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1cvodp-0005Mm-Kx; Wed, 05 Apr 2017 17:22:46 +0000 From: Christoph Hellwig To: axboe@kernel.dk, martin.petersen@oracle.com, agk@redhat.com, snitzer@redhat.com, shli@kernel.org, philipp.reisner@linbit.com, lars.ellenberg@linbit.com Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, drbd-dev@lists.linbit.com, dm-devel@redhat.com, linux-raid@vger.kernel.org Subject: [PATCH 26/27] scsi: sd: Separate zeroout and discard command choices Date: Wed, 5 Apr 2017 19:21:24 +0200 Message-Id: <20170405172125.22600-27-hch@lst.de> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170405172125.22600-1-hch@lst.de> References: <20170405172125.22600-1-hch@lst.de> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: "Martin K. Petersen" Now that zeroout and discards are distinct operations we need to separate the policy of choosing the appropriate command. Create a zeroing_mode which can be one of: write: Zeroout assist not present, use regular WRITE writesame: Allow WRITE SAME(10/16) with a zeroed payload writesame_16_unmap: Allow WRITE SAME(16) with UNMAP writesame_10_unmap: Allow WRITE SAME(10) with UNMAP The last two are conditional on the device being thin provisioned with LBPRZ=1 and LBPWS=1 or LBPWS10=1 respectively. Whether to set the UNMAP bit or not depends on the REQ_NOUNMAP flag. And if none of the _unmap variants are supported, regular WRITE SAME will be used if the device supports it. The zeroout_mode is exported in sysfs and the detected mode for a given device can be overridden using the string constants above. With this change in place we can now issue WRITE SAME(16) with UNMAP set for block zeroing applications that require hard guarantees and logical_block_size granularity. And at the same time use the UNMAP command with the device's preferred granulary and alignment for discard operations. Signed-off-by: Martin K. Petersen Signed-off-by: Christoph Hellwig Reviewed-by: Hannes Reinecke --- drivers/scsi/sd.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- drivers/scsi/sd.h | 8 ++++++++ 2 files changed, 61 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bcb0cb020fd2..acf9d17b05d8 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -418,6 +418,46 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RW(provisioning_mode); +static const char *zeroing_mode[] = { + [SD_ZERO_WRITE] = "write", + [SD_ZERO_WS] = "writesame", + [SD_ZERO_WS16_UNMAP] = "writesame_16_unmap", + [SD_ZERO_WS10_UNMAP] = "writesame_10_unmap", +}; + +static ssize_t +zeroing_mode_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct scsi_disk *sdkp = to_scsi_disk(dev); + + return snprintf(buf, 20, "%s\n", zeroing_mode[sdkp->zeroing_mode]); +} + +static ssize_t +zeroing_mode_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct scsi_disk *sdkp = to_scsi_disk(dev); + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (!strncmp(buf, zeroing_mode[SD_ZERO_WRITE], 20)) + sdkp->zeroing_mode = SD_ZERO_WRITE; + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS], 20)) + sdkp->zeroing_mode = SD_ZERO_WS; + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS16_UNMAP], 20)) + sdkp->zeroing_mode = SD_ZERO_WS16_UNMAP; + else if (!strncmp(buf, zeroing_mode[SD_ZERO_WS10_UNMAP], 20)) + sdkp->zeroing_mode = SD_ZERO_WS10_UNMAP; + else + return -EINVAL; + + return count; +} +static DEVICE_ATTR_RW(zeroing_mode); + static ssize_t max_medium_access_timeouts_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -496,6 +536,7 @@ static struct attribute *sd_disk_attrs[] = { &dev_attr_app_tag_own.attr, &dev_attr_thin_provisioning.attr, &dev_attr_provisioning_mode.attr, + &dev_attr_zeroing_mode.attr, &dev_attr_max_write_same_blocks.attr, &dev_attr_max_medium_access_timeouts.attr, NULL, @@ -799,10 +840,10 @@ static int sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9); if (!(rq->cmd_flags & REQ_NOUNMAP)) { - switch (sdkp->provisioning_mode) { - case SD_LBP_WS16: + switch (sdkp->zeroing_mode) { + case SD_ZERO_WS16_UNMAP: return sd_setup_write_same16_cmnd(cmd, true); - case SD_LBP_WS10: + case SD_ZERO_WS10_UNMAP: return sd_setup_write_same10_cmnd(cmd, true); } } @@ -840,6 +881,15 @@ static void sd_config_write_same(struct scsi_disk *sdkp) sdkp->max_ws_blocks = 0; } + if (sdkp->lbprz && sdkp->lbpws) + sdkp->zeroing_mode = SD_ZERO_WS16_UNMAP; + else if (sdkp->lbprz && sdkp->lbpws10) + sdkp->zeroing_mode = SD_ZERO_WS10_UNMAP; + else if (sdkp->max_ws_blocks) + sdkp->zeroing_mode = SD_ZERO_WS; + else + sdkp->zeroing_mode = SD_ZERO_WRITE; + out: blk_queue_max_write_same_sectors(q, sdkp->max_ws_blocks * (logical_block_size >> 9)); diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 4dac35e96a75..a2c4b5c35379 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -59,6 +59,13 @@ enum { SD_LBP_DISABLE, /* Discard disabled due to failed cmd */ }; +enum { + SD_ZERO_WRITE = 0, /* Use WRITE(10/16) command */ + SD_ZERO_WS, /* Use WRITE SAME(10/16) command */ + SD_ZERO_WS16_UNMAP, /* Use WRITE SAME(16) with UNMAP */ + SD_ZERO_WS10_UNMAP, /* Use WRITE SAME(10) with UNMAP */ +}; + struct scsi_disk { struct scsi_driver *driver; /* always &sd_template */ struct scsi_device *device; @@ -89,6 +96,7 @@ struct scsi_disk { u8 write_prot; u8 protection_type;/* Data Integrity Field */ u8 provisioning_mode; + u8 zeroing_mode; unsigned ATO : 1; /* state of disk ATO bit */ unsigned cache_override : 1; /* temp override of WCE,RCD */ unsigned WCE : 1; /* state of disk WCE bit */