From patchwork Thu Oct 24 06:50:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11208351 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F89214E5 for ; Thu, 24 Oct 2019 06:50:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 080312166E for ; Thu, 24 Oct 2019 06:50:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="p1UdsdTw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437814AbfJXGuL (ORCPT ); Thu, 24 Oct 2019 02:50:11 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:35887 "EHLO esa3.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437772AbfJXGuL (ORCPT ); Thu, 24 Oct 2019 02:50:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1571899811; x=1603435811; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=ln8B9Hx/otVUooon6rpjm8qdOv1APYrl4XlfoaJhVUY=; b=p1UdsdTw04W5s29+vrrE5LfKjCGPvr6srZwPUOGHEiERQ5B41aAqgoOc INqnvE2MYbw/cfP9TUaCFL4v6Agu0MRPsow882bYXYSv8c6/mcjXRqHw6 e9gUbBUVUvLDG/rQCx5w83+jmdvMShSXUKYujxhu1X6w0ag7rY7R/A729 rSgEB1zgUB4VrSmrYanDm4jvKpZ7S39OKZD853H8pcYcTrT00+du7/KC6 hrXt8ZGFhV1MhhK/x6kYpUDNQVH147EHzuxk5vLHjKaMmmP9wHoX0zQ26 emSRJMmFyYIpY6Le/LgiNDLrF++k4wrNhj7d/qrtYEx0gRUHLztQP4mSC Q==; IronPort-SDR: MQ9493tz8/hT617uGuuK9tQp4d+jBrx0y5/ikfVB4VI2Ld3o8BH6yTHmHWNBUTWNOkZ69PNEkU Ivy+A9AYBOwUFivJpxZGVt7xb3yxJwt2Z0VnNGQvaut4QfYOARRowKVEspLnPtQMmMIsIcrdIP 5lURw8mbafYP/U0is3ex0muuBrNsT0W7UutmJMDOMdxmqDlE6ulJTmqned0pm2HRD4tsKnShqb KfPZXiRLTBq5IZgPzD3bnnsGpmWS2cTZDJVMHTCdKfYhVRWRBDzRK1+QYIz5zC1Y/bnlT6x3nB TbM= X-IronPort-AV: E=Sophos;i="5.68,223,1569254400"; d="scan'208";a="125647242" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 24 Oct 2019 14:50:10 +0800 IronPort-SDR: fVSG3csNI/aSQlohgHzYQQwIl2FY6nHLaBVIAdxe/BKrkhIC5Htf4DUOPdxdyEOy+kO6YCAm7q 2TAz19+aT+gtoTFDx8M6634z+cipvB+nn+tQSXw4zrmQJ8d4xXC2/d9aFb7p7OpCQHB4Oy/Rk8 GTTdVhdDnXc8SodGNpBB3w+il7mxBrndbJMq+/6TytzTBLnpBsaPvrynvnmBngRHRDsaiP+98H qGLIkUfcez5CH6FjjqAmhQkeAN8dcd77r/OdB4GAVS1Q61b1Out8w+QwhjOks06/g0ECKfC9cW 4cwWgnaqVXa8HspnI5DX4ZOb Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2019 23:45:44 -0700 IronPort-SDR: LHp3uDUqzFnLyMgJlsVKjJos8ZJrO6nC0vWLE3pP3X0fSWBU1zH1mRAeSudGT0DEEUi+qDBg+/ 0BhACu6+X9cjDdF/NFT5YdHbvxrbBjfcdsgwhDZow6pkLfL4XvMIMSOdwn8MMRYMyUaMh4KXdl MGc2KxH60jE5GcvW7FEChHtWSo7HKv7gRoqUwtoUZMQ5Maj+dvGur4ZTkk0jL92qfjaWiFS7r+ 1oPMNLfNOLpTI+QjMUhFCW7b/5M8b+qd0FH/u9ii5q5RWda1Ed5Q9ncdiUaDiD2JtnAsud4v2b Hw0= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 23 Oct 2019 23:50:09 -0700 From: Damien Le Moal To: linux-block@vger.kernel.org, Jens Axboe , linux-scsi@vger.kernel.org, "Martin K . Petersen" , dm-devel@redhat.com, Mike Snitzer Subject: [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Date: Thu, 24 Oct 2019 15:50:03 +0900 Message-Id: <20191024065006.8684-2-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191024065006.8684-1-damien.lemoal@wdc.com> References: <20191024065006.8684-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org For ZBC and ZAC zoned devices, the scsi driver revalidation processing implemented by sd_revalidate_disk() includes a call to sd_zbc_read_zones() which executes a full disk zone report used to check that all zones of the disk are the same size. This processing is followed by a call to blk_revalidate_disk_zones(), used to initialize the device request queue zone bitmaps (zone type and zone write lock bitmaps). To do so, blk_revalidate_disk_zones() also executes a full device zone report to obtain zone types. As a result, the entire zoned block device revalidation process includes two full device zone report. By moving the zone size checks into blk_revalidate_disk_zones(), this process can be optimized to a single full device zone report, leading to shorter device scan and revalidation times. This patch implements this optimization, reducing the original full device zone report implemented in sd_zbc_check_zones() to a single, small, report zones command execution to obtain the size of the first zone of the device. Checks whether all zones of the device are the same size as the first zone size are moved to the generic blk_check_zone() function called from blk_revalidate_disk_zones(). This optimization also has the following benefits: 1) fewer memory allocations in the scsi layer during disk revalidation as the potentailly large buffer for zone report execution is not needed. 2) Implement zone checks in a generic manner, reducing the burden on device driver which only need to obtain the zone size and check that this size is a power of 2 number of LBAs. Any new type of zoned block device will benefit from this. Signed-off-by: Damien Le Moal --- block/blk-zoned.c | 61 +++++++++++++++++++++++- drivers/scsi/sd_zbc.c | 107 ++++++++---------------------------------- 2 files changed, 79 insertions(+), 89 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 4bc5f260248a..293891b7068a 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -441,6 +441,57 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q) q->seq_zones_wlock = NULL; } +/** + * blk_check_zone - Check a zone information + * @q: request queue + * @zone: the zone to check + * @sector: start sector of the zone + * + * Helper function to check zones of a zoned block device. Returns true if the + * zone is correct and false if a problem is detected. + */ +static bool blk_check_zone(struct gendisk *disk, struct blk_zone *zone, + sector_t *sector) +{ + struct request_queue *q = disk->queue; + sector_t zone_sectors = blk_queue_zone_sectors(q); + sector_t capacity = get_capacity(disk); + + /* + * All zones must have the same size, with the exception on an eventual + * smaller last zone. + */ + if (zone->start + zone_sectors < capacity && + zone->len != zone_sectors) { + pr_warn("%s: Invalid zone device with non constant zone size\n", + disk->disk_name); + return false; + } + + /* Check for holes in the zone report */ + if (zone->start != *sector) { + pr_warn("%s: Zone gap at sectors %llu..%llu\n", + disk->disk_name, *sector, zone->start); + return false; + } + + /* Check zone type */ + switch (zone->type) { + case BLK_ZONE_TYPE_CONVENTIONAL: + case BLK_ZONE_TYPE_SEQWRITE_REQ: + case BLK_ZONE_TYPE_SEQWRITE_PREF: + break; + default: + pr_warn("%s: Invalid zone type 0x%x at sectors %llu\n", + disk->disk_name, (int)zone->type, zone->start); + return false; + } + + *sector += zone->len; + + return true; +} + /** * blk_revalidate_disk_zones - (re)allocate and initialize zone bitmaps * @disk: Target disk @@ -490,7 +541,10 @@ int blk_revalidate_disk_zones(struct gendisk *disk) if (!seq_zones_bitmap) goto out; - /* Get zone information and initialize seq_zones_bitmap */ + /* + * Get zone information to check the zones and initialize + * seq_zones_bitmap. + */ rep_nr_zones = nr_zones; zones = blk_alloc_zones(&rep_nr_zones); if (!zones) @@ -504,11 +558,14 @@ int blk_revalidate_disk_zones(struct gendisk *disk) if (!nrz) break; for (i = 0; i < nrz; i++) { + if (!blk_check_zone(disk, &zones[i], §or)) { + ret = -ENODEV; + goto out; + } if (zones[i].type != BLK_ZONE_TYPE_CONVENTIONAL) set_bit(z, seq_zones_bitmap); z++; } - sector += nrz * blk_queue_zone_sectors(q); } if (WARN_ON(z != nr_zones)) { diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index de4019dc0f0b..fbec99db6124 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -344,32 +344,19 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp, * Returns the zone size in number of blocks upon success or an error code * upon failure. */ -static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) +static int sd_zbc_check_zones(struct scsi_disk *sdkp, unsigned char *buf, + u32 *zblocks) { - size_t bufsize, buflen; - unsigned int noio_flag; + size_t buflen; u64 zone_blocks = 0; - sector_t max_lba, block = 0; - unsigned char *buf; + sector_t max_lba; unsigned char *rec; int ret; - u8 same; - - /* Do all memory allocations as if GFP_NOIO was specified */ - noio_flag = memalloc_noio_save(); - /* Get a buffer */ - buf = sd_zbc_alloc_report_buffer(sdkp, SD_ZBC_REPORT_MAX_ZONES, - &bufsize); - if (!buf) { - ret = -ENOMEM; - goto out; - } - - /* Do a report zone to get max_lba and the same field */ - ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, 0, false); + /* Do a report zone to get max_lba and the size of the first zone */ + ret = sd_zbc_do_report_zones(sdkp, buf, SD_BUF_SIZE, 0, false); if (ret) - goto out_free; + return ret; if (sdkp->rc_basis == 0) { /* The max_lba field is the capacity of this device */ @@ -384,82 +371,28 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) } } - /* - * Check same field: for any value other than 0, we know that all zones - * have the same size. - */ - same = buf[4] & 0x0f; - if (same > 0) { - rec = &buf[64]; - zone_blocks = get_unaligned_be64(&rec[8]); - goto out; - } - - /* - * Check the size of all zones: all zones must be of - * equal size, except the last zone which can be smaller - * than other zones. - */ - do { - - /* Parse REPORT ZONES header */ - buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64, - bufsize); - rec = buf + 64; - - /* Parse zone descriptors */ - while (rec < buf + buflen) { - u64 this_zone_blocks = get_unaligned_be64(&rec[8]); - - if (zone_blocks == 0) { - zone_blocks = this_zone_blocks; - } else if (this_zone_blocks != zone_blocks && - (block + this_zone_blocks < sdkp->capacity - || this_zone_blocks > zone_blocks)) { - zone_blocks = 0; - goto out; - } - block += this_zone_blocks; - rec += 64; - } - - if (block < sdkp->capacity) { - ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, block, - true); - if (ret) - goto out_free; - } - - } while (block < sdkp->capacity); - -out: - if (!zone_blocks) { - if (sdkp->first_scan) - sd_printk(KERN_NOTICE, sdkp, - "Devices with non constant zone " - "size are not supported\n"); - ret = -ENODEV; - } else if (!is_power_of_2(zone_blocks)) { + /* Parse REPORT ZONES header */ + buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64, SD_BUF_SIZE); + rec = buf + 64; + zone_blocks = get_unaligned_be64(&rec[8]); + if (!zone_blocks || !is_power_of_2(zone_blocks)) { if (sdkp->first_scan) sd_printk(KERN_NOTICE, sdkp, "Devices with non power of 2 zone " "size are not supported\n"); - ret = -ENODEV; - } else if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) { + return -ENODEV; + } + + if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) { if (sdkp->first_scan) sd_printk(KERN_NOTICE, sdkp, "Zone size too large\n"); - ret = -EFBIG; - } else { - *zblocks = zone_blocks; - ret = 0; + return -EFBIG; } -out_free: - memalloc_noio_restore(noio_flag); - kvfree(buf); + *zblocks = zone_blocks; - return ret; + return 0; } int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) @@ -485,7 +418,7 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) * Check zone size: only devices with a constant zone size (except * an eventual last runt zone) that is a power of 2 are supported. */ - ret = sd_zbc_check_zones(sdkp, &zone_blocks); + ret = sd_zbc_check_zones(sdkp, buf, &zone_blocks); if (ret != 0) goto err; From patchwork Thu Oct 24 06:50:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11208355 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B1CC13B1 for ; Thu, 24 Oct 2019 06:50:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D3C3820856 for ; Thu, 24 Oct 2019 06:50:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="hPpEsHbB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437818AbfJXGuM (ORCPT ); Thu, 24 Oct 2019 02:50:12 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:35887 "EHLO esa3.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437816AbfJXGuL (ORCPT ); Thu, 24 Oct 2019 02:50:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1571899812; x=1603435812; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=0RwzibkoRNKSwjPWWwSw61zLqtm+0wYt15Pi27NEjIQ=; b=hPpEsHbBjJPbYwbkDGdspHu3Wyq+0YWqFWPSNg3IPWmKvaKCjoIpdN08 L68OxJ6Jh2bbUPQ9v+81g0H8PYnFtJwB0+9GmR/xBd+Q6moOjMXSfDwvD DE7ip62wi9Aga9YUHdpumaSMWspeMnJD9dUh2PIhGOAAqzKQPWou4NXDX TXDAYgAIJOWHivUBm6zvhmkH10nNKASlfsoPedMUEdwI17DooQStlFKgf BqS2cex4PfLd+JV1VKBf3BHEn8lqbnKo8FEiB8CGWcGIJNr7ORz1F6J9o 3kY41Ul9yywZASAxNSlsdsMyHODTLui3fWtuCqEBbuU2wvKZckJtBAfhN A==; IronPort-SDR: 33Xm1imwA5nypt/HNOAo/9Atgn1N8qBSxpW2xfv6Uyzr56Bn4zTAcRyYW8WQd82R+We/nwuOya iEiZNxHA3i4+QsecbvULjeBmb1Tz2yK6Um9mVwLpUb0SFIZid5nTwnYsk7/PIOl2nHqYCyCjAF l8SOQjj6EmNTqV5bo+MAKyk1Qt3gxoZjKZuismfw3oLVvAEy4Ud2RigKYXqWnhAxmg8c1d35jC plcbWKNNMF5m6jMo97ls11mR3nRKA7gL+d8a7z8koj/z8F/+2eM1OSvL+V7NPh/a9/yG7MZYCL M9k= X-IronPort-AV: E=Sophos;i="5.68,223,1569254400"; d="scan'208";a="125647244" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 24 Oct 2019 14:50:11 +0800 IronPort-SDR: kmLNj8voke2VQ90vZY6cwJW+V7gwjgLP3FkjHALWT2QploNST7OvpdR+ALtAKUCWfTBHdtyXqx YXcZBoKrN3g2qSMC/U/WR0F8qH0Em2lKjYNW1ul92P5cqnaSuW1tbPt0uL1PuzCp/n0FF7O3wg 32S0cNBEDJpogIOV3kyb22kWafB+xHg9drLvTR4ygQJqor3VZJqo2wceCxxBEYxTijW2J2QJXn zHkfBriqlnsBfGUtDMDSw9WDnJMrZJxkQu4WyjMHg3Mhzdorj9RzJeiTyL2/EZ64oKFBFOVtvJ pyfnn+FzQ+WM0tt0QadqhkjA Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2019 23:45:45 -0700 IronPort-SDR: 4AH0G2XIbt+AvAxtk3oEXk00PRFzqVbQCuknzEFj0EmBI3QgaUYBstAjsvfT84Up7SsyL0nkm3 TiT63dkym69hGQsjh+UvwBccUUVwARmOgwuiDEO2dTYSyjwmf1Uygt2NPd5dPixweVA3s7w5Ji lTLBAcCCvUky7KOyqb2pARcTsG56Dp7sHB/FE+S69xLo6SNIenl9Wrxgxbk+U8Nm5fPPFTxFM7 UaLc2j/6hwV9SSKpoboHLskEXb7nwn5Y8MHiVauzanZfTfZgfnnua6SbEgXKNn2o8gR3+fdgJU uw8= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 23 Oct 2019 23:50:11 -0700 From: Damien Le Moal To: linux-block@vger.kernel.org, Jens Axboe , linux-scsi@vger.kernel.org, "Martin K . Petersen" , dm-devel@redhat.com, Mike Snitzer Subject: [PATCH 2/4] block: Simplify report zones execution Date: Thu, 24 Oct 2019 15:50:04 +0900 Message-Id: <20191024065006.8684-3-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191024065006.8684-1-damien.lemoal@wdc.com> References: <20191024065006.8684-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org All kernel users of blkdev_report_zones() as well as applications use through ioctl(BLKZONEREPORT) expect to potentially get less zone descriptors than requested. As such, the use of the internal report zones command execution loop implemented by blk_report_zones() is not necessary and can even be harmful to performance by causing the execution of inefficient small zones report command to service the reminder of a requested zone array. This patch removes blk_report_zones(), simplifying the code. Also remove a now incorrect comment in dm_blk_report_zones(). Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Javier González --- block/blk-zoned.c | 34 +++++----------------------------- drivers/md/dm.c | 6 ------ 2 files changed, 5 insertions(+), 35 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 293891b7068a..43bfd1be0985 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -119,31 +119,6 @@ static bool blkdev_report_zone(struct block_device *bdev, struct blk_zone *rep) return true; } -static int blk_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones) -{ - struct request_queue *q = disk->queue; - unsigned int z = 0, n, nrz = *nr_zones; - sector_t capacity = get_capacity(disk); - int ret; - - while (z < nrz && sector < capacity) { - n = nrz - z; - ret = disk->fops->report_zones(disk, sector, &zones[z], &n); - if (ret) - return ret; - if (!n) - break; - sector += blk_queue_zone_sectors(q) * n; - z += n; - } - - WARN_ON(z > *nr_zones); - *nr_zones = z; - - return 0; -} - /** * blkdev_report_zones - Get zones information * @bdev: Target block device @@ -164,6 +139,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, struct blk_zone *zones, unsigned int *nr_zones) { struct request_queue *q = bdev_get_queue(bdev); + struct gendisk *disk = bdev->bd_disk; unsigned int i, nrz; int ret; @@ -175,7 +151,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, * report_zones method. If it does not have one defined, the device * driver has a bug. So warn about that. */ - if (WARN_ON_ONCE(!bdev->bd_disk->fops->report_zones)) + if (WARN_ON_ONCE(!disk->fops->report_zones)) return -EOPNOTSUPP; if (!*nr_zones || sector >= bdev->bd_part->nr_sects) { @@ -185,8 +161,8 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, nrz = min(*nr_zones, __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector)); - ret = blk_report_zones(bdev->bd_disk, get_start_sect(bdev) + sector, - zones, &nrz); + ret = disk->fops->report_zones(disk, get_start_sect(bdev) + sector, + zones, &nrz); if (ret) return ret; @@ -552,7 +528,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) while (z < nr_zones) { nrz = min(nr_zones - z, rep_nr_zones); - ret = blk_report_zones(disk, sector, zones, &nrz); + ret = disk->fops->report_zones(disk, sector, zones, &nrz); if (ret) goto out; if (!nrz) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 1a5e328c443a..647aa5b0233b 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -473,12 +473,6 @@ static int dm_blk_report_zones(struct gendisk *disk, sector_t sector, goto out; } - /* - * blkdev_report_zones() will loop and call this again to cover all the - * zones of the target, eventually moving on to the next target. - * So there is no need to loop here trying to fill the entire array - * of zones. - */ ret = tgt->type->report_zones(tgt, sector, zones, nr_zones); out: From patchwork Thu Oct 24 06:50:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11208361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF08A1920 for ; Thu, 24 Oct 2019 06:50:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9361E20856 for ; Thu, 24 Oct 2019 06:50:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="jIgxTWS5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437823AbfJXGuO (ORCPT ); Thu, 24 Oct 2019 02:50:14 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:35887 "EHLO esa3.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437821AbfJXGuN (ORCPT ); Thu, 24 Oct 2019 02:50:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1571899813; x=1603435813; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=R1Qm33pEVN/AJ0iDXdmsbE07Vz0VTveY/T36E7t8obs=; b=jIgxTWS5h38jJM6dHCeh1d6TqX6t0ovtUk7dxi7N+Aoln7es8y4HJDDy Jo7otumitGIlrI7JHpyU+ZBk8LPNR66iX54CykyEHGoOZSEF2YNqBTtv+ ZZ9lUzsALRwYFQhCF57E8uwjrYibZ0lU4SzufuyqEHRTXx4m3gfn71KcC F4igoEcRlgWOvTsmIE+cRHcLBctQHmm03Q0kq26kTod/2qlKXrqD0Ce8a GROyZ5pSBMDLWYpPHG74ktUSVTRiwunz9Ey5SQy/Zs+4MSv93ujgLjf+s QMlvGPZYifCXzo8fE7gKnfpVO3u/7dZuGLWUke6SXrGcoPDvIoEwhJCYD A==; IronPort-SDR: yJz+YZeLiQvZ/EjTS/TP3PEmp0hLMgA0k8tu5ftDjsxUhvWzq04bmHykw8ThrJM+GH2/FD78y8 omZsSt0qiI8ueW+bamRVrBfeEL1GmKZdNoYv8k58ZgGuvTSoPGbVu8dZ7zjYNVlGOoZ2kkrfu8 2iCKs6BM4DHsP3lHXjLAL6K4ZOXVYi+xXzJQdvzpB5BGXZM7R+WYNFBItb5u0AflY2W7DbWAtx QzWat8+BH7h1Sa4FGOSzYPKd3+fWRaf15TkeQTzuGKtik3SVcesmNQ1WE8/w0W+nkfwT08AFCs yxY= X-IronPort-AV: E=Sophos;i="5.68,223,1569254400"; d="scan'208";a="125647246" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 24 Oct 2019 14:50:13 +0800 IronPort-SDR: 2AUXhh3SrSU69LS6nlV4b3R41i8ip2gPn0GYnrK35eulgLliwehwuxL34VgjylmeJSIiu6rD3u WYmjLzAn9KFXh4HaLSMYY/TwFy8eCycqXLBHIn+0ZW0d7Ae83WJQxJ3zX7M8l6i/cpqdoKEMls fLuKydPYnoi/Yd5a9d94hswJNBt4iBV5IIrk6vtSOkFBNaUImdnDAjVrqMFKigR/Z3GYAcjUU5 s9YwBVBCnBXKsL/5RBpNXHmtAVCqfKEdDOvtTQO4/OyW3/bvTBwp3hYBBrmdm0VMJ5xv1eTgiW vSwvKZ3dP9qVrjQwJ05rzy2H Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2019 23:45:46 -0700 IronPort-SDR: 5azAjCD8cigXjOKs7Cl/rQN3Se/Js4dOohO58A9YRUR7oU2C34dRyIKoIkMLOwzMGv3OgccKdQ Ow05qDClyYr2irGim1KnasuuCnjpLX/gaWdseLr7ejPXUFotxh49/qZaQ+WU2xVP8BmNpDW2RM IMyFGkxZOH1wKMCW2zq2RNMGguAvEOeb+snjm0nMFI7+rvJnTgZ0PszBQAwAIJIzMEraoEEhO6 9ZEwMEs+iUDeWtNNCHtv2F9ZA3nUreX1HvAYX9c4EUye+OjW4+VNes9knPmRRUFzgvODo1+KaG GzM= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 23 Oct 2019 23:50:12 -0700 From: Damien Le Moal To: linux-block@vger.kernel.org, Jens Axboe , linux-scsi@vger.kernel.org, "Martin K . Petersen" , dm-devel@redhat.com, Mike Snitzer Subject: [PATCH 3/4] block: Introduce report zones queue limits Date: Thu, 24 Oct 2019 15:50:05 +0900 Message-Id: <20191024065006.8684-4-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191024065006.8684-1-damien.lemoal@wdc.com> References: <20191024065006.8684-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org In preparation for a generic report zones command buffer allocation to the block layer, introduce three new request queue limits describing the device zone descriptor size (zone_descriptor_size limit), the needed granularity of the report zones command buffer size (zones_report_granularity limit) and the maximum size of a report zone command (max_zones_report_size limit). For scsi, set these values respectively to 64 bytes, SECTOR_SIZE and the maximum transfer size used for regular read/write commands limited by the maximum number of pages (segments) that the hardware can map. This removes the need for the "magic" limit implemented with the macro SD_ZBC_REPORT_MAX_ZONES. For the null_blk driver and dm targets, the default value of 0 is used for these limits, indicating that these zoned devices do not need a buffer for the execution of report zones. Signed-off-by: Damien Le Moal --- block/blk-settings.c | 3 +++ drivers/scsi/sd_zbc.c | 48 +++++++++++++++++++++--------------------- include/linux/blkdev.h | 4 ++++ 3 files changed, 31 insertions(+), 24 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index 5f6dcc7a47bd..674cfc428334 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -53,6 +53,9 @@ void blk_set_default_limits(struct queue_limits *lim) lim->discard_granularity = 0; lim->discard_alignment = 0; lim->discard_misaligned = 0; + lim->zone_descriptor_size = 0; + lim->zones_report_granularity = 0; + lim->max_zones_report_size = 0; lim->logical_block_size = lim->physical_block_size = lim->io_min = 512; lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT); lim->alignment_offset = 0; diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index fbec99db6124..8dc96f4ea920 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -104,11 +104,6 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, return 0; } -/* - * Maximum number of zones to get with one report zones command. - */ -#define SD_ZBC_REPORT_MAX_ZONES 8192U - /** * Allocate a buffer for report zones reply. * @sdkp: The target disk @@ -129,21 +124,8 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp, size_t bufsize; void *buf; - /* - * Report zone buffer size should be at most 64B times the number of - * zones requested plus the 64B reply header, but should be at least - * SECTOR_SIZE for ATA devices. - * Make sure that this size does not exceed the hardware capabilities. - * Furthermore, since the report zone command cannot be split, make - * sure that the allocated buffer can always be mapped by limiting the - * number of pages allocated to the HBA max segments limit. - */ - nr_zones = min(nr_zones, SD_ZBC_REPORT_MAX_ZONES); - bufsize = roundup((nr_zones + 1) * 64, 512); - bufsize = min_t(size_t, bufsize, - queue_max_hw_sectors(q) << SECTOR_SHIFT); - bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); - + bufsize = min_t(size_t, roundup(nr_zones * 64, SECTOR_SIZE), + q->limits.max_zones_report_size); buf = vzalloc(bufsize); if (buf) *buflen = bufsize; @@ -398,6 +380,8 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, unsigned char *buf, int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) { struct gendisk *disk = sdkp->disk; + struct request_queue *q = disk->queue; + unsigned int max_zones_report_size; unsigned int nr_zones; u32 zone_blocks = 0; int ret; @@ -423,13 +407,29 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) goto err; /* The drive satisfies the kernel restrictions: set it up */ - blk_queue_chunk_sectors(sdkp->disk->queue, + blk_queue_chunk_sectors(q, logical_to_sectors(sdkp->device, zone_blocks)); - blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, sdkp->disk->queue); - blk_queue_required_elevator_features(sdkp->disk->queue, - ELEVATOR_F_ZBD_SEQ_WRITE); + blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); + blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); + /* + * Zone descriptors are 64 bytes. A report zone buffer size should be + * at most 64B times the number of zones of the device plus a 64B reply + * header and should be at least be SECTOR_SIZE bytes for ATA devices. + * Make sure that this maximum buffer size does not exceed the hardware + * capabilities in terms of maximum data transfer size. Furthermore, + * make sure that the allocated buffer can always be mapped by limiting + * the number of pages of the buffer to the device max segments limit. + */ + q->limits.zone_descriptor_size = 64; + q->limits.zones_report_granularity = SECTOR_SIZE; + max_zones_report_size = min(roundup((nr_zones + 1) * 64, SECTOR_SIZE), + queue_max_hw_sectors(q) << SECTOR_SHIFT); + q->limits.max_zones_report_size = + min(max_zones_report_size, + (unsigned int)queue_max_segments(q) << PAGE_SHIFT); + /* READ16/WRITE16 is mandatory for ZBC disks */ sdkp->device->use_16_for_rw = 1; sdkp->device->use_10_for_rw = 0; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f3ea78b0c91c..1c76d71fc232 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -338,6 +338,10 @@ struct queue_limits { unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int zone_descriptor_size; + unsigned int zones_report_granularity; + unsigned int max_zones_report_size; + unsigned short logical_block_size; unsigned short max_segments; unsigned short max_integrity_segments; From patchwork Thu Oct 24 06:50:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11208359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8986813B1 for ; Thu, 24 Oct 2019 06:50:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5D5F02166E for ; Thu, 24 Oct 2019 06:50:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="eUxgIt74" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437825AbfJXGuP (ORCPT ); Thu, 24 Oct 2019 02:50:15 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:35892 "EHLO esa3.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437816AbfJXGuP (ORCPT ); Thu, 24 Oct 2019 02:50:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1571899815; x=1603435815; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=H1K7hXmcXd5hJ2EOfBms6tjJj+r4T0l+62frOfgZ7mU=; b=eUxgIt74Si9khl6XBDLyskoWhoFi0WPZvvlf4oKgahF/JkiAB/Rx+dA+ UBJtPiNOaUcGyM+OszXNs0L1iAND2M+JDzx467qgqkPfPlxF7Csad6uQg SynmDIEndFS51j7fnHRvNRgvTZSmpO+exe45wzaSYTsyrmQuxQh87btpv pe+lFODE2kNjSWFFfrT8cXHnkvJX4XLW9Lu4E2Lmvm4EIZIQu4zO4iL1Q GWyZO36eM9EBeYl8DyKZesu2XlR6T7/FUqVY9lzWKuZhv3ykbQRHhDc/g UswMs518edxTmg74Yc4b3WFZW97rig/da+Cm4La/ZcZDv3OJBxHyRQa0d w==; IronPort-SDR: mO7tKMOYrJlLMJAPz2vActVTT9cwnWVgFXmRXvw9BW7tlMQk6oLItsbQMIhcTJtAa6tZMgsywV TnVsU2kX+zegQ68pgeseGo+JZj7zBpiBSKpdj49SnRmR+dsP/7PwRISq6Y6+7CNzjktonG/1rp k+gCQF75+Ohs+6tqkr49/6y7FQkC3fpSJyNbUj+wqJl9C/gr8UlrSWv+bGgMe8lThHGc56fFty zmXrNsWM7vsBNjxUuj6YU2CbbXNvcKWKsv9yv3OowY8pdAJ5WZy+ZQ9n5nTpMtPbDfAxqN6dpz Y9Y= X-IronPort-AV: E=Sophos;i="5.68,223,1569254400"; d="scan'208";a="125647248" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 24 Oct 2019 14:50:14 +0800 IronPort-SDR: IKytl06lJfKX38YLVIR0JbDwuTn3nVOPku45xMiFDudqsQMqGG1VVh6C0Kt4oJp1oBO8dCprPU ywe8/Ji7wJiCaRKDiU5YgaN948WfDKL3wT2YEz91MSnWzly4CnvvKeLj7qdcx5CFJM9lp23Iiv X0WK588Epn3dyIo8ViVzTIUzJQSY/b4oYYV2ItDz1sKcDwzNnznWPlAh2mw2ZvmdhTyLeJ363O bAWwfrnbc1XGduVx/1C2Qo3ltkJsMyZHtg2TMKfe9qtixWDmJ+zp3MGkms0mkmpSlh6jYy8stG FK6DWLUu2yoePHg2A78KckKm Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2019 23:45:48 -0700 IronPort-SDR: HTqiZCOeiYZCHPSRut58y3gFFtxto1dBrACGXxsFe12SKoGVKgzLbiXVi3WRC7+qXejK7jszpP m2RM2/Slv1siWHXFKbZYzWO3khY+nFFYsB6zOa+UxuZGL4MoxYOZkj5eSxvUFWOqhqL+cOQNKU u1HCov0hj/V2CXKLHYWNQQWLTLTxIVwN2PhNOO4+nQ5SiVgVYD+8E8KFgbdtKASJVSivm5XWRG 23M1uCuLzWrOlfRHlmtY433ssSgXASGiq4CS/feCfoXhacAZLca/hqBMpzhD+tKBQRPqWR0WSJ ieQ= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 23 Oct 2019 23:50:13 -0700 From: Damien Le Moal To: linux-block@vger.kernel.org, Jens Axboe , linux-scsi@vger.kernel.org, "Martin K . Petersen" , dm-devel@redhat.com, Mike Snitzer Subject: [PATCH 4/4] block: Generically handle report zones buffer Date: Thu, 24 Oct 2019 15:50:06 +0900 Message-Id: <20191024065006.8684-5-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191024065006.8684-1-damien.lemoal@wdc.com> References: <20191024065006.8684-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Instead of relying on a zoned block device driver to allocate a buffer for every execution of a report zones command execution, rely on the block layer use of the device zone report queue limits to allocate a buffer and keep it around when the device report_zones method is executed in a loop, e.g. as in blk_revalidate_disk_zones(). This simplifies the code in the scsi sd_zbc driver as well as simplify the addition of zone supports for upcoming new zoned device drivers. Signed-off-by: Damien Le Moal --- block/blk-zoned.c | 99 ++++++++++++++++++++-------------- drivers/block/null_blk.h | 6 ++- drivers/block/null_blk_zoned.c | 3 +- drivers/md/dm.c | 3 +- drivers/scsi/sd.h | 3 +- drivers/scsi/sd_zbc.c | 61 ++++++--------------- include/linux/blkdev.h | 8 +-- 7 files changed, 88 insertions(+), 95 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 43bfd1be0985..6bddaa505df0 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -97,6 +97,29 @@ unsigned int blkdev_nr_zones(struct block_device *bdev) } EXPORT_SYMBOL_GPL(blkdev_nr_zones); +/* + * Allocate a buffer to execute report zones. + */ +static void *blk_alloc_report_buffer(struct request_queue *q, + unsigned int *nr_zones, size_t *buflen) +{ + unsigned int nrz = *nr_zones; + size_t bufsize = nrz * q->limits.zone_descriptor_size; + void *buf; + + if (q->limits.zones_report_granularity) + bufsize = roundup(bufsize, q->limits.zones_report_granularity); + bufsize = min_t(size_t, bufsize, q->limits.max_zones_report_size); + buf = vzalloc(bufsize); + if (buf) { + *buflen = bufsize; + *nr_zones = min_t(unsigned int, nrz, + bufsize / q->limits.zone_descriptor_size); + } + + return buf; +} + /* * Check that a zone report belongs to this partition, and if yes, fix its start * sector and write pointer and return true. Return false otherwise. @@ -140,7 +163,10 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, { struct request_queue *q = bdev_get_queue(bdev); struct gendisk *disk = bdev->bd_disk; - unsigned int i, nrz; + unsigned int i, nrz = *nr_zones; + sector_t capacity = bdev->bd_part->nr_sects; + size_t buflen = 0; + void *buf = NULL; int ret; if (!blk_queue_is_zoned(q)) @@ -154,27 +180,33 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, if (WARN_ON_ONCE(!disk->fops->report_zones)) return -EOPNOTSUPP; - if (!*nr_zones || sector >= bdev->bd_part->nr_sects) { + if (!nrz || sector >= capacity) { *nr_zones = 0; return 0; } - nrz = min(*nr_zones, - __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector)); - ret = disk->fops->report_zones(disk, get_start_sect(bdev) + sector, - zones, &nrz); + nrz = min(nrz, __blkdev_nr_zones(q, capacity - sector)); + if (q->limits.zone_descriptor_size) { + buf = blk_alloc_report_buffer(q, &nrz, &buflen); + if (!buf) + return -ENOMEM; + } + + ret = disk->fops->report_zones(disk, sector, zones, &nrz, buf, buflen); if (ret) - return ret; + goto out; for (i = 0; i < nrz; i++) { if (!blkdev_report_zone(bdev, zones)) break; zones++; } - *nr_zones = i; - return 0; +out: + kvfree(buf); + + return ret; } EXPORT_SYMBOL_GPL(blkdev_report_zones); @@ -384,31 +416,6 @@ static inline unsigned long *blk_alloc_zone_bitmap(int node, GFP_NOIO, node); } -/* - * Allocate an array of struct blk_zone to get nr_zones zone information. - * The allocated array may be smaller than nr_zones. - */ -static struct blk_zone *blk_alloc_zones(unsigned int *nr_zones) -{ - struct blk_zone *zones; - size_t nrz = min(*nr_zones, BLK_ZONED_REPORT_MAX_ZONES); - - /* - * GFP_KERNEL here is meaningless as the caller task context has - * the PF_MEMALLOC_NOIO flag set in blk_revalidate_disk_zones() - * with memalloc_noio_save(). - */ - zones = kvcalloc(nrz, sizeof(struct blk_zone), GFP_KERNEL); - if (!zones) { - *nr_zones = 0; - return NULL; - } - - *nr_zones = nrz; - - return zones; -} - void blk_queue_free_zone_bitmaps(struct request_queue *q) { kfree(q->seq_zones_bitmap); @@ -482,10 +489,12 @@ int blk_revalidate_disk_zones(struct gendisk *disk) struct request_queue *q = disk->queue; unsigned int nr_zones = __blkdev_nr_zones(q, get_capacity(disk)); unsigned long *seq_zones_wlock = NULL, *seq_zones_bitmap = NULL; - unsigned int i, rep_nr_zones = 0, z = 0, nrz; + unsigned int i, rep_nr_zones, z = 0, nrz; struct blk_zone *zones = NULL; unsigned int noio_flag; sector_t sector = 0; + size_t buflen = 0; + void *buf = NULL; int ret = 0; /* @@ -518,17 +527,28 @@ int blk_revalidate_disk_zones(struct gendisk *disk) goto out; /* - * Get zone information to check the zones and initialize - * seq_zones_bitmap. + * Allocate a report buffer for the driver execution of report zones + * and an array of zones to get the report back. */ rep_nr_zones = nr_zones; - zones = blk_alloc_zones(&rep_nr_zones); + if (q->limits.zone_descriptor_size) { + buf = blk_alloc_report_buffer(q, &rep_nr_zones, &buflen); + if (!buf) + goto out; + } + + zones = kvcalloc(rep_nr_zones, sizeof(struct blk_zone), GFP_KERNEL); if (!zones) goto out; + /* + * Get zone information to check the zones and initialize + * seq_zones_bitmap. + */ while (z < nr_zones) { nrz = min(nr_zones - z, rep_nr_zones); - ret = disk->fops->report_zones(disk, sector, zones, &nrz); + ret = disk->fops->report_zones(disk, sector, zones, &nrz, + buf, buflen); if (ret) goto out; if (!nrz) @@ -565,6 +585,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) memalloc_noio_restore(noio_flag); kvfree(zones); + kvfree(buf); kfree(seq_zones_wlock); kfree(seq_zones_bitmap); diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h index 93c2a3d403da..6bd0482ec683 100644 --- a/drivers/block/null_blk.h +++ b/drivers/block/null_blk.h @@ -92,7 +92,8 @@ struct nullb { int null_zone_init(struct nullb_device *dev); void null_zone_exit(struct nullb_device *dev); int null_zone_report(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones); + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen); blk_status_t null_handle_zoned(struct nullb_cmd *cmd, enum req_opf op, sector_t sector, sector_t nr_sectors); @@ -107,7 +108,8 @@ static inline int null_zone_init(struct nullb_device *dev) static inline void null_zone_exit(struct nullb_device *dev) {} static inline int null_zone_report(struct gendisk *disk, sector_t sector, struct blk_zone *zones, - unsigned int *nr_zones) + unsigned int *nr_zones, + void *buf, size_t buflen) { return -EOPNOTSUPP; } diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c index 4e56b17ed3ef..446e083be240 100644 --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk_zoned.c @@ -67,7 +67,8 @@ void null_zone_exit(struct nullb_device *dev) } int null_zone_report(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones) + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen) { struct nullb *nullb = disk->private_data; struct nullb_device *dev = nullb->dev; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 647aa5b0233b..5d5a297ceeb1 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -441,7 +441,8 @@ static int dm_blk_getgeo(struct block_device *bdev, struct hd_geometry *geo) } static int dm_blk_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones) + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen) { #ifdef CONFIG_BLK_DEV_ZONED struct mapped_device *md = disk->private_data; diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 1eab779f812b..b948656b6882 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -213,7 +213,8 @@ extern blk_status_t sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd, bool all); extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, struct scsi_sense_hdr *sshdr); extern int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones); + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen); #else /* CONFIG_BLK_DEV_ZONED */ diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 8dc96f4ea920..228522c4338f 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -104,35 +104,6 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, return 0; } -/** - * Allocate a buffer for report zones reply. - * @sdkp: The target disk - * @nr_zones: Maximum number of zones to report - * @buflen: Size of the buffer allocated - * - * Try to allocate a reply buffer for the number of requested zones. - * The size of the buffer allocated may be smaller than requested to - * satify the device constraint (max_hw_sectors, max_segments, etc). - * - * Return the address of the allocated buffer and update @buflen with - * the size of the allocated buffer. - */ -static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp, - unsigned int nr_zones, size_t *buflen) -{ - struct request_queue *q = sdkp->disk->queue; - size_t bufsize; - void *buf; - - bufsize = min_t(size_t, roundup(nr_zones * 64, SECTOR_SIZE), - q->limits.max_zones_report_size); - buf = vzalloc(bufsize); - if (buf) - *buflen = bufsize; - - return buf; -} - /** * sd_zbc_report_zones - Disk report zones operation. * @disk: The target disk @@ -143,40 +114,40 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp, * Execute a report zones command on the target disk. */ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones) + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen) { struct scsi_disk *sdkp = scsi_disk(disk); unsigned int i, nrz = *nr_zones; - unsigned char *buf; - size_t buflen = 0, offset = 0; - int ret = 0; + unsigned char *rep_buf = buf; + size_t offset = 0; + int ret; if (!sd_is_zoned(sdkp)) /* Not a zoned device */ return -EOPNOTSUPP; - buf = sd_zbc_alloc_report_buffer(sdkp, nrz, &buflen); - if (!buf) - return -ENOMEM; - - ret = sd_zbc_do_report_zones(sdkp, buf, buflen, + /* + * The buffer prepared by the block layer may be too large for the + * number of zones requested. Tune it here to avoid requesting too + * many zones than necessary. + */ + buflen = min_t(size_t, roundup((nrz + 1) * 64, SECTOR_SIZE), buflen); + ret = sd_zbc_do_report_zones(sdkp, rep_buf, buflen, sectors_to_logical(sdkp->device, sector), true); if (ret) - goto out; + return ret; - nrz = min(nrz, get_unaligned_be32(&buf[0]) / 64); + nrz = min(nrz, get_unaligned_be32(&rep_buf[0]) / 64); for (i = 0; i < nrz; i++) { offset += 64; - sd_zbc_parse_report(sdkp, buf + offset, zones); + sd_zbc_parse_report(sdkp, rep_buf + offset, zones); zones++; } *nr_zones = nrz; -out: - kvfree(buf); - - return ret; + return 0; } /** diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 1c76d71fc232..f04927a7fb40 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -355,11 +355,6 @@ struct queue_limits { #ifdef CONFIG_BLK_DEV_ZONED -/* - * Maximum number of zones to report with a single report zones command. - */ -#define BLK_ZONED_REPORT_MAX_ZONES 8192U - extern unsigned int blkdev_nr_zones(struct block_device *bdev); extern int blkdev_report_zones(struct block_device *bdev, sector_t sector, struct blk_zone *zones, @@ -1713,7 +1708,8 @@ struct block_device_operations { /* this callback is with swap_lock and sometimes page table lock held */ void (*swap_slot_free_notify) (struct block_device *, unsigned long); int (*report_zones)(struct gendisk *, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones); + struct blk_zone *zones, unsigned int *nr_zones, + void *buf, size_t buflen); struct module *owner; const struct pr_ops *pr_ops; };