From patchwork Mon Jul 1 05:09:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11024833 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AADB9112C for ; Mon, 1 Jul 2019 05:11:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A7582832D for ; Mon, 1 Jul 2019 05:11:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8D3DC28496; Mon, 1 Jul 2019 05:11:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 26BC02832D for ; Mon, 1 Jul 2019 05:11:10 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 03C6F3082E57; Mon, 1 Jul 2019 05:11:09 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CCCD41001B2B; Mon, 1 Jul 2019 05:11:04 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id D6B494EBC9; Mon, 1 Jul 2019 05:10:54 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x615Ajlh004465 for ; Mon, 1 Jul 2019 01:10:45 -0400 Received: by smtp.corp.redhat.com (Postfix) id 70C407E322; Mon, 1 Jul 2019 05:10:45 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx05.extmail.prod.ext.phx2.redhat.com [10.5.110.29]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2847A7E312; Mon, 1 Jul 2019 05:10:40 +0000 (UTC) Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8C51137F43; Mon, 1 Jul 2019 05:10:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561957826; x=1593493826; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eKp5BWbm2czEOH0r/etBmUeYCDH36g7dJVTVIL5yAqs=; b=ZucCGH/xGXogK7/G+JmyorX30ou08wlwImSGDCQM4BV8mwUVKQjQXC4g mo2fljMvedTfh8tfitv9Me9OLDlrGQ8dXSQ2jOKDSBsPmKp6BpXLs0BNy Yuu71aY5bJJpI2euBV5EIiPXbVMfrT1yLl2GIcWy4+PWzTy2YLpUT9tGU VkFQFWFT1zrDc6aXGhKUVjiySgRW6nlkIEGckORSIa5lgxpyOFOz97dKQ EzdfNNgaGqwojLHmM1oXHIAyC6Eew3HX2GYttpG8699lXubSguELtT4kH +5IgYG4/V9IkOIQzn9QxVzvZb/K58aOqcfLMLcxTDHrgbFcl6x5IUPI3L Q==; X-IronPort-AV: E=Sophos;i="5.63,437,1557158400"; d="scan'208";a="113544725" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Jul 2019 13:09:22 +0800 IronPort-SDR: xJPIX2AipvSX8oEAU3vIh7lTV66Gozqun9KYEZSOvAQyzgLBxuLXnDQQBgkD3sSLSFhwC490PG ErXi1C9fI0sGnQx67yNPWowaO6g0Q7yxSkjDD/s34P1BevMYuqg7R+80ONRb/GxyWZIsJoXg/e eEOlOazBEvnIIsWqeqXesLkqNH3ZiDTQEYfYKxkzUiSQhGjfURGzgAPdaQR9vR5N08iDsr1d12 twU0q7qFMhrhROJ5Erlxe43kkh7CuRvpoJ6OCV1S+wrwR+fLes2Ktdlw+1+sDtccAlckXvBBc1 BqvkoDy3s7pidhpcsCsrTWze Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 30 Jun 2019 22:08:23 -0700 IronPort-SDR: d3MFX2II8ZNJ3WoQ6jf1YRJkftAnMXltUI6f3zNTDR7smaUTcGCaiVMXkff/WYVUpF5UXvSQ9E a/Xx9pkPY8Jll6aM0CCdWyyGcpxCah3A3NHprd3rphkQd1CVLYFz3SJ4VZlHfz3GmRoipQ3kO+ oc120j1L4eJ7aNYk3dvXqJHvKObKVAvUZclqkYqVqKLlhx3GUQQyj+6qnhsAy5uUrMsHbJqLOI voUuUVTHXJ+yJm+BqQVdNBq7PlvJ67HHN91n+aWBuLnVH8Xk791uWDw+5cl2pfqQAVtZhqfX6r t2w= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 30 Jun 2019 22:09:21 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe , dm-devel@redhat.com, Mike Snitzer , linux-f2fs-devel@lists.sourceforge.net, Jaegeuk Kim Date: Mon, 1 Jul 2019 14:09:15 +0900 Message-Id: <20190701050918.27511-2-damien.lemoal@wdc.com> In-Reply-To: <20190701050918.27511-1-damien.lemoal@wdc.com> References: <20190701050918.27511-1-damien.lemoal@wdc.com> MIME-Version: 1.0 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 216 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 01 Jul 2019 05:10:25 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 01 Jul 2019 05:10:25 +0000 (UTC) for IP:'216.71.154.45' DOMAIN:'esa6.hgst.iphmx.com' HELO:'esa6.hgst.iphmx.com' FROM:'damien.lemoal@wdc.com' RCPT:'' X-RedHat-Spam-Score: -2.399 (DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED, SPF_HELO_NONE) 216.71.154.45 esa6.hgst.iphmx.com 216.71.154.45 esa6.hgst.iphmx.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.29 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-loop: dm-devel@redhat.com Cc: Christoph Hellwig , Bart Van Assche Subject: [dm-devel] [PATCH V6 1/4] block: Allow mapping of vmalloc-ed buffers X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Mon, 01 Jul 2019 05:11:09 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP To allow the SCSI subsystem scsi_execute_req() function to issue requests using large buffers that are better allocated with vmalloc() rather than kmalloc(), modify bio_map_kern() to allow passing a buffer allocated with vmalloc(). To do so, detect vmalloc-ed buffers using is_vmalloc_addr(). For vmalloc-ed buffers, flush the buffer using flush_kernel_vmap_range(), use vmalloc_to_page() instead of virt_to_page() to obtain the pages of the buffer, and invalidate the buffer addresses with invalidate_kernel_vmap_range() on completion of read BIOs. This last point is executed using the function bio_invalidate_vmalloc_pages() which is defined only if the architecture defines ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE, that is, if the architecture actually needs the invalidation done. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Chaitanya Kulkarni Reviewed-by: Ming Lei Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen --- block/bio.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index 2050bb4aacb5..3b6e35f73fd7 100644 --- a/block/bio.c +++ b/block/bio.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "blk.h" @@ -1479,8 +1480,22 @@ void bio_unmap_user(struct bio *bio) bio_put(bio); } +static void bio_invalidate_vmalloc_pages(struct bio *bio) +{ +#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE + if (bio->bi_private && !op_is_write(bio_op(bio))) { + unsigned long i, len = 0; + + for (i = 0; i < bio->bi_vcnt; i++) + len += bio->bi_io_vec[i].bv_len; + invalidate_kernel_vmap_range(bio->bi_private, len); + } +#endif +} + static void bio_map_kern_endio(struct bio *bio) { + bio_invalidate_vmalloc_pages(bio); bio_put(bio); } @@ -1501,6 +1516,8 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT; unsigned long start = kaddr >> PAGE_SHIFT; const int nr_pages = end - start; + bool is_vmalloc = is_vmalloc_addr(data); + struct page *page; int offset, i; struct bio *bio; @@ -1508,6 +1525,11 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, if (!bio) return ERR_PTR(-ENOMEM); + if (is_vmalloc) { + flush_kernel_vmap_range(data, len); + bio->bi_private = data; + } + offset = offset_in_page(kaddr); for (i = 0; i < nr_pages; i++) { unsigned int bytes = PAGE_SIZE - offset; @@ -1518,7 +1540,11 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, if (bytes > len) bytes = len; - if (bio_add_pc_page(q, bio, virt_to_page(data), bytes, + if (!is_vmalloc) + page = virt_to_page(data); + else + page = vmalloc_to_page(data); + if (bio_add_pc_page(q, bio, page, bytes, offset) < bytes) { /* we don't support partial mappings */ bio_put(bio); From patchwork Mon Jul 1 05:09:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11024839 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 26989112C for ; Mon, 1 Jul 2019 05:11:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 157BD2832D for ; Mon, 1 Jul 2019 05:11:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 08D6E28468; Mon, 1 Jul 2019 05:11:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 306FF2832D for ; Mon, 1 Jul 2019 05:11:45 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0AA7E308339E; Mon, 1 Jul 2019 05:11:44 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DB5352DE85; Mon, 1 Jul 2019 05:11:43 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id DD13F54D3D; Mon, 1 Jul 2019 05:11:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x615AsEg004567 for ; Mon, 1 Jul 2019 01:10:54 -0400 Received: by smtp.corp.redhat.com (Postfix) id 46DD57A4A5; Mon, 1 Jul 2019 05:10:54 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx13.extmail.prod.ext.phx2.redhat.com [10.5.110.42]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 02DB07A4AC; Mon, 1 Jul 2019 05:10:48 +0000 (UTC) Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 109C1308213C; Mon, 1 Jul 2019 05:10:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561957832; x=1593493832; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uJqHrInudGIFXnnjRGPqz/V8lQGss+rD2d1MEJ48FFQ=; b=cZwFD1XKkQ52t4v8LIyJnRHizvOhBkX/StkL9j4ovQ/hBz9rSg5ANesL dcIFZXl5yKYXUHh7YGfyyTiU/3+RtQY9gdB5e1Khbn8zKa14QVa+Flj2F dmrt4/qcioFipRZ7ddhqQY0iyc/clcu2QVBGqApNuBo1kqxLzbsXI91G5 NHPcD6qdEax4ms4MYD7XNfSffgGEQTqUFUka9H+kMTDzUfPHQm50gsNXv B8CAqxRm/RXnw3H/dc9pBtdGqsy7hvkTOrC9c/A8RDkCBT4ydPbKYedhZ f+cWBmhOS5s2vGtA6wj0IL+ebE5QPKFPutJXiNfRpoF9NOXzY065mjhJ5 Q==; X-IronPort-AV: E=Sophos;i="5.63,437,1557158400"; d="scan'208";a="113544729" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Jul 2019 13:09:25 +0800 IronPort-SDR: rDlA4FPSPfY0UGHBnWdCSbAg9dhEqe5dpmvmbzVxqVPoeiHvtevrJeKIOGrPzkZZHV9/nHDW8V gujpdf5rkF5lHe4F/8JalrXJeqL77LOlwIZTQmwKiAI1IUEKqHV3/auP55kuSekbJUpoUGGe1o O7sPNMGsCEH5z6x2nLKeZYsFcrZctNCyhDtwFQgVvdV9G6/uRV7m1weMWbPjbWuxZWATCIIaT1 HuIKYF8FGh03v8CDXz6uQBOA374R9OZqYrYPtYB6R0mbIqpzSR3aBTbu/Qo93ZiSr7o5QTK3g+ 8jg/Mf6cLcl3eSaMJC9UuCaS Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 30 Jun 2019 22:08:25 -0700 IronPort-SDR: jPdBIhl8BVP9TDoBYR4WdKBPVgW+MM6Z860u2guGprLcLuKyL2QNez0NrV5o81slwg83Z6rc4Z 9LWVJGLulgih6q0e4E+zXyAq1VQYknZeUT5d/vRWagjKyCvwoZ9DAzli9AKxkdFUuxKUPrRnR2 zLoT8HJ1J8zbQUXJ6PZUpjdD+ArPzu0qeA2446jYArhB9+2VJUTb/t9G4hmK7MSRZPm9v9pAiZ 9RaQ2qMkzsg5A+lF/0v2czJX0o7fd7TgkdPlRJehuKUnLAVTRdohgkkv0CAnLh/ppXHbfpduFQ 7ks= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 30 Jun 2019 22:09:22 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe , dm-devel@redhat.com, Mike Snitzer , linux-f2fs-devel@lists.sourceforge.net, Jaegeuk Kim Date: Mon, 1 Jul 2019 14:09:16 +0900 Message-Id: <20190701050918.27511-3-damien.lemoal@wdc.com> In-Reply-To: <20190701050918.27511-1-damien.lemoal@wdc.com> References: <20190701050918.27511-1-damien.lemoal@wdc.com> MIME-Version: 1.0 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 216 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 01 Jul 2019 05:10:31 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 01 Jul 2019 05:10:31 +0000 (UTC) for IP:'216.71.154.45' DOMAIN:'esa6.hgst.iphmx.com' HELO:'esa6.hgst.iphmx.com' FROM:'damien.lemoal@wdc.com' RCPT:'' X-RedHat-Spam-Score: -0.099 (DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_NONE) 216.71.154.45 esa6.hgst.iphmx.com 216.71.154.45 esa6.hgst.iphmx.com X-Scanned-By: MIMEDefang 2.84 on 10.5.110.42 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: dm-devel@redhat.com Cc: Christoph Hellwig , Bart Van Assche Subject: [dm-devel] [PATCH V6 2/4] block: Kill gfp_t argument of blkdev_report_zones() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 01 Jul 2019 05:11:44 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In preparation of using vmalloc() for large report buffer and zone array allocations used by this function, remove its "gfp_t gfp_mask" argument and rely on the caller context to use memalloc_noio_save/restore() where necessary (block layer zone revalidation and dm-zoned I/O error path). Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen --- block/blk-zoned.c | 31 +++++++++++++++++++------------ drivers/block/null_blk.h | 3 +-- drivers/block/null_blk_zoned.c | 3 +-- drivers/md/dm-flakey.c | 5 ++--- drivers/md/dm-linear.c | 5 ++--- drivers/md/dm-zoned-metadata.c | 16 ++++++++++++---- drivers/md/dm.c | 6 ++---- drivers/scsi/sd.h | 3 +-- drivers/scsi/sd_zbc.c | 6 ++---- fs/f2fs/super.c | 4 +--- include/linux/blkdev.h | 5 ++--- include/linux/device-mapper.h | 3 +-- 12 files changed, 46 insertions(+), 44 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index ae7e91bd0618..60dfc3f22350 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "blk.h" @@ -117,8 +118,7 @@ static bool blkdev_report_zone(struct block_device *bdev, struct blk_zone *rep) } static int blk_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct request_queue *q = disk->queue; unsigned int z = 0, n, nrz = *nr_zones; @@ -127,8 +127,7 @@ static int blk_report_zones(struct gendisk *disk, sector_t sector, while (z < nrz && sector < capacity) { n = nrz - z; - ret = disk->fops->report_zones(disk, sector, &zones[z], &n, - gfp_mask); + ret = disk->fops->report_zones(disk, sector, &zones[z], &n); if (ret) return ret; if (!n) @@ -149,17 +148,18 @@ static int blk_report_zones(struct gendisk *disk, sector_t sector, * @sector: Sector from which to report zones * @zones: Array of zone structures where to return the zones information * @nr_zones: Number of zone structures in the zone array - * @gfp_mask: Memory allocation flags (for bio_alloc) * * Description: * Get zone information starting from the zone containing @sector. * The number of zone information reported may be less than the number * requested by @nr_zones. The number of zones actually reported is * returned in @nr_zones. + * The caller must use memalloc_noXX_save/restore() calls to control + * memory allocations done within this function (zone array and command + * buffer allocation by the device driver). */ int blkdev_report_zones(struct block_device *bdev, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct request_queue *q = bdev_get_queue(bdev); unsigned int i, nrz; @@ -184,7 +184,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, nrz = min(*nr_zones, __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector)); ret = blk_report_zones(bdev->bd_disk, get_start_sect(bdev) + sector, - zones, &nrz, gfp_mask); + zones, &nrz); if (ret) return ret; @@ -305,9 +305,7 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode, if (!zones) return -ENOMEM; - ret = blkdev_report_zones(bdev, rep.sector, - zones, &rep.nr_zones, - GFP_KERNEL); + ret = blkdev_report_zones(bdev, rep.sector, zones, &rep.nr_zones); if (ret) goto out; @@ -415,6 +413,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) unsigned long *seq_zones_wlock = NULL, *seq_zones_bitmap = NULL; unsigned int i, rep_nr_zones = 0, z = 0, nrz; struct blk_zone *zones = NULL; + unsigned int noio_flag; sector_t sector = 0; int ret = 0; @@ -427,6 +426,12 @@ int blk_revalidate_disk_zones(struct gendisk *disk) return 0; } + /* + * Ensure that all memory allocations in this context are done as + * if GFP_NOIO was specified. + */ + noio_flag = memalloc_noio_save(); + if (!blk_queue_is_zoned(q) || !nr_zones) { nr_zones = 0; goto update; @@ -449,7 +454,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) while (z < nr_zones) { nrz = min(nr_zones - z, rep_nr_zones); - ret = blk_report_zones(disk, sector, zones, &nrz, GFP_NOIO); + ret = blk_report_zones(disk, sector, zones, &nrz); if (ret) goto out; if (!nrz) @@ -480,6 +485,8 @@ int blk_revalidate_disk_zones(struct gendisk *disk) blk_mq_unfreeze_queue(q); out: + memalloc_noio_restore(noio_flag); + free_pages((unsigned long)zones, get_order(rep_nr_zones * sizeof(struct blk_zone))); kfree(seq_zones_wlock); diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h index 34b22d6523ba..4b9bbe3bb5a1 100644 --- a/drivers/block/null_blk.h +++ b/drivers/block/null_blk.h @@ -89,8 +89,7 @@ struct nullb { int null_zone_init(struct nullb_device *dev); void null_zone_exit(struct nullb_device *dev); int null_zone_report(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask); + struct blk_zone *zones, unsigned int *nr_zones); void null_zone_write(struct nullb_cmd *cmd, sector_t sector, unsigned int nr_sectors); void null_zone_reset(struct nullb_cmd *cmd, sector_t sector); diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c index fca0c97ff1aa..cb28d93f2bd1 100644 --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk_zoned.c @@ -67,8 +67,7 @@ void null_zone_exit(struct nullb_device *dev) } int null_zone_report(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct nullb *nullb = disk->private_data; struct nullb_device *dev = nullb->dev; diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c index a9bc518156f2..2900fbde89b3 100644 --- a/drivers/md/dm-flakey.c +++ b/drivers/md/dm-flakey.c @@ -461,15 +461,14 @@ static int flakey_prepare_ioctl(struct dm_target *ti, struct block_device **bdev #ifdef CONFIG_BLK_DEV_ZONED static int flakey_report_zones(struct dm_target *ti, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct flakey_c *fc = ti->private; int ret; /* Do report and remap it */ ret = blkdev_report_zones(fc->dev->bdev, flakey_map_sector(ti, sector), - zones, nr_zones, gfp_mask); + zones, nr_zones); if (ret != 0) return ret; diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c index ad980a38fb1e..ecefe6703736 100644 --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -137,15 +137,14 @@ static int linear_prepare_ioctl(struct dm_target *ti, struct block_device **bdev #ifdef CONFIG_BLK_DEV_ZONED static int linear_report_zones(struct dm_target *ti, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct linear_c *lc = (struct linear_c *) ti->private; int ret; /* Do report and remap it */ ret = blkdev_report_zones(lc->dev->bdev, linear_map_sector(ti, sector), - zones, nr_zones, gfp_mask); + zones, nr_zones); if (ret != 0) return ret; diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c index d8334cd45d7c..9faf3e49c7af 100644 --- a/drivers/md/dm-zoned-metadata.c +++ b/drivers/md/dm-zoned-metadata.c @@ -8,6 +8,7 @@ #include #include +#include #define DM_MSG_PREFIX "zoned metadata" @@ -1162,8 +1163,7 @@ static int dmz_init_zones(struct dmz_metadata *zmd) while (sector < dev->capacity) { /* Get zone information */ nr_blkz = DMZ_REPORT_NR_ZONES; - ret = blkdev_report_zones(dev->bdev, sector, blkz, - &nr_blkz, GFP_KERNEL); + ret = blkdev_report_zones(dev->bdev, sector, blkz, &nr_blkz); if (ret) { dmz_dev_err(dev, "Report zones failed %d", ret); goto out; @@ -1201,12 +1201,20 @@ static int dmz_init_zones(struct dmz_metadata *zmd) static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone) { unsigned int nr_blkz = 1; + unsigned int noio_flag; struct blk_zone blkz; int ret; - /* Get zone information from disk */ + /* + * Get zone information from disk. Since blkdev_report_zones() uses + * GFP_KERNEL by default for memory allocations, set the per-task + * PF_MEMALLOC_NOIO flag so that all allocations are done as if + * GFP_NOIO was specified. + */ + noio_flag = memalloc_noio_save(); ret = blkdev_report_zones(zmd->dev->bdev, dmz_start_sect(zmd, zone), - &blkz, &nr_blkz, GFP_NOIO); + &blkz, &nr_blkz); + memalloc_noio_restore(noio_flag); if (!nr_blkz) ret = -EIO; if (ret) { diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 5475081dcbd6..61f1152b74e9 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -441,8 +441,7 @@ static int dm_blk_getgeo(struct block_device *bdev, struct hd_geometry *geo) } static int dm_blk_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { #ifdef CONFIG_BLK_DEV_ZONED struct mapped_device *md = disk->private_data; @@ -480,8 +479,7 @@ static int dm_blk_report_zones(struct gendisk *disk, sector_t sector, * So there is no need to loop here trying to fill the entire array * of zones. */ - ret = tgt->type->report_zones(tgt, sector, zones, - nr_zones, gfp_mask); + ret = tgt->type->report_zones(tgt, sector, zones, nr_zones); out: dm_put_live_table(md, srcu_idx); diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 5796ace76225..38c50946fc42 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -213,8 +213,7 @@ extern blk_status_t sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd); extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, struct scsi_sense_hdr *sshdr); extern int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask); + struct blk_zone *zones, unsigned int *nr_zones); #else /* CONFIG_BLK_DEV_ZONED */ diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 7334024b64f1..ec3764c8f3f1 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -109,13 +109,11 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, * @sector: Start 512B sector of the report * @zones: Array of zone descriptors * @nr_zones: Number of descriptors in the array - * @gfp_mask: Memory allocation mask * * Execute a report zones command on the target disk. */ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask) + struct blk_zone *zones, unsigned int *nr_zones) { struct scsi_disk *sdkp = scsi_disk(disk); unsigned int i, buflen, nrz = *nr_zones; @@ -134,7 +132,7 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, */ buflen = min(queue_max_hw_sectors(disk->queue) << 9, roundup((nrz + 1) * 64, 512)); - buf = kmalloc(buflen, gfp_mask); + buf = kmalloc(buflen, GFP_KERNEL); if (!buf) return -ENOMEM; diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 6b959bbb336a..4e91ba6c8a2e 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -2841,9 +2841,7 @@ static int init_blkz_info(struct f2fs_sb_info *sbi, int devi) while (zones && sector < nr_sectors) { nr_zones = F2FS_REPORT_NR_ZONES; - err = blkdev_report_zones(bdev, sector, - zones, &nr_zones, - GFP_KERNEL); + err = blkdev_report_zones(bdev, sector, zones, &nr_zones); if (err) break; if (!nr_zones) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 592669bcc536..472ba74ca968 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -347,7 +347,7 @@ struct queue_limits { extern unsigned int blkdev_nr_zones(struct block_device *bdev); extern int blkdev_report_zones(struct block_device *bdev, sector_t sector, struct blk_zone *zones, - unsigned int *nr_zones, gfp_t gfp_mask); + unsigned int *nr_zones); extern int blkdev_reset_zones(struct block_device *bdev, sector_t sectors, sector_t nr_sectors, gfp_t gfp_mask); extern int blk_revalidate_disk_zones(struct gendisk *disk); @@ -1684,8 +1684,7 @@ struct block_device_operations { /* this callback is with swap_lock and sometimes page table lock held */ void (*swap_slot_free_notify) (struct block_device *, unsigned long); int (*report_zones)(struct gendisk *, sector_t sector, - struct blk_zone *zones, unsigned int *nr_zones, - gfp_t gfp_mask); + struct blk_zone *zones, unsigned int *nr_zones); struct module *owner; const struct pr_ops *pr_ops; }; diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index e1f51d607cc5..3b470cb03b66 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -95,8 +95,7 @@ typedef int (*dm_prepare_ioctl_fn) (struct dm_target *ti, struct block_device ** typedef int (*dm_report_zones_fn) (struct dm_target *ti, sector_t sector, struct blk_zone *zones, - unsigned int *nr_zones, - gfp_t gfp_mask); + unsigned int *nr_zones); /* * These iteration functions are typically used to check (and combine) From patchwork Mon Jul 1 05:09:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11024841 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6CA47138B for ; Mon, 1 Jul 2019 05:11:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C1332832D for ; Mon, 1 Jul 2019 05:11:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CEE728468; Mon, 1 Jul 2019 05:11:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ACEFB2832D for ; Mon, 1 Jul 2019 05:11:51 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 05B744ACDF; Mon, 1 Jul 2019 05:11:51 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D86495C207; Mon, 1 Jul 2019 05:11:50 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id ACEA71806B16; Mon, 1 Jul 2019 05:11:50 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x615B5Zn004692 for ; Mon, 1 Jul 2019 01:11:05 -0400 Received: by smtp.corp.redhat.com (Postfix) id 8E6367A4A5; Mon, 1 Jul 2019 05:11:05 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx05.extmail.prod.ext.phx2.redhat.com [10.5.110.29]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 85A467A4A9; Mon, 1 Jul 2019 05:11:05 +0000 (UTC) Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D764D20260; Mon, 1 Jul 2019 05:10:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561957853; x=1593493853; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GC4T9TT8NwGf93moEo/W7t09dhf9qp/NOqiEgDqXqhs=; b=Ykm9e6TyQ0hWKKiVWvnu6004bzazBDYsfGXyyB2+KuK/Z0PnBwAgFuIM DqNVCIijQLW2uk1Ta+3Bai6yWYAhS9aJPHh0cliK923ygvUxEHQnfYTDw PJ+hAzosiNhvbWlyEruZMmyjwtaafTFDY9Y7JCifphYEgNCCCbb5lnsfI 5HZSelRwzTteR8EC0PT6u42FOrBA5dPWiErDOVVKZZIQjv9wVUo13Zaxl r6q5/IdIfWNAFx2OLbRSQrrCT9rbBXqkDgy010py2FjQrnz3oH/bO6W0X wCziJbjhCxqzt+b+6ENTPCniSTqspU2uNI0/2tNgqYn5zXZ7Iqyxtj0dw A==; X-IronPort-AV: E=Sophos;i="5.63,437,1557158400"; d="scan'208";a="113544735" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Jul 2019 13:09:26 +0800 IronPort-SDR: 30yVZKOwUSpeyRzfJGzAaWic1VgWw98CNIANnmKR49S06j7pK+L5UM5txQuQ2S23WRtvK0goOY 37RSQuC0KSeqefzwE+Oc2r8dN47zyZPQ4mYiUaMZ5NSLk3mRRATPCCi6MbnOP1f6SKiYzqXos8 CZ039JDJCzyZ1Rv9enNcOrPjWEHu8NpUFp0RYsGgE3cSWdwfKXZ+l1wVr9OLVgEepIhaD0k7Ep gVnsIdktjMgIOPKfqyP2apZkETwmlP3hmCCYKRxmOcwEQsdDqtoqn0NAPV/A3NniPmvgR3wCVf FSPqIPtBIkzu8yGVl7TiYqWG Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 30 Jun 2019 22:08:26 -0700 IronPort-SDR: +tfisdmqFeqa7VyIIE5AVfkptRM1Z3BEo9ccVM+fLXh64IvuinUYshS41rMcfri7aLZfAg6aAW ejAaW8hFpvQSHj+2koES2DhjTRGYlMrrD1Md7io3WlnlUEA7ZSAxGFKxhK0EH+hgbXG5HxS9Jj y6uLJqTDWXOKEqdIB0V00m3ip3rVaWtN2HrW++bWLVZcET5DtKs261Cf/U6ujXQCRKdwuxuR8D v31oGUaCKCIERz7rZ9ceiWUq51wWcznLzMgXkUUziPbE7xXTNzj3rnT3kesop3CHHFVogj/uOn V9w= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 30 Jun 2019 22:09:24 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe , dm-devel@redhat.com, Mike Snitzer , linux-f2fs-devel@lists.sourceforge.net, Jaegeuk Kim Date: Mon, 1 Jul 2019 14:09:17 +0900 Message-Id: <20190701050918.27511-4-damien.lemoal@wdc.com> In-Reply-To: <20190701050918.27511-1-damien.lemoal@wdc.com> References: <20190701050918.27511-1-damien.lemoal@wdc.com> MIME-Version: 1.0 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 216 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 01 Jul 2019 05:10:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 01 Jul 2019 05:10:52 +0000 (UTC) for IP:'216.71.154.45' DOMAIN:'esa6.hgst.iphmx.com' HELO:'esa6.hgst.iphmx.com' FROM:'damien.lemoal@wdc.com' RCPT:'' X-RedHat-Spam-Score: -0.099 (DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_NONE) 216.71.154.45 esa6.hgst.iphmx.com 216.71.154.45 esa6.hgst.iphmx.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.29 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: dm-devel@redhat.com Cc: Christoph Hellwig , Bart Van Assche Subject: [dm-devel] [PATCH V6 3/4] sd_zbc: Fix report zones buffer allocation X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 01 Jul 2019 05:11:51 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP During disk scan and revalidation done with sd_revalidate(), the zones of a zoned disk are checked using the helper function blk_revalidate_disk_zones() if a configuration change is detected (change in the number of zones or zone size). The function blk_revalidate_disk_zones() issues report_zones calls that are very large, that is, to obtain zone information for all zones of the disk with a single command. The size of the report zones command buffer necessary for such large request generally is lower than the disk max_hw_sectors and KMALLOC_MAX_SIZE (4MB) and succeeds on boot (no memory fragmentation), but often fail at run time (e.g. hot-plug event). This causes the disk revalidation to fail and the disk capacity to be changed to 0. This problem can be avoided by using vmalloc() instead of kmalloc() for the buffer allocation. To limit the amount of memory to be allocated, this patch also introduces the arbitrary SD_ZBC_REPORT_MAX_ZONES maximum number of zones to report with a single report zones command. This limit may be lowered further to satisfy the disk max_hw_sectors limit. Finally, to ensure that the vmalloc-ed buffer can always be mapped in a request, the buffer size is further limited to at most queue_max_segments() pages, allowing successful mapping of the buffer even in the worst case scenario where none of the buffer pages are contiguous. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Acked-by: Martin K. Petersen --- drivers/scsi/sd_zbc.c | 104 ++++++++++++++++++++++++++++++------------ 1 file changed, 75 insertions(+), 29 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index ec3764c8f3f1..db16c19e05c4 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -9,6 +9,8 @@ */ #include +#include +#include #include @@ -50,7 +52,7 @@ static void sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf, /** * sd_zbc_do_report_zones - Issue a REPORT ZONES scsi command. * @sdkp: The target disk - * @buf: Buffer to use for the reply + * @buf: vmalloc-ed buffer to use for the reply * @buflen: the buffer size * @lba: Start LBA of the report * @partial: Do partial report @@ -79,7 +81,6 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, put_unaligned_be32(buflen, &cmd[10]); if (partial) cmd[14] = ZBC_REPORT_ZONE_PARTIAL; - memset(buf, 0, buflen); result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE, buf, buflen, &sshdr, @@ -103,6 +104,53 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, return 0; } +/* + * Maximum number of zones to get with one report zones command. + */ +#define SD_ZBC_REPORT_MAX_ZONES 8192U + +/** + * Allocate a buffer for report zones reply. + * @sdkp: The target disk + * @nr_zones: Maximum number of zones to report + * @buflen: Size of the buffer allocated + * + * Try to allocate a reply buffer for the number of requested zones. + * The size of the buffer allocated may be smaller than requested to + * satify the device constraint (max_hw_sectors, max_segments, etc). + * + * Return the address of the allocated buffer and update @buflen with + * the size of the allocated buffer. + */ +static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp, + unsigned int nr_zones, size_t *buflen) +{ + struct request_queue *q = sdkp->disk->queue; + size_t bufsize; + void *buf; + + /* + * Report zone buffer size should be at most 64B times the number of + * zones requested plus the 64B reply header, but should be at least + * SECTOR_SIZE for ATA devices. + * Make sure that this size does not exceed the hardware capabilities. + * Furthermore, since the report zone command cannot be split, make + * sure that the allocated buffer can always be mapped by limiting the + * number of pages allocated to the HBA max segments limit. + */ + nr_zones = min(nr_zones, SD_ZBC_REPORT_MAX_ZONES); + bufsize = roundup((nr_zones + 1) * 64, 512); + bufsize = min_t(size_t, bufsize, + queue_max_hw_sectors(q) << SECTOR_SHIFT); + bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); + + buf = vzalloc(bufsize); + if (buf) + *buflen = bufsize; + + return buf; +} + /** * sd_zbc_report_zones - Disk report zones operation. * @disk: The target disk @@ -116,30 +164,23 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, struct blk_zone *zones, unsigned int *nr_zones) { struct scsi_disk *sdkp = scsi_disk(disk); - unsigned int i, buflen, nrz = *nr_zones; + unsigned int i, nrz = *nr_zones; unsigned char *buf; - size_t offset = 0; + size_t buflen = 0, offset = 0; int ret = 0; if (!sd_is_zoned(sdkp)) /* Not a zoned device */ return -EOPNOTSUPP; - /* - * Get a reply buffer for the number of requested zones plus a header, - * without exceeding the device maximum command size. For ATA disks, - * buffers must be aligned to 512B. - */ - buflen = min(queue_max_hw_sectors(disk->queue) << 9, - roundup((nrz + 1) * 64, 512)); - buf = kmalloc(buflen, GFP_KERNEL); + buf = sd_zbc_alloc_report_buffer(sdkp, nrz, &buflen); if (!buf) return -ENOMEM; ret = sd_zbc_do_report_zones(sdkp, buf, buflen, sectors_to_logical(sdkp->device, sector), true); if (ret) - goto out_free_buf; + goto out; nrz = min(nrz, get_unaligned_be32(&buf[0]) / 64); for (i = 0; i < nrz; i++) { @@ -150,8 +191,8 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, *nr_zones = nrz; -out_free_buf: - kfree(buf); +out: + kvfree(buf); return ret; } @@ -285,8 +326,6 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp, return 0; } -#define SD_ZBC_BUF_SIZE 131072U - /** * sd_zbc_check_zones - Check the device capacity and zone sizes * @sdkp: Target disk @@ -302,22 +341,28 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp, */ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) { + size_t bufsize, buflen; + unsigned int noio_flag; u64 zone_blocks = 0; sector_t max_lba, block = 0; unsigned char *buf; unsigned char *rec; - unsigned int buf_len; - unsigned int list_length; int ret; u8 same; + /* Do all memory allocations as if GFP_NOIO was specified */ + noio_flag = memalloc_noio_save(); + /* Get a buffer */ - buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL); - if (!buf) - return -ENOMEM; + buf = sd_zbc_alloc_report_buffer(sdkp, SD_ZBC_REPORT_MAX_ZONES, + &bufsize); + if (!buf) { + ret = -ENOMEM; + goto out; + } /* Do a report zone to get max_lba and the same field */ - ret = sd_zbc_do_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, 0, false); + ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, 0, false); if (ret) goto out_free; @@ -353,12 +398,12 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) do { /* Parse REPORT ZONES header */ - list_length = get_unaligned_be32(&buf[0]) + 64; + buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64, + bufsize); rec = buf + 64; - buf_len = min(list_length, SD_ZBC_BUF_SIZE); /* Parse zone descriptors */ - while (rec < buf + buf_len) { + while (rec < buf + buflen) { u64 this_zone_blocks = get_unaligned_be64(&rec[8]); if (zone_blocks == 0) { @@ -374,8 +419,8 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) } if (block < sdkp->capacity) { - ret = sd_zbc_do_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, - block, true); + ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, block, + true); if (ret) goto out_free; } @@ -406,7 +451,8 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) } out_free: - kfree(buf); + memalloc_noio_restore(noio_flag); + kvfree(buf); return ret; } From patchwork Mon Jul 1 05:09:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11024837 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0278B112C for ; Mon, 1 Jul 2019 05:11:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E52DD2832D for ; Mon, 1 Jul 2019 05:11:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D808728468; Mon, 1 Jul 2019 05:11:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7498A2832D for ; Mon, 1 Jul 2019 05:11:44 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3B2B2356D2; Mon, 1 Jul 2019 05:11:43 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 199357A49A; Mon, 1 Jul 2019 05:11:43 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id DAC521806B19; Mon, 1 Jul 2019 05:11:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x615BF0b004810 for ; Mon, 1 Jul 2019 01:11:15 -0400 Received: by smtp.corp.redhat.com (Postfix) id D69D37A49A; Mon, 1 Jul 2019 05:11:15 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx01.extmail.prod.ext.phx2.redhat.com [10.5.110.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A650D7A4A9; Mon, 1 Jul 2019 05:11:11 +0000 (UTC) Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9839081F01; Mon, 1 Jul 2019 05:10:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561957858; x=1593493858; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fiT0Q2RZAIaTc+o8PPq09fvwdcfYZE72CFvMQOosU9c=; b=mx/ZlK8XVf/WM14sZEXBc3z7zhzvUyhRRZMh5/FX+vDLKMfYiPxHpeAD 8hQI2YiQBS9EJvX4Hr5wNpsxxvKviXXcHYYkVSlDZnRpFbg3K/vN5f580 lw9AgMH5vl6pD75e0oDC8MouD3xeHjWRzf6dQFJ8qdCPeDh0+kwhli4i7 KFtIEqwI/UNmxoiBdl68Ikg3jfHSkYBa1OUnG5hnENmuCkpC3vWTHQFyn z9YV/6+PqD20iCDNaDVyzDrxBo9QyXer1IbvTcl0stKD3TAbekwOck47j QzVafLW1qWP4e4P70UhBu5sJWP0LcSqBzkSkLQKVuvbu5SzELw7aaCp+h g==; X-IronPort-AV: E=Sophos;i="5.63,437,1557158400"; d="scan'208";a="113544736" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Jul 2019 13:09:28 +0800 IronPort-SDR: elewWRdst7hsMed6UTiTowZAbBN5/Mp8X+haZk+q4OMFEukn9ltvJ9v7kfX8clr7wtsy9OO6K6 bhLgYk25XTJk5674RG2qKxsv5XbG5uyqL+sjOAjaZqY8FySgtVfZF+qacoc/YUDIL+IeFW6mCE EhzEzFdu9axs2uKujZQ3pDR2SQOiTrHPHQeuhlWplso/0PUjzGmqAwjJ1d201/R84YleSGkhW5 0O6swomj7+jwk0ZK0yKSvVefmlJTpxwbM8TYNsCY37WGHabT/iyWTqLbElG7mLFiYhitwr/3kB FnvNwUB+TESBBijzfL5FDkh3 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 30 Jun 2019 22:08:28 -0700 IronPort-SDR: nW2I57e363zZy8ypwNj+v+eT0Z8Wy9I3uEUEDqVyL2t8v0jXQ7XAx1XyWcfk1eFd9+9RFQB592 QiA4zT7SbZeEzOcFj0coze4NDQZYsLK8U0dcJw/arh2P8QvIV05+/CsXSgN5psbs8ptf6Hom4w gHpgdM3tby0LsTxJqHtCvVG29jknMcWCH2j6Li8+n7TlrHApYGDYrbSG9WzBR4OqWV3AQRHUYq LoKfVoevMDObQmmSzKKWOeAgSb/r69E0xBFJjQdSIWlshpY0ySpNrz7BT7Vr35a4TskJUcsc0Z 1Ww= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 30 Jun 2019 22:09:26 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe , dm-devel@redhat.com, Mike Snitzer , linux-f2fs-devel@lists.sourceforge.net, Jaegeuk Kim Date: Mon, 1 Jul 2019 14:09:18 +0900 Message-Id: <20190701050918.27511-5-damien.lemoal@wdc.com> In-Reply-To: <20190701050918.27511-1-damien.lemoal@wdc.com> References: <20190701050918.27511-1-damien.lemoal@wdc.com> MIME-Version: 1.0 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 216 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 01 Jul 2019 05:10:57 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 01 Jul 2019 05:10:57 +0000 (UTC) for IP:'216.71.154.45' DOMAIN:'esa6.hgst.iphmx.com' HELO:'esa6.hgst.iphmx.com' FROM:'damien.lemoal@wdc.com' RCPT:'' X-RedHat-Spam-Score: -2.399 (DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED, SPF_HELO_NONE) 216.71.154.45 esa6.hgst.iphmx.com 216.71.154.45 esa6.hgst.iphmx.com X-Scanned-By: MIMEDefang 2.83 on 10.5.110.25 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: dm-devel@redhat.com Cc: Christoph Hellwig , Bart Van Assche Subject: [dm-devel] [PATCH V6 4/4] block: Limit zone array allocation size X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 01 Jul 2019 05:11:43 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Limit the size of the struct blk_zone array used in blk_revalidate_disk_zones() to avoid memory allocation failures leading to disk revalidation failure. Also further reduce the likelyhood of such failures by using kvcalloc() (that is vmalloc()) instead of allocating contiguous pages with alloc_pages(). Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen --- block/blk-zoned.c | 36 ++++++++++++++++++++---------------- include/linux/blkdev.h | 5 +++++ 2 files changed, 25 insertions(+), 16 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 60dfc3f22350..79ad269b545d 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -14,6 +14,8 @@ #include #include #include +#include +#include #include #include "blk.h" @@ -371,22 +373,25 @@ static inline unsigned long *blk_alloc_zone_bitmap(int node, * Allocate an array of struct blk_zone to get nr_zones zone information. * The allocated array may be smaller than nr_zones. */ -static struct blk_zone *blk_alloc_zones(int node, unsigned int *nr_zones) +static struct blk_zone *blk_alloc_zones(unsigned int *nr_zones) { - size_t size = *nr_zones * sizeof(struct blk_zone); - struct page *page; - int order; - - for (order = get_order(size); order >= 0; order--) { - page = alloc_pages_node(node, GFP_NOIO | __GFP_ZERO, order); - if (page) { - *nr_zones = min_t(unsigned int, *nr_zones, - (PAGE_SIZE << order) / sizeof(struct blk_zone)); - return page_address(page); - } + struct blk_zone *zones; + size_t nrz = min(*nr_zones, BLK_ZONED_REPORT_MAX_ZONES); + + /* + * GFP_KERNEL here is meaningless as the caller task context has + * the PF_MEMALLOC_NOIO flag set in blk_revalidate_disk_zones() + * with memalloc_noio_save(). + */ + zones = kvcalloc(nrz, sizeof(struct blk_zone), GFP_KERNEL); + if (!zones) { + *nr_zones = 0; + return NULL; } - return NULL; + *nr_zones = nrz; + + return zones; } void blk_queue_free_zone_bitmaps(struct request_queue *q) @@ -448,7 +453,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) /* Get zone information and initialize seq_zones_bitmap */ rep_nr_zones = nr_zones; - zones = blk_alloc_zones(q->node, &rep_nr_zones); + zones = blk_alloc_zones(&rep_nr_zones); if (!zones) goto out; @@ -487,8 +492,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) out: memalloc_noio_restore(noio_flag); - free_pages((unsigned long)zones, - get_order(rep_nr_zones * sizeof(struct blk_zone))); + kvfree(zones); kfree(seq_zones_wlock); kfree(seq_zones_bitmap); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 472ba74ca968..5ace0bb77213 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -344,6 +344,11 @@ struct queue_limits { #ifdef CONFIG_BLK_DEV_ZONED +/* + * Maximum number of zones to report with a single report zones command. + */ +#define BLK_ZONED_REPORT_MAX_ZONES 8192U + extern unsigned int blkdev_nr_zones(struct block_device *bdev); extern int blkdev_report_zones(struct block_device *bdev, sector_t sector, struct blk_zone *zones,