From patchwork Mon Oct 24 01:02:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Gilbert X-Patchwork-Id: 13016480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C196C433FE for ; Mon, 24 Oct 2022 01:12:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229920AbiJXBMG (ORCPT ); Sun, 23 Oct 2022 21:12:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229893AbiJXBMB (ORCPT ); Sun, 23 Oct 2022 21:12:01 -0400 Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9223B62A52 for ; Sun, 23 Oct 2022 18:12:00 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 7216F2041CB; Mon, 24 Oct 2022 03:02:50 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PM5aLRrSRZbs; Mon, 24 Oct 2022 03:02:48 +0200 (CEST) Received: from treten.bingwo.ca (host-45-78-203-98.dyn.295.ca [45.78.203.98]) by smtp.infotech.no (Postfix) with ESMTPA id 5F3D12041AF; Mon, 24 Oct 2022 03:02:47 +0200 (CEST) From: Douglas Gilbert To: linux-scsi@vger.kernel.org Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hare@suse.de, bvanassche@acm.org, Jason Gunthorpe , Bodo Stroesser Subject: [PATCH 1/5] sgl_alloc_order: remove 4 GiB limit Date: Sun, 23 Oct 2022 21:02:40 -0400 Message-Id: <20221024010244.9522-2-dgilbert@interlog.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221024010244.9522-1-dgilbert@interlog.com> References: <20221024010244.9522-1-dgilbert@interlog.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This patch fixes a check done by sgl_alloc_order() before it starts any allocations. The comment in the original said: "Check for integer overflow" but the right hand side of the expression in the condition is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which means 'length' can not exceed that value. This function may be used to replace vmalloc(unsigned long) for a large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so it seems unreasonable that sgl_alloc_order() whose length type is unsigned long long should be limited to 4 GB. Solutions to this issue were discussed by Jason Gunthorpe and Bodo Stroesser . This version is base on a linux-scsi post by Jason titled: "Re: [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201. An earlier patch fixed a memory leak in sg_alloc_order() due to the misuse of sgl_free(). Take the opportunity to put a one line comment above sgl_free()'s declaration warning that it is not suitable when order > 0 . Cc: Jason Gunthorpe Cc: Bodo Stroesser Signed-off-by: Douglas Gilbert --- include/linux/scatterlist.h | 1 + lib/scatterlist.c | 21 ++++++++++++--------- 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 375a5e90d86a..0930755a756e 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -426,6 +426,7 @@ struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp, unsigned int *nent_p); void sgl_free_n_order(struct scatterlist *sgl, int nents, int order); void sgl_free_order(struct scatterlist *sgl, int order); +/* Only use sgl_free() when order is 0 */ void sgl_free(struct scatterlist *sgl); #endif /* CONFIG_SGL_ALLOC */ diff --git a/lib/scatterlist.c b/lib/scatterlist.c index c8c3d675845c..f633e2d669fe 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -585,13 +585,16 @@ EXPORT_SYMBOL(sg_alloc_table_from_pages_segment); #ifdef CONFIG_SGL_ALLOC /** - * sgl_alloc_order - allocate a scatterlist and its pages + * sgl_alloc_order - allocate a scatterlist with equally sized elements each + * of which has 2^@order continuous pages * @length: Length in bytes of the scatterlist. Must be at least one - * @order: Second argument for alloc_pages() + * @order: Second argument for alloc_pages(). Each sgl element size will + * be (PAGE_SIZE*2^@order) bytes. @order must not exceed 16. * @chainable: Whether or not to allocate an extra element in the scatterlist - * for scatterlist chaining purposes + * for scatterlist chaining purposes * @gfp: Memory allocation flags - * @nent_p: [out] Number of entries in the scatterlist that have pages + * @nent_p: [out] Number of entries in the scatterlist that have pages. + * Ignored if @nent_p is NULL. * * Returns: A pointer to an initialized scatterlist or %NULL upon failure. */ @@ -601,14 +604,14 @@ struct scatterlist *sgl_alloc_order(unsigned long long length, { struct scatterlist *sgl, *sg; struct page *page; - unsigned int nent, nalloc; + uint64_t nent; + unsigned int nalloc; u32 elem_len; nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order); - /* Check for integer overflow */ - if (length > (nent << (PAGE_SHIFT + order))) + if (nent > UINT_MAX) return NULL; - nalloc = nent; + nalloc = (unsigned int)nent; if (chainable) { /* Check for integer overflow */ if (nalloc + 1 < nalloc) @@ -636,7 +639,7 @@ struct scatterlist *sgl_alloc_order(unsigned long long length, } WARN_ONCE(length, "length = %lld\n", length); if (nent_p) - *nent_p = nent; + *nent_p = (unsigned int)nent; return sgl; } EXPORT_SYMBOL(sgl_alloc_order); From patchwork Mon Oct 24 01:02:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Gilbert X-Patchwork-Id: 13016478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 017EFFA373F for ; Mon, 24 Oct 2022 01:12:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229875AbiJXBME (ORCPT ); Sun, 23 Oct 2022 21:12:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229876AbiJXBMA (ORCPT ); Sun, 23 Oct 2022 21:12:00 -0400 Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ED94C63374 for ; Sun, 23 Oct 2022 18:11:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 479772041CE; Mon, 24 Oct 2022 03:02:52 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d4n03ElnWqja; Mon, 24 Oct 2022 03:02:50 +0200 (CEST) Received: from treten.bingwo.ca (host-45-78-203-98.dyn.295.ca [45.78.203.98]) by smtp.infotech.no (Postfix) with ESMTPA id B084E2041BB; Mon, 24 Oct 2022 03:02:48 +0200 (CEST) From: Douglas Gilbert To: linux-scsi@vger.kernel.org Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hare@suse.de, bvanassche@acm.org, Bodo Stroesser Subject: [PATCH 2/5] scatterlist: add sgl_copy_sgl() function Date: Sun, 23 Oct 2022 21:02:41 -0400 Message-Id: <20221024010244.9522-3-dgilbert@interlog.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221024010244.9522-1-dgilbert@interlog.com> References: <20221024010244.9522-1-dgilbert@interlog.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Both the SCSI and NVMe subsystems receive user data from the block layer in scatterlist_s (aka scatter gather lists (sgl) which are often arrays). If drivers in those subsystems represent storage (e.g. a ramdisk) or cache "hot" user data then they may also choose to use scatterlist_s. Currently there are no sgl to sgl operations in the kernel. Start with a sgl to sgl copy. Stops when the first of the number of requested bytes to copy, or the source sgl, or the destination sgl is exhausted. So the destination sgl will _not_ grow. Reviewed-by: Bodo Stroesser Signed-off-by: Douglas Gilbert --- include/linux/scatterlist.h | 4 ++ lib/scatterlist.c | 74 +++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 0930755a756e..cea1edd246cb 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -445,6 +445,10 @@ size_t sg_pcopy_to_buffer(struct scatterlist *sgl, unsigned int nents, size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, size_t buflen, off_t skip); +size_t sgl_copy_sgl(struct scatterlist *d_sgl, unsigned int d_nents, off_t d_skip, + struct scatterlist *s_sgl, unsigned int s_nents, off_t s_skip, + size_t n_bytes); + /* * Maximum number of entries that will be allocated in one piece, if * a list larger than this is required then chaining will be utilized. diff --git a/lib/scatterlist.c b/lib/scatterlist.c index f633e2d669fe..5d873bd0cb96 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -1088,3 +1088,77 @@ size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, return offset; } EXPORT_SYMBOL(sg_zero_buffer); + +/** + * sgl_copy_sgl - Copy over a destination sgl from a source sgl + * @d_sgl: Destination sgl + * @d_nents: Number of SG entries in destination sgl + * @d_skip: Number of bytes to skip in destination before starting + * @s_sgl: Source sgl + * @s_nents: Number of SG entries in source sgl + * @s_skip: Number of bytes to skip in source before starting + * @n_bytes: The (maximum) number of bytes to copy + * + * Returns: + * The number of copied bytes. + * + * Notes: + * Destination arguments appear before the source arguments, as with memcpy(). + * + * Stops copying if either d_sgl, s_sgl or n_bytes is exhausted. + * + * Since memcpy() is used, overlapping copies (where d_sgl and s_sgl belong + * to the same sgl and the copy regions overlap) are not supported. + * + * Large copies are broken into copy segments whose sizes may vary. Those + * copy segment sizes are chosen by the min3() statement in the code below. + * Since SG_MITER_ATOMIC is used for both sides, each copy segment is started + * with kmap_atomic() [in sg_miter_next()] and completed with kunmap_atomic() + * [in sg_miter_stop()]. This means pre-emption is inhibited for relatively + * short periods even in very large copies. + * + * If d_skip is large, potentially spanning multiple d_nents then some + * integer arithmetic to adjust d_sgl may improve performance. For example + * if d_sgl is built using sgl_alloc_order(chainable=false) then the sgl + * will be an array with equally sized segments facilitating that + * arithmetic. The suggestion applies to s_skip, s_sgl and s_nents as well. + * + **/ +size_t sgl_copy_sgl(struct scatterlist *d_sgl, unsigned int d_nents, off_t d_skip, + struct scatterlist *s_sgl, unsigned int s_nents, off_t s_skip, + size_t n_bytes) +{ + size_t len; + size_t offset = 0; + struct sg_mapping_iter d_iter, s_iter; + + if (n_bytes == 0) + return 0; + sg_miter_start(&s_iter, s_sgl, s_nents, SG_MITER_ATOMIC | SG_MITER_FROM_SG); + sg_miter_start(&d_iter, d_sgl, d_nents, SG_MITER_ATOMIC | SG_MITER_TO_SG); + if (!sg_miter_skip(&s_iter, s_skip)) + goto fini; + if (!sg_miter_skip(&d_iter, d_skip)) + goto fini; + + while (offset < n_bytes) { + if (!sg_miter_next(&s_iter)) + break; + if (!sg_miter_next(&d_iter)) + break; + len = min3(d_iter.length, s_iter.length, n_bytes - offset); + + memcpy(d_iter.addr, s_iter.addr, len); + offset += len; + /* LIFO order (stop d_iter before s_iter) needed with SG_MITER_ATOMIC */ + d_iter.consumed = len; + sg_miter_stop(&d_iter); + s_iter.consumed = len; + sg_miter_stop(&s_iter); + } +fini: + sg_miter_stop(&d_iter); + sg_miter_stop(&s_iter); + return offset; +} +EXPORT_SYMBOL(sgl_copy_sgl); From patchwork Mon Oct 24 01:02:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Gilbert X-Patchwork-Id: 13016479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85584C3A59D for ; Mon, 24 Oct 2022 01:12:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229904AbiJXBMF (ORCPT ); Sun, 23 Oct 2022 21:12:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229882AbiJXBMA (ORCPT ); Sun, 23 Oct 2022 21:12:00 -0400 Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F3710635DE for ; Sun, 23 Oct 2022 18:11:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 4B54D2041D7; Mon, 24 Oct 2022 03:02:54 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aXc9Tbhz7yvZ; Mon, 24 Oct 2022 03:02:52 +0200 (CEST) Received: from treten.bingwo.ca (host-45-78-203-98.dyn.295.ca [45.78.203.98]) by smtp.infotech.no (Postfix) with ESMTPA id E31F52041C0; Mon, 24 Oct 2022 03:02:49 +0200 (CEST) From: Douglas Gilbert To: linux-scsi@vger.kernel.org Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hare@suse.de, bvanassche@acm.org, Bodo Stroesser , David Disseldorp Subject: [PATCH 3/5] scatterlist: add sgl_equal_sgl() function Date: Sun, 23 Oct 2022 21:02:42 -0400 Message-Id: <20221024010244.9522-4-dgilbert@interlog.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221024010244.9522-1-dgilbert@interlog.com> References: <20221024010244.9522-1-dgilbert@interlog.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org After enabling copies between scatter gather lists (sgl_s), another storage related operation is to compare two sgl_s for equality. This new function is designed to partially implement NVMe's Compare command and the SCSI VERIFY(BYTCHK=1) command. Like memcmp() this function begins scanning at the start (of each sgl) and returns false on the first miscompare and stops comparing. The sgl_equal_sgl_idx() function additionally yields the index (i.e. byte position) of the first miscompare. The additional parameter, miscompare_idx, is a pointer. If it is non-NULL and a miscompare is detected (i.e. the function returns false) then the byte index of the first miscompare is written to *miscompare_idx. Knowing the location of the first miscompare is needed to implement properly the SCSI COMPARE AND WRITE command. Reviewed-by: Bodo Stroesser Reviewed-by: David Disseldorp Signed-off-by: Douglas Gilbert --- include/linux/scatterlist.h | 8 +++ lib/scatterlist.c | 110 ++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index cea1edd246cb..e1552a3e9e13 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -449,6 +449,14 @@ size_t sgl_copy_sgl(struct scatterlist *d_sgl, unsigned int d_nents, off_t d_ski struct scatterlist *s_sgl, unsigned int s_nents, off_t s_skip, size_t n_bytes); +bool sgl_equal_sgl(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_skip, + struct scatterlist *y_sgl, unsigned int y_nents, off_t y_skip, + size_t n_bytes); + +bool sgl_equal_sgl_idx(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_skip, + struct scatterlist *y_sgl, unsigned int y_nents, off_t y_skip, + size_t n_bytes, size_t *miscompare_idx); + /* * Maximum number of entries that will be allocated in one piece, if * a list larger than this is required then chaining will be utilized. diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 5d873bd0cb96..426d73ba464a 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -1162,3 +1162,113 @@ size_t sgl_copy_sgl(struct scatterlist *d_sgl, unsigned int d_nents, off_t d_ski return offset; } EXPORT_SYMBOL(sgl_copy_sgl); + +/** + * sgl_equal_sgl_idx - check if x and y (both sgl_s) compare equal, report + * index for first unequal bytes + * @x_sgl: x (left) sgl + * @x_nents: Number of SG entries in x (left) sgl + * @x_skip: Number of bytes to skip in x (left) before starting + * @y_sgl: y (right) sgl + * @y_nents: Number of SG entries in y (right) sgl + * @y_skip: Number of bytes to skip in y (right) before starting + * @n_bytes: The (maximum) number of bytes to compare + * @miscompare_idx: if return is false, index of first miscompare written + * to this pointer (if non-NULL). Value will be < n_bytes + * + * Returns: + * true if x and y compare equal before x, y or n_bytes is exhausted. + * Otherwise on a miscompare, returns false (and stops comparing). If return + * is false and miscompare_idx is non-NULL, then index of first miscompared + * byte written to *miscompare_idx. + * + * Notes: + * x and y are symmetrical: they can be swapped and the result is the same. + * + * Implementation is based on memcmp(). x and y segments may overlap. + * + * The notes in sgl_copy_sgl() about large sgl_s _applies here as well. + * + **/ +bool sgl_equal_sgl_idx(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_skip, + struct scatterlist *y_sgl, unsigned int y_nents, off_t y_skip, + size_t n_bytes, size_t *miscompare_idx) +{ + bool equ = true; + size_t len; + size_t offset = 0; + struct sg_mapping_iter x_iter, y_iter; + + if (n_bytes == 0) + return true; + sg_miter_start(&x_iter, x_sgl, x_nents, SG_MITER_ATOMIC | SG_MITER_FROM_SG); + sg_miter_start(&y_iter, y_sgl, y_nents, SG_MITER_ATOMIC | SG_MITER_FROM_SG); + if (!sg_miter_skip(&x_iter, x_skip)) + goto fini; + if (!sg_miter_skip(&y_iter, y_skip)) + goto fini; + + while (offset < n_bytes) { + if (!sg_miter_next(&x_iter)) + break; + if (!sg_miter_next(&y_iter)) + break; + len = min3(x_iter.length, y_iter.length, n_bytes - offset); + + equ = !memcmp(x_iter.addr, y_iter.addr, len); + if (!equ) + goto fini; + offset += len; + /* LIFO order is important when SG_MITER_ATOMIC is used */ + y_iter.consumed = len; + sg_miter_stop(&y_iter); + x_iter.consumed = len; + sg_miter_stop(&x_iter); + } +fini: + if (miscompare_idx && !equ) { + u8 *xp = x_iter.addr; + u8 *yp = y_iter.addr; + u8 *x_endp; + + for (x_endp = xp + len ; xp < x_endp; ++xp, ++yp) { + if (*xp != *yp) + break; + } + *miscompare_idx = offset + len - (x_endp - xp); + } + sg_miter_stop(&y_iter); + sg_miter_stop(&x_iter); + return equ; +} +EXPORT_SYMBOL(sgl_equal_sgl_idx); + +/** + * sgl_equal_sgl - check if x and y (both sgl_s) compare equal + * @x_sgl: x (left) sgl + * @x_nents: Number of SG entries in x (left) sgl + * @x_skip: Number of bytes to skip in x (left) before starting + * @y_sgl: y (right) sgl + * @y_nents: Number of SG entries in y (right) sgl + * @y_skip: Number of bytes to skip in y (right) before starting + * @n_bytes: The (maximum) number of bytes to compare + * + * Returns: + * true if x and y compare equal before x, y or n_bytes is exhausted. + * Otherwise on a miscompare, returns false (and stops comparing). + * + * Notes: + * x and y are symmetrical: they can be swapped and the result is the same. + * + * Implementation is based on memcmp(). x and y segments may overlap. + * + * The notes in sgl_copy_sgl() about large sgl_s _applies here as well. + * + **/ +bool sgl_equal_sgl(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_skip, + struct scatterlist *y_sgl, unsigned int y_nents, off_t y_skip, + size_t n_bytes) +{ + return sgl_equal_sgl_idx(x_sgl, x_nents, x_skip, y_sgl, y_nents, y_skip, n_bytes, NULL); +} +EXPORT_SYMBOL(sgl_equal_sgl); From patchwork Mon Oct 24 01:02:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Gilbert X-Patchwork-Id: 13016477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C22EC3A59D for ; Mon, 24 Oct 2022 01:12:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229915AbiJXBMC (ORCPT ); Sun, 23 Oct 2022 21:12:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229875AbiJXBMA (ORCPT ); Sun, 23 Oct 2022 21:12:00 -0400 Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F38AE63D0B for ; Sun, 23 Oct 2022 18:11:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 0C7112041BD; Mon, 24 Oct 2022 03:02:55 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SqhNeNPHYkpc; Mon, 24 Oct 2022 03:02:53 +0200 (CEST) Received: from treten.bingwo.ca (host-45-78-203-98.dyn.295.ca [45.78.203.98]) by smtp.infotech.no (Postfix) with ESMTPA id 3EE492041AF; Mon, 24 Oct 2022 03:02:51 +0200 (CEST) From: Douglas Gilbert To: linux-scsi@vger.kernel.org Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hare@suse.de, bvanassche@acm.org, Bodo Stroesser Subject: [PATCH 4/5] scatterlist: add sgl_memset() Date: Sun, 23 Oct 2022 21:02:43 -0400 Message-Id: <20221024010244.9522-5-dgilbert@interlog.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221024010244.9522-1-dgilbert@interlog.com> References: <20221024010244.9522-1-dgilbert@interlog.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org The existing sg_zero_buffer() function is a bit restrictive. For example protection information (PI) blocks are usually initialized to 0xff bytes. As its name suggests sgl_memset() is modelled on memset(). One difference is the type of the val argument which is u8 rather than int. Plus it returns the number of bytes (over)written. Change implementation of sg_zero_buffer() to call this new function. Reviewed-by: Bodo Stroesser Signed-off-by: Douglas Gilbert --- include/linux/scatterlist.h | 20 +++++++++- lib/scatterlist.c | 78 ++++++++++++++++++++----------------- 2 files changed, 61 insertions(+), 37 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index e1552a3e9e13..dbcf0f6fd8d9 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -442,8 +442,6 @@ size_t sg_pcopy_from_buffer(struct scatterlist *sgl, unsigned int nents, const void *buf, size_t buflen, off_t skip); size_t sg_pcopy_to_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, size_t buflen, off_t skip); -size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, - size_t buflen, off_t skip); size_t sgl_copy_sgl(struct scatterlist *d_sgl, unsigned int d_nents, off_t d_skip, struct scatterlist *s_sgl, unsigned int s_nents, off_t s_skip, @@ -457,6 +455,24 @@ bool sgl_equal_sgl_idx(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_ struct scatterlist *y_sgl, unsigned int y_nents, off_t y_skip, size_t n_bytes, size_t *miscompare_idx); +size_t sgl_memset(struct scatterlist *sgl, unsigned int nents, off_t skip, + u8 val, size_t n_bytes); + +/** + * sg_zero_buffer - Zero-out a part of a SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buflen: The number of bytes to zero out + * @skip: Number of bytes to skip before zeroing + * + * Returns the number of bytes zeroed. + **/ +static inline size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, + size_t buflen, off_t skip) +{ + return sgl_memset(sgl, nents, skip, 0, buflen); +} + /* * Maximum number of entries that will be allocated in one piece, if * a list larger than this is required then chaining will be utilized. diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 426d73ba464a..2694aeedf6e2 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -1054,41 +1054,6 @@ size_t sg_pcopy_to_buffer(struct scatterlist *sgl, unsigned int nents, } EXPORT_SYMBOL(sg_pcopy_to_buffer); -/** - * sg_zero_buffer - Zero-out a part of a SG list - * @sgl: The SG list - * @nents: Number of SG entries - * @buflen: The number of bytes to zero out - * @skip: Number of bytes to skip before zeroing - * - * Returns the number of bytes zeroed. - **/ -size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, - size_t buflen, off_t skip) -{ - unsigned int offset = 0; - struct sg_mapping_iter miter; - unsigned int sg_flags = SG_MITER_ATOMIC | SG_MITER_TO_SG; - - sg_miter_start(&miter, sgl, nents, sg_flags); - - if (!sg_miter_skip(&miter, skip)) - return false; - - while (offset < buflen && sg_miter_next(&miter)) { - unsigned int len; - - len = min(miter.length, buflen - offset); - memset(miter.addr, 0, len); - - offset += len; - } - - sg_miter_stop(&miter); - return offset; -} -EXPORT_SYMBOL(sg_zero_buffer); - /** * sgl_copy_sgl - Copy over a destination sgl from a source sgl * @d_sgl: Destination sgl @@ -1272,3 +1237,46 @@ bool sgl_equal_sgl(struct scatterlist *x_sgl, unsigned int x_nents, off_t x_skip return sgl_equal_sgl_idx(x_sgl, x_nents, x_skip, y_sgl, y_nents, y_skip, n_bytes, NULL); } EXPORT_SYMBOL(sgl_equal_sgl); + +/** + * sgl_memset - set byte 'val' up to n_bytes times on SG list + * @sgl: The SG list + * @nents: Number of SG entries in sgl + * @skip: Number of bytes to skip before starting + * @val: byte value to write to sgl + * @n_bytes: The (maximum) number of bytes to modify + * + * Returns: + * The number of bytes written. + * + * Notes: + * Stops writing if either sgl or n_bytes is exhausted. If n_bytes is + * set SIZE_MAX then val will be written to each byte until the end + * of sgl. + * + * The notes in sgl_copy_sgl() about large sgl_s _applies here as well. + * + **/ +size_t sgl_memset(struct scatterlist *sgl, unsigned int nents, off_t skip, + u8 val, size_t n_bytes) +{ + size_t offset = 0; + size_t len; + struct sg_mapping_iter miter; + + if (n_bytes == 0) + return 0; + sg_miter_start(&miter, sgl, nents, SG_MITER_ATOMIC | SG_MITER_TO_SG); + if (!sg_miter_skip(&miter, skip)) + goto fini; + + while ((offset < n_bytes) && sg_miter_next(&miter)) { + len = min(miter.length, n_bytes - offset); + memset(miter.addr, val, len); + offset += len; + } +fini: + sg_miter_stop(&miter); + return offset; +} +EXPORT_SYMBOL(sgl_memset); From patchwork Mon Oct 24 01:02:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Gilbert X-Patchwork-Id: 13016481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 384D9C3A59D for ; Mon, 24 Oct 2022 01:12:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229923AbiJXBMH (ORCPT ); Sun, 23 Oct 2022 21:12:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229803AbiJXBMC (ORCPT ); Sun, 23 Oct 2022 21:12:02 -0400 Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0550E63FDF for ; Sun, 23 Oct 2022 18:11:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 1D3122041AF; Mon, 24 Oct 2022 03:02:58 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4SvycgDw3bnW; Mon, 24 Oct 2022 03:02:54 +0200 (CEST) Received: from treten.bingwo.ca (host-45-78-203-98.dyn.295.ca [45.78.203.98]) by smtp.infotech.no (Postfix) with ESMTPA id 72BC02041BB; Mon, 24 Oct 2022 03:02:52 +0200 (CEST) From: Douglas Gilbert To: linux-scsi@vger.kernel.org Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hare@suse.de, bvanassche@acm.org Subject: [PATCH 5/5] scsi_debug: change store from vmalloc to sgl Date: Sun, 23 Oct 2022 21:02:44 -0400 Message-Id: <20221024010244.9522-6-dgilbert@interlog.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221024010244.9522-1-dgilbert@interlog.com> References: <20221024010244.9522-1-dgilbert@interlog.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org A long time ago this driver's store was allocated by kmalloc() or alloc_pages(). When this was switched to vmalloc() the author noticed slower ramdisk access times and more variability in repeated tests. So try going back with sgl_alloc_order() to get uniformly sized allocations in a sometimes large scatter gather _array_. That array is the basis of maintaining O(1) access to the store. Using sgl_alloc_order() and friends requires CONFIG_SGL_ALLOC so add a 'select' to the Kconfig file. Remove kcalloc() in resp_verify() as sgl_s can now be compared directly without forming an intermediate buffer. This is a performance win for the SCSI VERIFY command implementation. Make the SCSI COMPARE AND WRITE command yield the offset of the first miscompared byte when the compare fails (as required by T10). Signed-off-by: Douglas Gilbert --- drivers/scsi/Kconfig | 3 +- drivers/scsi/scsi_debug.c | 478 +++++++++++++++++++++++++++----------- 2 files changed, 341 insertions(+), 140 deletions(-) diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 03e71e3d5e5b..97edb4e17319 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -1229,13 +1229,14 @@ config SCSI_DEBUG tristate "SCSI debugging host and device simulator" depends on SCSI select CRC_T10DIF + select SGL_ALLOC help This pseudo driver simulates one or more hosts (SCSI initiators), each with one or more targets, each with one or more logical units. Defaults to one of each, creating a small RAM disk device. Many parameters found in the /sys/bus/pseudo/drivers/scsi_debug directory can be tweaked at run time. - See for more information. + See for more information. Mainly used for testing and best as a module. If unsure, say N. config SCSI_MESH diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 697fc57bc711..0715521b2527 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -221,6 +221,7 @@ static const char *sdebug_version_date = "20210520"; #define SDEBUG_CANQUEUE_WORDS 3 /* a WORD is bits in a long */ #define SDEBUG_CANQUEUE (SDEBUG_CANQUEUE_WORDS * BITS_PER_LONG) #define DEF_CMD_PER_LUN SDEBUG_CANQUEUE +#define SDEB_ORDER_TOO_LARGE 4096 /* UA - Unit Attention; SA - Service Action; SSU - Start Stop Unit */ #define F_D_IN 1 /* Data-in command (e.g. READ) */ @@ -318,8 +319,11 @@ struct sdebug_host_info { /* There is an xarray of pointers to this struct's objects, one per host */ struct sdeb_store_info { + unsigned int n_elem; /* number of sgl elements */ + unsigned int order; /* as used by alloc_pages() */ + unsigned int elem_pow2; /* PAGE_SHIFT + order */ rwlock_t macc_lck; /* for atomic media access on this store */ - u8 *storep; /* user data storage (ram) */ + struct scatterlist *sgl; /* main store: n_elem array of same sized allocs */ struct t10_pi_tuple *dif_storep; /* protection info */ void *map_storep; /* provisioning map */ }; @@ -880,19 +884,6 @@ static inline bool scsi_debug_lbp(void) (sdebug_lbpu || sdebug_lbpws || sdebug_lbpws10); } -static void *lba2fake_store(struct sdeb_store_info *sip, - unsigned long long lba) -{ - struct sdeb_store_info *lsip = sip; - - lba = do_div(lba, sdebug_store_sectors); - if (!sip || !sip->storep) { - WARN_ON_ONCE(true); - lsip = xa_load(per_store_ap, 0); /* should never be NULL */ - } - return lsip->storep + lba * sdebug_sector_size; -} - static struct t10_pi_tuple *dif_store(struct sdeb_store_info *sip, sector_t sector) { @@ -1001,7 +992,6 @@ static int scsi_debug_ioctl(struct scsi_device *dev, unsigned int cmd, __func__, cmd); } return -EINVAL; - /* return -ENOTTY; // correct return but upsets fdisk */ } static void config_cdb_len(struct scsi_device *sdev) @@ -1221,6 +1211,55 @@ static int fetch_to_dev_buffer(struct scsi_cmnd *scp, unsigned char *arr, return scsi_sg_copy_to_buffer(scp, arr, arr_len); } +static bool sdeb_sgl_cmp_buf(struct scatterlist *sgl, unsigned int nents, + const void *buf, size_t buflen, off_t skip) +{ + bool equ = true; + size_t offset = 0; + size_t len; + struct sg_mapping_iter miter; + + if (buflen == 0) + return true; + sg_miter_start(&miter, sgl, nents, SG_MITER_ATOMIC | SG_MITER_FROM_SG); + if (!sg_miter_skip(&miter, skip)) + goto fini; + + while (equ && (offset < buflen) && sg_miter_next(&miter)) { + len = min(miter.length, buflen - offset); + equ = memcmp(buf + offset, miter.addr, len) == 0; + offset += len; + miter.consumed = len; + sg_miter_stop(&miter); + } +fini: + sg_miter_stop(&miter); + return equ; +} + +static void sdeb_sgl_prefetch(struct scatterlist *sgl, unsigned int nents, + off_t skip, size_t n_bytes) +{ + size_t offset = 0; + size_t len; + struct sg_mapping_iter miter; + unsigned int sg_flags = SG_MITER_FROM_SG; + + if (n_bytes == 0) + return; + sg_miter_start(&miter, sgl, nents, sg_flags); + if (!sg_miter_skip(&miter, skip)) + goto fini; + + while ((offset < n_bytes) && sg_miter_next(&miter)) { + len = min(miter.length, n_bytes - offset); + prefetch_range(miter.addr, len); + offset += len; + } + fini: + sg_miter_stop(&miter); +} + static char sdebug_inq_vendor_id[9] = "Linux "; static char sdebug_inq_product_id[17] = "scsi_debug "; @@ -2990,6 +3029,40 @@ static inline int check_device_access_params return 0; } +#if 0 +static struct scatterlist *sdeb_sgl_alloc_order(unsigned long long length, unsigned int order, + gfp_t gfp, unsigned int *nent_p) +{ + struct scatterlist *sgl, *sg; + struct page *page; + unsigned int nent, nalloc; + u32 elem_len; + + nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order); + nalloc = nent; + sgl = kmalloc_array(nalloc, sizeof(struct scatterlist), gfp & ~GFP_DMA); + if (!sgl) + return NULL; + + sg_init_table(sgl, nalloc); + sg = sgl; + while (length) { + elem_len = min_t(u64, length, PAGE_SIZE << order); + page = alloc_pages(gfp, order); + if (!page) { + sgl_free_order(sgl, order); + return NULL; + } + sg_set_page(sg, page, elem_len, 0); + length -= elem_len; + sg = sg_next(sg); + } + if (nent_p) + *nent_p = nent; + return sgl; +} +#endif + /* * Note: if BUG_ON() fires it usually indicates a problem with the parser * tables. Perhaps a missing F_FAKE_RW or FF_MEDIA_IO flag. Response functions @@ -3008,13 +3081,14 @@ static inline struct sdeb_store_info *devip2sip(struct sdebug_dev_info *devip, /* Returns number of bytes copied or -1 if error. */ static int do_device_access(struct sdeb_store_info *sip, struct scsi_cmnd *scp, - u32 sg_skip, u64 lba, u32 num, bool do_write) + u32 data_inout_off, u64 lba, u32 n_blks, bool do_write) { int ret; - u64 block, rest = 0; + u32 lb_size = sdebug_sector_size; + u64 block, sgl_i, rem, lba_start, rest = 0; enum dma_data_direction dir; struct scsi_data_buffer *sdb = &scp->sdb; - u8 *fsp; + struct scatterlist *store_sgl; if (do_write) { dir = DMA_TO_DEVICE; @@ -3027,25 +3101,38 @@ static int do_device_access(struct sdeb_store_info *sip, struct scsi_cmnd *scp, return 0; if (scp->sc_data_direction != dir) return -1; - fsp = sip->storep; - block = do_div(lba, sdebug_store_sectors); - if (block + num > sdebug_store_sectors) - rest = block + num - sdebug_store_sectors; + if (block + n_blks > sdebug_store_sectors) + rest = block + n_blks - sdebug_store_sectors; + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; /* O(1) to each store sg element */ + + if (do_write) + ret = sgl_copy_sgl(store_sgl, sip->n_elem - sgl_i, rem, + sdb->table.sgl, sdb->table.nents, data_inout_off, + (n_blks - rest) * lb_size); + else + ret = sgl_copy_sgl(sdb->table.sgl, sdb->table.nents, data_inout_off, + store_sgl, sip->n_elem - sgl_i, rem, + (n_blks - rest) * lb_size); - ret = sg_copy_buffer(sdb->table.sgl, sdb->table.nents, - fsp + (block * sdebug_sector_size), - (num - rest) * sdebug_sector_size, sg_skip, do_write); - if (ret != (num - rest) * sdebug_sector_size) + if (ret != (n_blks - rest) * lb_size) return ret; - if (rest) { - ret += sg_copy_buffer(sdb->table.sgl, sdb->table.nents, - fsp, rest * sdebug_sector_size, - sg_skip + ((num - rest) * sdebug_sector_size), - do_write); - } - + if (rest == 0) + goto fini; + if (do_write) + ret += sgl_copy_sgl(sip->sgl, sip->n_elem, 0, sdb->table.sgl, + sdb->table.nents, + data_inout_off + ((n_blks - rest) * lb_size), + rest * lb_size); + else + ret += sgl_copy_sgl(sdb->table.sgl, sdb->table.nents, + data_inout_off + ((n_blks - rest) * lb_size), + sip->sgl, sip->n_elem, 0, rest * lb_size); +fini: return ret; } @@ -3062,37 +3149,66 @@ static int do_dout_fetch(struct scsi_cmnd *scp, u32 num, u8 *doutp) num * sdebug_sector_size, 0, true); } -/* If sip->storep+lba compares equal to arr(num), then copy top half of - * arr into sip->storep+lba and return true. If comparison fails then - * return false. */ +/* If sip->storep+lba compares equal to arr(num) or scp->sdb, then if miscomp_idxp is non-NULL, + * copy top half of arr into sip->storep+lba and return true. If comparison fails then return + * false and write the miscompare_idx via miscomp_idxp. This is the COMAPARE AND WRITE case. + * For VERIFY(BytChk=1), set arr to NULL which causes a sgl (store) to sgl (data-out buffer) + * compare to be done. VERIFY(BytChk=3) sets arr to a valid address and sets miscomp_idxp + * to NULL. + */ static bool comp_write_worker(struct sdeb_store_info *sip, u64 lba, u32 num, - const u8 *arr, bool compare_only) + const u8 *arr, struct scsi_cmnd *scp, size_t *miscomp_idxp) { - bool res; - u64 block, rest = 0; + bool equ; + u64 block, lba_start, sgl_i, rem, rest = 0; u32 store_blks = sdebug_store_sectors; - u32 lb_size = sdebug_sector_size; - u8 *fsp = sip->storep; + const u32 lb_size = sdebug_sector_size; + u32 top_half = num * lb_size; + struct scsi_data_buffer *sdb = &scp->sdb; + struct scatterlist *store_sgl; block = do_div(lba, store_blks); if (block + num > store_blks) rest = block + num - store_blks; - - res = !memcmp(fsp + (block * lb_size), arr, (num - rest) * lb_size); - if (!res) - return res; - if (rest) - res = memcmp(fsp, arr + ((num - rest) * lb_size), + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; /* O(1) to each store sg element */ + + if (!arr) { /* sgl to sgl compare */ + equ = sgl_equal_sgl_idx(store_sgl, sip->n_elem - sgl_i, rem, + sdb->table.sgl, sdb->table.nents, 0, + (num - rest) * lb_size, miscomp_idxp); + if (!equ) + return equ; + if (rest > 0) + equ = sgl_equal_sgl_idx(sip->sgl, sip->n_elem, 0, sdb->table.sgl, + sdb->table.nents, (num - rest) * lb_size, + rest * lb_size, miscomp_idxp); + } else { + equ = sdeb_sgl_cmp_buf(store_sgl, sip->n_elem - sgl_i, arr, + (num - rest) * lb_size, 0); + if (!equ) + return equ; + if (rest > 0) + equ = sdeb_sgl_cmp_buf(sip->sgl, sip->n_elem, arr, + (num - rest) * lb_size, 0); + } + if (!equ || !miscomp_idxp) + return equ; + + /* Copy "top half" of dout (args: 4, 5 and 6) into store sgl (args 1, 2 and 3) */ + sgl_copy_sgl(store_sgl, sip->n_elem - sgl_i, rem, + sdb->table.sgl, sdb->table.nents, top_half, + (num - rest) * lb_size); + if (rest > 0) { /* for virtual_gb need to handle wrap-around of store */ + u32 src_off = top_half + ((num - rest) * lb_size); + + sgl_copy_sgl(sip->sgl, sip->n_elem, 0, + sdb->table.sgl, sdb->table.nents, src_off, rest * lb_size); - if (!res) - return res; - if (compare_only) - return true; - arr += num * lb_size; - memcpy(fsp + (block * lb_size), arr, (num - rest) * lb_size); - if (rest) - memcpy(fsp, arr + ((num - rest) * lb_size), rest * lb_size); - return res; + } + return true; } static __be16 dif_compute_csum(const void *buf, int len) @@ -3185,13 +3301,22 @@ static int prot_verify_read(struct scsi_cmnd *scp, sector_t start_sec, { int ret = 0; unsigned int i; + const u32 lb_size = sdebug_sector_size; sector_t sector; + u64 lba, lba_start, block, rem, sgl_i; struct sdeb_store_info *sip = devip2sip((struct sdebug_dev_info *) scp->device->hostdata, true); struct t10_pi_tuple *sdt; + struct scatterlist *store_sgl; + u8 *arr; + + arr = kzalloc(lb_size, GFP_ATOMIC); + if (!arr) + return -1; /* mkp, is this correct? */ for (i = 0; i < sectors; i++, ei_lba++) { sector = start_sec + i; + lba = sector; sdt = dif_store(sip, sector); if (sdt->app_tag == cpu_to_be16(0xffff)) @@ -3205,11 +3330,19 @@ static int prot_verify_read(struct scsi_cmnd *scp, sector_t start_sec, * have to iterate over the PI twice. */ if (scp->cmnd[1] >> 5) { /* RDPROTECT */ - ret = dif_verify(sdt, lba2fake_store(sip, sector), - sector, ei_lba); + block = do_div(lba, sdebug_store_sectors); + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; + + ret = sg_copy_buffer(store_sgl, sip->n_elem - sgl_i, arr, lb_size, rem, true); + + ret = dif_verify(sdt, arr, sector, ei_lba); + if (ret) { dif_errors++; - break; + goto fini; } } } @@ -3217,6 +3350,8 @@ static int prot_verify_read(struct scsi_cmnd *scp, sector_t start_sec, dif_copy_prot(scp, start_sec, sectors, true); dix_reads++; +fini: + kfree(arr); return ret; } @@ -3431,6 +3566,7 @@ static int prot_verify_write(struct scsi_cmnd *SCpnt, sector_t start_sec, unsigned int sectors, u32 ei_lba) { int ret; + const u32 lb_size = sdebug_sector_size; struct t10_pi_tuple *sdt; void *daddr; sector_t sector = start_sec; @@ -3480,7 +3616,7 @@ static int prot_verify_write(struct scsi_cmnd *SCpnt, sector_t start_sec, sector++; ei_lba++; - dpage_offset += sdebug_sector_size; + dpage_offset += lb_size; } diter.consumed = dpage_offset; sg_miter_stop(&diter); @@ -3555,8 +3691,8 @@ static void map_region(struct sdeb_store_info *sip, sector_t lba, static void unmap_region(struct sdeb_store_info *sip, sector_t lba, unsigned int len) { + const u32 lb_size = sdebug_sector_size; sector_t end = lba + len; - u8 *fsp = sip->storep; while (lba < end) { unsigned long index = lba_to_map_index(lba); @@ -3566,10 +3702,26 @@ static void unmap_region(struct sdeb_store_info *sip, sector_t lba, index < map_size) { clear_bit(index, sip->map_storep); if (sdebug_lbprz) { /* for LBPRZ=2 return 0xff_s */ - memset(fsp + lba * sdebug_sector_size, - (sdebug_lbprz & 1) ? 0 : 0xff, - sdebug_sector_size * - sdebug_unmap_granularity); + int val = (sdebug_lbprz & 1) ? 0 : 0xff; + u32 num = sdebug_unmap_granularity; + u64 lbaa = lba; + u64 rest = 0; + u64 block, lba_start, sgl_i, rem; + struct scatterlist *store_sgl; + + block = do_div(lbaa, sdebug_store_sectors); + if (block + num > sdebug_store_sectors) + rest = block + num - sdebug_store_sectors; + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; + + sgl_memset(store_sgl, sip->n_elem - sgl_i, rem, val, + num * lb_size); + if (rest) + sgl_memset(sip->sgl, sip->n_elem, rem, val, + (num - rest) * lb_size); } if (sip->dif_storep) { memset(sip->dif_storep + lba, 0xff, @@ -3727,7 +3879,7 @@ static int resp_write_scat(struct scsi_cmnd *scp, u8 wrprotect; u16 lbdof, num_lrd, k; u32 num, num_by, bt_len, lbdof_blen, sg_off, cum_lb; - u32 lb_size = sdebug_sector_size; + const u32 lb_size = sdebug_sector_size; u32 ei_lba; u64 lba; int ret, res; @@ -3885,13 +4037,12 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, struct scsi_device *sdp = scp->device; struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata; unsigned long long i; - u64 block, lbaa; - u32 lb_size = sdebug_sector_size; + u64 block, lbaa, sgl_i, lba_start, rem; + const u32 lb_size = sdebug_sector_size; int ret; struct sdeb_store_info *sip = devip2sip((struct sdebug_dev_info *) scp->device->hostdata, true); - u8 *fs1p; - u8 *fsp; + struct scatterlist *store_sgl; sdeb_write_lock(sip); @@ -3907,15 +4058,17 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, } lbaa = lba; block = do_div(lbaa, sdebug_store_sectors); - /* if ndob then zero 1 logical block, else fetch 1 logical block */ - fsp = sip->storep; - fs1p = fsp + (block * lb_size); - if (ndob) { - memset(fs1p, 0, lb_size); - ret = 0; - } else - ret = fetch_to_dev_buffer(scp, fs1p, lb_size); + /* if ndob then zero 1 logical block, else fetch 1 logical block */ + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; + ret = 0; + if (ndob) + sgl_memset(store_sgl, sip->n_elem - sgl_i, rem, 0, lb_size); + else + ret = do_device_access(sip, scp, 0, lba, 1, true); if (-1 == ret) { sdeb_write_unlock(sip); return DID_ERROR << 16; @@ -3926,9 +4079,11 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, /* Copy first sector to remaining blocks */ for (i = 1 ; i < num ; i++) { - lbaa = lba + i; - block = do_div(lbaa, sdebug_store_sectors); - memmove(fsp + (block * lb_size), fs1p, lb_size); + ret = do_device_access(sip, scp, 0, lba + i, 1, true); + if (-1 == ret) { + write_unlock(&sip->macc_lck); + return DID_ERROR << 16; + } } if (scsi_debug_lbp()) map_region(sip, lba, num); @@ -3937,7 +4092,6 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, zbc_inc_wp(devip, lba, num); out: sdeb_write_unlock(sip); - return 0; } @@ -4043,15 +4197,14 @@ static int resp_write_buffer(struct scsi_cmnd *scp, return 0; } -static int resp_comp_write(struct scsi_cmnd *scp, - struct sdebug_dev_info *devip) +static int resp_comp_write(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) { u8 *cmd = scp->cmnd; - u8 *arr; struct sdeb_store_info *sip = devip2sip(devip, true); u64 lba; + size_t miscomp_idx; u32 dnum; - u32 lb_size = sdebug_sector_size; + const u32 lb_size = sdebug_sector_size; u8 num; int ret; int retval = 0; @@ -4074,25 +4227,21 @@ static int resp_comp_write(struct scsi_cmnd *scp, if (ret) return ret; dnum = 2 * num; - arr = kcalloc(lb_size, dnum, GFP_ATOMIC); - if (NULL == arr) { - mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, - INSUFF_RES_ASCQ); - return check_condition_result; - } sdeb_write_lock(sip); - - ret = do_dout_fetch(scp, dnum, arr); - if (ret == -1) { - retval = DID_ERROR << 16; + if (scp->sdb.length < dnum * lb_size || scp->sc_data_direction != DMA_TO_DEVICE) { + mk_sense_buffer(scp, ILLEGAL_REQUEST, PARAMETER_LIST_LENGTH_ERR, 0); + retval = check_condition_result; + if (sdebug_verbose) + sdev_printk(KERN_INFO, scp->device, + "%s::%s: cdb indicated=%u, IO sent=%d bytes\n", my_name, + __func__, dnum * lb_size, ret); goto cleanup; - } else if (sdebug_verbose && (ret < (dnum * lb_size))) - sdev_printk(KERN_INFO, scp->device, "%s: compare_write: cdb " - "indicated=%u, IO sent=%d bytes\n", my_name, - dnum * lb_size, ret); - if (!comp_write_worker(sip, lba, num, arr, false)) { + } + + if (!comp_write_worker(sip, lba, num, NULL, scp, &miscomp_idx)) { mk_sense_buffer(scp, MISCOMPARE, MISCOMPARE_VERIFY_ASC, 0); + scsi_set_sense_information(scp->sense_buffer, SCSI_SENSE_BUFFERSIZE, miscomp_idx); retval = check_condition_result; goto cleanup; } @@ -4100,7 +4249,6 @@ static int resp_comp_write(struct scsi_cmnd *scp, map_region(sip, lba, num); cleanup: sdeb_write_unlock(sip); - kfree(arr); return retval; } @@ -4246,12 +4394,12 @@ static int resp_pre_fetch(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) { int res = 0; - u64 lba; - u64 block, rest = 0; + const u32 lb_size = sdebug_sector_size; + u64 lba, block, sgl_i, rem, lba_start, rest = 0; u32 nblks; u8 *cmd = scp->cmnd; struct sdeb_store_info *sip = devip2sip(devip, true); - u8 *fsp = sip->storep; + struct scatterlist *store_sgl; if (cmd[0] == PRE_FETCH) { /* 10 byte cdb */ lba = get_unaligned_be32(cmd + 2); @@ -4264,21 +4412,21 @@ static int resp_pre_fetch(struct scsi_cmnd *scp, mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0); return check_condition_result; } - if (!fsp) - goto fini; /* PRE-FETCH spec says nothing about LBP or PI so skip them */ block = do_div(lba, sdebug_store_sectors); if (block + nblks > sdebug_store_sectors) rest = block + nblks - sdebug_store_sectors; + lba_start = block * lb_size; + sgl_i = lba_start >> sip->elem_pow2; + rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + store_sgl = sip->sgl + sgl_i; /* O(1) to each store sg element */ /* Try to bring the PRE-FETCH range into CPU's cache */ sdeb_read_lock(sip); - prefetch_range(fsp + (sdebug_sector_size * block), - (nblks - rest) * sdebug_sector_size); + sdeb_sgl_prefetch(store_sgl, sip->n_elem - sgl_i, rem, (nblks - rest) * lb_size); if (rest) - prefetch_range(fsp, rest * sdebug_sector_size); + sdeb_sgl_prefetch(sip->sgl, sip->n_elem, 0, rest * lb_size); sdeb_read_unlock(sip); -fini: if (cmd[1] & 0x2) res = SDEG_RES_IMMED_MASK; return res | condition_met_result; @@ -4395,7 +4543,7 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) u32 vnum, a_num, off; const u32 lb_size = sdebug_sector_size; u64 lba; - u8 *arr; + u8 *arr = NULL; u8 *cmd = scp->cmnd; struct sdeb_store_info *sip = devip2sip(devip, true); @@ -4429,30 +4577,31 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip) if (ret) return ret; - arr = kcalloc(lb_size, vnum, GFP_ATOMIC); - if (!arr) { - mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, - INSUFF_RES_ASCQ); - return check_condition_result; + if (is_bytchk3) { + arr = kcalloc(lb_size, vnum, GFP_ATOMIC); + if (!arr) { + mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, INSUFF_RES_ASCQ); + return check_condition_result; + } } /* Not changing store, so only need read access */ sdeb_read_lock(sip); - ret = do_dout_fetch(scp, a_num, arr); - if (ret == -1) { - ret = DID_ERROR << 16; - goto cleanup; - } else if (sdebug_verbose && (ret < (a_num * lb_size))) { - sdev_printk(KERN_INFO, scp->device, - "%s: %s: cdb indicated=%u, IO sent=%d bytes\n", - my_name, __func__, a_num * lb_size, ret); - } if (is_bytchk3) { + ret = do_dout_fetch(scp, a_num, arr); + if (ret == -1) { + ret = DID_ERROR << 16; + goto cleanup; + } else if (sdebug_verbose && (ret < (a_num * lb_size))) { + sdev_printk(KERN_INFO, scp->device, + "%s: %s: cdb indicated=%u, IO sent=%d bytes\n", + my_name, __func__, a_num * lb_size, ret); + } for (j = 1, off = lb_size; j < vnum; ++j, off += lb_size) memcpy(arr + off, arr, lb_size); } ret = 0; - if (!comp_write_worker(sip, lba, vnum, arr, true)) { + if (!comp_write_worker(sip, lba, vnum, arr, scp, NULL)) { mk_sense_buffer(scp, MISCOMPARE, MISCOMPARE_VERIFY_ASC, 0); ret = check_condition_result; goto cleanup; @@ -4831,9 +4980,16 @@ static void zbc_rwp_zone(struct sdebug_dev_info *devip, if (zsp->z_cond == ZC4_CLOSED) devip->nr_closed--; - if (zsp->z_wp > zsp->z_start) - memset(sip->storep + zsp->z_start * sdebug_sector_size, 0, - (zsp->z_wp - zsp->z_start) * sdebug_sector_size); + if (zsp->z_wp > zsp->z_start) { + u32 lb_size = sdebug_sector_size; + u64 lba_start = zsp->z_start * lb_size; + u64 sgl_i = lba_start >> sip->elem_pow2; + u64 rem = lba_start - (sgl_i ? (sgl_i << sip->elem_pow2) : 0); + struct scatterlist *store_sgl = sip->sgl + sgl_i; + + sgl_memset(store_sgl, sip->n_elem - sgl_i, rem, 0, + (zsp->z_wp - zsp->z_start) * lb_size); + } zsp->z_non_seq_resource = false; zsp->z_wp = zsp->z_start; @@ -6051,15 +6207,15 @@ static int scsi_debug_show_info(struct seq_file *m, struct Scsi_Host *host) sdhp->shost->host_no, idx); ++j; } - seq_printf(m, "\nper_store array [most_recent_idx=%d]:\n", + seq_printf(m, "\nper_store array [most_recent_idx=%d] sgl_s:\n", sdeb_most_recent_idx); j = 0; xa_for_each(per_store_ap, l_idx, sip) { niu = xa_get_mark(per_store_ap, l_idx, SDEB_XA_NOT_IN_USE); idx = (int)l_idx; - seq_printf(m, " %d: idx=%d%s\n", j, idx, - (niu ? " not_in_use" : "")); + seq_printf(m, " %d: idx=%d%s, n_elems=%u, elem_sz=%u\n", j, idx, + (niu ? " not_in_use" : ""), sip->n_elem, 1 << sip->elem_pow2); ++j; } } @@ -7178,7 +7334,8 @@ static void sdebug_erase_store(int idx, struct sdeb_store_info *sip) } vfree(sip->map_storep); vfree(sip->dif_storep); - vfree(sip->storep); + if (sip->sgl) + sgl_free_n_order(sip->sgl, sip->n_elem, sip->order); xa_erase(per_store_ap, idx); kfree(sip); } @@ -7199,6 +7356,41 @@ static void sdebug_erase_all_stores(bool apart_from_first) sdeb_most_recent_idx = sdeb_first_idx; } +/* Want uniform sg element size, the last one can be less. */ +static int sdeb_store_sgat(struct sdeb_store_info *sip, int sz_mib) +{ + unsigned int order; + unsigned long sz_b = (unsigned long)sz_mib * 1048576; + gfp_t mask_ap = GFP_KERNEL | __GFP_COMP | __GFP_NOWARN | __GFP_ZERO; + + if (sz_mib <= 128) + order = get_order(max_t(unsigned int, PAGE_SIZE, 32 * 1024)); + else if (sz_mib <= 256) + order = get_order(max_t(unsigned int, PAGE_SIZE, 64 * 1024)); + else if (sz_mib <= 512) + order = get_order(max_t(unsigned int, PAGE_SIZE, 128 * 1024)); + else if (sz_mib <= 1024) + order = get_order(max_t(unsigned int, PAGE_SIZE, 256 * 1024)); + else if (sz_mib <= 2048) + order = get_order(max_t(unsigned int, PAGE_SIZE, 512 * 1024)); + else + order = get_order(max_t(unsigned int, PAGE_SIZE, 1024 * 1024)); + sip->sgl = sgl_alloc_order(sz_b, order, false, mask_ap, &sip->n_elem); + if (!sip->sgl && order > 0) { + sip->sgl = sgl_alloc_order(sz_b, --order, false, mask_ap, &sip->n_elem); + if (!sip->sgl && order > 0) + sip->sgl = sgl_alloc_order(sz_b, --order, false, mask_ap, &sip->n_elem); + } + if (!sip->sgl) { + pr_info("%s: unable to obtain %d MiB, last element size: %u kiB\n", __func__, + sz_mib, (1 << (PAGE_SHIFT + order)) / 1024); + return -ENOMEM; + } + sip->order = order; + sip->elem_pow2 = PAGE_SHIFT + order; + return 0; +} + /* * Returns store xarray new element index (idx) if >=0 else negated errno. * Limit the number of stores to 65536. @@ -7230,13 +7422,21 @@ static int sdebug_add_store(void) xa_unlock_irqrestore(per_store_ap, iflags); res = -ENOMEM; - sip->storep = vzalloc(sz); - if (!sip->storep) { - pr_err("user data oom\n"); + res = sdeb_store_sgat(sip, sdebug_dev_size_mb); + if (res) { + pr_err("sgat: user data oom\n"); goto err; } - if (sdebug_num_parts > 0) - sdebug_build_parts(sip->storep, sz); + if (sdebug_num_parts > 0) { + const int a_len = 1024; + u8 *arr = kzalloc(a_len, GFP_KERNEL); + + if (arr) { + sdebug_build_parts(arr, sz); + sg_copy_from_buffer(sip->sgl, sip->n_elem, arr, a_len); + kfree(arr); + } + } /* DIF/DIX: what T10 calls Protection Information (PI) */ if (sdebug_dix) {