From patchwork Thu Apr 15 23:15:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12206277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5931CC433B4 for ; Thu, 15 Apr 2021 23:15:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34529610FC for ; Thu, 15 Apr 2021 23:15:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234659AbhDOXP7 (ORCPT ); Thu, 15 Apr 2021 19:15:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234914AbhDOXP7 (ORCPT ); Thu, 15 Apr 2021 19:15:59 -0400 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14ACAC061756 for ; Thu, 15 Apr 2021 16:15:34 -0700 (PDT) Received: by mail-qk1-x734.google.com with SMTP id o5so27135567qkb.0 for ; Thu, 15 Apr 2021 16:15:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=0e+GUnuCpe8QCvStaw0mUR7nWzw75w2F948YZAEt28Q=; b=gpzaqMNTFlFilz2e3S5iD/m2OFN8Lm8z7jY1WAdGZPLXGTkFBuxo9V0fj8MwWesr7h 3ZPtRchAXubkQeYXwGu90Ic8J6XaMy/00CP3Q7vnlOfP0yCEGguCX2hvktmCO1jl031p 6lEHBiKMIqWojpylJ/7+1Ct8NM0am+lnWE7p0XD84VHrxT3Z7ql5GETkkCVYNWxIU6H/ VoCOnVTv4fcuxnboKsXk0sUm1mW8j0JWHRDVKg981Q/bDwK7aoaBg4JOhnTuaYMJgS/J 6a9HUewDk7G1lqS2bjy83ZBkBNPBZFZKVv7VDSZj61bjKABqRfsY8AIbLFYxrqnng86B 0l/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=0e+GUnuCpe8QCvStaw0mUR7nWzw75w2F948YZAEt28Q=; b=XCGXZTFbkTtwPPnSWfGVf43Gkj6cMAQlURKy3tqjGp6BWpWKxqHzfkeVRdpFxcZoox 7MCHbXxbNLJFBQzfARJqTugcF9eoVrn6c/QQaO5VvjWSjPN90K1JMcuOTMrIYuJBJVkL Wtu+6YID3n80Nim4eJ3pB3PjPxR64eOqSQZDMrGEtBljgYMbYN0GUYHZh/wZTLpEdFCV sVVeHmhZ7QZJ3f9oQTfDx3i4IYjYw3TAV+5sXqCi45Ja/ELPpL7HqbWpb8krnZ7FA9g+ 9nbbXVVb3JJlp5IjjCYjJls49SxoOOtXoyRN+5t9kbbPKFWtuHWg7pKsLn4Dl45RY6q+ hGwg== X-Gm-Message-State: AOAM533RE9GmhNswLVltPB3caa3mH7WZusbzzUZx08ZvOtIBsULqm5sr h3ocTlTHwetfpRZc98khxfw= X-Google-Smtp-Source: ABdhPJyItbrqK2zs8RtBIkzXLNW5V+kFE3mMQiJR8+zeLyi/3FrV8WPxvWweLdOctd9XT9tRHGx3wA== X-Received: by 2002:a37:a104:: with SMTP id k4mr3716326qke.149.1618528533257; Thu, 15 Apr 2021 16:15:33 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id d62sm3100679qkg.55.2021.04.15.16.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Apr 2021 16:15:32 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Date: Thu, 15 Apr 2021 19:15:27 -0400 Message-Id: <20210415231530.95464-2-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210415231530.95464-1-snitzer@redhat.com> References: <20210415231530.95464-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If the DNR bit is set we should not retry the command. We care about the retryable vs not retryable distinction at the block layer so propagate the equivalent of the DNR bit by introducing BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it is set. This change runs with the suggestion made here: https://lore.kernel.org/linux-nvme/20190813170144.GA10269@lst.de/ Suggested-by: Christoph Hellwig Signed-off-by: Mike Snitzer Reviewed-by: Chaitanya Kulkarni Reviewed-by: Hannes Reinecke --- drivers/nvme/host/core.c | 3 +++ include/linux/blk_types.h | 8 ++++++++ 2 files changed, 11 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0896e21642be..540d6fd8ffef 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -237,6 +237,9 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl) static blk_status_t nvme_error_status(u16 status) { + if (unlikely(status & NVME_SC_DNR)) + return BLK_STS_DO_NOT_RETRY; + switch (status & 0x7ff) { case NVME_SC_SUCCESS: return BLK_STS_OK; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index db026b6ec15a..1ca724948c56 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -142,6 +142,13 @@ typedef u8 __bitwise blk_status_t; */ #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) +/* + * BLK_STS_DO_NOT_RETRY is returned from the driver in the completion path + * if the device returns a status indicating that if the same command is + * re-submitted it is expected to fail. + */ +#define BLK_STS_DO_NOT_RETRY ((__force blk_status_t)17) + /** * blk_path_error - returns true if error may be path related * @error: status the request was completed with @@ -157,6 +164,7 @@ typedef u8 __bitwise blk_status_t; static inline bool blk_path_error(blk_status_t error) { switch (error) { + case BLK_STS_DO_NOT_RETRY: case BLK_STS_NOTSUPP: case BLK_STS_NOSPC: case BLK_STS_TARGET: From patchwork Thu Apr 15 23:15:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12206281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C027DC43460 for ; Thu, 15 Apr 2021 23:15:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A37F9610FC for ; Thu, 15 Apr 2021 23:15:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235152AbhDOXQB (ORCPT ); Thu, 15 Apr 2021 19:16:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234914AbhDOXQA (ORCPT ); Thu, 15 Apr 2021 19:16:00 -0400 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EFAAC06175F for ; Thu, 15 Apr 2021 16:15:35 -0700 (PDT) Received: by mail-qv1-xf34.google.com with SMTP id 30so12578768qva.9 for ; Thu, 15 Apr 2021 16:15:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=T/JS0tPFuGx026nisaFi9hJiA3/ZXWO6l+YekczmOw4=; b=PVZxMAKpqiXZzHl9BvxQp24AQZguptUtV/s7pogFbPpcCHpzTMwNsz4MVcnFxXLRHE im2pdKP2a/tlVrydPANj0R7tLy6LWbzS1ixH6wR/+ZBe8EjQ00zD5SQ/Ixwo9k387AG1 JciNZ5cfTRC/6gcwjJj7adY8VWCWQGXB6dambvdN84bwaBWg4UGP3FflRBxbfIg7IgGl LcAyJKk+7P/PawzP2Dk52Y407mQ0hD/hqF0hMIk5PHulscOdODexLXmKovlD1he7tN5p XW8a3FTwIJCh5EHvqSqzsadfXhbJNXMe5rAV7D5XNnsR5xJjC6tR++10MpvILvA0tLXx Z6ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=T/JS0tPFuGx026nisaFi9hJiA3/ZXWO6l+YekczmOw4=; b=WSvTtrLpBSaloWvgamuQ+TcJSRIcxXlZbcFbJwclmchvRnEo+KVhium8wYBfSUjkSZ FFEs/lRBeeBA1JZNnMR8Egr+FDP32hFC65trMLHIGLWX6RugF/kEK6DuhsXcXqEmqx42 tocBtK3JbL25/7yakB0KFBHjGRdAu7aKt1+7+MHBwn1kH2IoU1+v2PAwG7ciRKLqhURm YMbinqkfNkXx2A1VfJxv7DCRqCMkuR1v5G6jjBe+Mhz+JCFwlMHAESpxv4yRjmSGuo3S AZdcHMOyJTVB6sTl8uGYTG+Mt6oobH/2clCVXesQ7hq0KQIgmqaTmCk1LJKvRZIms+/Q EM1Q== X-Gm-Message-State: AOAM530YDPFbDpA6VJoDVgkqUVvgEbyHhNayCHca7y/I399IYppQTqfG ZbOIVeXN4iP1QaF9I8eFGWg= X-Google-Smtp-Source: ABdhPJypcPvBNABtyi60w7ZSs8nt8uydx8Z2UbVEMidzcXwJJfkUfTv11ayzAABKZ6UGP/UZ2UlWeA== X-Received: by 2002:ad4:5630:: with SMTP id cb16mr5609781qvb.40.1618528534657; Thu, 15 Apr 2021 16:15:34 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id h7sm1656528qka.39.2021.04.15.16.15.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Apr 2021 16:15:34 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Chao Leng Subject: [PATCH v2 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Date: Thu, 15 Apr 2021 19:15:28 -0400 Message-Id: <20210415231530.95464-3-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210415231530.95464-1-snitzer@redhat.com> References: <20210415231530.95464-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chao Leng REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol does not define the local retry mechanism. SCSI implements a fuzzy local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow higher-level multipathing software to perform failover/retry. NVMe is different with SCSI about this. It defines a local retry mechanism and path error codes, so NVMe should retry local for non path error. If path related error, whether to retry and how to retry is still determined by higher-level multipathing's failover. Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT because NVMe's local retry is needed -- as is NVMe specific logic to categorize whether an error is path related. In this way, the mechanism of NVMe multipath or other multipath are now equivalent. The mechanism is: non path related error will be retried locally, path related error is handled by multipath. Signed-off-by: Chao Leng [snitzer: edited header for grammar and clarity, also added code comment] Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 540d6fd8ffef..4134cf3c7e48 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) if (likely(nvme_req(req)->status == 0)) return COMPLETE; - if (blk_noretry_request(req) || + /* + * REQ_FAILFAST_TRANSPORT is set by upper layer software that + * handles multipathing. Unlike SCSI, NVMe's error handling was + * specifically designed to handle local retry for non-path errors. + * As such, allow NVMe's local retry mechanism to be used for + * requests marked with REQ_FAILFAST_TRANSPORT. + */ + if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) || (nvme_req(req)->status & NVME_SC_DNR) || nvme_req(req)->retries >= nvme_max_retries) return COMPLETE; From patchwork Thu Apr 15 23:15:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12206283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16154C43462 for ; Thu, 15 Apr 2021 23:15:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EA38B61131 for ; Thu, 15 Apr 2021 23:15:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235482AbhDOXQB (ORCPT ); Thu, 15 Apr 2021 19:16:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235375AbhDOXQB (ORCPT ); Thu, 15 Apr 2021 19:16:01 -0400 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC00FC061574 for ; Thu, 15 Apr 2021 16:15:36 -0700 (PDT) Received: by mail-qk1-x734.google.com with SMTP id 130so12882823qkm.4 for ; Thu, 15 Apr 2021 16:15:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=7RCfwAN4jL/DjugvaPSOGTsmpmXQlAF+zlH96pTjoMg=; b=HSjDkB47gFDJ40jIAgfKGOwisFzX5ppiQIpfjeawhJybyMhgCNcxjSqQmUoZcyiDyn aNlTbyf4LI0Ct4TP1bykmZBA9UNZy8Scaz7Nq440NfLOJOIP51+A8YcmbSWbT/PK12Zv +fPfnby7kQIvUGbn6kgzIWOYUwV6Edi//Cw2HO8BQFhBV/3ElwAQqI6LDP8pBGKVGHAG AaGS0JiEhoxfLD3Zn/Uh/AeIyPvxPSFNjGJb0gSFf8jKVwZZ0hBYYvC4ORI7mbZ3D9dM O3gsrE9a3TPX2n2MB89THPuJc+reClB+ZPZqzS/iUr10SuWZJ2yb4PretCEq4TL31I+I 6Cpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=7RCfwAN4jL/DjugvaPSOGTsmpmXQlAF+zlH96pTjoMg=; b=f6SoO+/K4Zd1xtZesf54UKZ9PZOEnC1AkWjg82HzZIhyl5w/FViZiJSpuERRmpZiPe zvajd72Bg+6tPiuHKmBVu9+JWu45IAomW2Wu2nUoTw7LkCY7VCQ1VO4sBi4pS8GSdXXE YSMlPdNhiL3qW/4c+3AXv775aSID+byUoDNFmIp3T90oi40xamq8SHag1EtsdlNbSLHV pTuANVmJPRe4icju1jT3Mo37VC1FcZhF473ur/pjgoukshPY7mA1rYBPaGVdDrFq1sI2 JqoZ9yFvEZ5cj9wO7XDOqrplnsHopHIWEv7ucwZn6paMlcCJbOwuhRbJvH4aT2Vxr+pq s3OA== X-Gm-Message-State: AOAM5322HzvFR9FmoWR7HhD7Po/IkxUuJEEcto5t7SEFCrI4iTxEFPrc S8gKnMjjb5wK0wS0I07QYYk= X-Google-Smtp-Source: ABdhPJwf8piMfDo08mpBT8UOsivPw4eYX8FyC0nqQZZ8gbPRXDke0svfzY8InnsN4GsOOspqnWEdpA== X-Received: by 2002:a05:620a:70c:: with SMTP id 12mr336862qkc.377.1618528536145; Thu, 15 Apr 2021 16:15:36 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id c17sm2627690qtd.71.2021.04.15.16.15.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Apr 2021 16:15:35 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v2 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Date: Thu, 15 Apr 2021 19:15:29 -0400 Message-Id: <20210415231530.95464-4-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210415231530.95464-1-snitzer@redhat.com> References: <20210415231530.95464-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is set by multipathing software (e.g. dm-multipath) before it issues IO. Update NVMe to allow failover of requests marked with either REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests to be given a disposition of either FAILOVER or FAILUP respectively. FAILUP handling ensures a retryable error is returned up from NVMe. Introduce nvme_failup_req() for use in nvme_complete_rq() if nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures the request is completed with a retryable IO error when appropriate. __nvme_end_req() was factored out for use by both nvme_end_req() and nvme_failup_req(). Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 4134cf3c7e48..10375197dd53 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -299,6 +299,7 @@ enum nvme_disposition { COMPLETE, RETRY, FAILOVER, + FAILUP, }; static inline enum nvme_disposition nvme_decide_disposition(struct request *req) @@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) nvme_req(req)->retries >= nvme_max_retries) return COMPLETE; - if (req->cmd_flags & REQ_NVME_MPATH) { + if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) { if (nvme_is_path_error(nvme_req(req)->status) || blk_queue_dying(req->q)) - return FAILOVER; + return (req->cmd_flags & REQ_NVME_MPATH) ? + FAILOVER : FAILUP; } else { if (blk_queue_dying(req->q)) return COMPLETE; @@ -330,10 +332,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) return RETRY; } -static inline void nvme_end_req(struct request *req) +static inline void __nvme_end_req(struct request *req, blk_status_t status) { - blk_status_t status = nvme_error_status(nvme_req(req)->status); - if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && req_op(req) == REQ_OP_ZONE_APPEND) req->__sector = nvme_lba_to_sect(req->q->queuedata, @@ -343,6 +343,24 @@ static inline void nvme_end_req(struct request *req) blk_mq_end_request(req, status); } +static inline void nvme_end_req(struct request *req) +{ + __nvme_end_req(req, nvme_error_status(nvme_req(req)->status)); +} + +static void nvme_failup_req(struct request *req) +{ + blk_status_t status = nvme_error_status(nvme_req(req)->status); + + if (WARN_ON_ONCE(!blk_path_error(status))) { + pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n", + blk_status_to_errno(status)); + status = BLK_STS_IOERR; + } + + __nvme_end_req(req, status); +} + void nvme_complete_rq(struct request *req) { trace_nvme_complete_rq(req); @@ -361,6 +379,9 @@ void nvme_complete_rq(struct request *req) case FAILOVER: nvme_failover_req(req); return; + case FAILUP: + nvme_failup_req(req); + return; } } EXPORT_SYMBOL_GPL(nvme_complete_rq); From patchwork Thu Apr 15 23:15:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12206285 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB4CCC433B4 for ; Thu, 15 Apr 2021 23:15:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCC25610FC for ; Thu, 15 Apr 2021 23:15:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236148AbhDOXQG (ORCPT ); Thu, 15 Apr 2021 19:16:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235375AbhDOXQF (ORCPT ); Thu, 15 Apr 2021 19:16:05 -0400 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43325C061756 for ; Thu, 15 Apr 2021 16:15:38 -0700 (PDT) Received: by mail-qk1-x72f.google.com with SMTP id d15so14106686qkc.9 for ; Thu, 15 Apr 2021 16:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=tXpmcKx37TXtPWLF87d7oCaiXGljnrfqUN5eg+/JOIA=; b=du7nDOkKBU45OSJuZi0/c4J9eUQmeDwz8iIjf0M5r0A03peaU6Y5xb4KOkPKxz/6pN tDQr9bh66gzs8Lm3LQ0KWArAQXx9VLijApcIzaeZfYD71dghhhld+gOwdVL5+/iF3P9P Qm/Nr2GIw+/5AIGV5ZQWjtqVXd+8ofsI+NUA6/Vbi+sV0Tt8pb/iGlkZlbSmMEdye7UX CEXmg5sYwLsPyUxjqruA3p+dLUpMCL0vBHbRv8bvkm2cT3hC+4rLZnApMfxNxLToYBIP 7KIHMfuHyjFM4Ymh5gRPpp+U3hRG8r5h3y4R1BOscoGBYMQhIAg7wFYsalaRo9wc3+DJ I1Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=tXpmcKx37TXtPWLF87d7oCaiXGljnrfqUN5eg+/JOIA=; b=TzqxMVJ+Anly8gpgCkGd3rYlBHEW9NV7Y4xYwivJVL19uNcD/6wSGXAbOXbz3I7b5O 5jHxhNxjnMoW0W9QrRZm/N+t7VezW7mJlenRthOILeYoog4ipmeV0LsOvdkjAPmcPcPs Qt/7jz7e4xRgyh5CMfus5A7PtvQ2MEqMuYwhi1eg+MbVor8ROesHahydLcwdE6JU4FFe A/8rkWZ2GcYP4Wz7zXUYuvdlG6RtVI9ivrJeN9jUXaU6zTWwnTvSbrxrZUsLp8irZBCH /TiE9HeDWRM94nExduewOSCJMUTSBlMCHRvF3fUwIrZI8YLa1y+aoJeralR3UEArvksQ t2bg== X-Gm-Message-State: AOAM531f/a9MkiaRP2Gum5RawED/irtHVrzuPvhu4VNriiiYGZ2JdshZ gV/IhmcphHWtTdE/GA/lzYc= X-Google-Smtp-Source: ABdhPJzn2XMQrRfE5zOzauzWEZPLKDOf+rpDSwz3uLvZB7jaGyI4cR/XVZfVp/8Ed7whNRoHvV84hA== X-Received: by 2002:a05:620a:214f:: with SMTP id m15mr5709548qkm.419.1618528537471; Thu, 15 Apr 2021 16:15:37 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id b26sm2741102qtr.28.2021.04.15.16.15.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Apr 2021 16:15:36 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v2 4/4] nvme: decouple basic ANA log page re-read support from native multipathing Date: Thu, 15 Apr 2021 19:15:30 -0400 Message-Id: <20210415231530.95464-5-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210415231530.95464-1-snitzer@redhat.com> References: <20210415231530.95464-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Whether or not ANA is present is a choice of the target implementation; the host (and whether it supports multipathing) has _zero_ influence on this. If the target declares a path as 'inaccessible' the path _is_ inaccessible to the host. As such, ANA support should be functional even if native multipathing is not. Introduce ability to always re-read ANA log page as required due to ANA error and make current ANA state available via sysfs -- even if native multipathing is disabled on the host (e.g. nvme_core.multipath=N). This is achieved by factoring out nvme_update_ana() and calling it in nvme_complete_rq() for all FAILOVER requests. This affords userspace access to the current ANA state independent of which layer might be doing multipathing. This makes 'nvme list-subsys' show ANA state for all NVMe subsystems with multiple controllers. It also allows userspace multipath-tools to rely on the NVMe driver for ANA support while dm-multipath takes care of multipathing. And as always, if embedded NVMe users do not want any performance overhead associated with ANA or native NVMe multipathing they can disable CONFIG_NVME_MULTIPATH. Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 2 ++ drivers/nvme/host/multipath.c | 16 +++++++++++----- drivers/nvme/host/nvme.h | 4 ++++ 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 10375197dd53..1c6dc3a6c24d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -352,6 +352,8 @@ static void nvme_failup_req(struct request *req) { blk_status_t status = nvme_error_status(nvme_req(req)->status); + nvme_update_ana(req); + if (WARN_ON_ONCE(!blk_path_error(status))) { pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n", blk_status_to_errno(status)); diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index a1d476e1ac02..7d94250264aa 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -65,23 +65,29 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, } } -void nvme_failover_req(struct request *req) +void nvme_update_ana(struct request *req) { struct nvme_ns *ns = req->q->queuedata; u16 status = nvme_req(req)->status & 0x7ff; - unsigned long flags; - - nvme_mpath_clear_current_path(ns); /* * If we got back an ANA error, we know the controller is alive but not - * ready to serve this namespace. Kick of a re-read of the ANA + * ready to serve this namespace. Kick off a re-read of the ANA * information page, and just try any other available path for now. */ if (nvme_is_ana_error(status) && ns->ctrl->ana_log_buf) { set_bit(NVME_NS_ANA_PENDING, &ns->flags); queue_work(nvme_wq, &ns->ctrl->ana_work); } +} + +void nvme_failover_req(struct request *req) +{ + struct nvme_ns *ns = req->q->queuedata; + unsigned long flags; + + nvme_mpath_clear_current_path(ns); + nvme_update_ana(req); spin_lock_irqsave(&ns->head->requeue_lock, flags); blk_steal_bios(&ns->head->requeue_list, req); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 07b34175c6ce..4eed8536625c 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -664,6 +664,7 @@ void nvme_mpath_start_freeze(struct nvme_subsystem *subsys); void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, struct nvme_ctrl *ctrl, int *flags); void nvme_failover_req(struct request *req); +void nvme_update_ana(struct request *req); void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl); int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head); void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id); @@ -714,6 +715,9 @@ static inline void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, static inline void nvme_failover_req(struct request *req) { } +static inline void nvme_update_ana(struct request *req) +{ +} static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl) { }