From patchwork Tue May 22 20:33:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10419581 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AAC8F6016C for ; Tue, 22 May 2018 20:33:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 99EEA28FA9 for ; Tue, 22 May 2018 20:33:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E627290F5; Tue, 22 May 2018 20:33:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9276290F2 for ; Tue, 22 May 2018 20:33:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752256AbeEVUdl (ORCPT ); Tue, 22 May 2018 16:33:41 -0400 Received: from esa2.hgst.iphmx.com ([68.232.143.124]:26142 "EHLO esa2.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751423AbeEVUdk (ORCPT ); Tue, 22 May 2018 16:33:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1527021928; x=1558557928; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-id:content-transfer-encoding: mime-version; bh=Mw7u1hetxT9X9Zhd9tJSeFw6h+Zp60gGCUED6cyc2Iw=; b=a12GMjg9Q/qKJXM+Q3scwXVTKGbYBfMI5cilUw0J/CbBO5zDNNwMu0Ie cXwKsjwgBO+3yekL6TUulunlCWjegO87gnOC+TLVCUMQBMjQjkVuGymnP kcqWcRh6culXsssLUJqBVe9urV8haIgT4RIVnRR5NhQOdiR9ti2LWxzq7 cl3wFnHcao5oZpzxHsOXPYH0hJQ51wiZ8iLdsfNZHj+QRoXm0KkpFP9aj 7dIqHIgNnZqwv5rg2/VeunY+NYuPGbHuSknI9EyCesZBR41Xnqq8k5U8C UKRyiuia7haOxLfkMDqCKIz9f9GGxbZ6mp/ix52R+4l/+oXwEFrWCtMeq A==; X-IronPort-AV: E=Sophos;i="5.49,430,1520870400"; d="scan'208";a="175461483" Received: from mail-cys01nam02lp0048.outbound.protection.outlook.com (HELO NAM02-CY1-obe.outbound.protection.outlook.com) ([207.46.163.48]) by ob1.hgst.iphmx.com with ESMTP; 23 May 2018 04:45:02 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sharedspace.onmicrosoft.com; s=selector1-wdc-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Mw7u1hetxT9X9Zhd9tJSeFw6h+Zp60gGCUED6cyc2Iw=; b=l0AtJl6u56cCzZcM09edNsZf3zL2buTsi911eomFD8w7lmpT9gH4l4i2rFfKdEpUjTpOwrfxyUkk9oT3uuDIQNARXWI/fj6IdgU/HDfiKQQZ69J+StVvepa7eGE/SU9mI/nqPvXdW+SvyFh2ZrFfK1gFPG4+fvizF4+VPtvdZvU= Received: from MWHPR04MB1198.namprd04.prod.outlook.com (10.173.48.151) by MWHPR04MB1182.namprd04.prod.outlook.com (10.173.53.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.776.11; Tue, 22 May 2018 20:33:20 +0000 Received: from MWHPR04MB1198.namprd04.prod.outlook.com ([fe80::b0a6:c7c1:fb97:eb04]) by MWHPR04MB1198.namprd04.prod.outlook.com ([fe80::b0a6:c7c1:fb97:eb04%14]) with mapi id 15.20.0776.019; Tue, 22 May 2018 20:33:20 +0000 From: Bart Van Assche To: "axboe@kernel.dk" CC: "linux-block@vger.kernel.org" , "israelr@mellanox.com" , "sagi@grimberg.me" , "hch@lst.de" , "sebott@linux.ibm.com" , "ming.lei@redhat.com" , "jianchao.w.wang@oracle.com" , "maxg@mellanox.com" , "tj@kernel.org" , "keith.busch@intel.com" Subject: Re: [PATCH v13] blk-mq: Rework blk-mq timeout handling again Thread-Topic: [PATCH v13] blk-mq: Rework blk-mq timeout handling again Thread-Index: AQHT8el8KIYlA/ZmP0udK/Gksn49eqQ79F6AgAAJOQCAABk+gIAABJqAgAAJjQCAAA9rgA== Date: Tue, 22 May 2018 20:33:20 +0000 Message-ID: <256db3889953289d75b989dd589802ce8a756553.camel@wdc.com> References: <20180522162515.20650-1-bart.vanassche@wdc.com> <448d63e9-cfcd-0bc8-abf3-30296590f1d6@kernel.dk> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bart.VanAssche@wdc.com; x-originating-ip: [199.255.44.250] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; MWHPR04MB1182; 7:2zWB4nedQi3mCfUtzA++WlBtYx5swTdbhexoGXNSTBLeTMzobstSuuY45f8JeC4zAzqhDBRFgPRm5LzFRjlrqzn0bDi/bjHh9xIwH41zM2ZgPnCprk6nifBKgfJznzCWC1/VVVwCLdYjcXXLKyVCPhmXGjN22G0UIl6ZXMLASk7wYL9GKxwYQHvCLf5vOktVPZpCnlE6F+K42cCUi3Xc7snM9C0e/1iPBDD4J/gVDyLMOD49KU68DNlIrbGcZKfY; 20:8YzyrTbFI+/XAQoTw1Y6bfgxm3dbQYQrj9OeSypu6ceZQ7vaiYCLd4I3PMI2ZWmSCYE6qnCd8K35RGBMekzSeRFo9x9MQudDvr6oua+ic6yPkAmarV3ckv14YK0CYDZe+W4NbgchOco73QH7pAVw+a6QjscjHYZjATMp7mjkmTo= x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(2017052603328)(7153060)(7193020); SRVR:MWHPR04MB1182; x-ms-traffictypediagnostic: MWHPR04MB1182: wdcipoutbound: EOP-TRUE x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(3231254)(944501410)(52105095)(3002001)(10201501046)(93006095)(93001095)(6055026)(149027)(150027)(6041310)(20161123558120)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(6072148)(201708071742011)(7699016); SRVR:MWHPR04MB1182; BCL:0; PCL:0; RULEID:; SRVR:MWHPR04MB1182; x-forefront-prvs: 0680FADD48 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(39380400002)(39860400002)(376002)(396003)(346002)(366004)(51234002)(199004)(189003)(377424004)(93886005)(2351001)(6512007)(6916009)(106356001)(316002)(81166006)(105586002)(81156014)(8676002)(4326008)(6246003)(2906002)(8936002)(229853002)(53936002)(3660700001)(1730700003)(486006)(54906003)(2616005)(446003)(476003)(68736007)(6486002)(6436002)(11346002)(14454004)(3280700002)(5660300001)(2900100001)(186003)(45080400002)(25786009)(478600001)(72206003)(66066001)(59450400001)(7736002)(36756003)(99286004)(6506007)(118296001)(6116002)(86362001)(5250100002)(76176011)(97736004)(3846002)(305945005)(102836004)(26005)(5640700003)(2501003)(7416002)(53546011)(575784001); DIR:OUT; SFP:1102; SCL:1; SRVR:MWHPR04MB1182; H:MWHPR04MB1198.namprd04.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; x-microsoft-antispam-message-info: m+/ThXPifO2tHO1jXYjJo1DseVSm5AWa3SL0j0cDzQ8Sleuxq9qN4gZ7CI7bK5p8QpOWv5gO5CZJaH5cc4sduJkeJt+iI4larpm91kLqUgljgPsnUlQROwweD9co8l/Xc/UiJgEE1Z4QQ5u/KMDFGknhW6kBjX9x48JjWYXbQnxang9ksVamOwHFqdP/kTRAjuggggAGjSRUdYk8P4jV5A== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-ID: <37D8C8AE65A54E46B52FB59E62AA6BDB@namprd04.prod.outlook.com> MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 6ab2ea50-b5ed-4702-d0a1-08d5c02341fb X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6ab2ea50-b5ed-4702-d0a1-08d5c02341fb X-MS-Exchange-CrossTenant-originalarrivaltime: 22 May 2018 20:33:20.4953 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86 X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR04MB1182 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, 2018-05-22 at 13:38 -0600, Jens Axboe wrote: > On 5/22/18 1:03 PM, Jens Axboe wrote: > > On 5/22/18 12:47 PM, Jens Axboe wrote: > > > Ran into this, running block/014 from blktests: > > > > > > [ 5744.949839] run blktests block/014 at 2018-05-22 12:41:25 > > > [ 5750.723000] null: rq 00000000ff68f103 timed out > > > [ 5750.728181] WARNING: CPU: 45 PID: 2480 at block/blk-mq.c:585 __blk_mq_complete_request+0xa6/0x0 > > > [ 5750.738187] Modules linked in: null_blk(+) configfs nvme nvme_core sb_edac x86_pkg_temp_therma] > > > [ 5750.765509] CPU: 45 PID: 2480 Comm: kworker/45:1H Not tainted 4.17.0-rc6+ #712 > > > [ 5750.774087] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016 > > > [ 5750.783369] Workqueue: kblockd blk_mq_timeout_work > > > [ 5750.789223] RIP: 0010:__blk_mq_complete_request+0xa6/0x110 > > > [ 5750.795850] RSP: 0018:ffff883ffb417d68 EFLAGS: 00010202 > > > [ 5750.802187] RAX: 0000000000000003 RBX: ffff881ff100d800 RCX: 0000000000000000 > > > [ 5750.810649] RDX: ffff88407fd9e040 RSI: ffff88407fd956b8 RDI: ffff881ff100d800 > > > [ 5750.819119] RBP: ffffe8ffffd91800 R08: 0000000000000000 R09: ffffffff82066eb8 > > > [ 5750.827588] R10: ffff883ffa386138 R11: ffff883ffa385900 R12: 0000000000000001 > > > [ 5750.836050] R13: ffff881fe7da6000 R14: 0000000000000020 R15: 0000000000000002 > > > [ 5750.844529] FS: 0000000000000000(0000) GS:ffff88407fd80000(0000) knlGS:0000000000000000 > > > [ 5750.854482] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [ 5750.861397] CR2: 00007ffc92f97f68 CR3: 000000000201d005 CR4: 00000000003606e0 > > > [ 5750.869861] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > [ 5750.878333] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > [ 5750.886805] Call Trace: > > > [ 5750.890033] bt_iter+0x42/0x50 > > > [ 5750.894000] blk_mq_queue_tag_busy_iter+0x12b/0x220 > > > [ 5750.899941] ? blk_mq_tag_to_rq+0x20/0x20 > > > [ 5750.904913] ? __rcu_read_unlock+0x50/0x50 > > > [ 5750.909978] ? blk_mq_tag_to_rq+0x20/0x20 > > > [ 5750.914948] blk_mq_timeout_work+0x14b/0x240 > > > [ 5750.920220] process_one_work+0x21b/0x510 > > > [ 5750.925197] worker_thread+0x3a/0x390 > > > [ 5750.929781] ? process_one_work+0x510/0x510 > > > [ 5750.934944] kthread+0x11c/0x140 > > > [ 5750.939028] ? kthread_create_worker_on_cpu+0x50/0x50 > > > [ 5750.945169] ret_from_fork+0x1f/0x30 > > > [ 5750.949656] Code: 48 02 00 00 80 e6 80 74 29 8b 95 80 00 00 00 44 39 e2 75 3b 48 89 df ff 90 2 > > > [ 5750.972139] ---[ end trace 40065cb1764bf500 ]--- > > > > > > which is this: > > > > > > WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_COMPLETE); > > > > That check looks wrong, since TIMED_OUT -> COMPLETE is also a valid > > state transition. So that check should be: > > > > WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_COMPLETE && > > blk_mq_rq_state(rq) != MQ_RQ_TIMED_OUT); > > I guess it would be cleaner to actually do the transition, in > blk_mq_rq_timed_out(): > > case BLK_EH_HANDLED: > if (blk_mq_change_rq_state(req, MQ_RQ_TIMED_OUT, > MQ_RQ_COMPLETE)) > __blk_mq_complete_request(req); > break; > > This works for me. Hello Jens, Thanks for having reported this. How about using the following change to suppress that warning: I think this will work better than what was proposed in your last e-mail. I'm afraid that with that change that a completion that occurs while the timeout handler is running can be ignored. Thanks, Bart. diff --git a/block/blk-mq.c b/block/blk-mq.c index bb99c03e7a34..84e55ea55baf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -844,6 +844,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved) switch (ret) { case BLK_EH_HANDLED: + blk_mq_change_rq_state(req, MQ_RQ_TIMED_OUT, MQ_RQ_COMPLETE); __blk_mq_complete_request(req); break; case BLK_EH_RESET_TIMER: