From patchwork Mon May 22 16:04:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 9740945 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B493E6034C for ; Mon, 22 May 2017 16:04:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6BAF280B0 for ; Mon, 22 May 2017 16:04:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9B3DF28698; Mon, 22 May 2017 16:04:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2BCC6280B0 for ; Mon, 22 May 2017 16:04:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760214AbdEVQEY (ORCPT ); Mon, 22 May 2017 12:04:24 -0400 Received: from verein.lst.de ([213.95.11.211]:49043 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757638AbdEVQEW (ORCPT ); Mon, 22 May 2017 12:04:22 -0400 Received: by newverein.lst.de (Postfix, from userid 2407) id 70C8F68B03; Mon, 22 May 2017 18:04:20 +0200 (CEST) Date: Mon, 22 May 2017 18:04:20 +0200 From: Christoph Hellwig To: Rakesh Pandit Cc: Christoph Hellwig , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Keith Busch , Jens Axboe , Sagi Grimberg , linux-pci@vger.kernel.org Subject: Re: [PATCH] nvme: pci: Fix NULL dereference when resetting NVMe SSD Message-ID: <20170522160420.GA26356@lst.de> References: <20170520175952.GA11258@dhcp-216.srv.tuxera.com> <20170521061736.GA12287@lst.de> <20170522153829.GA17980@dhcp-216.srv.tuxera.com> <20170522160217.GA26104@lst.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170522160217.GA26104@lst.de> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, May 22, 2017 at 06:02:17PM +0200, Christoph Hellwig wrote: > On Mon, May 22, 2017 at 06:38:29PM +0300, Rakesh Pandit wrote: > > Just got to use the using the test box again and you are right that > > nvme_remove_dead_ctrl_work is getting called just before the NULL > > pointer dereference. > > > > Here call trace to nvme_timeout which results in eventually call to > > nvme_reset when it wants to reset the controller (which races with > > ->reset_notify from PCI layer): > > Does the patch below fix the issue for you? Actually, it probably should be this one, but for you the effects are probably the same: diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b01bd5bba8e6..b61ad77dc322 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4275,11 +4275,13 @@ int pci_reset_function(struct pci_dev *dev) if (rc) return rc; + pci_dev_lock(dev); pci_dev_save_and_disable(dev); - rc = pci_dev_reset(dev, 0); + rc = __pci_dev_reset(dev, 0); pci_dev_restore(dev); + pci_dev_unlock(dev); return rc; }