From patchwork Mon Jan 11 19:15:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 8008681 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 8AEF29F1C0 for ; Mon, 11 Jan 2016 19:15:38 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8253620279 for ; Mon, 11 Jan 2016 19:15:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B4F8D2024D for ; Mon, 11 Jan 2016 19:15:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934381AbcAKTPV (ORCPT ); Mon, 11 Jan 2016 14:15:21 -0500 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:57480 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934373AbcAKTPU (ORCPT ); Mon, 11 Jan 2016 14:15:20 -0500 Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id C26BF8EE1D4; Mon, 11 Jan 2016 11:15:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1452539719; bh=/A4pxlJTGjAiLNyrrIdALk/3tazIgwonr/65NHJJk58=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ImVGpE0fi/a1wBeHS2J/1KueKZGyj6snbr0EJC5EoN+FlpL7s7zkNC7lfDB+2Vnq+ SuL7P5euVv75mlYLebdf6EUwuZF7w3zc7bvY6XATZpk864IjiRxxNTpd7EuCa+HwJk 3ZGVk0LCnFpalVG0/Q+Jo0HHCVvqcd/++HdtD6LI= Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nz12Hhn65QrC; Mon, 11 Jan 2016 11:15:19 -0800 (PST) Received: from [153.66.254.194] (unknown [184.11.141.41]) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id 4FA008EE0A4; Mon, 11 Jan 2016 11:15:19 -0800 (PST) Message-ID: <1452539718.2363.12.camel@HansenPartnership.com> Subject: Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices From: James Bottomley To: "Ewan D. Milne" , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org Cc: gregkh@linuxfoundation.org, martin.petersen@oracle.com, hare@suse.com Date: Mon, 11 Jan 2016 11:15:18 -0800 In-Reply-To: <1452533307-30142-1-git-send-email-emilne@redhat.com> References: <1452533307-30142-1-git-send-email-emilne@redhat.com> X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote: > From: "Ewan D. Milne" > > The klist traversal used by the reading of /proc/scsi/scsi is not > interlocked > against device removal. It takes a reference on the containing > object, but > this does not prevent the device from being removed from the list. > Thus, we > get errors and eventually panic, as shown in the traces below. Fix > this by > keeping a klist iterator in the seq_file private data. > > The problem can be easily reproduced by repeatedly increasing > scsi_debug's > max_luns to 30 and then deleting the devices via sysfs, while > simulatenously > accessing /proc/scsi/scsi. > > From a patch originally developed by David Jeffery < > djeffery@redhat.com> OK, so it looks like this is a bug in the klist system. When a starting point is used, there should be a check to see if it's still active otherwise the whole thing is racy. If it's fixed in klist, the fix works for everyone, not just SCSI. How about this? It causes the iterator to start at the beginning if the node has been deleted. That will produce double output during some of your test, but I think that's OK given that this is a rare race. James --- -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/lib/klist.c b/lib/klist.c index d74cf7a..0507fa5 100644 --- a/lib/klist.c +++ b/lib/klist.c @@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i, struct klist_node *n) { i->i_klist = k; - i->i_cur = n; - if (n) - kref_get(&n->n_ref); + i->i_cur = NULL; + if (n && kref_get_unless_zero(&n->n_ref)) + i->i_cur = n; } EXPORT_SYMBOL_GPL(klist_iter_init_node);