From patchwork Thu Jun 29 20:51:48 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Luis Chamberlain <mcgrof@kernel.org>
X-Patchwork-Id: 9817919
Return-Path: <linux-fsdevel-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	D9B596020A for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Thu, 29 Jun 2017 20:52:54 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAC8B26E97
	for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Thu, 29 Jun 2017 20:52:54 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id BBC63274D2; Thu, 29 Jun 2017 20:52:54 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F6ED26E97
	for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Thu, 29 Jun 2017 20:52:54 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752545AbdF2Uwk (ORCPT
	<rfc822;patchwork-linux-fsdevel@patchwork.kernel.org>);
	Thu, 29 Jun 2017 16:52:40 -0400
Received: from mail.kernel.org ([198.145.29.99]:38242 "EHLO mail.kernel.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752907AbdF2Uv4 (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 29 Jun 2017 16:51:56 -0400
Received: from garbanzo.do-not-panic.com (c-73-15-241-2.hsd1.ca.comcast.net
	[73.15.241.2])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128
	bits)) (No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPSA id 64A3322BD9;
	Thu, 29 Jun 2017 20:51:54 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64A3322BD9
Authentication-Results: mail.kernel.org;
	dmarc=none (p=none dis=none) header.from=kernel.org
Authentication-Results: mail.kernel.org;
	spf=none smtp.mailfrom=mcgrof@kernel.org
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: gregkh@linuxfoundation.org
Cc: wagi@monom.org, yi1.li@linux.intel.com, takahiro.akashi@linaro.org,
	luto@kernel.org, ebiederm@xmission.com, dmitry.torokhov@gmail.com,
	arend.vanspriel@broadcom.com, dwmw2@infradead.org,
	rjw@rjwysocki.net, atull@kernel.org, moritz.fischer@ettus.com,
	pmladek@suse.com, johannes.berg@intel.com,
	emmanuel.grumbach@intel.com, luciano.coelho@intel.com,
	kvalo@codeaurora.org, torvalds@linux-foundation.org,
	keescook@chromium.org, dhowells@redhat.com, pjones@redhat.com,
	hdegoede@redhat.com, alan@linux.intel.com, tytso@mit.edu,
	dave@stgolabs.net, mawilcox@microsoft.com, tglx@linutronix.de,
	peterz@infradead.org, mfuzzey@parkeon.com,
	jakub.kicinski@netronome.com, nbroeking@me.com,
	jewalt@lgsinnovations.com, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, "Luis R. Rodriguez" <mcgrof@kernel.org>,
	"[4.10+]" <stable@vger.kernel.org>, Ming Lei <ming.lei@redhat.com>
Subject: [PATCH v3 1/4] firmware: fix batched requests - wake all waiters
Date: Thu, 29 Jun 2017 13:51:48 -0700
Message-Id: <20170629205151.5329-2-mcgrof@kernel.org>
X-Mailer: git-send-email 2.11.0
In-Reply-To: <20170629205151.5329-1-mcgrof@kernel.org>
References: <20170629205151.5329-1-mcgrof@kernel.org>
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

The firmware cache mechanism serves two purposes, the secondary purpose is
not well documented nor understood. This fixes a regression with the secondary
purpose of the firmware cache mechanism: batched requests.

The firmware cache is used for:

1) Addressing races with file lookups during the suspend/resume cycle
   by keeping firmware in memory during the suspend/resume cycle

2) Batched requests for the same file rely only on work from the first file
   lookup, which keeps the firmware in memory until the last release_firmware()
   is called

Batched requests *only* take effect if secondary requests come in prior to the
first user calling release_firmware(). The devres name used for the internal
firmware cache is used as a hint other pending requests are ongoing, the
firmware buffer data is kept in memory until the last user of the buffer
calls release_firmware(), therefore serializing requests and delaying the
release until all requests are done.

Batched requests wait for a wakup or signal so we can rely on the first file
fetch to write to the pending secondary requests. Commit 5b029624948d
("firmware: do not use fw_lock for fw_state protection") ported the firmware
API to use swait, and in doing so failed to convert complete_all() to
swake_up_all() -- it used swake_up(), loosing the ability for *some* batched
requests to take effect.

We *could* fix this by just using swake_up_all() *but* swait is now known
to be very special use case, so its best to just move away from it. So we
just go back to using completions as before commit 5b029624948d ("firmware:
do not use fw_lock for fw_state protection") given this was using
complete_all().

Without this fix it has been reported plugging in two Intel 6260 Wifi cards
on a system will end up enumerating the two devices only 50% of the time
[0]. The ported swake_up() should have actually handled the case with two
devices, however, *if more than two cards are used* the swake_up() would
not have sufficed. This change is only part of the required fixes for
batched requests. Subsequent fixes will follow.

This particular change should fix the cases where more than three requests
with the same firmware name is used, otherwise batched requests will wait for
MAX_SCHEDULE_TIMEOUT and just timeout eventually.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

CC: <stable@vger.kernel.org>    [4.10+]
Cc: Ming Lei <ming.lei@redhat.com>
Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 drivers/base/firmware_class.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index b9f907eedbf7..f50ec6e367bd 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -30,7 +30,6 @@
 #include <linux/syscore_ops.h>
 #include <linux/reboot.h>
 #include <linux/security.h>
-#include <linux/swait.h>
 
 #include <generated/utsrelease.h>
 
@@ -112,13 +111,13 @@ static inline long firmware_loading_timeout(void)
  * state of the firmware loading.
  */
 struct fw_state {
-	struct swait_queue_head wq;
+	struct completion completion;
 	enum fw_status status;
 };
 
 static void fw_state_init(struct fw_state *fw_st)
 {
-	init_swait_queue_head(&fw_st->wq);
+	init_completion(&fw_st->completion);
 	fw_st->status = FW_STATUS_UNKNOWN;
 }
 
@@ -131,9 +130,8 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout)
 {
 	long ret;
 
-	ret = swait_event_interruptible_timeout(fw_st->wq,
-				__fw_state_is_done(READ_ONCE(fw_st->status)),
-				timeout);
+	ret = wait_for_completion_interruptible_timeout(&fw_st->completion,
+							timeout);
 	if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
 		return -ENOENT;
 	if (!ret)
@@ -148,7 +146,7 @@ static void __fw_state_set(struct fw_state *fw_st,
 	WRITE_ONCE(fw_st->status, status);
 
 	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
-		swake_up(&fw_st->wq);
+		complete_all(&fw_st->completion);
 }
 
 #define fw_state_start(fw_st)					\