From patchwork Thu May 17 21:07:40 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Kelley <mhkelley58@gmail.com>
X-Patchwork-Id: 10407765
Return-Path: <linux-scsi-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	CA1C160247 for <patchwork-linux-scsi@patchwork.kernel.org>;
	Thu, 17 May 2018 21:08:48 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BACBE28757
	for <patchwork-linux-scsi@patchwork.kernel.org>;
	Thu, 17 May 2018 21:08:48 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id AF5672875C; Thu, 17 May 2018 21:08:48 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3F00E28757
	for <patchwork-linux-scsi@patchwork.kernel.org>;
	Thu, 17 May 2018 21:08:48 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752139AbeEQVIf (ORCPT
	<rfc822;patchwork-linux-scsi@patchwork.kernel.org>);
	Thu, 17 May 2018 17:08:35 -0400
Received: from mail-pf0-f193.google.com ([209.85.192.193]:35262 "EHLO
	mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751610AbeEQVIe (ORCPT
	<rfc822; linux-scsi@vger.kernel.org>); Thu, 17 May 2018 17:08:34 -0400
Received: by mail-pf0-f193.google.com with SMTP id x9-v6so2679387pfm.2;
	Thu, 17 May 2018 14:08:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=gmail.com; s=20161025;
	h=from:to:cc:subject:date:message-id:reply-to;
	bh=VCuYkr/WuzB8F5y9K5fnSEt9+m0/lbmjUJp9kU4t7Vs=;
	b=Xi7dFqjydH5tIkuoYuLFjdfpwbZ+OR+1QvmcAdvaFRXyPl5bTdAn1ZY7SiENlwRKSM
	oEBPxPDCh8ODkI7Ge0IQ0zXIYop6XICaN887Ysmi49DBpMs4gJqNdcUfhpMfcXjsjoQi
	ftHMC5xDV0vtioWn6h/iU7KiqUgXiejLZHaVAA6Nhf2ELL8kYiAUMY2vCw0mIz1aTazw
	Ugm3fhrwW/0l/uSqZqxzuTh49CUqzIJHvt7BHt6WaNt8FOVDaW27eih2FVRGLSlXEFbF
	34SPhdhy6lfWRn2Ouoaz/z+u3id3mk44JDConIHuis2ma/d7k5qgmcaQU+NAEmfYnkTk
	9dUw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
	bh=VCuYkr/WuzB8F5y9K5fnSEt9+m0/lbmjUJp9kU4t7Vs=;
	b=UYWS8Hia4KXfuS4Tjdy7UKQ6+PNjHfrYoczo8mRUzrd5aXQd+S9NpKbMP7cOfxODCs
	JnTSWJxPF6t8rCapVI8G1mJQACXumGxvTVCom/tKGDdHy0uN/ynvuMR129PnZLmFU+LS
	pK7xIPC4Yw7PRjSqeW5hfwZEPVKw2dEPvZqltTc0jOMfflOeo8S57OZwjkD6B9hZ8NhT
	GpyIDuqgepOsk+a8j4VAF1x9ryRa2KZ3Uflw0W6Dq+fp+F5s/mSOBI/WTR8eWTUKSQh3
	7Yvo6Oe2n8j3aYKfVA1CSw+5IW/x8tYRVoHTe/Io8vtOBGJw21Uu2RLSJr9evQSHLIKe
	zbQg==
X-Gm-Message-State: ALKqPwft4MjAihIuWGLyO2JiaMg9G617xzyemTTR5TKfNo1553UXuyAH
	m6M70duSdouOlOgZJiSzWIs=
X-Google-Smtp-Source: 
 AB8JxZrz0yk7oJfpJImINt3G0gHs7Fg5ShL2KKkSW/W32Kx9Fn3xOG/7tkPBmPkK1xNQFXkvzvEzCg==
X-Received: by 2002:a63:a84f:: with SMTP id
	i15-v6mr5280211pgp.367.1526591313611;
	Thu, 17 May 2018 14:08:33 -0700 (PDT)
Received: from nvmetest.corp.microsoft.com ([2001:4898:80e8:e::4da])
	by smtp.gmail.com with ESMTPSA id
	c87-v6sm11233947pfd.78.2018.05.17.14.08.32
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Thu, 17 May 2018 14:08:33 -0700 (PDT)
From: Michael Kelley <mhkelley58@gmail.com>
X-Google-Original-From: Michael Kelley <mikelley@microsoft.com>
To: kys@microsoft.com, sthemmin@microsoft.com,
	martin.petersen@oracle.com, longli@microsoft.com,
	jejb@linux.vnet.ibm.com, devel@linuxdriverproject.org,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Cc: mikelley@microsoft.com
Subject: [PATCH 1/1] scsi: storvsc: Avoid allocating memory for temp cpumasks
Date: Thu, 17 May 2018 14:07:40 -0700
Message-Id: <1526591260-27162-1-git-send-email-mikelley@microsoft.com>
X-Mailer: git-send-email 2.7.4
Reply-To: mikelley@microsoft.com
Sender: linux-scsi-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-scsi.vger.kernel.org>
X-Mailing-List: linux-scsi@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Current code allocates 240 Kbytes (in typical configs) for
each synthetic SCSI controller to use as temp cpumask variables.
Recode to avoid needing the temp cpumask variables and remove the
memory allocation.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Stephen Hemminger <sthemmin@microsoft.com>
---

This patch is for the 4.18/scsi-queue branch and improves on commit
2217a47de42f on April 20, 2018.

---
 drivers/scsi/storvsc_drv.c | 50 ++++++++++++++++++----------------------------
 1 file changed, 19 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index d3897c4..a667119 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -474,13 +474,6 @@ struct storvsc_device {
 	 * Mask of CPUs bound to subchannels.
 	 */
 	struct cpumask alloced_cpus;
-	/*
-	 * Pre-allocated struct cpumask for each hardware queue.
-	 * struct cpumask is used by selecting out-going channels. It is a
-	 * big structure, default to 1024k bytes when CONFIG_MAXSMP=y.
-	 * Pre-allocate it to avoid allocation on the kernel stack.
-	 */
-	struct cpumask *cpumask_chns;
 	/* Used for vsc/vsp channel reset process */
 	struct storvsc_cmd_request init_request;
 	struct storvsc_cmd_request reset_request;
@@ -885,13 +878,6 @@ static int storvsc_channel_init(struct hv_device *device, bool is_fc)
 	if (stor_device->stor_chns == NULL)
 		return -ENOMEM;
 
-	stor_device->cpumask_chns = kcalloc(num_possible_cpus(),
-			sizeof(struct cpumask), GFP_KERNEL);
-	if (stor_device->cpumask_chns == NULL) {
-		kfree(stor_device->stor_chns);
-		return -ENOMEM;
-	}
-
 	stor_device->stor_chns[device->channel->target_cpu] = device->channel;
 	cpumask_set_cpu(device->channel->target_cpu,
 			&stor_device->alloced_cpus);
@@ -1252,7 +1238,6 @@ static int storvsc_dev_remove(struct hv_device *device)
 	vmbus_close(device->channel);
 
 	kfree(stor_device->stor_chns);
-	kfree(stor_device->cpumask_chns);
 	kfree(stor_device);
 	return 0;
 }
@@ -1262,7 +1247,7 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device,
 {
 	u16 slot = 0;
 	u16 hash_qnum;
-	struct cpumask *alloced_mask = &stor_device->cpumask_chns[q_num];
+	const struct cpumask *node_mask;
 	int num_channels, tgt_cpu;
 
 	if (stor_device->num_sc == 0)
@@ -1278,10 +1263,13 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device,
 	 * III. Mapping is persistent.
 	 */
 
-	cpumask_and(alloced_mask, &stor_device->alloced_cpus,
-		    cpumask_of_node(cpu_to_node(q_num)));
+	node_mask = cpumask_of_node(cpu_to_node(q_num));
 
-	num_channels = cpumask_weight(alloced_mask);
+	num_channels = 0;
+	for_each_cpu(tgt_cpu, &stor_device->alloced_cpus) {
+		if (cpumask_test_cpu(tgt_cpu, node_mask))
+			num_channels++;
+	}
 	if (num_channels == 0)
 		return stor_device->device->channel;
 
@@ -1289,7 +1277,9 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device,
 	while (hash_qnum >= num_channels)
 		hash_qnum -= num_channels;
 
-	for_each_cpu(tgt_cpu, alloced_mask) {
+	for_each_cpu(tgt_cpu, &stor_device->alloced_cpus) {
+		if (!cpumask_test_cpu(tgt_cpu, node_mask))
+			continue;
 		if (slot == hash_qnum)
 			break;
 		slot++;
@@ -1308,7 +1298,7 @@ static int storvsc_do_io(struct hv_device *device,
 	struct vstor_packet *vstor_packet;
 	struct vmbus_channel *outgoing_channel, *channel;
 	int ret = 0;
-	struct cpumask *alloced_mask;
+	const struct cpumask *node_mask;
 	int tgt_cpu;
 
 	vstor_packet = &request->vstor_packet;
@@ -1329,11 +1319,11 @@ static int storvsc_do_io(struct hv_device *device,
 			 * Ideally, we want to pick a different channel if
 			 * available on the same NUMA node.
 			 */
-			alloced_mask = &stor_device->cpumask_chns[q_num];
-			cpumask_and(alloced_mask, &stor_device->alloced_cpus,
-				    cpumask_of_node(cpu_to_node(q_num)));
-
-			for_each_cpu_wrap(tgt_cpu, alloced_mask, q_num + 1) {
+			node_mask = cpumask_of_node(cpu_to_node(q_num));
+			for_each_cpu_wrap(tgt_cpu,
+				 &stor_device->alloced_cpus, q_num + 1) {
+				if (!cpumask_test_cpu(tgt_cpu, node_mask))
+					continue;
 				if (tgt_cpu == q_num)
 					continue;
 				channel = stor_device->stor_chns[tgt_cpu];
@@ -1359,10 +1349,9 @@ static int storvsc_do_io(struct hv_device *device,
 			 * NUMA node are busy. Try to find a channel in
 			 * other NUMA nodes
 			 */
-			cpumask_andnot(alloced_mask, &stor_device->alloced_cpus,
-					cpumask_of_node(cpu_to_node(q_num)));
-
-			for_each_cpu(tgt_cpu, alloced_mask) {
+			for_each_cpu(tgt_cpu, &stor_device->alloced_cpus) {
+				if (cpumask_test_cpu(tgt_cpu, node_mask))
+					continue;
 				channel = stor_device->stor_chns[tgt_cpu];
 				if (hv_get_avail_to_write_percent(
 							&channel->outbound)
@@ -1911,7 +1900,6 @@ static int storvsc_probe(struct hv_device *device,
 
 err_out1:
 	kfree(stor_device->stor_chns);
-	kfree(stor_device->cpumask_chns);
 	kfree(stor_device);
 
 err_out0: