From patchwork Tue Jun 30 14:36:57 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
X-Patchwork-Id: 6696741
Return-Path: 
 <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org>
X-Original-To: patchwork-linux-arm@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork1.web.kernel.org (Postfix) with ESMTP id CAA269F380
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Tue, 30 Jun 2015 14:41:03 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id D65EB20622
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Tue, 30 Jun 2015 14:41:02 +0000 (UTC)
Received: from bombadil.infradead.org (bombadil.infradead.org
	[198.137.202.9])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 01B0B20621
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Tue, 30 Jun 2015 14:41:02 +0000 (UTC)
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux))
	id 1Z9wgK-0003Ts-8o; Tue, 30 Jun 2015 14:38:40 +0000
Received: from casper.infradead.org ([2001:770:15f::2])
	by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat
	Linux)) id 1Z9wfb-00036H-4u
	for linux-arm-kernel@bombadil.infradead.org;
	Tue, 30 Jun 2015 14:37:55 +0000
Received: from down.free-electrons.com ([37.187.137.238]
	helo=mail.free-electrons.com)
	by casper.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux))
	id 1Z9wfU-0001YR-Lw for linux-arm-kernel@lists.infradead.org;
	Tue, 30 Jun 2015 14:37:53 +0000
Received: by mail.free-electrons.com (Postfix, from userid 106)
	id 25EF11F46; Tue, 30 Jun 2015 16:37:09 +0200 (CEST)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
	RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
Received: from localhost (AToulouse-657-1-1071-25.w92-134.abo.wanadoo.fr
	[92.134.77.25])
	by mail.free-electrons.com (Postfix) with ESMTPSA id DC48093;
	Tue, 30 Jun 2015 16:37:08 +0200 (CEST)
From: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
To: Vinod Koul <vinod.koul@intel.com>,
	dmaengine@vger.kernel.org
Subject: [PATCH v3 6/6] dmaengine: mv_xor: optimize performance by using a
	subset of the XOR channels
Date: Tue, 30 Jun 2015 16:36:57 +0200
Message-Id: 
 <1435675017-875-7-git-send-email-thomas.petazzoni@free-electrons.com>
X-Mailer: git-send-email 2.4.5
In-Reply-To: 
 <1435675017-875-1-git-send-email-thomas.petazzoni@free-electrons.com>
References: 
 <1435675017-875-1-git-send-email-thomas.petazzoni@free-electrons.com>
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20150630_153748_783916_66385999 
X-CRM114-Status: GOOD (  26.87  )
X-Spam-Score: -2.5 (--)
Cc: Lior Amsalem <alior@marvell.com>, Andrew Lunn <andrew@lunn.ch>,
	Jason Cooper <jason@lakedaemon.net>, Tawfik Bayouk <tawfik@marvell.com>,
	Nadav Haklai <nadavh@marvell.com>,
	Gregory Clement <gregory.clement@free-electrons.com>,
	Maxime Ripard <maxime.ripard@free-electrons.com>,
	Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	linux-arm-kernel@lists.infradead.org,
	Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: 
 <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: 
 <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
MIME-Version: 1.0
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
X-Virus-Scanned: ClamAV using ClamSMTP

Due to how async_tx behaves internally, having more XOR channels than
CPUs is actually hurting performance more than it improves it, because
memcpy requests get scheduled on a different channel than the XOR
requests, but async_tx will still wait for the completion of the
memcpy requests before scheduling the XOR requests.

It is in fact more efficient to have at most one channel per CPU,
which this patch implements by limiting the number of channels per
engine, and the number of engines registered depending on the number
of availables CPUs.

Marvell platforms are currently available in one CPU, two CPUs and
four CPUs configurations:

 - in the configurations with one CPU, only one channel from one
   engine is used.

 - in the configurations with two CPUs, only one channel from each
   engine is used (they are two XOR engines)

 - in the configurations with four CPUs, both channels of both engines
   are used.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 drivers/dma/mv_xor.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index 6e09d59..2cdfca7 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -26,6 +26,7 @@
 #include <linux/of.h>
 #include <linux/of_irq.h>
 #include <linux/irqdomain.h>
+#include <linux/cpumask.h>
 #include <linux/platform_data/dma-mv_xor.h>
 
 #include "dmaengine.h"
@@ -1137,12 +1138,15 @@ static const struct of_device_id mv_xor_dt_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, mv_xor_dt_ids);
 
+static unsigned int mv_xor_engine_count;
+
 static int mv_xor_probe(struct platform_device *pdev)
 {
 	const struct mbus_dram_target_info *dram;
 	struct mv_xor_device *xordev;
 	struct mv_xor_platform_data *pdata = dev_get_platdata(&pdev->dev);
 	struct resource *res;
+	unsigned int max_engines, max_channels;
 	int i, ret;
 	int op_in_desc;
 
@@ -1186,6 +1190,21 @@ static int mv_xor_probe(struct platform_device *pdev)
 	if (!IS_ERR(xordev->clk))
 		clk_prepare_enable(xordev->clk);
 
+	/*
+	 * We don't want to have more than one channel per CPU in
+	 * order for async_tx to perform well. So we limit the number
+	 * of engines and channels so that we take into account this
+	 * constraint. Note that we also want to use channels from
+	 * separate engines when possible.
+	 */
+	max_engines = num_present_cpus();
+	max_channels = min_t(unsigned int,
+			     MV_XOR_MAX_CHANNELS,
+			     DIV_ROUND_UP(num_present_cpus(), 2));
+
+	if (mv_xor_engine_count >= max_engines)
+		return 0;
+
 	if (pdev->dev.of_node) {
 		struct device_node *np;
 		int i = 0;
@@ -1199,6 +1218,9 @@ static int mv_xor_probe(struct platform_device *pdev)
 			int irq;
 			op_in_desc = (int)of_id->data;
 
+			if (i >= max_channels)
+				continue;
+
 			dma_cap_zero(cap_mask);
 			dma_cap_set(DMA_MEMCPY, cap_mask);
 			dma_cap_set(DMA_XOR, cap_mask);
@@ -1222,7 +1244,7 @@ static int mv_xor_probe(struct platform_device *pdev)
 			i++;
 		}
 	} else if (pdata && pdata->channels) {
-		for (i = 0; i < MV_XOR_MAX_CHANNELS; i++) {
+		for (i = 0; i < max_channels; i++) {
 			struct mv_xor_channel_data *cd;
 			struct mv_xor_chan *chan;
 			int irq;