[RFC,net-next,v2,6/6] net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM

The default RX IRQ coalescing settings of one IRQ per packet can represent
a significant CPU load. However, increasing the coalescing unilaterally
can result in undesirable latency under low load. Adaptive IRQ
coalescing with DIM offers a way to adjust the coalescing settings based
on load.

This device only supports "CQE" mode [1], where each packet resets the
timer. Therefore, an interrupt is fired either when we receive
coalesce_count_rx packets or when the interface is idle for
coalesce_usec_rx. With this in mind, consider the following scenarios:

Link saturated
    Here we want to set coalesce_count_rx to a large value, in order to
    coalesce more packets and reduce CPU load. coalesce_usec_rx should
    be set to at least the time for one packet. Otherwise the link will
    be "idle" and we will get an interrupt for each packet anyway.

Bursts of packets
    Each burst should be coalesced into a single interrupt, although it
    may be prudent to reduce coalesce_count_rx for better latency.
    coalesce_usec_rx should be set to at least the time for one packet
    so bursts are coalesced. However, additional time beyond the packet
    time will just increase latency at the end of a burst.

Sporadic packets
    Due to low load, we can set coalesce_count_rx to 1 in order to
    reduce latency to the minimum. coalesce_usec_rx does not matter in
    this case.

Based on this analysis, I expected the CQE profiles to look something
like

	usec =  0, pkts = 1   // Low load
	usec = 16, pkts = 4
	usec = 16, pkts = 16
	usec = 16, pkts = 64
	usec = 16, pkts = 256 // High load

Where usec is set to 16 to be a few us greater than the 12.3 us packet
time of a 1500 MTU packet at 1 GBit/s. However, the CQE profile is
instead

	usec =  2, pkts = 256 // Low load
	usec =  8, pkts = 128
	usec = 16, pkts =  64
	usec = 32, pkts =  64
	usec = 64, pkts =  64 // High load

I found this very surprising. The number of coalesced packets
*decreases* as load increases. But as load increases we have more
opportunities to coalesce packets without affecting latency as much.
Additionally, the profile *increases* the usec as the load increases.
But as load increases, the gaps between packets will tend to become
smaller, making it possible to *decrease* usec for better latency at the
end of a "burst".

I consider the default CQE profile unsuitable for this NIC. Therefore,
we use the first profile outlined in this commit instead.
coalesce_usec_rx is set to 16 by default, but the user can customize it.
This may be necessary if they are using jumbo frames. I think adjusting
the profile times based on the link speed/mtu would be good improvement
for generic DIM.

In addition to the above profile problems, I noticed the following
additional issues with DIM while testing:

- DIM tends to "wander" when at low load, since the performance gradient
  is pretty flat. If you only have 10p/ms anyway then adjusting the
  coalescing settings will not affect throughput very much.
- DIM takes a long time to adjust back to low indices when load is
  decreased following a period of high load. This is because it only
  re-evaluates its settings once every 64 interrupts. However, at low
  load 64 interrupts can be several seconds.

Finally: performance. This patch increases receive throughput with
iperf3 from 840 Mbits/sec to 938 Mbits/sec, decreases interrupts from
69920/sec to 316/sec, and decreases CPU utilization (4x Cortex-A53) from
43% to 9%.

[1] Who names this stuff?

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---
Heng, maybe you have some comments on DIM regarding the above?

Changes in v2:
- Don't take the RTNL in axienet_rx_dim_work to avoid deadlock. Instead,
  calculate a partial cr update that axienet_update_coalesce_rx can
  perform under a spin lock.
- Use READ/WRITE_ONCE when accessing/modifying rx_irqs

 drivers/net/ethernet/xilinx/Kconfig           |  1 +
 drivers/net/ethernet/xilinx/xilinx_axienet.h  | 10 ++-
 .../net/ethernet/xilinx/xilinx_axienet_main.c | 80 +++++++++++++++++--
 3 files changed, 82 insertions(+), 9 deletions(-)

Message ID	20240909235208.1331065-7-sean.anderson@linux.dev (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40B5B18FC81 for <netdev@vger.kernel.org>; Mon, 9 Sep 2024 23:52:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725925948; cv=none; b=uvYjYTihiLhvAHQjWGIeBPva02L2jU0lDNXDvCdxRvXHcLAichUPHa7CS7ZZ7j83xL+eMWooXVDv0IWLTO1sToNcYyn/v1vJ4wB5ONVGYvcZOZTnMWRxwAg3pjsAgUMi9BQoUM8h61xlvpL2cvYKte0sHgQi3VzMBfpjX4FofnI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725925948; c=relaxed/simple; bh=OJs1JXSwoen/LWW8JMB7/K3ew8Y3ZuyVgG2yuAh9Ntc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oDPlrHhcPnIMPMMlea5uVzBOBUR59WwtJ/sBMzPEaezUIxdlTg7GgEwH1VMfj+eGoZX2B8f6Z5NW084XK8Rwqd7nILYysPqgNR7crfWHHo9TFll/5o0s4vPX2x2FwBIMDlGRDSYlmJSNeS9ygFpyDIo5dLPCqbvm3orYxt53reQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=P96N8hBC; arc=none smtp.client-ip=95.215.58.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="P96N8hBC" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725925944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2HpjAZM+TWby3wVWVYyL+dHW5q5pMysAcY47I4DBh+M=; b=P96N8hBCxqP3urN/33qiWa8ilJEjwRtbz+bE31uxrSllHTpCkFG4nUIT91EgDjoqZ5WzD9 CNES6j2VamHKzQ8MhejyZyBp+Zkbcq6XOxQKAm5+9NtZfYX+s3LvQSssNa3Z/Fw0oHzCWR 8s7H7eXDkDlmyJ3xzP7lZU7qA00D7po= From: Sean Anderson <sean.anderson@linux.dev> To: "David S . Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>, netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Michal Simek <michal.simek@amd.com>, linux-kernel@vger.kernel.org, Sean Anderson <sean.anderson@linux.dev>, Heng Qi <hengqi@linux.alibaba.com> Subject: [RFC PATCH net-next v2 6/6] net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM Date: Mon, 9 Sep 2024 19:52:08 -0400 Message-Id: <20240909235208.1331065-7-sean.anderson@linux.dev> In-Reply-To: <20240909235208.1331065-1-sean.anderson@linux.dev> References: <20240909235208.1331065-1-sean.anderson@linux.dev> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: <netdev.vger.kernel.org> List-Subscribe: <mailto:netdev+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:netdev+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC
Series	net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM \| expand [RFC,net-next,v2,0/6] net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM [RFC,net-next,v2,1/6] net: xilinx: axienet: Add some symbolic constants for IRQ delay timer [RFC,net-next,v2,2/6] net: xilinx: axienet: Report an error for bad coalesce settings [RFC,net-next,v2,3/6] net: xilinx: axienet: Combine CR calculation [RFC,net-next,v2,4/6] net: xilinx: axienet: Support adjusting coalesce settings while running [RFC,net-next,v2,5/6] net: xilinx: axienet: Get coalesce parameters from driver state [RFC,net-next,v2,6/6] net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM

Context	Check	Description
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/apply	fail	Patch does not apply to net-next-0

[RFC,net-next,v2,6/6] net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM

Checks

Commit Message

Comments

Patch