[net,v2] virtio-net: fix possible dim status unrecoverable

Message ID	1711434338-64848-1-git-send-email-hengqi@linux.alibaba.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3B9112881C for <netdev@vger.kernel.org>; Tue, 26 Mar 2024 06:25:41 +0000 (UTC) From: Heng Qi <hengqi@linux.alibaba.com> To: netdev@vger.kernel.org, virtualization@lists.linux.dev Cc: Jason Wang <jasowang@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Eric Dumazet <edumazet@google.com>, "David S. Miller" <davem@davemloft.net>, Xuan Zhuo <xuanzhuo@linux.alibaba.com> Subject: [PATCH net v2] virtio-net: fix possible dim status unrecoverable Date: Tue, 26 Mar 2024 14:25:38 +0800 Message-Id: <1711434338-64848-1-git-send-email-hengqi@linux.alibaba.com> Precedence: bulk
Series	[net,v2] virtio-net: fix possible dim status unrecoverable \| expand [net,v2] virtio-net: fix possible dim status unrecoverable

Message ID

1711434338-64848-1-git-send-email-hengqi@linux.alibaba.com (mailing list archive)

State

Changes Requested

Delegated to:

Netdev Maintainers

Headers

From: Heng Qi <hengqi@linux.alibaba.com>
To: netdev@vger.kernel.org,
	virtualization@lists.linux.dev
Cc: Jason Wang <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: [PATCH net v2] virtio-net: fix possible dim status unrecoverable
Date: Tue, 26 Mar 2024 14:25:38 +0800
Message-Id: <1711434338-64848-1-git-send-email-hengqi@linux.alibaba.com>
Precedence: bulk

Series

[net,v2] virtio-net: fix possible dim status unrecoverable | expand

Context	Check	Description
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for net
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 944 this patch: 944
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 9 of 9 maintainers
netdev/build_clang	success	Errors and warnings before: 955 this patch: 955
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 955 this patch: 955
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-03-27--15-00 (tests: 952)

Context

Check

Description

netdev/series_format

success

Single patches do not need cover letters

netdev/tree_selection

success

Clearly marked for net

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

success

Fixes tag present in non-next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 944 this patch: 944

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

success

CCed 9 of 9 maintainers

netdev/build_clang

success

Errors and warnings before: 955 this patch: 955

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

Fixes tag looks correct

netdev/build_allmodconfig_warn

success

Errors and warnings before: 955 this patch: 955

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 11 lines checked

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2024-03-27--15-00 (tests: 952)

Commit Message

Heng Qi March 26, 2024, 6:25 a.m. UTC

When the dim worker is scheduled, if it fails to acquire the lock,
dim may not be able to return to the working state later.

For example, the following single queue scenario:
  1. The dim worker of rxq0 is scheduled, and the dim status is
     changed to DIM_APPLY_NEW_PROFILE;
  2. The ethtool command is holding rtnl lock;
  3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
     to acquire the lock and exits;

Then, even if net_dim is invoked again, it cannot work because the
state is not restored to DIM_START_MEASURE.

Patch has been tested on a VM with 16 NICs, 128 queues per NIC
(2kq total):
With dim enabled on all queues, there are many opportunities for
contention for RTNL lock, and this patch introduces no visible hotspots.
The dim performance is also stable.

Fixes: 6208799553a8 ("virtio-net: support rx netdim")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
v1->v2:
  - Update commit log. No functional changes.

 drivers/net/virtio_net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Paolo Abeni March 28, 2024, 10:34 a.m. UTC | #1

On Tue, 2024-03-26 at 14:25 +0800, Heng Qi wrote:
> When the dim worker is scheduled, if it fails to acquire the lock,
> dim may not be able to return to the working state later.
> 
> For example, the following single queue scenario:
>   1. The dim worker of rxq0 is scheduled, and the dim status is
>      changed to DIM_APPLY_NEW_PROFILE;
>   2. The ethtool command is holding rtnl lock;
>   3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
>      to acquire the lock and exits;
> 
> Then, even if net_dim is invoked again, it cannot work because the
> state is not restored to DIM_START_MEASURE.
> 
> Patch has been tested on a VM with 16 NICs, 128 queues per NIC
> (2kq total):
> With dim enabled on all queues, there are many opportunities for
> contention for RTNL lock, and this patch introduces no visible hotspots.
> The dim performance is also stable.
> 
> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> ---
> v1->v2:
>   - Update commit log. No functional changes.
> 
>  drivers/net/virtio_net.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c22d111..0ebe322 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>  	struct dim_cq_moder update_moder;
>  	int i, qnum, err;
>  
> -	if (!rtnl_trylock())
> +	if (!rtnl_trylock()) {
> +		schedule_work(&dim->work);
>  		return;

I'm really scared by this change. VMs are (increasingly) used to run
containers orchestration, which in turns puts a lot of pressure on the
RTNL lock. Any rtnl_trylock+ reschedule may hang for a very long time.
Addressing this kind of issues later becomes _extremely_ painful, see:

https://lore.kernel.org/netdev/20231018154804.420823-1-atenart@kernel.org/

I really think a different solution is needed. What about moving
virtnet_send_command() under protection of a new mutex?

I understand it will complicate future hardening works around cvq, but
really rtnl_trylock()/<spin/retry> is bad for the whole system.

Cheers,

Paolo

Heng Qi March 29, 2024, 2:19 a.m. UTC | #2

在 2024/3/28 下午6:34, Paolo Abeni 写道:
> On Tue, 2024-03-26 at 14:25 +0800, Heng Qi wrote:
>> When the dim worker is scheduled, if it fails to acquire the lock,
>> dim may not be able to return to the working state later.
>>
>> For example, the following single queue scenario:
>>    1. The dim worker of rxq0 is scheduled, and the dim status is
>>       changed to DIM_APPLY_NEW_PROFILE;
>>    2. The ethtool command is holding rtnl lock;
>>    3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
>>       to acquire the lock and exits;
>>
>> Then, even if net_dim is invoked again, it cannot work because the
>> state is not restored to DIM_START_MEASURE.
>>
>> Patch has been tested on a VM with 16 NICs, 128 queues per NIC
>> (2kq total):
>> With dim enabled on all queues, there are many opportunities for
>> contention for RTNL lock, and this patch introduces no visible hotspots.
>> The dim performance is also stable.
>>
>> Fixes: 6208799553a8 ("virtio-net: support rx netdim")
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>> ---
>> v1->v2:
>>    - Update commit log. No functional changes.
>>
>>   drivers/net/virtio_net.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index c22d111..0ebe322 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>>   	struct dim_cq_moder update_moder;
>>   	int i, qnum, err;
>>   
>> -	if (!rtnl_trylock())
>> +	if (!rtnl_trylock()) {
>> +		schedule_work(&dim->work);
>>   		return;
> I'm really scared by this change. VMs are (increasingly) used to run
> containers orchestration, which in turns puts a lot of pressure on the
> RTNL lock. Any rtnl_trylock+ reschedule may hang for a very long time.
> Addressing this kind of issues later becomes _extremely_ painful, see:
>
> https://lore.kernel.org/netdev/20231018154804.420823-1-atenart@kernel.org/
>
> I really think a different solution is needed. What about moving
> virtnet_send_command() under protection of a new mutex?

Daniel did additional work:

https://lore.kernel.org/all/20240328044715.266641-1-danielj@nvidia.com/

Use spin lock to protect ctrlq access, therefore, rtnl lock can be 
removed in rx_dim_work,
which will make the problem non-existent.

Thanks,
Heng

>
> I understand it will complicate future hardening works around cvq, but
> really rtnl_trylock()/<spin/retry> is bad for the whole system.
>
> Cheers,
>
> Paolo

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c22d111..0ebe322 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3563,8 +3563,10 @@  static void virtnet_rx_dim_work(struct work_struct *work)
 	struct dim_cq_moder update_moder;
 	int i, qnum, err;
 
-	if (!rtnl_trylock())
+	if (!rtnl_trylock()) {
+		schedule_work(&dim->work);
 		return;
+	}
 
 	/* Each rxq's work is queued by "net_dim()->schedule_work()"
 	 * in response to NAPI traffic changes. Note that dim->profile_ix

[net,v2] virtio-net: fix possible dim status unrecoverable

Checks

Commit Message

Comments

Patch