diff mbox series

[net] xen-netback: properly sync TX responses

Message ID 980c6c3d-e10e-4459-8565-e8fbde122f00@suse.com (mailing list archive)
State Accepted
Commit 7b55984c96ffe9e236eb9c82a2196e0b1f84990d
Delegated to: Netdev Maintainers
Headers show
Series [net] xen-netback: properly sync TX responses | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1064 this patch: 1064
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 1081 this patch: 1081
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1081 this patch: 1081
netdev/checkpatch warning CHECK: Alignment should match open parenthesis
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-01-31--06-00 (tests: 715)

Commit Message

Jan Beulich Jan. 29, 2024, 1:03 p.m. UTC
Invoking the make_tx_response() / push_tx_responses() pair with no lock
held would be acceptable only if all such invocations happened from the
same context (NAPI instance or dealloc thread). Since this isn't the
case, and since the interface "spec" also doesn't demand that multicast
operations may only be performed with no in-flight transmits,
MCAST_{ADD,DEL} processing also needs to acquire the response lock
around the invocations.

To prevent similar mistakes going forward, "downgrade" the present
functions to private helpers of just the two remaining ones using them
directly, with no forward declarations anymore. This involves renaming
what so far was make_tx_response(), for the new function of that name
to serve the new (wrapper) purpose.

While there,
- constify the txp parameters,
- correct xenvif_idx_release()'s status parameter's type,
- rename {,_}make_tx_response()'s status parameters for consistency with
  xenvif_idx_release()'s.

Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Of course this could be split into two or even more separate changes,
but I think these transformations are best done all in one go.

It remains questionable whether push_tx_responses() really needs
invoking after every single _make_tx_response().

MCAST_{ADD,DEL} are odd also from another perspective: They're supposed
to come with "dummy requests", with the comment in the public header
leaving open what that means. Netback doesn't check dummy-ness (e.g.
size being zero). Furthermore the description in the public header
doesn't really make clear that there's a restriction of one such "extra"
per dummy request. Yet the way xenvif_get_extras() works precludes
multiple ADDs or multiple DELs in a single dummy request (only the last
one would be honored afaict). While the way xenvif_tx_build_gops() works
precludes an ADD and a DEL coming together in a single dummy request
(the DEL would be ignored).

Comments

Jakub Kicinski Feb. 1, 2024, 12:23 a.m. UTC | #1
On Mon, 29 Jan 2024 14:03:08 +0100 Jan Beulich wrote:
> Invoking the make_tx_response() / push_tx_responses() pair with no lock
> held would be acceptable only if all such invocations happened from the
> same context (NAPI instance or dealloc thread). Since this isn't the
> case, and since the interface "spec" also doesn't demand that multicast
> operations may only be performed with no in-flight transmits,
> MCAST_{ADD,DEL} processing also needs to acquire the response lock
> around the invocations.
> 
> To prevent similar mistakes going forward, "downgrade" the present
> functions to private helpers of just the two remaining ones using them
> directly, with no forward declarations anymore. This involves renaming
> what so far was make_tx_response(), for the new function of that name
> to serve the new (wrapper) purpose.
> 
> While there,
> - constify the txp parameters,
> - correct xenvif_idx_release()'s status parameter's type,
> - rename {,_}make_tx_response()'s status parameters for consistency with
>   xenvif_idx_release()'s.

Hi Paul, is this one on your TODO list to review or should 
we do our best? :)
Paul Durrant Feb. 1, 2024, 3:43 p.m. UTC | #2
On 01/02/2024 00:23, Jakub Kicinski wrote:
> On Mon, 29 Jan 2024 14:03:08 +0100 Jan Beulich wrote:
>> Invoking the make_tx_response() / push_tx_responses() pair with no lock
>> held would be acceptable only if all such invocations happened from the
>> same context (NAPI instance or dealloc thread). Since this isn't the
>> case, and since the interface "spec" also doesn't demand that multicast
>> operations may only be performed with no in-flight transmits,
>> MCAST_{ADD,DEL} processing also needs to acquire the response lock
>> around the invocations.
>>
>> To prevent similar mistakes going forward, "downgrade" the present
>> functions to private helpers of just the two remaining ones using them
>> directly, with no forward declarations anymore. This involves renaming
>> what so far was make_tx_response(), for the new function of that name
>> to serve the new (wrapper) purpose.
>>
>> While there,
>> - constify the txp parameters,
>> - correct xenvif_idx_release()'s status parameter's type,
>> - rename {,_}make_tx_response()'s status parameters for consistency with
>>    xenvif_idx_release()'s.
> 
> Hi Paul, is this one on your TODO list to review or should
> we do our best? :)

Sorry for the delay. I'll take a look at this now.

   Paul
Paul Durrant Feb. 1, 2024, 4:10 p.m. UTC | #3
On 29/01/2024 13:03, Jan Beulich wrote:
> Invoking the make_tx_response() / push_tx_responses() pair with no lock
> held would be acceptable only if all such invocations happened from the
> same context (NAPI instance or dealloc thread). Since this isn't the
> case, and since the interface "spec" also doesn't demand that multicast
> operations may only be performed with no in-flight transmits,
> MCAST_{ADD,DEL} processing also needs to acquire the response lock
> around the invocations.
> 
> To prevent similar mistakes going forward, "downgrade" the present
> functions to private helpers of just the two remaining ones using them
> directly, with no forward declarations anymore. This involves renaming
> what so far was make_tx_response(), for the new function of that name
> to serve the new (wrapper) purpose.
> 
> While there,
> - constify the txp parameters,
> - correct xenvif_idx_release()'s status parameter's type,
> - rename {,_}make_tx_response()'s status parameters for consistency with
>    xenvif_idx_release()'s.
> 
> Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control")
> Cc: stable@vger.kernel.org
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Of course this could be split into two or even more separate changes,
> but I think these transformations are best done all in one go.
> 
> It remains questionable whether push_tx_responses() really needs
> invoking after every single _make_tx_response().
> 
> MCAST_{ADD,DEL} are odd also from another perspective: They're supposed
> to come with "dummy requests", with the comment in the public header
> leaving open what that means.

IIRC the only reference I had at the time was Solaris code... I don't 
really there being any documentation of the feature. I think that 
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/xen/io/xnb.c 
is probably similar to the code I looked at to determine what the 
Solaris frontend expected. So I think 'dummy' means 'ignored'.

> Netback doesn't check dummy-ness (e.g.
> size being zero). Furthermore the description in the public header
> doesn't really make clear that there's a restriction of one such "extra"
> per dummy request. Yet the way xenvif_get_extras() works precludes
> multiple ADDs or multiple DELs in a single dummy request (only the last
> one would be honored afaict). While the way xenvif_tx_build_gops() works
> precludes an ADD and a DEL coming together in a single dummy request
> (the DEL would be ignored).

It appears the Solaris backend never coped with multiple extra_info so 
what the 'correct' semantic would be is unclear.

Anyway...

Reviewed-by: Paul Durrant <paul@xen.org>

> 
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -104,13 +104,12 @@ bool provides_xdp_headroom = true;
>   module_param(provides_xdp_headroom, bool, 0644);
>   
>   static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
> -			       u8 status);
> +			       s8 status);
>   
>   static void make_tx_response(struct xenvif_queue *queue,
> -			     struct xen_netif_tx_request *txp,
> +			     const struct xen_netif_tx_request *txp,
>   			     unsigned int extra_count,
> -			     s8       st);
> -static void push_tx_responses(struct xenvif_queue *queue);
> +			     s8 status);
>   
>   static void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx);
>   
> @@ -208,13 +207,9 @@ static void xenvif_tx_err(struct xenvif_
>   			  unsigned int extra_count, RING_IDX end)
>   {
>   	RING_IDX cons = queue->tx.req_cons;
> -	unsigned long flags;
>   
>   	do {
> -		spin_lock_irqsave(&queue->response_lock, flags);
>   		make_tx_response(queue, txp, extra_count, XEN_NETIF_RSP_ERROR);
> -		push_tx_responses(queue);
> -		spin_unlock_irqrestore(&queue->response_lock, flags);
>   		if (cons == end)
>   			break;
>   		RING_COPY_REQUEST(&queue->tx, cons++, txp);
> @@ -465,12 +460,7 @@ static void xenvif_get_requests(struct x
>   	for (shinfo->nr_frags = 0; nr_slots > 0 && shinfo->nr_frags < MAX_SKB_FRAGS;
>   	     nr_slots--) {
>   		if (unlikely(!txp->size)) {
> -			unsigned long flags;
> -
> -			spin_lock_irqsave(&queue->response_lock, flags);
>   			make_tx_response(queue, txp, 0, XEN_NETIF_RSP_OKAY);
> -			push_tx_responses(queue);
> -			spin_unlock_irqrestore(&queue->response_lock, flags);
>   			++txp;
>   			continue;
>   		}
> @@ -496,14 +486,8 @@ static void xenvif_get_requests(struct x
>   
>   		for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots; ++txp) {
>   			if (unlikely(!txp->size)) {
> -				unsigned long flags;
> -
> -				spin_lock_irqsave(&queue->response_lock, flags);
>   				make_tx_response(queue, txp, 0,
>   						 XEN_NETIF_RSP_OKAY);
> -				push_tx_responses(queue);
> -				spin_unlock_irqrestore(&queue->response_lock,
> -						       flags);
>   				continue;
>   			}
>   
> @@ -995,7 +979,6 @@ static void xenvif_tx_build_gops(struct
>   					 (ret == 0) ?
>   					 XEN_NETIF_RSP_OKAY :
>   					 XEN_NETIF_RSP_ERROR);
> -			push_tx_responses(queue);
>   			continue;
>   		}
>   
> @@ -1007,7 +990,6 @@ static void xenvif_tx_build_gops(struct
>   
>   			make_tx_response(queue, &txreq, extra_count,
>   					 XEN_NETIF_RSP_OKAY);
> -			push_tx_responses(queue);
>   			continue;
>   		}
>   
> @@ -1433,8 +1415,35 @@ int xenvif_tx_action(struct xenvif_queue
>   	return work_done;
>   }
>   
> +static void _make_tx_response(struct xenvif_queue *queue,
> +			     const struct xen_netif_tx_request *txp,
> +			     unsigned int extra_count,
> +			     s8 status)
> +{
> +	RING_IDX i = queue->tx.rsp_prod_pvt;
> +	struct xen_netif_tx_response *resp;
> +
> +	resp = RING_GET_RESPONSE(&queue->tx, i);
> +	resp->id     = txp->id;
> +	resp->status = status;
> +
> +	while (extra_count-- != 0)
> +		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
> +
> +	queue->tx.rsp_prod_pvt = ++i;
> +}
> +
> +static void push_tx_responses(struct xenvif_queue *queue)
> +{
> +	int notify;
> +
> +	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&queue->tx, notify);
> +	if (notify)
> +		notify_remote_via_irq(queue->tx_irq);
> +}
> +
>   static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
> -			       u8 status)
> +			       s8 status)
>   {
>   	struct pending_tx_info *pending_tx_info;
>   	pending_ring_idx_t index;
> @@ -1444,8 +1453,8 @@ static void xenvif_idx_release(struct xe
>   
>   	spin_lock_irqsave(&queue->response_lock, flags);
>   
> -	make_tx_response(queue, &pending_tx_info->req,
> -			 pending_tx_info->extra_count, status);
> +	_make_tx_response(queue, &pending_tx_info->req,
> +			  pending_tx_info->extra_count, status);
>   
>   	/* Release the pending index before pusing the Tx response so
>   	 * its available before a new Tx request is pushed by the
> @@ -1459,32 +1468,19 @@ static void xenvif_idx_release(struct xe
>   	spin_unlock_irqrestore(&queue->response_lock, flags);
>   }
>   
> -
>   static void make_tx_response(struct xenvif_queue *queue,
> -			     struct xen_netif_tx_request *txp,
> +			     const struct xen_netif_tx_request *txp,
>   			     unsigned int extra_count,
> -			     s8       st)
> +			     s8 status)
>   {
> -	RING_IDX i = queue->tx.rsp_prod_pvt;
> -	struct xen_netif_tx_response *resp;
> -
> -	resp = RING_GET_RESPONSE(&queue->tx, i);
> -	resp->id     = txp->id;
> -	resp->status = st;
> -
> -	while (extra_count-- != 0)
> -		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
> +	unsigned long flags;
>   
> -	queue->tx.rsp_prod_pvt = ++i;
> -}
> +	spin_lock_irqsave(&queue->response_lock, flags);
>   
> -static void push_tx_responses(struct xenvif_queue *queue)
> -{
> -	int notify;
> +	_make_tx_response(queue, txp, extra_count, status);
> +	push_tx_responses(queue);
>   
> -	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&queue->tx, notify);
> -	if (notify)
> -		notify_remote_via_irq(queue->tx_irq);
> +	spin_unlock_irqrestore(&queue->response_lock, flags);
>   }
>   
>   static void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)
patchwork-bot+netdevbpf@kernel.org Feb. 1, 2024, 4:40 p.m. UTC | #4
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 29 Jan 2024 14:03:08 +0100 you wrote:
> Invoking the make_tx_response() / push_tx_responses() pair with no lock
> held would be acceptable only if all such invocations happened from the
> same context (NAPI instance or dealloc thread). Since this isn't the
> case, and since the interface "spec" also doesn't demand that multicast
> operations may only be performed with no in-flight transmits,
> MCAST_{ADD,DEL} processing also needs to acquire the response lock
> around the invocations.
> 
> [...]

Here is the summary with links:
  - [net] xen-netback: properly sync TX responses
    https://git.kernel.org/netdev/net/c/7b55984c96ff

You are awesome, thank you!
diff mbox series

Patch

--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -104,13 +104,12 @@  bool provides_xdp_headroom = true;
 module_param(provides_xdp_headroom, bool, 0644);
 
 static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
-			       u8 status);
+			       s8 status);
 
 static void make_tx_response(struct xenvif_queue *queue,
-			     struct xen_netif_tx_request *txp,
+			     const struct xen_netif_tx_request *txp,
 			     unsigned int extra_count,
-			     s8       st);
-static void push_tx_responses(struct xenvif_queue *queue);
+			     s8 status);
 
 static void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx);
 
@@ -208,13 +207,9 @@  static void xenvif_tx_err(struct xenvif_
 			  unsigned int extra_count, RING_IDX end)
 {
 	RING_IDX cons = queue->tx.req_cons;
-	unsigned long flags;
 
 	do {
-		spin_lock_irqsave(&queue->response_lock, flags);
 		make_tx_response(queue, txp, extra_count, XEN_NETIF_RSP_ERROR);
-		push_tx_responses(queue);
-		spin_unlock_irqrestore(&queue->response_lock, flags);
 		if (cons == end)
 			break;
 		RING_COPY_REQUEST(&queue->tx, cons++, txp);
@@ -465,12 +460,7 @@  static void xenvif_get_requests(struct x
 	for (shinfo->nr_frags = 0; nr_slots > 0 && shinfo->nr_frags < MAX_SKB_FRAGS;
 	     nr_slots--) {
 		if (unlikely(!txp->size)) {
-			unsigned long flags;
-
-			spin_lock_irqsave(&queue->response_lock, flags);
 			make_tx_response(queue, txp, 0, XEN_NETIF_RSP_OKAY);
-			push_tx_responses(queue);
-			spin_unlock_irqrestore(&queue->response_lock, flags);
 			++txp;
 			continue;
 		}
@@ -496,14 +486,8 @@  static void xenvif_get_requests(struct x
 
 		for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots; ++txp) {
 			if (unlikely(!txp->size)) {
-				unsigned long flags;
-
-				spin_lock_irqsave(&queue->response_lock, flags);
 				make_tx_response(queue, txp, 0,
 						 XEN_NETIF_RSP_OKAY);
-				push_tx_responses(queue);
-				spin_unlock_irqrestore(&queue->response_lock,
-						       flags);
 				continue;
 			}
 
@@ -995,7 +979,6 @@  static void xenvif_tx_build_gops(struct
 					 (ret == 0) ?
 					 XEN_NETIF_RSP_OKAY :
 					 XEN_NETIF_RSP_ERROR);
-			push_tx_responses(queue);
 			continue;
 		}
 
@@ -1007,7 +990,6 @@  static void xenvif_tx_build_gops(struct
 
 			make_tx_response(queue, &txreq, extra_count,
 					 XEN_NETIF_RSP_OKAY);
-			push_tx_responses(queue);
 			continue;
 		}
 
@@ -1433,8 +1415,35 @@  int xenvif_tx_action(struct xenvif_queue
 	return work_done;
 }
 
+static void _make_tx_response(struct xenvif_queue *queue,
+			     const struct xen_netif_tx_request *txp,
+			     unsigned int extra_count,
+			     s8 status)
+{
+	RING_IDX i = queue->tx.rsp_prod_pvt;
+	struct xen_netif_tx_response *resp;
+
+	resp = RING_GET_RESPONSE(&queue->tx, i);
+	resp->id     = txp->id;
+	resp->status = status;
+
+	while (extra_count-- != 0)
+		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
+
+	queue->tx.rsp_prod_pvt = ++i;
+}
+
+static void push_tx_responses(struct xenvif_queue *queue)
+{
+	int notify;
+
+	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&queue->tx, notify);
+	if (notify)
+		notify_remote_via_irq(queue->tx_irq);
+}
+
 static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,
-			       u8 status)
+			       s8 status)
 {
 	struct pending_tx_info *pending_tx_info;
 	pending_ring_idx_t index;
@@ -1444,8 +1453,8 @@  static void xenvif_idx_release(struct xe
 
 	spin_lock_irqsave(&queue->response_lock, flags);
 
-	make_tx_response(queue, &pending_tx_info->req,
-			 pending_tx_info->extra_count, status);
+	_make_tx_response(queue, &pending_tx_info->req,
+			  pending_tx_info->extra_count, status);
 
 	/* Release the pending index before pusing the Tx response so
 	 * its available before a new Tx request is pushed by the
@@ -1459,32 +1468,19 @@  static void xenvif_idx_release(struct xe
 	spin_unlock_irqrestore(&queue->response_lock, flags);
 }
 
-
 static void make_tx_response(struct xenvif_queue *queue,
-			     struct xen_netif_tx_request *txp,
+			     const struct xen_netif_tx_request *txp,
 			     unsigned int extra_count,
-			     s8       st)
+			     s8 status)
 {
-	RING_IDX i = queue->tx.rsp_prod_pvt;
-	struct xen_netif_tx_response *resp;
-
-	resp = RING_GET_RESPONSE(&queue->tx, i);
-	resp->id     = txp->id;
-	resp->status = st;
-
-	while (extra_count-- != 0)
-		RING_GET_RESPONSE(&queue->tx, ++i)->status = XEN_NETIF_RSP_NULL;
+	unsigned long flags;
 
-	queue->tx.rsp_prod_pvt = ++i;
-}
+	spin_lock_irqsave(&queue->response_lock, flags);
 
-static void push_tx_responses(struct xenvif_queue *queue)
-{
-	int notify;
+	_make_tx_response(queue, txp, extra_count, status);
+	push_tx_responses(queue);
 
-	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&queue->tx, notify);
-	if (notify)
-		notify_remote_via_irq(queue->tx_irq);
+	spin_unlock_irqrestore(&queue->response_lock, flags);
 }
 
 static void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx)