diff mbox series

[RFC,net-next] pktgen: Introducing a parameter for non-shared skb testing

Message ID 20230906103508.6789-1-liangchen.linux@gmail.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [RFC,net-next] pktgen: Introducing a parameter for non-shared skb testing | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1330 this patch: 1330
netdev/cc_maintainers warning 5 maintainers not CCed: keescook@chromium.org gregkh@linuxfoundation.org djwong@kernel.org jack@suse.cz Jason@zx2c4.com
netdev/build_clang success Errors and warnings before: 1353 this patch: 1353
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1353 this patch: 1353
netdev/checkpatch warning WARNING: line length of 87 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Liang Chen Sept. 6, 2023, 10:35 a.m. UTC
Currently, skbs generated by pktgen always have their reference count
incremented before transmission, leading to two issues:
  1. Only the code paths for shared skbs can be tested.
  2. Skbs can only be released by pktgen.
To enhance testing comprehensiveness, introducing the "skb_single_user"
parameter, which allows skbs with a reference count of 1 to be
transmitted. So we can test non-shared skbs and code paths where skbs
are released within the network stack.

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
---
 net/core/pktgen.c | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

Comments

Benjamin Poirier Sept. 6, 2023, 10:32 p.m. UTC | #1
On 2023-09-06 18:35 +0800, Liang Chen wrote:
> Currently, skbs generated by pktgen always have their reference count
> incremented before transmission, leading to two issues:
>   1. Only the code paths for shared skbs can be tested.
>   2. Skbs can only be released by pktgen.
> To enhance testing comprehensiveness, introducing the "skb_single_user"
> parameter, which allows skbs with a reference count of 1 to be
> transmitted. So we can test non-shared skbs and code paths where skbs
> are released within the network stack.

If my understanding of the code is correct, pktgen operates in the same
way with parameter clone_skb = 0 and clone_skb = 1.

clone_skb = 0 is already meant to work on devices that don't support
shared skbs (see IFF_TX_SKB_SHARING check in pktgen_if_write()). Instead
of introducing a new option for your purpose, how about changing
pktgen_xmit() to send "not shared" skbs when clone_skb == 0?

Note that for devices without IFF_TX_SKB_SHARING, it would no longer be
possible to have pktgen free skbs. Is that important?
Liang Chen Sept. 7, 2023, 3:54 a.m. UTC | #2
On Thu, Sep 7, 2023 at 6:32 AM Benjamin Poirier
<benjamin.poirier@gmail.com> wrote:
>
> On 2023-09-06 18:35 +0800, Liang Chen wrote:
> > Currently, skbs generated by pktgen always have their reference count
> > incremented before transmission, leading to two issues:
> >   1. Only the code paths for shared skbs can be tested.
> >   2. Skbs can only be released by pktgen.
> > To enhance testing comprehensiveness, introducing the "skb_single_user"
> > parameter, which allows skbs with a reference count of 1 to be
> > transmitted. So we can test non-shared skbs and code paths where skbs
> > are released within the network stack.
>
> If my understanding of the code is correct, pktgen operates in the same
> way with parameter clone_skb = 0 and clone_skb = 1.
>

Yeah. pktgen seems to treat user count of 2 as not shared, as long as
the skb is not reused for burst or clone_skb. In that case the only
thing left to do with the skb is to check if user count is
decremented.

> clone_skb = 0 is already meant to work on devices that don't support
> shared skbs (see IFF_TX_SKB_SHARING check in pktgen_if_write()). Instead
> of introducing a new option for your purpose, how about changing
> pktgen_xmit() to send "not shared" skbs when clone_skb == 0?
>

Using clone_skb = 0 to enforce non-sharing makes sense to me. However,
we are a bit concerned that such a change would affect existing users
who have been assuming the current behavior.

> Note that for devices without IFF_TX_SKB_SHARING, it would no longer be
> possible to have pktgen free skbs. Is that important?

It seems that only the "count" capability depends on that. In fact,
this patch is attempting to allow skb release within the network stack
when possible. BTW, to strictly obey the IFF_TX_SKB_SHARING flag,
perhaps the "count" capability can be implemented by supplying a
destructor to skbs.
Benjamin Poirier Sept. 7, 2023, 8:19 p.m. UTC | #3
On 2023-09-07 11:54 +0800, Liang Chen wrote:
> On Thu, Sep 7, 2023 at 6:32 AM Benjamin Poirier
> <benjamin.poirier@gmail.com> wrote:
> >
> > On 2023-09-06 18:35 +0800, Liang Chen wrote:
> > > Currently, skbs generated by pktgen always have their reference count
> > > incremented before transmission, leading to two issues:
> > >   1. Only the code paths for shared skbs can be tested.
> > >   2. Skbs can only be released by pktgen.
> > > To enhance testing comprehensiveness, introducing the "skb_single_user"
> > > parameter, which allows skbs with a reference count of 1 to be
> > > transmitted. So we can test non-shared skbs and code paths where skbs
> > > are released within the network stack.
> >
> > If my understanding of the code is correct, pktgen operates in the same
> > way with parameter clone_skb = 0 and clone_skb = 1.
> >
> 
> Yeah. pktgen seems to treat user count of 2 as not shared, as long as
> the skb is not reused for burst or clone_skb. In that case the only
> thing left to do with the skb is to check if user count is
> decremented.
> 
> > clone_skb = 0 is already meant to work on devices that don't support
> > shared skbs (see IFF_TX_SKB_SHARING check in pktgen_if_write()). Instead
> > of introducing a new option for your purpose, how about changing
> > pktgen_xmit() to send "not shared" skbs when clone_skb == 0?
> >
> 
> Using clone_skb = 0 to enforce non-sharing makes sense to me. However,
> we are a bit concerned that such a change would affect existing users
> who have been assuming the current behavior.

I looked into it more and mode netif_receive only supports clone_skb = 0
and normally reuses the same skb all the time. In order to support
shared/non-shared, I think a new parameter is needed, indeed.

Here are some comments on the rest of the patch:

> ---
>  net/core/pktgen.c | 39 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 36 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index f56b8d697014..8f48272b9d4b 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -423,6 +423,7 @@ struct pktgen_dev {
>  	__u32 skb_priority;	/* skb priority field */
>  	unsigned int burst;	/* number of duplicated packets to burst */
>  	int node;               /* Memory node */
> +	int skb_single_user;	/* allow single user skb for transmission */
>  
>  #ifdef CONFIG_XFRM
>  	__u8	ipsmode;		/* IPSEC mode (config) */
> @@ -1805,6 +1806,17 @@ static ssize_t pktgen_if_write(struct file *file,
>  		return count;
>  	}
>  
> +	if (!strcmp(name, "skb_single_user")) {
> +		len = num_arg(&user_buffer[i], 1, &value);
> +		if (len < 0)
> +			return len;
> +
> +		i += len;
> +		pkt_dev->skb_single_user = value;
> +		sprintf(pg_result, "OK: skb_single_user=%u", pkt_dev->skb_single_user);
> +		return count;
> +	}
> +

Since skb_single_user is a boolean, it seems that it should be a flag
(pkt_dev->flags), not a parameter.

Since "non shared" skbs don't really have a name, I would suggest to
avoid inventing a new name and instead call the flag "SHARED" and make
it on by default. So the user would unset the flag to enable the new
behavior.

This patch should also document the new option in 
Documentation/networking/pktgen.rst

>  	sprintf(pkt_dev->result, "No such parameter \"%s\"", name);
>  	return -EINVAL;
>  }
> @@ -3460,6 +3472,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  		return;
>  	}
>  
> +	/* If clone_skb, burst, or count parameters are configured,
> +	 * it implies the need for skb reuse, hence single user skb
> +	 * transmission is not allowed.
> +	 */
> +	if (pkt_dev->skb_single_user && (pkt_dev->clone_skb ||
> +					 burst > 1 || pkt_dev->count))
> +		pkt_dev->skb_single_user = 0;
> +

count > 0 does not imply reuse. That restriction can be lifted.

Instead of silently disabling the option, how about adding these checks
to pktgen_if_write()? The "clone_skb" parameter works that way, for
example.

>  	/* If no skb or clone count exhausted then get new one */
>  	if (!pkt_dev->skb || (pkt_dev->last_ok &&
>  			      ++pkt_dev->clone_count >= pkt_dev->clone_skb)) {
> @@ -3483,7 +3503,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  	if (pkt_dev->xmit_mode == M_NETIF_RECEIVE) {
>  		skb = pkt_dev->skb;
>  		skb->protocol = eth_type_trans(skb, skb->dev);
> -		refcount_add(burst, &skb->users);
> +		if (!pkt_dev->skb_single_user)
> +			refcount_add(burst, &skb->users);
>  		local_bh_disable();
>  		do {
>  			ret = netif_receive_skb(skb);
> @@ -3491,6 +3512,12 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  				pkt_dev->errors++;
>  			pkt_dev->sofar++;
>  			pkt_dev->seq_num++;
> +
> +			if (pkt_dev->skb_single_user) {
> +				pkt_dev->skb = NULL;
> +				break;
> +			}
> +

The assignment can be moved out of the loop, with the other 'if' in the
previous hunk.

>  			if (refcount_read(&skb->users) != burst) {
>  				/* skb was queued by rps/rfs or taps,
>  				 * so cannot reuse this skb
> @@ -3509,7 +3536,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  		goto out; /* Skips xmit_mode M_START_XMIT */
>  	} else if (pkt_dev->xmit_mode == M_QUEUE_XMIT) {
>  		local_bh_disable();
> -		refcount_inc(&pkt_dev->skb->users);
> +		if (!pkt_dev->skb_single_user)
> +			refcount_inc(&pkt_dev->skb->users);
>  
>  		ret = dev_queue_xmit(pkt_dev->skb);
>  		switch (ret) {
> @@ -3517,6 +3545,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  			pkt_dev->sofar++;
>  			pkt_dev->seq_num++;
>  			pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
> +			if (pkt_dev->skb_single_user)
> +				pkt_dev->skb = NULL;

>  			break;
>  		case NET_XMIT_DROP:
>  		case NET_XMIT_CN:

This code can lead to a use after free of pkt_dev->skb when
dev_queue_xmit() returns ex. NET_XMIT_DROP. The skb has been freed by
the stack but pkt_dev->skb is still set.

It can be triggered like this:
ip link add dummy0 up type dummy
tc qdisc add dev dummy0 clsact
tc filter add dev dummy0 egress matchall action drop
And then run pktgen on dummy0 with "skb_single_user 1" and "xmit_mode
queue_xmit"
Liang Chen Sept. 11, 2023, 6:25 a.m. UTC | #4
On Fri, Sep 8, 2023 at 4:19 AM Benjamin Poirier
<benjamin.poirier@gmail.com> wrote:
>
> On 2023-09-07 11:54 +0800, Liang Chen wrote:
> > On Thu, Sep 7, 2023 at 6:32 AM Benjamin Poirier
> > <benjamin.poirier@gmail.com> wrote:
> > >
> > > On 2023-09-06 18:35 +0800, Liang Chen wrote:
> > > > Currently, skbs generated by pktgen always have their reference count
> > > > incremented before transmission, leading to two issues:
> > > >   1. Only the code paths for shared skbs can be tested.
> > > >   2. Skbs can only be released by pktgen.
> > > > To enhance testing comprehensiveness, introducing the "skb_single_user"
> > > > parameter, which allows skbs with a reference count of 1 to be
> > > > transmitted. So we can test non-shared skbs and code paths where skbs
> > > > are released within the network stack.
> > >
> > > If my understanding of the code is correct, pktgen operates in the same
> > > way with parameter clone_skb = 0 and clone_skb = 1.
> > >
> >
> > Yeah. pktgen seems to treat user count of 2 as not shared, as long as
> > the skb is not reused for burst or clone_skb. In that case the only
> > thing left to do with the skb is to check if user count is
> > decremented.
> >
> > > clone_skb = 0 is already meant to work on devices that don't support
> > > shared skbs (see IFF_TX_SKB_SHARING check in pktgen_if_write()). Instead
> > > of introducing a new option for your purpose, how about changing
> > > pktgen_xmit() to send "not shared" skbs when clone_skb == 0?
> > >
> >
> > Using clone_skb = 0 to enforce non-sharing makes sense to me. However,
> > we are a bit concerned that such a change would affect existing users
> > who have been assuming the current behavior.
>
> I looked into it more and mode netif_receive only supports clone_skb = 0
> and normally reuses the same skb all the time. In order to support
> shared/non-shared, I think a new parameter is needed, indeed.
>

Sure, we will introduce a new parameter and store the value in the
flag, as you suggested below. Thanks.

> Here are some comments on the rest of the patch:
>
> > ---
> >  net/core/pktgen.c | 39 ++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 36 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> > index f56b8d697014..8f48272b9d4b 100644
> > --- a/net/core/pktgen.c
> > +++ b/net/core/pktgen.c
> > @@ -423,6 +423,7 @@ struct pktgen_dev {
> >       __u32 skb_priority;     /* skb priority field */
> >       unsigned int burst;     /* number of duplicated packets to burst */
> >       int node;               /* Memory node */
> > +     int skb_single_user;    /* allow single user skb for transmission */
> >
> >  #ifdef CONFIG_XFRM
> >       __u8    ipsmode;                /* IPSEC mode (config) */
> > @@ -1805,6 +1806,17 @@ static ssize_t pktgen_if_write(struct file *file,
> >               return count;
> >       }
> >
> > +     if (!strcmp(name, "skb_single_user")) {
> > +             len = num_arg(&user_buffer[i], 1, &value);
> > +             if (len < 0)
> > +                     return len;
> > +
> > +             i += len;
> > +             pkt_dev->skb_single_user = value;
> > +             sprintf(pg_result, "OK: skb_single_user=%u", pkt_dev->skb_single_user);
> > +             return count;
> > +     }
> > +
>
> Since skb_single_user is a boolean, it seems that it should be a flag
> (pkt_dev->flags), not a parameter.
>
> Since "non shared" skbs don't really have a name, I would suggest to
> avoid inventing a new name and instead call the flag "SHARED" and make
> it on by default. So the user would unset the flag to enable the new
> behavior.
>
> This patch should also document the new option in
> Documentation/networking/pktgen.rst
>

Sure, "SHARED" sounds good to me as well, and the relevant document
will be supplied.

> >       sprintf(pkt_dev->result, "No such parameter \"%s\"", name);
> >       return -EINVAL;
> >  }
> > @@ -3460,6 +3472,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> >               return;
> >       }
> >
> > +     /* If clone_skb, burst, or count parameters are configured,
> > +      * it implies the need for skb reuse, hence single user skb
> > +      * transmission is not allowed.
> > +      */
> > +     if (pkt_dev->skb_single_user && (pkt_dev->clone_skb ||
> > +                                      burst > 1 || pkt_dev->count))
> > +             pkt_dev->skb_single_user = 0;
> > +
>
> count > 0 does not imply reuse. That restriction can be lifted.
>

If 'count' is set, pktgen_wait_for_skb waits for the completion of the
last sent skb by polling the user count. So it still has an implicit
dependency on the user count.

> Instead of silently disabling the option, how about adding these checks
> to pktgen_if_write()? The "clone_skb" parameter works that way, for
> example.
>

Yes, silently disabling the option can be confusing for users. We will
make this a parameter condition check instead.

> >       /* If no skb or clone count exhausted then get new one */
> >       if (!pkt_dev->skb || (pkt_dev->last_ok &&
> >                             ++pkt_dev->clone_count >= pkt_dev->clone_skb)) {
> > @@ -3483,7 +3503,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> >       if (pkt_dev->xmit_mode == M_NETIF_RECEIVE) {
> >               skb = pkt_dev->skb;
> >               skb->protocol = eth_type_trans(skb, skb->dev);
> > -             refcount_add(burst, &skb->users);
> > +             if (!pkt_dev->skb_single_user)
> > +                     refcount_add(burst, &skb->users);
> >               local_bh_disable();
> >               do {
> >                       ret = netif_receive_skb(skb);
> > @@ -3491,6 +3512,12 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> >                               pkt_dev->errors++;
> >                       pkt_dev->sofar++;
> >                       pkt_dev->seq_num++;
> > +
> > +                     if (pkt_dev->skb_single_user) {
> > +                             pkt_dev->skb = NULL;
> > +                             break;
> > +                     }
> > +
>
> The assignment can be moved out of the loop, with the other 'if' in the
> previous hunk.
>

Sure.

> >                       if (refcount_read(&skb->users) != burst) {
> >                               /* skb was queued by rps/rfs or taps,
> >                                * so cannot reuse this skb
> > @@ -3509,7 +3536,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> >               goto out; /* Skips xmit_mode M_START_XMIT */
> >       } else if (pkt_dev->xmit_mode == M_QUEUE_XMIT) {
> >               local_bh_disable();
> > -             refcount_inc(&pkt_dev->skb->users);
> > +             if (!pkt_dev->skb_single_user)
> > +                     refcount_inc(&pkt_dev->skb->users);
> >
> >               ret = dev_queue_xmit(pkt_dev->skb);
> >               switch (ret) {
> > @@ -3517,6 +3545,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> >                       pkt_dev->sofar++;
> >                       pkt_dev->seq_num++;
> >                       pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
> > +                     if (pkt_dev->skb_single_user)
> > +                             pkt_dev->skb = NULL;
>
> >                       break;
> >               case NET_XMIT_DROP:
> >               case NET_XMIT_CN:
>
> This code can lead to a use after free of pkt_dev->skb when
> dev_queue_xmit() returns ex. NET_XMIT_DROP. The skb has been freed by
> the stack but pkt_dev->skb is still set.
>
> It can be triggered like this:
> ip link add dummy0 up type dummy
> tc qdisc add dev dummy0 clsact
> tc filter add dev dummy0 egress matchall action drop
> And then run pktgen on dummy0 with "skb_single_user 1" and "xmit_mode
> queue_xmit"

Sure. Thanks for pointing out this issue! It will be fixed in v2.


Thanks,
Liang
Liang Chen Sept. 13, 2023, 4:51 a.m. UTC | #5
On Mon, Sep 11, 2023 at 2:25 PM Liang Chen <liangchen.linux@gmail.com> wrote:
>
> On Fri, Sep 8, 2023 at 4:19 AM Benjamin Poirier
> <benjamin.poirier@gmail.com> wrote:
> >
> > On 2023-09-07 11:54 +0800, Liang Chen wrote:
> > > On Thu, Sep 7, 2023 at 6:32 AM Benjamin Poirier
> > > <benjamin.poirier@gmail.com> wrote:
> > > >
> > > > On 2023-09-06 18:35 +0800, Liang Chen wrote:
> > > > > Currently, skbs generated by pktgen always have their reference count
> > > > > incremented before transmission, leading to two issues:
> > > > >   1. Only the code paths for shared skbs can be tested.
> > > > >   2. Skbs can only be released by pktgen.
> > > > > To enhance testing comprehensiveness, introducing the "skb_single_user"
> > > > > parameter, which allows skbs with a reference count of 1 to be
> > > > > transmitted. So we can test non-shared skbs and code paths where skbs
> > > > > are released within the network stack.
> > > >
> > > > If my understanding of the code is correct, pktgen operates in the same
> > > > way with parameter clone_skb = 0 and clone_skb = 1.
> > > >
> > >
> > > Yeah. pktgen seems to treat user count of 2 as not shared, as long as
> > > the skb is not reused for burst or clone_skb. In that case the only
> > > thing left to do with the skb is to check if user count is
> > > decremented.
> > >
> > > > clone_skb = 0 is already meant to work on devices that don't support
> > > > shared skbs (see IFF_TX_SKB_SHARING check in pktgen_if_write()). Instead
> > > > of introducing a new option for your purpose, how about changing
> > > > pktgen_xmit() to send "not shared" skbs when clone_skb == 0?
> > > >
> > >
> > > Using clone_skb = 0 to enforce non-sharing makes sense to me. However,
> > > we are a bit concerned that such a change would affect existing users
> > > who have been assuming the current behavior.
> >
> > I looked into it more and mode netif_receive only supports clone_skb = 0
> > and normally reuses the same skb all the time. In order to support
> > shared/non-shared, I think a new parameter is needed, indeed.
> >
>
> Sure, we will introduce a new parameter and store the value in the
> flag, as you suggested below. Thanks.
>
> > Here are some comments on the rest of the patch:
> >
> > > ---
> > >  net/core/pktgen.c | 39 ++++++++++++++++++++++++++++++++++++---
> > >  1 file changed, 36 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> > > index f56b8d697014..8f48272b9d4b 100644
> > > --- a/net/core/pktgen.c
> > > +++ b/net/core/pktgen.c
> > > @@ -423,6 +423,7 @@ struct pktgen_dev {
> > >       __u32 skb_priority;     /* skb priority field */
> > >       unsigned int burst;     /* number of duplicated packets to burst */
> > >       int node;               /* Memory node */
> > > +     int skb_single_user;    /* allow single user skb for transmission */
> > >
> > >  #ifdef CONFIG_XFRM
> > >       __u8    ipsmode;                /* IPSEC mode (config) */
> > > @@ -1805,6 +1806,17 @@ static ssize_t pktgen_if_write(struct file *file,
> > >               return count;
> > >       }
> > >
> > > +     if (!strcmp(name, "skb_single_user")) {
> > > +             len = num_arg(&user_buffer[i], 1, &value);
> > > +             if (len < 0)
> > > +                     return len;
> > > +
> > > +             i += len;
> > > +             pkt_dev->skb_single_user = value;
> > > +             sprintf(pg_result, "OK: skb_single_user=%u", pkt_dev->skb_single_user);
> > > +             return count;
> > > +     }
> > > +
> >
> > Since skb_single_user is a boolean, it seems that it should be a flag
> > (pkt_dev->flags), not a parameter.
> >
> > Since "non shared" skbs don't really have a name, I would suggest to
> > avoid inventing a new name and instead call the flag "SHARED" and make
> > it on by default. So the user would unset the flag to enable the new
> > behavior.
> >
> > This patch should also document the new option in
> > Documentation/networking/pktgen.rst
> >
>
> Sure, "SHARED" sounds good to me as well, and the relevant document
> will be supplied.
>
> > >       sprintf(pkt_dev->result, "No such parameter \"%s\"", name);
> > >       return -EINVAL;
> > >  }
> > > @@ -3460,6 +3472,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> > >               return;
> > >       }
> > >
> > > +     /* If clone_skb, burst, or count parameters are configured,
> > > +      * it implies the need for skb reuse, hence single user skb
> > > +      * transmission is not allowed.
> > > +      */
> > > +     if (pkt_dev->skb_single_user && (pkt_dev->clone_skb ||
> > > +                                      burst > 1 || pkt_dev->count))
> > > +             pkt_dev->skb_single_user = 0;
> > > +
> >
> > count > 0 does not imply reuse. That restriction can be lifted.
> >
>
> If 'count' is set, pktgen_wait_for_skb waits for the completion of the
> last sent skb by polling the user count. So it still has an implicit
> dependency on the user count.
>
> > Instead of silently disabling the option, how about adding these checks
> > to pktgen_if_write()? The "clone_skb" parameter works that way, for
> > example.
> >
>
> Yes, silently disabling the option can be confusing for users. We will
> make this a parameter condition check instead.
>
> > >       /* If no skb or clone count exhausted then get new one */
> > >       if (!pkt_dev->skb || (pkt_dev->last_ok &&
> > >                             ++pkt_dev->clone_count >= pkt_dev->clone_skb)) {
> > > @@ -3483,7 +3503,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> > >       if (pkt_dev->xmit_mode == M_NETIF_RECEIVE) {
> > >               skb = pkt_dev->skb;
> > >               skb->protocol = eth_type_trans(skb, skb->dev);
> > > -             refcount_add(burst, &skb->users);
> > > +             if (!pkt_dev->skb_single_user)
> > > +                     refcount_add(burst, &skb->users);
> > >               local_bh_disable();
> > >               do {
> > >                       ret = netif_receive_skb(skb);
> > > @@ -3491,6 +3512,12 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> > >                               pkt_dev->errors++;
> > >                       pkt_dev->sofar++;
> > >                       pkt_dev->seq_num++;
> > > +
> > > +                     if (pkt_dev->skb_single_user) {
> > > +                             pkt_dev->skb = NULL;
> > > +                             break;
> > > +                     }
> > > +
> >
> > The assignment can be moved out of the loop, with the other 'if' in the
> > previous hunk.
> >
>
> Sure.

The assignment needs to occur after netif_receive_skb. So moving the
assignment above the loop would require netif_receive_skb and its
associated statistics being duplicated as well. Therefore, we mark the
condition 'unlikely' to optimize the case when 'burst' is set and it
loops.

>
> > >                       if (refcount_read(&skb->users) != burst) {
> > >                               /* skb was queued by rps/rfs or taps,
> > >                                * so cannot reuse this skb
> > > @@ -3509,7 +3536,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> > >               goto out; /* Skips xmit_mode M_START_XMIT */
> > >       } else if (pkt_dev->xmit_mode == M_QUEUE_XMIT) {
> > >               local_bh_disable();
> > > -             refcount_inc(&pkt_dev->skb->users);
> > > +             if (!pkt_dev->skb_single_user)
> > > +                     refcount_inc(&pkt_dev->skb->users);
> > >
> > >               ret = dev_queue_xmit(pkt_dev->skb);
> > >               switch (ret) {
> > > @@ -3517,6 +3545,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
> > >                       pkt_dev->sofar++;
> > >                       pkt_dev->seq_num++;
> > >                       pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
> > > +                     if (pkt_dev->skb_single_user)
> > > +                             pkt_dev->skb = NULL;
> >
> > >                       break;
> > >               case NET_XMIT_DROP:
> > >               case NET_XMIT_CN:
> >
> > This code can lead to a use after free of pkt_dev->skb when
> > dev_queue_xmit() returns ex. NET_XMIT_DROP. The skb has been freed by
> > the stack but pkt_dev->skb is still set.
> >
> > It can be triggered like this:
> > ip link add dummy0 up type dummy
> > tc qdisc add dev dummy0 clsact
> > tc filter add dev dummy0 egress matchall action drop
> > And then run pktgen on dummy0 with "skb_single_user 1" and "xmit_mode
> > queue_xmit"
>
> Sure. Thanks for pointing out this issue! It will be fixed in v2.
>
>
> Thanks,
> Liang
diff mbox series

Patch

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index f56b8d697014..8f48272b9d4b 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -423,6 +423,7 @@  struct pktgen_dev {
 	__u32 skb_priority;	/* skb priority field */
 	unsigned int burst;	/* number of duplicated packets to burst */
 	int node;               /* Memory node */
+	int skb_single_user;	/* allow single user skb for transmission */
 
 #ifdef CONFIG_XFRM
 	__u8	ipsmode;		/* IPSEC mode (config) */
@@ -1805,6 +1806,17 @@  static ssize_t pktgen_if_write(struct file *file,
 		return count;
 	}
 
+	if (!strcmp(name, "skb_single_user")) {
+		len = num_arg(&user_buffer[i], 1, &value);
+		if (len < 0)
+			return len;
+
+		i += len;
+		pkt_dev->skb_single_user = value;
+		sprintf(pg_result, "OK: skb_single_user=%u", pkt_dev->skb_single_user);
+		return count;
+	}
+
 	sprintf(pkt_dev->result, "No such parameter \"%s\"", name);
 	return -EINVAL;
 }
@@ -3460,6 +3472,14 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		return;
 	}
 
+	/* If clone_skb, burst, or count parameters are configured,
+	 * it implies the need for skb reuse, hence single user skb
+	 * transmission is not allowed.
+	 */
+	if (pkt_dev->skb_single_user && (pkt_dev->clone_skb ||
+					 burst > 1 || pkt_dev->count))
+		pkt_dev->skb_single_user = 0;
+
 	/* If no skb or clone count exhausted then get new one */
 	if (!pkt_dev->skb || (pkt_dev->last_ok &&
 			      ++pkt_dev->clone_count >= pkt_dev->clone_skb)) {
@@ -3483,7 +3503,8 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 	if (pkt_dev->xmit_mode == M_NETIF_RECEIVE) {
 		skb = pkt_dev->skb;
 		skb->protocol = eth_type_trans(skb, skb->dev);
-		refcount_add(burst, &skb->users);
+		if (!pkt_dev->skb_single_user)
+			refcount_add(burst, &skb->users);
 		local_bh_disable();
 		do {
 			ret = netif_receive_skb(skb);
@@ -3491,6 +3512,12 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 				pkt_dev->errors++;
 			pkt_dev->sofar++;
 			pkt_dev->seq_num++;
+
+			if (pkt_dev->skb_single_user) {
+				pkt_dev->skb = NULL;
+				break;
+			}
+
 			if (refcount_read(&skb->users) != burst) {
 				/* skb was queued by rps/rfs or taps,
 				 * so cannot reuse this skb
@@ -3509,7 +3536,8 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		goto out; /* Skips xmit_mode M_START_XMIT */
 	} else if (pkt_dev->xmit_mode == M_QUEUE_XMIT) {
 		local_bh_disable();
-		refcount_inc(&pkt_dev->skb->users);
+		if (!pkt_dev->skb_single_user)
+			refcount_inc(&pkt_dev->skb->users);
 
 		ret = dev_queue_xmit(pkt_dev->skb);
 		switch (ret) {
@@ -3517,6 +3545,8 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 			pkt_dev->sofar++;
 			pkt_dev->seq_num++;
 			pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
+			if (pkt_dev->skb_single_user)
+				pkt_dev->skb = NULL;
 			break;
 		case NET_XMIT_DROP:
 		case NET_XMIT_CN:
@@ -3549,7 +3579,8 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		pkt_dev->last_ok = 0;
 		goto unlock;
 	}
-	refcount_add(burst, &pkt_dev->skb->users);
+	if (!pkt_dev->skb_single_user)
+		refcount_add(burst, &pkt_dev->skb->users);
 
 xmit_more:
 	ret = netdev_start_xmit(pkt_dev->skb, odev, txq, --burst > 0);
@@ -3560,6 +3591,8 @@  static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		pkt_dev->sofar++;
 		pkt_dev->seq_num++;
 		pkt_dev->tx_bytes += pkt_dev->last_pkt_size;
+		if (pkt_dev->skb_single_user)
+			pkt_dev->skb = NULL;
 		if (burst > 0 && !netif_xmit_frozen_or_drv_stopped(txq))
 			goto xmit_more;
 		break;