diff mbox series

[net] net: psample: Fix the netlink skb length

Message ID 20210203031028.171318-1-cmi@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] net: psample: Fix the netlink skb length | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 5 maintainers not CCed: jhs@mojatatu.com jiri@mellanox.com davem@davemloft.net simon.horman@netronome.com kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 87 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Chris Mi Feb. 3, 2021, 3:10 a.m. UTC
Currently, the netlink skb length only includes metadata and data
length. It doesn't include the psample generic netlink header length.
Fix it by adding it.

Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
CC: Yotam Gigi <yotam.gi@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
---
 net/psample/psample.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Jakub Kicinski Feb. 4, 2021, 2:21 a.m. UTC | #1
On Wed,  3 Feb 2021 11:10:28 +0800 Chris Mi wrote:
> Currently, the netlink skb length only includes metadata and data
> length. It doesn't include the psample generic netlink header length.

But what's the bug? Did you see oversized messages on the socket? Did
one of the nla_put() fail?

> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
> CC: Yotam Gigi <yotam.gi@gmail.com>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
> Signed-off-by: Chris Mi <cmi@nvidia.com>
> ---
>  net/psample/psample.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/net/psample/psample.c b/net/psample/psample.c
> index 33e238c965bd..807d75f5a40f 100644
> --- a/net/psample/psample.c
> +++ b/net/psample/psample.c
> @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>  	struct ip_tunnel_info *tun_info;
>  #endif
>  	struct sk_buff *nl_skb;
> +	int header_len;
>  	int data_len;
>  	int meta_len;
>  	void *data;
> @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>  		meta_len += psample_tunnel_meta_len(tun_info);
>  #endif
>  
> +	/* psample generic netlink header size */
> +	header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize);

GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0
/ uninitialized for psample_nl_family. What am I missing? Ido?

>  	data_len = min(skb->len, trunc_size);
> -	if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
> -		data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN
> +	if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
> +		data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN
>  			    - NLA_ALIGNTO;
> -
> -	nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC);
> +	nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC);
>  	if (unlikely(!nl_skb))
>  		return;
>
Ido Schimmel Feb. 4, 2021, 8:47 a.m. UTC | #2
On Wed, Feb 03, 2021 at 06:21:03PM -0800, Jakub Kicinski wrote:
> On Wed,  3 Feb 2021 11:10:28 +0800 Chris Mi wrote:
> > Currently, the netlink skb length only includes metadata and data
> > length. It doesn't include the psample generic netlink header length.
> 
> But what's the bug? Did you see oversized messages on the socket? Did
> one of the nla_put() fail?

I didn't ask, but I assumed the problem was nla_put(). Agree it needs to
be noted in the commit message.

> 
> > Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
> > CC: Yotam Gigi <yotam.gi@gmail.com>
> > Reviewed-by: Ido Schimmel <idosch@nvidia.com>
> > Signed-off-by: Chris Mi <cmi@nvidia.com>
> > ---
> >  net/psample/psample.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/psample/psample.c b/net/psample/psample.c
> > index 33e238c965bd..807d75f5a40f 100644
> > --- a/net/psample/psample.c
> > +++ b/net/psample/psample.c
> > @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
> >  	struct ip_tunnel_info *tun_info;
> >  #endif
> >  	struct sk_buff *nl_skb;
> > +	int header_len;
> >  	int data_len;
> >  	int meta_len;
> >  	void *data;
> > @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
> >  		meta_len += psample_tunnel_meta_len(tun_info);
> >  #endif
> >  
> > +	/* psample generic netlink header size */
> > +	header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize);
> 
> GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0
> / uninitialized for psample_nl_family. What am I missing? Ido?

Yea, I missed that genlmsg_new() eventually accounts for 'GENL_HDRLEN'.

Chris, assuming the problem is nla_put(), I think some other attribute
is not accounted for when calculating the size of the skb. Does it only
happen with packets that include tunnel metadata? Because I think I see
a few problems there:

diff --git a/net/psample/psample.c b/net/psample/psample.c
index 33e238c965bd..1a233cd128c7 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -311,8 +311,10 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
        int tun_opts_len = tun_info->options_len;
        int sum = 0;
 
+       sum += nla_total_size(0);       /* PSAMPLE_ATTR_TUNNEL */
+
        if (tun_key->tun_flags & TUNNEL_KEY)
-               sum += nla_total_size(sizeof(u64));
+               sum += nla_total_size_64bit(sizeof(u64));
 
        if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE)
                sum += nla_total_size(0);

> 
> >  	data_len = min(skb->len, trunc_size);
> > -	if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
> > -		data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN
> > +	if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
> > +		data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN
> >  			    - NLA_ALIGNTO;
> > -
> > -	nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC);
> > +	nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC);
> >  	if (unlikely(!nl_skb))
> >  		return;
> >  
>
Chris Mi Feb. 4, 2021, 9:23 a.m. UTC | #3
On 2/4/2021 10:21 AM, Jakub Kicinski wrote:
> On Wed,  3 Feb 2021 11:10:28 +0800 Chris Mi wrote:
>> Currently, the netlink skb length only includes metadata and data
>> length. It doesn't include the psample generic netlink header length.
> But what's the bug? Did you see oversized messages on the socket?
Yes.
>   Did
> one of the nla_put() fail?
Yes.
>
>> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
>> CC: Yotam Gigi <yotam.gi@gmail.com>
>> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
>> Signed-off-by: Chris Mi <cmi@nvidia.com>
>> ---
>>   net/psample/psample.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/psample/psample.c b/net/psample/psample.c
>> index 33e238c965bd..807d75f5a40f 100644
>> --- a/net/psample/psample.c
>> +++ b/net/psample/psample.c
>> @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>>   	struct ip_tunnel_info *tun_info;
>>   #endif
>>   	struct sk_buff *nl_skb;
>> +	int header_len;
>>   	int data_len;
>>   	int meta_len;
>>   	void *data;
>> @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>>   		meta_len += psample_tunnel_meta_len(tun_info);
>>   #endif
>>   
>> +	/* psample generic netlink header size */
>> +	header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize);
> GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0
> / uninitialized for psample_nl_family. What am I missing? Ido?
Thanks for pointing this out. If so, it seems this patch is incorrect.
>
>>   	data_len = min(skb->len, trunc_size);
>> -	if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
>> -		data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN
>> +	if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
>> +		data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN
>>   			    - NLA_ALIGNTO;
>> -
>> -	nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC);
>> +	nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC);
>>   	if (unlikely(!nl_skb))
>>   		return;
>>
Chris Mi Feb. 4, 2021, 9:32 a.m. UTC | #4
On 2/4/2021 4:47 PM, Ido Schimmel wrote:
> On Wed, Feb 03, 2021 at 06:21:03PM -0800, Jakub Kicinski wrote:
>> On Wed,  3 Feb 2021 11:10:28 +0800 Chris Mi wrote:
>>> Currently, the netlink skb length only includes metadata and data
>>> length. It doesn't include the psample generic netlink header length.
>> But what's the bug? Did you see oversized messages on the socket? Did
>> one of the nla_put() fail?
> I didn't ask, but I assumed the problem was nla_put(). Agree it needs to
> be noted in the commit message.
>
>>> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
>>> CC: Yotam Gigi <yotam.gi@gmail.com>
>>> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
>>> Signed-off-by: Chris Mi <cmi@nvidia.com>
>>> ---
>>>   net/psample/psample.c | 10 ++++++----
>>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/psample/psample.c b/net/psample/psample.c
>>> index 33e238c965bd..807d75f5a40f 100644
>>> --- a/net/psample/psample.c
>>> +++ b/net/psample/psample.c
>>> @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>>>   	struct ip_tunnel_info *tun_info;
>>>   #endif
>>>   	struct sk_buff *nl_skb;
>>> +	int header_len;
>>>   	int data_len;
>>>   	int meta_len;
>>>   	void *data;
>>> @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>>>   		meta_len += psample_tunnel_meta_len(tun_info);
>>>   #endif
>>>   
>>> +	/* psample generic netlink header size */
>>> +	header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize);
>> GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0
>> / uninitialized for psample_nl_family. What am I missing? Ido?
> Yea, I missed that genlmsg_new() eventually accounts for 'GENL_HDRLEN'.
>
> Chris, assuming the problem is nla_put(), I think some other attribute
> is not accounted for when calculating the size of the skb. Does it only
> happen with packets that include tunnel metadata?
Yes.
>   Because I think I see
> a few problems there:
>
> diff --git a/net/psample/psample.c b/net/psample/psample.c
> index 33e238c965bd..1a233cd128c7 100644
> --- a/net/psample/psample.c
> +++ b/net/psample/psample.c
> @@ -311,8 +311,10 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
>          int tun_opts_len = tun_info->options_len;
>          int sum = 0;
>   
> +       sum += nla_total_size(0);       /* PSAMPLE_ATTR_TUNNEL */
> +
>          if (tun_key->tun_flags & TUNNEL_KEY)
> -               sum += nla_total_size(sizeof(u64));
> +               sum += nla_total_size_64bit(sizeof(u64));
>   
>          if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE)
>                  sum += nla_total_size(0);
Thanks for this patch. I'll check it.

BTW, maybe I should not mention it, if we have the psample dependency 
removal patch
which is rejected, I think we can debug the psample issue easily. 
Because we can
unload and load psample easily. But if NIC driver calls psample api 
directly.
We have to unload the driver first. After loading the NIC driver, we 
have to enable sriov
and enable switchdev mode again which is time consuming.
>>>   	data_len = min(skb->len, trunc_size);
>>> -	if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
>>> -		data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN
>>> +	if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
>>> +		data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN
>>>   			    - NLA_ALIGNTO;
>>> -
>>> -	nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC);
>>> +	nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC);
>>>   	if (unlikely(!nl_skb))
>>>   		return;
>>>
diff mbox series

Patch

diff --git a/net/psample/psample.c b/net/psample/psample.c
index 33e238c965bd..807d75f5a40f 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -363,6 +363,7 @@  void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 	struct ip_tunnel_info *tun_info;
 #endif
 	struct sk_buff *nl_skb;
+	int header_len;
 	int data_len;
 	int meta_len;
 	void *data;
@@ -381,12 +382,13 @@  void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 		meta_len += psample_tunnel_meta_len(tun_info);
 #endif
 
+	/* psample generic netlink header size */
+	header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize);
 	data_len = min(skb->len, trunc_size);
-	if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
-		data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN
+	if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE)
+		data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN
 			    - NLA_ALIGNTO;
-
-	nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC);
+	nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC);
 	if (unlikely(!nl_skb))
 		return;