Message ID | 20210203031028.171318-1-cmi@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: psample: Fix the netlink skb length | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 5 maintainers not CCed: jhs@mojatatu.com jiri@mellanox.com davem@davemloft.net simon.horman@netronome.com kuba@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: line length of 87 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Wed, 3 Feb 2021 11:10:28 +0800 Chris Mi wrote: > Currently, the netlink skb length only includes metadata and data > length. It doesn't include the psample generic netlink header length. But what's the bug? Did you see oversized messages on the socket? Did one of the nla_put() fail? > Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") > CC: Yotam Gigi <yotam.gi@gmail.com> > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > Signed-off-by: Chris Mi <cmi@nvidia.com> > --- > net/psample/psample.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/net/psample/psample.c b/net/psample/psample.c > index 33e238c965bd..807d75f5a40f 100644 > --- a/net/psample/psample.c > +++ b/net/psample/psample.c > @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, > struct ip_tunnel_info *tun_info; > #endif > struct sk_buff *nl_skb; > + int header_len; > int data_len; > int meta_len; > void *data; > @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, > meta_len += psample_tunnel_meta_len(tun_info); > #endif > > + /* psample generic netlink header size */ > + header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize); GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0 / uninitialized for psample_nl_family. What am I missing? Ido? > data_len = min(skb->len, trunc_size); > - if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) > - data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN > + if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) > + data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN > - NLA_ALIGNTO; > - > - nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC); > + nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC); > if (unlikely(!nl_skb)) > return; >
On Wed, Feb 03, 2021 at 06:21:03PM -0800, Jakub Kicinski wrote: > On Wed, 3 Feb 2021 11:10:28 +0800 Chris Mi wrote: > > Currently, the netlink skb length only includes metadata and data > > length. It doesn't include the psample generic netlink header length. > > But what's the bug? Did you see oversized messages on the socket? Did > one of the nla_put() fail? I didn't ask, but I assumed the problem was nla_put(). Agree it needs to be noted in the commit message. > > > Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") > > CC: Yotam Gigi <yotam.gi@gmail.com> > > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > > Signed-off-by: Chris Mi <cmi@nvidia.com> > > --- > > net/psample/psample.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/net/psample/psample.c b/net/psample/psample.c > > index 33e238c965bd..807d75f5a40f 100644 > > --- a/net/psample/psample.c > > +++ b/net/psample/psample.c > > @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, > > struct ip_tunnel_info *tun_info; > > #endif > > struct sk_buff *nl_skb; > > + int header_len; > > int data_len; > > int meta_len; > > void *data; > > @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, > > meta_len += psample_tunnel_meta_len(tun_info); > > #endif > > > > + /* psample generic netlink header size */ > > + header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize); > > GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0 > / uninitialized for psample_nl_family. What am I missing? Ido? Yea, I missed that genlmsg_new() eventually accounts for 'GENL_HDRLEN'. Chris, assuming the problem is nla_put(), I think some other attribute is not accounted for when calculating the size of the skb. Does it only happen with packets that include tunnel metadata? Because I think I see a few problems there: diff --git a/net/psample/psample.c b/net/psample/psample.c index 33e238c965bd..1a233cd128c7 100644 --- a/net/psample/psample.c +++ b/net/psample/psample.c @@ -311,8 +311,10 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info) int tun_opts_len = tun_info->options_len; int sum = 0; + sum += nla_total_size(0); /* PSAMPLE_ATTR_TUNNEL */ + if (tun_key->tun_flags & TUNNEL_KEY) - sum += nla_total_size(sizeof(u64)); + sum += nla_total_size_64bit(sizeof(u64)); if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE) sum += nla_total_size(0); > > > data_len = min(skb->len, trunc_size); > > - if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) > > - data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN > > + if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) > > + data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN > > - NLA_ALIGNTO; > > - > > - nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC); > > + nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC); > > if (unlikely(!nl_skb)) > > return; > > >
On 2/4/2021 10:21 AM, Jakub Kicinski wrote: > On Wed, 3 Feb 2021 11:10:28 +0800 Chris Mi wrote: >> Currently, the netlink skb length only includes metadata and data >> length. It doesn't include the psample generic netlink header length. > But what's the bug? Did you see oversized messages on the socket? Yes. > Did > one of the nla_put() fail? Yes. > >> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") >> CC: Yotam Gigi <yotam.gi@gmail.com> >> Reviewed-by: Ido Schimmel <idosch@nvidia.com> >> Signed-off-by: Chris Mi <cmi@nvidia.com> >> --- >> net/psample/psample.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/net/psample/psample.c b/net/psample/psample.c >> index 33e238c965bd..807d75f5a40f 100644 >> --- a/net/psample/psample.c >> +++ b/net/psample/psample.c >> @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, >> struct ip_tunnel_info *tun_info; >> #endif >> struct sk_buff *nl_skb; >> + int header_len; >> int data_len; >> int meta_len; >> void *data; >> @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, >> meta_len += psample_tunnel_meta_len(tun_info); >> #endif >> >> + /* psample generic netlink header size */ >> + header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize); > GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0 > / uninitialized for psample_nl_family. What am I missing? Ido? Thanks for pointing this out. If so, it seems this patch is incorrect. > >> data_len = min(skb->len, trunc_size); >> - if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) >> - data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN >> + if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) >> + data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN >> - NLA_ALIGNTO; >> - >> - nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC); >> + nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC); >> if (unlikely(!nl_skb)) >> return; >>
On 2/4/2021 4:47 PM, Ido Schimmel wrote: > On Wed, Feb 03, 2021 at 06:21:03PM -0800, Jakub Kicinski wrote: >> On Wed, 3 Feb 2021 11:10:28 +0800 Chris Mi wrote: >>> Currently, the netlink skb length only includes metadata and data >>> length. It doesn't include the psample generic netlink header length. >> But what's the bug? Did you see oversized messages on the socket? Did >> one of the nla_put() fail? > I didn't ask, but I assumed the problem was nla_put(). Agree it needs to > be noted in the commit message. > >>> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") >>> CC: Yotam Gigi <yotam.gi@gmail.com> >>> Reviewed-by: Ido Schimmel <idosch@nvidia.com> >>> Signed-off-by: Chris Mi <cmi@nvidia.com> >>> --- >>> net/psample/psample.c | 10 ++++++---- >>> 1 file changed, 6 insertions(+), 4 deletions(-) >>> >>> diff --git a/net/psample/psample.c b/net/psample/psample.c >>> index 33e238c965bd..807d75f5a40f 100644 >>> --- a/net/psample/psample.c >>> +++ b/net/psample/psample.c >>> @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, >>> struct ip_tunnel_info *tun_info; >>> #endif >>> struct sk_buff *nl_skb; >>> + int header_len; >>> int data_len; >>> int meta_len; >>> void *data; >>> @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, >>> meta_len += psample_tunnel_meta_len(tun_info); >>> #endif >>> >>> + /* psample generic netlink header size */ >>> + header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize); >> GENL_HDRLEN is already included by genlmsg_new() and fam->hdrsize is 0 >> / uninitialized for psample_nl_family. What am I missing? Ido? > Yea, I missed that genlmsg_new() eventually accounts for 'GENL_HDRLEN'. > > Chris, assuming the problem is nla_put(), I think some other attribute > is not accounted for when calculating the size of the skb. Does it only > happen with packets that include tunnel metadata? Yes. > Because I think I see > a few problems there: > > diff --git a/net/psample/psample.c b/net/psample/psample.c > index 33e238c965bd..1a233cd128c7 100644 > --- a/net/psample/psample.c > +++ b/net/psample/psample.c > @@ -311,8 +311,10 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info) > int tun_opts_len = tun_info->options_len; > int sum = 0; > > + sum += nla_total_size(0); /* PSAMPLE_ATTR_TUNNEL */ > + > if (tun_key->tun_flags & TUNNEL_KEY) > - sum += nla_total_size(sizeof(u64)); > + sum += nla_total_size_64bit(sizeof(u64)); > > if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE) > sum += nla_total_size(0); Thanks for this patch. I'll check it. BTW, maybe I should not mention it, if we have the psample dependency removal patch which is rejected, I think we can debug the psample issue easily. Because we can unload and load psample easily. But if NIC driver calls psample api directly. We have to unload the driver first. After loading the NIC driver, we have to enable sriov and enable switchdev mode again which is time consuming. >>> data_len = min(skb->len, trunc_size); >>> - if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) >>> - data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN >>> + if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) >>> + data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN >>> - NLA_ALIGNTO; >>> - >>> - nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC); >>> + nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC); >>> if (unlikely(!nl_skb)) >>> return; >>>
diff --git a/net/psample/psample.c b/net/psample/psample.c index 33e238c965bd..807d75f5a40f 100644 --- a/net/psample/psample.c +++ b/net/psample/psample.c @@ -363,6 +363,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, struct ip_tunnel_info *tun_info; #endif struct sk_buff *nl_skb; + int header_len; int data_len; int meta_len; void *data; @@ -381,12 +382,13 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, meta_len += psample_tunnel_meta_len(tun_info); #endif + /* psample generic netlink header size */ + header_len = nlmsg_total_size(GENL_HDRLEN + psample_nl_family.hdrsize); data_len = min(skb->len, trunc_size); - if (meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) - data_len = PSAMPLE_MAX_PACKET_SIZE - meta_len - NLA_HDRLEN + if (header_len + meta_len + nla_total_size(data_len) > PSAMPLE_MAX_PACKET_SIZE) + data_len = PSAMPLE_MAX_PACKET_SIZE - header_len - meta_len - NLA_HDRLEN - NLA_ALIGNTO; - - nl_skb = genlmsg_new(meta_len + nla_total_size(data_len), GFP_ATOMIC); + nl_skb = genlmsg_new(header_len + meta_len + nla_total_size(data_len), GFP_ATOMIC); if (unlikely(!nl_skb)) return;