Message ID | 20210123195916.2765481-16-jonas@norrbonn.se (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | GTP: flow based | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | fail | Series longer than 15 patches |
netdev/tree_selection | success | Guessed tree name to be net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 2 maintainers not CCed: davem@davemloft.net osmocom-net-gprs@lists.osmocom.org |
netdev/source_inline | fail | Was 0 now: 1 |
netdev/verify_signedoff | fail | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: line length of 83 exceeds 80 columns |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
Hi Pravin, So, this whole GTP metadata thing has me a bit confused. You've defined a metadata structure like this: struct gtpu_metadata { __u8 ver; __u8 flags; __u8 type; }; Here ver is the version of the metadata structure itself, which is fine. 'flags' corresponds to the 3 flag bits of GTP header's first byte: E, S, and PN. 'type' corresponds to the 'message type' field of the GTP header. The 'control header' (strange name) example below allows the flags to be set; however, setting these flags alone is insufficient because each one indicates the presence of additional fields in the header and there's nothing in the code to account for that. If E is set, extension headers would need to be added. If S is set, a sequence number field would need to be added. If PN is set, a PDU-number header would need to be added. It's not clear to me who sets up this metadata in the first place. Is that where OVS or eBPF come in? Can you give some pointers to how this works? Couple of comments below.... On 23/01/2021 20:59, Jonas Bonn wrote: > From: Pravin B Shelar <pbshelar@fb.com> > > Please explain how this patch actually works... creation of the control > header makes sense, but I don't understand how sending of a > control header is actually triggered. > > Signed-off-by: Jonas Bonn <jonas@norrbonn.se> > --- > drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 42 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c > index 668ed8a4836e..bbce2671de2d 100644 > --- a/drivers/net/gtp.c > +++ b/drivers/net/gtp.c > @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx, > } > } > > +static inline int gtp1_push_control_header(struct sk_buff *skb, I'm not enamored with the name 'control header' because it makes sound like this is some GTP-C thing. The GTP module is really only about GTP-U and the function itself just sets up a GTP-U header. > + struct pdp_ctx *pctx, > + struct gtpu_metadata *opts, > + struct net_device *dev) > +{ > + struct gtp1_header *gtp1c; > + int payload_len; > + > + if (opts->ver != GTP_METADATA_V1) > + return -ENOENT; > + > + if (opts->type == 0xFE) { > + /* for end marker ignore skb data. */ > + netdev_dbg(dev, "xmit pkt with null data"); > + pskb_trim(skb, 0); > + } > + if (skb_cow_head(skb, sizeof(*gtp1c)) < 0) > + return -ENOMEM; > + > + payload_len = skb->len; > + > + gtp1c = skb_push(skb, sizeof(*gtp1c)); > + > + gtp1c->flags = opts->flags; > + gtp1c->type = opts->type; > + gtp1c->length = htons(payload_len); > + gtp1c->tid = htonl(pctx->u.v1.o_tei); > + netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x", > + opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei); > + return 0; > +} There's nothing really special about that above function aside from the facts that it takes 'opts' as an argument. Can't we just merge this with the regular 'gtp_push_header' function? Do you have plans for this beyond what's here that would complicated by doing so? /Jonas > + > static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) > { > struct gtp_dev *gtp = netdev_priv(dev); > @@ -807,7 +839,16 @@ static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) > > skb_set_inner_protocol(skb, skb->protocol); > > - gtp_push_header(skb, pctx, &port); > + if (unlikely(opts)) { > + port = htons(GTP1U_PORT); > + r = gtp1_push_control_header(skb, pctx, opts, dev); > + if (r) { > + netdev_info(dev, "cntr pkt error %d", r); > + goto err_rt; > + } > + } else { > + gtp_push_header(skb, pctx, &port); > + } > > iph = ip_hdr(skb); > netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n", >
Hi Jonas, thanks for your detailed analysis and review of the changes. To me, they once again show that the original patch was merged too quickly, without a detailed review by people with strong GTP background. On Sun, Jan 24, 2021 at 03:21:21PM +0100, Jonas Bonn wrote: > struct gtpu_metadata { > __u8 ver; > __u8 flags; > __u8 type; > }; > > Here ver is the version of the metadata structure itself, which is fine. > 'flags' corresponds to the 3 flag bits of GTP header's first byte: E, S, > and PN. > 'type' corresponds to the 'message type' field of the GTP header. One more comment on the 'type': Of how much use is it? After all, the GTP-U kernel driver only handles a single message type at all (G-PDU / 255 - the only message type that encapsulates user IP data), while all other message types are always processed in userland via the UDP socket. Side-note: 3GPP TS 29.060 lists 5 other message types that can happen in GTP-U: * Echo Request * Echo Response * Error Indication * Supported Extension Headers Notification * End Marker It would be interesting to understand how the new flow-based tunnel would treat those, if those > The 'control header' (strange name) example below allows the flags to be > set; however, setting these flags alone is insufficient because each one > indicates the presence of additional fields in the header and there's > nothing in the code to account for that. Full ACK from my side here. Setting arbitrary bits in the GTP flags without then actually encoding the required additional bits that those flags require will produce broken packets. IMHO, the GTP driver should never do that.
On Sun, Jan 24, 2021 at 6:22 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > Hi Pravin, > > So, this whole GTP metadata thing has me a bit confused. > > You've defined a metadata structure like this: > > struct gtpu_metadata { > __u8 ver; > __u8 flags; > __u8 type; > }; > > Here ver is the version of the metadata structure itself, which is fine. > 'flags' corresponds to the 3 flag bits of GTP header's first byte: E, > S, and PN. > 'type' corresponds to the 'message type' field of the GTP header. > > The 'control header' (strange name) example below allows the flags to be > set; however, setting these flags alone is insufficient because each one > indicates the presence of additional fields in the header and there's > nothing in the code to account for that. > > If E is set, extension headers would need to be added. > If S is set, a sequence number field would need to be added. > If PN is set, a PDU-number header would need to be added. > > It's not clear to me who sets up this metadata in the first place. Is > that where OVS or eBPF come in? Can you give some pointers to how this > works? > Receive path: LWT extracts tunnel metadata into tunnel-metadata struct. This object has 5-tuple info from outer header and tunnel key. When there is presence of extension header there is no way to store the info standard tunnel-metadata object. That is when the optional section of tunnel-metadata comes in the play. As you can see the packet data from GTP header onwards is still pushed to the device, so consumers of LWT can look at tunnel-metadata and make sense of the inner packet that is received on the device. OVS does exactly the same. When it receives a GTP packet with optional metadata, it looks at flags and parses the inner packet and extension header accordingly. xmit path: it is set by LWT tunnel user, OVS or eBPF code. it needs to set the metadata in tunnel dst along with the 5-tuple data and tunel ID. This is how it can steer the packet at the right tunnel using a single tunnel net device. > Couple of comments below.... > > On 23/01/2021 20:59, Jonas Bonn wrote: > > From: Pravin B Shelar <pbshelar@fb.com> > > > > Please explain how this patch actually works... creation of the control > > header makes sense, but I don't understand how sending of a > > control header is actually triggered. > > > > Signed-off-by: Jonas Bonn <jonas@norrbonn.se> > > --- > > drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 42 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c > > index 668ed8a4836e..bbce2671de2d 100644 > > --- a/drivers/net/gtp.c > > +++ b/drivers/net/gtp.c > > @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx, > > } > > } > > > > +static inline int gtp1_push_control_header(struct sk_buff *skb, > > I'm not enamored with the name 'control header' because it makes sound > like this is some GTP-C thing. The GTP module is really only about > GTP-U and the function itself just sets up a GTP-U header. > sure. lets call ext_hdr. > > > + struct pdp_ctx *pctx, > > + struct gtpu_metadata *opts, > > + struct net_device *dev) > > +{ > > + struct gtp1_header *gtp1c; > > + int payload_len; > > + > > + if (opts->ver != GTP_METADATA_V1) > > + return -ENOENT; > > + > > + if (opts->type == 0xFE) { > > + /* for end marker ignore skb data. */ > > + netdev_dbg(dev, "xmit pkt with null data"); > > + pskb_trim(skb, 0); > > + } > > + if (skb_cow_head(skb, sizeof(*gtp1c)) < 0) > > + return -ENOMEM; > > + > > + payload_len = skb->len; > > + > > + gtp1c = skb_push(skb, sizeof(*gtp1c)); > > + > > + gtp1c->flags = opts->flags; > > + gtp1c->type = opts->type; > > + gtp1c->length = htons(payload_len); > > + gtp1c->tid = htonl(pctx->u.v1.o_tei); > > + netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x", > > + opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei); > > + return 0; > > +} > > There's nothing really special about that above function aside from the > facts that it takes 'opts' as an argument. Can't we just merge this > with the regular 'gtp_push_header' function? Do you have plans for this > beyond what's here that would complicated by doing so? > Yes, we already have usecase for handling GTP PDU session container related extension header for 5G UPF endpoitnt. > /Jonas > > > > + > > static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) > > { > > struct gtp_dev *gtp = netdev_priv(dev); > > @@ -807,7 +839,16 @@ static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) > > > > skb_set_inner_protocol(skb, skb->protocol); > > > > - gtp_push_header(skb, pctx, &port); > > + if (unlikely(opts)) { > > + port = htons(GTP1U_PORT); > > + r = gtp1_push_control_header(skb, pctx, opts, dev); > > + if (r) { > > + netdev_info(dev, "cntr pkt error %d", r); > > + goto err_rt; > > + } > > + } else { > > + gtp_push_header(skb, pctx, &port); > > + } > > > > iph = ip_hdr(skb); > > netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n", > >
On Mon, Jan 25, 2021 at 9:52 AM Harald Welte <laforge@gnumonks.org> wrote: > > Hi Jonas, > > On Sun, Jan 24, 2021 at 03:21:21PM +0100, Jonas Bonn wrote: > > struct gtpu_metadata { > > __u8 ver; > > __u8 flags; > > __u8 type; > > }; > > > > Here ver is the version of the metadata structure itself, which is fine. > > 'flags' corresponds to the 3 flag bits of GTP header's first byte: E, S, > > and PN. > > 'type' corresponds to the 'message type' field of the GTP header. > > One more comment on the 'type': Of how much use is it? After all, the > GTP-U kernel driver only handles a single message type at all (G-PDU / > 255 - the only message type that encapsulates user IP data), while all > other message types are always processed in userland via the UDP socket. > There is NO userland UDP socket for the LWT tunnel. All UDP traffic is terminated in kernel space. userland only sets rules over LTW tunnel device to handle traffic. This is how OVS/eBPF handles other UDP based LWT tunnel devices. > Side-note: 3GPP TS 29.060 lists 5 other message types that can happen in > GTP-U: > * Echo Request > * Echo Response > * Error Indication > * Supported Extension Headers Notification > * End Marker > > It would be interesting to understand how the new flow-based tunnel would > treat those, if those > Current code does handle Echo Request, Response and End marker.
On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > Hi Pravin, > > On 28/01/2021 22:29, Pravin Shelar wrote: > > On Sun, Jan 24, 2021 at 6:22 AM Jonas Bonn <jonas@norrbonn.se> wrote: > >> > >> Hi Pravin, > >> > >> So, this whole GTP metadata thing has me a bit confused. > >> > >> You've defined a metadata structure like this: > >> > >> struct gtpu_metadata { > >> __u8 ver; > >> __u8 flags; > >> __u8 type; > >> }; > >> > >> Here ver is the version of the metadata structure itself, which is fine. > >> 'flags' corresponds to the 3 flag bits of GTP header's first byte: E, > >> S, and PN. > >> 'type' corresponds to the 'message type' field of the GTP header. > >> > >> The 'control header' (strange name) example below allows the flags to be > >> set; however, setting these flags alone is insufficient because each one > >> indicates the presence of additional fields in the header and there's > >> nothing in the code to account for that. > >> > >> If E is set, extension headers would need to be added. > >> If S is set, a sequence number field would need to be added. > >> If PN is set, a PDU-number header would need to be added. > >> > >> It's not clear to me who sets up this metadata in the first place. Is > >> that where OVS or eBPF come in? Can you give some pointers to how this > >> works? > >> > > > > Receive path: LWT extracts tunnel metadata into tunnel-metadata > > struct. This object has 5-tuple info from outer header and tunnel key. > > When there is presence of extension header there is no way to store > > the info standard tunnel-metadata object. That is when the optional > > section of tunnel-metadata comes in the play. > > As you can see the packet data from GTP header onwards is still pushed > > to the device, so consumers of LWT can look at tunnel-metadata and > > make sense of the inner packet that is received on the device. > > OVS does exactly the same. When it receives a GTP packet with optional > > metadata, it looks at flags and parses the inner packet and extension > > header accordingly. > > Ah, ok, I see. So you are pulling _half_ of the GTP header off the > packet but leaving the optional GTP extension headers in place if they > exist. So what OVS receives is a packet with metadata indicating > whether or not it begins with these extension headers or whether it > begins with an IP header. > > So OVS might need to begin by pulling parts of the packet in order to > get to the inner IP packet. In that case, why don't you just leave the > _entire_ GTP header in place and let OVS work from that? The header > contains exactly the data you've copied to the metadata struct PLUS it > has the incoming TEID value that you really should be validating inner > IP against. > Following are the reasons for extracting the header and populating metadata. 1. That is the design used by other tunneling protocols implementations for handling optional headers. We need to have a consistent model across all tunnel devices for upper layers. 2. GTP module is parsing the UDP and GTP header. It would be wasteful to repeat the same process in upper layers. 3. TIED is part of tunnel metadata, it is already used to validating inner packets. But TIED is not alone to handle packets with extended header. I am fine with processing the entire header in GTP but in case of 'end marker' there is no data left after pulling entire GTP header. Thats why I took this path. > Also, what happens if this is used WITHOUT OVS/eBPF... just a route with > 'encap' set. In that case, nothing will be there to pull the extension > headers off the packet. > One way would be, the user can use encap to steer "special" packets to a netdev that can handle such packets. it would be a userspace process or eBPF program. > > > > xmit path: it is set by LWT tunnel user, OVS or eBPF code. it needs to > > set the metadata in tunnel dst along with the 5-tuple data and tunel > > ID. This is how it can steer the packet at the right tunnel using a > > single tunnel net device. > > Right, that part is fine. However, for setting extension headers you'd > need to set the flags in the metadata and the extensions themselves > prepended to the IP packet meaning your SKB contains a pseudo-GTP packet > before the GTP module can finish adding the header. I don't know why > you wouldn't just add the entire GTP header in one place and be done > with it... going via the GTP module gets you almost nothing at this point. > The UDP socket is owned by GTP module, so there is no other way to inject a packet in the tunnel than sending it over GTP module. plus this would looks this will break the standard model of abstracting tunnel implementation in ovs or bridge code. We will need special code for GTP packets to parse extra outer headers when OVS extracts the flow from the inner packet. I am open to handling the optional headers completely GTP module. > > > > >> Couple of comments below.... > >> > >> On 23/01/2021 20:59, Jonas Bonn wrote: > >>> From: Pravin B Shelar <pbshelar@fb.com> > >>> > >>> Please explain how this patch actually works... creation of the control > >>> header makes sense, but I don't understand how sending of a > >>> control header is actually triggered. > >>> > >>> Signed-off-by: Jonas Bonn <jonas@norrbonn.se> > >>> --- > >>> drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++- > >>> 1 file changed, 42 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c > >>> index 668ed8a4836e..bbce2671de2d 100644 > >>> --- a/drivers/net/gtp.c > >>> +++ b/drivers/net/gtp.c > >>> @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx, > >>> } > >>> } > >>> > >>> +static inline int gtp1_push_control_header(struct sk_buff *skb, > >> > >> I'm not enamored with the name 'control header' because it makes sound > >> like this is some GTP-C thing. The GTP module is really only about > >> GTP-U and the function itself just sets up a GTP-U header. > >> > > sure. lets call ext_hdr. > > > >> > >>> + struct pdp_ctx *pctx, > >>> + struct gtpu_metadata *opts, > >>> + struct net_device *dev) > >>> +{ > >>> + struct gtp1_header *gtp1c; > >>> + int payload_len; > >>> + > >>> + if (opts->ver != GTP_METADATA_V1) > >>> + return -ENOENT; > >>> + > >>> + if (opts->type == 0xFE) { > >>> + /* for end marker ignore skb data. */ > >>> + netdev_dbg(dev, "xmit pkt with null data"); > >>> + pskb_trim(skb, 0); > >>> + } > >>> + if (skb_cow_head(skb, sizeof(*gtp1c)) < 0) > >>> + return -ENOMEM; > >>> + > >>> + payload_len = skb->len; > >>> + > >>> + gtp1c = skb_push(skb, sizeof(*gtp1c)); > >>> + > >>> + gtp1c->flags = opts->flags; > >>> + gtp1c->type = opts->type; > >>> + gtp1c->length = htons(payload_len); > >>> + gtp1c->tid = htonl(pctx->u.v1.o_tei); > >>> + netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x", > >>> + opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei); > >>> + return 0; > >>> +} > >> > >> There's nothing really special about that above function aside from the > >> facts that it takes 'opts' as an argument. Can't we just merge this > >> with the regular 'gtp_push_header' function? Do you have plans for this > >> beyond what's here that would complicated by doing so? > >> > > > > Yes, we already have usecase for handling GTP PDU session container > > related extension header for 5G UPF endpoitnt. > > I'll venture a guess that you're referring to QoS indications. I, too, > have been wondering whether or not these can be made to be compatible > with the GTP module... I'm not sure, at this point. > Right, It's QoS related headers. We can use the same optional metadata field to process this header. we need to extend the option data structure to pass PDU session container header values.
On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: > On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > On 28/01/2021 22:29, Pravin Shelar wrote: > > > Receive path: LWT extracts tunnel metadata into tunnel-metadata > > > struct. This object has 5-tuple info from outer header and tunnel key. > > > When there is presence of extension header there is no way to store > > > the info standard tunnel-metadata object. That is when the optional > > > section of tunnel-metadata comes in the play. > > > As you can see the packet data from GTP header onwards is still pushed > > > to the device, so consumers of LWT can look at tunnel-metadata and > > > make sense of the inner packet that is received on the device. > > > OVS does exactly the same. When it receives a GTP packet with optional > > > metadata, it looks at flags and parses the inner packet and extension > > > header accordingly. > > > > Ah, ok, I see. So you are pulling _half_ of the GTP header off the > > packet but leaving the optional GTP extension headers in place if they > > exist. So what OVS receives is a packet with metadata indicating > > whether or not it begins with these extension headers or whether it > > begins with an IP header. > > > > So OVS might need to begin by pulling parts of the packet in order to > > get to the inner IP packet. In that case, why don't you just leave the > > _entire_ GTP header in place and let OVS work from that? The header > > contains exactly the data you've copied to the metadata struct PLUS it > > has the incoming TEID value that you really should be validating inner > > IP against. > > > > Following are the reasons for extracting the header and populating metadata. > 1. That is the design used by other tunneling protocols > implementations for handling optional headers. We need to have a > consistent model across all tunnel devices for upper layers. Could you clarify with some examples? This does not match intuition, I must be missing something. > 2. GTP module is parsing the UDP and GTP header. It would be wasteful > to repeat the same process in upper layers. > 3. TIED is part of tunnel metadata, it is already used to validating > inner packets. But TIED is not alone to handle packets with extended > header. > > I am fine with processing the entire header in GTP but in case of 'end > marker' there is no data left after pulling entire GTP header. Thats > why I took this path.
On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: > > On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > > On 28/01/2021 22:29, Pravin Shelar wrote: > > > > Receive path: LWT extracts tunnel metadata into tunnel-metadata > > > > struct. This object has 5-tuple info from outer header and tunnel key. > > > > When there is presence of extension header there is no way to store > > > > the info standard tunnel-metadata object. That is when the optional > > > > section of tunnel-metadata comes in the play. > > > > As you can see the packet data from GTP header onwards is still pushed > > > > to the device, so consumers of LWT can look at tunnel-metadata and > > > > make sense of the inner packet that is received on the device. > > > > OVS does exactly the same. When it receives a GTP packet with optional > > > > metadata, it looks at flags and parses the inner packet and extension > > > > header accordingly. > > > > > > Ah, ok, I see. So you are pulling _half_ of the GTP header off the > > > packet but leaving the optional GTP extension headers in place if they > > > exist. So what OVS receives is a packet with metadata indicating > > > whether or not it begins with these extension headers or whether it > > > begins with an IP header. > > > > > > So OVS might need to begin by pulling parts of the packet in order to > > > get to the inner IP packet. In that case, why don't you just leave the > > > _entire_ GTP header in place and let OVS work from that? The header > > > contains exactly the data you've copied to the metadata struct PLUS it > > > has the incoming TEID value that you really should be validating inner > > > IP against. > > > > > > > Following are the reasons for extracting the header and populating metadata. > > 1. That is the design used by other tunneling protocols > > implementations for handling optional headers. We need to have a > > consistent model across all tunnel devices for upper layers. > > Could you clarify with some examples? This does not match intuition, > I must be missing something. > You can look at geneve_rx() or vxlan_rcv() that extracts optional headers in ip_tunnel_info opts.
On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote: > On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: > > > On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > > Following are the reasons for extracting the header and populating metadata. > > > 1. That is the design used by other tunneling protocols > > > implementations for handling optional headers. We need to have a > > > consistent model across all tunnel devices for upper layers. > > > > Could you clarify with some examples? This does not match intuition, > > I must be missing something. > > You can look at geneve_rx() or vxlan_rcv() that extracts optional > headers in ip_tunnel_info opts. Okay, I got confused what Jonas was inquiring about. I thought that the extension headers were not pulled, rather than not parsed. Copying them as-is to info->opts is right, thanks!
Hi Jakub, On 01/02/2021 21:44, Jakub Kicinski wrote: > On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote: >> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: >>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: >>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: >>>> Following are the reasons for extracting the header and populating metadata. >>>> 1. That is the design used by other tunneling protocols >>>> implementations for handling optional headers. We need to have a >>>> consistent model across all tunnel devices for upper layers. >>> >>> Could you clarify with some examples? This does not match intuition, >>> I must be missing something. >> >> You can look at geneve_rx() or vxlan_rcv() that extracts optional >> headers in ip_tunnel_info opts. > > Okay, I got confused what Jonas was inquiring about. I thought that the > extension headers were not pulled, rather than not parsed. Copying them > as-is to info->opts is right, thanks! > No, you're not confused. The extension headers are not being pulled in the current patchset. Incoming packet: --------------------------------------------------------------------- | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... --------------------------------------------------------------------- <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt ---> The "collect metadata" path of the patchset copies 'flags' and 'type' to info->opts, but leaves the following: ----------------------------------------- | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... ----------------------------------------- <--------- GTP header -------><- Pkt ---> So it's leaving _half_ the header and making it a requirement that there be further intelligence down the line that can handle this. This is far from intuitive. /Jonas
On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote: > > Hi Jakub, > > On 01/02/2021 21:44, Jakub Kicinski wrote: > > On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote: > >> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: > >>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: > >>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > >>>> Following are the reasons for extracting the header and populating metadata. > >>>> 1. That is the design used by other tunneling protocols > >>>> implementations for handling optional headers. We need to have a > >>>> consistent model across all tunnel devices for upper layers. > >>> > >>> Could you clarify with some examples? This does not match intuition, > >>> I must be missing something. > >> > >> You can look at geneve_rx() or vxlan_rcv() that extracts optional > >> headers in ip_tunnel_info opts. > > > > Okay, I got confused what Jonas was inquiring about. I thought that the > > extension headers were not pulled, rather than not parsed. Copying them > > as-is to info->opts is right, thanks! > > > > No, you're not confused. The extension headers are not being pulled in > the current patchset. > > Incoming packet: > > --------------------------------------------------------------------- > | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... > --------------------------------------------------------------------- > <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt ---> > > The "collect metadata" path of the patchset copies 'flags' and 'type' to > info->opts, but leaves the following: > > ----------------------------------------- > | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... > ----------------------------------------- > <--------- GTP header -------><- Pkt ---> > > So it's leaving _half_ the header and making it a requirement that there > be further intelligence down the line that can handle this. This is far > from intuitive. > The patch supports Echo, Echo response and End marker packet. Issue with pulling the entire extension header is that it would result in zero length skb, such packets can not be passed on to the upper layer. That is the reason I kept the extension header in skb and added indication in tunnel metadata that it is not a IP packet. so that upper layer can process the packet. IP packet without an extension header would be handled in a fast path without any special handling. Obviously In case of PDU session container extension header GTP driver would need to process the entire extension header in the module. This way we can handle these user data packets in fastpath. I can make changes to use the same method for all extension headers if needed.
On 02/02/2021 07:56, Pravin Shelar wrote: > On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote: >> >> Hi Jakub, >> >> On 01/02/2021 21:44, Jakub Kicinski wrote: >>> On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote: >>>> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: >>>>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: >>>>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: >>>>>> Following are the reasons for extracting the header and populating metadata. >>>>>> 1. That is the design used by other tunneling protocols >>>>>> implementations for handling optional headers. We need to have a >>>>>> consistent model across all tunnel devices for upper layers. >>>>> >>>>> Could you clarify with some examples? This does not match intuition, >>>>> I must be missing something. >>>> >>>> You can look at geneve_rx() or vxlan_rcv() that extracts optional >>>> headers in ip_tunnel_info opts. >>> >>> Okay, I got confused what Jonas was inquiring about. I thought that the >>> extension headers were not pulled, rather than not parsed. Copying them >>> as-is to info->opts is right, thanks! >>> >> >> No, you're not confused. The extension headers are not being pulled in >> the current patchset. >> >> Incoming packet: >> >> --------------------------------------------------------------------- >> | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... >> --------------------------------------------------------------------- >> <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt ---> >> >> The "collect metadata" path of the patchset copies 'flags' and 'type' to >> info->opts, but leaves the following: >> >> ----------------------------------------- >> | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... >> ----------------------------------------- >> <--------- GTP header -------><- Pkt ---> >> >> So it's leaving _half_ the header and making it a requirement that there >> be further intelligence down the line that can handle this. This is far >> from intuitive. >> > > The patch supports Echo, Echo response and End marker packet. > Issue with pulling the entire extension header is that it would result > in zero length skb, such packets can not be passed on to the upper > layer. That is the reason I kept the extension header in skb and added > indication in tunnel metadata that it is not a IP packet. so that > upper layer can process the packet. > IP packet without an extension header would be handled in a fast path > without any special handling. > > Obviously In case of PDU session container extension header GTP driver > would need to process the entire extension header in the module. This > way we can handle these user data packets in fastpath. > I can make changes to use the same method for all extension headers if needed. > The most disturbing bit is the fact that the upper layer needs to understand that part of the header info is in info->opts whereas the remainder is on the SKB itself. If it is going to access the SKB anyway, why not just leave the entire GTP header in place and let the upper layer just get all the information from there? What's the advantage of info->opts in this case? Normally, the gtp module extracts T-PDU's from the GTP packet and passes them on (after validating their IP address) to the network stack. For _everything else_, it just passes them along the socket for handling elsewhere. It sounds like you are trying to do exactly the same thing: extract T-PDU and inject into network stack for T-PDU's, and pass everything else to another handler. So what is different in your case from the normal case? - there's metadata on the packet... can't we detect this and set the tunnel ID from the TEID in that case? Or can't we just always have metadata on the packet? - the upper layer handler is in kernel space instead of userspace; but they are doing pretty much the same thing, right? why does the kernel space variant need something (info->opts) that userspace can get by without? It would be seriously good to see a _real_ example of how you intend to use this. Isn't the PDP context mechanism already sufficient to do all of the above? What's missing? ip route 192.168.99.0/24 encap gtp id 100 dst 172.99.0.2 dev gtp1 is roughly equivalent to: gtp-tunnel add gtp1 v1 [LOCAL_TEID] 100 172.99.0.2 [UE_IP] /Jonas
On Tue, Feb 2, 2021 at 12:03 AM Jonas Bonn <jonas@norrbonn.se> wrote: > > > > On 02/02/2021 07:56, Pravin Shelar wrote: > > On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote: > >> > >> Hi Jakub, > >> > >> On 01/02/2021 21:44, Jakub Kicinski wrote: > >>> On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote: > >>>> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote: > >>>>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote: > >>>>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote: > >>>>>> Following are the reasons for extracting the header and populating metadata. > >>>>>> 1. That is the design used by other tunneling protocols > >>>>>> implementations for handling optional headers. We need to have a > >>>>>> consistent model across all tunnel devices for upper layers. > >>>>> > >>>>> Could you clarify with some examples? This does not match intuition, > >>>>> I must be missing something. > >>>> > >>>> You can look at geneve_rx() or vxlan_rcv() that extracts optional > >>>> headers in ip_tunnel_info opts. > >>> > >>> Okay, I got confused what Jonas was inquiring about. I thought that the > >>> extension headers were not pulled, rather than not parsed. Copying them > >>> as-is to info->opts is right, thanks! > >>> > >> > >> No, you're not confused. The extension headers are not being pulled in > >> the current patchset. > >> > >> Incoming packet: > >> > >> --------------------------------------------------------------------- > >> | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... > >> --------------------------------------------------------------------- > >> <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt ---> > >> > >> The "collect metadata" path of the patchset copies 'flags' and 'type' to > >> info->opts, but leaves the following: > >> > >> ----------------------------------------- > >> | N-PDU | SEQ | Ext | EXT.Hdr | IP | ... > >> ----------------------------------------- > >> <--------- GTP header -------><- Pkt ---> > >> > >> So it's leaving _half_ the header and making it a requirement that there > >> be further intelligence down the line that can handle this. This is far > >> from intuitive. > >> > > > > The patch supports Echo, Echo response and End marker packet. > > Issue with pulling the entire extension header is that it would result > > in zero length skb, such packets can not be passed on to the upper > > layer. That is the reason I kept the extension header in skb and added > > indication in tunnel metadata that it is not a IP packet. so that > > upper layer can process the packet. > > IP packet without an extension header would be handled in a fast path > > without any special handling. > > > > Obviously In case of PDU session container extension header GTP driver > > would need to process the entire extension header in the module. This > > way we can handle these user data packets in fastpath. > > I can make changes to use the same method for all extension headers if needed. > > > > The most disturbing bit is the fact that the upper layer needs to > understand that part of the header info is in info->opts whereas the > remainder is on the SKB itself. If it is going to access the SKB > anyway, why not just leave the entire GTP header in place and let the > upper layer just get all the information from there? What's the > advantage of info->opts in this case? > > Normally, the gtp module extracts T-PDU's from the GTP packet and passes > them on (after validating their IP address) to the network stack. For > _everything else_, it just passes them along the socket for handling > elsewhere. > > It sounds like you are trying to do exactly the same thing: extract > T-PDU and inject into network stack for T-PDU's, and pass everything > else to another handler. > > So what is different in your case from the normal case? > - there's metadata on the packet... can't we detect this and set the > tunnel ID from the TEID in that case? Or can't we just always have > metadata on the packet? > - the upper layer handler is in kernel space instead of userspace; but > they are doing pretty much the same thing, right? why does the kernel > space variant need something (info->opts) that userspace can get by without? > This is how OVS/eBPF abstracts tunneling implementation. Tunnel context is represented in tunnel metadata. I am trying to integrate GTP tunnel with LTW framework. This way we can make use of GTP tunnel as any other LWT device. I am fine with the GTP extension header handling that passes GTP packet as is in case extension header. Can you make those changes in this series and post Non RFC Patch Set.
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 668ed8a4836e..bbce2671de2d 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx, } } +static inline int gtp1_push_control_header(struct sk_buff *skb, + struct pdp_ctx *pctx, + struct gtpu_metadata *opts, + struct net_device *dev) +{ + struct gtp1_header *gtp1c; + int payload_len; + + if (opts->ver != GTP_METADATA_V1) + return -ENOENT; + + if (opts->type == 0xFE) { + /* for end marker ignore skb data. */ + netdev_dbg(dev, "xmit pkt with null data"); + pskb_trim(skb, 0); + } + if (skb_cow_head(skb, sizeof(*gtp1c)) < 0) + return -ENOMEM; + + payload_len = skb->len; + + gtp1c = skb_push(skb, sizeof(*gtp1c)); + + gtp1c->flags = opts->flags; + gtp1c->type = opts->type; + gtp1c->length = htons(payload_len); + gtp1c->tid = htonl(pctx->u.v1.o_tei); + netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x", + opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei); + return 0; +} + static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) { struct gtp_dev *gtp = netdev_priv(dev); @@ -807,7 +839,16 @@ static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev) skb_set_inner_protocol(skb, skb->protocol); - gtp_push_header(skb, pctx, &port); + if (unlikely(opts)) { + port = htons(GTP1U_PORT); + r = gtp1_push_control_header(skb, pctx, opts, dev); + if (r) { + netdev_info(dev, "cntr pkt error %d", r); + goto err_rt; + } + } else { + gtp_push_header(skb, pctx, &port); + } iph = ip_hdr(skb); netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n",