diff mbox series

[RFC,15/16] gtp: add ability to send GTP controls headers

Message ID 20210123195916.2765481-16-jonas@norrbonn.se (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series GTP: flow based | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count fail Series longer than 15 patches
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: davem@davemloft.net osmocom-net-gprs@lists.osmocom.org
netdev/source_inline fail Was 0 now: 1
netdev/verify_signedoff fail Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Jonas Bonn Jan. 23, 2021, 7:59 p.m. UTC
From: Pravin B Shelar <pbshelar@fb.com>

Please explain how this patch actually works... creation of the control
header makes sense, but I don't understand how sending of a
control header is actually triggered.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
---
 drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

Comments

Jonas Bonn Jan. 24, 2021, 2:21 p.m. UTC | #1
Hi Pravin,

So, this whole GTP metadata thing has me a bit confused.

You've defined a metadata structure like this:

struct gtpu_metadata {
         __u8    ver;
         __u8    flags;
         __u8    type;
};

Here ver is the version of the metadata structure itself, which is fine.
'flags' corresponds to the 3 flag bits of GTP header's first byte:  E, 
S, and PN.
'type' corresponds to the 'message type' field of the GTP header.

The 'control header' (strange name) example below allows the flags to be 
set; however, setting these flags alone is insufficient because each one 
indicates the presence of additional fields in the header and there's 
nothing in the code to account for that.

If E is set, extension headers would need to be added.
If S is set, a sequence number field would need to be added.
If PN is set, a PDU-number header would need to be added.

It's not clear to me who sets up this metadata in the first place.  Is 
that where OVS or eBPF come in?  Can you give some pointers to how this 
works?

Couple of comments below....

On 23/01/2021 20:59, Jonas Bonn wrote:
> From: Pravin B Shelar <pbshelar@fb.com>
> 
> Please explain how this patch actually works... creation of the control
> header makes sense, but I don't understand how sending of a
> control header is actually triggered.
> 
> Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
> ---
>   drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> index 668ed8a4836e..bbce2671de2d 100644
> --- a/drivers/net/gtp.c
> +++ b/drivers/net/gtp.c
> @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx,
>   	}
>   }
>   
> +static inline int gtp1_push_control_header(struct sk_buff *skb,

I'm not enamored with the name 'control header' because it makes sound 
like this is some GTP-C thing.  The GTP module is really only about 
GTP-U and the function itself just sets up a GTP-U header.


> +					   struct pdp_ctx *pctx,
> +					   struct gtpu_metadata *opts,
> +					   struct net_device *dev)
> +{
> +	struct gtp1_header *gtp1c;
> +	int payload_len;
> +
> +	if (opts->ver != GTP_METADATA_V1)
> +		return -ENOENT;
> +
> +	if (opts->type == 0xFE) {
> +		/* for end marker ignore skb data. */
> +		netdev_dbg(dev, "xmit pkt with null data");
> +		pskb_trim(skb, 0);
> +	}
> +	if (skb_cow_head(skb, sizeof(*gtp1c)) < 0)
> +		return -ENOMEM;
> +
> +	payload_len = skb->len;
> +
> +	gtp1c = skb_push(skb, sizeof(*gtp1c));
> +
> +	gtp1c->flags	= opts->flags;
> +	gtp1c->type	= opts->type;
> +	gtp1c->length	= htons(payload_len);
> +	gtp1c->tid	= htonl(pctx->u.v1.o_tei);
> +	netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x",
> +		   opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei);
> +	return 0;
> +}

There's nothing really special about that above function aside from the 
facts that it takes 'opts' as an argument.  Can't we just merge this 
with the regular 'gtp_push_header' function?  Do you have plans for this 
beyond what's here that would complicated by doing so?

/Jonas


> +
>   static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
>   {
>   	struct gtp_dev *gtp = netdev_priv(dev);
> @@ -807,7 +839,16 @@ static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
>   
>   	skb_set_inner_protocol(skb, skb->protocol);
>   
> -	gtp_push_header(skb, pctx, &port);
> +	if (unlikely(opts)) {
> +		port = htons(GTP1U_PORT);
> +		r = gtp1_push_control_header(skb, pctx, opts, dev);
> +		if (r) {
> +			netdev_info(dev, "cntr pkt error %d", r);
> +			goto err_rt;
> +		}
> +	} else {
> +		gtp_push_header(skb, pctx, &port);
> +	}
>   
>   	iph = ip_hdr(skb);
>   	netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n",
>
Harald Welte Jan. 25, 2021, 5:41 p.m. UTC | #2
Hi Jonas,

thanks for your detailed analysis and review of the changes.  To me, they
once again show that the original patch was merged too quickly, without
a detailed review by people with strong GTP background.

On Sun, Jan 24, 2021 at 03:21:21PM +0100, Jonas Bonn wrote:
> struct gtpu_metadata {
>         __u8    ver;
>         __u8    flags;
>         __u8    type;
> };
> 
> Here ver is the version of the metadata structure itself, which is fine.
> 'flags' corresponds to the 3 flag bits of GTP header's first byte:  E, S,
> and PN.
> 'type' corresponds to the 'message type' field of the GTP header.

One more comment on the 'type': Of how much use is it?  After all, the
GTP-U kernel driver only handles a single message type at all (G-PDU /
255 - the only message type that encapsulates user IP data), while all
other message types are always processed in userland via the UDP socket.

Side-note: 3GPP TS 29.060 lists 5 other message types that can happen in
GTP-U:
* Echo Request
* Echo Response
* Error Indication
* Supported Extension Headers Notification
* End Marker

It would be interesting to understand how the new flow-based tunnel would
treat those, if those 

> The 'control header' (strange name) example below allows the flags to be
> set; however, setting these flags alone is insufficient because each one
> indicates the presence of additional fields in the header and there's
> nothing in the code to account for that.

Full ACK from my side here.  Setting arbitrary bits in the GTP flags without
then actually encoding the required additional bits that those flags require
will produce broken packets.  IMHO, the GTP driver should never do that.
Pravin Shelar Jan. 28, 2021, 9:29 p.m. UTC | #3
On Sun, Jan 24, 2021 at 6:22 AM Jonas Bonn <jonas@norrbonn.se> wrote:
>
> Hi Pravin,
>
> So, this whole GTP metadata thing has me a bit confused.
>
> You've defined a metadata structure like this:
>
> struct gtpu_metadata {
>          __u8    ver;
>          __u8    flags;
>          __u8    type;
> };
>
> Here ver is the version of the metadata structure itself, which is fine.
> 'flags' corresponds to the 3 flag bits of GTP header's first byte:  E,
> S, and PN.
> 'type' corresponds to the 'message type' field of the GTP header.
>
> The 'control header' (strange name) example below allows the flags to be
> set; however, setting these flags alone is insufficient because each one
> indicates the presence of additional fields in the header and there's
> nothing in the code to account for that.
>
> If E is set, extension headers would need to be added.
> If S is set, a sequence number field would need to be added.
> If PN is set, a PDU-number header would need to be added.
>
> It's not clear to me who sets up this metadata in the first place.  Is
> that where OVS or eBPF come in?  Can you give some pointers to how this
> works?
>

Receive path: LWT extracts tunnel metadata into tunnel-metadata
struct. This object has 5-tuple info from outer header and tunnel key.
When there is presence of extension header there is no way to store
the info standard tunnel-metadata object. That is when the optional
section of tunnel-metadata comes in the play.
As you can see the packet data from GTP header onwards is still pushed
to the device, so consumers of LWT can look at tunnel-metadata and
make sense of the inner packet that is received on the device.
OVS does exactly the same. When it receives a GTP packet with optional
metadata, it looks at flags and parses the inner packet and extension
header accordingly.

xmit path: it is set by LWT tunnel user, OVS or eBPF code. it needs to
set the metadata in tunnel dst along with the 5-tuple data and tunel
ID. This is how it can steer the packet at the right tunnel using a
single tunnel net device.

> Couple of comments below....
>
> On 23/01/2021 20:59, Jonas Bonn wrote:
> > From: Pravin B Shelar <pbshelar@fb.com>
> >
> > Please explain how this patch actually works... creation of the control
> > header makes sense, but I don't understand how sending of a
> > control header is actually triggered.
> >
> > Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
> > ---
> >   drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
> >   1 file changed, 42 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> > index 668ed8a4836e..bbce2671de2d 100644
> > --- a/drivers/net/gtp.c
> > +++ b/drivers/net/gtp.c
> > @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx,
> >       }
> >   }
> >
> > +static inline int gtp1_push_control_header(struct sk_buff *skb,
>
> I'm not enamored with the name 'control header' because it makes sound
> like this is some GTP-C thing.  The GTP module is really only about
> GTP-U and the function itself just sets up a GTP-U header.
>
sure. lets call ext_hdr.

>
> > +                                        struct pdp_ctx *pctx,
> > +                                        struct gtpu_metadata *opts,
> > +                                        struct net_device *dev)
> > +{
> > +     struct gtp1_header *gtp1c;
> > +     int payload_len;
> > +
> > +     if (opts->ver != GTP_METADATA_V1)
> > +             return -ENOENT;
> > +
> > +     if (opts->type == 0xFE) {
> > +             /* for end marker ignore skb data. */
> > +             netdev_dbg(dev, "xmit pkt with null data");
> > +             pskb_trim(skb, 0);
> > +     }
> > +     if (skb_cow_head(skb, sizeof(*gtp1c)) < 0)
> > +             return -ENOMEM;
> > +
> > +     payload_len = skb->len;
> > +
> > +     gtp1c = skb_push(skb, sizeof(*gtp1c));
> > +
> > +     gtp1c->flags    = opts->flags;
> > +     gtp1c->type     = opts->type;
> > +     gtp1c->length   = htons(payload_len);
> > +     gtp1c->tid      = htonl(pctx->u.v1.o_tei);
> > +     netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x",
> > +                opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei);
> > +     return 0;
> > +}
>
> There's nothing really special about that above function aside from the
> facts that it takes 'opts' as an argument.  Can't we just merge this
> with the regular 'gtp_push_header' function?  Do you have plans for this
> beyond what's here that would complicated by doing so?
>

Yes, we already have usecase for handling GTP PDU session container
related extension header for 5G UPF endpoitnt.



> /Jonas
>
>
> > +
> >   static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
> >   {
> >       struct gtp_dev *gtp = netdev_priv(dev);
> > @@ -807,7 +839,16 @@ static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
> >
> >       skb_set_inner_protocol(skb, skb->protocol);
> >
> > -     gtp_push_header(skb, pctx, &port);
> > +     if (unlikely(opts)) {
> > +             port = htons(GTP1U_PORT);
> > +             r = gtp1_push_control_header(skb, pctx, opts, dev);
> > +             if (r) {
> > +                     netdev_info(dev, "cntr pkt error %d", r);
> > +                     goto err_rt;
> > +             }
> > +     } else {
> > +             gtp_push_header(skb, pctx, &port);
> > +     }
> >
> >       iph = ip_hdr(skb);
> >       netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n",
> >
Pravin Shelar Jan. 28, 2021, 9:29 p.m. UTC | #4
On Mon, Jan 25, 2021 at 9:52 AM Harald Welte <laforge@gnumonks.org> wrote:
>
> Hi Jonas,
>

> On Sun, Jan 24, 2021 at 03:21:21PM +0100, Jonas Bonn wrote:
> > struct gtpu_metadata {
> >         __u8    ver;
> >         __u8    flags;
> >         __u8    type;
> > };
> >
> > Here ver is the version of the metadata structure itself, which is fine.
> > 'flags' corresponds to the 3 flag bits of GTP header's first byte:  E, S,
> > and PN.
> > 'type' corresponds to the 'message type' field of the GTP header.
>
> One more comment on the 'type': Of how much use is it?  After all, the
> GTP-U kernel driver only handles a single message type at all (G-PDU /
> 255 - the only message type that encapsulates user IP data), while all
> other message types are always processed in userland via the UDP socket.
>
There is NO userland UDP socket for the LWT tunnel. All UDP traffic is
terminated in kernel space. userland only sets rules over LTW tunnel
device to handle traffic. This is how OVS/eBPF handles other UDP based
LWT tunnel devices.


> Side-note: 3GPP TS 29.060 lists 5 other message types that can happen in
> GTP-U:
> * Echo Request
> * Echo Response
> * Error Indication
> * Supported Extension Headers Notification
> * End Marker
>
> It would be interesting to understand how the new flow-based tunnel would
> treat those, if those
>
Current code does handle Echo Request, Response and End marker.
Pravin Shelar Jan. 30, 2021, 6:59 a.m. UTC | #5
On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
>
> Hi Pravin,
>
> On 28/01/2021 22:29, Pravin Shelar wrote:
> > On Sun, Jan 24, 2021 at 6:22 AM Jonas Bonn <jonas@norrbonn.se> wrote:
> >>
> >> Hi Pravin,
> >>
> >> So, this whole GTP metadata thing has me a bit confused.
> >>
> >> You've defined a metadata structure like this:
> >>
> >> struct gtpu_metadata {
> >>           __u8    ver;
> >>           __u8    flags;
> >>           __u8    type;
> >> };
> >>
> >> Here ver is the version of the metadata structure itself, which is fine.
> >> 'flags' corresponds to the 3 flag bits of GTP header's first byte:  E,
> >> S, and PN.
> >> 'type' corresponds to the 'message type' field of the GTP header.
> >>
> >> The 'control header' (strange name) example below allows the flags to be
> >> set; however, setting these flags alone is insufficient because each one
> >> indicates the presence of additional fields in the header and there's
> >> nothing in the code to account for that.
> >>
> >> If E is set, extension headers would need to be added.
> >> If S is set, a sequence number field would need to be added.
> >> If PN is set, a PDU-number header would need to be added.
> >>
> >> It's not clear to me who sets up this metadata in the first place.  Is
> >> that where OVS or eBPF come in?  Can you give some pointers to how this
> >> works?
> >>
> >
> > Receive path: LWT extracts tunnel metadata into tunnel-metadata
> > struct. This object has 5-tuple info from outer header and tunnel key.
> > When there is presence of extension header there is no way to store
> > the info standard tunnel-metadata object. That is when the optional
> > section of tunnel-metadata comes in the play.
> > As you can see the packet data from GTP header onwards is still pushed
> > to the device, so consumers of LWT can look at tunnel-metadata and
> > make sense of the inner packet that is received on the device.
> > OVS does exactly the same. When it receives a GTP packet with optional
> > metadata, it looks at flags and parses the inner packet and extension
> > header accordingly.
>
> Ah, ok, I see.  So you are pulling _half_ of the GTP header off the
> packet but leaving the optional GTP extension headers in place if they
> exist.  So what OVS receives is a packet with metadata indicating
> whether or not it begins with these extension headers or whether it
> begins with an IP header.
>
> So OVS might need to begin by pulling parts of the packet in order to
> get to the inner IP packet.  In that case, why don't you just leave the
> _entire_ GTP header in place and let OVS work from that?  The header
> contains exactly the data you've copied to the metadata struct PLUS it
> has the incoming TEID value that you really should be validating inner
> IP against.
>

Following are the reasons for extracting the header and populating metadata.
1. That is the design used by other tunneling protocols
implementations for handling optional headers. We need to have a
consistent model across all tunnel devices for upper layers.
2. GTP module is parsing the UDP and GTP header. It would be wasteful
to repeat the same process in upper layers.
3. TIED is part of tunnel metadata, it is already used to validating
inner packets. But TIED is not alone to handle packets with extended
header.

I am fine with processing the entire header in GTP but in case of 'end
marker' there is no data left after pulling entire GTP header. Thats
why I took this path.

> Also, what happens if this is used WITHOUT OVS/eBPF... just a route with
> 'encap' set.  In that case, nothing will be there to pull the extension
> headers off the packet.
>

One way would be, the user can use encap to steer "special" packets to
a netdev that can handle such packets. it would be a userspace process
or eBPF program.

> >
> > xmit path: it is set by LWT tunnel user, OVS or eBPF code. it needs to
> > set the metadata in tunnel dst along with the 5-tuple data and tunel
> > ID. This is how it can steer the packet at the right tunnel using a
> > single tunnel net device.
>
> Right, that part is fine.  However, for setting extension headers you'd
> need to set the flags in the metadata and the extensions themselves
> prepended to the IP packet meaning your SKB contains a pseudo-GTP packet
> before the GTP module can finish adding the header.  I don't know why
> you wouldn't just add the entire GTP header in one place and be done
> with it... going via the GTP module gets you almost nothing at this point.
>
The UDP socket is owned by GTP module, so there is no other way to
inject a packet in the tunnel than sending it over GTP module.
plus this would looks this will break the standard model of
abstracting tunnel implementation in ovs or bridge code. We will need
special code for GTP packets to parse extra outer headers when OVS
extracts the flow from the inner packet.
I am open to handling  the optional headers completely GTP module.

>
> >
> >> Couple of comments below....
> >>
> >> On 23/01/2021 20:59, Jonas Bonn wrote:
> >>> From: Pravin B Shelar <pbshelar@fb.com>
> >>>
> >>> Please explain how this patch actually works... creation of the control
> >>> header makes sense, but I don't understand how sending of a
> >>> control header is actually triggered.
> >>>
> >>> Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
> >>> ---
> >>>    drivers/net/gtp.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
> >>>    1 file changed, 42 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> >>> index 668ed8a4836e..bbce2671de2d 100644
> >>> --- a/drivers/net/gtp.c
> >>> +++ b/drivers/net/gtp.c
> >>> @@ -683,6 +683,38 @@ static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx,
> >>>        }
> >>>    }
> >>>
> >>> +static inline int gtp1_push_control_header(struct sk_buff *skb,
> >>
> >> I'm not enamored with the name 'control header' because it makes sound
> >> like this is some GTP-C thing.  The GTP module is really only about
> >> GTP-U and the function itself just sets up a GTP-U header.
> >>
> > sure. lets call ext_hdr.
> >
> >>
> >>> +                                        struct pdp_ctx *pctx,
> >>> +                                        struct gtpu_metadata *opts,
> >>> +                                        struct net_device *dev)
> >>> +{
> >>> +     struct gtp1_header *gtp1c;
> >>> +     int payload_len;
> >>> +
> >>> +     if (opts->ver != GTP_METADATA_V1)
> >>> +             return -ENOENT;
> >>> +
> >>> +     if (opts->type == 0xFE) {
> >>> +             /* for end marker ignore skb data. */
> >>> +             netdev_dbg(dev, "xmit pkt with null data");
> >>> +             pskb_trim(skb, 0);
> >>> +     }
> >>> +     if (skb_cow_head(skb, sizeof(*gtp1c)) < 0)
> >>> +             return -ENOMEM;
> >>> +
> >>> +     payload_len = skb->len;
> >>> +
> >>> +     gtp1c = skb_push(skb, sizeof(*gtp1c));
> >>> +
> >>> +     gtp1c->flags    = opts->flags;
> >>> +     gtp1c->type     = opts->type;
> >>> +     gtp1c->length   = htons(payload_len);
> >>> +     gtp1c->tid      = htonl(pctx->u.v1.o_tei);
> >>> +     netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x",
> >>> +                opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei);
> >>> +     return 0;
> >>> +}
> >>
> >> There's nothing really special about that above function aside from the
> >> facts that it takes 'opts' as an argument.  Can't we just merge this
> >> with the regular 'gtp_push_header' function?  Do you have plans for this
> >> beyond what's here that would complicated by doing so?
> >>
> >
> > Yes, we already have usecase for handling GTP PDU session container
> > related extension header for 5G UPF endpoitnt.
>
> I'll venture a guess that you're referring to QoS indications.  I, too,
> have been wondering whether or not these can be made to be compatible
> with the GTP module... I'm not sure, at this point.
>
Right, It's QoS related headers.
We can use the same optional metadata field to process this header. we
need to extend the option data structure to pass PDU session container
header values.
Jakub Kicinski Jan. 30, 2021, 6:44 p.m. UTC | #6
On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
> > On 28/01/2021 22:29, Pravin Shelar wrote:  
> > > Receive path: LWT extracts tunnel metadata into tunnel-metadata
> > > struct. This object has 5-tuple info from outer header and tunnel key.
> > > When there is presence of extension header there is no way to store
> > > the info standard tunnel-metadata object. That is when the optional
> > > section of tunnel-metadata comes in the play.
> > > As you can see the packet data from GTP header onwards is still pushed
> > > to the device, so consumers of LWT can look at tunnel-metadata and
> > > make sense of the inner packet that is received on the device.
> > > OVS does exactly the same. When it receives a GTP packet with optional
> > > metadata, it looks at flags and parses the inner packet and extension
> > > header accordingly.  
> >
> > Ah, ok, I see.  So you are pulling _half_ of the GTP header off the
> > packet but leaving the optional GTP extension headers in place if they
> > exist.  So what OVS receives is a packet with metadata indicating
> > whether or not it begins with these extension headers or whether it
> > begins with an IP header.
> >
> > So OVS might need to begin by pulling parts of the packet in order to
> > get to the inner IP packet.  In that case, why don't you just leave the
> > _entire_ GTP header in place and let OVS work from that?  The header
> > contains exactly the data you've copied to the metadata struct PLUS it
> > has the incoming TEID value that you really should be validating inner
> > IP against.
> >  
> 
> Following are the reasons for extracting the header and populating metadata.
> 1. That is the design used by other tunneling protocols
> implementations for handling optional headers. We need to have a
> consistent model across all tunnel devices for upper layers.

Could you clarify with some examples? This does not match intuition, 
I must be missing something.

> 2. GTP module is parsing the UDP and GTP header. It would be wasteful
> to repeat the same process in upper layers.
> 3. TIED is part of tunnel metadata, it is already used to validating
> inner packets. But TIED is not alone to handle packets with extended
> header.
> 
> I am fine with processing the entire header in GTP but in case of 'end
> marker' there is no data left after pulling entire GTP header. Thats
> why I took this path.
Pravin Shelar Jan. 30, 2021, 8:05 p.m. UTC | #7
On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
> > On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
> > > On 28/01/2021 22:29, Pravin Shelar wrote:
> > > > Receive path: LWT extracts tunnel metadata into tunnel-metadata
> > > > struct. This object has 5-tuple info from outer header and tunnel key.
> > > > When there is presence of extension header there is no way to store
> > > > the info standard tunnel-metadata object. That is when the optional
> > > > section of tunnel-metadata comes in the play.
> > > > As you can see the packet data from GTP header onwards is still pushed
> > > > to the device, so consumers of LWT can look at tunnel-metadata and
> > > > make sense of the inner packet that is received on the device.
> > > > OVS does exactly the same. When it receives a GTP packet with optional
> > > > metadata, it looks at flags and parses the inner packet and extension
> > > > header accordingly.
> > >
> > > Ah, ok, I see.  So you are pulling _half_ of the GTP header off the
> > > packet but leaving the optional GTP extension headers in place if they
> > > exist.  So what OVS receives is a packet with metadata indicating
> > > whether or not it begins with these extension headers or whether it
> > > begins with an IP header.
> > >
> > > So OVS might need to begin by pulling parts of the packet in order to
> > > get to the inner IP packet.  In that case, why don't you just leave the
> > > _entire_ GTP header in place and let OVS work from that?  The header
> > > contains exactly the data you've copied to the metadata struct PLUS it
> > > has the incoming TEID value that you really should be validating inner
> > > IP against.
> > >
> >
> > Following are the reasons for extracting the header and populating metadata.
> > 1. That is the design used by other tunneling protocols
> > implementations for handling optional headers. We need to have a
> > consistent model across all tunnel devices for upper layers.
>
> Could you clarify with some examples? This does not match intuition,
> I must be missing something.
>

You can look at geneve_rx() or vxlan_rcv() that extracts optional
headers in ip_tunnel_info opts.
Jakub Kicinski Feb. 1, 2021, 8:44 p.m. UTC | #8
On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote:
> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:  
> > > On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:  
> > > Following are the reasons for extracting the header and populating metadata.
> > > 1. That is the design used by other tunneling protocols
> > > implementations for handling optional headers. We need to have a
> > > consistent model across all tunnel devices for upper layers.  
> >
> > Could you clarify with some examples? This does not match intuition,
> > I must be missing something.
> 
> You can look at geneve_rx() or vxlan_rcv() that extracts optional
> headers in ip_tunnel_info opts.

Okay, I got confused what Jonas was inquiring about. I thought that the
extension headers were not pulled, rather than not parsed. Copying them
as-is to info->opts is right, thanks!
Jonas Bonn Feb. 2, 2021, 5:24 a.m. UTC | #9
Hi Jakub,

On 01/02/2021 21:44, Jakub Kicinski wrote:
> On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote:
>> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
>>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
>>>> Following are the reasons for extracting the header and populating metadata.
>>>> 1. That is the design used by other tunneling protocols
>>>> implementations for handling optional headers. We need to have a
>>>> consistent model across all tunnel devices for upper layers.
>>>
>>> Could you clarify with some examples? This does not match intuition,
>>> I must be missing something.
>>
>> You can look at geneve_rx() or vxlan_rcv() that extracts optional
>> headers in ip_tunnel_info opts.
> 
> Okay, I got confused what Jonas was inquiring about. I thought that the
> extension headers were not pulled, rather than not parsed. Copying them
> as-is to info->opts is right, thanks!
> 

No, you're not confused.  The extension headers are not being pulled in 
the current patchset.

Incoming packet:

---------------------------------------------------------------------
| flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
---------------------------------------------------------------------
<--------- GTP header ------<<Optional GTP elements>>-----><- Pkt --->

The "collect metadata" path of the patchset copies 'flags' and 'type' to 
info->opts, but leaves the following:

-----------------------------------------
| N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
-----------------------------------------
<--------- GTP header -------><- Pkt --->

So it's leaving _half_ the header and making it a requirement that there 
be further intelligence down the line that can handle this.  This is far 
from intuitive.

/Jonas
Pravin Shelar Feb. 2, 2021, 6:56 a.m. UTC | #10
On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote:
>
> Hi Jakub,
>
> On 01/02/2021 21:44, Jakub Kicinski wrote:
> > On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote:
> >> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
> >>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
> >>>> Following are the reasons for extracting the header and populating metadata.
> >>>> 1. That is the design used by other tunneling protocols
> >>>> implementations for handling optional headers. We need to have a
> >>>> consistent model across all tunnel devices for upper layers.
> >>>
> >>> Could you clarify with some examples? This does not match intuition,
> >>> I must be missing something.
> >>
> >> You can look at geneve_rx() or vxlan_rcv() that extracts optional
> >> headers in ip_tunnel_info opts.
> >
> > Okay, I got confused what Jonas was inquiring about. I thought that the
> > extension headers were not pulled, rather than not parsed. Copying them
> > as-is to info->opts is right, thanks!
> >
>
> No, you're not confused.  The extension headers are not being pulled in
> the current patchset.
>
> Incoming packet:
>
> ---------------------------------------------------------------------
> | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
> ---------------------------------------------------------------------
> <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt --->
>
> The "collect metadata" path of the patchset copies 'flags' and 'type' to
> info->opts, but leaves the following:
>
> -----------------------------------------
> | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
> -----------------------------------------
> <--------- GTP header -------><- Pkt --->
>
> So it's leaving _half_ the header and making it a requirement that there
> be further intelligence down the line that can handle this.  This is far
> from intuitive.
>

The patch supports Echo, Echo response and End marker packet.
Issue with pulling the entire extension header is that it would result
in zero length skb, such packets can not be passed on to the upper
layer. That is the reason I kept the extension header in skb and added
indication in tunnel metadata that it is not a IP packet. so that
upper layer can process the packet.
IP packet without an extension header would be handled in a fast path
without any special handling.

Obviously In case of PDU session container extension header GTP driver
would need to process the entire extension header in the module. This
way we can handle these user data packets in fastpath.
I can make changes to use the same method for all extension headers if needed.
Jonas Bonn Feb. 2, 2021, 8:03 a.m. UTC | #11
On 02/02/2021 07:56, Pravin Shelar wrote:
> On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote:
>>
>> Hi Jakub,
>>
>> On 01/02/2021 21:44, Jakub Kicinski wrote:
>>> On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote:
>>>> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>>>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
>>>>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
>>>>>> Following are the reasons for extracting the header and populating metadata.
>>>>>> 1. That is the design used by other tunneling protocols
>>>>>> implementations for handling optional headers. We need to have a
>>>>>> consistent model across all tunnel devices for upper layers.
>>>>>
>>>>> Could you clarify with some examples? This does not match intuition,
>>>>> I must be missing something.
>>>>
>>>> You can look at geneve_rx() or vxlan_rcv() that extracts optional
>>>> headers in ip_tunnel_info opts.
>>>
>>> Okay, I got confused what Jonas was inquiring about. I thought that the
>>> extension headers were not pulled, rather than not parsed. Copying them
>>> as-is to info->opts is right, thanks!
>>>
>>
>> No, you're not confused.  The extension headers are not being pulled in
>> the current patchset.
>>
>> Incoming packet:
>>
>> ---------------------------------------------------------------------
>> | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
>> ---------------------------------------------------------------------
>> <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt --->
>>
>> The "collect metadata" path of the patchset copies 'flags' and 'type' to
>> info->opts, but leaves the following:
>>
>> -----------------------------------------
>> | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
>> -----------------------------------------
>> <--------- GTP header -------><- Pkt --->
>>
>> So it's leaving _half_ the header and making it a requirement that there
>> be further intelligence down the line that can handle this.  This is far
>> from intuitive.
>>
> 
> The patch supports Echo, Echo response and End marker packet.
> Issue with pulling the entire extension header is that it would result
> in zero length skb, such packets can not be passed on to the upper
> layer. That is the reason I kept the extension header in skb and added
> indication in tunnel metadata that it is not a IP packet. so that
> upper layer can process the packet.
> IP packet without an extension header would be handled in a fast path
> without any special handling.
> 
> Obviously In case of PDU session container extension header GTP driver
> would need to process the entire extension header in the module. This
> way we can handle these user data packets in fastpath.
> I can make changes to use the same method for all extension headers if needed.
> 

The most disturbing bit is the fact that the upper layer needs to 
understand that part of the header info is in info->opts whereas the 
remainder is on the SKB itself.  If it is going to access the SKB 
anyway, why not just leave the entire GTP header in place and let the 
upper layer just get all the information from there?  What's the 
advantage of info->opts in this case?

Normally, the gtp module extracts T-PDU's from the GTP packet and passes 
them on (after validating their IP address) to the network stack.  For 
_everything else_, it just passes them along the socket for handling 
elsewhere.

It sounds like you are trying to do exactly the same thing:  extract 
T-PDU and inject into network stack for T-PDU's, and pass everything
else to another handler.

So what is different in your case from the normal case?
- there's metadata on the packet... can't we detect this and set the 
tunnel ID from the TEID in that case?  Or can't we just always have 
metadata on the packet?
- the upper layer handler is in kernel space instead of userspace; but 
they are doing pretty much the same thing, right?  why does the kernel 
space variant need something (info->opts) that userspace can get by without?

It would be seriously good to see a _real_ example of how you intend to 
use this.  Isn't the PDP context mechanism already sufficient to do all 
of the above?  What's missing?

ip route 192.168.99.0/24 encap gtp id 100 dst 172.99.0.2 dev gtp1

is roughly equivalent to:

gtp-tunnel add gtp1 v1 [LOCAL_TEID] 100 172.99.0.2 [UE_IP]

/Jonas
Pravin Shelar Feb. 2, 2021, 10:54 p.m. UTC | #12
On Tue, Feb 2, 2021 at 12:03 AM Jonas Bonn <jonas@norrbonn.se> wrote:
>
>
>
> On 02/02/2021 07:56, Pravin Shelar wrote:
> > On Mon, Feb 1, 2021 at 9:24 PM Jonas Bonn <jonas@norrbonn.se> wrote:
> >>
> >> Hi Jakub,
> >>
> >> On 01/02/2021 21:44, Jakub Kicinski wrote:
> >>> On Sat, 30 Jan 2021 12:05:40 -0800 Pravin Shelar wrote:
> >>>> On Sat, Jan 30, 2021 at 10:44 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >>>>> On Fri, 29 Jan 2021 22:59:06 -0800 Pravin Shelar wrote:
> >>>>>> On Fri, Jan 29, 2021 at 6:08 AM Jonas Bonn <jonas@norrbonn.se> wrote:
> >>>>>> Following are the reasons for extracting the header and populating metadata.
> >>>>>> 1. That is the design used by other tunneling protocols
> >>>>>> implementations for handling optional headers. We need to have a
> >>>>>> consistent model across all tunnel devices for upper layers.
> >>>>>
> >>>>> Could you clarify with some examples? This does not match intuition,
> >>>>> I must be missing something.
> >>>>
> >>>> You can look at geneve_rx() or vxlan_rcv() that extracts optional
> >>>> headers in ip_tunnel_info opts.
> >>>
> >>> Okay, I got confused what Jonas was inquiring about. I thought that the
> >>> extension headers were not pulled, rather than not parsed. Copying them
> >>> as-is to info->opts is right, thanks!
> >>>
> >>
> >> No, you're not confused.  The extension headers are not being pulled in
> >> the current patchset.
> >>
> >> Incoming packet:
> >>
> >> ---------------------------------------------------------------------
> >> | flags | type | len | TEID | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
> >> ---------------------------------------------------------------------
> >> <--------- GTP header ------<<Optional GTP elements>>-----><- Pkt --->
> >>
> >> The "collect metadata" path of the patchset copies 'flags' and 'type' to
> >> info->opts, but leaves the following:
> >>
> >> -----------------------------------------
> >> | N-PDU | SEQ | Ext | EXT.Hdr | IP | ...
> >> -----------------------------------------
> >> <--------- GTP header -------><- Pkt --->
> >>
> >> So it's leaving _half_ the header and making it a requirement that there
> >> be further intelligence down the line that can handle this.  This is far
> >> from intuitive.
> >>
> >
> > The patch supports Echo, Echo response and End marker packet.
> > Issue with pulling the entire extension header is that it would result
> > in zero length skb, such packets can not be passed on to the upper
> > layer. That is the reason I kept the extension header in skb and added
> > indication in tunnel metadata that it is not a IP packet. so that
> > upper layer can process the packet.
> > IP packet without an extension header would be handled in a fast path
> > without any special handling.
> >
> > Obviously In case of PDU session container extension header GTP driver
> > would need to process the entire extension header in the module. This
> > way we can handle these user data packets in fastpath.
> > I can make changes to use the same method for all extension headers if needed.
> >
>
> The most disturbing bit is the fact that the upper layer needs to
> understand that part of the header info is in info->opts whereas the
> remainder is on the SKB itself.  If it is going to access the SKB
> anyway, why not just leave the entire GTP header in place and let the
> upper layer just get all the information from there?  What's the
> advantage of info->opts in this case?
>
> Normally, the gtp module extracts T-PDU's from the GTP packet and passes
> them on (after validating their IP address) to the network stack.  For
> _everything else_, it just passes them along the socket for handling
> elsewhere.
>
> It sounds like you are trying to do exactly the same thing:  extract
> T-PDU and inject into network stack for T-PDU's, and pass everything
> else to another handler.
>
> So what is different in your case from the normal case?
> - there's metadata on the packet... can't we detect this and set the
> tunnel ID from the TEID in that case?  Or can't we just always have
> metadata on the packet?
> - the upper layer handler is in kernel space instead of userspace; but
> they are doing pretty much the same thing, right?  why does the kernel
> space variant need something (info->opts) that userspace can get by without?
>
This is how OVS/eBPF abstracts tunneling implementation. Tunnel
context is represented in tunnel metadata. I am trying to integrate
GTP tunnel with LTW framework. This way we can make use of GTP tunnel
as any other LWT device.

I am fine with the GTP extension header handling that passes GTP
packet as is in case extension header. Can you make those changes in
this series and post Non RFC Patch Set.
diff mbox series

Patch

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 668ed8a4836e..bbce2671de2d 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -683,6 +683,38 @@  static void gtp_push_header(struct sk_buff *skb, struct pdp_ctx *pctx,
 	}
 }
 
+static inline int gtp1_push_control_header(struct sk_buff *skb,
+					   struct pdp_ctx *pctx,
+					   struct gtpu_metadata *opts,
+					   struct net_device *dev)
+{
+	struct gtp1_header *gtp1c;
+	int payload_len;
+
+	if (opts->ver != GTP_METADATA_V1)
+		return -ENOENT;
+
+	if (opts->type == 0xFE) {
+		/* for end marker ignore skb data. */
+		netdev_dbg(dev, "xmit pkt with null data");
+		pskb_trim(skb, 0);
+	}
+	if (skb_cow_head(skb, sizeof(*gtp1c)) < 0)
+		return -ENOMEM;
+
+	payload_len = skb->len;
+
+	gtp1c = skb_push(skb, sizeof(*gtp1c));
+
+	gtp1c->flags	= opts->flags;
+	gtp1c->type	= opts->type;
+	gtp1c->length	= htons(payload_len);
+	gtp1c->tid	= htonl(pctx->u.v1.o_tei);
+	netdev_dbg(dev, "xmit control pkt: ver %d flags %x type %x pkt len %d tid %x",
+		   opts->ver, opts->flags, opts->type, skb->len, pctx->u.v1.o_tei);
+	return 0;
+}
+
 static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
 {
 	struct gtp_dev *gtp = netdev_priv(dev);
@@ -807,7 +839,16 @@  static int gtp_xmit_ip4(struct sk_buff *skb, struct net_device *dev)
 
 	skb_set_inner_protocol(skb, skb->protocol);
 
-	gtp_push_header(skb, pctx, &port);
+	if (unlikely(opts)) {
+		port = htons(GTP1U_PORT);
+		r = gtp1_push_control_header(skb, pctx, opts, dev);
+		if (r) {
+			netdev_info(dev, "cntr pkt error %d", r);
+			goto err_rt;
+		}
+	} else {
+		gtp_push_header(skb, pctx, &port);
+	}
 
 	iph = ip_hdr(skb);
 	netdev_dbg(dev, "gtp -> IP src: %pI4 dst: %pI4\n",