mbox series

[RFC,net-next,0/3] support "flow-based" datapath in l2tp

Message ID 20210929094514.15048-1-tparkin@katalix.com (mailing list archive)
Headers show
Series support "flow-based" datapath in l2tp | expand

Message

Tom Parkin Sept. 29, 2021, 9:45 a.m. UTC
The traditional l2tp datapath in the kernel allocates of a netdev for
each l2tp session.  For larger session populations this limits
scalability.

Other protocols (such as geneve) support a mode whereby a single virtual
netdev is used to manage packetflows for multiple logical sessions: a
much more scalable solution.

This RFC patch series extends l2tp to support this mode of operation:

    * On creation of a tunnel instance a new tunnel virtual device is
      created (in this patch series it is named according to its ID for
      ease of testing, but this is potentially racy: alternatives are
      mentioned in the code comments).

    * For l2tp encapsulation, tc rules can be added to redirect traffic
      to the virtual tunnel device, e.g.

            tc qdisc add dev eth0 handle ffff: ingress
            tc filter add dev eth0 \
                    parent ffff: \
                    matchall \
                    action tunnel_key set \
                            src_ip 0.0.0.1 \
                            dst_ip 0.0.0.1 \
                            id 1 \
                    action mirred egress redirect dev l2tpt1

      This series utilises the 'id' parameter to refer to session ID
      within the tunnel, and the src_ip/dst_ip parameters are ignored.

    * For l2tp decapsulation, a new session data path is implemented.

      On receipt of an l2tp data packet on the tunnel socket, the l2tp
      headers are removed as normal, and the session ID of the target
      session associated with the skb using ip tunnel dst metadata.

      The skb is then redirected to the tunnel virtual netdev: tc rules
      can then be added to match traffic based on the session ID and
      redirect it to the correct interface:

            tc qdisc add dev l2tpt1 handle ffff: ingress
            tc filter add dev l2tpt1 \
                    parent ffff: \
                    flower enc_key_id 1 \
                    action mirred egress redirect dev eth0

      In the case that no tc rule matches an incoming packet, the tunnel
      virtual device implements an rx handler which swallows the packet
      in order to prevent it continuing through the network stack.

I welcome any comments on:

    1. Whether this RFC represents a good approach for improving
       the l2tp datapath?

    2. Architectural/design feedback on this implementation.

The code here isn't production-ready by any means, although any comments
on bugs or other issues with the series as it stands are also welcome.

Tom Parkin (3):
  net/l2tp: add virtual tunnel device
  net/l2tp: add flow-based session create API
  net/l2tp: add netlink attribute to enable flow-based session creation

 include/uapi/linux/l2tp.h |   1 +
 net/l2tp/l2tp_core.c      | 208 ++++++++++++++++++++++++++++++++++++++
 net/l2tp/l2tp_core.h      |   9 ++
 net/l2tp/l2tp_netlink.c   |  36 ++++---
 4 files changed, 241 insertions(+), 13 deletions(-)

Comments

Eyal Birger Sept. 29, 2021, 10:03 a.m. UTC | #1
Hi Tom,

On 29/09/2021 12:45, Tom Parkin wrote:
...
>        The skb is then redirected to the tunnel virtual netdev: tc rules
>        can then be added to match traffic based on the session ID and
>        redirect it to the correct interface:
> 
>              tc qdisc add dev l2tpt1 handle ffff: ingress
>              tc filter add dev l2tpt1 \
>                      parent ffff: \
>                      flower enc_key_id 1 \
>                      action mirred egress redirect dev eth0
> 
>        In the case that no tc rule matches an incoming packet, the tunnel
>        virtual device implements an rx handler which swallows the packet
>        in order to prevent it continuing through the network stack.

There are other ways to utilize the tunnel key on rx, e.g. in ip rules.

IMHO it'd be nicer if the decision to drop would be an administrator 
decision which they can implement using a designated tc drop rule.

Eyal.
Tom Parkin Oct. 1, 2021, 8:40 a.m. UTC | #2
On  Wed, Sep 29, 2021 at 13:03:21 +0300, Eyal Birger wrote:
> Hi Tom,
> 
> On 29/09/2021 12:45, Tom Parkin wrote:
> ...
> >        The skb is then redirected to the tunnel virtual netdev: tc rules
> >        can then be added to match traffic based on the session ID and
> >        redirect it to the correct interface:
> > 
> >              tc qdisc add dev l2tpt1 handle ffff: ingress
> >              tc filter add dev l2tpt1 \
> >                      parent ffff: \
> >                      flower enc_key_id 1 \
> >                      action mirred egress redirect dev eth0
> > 
> >        In the case that no tc rule matches an incoming packet, the tunnel
> >        virtual device implements an rx handler which swallows the packet
> >        in order to prevent it continuing through the network stack.
> 
> There are other ways to utilize the tunnel key on rx, e.g. in ip rules.
> 
> IMHO it'd be nicer if the decision to drop would be an administrator
> decision which they can implement using a designated tc drop rule.

Good point, and one I hadn't considered.

My concern with letting the packet enter the stack is that it could
possibly cause issues, but maybe it's better to allow the admin to
make that call.