mbox series

[RFC,v1,net-next,00/12] bridge-fastpath and related improvements

Message ID 20241013185509.4430-1-ericwouds@gmail.com (mailing list archive)
Headers show
Series bridge-fastpath and related improvements | expand

Message

Eric Woudstra Oct. 13, 2024, 6:54 p.m. UTC
This patchset makes it possible to set up a (hardware offloaded) fastpath
for bridged interfaces.

To set up the fastpath with offloading, add this extra flowtable:

table bridge filter {
        flowtable fb {
                hook ingress priority filter
                devices = { lan0, lan1, lan2, lan3, lan4, wlan0, wlan1 }
                flags offload
        }
        chain forward {
                type filter hook forward priority filter; policy accept;
		ct state established flow add @fb
        }
}

Creating a separate fastpath for bridges.

         forward fastpath bypass
 .----------------------------------------.
/                                          \
|                        IP - forwarding    |
|                       /                \  v
|                      /                  wan ...
|                     /
|                     |
|                     |
|                   brlan.1
|                     |
|    +-------------------------------+
|    |           vlan 1              |
|    |                               |
|    |     brlan (vlan-filtering)    |
|    +---------------+               |
|    |  DSA-SWITCH   |               |
|    |               |    vlan 1     |
|    |               |      to       |
|    |   vlan 1      |   untagged    |
|    +---------------+---------------+
.         /                   \
 ------>lan0                 wlan1
        .  ^                 ^
        .  |                 |
        .  \_________________/
        .  bridge fastpath bypass
        .
        ^
     vlan 1 tagged packets

To have the ability to handle xmit direct with outgoing encaps in the
bridge fastpass bypass, we need to be able to handle them without going
through vlan/pppoe devices. So I've applied, amended and squashed wenxu's
patchset. This patch also makes it possible to egress from vlan-filtering
brlan to lan0 with vlan tagged packets, if the bridge master port is doing
the vlan tagging, instead of the vlan-device. Without this patch, this is
not possible in the bridge-fastpath and also not in the forward-fastpath,
as seen in the figure above.

There are also some more fixes for filling in the forward path. These
fixes also apply to for the forward-fastpath. They include handling
DEV_PATH_MTK_WDMA in nft_dev_path_info() and avoiding
DEV_PATH_BR_VLAN_UNTAG_HW for bridges with ports that use dsa.

Conntrack bridge only tracks untagged and 802.1q. To make the bridge
fastpath experience more similar to the forward fastpath experience,
I've added double vlan, pppoe and pppoe-in-q tagged packets to bridge
conntrack and to bridge filter chain.

Eric Woudstra (12):
  netfilter: nf_flow_table_offload: Add nf_flow_encap_push() for xmit
    direct
  netfilter: bridge: Add conntrack double vlan and pppoe
  netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  bridge: br_vlan_fill_forward_path_pvid: Add port to port
  bridge: br_fill_forward_path add port to port
  net: core: dev: Add dev_fill_bridge_path()
  netfilter :nf_flow_table_offload: Add nf_flow_rule_bridge()
  netfilter: nf_flow_table_inet: Add nf_flowtable_type flowtable_bridge
  netfilter: nft_flow_offload: Add NFPROTO_BRIDGE to validate
  netfilter: nft_flow_offload: Add DEV_PATH_MTK_WDMA to
    nft_dev_path_info()
  bridge: br_vlan_fill_forward_path_mode no _UNTAG_HW for dsa
  netfilter: nft_flow_offload: Add bridgeflow to nft_flow_offload_eval()

 include/linux/netdevice.h                  |   2 +
 include/net/netfilter/nf_flow_table.h      |   3 +
 net/bridge/br_device.c                     |  20 ++-
 net/bridge/br_private.h                    |   2 +
 net/bridge/br_vlan.c                       |  24 +++-
 net/bridge/netfilter/nf_conntrack_bridge.c |  86 ++++++++++--
 net/core/dev.c                             |  77 +++++++++--
 net/netfilter/nf_flow_table_inet.c         |  13 ++
 net/netfilter/nf_flow_table_ip.c           |  96 ++++++++++++-
 net/netfilter/nf_flow_table_offload.c      |  13 ++
 net/netfilter/nft_chain_filter.c           |  20 ++-
 net/netfilter/nft_flow_offload.c           | 154 +++++++++++++++++++--
 12 files changed, 463 insertions(+), 47 deletions(-)

Comments

Nikolay Aleksandrov Oct. 14, 2024, 6:35 a.m. UTC | #1
On 13/10/2024 21:54, Eric Woudstra wrote:
> This patchset makes it possible to set up a (hardware offloaded) fastpath
> for bridged interfaces.
> 

The subject and this sentence are misleading, you're talking about netfilter bridge
fastpath offload, please mention it in both places. When you just say bridge fast
path, I think of the software fast path.

> To set up the fastpath with offloading, add this extra flowtable:
> 
> table bridge filter {
>         flowtable fb {
>                 hook ingress priority filter
>                 devices = { lan0, lan1, lan2, lan3, lan4, wlan0, wlan1 }
>                 flags offload
>         }
>         chain forward {
>                 type filter hook forward priority filter; policy accept;
> 		ct state established flow add @fb
>         }
> }
> 
> Creating a separate fastpath for bridges.
> 
>          forward fastpath bypass
>  .----------------------------------------.
> /                                          \
> |                        IP - forwarding    |
> |                       /                \  v
> |                      /                  wan ...
> |                     /
> |                     |
> |                     |
> |                   brlan.1
> |                     |
> |    +-------------------------------+
> |    |           vlan 1              |
> |    |                               |
> |    |     brlan (vlan-filtering)    |
> |    +---------------+               |
> |    |  DSA-SWITCH   |               |
> |    |               |    vlan 1     |
> |    |               |      to       |
> |    |   vlan 1      |   untagged    |
> |    +---------------+---------------+
> .         /                   \
>  ------>lan0                 wlan1
>         .  ^                 ^
>         .  |                 |
>         .  \_________________/
>         .  bridge fastpath bypass
>         .
>         ^
>      vlan 1 tagged packets
> 
> To have the ability to handle xmit direct with outgoing encaps in the
> bridge fastpass bypass, we need to be able to handle them without going
> through vlan/pppoe devices. So I've applied, amended and squashed wenxu's
> patchset. This patch also makes it possible to egress from vlan-filtering
> brlan to lan0 with vlan tagged packets, if the bridge master port is doing
> the vlan tagging, instead of the vlan-device. Without this patch, this is
> not possible in the bridge-fastpath and also not in the forward-fastpath,
> as seen in the figure above.
> 
> There are also some more fixes for filling in the forward path. These
> fixes also apply to for the forward-fastpath. They include handling
> DEV_PATH_MTK_WDMA in nft_dev_path_info() and avoiding
> DEV_PATH_BR_VLAN_UNTAG_HW for bridges with ports that use dsa.
> 
> Conntrack bridge only tracks untagged and 802.1q. To make the bridge
> fastpath experience more similar to the forward fastpath experience,
> I've added double vlan, pppoe and pppoe-in-q tagged packets to bridge
> conntrack and to bridge filter chain.
> 
> Eric Woudstra (12):
>   netfilter: nf_flow_table_offload: Add nf_flow_encap_push() for xmit
>     direct
>   netfilter: bridge: Add conntrack double vlan and pppoe
>   netfilter: nft_chain_filter: Add bridge double vlan and pppoe
>   bridge: br_vlan_fill_forward_path_pvid: Add port to port
>   bridge: br_fill_forward_path add port to port
>   net: core: dev: Add dev_fill_bridge_path()
>   netfilter :nf_flow_table_offload: Add nf_flow_rule_bridge()
>   netfilter: nf_flow_table_inet: Add nf_flowtable_type flowtable_bridge
>   netfilter: nft_flow_offload: Add NFPROTO_BRIDGE to validate
>   netfilter: nft_flow_offload: Add DEV_PATH_MTK_WDMA to
>     nft_dev_path_info()
>   bridge: br_vlan_fill_forward_path_mode no _UNTAG_HW for dsa
>   netfilter: nft_flow_offload: Add bridgeflow to nft_flow_offload_eval()
> 
>  include/linux/netdevice.h                  |   2 +
>  include/net/netfilter/nf_flow_table.h      |   3 +
>  net/bridge/br_device.c                     |  20 ++-
>  net/bridge/br_private.h                    |   2 +
>  net/bridge/br_vlan.c                       |  24 +++-
>  net/bridge/netfilter/nf_conntrack_bridge.c |  86 ++++++++++--
>  net/core/dev.c                             |  77 +++++++++--
>  net/netfilter/nf_flow_table_inet.c         |  13 ++
>  net/netfilter/nf_flow_table_ip.c           |  96 ++++++++++++-
>  net/netfilter/nf_flow_table_offload.c      |  13 ++
>  net/netfilter/nft_chain_filter.c           |  20 ++-
>  net/netfilter/nft_flow_offload.c           | 154 +++++++++++++++++++--
>  12 files changed, 463 insertions(+), 47 deletions(-)
>
Eric Woudstra Oct. 14, 2024, 6:29 p.m. UTC | #2
On 10/14/24 8:35 AM, Nikolay Aleksandrov wrote:
> On 13/10/2024 21:54, Eric Woudstra wrote:
>> This patchset makes it possible to set up a (hardware offloaded) fastpath
>> for bridged interfaces.
>>
> 
> The subject and this sentence are misleading, you're talking about netfilter bridge
> fastpath offload, please mention it in both places. When you just say bridge fast
> path, I think of the software fast path.
> 

Hello Nikolay,

It would be no problem for me to change the subject and body, if you
think that is better.

The thing is, these patches actually make it possible to set up a fully
functional software fastpath between bridged interfaces. Only after the
software fastpath is set up and functional, it can be offloaded, which
happens to by my personal motivation to write this patch-set.

If the offload flag is set in the flowtable, the software fastpath will
be offloaded. But in this patch-set, there is nothing that changes
anything there, the existing code is used unchanged.

>> To set up the fastpath with offloading, add this extra flowtable:
>>
>> table bridge filter {
>>         flowtable fb {
>>                 hook ingress priority filter
>>                 devices = { lan0, lan1, lan2, lan3, lan4, wlan0, wlan1 }
>>                 flags offload
>>         }
>>         chain forward {
>>                 type filter hook forward priority filter; policy accept;
>> 		ct state established flow add @fb
>>         }
>> }
>>
>> Creating a separate fastpath for bridges.
>>
>>          forward fastpath bypass
>>  .----------------------------------------.
>> /                                          \
>> |                        IP - forwarding    |
>> |                       /                \  v
>> |                      /                  wan ...
>> |                     /
>> |                     |
>> |                     |
>> |                   brlan.1
>> |                     |
>> |    +-------------------------------+
>> |    |           vlan 1              |
>> |    |                               |
>> |    |     brlan (vlan-filtering)    |
>> |    +---------------+               |
>> |    |  DSA-SWITCH   |               |
>> |    |               |    vlan 1     |
>> |    |               |      to       |
>> |    |   vlan 1      |   untagged    |
>> |    +---------------+---------------+
>> .         /                   \
>>  ------>lan0                 wlan1
>>         .  ^                 ^
>>         .  |                 |
>>         .  \_________________/
>>         .  bridge fastpath bypass
>>         .
>>         ^
>>      vlan 1 tagged packets
>>
>> To have the ability to handle xmit direct with outgoing encaps in the
>> bridge fastpass bypass, we need to be able to handle them without going
>> through vlan/pppoe devices. So I've applied, amended and squashed wenxu's
>> patchset. This patch also makes it possible to egress from vlan-filtering
>> brlan to lan0 with vlan tagged packets, if the bridge master port is doing
>> the vlan tagging, instead of the vlan-device. Without this patch, this is
>> not possible in the bridge-fastpath and also not in the forward-fastpath,
>> as seen in the figure above.
>>
>> There are also some more fixes for filling in the forward path. These
>> fixes also apply to for the forward-fastpath. They include handling
>> DEV_PATH_MTK_WDMA in nft_dev_path_info() and avoiding
>> DEV_PATH_BR_VLAN_UNTAG_HW for bridges with ports that use dsa.
>>
>> Conntrack bridge only tracks untagged and 802.1q. To make the bridge
>> fastpath experience more similar to the forward fastpath experience,
>> I've added double vlan, pppoe and pppoe-in-q tagged packets to bridge
>> conntrack and to bridge filter chain.
>>
>> Eric Woudstra (12):
>>   netfilter: nf_flow_table_offload: Add nf_flow_encap_push() for xmit
>>     direct
>>   netfilter: bridge: Add conntrack double vlan and pppoe
>>   netfilter: nft_chain_filter: Add bridge double vlan and pppoe
>>   bridge: br_vlan_fill_forward_path_pvid: Add port to port
>>   bridge: br_fill_forward_path add port to port
>>   net: core: dev: Add dev_fill_bridge_path()
>>   netfilter :nf_flow_table_offload: Add nf_flow_rule_bridge()
>>   netfilter: nf_flow_table_inet: Add nf_flowtable_type flowtable_bridge
>>   netfilter: nft_flow_offload: Add NFPROTO_BRIDGE to validate
>>   netfilter: nft_flow_offload: Add DEV_PATH_MTK_WDMA to
>>     nft_dev_path_info()
>>   bridge: br_vlan_fill_forward_path_mode no _UNTAG_HW for dsa
>>   netfilter: nft_flow_offload: Add bridgeflow to nft_flow_offload_eval()
>>
>>  include/linux/netdevice.h                  |   2 +
>>  include/net/netfilter/nf_flow_table.h      |   3 +
>>  net/bridge/br_device.c                     |  20 ++-
>>  net/bridge/br_private.h                    |   2 +
>>  net/bridge/br_vlan.c                       |  24 +++-
>>  net/bridge/netfilter/nf_conntrack_bridge.c |  86 ++++++++++--
>>  net/core/dev.c                             |  77 +++++++++--
>>  net/netfilter/nf_flow_table_inet.c         |  13 ++
>>  net/netfilter/nf_flow_table_ip.c           |  96 ++++++++++++-
>>  net/netfilter/nf_flow_table_offload.c      |  13 ++
>>  net/netfilter/nft_chain_filter.c           |  20 ++-
>>  net/netfilter/nft_flow_offload.c           | 154 +++++++++++++++++++--
>>  12 files changed, 463 insertions(+), 47 deletions(-)
>>
>
Felix Fietkau Oct. 15, 2024, 12:16 p.m. UTC | #3
Hi Eric,

On 14.10.24 20:29, Eric Woudstra wrote:
> It would be no problem for me to change the subject and body, if you
> think that is better.
> 
> The thing is, these patches actually make it possible to set up a fully
> functional software fastpath between bridged interfaces. Only after the
> software fastpath is set up and functional, it can be offloaded, which
> happens to by my personal motivation to write this patch-set.
> 
> If the offload flag is set in the flowtable, the software fastpath will
> be offloaded. But in this patch-set, there is nothing that changes
> anything there, the existing code is used unchanged.

FWIW, a while back, I also wanted to add a software fast path for the 
bridge layer to the kernel, also with the intention of using it for 
hardware offload. It wasn't accepted back then, because (if I remember 
correctly) people didn't want any extra complexity in the network stack 
to make the bridge layer faster.

Because of that, I created this piece of software:
https://github.com/nbd168/bridger

It uses an eBPF TC classifier for discovering flows and handling the 
software fast path, and also creates hardware offload rules for flows.
With that, hardware offloading for bridged LAN->WLAN flows is fully 
supported on MediaTek hardware with upstream kernels.

- Felix
Eric Woudstra Oct. 15, 2024, 1:32 p.m. UTC | #4
On 10/15/24 2:16 PM, Felix Fietkau wrote:
> Hi Eric,
> 
> On 14.10.24 20:29, Eric Woudstra wrote:
>> It would be no problem for me to change the subject and body, if you
>> think that is better.
>>
>> The thing is, these patches actually make it possible to set up a fully
>> functional software fastpath between bridged interfaces. Only after the
>> software fastpath is set up and functional, it can be offloaded, which
>> happens to by my personal motivation to write this patch-set.
>>
>> If the offload flag is set in the flowtable, the software fastpath will
>> be offloaded. But in this patch-set, there is nothing that changes
>> anything there, the existing code is used unchanged.
> 
> FWIW, a while back, I also wanted to add a software fast path for the
> bridge layer to the kernel, also with the intention of using it for
> hardware offload. It wasn't accepted back then, because (if I remember
> correctly) people didn't want any extra complexity in the network stack
> to make the bridge layer faster.

Hello Felix,

I think this patch-set is a clear showcase it is not very complex at
all. The core of making it possible only consists a few patches. Half of
this patch-set involves improvements that also apply to the
forward-fastpath.

> Because of that, I created this piece of software:
> https://github.com/nbd168/bridger
> 
> It uses an eBPF TC classifier for discovering flows and handling the
> software fast path, and also creates hardware offload rules for flows.
> With that, hardware offloading for bridged LAN->WLAN flows is fully
> supported on MediaTek hardware with upstream kernels.
> 
> - Felix

Thanks, I've seen that already. Nice piece of software, but I'm not
running openwrt. I would like to see a solution implemented in the
kernel, so any operating system can use it.
Felix Fietkau Oct. 15, 2024, 7:44 p.m. UTC | #5
On 15.10.24 15:32, Eric Woudstra wrote:
> 
> 
> On 10/15/24 2:16 PM, Felix Fietkau wrote:
>> Hi Eric,
>> 
>> On 14.10.24 20:29, Eric Woudstra wrote:
>>> It would be no problem for me to change the subject and body, if you
>>> think that is better.
>>>
>>> The thing is, these patches actually make it possible to set up a fully
>>> functional software fastpath between bridged interfaces. Only after the
>>> software fastpath is set up and functional, it can be offloaded, which
>>> happens to by my personal motivation to write this patch-set.
>>>
>>> If the offload flag is set in the flowtable, the software fastpath will
>>> be offloaded. But in this patch-set, there is nothing that changes
>>> anything there, the existing code is used unchanged.
>> 
>> FWIW, a while back, I also wanted to add a software fast path for the
>> bridge layer to the kernel, also with the intention of using it for
>> hardware offload. It wasn't accepted back then, because (if I remember
>> correctly) people didn't want any extra complexity in the network stack
>> to make the bridge layer faster.
> 
> Hello Felix,
> 
> I think this patch-set is a clear showcase it is not very complex at
> all. The core of making it possible only consists a few patches. Half of
> this patch-set involves improvements that also apply to the
> forward-fastpath.

It's definitely an interesting approach. How does it deal with devices 
roaming from one bridge port to another? I couldn't find that in the code.

>> Because of that, I created this piece of software:
>> https://github.com/nbd168/bridger
>> 
>> It uses an eBPF TC classifier for discovering flows and handling the
>> software fast path, and also creates hardware offload rules for flows.
>> With that, hardware offloading for bridged LAN->WLAN flows is fully
>> supported on MediaTek hardware with upstream kernels.
>> 
>> - Felix
> 
> Thanks, I've seen that already. Nice piece of software, but I'm not
> running openwrt. I would like to see a solution implemented in the
> kernel, so any operating system can use it.

Makes sense. By the way, bridger can easily be built for non-OpenWrt 
systems too. The only library that's actually needed is libubox - that 
one is small and can be linked in statically. ubus support is fully 
optional and not necessary for standard cases.

- Felix
Eric Woudstra Oct. 16, 2024, 3:59 p.m. UTC | #6
On 10/15/24 9:44 PM, Felix Fietkau wrote:
> On 15.10.24 15:32, Eric Woudstra wrote:
>>
>>
>> On 10/15/24 2:16 PM, Felix Fietkau wrote:
>>> Hi Eric,
>>>
>>> On 14.10.24 20:29, Eric Woudstra wrote:
>>>> It would be no problem for me to change the subject and body, if you
>>>> think that is better.
>>>>
>>>> The thing is, these patches actually make it possible to set up a fully
>>>> functional software fastpath between bridged interfaces. Only after the
>>>> software fastpath is set up and functional, it can be offloaded, which
>>>> happens to by my personal motivation to write this patch-set.
>>>>
>>>> If the offload flag is set in the flowtable, the software fastpath will
>>>> be offloaded. But in this patch-set, there is nothing that changes
>>>> anything there, the existing code is used unchanged.
>>>
>>> FWIW, a while back, I also wanted to add a software fast path for the
>>> bridge layer to the kernel, also with the intention of using it for
>>> hardware offload. It wasn't accepted back then, because (if I remember
>>> correctly) people didn't want any extra complexity in the network stack
>>> to make the bridge layer faster.
>>
>> Hello Felix,
>>
>> I think this patch-set is a clear showcase it is not very complex at
>> all. The core of making it possible only consists a few patches. Half of
>> this patch-set involves improvements that also apply to the
>> forward-fastpath.
> 
> It's definitely an interesting approach. How does it deal with devices
> roaming from one bridge port to another? I couldn't find that in the code.

It is handled in the same manner when dealing with the forward-fastpath,
with the aid of conntrack. If roaming is problematic, then it would be
for both the forward-fastpath and the bridge-fastpath. I have a topic on
the banana-pi forum about this patch-set, so I think long discussions
about additional details we could have there, keeping the mailing list
more clean.

>>> Because of that, I created this piece of software:
>>> https://github.com/nbd168/bridger
>>>
>>> It uses an eBPF TC classifier for discovering flows and handling the
>>> software fast path, and also creates hardware offload rules for flows.
>>> With that, hardware offloading for bridged LAN->WLAN flows is fully
>>> supported on MediaTek hardware with upstream kernels.
>>>
>>> - Felix
>>
>> Thanks, I've seen that already. Nice piece of software, but I'm not
>> running openwrt. I would like to see a solution implemented in the
>> kernel, so any operating system can use it.
> 
> Makes sense. By the way, bridger can easily be built for non-OpenWrt
> systems too. The only library that's actually needed is libubox - that
> one is small and can be linked in statically. ubus support is fully
> optional and not necessary for standard cases.
> 
> - Felix
Felix Fietkau Oct. 17, 2024, 9:17 a.m. UTC | #7
On 16.10.24 17:59, Eric Woudstra wrote:
> 
> 
> On 10/15/24 9:44 PM, Felix Fietkau wrote:
>> On 15.10.24 15:32, Eric Woudstra wrote:
>>>
>>>
>>> On 10/15/24 2:16 PM, Felix Fietkau wrote:
>>>> Hi Eric,
>>>>
>>>> On 14.10.24 20:29, Eric Woudstra wrote:
>>>>> It would be no problem for me to change the subject and body, if you
>>>>> think that is better.
>>>>>
>>>>> The thing is, these patches actually make it possible to set up a fully
>>>>> functional software fastpath between bridged interfaces. Only after the
>>>>> software fastpath is set up and functional, it can be offloaded, which
>>>>> happens to by my personal motivation to write this patch-set.
>>>>>
>>>>> If the offload flag is set in the flowtable, the software fastpath will
>>>>> be offloaded. But in this patch-set, there is nothing that changes
>>>>> anything there, the existing code is used unchanged.
>>>>
>>>> FWIW, a while back, I also wanted to add a software fast path for the
>>>> bridge layer to the kernel, also with the intention of using it for
>>>> hardware offload. It wasn't accepted back then, because (if I remember
>>>> correctly) people didn't want any extra complexity in the network stack
>>>> to make the bridge layer faster.
>>>
>>> Hello Felix,
>>>
>>> I think this patch-set is a clear showcase it is not very complex at
>>> all. The core of making it possible only consists a few patches. Half of
>>> this patch-set involves improvements that also apply to the
>>> forward-fastpath.
>> 
>> It's definitely an interesting approach. How does it deal with devices
>> roaming from one bridge port to another? I couldn't find that in the code.
> 
> It is handled in the same manner when dealing with the forward-fastpath,
> with the aid of conntrack. If roaming is problematic, then it would be
> for both the forward-fastpath and the bridge-fastpath. I have a topic on
> the banana-pi forum about this patch-set, so I think long discussions
> about additional details we could have there, keeping the mailing list
> more clean.

You forgot to include a link to the forum topic :)

By the way, based on some reports that I received, I do believe that the 
existing forwarding fastpath also doesn't handle roaming properly.
I just didn't have the time to properly look into that yet.

- Felix
Pablo Neira Ayuso Oct. 17, 2024, 12:39 p.m. UTC | #8
On Thu, Oct 17, 2024 at 11:17:09AM +0200, Felix Fietkau wrote:
[...]
> By the way, based on some reports that I received, I do believe that the
> existing forwarding fastpath also doesn't handle roaming properly.
> I just didn't have the time to properly look into that yet.

I think it should work for the existing forwarding fastpath.

- If computer roams from different port, packets follow classic path,
  then new flow entry is created. The flow old entry expires after 30
  seconds.
- If route is stale, flow entry is also removed.

Maybe I am missing another possible scenario?

Thanks.
Felix Fietkau Oct. 17, 2024, 5:06 p.m. UTC | #9
On 17.10.24 14:39, Pablo Neira Ayuso wrote:
> On Thu, Oct 17, 2024 at 11:17:09AM +0200, Felix Fietkau wrote:
> [...]
>> By the way, based on some reports that I received, I do believe that the
>> existing forwarding fastpath also doesn't handle roaming properly.
>> I just didn't have the time to properly look into that yet.
> 
> I think it should work for the existing forwarding fastpath.
> 
> - If computer roams from different port, packets follow classic path,
>    then new flow entry is created. The flow old entry expires after 30
>    seconds.
> - If route is stale, flow entry is also removed.
> 
> Maybe I am missing another possible scenario?

I'm mainly talking about the scenario where a computer moves to a 
different switch port on L2 only, so all routes remain the same.

I haven't fully analyzed the issue, but I did find a few potential 
issues with what you're describing.

1. Since one direction remains the same when a computer roams, a new 
flow entry would probably fail to be added because of an existing entry 
in the flow hash table.

2. Even with that out of the way, the MTK hardware offload currently 
does not support matching the incoming switch/ethernet port.
So even if we manage to add an updated entry, the old entry could still 
be kept alive by the hardware.

The issues I found probably wouldn't cause connection hangs in pure L3 
software flow offload, since it will use the bridge device for xmit 
instead of its members. But since hardware offload needs to redirect 
traffic to individual bridge ports, it could cause connection hangs with 
stale flow entries.

There might be other issues as well, but this is what I could come up 
with on short notice. I think in order to properly address this, we 
should probably monitor for FDB / neigh entry changes somehow and clear 
affected flows.
Routes do not become stale in my scenario, so something else is needed 
to trigger flow entry removal.

- Felix
Pablo Neira Ayuso Oct. 17, 2024, 6:09 p.m. UTC | #10
On Thu, Oct 17, 2024 at 07:06:51PM +0200, Felix Fietkau wrote:
> On 17.10.24 14:39, Pablo Neira Ayuso wrote:
> > On Thu, Oct 17, 2024 at 11:17:09AM +0200, Felix Fietkau wrote:
> > [...]
> > > By the way, based on some reports that I received, I do believe that the
> > > existing forwarding fastpath also doesn't handle roaming properly.
> > > I just didn't have the time to properly look into that yet.
> > 
> > I think it should work for the existing forwarding fastpath.
> > 
> > - If computer roams from different port, packets follow classic path,
> >    then new flow entry is created. The flow old entry expires after 30
> >    seconds.
> > - If route is stale, flow entry is also removed.
> > 
> > Maybe I am missing another possible scenario?
> 
> I'm mainly talking about the scenario where a computer moves to a different
> switch port on L2 only, so all routes remain the same.
> 
> I haven't fully analyzed the issue, but I did find a few potential issues
> with what you're describing.
> 
> 1. Since one direction remains the same when a computer roams, a new flow
> entry would probably fail to be added because of an existing entry in the
> flow hash table.

I don't think so, hash includes iifidx.

> 2. Even with that out of the way, the MTK hardware offload currently does
> not support matching the incoming switch/ethernet port.
> So even if we manage to add an updated entry, the old entry could still be
> kept alive by the hardware.

OK, that means probably driver needs to address the lack of iifidx in
the matching by dealling with more than one single flow entry to point
to one single hardware entry (refcounting?).

> The issues I found probably wouldn't cause connection hangs in pure L3
> software flow offload, since it will use the bridge device for xmit instead
> of its members. But since hardware offload needs to redirect traffic to
> individual bridge ports, it could cause connection hangs with stale flow
> entries.

I would not expect a hang, packets will just flow over classic path
for a little while for the computer that is roaming until the new flow
entry is added.

> There might be other issues as well, but this is what I could come up with
> on short notice. I think in order to properly address this, we should
> probably monitor for FDB / neigh entry changes somehow and clear affected
> flows.
>
> Routes do not become stale in my scenario, so something else is needed to
> trigger flow entry removal.

Yes. In case letting expire stale flow entries with old iifidx is not enough
some other mechanism could be required.
Felix Fietkau Oct. 17, 2024, 6:39 p.m. UTC | #11
On 17.10.24 20:09, Pablo Neira Ayuso wrote:
> On Thu, Oct 17, 2024 at 07:06:51PM +0200, Felix Fietkau wrote:
>> On 17.10.24 14:39, Pablo Neira Ayuso wrote:
>> > On Thu, Oct 17, 2024 at 11:17:09AM +0200, Felix Fietkau wrote:
>> > [...]
>> > > By the way, based on some reports that I received, I do believe that the
>> > > existing forwarding fastpath also doesn't handle roaming properly.
>> > > I just didn't have the time to properly look into that yet.
>> > 
>> > I think it should work for the existing forwarding fastpath.
>> > 
>> > - If computer roams from different port, packets follow classic path,
>> >    then new flow entry is created. The flow old entry expires after 30
>> >    seconds.
>> > - If route is stale, flow entry is also removed.
>> > 
>> > Maybe I am missing another possible scenario?
>> 
>> I'm mainly talking about the scenario where a computer moves to a different
>> switch port on L2 only, so all routes remain the same.
>> 
>> I haven't fully analyzed the issue, but I did find a few potential issues
>> with what you're describing.
>> 
>> 1. Since one direction remains the same when a computer roams, a new flow
>> entry would probably fail to be added because of an existing entry in the
>> flow hash table.
> 
> I don't think so, hash includes iifidx.

I'm talking about the side where the input ifindex remains the same, but 
the output interface doesn't.

>> 2. Even with that out of the way, the MTK hardware offload currently does
>> not support matching the incoming switch/ethernet port.
>> So even if we manage to add an updated entry, the old entry could still be
>> kept alive by the hardware.
> 
> OK, that means probably driver needs to address the lack of iifidx in
> the matching by dealling with more than one single flow entry to point
> to one single hardware entry (refcounting?).

If we have multiple colliding entries, I think a more reasonable 
behavior would be allowing the newer flow to override the older one.

>> The issues I found probably wouldn't cause connection hangs in pure L3
>> software flow offload, since it will use the bridge device for xmit instead
>> of its members. But since hardware offload needs to redirect traffic to
>> individual bridge ports, it could cause connection hangs with stale flow
>> entries.
> 
> I would not expect a hang, packets will just flow over classic path
> for a little while for the computer that is roaming until the new flow
> entry is added.

If the hardware still handles traffic, but redirects it to the wrong 
destination port, the connection will hang.

- Felix