mbox series

[iwl-next,v6,00/12] ice: switchdev bridge offload

Message ID 20230712110337.8030-1-wojciech.drewek@intel.com (mailing list archive)
Headers show
Series ice: switchdev bridge offload | expand

Message

Wojciech Drewek July 12, 2023, 11:03 a.m. UTC
Linux bridge provides ability to learn MAC addresses and vlans
detected on bridge's ports. As a result of this, FDB (forward data base)
entries are created and they can be offloaded to the HW. By adding
VF's port representors to the bridge together with the uplink netdev,
we can learn VF's and link partner's MAC addresses. This is achieved
by slow/exception-path, where packets that do not match any filters
(FDB entries in this case) are send to the bridge ports.

Driver keeps track of the netdevs added to the bridge
by listening for NETDEV_CHANGEUPPER event. We distinguish two types
of bridge ports: uplink port and VF's representor port. Linux
bridge always learns src MAC of the packet on rx path. With the
current slow-path implementation, it means that we will learn
VF's MAC on port repr (when the VF transmits the packet) and
link partner's MAC on uplink (when we receive it on uplink from LAN).

The driver is notified about learning of the MAC/VLAN by
SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events. This is followed by creation
of the HW filter. The direction of the filter is based on port
type (uplink or VF repr). In case of the uplink, rule forwards
the packets to the LAN (matching on link partner's MAC). When the
notification is received on VF repr then the rule forwards the
packets to the associated VF (matching on VF's MAC).

This approach would not work on its own however. This is because if
one of the directions is offloaded, then the bridge would not be able
to learn the other one. If the egress rule is added (learned on uplink)
then the response from the VF will be sent directly to the LAN.
The packet will not got through slow-path, it would not be seen on
VF's port repr. Because of that, the bridge would not learn VF's MAC.

This is solved by introducing guard rule. It prevents forward rule from
working until the opposite direction is offloaded.

Aging is not fully supported yet, aging time is static for now. The
follow up submissions will introduce counters that will allow us to
keep track if the rule is actually being used or not.

A few fixes/changes are needed for this feature to work with ice driver.
These are introduced in first 5 patches.
---
v2: two patches were droped from the series:
    - "ice: Remove exclusion code for RDMA+SRIOV" was sent as separate
      patch: https://lore.kernel.org/netdev/20230516113055.7336-1-wojciech.drewek@intel.com/
    - "ice: Ethtool fdb_cnt stats" was dropped because of the comments
      suggesting that ethtool is not a good option for such statistic.
      An alternative will be send as a separate patch.
v3: small changes in patch 5, 7 and 8 including kdoc, style fixes.
v4: split 1st patch in the series into 4 as Paul suggested
v5: drop "ice: Accept LAG netdevs in bridge offloads" patch,
    it will go with LAG patchset, I kept dev_hold and dev_put since the
    discussion was not resolved
v6: resolve Vlad's comments: delete FDB entries associated with
    deleted vlan, add missing vlan_ops calls when clearing pvid

Marcin Szycik (2):
  ice: Add guard rule when creating FDB in switchdev
  ice: Add VLAN FDB support in switchdev mode

Michal Swiatkowski (2):
  ice: implement bridge port vlan
  ice: implement static version of ageing

Pawel Chmielewski (1):
  ice: add tracepoints for the switchdev bridge

Wojciech Drewek (7):
  ice: Skip adv rules removal upon switchdev release
  ice: Prohibit rx mode change in switchdev mode
  ice: Don't tx before switchdev is fully configured
  ice: Disable vlan pruning for uplink VSI
  ice: Unset src prune on uplink VSI
  ice: Implement basic eswitch bridge setup
  ice: Switchdev FDB events support

 drivers/net/ethernet/intel/ice/Makefile       |    2 +-
 drivers/net/ethernet/intel/ice/ice.h          |    5 +-
 drivers/net/ethernet/intel/ice/ice_eswitch.c  |   46 +-
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 1308 +++++++++++++++++
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  120 ++
 drivers/net/ethernet/intel/ice/ice_lib.c      |   25 +
 drivers/net/ethernet/intel/ice/ice_lib.h      |    1 +
 drivers/net/ethernet/intel/ice/ice_main.c     |    4 +-
 drivers/net/ethernet/intel/ice/ice_repr.c     |    2 +-
 drivers/net/ethernet/intel/ice/ice_repr.h     |    3 +-
 drivers/net/ethernet/intel/ice/ice_switch.c   |  150 +-
 drivers/net/ethernet/intel/ice/ice_switch.h   |    6 +-
 drivers/net/ethernet/intel/ice/ice_trace.h    |   90 ++
 drivers/net/ethernet/intel/ice/ice_type.h     |    1 +
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.c  |  186 +--
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.h  |    4 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.c |   84 +-
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.h |    8 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_ops.h |    1 +
 19 files changed, 1860 insertions(+), 186 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.h

Comments

Vlad Buslov July 12, 2023, 4:47 p.m. UTC | #1
On Wed 12 Jul 2023 at 13:03, Wojciech Drewek <wojciech.drewek@intel.com> wrote:
> Linux bridge provides ability to learn MAC addresses and vlans
> detected on bridge's ports. As a result of this, FDB (forward data base)
> entries are created and they can be offloaded to the HW. By adding
> VF's port representors to the bridge together with the uplink netdev,
> we can learn VF's and link partner's MAC addresses. This is achieved
> by slow/exception-path, where packets that do not match any filters
> (FDB entries in this case) are send to the bridge ports.
>
> Driver keeps track of the netdevs added to the bridge
> by listening for NETDEV_CHANGEUPPER event. We distinguish two types
> of bridge ports: uplink port and VF's representor port. Linux
> bridge always learns src MAC of the packet on rx path. With the
> current slow-path implementation, it means that we will learn
> VF's MAC on port repr (when the VF transmits the packet) and
> link partner's MAC on uplink (when we receive it on uplink from LAN).
>
> The driver is notified about learning of the MAC/VLAN by
> SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events. This is followed by creation
> of the HW filter. The direction of the filter is based on port
> type (uplink or VF repr). In case of the uplink, rule forwards
> the packets to the LAN (matching on link partner's MAC). When the
> notification is received on VF repr then the rule forwards the
> packets to the associated VF (matching on VF's MAC).
>
> This approach would not work on its own however. This is because if
> one of the directions is offloaded, then the bridge would not be able
> to learn the other one. If the egress rule is added (learned on uplink)
> then the response from the VF will be sent directly to the LAN.
> The packet will not got through slow-path, it would not be seen on
> VF's port repr. Because of that, the bridge would not learn VF's MAC.
>
> This is solved by introducing guard rule. It prevents forward rule from
> working until the opposite direction is offloaded.
>
> Aging is not fully supported yet, aging time is static for now. The
> follow up submissions will introduce counters that will allow us to
> keep track if the rule is actually being used or not.
>
> A few fixes/changes are needed for this feature to work with ice driver.
> These are introduced in first 5 patches.
> ---
> v2: two patches were droped from the series:
>     - "ice: Remove exclusion code for RDMA+SRIOV" was sent as separate
>       patch: https://lore.kernel.org/netdev/20230516113055.7336-1-wojciech.drewek@intel.com/
>     - "ice: Ethtool fdb_cnt stats" was dropped because of the comments
>       suggesting that ethtool is not a good option for such statistic.
>       An alternative will be send as a separate patch.
> v3: small changes in patch 5, 7 and 8 including kdoc, style fixes.
> v4: split 1st patch in the series into 4 as Paul suggested
> v5: drop "ice: Accept LAG netdevs in bridge offloads" patch,
>     it will go with LAG patchset, I kept dev_hold and dev_put since the
>     discussion was not resolved
> v6: resolve Vlad's comments: delete FDB entries associated with
>     deleted vlan, add missing vlan_ops calls when clearing pvid
>
> Marcin Szycik (2):
>   ice: Add guard rule when creating FDB in switchdev
>   ice: Add VLAN FDB support in switchdev mode
>
> Michal Swiatkowski (2):
>   ice: implement bridge port vlan
>   ice: implement static version of ageing
>
> Pawel Chmielewski (1):
>   ice: add tracepoints for the switchdev bridge
>
> Wojciech Drewek (7):
>   ice: Skip adv rules removal upon switchdev release
>   ice: Prohibit rx mode change in switchdev mode
>   ice: Don't tx before switchdev is fully configured
>   ice: Disable vlan pruning for uplink VSI
>   ice: Unset src prune on uplink VSI
>   ice: Implement basic eswitch bridge setup
>   ice: Switchdev FDB events support
>
>  drivers/net/ethernet/intel/ice/Makefile       |    2 +-
>  drivers/net/ethernet/intel/ice/ice.h          |    5 +-
>  drivers/net/ethernet/intel/ice/ice_eswitch.c  |   46 +-
>  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 1308 +++++++++++++++++
>  .../net/ethernet/intel/ice/ice_eswitch_br.h   |  120 ++
>  drivers/net/ethernet/intel/ice/ice_lib.c      |   25 +
>  drivers/net/ethernet/intel/ice/ice_lib.h      |    1 +
>  drivers/net/ethernet/intel/ice/ice_main.c     |    4 +-
>  drivers/net/ethernet/intel/ice/ice_repr.c     |    2 +-
>  drivers/net/ethernet/intel/ice/ice_repr.h     |    3 +-
>  drivers/net/ethernet/intel/ice/ice_switch.c   |  150 +-
>  drivers/net/ethernet/intel/ice/ice_switch.h   |    6 +-
>  drivers/net/ethernet/intel/ice/ice_trace.h    |   90 ++
>  drivers/net/ethernet/intel/ice/ice_type.h     |    1 +
>  .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.c  |  186 +--
>  .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.h  |    4 +
>  .../net/ethernet/intel/ice/ice_vsi_vlan_lib.c |   84 +-
>  .../net/ethernet/intel/ice/ice_vsi_vlan_lib.h |    8 +
>  .../net/ethernet/intel/ice/ice_vsi_vlan_ops.h |    1 +
>  19 files changed, 1860 insertions(+), 186 deletions(-)
>  create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.c
>  create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.h

Reviewed-by: Vlad Buslov <vladbu@nvidia.com>