Message ID | 20210323102326.3677940-1-tobias@waldekranz.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: dsa: mv88e6xxx: Allow dynamic reconfiguration of tag protocol | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 1 maintainers not CCed: linux@armlinux.org.uk |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 76 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: > All devices are capable of using regular DSA tags. Support for > Ethertyped DSA tags sort into three categories: > > 1. No support. Older chips fall into this category. > > 2. Full support. Datasheet explicitly supports configuring the CPU > port to receive FORWARDs with a DSA tag. > > 3. Undocumented support. Datasheet lists the configuration from > category 2 as "reserved for future use", but does empirically > behave like a category 2 device. > > Because there are ethernet controllers that do not handle regular DSA > tags in all cases, it is sometimes preferable to rely on the > undocumented behavior, as the alternative is a very crippled > system. But, in those cases, make sure to log the fact that an > undocumented feature has been enabled. > > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> > --- > > In a system using an NXP T1023 SoC connected to a 6390X switch, we > noticed that TO_CPU frames where not reaching the CPU. This only > happened on hardware port 8. Looking at the DSA master interface > (dpaa-ethernet) we could see that an Rx error counter was bumped at > the same rate. The logs indicated a parser error. > > It just so happens that a TO_CPU coming in on device 0, port 8, will > result in the first two bytes of the DSA tag being one of: > > 00 40 > 00 44 > 00 46 > > My guess is that since these values look like 802.3 length fields, the > controller's parser will signal an error if the frame length does not > match what is in the header. Interesting assumption. Could you please try this patch out, just for my amusement? It is only compile-tested. -----------------------------[ cut here ]----------------------------- >From ab75b63d1bfeccc3032060e6e6dbd2ea8f1d31ed Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vladimir.oltean@nxp.com> Date: Tue, 23 Mar 2021 13:03:34 +0200 Subject: [PATCH] fsl/fman: ignore RX parse errors on ports that are DSA masters Tobias reports that when an FMan port receives a Marvell DSA tagged frame (normal/legacy tag, not EtherType tag) which is a TO_CPU frame coming in from device 0, port 8, that frame will be dropped. It appears that the first two bytes of this particular DSA tag (which overlap with what the FMan parser interprets as an EtherType/Length field) look like one of the possible values below: 00 40 00 44 00 46 Reported-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- .../net/ethernet/freescale/dpaa/dpaa_eth.c | 65 ++++++++++--------- .../net/ethernet/freescale/fman/fman_port.c | 8 ++- .../net/ethernet/freescale/fman/fman_port.h | 2 +- 3 files changed, 42 insertions(+), 33 deletions(-) diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index 720dc99bd1fc..069d38cd63c5 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -55,6 +55,7 @@ #include <linux/phy_fixed.h> #include <linux/bpf.h> #include <linux/bpf_trace.h> +#include <net/dsa.h> #include <soc/fsl/bman.h> #include <soc/fsl/qman.h> #include "fman.h" @@ -2447,34 +2448,6 @@ static inline int dpaa_eth_napi_schedule(struct dpaa_percpu_priv *percpu_priv, return 0; } -static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, - struct qman_fq *fq, - const struct qm_dqrr_entry *dq, - bool sched_napi) -{ - struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); - struct dpaa_percpu_priv *percpu_priv; - struct net_device *net_dev; - struct dpaa_bp *dpaa_bp; - struct dpaa_priv *priv; - - net_dev = dpaa_fq->net_dev; - priv = netdev_priv(net_dev); - dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); - if (!dpaa_bp) - return qman_cb_dqrr_consume; - - percpu_priv = this_cpu_ptr(priv->percpu_priv); - - if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) - return qman_cb_dqrr_stop; - - dpaa_eth_refill_bpools(priv); - dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); - - return qman_cb_dqrr_consume; -} - static int dpaa_xdp_xmit_frame(struct net_device *net_dev, struct xdp_frame *xdpf) { @@ -2699,7 +2672,7 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, return qman_cb_dqrr_consume; } - if (unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { + if (!netdev_uses_dsa(net_dev) && unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { if (net_ratelimit()) netif_warn(priv, hw, net_dev, "FD status = 0x%08x\n", fd_status & FM_FD_STAT_RX_ERRORS); @@ -2802,6 +2775,37 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, return qman_cb_dqrr_consume; } +static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, + struct qman_fq *fq, + const struct qm_dqrr_entry *dq, + bool sched_napi) +{ + struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); + struct dpaa_percpu_priv *percpu_priv; + struct net_device *net_dev; + struct dpaa_bp *dpaa_bp; + struct dpaa_priv *priv; + + net_dev = dpaa_fq->net_dev; + if (netdev_uses_dsa(net_dev)) + return rx_default_dqrr(portal, fq, dq, sched_napi); + + priv = netdev_priv(net_dev); + dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); + if (!dpaa_bp) + return qman_cb_dqrr_consume; + + percpu_priv = this_cpu_ptr(priv->percpu_priv); + + if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) + return qman_cb_dqrr_stop; + + dpaa_eth_refill_bpools(priv); + dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); + + return qman_cb_dqrr_consume; +} + static enum qman_cb_dqrr_result conf_error_dqrr(struct qman_portal *portal, struct qman_fq *fq, const struct qm_dqrr_entry *dq, @@ -2955,6 +2959,7 @@ static int dpaa_phy_init(struct net_device *net_dev) static int dpaa_open(struct net_device *net_dev) { + bool ignore_errors = netdev_uses_dsa(net_dev); struct mac_device *mac_dev; struct dpaa_priv *priv; int err, i; @@ -2968,7 +2973,7 @@ static int dpaa_open(struct net_device *net_dev) goto phy_init_failed; for (i = 0; i < ARRAY_SIZE(mac_dev->port); i++) { - err = fman_port_enable(mac_dev->port[i]); + err = fman_port_enable(mac_dev->port[i], ignore_errors); if (err) goto mac_start_failed; } diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c b/drivers/net/ethernet/freescale/fman/fman_port.c index d9baac0dbc7d..763faec11f5c 100644 --- a/drivers/net/ethernet/freescale/fman/fman_port.c +++ b/drivers/net/ethernet/freescale/fman/fman_port.c @@ -106,6 +106,7 @@ #define BMI_EBD_EN 0x80000000 #define BMI_PORT_CFG_EN 0x80000000 +#define BMI_PORT_CFG_FDOVR 0x02000000 #define BMI_PORT_STATUS_BSY 0x80000000 @@ -1629,7 +1630,7 @@ int fman_port_disable(struct fman_port *port) } /* Disable BMI */ - tmp = ioread32be(bmi_cfg_reg) & ~BMI_PORT_CFG_EN; + tmp = ioread32be(bmi_cfg_reg) & ~(BMI_PORT_CFG_EN | BMI_PORT_CFG_FDOVR); iowrite32be(tmp, bmi_cfg_reg); /* Wait for graceful stop end */ @@ -1655,6 +1656,7 @@ EXPORT_SYMBOL(fman_port_disable); /** * fman_port_enable * @port: A pointer to a FM Port module. + * @ignore_errors: If set, do not discard frames received with errors. * * A runtime routine provided to allow disable/enable of port. * @@ -1662,7 +1664,7 @@ EXPORT_SYMBOL(fman_port_disable); * * Return: 0 on success; Error code otherwise. */ -int fman_port_enable(struct fman_port *port) +int fman_port_enable(struct fman_port *port, bool ignore_errors) { u32 __iomem *bmi_cfg_reg; u32 tmp; @@ -1692,6 +1694,8 @@ int fman_port_enable(struct fman_port *port) /* Enable BMI */ tmp = ioread32be(bmi_cfg_reg) | BMI_PORT_CFG_EN; + if (ignore_errors) + tmp |= BMI_PORT_CFG_FDOVR; iowrite32be(tmp, bmi_cfg_reg); return 0; diff --git a/drivers/net/ethernet/freescale/fman/fman_port.h b/drivers/net/ethernet/freescale/fman/fman_port.h index 82f12661a46d..0928361b0e73 100644 --- a/drivers/net/ethernet/freescale/fman/fman_port.h +++ b/drivers/net/ethernet/freescale/fman/fman_port.h @@ -147,7 +147,7 @@ int fman_port_cfg_buf_prefix_content(struct fman_port *port, int fman_port_disable(struct fman_port *port); -int fman_port_enable(struct fman_port *port); +int fman_port_enable(struct fman_port *port, bool ignore_errors); u32 fman_port_get_qman_channel_id(struct fman_port *port); -----------------------------[ cut here ]----------------------------- The netdev_uses_dsa thing is a bit trashy, I think that a more polished version should rather set NETIF_F_RXALL for the DSA master, and have the dpaa driver act upon that. But first I'm curious if it works. > > As a workaround, switching to EDSA (thereby always having a proper > EtherType in the frame) solves the issue. So basically every user needs to change the tag protocol manually to be able to receive from port 8? Not sure if that's too friendly. > drivers/net/dsa/mv88e6xxx/chip.c | 41 +++++++++++++++++++++++++++++--- > drivers/net/dsa/mv88e6xxx/chip.h | 3 +++ > 2 files changed, 41 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index 95f07fcd4f85..e7ec883d5f6b 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -2531,10 +2531,10 @@ static int mv88e6xxx_setup_port_mode(struct mv88e6xxx_chip *chip, int port) > return mv88e6xxx_set_port_mode_normal(chip, port); > > /* Setup CPU port mode depending on its supported tag format */ > - if (chip->info->tag_protocol == DSA_TAG_PROTO_DSA) > + if (chip->tag_protocol == DSA_TAG_PROTO_DSA) > return mv88e6xxx_set_port_mode_dsa(chip, port); > > - if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA) > + if (chip->tag_protocol == DSA_TAG_PROTO_EDSA) > return mv88e6xxx_set_port_mode_edsa(chip, port); > > return -EINVAL; > @@ -5564,7 +5564,39 @@ static enum dsa_tag_protocol mv88e6xxx_get_tag_protocol(struct dsa_switch *ds, > { > struct mv88e6xxx_chip *chip = ds->priv; > > - return chip->info->tag_protocol; > + return chip->tag_protocol; > +} > + > +static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds, int port, > + enum dsa_tag_protocol proto) > +{ > + struct mv88e6xxx_chip *chip = ds->priv; > + enum dsa_tag_protocol old_protocol; > + int err; > + > + switch (proto) { > + case DSA_TAG_PROTO_EDSA: > + if (chip->info->tag_protocol != DSA_TAG_PROTO_EDSA) > + dev_warn(chip->dev, "Relying on undocumented EDSA tagging behavior\n"); > + > + break; > + case DSA_TAG_PROTO_DSA: > + break; > + default: > + return -EPROTONOSUPPORT; > + } > + > + old_protocol = chip->tag_protocol; > + chip->tag_protocol = proto; > + > + mv88e6xxx_reg_lock(chip); > + err = mv88e6xxx_setup_port_mode(chip, port); > + mv88e6xxx_reg_unlock(chip); > + > + if (err) > + chip->tag_protocol = old_protocol; > + > + return err; > } > > static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port, > @@ -6029,6 +6061,7 @@ static int mv88e6xxx_crosschip_lag_leave(struct dsa_switch *ds, int sw_index, > > static const struct dsa_switch_ops mv88e6xxx_switch_ops = { > .get_tag_protocol = mv88e6xxx_get_tag_protocol, > + .change_tag_protocol = mv88e6xxx_change_tag_protocol, > .setup = mv88e6xxx_setup, > .teardown = mv88e6xxx_teardown, > .phylink_validate = mv88e6xxx_validate, > @@ -6209,6 +6242,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev) > if (err) > goto out; > > + chip->tag_protocol = chip->info->tag_protocol; > + > mv88e6xxx_phy_init(chip); > > if (chip->info->ops->get_eeprom) { > diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h > index bce6e0dc8535..96b775f3fda2 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.h > +++ b/drivers/net/dsa/mv88e6xxx/chip.h > @@ -261,6 +261,9 @@ struct mv88e6xxx_region_priv { > struct mv88e6xxx_chip { > const struct mv88e6xxx_info *info; > > + /* Currently configured tagging protocol */ > + enum dsa_tag_protocol tag_protocol; > + > /* The dsa_switch this private structure is related to */ > struct dsa_switch *ds; > > -- > 2.25.1 >
On Tue, Mar 23, 2021 at 01:35:22PM +0200, Vladimir Oltean wrote: > On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: > > All devices are capable of using regular DSA tags. Support for > > Ethertyped DSA tags sort into three categories: > > > > 1. No support. Older chips fall into this category. > > > > 2. Full support. Datasheet explicitly supports configuring the CPU > > port to receive FORWARDs with a DSA tag. > > > > 3. Undocumented support. Datasheet lists the configuration from > > category 2 as "reserved for future use", but does empirically > > behave like a category 2 device. > > > > Because there are ethernet controllers that do not handle regular DSA > > tags in all cases, it is sometimes preferable to rely on the > > undocumented behavior, as the alternative is a very crippled > > system. But, in those cases, make sure to log the fact that an > > undocumented feature has been enabled. > > > > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> > > --- > > > > In a system using an NXP T1023 SoC connected to a 6390X switch, we > > noticed that TO_CPU frames where not reaching the CPU. This only > > happened on hardware port 8. Looking at the DSA master interface > > (dpaa-ethernet) we could see that an Rx error counter was bumped at > > the same rate. The logs indicated a parser error. > > > > It just so happens that a TO_CPU coming in on device 0, port 8, will > > result in the first two bytes of the DSA tag being one of: > > > > 00 40 > > 00 44 > > 00 46 > > > > My guess is that since these values look like 802.3 length fields, the > > controller's parser will signal an error if the frame length does not > > match what is in the header. > > Interesting assumption. > Could you please try this patch out, just for my amusement? It is only > compile-tested. Another thing you could try, just for amusement, is change the Ethertype in EDSA to 00400044 and see if it also causes problems. It might not, since the last two bytes are not set as required. Andrew
On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: > All devices are capable of using regular DSA tags. Support for > Ethertyped DSA tags sort into three categories: > > 1. No support. Older chips fall into this category. > > 2. Full support. Datasheet explicitly supports configuring the CPU > port to receive FORWARDs with a DSA tag. > > 3. Undocumented support. Datasheet lists the configuration from > category 2 as "reserved for future use", but does empirically > behave like a category 2 device. > > Because there are ethernet controllers that do not handle regular DSA > tags in all cases, it is sometimes preferable to rely on the > undocumented behavior, as the alternative is a very crippled > system. But, in those cases, make sure to log the fact that an > undocumented feature has been enabled. Hi Tobias I wonder if dynamic reconfiguration is the correct solution here. By default it will be wrong for this board, and you need user space to flip it. Maybe a DT property would be better. Extend dsa_switch_parse_of() to look for the optional property dsa,tag-protocol, a string containing the name of the tag ops to be used. Andrew
On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: > On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: >> All devices are capable of using regular DSA tags. Support for >> Ethertyped DSA tags sort into three categories: >> >> 1. No support. Older chips fall into this category. >> >> 2. Full support. Datasheet explicitly supports configuring the CPU >> port to receive FORWARDs with a DSA tag. >> >> 3. Undocumented support. Datasheet lists the configuration from >> category 2 as "reserved for future use", but does empirically >> behave like a category 2 device. >> >> Because there are ethernet controllers that do not handle regular DSA >> tags in all cases, it is sometimes preferable to rely on the >> undocumented behavior, as the alternative is a very crippled >> system. But, in those cases, make sure to log the fact that an >> undocumented feature has been enabled. >> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> >> --- >> >> In a system using an NXP T1023 SoC connected to a 6390X switch, we >> noticed that TO_CPU frames where not reaching the CPU. This only >> happened on hardware port 8. Looking at the DSA master interface >> (dpaa-ethernet) we could see that an Rx error counter was bumped at >> the same rate. The logs indicated a parser error. >> >> It just so happens that a TO_CPU coming in on device 0, port 8, will >> result in the first two bytes of the DSA tag being one of: >> >> 00 40 >> 00 44 >> 00 46 >> >> My guess is that since these values look like 802.3 length fields, the >> controller's parser will signal an error if the frame length does not >> match what is in the header. > > Interesting assumption. > Could you please try this patch out, just for my amusement? It is only > compile-tested. > > -----------------------------[ cut here ]----------------------------- > From ab75b63d1bfeccc3032060e6e6dbd2ea8f1d31ed Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <vladimir.oltean@nxp.com> > Date: Tue, 23 Mar 2021 13:03:34 +0200 > Subject: [PATCH] fsl/fman: ignore RX parse errors on ports that are DSA > masters > > Tobias reports that when an FMan port receives a Marvell DSA tagged > frame (normal/legacy tag, not EtherType tag) which is a TO_CPU frame > coming in from device 0, port 8, that frame will be dropped. > > It appears that the first two bytes of this particular DSA tag (which > overlap with what the FMan parser interprets as an EtherType/Length > field) look like one of the possible values below: > > 00 40 > 00 44 > 00 46 > > Reported-by: Tobias Waldekranz <tobias@waldekranz.com> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- > .../net/ethernet/freescale/dpaa/dpaa_eth.c | 65 ++++++++++--------- > .../net/ethernet/freescale/fman/fman_port.c | 8 ++- > .../net/ethernet/freescale/fman/fman_port.h | 2 +- > 3 files changed, 42 insertions(+), 33 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c > index 720dc99bd1fc..069d38cd63c5 100644 > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c > @@ -55,6 +55,7 @@ > #include <linux/phy_fixed.h> > #include <linux/bpf.h> > #include <linux/bpf_trace.h> > +#include <net/dsa.h> > #include <soc/fsl/bman.h> > #include <soc/fsl/qman.h> > #include "fman.h" > @@ -2447,34 +2448,6 @@ static inline int dpaa_eth_napi_schedule(struct dpaa_percpu_priv *percpu_priv, > return 0; > } > > -static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, > - struct qman_fq *fq, > - const struct qm_dqrr_entry *dq, > - bool sched_napi) > -{ > - struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); > - struct dpaa_percpu_priv *percpu_priv; > - struct net_device *net_dev; > - struct dpaa_bp *dpaa_bp; > - struct dpaa_priv *priv; > - > - net_dev = dpaa_fq->net_dev; > - priv = netdev_priv(net_dev); > - dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); > - if (!dpaa_bp) > - return qman_cb_dqrr_consume; > - > - percpu_priv = this_cpu_ptr(priv->percpu_priv); > - > - if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) > - return qman_cb_dqrr_stop; > - > - dpaa_eth_refill_bpools(priv); > - dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); > - > - return qman_cb_dqrr_consume; > -} > - > static int dpaa_xdp_xmit_frame(struct net_device *net_dev, > struct xdp_frame *xdpf) > { > @@ -2699,7 +2672,7 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, > return qman_cb_dqrr_consume; > } > > - if (unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { > + if (!netdev_uses_dsa(net_dev) && unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { > if (net_ratelimit()) > netif_warn(priv, hw, net_dev, "FD status = 0x%08x\n", > fd_status & FM_FD_STAT_RX_ERRORS); > @@ -2802,6 +2775,37 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, > return qman_cb_dqrr_consume; > } > > +static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, > + struct qman_fq *fq, > + const struct qm_dqrr_entry *dq, > + bool sched_napi) > +{ > + struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); > + struct dpaa_percpu_priv *percpu_priv; > + struct net_device *net_dev; > + struct dpaa_bp *dpaa_bp; > + struct dpaa_priv *priv; > + > + net_dev = dpaa_fq->net_dev; > + if (netdev_uses_dsa(net_dev)) > + return rx_default_dqrr(portal, fq, dq, sched_napi); > + > + priv = netdev_priv(net_dev); > + dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); > + if (!dpaa_bp) > + return qman_cb_dqrr_consume; > + > + percpu_priv = this_cpu_ptr(priv->percpu_priv); > + > + if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) > + return qman_cb_dqrr_stop; > + > + dpaa_eth_refill_bpools(priv); > + dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); > + > + return qman_cb_dqrr_consume; > +} > + > static enum qman_cb_dqrr_result conf_error_dqrr(struct qman_portal *portal, > struct qman_fq *fq, > const struct qm_dqrr_entry *dq, > @@ -2955,6 +2959,7 @@ static int dpaa_phy_init(struct net_device *net_dev) > > static int dpaa_open(struct net_device *net_dev) > { > + bool ignore_errors = netdev_uses_dsa(net_dev); > struct mac_device *mac_dev; > struct dpaa_priv *priv; > int err, i; > @@ -2968,7 +2973,7 @@ static int dpaa_open(struct net_device *net_dev) > goto phy_init_failed; > > for (i = 0; i < ARRAY_SIZE(mac_dev->port); i++) { > - err = fman_port_enable(mac_dev->port[i]); > + err = fman_port_enable(mac_dev->port[i], ignore_errors); > if (err) > goto mac_start_failed; > } > diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c b/drivers/net/ethernet/freescale/fman/fman_port.c > index d9baac0dbc7d..763faec11f5c 100644 > --- a/drivers/net/ethernet/freescale/fman/fman_port.c > +++ b/drivers/net/ethernet/freescale/fman/fman_port.c > @@ -106,6 +106,7 @@ > #define BMI_EBD_EN 0x80000000 > > #define BMI_PORT_CFG_EN 0x80000000 > +#define BMI_PORT_CFG_FDOVR 0x02000000 > > #define BMI_PORT_STATUS_BSY 0x80000000 > > @@ -1629,7 +1630,7 @@ int fman_port_disable(struct fman_port *port) > } > > /* Disable BMI */ > - tmp = ioread32be(bmi_cfg_reg) & ~BMI_PORT_CFG_EN; > + tmp = ioread32be(bmi_cfg_reg) & ~(BMI_PORT_CFG_EN | BMI_PORT_CFG_FDOVR); > iowrite32be(tmp, bmi_cfg_reg); > > /* Wait for graceful stop end */ > @@ -1655,6 +1656,7 @@ EXPORT_SYMBOL(fman_port_disable); > /** > * fman_port_enable > * @port: A pointer to a FM Port module. > + * @ignore_errors: If set, do not discard frames received with errors. > * > * A runtime routine provided to allow disable/enable of port. > * > @@ -1662,7 +1664,7 @@ EXPORT_SYMBOL(fman_port_disable); > * > * Return: 0 on success; Error code otherwise. > */ > -int fman_port_enable(struct fman_port *port) > +int fman_port_enable(struct fman_port *port, bool ignore_errors) > { > u32 __iomem *bmi_cfg_reg; > u32 tmp; > @@ -1692,6 +1694,8 @@ int fman_port_enable(struct fman_port *port) > > /* Enable BMI */ > tmp = ioread32be(bmi_cfg_reg) | BMI_PORT_CFG_EN; > + if (ignore_errors) > + tmp |= BMI_PORT_CFG_FDOVR; > iowrite32be(tmp, bmi_cfg_reg); > > return 0; > diff --git a/drivers/net/ethernet/freescale/fman/fman_port.h b/drivers/net/ethernet/freescale/fman/fman_port.h > index 82f12661a46d..0928361b0e73 100644 > --- a/drivers/net/ethernet/freescale/fman/fman_port.h > +++ b/drivers/net/ethernet/freescale/fman/fman_port.h > @@ -147,7 +147,7 @@ int fman_port_cfg_buf_prefix_content(struct fman_port *port, > > int fman_port_disable(struct fman_port *port); > > -int fman_port_enable(struct fman_port *port); > +int fman_port_enable(struct fman_port *port, bool ignore_errors); > > u32 fman_port_get_qman_channel_id(struct fman_port *port); > > -----------------------------[ cut here ]----------------------------- > > The netdev_uses_dsa thing is a bit trashy, I think that a more polished > version should rather set NETIF_F_RXALL for the DSA master, and have the > dpaa driver act upon that. But first I'm curious if it works. It does work. Thank you! Does setting RXALL mean that the master would accept frames with a bad FCS as well? If so, would that mean that we would have to verify it in software? >> >> As a workaround, switching to EDSA (thereby always having a proper >> EtherType in the frame) solves the issue. > > So basically every user needs to change the tag protocol manually to be > able to receive from port 8? Not sure if that's too friendly. No it is not friendly at all. My goal was to add it as a device-tree property, but for reasons I will detail in my answer to Andrew, I did not manage to figure out a good way to do that. Happy to take suggestions. >> drivers/net/dsa/mv88e6xxx/chip.c | 41 +++++++++++++++++++++++++++++--- >> drivers/net/dsa/mv88e6xxx/chip.h | 3 +++ >> 2 files changed, 41 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c >> index 95f07fcd4f85..e7ec883d5f6b 100644 >> --- a/drivers/net/dsa/mv88e6xxx/chip.c >> +++ b/drivers/net/dsa/mv88e6xxx/chip.c >> @@ -2531,10 +2531,10 @@ static int mv88e6xxx_setup_port_mode(struct mv88e6xxx_chip *chip, int port) >> return mv88e6xxx_set_port_mode_normal(chip, port); >> >> /* Setup CPU port mode depending on its supported tag format */ >> - if (chip->info->tag_protocol == DSA_TAG_PROTO_DSA) >> + if (chip->tag_protocol == DSA_TAG_PROTO_DSA) >> return mv88e6xxx_set_port_mode_dsa(chip, port); >> >> - if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA) >> + if (chip->tag_protocol == DSA_TAG_PROTO_EDSA) >> return mv88e6xxx_set_port_mode_edsa(chip, port); >> >> return -EINVAL; >> @@ -5564,7 +5564,39 @@ static enum dsa_tag_protocol mv88e6xxx_get_tag_protocol(struct dsa_switch *ds, >> { >> struct mv88e6xxx_chip *chip = ds->priv; >> >> - return chip->info->tag_protocol; >> + return chip->tag_protocol; >> +} >> + >> +static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds, int port, >> + enum dsa_tag_protocol proto) >> +{ >> + struct mv88e6xxx_chip *chip = ds->priv; >> + enum dsa_tag_protocol old_protocol; >> + int err; >> + >> + switch (proto) { >> + case DSA_TAG_PROTO_EDSA: >> + if (chip->info->tag_protocol != DSA_TAG_PROTO_EDSA) >> + dev_warn(chip->dev, "Relying on undocumented EDSA tagging behavior\n"); >> + >> + break; >> + case DSA_TAG_PROTO_DSA: >> + break; >> + default: >> + return -EPROTONOSUPPORT; >> + } >> + >> + old_protocol = chip->tag_protocol; >> + chip->tag_protocol = proto; >> + >> + mv88e6xxx_reg_lock(chip); >> + err = mv88e6xxx_setup_port_mode(chip, port); >> + mv88e6xxx_reg_unlock(chip); >> + >> + if (err) >> + chip->tag_protocol = old_protocol; >> + >> + return err; >> } >> >> static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port, >> @@ -6029,6 +6061,7 @@ static int mv88e6xxx_crosschip_lag_leave(struct dsa_switch *ds, int sw_index, >> >> static const struct dsa_switch_ops mv88e6xxx_switch_ops = { >> .get_tag_protocol = mv88e6xxx_get_tag_protocol, >> + .change_tag_protocol = mv88e6xxx_change_tag_protocol, >> .setup = mv88e6xxx_setup, >> .teardown = mv88e6xxx_teardown, >> .phylink_validate = mv88e6xxx_validate, >> @@ -6209,6 +6242,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev) >> if (err) >> goto out; >> >> + chip->tag_protocol = chip->info->tag_protocol; >> + >> mv88e6xxx_phy_init(chip); >> >> if (chip->info->ops->get_eeprom) { >> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h >> index bce6e0dc8535..96b775f3fda2 100644 >> --- a/drivers/net/dsa/mv88e6xxx/chip.h >> +++ b/drivers/net/dsa/mv88e6xxx/chip.h >> @@ -261,6 +261,9 @@ struct mv88e6xxx_region_priv { >> struct mv88e6xxx_chip { >> const struct mv88e6xxx_info *info; >> >> + /* Currently configured tagging protocol */ >> + enum dsa_tag_protocol tag_protocol; >> + >> /* The dsa_switch this private structure is related to */ >> struct dsa_switch *ds; >> >> -- >> 2.25.1 >>
On Tue, Mar 23, 2021 at 13:41, Andrew Lunn <andrew@lunn.ch> wrote: > On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: >> All devices are capable of using regular DSA tags. Support for >> Ethertyped DSA tags sort into three categories: >> >> 1. No support. Older chips fall into this category. >> >> 2. Full support. Datasheet explicitly supports configuring the CPU >> port to receive FORWARDs with a DSA tag. >> >> 3. Undocumented support. Datasheet lists the configuration from >> category 2 as "reserved for future use", but does empirically >> behave like a category 2 device. >> >> Because there are ethernet controllers that do not handle regular DSA >> tags in all cases, it is sometimes preferable to rely on the >> undocumented behavior, as the alternative is a very crippled >> system. But, in those cases, make sure to log the fact that an >> undocumented feature has been enabled. > > Hi Tobias > > I wonder if dynamic reconfiguration is the correct solution here. By > default it will be wrong for this board, and you need user space to > flip it. > > Maybe a DT property would be better. Extend dsa_switch_parse_of() to > look for the optional property dsa,tag-protocol, a string containing > the name of the tag ops to be used. This was my initial approach. It gets quite messy though. Since taggers can be modules, there is no way of knowing if a supplied protocol name is garbage ("asdf"), or just part of a module in an initrd that is not loaded yet when you are probing the tree. Even when the tagger is available, there is no way to verify if the driver is compatible with it. So I think we would have to: - Keep the list of protcol names compiled in with the DSA module, such that "edsa" can be resolved to DSA_TAG_PROTO_EDSA without having the tagger module loaded. - Add (yet) another op so that we can ask the driver if the given protocol is acceptable. Calling .change_tag_protocol will not work as drivers will assume that the driver's .setup has already executed before it is called. - Have each driver check (during .setup?) if it should configure the device to use its preferred protocol or if the user has specified something else. That felt like a lot to take on board just to solve a corner case like this. I am happy to be told that there is a much easier way to do it, or that the above would be acceptable if there isn't one.
On 3/23/2021 7:48 AM, Tobias Waldekranz wrote: > On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: >> On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: >>> All devices are capable of using regular DSA tags. Support for >>> Ethertyped DSA tags sort into three categories: >>> >>> 1. No support. Older chips fall into this category. >>> >>> 2. Full support. Datasheet explicitly supports configuring the CPU >>> port to receive FORWARDs with a DSA tag. >>> >>> 3. Undocumented support. Datasheet lists the configuration from >>> category 2 as "reserved for future use", but does empirically >>> behave like a category 2 device. >>> >>> Because there are ethernet controllers that do not handle regular DSA >>> tags in all cases, it is sometimes preferable to rely on the >>> undocumented behavior, as the alternative is a very crippled >>> system. But, in those cases, make sure to log the fact that an >>> undocumented feature has been enabled. >>> >>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> >>> --- >>> >>> In a system using an NXP T1023 SoC connected to a 6390X switch, we >>> noticed that TO_CPU frames where not reaching the CPU. This only >>> happened on hardware port 8. Looking at the DSA master interface >>> (dpaa-ethernet) we could see that an Rx error counter was bumped at >>> the same rate. The logs indicated a parser error. >>> >>> It just so happens that a TO_CPU coming in on device 0, port 8, will >>> result in the first two bytes of the DSA tag being one of: >>> >>> 00 40 >>> 00 44 >>> 00 46 >>> >>> My guess is that since these values look like 802.3 length fields, the >>> controller's parser will signal an error if the frame length does not >>> match what is in the header. >> >> Interesting assumption. >> Could you please try this patch out, just for my amusement? It is only >> compile-tested. >> >> -----------------------------[ cut here ]----------------------------- >> From ab75b63d1bfeccc3032060e6e6dbd2ea8f1d31ed Mon Sep 17 00:00:00 2001 >> From: Vladimir Oltean <vladimir.oltean@nxp.com> >> Date: Tue, 23 Mar 2021 13:03:34 +0200 >> Subject: [PATCH] fsl/fman: ignore RX parse errors on ports that are DSA >> masters >> >> Tobias reports that when an FMan port receives a Marvell DSA tagged >> frame (normal/legacy tag, not EtherType tag) which is a TO_CPU frame >> coming in from device 0, port 8, that frame will be dropped. >> >> It appears that the first two bytes of this particular DSA tag (which >> overlap with what the FMan parser interprets as an EtherType/Length >> field) look like one of the possible values below: >> >> 00 40 >> 00 44 >> 00 46 >> >> Reported-by: Tobias Waldekranz <tobias@waldekranz.com> >> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> >> --- >> .../net/ethernet/freescale/dpaa/dpaa_eth.c | 65 ++++++++++--------- >> .../net/ethernet/freescale/fman/fman_port.c | 8 ++- >> .../net/ethernet/freescale/fman/fman_port.h | 2 +- >> 3 files changed, 42 insertions(+), 33 deletions(-) >> >> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c >> index 720dc99bd1fc..069d38cd63c5 100644 >> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c >> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c >> @@ -55,6 +55,7 @@ >> #include <linux/phy_fixed.h> >> #include <linux/bpf.h> >> #include <linux/bpf_trace.h> >> +#include <net/dsa.h> >> #include <soc/fsl/bman.h> >> #include <soc/fsl/qman.h> >> #include "fman.h" >> @@ -2447,34 +2448,6 @@ static inline int dpaa_eth_napi_schedule(struct dpaa_percpu_priv *percpu_priv, >> return 0; >> } >> >> -static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, >> - struct qman_fq *fq, >> - const struct qm_dqrr_entry *dq, >> - bool sched_napi) >> -{ >> - struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); >> - struct dpaa_percpu_priv *percpu_priv; >> - struct net_device *net_dev; >> - struct dpaa_bp *dpaa_bp; >> - struct dpaa_priv *priv; >> - >> - net_dev = dpaa_fq->net_dev; >> - priv = netdev_priv(net_dev); >> - dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); >> - if (!dpaa_bp) >> - return qman_cb_dqrr_consume; >> - >> - percpu_priv = this_cpu_ptr(priv->percpu_priv); >> - >> - if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) >> - return qman_cb_dqrr_stop; >> - >> - dpaa_eth_refill_bpools(priv); >> - dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); >> - >> - return qman_cb_dqrr_consume; >> -} >> - >> static int dpaa_xdp_xmit_frame(struct net_device *net_dev, >> struct xdp_frame *xdpf) >> { >> @@ -2699,7 +2672,7 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, >> return qman_cb_dqrr_consume; >> } >> >> - if (unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { >> + if (!netdev_uses_dsa(net_dev) && unlikely(fd_status & FM_FD_STAT_RX_ERRORS) != 0) { >> if (net_ratelimit()) >> netif_warn(priv, hw, net_dev, "FD status = 0x%08x\n", >> fd_status & FM_FD_STAT_RX_ERRORS); >> @@ -2802,6 +2775,37 @@ static enum qman_cb_dqrr_result rx_default_dqrr(struct qman_portal *portal, >> return qman_cb_dqrr_consume; >> } >> >> +static enum qman_cb_dqrr_result rx_error_dqrr(struct qman_portal *portal, >> + struct qman_fq *fq, >> + const struct qm_dqrr_entry *dq, >> + bool sched_napi) >> +{ >> + struct dpaa_fq *dpaa_fq = container_of(fq, struct dpaa_fq, fq_base); >> + struct dpaa_percpu_priv *percpu_priv; >> + struct net_device *net_dev; >> + struct dpaa_bp *dpaa_bp; >> + struct dpaa_priv *priv; >> + >> + net_dev = dpaa_fq->net_dev; >> + if (netdev_uses_dsa(net_dev)) >> + return rx_default_dqrr(portal, fq, dq, sched_napi); >> + >> + priv = netdev_priv(net_dev); >> + dpaa_bp = dpaa_bpid2pool(dq->fd.bpid); >> + if (!dpaa_bp) >> + return qman_cb_dqrr_consume; >> + >> + percpu_priv = this_cpu_ptr(priv->percpu_priv); >> + >> + if (dpaa_eth_napi_schedule(percpu_priv, portal, sched_napi)) >> + return qman_cb_dqrr_stop; >> + >> + dpaa_eth_refill_bpools(priv); >> + dpaa_rx_error(net_dev, priv, percpu_priv, &dq->fd, fq->fqid); >> + >> + return qman_cb_dqrr_consume; >> +} >> + >> static enum qman_cb_dqrr_result conf_error_dqrr(struct qman_portal *portal, >> struct qman_fq *fq, >> const struct qm_dqrr_entry *dq, >> @@ -2955,6 +2959,7 @@ static int dpaa_phy_init(struct net_device *net_dev) >> >> static int dpaa_open(struct net_device *net_dev) >> { >> + bool ignore_errors = netdev_uses_dsa(net_dev); >> struct mac_device *mac_dev; >> struct dpaa_priv *priv; >> int err, i; >> @@ -2968,7 +2973,7 @@ static int dpaa_open(struct net_device *net_dev) >> goto phy_init_failed; >> >> for (i = 0; i < ARRAY_SIZE(mac_dev->port); i++) { >> - err = fman_port_enable(mac_dev->port[i]); >> + err = fman_port_enable(mac_dev->port[i], ignore_errors); >> if (err) >> goto mac_start_failed; >> } >> diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c b/drivers/net/ethernet/freescale/fman/fman_port.c >> index d9baac0dbc7d..763faec11f5c 100644 >> --- a/drivers/net/ethernet/freescale/fman/fman_port.c >> +++ b/drivers/net/ethernet/freescale/fman/fman_port.c >> @@ -106,6 +106,7 @@ >> #define BMI_EBD_EN 0x80000000 >> >> #define BMI_PORT_CFG_EN 0x80000000 >> +#define BMI_PORT_CFG_FDOVR 0x02000000 >> >> #define BMI_PORT_STATUS_BSY 0x80000000 >> >> @@ -1629,7 +1630,7 @@ int fman_port_disable(struct fman_port *port) >> } >> >> /* Disable BMI */ >> - tmp = ioread32be(bmi_cfg_reg) & ~BMI_PORT_CFG_EN; >> + tmp = ioread32be(bmi_cfg_reg) & ~(BMI_PORT_CFG_EN | BMI_PORT_CFG_FDOVR); >> iowrite32be(tmp, bmi_cfg_reg); >> >> /* Wait for graceful stop end */ >> @@ -1655,6 +1656,7 @@ EXPORT_SYMBOL(fman_port_disable); >> /** >> * fman_port_enable >> * @port: A pointer to a FM Port module. >> + * @ignore_errors: If set, do not discard frames received with errors. >> * >> * A runtime routine provided to allow disable/enable of port. >> * >> @@ -1662,7 +1664,7 @@ EXPORT_SYMBOL(fman_port_disable); >> * >> * Return: 0 on success; Error code otherwise. >> */ >> -int fman_port_enable(struct fman_port *port) >> +int fman_port_enable(struct fman_port *port, bool ignore_errors) >> { >> u32 __iomem *bmi_cfg_reg; >> u32 tmp; >> @@ -1692,6 +1694,8 @@ int fman_port_enable(struct fman_port *port) >> >> /* Enable BMI */ >> tmp = ioread32be(bmi_cfg_reg) | BMI_PORT_CFG_EN; >> + if (ignore_errors) >> + tmp |= BMI_PORT_CFG_FDOVR; >> iowrite32be(tmp, bmi_cfg_reg); >> >> return 0; >> diff --git a/drivers/net/ethernet/freescale/fman/fman_port.h b/drivers/net/ethernet/freescale/fman/fman_port.h >> index 82f12661a46d..0928361b0e73 100644 >> --- a/drivers/net/ethernet/freescale/fman/fman_port.h >> +++ b/drivers/net/ethernet/freescale/fman/fman_port.h >> @@ -147,7 +147,7 @@ int fman_port_cfg_buf_prefix_content(struct fman_port *port, >> >> int fman_port_disable(struct fman_port *port); >> >> -int fman_port_enable(struct fman_port *port); >> +int fman_port_enable(struct fman_port *port, bool ignore_errors); >> >> u32 fman_port_get_qman_channel_id(struct fman_port *port); >> >> -----------------------------[ cut here ]----------------------------- >> >> The netdev_uses_dsa thing is a bit trashy, I think that a more polished >> version should rather set NETIF_F_RXALL for the DSA master, and have the >> dpaa driver act upon that. But first I'm curious if it works. > > It does work. Thank you! This is unfortunately not new, the stmmac choked on me with Broadcom tags as well: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8cad443eacf661796a740903a75cb8944c675b4e
On 3/23/2021 7:49 AM, Tobias Waldekranz wrote: > On Tue, Mar 23, 2021 at 13:41, Andrew Lunn <andrew@lunn.ch> wrote: >> On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: >>> All devices are capable of using regular DSA tags. Support for >>> Ethertyped DSA tags sort into three categories: >>> >>> 1. No support. Older chips fall into this category. >>> >>> 2. Full support. Datasheet explicitly supports configuring the CPU >>> port to receive FORWARDs with a DSA tag. >>> >>> 3. Undocumented support. Datasheet lists the configuration from >>> category 2 as "reserved for future use", but does empirically >>> behave like a category 2 device. >>> >>> Because there are ethernet controllers that do not handle regular DSA >>> tags in all cases, it is sometimes preferable to rely on the >>> undocumented behavior, as the alternative is a very crippled >>> system. But, in those cases, make sure to log the fact that an >>> undocumented feature has been enabled. >> >> Hi Tobias >> >> I wonder if dynamic reconfiguration is the correct solution here. By >> default it will be wrong for this board, and you need user space to >> flip it. >> >> Maybe a DT property would be better. Extend dsa_switch_parse_of() to >> look for the optional property dsa,tag-protocol, a string containing >> the name of the tag ops to be used. > > This was my initial approach. It gets quite messy though. Since taggers > can be modules, there is no way of knowing if a supplied protocol name > is garbage ("asdf"), or just part of a module in an initrd that is not > loaded yet when you are probing the tree. Even when the tagger is > available, there is no way to verify if the driver is compatible with > it. So I think we would have to: > > - Keep the list of protcol names compiled in with the DSA module, such > that "edsa" can be resolved to DSA_TAG_PROTO_EDSA without having the > tagger module loaded. > > - Add (yet) another op so that we can ask the driver if the given > protocol is acceptable. Calling .change_tag_protocol will not work as > drivers will assume that the driver's .setup has already executed > before it is called. > > - Have each driver check (during .setup?) if it should configure the > device to use its preferred protocol or if the user has specified > something else. > > That felt like a lot to take on board just to solve a corner case like > this. I am happy to be told that there is a much easier way to do it, or > that the above would be acceptable if there isn't one. > The other problem with specifying the tag within the Device Tree is that you are half way between providing a policy (which tag to use) and describing how the hardware works (which tag is actually supported). FWIW, the b53/bcm_sf2 binding allows one to specify whether an internal port should have Broadcom tags enabled because the accelerator behind that port would require it (brcm,use-bcm-hdr).
On Tue, Mar 23, 2021 at 03:48:51PM +0100, Tobias Waldekranz wrote: > On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: > > The netdev_uses_dsa thing is a bit trashy, I think that a more polished > > version should rather set NETIF_F_RXALL for the DSA master, and have the > > dpaa driver act upon that. But first I'm curious if it works. > > It does work. Thank you! Happy to hear that. > Does setting RXALL mean that the master would accept frames with a bad > FCS as well? Do you mean from the perspective of the network stack, or of the hardware? As far as the hardware is concerned, here is what the manual has to say: Frame reception from the network may encounter certain error conditions. Such errors are reported by the Ethernet MAC when the frame is transferred to the Buffer Manager Interface (BMI). The action taken per error case is described below. Besides the interrupts, the BMI is capable of recognizing several conditions and setting a corresponding flag in FD status field for Host usage. These conditions are as follows: * Physical Error. One of the following events were detected by the Ethernet MAC: Rx FIFO overflow, FCS error, code error, running disparity error (in applicable modes), FIFO parity error, PHY Sequence error, PHY error control character detected, CRC error. The BMI discards the frame, or enqueue directly to EFQID if FMBM_RCFG[FDOVR] is set [ editor's note: this is what my patch does ]. FPE bit is set in the FD status. * Frame size error. The Ethernet MAC detected a frame that its length exceeds the maximum allowed as configured in the MAC registers. The frame is truncated by the MAC to the maximum allowed, and it is marked as truncated. The BMI sets FSE in the FD status and forwards the frame to next module in the FMan as usual. * Some other network error may result in the frame being discarded by the MAC and not shown to the BMI. However, the MAC is responsible for counting such errors in its own statistics counters. So yes, packets with bad FCS are accepted with FMBM_RCFG[FDOVR] set. But it would be interesting to see what is the value of "fd_status" in rx_default_dqrr() for bad packets. You know, in the DPAA world, the correct approach to solve this problem would be to create a configuration to describe a "soft examination sequence" for the programmable hardware "soft parser", which identifies the DSA tag and skips over a programmable number of octets. This allows you to be able to continue to do things such as flow steering based on IP headers located after the DSA tag, etc. This is not supported in the upstream FMan driver however, neither the soft parser itself nor an abstraction for making DSA masters DSA-aware. I think it would also require more work than it took me to hack up this patch. But at least, if I understand correctly, with a soft parser in place, the MAC error counters should at least stop incrementing, if that is of any importance to you. > If so, would that mean that we would have to verify it in software? I don't see any place in the network stack that recalculates the FCS if NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even know how could the stack even tell a packet with bad FCS apart from one with good FCS. If NETIF_F_RXALL is set, then once a packet is received, it's taken for granted as good. There is a separate hardware bit to include the FCS in the RX buffer, I don't think this is what you want/need. > >> > >> As a workaround, switching to EDSA (thereby always having a proper > >> EtherType in the frame) solves the issue. > > > > So basically every user needs to change the tag protocol manually to be > > able to receive from port 8? Not sure if that's too friendly. > > No it is not friendly at all. My goal was to add it as a device-tree > property, but for reasons I will detail in my answer to Andrew, I did > not manage to figure out a good way to do that. Happy to take > suggestions. My two cents here are that you should think for the long term. If you need it due to a limitation which you have today but might no longer have tomorrow, don't put it in the device tree unless you want to support it even when you don't need it anymore.
On Tue, Mar 23, 2021 at 09:53, Florian Fainelli <f.fainelli@gmail.com> wrote: > On 3/23/2021 7:49 AM, Tobias Waldekranz wrote: >> On Tue, Mar 23, 2021 at 13:41, Andrew Lunn <andrew@lunn.ch> wrote: >>> On Tue, Mar 23, 2021 at 11:23:26AM +0100, Tobias Waldekranz wrote: >>>> All devices are capable of using regular DSA tags. Support for >>>> Ethertyped DSA tags sort into three categories: >>>> >>>> 1. No support. Older chips fall into this category. >>>> >>>> 2. Full support. Datasheet explicitly supports configuring the CPU >>>> port to receive FORWARDs with a DSA tag. >>>> >>>> 3. Undocumented support. Datasheet lists the configuration from >>>> category 2 as "reserved for future use", but does empirically >>>> behave like a category 2 device. >>>> >>>> Because there are ethernet controllers that do not handle regular DSA >>>> tags in all cases, it is sometimes preferable to rely on the >>>> undocumented behavior, as the alternative is a very crippled >>>> system. But, in those cases, make sure to log the fact that an >>>> undocumented feature has been enabled. >>> >>> Hi Tobias >>> >>> I wonder if dynamic reconfiguration is the correct solution here. By >>> default it will be wrong for this board, and you need user space to >>> flip it. >>> >>> Maybe a DT property would be better. Extend dsa_switch_parse_of() to >>> look for the optional property dsa,tag-protocol, a string containing >>> the name of the tag ops to be used. >> >> This was my initial approach. It gets quite messy though. Since taggers >> can be modules, there is no way of knowing if a supplied protocol name >> is garbage ("asdf"), or just part of a module in an initrd that is not >> loaded yet when you are probing the tree. Even when the tagger is >> available, there is no way to verify if the driver is compatible with >> it. So I think we would have to: >> >> - Keep the list of protcol names compiled in with the DSA module, such >> that "edsa" can be resolved to DSA_TAG_PROTO_EDSA without having the >> tagger module loaded. >> >> - Add (yet) another op so that we can ask the driver if the given >> protocol is acceptable. Calling .change_tag_protocol will not work as >> drivers will assume that the driver's .setup has already executed >> before it is called. >> >> - Have each driver check (during .setup?) if it should configure the >> device to use its preferred protocol or if the user has specified >> something else. >> >> That felt like a lot to take on board just to solve a corner case like >> this. I am happy to be told that there is a much easier way to do it, or >> that the above would be acceptable if there isn't one. >> > > The other problem with specifying the tag within the Device Tree is that > you are half way between providing a policy (which tag to use) and > describing how the hardware works (which tag is actually supported). Yeah it is a grey area for sure. Still, I think of it more as a hint from the OEM. You could argue that the "label" property is policy, you could also see it as a recommendation that will make the product easier to use. We can of course keep modifying drivers as incompatible ones are discovered, hoping that we will never hit one that does not allow you to disable the block responsible for dropping the frames. I guess one argument for applying this change anyway is that it gives you an easy way to test if your controller behaves better if all frames are guaranteed to honor Ethernet II. Especially for users that might not be comfortable with building their own kernels. > FWIW, the b53/bcm_sf2 binding allows one to specify whether an internal > port should have Broadcom tags enabled because the accelerator behind > that port would require it (brcm,use-bcm-hdr).
On Tue, Mar 23, 2021 at 21:03, Vladimir Oltean <olteanv@gmail.com> wrote: > On Tue, Mar 23, 2021 at 03:48:51PM +0100, Tobias Waldekranz wrote: >> On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: >> > The netdev_uses_dsa thing is a bit trashy, I think that a more polished >> > version should rather set NETIF_F_RXALL for the DSA master, and have the >> > dpaa driver act upon that. But first I'm curious if it works. >> >> It does work. Thank you! > > Happy to hear that. > >> Does setting RXALL mean that the master would accept frames with a bad >> FCS as well? > > Do you mean from the perspective of the network stack, or of the hardware? > > As far as the hardware is concerned, here is what the manual has to say: > > Frame reception from the network may encounter certain error conditions. > Such errors are reported by the Ethernet MAC when the frame is transferred > to the Buffer Manager Interface (BMI). The action taken per error case > is described below. Besides the interrupts, the BMI is capable of > recognizing several conditions and setting a corresponding flag in FD > status field for Host usage. These conditions are as follows: > > * Physical Error. One of the following events were detected by the > Ethernet MAC: Rx FIFO overflow, FCS error, code error, running > disparity error (in applicable modes), FIFO parity error, PHY Sequence > error, PHY error control character detected, CRC error. The BMI > discards the frame, or enqueue directly to EFQID if FMBM_RCFG[FDOVR] > is set [ editor's note: this is what my patch does ]. FPE bit is set > in the FD status. > * Frame size error. The Ethernet MAC detected a frame that its length > exceeds the maximum allowed as configured in the MAC registers. The > frame is truncated by the MAC to the maximum allowed, and it is marked > as truncated. The BMI sets FSE in the FD status and forwards the frame > to next module in the FMan as usual. > * Some other network error may result in the frame being discarded by > the MAC and not shown to the BMI. However, the MAC is responsible for > counting such errors in its own statistics counters. > > So yes, packets with bad FCS are accepted with FMBM_RCFG[FDOVR] set. > But it would be interesting to see what is the value of "fd_status" in > rx_default_dqrr() for bad packets. You know, in the DPAA world, the > correct approach to solve this problem would be to create a > configuration to describe a "soft examination sequence" for the > programmable hardware "soft parser", which identifies the DSA tag and Yeah I know you can do that. It is a very flexible chip that can do all kinds of fancy stuff... > skips over a programmable number of octets. This allows you to be able > to continue to do things such as flow steering based on IP headers > located after the DSA tag, etc. This is not supported in the upstream > FMan driver however, neither the soft parser itself nor an abstraction > for making DSA masters DSA-aware. I think it would also require more ...but this is the problem. These accelerators are always guarded by NDAs and proprietary code. If NXP could transpile XDP to dpaa/dpaa2 in the kernel like how Netronome does it, we would never even talk to another SoC-vendor. > work than it took me to hack up this patch. But at least, if I > understand correctly, with a soft parser in place, the MAC error > counters should at least stop incrementing, if that is of any importance > to you. This is the tragedy: I know for a fact that a DSA soft parser exists, but because of the aforementioned maze of NDAs and license agreements we, the community, cannot have nice things. >> If so, would that mean that we would have to verify it in software? > > I don't see any place in the network stack that recalculates the FCS if > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even > know how could the stack even tell a packet with bad FCS apart from one > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, > it's taken for granted as good. Right, but there is a difference between a user explicitly enabling it on a device and us enabling it because we need it internally in the kernel. In the first scenario, the user can hardly complain as they have explicitly requested to see all packets on that device. That would not be true in the second one because there would be no way for the user to turn it off. It feels like you would end up in a similar situation as with the user- vs. kernel- promiscuous setting. It seems to me if we enable it, we are responsible for not letting crap through to the port netdevs. > There is a separate hardware bit to include the FCS in the RX buffer, I > don't think this is what you want/need. > >> >> >> >> As a workaround, switching to EDSA (thereby always having a proper >> >> EtherType in the frame) solves the issue. >> > >> > So basically every user needs to change the tag protocol manually to be >> > able to receive from port 8? Not sure if that's too friendly. >> >> No it is not friendly at all. My goal was to add it as a device-tree >> property, but for reasons I will detail in my answer to Andrew, I did >> not manage to figure out a good way to do that. Happy to take >> suggestions. > > My two cents here are that you should think for the long term. If you > need it due to a limitation which you have today but might no longer > have tomorrow, don't put it in the device tree unless you want to > support it even when you don't need it anymore.
On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: > On Tue, Mar 23, 2021 at 21:03, Vladimir Oltean <olteanv@gmail.com> wrote: > > On Tue, Mar 23, 2021 at 03:48:51PM +0100, Tobias Waldekranz wrote: > >> On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: > >> > The netdev_uses_dsa thing is a bit trashy, I think that a more polished > >> > version should rather set NETIF_F_RXALL for the DSA master, and have the > >> > dpaa driver act upon that. But first I'm curious if it works. > >> > >> It does work. Thank you! > > > > Happy to hear that. > > > >> Does setting RXALL mean that the master would accept frames with a bad > >> FCS as well? > > > > Do you mean from the perspective of the network stack, or of the hardware? > > > > As far as the hardware is concerned, here is what the manual has to say: > > > > Frame reception from the network may encounter certain error conditions. > > Such errors are reported by the Ethernet MAC when the frame is transferred > > to the Buffer Manager Interface (BMI). The action taken per error case > > is described below. Besides the interrupts, the BMI is capable of > > recognizing several conditions and setting a corresponding flag in FD > > status field for Host usage. These conditions are as follows: > > > > * Physical Error. One of the following events were detected by the > > Ethernet MAC: Rx FIFO overflow, FCS error, code error, running > > disparity error (in applicable modes), FIFO parity error, PHY Sequence > > error, PHY error control character detected, CRC error. The BMI > > discards the frame, or enqueue directly to EFQID if FMBM_RCFG[FDOVR] > > is set [ editor's note: this is what my patch does ]. FPE bit is set > > in the FD status. > > * Frame size error. The Ethernet MAC detected a frame that its length > > exceeds the maximum allowed as configured in the MAC registers. The > > frame is truncated by the MAC to the maximum allowed, and it is marked > > as truncated. The BMI sets FSE in the FD status and forwards the frame > > to next module in the FMan as usual. > > * Some other network error may result in the frame being discarded by > > the MAC and not shown to the BMI. However, the MAC is responsible for > > counting such errors in its own statistics counters. > > > > So yes, packets with bad FCS are accepted with FMBM_RCFG[FDOVR] set. > > But it would be interesting to see what is the value of "fd_status" in > > rx_default_dqrr() for bad packets. You know, in the DPAA world, the > > correct approach to solve this problem would be to create a > > configuration to describe a "soft examination sequence" for the > > programmable hardware "soft parser", which identifies the DSA tag and > > Yeah I know you can do that. It is a very flexible chip that can do all > kinds of fancy stuff... > > > skips over a programmable number of octets. This allows you to be able > > to continue to do things such as flow steering based on IP headers > > located after the DSA tag, etc. This is not supported in the upstream > > FMan driver however, neither the soft parser itself nor an abstraction > > for making DSA masters DSA-aware. I think it would also require more > > ...but this is the problem. These accelerators are always guarded by > NDAs and proprietary code. Hey, that is simply not true. [ this is even more hilarous, given that you are criticizing NXP about openness in a thread about Marvell hardware ] I just created an account on nxp.com using my gmail email address, then I typed "T1023" in the search bar, clicked on "T1024", clicked the "Documentation" tab, went to the "Reference Manual" section, hit "More", selected "T1024DPAArm, QorIQ T1024 Data Path Acceleration Architecture (DPAA) Reference Manual", and it downloaded T1042DPAARM.pdf right away. And that has 1373 pages of 'all you can eat' about the DPAA hardware. And this user manual is _specifically_ for the network subsystem, the SoC has a separate user manual (T1024RM.pdf) which has another 2121 pages about the other peripherals. > If NXP could transpile XDP to dpaa/dpaa2 in the kernel like how > Netronome does it, we would never even talk to another SoC-vendor. 'More complicated than it's worth' is, I believe, what the verdict on that was. DPAA is an unbelievably complicated architecture living in a universe of its own. Also, it was designed at the beginning of the multi-core/multi-queue era, when it wasn't at all clear that Linux would become the dominant operating system for embedded networking, but merely one of the many contenders. So you'll find things like a queuing and buffer management system optimized for dispatching towards many packet processing engines arranged in composable pipelines, and with seemingly exotic features such as order preservation for parallel processing of a single network flow. Whereas in Linux, the dominant model [ coming from x86 ] is that where 1 queue + 1 buffer pool are compressed into 1 ring which goes to 1 CPU, which is a much more efficient data structure for single-core, CPU-intensive processing, so that model won out. The way the Linux drivers for DPAA make use of the hardware features is so poor not because of lack of willpower, but due to an explosive combination of dated hardware design and overengineering. And DPAA2 is basically the response to the criticism that with DPAA1 it was stupidly complicated to send a packet. With DPAA2 it's just as complicated, except that now, most of the hardware intricacies are managed by a firmware. The idea behind this was that it was supposed to make the integration with operating systems easier, and many people smarter than me say it's for the better. Of course, all of these look like mistakes when seen through the lens of hindsight, but I don't think I can really judge, just observe. But many open source frameworks for packet acceleration with DPAA were tried, rest assured. If you scan the "external/qoriq" project from https://source.codeaurora.org/, I'm sure you'll find more code than you can digest. On the other hand, there _is_ support in the mainline kernel for both dpaa and dpaa2 for native XDP, which is a modern framework that gives you decent enough throughput for CPU-based packet processing. > > work than it took me to hack up this patch. But at least, if I > > understand correctly, with a soft parser in place, the MAC error > > counters should at least stop incrementing, if that is of any importance > > to you. > > This is the tragedy: I know for a fact that a DSA soft parser exists, > but because of the aforementioned maze of NDAs and license agreements > we, the community, cannot have nice things. Oh yeah? You can even create your own, if you have nerves of steel and a thick enough skin to learn to use the "fmc" (Frame Manager Configuration Tool) program, which is fully open source if you search for it on CAF (and if you can actually make something out of the source code). And this PDF (hidden so well behind the maze of NDAs, that I just had to google for it, and you don't even need to register to read it): https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf is chock full of information on what you can do with it, see chapters 8.2.5 and 8.2.6. Personally, I am not ashamed to admit that I'm too stupid to use it. But to blame the mainline unavailability of these features on lack of openness is unfair. Poor integration with Linux networking concepts? Lack of popularity or demand from Linux users? Maybe, but that would imply a certain circularity, and that would paint things in a not-so-black-and-white tone, which you want to avoid. If you want to do any sort of development around that area, I'm sure you'd be raised a statue and sung odes to. [ enough about DPAA ] > >> If so, would that mean that we would have to verify it in software? > > > > I don't see any place in the network stack that recalculates the FCS if > > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even > > know how could the stack even tell a packet with bad FCS apart from one > > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, > > it's taken for granted as good. > > Right, but there is a difference between a user explicitly enabling it > on a device and us enabling it because we need it internally in the > kernel. > > In the first scenario, the user can hardly complain as they have > explicitly requested to see all packets on that device. That would not > be true in the second one because there would be no way for the user to > turn it off. It feels like you would end up in a similar situation as > with the user- vs. kernel- promiscuous setting. > > It seems to me if we enable it, we are responsible for not letting crap > through to the port netdevs. Yes, the advantage of NETIF_F_RXALL is that you treat error packets as normal, the disadvantage is that you treat error packets as normal. So I think all the options were laid out: - Make the driver use EDSA by default when it can, because that has better compatibility with masters, then users who care about performance can dynamically switch [ back ] to DSA. Pro: is simple, con: may affect somebody relying on the default behavior - Make your board parse a custom device tree binding which tells it what initial tagging protocol to use. Pro: addresses the con of the above. Con: kinda hacky. - Make the DSA master ignore parser errors. Pro: see above. Con: see above. - Teach the DSA master to have a minimal understanding of the DSA header. Pro: is the right thing to do, is compatible and better for more advanced use cases. Con: is more complicated than the alternatives.
> This was my initial approach. It gets quite messy though. Since taggers > can be modules, there is no way of knowing if a supplied protocol name > is garbage ("asdf"), or just part of a module in an initrd that is not > loaded yet when you are probing the tree. Hi Tobias I don't think that is an issue. We currently lookup the tagger in dsa_port_parse_cpu(). If it does not exist, we return -EPROBE_DEFER. Either it eventually gets loaded, or the driver core gives up. I don't see why the same cannot be done for a DT property. If dsa_find_tagger_by_name() does not find the tagger return -EPROBE_DEFER. Garbage will result in the switch never loading, and the DT writer will go find their typo. > Even when the tagger is available, there is no way to verify if the > driver is compatible with it. I would of though, calling the switch drivers change_tag_protocol() op will that for you. If it comes back with -EINVAL, or -EOPNOTSUPP, you know it is not compatible. So i guess i would keep all the code you are adding here to allow dynamic setting of the protocol. And add more code in dsa_switch_parse_of() to parse the optional tagging protocol name, error out -EPROBE_DEFER if it is not known yet, otherwise store it away in something like dst->tag_ops_name. And then probably in dsa_switch_setup(), if dst->tag_ops_name is not NULL, invoke the dynamic change code to perform the actual change. Andrew
On Wed, Mar 24, 2021 at 01:15, Vladimir Oltean <olteanv@gmail.com> wrote: > On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: >> On Tue, Mar 23, 2021 at 21:03, Vladimir Oltean <olteanv@gmail.com> wrote: >> > On Tue, Mar 23, 2021 at 03:48:51PM +0100, Tobias Waldekranz wrote: >> >> On Tue, Mar 23, 2021 at 13:35, Vladimir Oltean <olteanv@gmail.com> wrote: >> >> > The netdev_uses_dsa thing is a bit trashy, I think that a more polished >> >> > version should rather set NETIF_F_RXALL for the DSA master, and have the >> >> > dpaa driver act upon that. But first I'm curious if it works. >> >> >> >> It does work. Thank you! >> > >> > Happy to hear that. >> > >> >> Does setting RXALL mean that the master would accept frames with a bad >> >> FCS as well? >> > >> > Do you mean from the perspective of the network stack, or of the hardware? >> > >> > As far as the hardware is concerned, here is what the manual has to say: >> > >> > Frame reception from the network may encounter certain error conditions. >> > Such errors are reported by the Ethernet MAC when the frame is transferred >> > to the Buffer Manager Interface (BMI). The action taken per error case >> > is described below. Besides the interrupts, the BMI is capable of >> > recognizing several conditions and setting a corresponding flag in FD >> > status field for Host usage. These conditions are as follows: >> > >> > * Physical Error. One of the following events were detected by the >> > Ethernet MAC: Rx FIFO overflow, FCS error, code error, running >> > disparity error (in applicable modes), FIFO parity error, PHY Sequence >> > error, PHY error control character detected, CRC error. The BMI >> > discards the frame, or enqueue directly to EFQID if FMBM_RCFG[FDOVR] >> > is set [ editor's note: this is what my patch does ]. FPE bit is set >> > in the FD status. >> > * Frame size error. The Ethernet MAC detected a frame that its length >> > exceeds the maximum allowed as configured in the MAC registers. The >> > frame is truncated by the MAC to the maximum allowed, and it is marked >> > as truncated. The BMI sets FSE in the FD status and forwards the frame >> > to next module in the FMan as usual. >> > * Some other network error may result in the frame being discarded by >> > the MAC and not shown to the BMI. However, the MAC is responsible for >> > counting such errors in its own statistics counters. >> > >> > So yes, packets with bad FCS are accepted with FMBM_RCFG[FDOVR] set. >> > But it would be interesting to see what is the value of "fd_status" in >> > rx_default_dqrr() for bad packets. You know, in the DPAA world, the >> > correct approach to solve this problem would be to create a >> > configuration to describe a "soft examination sequence" for the >> > programmable hardware "soft parser", which identifies the DSA tag and >> >> Yeah I know you can do that. It is a very flexible chip that can do all >> kinds of fancy stuff... >> >> > skips over a programmable number of octets. This allows you to be able >> > to continue to do things such as flow steering based on IP headers >> > located after the DSA tag, etc. This is not supported in the upstream >> > FMan driver however, neither the soft parser itself nor an abstraction >> > for making DSA masters DSA-aware. I think it would also require more >> >> ...but this is the problem. These accelerators are always guarded by >> NDAs and proprietary code. > > Hey, that is simply not true. [ this is even more hilarous, given that > you are criticizing NXP about openness in a thread about Marvell hardware ] NXP is much better than most in this area. That is a big reason why you will find NXP/Freescale silicon in a majority of Westermo products. > I just created an account on nxp.com using my gmail email address, then > I typed "T1023" in the search bar, clicked on "T1024", clicked the > "Documentation" tab, went to the "Reference Manual" section, hit "More", > selected "T1024DPAArm, QorIQ T1024 Data Path Acceleration Architecture > (DPAA) Reference Manual", and it downloaded T1042DPAARM.pdf right away. > And that has 1373 pages of 'all you can eat' about the DPAA hardware. > And this user manual is _specifically_ for the network subsystem, the > SoC has a separate user manual (T1024RM.pdf) which has another 2121 > pages about the other peripherals. Yes I know all this. >> If NXP could transpile XDP to dpaa/dpaa2 in the kernel like how >> Netronome does it, we would never even talk to another SoC-vendor. > > 'More complicated than it's worth' is, I believe, what the verdict on > that was. DPAA is an unbelievably complicated architecture living in a > universe of its own. Also, it was designed at the beginning of the > multi-core/multi-queue era, when it wasn't at all clear that Linux would > become the dominant operating system for embedded networking, but merely > one of the many contenders. So you'll find things like a queuing and > buffer management system optimized for dispatching towards many packet > processing engines arranged in composable pipelines, and with seemingly > exotic features such as order preservation for parallel processing of a > single network flow. Whereas in Linux, the dominant model [ coming from x86 ] > is that where 1 queue + 1 buffer pool are compressed into 1 ring which > goes to 1 CPU, which is a much more efficient data structure for > single-core, CPU-intensive processing, so that model won out. The way > the Linux drivers for DPAA make use of the hardware features is so poor > not because of lack of willpower, but due to an explosive combination of > dated hardware design and overengineering. > > And DPAA2 is basically the response to the criticism that with DPAA1 it was > stupidly complicated to send a packet. With DPAA2 it's just as complicated, > except that now, most of the hardware intricacies are managed by a > firmware. The idea behind this was that it was supposed to make the > integration with operating systems easier, and many people smarter than > me say it's for the better. > > Of course, all of these look like mistakes when seen through the lens of > hindsight, but I don't think I can really judge, just observe. I do not mean to judge either. I am sure that all these desicions were the right ones at the time they were taken. It is just surprising to me that no SoC vendor has tried on a sub-50 Gbps device. > But many open source frameworks for packet acceleration with DPAA were > tried, rest assured. If you scan the "external/qoriq" project from > https://source.codeaurora.org/, I'm sure you'll find more code than you > can digest. > > On the other hand, there _is_ support in the mainline kernel for both > dpaa and dpaa2 for native XDP, which is a modern framework that gives > you decent enough throughput for CPU-based packet processing. Yes, that is very much appreciated. >> > work than it took me to hack up this patch. But at least, if I >> > understand correctly, with a soft parser in place, the MAC error >> > counters should at least stop incrementing, if that is of any importance >> > to you. >> >> This is the tragedy: I know for a fact that a DSA soft parser exists, >> but because of the aforementioned maze of NDAs and license agreements >> we, the community, cannot have nice things. > > Oh yeah? You can even create your own, if you have nerves of steel and a > thick enough skin to learn to use the "fmc" (Frame Manager Configuration > Tool) program, which is fully open source if you search for it on CAF > (and if you can actually make something out of the source code). Yes, this is what a colleague of mine has done. Which is how I know that one exists :) > And this PDF (hidden so well behind the maze of NDAs, that I just had to > google for it, and you don't even need to register to read it): > https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf > is chock full of information on what you can do with it, see chapters 8.2.5 and 8.2.6. Right, but this is where it ends. Using the wealth of information you have laid out so far you can use DPAA to do amazing things using open components. ...unless you have to do something so incredibly advanced and exotic as a masked update of a field. At this point you have two options: 1. Buy the firmware toolchain, which requires signing an NDA. 2. Buy a single-drop firmware binary for lots of $$$ without any possibility of getting further updates because "you should really be using DPAA2". > Personally, I am not ashamed to admit that I'm too stupid to use it. Neither am I, I am very thankful to have people around me who can do amazing things that are out of reach for me. > But to blame the mainline unavailability of these features on lack of > openness is unfair. Poor integration with Linux networking concepts? > Lack of popularity or demand from Linux users? Maybe, but that would > imply a certain circularity, and that would paint things in a > not-so-black-and-white tone, which you want to avoid. Objectively, it is certainly not black-and-white. My guess is that there many many happy DPAA users out there. I was describing my subjective experience. Perhaps some residual pain from old wounds led me to use a harsher tone than necessary. > If you want to do any sort of development around that area, I'm sure > you'd be raised a statue and sung odes to. We could maybe open up our soft-parser without having to use a custom firmware. But I do not know enough about DPAA to judge how hard that would be to integrate with what is in the open driver. I will consult with the people who do. > [ enough about DPAA ] > >> >> If so, would that mean that we would have to verify it in software? >> > >> > I don't see any place in the network stack that recalculates the FCS if >> > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even >> > know how could the stack even tell a packet with bad FCS apart from one >> > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, >> > it's taken for granted as good. >> >> Right, but there is a difference between a user explicitly enabling it >> on a device and us enabling it because we need it internally in the >> kernel. >> >> In the first scenario, the user can hardly complain as they have >> explicitly requested to see all packets on that device. That would not >> be true in the second one because there would be no way for the user to >> turn it off. It feels like you would end up in a similar situation as >> with the user- vs. kernel- promiscuous setting. >> >> It seems to me if we enable it, we are responsible for not letting crap >> through to the port netdevs. > > Yes, the advantage of NETIF_F_RXALL is that you treat error packets as > normal, the disadvantage is that you treat error packets as normal. > > So I think all the options were laid out: > - Make the driver use EDSA by default when it can, because that has > better compatibility with masters, then users who care about > performance can dynamically switch [ back ] to DSA. Pro: is simple, > con: may affect somebody relying on the default behavior I think this is pretty much what is done today for "category 2" devices in my original message. The problem is with the devices from "category 3" where we would prefer to only use documented features of the device as long as the master can cope with it. But if the controller refuses to cooperate, the scales might tip in favor of using undocumented features. > - Make your board parse a custom device tree binding which tells it what > initial tagging protocol to use. Pro: addresses the con of the above. > Con: kinda hacky. Kinda hacky, yes. OTOH, it is a workaround and not something that you have to be aware of unless you actually need it. > - Make the DSA master ignore parser errors. Pro: see above. Con: see above. Yeah I think the performace impact of having to checksum every frame makes this unattractive. > - Teach the DSA master to have a minimal understanding of the DSA header. > Pro: is the right thing to do, is compatible and better for more > advanced use cases. Con: is more complicated than the alternatives. In cases where this is possible, absolutely!
On Wed, Mar 24, 2021 at 11:52:49AM +0100, Tobias Waldekranz wrote: > >> This is the tragedy: I know for a fact that a DSA soft parser exists, > >> but because of the aforementioned maze of NDAs and license agreements > >> we, the community, cannot have nice things. > > > > Oh yeah? You can even create your own, if you have nerves of steel and a > > thick enough skin to learn to use the "fmc" (Frame Manager Configuration > > Tool) program, which is fully open source if you search for it on CAF > > (and if you can actually make something out of the source code). > > Yes, this is what a colleague of mine has done. Which is how I know that > one exists :) > > > And this PDF (hidden so well behind the maze of NDAs, that I just had to > > google for it, and you don't even need to register to read it): > > https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf > > is chock full of information on what you can do with it, see chapters 8.2.5 and 8.2.6. > > Right, but this is where it ends. Using the wealth of information you > have laid out so far you can use DPAA to do amazing things using open > components. > > ...unless you have to do something so incredibly advanced and exotic as > a masked update of a field. At this point you have two options: > > 1. Buy the firmware toolchain, which requires signing an NDA. > 2. Buy a single-drop firmware binary for lots of $$$ without any > possibility of getting further updates because "you should really be > using DPAA2". Uhm, what? By "firmware" I assume you mean "FMan microcode"? To my knowledge, the standard FMan microcode distributed _freely_ with the LSDK has support for Header Manipulation, you just need to create a Header Manipulation Command Descriptor (HMCD) and pass it to the microcode through an O/H port. I believe that: (a) the Header Manipulation descriptors allow you to perform raw mask based field updates too, not just for standard protocols (b) fmc already has some support for sending Header Manipulation descriptors to the microcode And by "firmware toolchain" you mean the FMan microcode SDK? https://www.nxp.com/design/software/embedded-software/linux-software-and-development-tools/dpaa-fman-microcode-sdk-source-code-software-kit:DPAA-FMAN-SDK In the description for that product it says: For MOST of NXP communications customers, the microcode that is freely accessible via the NXP LSDK or SDK for QorIQ or Layerscape processors will handle any communications offload task you could throw at the DPAA. So why on earth would you need that? And does it really surprise you that it costs money, especially considering the fact that you're going to need heaps of support for it anyway? Seriously, what is your point? You're complaining about having the option to write your own microcode for the RISC cores inside the network controller, when the standard one already comes with a lot of features? What would you prefer, not having that option? This is a strawman. None of the features we talked about in this thread, soft parser for DSA tags or masked header manipulation, should require custom microcode.
On Wed, Mar 24, 2021 at 01:44, Andrew Lunn <andrew@lunn.ch> wrote: >> This was my initial approach. It gets quite messy though. Since taggers >> can be modules, there is no way of knowing if a supplied protocol name >> is garbage ("asdf"), or just part of a module in an initrd that is not >> loaded yet when you are probing the tree. > > Hi Tobias > > I don't think that is an issue. We currently lookup the tagger in > dsa_port_parse_cpu(). If it does not exist, we return > -EPROBE_DEFER. Either it eventually gets loaded, or the driver core > gives up. I don't see why the same cannot be done for a DT > property. If dsa_find_tagger_by_name() does not find the tagger return > -EPROBE_DEFER. Garbage will result in the switch never loading, and > the DT writer will go find their typo. > >> Even when the tagger is available, there is no way to verify if the >> driver is compatible with it. > > I would of though, calling the switch drivers change_tag_protocol() op > will that for you. If it comes back with -EINVAL, or -EOPNOTSUPP, you > know it is not compatible. > > So i guess i would keep all the code you are adding here to allow > dynamic setting of the protocol. And add more code in > dsa_switch_parse_of() to parse the optional tagging protocol name, > error out -EPROBE_DEFER if it is not known yet, otherwise store it > away in something like dst->tag_ops_name. And then probably in > dsa_switch_setup(), if dst->tag_ops_name is not NULL, invoke the > dynamic change code to perform the actual change. Sounds like a plan. I will try it out and get back with a v2. Thanks.
On Wed, Mar 24, 2021 at 13:34, Vladimir Oltean <olteanv@gmail.com> wrote: > On Wed, Mar 24, 2021 at 11:52:49AM +0100, Tobias Waldekranz wrote: >> >> This is the tragedy: I know for a fact that a DSA soft parser exists, >> >> but because of the aforementioned maze of NDAs and license agreements >> >> we, the community, cannot have nice things. >> > >> > Oh yeah? You can even create your own, if you have nerves of steel and a >> > thick enough skin to learn to use the "fmc" (Frame Manager Configuration >> > Tool) program, which is fully open source if you search for it on CAF >> > (and if you can actually make something out of the source code). >> >> Yes, this is what a colleague of mine has done. Which is how I know that >> one exists :) >> >> > And this PDF (hidden so well behind the maze of NDAs, that I just had to >> > google for it, and you don't even need to register to read it): >> > https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf >> > is chock full of information on what you can do with it, see chapters 8.2.5 and 8.2.6. >> >> Right, but this is where it ends. Using the wealth of information you >> have laid out so far you can use DPAA to do amazing things using open >> components. >> >> ...unless you have to do something so incredibly advanced and exotic as >> a masked update of a field. At this point you have two options: >> >> 1. Buy the firmware toolchain, which requires signing an NDA. >> 2. Buy a single-drop firmware binary for lots of $$$ without any >> possibility of getting further updates because "you should really be >> using DPAA2". > > Uhm, what? > By "firmware" I assume you mean "FMan microcode"? > > To my knowledge, the standard FMan microcode distributed _freely_ with > the LSDK has support for Header Manipulation, you just need to create a > Header Manipulation Command Descriptor (HMCD) and pass it to the > microcode through an O/H port. I believe that: > (a) the Header Manipulation descriptors allow you to perform raw mask > based field updates too, not just for standard protocols This is not the story we were told. > (b) fmc already has some support for sending Header Manipulation > descriptors to the microcode > > And by "firmware toolchain" you mean the FMan microcode SDK? > https://www.nxp.com/design/software/embedded-software/linux-software-and-development-tools/dpaa-fman-microcode-sdk-source-code-software-kit:DPAA-FMAN-SDK > > In the description for that product it says: > > For MOST of NXP communications customers, the microcode that is freely > accessible via the NXP LSDK or SDK for QorIQ or Layerscape processors > will handle any communications offload task you could throw at the DPAA. > > So why on earth would you need that? And does it really surprise you Because NXP said we needed it. > that it costs money, especially considering the fact that you're going > to need heaps of support for it anyway? No, it surprised me that we had to pay for a solution to a problem that we were promised would be solvable using the stock firmware. > Seriously, what is your point? You're complaining about having the > option to write your own microcode for the RISC cores inside the network > controller, when the standard one already comes with a lot of features? > What would you prefer, not having that option? > > This is a strawman. None of the features we talked about in this thread, > soft parser for DSA tags or masked header manipulation, should require > custom microcode. I never made that claim. I was describing our experience with DPAA on the whole.
On Wed, Mar 24, 2021 at 02:01:14PM +0100, Tobias Waldekranz wrote: > On Wed, Mar 24, 2021 at 13:34, Vladimir Oltean <olteanv@gmail.com> wrote: > > On Wed, Mar 24, 2021 at 11:52:49AM +0100, Tobias Waldekranz wrote: > >> >> This is the tragedy: I know for a fact that a DSA soft parser exists, > >> >> but because of the aforementioned maze of NDAs and license agreements > >> >> we, the community, cannot have nice things. > >> > > >> > Oh yeah? You can even create your own, if you have nerves of steel and a > >> > thick enough skin to learn to use the "fmc" (Frame Manager Configuration > >> > Tool) program, which is fully open source if you search for it on CAF > >> > (and if you can actually make something out of the source code). > >> > >> Yes, this is what a colleague of mine has done. Which is how I know that > >> one exists :) > >> > >> > And this PDF (hidden so well behind the maze of NDAs, that I just had to > >> > google for it, and you don't even need to register to read it): > >> > https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf > >> > is chock full of information on what you can do with it, see chapters 8.2.5 and 8.2.6. > >> > >> Right, but this is where it ends. Using the wealth of information you > >> have laid out so far you can use DPAA to do amazing things using open > >> components. > >> > >> ...unless you have to do something so incredibly advanced and exotic as > >> a masked update of a field. At this point you have two options: > >> > >> 1. Buy the firmware toolchain, which requires signing an NDA. > >> 2. Buy a single-drop firmware binary for lots of $$$ without any > >> possibility of getting further updates because "you should really be > >> using DPAA2". > > > > Uhm, what? > > By "firmware" I assume you mean "FMan microcode"? > > > > To my knowledge, the standard FMan microcode distributed _freely_ with > > the LSDK has support for Header Manipulation, you just need to create a > > Header Manipulation Command Descriptor (HMCD) and pass it to the > > microcode through an O/H port. I believe that: > > (a) the Header Manipulation descriptors allow you to perform raw mask > > based field updates too, not just for standard protocols > > This is not the story we were told. Wait, aren't we talking about HdrMan OPCODE 0x19 ("Replace Field in Header")? > > (b) fmc already has some support for sending Header Manipulation > > descriptors to the microcode > > > > And by "firmware toolchain" you mean the FMan microcode SDK? > > https://www.nxp.com/design/software/embedded-software/linux-software-and-development-tools/dpaa-fman-microcode-sdk-source-code-software-kit:DPAA-FMAN-SDK > > > > In the description for that product it says: > > > > For MOST of NXP communications customers, the microcode that is freely > > accessible via the NXP LSDK or SDK for QorIQ or Layerscape processors > > will handle any communications offload task you could throw at the DPAA. > > > > So why on earth would you need that? And does it really surprise you > > Because NXP said we needed it. > > > that it costs money, especially considering the fact that you're going > > to need heaps of support for it anyway? > > No, it surprised me that we had to pay for a solution to a problem that > we were promised would be solvable using the stock firmware. Maybe the FMan version of your particular device does not support that HM command, or maybe you needed a slightly different behavior compared to what HM opcode 0x19 does, and there was a misunderstanding on either ends resulting in the impression that what you need could be achievable through that type of descriptor? Either way, the way you phrased things: | unless you have to do something so incredibly advanced and exotic as | a masked update of a field is very unfair, oversimplifying and misleading. > > Seriously, what is your point? You're complaining about having the > > option to write your own microcode for the RISC cores inside the network > > controller, when the standard one already comes with a lot of features? > > What would you prefer, not having that option? > > > > This is a strawman. None of the features we talked about in this thread, > > soft parser for DSA tags or masked header manipulation, should require > > custom microcode. > > I never made that claim. I was describing our experience with DPAA on > the whole. I fail to see how we ended up talking about custom FMan microcode then. I did not bring it up, and it is completely irrelevant to the discussion about soft parser for DSA.
On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: > > I don't see any place in the network stack that recalculates the FCS if > > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even > > know how could the stack even tell a packet with bad FCS apart from one > > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, > > it's taken for granted as good. > > Right, but there is a difference between a user explicitly enabling it > on a device and us enabling it because we need it internally in the > kernel. > > In the first scenario, the user can hardly complain as they have > explicitly requested to see all packets on that device. That would not > be true in the second one because there would be no way for the user to > turn it off. It feels like you would end up in a similar situation as > with the user- vs. kernel- promiscuous setting. > > It seems to me if we enable it, we are responsible for not letting crap > through to the port netdevs. I think there exists an intermediate approach between processing the frames on the RX queue and installing a soft parser. The BMI of FMan RX ports has a configurable pipeline through Next Invoked Actions (NIA). Through the FMBM_RFNE register (Rx Frame Next Engine), it is possible to change the Next Invoked Action from the default value (which is the hardware parser). You can choose to make the Buffer Manager Interface enqueue the packet directly to the Queue Manager Interface (QMI). This will effectively bypass the hardware parser, so DSA frames will never be sent to the error queue if they have an invalid EtherType/Length field. Additionally, frames with a bad FCS should still be discarded, as that is done by the MAC (an earlier stage compared to the BMI).
On Wed, Mar 24, 2021 at 04:03:17PM +0200, Vladimir Oltean wrote: > I think there exists an intermediate approach between processing the > frames on the RX queue and installing a soft parser. I meant "RX error queue", sorry for the confusion.
On Wed, Mar 24, 2021 at 16:03, Vladimir Oltean <olteanv@gmail.com> wrote: > On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: >> > I don't see any place in the network stack that recalculates the FCS if >> > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even >> > know how could the stack even tell a packet with bad FCS apart from one >> > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, >> > it's taken for granted as good. >> >> Right, but there is a difference between a user explicitly enabling it >> on a device and us enabling it because we need it internally in the >> kernel. >> >> In the first scenario, the user can hardly complain as they have >> explicitly requested to see all packets on that device. That would not >> be true in the second one because there would be no way for the user to >> turn it off. It feels like you would end up in a similar situation as >> with the user- vs. kernel- promiscuous setting. >> >> It seems to me if we enable it, we are responsible for not letting crap >> through to the port netdevs. > > I think there exists an intermediate approach between processing the > frames on the RX queue and installing a soft parser. > > The BMI of FMan RX ports has a configurable pipeline through Next > Invoked Actions (NIA). Through the FMBM_RFNE register (Rx Frame Next > Engine), it is possible to change the Next Invoked Action from the > default value (which is the hardware parser). You can choose to make the > Buffer Manager Interface enqueue the packet directly to the Queue > Manager Interface (QMI). This will effectively bypass the hardware > parser, so DSA frames will never be sent to the error queue if they have > an invalid EtherType/Length field. > > Additionally, frames with a bad FCS should still be discarded, as that > is done by the MAC (an earlier stage compared to the BMI). Yeah this sounds like the perfect middle ground. I guess that would then be activated with an `if (netdev_uses_dsa(dev))`-guard in the driver, like how Florian solved it for stmmac? Since it is not quite "rx-all".
On Wed, Mar 24, 2021 at 04:02:52PM +0100, Tobias Waldekranz wrote: > On Wed, Mar 24, 2021 at 16:03, Vladimir Oltean <olteanv@gmail.com> wrote: > > On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: > >> > I don't see any place in the network stack that recalculates the FCS if > >> > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even > >> > know how could the stack even tell a packet with bad FCS apart from one > >> > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, > >> > it's taken for granted as good. > >> > >> Right, but there is a difference between a user explicitly enabling it > >> on a device and us enabling it because we need it internally in the > >> kernel. > >> > >> In the first scenario, the user can hardly complain as they have > >> explicitly requested to see all packets on that device. That would not > >> be true in the second one because there would be no way for the user to > >> turn it off. It feels like you would end up in a similar situation as > >> with the user- vs. kernel- promiscuous setting. > >> > >> It seems to me if we enable it, we are responsible for not letting crap > >> through to the port netdevs. > > > > I think there exists an intermediate approach between processing the > > frames on the RX queue and installing a soft parser. > > > > The BMI of FMan RX ports has a configurable pipeline through Next > > Invoked Actions (NIA). Through the FMBM_RFNE register (Rx Frame Next > > Engine), it is possible to change the Next Invoked Action from the > > default value (which is the hardware parser). You can choose to make the > > Buffer Manager Interface enqueue the packet directly to the Queue > > Manager Interface (QMI). This will effectively bypass the hardware > > parser, so DSA frames will never be sent to the error queue if they have > > an invalid EtherType/Length field. > > > > Additionally, frames with a bad FCS should still be discarded, as that > > is done by the MAC (an earlier stage compared to the BMI). > > Yeah this sounds like the perfect middle ground. I guess that would then > be activated with an `if (netdev_uses_dsa(dev))`-guard in the driver, > like how Florian solved it for stmmac? Since it is not quite "rx-all". I think this would have to be guarded by netdev_uses_dsa for now, yes. Also, it is far from being a "perfect" middle ground, because if you disable the hardware parser, you also lose the ability to do frame classification and hashing/flow steering to multiple RX queues on that port, I think.
On Wed, Mar 24, 2021 at 17:08, Vladimir Oltean <olteanv@gmail.com> wrote: > On Wed, Mar 24, 2021 at 04:02:52PM +0100, Tobias Waldekranz wrote: >> On Wed, Mar 24, 2021 at 16:03, Vladimir Oltean <olteanv@gmail.com> wrote: >> > On Tue, Mar 23, 2021 at 10:17:30PM +0100, Tobias Waldekranz wrote: >> >> > I don't see any place in the network stack that recalculates the FCS if >> >> > NETIF_F_RXALL is set. Additionally, without NETIF_F_RXFCS, I don't even >> >> > know how could the stack even tell a packet with bad FCS apart from one >> >> > with good FCS. If NETIF_F_RXALL is set, then once a packet is received, >> >> > it's taken for granted as good. >> >> >> >> Right, but there is a difference between a user explicitly enabling it >> >> on a device and us enabling it because we need it internally in the >> >> kernel. >> >> >> >> In the first scenario, the user can hardly complain as they have >> >> explicitly requested to see all packets on that device. That would not >> >> be true in the second one because there would be no way for the user to >> >> turn it off. It feels like you would end up in a similar situation as >> >> with the user- vs. kernel- promiscuous setting. >> >> >> >> It seems to me if we enable it, we are responsible for not letting crap >> >> through to the port netdevs. >> > >> > I think there exists an intermediate approach between processing the >> > frames on the RX queue and installing a soft parser. >> > >> > The BMI of FMan RX ports has a configurable pipeline through Next >> > Invoked Actions (NIA). Through the FMBM_RFNE register (Rx Frame Next >> > Engine), it is possible to change the Next Invoked Action from the >> > default value (which is the hardware parser). You can choose to make the >> > Buffer Manager Interface enqueue the packet directly to the Queue >> > Manager Interface (QMI). This will effectively bypass the hardware >> > parser, so DSA frames will never be sent to the error queue if they have >> > an invalid EtherType/Length field. >> > >> > Additionally, frames with a bad FCS should still be discarded, as that >> > is done by the MAC (an earlier stage compared to the BMI). >> >> Yeah this sounds like the perfect middle ground. I guess that would then >> be activated with an `if (netdev_uses_dsa(dev))`-guard in the driver, >> like how Florian solved it for stmmac? Since it is not quite "rx-all". > > I think this would have to be guarded by netdev_uses_dsa for now, yes. > Also, it is far from being a "perfect" middle ground, because if you > disable the hardware parser, you also lose the ability to do frame > classification and hashing/flow steering to multiple RX queues on that > port, I think. But even if the parser was enabled, it would never get anywhere since the Ethertype would look like random garbage. Unless we have the soft parser, but then it is not the middle ground anymore :) I suppose you would like to test for netdev_uses_dsa_and_violates_8023, that way you could still do RSS on DSA devices using regular 1Q-tags for example. Do we want to add this property to the taggers so that we do not degrade performance for any existing users?
On Wed, Mar 24, 2021 at 05:07:09PM +0100, Tobias Waldekranz wrote: > But even if the parser was enabled, it would never get anywhere since > the Ethertype would look like random garbage. Unless we have the soft > parser, but then it is not the middle ground anymore :) Garbage, true, but garbage with enough entropy to allow for some sort of RFS (ideally you can get the source port field from the DSA tag into the area covered by the n-tuple on which the master performs hashing). This is the way in which the switches inside NXP LS1028A and T1040 work. > I suppose you would like to test for netdev_uses_dsa_and_violates_8023, > that way you could still do RSS on DSA devices using regular 1Q-tags for > example. Do we want to add this property to the taggers so that we do > not degrade performance for any existing users? Yes, so T1040 is one such example of device that would be negatively affected by this change. There isn't a good solution to solve all problems: there will be some Marvell switches which can't operate in EDSA mode, and there will be some DSA masters that can't parse Marvell DSA tags. Eventually all possible combinations of workarounds will have to be implemented. But for now, I think I prefer to see the simplest one, which has just become the one based on device tree.
On Thu, Mar 25, 2021 at 03:34, Vladimir Oltean <olteanv@gmail.com> wrote: > On Wed, Mar 24, 2021 at 05:07:09PM +0100, Tobias Waldekranz wrote: >> But even if the parser was enabled, it would never get anywhere since >> the Ethertype would look like random garbage. Unless we have the soft >> parser, but then it is not the middle ground anymore :) > > Garbage, true, but garbage with enough entropy to allow for some sort of > RFS (ideally you can get the source port field from the DSA tag into the > area covered by the n-tuple on which the master performs hashing). This > is the way in which the switches inside NXP LS1028A and T1040 work. I see what you are saying. Any given flow would still have the same not-really-an-Ethertype. >> I suppose you would like to test for netdev_uses_dsa_and_violates_8023, >> that way you could still do RSS on DSA devices using regular 1Q-tags for >> example. Do we want to add this property to the taggers so that we do >> not degrade performance for any existing users? > > Yes, so T1040 is one such example of device that would be negatively > affected by this change. There isn't a good solution to solve all > problems: there will be some Marvell switches which can't operate in > EDSA mode, and there will be some DSA masters that can't parse Marvell > DSA tags. Eventually all possible combinations of workarounds will have > to be implemented. But for now, I think I prefer to see the simplest > one, which has just become the one based on device tree. Alright, it seems like everyone agrees then. I will look into it. Just to avoid a DenverCoder9 situation; I tried changing the NIA in FMBM_RFNE like you suggested: 8< --- diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c b/drivers/net/ethernet/freescale/fman/fman_port.c index d9baac0dbc7d..5aa5b4068f2d 100644 --- a/drivers/net/ethernet/freescale/fman/fman_port.c +++ b/drivers/net/ethernet/freescale/fman/fman_port.c @@ -543,7 +543,7 @@ static int init_bmi_rx(struct fman_port *port) /* NIA */ tmp = (u32)cfg->rx_fd_bits << BMI_NEXT_ENG_FD_BITS_SHIFT; - tmp |= NIA_ENG_HWP; + tmp |= NIA_ENG_BMI | NIA_BMI_AC_ENQ_FRAME; iowrite32be(tmp, ®s->fmbm_rfne); /* Parser Next Engine NIA */ 8< --- From what I can tell, this works as expected. TO_CPUs from port 8 can ingress the device with this in place.
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 95f07fcd4f85..e7ec883d5f6b 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2531,10 +2531,10 @@ static int mv88e6xxx_setup_port_mode(struct mv88e6xxx_chip *chip, int port) return mv88e6xxx_set_port_mode_normal(chip, port); /* Setup CPU port mode depending on its supported tag format */ - if (chip->info->tag_protocol == DSA_TAG_PROTO_DSA) + if (chip->tag_protocol == DSA_TAG_PROTO_DSA) return mv88e6xxx_set_port_mode_dsa(chip, port); - if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA) + if (chip->tag_protocol == DSA_TAG_PROTO_EDSA) return mv88e6xxx_set_port_mode_edsa(chip, port); return -EINVAL; @@ -5564,7 +5564,39 @@ static enum dsa_tag_protocol mv88e6xxx_get_tag_protocol(struct dsa_switch *ds, { struct mv88e6xxx_chip *chip = ds->priv; - return chip->info->tag_protocol; + return chip->tag_protocol; +} + +static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds, int port, + enum dsa_tag_protocol proto) +{ + struct mv88e6xxx_chip *chip = ds->priv; + enum dsa_tag_protocol old_protocol; + int err; + + switch (proto) { + case DSA_TAG_PROTO_EDSA: + if (chip->info->tag_protocol != DSA_TAG_PROTO_EDSA) + dev_warn(chip->dev, "Relying on undocumented EDSA tagging behavior\n"); + + break; + case DSA_TAG_PROTO_DSA: + break; + default: + return -EPROTONOSUPPORT; + } + + old_protocol = chip->tag_protocol; + chip->tag_protocol = proto; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_setup_port_mode(chip, port); + mv88e6xxx_reg_unlock(chip); + + if (err) + chip->tag_protocol = old_protocol; + + return err; } static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port, @@ -6029,6 +6061,7 @@ static int mv88e6xxx_crosschip_lag_leave(struct dsa_switch *ds, int sw_index, static const struct dsa_switch_ops mv88e6xxx_switch_ops = { .get_tag_protocol = mv88e6xxx_get_tag_protocol, + .change_tag_protocol = mv88e6xxx_change_tag_protocol, .setup = mv88e6xxx_setup, .teardown = mv88e6xxx_teardown, .phylink_validate = mv88e6xxx_validate, @@ -6209,6 +6242,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev) if (err) goto out; + chip->tag_protocol = chip->info->tag_protocol; + mv88e6xxx_phy_init(chip); if (chip->info->ops->get_eeprom) { diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index bce6e0dc8535..96b775f3fda2 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -261,6 +261,9 @@ struct mv88e6xxx_region_priv { struct mv88e6xxx_chip { const struct mv88e6xxx_info *info; + /* Currently configured tagging protocol */ + enum dsa_tag_protocol tag_protocol; + /* The dsa_switch this private structure is related to */ struct dsa_switch *ds;
All devices are capable of using regular DSA tags. Support for Ethertyped DSA tags sort into three categories: 1. No support. Older chips fall into this category. 2. Full support. Datasheet explicitly supports configuring the CPU port to receive FORWARDs with a DSA tag. 3. Undocumented support. Datasheet lists the configuration from category 2 as "reserved for future use", but does empirically behave like a category 2 device. Because there are ethernet controllers that do not handle regular DSA tags in all cases, it is sometimes preferable to rely on the undocumented behavior, as the alternative is a very crippled system. But, in those cases, make sure to log the fact that an undocumented feature has been enabled. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> --- In a system using an NXP T1023 SoC connected to a 6390X switch, we noticed that TO_CPU frames where not reaching the CPU. This only happened on hardware port 8. Looking at the DSA master interface (dpaa-ethernet) we could see that an Rx error counter was bumped at the same rate. The logs indicated a parser error. It just so happens that a TO_CPU coming in on device 0, port 8, will result in the first two bytes of the DSA tag being one of: 00 40 00 44 00 46 My guess is that since these values look like 802.3 length fields, the controller's parser will signal an error if the frame length does not match what is in the header. As a workaround, switching to EDSA (thereby always having a proper EtherType in the frame) solves the issue. drivers/net/dsa/mv88e6xxx/chip.c | 41 +++++++++++++++++++++++++++++--- drivers/net/dsa/mv88e6xxx/chip.h | 3 +++ 2 files changed, 41 insertions(+), 3 deletions(-)