Message ID | 20231118123205.266819-2-arinc.unal@arinc9.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | MT7530 DSA subdriver improvements | expand |
On Sat, Nov 18, 2023 at 03:31:51PM +0300, Arınç ÜNAL wrote: > + /* Set the CPU port to trap frames to for MT7530. Trapped frames will be > + * forwarded to the numerically smallest CPU port which the DSA conduit > + * interface its affine to is up. > + */ > + if (priv->id != ID_MT7530 && priv->id != ID_MT7621) > + return; > + > + if (operational) > + priv->active_cpu_ports |= BIT(cpu_dp->index); > + else > + priv->active_cpu_ports &= ~BIT(cpu_dp->index); > + > + if (priv->active_cpu_ports) > + mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, CPU_EN | > + CPU_PORT(__ffs(priv->active_cpu_ports))); I would be tempted to write this as: mask = BIT(cpu_dp->index); if (operational) priv->active_cpu_ports |= mask; else priv->active_cpu_ports &= ~mask; Now, what happens when active_cpu_ports is zero? Doesn't that mean there is no active CPU port? In which case, wouldn't disabling the CPU port direction be appropriate, such as: if (priv->active_cpu_ports) val = CPU_EN | CPU_PORT(__ffs(priv->active_cpu_ports)); else val = 0; mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); ? > struct mt7530_priv { > struct device *dev; > @@ -786,6 +787,7 @@ struct mt7530_priv { > struct irq_domain *irq_domain; > u32 irq_enable; > int (*create_sgmii)(struct mt7530_priv *priv, bool dual_sgmii); > + unsigned long active_cpu_ports; So this will be 32 or 64 bit in size. Presumably you know how many CPU ports there can be, which looking at this code must be less than 8 as CPU_PORT_MASK is only 3 bits in size. So, maybe use a u8, and check that cpu_dp->index <= 7 ? I would also suggest moving irq_enable after create_sgmii, to avoid holes in the struct.
On Sat, Nov 18, 2023 at 03:31:51PM +0300, Arınç ÜNAL wrote: > +static void > +mt753x_conduit_state_change(struct dsa_switch *ds, > + const struct net_device *conduit, > + bool operational) > +{ > + struct mt7530_priv *priv = ds->priv; > + struct dsa_port *cpu_dp = conduit->dsa_ptr; nitpick: reverse Christmas tree variable ordering > + > + /* Set the CPU port to trap frames to for MT7530. Trapped frames will be > + * forwarded to the numerically smallest CPU port which the DSA conduit > + * interface its affine to is up. > + */ > + if (priv->id != ID_MT7530 && priv->id != ID_MT7621) > + return; > + > + if (operational) > + priv->active_cpu_ports |= BIT(cpu_dp->index); > + else > + priv->active_cpu_ports &= ~BIT(cpu_dp->index); > + > + if (priv->active_cpu_ports) > + mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, CPU_EN | > + CPU_PORT(__ffs(priv->active_cpu_ports))); > +}
On 18.11.2023 17:34, Russell King (Oracle) wrote: > On Sat, Nov 18, 2023 at 03:31:51PM +0300, Arınç ÜNAL wrote: >> + /* Set the CPU port to trap frames to for MT7530. Trapped frames will be >> + * forwarded to the numerically smallest CPU port which the DSA conduit >> + * interface its affine to is up. >> + */ >> + if (priv->id != ID_MT7530 && priv->id != ID_MT7621) >> + return; >> + >> + if (operational) >> + priv->active_cpu_ports |= BIT(cpu_dp->index); >> + else >> + priv->active_cpu_ports &= ~BIT(cpu_dp->index); >> + >> + if (priv->active_cpu_ports) >> + mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, CPU_EN | >> + CPU_PORT(__ffs(priv->active_cpu_ports))); > > I would be tempted to write this as: > > mask = BIT(cpu_dp->index); > > if (operational) > priv->active_cpu_ports |= mask; > else > priv->active_cpu_ports &= ~mask; > > Now, what happens when active_cpu_ports is zero? Doesn't that mean there > is no active CPU port? In which case, wouldn't disabling the CPU port > direction be appropriate, such as: > > if (priv->active_cpu_ports) > val = CPU_EN | CPU_PORT(__ffs(priv->active_cpu_ports)); > else > val = 0; > > mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); > > ? In practice, it doesn't seem to matter. The CPU_EN bit enables the CPU port defined on CPU_PORT_MASK which is used for trapping frames. No active CPU ports would mean that all the DSA conduits are down. In that case, all the user ports will be down also. So there won't be any traffic. But disabling it is of course more appropriate here. > >> struct mt7530_priv { >> struct device *dev; >> @@ -786,6 +787,7 @@ struct mt7530_priv { >> struct irq_domain *irq_domain; >> u32 irq_enable; >> int (*create_sgmii)(struct mt7530_priv *priv, bool dual_sgmii); >> + unsigned long active_cpu_ports; > > So this will be 32 or 64 bit in size. Presumably you know how many CPU > ports there can be, which looking at this code must be less than 8 as > CPU_PORT_MASK is only 3 bits in size. So, maybe use a u8, and check > that cpu_dp->index <= 7 ? Aren't there other mechanisms to check that cpu_dp->index is a valid port? At least with phylink_get_caps(), only ports lower than 7 will have proper interface modes allowed. Here's the code after you and Vladimir's review: static void mt753x_conduit_state_change(struct dsa_switch *ds, const struct net_device *conduit, bool operational) { struct dsa_port *cpu_dp = conduit->dsa_ptr; struct mt7530_priv *priv = ds->priv; u8 mask; int val; /* Set the CPU port to trap frames to for MT7530. Trapped frames will be * forwarded to the numerically smallest CPU port which the DSA conduit * interface its affine to is up. */ if (priv->id != ID_MT7530 && priv->id != ID_MT7621) return; mask = BIT(cpu_dp->index); if (operational) priv->active_cpu_ports |= mask; else priv->active_cpu_ports &= ~mask; if (priv->active_cpu_ports) val = CPU_EN | CPU_PORT(__ffs(priv->active_cpu_ports)); else val = 0; mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); } struct mt7530_priv { [...] u8 active_cpu_ports; }; > > I would also suggest moving irq_enable after create_sgmii, to avoid > holes in the struct. Sorry, I've got no idea about this. Could you explain why would there possibly be holes in the struct with the current ordering of the members of the mt7530_priv structure? Arınç
On Sat, Dec 02, 2023 at 11:29:18AM +0300, Arınç ÜNAL wrote: > > I would be tempted to write this as: > > > > mask = BIT(cpu_dp->index); > > > > if (operational) > > priv->active_cpu_ports |= mask; > > else > > priv->active_cpu_ports &= ~mask; > > > > Now, what happens when active_cpu_ports is zero? Doesn't that mean there > > is no active CPU port? In which case, wouldn't disabling the CPU port > > direction be appropriate, such as: > > > > if (priv->active_cpu_ports) > > val = CPU_EN | CPU_PORT(__ffs(priv->active_cpu_ports)); > > else > > val = 0; > > > > mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); > > > > ? > > In practice, it doesn't seem to matter. The CPU_EN bit enables the CPU port > defined on CPU_PORT_MASK which is used for trapping frames. No active CPU > ports would mean that all the DSA conduits are down. In that case, all the > user ports will be down also. So there won't be any traffic. But disabling > it is of course more appropriate here. Ack, DSA takes down the user ports which are affine to a certain conduit interface when that goes down. See the NETDEV_GOING_DOWN handling in net/dsa/user.c. > > > > > struct mt7530_priv { > > > struct device *dev; > > > @@ -786,6 +787,7 @@ struct mt7530_priv { > > > struct irq_domain *irq_domain; > > > u32 irq_enable; > > > int (*create_sgmii)(struct mt7530_priv *priv, bool dual_sgmii); > > > + unsigned long active_cpu_ports; > > > > So this will be 32 or 64 bit in size. Presumably you know how many CPU > > ports there can be, which looking at this code must be less than 8 as > > CPU_PORT_MASK is only 3 bits in size. So, maybe use a u8, and check > > that cpu_dp->index <= 7 ? We picked "unsigned long" as storage because that's also the size of the argument that __ffs() takes. But admittedly, we could have also stored a smaller variable and promote it to unsigned long when we pass it to __ffs(). > Aren't there other mechanisms to check that cpu_dp->index is a valid port? cpu_dp->index is guaranteed by DSA to be valid (according to the "reg" value from the device tree and smaller than ds->num_ports). It's just a question of balancing this kind of optimization with the possibility that a future switch appears which has more than MT7530_NUM_PORTS (7) ports. > At least with phylink_get_caps(), only ports lower than 7 will have proper > interface modes allowed. > > Here's the code after you and Vladimir's review: > > static void > mt753x_conduit_state_change(struct dsa_switch *ds, > const struct net_device *conduit, > bool operational) > { > struct dsa_port *cpu_dp = conduit->dsa_ptr; > struct mt7530_priv *priv = ds->priv; > u8 mask; > int val; > > /* Set the CPU port to trap frames to for MT7530. Trapped frames will be > * forwarded to the numerically smallest CPU port which the DSA conduit > * interface its affine to is up. > */ > if (priv->id != ID_MT7530 && priv->id != ID_MT7621) > return; > > mask = BIT(cpu_dp->index); > > if (operational) > priv->active_cpu_ports |= mask; > else > priv->active_cpu_ports &= ~mask; > > if (priv->active_cpu_ports) > val = CPU_EN | CPU_PORT(__ffs(priv->active_cpu_ports)); > else > val = 0; You could initialize "val" with 0 at declaration time and you wouldn't need the "else" branch. > > mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); > } > > struct mt7530_priv { > [...] > u8 active_cpu_ports; > }; Actually, looking at the code now, I don't understand why we even keep track of the active_cpu_ports mask in the driver. We could read the MT7530_MFC register in mt753x_conduit_state_change(), flip the bit corresponding just to cpu_dp->index (rather than rmw all of CPU_PORT_MASK), and write back the result. And to address Russell's concern, we could test whether the resulting CPU_PORT_MASK portion of what we're going to write back is all-zeroes or not, and if it is, clear the CPU_EN bit, otherwise set it. > > > > > I would also suggest moving irq_enable after create_sgmii, to avoid > > holes in the struct. > > Sorry, I've got no idea about this. Could you explain why would there > possibly be holes in the struct with the current ordering of the members of > the mt7530_priv structure? > > Arınç FWIW: $ pahole -C mt7530_priv $KBUILD_OUTPUT/drivers/net/dsa/mt7530.o struct mt7530_priv { struct device * dev; /* 0 8 */ struct dsa_switch * ds; /* 8 8 */ struct mii_bus * bus; /* 16 8 */ struct regmap * regmap; /* 24 8 */ struct reset_control * rstc; /* 32 8 */ struct regulator * core_pwr; /* 40 8 */ struct regulator * io_pwr; /* 48 8 */ struct gpio_desc * reset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ const struct mt753x_info * info; /* 64 8 */ unsigned int id; /* 72 4 */ bool mcm; /* 76 1 */ /* XXX 3 bytes hole, try to pack */ phy_interface_t p6_interface; /* 80 4 */ phy_interface_t p5_interface; /* 84 4 */ unsigned int p5_intf_sel; /* 88 4 */ u8 mirror_rx; /* 92 1 */ u8 mirror_tx; /* 93 1 */ /* XXX 2 bytes hole, try to pack */ struct mt7530_port ports[7]; /* 96 168 */ /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */ struct mt753x_pcs pcs[7]; /* 264 280 */ /* --- cacheline 8 boundary (512 bytes) was 32 bytes ago --- */ struct mutex reg_mutex; /* 544 32 */ /* --- cacheline 9 boundary (576 bytes) --- */ int irq; /* 576 4 */ /* XXX 4 bytes hole, try to pack */ struct irq_domain * irq_domain; /* 584 8 */ u32 irq_enable; /* 592 4 */ /* XXX 4 bytes hole, try to pack */ int (*create_sgmii)(struct mt7530_priv *, bool); /* 600 8 */ unsigned long active_cpu_ports; /* 608 8 */ /* size: 616, cachelines: 10, members: 24 */ /* sum members: 603, holes: 4, sum holes: 13 */ /* last cacheline: 40 bytes */ }; It's not like this makes any practical difference, as struct mt7530_priv isn't used from hot paths, but tidying it up is a good sign of clean, careful development, and of understanding memory alignment.
On 7.12.2023 20:48, Vladimir Oltean wrote: > On Sat, Dec 02, 2023 at 11:29:18AM +0300, Arınç ÜNAL wrote: >>>> struct mt7530_priv { >>>> struct device *dev; >>>> @@ -786,6 +787,7 @@ struct mt7530_priv { >>>> struct irq_domain *irq_domain; >>>> u32 irq_enable; >>>> int (*create_sgmii)(struct mt7530_priv *priv, bool dual_sgmii); >>>> + unsigned long active_cpu_ports; >>> >>> So this will be 32 or 64 bit in size. Presumably you know how many CPU >>> ports there can be, which looking at this code must be less than 8 as >>> CPU_PORT_MASK is only 3 bits in size. So, maybe use a u8, and check >>> that cpu_dp->index <= 7 ? > > We picked "unsigned long" as storage because that's also the size of the > argument that __ffs() takes. But admittedly, we could have also stored a > smaller variable and promote it to unsigned long when we pass it to __ffs(). > >> Aren't there other mechanisms to check that cpu_dp->index is a valid port? > > cpu_dp->index is guaranteed by DSA to be valid (according to the "reg" > value from the device tree and smaller than ds->num_ports). It's just a > question of balancing this kind of optimization with the possibility > that a future switch appears which has more than MT7530_NUM_PORTS (7) ports. If this was to happen - as unlikely as I find it - I would suggest adding a new MTXXXX_NUM_PORTS and set priv->ds->num_ports to it after checking priv->id. I'm a maintainer here so I'll keep an eye out. > >> >> mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); >> } >> >> struct mt7530_priv { >> [...] >> u8 active_cpu_ports; >> }; > > Actually, looking at the code now, I don't understand why we even keep > track of the active_cpu_ports mask in the driver. We could read the > MT7530_MFC register in mt753x_conduit_state_change(), flip the bit > corresponding just to cpu_dp->index (rather than rmw all of CPU_PORT_MASK), > and write back the result. And to address Russell's concern, we could test > whether the resulting CPU_PORT_MASK portion of what we're going to write > back is all-zeroes or not, and if it is, clear the CPU_EN bit, otherwise > set it. Correct me if I'm wrong, we introduced priv->active_cpu_ports because CPU_PORT_MASK from the MT7530_MFC register is a 3-bit mask, meant to keep a single port number ranging from 0 to 7. There's no single bit corresponding to cpu_dp->index on the MT7530_MFC register. So I don't see how what you suggest here would work. This is what I've got right now: static void mt753x_conduit_state_change(struct dsa_switch *ds, const struct net_device *conduit, bool operational) { struct dsa_port *cpu_dp = conduit->dsa_ptr; struct mt7530_priv *priv = ds->priv; u8 mask; int val = 0; /* Set the CPU port to trap frames to for MT7530. Trapped frames will be * forwarded to the numerically smallest CPU port which the DSA conduit * interface its affine to is up. */ if (priv->id != ID_MT7530 && priv->id != ID_MT7621) return; mask = BIT(cpu_dp->index); if (operational) priv->active_cpu_ports |= mask; else priv->active_cpu_ports &= ~mask; if (priv->active_cpu_ports) val = CPU_EN | CPU_PORT(__ffs((unsigned long)priv->active_cpu_ports)); mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, val); } > >> >>> >>> I would also suggest moving irq_enable after create_sgmii, to avoid >>> holes in the struct. >> >> Sorry, I've got no idea about this. Could you explain why would there >> possibly be holes in the struct with the current ordering of the members of >> the mt7530_priv structure? >> >> Arınç > > FWIW: > > $ pahole -C mt7530_priv $KBUILD_OUTPUT/drivers/net/dsa/mt7530.o > struct mt7530_priv { > struct device * dev; /* 0 8 */ > struct dsa_switch * ds; /* 8 8 */ > struct mii_bus * bus; /* 16 8 */ > struct regmap * regmap; /* 24 8 */ > struct reset_control * rstc; /* 32 8 */ > struct regulator * core_pwr; /* 40 8 */ > struct regulator * io_pwr; /* 48 8 */ > struct gpio_desc * reset; /* 56 8 */ > /* --- cacheline 1 boundary (64 bytes) --- */ > const struct mt753x_info * info; /* 64 8 */ > unsigned int id; /* 72 4 */ > bool mcm; /* 76 1 */ > > /* XXX 3 bytes hole, try to pack */ > > phy_interface_t p6_interface; /* 80 4 */ > phy_interface_t p5_interface; /* 84 4 */ > unsigned int p5_intf_sel; /* 88 4 */ > u8 mirror_rx; /* 92 1 */ > u8 mirror_tx; /* 93 1 */ > > /* XXX 2 bytes hole, try to pack */ > > struct mt7530_port ports[7]; /* 96 168 */ > /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */ > struct mt753x_pcs pcs[7]; /* 264 280 */ > /* --- cacheline 8 boundary (512 bytes) was 32 bytes ago --- */ > struct mutex reg_mutex; /* 544 32 */ > /* --- cacheline 9 boundary (576 bytes) --- */ > int irq; /* 576 4 */ > > /* XXX 4 bytes hole, try to pack */ > > struct irq_domain * irq_domain; /* 584 8 */ > u32 irq_enable; /* 592 4 */ > > /* XXX 4 bytes hole, try to pack */ > > int (*create_sgmii)(struct mt7530_priv *, bool); /* 600 8 */ > unsigned long active_cpu_ports; /* 608 8 */ > > /* size: 616, cachelines: 10, members: 24 */ > /* sum members: 603, holes: 4, sum holes: 13 */ > /* last cacheline: 40 bytes */ > }; > > It's not like this makes any practical difference, as struct mt7530_priv > isn't used from hot paths, but tidying it up is a good sign of clean, > careful development, and of understanding memory alignment. Got it, thanks. I've got a few patches that introduce changes to the mt7530_priv structure. I'll make sure that, in the end, mt7530_priv won't have any holes. Arınç
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index d27c6b70a2f6..442492d62670 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -1035,10 +1035,6 @@ mt753x_cpu_port_enable(struct dsa_switch *ds, int port) mt7530_set(priv, MT7530_MFC, BC_FFP(BIT(port)) | UNM_FFP(BIT(port)) | UNU_FFP(BIT(port))); - /* Set CPU port number */ - if (priv->id == ID_MT7530 || priv->id == ID_MT7621) - mt7530_rmw(priv, MT7530_MFC, CPU_MASK, CPU_EN | CPU_PORT(port)); - /* Add the CPU port to the CPU port bitmap for MT7531 and the switch on * the MT7988 SoC. Trapped frames will be forwarded to the CPU port that * is affine to the inbound user port. @@ -3075,6 +3071,31 @@ static int mt753x_set_mac_eee(struct dsa_switch *ds, int port, return 0; } +static void +mt753x_conduit_state_change(struct dsa_switch *ds, + const struct net_device *conduit, + bool operational) +{ + struct mt7530_priv *priv = ds->priv; + struct dsa_port *cpu_dp = conduit->dsa_ptr; + + /* Set the CPU port to trap frames to for MT7530. Trapped frames will be + * forwarded to the numerically smallest CPU port which the DSA conduit + * interface its affine to is up. + */ + if (priv->id != ID_MT7530 && priv->id != ID_MT7621) + return; + + if (operational) + priv->active_cpu_ports |= BIT(cpu_dp->index); + else + priv->active_cpu_ports &= ~BIT(cpu_dp->index); + + if (priv->active_cpu_ports) + mt7530_rmw(priv, MT7530_MFC, CPU_EN | CPU_PORT_MASK, CPU_EN | + CPU_PORT(__ffs(priv->active_cpu_ports))); +} + static int mt7988_pad_setup(struct dsa_switch *ds, phy_interface_t interface) { return 0; @@ -3130,6 +3151,7 @@ const struct dsa_switch_ops mt7530_switch_ops = { .phylink_mac_link_up = mt753x_phylink_mac_link_up, .get_mac_eee = mt753x_get_mac_eee, .set_mac_eee = mt753x_set_mac_eee, + .conduit_state_change = mt753x_conduit_state_change, }; EXPORT_SYMBOL_GPL(mt7530_switch_ops); diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h index 17e42d30fff4..96d610f5bcf9 100644 --- a/drivers/net/dsa/mt7530.h +++ b/drivers/net/dsa/mt7530.h @@ -41,8 +41,8 @@ enum mt753x_id { #define UNU_FFP(x) (((x) & 0xff) << 8) #define UNU_FFP_MASK UNU_FFP(~0) #define CPU_EN BIT(7) -#define CPU_PORT(x) ((x) << 4) -#define CPU_MASK (0xf << 4) +#define CPU_PORT_MASK GENMASK(6, 4) +#define CPU_PORT(x) FIELD_PREP(CPU_PORT_MASK, x) #define MIRROR_EN BIT(3) #define MIRROR_PORT(x) ((x) & 0x7) #define MIRROR_MASK 0x7 @@ -760,6 +760,7 @@ struct mt753x_info { * @irq_domain: IRQ domain of the switch irq_chip * @irq_enable: IRQ enable bits, synced to SYS_INT_EN * @create_sgmii: Pointer to function creating SGMII PCS instance(s) + * @active_cpu_ports: Holding the active CPU ports */ struct mt7530_priv { struct device *dev; @@ -786,6 +787,7 @@ struct mt7530_priv { struct irq_domain *irq_domain; u32 irq_enable; int (*create_sgmii)(struct mt7530_priv *priv, bool dual_sgmii); + unsigned long active_cpu_ports; }; struct mt7530_hw_vlan_entry {
On the MT7530 switch, the CPU_PORT field indicates which CPU port to trap frames to, regardless of the affinity of the inbound user port. When multiple CPU ports are in use, if the DSA conduit interface is down, trapped frames won't be passed to the conduit interface. To make trapping frames work including this case, implement ds->ops->master_state_change() on this subdriver and set the CPU_PORT field to the numerically smallest CPU port which the DSA conduit interface its affine to is up. Introduce the active_cpu_ports field to store the information of the active CPU ports. Correct the macros, CPU_PORT is bits 4 through 6 of the register. Add a comment to explain frame trapping for this switch. Currently, the driver doesn't support the use of multiple CPU ports so this is not necessarily a bug fix. Suggested-by: Vladimir Oltean <olteanv@gmail.com> Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> --- drivers/net/dsa/mt7530.c | 30 ++++++++++++++++++++++++++---- drivers/net/dsa/mt7530.h | 6 ++++-- 2 files changed, 30 insertions(+), 6 deletions(-)