mbox series

[00/12] Add support for PSE port priority

Message ID 20241002-feature_poe_port_prio-v1-0-eb067b78d6cf@bootlin.com (mailing list archive)
Headers show
Series Add support for PSE port priority | expand

Message

Kory Maincent Oct. 2, 2024, 4:14 p.m. UTC
From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>

This series brings support for port priority in the PSE subsystem.
PSE controllers can set priorities to decide which ports should be
turned off in case of special events like over-current.

This series also adds support for the devm_pse_irq_helper() helper,
similarly to devm_regulator_irq_helper(), to report events and errors.
Wrappers are used to avoid regulator naming in PSE drivers to prevent
confusion.

Patches 1-3: Cosmetics.
Patch 4: Adds support for last supported features in the TPS23881 drivers.
Patches 5-7: Add support for port priority in PSE core and ethtool.
Patches 8-9: Add support for port priority in PD692x0 and TPS23881 drivers.
Patches 10-11: Add support for devm_pse_irq_helper() helper in PSE core and
               ethtool.
Patch 12: Adds support for interrupt and event report in TPS23881 driver.

This patch series is based on the fix sent recently:
https://lore.kernel.org/netdev/20241002121706.246143-1-kory.maincent@bootlin.com/T/#u

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
---
Kory Maincent (12):
      net: pse-pd: Remove unused pse_ethtool_get_pw_limit function declaration
      net: pse-pd: tps23881: Correct boolean evaluation for bitmask checks
      net: pse-pd: tps23881: Simplify function returns by removing redundant checks
      net: pse-pd: tps23881: Add support for power limit and measurement features
      net: pse-pd: Add support for getting and setting port priority
      net: ethtool: Add PSE new port priority support feature
      netlink: specs: Expand the PSE netlink command with C33 prio attributes
      net: pse-pd: pd692x0: Add support for PSE PI priority feature
      net: pse-pd: tps23881: Add support for PSE PI priority feature
      net: pse-pd: Register regulator even for undescribed PSE PIs
      net: pse-pd: Add support for event reporting using devm_regulator_irq_helper
      net: pse-pd: tps23881: Add support for PSE events and interrupts

 Documentation/netlink/specs/ethtool.yaml     |  11 +
 Documentation/networking/ethtool-netlink.rst |  16 +
 drivers/net/pse-pd/pd692x0.c                 |  23 ++
 drivers/net/pse-pd/pse_core.c                |  66 +++-
 drivers/net/pse-pd/tps23881.c                | 532 +++++++++++++++++++++++++--
 include/linux/pse-pd/pse.h                   |  43 ++-
 include/uapi/linux/ethtool_netlink.h         |   2 +
 net/ethtool/pse-pd.c                         |  18 +
 8 files changed, 674 insertions(+), 37 deletions(-)
---
base-commit: 8052e7ff851b33e77f23800f8d15bafae9f97d17
change-id: 20240913-feature_poe_port_prio-a51aed7332ec

Best regards,

Comments

Kory Maincent Oct. 2, 2024, 4:17 p.m. UTC | #1
On Wed, 02 Oct 2024 18:14:11 +0200
Kory Maincent <kory.maincent@bootlin.com> wrote:

> From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>
> 
> This series brings support for port priority in the PSE subsystem.
> PSE controllers can set priorities to decide which ports should be
> turned off in case of special events like over-current.
> 
> This series also adds support for the devm_pse_irq_helper() helper,
> similarly to devm_regulator_irq_helper(), to report events and errors.
> Wrappers are used to avoid regulator naming in PSE drivers to prevent
> confusion.

Dohh missing the net-next prefix. Sorry !!!

Regards,
Kory Maincent Oct. 2, 2024, 4:23 p.m. UTC | #2
On Wed, 02 Oct 2024 18:14:11 +0200
Kory Maincent <kory.maincent@bootlin.com> wrote:

> From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>
> 
> This series brings support for port priority in the PSE subsystem.
> PSE controllers can set priorities to decide which ports should be
> turned off in case of special events like over-current.
> 
> This series also adds support for the devm_pse_irq_helper() helper,
> similarly to devm_regulator_irq_helper(), to report events and errors.
> Wrappers are used to avoid regulator naming in PSE drivers to prevent
> confusion.

pw-bot: cr
Kyle Swenson Oct. 9, 2024, 12:20 a.m. UTC | #3
Hello Kory,

On Wed, Oct 02, 2024 at 06:14:11PM +0200, Kory Maincent wrote:
> From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>
> 
> This series brings support for port priority in the PSE subsystem.
> PSE controllers can set priorities to decide which ports should be
> turned off in case of special events like over-current.

First off, great work here.  I've read through the patches in the series and
have a pretty good idea of what you're trying to achieve- use the PSE
controller's idea of "port priority" and expose this to userspace via ethtool.

I think this is probably sufficient but I wanted to share my experience
supporting a system level PSE power budget with PSE port priorities across
different PSE controllers through the same userspace interface such that
userspace doesn't know or care about the underlying PSE controller.

Out of the three PSE controllers I'm aware of (Microchip's PD692x0, TI's
TPS2388x, and LTC's LT4266), the PD692x0 definitely has the most advanced
configuration, supporting concepts like a system (well, manager) level budget
and powering off lower priority ports in the event that the port power
consumption is greater than the system budget.

When we experimented with this feature in our routers, we found it to be using
the dynamic power consumed by a particular port- literally, the summation of
port current * port voltage across all the ports.  While this behavior
technically saves the system from resetting or worse, it causes a bit of a
problem with lower priority ports getting powered off depending on the behavior
(power consumption) of unrelated devices.  

As an example, let's say we've got 4 devices, all powered, and we're close to
the power budget.  One of the devices starts consuming more power (perhaps it's
modem just powered on), but not more than it's class limit.  Say this device
consumes enough power to exceed the configured power budget, causing the lowest
priority device to be powered off.  This is the documented and intended
behavior of the PD692x0 chipset, but causes an unpleasant user experience
because it's not really clear why some device was powered down all the sudden.
Was it because someone unplugged it? Or because the modem on the high priority
device turned on?  Or maybe that device had an overcurrent?  It'd be impossible
to tell, and even worse, by the time someone is able to physically look at the
switch, the low priority device might be back online (perhaps the modem on
the high priority device powered off).

This behavior is unique to the PD692x0- I'm much less familiar with the
TPS2388x's idea of port priority but it is very different from the PD692x0.
Frankly the behavior of the OSS pin is confusing and since we don't use the PSE
controllers' idea of port priority, it was safe to ignore it. Finally, the
LTC4266 has a "masked shutdown" ability where a predetermined set of ports are
shutdown when a specific pin (MSD) is driven low.  Like the TPS2388x's OSS pin,
We ignore this feature on the LTC4266.

If the end-goal here is to have a device-independent idea of "port priority" I
think we need to add a level of indirection between the port priority concept and the
actual PSE hardware.  The indirection would enable a system with multiple
(possibly heterogeneous even) PSE chips to have a unified idea of port
priority.  The way we've implemented this in our routers is by putting the PSE
controllers in "semi-auto" mode, where they continually detect and classify PDs
(powered device), but do not power them until instructed to do so.  The
mechanism that decides to power a particular port or not (for lack of a better
term, "budgeting logic") uses the available system power budget (configured
from userspace), the relative port priorities (also configured from userspace)
and the class of a detected PD.  The classification result is used to determine
the _maximum_ power a particular PD might draw, and that is the value that is
subtracted from the power budget.

Using the PD's classification and then allocating it the maximum power for that
class enables a non-technical installer to plug in all the PDs at the switch,
and observe if all the PDs are powered (or not).  But the important part is
(unless the port priorities or power budget are changed from userspace) the
devices that are powered won't change due to dynamic power consumption of the
other devices.

I'm not sure what the right path is for the kernel, and I'm not sure how this
would look with the regulator integration, nor am I sure what the userspace API
should look like (we used sysfs, but that's probably not ideal for upstream).
It's also not clear how much of the budgeting logic should be in the kernel, if
any. Despite that, hopefully sharing our experience is insightful and/or
helpful.  If not, feel free to ignore it.  In any case, you've got my

Reviewed-by: Kyle Swenson <kyle.swenson@est.tech>

for all the patches in the series.

Thanks,
Kyle Swenson
Kyle Swenson Oct. 9, 2024, 1:49 p.m. UTC | #4
On Wed, Oct 09, 2024 at 12:20:21AM +0000, Kyle Swenson wrote:
> Hello Kory,
> 
> On Wed, Oct 02, 2024 at 06:14:11PM +0200, Kory Maincent wrote:
> > From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>
> > 
> > This series brings support for port priority in the PSE subsystem.
> > PSE controllers can set priorities to decide which ports should be
> > turned off in case of special events like over-current.
> 
> First off, great work here.  I've read through the patches in the series and
> have a pretty good idea of what you're trying to achieve- use the PSE
> controller's idea of "port priority" and expose this to userspace via ethtool.
> 
> I think this is probably sufficient but I wanted to share my experience
> supporting a system level PSE power budget with PSE port priorities across
> different PSE controllers through the same userspace interface such that
> userspace doesn't know or care about the underlying PSE controller.
> 
> Out of the three PSE controllers I'm aware of (Microchip's PD692x0, TI's
> TPS2388x, and LTC's LT4266), the PD692x0 definitely has the most advanced
> configuration, supporting concepts like a system (well, manager) level budget
> and powering off lower priority ports in the event that the port power
> consumption is greater than the system budget.
> 
> When we experimented with this feature in our routers, we found it to be using
> the dynamic power consumed by a particular port- literally, the summation of
> port current * port voltage across all the ports.  While this behavior
> technically saves the system from resetting or worse, it causes a bit of a
> problem with lower priority ports getting powered off depending on the behavior
> (power consumption) of unrelated devices.  
> 
> As an example, let's say we've got 4 devices, all powered, and we're close to
> the power budget.  One of the devices starts consuming more power (perhaps it's
> modem just powered on), but not more than it's class limit.  Say this device
> consumes enough power to exceed the configured power budget, causing the lowest
> priority device to be powered off.  This is the documented and intended
> behavior of the PD692x0 chipset, but causes an unpleasant user experience
> because it's not really clear why some device was powered down all the sudden.
> Was it because someone unplugged it? Or because the modem on the high priority
> device turned on?  Or maybe that device had an overcurrent?  It'd be impossible
> to tell, and even worse, by the time someone is able to physically look at the
> switch, the low priority device might be back online (perhaps the modem on
> the high priority device powered off).
> 
> This behavior is unique to the PD692x0- I'm much less familiar with the
> TPS2388x's idea of port priority but it is very different from the PD692x0.
> Frankly the behavior of the OSS pin is confusing and since we don't use the PSE
> controllers' idea of port priority, it was safe to ignore it. Finally, the
> LTC4266 has a "masked shutdown" ability where a predetermined set of ports are
> shutdown when a specific pin (MSD) is driven low.  Like the TPS2388x's OSS pin,
> We ignore this feature on the LTC4266.
> 
> If the end-goal here is to have a device-independent idea of "port priority" I
> think we need to add a level of indirection between the port priority concept and the
> actual PSE hardware.  The indirection would enable a system with multiple
> (possibly heterogeneous even) PSE chips to have a unified idea of port
> priority.  The way we've implemented this in our routers is by putting the PSE
> controllers in "semi-auto" mode, where they continually detect and classify PDs
> (powered device), but do not power them until instructed to do so.  The
> mechanism that decides to power a particular port or not (for lack of a better
> term, "budgeting logic") uses the available system power budget (configured
> from userspace), the relative port priorities (also configured from userspace)
> and the class of a detected PD.  The classification result is used to determine
> the _maximum_ power a particular PD might draw, and that is the value that is
> subtracted from the power budget.
> 
> Using the PD's classification and then allocating it the maximum power for that
> class enables a non-technical installer to plug in all the PDs at the switch,
> and observe if all the PDs are powered (or not).  But the important part is
> (unless the port priorities or power budget are changed from userspace) the
> devices that are powered won't change due to dynamic power consumption of the
> other devices.
> 
> I'm not sure what the right path is for the kernel, and I'm not sure how this
> would look with the regulator integration, nor am I sure what the userspace API
> should look like (we used sysfs, but that's probably not ideal for upstream).
> It's also not clear how much of the budgeting logic should be in the kernel, if
> any. Despite that, hopefully sharing our experience is insightful and/or
> helpful.  If not, feel free to ignore it.  In any case, you've got my
> 
> Reviewed-by: Kyle Swenson <kyle.swenson@est.tech>
> 
> for all the patches in the series.
> 
> Thanks,
> Kyle Swenson

Somehow I've managed to post this to the wrong thread.  Please
disregard this message (don't reply to it), I'll send the same text
above to the correct thread shortly.

Sorry,
Kyle