diff mbox series

[net-next,1/6] ethtool: Add ability to control transceiver modules' power mode

Message ID 20211003073219.1631064-2-idosch@idosch.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ethtool: Add ability to control transceiver modules' power mode | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 14 maintainers not CCed: vladyslavt@nvidia.com zhengyongjun3@huawei.com akpm@linux-foundation.org shenjian15@huawei.com petr.vorel@gmail.com corbet@lwn.net moyufeng@huawei.com huangguangbin2@huawei.com hkallweit1@gmail.com yajun.deng@linux.dev danieller@nvidia.com yangbo.lu@nxp.com linux-doc@vger.kernel.org arnd@arndb.de
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 2043 this patch: 2043
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 84 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 2037 this patch: 2037
netdev/header_inline success Link

Commit Message

Ido Schimmel Oct. 3, 2021, 7:32 a.m. UTC
From: Ido Schimmel <idosch@nvidia.com>

Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
modules parameters and retrieve their status.

The first parameter to control is the power mode of the module. It is
only relevant for paged memory modules, as flat memory modules always
operate in low power mode.

When a paged memory module is in low power mode, its power consumption
is reduced to the minimum, the management interface towards the host is
available and the data path is deactivated.

User space can choose to put modules that are not currently in use in
low power mode and transition them to high power mode before putting the
associated ports administratively up. This is useful for user space that
favors reduced power consumption and lower temperatures over reduced
link up times. In QSFP-DD modules the transition from low power mode to
high power mode can take a few seconds and this transition is only
expected to get longer with future / more complex modules.

User space can control the power mode of the module via the power mode
policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible
values:

* high: Module is always in high power mode.

* auto: Module is transitioned by the host to high power mode when the
  first port using it is put administratively up and to low power mode
  when the last port using it is put administratively down.

The operational power mode of the module is available to user space via
the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not
reported to user space when a module is not plugged-in.

The user API is designed to be generic enough so that it could be used
for modules with different memory maps (e.g., SFF-8636, CMIS).

The only implementation of the device driver API in this series is for a
MAC driver (mlxsw) where the module is controlled by the device's
firmware, but it is designed to be generic enough so that it could also
be used by implementations where the module is controlled by the CPU.

CMIS testing

============

 # ethtool -m swp11
 Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
 ...
 Module State                              : 0x03 (ModuleReady)
 LowPwrAllowRequestHW                      : Off
 LowPwrRequestSW                           : Off

The module is not in low power mode, as it is not forced by hardware
(LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off).

The power mode can be queried from the kernel. In case
LowPwrAllowRequestHW was on, the kernel would need to take into account
the state of the LowPwrRequestHW signal, which is not visible to user
space.

 $ ethtool --show-module swp11
 Module parameters for swp11:
 power-mode-policy high
 power-mode high

Change the power mode policy to 'auto':

 # ethtool --set-module swp11 power-mode-policy auto

Query the power mode again:

 $ ethtool --show-module swp11
 Module parameters for swp11:
 power-mode-policy auto
 power-mode low

Verify with the data read from the EEPROM:

 # ethtool -m swp11
 Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
 ...
 Module State                              : 0x01 (ModuleLowPwr)
 LowPwrAllowRequestHW                      : Off
 LowPwrRequestSW                           : On

Put the associated port administratively up which will instruct the host
to transition the module to high power mode:

 # ip link set dev swp11 up

Query the power mode again:

 $ ethtool --show-module swp11
 Module parameters for swp11:
 power-mode-policy auto
 power-mode high

Verify with the data read from the EEPROM:

 # ethtool -m swp11
 Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
 ...
 Module State                              : 0x03 (ModuleReady)
 LowPwrAllowRequestHW                      : Off
 LowPwrRequestSW                           : Off

Put the associated port administratively down which will instruct the
host to transition the module to low power mode:

 # ip link set dev swp11 down

Query the power mode again:

 $ ethtool --show-module swp11
 Module parameters for swp11:
 power-mode-policy auto
 power-mode low

Verify with the data read from the EEPROM:

 # ethtool -m swp11
 Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
 ...
 Module State                              : 0x01 (ModuleLowPwr)
 LowPwrAllowRequestHW                      : Off
 LowPwrRequestSW                           : On

SFF-8636 testing
================

 # ethtool -m swp13
 Identifier                                : 0x11 (QSFP28)
 ...
 Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
 Power set                                 : Off
 Power override                            : On
 ...
 Transmit avg optical power (Channel 1)    : 0.7733 mW / -1.12 dBm
 Transmit avg optical power (Channel 2)    : 0.7649 mW / -1.16 dBm
 Transmit avg optical power (Channel 3)    : 0.7790 mW / -1.08 dBm
 Transmit avg optical power (Channel 4)    : 0.7837 mW / -1.06 dBm
 Rcvr signal avg optical power(Channel 1)  : 0.9302 mW / -0.31 dBm
 Rcvr signal avg optical power(Channel 2)  : 0.9079 mW / -0.42 dBm
 Rcvr signal avg optical power(Channel 3)  : 0.8993 mW / -0.46 dBm
 Rcvr signal avg optical power(Channel 4)  : 0.8778 mW / -0.57 dBm

The module is not in low power mode, as it is not forced by hardware
(Power override is on) or by software (Power set is off).

The power mode can be queried from the kernel. In case Power override
was off, the kernel would need to take into account the state of the
LPMode signal, which is not visible to user space.

 $ ethtool --show-module swp13
 Module parameters for swp13:
 power-mode-policy high
 power-mode high

Change the power mode policy to 'auto':

 # ethtool --set-module swp13 power-mode-policy auto

Query the power mode again:

 $ ethtool --show-module swp13
 Module parameters for swp13:
 power-mode-policy auto
 power-mode low

Verify with the data read from the EEPROM:

 # ethtool -m swp13
 Identifier                                : 0x11 (QSFP28)
 Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
 Power set                                 : On
 Power override                            : On
 ...
 Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm

Put the associated port administratively up which will instruct the host
to transition the module to high power mode:

 # ip link set dev swp13 up

Query the power mode again:

 $ ethtool --show-module swp13
 Module parameters for swp13:
 power-mode-policy auto
 power-mode high

Verify with the data read from the EEPROM:

 # ethtool -m swp13
 Identifier                                : 0x11 (QSFP28)
 ...
 Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
 Power set                                 : Off
 Power override                            : On
 ...
 Transmit avg optical power (Channel 1)    : 0.7934 mW / -1.01 dBm
 Transmit avg optical power (Channel 2)    : 0.7859 mW / -1.05 dBm
 Transmit avg optical power (Channel 3)    : 0.7885 mW / -1.03 dBm
 Transmit avg optical power (Channel 4)    : 0.7985 mW / -0.98 dBm
 Rcvr signal avg optical power(Channel 1)  : 0.9325 mW / -0.30 dBm
 Rcvr signal avg optical power(Channel 2)  : 0.9034 mW / -0.44 dBm
 Rcvr signal avg optical power(Channel 3)  : 0.9086 mW / -0.42 dBm
 Rcvr signal avg optical power(Channel 4)  : 0.8885 mW / -0.51 dBm

Put the associated port administratively down which will instruct the
host to transition the module to low power mode:

 # ip link set dev swp13 down

Query the power mode again:

 $ ethtool --show-module swp13
 Module parameters for swp13:
 power-mode-policy auto
 power-mode low

Verify with the data read from the EEPROM:

 # ethtool -m swp13
 Identifier                                : 0x11 (QSFP28)
 ...
 Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
 Power set                                 : On
 Power override                            : On
 ...
 Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
 Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
 Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ethtool-netlink.rst |  71 ++++++-
 include/linux/ethtool.h                      |  22 +++
 include/uapi/linux/ethtool.h                 |  23 +++
 include/uapi/linux/ethtool_netlink.h         |  17 ++
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/module.c                         | 184 +++++++++++++++++++
 net/ethtool/netlink.c                        |  19 ++
 net/ethtool/netlink.h                        |   4 +
 8 files changed, 339 insertions(+), 3 deletions(-)
 create mode 100644 net/ethtool/module.c

Comments

Jakub Kicinski Oct. 5, 2021, 1:01 a.m. UTC | #1
On Sun,  3 Oct 2021 10:32:14 +0300 Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
> 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
> modules parameters and retrieve their status.

Acked-by: Jakub Kicinski <kuba@kernel.org>

Couple of take it or leave it comments again, if you prefer to leave as
is that's fine.

> +enum ethtool_module_power_mode_policy {
> +	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
> +	ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
> +};

I see you left this starting from 0, and we still need a valid bit,
granted just internal to the core.

Would we not need a driver-facing valid bit later on when we extend 
the module API to control more params?  Can't there be drivers which
implement power but don't support the mode policy?

> +static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
> +				 bool *p_mod, struct netlink_ext_ack *extack)
> +{
> +	struct ethtool_module_power_mode_params power = {};
> +	struct ethtool_module_power_mode_params power_new;
> +	const struct ethtool_ops *ops = dev->ethtool_ops;
> +	int ret;
> +
> +	if (!tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY])
> +		return 0;

Feels a little old school to allow set with no attrs, now that we 
do strict validation on attrs across netlink.  What's the reason?

> +	if (!ops->get_module_power_mode || !ops->set_module_power_mode) {
> +		NL_SET_ERR_MSG_ATTR(extack,
> +				    tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY],
> +				    "Setting power mode policy is not supported by this device");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	power_new.policy = nla_get_u8(tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]);
> +	ret = ops->get_module_power_mode(dev, &power, extack);
> +	if (ret < 0)
> +		return ret;
> +	*p_mod = power_new.policy != power.policy;
> +
> +	return ops->set_module_power_mode(dev, &power_new, extack);

Why still call set if *p_mod == false?
Ido Schimmel Oct. 5, 2021, 6:57 a.m. UTC | #2
On Mon, Oct 04, 2021 at 06:01:35PM -0700, Jakub Kicinski wrote:
> On Sun,  3 Oct 2021 10:32:14 +0300 Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
> > 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
> > modules parameters and retrieve their status.
> 
> Acked-by: Jakub Kicinski <kuba@kernel.org>

Thanks!

> 
> Couple of take it or leave it comments again, if you prefer to leave as
> is that's fine.

I'll make whatever changes we conclude are necessary. See below.

> 
> > +enum ethtool_module_power_mode_policy {
> > +	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
> > +	ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
> > +};
> 
> I see you left this starting from 0, and we still need a valid bit,
> granted just internal to the core.

I was under the impression that we were only talking about the power
mode itself (which can be invalid if a module is unplugged), but not the
policy [1]. I did change that like we discussed.

[1] https://lore.kernel.org/netdev/YSYmFEDWJIu6eDvR@shredder/

> 
> Would we not need a driver-facing valid bit later on when we extend 
> the module API to control more params?

To keep the driver-facing API simple I want to have different ethtool
operations for different parameters (like rtnetlink and ndos). So, if a
driver does not support a parameter, the operation will not be
implemented and the attributes will not be dumped.

> Can't there be drivers which implement power but don't support the
> mode policy?

I don't really see how. The policy is a host attribute (not module)
determining how the host configures the power mode of the module. It
always exists, but can be fixed.

Do you still think we should make the change below?

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 1b126e8b5269..a2223b685451 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -721,7 +721,7 @@ enum ethtool_stringset {
  *     administratively down.
  */
 enum ethtool_module_power_mode_policy {
-       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
+       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH = 1,
        ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
 };

> 
> > +static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
> > +				 bool *p_mod, struct netlink_ext_ack *extack)
> > +{
> > +	struct ethtool_module_power_mode_params power = {};
> > +	struct ethtool_module_power_mode_params power_new;
> > +	const struct ethtool_ops *ops = dev->ethtool_ops;
> > +	int ret;
> > +
> > +	if (!tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY])
> > +		return 0;
> 
> Feels a little old school to allow set with no attrs, now that we 
> do strict validation on attrs across netlink.  What's the reason?

The power mode policy is the first parameter that can be set via
MODULE_SET, but in the future there can be more and it is valid for user
space to only want to change a subset. In which case, we will skip over
attributes that were not specified.

> 
> > +	if (!ops->get_module_power_mode || !ops->set_module_power_mode) {
> > +		NL_SET_ERR_MSG_ATTR(extack,
> > +				    tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY],
> > +				    "Setting power mode policy is not supported by this device");
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	power_new.policy = nla_get_u8(tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]);
> > +	ret = ops->get_module_power_mode(dev, &power, extack);
> > +	if (ret < 0)
> > +		return ret;
> > +	*p_mod = power_new.policy != power.policy;
> > +
> > +	return ops->set_module_power_mode(dev, &power_new, extack);
> 
> Why still call set if *p_mod == false?

Good question...

Thinking about this again, this seems better:

diff --git a/net/ethtool/module.c b/net/ethtool/module.c
index 254ac84f9728..a6eefae906eb 100644
--- a/net/ethtool/module.c
+++ b/net/ethtool/module.c
@@ -141,7 +141,10 @@ static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
        ret = ops->get_module_power_mode(dev, &power, extack);
        if (ret < 0)
                return ret;
-       *p_mod = power_new.policy != power.policy;
+
+       if (power_new.policy == power.policy)
+               return 0;
+       *p_mod = true;
 
        return ops->set_module_power_mode(dev, &power_new, extack);
 }

That way we avoid setting 'mod' to 'false' if it was already 'true'
because of other parameters that were changed in ethnl_set_module(). We
don't have any other parameters right now, but this can change.

Thanks for looking into this
Ido Schimmel Oct. 5, 2021, 12:04 p.m. UTC | #3
On Tue, Oct 05, 2021 at 09:57:25AM +0300, Ido Schimmel wrote:
> On Mon, Oct 04, 2021 at 06:01:35PM -0700, Jakub Kicinski wrote:
> > On Sun,  3 Oct 2021 10:32:14 +0300 Ido Schimmel wrote:
> > > From: Ido Schimmel <idosch@nvidia.com>
> > > 
> > > Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
> > > 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
> > > modules parameters and retrieve their status.
> > 
> > Acked-by: Jakub Kicinski <kuba@kernel.org>
> 
> Thanks!
> 
> > 
> > Couple of take it or leave it comments again, if you prefer to leave as
> > is that's fine.
> 
> I'll make whatever changes we conclude are necessary. See below.
> 
> > 
> > > +enum ethtool_module_power_mode_policy {
> > > +	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
> > > +	ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
> > > +};
> > 
> > I see you left this starting from 0, and we still need a valid bit,
> > granted just internal to the core.
> 
> I was under the impression that we were only talking about the power
> mode itself (which can be invalid if a module is unplugged), but not the
> policy [1]. I did change that like we discussed.
> 
> [1] https://lore.kernel.org/netdev/YSYmFEDWJIu6eDvR@shredder/
> 
> > 
> > Would we not need a driver-facing valid bit later on when we extend 
> > the module API to control more params?
> 
> To keep the driver-facing API simple I want to have different ethtool
> operations for different parameters (like rtnetlink and ndos). So, if a
> driver does not support a parameter, the operation will not be
> implemented and the attributes will not be dumped.
> 
> > Can't there be drivers which implement power but don't support the
> > mode policy?
> 
> I don't really see how. The policy is a host attribute (not module)
> determining how the host configures the power mode of the module. It
> always exists, but can be fixed.
> 
> Do you still think we should make the change below?
> 
> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index 1b126e8b5269..a2223b685451 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -721,7 +721,7 @@ enum ethtool_stringset {
>   *     administratively down.
>   */
>  enum ethtool_module_power_mode_policy {
> -       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
> +       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH = 1,
>         ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
>  };

I read your reply again about "still need a valid bit, granted just
internal to the core". My confusion was that I thought only the valid
bit in the driver-facing API bothered you, but you actually wanted me to
remove all of them.

How about the below (compile tested)?

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 1b126e8b5269..a2223b685451 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -721,7 +721,7 @@ enum ethtool_stringset {
  *	administratively down.
  */
 enum ethtool_module_power_mode_policy {
-	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
+	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH = 1,
 	ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
 };
 
diff --git a/net/ethtool/module.c b/net/ethtool/module.c
index 254ac84f9728..5d430b2bb20a 100644
--- a/net/ethtool/module.c
+++ b/net/ethtool/module.c
@@ -13,7 +13,6 @@ struct module_req_info {
 struct module_reply_data {
 	struct ethnl_reply_data	base;
 	struct ethtool_module_power_mode_params power;
-	u8 power_valid:1;
 };
 
 #define MODULE_REPDATA(__reply_base) \
@@ -30,18 +29,11 @@ static int module_get_power_mode(struct net_device *dev,
 				 struct netlink_ext_ack *extack)
 {
 	const struct ethtool_ops *ops = dev->ethtool_ops;
-	int ret;
 
 	if (!ops->get_module_power_mode)
 		return 0;
 
-	ret = ops->get_module_power_mode(dev, &data->power, extack);
-	if (ret < 0)
-		return ret;
-
-	data->power_valid = true;
-
-	return 0;
+	return ops->get_module_power_mode(dev, &data->power, extack);
 }
 
 static int module_prepare_data(const struct ethnl_req_info *req_base,
@@ -72,10 +64,10 @@ static int module_reply_size(const struct ethnl_req_info *req_base,
 	struct module_reply_data *data = MODULE_REPDATA(reply_base);
 	int len = 0;
 
-	if (data->power_valid)
+	if (data->power.policy)
 		len += nla_total_size(sizeof(u8));	/* _MODULE_POWER_MODE_POLICY */
 
-	if (data->power_valid && data->power.mode)
+	if (data->power.mode)
 		len += nla_total_size(sizeof(u8));	/* _MODULE_POWER_MODE */
 
 	return len;
@@ -87,12 +79,12 @@ static int module_fill_reply(struct sk_buff *skb,
 {
 	const struct module_reply_data *data = MODULE_REPDATA(reply_base);
 
-	if (data->power_valid &&
+	if (data->power.policy &&
 	    nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE_POLICY,
 		       data->power.policy))
 		return -EMSGSIZE;
 
-	if (data->power_valid && data->power.mode &&
+	if (data->power.mode &&
 	    nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE, data->power.mode))
 		return -EMSGSIZE;
 
@@ -116,7 +108,8 @@ const struct ethnl_request_ops ethnl_module_request_ops = {
 const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1] = {
 	[ETHTOOL_A_MODULE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
 	[ETHTOOL_A_MODULE_POWER_MODE_POLICY] =
-		NLA_POLICY_MAX(NLA_U8, ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO),
+		NLA_POLICY_RANGE(NLA_U8, ETHTOOL_MODULE_POWER_MODE_HIGH,
+				 ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO),
 };
 
 static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
Jakub Kicinski Oct. 5, 2021, 2:07 p.m. UTC | #4
On Tue, 5 Oct 2021 15:04:02 +0300 Ido Schimmel wrote:
> > > Can't there be drivers which implement power but don't support the
> > > mode policy?  
> > 
> > I don't really see how. The policy is a host attribute (not module)
> > determining how the host configures the power mode of the module. It
> > always exists, but can be fixed.
> > 
> > Do you still think we should make the change below?
> > 
> > diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> > index 1b126e8b5269..a2223b685451 100644
> > --- a/include/uapi/linux/ethtool.h
> > +++ b/include/uapi/linux/ethtool.h
> > @@ -721,7 +721,7 @@ enum ethtool_stringset {
> >   *     administratively down.
> >   */
> >  enum ethtool_module_power_mode_policy {
> > -       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
> > +       ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH = 1,
> >         ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
> >  };  
> 
> I read your reply again about "still need a valid bit, granted just
> internal to the core". My confusion was that I thought only the valid
> bit in the driver-facing API bothered you, but you actually wanted me to
> remove all of them.
> 
> How about the below (compile tested)?

Yup, exactly!
Jakub Kicinski Oct. 5, 2021, 2:15 p.m. UTC | #5
On Tue, 5 Oct 2021 09:57:20 +0300 Ido Schimmel wrote:
> > > +static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
> > > +				 bool *p_mod, struct netlink_ext_ack *extack)
> > > +{
> > > +	struct ethtool_module_power_mode_params power = {};
> > > +	struct ethtool_module_power_mode_params power_new;
> > > +	const struct ethtool_ops *ops = dev->ethtool_ops;
> > > +	int ret;
> > > +
> > > +	if (!tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY])
> > > +		return 0;  
> > 
> > Feels a little old school to allow set with no attrs, now that we 
> > do strict validation on attrs across netlink.  What's the reason?  
> 
> The power mode policy is the first parameter that can be set via
> MODULE_SET, but in the future there can be more and it is valid for user
> space to only want to change a subset. In which case, we will skip over
> attributes that were not specified.

Ack, I guess catching the "no parameter specified" case may be more
effort than it's worth. Nothing is going to break if we don't do it.

> > > +	if (!ops->get_module_power_mode || !ops->set_module_power_mode) {
> > > +		NL_SET_ERR_MSG_ATTR(extack,
> > > +				    tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY],
> > > +				    "Setting power mode policy is not supported by this device");
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > > +	power_new.policy = nla_get_u8(tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]);
> > > +	ret = ops->get_module_power_mode(dev, &power, extack);
> > > +	if (ret < 0)
> > > +		return ret;
> > > +	*p_mod = power_new.policy != power.policy;
> > > +
> > > +	return ops->set_module_power_mode(dev, &power_new, extack);  
> > 
> > Why still call set if *p_mod == false?  
> 
> Good question...
> 
> Thinking about this again, this seems better:
> 
> diff --git a/net/ethtool/module.c b/net/ethtool/module.c
> index 254ac84f9728..a6eefae906eb 100644
> --- a/net/ethtool/module.c
> +++ b/net/ethtool/module.c
> @@ -141,7 +141,10 @@ static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
>         ret = ops->get_module_power_mode(dev, &power, extack);
>         if (ret < 0)
>                 return ret;
> -       *p_mod = power_new.policy != power.policy;
> +
> +       if (power_new.policy == power.policy)
> +               return 0;
> +       *p_mod = true;
>  
>         return ops->set_module_power_mode(dev, &power_new, extack);
>  }
> 
> That way we avoid setting 'mod' to 'false' if it was already 'true'
> because of other parameters that were changed in ethnl_set_module(). We
> don't have any other parameters right now, but this can change.
> 
> Thanks for looking into this


diff mbox series

Patch

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index d9b55b7a1a4d..d6fd4b2e243c 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -41,6 +41,11 @@  In the message structure descriptions below, if an attribute name is suffixed
 with "+", parent nest can contain multiple attributes of the same type. This
 implements an array of entries.
 
+Attributes that need to be filled-in by device drivers and that are dumped to
+user space based on whether they are valid or not should not use zero as a
+valid value. This avoids the need to explicitly signal the validity of the
+attribute in the device driver API.
+
 
 Request header
 ==============
@@ -179,7 +184,7 @@  according to message purpose:
 
 Userspace to kernel:
 
-  ===================================== ================================
+  ===================================== =================================
   ``ETHTOOL_MSG_STRSET_GET``            get string set
   ``ETHTOOL_MSG_LINKINFO_GET``          get link settings
   ``ETHTOOL_MSG_LINKINFO_SET``          set link settings
@@ -213,7 +218,9 @@  Userspace to kernel:
   ``ETHTOOL_MSG_MODULE_EEPROM_GET``     read SFP module EEPROM
   ``ETHTOOL_MSG_STATS_GET``             get standard statistics
   ``ETHTOOL_MSG_PHC_VCLOCKS_GET``       get PHC virtual clocks info
-  ===================================== ================================
+  ``ETHTOOL_MSG_MODULE_SET``            set transceiver module parameters
+  ``ETHTOOL_MSG_MODULE_GET``            get transceiver module parameters
+  ===================================== =================================
 
 Kernel to userspace:
 
@@ -252,6 +259,7 @@  Kernel to userspace:
   ``ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY``  read SFP module EEPROM
   ``ETHTOOL_MSG_STATS_GET_REPLY``          standard statistics
   ``ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY``    PHC virtual clocks info
+  ``ETHTOOL_MSG_MODULE_GET_REPLY``         transceiver module parameters
   ======================================== =================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -1521,6 +1529,63 @@  Kernel response contents:
   ``ETHTOOL_A_PHC_VCLOCKS_INDEX``       s32     PHC index array
   ====================================  ======  ==========================
 
+MODULE_GET
+==========
+
+Gets transceiver module parameters.
+
+Request contents:
+
+  =====================================  ======  ==========================
+  ``ETHTOOL_A_MODULE_HEADER``            nested  request header
+  =====================================  ======  ==========================
+
+Kernel response contents:
+
+  ======================================  ======  ==========================
+  ``ETHTOOL_A_MODULE_HEADER``             nested  reply header
+  ``ETHTOOL_A_MODULE_POWER_MODE_POLICY``  u8      power mode policy
+  ``ETHTOOL_A_MODULE_POWER_MODE``         u8      operational power mode
+  ======================================  ======  ==========================
+
+The optional ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` attribute encodes the
+transceiver module power mode policy enforced by the host. The default policy
+is driver-dependent, but "auto" is the recommended default and it should be
+implemented by new drivers and drivers where conformance to a legacy behavior
+is not critical.
+
+The optional ``ETHTHOOL_A_MODULE_POWER_MODE`` attribute encodes the operational
+power mode policy of the transceiver module. It is only reported when a module
+is plugged-in. Possible values are:
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+    :identifiers: ethtool_module_power_mode
+
+MODULE_SET
+==========
+
+Sets transceiver module parameters.
+
+Request contents:
+
+  ======================================  ======  ==========================
+  ``ETHTOOL_A_MODULE_HEADER``             nested  request header
+  ``ETHTOOL_A_MODULE_POWER_MODE_POLICY``  u8      power mode policy
+  ======================================  ======  ==========================
+
+When set, the optional ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` attribute is used
+to set the transceiver module power policy enforced by the host. Possible
+values are:
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+    :identifiers: ethtool_module_power_mode_policy
+
+For SFF-8636 modules, low power mode is forced by the host according to table
+6-10 in revision 2.10a of the specification.
+
+For CMIS modules, low power mode is forced by the host according to table 6-12
+in revision 5.0 of the specification.
+
 Request translation
 ===================
 
@@ -1620,4 +1685,6 @@  are netlink only.
   n/a                                 ``ETHTOOL_MSG_CABLE_TEST_TDR_ACT``
   n/a                                 ``ETHTOOL_MSG_TUNNEL_INFO_GET``
   n/a                                 ``ETHTOOL_MSG_PHC_VCLOCKS_GET``
+  n/a                                 ``ETHTOOL_MSG_MODULE_GET``
+  n/a                                 ``ETHTOOL_MSG_MODULE_SET``
   =================================== =====================================
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 849524b55d89..9adf8d2c3144 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -415,6 +415,17 @@  struct ethtool_module_eeprom {
 	u8	*data;
 };
 
+/**
+ * struct ethtool_module_power_mode_params - module power mode parameters
+ * @policy: The power mode policy enforced by the host for the plug-in module.
+ * @mode: The operational power mode of the plug-in module. Should be filled by
+ *	device drivers on get operations.
+ */
+struct ethtool_module_power_mode_params {
+	enum ethtool_module_power_mode_policy policy;
+	enum ethtool_module_power_mode mode;
+};
+
 /**
  * struct ethtool_ops - optional netdev operations
  * @cap_link_lanes_supported: indicates if the driver supports lanes
@@ -580,6 +591,11 @@  struct ethtool_module_eeprom {
  * @get_eth_ctrl_stats: Query some of the IEEE 802.3 MAC Ctrl statistics.
  * @get_rmon_stats: Query some of the RMON (RFC 2819) statistics.
  *	Set %ranges to a pointer to zero-terminated array of byte ranges.
+ * @get_module_power_mode: Get the power mode policy for the plug-in module
+ *	used by the network device and its operational power mode, if
+ *	plugged-in.
+ * @set_module_power_mode: Set the power mode policy for the plug-in module
+ *	used by the network device.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -705,6 +721,12 @@  struct ethtool_ops {
 	void	(*get_rmon_stats)(struct net_device *dev,
 				  struct ethtool_rmon_stats *rmon_stats,
 				  const struct ethtool_rmon_hist_range **ranges);
+	int	(*get_module_power_mode)(struct net_device *dev,
+					 struct ethtool_module_power_mode_params *params,
+					 struct netlink_ext_ack *extack);
+	int	(*set_module_power_mode)(struct net_device *dev,
+					 const struct ethtool_module_power_mode_params *params,
+					 struct netlink_ext_ack *extack);
 };
 
 int ethtool_check_ops(const struct ethtool_ops *ops);
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index b6db6590baf0..a3e763ad1666 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -706,6 +706,29 @@  enum ethtool_stringset {
 	ETH_SS_COUNT
 };
 
+/**
+ * enum ethtool_module_power_mode_policy - plug-in module power mode policy
+ * @ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH: Module is always in high power mode.
+ * @ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO: Module is transitioned by the host
+ *	to high power mode when the first port using it is put administratively
+ *	up and to low power mode when the last port using it is put
+ *	administratively down.
+ */
+enum ethtool_module_power_mode_policy {
+	ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH,
+	ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO,
+};
+
+/**
+ * enum ethtool_module_power_mode - plug-in module power mode
+ * @ETHTOOL_MODULE_POWER_MODE_LOW: Module is in low power mode.
+ * @ETHTOOL_MODULE_POWER_MODE_HIGH: Module is in high power mode.
+ */
+enum ethtool_module_power_mode {
+	ETHTOOL_MODULE_POWER_MODE_LOW = 1,
+	ETHTOOL_MODULE_POWER_MODE_HIGH,
+};
+
 /**
  * struct ethtool_gstrings - string set for data tagging
  * @cmd: Command number = %ETHTOOL_GSTRINGS
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 5545f1ca9237..ca5fbb59fa42 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -47,6 +47,8 @@  enum {
 	ETHTOOL_MSG_MODULE_EEPROM_GET,
 	ETHTOOL_MSG_STATS_GET,
 	ETHTOOL_MSG_PHC_VCLOCKS_GET,
+	ETHTOOL_MSG_MODULE_GET,
+	ETHTOOL_MSG_MODULE_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -90,6 +92,8 @@  enum {
 	ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY,
 	ETHTOOL_MSG_STATS_GET_REPLY,
 	ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY,
+	ETHTOOL_MSG_MODULE_GET_REPLY,
+	ETHTOOL_MSG_MODULE_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -833,6 +837,19 @@  enum {
 	ETHTOOL_A_STATS_RMON_MAX = (__ETHTOOL_A_STATS_RMON_CNT - 1)
 };
 
+/* MODULE */
+
+enum {
+	ETHTOOL_A_MODULE_UNSPEC,
+	ETHTOOL_A_MODULE_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_MODULE_POWER_MODE_POLICY,	/* u8 */
+	ETHTOOL_A_MODULE_POWER_MODE,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_MODULE_CNT,
+	ETHTOOL_A_MODULE_MAX = (__ETHTOOL_A_MODULE_CNT - 1)
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 0a19470efbfb..b76432e70e6b 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -7,4 +7,4 @@  obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
 		   linkstate.o debug.o wol.o features.o privflags.o rings.o \
 		   channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \
-		   tunnels.o fec.o eeprom.o stats.o phc_vclocks.o
+		   tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o
diff --git a/net/ethtool/module.c b/net/ethtool/module.c
new file mode 100644
index 000000000000..254ac84f9728
--- /dev/null
+++ b/net/ethtool/module.c
@@ -0,0 +1,184 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/ethtool.h>
+
+#include "netlink.h"
+#include "common.h"
+#include "bitset.h"
+
+struct module_req_info {
+	struct ethnl_req_info base;
+};
+
+struct module_reply_data {
+	struct ethnl_reply_data	base;
+	struct ethtool_module_power_mode_params power;
+	u8 power_valid:1;
+};
+
+#define MODULE_REPDATA(__reply_base) \
+	container_of(__reply_base, struct module_reply_data, base)
+
+/* MODULE_GET */
+
+const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1] = {
+	[ETHTOOL_A_MODULE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
+};
+
+static int module_get_power_mode(struct net_device *dev,
+				 struct module_reply_data *data,
+				 struct netlink_ext_ack *extack)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	int ret;
+
+	if (!ops->get_module_power_mode)
+		return 0;
+
+	ret = ops->get_module_power_mode(dev, &data->power, extack);
+	if (ret < 0)
+		return ret;
+
+	data->power_valid = true;
+
+	return 0;
+}
+
+static int module_prepare_data(const struct ethnl_req_info *req_base,
+			       struct ethnl_reply_data *reply_base,
+			       struct genl_info *info)
+{
+	struct module_reply_data *data = MODULE_REPDATA(reply_base);
+	struct netlink_ext_ack *extack = info ? info->extack : NULL;
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+
+	ret = module_get_power_mode(dev, data, extack);
+	if (ret < 0)
+		goto out_complete;
+
+out_complete:
+	ethnl_ops_complete(dev);
+	return ret;
+}
+
+static int module_reply_size(const struct ethnl_req_info *req_base,
+			     const struct ethnl_reply_data *reply_base)
+{
+	struct module_reply_data *data = MODULE_REPDATA(reply_base);
+	int len = 0;
+
+	if (data->power_valid)
+		len += nla_total_size(sizeof(u8));	/* _MODULE_POWER_MODE_POLICY */
+
+	if (data->power_valid && data->power.mode)
+		len += nla_total_size(sizeof(u8));	/* _MODULE_POWER_MODE */
+
+	return len;
+}
+
+static int module_fill_reply(struct sk_buff *skb,
+			     const struct ethnl_req_info *req_base,
+			     const struct ethnl_reply_data *reply_base)
+{
+	const struct module_reply_data *data = MODULE_REPDATA(reply_base);
+
+	if (data->power_valid &&
+	    nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE_POLICY,
+		       data->power.policy))
+		return -EMSGSIZE;
+
+	if (data->power_valid && data->power.mode &&
+	    nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE, data->power.mode))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_module_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_MODULE_GET,
+	.reply_cmd		= ETHTOOL_MSG_MODULE_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_MODULE_HEADER,
+	.req_info_size		= sizeof(struct module_req_info),
+	.reply_data_size	= sizeof(struct module_reply_data),
+
+	.prepare_data		= module_prepare_data,
+	.reply_size		= module_reply_size,
+	.fill_reply		= module_fill_reply,
+};
+
+/* MODULE_SET */
+
+const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1] = {
+	[ETHTOOL_A_MODULE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
+	[ETHTOOL_A_MODULE_POWER_MODE_POLICY] =
+		NLA_POLICY_MAX(NLA_U8, ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO),
+};
+
+static int module_set_power_mode(struct net_device *dev, struct nlattr **tb,
+				 bool *p_mod, struct netlink_ext_ack *extack)
+{
+	struct ethtool_module_power_mode_params power = {};
+	struct ethtool_module_power_mode_params power_new;
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	int ret;
+
+	if (!tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY])
+		return 0;
+
+	if (!ops->get_module_power_mode || !ops->set_module_power_mode) {
+		NL_SET_ERR_MSG_ATTR(extack,
+				    tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY],
+				    "Setting power mode policy is not supported by this device");
+		return -EOPNOTSUPP;
+	}
+
+	power_new.policy = nla_get_u8(tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]);
+	ret = ops->get_module_power_mode(dev, &power, extack);
+	if (ret < 0)
+		return ret;
+	*p_mod = power_new.policy != power.policy;
+
+	return ops->set_module_power_mode(dev, &power_new, extack);
+}
+
+int ethnl_set_module(struct sk_buff *skb, struct genl_info *info)
+{
+	struct ethnl_req_info req_info = {};
+	struct nlattr **tb = info->attrs;
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = ethnl_parse_header_dev_get(&req_info, tb[ETHTOOL_A_MODULE_HEADER],
+					 genl_info_net(info), info->extack,
+					 true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = module_set_power_mode(dev, tb, &mod, info->extack);
+	if (ret < 0)
+		goto out_ops;
+
+	if (!mod)
+		goto out_ops;
+
+	ethtool_notify(dev, ETHTOOL_MSG_MODULE_NTF, NULL);
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 1797a0a90019..38b44c0291b1 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -282,6 +282,7 @@  ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_MODULE_EEPROM_GET]	= &ethnl_module_eeprom_request_ops,
 	[ETHTOOL_MSG_STATS_GET]		= &ethnl_stats_request_ops,
 	[ETHTOOL_MSG_PHC_VCLOCKS_GET]	= &ethnl_phc_vclocks_request_ops,
+	[ETHTOOL_MSG_MODULE_GET]	= &ethnl_module_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -593,6 +594,7 @@  ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_PAUSE_NTF]		= &ethnl_pause_request_ops,
 	[ETHTOOL_MSG_EEE_NTF]		= &ethnl_eee_request_ops,
 	[ETHTOOL_MSG_FEC_NTF]		= &ethnl_fec_request_ops,
+	[ETHTOOL_MSG_MODULE_NTF]	= &ethnl_module_request_ops,
 };
 
 /* default notification handler */
@@ -686,6 +688,7 @@  static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_PAUSE_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_EEE_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_FEC_NTF]		= ethnl_default_notify,
+	[ETHTOOL_MSG_MODULE_NTF]	= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
@@ -999,6 +1002,22 @@  static const struct genl_ops ethtool_genl_ops[] = {
 		.policy = ethnl_phc_vclocks_get_policy,
 		.maxattr = ARRAY_SIZE(ethnl_phc_vclocks_get_policy) - 1,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_MODULE_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+		.policy = ethnl_module_get_policy,
+		.maxattr = ARRAY_SIZE(ethnl_module_get_policy) - 1,
+	},
+	{
+		.cmd	= ETHTOOL_MSG_MODULE_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_module,
+		.policy = ethnl_module_set_policy,
+		.maxattr = ARRAY_SIZE(ethnl_module_set_policy) - 1,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index e8987e28036f..836ee7157848 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -337,6 +337,7 @@  extern const struct ethnl_request_ops ethnl_fec_request_ops;
 extern const struct ethnl_request_ops ethnl_module_eeprom_request_ops;
 extern const struct ethnl_request_ops ethnl_stats_request_ops;
 extern const struct ethnl_request_ops ethnl_phc_vclocks_request_ops;
+extern const struct ethnl_request_ops ethnl_module_request_ops;
 
 extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
 extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
@@ -373,6 +374,8 @@  extern const struct nla_policy ethnl_fec_set_policy[ETHTOOL_A_FEC_AUTO + 1];
 extern const struct nla_policy ethnl_module_eeprom_get_policy[ETHTOOL_A_MODULE_EEPROM_I2C_ADDRESS + 1];
 extern const struct nla_policy ethnl_stats_get_policy[ETHTOOL_A_STATS_GROUPS + 1];
 extern const struct nla_policy ethnl_phc_vclocks_get_policy[ETHTOOL_A_PHC_VCLOCKS_HEADER + 1];
+extern const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1];
+extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1];
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
@@ -391,6 +394,7 @@  int ethnl_tunnel_info_doit(struct sk_buff *skb, struct genl_info *info);
 int ethnl_tunnel_info_start(struct netlink_callback *cb);
 int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
 int ethnl_set_fec(struct sk_buff *skb, struct genl_info *info);
+int ethnl_set_module(struct sk_buff *skb, struct genl_info *info);
 
 extern const char stats_std_names[__ETHTOOL_STATS_CNT][ETH_GSTRING_LEN];
 extern const char stats_eth_phy_names[__ETHTOOL_A_STATS_ETH_PHY_CNT][ETH_GSTRING_LEN];