Message ID | 3d1e2d945904a0fb55258559eb7322d7e11066b6.1715199358.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] net: introduce HW Rate Limiting Driver API | expand |
Hi Paolo > + * struct net_shaper_info - represents a shaping node on the NIC H/W > + * @metric: Specify if the bw limits refers to PPS or BPS > + * @bw_min: Minimum guaranteed rate for this shaper > + * @bw_max: Maximum peak bw allowed for this shaper > + * @burst: Maximum burst for the peek rate of this shaper > + * @priority: Scheduling priority for this shaper > + * @weight: Scheduling weight for this shaper > + */ > +struct net_shaper_info { > + enum net_shaper_metric metric; > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > + u64 bw_max; /* maximum allowed bandwidth */ > + u32 burst; /* maximum burst in bytes for bw_max */ > + u32 priority; /* scheduling strict priority */ Above it say priority. Here is strict priority? Is there a difference? > + u32 weight; /* scheduling WRR weight*/ > +}; Are there any special semantics for weight? Looking at the hardware i have, which has 8 queues for a switch port, i can either set it to strict priority, meaning queue 7 needs to be empty before it look at queue 6, and it will only look at queue 5 when 6 is empty etc. Or i can set weights per queue. How would i expect strict priority? Shaping itself is not performed on the queues, but the port. So it seems like i should create 8 net_shaper_info and set the weight in each, and everything else to 0? And then create a queue group shaper, put the 8 queue shapers into it, and then configure bw_max for the group? Everything else is 0, because that is all i can configure. Does this sound correct? > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. Are they also available on plain boring devices which don't have PFs or VFs? Would i be correct in assuming my driver should just create these shapers. There will be some netlink calls to allow user space to enumerate them, and display the relationships between them? And netlink calls to set values in a shaper? Will there be a way to say which fields are actually settable, since i doubt most hardware will have everything? In my case, only one field appears to be relevant in each shaper, and maybe we want to give a hint about that to userspace? Andrew
On Wed, 2024-05-08 at 23:47 +0200, Andrew Lunn wrote: > > + * struct net_shaper_info - represents a shaping node on the NIC H/W > > + * @metric: Specify if the bw limits refers to PPS or BPS > > + * @bw_min: Minimum guaranteed rate for this shaper > > + * @bw_max: Maximum peak bw allowed for this shaper > > + * @burst: Maximum burst for the peek rate of this shaper > > + * @priority: Scheduling priority for this shaper > > + * @weight: Scheduling weight for this shaper > > + */ > > +struct net_shaper_info { > > + enum net_shaper_metric metric; > > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > > + u64 bw_max; /* maximum allowed bandwidth */ > > + u32 burst; /* maximum burst in bytes for bw_max */ > > + u32 priority; /* scheduling strict priority */ > > Above it say priority. Here is strict priority? Is there a difference? the 'priority' field is the strict priority for scheduling. We will clarify the first comment. > > > + u32 weight; /* scheduling WRR weight*/ > > +}; > > Are there any special semantics for weight? Looking at the hardware i > have, which has 8 queues for a switch port, i can either set it to > strict priority, meaning queue 7 needs to be empty before it look at > queue 6, and it will only look at queue 5 when 6 is empty etc. Or i > can set weights per queue. How would i expect strict priority? The expected behaviour is that schedulers at the same level with the same priority will be served according to their weight. I'm unsure if I read your question correctly. My understanding is that the your H/W don't allow strict priority and WRR simultaneously. In such case, the set/create operations should reject calls setting both non zero weight and priority values. > Shaping itself is not performed on the queues, but the port. So it > seems like i should create 8 net_shaper_info and set the weight in > each, and everything else to 0? And then create a queue group shaper, > put the 8 queue shapers into it, and then configure bw_max for the > group? Everything else is 0, because that is all i can configure. > > Does this sound correct? It sounds correct. You can avoid creating the queue group and set the rate at the NETDEV scope, or possibly not even set/create such shaper: the transmission is expected to be limited by the line rate. > > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and > > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. > > Are they also available on plain boring devices which don't have PFs > or VFs? Yes, a driver is free to implement/support an arbitrary subset of the available feature. Operations attempting to set an unsupported state should fail with a relevant extended ack. > Would i be correct in assuming my driver should just create these > shapers. There will be some netlink calls to allow user space to > enumerate them, and display the relationships between them? Yes, the idea/plan/roadmap is to implement yml-based netlink API to fully expose the feature to the user-space. > And > netlink calls to set values in a shaper? Will there be a way to say > which fields are actually settable, since i doubt most hardware will > have everything? This was not planned yet, even because some (most?) H/W can have non- trivial constraints to expose. e.g. WRR is available at the queue scope only, and only if there is at most one SP shaper somewhere else. > In my case, only one field appears to be relevant in > each shaper, and maybe we want to give a hint about that to userspace? Any suggestion on how to expose that in reasonable way more then welcome! Thanks, Paolo
On Thu, May 09, 2024 at 04:19:52PM +0200, Paolo Abeni wrote: > On Wed, 2024-05-08 at 23:47 +0200, Andrew Lunn wrote: > > > + * struct net_shaper_info - represents a shaping node on the NIC H/W > > > + * @metric: Specify if the bw limits refers to PPS or BPS > > > + * @bw_min: Minimum guaranteed rate for this shaper > > > + * @bw_max: Maximum peak bw allowed for this shaper > > > + * @burst: Maximum burst for the peek rate of this shaper > > > + * @priority: Scheduling priority for this shaper > > > + * @weight: Scheduling weight for this shaper > > > + */ > > > +struct net_shaper_info { > > > + enum net_shaper_metric metric; > > > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > > > + u64 bw_max; /* maximum allowed bandwidth */ > > > + u32 burst; /* maximum burst in bytes for bw_max */ > > > + u32 priority; /* scheduling strict priority */ > > > > Above it say priority. Here is strict priority? Is there a difference? > > the 'priority' field is the strict priority for scheduling. We will > clarify the first comment. > > > > > > + u32 weight; /* scheduling WRR weight*/ > > > +}; > > > > Are there any special semantics for weight? Looking at the hardware i > > have, which has 8 queues for a switch port, i can either set it to > > strict priority, meaning queue 7 needs to be empty before it look at > > queue 6, and it will only look at queue 5 when 6 is empty etc. Or i > > can set weights per queue. How would i expect strict priority? > > The expected behaviour is that schedulers at the same level with the > same priority will be served according to their weight. > > I'm unsure if I read your question correctly. My understanding is that > the your H/W don't allow strict priority and WRR simultaneously. Correct. There is one bit to select between the two. If WRR is enabled, it then becomes possible for some generations of the hardware to configure the weights, for others the weights are fixed in silicon. > In > such case, the set/create operations should reject calls setting both > non zero weight and priority values. So in order to set strict priority, i need to set the priority field to 7, 6, 5, 4, 3, 2, 1, 0, for my 8 queues, and weight to 0. For WRR, i need to set the priority fields to 0, and the weight values, either to what is hard coded in the silicon, or valid weights when it is configurable. Now the question is, how do i get between these two states? It is not possible to mix WRR and strict priority. Any kAPI which only modifies one queue at once will go straight into an invalid state, and the driver will need to return -EOPNOTSUPP. So it seems like there needs to be an atomic set N queue configuration at once, so i can cleanly go from strict priority across 8 queues to WRR across 8 queues. Is that foreseen? > It sounds correct. You can avoid creating the queue group and set the > rate at the NETDEV scope, or possibly not even set/create such shaper: > the transmission is expected to be limited by the line rate. Well, the hardware exists, i probably should just create the shapers to match the hardware. However, if i set the hardware equivalent of bw_max to 0, it then uses line rate. Is this something we want to document? You can disable a shaper from shaping by setting the bandwidth to 0? Or do we want a separate enable/disable bit in the structure? > > > > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and > > > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. > > > > Are they also available on plain boring devices which don't have PFs > > or VFs? > > Yes, a driver is free to implement/support an arbitrary subset of the > available feature. Operations attempting to set an unsupported state > should fail with a relevant extended ack. Great. Please update the comment. > > In my case, only one field appears to be relevant in > > each shaper, and maybe we want to give a hint about that to userspace? > > Any suggestion on how to expose that in reasonable way more then > welcome! We might first want to gather knowledge on what these shapers actually look like in different hardwares. There are going to be some which are very limited fixed functions as in the hardware i have, probably most SOHO switches are like this. And then we have TOR and smart NICs which might be able to create and destroy shapers on the fly, and are a lot less restrictive. Andrew
> +/** > + * enum net_shaper_metric - the metric of the shaper > + * @NET_SHAPER_METRIC_PPS: Shaper operates on a packets per second basis > + * @NET_SHAPER_METRIC_BPS: Shaper operates on a bits per second basis > + */ > +enum net_shaper_metric { > + NET_SHAPER_METRIC_PPS, > + NET_SHAPER_METRIC_BPS > +}; I think we need a third value. In the hardware i'm talking about, the 8 queues themselves don't do any shaping. All i'm configuring is WRR vs strict priority order. So ideally i want something to indicate this shaper does not actually shape. Andrew
On Thu, 2024-05-09 at 17:00 +0200, Andrew Lunn wrote: > On Thu, May 09, 2024 at 04:19:52PM +0200, Paolo Abeni wrote: > > On Wed, 2024-05-08 at 23:47 +0200, Andrew Lunn wrote: > > > > + * struct net_shaper_info - represents a shaping node on the NIC H/W > > > > + * @metric: Specify if the bw limits refers to PPS or BPS > > > > + * @bw_min: Minimum guaranteed rate for this shaper > > > > + * @bw_max: Maximum peak bw allowed for this shaper > > > > + * @burst: Maximum burst for the peek rate of this shaper > > > > + * @priority: Scheduling priority for this shaper > > > > + * @weight: Scheduling weight for this shaper > > > > + */ > > > > +struct net_shaper_info { > > > > + enum net_shaper_metric metric; > > > > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > > > > + u64 bw_max; /* maximum allowed bandwidth */ > > > > + u32 burst; /* maximum burst in bytes for bw_max */ > > > > + u32 priority; /* scheduling strict priority */ > > > > > > Above it say priority. Here is strict priority? Is there a difference? > > > > the 'priority' field is the strict priority for scheduling. We will > > clarify the first comment. > > > > > > > > > + u32 weight; /* scheduling WRR weight*/ > > > > +}; > > > > > > Are there any special semantics for weight? Looking at the hardware i > > > have, which has 8 queues for a switch port, i can either set it to > > > strict priority, meaning queue 7 needs to be empty before it look at > > > queue 6, and it will only look at queue 5 when 6 is empty etc. Or i > > > can set weights per queue. How would i expect strict priority? > > > > The expected behaviour is that schedulers at the same level with the > > same priority will be served according to their weight. > > > > I'm unsure if I read your question correctly. My understanding is that > > the your H/W don't allow strict priority and WRR simultaneously. > > Correct. There is one bit to select between the two. If WRR is > enabled, it then becomes possible for some generations of the hardware > to configure the weights, for others the weights are fixed in silicon. > > > In > > such case, the set/create operations should reject calls setting both > > non zero weight and priority values. > > So in order to set strict priority, i need to set the priority field > to 7, 6, 5, 4, 3, 2, 1, 0, for my 8 queues, and weight to 0. For WRR, > i need to set the priority fields to 0, and the weight values, either > to what is hard coded in the silicon, or valid weights when it is > configurable. > > Now the question is, how do i get between these two states? It is not > possible to mix WRR and strict priority. Any kAPI which only modifies > one queue at once will go straight into an invalid state, and the > driver will need to return -EOPNOTSUPP. So it seems like there needs > to be an atomic set N queue configuration at once, so i can cleanly go > from strict priority across 8 queues to WRR across 8 queues. Is that > foreseen? You could delete all the WRR shapers and then create/add SP based ones. > > It sounds correct. You can avoid creating the queue group and set the > > rate at the NETDEV scope, or possibly not even set/create such shaper: > > the transmission is expected to be limited by the line rate. > > Well, the hardware exists, i probably should just create the shapers > to match the hardware. However, if i set the hardware equivalent of > bw_max to 0, it then uses line rate. Is this something we want to > document? You can disable a shaper from shaping by setting the > bandwidth to 0? Or do we want a separate enable/disable bit in the > structure? I documenting that '0' means 'unlimited' for bw_max fields should be good enough. If the NIC can't support any kind of shaping it will reject with an appropriate extended ack any configuration change with bw != 0. I guess we need another comment update here :) > > > > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and > > > > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. > > > > > > Are they also available on plain boring devices which don't have PFs > > > or VFs? > > > > Yes, a driver is free to implement/support an arbitrary subset of the > > available feature. Operations attempting to set an unsupported state > > should fail with a relevant extended ack. > > Great. Please update the comment. Sure, will do. > > > In my case, only one field appears to be relevant in > > > each shaper, and maybe we want to give a hint about that to userspace? > > > > Any suggestion on how to expose that in reasonable way more then > > welcome! > > We might first want to gather knowledge on what these shapers actually > look like in different hardwares. There are going to be some which are > very limited fixed functions as in the hardware i have, probably most > SOHO switches are like this. And then we have TOR and smart NICs which > might be able to create and destroy shapers on the fly, and are a lot > less restrictive. As a reference the initial configuration status has been discussed here: https://lore.kernel.org/netdev/20240410075745.4637c537@kernel.org/ (grep for 'Transforming arbitrary pre-existing driver hierarchy') The conclusion was (to my understanding) to reduce the core complexity enforcing an uniform views of the NIC defaults. The 'create' op is just an abstraction to tell the NIC to switch from the default configuration to the specified one. The default configuration means 'no restrictions/shaping beyond the link rate'. If the NIC has buildin shapers it can't disable, it must initialize them to match the above. In both case the kernel view of the default will be 'no shapers added' The 'delete' op will restore the default for the specified scope location. If the NIC has a persistent shaper there, it will update the setting to the default, or actually delete the H/W shaper if possible. In both case the kernel view of such scope location with return to be 'no shapers added' My understading/guess/hope is that is possible to implement the above default with any H/W. Does the above solve your doubt/answer your question? Thanks! Paolo
> > Now the question is, how do i get between these two states? It is not > > possible to mix WRR and strict priority. Any kAPI which only modifies > > one queue at once will go straight into an invalid state, and the > > driver will need to return -EOPNOTSUPP. So it seems like there needs > > to be an atomic set N queue configuration at once, so i can cleanly go > > from strict priority across 8 queues to WRR across 8 queues. Is that > > foreseen? > > You could delete all the WRR shapers and then create/add SP based ones. But that does not match the hardware. I cannot delete the hardware. It will either do strict priority or WRR. If i delete the software representation of the shaper, the hardware shaper will keep on doing what it was doing. So i don't see this as a good model. I think the driver will create shapers to represent the hardware, and you are not allowed to delete them or add more of them, because that is what the hardware is. All you can do is configure the shapers that exist. > The 'create' op is just an abstraction to tell the NIC to switch from > the default configuration to the specified one. Well, the hardware default is i think WRR for the queues, and line rate. That will be what the software representation of the shapers will be set to when the driver probes and creates the shapers representors. Andrew
Hi Paolo, > -----Original Message----- > From: Paolo Abeni <pabeni@redhat.com> > Sent: Thursday, May 9, 2024 1:51 AM > To: netdev@vger.kernel.org > Cc: Jakub Kicinski <kuba@kernel.org>; Jiri Pirko <jiri@resnulli.us>; Madhu > Chittim <madhu.chittim@intel.com>; Sridhar Samudrala > <sridhar.samudrala@intel.com>; Simon Horman <horms@kernel.org>; John > Fastabend <john.fastabend@gmail.com>; Sunil Kovvuri Goutham > <sgoutham@marvell.com>; Jamal Hadi Salim <jhs@mojatatu.com> > Subject: [RFC PATCH] net: introduce HW Rate Limiting Driver API > > This is the first incarnation in a formal (pre-RFC) patch of the HW TX Rate > Limiting Driver API proposal shared here[1]. > > The goal is to outline the proposed APIs before pushing the actual implementation. > > The network devices gain a new ops struct to directly manipulate the H/W > shapers implemented by the NIC. > > The shapers can be attached to a pre-defined set of 'domains' - port, vf, etc. - and > the overall shapers configuration pushed to the H/W is maintained by the kernel. > > Each shaper is identified by an unique integer id based on the domain and > additional domain-specific information - e.g. for the VF domain, the virtual function > number/identifier. > > [1] https://urldefense.proofpoint.com/v2/url?u=https- > 3A__lore.kernel.org_netdev_20240405102313.GA310894- > 40kernel.org_&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=TwreqwV6mQ8K9wI > pqwFO8yjikO_w1jUOe2MzChg4Rmg&m=g3KFBILzzqT9znnKDO0ePHOi5tc0Zzy6r > G9xL3GdGKS_9IkdC3q- > okhCTptMrIKN&s=hVQbF8IF41zg6I197jY8r96uQq0oQbgTfe4-4HF6c3M&e= > > Co-developed-by: Simon Horman <horms@kernel.org> > Signed-off-by: Simon Horman <horms@kernel.org> > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > include/linux/netdevice.h | 15 +++ > include/net/net_shaper.h | 206 > ++++++++++++++++++++++++++++++++++++++ > net/Kconfig | 3 + > 3 files changed, 224 insertions(+) > create mode 100644 include/net/net_shaper.h > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index > cf261fb89d73..39f66af014be 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -79,6 +79,8 @@ struct xdp_buff; > struct xdp_frame; > struct xdp_metadata_ops; > struct xdp_md; > +struct net_shaper_ops; > +struct net_shaper_data; > > typedef u32 xdp_features_t; > > @@ -1596,6 +1598,13 @@ struct net_device_ops { > int (*ndo_hwtstamp_set)(struct net_device *dev, > struct kernel_hwtstamp_config > *kernel_config, > struct netlink_ext_ack > *extack); > + > +#if IS_ENABLED(CONFIG_NET_SHAPER) > + /** @net_shaper_ops: Device shaping offload operations > + * see include/net/net_shapers.h > + */ > + const struct net_shaper_ops *net_shaper_ops; #endif > }; > > /** > @@ -2403,6 +2412,12 @@ struct net_device { > /** @page_pools: page pools created for this netdevice */ > struct hlist_head page_pools; > #endif > +#if IS_ENABLED(CONFIG_NET_SHAPER) > + /** @net_shaper_data: data tracking the current shaper status > + * see include/net/net_shapers.h > + */ > + struct net_shaper_data *net_shaper_data; #endif > }; > #define to_net_dev(d) container_of(d, struct net_device, dev) > > diff --git a/include/net/net_shaper.h b/include/net/net_shaper.h new file mode > 100644 index 000000000000..a4fbadd99870 > --- /dev/null > +++ b/include/net/net_shaper.h > @@ -0,0 +1,206 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > + > +#ifndef _NET_SHAPER_H_ > +#define _NET_SHAPER_H_ > + > +#include <linux/types.h> > +#include <linux/netdevice.h> > +#include <linux/netlink.h> > + > +/** > + * enum net_shaper_metric - the metric of the shaper > + * @NET_SHAPER_METRIC_PPS: Shaper operates on a packets per second > +basis > + * @NET_SHAPER_METRIC_BPS: Shaper operates on a bits per second basis > +*/ enum net_shaper_metric { > + NET_SHAPER_METRIC_PPS, > + NET_SHAPER_METRIC_BPS > +}; > + > +/** > + * struct net_shaper_info - represents a shaping node on the NIC H/W > + * @metric: Specify if the bw limits refers to PPS or BPS > + * @bw_min: Minimum guaranteed rate for this shaper > + * @bw_max: Maximum peak bw allowed for this shaper > + * @burst: Maximum burst for the peek rate of this shaper > + * @priority: Scheduling priority for this shaper > + * @weight: Scheduling weight for this shaper */ struct > +net_shaper_info { > + enum net_shaper_metric metric; > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > + u64 bw_max; /* maximum allowed bandwidth */ > + u32 burst; /* maximum burst in bytes for bw_max */ > + u32 priority; /* scheduling strict priority */ > + u32 weight; /* scheduling WRR weight*/ > +}; > + > +/** > + * enum net_shaper_scope - the different scopes where a shaper could be > attached > + * @NET_SHAPER_SCOPE_PORT: The root shaper for the whole H/W. > + * @NET_SHAPER_SCOPE_NETDEV: The main shaper for the given network > device. > + * @NET_SHAPER_SCOPE_VF: The shaper is attached to the given virtual > + * function. > + * @NET_SHAPER_SCOPE_QUEUE_GROUP: The shaper groups multiple > queues > +under the > + * same device. > + * @NET_SHAPER_SCOPE_QUEUE: The shaper is attached to the given > device queue. > + * > + * NET_SHAPER_SCOPE_PORT and NET_SHAPER_SCOPE_VF are only > available on > + * PF devices, usually inside the host/hypervisor. What is the difference between NET_SHAPER_SCOPE_VF and NET_SHAPER_SCOPE_NETDEV from a VF traffic shaping perspective? Is NET_SHAPER_SCOPE_VF used to configure a shaper for a VF which is in a separate domain that is not a NETDEV? Thanks, Naveen > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP > and > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. > + */ > +enum net_shaper_scope { > + NET_SHAPER_SCOPE_PORT, > + NET_SHAPER_SCOPE_NETDEV, > + NET_SHAPER_SCOPE_VF, > + NET_SHAPER_SCOPE_QUEUE_GROUP, > + NET_SHAPER_SCOPE_QUEUE, > +}; > + > +/** > + * struct net_shaper_ops - Operations on device H/W shapers > + * @add: Creates a new shaper in the specified scope. > + * @set: Modify the existing shaper. > + * @delete: Delete the specified shaper. > + * @move: Move an existing shaper under a different parent. > + * > + * The initial shaping configuration ad device initialization is empty/ > + * a no-op/does not constraint the b/w in any way. > + * The network core keeps track of the applied user-configuration in > + * per device storage. > + * > + * Each shaper is uniquely identified within the device with an > +'handle', > + * dependent on the shaper scope and other data, see > +@shaper_make_handle() */ struct net_shaper_ops { > + /** add - Add a shaper inside the shaper hierarchy > + * @dev: netdevice to operate on > + * @handle: the shaper indetifier > + * @shaper: configuration of shaper > + * @extack: Netlink extended ACK for reporting errors. > + * > + * Return: > + * * 0 on success > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > + * or core for any reason. @extack should be set to > + * text describing the reason. > + * * Other negative error values on failure. > + * > + * Examples or reasons this operation may fail include: > + * * H/W resources limits. > + * * Can’t respect the requested bw limits. > + */ > + int (*add)(struct net_device *dev, u32 handle, > + const struct net_shaper_info *shaper, > + struct netlink_ext_ack *extack); > + > + /** set - Update the specified shaper, if it exists > + * @dev: Netdevice to operate on. > + * @handle: the shaper identifier > + * @shaper: Configuration of shaper. > + * @extack: Netlink extended ACK for reporting errors. > + * > + * Return: > + * * %0 - Success > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > + * or core for any reason. @extack should be set to > + * text describing the reason. > + * * Other negative error values on failure. > + */ > + int (*set)(struct net_device *dev, u32 handle, > + const struct net_shaper_info *shaper, > + struct netlink_ext_ack *extack); > + > + /** delete - Removes a shaper from the NIC > + * @dev: netdevice to operate on. > + * @handle: the shaper identifier > + * @extack: Netlink extended ACK for reporting errors. > + * > + * Return: > + * * %0 - Success > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > + * or core for any reason. @extack should be set to > + * text describing the reason. > + * * Other negative error value on failure. > + */ > + int (*delete)(struct net_device *dev, u32 handle, > + struct netlink_ext_ack *extack); > + > + /** Move - change the parent id of the specified shaper > + * @dev: netdevice to operate on. > + * @handle: unique identifier for the shaper > + * @new_parent_id: identifier of the new parent for this shaper > + * @extack: Netlink extended ACK for reporting errors. > + * > + * Move the specified shaper in the hierarchy replacing its > + * current parent shaper with @new_parent_id > + * > + * Return: > + * * %0 - Success > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > + * or core for any reason. @extack should be set to > + * text describing the reason. > + * * Other negative error values on failure. > + */ > + int (*move)(struct net_device *dev, u32 handle, > + u32 new_parent_handle, struct netlink_ext_ack *extack); }; > + > +/** > + * net_shaper_make_handle - creates an unique shaper identifier > + * @scope: the shaper scope > + * @vf: virtual function number > + * @id: queue group or queue id > + * > + * Return: an unique identifier for the shaper > + * > + * Combines the specified arguments to create an unique identifier for > + * the shaper. > + * The virtual function number is only used within > +@NET_SHAPER_SCOPE_VF, > + * @NET_SHAPER_SCOPE_QUEUE_GROUP and > @NET_SHAPER_SCOPE_QUEUE. > + * The @id number is only used for @NET_SHAPER_SCOPE_QUEUE_GROUP > and > + * @NET_SHAPER_SCOPE_QUEUE, and must be, respectively, the queue > group > + * identifier or the queue number. > + */ > +u32 net_shaper_make_handle(enum net_shaper_scope scope, int vf, int > +id); > + > +/* > + * Examples: > + * - set shaping on a given queue > + * struct shaper_info info = { }; // fill this > + * u32 handle = shaper_make_handle(NET_SHAPER_SCOPE_QUEUE, 0, > queue_id); > + * dev->shaper_ops->add(dev, handle, &info, NULL); > + * > + * - create a queue group with a queue group shaping limits. > + * Assuming the following topology already exists: > + * < netdev shaper > > + * / \ > + * <queue 0 shaper> . . . <queue N shaper> > + * > + * struct shaper_info ginfo = { }; // fill this > + * u32 ghandle = > shaper_make_handle(NET_SHAPER_SCOPE_QUEUE_GROUP, 0, 0); > + * dev->shaper_ops->add(dev, ghandle, &ginfo); > + * > + * // now topology is: > + * // < netdev shaper > > + * // / | \ > + * // / | < newly created shaper > > + * // / | > + * // <queue 0 shaper> . . . <queue N shaper> > + * > + * // move a shapers for queues 3..n out of such queue group > + * for (i = 0; i <= 2; ++i) { > + * u32 qhandle = net_shaper_make_handle(NET_SHAPER_SCOPE_QUEUE, > 0, i); > + * dev->netshaper_ops->move(dev, qhandle, ghandle, NULL); > + * } > + * > + * // now the topology is: > + * // < netdev shaper > > + * // / \ > + * // < newly created shaper> <queue 3 shaper> .. <queue n shaper> > + * // / \ > + * // <queue 0 shaper> . . . <queue 2 shaper> > + */ > +#endif > + > diff --git a/net/Kconfig b/net/Kconfig > index f0a8692496ff..29c6fec54711 100644 > --- a/net/Kconfig > +++ b/net/Kconfig > @@ -66,6 +66,9 @@ config SKB_DECRYPTED > config SKB_EXTENSIONS > bool > > +config NET_SHAPER > + bool > + > menu "Networking options" > > source "net/packet/Kconfig" > -- > 2.43.2 >
On Fri, 2024-05-10 at 07:10 +0000, Naveen Mamindlapalli wrote: > On Thursday, May 9, 2024 1:51 AM Paolo Abeni <pabeni@redhat.com> wrote: [...] > > +/** > > + * enum net_shaper_scope - the different scopes where a shaper could be > > attached > > + * @NET_SHAPER_SCOPE_PORT: The root shaper for the whole H/W. > > + * @NET_SHAPER_SCOPE_NETDEV: The main shaper for the given network > > device. > > + * @NET_SHAPER_SCOPE_VF: The shaper is attached to the given virtual > > + * function. > > + * @NET_SHAPER_SCOPE_QUEUE_GROUP: The shaper groups multiple > > queues > > +under the > > + * same device. > > + * @NET_SHAPER_SCOPE_QUEUE: The shaper is attached to the given > > device queue. > > + * > > + * NET_SHAPER_SCOPE_PORT and NET_SHAPER_SCOPE_VF are only > > available on > > + * PF devices, usually inside the host/hypervisor. > > What is the difference between NET_SHAPER_SCOPE_VF and NET_SHAPER_SCOPE_NETDEV > from a VF traffic shaping perspective? With NET_SHAPER_SCOPE_VF, the user can control VF-level shapers from within the hypervisor. With NET_SHAPER_SCOPE_NETDEV, the user will control whatever is really under the gear of the given 'dev' struct: the PF-level shaper if the user itself is within the hypervisor, or the relevant VF if the user is within the VM. Cheers, Paolo
On Thu, 2024-05-09 at 18:17 +0200, Andrew Lunn wrote: > > > Now the question is, how do i get between these two states? It is not > > > possible to mix WRR and strict priority. Any kAPI which only modifies > > > one queue at once will go straight into an invalid state, and the > > > driver will need to return -EOPNOTSUPP. So it seems like there needs > > > to be an atomic set N queue configuration at once, so i can cleanly go > > > from strict priority across 8 queues to WRR across 8 queues. Is that > > > foreseen? > > > > You could delete all the WRR shapers and then create/add SP based ones. > > But that does not match the hardware. I cannot delete the hardware. It > will either do strict priority or WRR. If i delete the software > representation of the shaper, the hardware shaper will keep on doing > what it was doing. So i don't see this as a good model. I think the > driver will create shapers to represent the hardware, and you are not > allowed to delete them or add more of them, because that is what the > hardware is. All you can do is configure the shapers that exist. > > > The 'create' op is just an abstraction to tell the NIC to switch from > > the default configuration to the specified one. > > Well, the hardware default is i think WRR for the queues, and line > rate. That will be what the software representation of the shapers > will be set to when the driver probes and creates the shapers > representors. If I read correctly, allowing each NIC to expose it's own different starting configuration still will not solve the problem for this H/W to switch from WRR to SP (and vice versa). AFAICS, what would be needed there is an atomic set of operations: 'set_many' (and e.v. 'delete_many', 'create_many') that will allow changing all the shapers at once. With such operations, that H/W could still fit the expected 'no-op' default, as WRR on the queue shapers is what we expect. I agree with Jakub, handling the complexity of arbitrary starting configuration would pose a lot of trouble to the user/admin. If all the above stands together, I think we have a few options (in random order): - add both set of operations: the ones operating on a single shaper and the ones operating on multiple shapers - use only the multiple shapers ops. And the latter looks IMHO the simple/better. At that point I would probably drop the 'add' op and would rename 'delete' as 'reset': int (*set)(struct net_device *dev, int how_many, const u32 *handles, const struct net_shaper_info *shapers, struct netlink_ext_ack *extack); int (*reset)(struct net_device *dev, int how_many, const u32 *handles, struct netlink_ext_ack *extack); int (*move)(struct net_device *dev, int how_many, const u32 *handles, const u32 *new_parent_handles, struct netlink_ext_ack *extack); An NIC with 'static' shapers can implement a dummy move always returning EOPNOTSUPP and eventually filling a detailed extack. NIC without any constraints on mixing and matching different kind of shapers could implement the above as a loop over whatever they will do for the corresponding 'single shaper op' NIC with constrains alike the one you pointed out could validate the final state before atomically applying the specified operation. After a successful 'reset' operation, the kernel could drop any data it retains/caches for the relevant shapers - the current idea is to keep a copy of all successfully configured shaper_info in a xarray, using the 'handle' as the index. Side note: the move() operation could look like a complex artifact, but it's the simplest instrument I could think of to support scenarios where the user creates/configures/sets a queue group and 'move' some queue under the newly created group WDYT? Thanks, Paolo
On Fri, May 10, 2024 at 01:05:41PM +0200, Paolo Abeni wrote: > On Thu, 2024-05-09 at 18:17 +0200, Andrew Lunn wrote: > > > > Now the question is, how do i get between these two states? It is not > > > > possible to mix WRR and strict priority. Any kAPI which only modifies > > > > one queue at once will go straight into an invalid state, and the > > > > driver will need to return -EOPNOTSUPP. So it seems like there needs > > > > to be an atomic set N queue configuration at once, so i can cleanly go > > > > from strict priority across 8 queues to WRR across 8 queues. Is that > > > > foreseen? > > > > > > You could delete all the WRR shapers and then create/add SP based ones. > > > > But that does not match the hardware. I cannot delete the hardware. It > > will either do strict priority or WRR. If i delete the software > > representation of the shaper, the hardware shaper will keep on doing > > what it was doing. So i don't see this as a good model. I think the > > driver will create shapers to represent the hardware, and you are not > > allowed to delete them or add more of them, because that is what the > > hardware is. All you can do is configure the shapers that exist. > > > > > The 'create' op is just an abstraction to tell the NIC to switch from > > > the default configuration to the specified one. > > > > Well, the hardware default is i think WRR for the queues, and line > > rate. That will be what the software representation of the shapers > > will be set to when the driver probes and creates the shapers > > representors. > > If I read correctly, allowing each NIC to expose it's own different > starting configuration still will not solve the problem for this H/W to > switch from WRR to SP (and vice versa). > > AFAICS, what would be needed there is an atomic set of operations: > 'set_many' (and e.v. 'delete_many', 'create_many') that will allow > changing all the shapers at once. > > With such operations, that H/W could still fit the expected 'no-op' > default, as WRR on the queue shapers is what we expect. I agree with > Jakub, handling the complexity of arbitrary starting configuration > would pose a lot of trouble to the user/admin. > > If all the above stands together, I think we have a few options (in > random order): > > - add both set of operations: the ones operating on a single shaper and > the ones operating on multiple shapers > - use only the multiple shapers ops. > > And the latter looks IMHO the simple/better. At that point I would > probably drop the 'add' op and would rename 'delete' as > 'reset': > > int (*set)(struct net_device *dev, int how_many, const u32 *handles, > const struct net_shaper_info *shapers, > struct netlink_ext_ack *extack); > int (*reset)(struct net_device *dev, int how_many, const u32 *handles, > struct netlink_ext_ack *extack); > int (*move)(struct net_device *dev, int how_many, const u32 *handles, > const u32 *new_parent_handles, > struct netlink_ext_ack *extack); > > An NIC with 'static' shapers can implement a dummy move always > returning EOPNOTSUPP and eventually filling a detailed extack. > > NIC without any constraints on mixing and matching different kind of > shapers could implement the above as a loop over whatever they will do > for the corresponding 'single shaper op' > > NIC with constrains alike the one you pointed out could validate the > final state before atomically applying the specified operation. > > After a successful 'reset' operation, the kernel could drop any data > it retains/caches for the relevant shapers - the current idea is to > keep a copy of all successfully configured shaper_info in a xarray, > using the 'handle' as the index. > > Side note: the move() operation could look like a complex artifact, but > it's the simplest instrument I could think of to support scenarios > where the user creates/configures/sets a queue group and 'move' some > queue under the newly created group > > WDYT? Hi Andrew, Picking up this discussion, I'm wondering if Paolo's proposal addresses your concerns.
> > If I read correctly, allowing each NIC to expose it's own different > > starting configuration still will not solve the problem for this H/W to > > switch from WRR to SP (and vice versa). I also suspect this is not unique to this hardware. I've not looked at other SOHO switches, but it is reasonably common to have different queues for different priority classes, and then one shaper for the overall port rate. > > AFAICS, what would be needed there is an atomic set of operations: > > 'set_many' (and e.v. 'delete_many', 'create_many') that will allow > > changing all the shapers at once. Yep. > > With such operations, that H/W could still fit the expected 'no-op' > > default, as WRR on the queue shapers is what we expect. I agree with > > Jakub, handling the complexity of arbitrary starting configuration > > would pose a lot of trouble to the user/admin. > > > > If all the above stands together, I think we have a few options (in > > random order): > > > > - add both set of operations: the ones operating on a single shaper and > > the ones operating on multiple shapers > > - use only the multiple shapers ops. > > > > And the latter looks IMHO the simple/better. I would agree, start with only multiple shaper opps. If we find that many implementation end up just iterating the list and dealing with them individually, would could pull that iterator into the core, and expand the ops to either/or, multiple or single. > > int (*set)(struct net_device *dev, int how_many, const u32 *handles, > > const struct net_shaper_info *shapers, > > struct netlink_ext_ack *extack); > > int (*reset)(struct net_device *dev, int how_many, const u32 *handles, > > struct netlink_ext_ack *extack); > > int (*move)(struct net_device *dev, int how_many, const u32 *handles, > > const u32 *new_parent_handles, > > struct netlink_ext_ack *extack); > > > > An NIC with 'static' shapers can implement a dummy move always > > returning EOPNOTSUPP and eventually filling a detailed extack. The extack is going to be important here, we are going to need meaningful error messages. Overall, i think this can be made to work with the hardware i have. Andrew
> + * struct net_shaper_info - represents a shaping node on the NIC H/W > + * @metric: Specify if the bw limits refers to PPS or BPS > + * @bw_min: Minimum guaranteed rate for this shaper > + * @bw_max: Maximum peak bw allowed for this shaper > + * @burst: Maximum burst for the peek rate of this shaper > + * @priority: Scheduling priority for this shaper > + * @weight: Scheduling weight for this shaper > + */ > +struct net_shaper_info { > + enum net_shaper_metric metric; > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > + u64 bw_max; /* maximum allowed bandwidth */ > + u32 burst; /* maximum burst in bytes for bw_max */ > + u32 priority; /* scheduling strict priority */ > + u32 weight; /* scheduling WRR weight*/ > +}; ... > + /** set - Update the specified shaper, if it exists > + * @dev: Netdevice to operate on. > + * @handle: the shaper identifier > + * @shaper: Configuration of shaper. > + * @extack: Netlink extended ACK for reporting errors. > + * > + * Return: > + * * %0 - Success > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > + * or core for any reason. @extack should be set to > + * text describing the reason. > + * * Other negative error values on failure. > + */ > + int (*set)(struct net_device *dev, u32 handle, > + const struct net_shaper_info *shaper, > + struct netlink_ext_ack *extack); > + * net_shaper_make_handle - creates an unique shaper identifier > + * @scope: the shaper scope > + * @vf: virtual function number > + * @id: queue group or queue id > + * > + * Return: an unique identifier for the shaper > + * > + * Combines the specified arguments to create an unique identifier for > + * the shaper. > + * The virtual function number is only used within @NET_SHAPER_SCOPE_VF, > + * @NET_SHAPER_SCOPE_QUEUE_GROUP and @NET_SHAPER_SCOPE_QUEUE. > + * The @id number is only used for @NET_SHAPER_SCOPE_QUEUE_GROUP and > + * @NET_SHAPER_SCOPE_QUEUE, and must be, respectively, the queue group > + * identifier or the queue number. > + */ > +u32 net_shaper_make_handle(enum net_shaper_scope scope, int vf, int id); One thing i'm missing here is a function which does the opposite of net_shaper_make_handle(). Given a handle, it returns the scope, vf and the id. When the set() op is called, i somehow need to find the software instance representing the hardware block. If i know the id, it is just an array access. Otherwise i need additional bookkeeping, maybe a linked list of handles and pointers to structures etc. Or net_shaper_make_handle() could maybe take an addition void * priv, and provide a function void * net_shape_priv(u32 handle); Andrew
On Wed, May 15, 2024 at 04:41:09PM +0200, Andrew Lunn wrote: > > + * struct net_shaper_info - represents a shaping node on the NIC H/W > > + * @metric: Specify if the bw limits refers to PPS or BPS > > + * @bw_min: Minimum guaranteed rate for this shaper > > + * @bw_max: Maximum peak bw allowed for this shaper > > + * @burst: Maximum burst for the peek rate of this shaper > > + * @priority: Scheduling priority for this shaper > > + * @weight: Scheduling weight for this shaper > > + */ > > +struct net_shaper_info { > > + enum net_shaper_metric metric; > > + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ > > + u64 bw_max; /* maximum allowed bandwidth */ > > + u32 burst; /* maximum burst in bytes for bw_max */ > > + u32 priority; /* scheduling strict priority */ > > + u32 weight; /* scheduling WRR weight*/ > > +}; > > ... > > > + /** set - Update the specified shaper, if it exists > > + * @dev: Netdevice to operate on. > > + * @handle: the shaper identifier > > + * @shaper: Configuration of shaper. > > + * @extack: Netlink extended ACK for reporting errors. > > + * > > + * Return: > > + * * %0 - Success > > + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, > > + * or core for any reason. @extack should be set to > > + * text describing the reason. > > + * * Other negative error values on failure. > > + */ > > + int (*set)(struct net_device *dev, u32 handle, > > + const struct net_shaper_info *shaper, > > + struct netlink_ext_ack *extack); > > > + * net_shaper_make_handle - creates an unique shaper identifier > > + * @scope: the shaper scope > > + * @vf: virtual function number > > + * @id: queue group or queue id > > + * > > + * Return: an unique identifier for the shaper > > + * > > + * Combines the specified arguments to create an unique identifier for > > + * the shaper. > > + * The virtual function number is only used within @NET_SHAPER_SCOPE_VF, > > + * @NET_SHAPER_SCOPE_QUEUE_GROUP and @NET_SHAPER_SCOPE_QUEUE. > > + * The @id number is only used for @NET_SHAPER_SCOPE_QUEUE_GROUP and > > + * @NET_SHAPER_SCOPE_QUEUE, and must be, respectively, the queue group > > + * identifier or the queue number. > > + */ > > +u32 net_shaper_make_handle(enum net_shaper_scope scope, int vf, int id); > > One thing i'm missing here is a function which does the opposite of > net_shaper_make_handle(). Given a handle, it returns the scope, vf and > the id. > > When the set() op is called, i somehow need to find the software > instance representing the hardware block. If i know the id, it is just > an array access. Otherwise i need additional bookkeeping, maybe a > linked list of handles and pointers to structures etc. > > Or net_shaper_make_handle() could maybe take an addition void * priv, > and provide a function void * net_shape_priv(u32 handle); Hi Andrew, I think that, initially at least, the implementation of net_shaper_make_handle() can be fairly simple, involving packing fields into an integer in a static manner. As such I think implementing a helper or helpers to an to extract fields should be trivial. In other words, yes, I think that can be added.
On Wed, May 15, 2024 at 04:19:57PM +0200, Andrew Lunn wrote: > > > If I read correctly, allowing each NIC to expose it's own different > > > starting configuration still will not solve the problem for this H/W to > > > switch from WRR to SP (and vice versa). > > I also suspect this is not unique to this hardware. I've not looked at > other SOHO switches, but it is reasonably common to have different > queues for different priority classes, and then one shaper for the > overall port rate. Yes, understood. It's about creating a sufficiently general solution. And the HW you have in mind has lead us to see some shortcomings of the proposed API in that area. Because it drew a bit too much on understanding of a different category of HW. > > > AFAICS, what would be needed there is an atomic set of operations: > > > 'set_many' (and e.v. 'delete_many', 'create_many') that will allow > > > changing all the shapers at once. > > Yep. > > > > With such operations, that H/W could still fit the expected 'no-op' > > > default, as WRR on the queue shapers is what we expect. I agree with > > > Jakub, handling the complexity of arbitrary starting configuration > > > would pose a lot of trouble to the user/admin. > > > > > > If all the above stands together, I think we have a few options (in > > > random order): > > > > > > - add both set of operations: the ones operating on a single shaper and > > > the ones operating on multiple shapers > > > - use only the multiple shapers ops. > > > > > > And the latter looks IMHO the simple/better. > > I would agree, start with only multiple shaper opps. If we find that > many implementation end up just iterating the list and dealing with > them individually, would could pull that iterator into the core, and > expand the ops to either/or, multiple or single. FWIIW, this was my thinking too. > > > int (*set)(struct net_device *dev, int how_many, const u32 *handles, > > > const struct net_shaper_info *shapers, > > > struct netlink_ext_ack *extack); > > > int (*reset)(struct net_device *dev, int how_many, const u32 *handles, > > > struct netlink_ext_ack *extack); > > > int (*move)(struct net_device *dev, int how_many, const u32 *handles, > > > const u32 *new_parent_handles, > > > struct netlink_ext_ack *extack); > > > > > > An NIC with 'static' shapers can implement a dummy move always > > > returning EOPNOTSUPP and eventually filling a detailed extack. > > The extack is going to be important here, we are going to need > meaningful error messages. Always :) > Overall, i think this can be made to work with the hardware i have. Great, I think the next step is for us to propose a revised API with multiple shaper ops in place of single shaper ops.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index cf261fb89d73..39f66af014be 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -79,6 +79,8 @@ struct xdp_buff; struct xdp_frame; struct xdp_metadata_ops; struct xdp_md; +struct net_shaper_ops; +struct net_shaper_data; typedef u32 xdp_features_t; @@ -1596,6 +1598,13 @@ struct net_device_ops { int (*ndo_hwtstamp_set)(struct net_device *dev, struct kernel_hwtstamp_config *kernel_config, struct netlink_ext_ack *extack); + +#if IS_ENABLED(CONFIG_NET_SHAPER) + /** @net_shaper_ops: Device shaping offload operations + * see include/net/net_shapers.h + */ + const struct net_shaper_ops *net_shaper_ops; +#endif }; /** @@ -2403,6 +2412,12 @@ struct net_device { /** @page_pools: page pools created for this netdevice */ struct hlist_head page_pools; #endif +#if IS_ENABLED(CONFIG_NET_SHAPER) + /** @net_shaper_data: data tracking the current shaper status + * see include/net/net_shapers.h + */ + struct net_shaper_data *net_shaper_data; +#endif }; #define to_net_dev(d) container_of(d, struct net_device, dev) diff --git a/include/net/net_shaper.h b/include/net/net_shaper.h new file mode 100644 index 000000000000..a4fbadd99870 --- /dev/null +++ b/include/net/net_shaper.h @@ -0,0 +1,206 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _NET_SHAPER_H_ +#define _NET_SHAPER_H_ + +#include <linux/types.h> +#include <linux/netdevice.h> +#include <linux/netlink.h> + +/** + * enum net_shaper_metric - the metric of the shaper + * @NET_SHAPER_METRIC_PPS: Shaper operates on a packets per second basis + * @NET_SHAPER_METRIC_BPS: Shaper operates on a bits per second basis + */ +enum net_shaper_metric { + NET_SHAPER_METRIC_PPS, + NET_SHAPER_METRIC_BPS +}; + +/** + * struct net_shaper_info - represents a shaping node on the NIC H/W + * @metric: Specify if the bw limits refers to PPS or BPS + * @bw_min: Minimum guaranteed rate for this shaper + * @bw_max: Maximum peak bw allowed for this shaper + * @burst: Maximum burst for the peek rate of this shaper + * @priority: Scheduling priority for this shaper + * @weight: Scheduling weight for this shaper + */ +struct net_shaper_info { + enum net_shaper_metric metric; + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */ + u64 bw_max; /* maximum allowed bandwidth */ + u32 burst; /* maximum burst in bytes for bw_max */ + u32 priority; /* scheduling strict priority */ + u32 weight; /* scheduling WRR weight*/ +}; + +/** + * enum net_shaper_scope - the different scopes where a shaper could be attached + * @NET_SHAPER_SCOPE_PORT: The root shaper for the whole H/W. + * @NET_SHAPER_SCOPE_NETDEV: The main shaper for the given network device. + * @NET_SHAPER_SCOPE_VF: The shaper is attached to the given virtual + * function. + * @NET_SHAPER_SCOPE_QUEUE_GROUP: The shaper groups multiple queues under the + * same device. + * @NET_SHAPER_SCOPE_QUEUE: The shaper is attached to the given device queue. + * + * NET_SHAPER_SCOPE_PORT and NET_SHAPER_SCOPE_VF are only available on + * PF devices, usually inside the host/hypervisor. + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices. + */ +enum net_shaper_scope { + NET_SHAPER_SCOPE_PORT, + NET_SHAPER_SCOPE_NETDEV, + NET_SHAPER_SCOPE_VF, + NET_SHAPER_SCOPE_QUEUE_GROUP, + NET_SHAPER_SCOPE_QUEUE, +}; + +/** + * struct net_shaper_ops - Operations on device H/W shapers + * @add: Creates a new shaper in the specified scope. + * @set: Modify the existing shaper. + * @delete: Delete the specified shaper. + * @move: Move an existing shaper under a different parent. + * + * The initial shaping configuration ad device initialization is empty/ + * a no-op/does not constraint the b/w in any way. + * The network core keeps track of the applied user-configuration in + * per device storage. + * + * Each shaper is uniquely identified within the device with an 'handle', + * dependent on the shaper scope and other data, see @shaper_make_handle() + */ +struct net_shaper_ops { + /** add - Add a shaper inside the shaper hierarchy + * @dev: netdevice to operate on + * @handle: the shaper indetifier + * @shaper: configuration of shaper + * @extack: Netlink extended ACK for reporting errors. + * + * Return: + * * 0 on success + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, + * or core for any reason. @extack should be set to + * text describing the reason. + * * Other negative error values on failure. + * + * Examples or reasons this operation may fail include: + * * H/W resources limits. + * * Can’t respect the requested bw limits. + */ + int (*add)(struct net_device *dev, u32 handle, + const struct net_shaper_info *shaper, + struct netlink_ext_ack *extack); + + /** set - Update the specified shaper, if it exists + * @dev: Netdevice to operate on. + * @handle: the shaper identifier + * @shaper: Configuration of shaper. + * @extack: Netlink extended ACK for reporting errors. + * + * Return: + * * %0 - Success + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, + * or core for any reason. @extack should be set to + * text describing the reason. + * * Other negative error values on failure. + */ + int (*set)(struct net_device *dev, u32 handle, + const struct net_shaper_info *shaper, + struct netlink_ext_ack *extack); + + /** delete - Removes a shaper from the NIC + * @dev: netdevice to operate on. + * @handle: the shaper identifier + * @extack: Netlink extended ACK for reporting errors. + * + * Return: + * * %0 - Success + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, + * or core for any reason. @extack should be set to + * text describing the reason. + * * Other negative error value on failure. + */ + int (*delete)(struct net_device *dev, u32 handle, + struct netlink_ext_ack *extack); + + /** Move - change the parent id of the specified shaper + * @dev: netdevice to operate on. + * @handle: unique identifier for the shaper + * @new_parent_id: identifier of the new parent for this shaper + * @extack: Netlink extended ACK for reporting errors. + * + * Move the specified shaper in the hierarchy replacing its + * current parent shaper with @new_parent_id + * + * Return: + * * %0 - Success + * * %-EOPNOTSUPP - Operation is not supported by hardware, driver, + * or core for any reason. @extack should be set to + * text describing the reason. + * * Other negative error values on failure. + */ + int (*move)(struct net_device *dev, u32 handle, + u32 new_parent_handle, struct netlink_ext_ack *extack); +}; + +/** + * net_shaper_make_handle - creates an unique shaper identifier + * @scope: the shaper scope + * @vf: virtual function number + * @id: queue group or queue id + * + * Return: an unique identifier for the shaper + * + * Combines the specified arguments to create an unique identifier for + * the shaper. + * The virtual function number is only used within @NET_SHAPER_SCOPE_VF, + * @NET_SHAPER_SCOPE_QUEUE_GROUP and @NET_SHAPER_SCOPE_QUEUE. + * The @id number is only used for @NET_SHAPER_SCOPE_QUEUE_GROUP and + * @NET_SHAPER_SCOPE_QUEUE, and must be, respectively, the queue group + * identifier or the queue number. + */ +u32 net_shaper_make_handle(enum net_shaper_scope scope, int vf, int id); + +/* + * Examples: + * - set shaping on a given queue + * struct shaper_info info = { }; // fill this + * u32 handle = shaper_make_handle(NET_SHAPER_SCOPE_QUEUE, 0, queue_id); + * dev->shaper_ops->add(dev, handle, &info, NULL); + * + * - create a queue group with a queue group shaping limits. + * Assuming the following topology already exists: + * < netdev shaper > + * / \ + * <queue 0 shaper> . . . <queue N shaper> + * + * struct shaper_info ginfo = { }; // fill this + * u32 ghandle = shaper_make_handle(NET_SHAPER_SCOPE_QUEUE_GROUP, 0, 0); + * dev->shaper_ops->add(dev, ghandle, &ginfo); + * + * // now topology is: + * // < netdev shaper > + * // / | \ + * // / | < newly created shaper > + * // / | + * // <queue 0 shaper> . . . <queue N shaper> + * + * // move a shapers for queues 3..n out of such queue group + * for (i = 0; i <= 2; ++i) { + * u32 qhandle = net_shaper_make_handle(NET_SHAPER_SCOPE_QUEUE, 0, i); + * dev->netshaper_ops->move(dev, qhandle, ghandle, NULL); + * } + * + * // now the topology is: + * // < netdev shaper > + * // / \ + * // < newly created shaper> <queue 3 shaper> .. <queue n shaper> + * // / \ + * // <queue 0 shaper> . . . <queue 2 shaper> + */ +#endif + diff --git a/net/Kconfig b/net/Kconfig index f0a8692496ff..29c6fec54711 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -66,6 +66,9 @@ config SKB_DECRYPTED config SKB_EXTENSIONS bool +config NET_SHAPER + bool + menu "Networking options" source "net/packet/Kconfig"