Message ID | 20240422203913.225151-7-anthony.l.nguyen@intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9afff0de30db149a1bf440db26a3ddd6a4f260d8 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: Support 5 layer Tx scheduler topology | expand |
On Mon, Apr 22, 2024 at 01:39:11PM -0700, Tony Nguyen wrote: > + The default 9-layer tree topology was deemed best for most workloads, > + as it gives an optimal ratio of performance to configurability. However, > + for some specific cases, this 9-layer topology might not be desired. > + One example would be sending traffic to queues that are not a multiple > + of 8. Because the maximum radix is limited to 8 in 9-layer topology, > + the 9th queue has a different parent than the rest, and it's given > + more bandwidth credits. This causes a problem when the system is > + sending traffic to 9 queues: > + > + | tx_queue_0_packets: 24163396 > + | tx_queue_1_packets: 24164623 > + | tx_queue_2_packets: 24163188 > + | tx_queue_3_packets: 24163701 > + | tx_queue_4_packets: 24163683 > + | tx_queue_5_packets: 24164668 > + | tx_queue_6_packets: 23327200 > + | tx_queue_7_packets: 24163853 > + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th > + > <snipped>... > + To verify that value has been set: > + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers > For consistency with other code blocks, format above as such: ---- >8 ---- diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index 830c04354222f8..0039ca45782400 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -41,15 +41,17 @@ Parameters more bandwidth credits. This causes a problem when the system is sending traffic to 9 queues: - | tx_queue_0_packets: 24163396 - | tx_queue_1_packets: 24164623 - | tx_queue_2_packets: 24163188 - | tx_queue_3_packets: 24163701 - | tx_queue_4_packets: 24163683 - | tx_queue_5_packets: 24164668 - | tx_queue_6_packets: 23327200 - | tx_queue_7_packets: 24163853 - | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th + .. code-block:: shell + + tx_queue_0_packets: 24163396 + tx_queue_1_packets: 24164623 + tx_queue_2_packets: 24163188 + tx_queue_3_packets: 24163701 + tx_queue_4_packets: 24163683 + tx_queue_5_packets: 24164668 + tx_queue_6_packets: 23327200 + tx_queue_7_packets: 24163853 + tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th To address this need, you can switch to a 5-layer topology, which changes the maximum topology radix to 512. With this enhancement, @@ -67,7 +69,10 @@ Parameters You must do PCI slot powercycle for the selected topology to take effect. To verify that value has been set: - $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers + + .. code-block:: shell + + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers Info versions ============= Thanks.
On 4/23/2024 2:37 PM, Bagas Sanjaya wrote: > On Mon, Apr 22, 2024 at 01:39:11PM -0700, Tony Nguyen wrote: >> + The default 9-layer tree topology was deemed best for most workloads, >> + as it gives an optimal ratio of performance to configurability. However, >> + for some specific cases, this 9-layer topology might not be desired. >> + One example would be sending traffic to queues that are not a multiple >> + of 8. Because the maximum radix is limited to 8 in 9-layer topology, >> + the 9th queue has a different parent than the rest, and it's given >> + more bandwidth credits. This causes a problem when the system is >> + sending traffic to 9 queues: >> + >> + | tx_queue_0_packets: 24163396 >> + | tx_queue_1_packets: 24164623 >> + | tx_queue_2_packets: 24163188 >> + | tx_queue_3_packets: 24163701 >> + | tx_queue_4_packets: 24163683 >> + | tx_queue_5_packets: 24164668 >> + | tx_queue_6_packets: 23327200 >> + | tx_queue_7_packets: 24163853 >> + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th >> + >> <snipped>... >> + To verify that value has been set: >> + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers >> > > For consistency with other code blocks, format above as such: > > ---- >8 ---- > diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst > index 830c04354222f8..0039ca45782400 100644 > --- a/Documentation/networking/devlink/ice.rst > +++ b/Documentation/networking/devlink/ice.rst > @@ -41,15 +41,17 @@ Parameters > more bandwidth credits. This causes a problem when the system is > sending traffic to 9 queues: > > - | tx_queue_0_packets: 24163396 > - | tx_queue_1_packets: 24164623 > - | tx_queue_2_packets: 24163188 > - | tx_queue_3_packets: 24163701 > - | tx_queue_4_packets: 24163683 > - | tx_queue_5_packets: 24164668 > - | tx_queue_6_packets: 23327200 > - | tx_queue_7_packets: 24163853 > - | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th > + .. code-block:: shell > + > + tx_queue_0_packets: 24163396 > + tx_queue_1_packets: 24164623 > + tx_queue_2_packets: 24163188 > + tx_queue_3_packets: 24163701 > + tx_queue_4_packets: 24163683 > + tx_queue_5_packets: 24164668 > + tx_queue_6_packets: 23327200 > + tx_queue_7_packets: 24163853 > + tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th > > To address this need, you can switch to a 5-layer topology, which > changes the maximum topology radix to 512. With this enhancement, > @@ -67,7 +69,10 @@ Parameters > You must do PCI slot powercycle for the selected topology to take effect. > > To verify that value has been set: > - $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers > + > + .. code-block:: shell > + > + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers > > Info versions > ============= > > Thanks. > Thank You for reporting that. I will verify this issue soon.
On 4/24/24 16:54, Mateusz Polchlopek wrote: > > > On 4/23/2024 2:37 PM, Bagas Sanjaya wrote: >> On Mon, Apr 22, 2024 at 01:39:11PM -0700, Tony Nguyen wrote: >>> + The default 9-layer tree topology was deemed best for most workloads, >>> + as it gives an optimal ratio of performance to configurability. However, >>> + for some specific cases, this 9-layer topology might not be desired. >>> + One example would be sending traffic to queues that are not a multiple >>> + of 8. Because the maximum radix is limited to 8 in 9-layer topology, >>> + the 9th queue has a different parent than the rest, and it's given >>> + more bandwidth credits. This causes a problem when the system is >>> + sending traffic to 9 queues: >>> + >>> + | tx_queue_0_packets: 24163396 >>> + | tx_queue_1_packets: 24164623 >>> + | tx_queue_2_packets: 24163188 >>> + | tx_queue_3_packets: 24163701 >>> + | tx_queue_4_packets: 24163683 >>> + | tx_queue_5_packets: 24164668 >>> + | tx_queue_6_packets: 23327200 >>> + | tx_queue_7_packets: 24163853 >>> + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th >>> + >>> <snipped>... >>> + To verify that value has been set: >>> + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers >>> >> >> For consistency with other code blocks, format above as such: >> >> ---- >8 ---- >> diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst >> index 830c04354222f8..0039ca45782400 100644 >> --- a/Documentation/networking/devlink/ice.rst >> +++ b/Documentation/networking/devlink/ice.rst >> @@ -41,15 +41,17 @@ Parameters >> more bandwidth credits. This causes a problem when the system is >> sending traffic to 9 queues: >> - | tx_queue_0_packets: 24163396 >> - | tx_queue_1_packets: 24164623 >> - | tx_queue_2_packets: 24163188 >> - | tx_queue_3_packets: 24163701 >> - | tx_queue_4_packets: 24163683 >> - | tx_queue_5_packets: 24164668 >> - | tx_queue_6_packets: 23327200 >> - | tx_queue_7_packets: 24163853 >> - | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th >> + .. code-block:: shell >> + >> + tx_queue_0_packets: 24163396 >> + tx_queue_1_packets: 24164623 >> + tx_queue_2_packets: 24163188 >> + tx_queue_3_packets: 24163701 >> + tx_queue_4_packets: 24163683 >> + tx_queue_5_packets: 24164668 >> + tx_queue_6_packets: 23327200 >> + tx_queue_7_packets: 24163853 >> + tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th >> To address this need, you can switch to a 5-layer topology, which >> changes the maximum topology radix to 512. With this enhancement, >> @@ -67,7 +69,10 @@ Parameters >> You must do PCI slot powercycle for the selected topology to take effect. >> To verify that value has been set: >> - $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers >> + >> + .. code-block:: shell >> + >> + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers >> Info versions >> ============= >> >> Thanks. >> > > Thank You for reporting that. I will verify this issue soon. OK, thanks!
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index 7f30ebd5debb..830c04354222 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -21,6 +21,53 @@ Parameters * - ``enable_iwarp`` - runtime - mutually exclusive with ``enable_roce`` + * - ``tx_scheduling_layers`` + - permanent + - The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Each of them are decision + points. Root node represents a port, while all the leaves represent + the queues. This way of configuring the Tx scheduler allows features + like DCB or devlink-rate (documented below) to configure how much + bandwidth is given to any given queue or group of queues, enabling + fine-grained control because scheduling parameters can be configured + at any given layer of the tree. + + The default 9-layer tree topology was deemed best for most workloads, + as it gives an optimal ratio of performance to configurability. However, + for some specific cases, this 9-layer topology might not be desired. + One example would be sending traffic to queues that are not a multiple + of 8. Because the maximum radix is limited to 8 in 9-layer topology, + the 9th queue has a different parent than the rest, and it's given + more bandwidth credits. This causes a problem when the system is + sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th + + To address this need, you can switch to a 5-layer topology, which + changes the maximum topology radix to 512. With this enhancement, + the performance characteristic is equal as all queues can be assigned + to the same parent in the tree. The obvious drawback of this solution + is a lower configuration depth of the tree. + + Use the ``tx_scheduling_layer`` parameter with the devlink command + to change the transmit scheduler topology. To use 5-layer topology, + use a value of 5. For example: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + Use a value of 9 to set it back to the default value. + + You must do PCI slot powercycle for the selected topology to take effect. + + To verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers Info versions =============