Message ID | 20240112-opp_support-v6-5-77bbf7d0cc37@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Manivannan Sadhasivam |
Headers | show |
Series | PCI: qcom: Add support for OPP | expand |
On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote: > PCIe needs to choose the appropriate performance state of RPMH power > domain and interconnect bandwidth based up on the PCIe gen speed. > > Add the OPP table support to specify RPMH performance states and > interconnect peak bandwidth. > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> > --- > arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 74 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi > index 6b1d2e0d9d14..eab85ecaeff0 100644 > --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi > +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi > @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 { > pinctrl-names = "default"; > pinctrl-0 = <&pcie0_default_state>; > > + operating-points-v2 = <&pcie0_opp_table>; > + > status = "disabled"; > + > + pcie0_opp_table: opp-table { > + compatible = "operating-points-v2"; > + > + opp-2500000 { > + opp-hz = /bits/ 64 <2500000>; > + required-opps = <&rpmhpd_opp_low_svs>; > + opp-peak-kBps = <250000 250000>; This is a question for Viresh: We already have macros in the driver to derive the bandwidth based on link speed. So if OPP core exposes a callback to allow the consumers to set the bw on its own, we can get rid of this entry. Similar to config_clks()/config_regulators(). Is that feasible? - Mani
On 29-01-24, 21:34, Manivannan Sadhasivam wrote: > On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote: > > PCIe needs to choose the appropriate performance state of RPMH power > > domain and interconnect bandwidth based up on the PCIe gen speed. > > > > Add the OPP table support to specify RPMH performance states and > > interconnect peak bandwidth. > > > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> > > --- > > arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 74 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi > > index 6b1d2e0d9d14..eab85ecaeff0 100644 > > --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi > > @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 { > > pinctrl-names = "default"; > > pinctrl-0 = <&pcie0_default_state>; > > > > + operating-points-v2 = <&pcie0_opp_table>; > > + > > status = "disabled"; > > + > > + pcie0_opp_table: opp-table { > > + compatible = "operating-points-v2"; > > + > > + opp-2500000 { > > + opp-hz = /bits/ 64 <2500000>; > > + required-opps = <&rpmhpd_opp_low_svs>; > > + opp-peak-kBps = <250000 250000>; > > This is a question for Viresh: We already have macros in the driver to derive > the bandwidth based on link speed. So if OPP core exposes a callback to allow > the consumers to set the bw on its own, we can get rid of this entry. > > Similar to config_clks()/config_regulators(). Is that feasible? I don't have any issues with a new callback for bw. But, AFAIU, the DT is required to represent the hardware irrespective of what any OS would do with it. So DT should ideally have these values here, right ? Also, the driver has already moved away from using those macros now and depend on the OPP core to do the right thing. It only uses the macro for the cases where the DT OPP table isn't available. And as said by few others as well already, the driver really should try to add OPPs dynamically in that case to avoid multiple code paths and stick to a single OPP based solution.
On Tue, Jan 30, 2024 at 11:41:11AM +0530, Viresh Kumar wrote: > On 29-01-24, 21:34, Manivannan Sadhasivam wrote: > > On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote: > > > PCIe needs to choose the appropriate performance state of RPMH power > > > domain and interconnect bandwidth based up on the PCIe gen speed. > > > > > > Add the OPP table support to specify RPMH performance states and > > > interconnect peak bandwidth. > > > > > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> > > > --- > > > arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 74 insertions(+) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi > > > index 6b1d2e0d9d14..eab85ecaeff0 100644 > > > --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi > > > @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 { > > > pinctrl-names = "default"; > > > pinctrl-0 = <&pcie0_default_state>; > > > > > > + operating-points-v2 = <&pcie0_opp_table>; > > > + > > > status = "disabled"; > > > + > > > + pcie0_opp_table: opp-table { > > > + compatible = "operating-points-v2"; > > > + > > > + opp-2500000 { > > > + opp-hz = /bits/ 64 <2500000>; > > > + required-opps = <&rpmhpd_opp_low_svs>; > > > + opp-peak-kBps = <250000 250000>; > > > > This is a question for Viresh: We already have macros in the driver to derive > > the bandwidth based on link speed. So if OPP core exposes a callback to allow > > the consumers to set the bw on its own, we can get rid of this entry. > > > > Similar to config_clks()/config_regulators(). Is that feasible? > > I don't have any issues with a new callback for bw. But, AFAIU, the DT > is required to represent the hardware irrespective of what any OS > would do with it. So DT should ideally have these values here, right ? > Not necessarily. Because, right now the bandwidth values of the all peripherals are encoded within the drivers. Only OPP has the requirement to define the values in DT. > Also, the driver has already moved away from using those macros now > and depend on the OPP core to do the right thing. It only uses the > macro for the cases where the DT OPP table isn't available. And as > said by few others as well already, the driver really should try to > add OPPs dynamically in that case to avoid multiple code paths and > stick to a single OPP based solution. > Still I prefer to use OPP for bandwidth control because both the voltage and bandwidth values need to be updated at the same time. My only point here is, if OPP exposes a callback for bw, then we can keep the DT behavior consistent. - Mani > -- > viresh
On 30-01-24, 12:44, Manivannan Sadhasivam wrote: > On Tue, Jan 30, 2024 at 11:41:11AM +0530, Viresh Kumar wrote: > > I don't have any issues with a new callback for bw. But, AFAIU, the DT > > is required to represent the hardware irrespective of what any OS > > would do with it. So DT should ideally have these values here, right ? > > > > Not necessarily. Because, right now the bandwidth values of the all peripherals > are encoded within the drivers. Only OPP has the requirement to define the > values in DT. I have a bit different argument here. I am saying that it doesn't matter if we have OPP framework or something else using these values. The hardware must be represented properly by the DT, so Linux or any other firmware/OS can program the device. So DT should have bandwidth values anyway. And that's the way we have designed things in Linux now. > > Also, the driver has already moved away from using those macros now > > and depend on the OPP core to do the right thing. It only uses the > > macro for the cases where the DT OPP table isn't available. And as > > said by few others as well already, the driver really should try to > > add OPPs dynamically in that case to avoid multiple code paths and > > stick to a single OPP based solution. > > > > Still I prefer to use OPP for bandwidth control because both the voltage and > bandwidth values need to be updated at the same time. My only point here is, if > OPP exposes a callback for bw, then we can keep the DT behavior consistent. Feels like we are going a bit backward on this. The current view, as per me, is that driver shouldn't need to micromanage all these configurations and the OPP core should be able to handle them. That's why we want to handle all configurations from there. This also means that the DT needs to contain all this information and drivers shouldn't use special math functions to calculate these values. Drivers need to move away from them, instead of getting more of those. I don't see how a callback would be helpful here, if the driver relies on DT values only. Or am I confusing things here ??
On Tue, Jan 30, 2024 at 02:06:19PM +0530, Viresh Kumar wrote: > On 30-01-24, 12:44, Manivannan Sadhasivam wrote: > > On Tue, Jan 30, 2024 at 11:41:11AM +0530, Viresh Kumar wrote: > > > I don't have any issues with a new callback for bw. But, AFAIU, the DT > > > is required to represent the hardware irrespective of what any OS > > > would do with it. So DT should ideally have these values here, right ? > > > > > > > Not necessarily. Because, right now the bandwidth values of the all peripherals > > are encoded within the drivers. Only OPP has the requirement to define the > > values in DT. > > I have a bit different argument here. I am saying that it doesn't > matter if we have OPP framework or something else using these values. > The hardware must be represented properly by the DT, so Linux or any > other firmware/OS can program the device. So DT should have bandwidth > values anyway. And that's the way we have designed things in Linux > now. > So you are saying that the ICC core itself should get the bw values from DT instead of hardcoding in the driver? If so, I'd like to get the opinion from Georgi/Bjorn. > > > Also, the driver has already moved away from using those macros now > > > and depend on the OPP core to do the right thing. It only uses the > > > macro for the cases where the DT OPP table isn't available. And as > > > said by few others as well already, the driver really should try to > > > add OPPs dynamically in that case to avoid multiple code paths and > > > stick to a single OPP based solution. > > > > > > > Still I prefer to use OPP for bandwidth control because both the voltage and > > bandwidth values need to be updated at the same time. My only point here is, if > > OPP exposes a callback for bw, then we can keep the DT behavior consistent. > > Feels like we are going a bit backward on this. The current view, as > per me, is that driver shouldn't need to micromanage all these > configurations and the OPP core should be able to handle them. That's > why we want to handle all configurations from there. > > This also means that the DT needs to contain all this information and > drivers shouldn't use special math functions to calculate these > values. Drivers need to move away from them, instead of getting more > of those. > > I don't see how a callback would be helpful here, if the driver relies > on DT values only. Or am I confusing things here ?? > No, there is no confusion here, but a difference in perspective. Let's get the thoughts of Georgi/Bjorn on this. I just want to avoid the confusion in DT since some peripherals with OPP support will have bw defined in DT, while rest of the peripherals will have them in drivers. - Mani > -- > viresh
On 30-01-24, 15:18, Manivannan Sadhasivam wrote: > So you are saying that the ICC core itself should get the bw values from DT > instead of hardcoding in the driver? If so, I'd like to get the opinion from > Georgi/Bjorn. Not really. The drivers or the ICC core doesn't need to do anything I guess. Since the values are coming via the OPP, we must just use it to hide all these details. Why is the ICC core required to get into this here ? ICC core should be ready to get the information from DT (may or may not via the OPP core), or from driver.
On Tue, Jan 30, 2024 at 03:25:08PM +0530, Viresh Kumar wrote: > On 30-01-24, 15:18, Manivannan Sadhasivam wrote: > > So you are saying that the ICC core itself should get the bw values from DT > > instead of hardcoding in the driver? If so, I'd like to get the opinion from > > Georgi/Bjorn. > > Not really. The drivers or the ICC core doesn't need to do anything I > guess. Since the values are coming via the OPP, we must just use it to > hide all these details. > > Why is the ICC core required to get into this here ? ICC core should > be ready to get the information from DT (may or may not via the OPP > core), or from driver. > Agree. But what I'm saying is, right now there is no DT property in the interconnect consumer nodes to specificy the bw requirements. This is all hardcoded in the respective ICC consumer drivers. But when we use OPP to control bw, the bw requirements come from DT. This is what I see as a difference. Because, only nodes making use of OPP will specify bw in DT and other nodes making use of just ICC will not. Maybe I'm worrying too much about these details... But it looks like inconsistency to me. - Mani
On 30-01-24, 18:46, Manivannan Sadhasivam wrote: > Agree. But what I'm saying is, right now there is no DT property in the > interconnect consumer nodes to specificy the bw requirements. This is all > hardcoded in the respective ICC consumer drivers. I thought there are a lot of users already in there.. $ git grep -i opp.*bps arch/arm64/boot/dts/ | wc -l 864 > But when we use OPP to control bw, the bw requirements come from DT. This is > what I see as a difference. Because, only nodes making use of OPP will specify > bw in DT and other nodes making use of just ICC will not. > > Maybe I'm worrying too much about these details... But it looks like > inconsistency to me. Right. So is there inconsistency right now ? Yes, there is. The important question we need to answer is where do we want to see all these drivers (specially new ones) in the future. What's the right thing to do eventually ? Hardcode stuff ? Or Move it to DT ? The answer is DT for me, so the code can be generic enough to be reused. This is just one step in the right direction I guess. Eventually the drivers must get simplified, which they are I guess.
On Wed, Jan 31, 2024 at 10:53:35AM +0530, Viresh Kumar wrote: > On 30-01-24, 18:46, Manivannan Sadhasivam wrote: > > Agree. But what I'm saying is, right now there is no DT property in the > > interconnect consumer nodes to specificy the bw requirements. This is all > > hardcoded in the respective ICC consumer drivers. > > I thought there are a lot of users already in there.. > > $ git grep -i opp.*bps arch/arm64/boot/dts/ | wc -l > 864 Most of the hits are from CPU nodes... For some reasons, peripheral drivers are sticking to hardcoded values. > > > But when we use OPP to control bw, the bw requirements come from DT. This is > > what I see as a difference. Because, only nodes making use of OPP will specify > > bw in DT and other nodes making use of just ICC will not. > > > > Maybe I'm worrying too much about these details... But it looks like > > inconsistency to me. > > Right. So is there inconsistency right now ? Yes, there is. > > The important question we need to answer is where do we want to see > all these drivers (specially new ones) in the future. What's the right > thing to do eventually ? Hardcode stuff ? Or Move it to DT ? > > The answer is DT for me, so the code can be generic enough to be > reused. This is just one step in the right direction I guess. > Eventually the drivers must get simplified, which they are I guess. > I completely agree that hardcoding the bw values is not the right thing, but was worried about the inconsistency. But anyway, I hope either ICC will also move towards DT for bw or we will convert all the drivers to use OPP in the future. Thanks for the discussion so far! It clarified. - Mani
On 31-01-24, 14:16, Manivannan Sadhasivam wrote: > Most of the hits are from CPU nodes... For some reasons, peripheral drivers are > sticking to hardcoded values. I guess the reason for this is that the OPP core wasn't used for non-CPU devices until recently. And we are in a transition phase where few of the drivers will migrate to using it and so will have DT based bw values.
On 31.01.2024 06:23, Viresh Kumar wrote: > On 30-01-24, 18:46, Manivannan Sadhasivam wrote: >> Agree. But what I'm saying is, right now there is no DT property in the >> interconnect consumer nodes to specificy the bw requirements. This is all >> hardcoded in the respective ICC consumer drivers. > > I thought there are a lot of users already in there.. > > $ git grep -i opp.*bps arch/arm64/boot/dts/ | wc -l > 864 > >> But when we use OPP to control bw, the bw requirements come from DT. This is >> what I see as a difference. Because, only nodes making use of OPP will specify >> bw in DT and other nodes making use of just ICC will not. >> >> Maybe I'm worrying too much about these details... But it looks like >> inconsistency to me. > > Right. So is there inconsistency right now ? Yes, there is. > > The important question we need to answer is where do we want to see > all these drivers (specially new ones) in the future. What's the right > thing to do eventually ? Hardcode stuff ? Or Move it to DT ? > > The answer is DT for me, so the code can be generic enough to be > reused. This is just one step in the right direction I guess. > Eventually the drivers must get simplified, which they are I guess. I'm lukewarm on this. A *lot* of hardware has more complex requirements than "x MBps at y MHz", especially when performance counters come into the picture for dynamic bw management. OPP tables can't really handle this properly. Konrad
On 01-02-24, 15:45, Konrad Dybcio wrote: > I'm lukewarm on this. > > A *lot* of hardware has more complex requirements than "x MBps at y MHz", > especially when performance counters come into the picture for dynamic > bw management. > > OPP tables can't really handle this properly. There was a similar concern for voltages earlier on and we added the capability of adjusting the voltage for OPPs in the OPP core. Maybe something similar can be done here ?
On 2.02.2024 08:33, Viresh Kumar wrote: > On 01-02-24, 15:45, Konrad Dybcio wrote: >> I'm lukewarm on this. >> >> A *lot* of hardware has more complex requirements than "x MBps at y MHz", >> especially when performance counters come into the picture for dynamic >> bw management. >> >> OPP tables can't really handle this properly. > > There was a similar concern for voltages earlier on and we added the capability > of adjusting the voltage for OPPs in the OPP core. Maybe something similar can > be done here ? > I really don't think it's fitting.. At any moment the device may require any bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH.. Konrad
On 2/10/2024 2:44 AM, Konrad Dybcio wrote: > On 2.02.2024 08:33, Viresh Kumar wrote: >> On 01-02-24, 15:45, Konrad Dybcio wrote: >>> I'm lukewarm on this. >>> >>> A *lot* of hardware has more complex requirements than "x MBps at y MHz", >>> especially when performance counters come into the picture for dynamic >>> bw management. >>> >>> OPP tables can't really handle this properly. >> >> There was a similar concern for voltages earlier on and we added the capability >> of adjusting the voltage for OPPs in the OPP core. Maybe something similar can >> be done here ? >> > I really don't think it's fitting.. At any moment the device may require any > bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH.. > > Konrad Viresh & konrad can you both come to conclusion on this. - Krishna Chaitanya.
On 09-02-24, 22:14, Konrad Dybcio wrote: > On 2.02.2024 08:33, Viresh Kumar wrote: > > On 01-02-24, 15:45, Konrad Dybcio wrote: > >> I'm lukewarm on this. > >> > >> A *lot* of hardware has more complex requirements than "x MBps at y MHz", > >> especially when performance counters come into the picture for dynamic > >> bw management. > >> > >> OPP tables can't really handle this properly. > > > > There was a similar concern for voltages earlier on and we added the capability > > of adjusting the voltage for OPPs in the OPP core. Maybe something similar can > > be done here ? > > > I really don't think it's fitting.. At any moment the device may require any > bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH.. Okay, I leave it up to you guys to decide on how you want to do it. I still believe getting the information via DT is the right thing, but maybe I still don't understand the problem fully. Thanks.
On Mon, Feb 19, 2024 at 03:58:34PM +0530, Viresh Kumar wrote: > On 09-02-24, 22:14, Konrad Dybcio wrote: > > On 2.02.2024 08:33, Viresh Kumar wrote: > > > On 01-02-24, 15:45, Konrad Dybcio wrote: > > >> I'm lukewarm on this. > > >> > > >> A *lot* of hardware has more complex requirements than "x MBps at y MHz", > > >> especially when performance counters come into the picture for dynamic > > >> bw management. > > >> > > >> OPP tables can't really handle this properly. > > > > > > There was a similar concern for voltages earlier on and we added the capability > > > of adjusting the voltage for OPPs in the OPP core. Maybe something similar can > > > be done here ? > > > > > I really don't think it's fitting.. At any moment the device may require any > > bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH.. > > Okay, I leave it up to you guys to decide on how you want to do it. I still > believe getting the information via DT is the right thing, but maybe I still > don't understand the problem fully. > I argued for a different issue, but what Konrad pointed out is not a valid concern to me. The driver may only require _fixed_ bandwidth between 0 and (MAX_BW_PER_LINK_GEN * LINK_WIDTH) and DT can pass those bandwidth values. Chaitanya pointed out that this may end up with long entries in DT once the PCIe Gen versions start to increase (current Qcom platforms support upto Gen 4 only). But that shouldn't be a real concern if we look at what DT has to provide. - Mani
diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi index 6b1d2e0d9d14..eab85ecaeff0 100644 --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 { pinctrl-names = "default"; pinctrl-0 = <&pcie0_default_state>; + operating-points-v2 = <&pcie0_opp_table>; + status = "disabled"; + + pcie0_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-2500000 { + opp-hz = /bits/ 64 <2500000>; + required-opps = <&rpmhpd_opp_low_svs>; + opp-peak-kBps = <250000 250000>; + }; + + opp-5000000 { + opp-hz = /bits/ 64 <5000000>; + required-opps = <&rpmhpd_opp_low_svs>; + opp-peak-kBps = <500000 250000>; + }; + + opp-8000000 { + opp-hz = /bits/ 64 <8000000>; + required-opps = <&rpmhpd_opp_nom>; + opp-peak-kBps = <984500 250000>; + }; + }; + }; pcie0_phy: phy@1c06000 { @@ -1938,7 +1963,56 @@ pcie1: pcie@1c08000 { pinctrl-names = "default"; pinctrl-0 = <&pcie1_default_state>; + operating-points-v2 = <&pcie1_opp_table>; + status = "disabled"; + + pcie1_opp_table: opp-table { + compatible = "operating-points-v2"; + + /* GEN 1x1 */ + opp-2500000 { + opp-hz = /bits/ 64 <2500000>; + required-opps = <&rpmhpd_opp_low_svs>; + opp-peak-kBps = <250000 250000>; + }; + + /* GEN 1x2 GEN 2x1 */ + opp-5000000 { + opp-hz = /bits/ 64 <5000000>; + required-opps = <&rpmhpd_opp_low_svs>; + opp-peak-kBps = <500000 250000>; + }; + + /* GEN 2x2 */ + opp-10000000 { + opp-hz = /bits/ 64 <10000000>; + required-opps = <&rpmhpd_opp_low_svs>; + opp-peak-kBps = <1000000 250000>; + }; + + /* GEN 3x1 */ + opp-8000000 { + opp-hz = /bits/ 64 <8000000>; + required-opps = <&rpmhpd_opp_nom>; + opp-peak-kBps = <984500 250000>; + }; + + /* GEN 3x2 GEN 4x1 */ + opp-16000000 { + opp-hz = /bits/ 64 <16000000>; + required-opps = <&rpmhpd_opp_nom>; + opp-peak-kBps = <1969000 250000>; + }; + + /* GEN 4x2 */ + opp-32000000 { + opp-hz = /bits/ 64 <32000000>; + required-opps = <&rpmhpd_opp_nom>; + opp-peak-kBps = <3938000 250000>; + }; + }; + }; pcie1_phy: phy@1c0e000 {
PCIe needs to choose the appropriate performance state of RPMH power domain and interconnect bandwidth based up on the PCIe gen speed. Add the OPP table support to specify RPMH performance states and interconnect peak bandwidth. Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> --- arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+)