Message ID | 20200701174506.1.Icfdcee14649fc0a6c38e87477b28523d4e60bab3@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | spi: spi-geni-qcom: Avoid a bunch of per-transfer overhead | expand |
On 7/2/2020 6:15 AM, Douglas Anderson wrote: > Every SPI transfer could have a different clock rate. The > spi-geni-qcom controller code to deal with this was never very well > optimized and has always had a lot of code plus some calls into the > clk framework which, at the very least, would grab a mutex. However, > until recently, the overhead wasn't _too_ much. That changed with > commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support") > we're now calling geni_icc_set_bw(), which leads to a bunch of math > plus: > geni_icc_set_bw() > icc_set_bw() > apply_constraints() > qcom_icc_set() > qcom_icc_bcm_voter_commit() > rpmh_invalidate() > rpmh_write_batch() > ...and those rpmh commands can be a bit beefy if you call them too > often. Reviewed-by: Akash Asthana<akashast@codeaurora.org>
On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote: > Every SPI transfer could have a different clock rate. The > spi-geni-qcom controller code to deal with this was never very well > optimized and has always had a lot of code plus some calls into the This doesn't apply against current code, please check and resend.
Hi Mark, On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote: > > On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote: > > Every SPI transfer could have a different clock rate. The > > spi-geni-qcom controller code to deal with this was never very well > > optimized and has always had a lot of code plus some calls into the > > This doesn't apply against current code, please check and resend. As mentioned in the cover letter, I posted this series against the Qualcomm tree. The commit that it is fixing landed there with your Ack so I was hoping this series could land in the Qualcomm tree with your Ack as well. Would that be OK? -Doug
On Tue, Jul 07, 2020 at 05:53:01AM -0700, Doug Anderson wrote: > On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote: > > This doesn't apply against current code, please check and resend. > As mentioned in the cover letter, I posted this series against the > Qualcomm tree. The commit that it is fixing landed there with your > Ack so I was hoping this series could land in the Qualcomm tree with > your Ack as well. Would that be OK? So I didn't see this until after the patch I applied was queued... it's looking like it would be good to have a cross-tree merge with the Qualcomm tree if there's stuff like this - is this on a branch which makes that practical? Otherwise I guess...
On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote: > Every SPI transfer could have a different clock rate. The > spi-geni-qcom controller code to deal with this was never very well > optimized and has always had a lot of code plus some calls into the Acked-by: Mark Brown <broonie@kernel.org>
Hi, On Wed, Jul 8, 2020 at 3:01 AM Mark Brown <broonie@kernel.org> wrote: > > On Tue, Jul 07, 2020 at 05:53:01AM -0700, Doug Anderson wrote: > > On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote: > > > > This doesn't apply against current code, please check and resend. > > > As mentioned in the cover letter, I posted this series against the > > Qualcomm tree. The commit that it is fixing landed there with your > > Ack so I was hoping this series could land in the Qualcomm tree with > > your Ack as well. Would that be OK? > > So I didn't see this until after the patch I applied was queued... it's > looking like it would be good to have a cross-tree merge with the > Qualcomm tree if there's stuff like this - is this on a branch which > makes that practical? Otherwise I guess... It's not too bad. Of the 5 patches I've sent out (3 for geni SPI, 2 for quad SPI) you've landed just one. Here's the summary: a) geni SPI 1/3 (Avoid clock setting): Has your Ack. b) geni SPI 2/3 (autosuspend delay): Landed in SPI tree c) geni SPI 3/3 (overhead in prepare_message): Has your Ack. d) quad SPI 1/2 (Avoid clock setting): Needs your Ack. e) quad SPI 2/2 (autosuspend delay): Needs your Ack. Since b) has already landed in your tree, let's just leave it there. There'll be a bit of a performance hit in the Qualcomm tree, but it'll still be usable. Since the rest haven't landed, it would be nice to just land them in the Qualcomm tree. I think there's still more work to make the Geni SPI driver more optimized, but I don't think it'll be as urgent as those patches and I feel like any more major work could wait a cycle. -Doug
On Wed, Jul 08, 2020 at 08:22:05AM -0700, Doug Anderson wrote: > Since the rest haven't landed, it would be nice to just land them in > the Qualcomm tree. I guess. > I think there's still more work to make the Geni SPI driver more > optimized, but I don't think it'll be as urgent as those patches and I > feel like any more major work could wait a cycle. It feels like there's more than what's already landed in flight at the minute, though some of that might just be the multiple rounds of reviews there were for the bandwidth stuff.
diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c index e01c782ef7d0..bb4cdda2dec8 100644 --- a/drivers/spi/spi-geni-qcom.c +++ b/drivers/spi/spi-geni-qcom.c @@ -201,6 +201,9 @@ static int geni_spi_set_clock_and_bw(struct spi_geni_master *mas, struct geni_se *se = &mas->se; int ret; + if (clk_hz == mas->cur_speed_hz) + return 0; + ret = get_spi_clk_cfg(clk_hz, mas, &idx, &div); if (ret) { dev_err(mas->dev, "Err setting clk to %lu: %d\n", clk_hz, ret); @@ -339,11 +342,9 @@ static void setup_fifo_xfer(struct spi_transfer *xfer, } /* Speed and bits per word can be overridden per transfer */ - if (xfer->speed_hz != mas->cur_speed_hz) { - ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz); - if (ret) - return; - } + ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz); + if (ret) + return; mas->tx_rem_bytes = 0; mas->rx_rem_bytes = 0;
Every SPI transfer could have a different clock rate. The spi-geni-qcom controller code to deal with this was never very well optimized and has always had a lot of code plus some calls into the clk framework which, at the very least, would grab a mutex. However, until recently, the overhead wasn't _too_ much. That changed with commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support") we're now calling geni_icc_set_bw(), which leads to a bunch of math plus: geni_icc_set_bw() icc_set_bw() apply_constraints() qcom_icc_set() qcom_icc_bcm_voter_commit() rpmh_invalidate() rpmh_write_batch() ...and those rpmh commands can be a bit beefy if you call them too often. We already know what speed we were running at before, so if we see that nothing has changed let's avoid the whole pile of code. On my hardware, this made spi_geni_prepare_message() drop down from ~145 us down to ~14 us. NOTE: Potentially it might also make sense to add some code into the interconnect framework to avoid executing so much code when bandwidth isn't changing, but even if we did that we still want to short circuit here to save the extra math / clock calls. Fixes: 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support") Signed-off-by: Douglas Anderson <dianders@chromium.org> --- drivers/spi/spi-geni-qcom.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)