[1/3] spi: spi-geni-qcom: Avoid clock setting if not needed
diff mbox series

Message ID 20200701174506.1.Icfdcee14649fc0a6c38e87477b28523d4e60bab3@changeid
State New
Headers show
Series
  • spi: spi-geni-qcom: Avoid a bunch of per-transfer overhead
Related show

Commit Message

Doug Anderson July 2, 2020, 12:45 a.m. UTC
Every SPI transfer could have a different clock rate.  The
spi-geni-qcom controller code to deal with this was never very well
optimized and has always had a lot of code plus some calls into the
clk framework which, at the very least, would grab a mutex.  However,
until recently, the overhead wasn't _too_ much.  That changed with
commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
we're now calling geni_icc_set_bw(), which leads to a bunch of math
plus:
  geni_icc_set_bw()
    icc_set_bw()
      apply_constraints()
        qcom_icc_set()
          qcom_icc_bcm_voter_commit()
            rpmh_invalidate()
            rpmh_write_batch()
...and those rpmh commands can be a bit beefy if you call them too
often.

We already know what speed we were running at before, so if we see
that nothing has changed let's avoid the whole pile of code.

On my hardware, this made spi_geni_prepare_message() drop down from
~145 us down to ~14 us.

NOTE: Potentially it might also make sense to add some code into the
interconnect framework to avoid executing so much code when bandwidth
isn't changing, but even if we did that we still want to short circuit
here to save the extra math / clock calls.

Fixes: 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/spi/spi-geni-qcom.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Akash Asthana July 7, 2020, 10:16 a.m. UTC | #1
On 7/2/2020 6:15 AM, Douglas Anderson wrote:
> Every SPI transfer could have a different clock rate.  The
> spi-geni-qcom controller code to deal with this was never very well
> optimized and has always had a lot of code plus some calls into the
> clk framework which, at the very least, would grab a mutex.  However,
> until recently, the overhead wasn't _too_ much.  That changed with
> commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
> we're now calling geni_icc_set_bw(), which leads to a bunch of math
> plus:
>    geni_icc_set_bw()
>      icc_set_bw()
>        apply_constraints()
>          qcom_icc_set()
>            qcom_icc_bcm_voter_commit()
>              rpmh_invalidate()
>              rpmh_write_batch()
> ...and those rpmh commands can be a bit beefy if you call them too
> often.

Reviewed-by: Akash Asthana<akashast@codeaurora.org>
Mark Brown July 7, 2020, 12:08 p.m. UTC | #2
On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote:
> Every SPI transfer could have a different clock rate.  The
> spi-geni-qcom controller code to deal with this was never very well
> optimized and has always had a lot of code plus some calls into the

This doesn't apply against current code, please check and resend.
Doug Anderson July 7, 2020, 12:53 p.m. UTC | #3
Hi Mark,

On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote:
> > Every SPI transfer could have a different clock rate.  The
> > spi-geni-qcom controller code to deal with this was never very well
> > optimized and has always had a lot of code plus some calls into the
>
> This doesn't apply against current code, please check and resend.

As mentioned in the cover letter, I posted this series against the
Qualcomm tree.  The commit that it is fixing landed there with your
Ack so I was hoping this series could land in the Qualcomm tree with
your Ack as well.  Would that be OK?

-Doug
Mark Brown July 8, 2020, 10:01 a.m. UTC | #4
On Tue, Jul 07, 2020 at 05:53:01AM -0700, Doug Anderson wrote:
> On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote:

> > This doesn't apply against current code, please check and resend.

> As mentioned in the cover letter, I posted this series against the
> Qualcomm tree.  The commit that it is fixing landed there with your
> Ack so I was hoping this series could land in the Qualcomm tree with
> your Ack as well.  Would that be OK?

So I didn't see this until after the patch I applied was queued...  it's
looking like it would be good to have a cross-tree merge with the
Qualcomm tree if there's stuff like this - is this on a branch which
makes that practical?  Otherwise I guess...
Mark Brown July 8, 2020, 12:48 p.m. UTC | #5
On Wed, Jul 01, 2020 at 05:45:07PM -0700, Douglas Anderson wrote:
> Every SPI transfer could have a different clock rate.  The
> spi-geni-qcom controller code to deal with this was never very well
> optimized and has always had a lot of code plus some calls into the

Acked-by: Mark Brown <broonie@kernel.org>
Doug Anderson July 8, 2020, 3:22 p.m. UTC | #6
Hi,

On Wed, Jul 8, 2020 at 3:01 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Tue, Jul 07, 2020 at 05:53:01AM -0700, Doug Anderson wrote:
> > On Tue, Jul 7, 2020 at 5:08 AM Mark Brown <broonie@kernel.org> wrote:
>
> > > This doesn't apply against current code, please check and resend.
>
> > As mentioned in the cover letter, I posted this series against the
> > Qualcomm tree.  The commit that it is fixing landed there with your
> > Ack so I was hoping this series could land in the Qualcomm tree with
> > your Ack as well.  Would that be OK?
>
> So I didn't see this until after the patch I applied was queued...  it's
> looking like it would be good to have a cross-tree merge with the
> Qualcomm tree if there's stuff like this - is this on a branch which
> makes that practical?  Otherwise I guess...

It's not too bad.  Of the 5 patches I've sent out (3 for geni SPI, 2
for quad SPI) you've landed just one.  Here's the summary:

a) geni SPI 1/3 (Avoid clock setting): Has your Ack.
b) geni SPI 2/3 (autosuspend delay): Landed in SPI tree
c) geni SPI 3/3 (overhead in prepare_message): Has your Ack.

d) quad SPI 1/2 (Avoid clock setting): Needs your Ack.
e) quad SPI 2/2 (autosuspend delay): Needs your Ack.

Since b) has already landed in your tree, let's just leave it there.
There'll be a bit of a performance hit in the Qualcomm tree, but it'll
still be usable.

Since the rest haven't landed, it would be nice to just land them in
the Qualcomm tree.


I think there's still more work to make the Geni SPI driver more
optimized, but I don't think it'll be as urgent as those patches and I
feel like any more major work could wait a cycle.


-Doug
Mark Brown July 8, 2020, 5:02 p.m. UTC | #7
On Wed, Jul 08, 2020 at 08:22:05AM -0700, Doug Anderson wrote:

> Since the rest haven't landed, it would be nice to just land them in
> the Qualcomm tree.

I guess.

> I think there's still more work to make the Geni SPI driver more
> optimized, but I don't think it'll be as urgent as those patches and I
> feel like any more major work could wait a cycle.

It feels like there's more than what's already landed in flight at the
minute, though some of that might just be the multiple rounds of reviews
there were for the bandwidth stuff.

Patch
diff mbox series

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index e01c782ef7d0..bb4cdda2dec8 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -201,6 +201,9 @@  static int geni_spi_set_clock_and_bw(struct spi_geni_master *mas,
 	struct geni_se *se = &mas->se;
 	int ret;
 
+	if (clk_hz == mas->cur_speed_hz)
+		return 0;
+
 	ret = get_spi_clk_cfg(clk_hz, mas, &idx, &div);
 	if (ret) {
 		dev_err(mas->dev, "Err setting clk to %lu: %d\n", clk_hz, ret);
@@ -339,11 +342,9 @@  static void setup_fifo_xfer(struct spi_transfer *xfer,
 	}
 
 	/* Speed and bits per word can be overridden per transfer */
-	if (xfer->speed_hz != mas->cur_speed_hz) {
-		ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
-		if (ret)
-			return;
-	}
+	ret = geni_spi_set_clock_and_bw(mas, xfer->speed_hz);
+	if (ret)
+		return;
 
 	mas->tx_rem_bytes = 0;
 	mas->rx_rem_bytes = 0;