mbox series

[RFC,0/4] spi: spi-fsl-spi: try to make cpu-mode transfers faster

Message ID 20190327143040.16013-1-rasmus.villemoes@prevas.dk (mailing list archive)
Headers show
Series spi: spi-fsl-spi: try to make cpu-mode transfers faster | expand

Message

Rasmus Villemoes March 27, 2019, 2:30 p.m. UTC
I doubt patches 3 and 4 are acceptable, but I'd still like to get
comments and/or alternative suggestions for making large transfers
faster.

The patches have been tested on an MPC8309 with a Cypress S25FL032P
spi-nor slave, and make various operations between 50% and 73%
faster.

We have not observed any problems, but to completely rule out the
possibility of "glitches on SPI CLK" mentioned in patch 3 would of
course require testing on a much wider set of hardware combinations.

Rasmus Villemoes (4):
  spi: spi-fsl-spi: remove always-true conditional in fsl_spi_do_one_msg
  spi: spi-fsl-spi: relax message sanity checking a little
  spi: spi-fsl-spi: allow changing bits_per_word while CS is still
    active
  spi: spi-fsl-spi: automatically adapt bits-per-word in cpu mode

 drivers/spi/spi-fsl-spi.c | 41 +++++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 12 deletions(-)

Comments

Mark Brown April 1, 2019, 7:34 a.m. UTC | #1
On Wed, Mar 27, 2019 at 02:30:48PM +0000, Rasmus Villemoes wrote:
> I doubt patches 3 and 4 are acceptable, but I'd still like to get
> comments and/or alternative suggestions for making large transfers
> faster.

I see no problem with this from a framework point of view FWIW, it's
going to be a question of if there's any glitches like you say.  I'm not
sure how we can get wider testing/review unless the patches actually get
merged though...  I'll leave them for a bit longer but unless someone
sees a problem I'll probably go ahead and apply them.
Rasmus Villemoes April 2, 2019, 8:43 a.m. UTC | #2
On 01/04/2019 09.34, Mark Brown wrote:
> On Wed, Mar 27, 2019 at 02:30:48PM +0000, Rasmus Villemoes wrote:
>> I doubt patches 3 and 4 are acceptable, but I'd still like to get
>> comments and/or alternative suggestions for making large transfers
>> faster.
> 
> I see no problem with this from a framework point of view FWIW, it's
> going to be a question of if there's any glitches like you say.  I'm not
> sure how we can get wider testing/review unless the patches actually get
> merged though...  I'll leave them for a bit longer but unless someone
> sees a problem I'll probably go ahead and apply them.
> 

Thanks! There's one other option I can think of: don't do the interrupts
at all, but just busy-wait for the completion of each word transfer (in
a cpu_relax() loop). That could be guarded by something like
1000000*bits_per_word < hz (roughly, the word transfer takes less than 1
us). At least on -rt, having the interrupt thread scheduled in and out
again easily takes more than 1us of cpu time, and AFAIU we'd still be
preemptible throughout - and/or one can throw in a cond_resched() every
nnn words. But this might be a bit -rt specific, and the 1us threshold
is rather arbitrary.

Rasmus
Mark Brown April 2, 2019, 9:10 a.m. UTC | #3
On Tue, Apr 02, 2019 at 08:43:51AM +0000, Rasmus Villemoes wrote:

> Thanks! There's one other option I can think of: don't do the interrupts
> at all, but just busy-wait for the completion of each word transfer (in
> a cpu_relax() loop). That could be guarded by something like
> 1000000*bits_per_word < hz (roughly, the word transfer takes less than 1
> us). At least on -rt, having the interrupt thread scheduled in and out
> again easily takes more than 1us of cpu time, and AFAIU we'd still be
> preemptible throughout - and/or one can throw in a cond_resched() every
> nnn words. But this might be a bit -rt specific, and the 1us threshold
> is rather arbitrary.

Yeah, that's definitely worth exploring as a mitigation but obviously
with things like flash I/O that gets a bit rude.  Hopefully what's there
at the minute turns out to be robust enough.