mbox series

[0/3] spi: bcm2835: Interrupt-handling optimisations

Message ID cover.1592261248.git.robin.murphy@arm.com (mailing list archive)
Headers show
Series spi: bcm2835: Interrupt-handling optimisations | expand

Message

Robin Murphy June 16, 2020, 12:09 a.m. UTC
Hi all,

Although Florian was concerned about a trivial inline check to deal with
shared IRQs adding overhead, the reality is that it would be so small as
to not be worth even thinking about unless the driver was already tuned
to squeeze out every last cycle. And a brief look over the code shows
that that clearly isn't the case.

This is an example of some of the easy low-hanging fruit that jumps out
just from code inspection. Based on disassembly and ARM1176 cycle
timings, patch #2 should save the equivalent of 2-3 shared interrupt
checks off the critical path in all cases, and patch #3 possibly up to
about 100x more. I don't have any means to test these patches, let alone
measure performance, so they're only backed by the principle that less
code - and in particular fewer memory accesses - is almost always
better.

There is almost certainly a *lot* more to be had from careful use of
relaxed I/O accessors, not doing a read-modify-write of CS at every
reset, tweaking the loops further to avoid unnecessary writebacks to
variables, and so on. However since I'm not invested in this personally
I'm not going to pursue it any further; I'm throwing these patches out
as more of a demonstration to back up my original drive-by review
comments, so if anyone want to pick them up and run with them then
please do so.

Robin.


Robin Murphy (3):
  spi: bcm3835: Tidy up bcm2835_spi_reset_hw()
  spi: bcm2835: Micro-optimise IRQ handler
  spi: bcm2835: Micro-optimise FIFO loops

 drivers/spi/spi-bcm2835.c | 45 +++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

Comments

Mark Brown July 1, 2020, 10:24 p.m. UTC | #1
On Tue, 16 Jun 2020 01:09:26 +0100, Robin Murphy wrote:
> Although Florian was concerned about a trivial inline check to deal with
> shared IRQs adding overhead, the reality is that it would be so small as
> to not be worth even thinking about unless the driver was already tuned
> to squeeze out every last cycle. And a brief look over the code shows
> that that clearly isn't the case.
> 
> This is an example of some of the easy low-hanging fruit that jumps out
> just from code inspection. Based on disassembly and ARM1176 cycle
> timings, patch #2 should save the equivalent of 2-3 shared interrupt
> checks off the critical path in all cases, and patch #3 possibly up to
> about 100x more. I don't have any means to test these patches, let alone
> measure performance, so they're only backed by the principle that less
> code - and in particular fewer memory accesses - is almost always
> better.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/3] spi: bcm3835: Tidy up bcm2835_spi_reset_hw()
      commit: ac4648b5d866f98feef4525ae8734972359e4edd
[2/3] spi: bcm2835: Micro-optimise IRQ handler
      commit: afe7e36360f4c981fc03ef07a81cb4ce3d567325
[3/3] spi: bcm2835: Micro-optimise FIFO loops
      commit: 26751de25d255eab7132a8024a893609456996e6

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark