diff mbox

[RFC,RESEND] spi, spidev: Add support for long SPI transfers

Message ID 1416853893-1993-1-git-send-email-andrew.smirnov@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrey Smirnov Nov. 24, 2014, 6:31 p.m. UTC
SPI controllers found on modern SoCs have rather large SPI FIFOs and
allow for uninterrupted SPI transaction that are more then 255 bits
long. This commit adds necessary plumbing for such SPI transfers.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
---

Changes:
 - Fixed commit message
 - Ditched the cover letter
 - Used correct address for linux-kernel mailing list

Hi,

This patch is a very first, very preliminary version of the feature I
am implementing for niche usecase of SPI in a system I am working
on. It by no means contains production code and is purely RFC to "test
the water" and see if there would be any interest in including this in
mainline tree.

The system I am working with implements a lion's share of its
functionality in an extrenal FPGA connected to an ARM SoC via SPI
bus. In my use-case SPI bus used both for initial bitfile programming
of the FPGA and sending/fetching data to/from FPGA as well. Both use
cases, can benefit greatly from ability to send very long bit streams
over SPI as a single transaction during which CS line stays low the
whole time. The use-case of programming the FPGA is the most important
for me, since it is mandatory to program the FPGA before the rest of
the system can proceed to a functioning state and therefore my
"Applied power -> Functional State" time is greatly influenced by that
operation.

As things are right now both SPI subsystem and SPIDEV driver are
limited by their APIs to SPI transactions that are no longer than 255
bits and that problem is exacerbated by the fact that transction
length validity verification code does not have provisions for
anything bigger than 32 bits.

The code in this RFC deals only with the first problem, as I am not
sure what would best way to tackle the second one.

So I guess my two maing questions for this RFC are the following:

 - Is this work seem like a good fit to be included in the mainline
   kernel, or should I just keep doing it as my system specifica hack
   and not bother with upsteaming?

 - If the idea is good for upstreaming what would be a good way to
   improve length checking code? I was thinking of ditching the
   bitmask in favour of driver provided function that would take
   transfer length as an argument, are there any better options?

Thank you.

Andrey Smirnov


 drivers/spi/spidev.c            | 7 ++++++-
 include/linux/spi/spi.h         | 4 ++--
 include/uapi/linux/spi/spidev.h | 2 +-
 3 files changed, 9 insertions(+), 4 deletions(-)

Comments

Mark Brown Nov. 25, 2014, 12:42 p.m. UTC | #1
On Mon, Nov 24, 2014 at 10:31:33AM -0800, Andrey Smirnov wrote:

> SPI controllers found on modern SoCs have rather large SPI FIFOs and
> allow for uninterrupted SPI transaction that are more then 255 bits
> long. This commit adds necessary plumbing for such SPI transfers.

I really don't follow how this change is connected to the changelog...

> As things are right now both SPI subsystem and SPIDEV driver are
> limited by their APIs to SPI transactions that are no longer than 255
> bits and that problem is exacerbated by the fact that transction
> length validity verification code does not have provisions for
> anything bigger than 32 bits.

No, that's not the case at all.  A spi_transfer can have a length that's
an unsigned integer number of bytes which is much larger than 255 bits.
What is the actual problem you're trying to solve here?  I suspect the
driver you are using is just badly implemented...

> +		if (!u_tmp->bits_per_word && u_tmp->bits_per_burst)
> +			k_tmp->bits_per_word = u_tmp->bits_per_burst;
> +		else
> +			k_tmp->bits_per_word = u_tmp->bits_per_word;

This is setting the number of bits per word which is nothing to do with
FIFOs or the lengths of transfers but instead concerns the formatting of
data onto the bus.
Andrey Smirnov Nov. 25, 2014, 1:30 p.m. UTC | #2
>> As things are right now both SPI subsystem and SPIDEV driver are
>> limited by their APIs to SPI transactions that are no longer than 255
>> bits and that problem is exacerbated by the fact that transction
>> length validity verification code does not have provisions for
>> anything bigger than 32 bits.
>
> No, that's not the case at all.  A spi_transfer can have a length that's
> an unsigned integer number of bytes which is much larger than 255 bits.
> What is the actual problem you're trying to solve here?  I suspect the
> driver you are using is just badly implemented...

Yes, and you're absolutely right about spi_transfer, however I wasn't
talking about spi_transfer at all. My point was about SPI transaction
which is all the bits shifted out on the bus(and shifted in as well)
during the duration of the CS/SS signal assertion.

>
>> +             if (!u_tmp->bits_per_word && u_tmp->bits_per_burst)
>> +                     k_tmp->bits_per_word = u_tmp->bits_per_burst;
>> +             else
>> +                     k_tmp->bits_per_word = u_tmp->bits_per_word;
>
> This is setting the number of bits per word which is nothing to do with
> FIFOs or the lengths of transfers but instead concerns the formatting of
> data onto the bus.

I don't believe I said that this commit had anything to do with FIFO.
I did acknowledge that fact that modern SoC have large FIFO and this
is relevant because often times the size of that FIFO is what
determines the size of a single SPI transaction. I tried to be careful
in my wording of the summary and not use the word "transfer" anywhere,
but it looks like I slipped twice and that might have contributed to
this confusion. I apologize for this and hope that my comments
clarified my meaning/intent

Andrey
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown Nov. 25, 2014, 1:40 p.m. UTC | #3
On Tue, Nov 25, 2014 at 05:30:04AM -0800, Andrey Smirnov wrote:

> > No, that's not the case at all.  A spi_transfer can have a length that's
> > an unsigned integer number of bytes which is much larger than 255 bits.
> > What is the actual problem you're trying to solve here?  I suspect the
> > driver you are using is just badly implemented...

> Yes, and you're absolutely right about spi_transfer, however I wasn't
> talking about spi_transfer at all. My point was about SPI transaction
> which is all the bits shifted out on the bus(and shifted in as well)
> during the duration of the CS/SS signal assertion.

A spi_message (which is the thing that covers the entire /CS assert
modulo non-standard fiddling) is composed of multiple spi_transfers
so the above still applies...

> >> +             if (!u_tmp->bits_per_word && u_tmp->bits_per_burst)
> >> +                     k_tmp->bits_per_word = u_tmp->bits_per_burst;
> >> +             else
> >> +                     k_tmp->bits_per_word = u_tmp->bits_per_word;

> > This is setting the number of bits per word which is nothing to do with
> > FIFOs or the lengths of transfers but instead concerns the formatting of
> > data onto the bus.

> I don't believe I said that this commit had anything to do with FIFO.
> I did acknowledge that fact that modern SoC have large FIFO and this
> is relevant because often times the size of that FIFO is what
> determines the size of a single SPI transaction. I tried to be careful

No it doesn't, chip select can frequently be manipulated independently of
data transfers and almost all controllers can support transfers much
longer than their FIFOs either by doing that or by keeping the FIFO
topped up during transfer.

> in my wording of the summary and not use the word "transfer" anywhere,
> but it looks like I slipped twice and that might have contributed to
> this confusion. I apologize for this and hope that my comments
> clarified my meaning/intent

Sorry it's still not at all clear, please re-read what I wrote about
what bits_per_word means since I really don't think you've understood
what it's for.
Andrey Smirnov Nov. 25, 2014, 9:18 p.m. UTC | #4
>
> A spi_message (which is the thing that covers the entire /CS assert
> modulo non-standard fiddling) is composed of multiple spi_transfers
> so the above still applies...
>

You are correct, I think I was confused because the driver I was
working with didn't seem to implement it this way for hardware CS
case.

>
> No it doesn't, chip select can frequently be manipulated independently of
> data transfers and almost all controllers can support transfers much
> longer than their FIFOs either by doing that or by keeping the FIFO
> topped up during transfer.

You are correct, I reread the datasheet and even from my niche use
case FIFO depth is irrelevant. I apologize.

>
>> in my wording of the summary and not use the word "transfer" anywhere,
>> but it looks like I slipped twice and that might have contributed to
>> this confusion. I apologize for this and hope that my comments
>> clarified my meaning/intent
>
> Sorry it's still not at all clear, please re-read what I wrote about
> what bits_per_word means since I really don't think you've understood
> what it's for.

OK, I think I use too generic terms and descriptions and maybe that's
why it is not very clear what I mean. Just to give more context I am
working with i.MX6 SPI peripheral, and AFAIK I have two options of
using it:

- Use a generic GPIO as SS/CS. This way everything works exactly as
you describe, but unfortunately using GPIO leads to a significant
overhead for SS/CS assertion/deassertion.

- Use hardware controlled SS/CS in which case peripheral toggles the
line every "burst" it transmits, "burst" is n bits where n can be from
1 to 2^12
 For 1 <= n <= 32, and software SS/CS the concept of a "burst" matches
the meaning of "bits_per_word" field and this is exactly how it is
setup in the driver.

So with all of this in mind what I am trying to achive is to have
longest transaction with minimal CS/SS switching overhead. The change
in my patch allows me to set the length of a burst to its limit.

Andrey
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown Nov. 25, 2014, 10:25 p.m. UTC | #5
On Tue, Nov 25, 2014 at 01:18:45PM -0800, Andrey Smirnov wrote:

> >> in my wording of the summary and not use the word "transfer" anywhere,
> >> but it looks like I slipped twice and that might have contributed to
> >> this confusion. I apologize for this and hope that my comments
> >> clarified my meaning/intent

> > Sorry it's still not at all clear, please re-read what I wrote about
> > what bits_per_word means since I really don't think you've understood
> > what it's for.

> OK, I think I use too generic terms and descriptions and maybe that's
> why it is not very clear what I mean. Just to give more context I am
> working with i.MX6 SPI peripheral, and AFAIK I have two options of
> using it:

Ah, i.MX...  When I've worked with it I managed the chip select with
GPIO since it was flipping the chip select per word.

> - Use a generic GPIO as SS/CS. This way everything works exactly as
> you describe, but unfortunately using GPIO leads to a significant
> overhead for SS/CS assertion/deassertion.

It's really surprising that it's expensive to use a GPIO, I've not seen
other users complaining about that.  Do we understand why (I'm guessing
you have some extremely performance sensitive application here so you're
just more picky than most)?

> - Use hardware controlled SS/CS in which case peripheral toggles the
> line every "burst" it transmits, "burst" is n bits where n can be from
> 1 to 2^12
>  For 1 <= n <= 32, and software SS/CS the concept of a "burst" matches
> the meaning of "bits_per_word" field and this is exactly how it is
> setup in the driver.

But if we go over 32 bits it doesn't do the byte swapping?  Hardware
designers are wonderful, creative people sometimes.  In general this
isn't going to be sufficient for all transfers even with the non swapped
case since it means we can't do more than one transfer per message
unless we start coalescing in the core and putting in dummies for cases
where transfers change (we should do these things, they're a win in
general since we can just set the message up and let the hardware do the
transfer, but it's not completely trivial).

> So with all of this in mind what I am trying to achive is to have
> longest transaction with minimal CS/SS switching overhead. The change
> in my patch allows me to set the length of a burst to its limit.

The driver should probably just be doing that where it can, it seems
like the burst size should be based on the transfer length not the word
size - we might need to just manually mangle the data for short
transfers.

The *easiest* thing would be to manage as a GPIO, though - it's normally
fast enough.  Some more information about your application might help -
what sort of transfers is it doing for example?
Andrey Smirnov Nov. 26, 2014, 2:37 p.m. UTC | #6
>> OK, I think I use too generic terms and descriptions and maybe that's
>> why it is not very clear what I mean. Just to give more context I am
>> working with i.MX6 SPI peripheral, and AFAIK I have two options of
>> using it:
>
> Ah, i.MX...  When I've worked with it I managed the chip select with
> GPIO since it was flipping the chip select per word.

Yep, that's pretty much what I am observing, except now the size of
the "word" is configurable.

>
>> - Use a generic GPIO as SS/CS. This way everything works exactly as
>> you describe, but unfortunately using GPIO leads to a significant
>> overhead for SS/CS assertion/deassertion.
>
> It's really surprising that it's expensive to use a GPIO, I've not seen
> other users complaining about that.  Do we understand why (I'm guessing
> you have some extremely performance sensitive application here so you're
> just more picky than most)?

Yeah, I agree, I have a niche use case and just happen to be more
picky because of that. I have a rather slow clock (8Mhz out of
potential 20Mhz) that is out of my control; rather verbose protocol
which has 32bits+ of overhead per 32bit of data with fixed transaction
length of 64bits(so I get a lot of small equally sized transactions);
sizeable chunk of data to transfer 4K of 32bit words. On top of that
the system needs to do those 4K transfers as fast as possible so my
timing requirements are very severe as well.

>
>> - Use hardware controlled SS/CS in which case peripheral toggles the
>> line every "burst" it transmits, "burst" is n bits where n can be from
>> 1 to 2^12
>>  For 1 <= n <= 32, and software SS/CS the concept of a "burst" matches
>> the meaning of "bits_per_word" field and this is exactly how it is
>> setup in the driver.
>
> But if we go over 32 bits it doesn't do the byte swapping?

*Sigh* It's FIFO is little endian as well so it does do byte swapping.

> Hardware designers are wonderful, creative people sometimes.  In general this
> isn't going to be sufficient for all transfers even with the non swapped
> case since it means we can't do more than one transfer per message
> unless we start coalescing in the core and putting in dummies for cases
> where transfers change (we should do these things, they're a win in
> general since we can just set the message up and let the hardware do the
> transfer, but it's not completely trivial).
>
>> So with all of this in mind what I am trying to achive is to have
>> longest transaction with minimal CS/SS switching overhead. The change
>> in my patch allows me to set the length of a burst to its limit.
>
> The driver should probably just be doing that where it can, it seems
> like the burst size should be based on the transfer length not the word
> size - we might need to just manually mangle the data for short
> transfers.

Yeah, I thought about this idea yesterday as well. I agree this is
probably the right way to solve this. I'll have my hack as a temporary
kludge and will try to improve the driver in the mean time.

Thank you for you input and please ignore the patch that I sent.

Andrey
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c
index d7c6e36..45d8905 100644
--- a/drivers/spi/spidev.c
+++ b/drivers/spi/spidev.c
@@ -265,7 +265,12 @@  static int spidev_message(struct spidev_data *spidev,
 		buf += k_tmp->len;
 
 		k_tmp->cs_change = !!u_tmp->cs_change;
-		k_tmp->bits_per_word = u_tmp->bits_per_word;
+
+		if (!u_tmp->bits_per_word && u_tmp->bits_per_burst)
+			k_tmp->bits_per_word = u_tmp->bits_per_burst;
+		else
+			k_tmp->bits_per_word = u_tmp->bits_per_word;
+
 		k_tmp->delay_usecs = u_tmp->delay_usecs;
 		k_tmp->speed_hz = u_tmp->speed_hz;
 #ifdef VERBOSE
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 4203c66..a0c34c1 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -75,7 +75,7 @@  struct spi_device {
 	struct spi_master	*master;
 	u32			max_speed_hz;
 	u8			chip_select;
-	u8			bits_per_word;
+	u32			bits_per_word;
 	u16			mode;
 #define	SPI_CPHA	0x01			/* clock phase */
 #define	SPI_CPOL	0x02			/* clock polarity */
@@ -586,7 +586,7 @@  struct spi_transfer {
 #define	SPI_NBITS_SINGLE	0x01 /* 1bit transfer */
 #define	SPI_NBITS_DUAL		0x02 /* 2bits transfer */
 #define	SPI_NBITS_QUAD		0x04 /* 4bits transfer */
-	u8		bits_per_word;
+	u32		bits_per_word;
 	u16		delay_usecs;
 	u32		speed_hz;
 
diff --git a/include/uapi/linux/spi/spidev.h b/include/uapi/linux/spi/spidev.h
index 52d9ed0..b70f3d4 100644
--- a/include/uapi/linux/spi/spidev.h
+++ b/include/uapi/linux/spi/spidev.h
@@ -92,7 +92,7 @@  struct spi_ioc_transfer {
 	__u16		delay_usecs;
 	__u8		bits_per_word;
 	__u8		cs_change;
-	__u32		pad;
+	__u32		bits_per_burst;
 
 	/* If the contents of 'struct spi_ioc_transfer' ever change
 	 * incompatibly, then the ioctl number (currently 0) must change;