diff mbox

[v2] spi: orion.c: Add direct access mode

Message ID 1458663893-13766-1-git-send-email-sr@denx.de (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Roese March 22, 2016, 4:24 p.m. UTC
This patch adds support for the direct access mode to the Orion SPI
driver which is used on the Marvell Armada based SoCs. In this direct
mode, all data written to (or read from) a specifically mapped MBus
window (linked to one SPI chip-select on one of the SPI controllers)
will be transferred directly to the SPI bus. Without the need to control
the SPI registers in between. This can improve the SPI transfer rate in
such cases.

Both, direct-read and -write mode are supported. But only the write
mode has been tested. This mode especially benefits from the SPI direct
mode, as the data bytes are written head-to-head to the SPI bus,
without any additional addresses.

One use-case for this direct write mode is, programming a FPGA bitstream
image into the FPGA connected to the SPI bus at maximum speed.

This mode is described in chapter "22.5.2 Direct Write to SPI" in the
Marvell Armada XP Functional Spec Datasheet.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Nadav Haklai <nadavh@marvell.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Mark Brown <broonie@kernel.org>
---
Mark, sorry for the huge delay for v2 of this direct-access patch.
I was busy with other tasks in the meantime. And only found now
the time to address (hopefully all) of your comments.

v2:
- Use one MBus window for each SPI controller instead of one for
  for each SPI device. This MBus window is re-assigned, once the
  CS changes.
- Assert the CS over the entire message
- Don't restrict the direct access mode to only direct write mode
- Add check for max size before using memcpy()
- Remove spidev from DT bindings documentation

 .../devicetree/bindings/spi/spi-orion.txt          |  21 +++
 drivers/spi/spi-orion.c                            | 142 ++++++++++++++++++++-
 2 files changed, 161 insertions(+), 2 deletions(-)

Comments

Thomas Petazzoni March 22, 2016, 4:35 p.m. UTC | #1
Hello,

On Tue, 22 Mar 2016 17:24:53 +0100, Stefan Roese wrote:
> This patch adds support for the direct access mode to the Orion SPI
> driver which is used on the Marvell Armada based SoCs. In this direct
> mode, all data written to (or read from) a specifically mapped MBus
> window (linked to one SPI chip-select on one of the SPI controllers)
> will be transferred directly to the SPI bus. Without the need to control
> the SPI registers in between. This can improve the SPI transfer rate in
> such cases.
> 
> Both, direct-read and -write mode are supported. But only the write
> mode has been tested. This mode especially benefits from the SPI direct
> mode, as the data bytes are written head-to-head to the SPI bus,
> without any additional addresses.
> 
> One use-case for this direct write mode is, programming a FPGA bitstream
> image into the FPGA connected to the SPI bus at maximum speed.
> 
> This mode is described in chapter "22.5.2 Direct Write to SPI" in the
> Marvell Armada XP Functional Spec Datasheet.
> 
> Signed-off-by: Stefan Roese <sr@denx.de>
> Cc: Nadav Haklai <nadavh@marvell.com>
> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
> Cc: Mark Brown <broonie@kernel.org>
> ---
> Mark, sorry for the huge delay for v2 of this direct-access patch.
> I was busy with other tasks in the meantime. And only found now
> the time to address (hopefully all) of your comments.

Thanks for this new version! To be honest, I don't remember all the
discussions that took place on the v1, so maybe I'll just be re-asking
the same question.

Has there been any discussion on whether dynamically adding the MBus
window is a good idea, as opposed to statically defining it in the
board .dts ?

So far, the only driver that was using dynamically allocated MBus is
the PCIe controller driver, because there is no way in advanced to know
the number and memory window requirements of the PCIe devices that will
be connected to the system.

For all other windows (BootROM, crypto SRAM and more recently network
related SRAM), we are using statically allocated windows.

So I'm wondering if we should add this additional DT binding that
describes the necessary information to allow the driver to dynamically
allocate a window, or if we shouldn't rely on a statically allocated
window.

This is really an open discussion, I don't have a very well-defined
opinion on the matter.

Let's Cc: Arnd Bergmann on this question, he has followed the whole
MBus story and might have some interesting insights.

Best regards,

Thomas
Stefan Roese March 22, 2016, 4:44 p.m. UTC | #2
Hi Thomas,

On 22.03.2016 17:35, Thomas Petazzoni wrote:
> On Tue, 22 Mar 2016 17:24:53 +0100, Stefan Roese wrote:
>> This patch adds support for the direct access mode to the Orion SPI
>> driver which is used on the Marvell Armada based SoCs. In this direct
>> mode, all data written to (or read from) a specifically mapped MBus
>> window (linked to one SPI chip-select on one of the SPI controllers)
>> will be transferred directly to the SPI bus. Without the need to control
>> the SPI registers in between. This can improve the SPI transfer rate in
>> such cases.
>>
>> Both, direct-read and -write mode are supported. But only the write
>> mode has been tested. This mode especially benefits from the SPI direct
>> mode, as the data bytes are written head-to-head to the SPI bus,
>> without any additional addresses.
>>
>> One use-case for this direct write mode is, programming a FPGA bitstream
>> image into the FPGA connected to the SPI bus at maximum speed.
>>
>> This mode is described in chapter "22.5.2 Direct Write to SPI" in the
>> Marvell Armada XP Functional Spec Datasheet.
>>
>> Signed-off-by: Stefan Roese <sr@denx.de>
>> Cc: Nadav Haklai <nadavh@marvell.com>
>> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> ---
>> Mark, sorry for the huge delay for v2 of this direct-access patch.
>> I was busy with other tasks in the meantime. And only found now
>> the time to address (hopefully all) of your comments.
>
> Thanks for this new version! To be honest, I don't remember all the
> discussions that took place on the v1, so maybe I'll just be re-asking
> the same question.

No problem. Thanks for looking into this.

> Has there been any discussion on whether dynamically adding the MBus
> window is a good idea, as opposed to statically defining it in the
> board .dts ?

Yes. My 1st patch version (still RFC) used fixed MBus windows instead:

http://www.spinics.net/lists/linux-spi/msg06536.html

Mark suggested to use dynamic windows, so that one area could be
used for all SPI devices by switching (re-configuring) the MBus
window:

http://www.spinics.net/lists/linux-spi/msg06537.html

> So far, the only driver that was using dynamically allocated MBus is
> the PCIe controller driver, because there is no way in advanced to know
> the number and memory window requirements of the PCIe devices that will
> be connected to the system.
>
> For all other windows (BootROM, crypto SRAM and more recently network
> related SRAM), we are using statically allocated windows.
>
> So I'm wondering if we should add this additional DT binding that
> describes the necessary information to allow the driver to dynamically
> allocate a window, or if we shouldn't rely on a statically allocated
> window.
>
> This is really an open discussion, I don't have a very well-defined
> opinion on the matter.

I also have no real preference here.

> Let's Cc: Arnd Bergmann on this question, he has followed the whole
> MBus story and might have some interesting insights.

Good idea.

Thanks,
Stefan
Mark Brown March 22, 2016, 5:39 p.m. UTC | #3
On Tue, Mar 22, 2016 at 05:35:46PM +0100, Thomas Petazzoni wrote:

> Has there been any discussion on whether dynamically adding the MBus
> window is a good idea, as opposed to statically defining it in the
> board .dts ?

> So far, the only driver that was using dynamically allocated MBus is
> the PCIe controller driver, because there is no way in advanced to know
> the number and memory window requirements of the PCIe devices that will
> be connected to the system.

> For all other windows (BootROM, crypto SRAM and more recently network
> related SRAM), we are using statically allocated windows.

Is this a case where we used to use static assignment because we didn't
have a mechanism for dynamic assignment but now have a mechanism for
dynamic assignment?  In other words is there any advantage to doing the
assignments statically given that we can do them dynamically?
Mark Brown March 23, 2016, 11:33 a.m. UTC | #4
On Tue, Mar 22, 2016 at 05:44:52PM +0100, Stefan Roese wrote:
> On 22.03.2016 17:35, Thomas Petazzoni wrote:

> >Has there been any discussion on whether dynamically adding the MBus
> >window is a good idea, as opposed to statically defining it in the
> >board .dts ?

> Yes. My 1st patch version (still RFC) used fixed MBus windows instead:

> http://www.spinics.net/lists/linux-spi/msg06536.html

> Mark suggested to use dynamic windows, so that one area could be
> used for all SPI devices by switching (re-configuring) the MBus
> window:

> http://www.spinics.net/lists/linux-spi/msg06537.html

No, there's two separate things here.  The big problem with what you
originally sent was that you were defining a window per SPI device but
this is a part of the SPI controller so having to change windows per
device is just going to make the code more complex.  There's also the
separate issue of where any MBus windows we get come up with.
Stefan Roese March 23, 2016, 11:59 a.m. UTC | #5
Hi Mark,

On 23.03.2016 12:33, Mark Brown wrote:
> On Tue, Mar 22, 2016 at 05:44:52PM +0100, Stefan Roese wrote:
>> On 22.03.2016 17:35, Thomas Petazzoni wrote:
> 
>>> Has there been any discussion on whether dynamically adding the MBus
>>> window is a good idea, as opposed to statically defining it in the
>>> board .dts ?
> 
>> Yes. My 1st patch version (still RFC) used fixed MBus windows instead:
> 
>> http://www.spinics.net/lists/linux-spi/msg06536.html
> 
>> Mark suggested to use dynamic windows, so that one area could be
>> used for all SPI devices by switching (re-configuring) the MBus
>> window:
> 
>> http://www.spinics.net/lists/linux-spi/msg06537.html
> 
> No, there's two separate things here.  The big problem with what you
> originally sent was that you were defining a window per SPI device but
> this is a part of the SPI controller so having to change windows per
> device is just going to make the code more complex.

I'm sorry, but this sentence is not totally clear to me: Do you mean
that the "more complex" code in v2 with the MBus window reconfiguration
is not good? Or do you mean that in the RFC version, the switching
between the SPI devices was too complex?

To sum it up, in the RFC version, I had one MBus window per SPI
device (using this direct access mode). Resulting in multiple MBus
windows and more resource "wastage" than necessary. In v2, I now
have one MBus window per SPI controller. Which needs to get
re-configured upon CS change.

>  There's also the
> separate issue of where any MBus windows we get come up with.

Could you also please explain this in more detail? What do you mean
with that? Do you mean the physical address, that is defined in the
DT for this MBus window?

Thanks,
Stefan
Mark Brown March 23, 2016, 12:54 p.m. UTC | #6
On Wed, Mar 23, 2016 at 12:59:40PM +0100, Stefan Roese wrote:
> On 23.03.2016 12:33, Mark Brown wrote:

> > No, there's two separate things here.  The big problem with what you
> > originally sent was that you were defining a window per SPI device but
> > this is a part of the SPI controller so having to change windows per
> > device is just going to make the code more complex.

> I'm sorry, but this sentence is not totally clear to me: Do you mean
> that the "more complex" code in v2 with the MBus window reconfiguration
> is not good? Or do you mean that in the RFC version, the switching
> between the SPI devices was too complex?

I haven't looked at your new code at all.  What I'm saying is that
specifying a per-device MBus window seems like pointless complexity.

> >  There's also the
> > separate issue of where any MBus windows we get come up with.

> Could you also please explain this in more detail? What do you mean
> with that? Do you mean the physical address, that is defined in the
> DT for this MBus window?

I had thought that this was what this subthread was about but you
appeared to be confused.
Stefan Roese March 23, 2016, 1:10 p.m. UTC | #7
Hi Mark,

On 23.03.2016 13:54, Mark Brown wrote:
> On Wed, Mar 23, 2016 at 12:59:40PM +0100, Stefan Roese wrote:
>> On 23.03.2016 12:33, Mark Brown wrote:
> 
>>> No, there's two separate things here.  The big problem with what you
>>> originally sent was that you were defining a window per SPI device but
>>> this is a part of the SPI controller so having to change windows per
>>> device is just going to make the code more complex.
> 
>> I'm sorry, but this sentence is not totally clear to me: Do you mean
>> that the "more complex" code in v2 with the MBus window reconfiguration
>> is not good? Or do you mean that in the RFC version, the switching
>> between the SPI devices was too complex?
> 
> I haven't looked at your new code at all.  What I'm saying is that
> specifying a per-device MBus window seems like pointless complexity.

I don't necessarily share this opinions. Code-wise, its less complex
that re-configuring (removing the old and creating the new) the MBus
window. But I have no strong feeling here. Whatever is decided that
should be used, I can go with.

Thomas, Arnd (or anyone else?), do you have any comments or preferences
which way to go here?

Thanks,
Stefan
Andrew Lunn March 23, 2016, 1:26 p.m. UTC | #8
On Wed, Mar 23, 2016 at 02:10:57PM +0100, Stefan Roese wrote:
> Hi Mark,
> 
> On 23.03.2016 13:54, Mark Brown wrote:
> > On Wed, Mar 23, 2016 at 12:59:40PM +0100, Stefan Roese wrote:
> >> On 23.03.2016 12:33, Mark Brown wrote:
> > 
> >>> No, there's two separate things here.  The big problem with what you
> >>> originally sent was that you were defining a window per SPI device but
> >>> this is a part of the SPI controller so having to change windows per
> >>> device is just going to make the code more complex.
> > 
> >> I'm sorry, but this sentence is not totally clear to me: Do you mean
> >> that the "more complex" code in v2 with the MBus window reconfiguration
> >> is not good? Or do you mean that in the RFC version, the switching
> >> between the SPI devices was too complex?
> > 
> > I haven't looked at your new code at all.  What I'm saying is that
> > specifying a per-device MBus window seems like pointless complexity.
> 
> I don't necessarily share this opinions. Code-wise, its less complex
> that re-configuring (removing the old and creating the new) the MBus
> window. But I have no strong feeling here. Whatever is decided that
> should be used, I can go with.
> 
> Thomas, Arnd (or anyone else?), do you have any comments or preferences
> which way to go here?

Hi Stefan

The number of windows is limited. On i think the Armada XP, if you put
a PCIe device on every available PCIe bus, you can run out of
windows. This is why Thomas implemented dynamic allocation of the
Windows, so that only those that are needed are used.

So i would not statically and globally allocate as many windows as
possible SPI devices.

The fact that SPI can fall back to another mechanism if there are no
available windows, were as PCIe cannot, suggests that SPI should
dynamically allocate a window, and be prepared for it to fail.

Since only one SPI device can be active at once on a SPI bus, one
window per bus makes sense, and keeps the required number of windows
to a minimum.

   Andrew
Mark Brown March 23, 2016, 1:27 p.m. UTC | #9
On Wed, Mar 23, 2016 at 02:10:57PM +0100, Stefan Roese wrote:
> On 23.03.2016 13:54, Mark Brown wrote:

> > I haven't looked at your new code at all.  What I'm saying is that
> > specifying a per-device MBus window seems like pointless complexity.

> I don't necessarily share this opinions. Code-wise, its less complex
> that re-configuring (removing the old and creating the new) the MBus
> window. But I have no strong feeling here. Whatever is decided that
> should be used, I can go with.

No, really - it's just unhelpful.  Putting this in the ABI means that
every single system integrator who cares about performance is going to
need to go and manually figure out how to configure this in DT and
manually select values.  That's not doing a good job for users, it's
making their lives harder for no gain.  If there are no physical
constraints then how we allocate space in the MBus should be a runtime
thing, it shouldn't be part of the ABI.
Mark Brown March 23, 2016, 1:36 p.m. UTC | #10
On Wed, Mar 23, 2016 at 02:26:37PM +0100, Andrew Lunn wrote:

> The number of windows is limited. On i think the Armada XP, if you put
> a PCIe device on every available PCIe bus, you can run out of
> windows. This is why Thomas implemented dynamic allocation of the
> Windows, so that only those that are needed are used.

> So i would not statically and globally allocate as many windows as
> possible SPI devices.

> The fact that SPI can fall back to another mechanism if there are no
> available windows, were as PCIe cannot, suggests that SPI should
> dynamically allocate a window, and be prepared for it to fail.

> Since only one SPI device can be active at once on a SPI bus, one
> window per bus makes sense, and keeps the required number of windows
> to a minimum.

If we're under pressure for windows I'd go further and say that we
should be dynamically allocating the windows only when they're actually
in use (and modifying other drivers to do the same if that makes sense
for them), unless it's somehow expensive to allocate windows that means
that we should reduce the overall pressure.
Andrew Lunn March 23, 2016, 1:56 p.m. UTC | #11
On Wed, Mar 23, 2016 at 01:36:12PM +0000, Mark Brown wrote:
> On Wed, Mar 23, 2016 at 02:26:37PM +0100, Andrew Lunn wrote:
> 
> > The number of windows is limited. On i think the Armada XP, if you put
> > a PCIe device on every available PCIe bus, you can run out of
> > windows. This is why Thomas implemented dynamic allocation of the
> > Windows, so that only those that are needed are used.
> 
> > So i would not statically and globally allocate as many windows as
> > possible SPI devices.
> 
> > The fact that SPI can fall back to another mechanism if there are no
> > available windows, were as PCIe cannot, suggests that SPI should
> > dynamically allocate a window, and be prepared for it to fail.
> 
> > Since only one SPI device can be active at once on a SPI bus, one
> > window per bus makes sense, and keeps the required number of windows
> > to a minimum.
> 
> If we're under pressure for windows I'd go further and say that we
> should be dynamically allocating the windows only when they're actually
> in use (and modifying other drivers to do the same if that makes sense
> for them), unless it's somehow expensive to allocate windows that means
> that we should reduce the overall pressure.

Hi Mark

We are only under pressure in the extremes, i.e 10 PCIe busses in
use. Only allocating PCIe Windows when needed means in practice, we
have not had issues. We need to be mindful, and don't waste them, but
we don't need to consider them a scarce resource.

I don't see it being an issue for the SPI driver to allocate one on
probe and keeping it until release. I probably would object to it
allocating one per chip select line.

   Andrew
Stefan Roese March 23, 2016, 5:25 p.m. UTC | #12
On 23.03.2016 14:27, Mark Brown wrote:
> On Wed, Mar 23, 2016 at 02:10:57PM +0100, Stefan Roese wrote:
>> On 23.03.2016 13:54, Mark Brown wrote:
>
>>> I haven't looked at your new code at all.  What I'm saying is that
>>> specifying a per-device MBus window seems like pointless complexity.
>
>> I don't necessarily share this opinions. Code-wise, its less complex
>> that re-configuring (removing the old and creating the new) the MBus
>> window. But I have no strong feeling here. Whatever is decided that
>> should be used, I can go with.
>
> No, really - it's just unhelpful.  Putting this in the ABI means that
> every single system integrator who cares about performance is going to
> need to go and manually figure out how to configure this in DT and
> manually select values.  That's not doing a good job for users, it's
> making their lives harder for no gain.  If there are no physical
> constraints then how we allocate space in the MBus should be a runtime
> thing, it shouldn't be part of the ABI.

Do I understand you correctly, that you want the complete MBus
allocation and configuration to be included in the SPI driver
instead of using the DT to describe the window (target, attribute,
area)? If yes, then the SPI driver would be the first driver
to do this "internally". All other drivers using MBus resources
parse these from the DT (AFAIK) as this is highly SoC specific.

Sidenote:
Please note that this direct access mode is really not suitable
for "normal" SPI devices like SPI NOR chips. As the direct read
mode (as described in chapter 22.5.1 of the Armada XP datasheet)
always generates:

a) a write command to the SPI bus (opcode configurable via register)

and

b) writes 1-4 address bytes to the SPI bus

before the data is read from the SPI bus.

So its definitely nothing that should be enabled per default for
all SPI devices. In my case of the FPGA bitstream loading, the
direct write mode is perfect, as it provides the fastest way to
stream the bitstream back-to-back onto the SPI bus.

Thanks,
Stefan
Mark Brown March 23, 2016, 6:29 p.m. UTC | #13
On Wed, Mar 23, 2016 at 06:25:46PM +0100, Stefan Roese wrote:
> On 23.03.2016 14:27, Mark Brown wrote:

> >No, really - it's just unhelpful.  Putting this in the ABI means that
> >every single system integrator who cares about performance is going to
> >need to go and manually figure out how to configure this in DT and
> >manually select values.  That's not doing a good job for users, it's
> >making their lives harder for no gain.  If there are no physical
> >constraints then how we allocate space in the MBus should be a runtime
> >thing, it shouldn't be part of the ABI.

> Do I understand you correctly, that you want the complete MBus
> allocation and configuration to be included in the SPI driver
> instead of using the DT to describe the window (target, attribute,
> area)? If yes, then the SPI driver would be the first driver
> to do this "internally". All other drivers using MBus resources
> parse these from the DT (AFAIK) as this is highly SoC specific.

That seems pretty surprising, especially if there is a possibility that
we might run out of resources on the MBus even if just in some unusual
situation.  Why would this be manually configured in DT if there's no
hardware restriction?  Based on what people are saying I'd expect the
SPI controller to have a reference to the attached bus which it uses to
find the bus and then ask the bus to allocate it whatever it needs from
the space available.  I mean, I guess if that's what you're forcing
users to do that's what you're forcing users to do but it seems like
it's making people do manual work that should be automated.

It's possible I'm missing something about how this hardware works...
Andrew Lunn March 23, 2016, 6:39 p.m. UTC | #14
> Please note that this direct access mode is really not suitable
> for "normal" SPI devices like SPI NOR chips. As the direct read
> mode (as described in chapter 22.5.1 of the Armada XP datasheet)
> always generates:
> 
> a) a write command to the SPI bus (opcode configurable via register)
> 
> and
> 
> b) writes 1-4 address bytes to the SPI bus
> 
> before the data is read from the SPI bus.
> 
> So its definitely nothing that should be enabled per default for
> all SPI devices.

Looking at the Kirkwood datasheet, it suggests it can be used with SPI
flash:

	If using SPI flash, set the <Attr> field in the CPU address
	decoding windows to match the SPI interface (see Section 2,
	Address Map, on page 34 for more details). Any CPU read to
	this address space is converted by the SPI controller to a SPI
	flash read transaction, composed of an address phase, followed
	by a data phase.

	The actual sequence that is driven on the SPI interface is:

	1. Assert SPI_CSn. Write command to the SPI. The command is
	either Read or Fast_Read, based on the configuration of the
	<DirectRdCommand> field.

	2. Write the address. The Address is driven in 1 to 4 phases
	based on the configuration of the <DirectAddrLen>> field. The order of
	the bytes is MSB to LSB.

	3. In Fast_Read mode, add a one-byte dummy write.

	4. Read the data, according to the length given by the request.

	5. De-assert the SPI_CSn signal.

There is also something interesting in the Direct Write to SPI
section:

	4. Address Phase.

	This is a 1-4 byte field that is taken from the address of the
	request.  The size of the address is configured via the
	<DirectAddrLen> field in the Serial Memory Interface
	Configuration Register (Table 629 p. 686).  The entire Address
	phase is omitted when <Direct Wr Addr Enable> field in the
	Serial Memory Direct Write Configuration Register (Table 634
	p. 688) is cleared.

Turning off the address phase may mean it is possible to do normal SPI
writes using the direct mode.

Quite a few Kirkwood boards have SPI flash, so if this brings better
performance, it would be nice to be able to use it.

	     Andrew
Arnd Bergmann March 23, 2016, 7:51 p.m. UTC | #15
On Wednesday 23 March 2016 14:56:06 Andrew Lunn wrote:
> On Wed, Mar 23, 2016 at 01:36:12PM +0000, Mark Brown wrote:
> > On Wed, Mar 23, 2016 at 02:26:37PM +0100, Andrew Lunn wrote:
> > 
> > > The number of windows is limited. On i think the Armada XP, if you put
> > > a PCIe device on every available PCIe bus, you can run out of
> > > windows. This is why Thomas implemented dynamic allocation of the
> > > Windows, so that only those that are needed are used.
> > 
> > > So i would not statically and globally allocate as many windows as
> > > possible SPI devices.
> > 
> > > The fact that SPI can fall back to another mechanism if there are no
> > > available windows, were as PCIe cannot, suggests that SPI should
> > > dynamically allocate a window, and be prepared for it to fail.
> > 
> > > Since only one SPI device can be active at once on a SPI bus, one
> > > window per bus makes sense, and keeps the required number of windows
> > > to a minimum.
> > 
> > If we're under pressure for windows I'd go further and say that we
> > should be dynamically allocating the windows only when they're actually
> > in use (and modifying other drivers to do the same if that makes sense
> > for them), unless it's somehow expensive to allocate windows that means
> > that we should reduce the overall pressure.
> 
> Hi Mark
> 
> We are only under pressure in the extremes, i.e 10 PCIe busses in
> use. Only allocating PCIe Windows when needed means in practice, we
> have not had issues. We need to be mindful, and don't waste them, but
> we don't need to consider them a scarce resource.
> 
> I don't see it being an issue for the SPI driver to allocate one on
> probe and keeping it until release. I probably would object to it
> allocating one per chip select line.

I agree. We had a very long debate about this when we first added
the support for MBus, and not much has changed. As far as I'm concerned,
this is where we're at:

The binding defines a reasonable set of default settings for each
device that uses MBus, and the OS can use them, or define its own.
Coming up with an algorithm to do this right however is very hard
and not really worth it, as defining the defaults works well enough.

Ideally we'd let the boot loader set up the windows statically and
have the DT describe what they are, but unfortunately that is not
what boot loaders in the field do.

The method that was picked for MBus is similar to how we typically
handle PCI host bridges that have the same problem: you can set up
translation windows between CPU addresses and the various PCI
address spaces (I/O, mem, prefetchable, config, ...) in all sorts
of ways, with various tradeoffs, but instead we just pick a
reasonable default and describe it in the DT, because the kernel
has no good generic way of picking a particular setting over another.

	Arnd
Stefan Roese March 24, 2016, 5:45 a.m. UTC | #16
On 23.03.2016 19:39, Andrew Lunn wrote:
>> Please note that this direct access mode is really not suitable
>> for "normal" SPI devices like SPI NOR chips. As the direct read
>> mode (as described in chapter 22.5.1 of the Armada XP datasheet)
>> always generates:
>>
>> a) a write command to the SPI bus (opcode configurable via register)
>>
>> and
>>
>> b) writes 1-4 address bytes to the SPI bus
>>
>> before the data is read from the SPI bus.
>>
>> So its definitely nothing that should be enabled per default for
>> all SPI devices.
> 
> Looking at the Kirkwood datasheet, it suggests it can be used with SPI
> flash:
> 
> 	If using SPI flash, set the <Attr> field in the CPU address
> 	decoding windows to match the SPI interface (see Section 2,
> 	Address Map, on page 34 for more details). Any CPU read to
> 	this address space is converted by the SPI controller to a SPI
> 	flash read transaction, composed of an address phase, followed
> 	by a data phase.
> 
> 	The actual sequence that is driven on the SPI interface is:
> 
> 	1. Assert SPI_CSn. Write command to the SPI. The command is
> 	either Read or Fast_Read, based on the configuration of the
> 	<DirectRdCommand> field.
> 
> 	2. Write the address. The Address is driven in 1 to 4 phases
> 	based on the configuration of the <DirectAddrLen>> field. The order of
> 	the bytes is MSB to LSB.
> 
> 	3. In Fast_Read mode, add a one-byte dummy write.
> 
> 	4. Read the data, according to the length given by the request.
> 
> 	5. De-assert the SPI_CSn signal.

Right. It is of course possible to use this direct mode to access
SPI flash. Its just not the way how the SPI MTD driver uses the
SPI controller. All information that is written to the SPI flash
(e.g. commands, addresses) are passed as plain tx-data in the SPI
message to the SPI controller. And it would be not easy to detect
commands / addresses vs. "real data" in this orion SPI driver
to put these values into the specific registers.
 
> There is also something interesting in the Direct Write to SPI
> section:
> 
> 	4. Address Phase.
> 
> 	This is a 1-4 byte field that is taken from the address of the
> 	request.  The size of the address is configured via the
> 	<DirectAddrLen> field in the Serial Memory Interface
> 	Configuration Register (Table 629 p. 686).  The entire Address
> 	phase is omitted when <Direct Wr Addr Enable> field in the
> 	Serial Memory Direct Write Configuration Register (Table 634
> 	p. 688) is cleared.
> 
> Turning off the address phase may mean it is possible to do normal SPI
> writes using the direct mode.

Correct. And this is exactly why I'm using this direct write mode
for the FPGAs connected via SPI. To implement normal, optimized SPI
writes using this direct write mode.

> Quite a few Kirkwood boards have SPI flash, so if this brings better
> performance, it would be nice to be able to use it.

Yes. But I hope my explanation above makes it clear that this is not
easily implemented. So I suggest to just use the implementation I've
added with this patch for these special SPI devices like FPGAs
for fast bitstream downloading. And for this it is necessary, to make
this direct access mode optionally configurable on a per-SPI-device
basis.

Thanks,
Stefan
Stefan Roese March 24, 2016, 7:22 a.m. UTC | #17
On 23.03.2016 20:51, Arnd Bergmann wrote:
> On Wednesday 23 March 2016 14:56:06 Andrew Lunn wrote:
>> On Wed, Mar 23, 2016 at 01:36:12PM +0000, Mark Brown wrote:
>>> On Wed, Mar 23, 2016 at 02:26:37PM +0100, Andrew Lunn wrote:
>>>
>>>> The number of windows is limited. On i think the Armada XP, if you put
>>>> a PCIe device on every available PCIe bus, you can run out of
>>>> windows. This is why Thomas implemented dynamic allocation of the
>>>> Windows, so that only those that are needed are used.
>>>
>>>> So i would not statically and globally allocate as many windows as
>>>> possible SPI devices.
>>>
>>>> The fact that SPI can fall back to another mechanism if there are no
>>>> available windows, were as PCIe cannot, suggests that SPI should
>>>> dynamically allocate a window, and be prepared for it to fail.
>>>
>>>> Since only one SPI device can be active at once on a SPI bus, one
>>>> window per bus makes sense, and keeps the required number of windows
>>>> to a minimum.
>>>
>>> If we're under pressure for windows I'd go further and say that we
>>> should be dynamically allocating the windows only when they're actually
>>> in use (and modifying other drivers to do the same if that makes sense
>>> for them), unless it's somehow expensive to allocate windows that means
>>> that we should reduce the overall pressure.
>>
>> Hi Mark
>>
>> We are only under pressure in the extremes, i.e 10 PCIe busses in
>> use. Only allocating PCIe Windows when needed means in practice, we
>> have not had issues. We need to be mindful, and don't waste them, but
>> we don't need to consider them a scarce resource.
>>
>> I don't see it being an issue for the SPI driver to allocate one on
>> probe and keeping it until release. I probably would object to it
>> allocating one per chip select line.
> 
> I agree. We had a very long debate about this when we first added
> the support for MBus, and not much has changed. As far as I'm concerned,
> this is where we're at:
> 
> The binding defines a reasonable set of default settings for each
> device that uses MBus, and the OS can use them, or define its own.
> Coming up with an algorithm to do this right however is very hard
> and not really worth it, as defining the defaults works well enough.
> 
> Ideally we'd let the boot loader set up the windows statically and
> have the DT describe what they are, but unfortunately that is not
> what boot loaders in the field do.
> 
> The method that was picked for MBus is similar to how we typically
> handle PCI host bridges that have the same problem: you can set up
> translation windows between CPU addresses and the various PCI
> address spaces (I/O, mem, prefetchable, config, ...) in all sorts
> of ways, with various tradeoffs, but instead we just pick a
> reasonable default and describe it in the DT, because the kernel
> has no good generic way of picking a particular setting over another.

Arnd, thanks for your comments.

So we seem to agree, that one MBus window per SPI controller is the
way to go. Only how should this be described in the DT? I've come up
with these new DT properties in v2:

+- da-reg : The physical memory area that will be used for the direct
+           access mode, if enabled in one of the SPI devices.
+
+Per SPI device:
+- da-target-attribute : The target and attribute value for a specific
+                        SPI-controller / SPI-device combination.
+			E.g. <0x01 0x5e>: SPI0-CS1 target and attribute

...

+Example with direct access mode:
+	spi@10600 {
+		compatible = "marvell,orion-spi";
+		status = "okay";
+		da-reg = <0xf2000000 0x100000>;
+
+		spi-fpga@1 {
+			compatible = "altera,stratix-v";
+			reg = <1>;
+			/* 0x01 0x5e: SPI0-CS1 target and attribute */
+			da-target-attribute = <0x01 0x5e>;
+		};
+	};

I've added the "da-*" (Direct Access) properties to enable the
SPI driver to dynamically allocate the MBus windows. Do you find
these new bindings reasonable? Or do you have better suggestions for
this per-SPI-device dynamic MBus allocation, perhaps by using
MBUS_ID somehow?

Thanks,
Stefan
Mark Brown March 24, 2016, 11:23 a.m. UTC | #18
On Thu, Mar 24, 2016 at 06:45:35AM +0100, Stefan Roese wrote:

> Right. It is of course possible to use this direct mode to access
> SPI flash. Its just not the way how the SPI MTD driver uses the
> SPI controller. All information that is written to the SPI flash
> (e.g. commands, addresses) are passed as plain tx-data in the SPI
> message to the SPI controller. And it would be not easy to detect
> commands / addresses vs. "real data" in this orion SPI driver
> to put these values into the specific registers.

We now have extensions for passing commands through directly (see
spi_flash_read()).
Stefan Roese March 24, 2016, 12:05 p.m. UTC | #19
On 24.03.2016 12:23, Mark Brown wrote:
> On Thu, Mar 24, 2016 at 06:45:35AM +0100, Stefan Roese wrote:
> 
>> Right. It is of course possible to use this direct mode to access
>> SPI flash. Its just not the way how the SPI MTD driver uses the
>> SPI controller. All information that is written to the SPI flash
>> (e.g. commands, addresses) are passed as plain tx-data in the SPI
>> message to the SPI controller. And it would be not easy to detect
>> commands / addresses vs. "real data" in this orion SPI driver
>> to put these values into the specific registers.
> 
> We now have extensions for passing commands through directly (see
> spi_flash_read()).

Interesting. So this direct access mode can be used to interface
with SPI flash using these extensions. Great.

But I hope that its not a requirement that I add this SPI flash
support (via direct access mode) to the orion SPI driver to get
this patch accepted. As it really was not the scope of my project.
I can try to get this added later, but I can't make any promises
here. Perhaps someone else finds the time to implement this?

Thanks,
Stefan
Arnd Bergmann March 24, 2016, 12:42 p.m. UTC | #20
On Thursday 24 March 2016 08:22:36 Stefan Roese wrote:
> 
> Arnd, thanks for your comments.
> 
> So we seem to agree, that one MBus window per SPI controller is the
> way to go. Only how should this be described in the DT? I've come up
> with these new DT properties in v2:
> 
> +- da-reg : The physical memory area that will be used for the direct
> +           access mode, if enabled in one of the SPI devices.
> +
> +Per SPI device:
> +- da-target-attribute : The target and attribute value for a specific
> +                        SPI-controller / SPI-device combination.
> +                       E.g. <0x01 0x5e>: SPI0-CS1 target and attribute
> 
> ...
> 
> +Example with direct access mode:
> +       spi@10600 {
> +               compatible = "marvell,orion-spi";
> +               status = "okay";
> +               da-reg = <0xf2000000 0x100000>;
> +
> +               spi-fpga@1 {
> +                       compatible = "altera,stratix-v";
> +                       reg = <1>;
> +                       /* 0x01 0x5e: SPI0-CS1 target and attribute */
> +                       da-target-attribute = <0x01 0x5e>;
> +               };
> +       };
> 
> I've added the "da-*" (Direct Access) properties to enable the
> SPI driver to dynamically allocate the MBus windows. Do you find
> these new bindings reasonable? Or do you have better suggestions for
> this per-SPI-device dynamic MBus allocation, perhaps by using
> MBUS_ID somehow?

I was thinking it would be statically set up, but then we have a
problem with how it uses both internal-regs and and its own mbus
based reg, so we probably have to move the spi node outside of
the internal-regs node to achieve that, similar to how we handle
the devbus devices:


	soc@ {
		spi0 {
                       compatible = "marvell,armada-370-spi",
                                    "marvell,orion-spi";
			reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
			      <MBUS_ID(0x01, 0x5e) 0 0x100000>;
                       #address-cells = <1>;
                       #size-cells = <0>;
                       pinctrl-0 = <&spi0_pins1>;
                       pinctrl-names = "default";
                       cell-index = <0>;
                       interrupts = <30>;
                       clocks = <&coreclk 0>;
                       status = "disabled";
		};
	};

	Arnd
Stefan Roese March 24, 2016, 4:15 p.m. UTC | #21
On 24.03.2016 13:42, Arnd Bergmann wrote:
> On Thursday 24 March 2016 08:22:36 Stefan Roese wrote:
>>
>> Arnd, thanks for your comments.
>>
>> So we seem to agree, that one MBus window per SPI controller is the
>> way to go. Only how should this be described in the DT? I've come up
>> with these new DT properties in v2:
>>
>> +- da-reg : The physical memory area that will be used for the direct
>> +           access mode, if enabled in one of the SPI devices.
>> +
>> +Per SPI device:
>> +- da-target-attribute : The target and attribute value for a specific
>> +                        SPI-controller / SPI-device combination.
>> +                       E.g. <0x01 0x5e>: SPI0-CS1 target and attribute
>>
>> ...
>>
>> +Example with direct access mode:
>> +       spi@10600 {
>> +               compatible = "marvell,orion-spi";
>> +               status = "okay";
>> +               da-reg = <0xf2000000 0x100000>;
>> +
>> +               spi-fpga@1 {
>> +                       compatible = "altera,stratix-v";
>> +                       reg = <1>;
>> +                       /* 0x01 0x5e: SPI0-CS1 target and attribute */
>> +                       da-target-attribute = <0x01 0x5e>;
>> +               };
>> +       };
>>
>> I've added the "da-*" (Direct Access) properties to enable the
>> SPI driver to dynamically allocate the MBus windows. Do you find
>> these new bindings reasonable? Or do you have better suggestions for
>> this per-SPI-device dynamic MBus allocation, perhaps by using
>> MBUS_ID somehow?
>
> I was thinking it would be statically set up,

Really statically? Please see below.

> but then we have a
> problem with how it uses both internal-regs and and its own mbus
> based reg, so we probably have to move the spi node outside of
> the internal-regs node to achieve that, similar to how we handle
> the devbus devices:
>
>
> 	soc@ {
> 		spi0 {
>                         compatible = "marvell,armada-370-spi",
>                                      "marvell,orion-spi";
> 			reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
> 			      <MBUS_ID(0x01, 0x5e) 0 0x100000>;
>                         #address-cells = <1>;
>                         #size-cells = <0>;
>                         pinctrl-0 = <&spi0_pins1>;
>                         pinctrl-names = "default";
>                         cell-index = <0>;
>                         interrupts = <30>;
>                         clocks = <&coreclk 0>;
>                         status = "disabled";
> 		};
> 	};

Do I understand this correctly, that you suggest to list all MBus
windows here, that the SoC supports (e.g. 8 for the Armada XP).
And let the SPI driver then extract and dynamically enable (map)
the one that is currently used?

We also need a per-SPI-device DT property to enable this direct
access mode for this specific SPI device. As not all SPI devices
support this mode - at least not yet. How about this one:

		flash0: flash@0 {
			compatible = "m25p128";
			reg = <0>;
			direct-access-enable;
			...
		};

?

Thanks,
Stefan
Arnd Bergmann March 24, 2016, 4:42 p.m. UTC | #22
On Thursday 24 March 2016 17:15:49 Stefan Roese wrote:
> > but then we have a
> > problem with how it uses both internal-regs and and its own mbus
> > based reg, so we probably have to move the spi node outside of
> > the internal-regs node to achieve that, similar to how we handle
> > the devbus devices:
> >
> >
> >       soc@ {
> >               spi0 {
> >                         compatible = "marvell,armada-370-spi",
> >                                      "marvell,orion-spi";
> >                       reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
> >                             <MBUS_ID(0x01, 0x5e) 0 0x100000>;
> >                         #address-cells = <1>;
> >                         #size-cells = <0>;
> >                         pinctrl-0 = <&spi0_pins1>;
> >                         pinctrl-names = "default";
> >                         cell-index = <0>;
> >                         interrupts = <30>;
> >                         clocks = <&coreclk 0>;
> >                         status = "disabled";
> >               };
> >       };
> 
> Do I understand this correctly, that you suggest to list all MBus
> windows here, that the SoC supports (e.g. 8 for the Armada XP).
> And let the SPI driver then extract and dynamically enable (map)
> the one that is currently used?

I had not realize that there is more than one per controller.
Is it one per chip-select, or how do you pick the right ones?

	Arnd
Arnd Bergmann March 24, 2016, 4:48 p.m. UTC | #23
On Thursday 24 March 2016 17:15:49 Stefan Roese wrote:
> >
> >       soc@ {
> >               spi0 {
> >                         compatible = "marvell,armada-370-spi",
> >                                      "marvell,orion-spi";
> >                       reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
> >                             <MBUS_ID(0x01, 0x5e) 0 0x100000>;
> >                         #address-cells = <1>;
> >                         #size-cells = <0>;
> >                         pinctrl-0 = <&spi0_pins1>;
> >                         pinctrl-names = "default";
> >                         cell-index = <0>;
> >                         interrupts = <30>;
> >                         clocks = <&coreclk 0>;
> >                         status = "disabled";
> >               };
> >       };
> 
> Do I understand this correctly, that you suggest to list all MBus
> windows here, that the SoC supports (e.g. 8 for the Armada XP).
> And let the SPI driver then extract and dynamically enable (map)
> the one that is currently used?

Actually the syntax above doesn't imply any mapping at all, it
just describes how the hardware is wired up, so that really
needs to list all windows, and then the ranges property in the
soc node can statically map the ones that are used on the
particular machine.

> We also need a per-SPI-device DT property to enable this direct
> access mode for this specific SPI device. As not all SPI devices
> support this mode - at least not yet. How about this one:
> 
>                 flash0: flash@0 {
>                         compatible = "m25p128";
>                         reg = <0>;
>                         direct-access-enable;
>                         ...
>                 };
> 
> ?

Maybe the spi driver can just check whether the window is mapped
or not, and then have no ranges entry for the devices that don't
support it?

Alternatively, we might encode this in the 'reg' property in
some way?

How do we know whether a device supports the mode or not?

	Arnd
Stefan Roese March 24, 2016, 5:30 p.m. UTC | #24
On 24.03.2016 17:42, Arnd Bergmann wrote:
> On Thursday 24 March 2016 17:15:49 Stefan Roese wrote:
>>> but then we have a
>>> problem with how it uses both internal-regs and and its own mbus
>>> based reg, so we probably have to move the spi node outside of
>>> the internal-regs node to achieve that, similar to how we handle
>>> the devbus devices:
>>>
>>>
>>>        soc@ {
>>>                spi0 {
>>>                          compatible = "marvell,armada-370-spi",
>>>                                       "marvell,orion-spi";
>>>                        reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
>>>                              <MBUS_ID(0x01, 0x5e) 0 0x100000>;
>>>                          #address-cells = <1>;
>>>                          #size-cells = <0>;
>>>                          pinctrl-0 = <&spi0_pins1>;
>>>                          pinctrl-names = "default";
>>>                          cell-index = <0>;
>>>                          interrupts = <30>;
>>>                          clocks = <&coreclk 0>;
>>>                          status = "disabled";
>>>                };
>>>        };
>>
>> Do I understand this correctly, that you suggest to list all MBus
>> windows here, that the SoC supports (e.g. 8 for the Armada XP).
>> And let the SPI driver then extract and dynamically enable (map)
>> the one that is currently used?
>
> I had not realize that there is more than one per controller.
> Is it one per chip-select, or how do you pick the right ones?

Yes, its one per chip-select (SPI device).

Thanks,
Stefan
Stefan Roese March 24, 2016, 5:51 p.m. UTC | #25
On 24.03.2016 17:48, Arnd Bergmann wrote:
> On Thursday 24 March 2016 17:15:49 Stefan Roese wrote:
>>>
>>>        soc@ {
>>>                spi0 {
>>>                          compatible = "marvell,armada-370-spi",
>>>                                       "marvell,orion-spi";
>>>                        reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
>>>                              <MBUS_ID(0x01, 0x5e) 0 0x100000>;
>>>                          #address-cells = <1>;
>>>                          #size-cells = <0>;
>>>                          pinctrl-0 = <&spi0_pins1>;
>>>                          pinctrl-names = "default";
>>>                          cell-index = <0>;
>>>                          interrupts = <30>;
>>>                          clocks = <&coreclk 0>;
>>>                          status = "disabled";
>>>                };
>>>        };
>>
>> Do I understand this correctly, that you suggest to list all MBus
>> windows here, that the SoC supports (e.g. 8 for the Armada XP).
>> And let the SPI driver then extract and dynamically enable (map)
>> the one that is currently used?
>
> Actually the syntax above doesn't imply any mapping at all, it
> just describes how the hardware is wired up, so that really
> needs to list all windows, and then the ranges property in the
> soc node can statically map the ones that are used on the
> particular machine.

I see. But with this we are back to statically mapping the MBus
windows that are used. Resulting in potentially multiple windows
per SPI controller, which is not necessary (Andrew and Mark
objected against that).

>> We also need a per-SPI-device DT property to enable this direct
>> access mode for this specific SPI device. As not all SPI devices
>> support this mode - at least not yet. How about this one:
>>
>>                  flash0: flash@0 {
>>                          compatible = "m25p128";
>>                          reg = <0>;
>>                          direct-access-enable;
>>                          ...
>>                  };
>>
>> ?
>
> Maybe the spi driver can just check whether the window is mapped
> or not, and then have no ranges entry for the devices that don't
> support it?

Sure, if its statically mapped (via 'ranges' in the soc node),
then this is easy to implement in the driver. Its similar to what
the RFC version of the patch did.

> Alternatively, we might encode this in the 'reg' property in
> some way?
>
> How do we know whether a device supports the mode or not?

Currently, only a very limited number of SPI devices will support
it. As its really only useful right now for full-speed tx-transfer.
In my case its downloading a bitstream into a FPGA (only SPI
writes). In the future, SPI NOR flash devices can be supported.
But this needs additional work. So SPI devices should not use
this direct access mode as default for now. It needs to get
opted-in via some DT property.

Thanks,
Stefan

PS: I'm off into Easter vacation in a few hours. So I will follow-up
on this earliest in week 14.
Arnd Bergmann March 24, 2016, 8:07 p.m. UTC | #26
On Thursday 24 March 2016 18:51:53 Stefan Roese wrote:
> On 24.03.2016 17:48, Arnd Bergmann wrote:
> > On Thursday 24 March 2016 17:15:49 Stefan Roese wrote:
> >>>
> >>>        soc@ {
> >>>                spi0 {
> >>>                          compatible = "marvell,armada-370-spi",
> >>>                                       "marvell,orion-spi";
> >>>                        reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>,
> >>>                              <MBUS_ID(0x01, 0x5e) 0 0x100000>;
> >>>                          #address-cells = <1>;
> >>>                          #size-cells = <0>;
> >>>                          pinctrl-0 = <&spi0_pins1>;
> >>>                          pinctrl-names = "default";
> >>>                          cell-index = <0>;
> >>>                          interrupts = <30>;
> >>>                          clocks = <&coreclk 0>;
> >>>                          status = "disabled";
> >>>                };
> >>>        };
> >>
> >> Do I understand this correctly, that you suggest to list all MBus
> >> windows here, that the SoC supports (e.g. 8 for the Armada XP).
> >> And let the SPI driver then extract and dynamically enable (map)
> >> the one that is currently used?
> >
> > Actually the syntax above doesn't imply any mapping at all, it
> > just describes how the hardware is wired up, so that really
> > needs to list all windows, and then the ranges property in the
> > soc node can statically map the ones that are used on the
> > particular machine.
> 
> I see. But with this we are back to statically mapping the MBus
> windows that are used. Resulting in potentially multiple windows
> per SPI controller, which is not necessary (Andrew and Mark
> objected against that).

I'm not following here. Do you mean we should set up and tear
down the windows in the runtime PM callbacks so they are only
present when we actually access a device instead?

If a controller has multiple high-speed devices connected to
it, and we want them all to use the direct mapping, it doesn't
seem overly wasteful to have one mbus window per device, but
doing the dynamic switching doesn't seem wrong either.

Among the dts files we ship with the kernel, how many would
actually use more than one mapping in practice if we decide to
do the static ranges property? I had a quick look and could not
even find one that has more than one chip-select connected.

> > Alternatively, we might encode this in the 'reg' property in
> > some way?
> >
> > How do we know whether a device supports the mode or not?
> 
> Currently, only a very limited number of SPI devices will support
> it. As its really only useful right now for full-speed tx-transfer.
> In my case its downloading a bitstream into a FPGA (only SPI
> writes). In the future, SPI NOR flash devices can be supported.
> But this needs additional work. So SPI devices should not use
> this direct access mode as default for now. It needs to get
> opted-in via some DT property.
 
Ok, so with the static mapping it could be done very easily, or
we need a more complex solution for the dynamic mapping.

	Arnd
Mark Brown March 25, 2016, 10:32 a.m. UTC | #27
On Thu, Mar 24, 2016 at 09:07:32PM +0100, Arnd Bergmann wrote:
> On Thursday 24 March 2016 18:51:53 Stefan Roese wrote:

> > I see. But with this we are back to statically mapping the MBus
> > windows that are used. Resulting in potentially multiple windows
> > per SPI controller, which is not necessary (Andrew and Mark
> > objected against that).

> I'm not following here. Do you mean we should set up and tear
> down the windows in the runtime PM callbacks so they are only
> present when we actually access a device instead?

What we're trying to avoid is allocating a window to every single client
device attached to SPI, the controller can only talk to one device at
once anyway so having one per device is wasteful when this is a limited
resource.  Looking at this from the point of view of a SPI driver it
just seems obviously bad, in order to get the best performance with this
controller we need to do a special per-device magic binding.  That's not
good for usability, it's not ideal having to do it per controller but at
least then it's done for the SoC.

We also already apparently have some dynamic code for PCIe. 

> Among the dts files we ship with the kernel, how many would
> actually use more than one mapping in practice if we decide to
> do the static ranges property? I had a quick look and could not
> even find one that has more than one chip-select connected.

Do any of these boards have things like expansion connectors?

> Ok, so with the static mapping it could be done very easily, or
> we need a more complex solution for the dynamic mapping.

Part of what I personally don't understand is why this is complicated?
Arnd Bergmann March 25, 2016, 3:11 p.m. UTC | #28
On Friday 25 March 2016 10:32:53 Mark Brown wrote:
> On Thu, Mar 24, 2016 at 09:07:32PM +0100, Arnd Bergmann wrote:
> > On Thursday 24 March 2016 18:51:53 Stefan Roese wrote:
> 
> > > I see. But with this we are back to statically mapping the MBus
> > > windows that are used. Resulting in potentially multiple windows
> > > per SPI controller, which is not necessary (Andrew and Mark
> > > objected against that).
> 
> > I'm not following here. Do you mean we should set up and tear
> > down the windows in the runtime PM callbacks so they are only
> > present when we actually access a device instead?
> 
> What we're trying to avoid is allocating a window to every single client
> device attached to SPI, the controller can only talk to one device at
> once anyway so having one per device is wasteful when this is a limited
> resource.  Looking at this from the point of view of a SPI driver it
> just seems obviously bad, in order to get the best performance with this
> controller we need to do a special per-device magic binding.  That's not
> good for usability, it's not ideal having to do it per controller but at
> least then it's done for the SoC.
> 
> We also already apparently have some dynamic code for PCIe. 

The PCI implementation is a bit of a layering violation, as the mbus
binding has to know about the special reserved area that is used for
dynamic PCI mappings.

There is nothing magic in the binding if we just do the same thing
the flash driver does, and describe the memory range that is associated
with a chipselect.

> > Among the dts files we ship with the kernel, how many would
> > actually use more than one mapping in practice if we decide to
> > do the static ranges property? I had a quick look and could not
> > even find one that has more than one chip-select connected.
> 
> Do any of these boards have things like expansion connectors?

I think they are basically all consumer products, e.g. NAS systems
or routers that just have a single SPI-NOR flash.

> > Ok, so with the static mapping it could be done very easily, or
> > we need a more complex solution for the dynamic mapping.
> 
> Part of what I personally don't understand is why this is complicated?

I think we'd need to add another special case in the bus driver
for it, which otherwise should be able to handle this in a generic
way: if we just use the existing binding, the spi host driver can
simply call devm_ioremap_resource() to see if there is a map
for a given chipselect and otherwise fall back to the current
mode.

	Arnd
Mark Brown March 25, 2016, 3:50 p.m. UTC | #29
On Fri, Mar 25, 2016 at 04:11:05PM +0100, Arnd Bergmann wrote:

> There is nothing magic in the binding if we just do the same thing
> the flash driver does, and describe the memory range that is associated
> with a chipselect.

Why are we even doing that though?  Like I say it just seems pointlessly
unhelpful for users.  Are we really saying that every single system has
to go through and manually modify their DT so that they can use this
entirely in SoC feature?  That doesn't seem like winning...

> > > Ok, so with the static mapping it could be done very easily, or
> > > we need a more complex solution for the dynamic mapping.

> > Part of what I personally don't understand is why this is complicated?

> I think we'd need to add another special case in the bus driver
> for it, which otherwise should be able to handle this in a generic
> way: if we just use the existing binding, the spi host driver can
> simply call devm_ioremap_resource() to see if there is a map
> for a given chipselect and otherwise fall back to the current
> mode.

Why does the binding have to be in the client device to do that?
Arnd Bergmann March 25, 2016, 8:58 p.m. UTC | #30
On Friday 25 March 2016 15:50:32 Mark Brown wrote:
> On Fri, Mar 25, 2016 at 04:11:05PM +0100, Arnd Bergmann wrote:
> 
> > There is nothing magic in the binding if we just do the same thing
> > the flash driver does, and describe the memory range that is associated
> > with a chipselect.
> 
> Why are we even doing that though?  Like I say it just seems pointlessly
> unhelpful for users.  Are we really saying that every single system has
> to go through and manually modify their DT so that they can use this
> entirely in SoC feature?  That doesn't seem like winning...

I don't remember the exact details, but I think we had come up with
a generic constraint solving algorithm to pick appropriate windows for
each device with a (close to) minimal use of physical address space,
and we defined the DT binding for MBus in a way that would let us
implement that algorithm at a later point if we ever found we needed
it, but then decided against implementing it because it puts a lot
of complexity at the very early boot: it needs to do a deep walk
of all enabled child device nodes to find the size and alignment of
every reg property that translates into an mbus address, and figure out
a set of mbus translations that convert those into CPU phys addresses
without using too many mbus windows or too much address space.

> > > > Ok, so with the static mapping it could be done very easily, or
> > > > we need a more complex solution for the dynamic mapping.
> 
> > > Part of what I personally don't understand is why this is complicated?
> 
> > I think we'd need to add another special case in the bus driver
> > for it, which otherwise should be able to handle this in a generic
> > way: if we just use the existing binding, the spi host driver can
> > simply call devm_ioremap_resource() to see if there is a map
> > for a given chipselect and otherwise fall back to the current
> > mode.
> 
> Why does the binding have to be in the client device to do that?

I don't understand the question. The client device here is the SPI
controller, right? That doesn't need to have a special binding, it
just needs to list the registers relative to the mbus address space
like any other device, and that is independent of how we implement
it.

It looks like Stefan's patch tricks here by creating an incompatible
binding for mbus that bypasses the address mapping.

	Arnd
Mark Brown March 25, 2016, 10:39 p.m. UTC | #31
On Fri, Mar 25, 2016 at 09:58:40PM +0100, Arnd Bergmann wrote:
> On Friday 25 March 2016 15:50:32 Mark Brown wrote:

> > Why are we even doing that though?  Like I say it just seems pointlessly
> > unhelpful for users.  Are we really saying that every single system has
> > to go through and manually modify their DT so that they can use this
> > entirely in SoC feature?  That doesn't seem like winning...

> I don't remember the exact details, but I think we had come up with
> a generic constraint solving algorithm to pick appropriate windows for
> each device with a (close to) minimal use of physical address space,

So there are some complicated constraints that make it hard to allocate
from the bus space?

> > > way: if we just use the existing binding, the spi host driver can
> > > simply call devm_ioremap_resource() to see if there is a map
> > > for a given chipselect and otherwise fall back to the current
> > > mode.

> > Why does the binding have to be in the client device to do that?

> I don't understand the question. The client device here is the SPI
> controller, right? That doesn't need to have a special binding, it
> just needs to list the registers relative to the mbus address space
> like any other device, and that is independent of how we implement
> it.

If you're saying we're doing a binding per chip select presumably that
binding appears in the SPI device even if it's the controller driver
that parses it.  This was what the original patch did, I pushed back
since it's a fairly common pattern for poor quality bindings where
people for some reason want to add per device properties for whatever
new feature they want to support instead of doing it at the controller
level (and ideally just doing it without requring any configuration).

> It looks like Stefan's patch tricks here by creating an incompatible
> binding for mbus that bypasses the address mapping.

I've not looked at the patch and honestly nobody has been talking about
any generic binding, all I've seen is the per device property in the
original patch.
Arnd Bergmann March 29, 2016, 12:39 p.m. UTC | #32
On Friday 25 March 2016 22:39:22 Mark Brown wrote:
> On Fri, Mar 25, 2016 at 09:58:40PM +0100, Arnd Bergmann wrote:
> > On Friday 25 March 2016 15:50:32 Mark Brown wrote:
> 
> > > Why are we even doing that though?  Like I say it just seems pointlessly
> > > unhelpful for users.  Are we really saying that every single system has
> > > to go through and manually modify their DT so that they can use this
> > > entirely in SoC feature?  That doesn't seem like winning...
> 
> > I don't remember the exact details, but I think we had come up with
> > a generic constraint solving algorithm to pick appropriate windows for
> > each device with a (close to) minimal use of physical address space,
> 
> So there are some complicated constraints that make it hard to allocate
> from the bus space?

Yes, it's basically an optimization problem: the closer you can squeeze
everything together, the more address space you have left for RAM and/or
PCI.

From the Armada XP data sheet, I can also see that it's possible to map
multiple SPI slaves into the MMIO space using a single MBus window, which
can be useful if  you have plenty of address space but a shortage of
mbus window registers.

> > > > way: if we just use the existing binding, the spi host driver can
> > > > simply call devm_ioremap_resource() to see if there is a map
> > > > for a given chipselect and otherwise fall back to the current
> > > > mode.
> 
> > > Why does the binding have to be in the client device to do that?
> 
> > I don't understand the question. The client device here is the SPI
> > controller, right? That doesn't need to have a special binding, it
> > just needs to list the registers relative to the mbus address space
> > like any other device, and that is independent of how we implement
> > it.
> 
> If you're saying we're doing a binding per chip select presumably that
> binding appears in the SPI device even if it's the controller driver
> that parses it.  This was what the original patch did, I pushed back
> since it's a fairly common pattern for poor quality bindings where
> people for some reason want to add per device properties for whatever
> new feature they want to support instead of doing it at the controller
> level (and ideally just doing it without requring any configuration).
>
> > It looks like Stefan's patch tricks here by creating an incompatible
> > binding for mbus that bypasses the address mapping.
> 
> I've not looked at the patch and honestly nobody has been talking about
> any generic binding, all I've seen is the per device property in the
> original patch.

I've looked up the original patch now, and I agree with your concern.
It's not at all what I suggest here though. We should not need any
properties in the spi slave node, and I think all added properties in
the spi master node should be static, and not refer to the configuration
at all.

In the DT node for the spi master, we already have a register range
for the SPI's control registers, and this is an address in the
domain of the MBus controller, which we treat as a 64-bit flat address
space in DT, specifically it is part of the 'internal-regs' bus that
has a 20-bit section out of that 64-bit bus:

        soc {
                #address-cells = <2>;
                #size-cells = <1>;
                internal-regs {
                        compatible = "simple-bus";
                        #address-cells = <1>;
                        #size-cells = <1>;
                        ranges = <0 MBUS_ID(0xf0, 0x01) 0 0x100000>;

                        spi0: spi@10600 {
				compatible = "marvell,armada-370-spi";
                                reg = <0x10600 0x28>;
                                #address-cells = <1>;
                                #size-cells = <0>;
                        };
		};
	};

What we really have though is additional registers that belong to
the SPI controller and that are not part of internal-regs, so
we need to move it up one level out of the internal-regs node in order
to add more registers:

        soc {
                #address-cells = <2>;
                #size-cells = <1>;
               spi0: spi@10600 {
			compatible = "marvell,armada-370-spi";
                       reg = <MBUS_ID(0xf0, 0x01) 0x10600 0x28>, /* control */
			       <MBUS_ID(0x01, 0x1e) 0 0x100000>; /* CS0 */
			       <MBUS_ID(0x01, 0x5e) 0 0x100000>; /* CS1 */
			       <MBUS_ID(0x01, 0x9e) 0 0x100000>; /* CS2 */
			       <MBUS_ID(0x01, 0xee) 0 0x100000>; /* CS3 */
			       <MBUS_ID(0x01, 0x1f) 0 0x100000>; /* CS4 */
			       <MBUS_ID(0x01, 0x5f) 0 0x100000>; /* CS5 */
			       <MBUS_ID(0x01, 0x9f) 0 0x100000>; /* CS6 */
			       <MBUS_ID(0x01, 0xef) 0 0x100000>; /* CS7 */
                       #address-cells = <1>;
                       #size-cells = <0>;
		};
	};

This lists all the addresses as seen from the MBus, but doesn't put them
into the CPU address space, which is then done by adding an entry in the
'ranges' property of the soc node:

	soc {
		ranges = <MBUS_ID(0x01, 0x1e) 0  0xe0000000  0x20000>; /* SPI 0 CS0 128kb */
		...
	};

I think it makes a lot of sense to define a separate window for each CS
at boot time, just so you don't have to reprogram the windows manually.
After all, the entire point of the direct mode is to avoid having to
do any of the setup work, and have the SPI master set the right CS
itself based on the MBus ID that is used for accessing the slave mmio
window. This is also required if we want to enable things like
XIP (DaX) mappings for file systems on a SPI-NOR flash.

	Arnd
Mark Brown March 29, 2016, 4:47 p.m. UTC | #33
On Tue, Mar 29, 2016 at 02:39:10PM +0200, Arnd Bergmann wrote:
> On Friday 25 March 2016 22:39:22 Mark Brown wrote:

> > So there are some complicated constraints that make it hard to allocate
> > from the bus space?

> Yes, it's basically an optimization problem: the closer you can squeeze
> everything together, the more address space you have left for RAM and/or
> PCI.

But there are allocation constraints that make that hard?

> What we really have though is additional registers that belong to
> the SPI controller and that are not part of internal-regs, so
> we need to move it up one level out of the internal-regs node in order
> to add more registers:

That seems fine, it's part of the controller binding.

> I think it makes a lot of sense to define a separate window for each CS
> at boot time, just so you don't have to reprogram the windows manually.
> After all, the entire point of the direct mode is to avoid having to
> do any of the setup work, and have the SPI master set the right CS
> itself based on the MBus ID that is used for accessing the slave mmio
> window. This is also required if we want to enable things like
> XIP (DaX) mappings for file systems on a SPI-NOR flash.

Well, in the cases where we have one device on the bus then it's not a
big deal since we can check what the last thing we set was.  The direct
access stuff is going to have trouble if we have multiple devices on the
bus since we try to mix it with non-MMIO access we run the risk of
conflicting simultaneous use unless we continue to route everything
through the SPI subsystem (like we do with the current flash read
support).

It really only makes a difference if the reprogramming process happens a
lot and is expensive relative to the transfers and that doesn't seem
like something I'd expect.
Arnd Bergmann March 29, 2016, 7:49 p.m. UTC | #34
On Tuesday 29 March 2016 09:47:58 Mark Brown wrote:
> 
> > I think it makes a lot of sense to define a separate window for each CS
> > at boot time, just so you don't have to reprogram the windows manually.
> > After all, the entire point of the direct mode is to avoid having to
> > do any of the setup work, and have the SPI master set the right CS
> > itself based on the MBus ID that is used for accessing the slave mmio
> > window. This is also required if we want to enable things like
> > XIP (DaX) mappings for file systems on a SPI-NOR flash.
> 
> Well, in the cases where we have one device on the bus then it's not a
> big deal since we can check what the last thing we set was.  The direct
> access stuff is going to have trouble if we have multiple devices on the
> bus since we try to mix it with non-MMIO access we run the risk of
> conflicting simultaneous use unless we continue to route everything
> through the SPI subsystem (like we do with the current flash read
> support).
> 
> It really only makes a difference if the reprogramming process happens a
> lot and is expensive relative to the transfers and that doesn't seem
> like something I'd expect.

Maybe we can avoid that if we enforce at the driver level that we
use the same mode for all slaves? The way I read the manual, I think
that's how it is intended at least.

Also, as mentioned we don't have any machine with more than one SPI
slave so far, so we don't really need to overengineer it and can
go for the simplest implementation in the SPI master driver.

	Arnd
Mark Brown March 29, 2016, 7:52 p.m. UTC | #35
On Tue, Mar 29, 2016 at 09:49:09PM +0200, Arnd Bergmann wrote:
> On Tuesday 29 March 2016 09:47:58 Mark Brown wrote:

> > Well, in the cases where we have one device on the bus then it's not a
> > big deal since we can check what the last thing we set was.  The direct
> > access stuff is going to have trouble if we have multiple devices on the
> > bus since we try to mix it with non-MMIO access we run the risk of
> > conflicting simultaneous use unless we continue to route everything
> > through the SPI subsystem (like we do with the current flash read
> > support).

> Maybe we can avoid that if we enforce at the driver level that we
> use the same mode for all slaves? The way I read the manual, I think
> that's how it is intended at least.

Well, we currently don't have any XIP or whatever support at all, it's
still only in the SPI operations so it's academic - that'd be a future
issue if we did start doing things that accessed the memory map outside
of the SPI flow.

> Also, as mentioned we don't have any machine with more than one SPI
> slave so far, so we don't really need to overengineer it and can
> go for the simplest implementation in the SPI master driver.

For me the simplest thing seems like just using one window for all the
devices, it seems more likely to deliver useful results without the
system integrator having to think about how to configure this for
optimisation.
Arnd Bergmann March 29, 2016, 8:04 p.m. UTC | #36
On Tuesday 29 March 2016 12:52:30 Mark Brown wrote:
> On Tue, Mar 29, 2016 at 09:49:09PM +0200, Arnd Bergmann wrote:
> > On Tuesday 29 March 2016 09:47:58 Mark Brown wrote:
> 
> > > Well, in the cases where we have one device on the bus then it's not a
> > > big deal since we can check what the last thing we set was.  The direct
> > > access stuff is going to have trouble if we have multiple devices on the
> > > bus since we try to mix it with non-MMIO access we run the risk of
> > > conflicting simultaneous use unless we continue to route everything
> > > through the SPI subsystem (like we do with the current flash read
> > > support).
> 
> > Maybe we can avoid that if we enforce at the driver level that we
> > use the same mode for all slaves? The way I read the manual, I think
> > that's how it is intended at least.
> 
> Well, we currently don't have any XIP or whatever support at all, it's
> still only in the SPI operations so it's academic - that'd be a future
> issue if we did start doing things that accessed the memory map outside
> of the SPI flow.

Isn't this just about implementing .point()/.unpoint() in the spi-nor
driver?

> > Also, as mentioned we don't have any machine with more than one SPI
> > slave so far, so we don't really need to overengineer it and can
> > go for the simplest implementation in the SPI master driver.
> 
> For me the simplest thing seems like just using one window for all the
> devices, it seems more likely to deliver useful results without the
> system integrator having to think about how to configure this for
> optimisation.

Do you mean single-window mode? I don't see the difference to the other
modes, but that's certainly fine with me if it helps.

	Arnd
Mark Brown March 29, 2016, 9 p.m. UTC | #37
On Tue, Mar 29, 2016 at 10:04:59PM +0200, Arnd Bergmann wrote:
> On Tuesday 29 March 2016 12:52:30 Mark Brown wrote:

> > Well, we currently don't have any XIP or whatever support at all, it's
> > still only in the SPI operations so it's academic - that'd be a future
> > issue if we did start doing things that accessed the memory map outside
> > of the SPI flow.

> Isn't this just about implementing .point()/.unpoint() in the spi-nor
> driver?

Implementing that would require the SPI core to export something
providing access to the memory regions.  We don't currently do that,
it'd require us to deal with the interaction with access as a SPI device
(this hardware often seems to only accelerate reads so we still need to
use regular SPI access for writes).

> > For me the simplest thing seems like just using one window for all the
> > devices, it seems more likely to deliver useful results without the
> > system integrator having to think about how to configure this for
> > optimisation.

> Do you mean single-window mode? I don't see the difference to the other
> modes, but that's certainly fine with me if it helps.

I don't like having to assign windows per device, like I say it seems
like more work.
Arnd Bergmann March 29, 2016, 9:08 p.m. UTC | #38
On Tuesday 29 March 2016 14:00:18 Mark Brown wrote:
> On Tue, Mar 29, 2016 at 10:04:59PM +0200, Arnd Bergmann wrote:
> > On Tuesday 29 March 2016 12:52:30 Mark Brown wrote:
> 
> > > Well, we currently don't have any XIP or whatever support at all, it's
> > > still only in the SPI operations so it's academic - that'd be a future
> > > issue if we did start doing things that accessed the memory map outside
> > > of the SPI flow.
> 
> > Isn't this just about implementing .point()/.unpoint() in the spi-nor
> > driver?
> 
> Implementing that would require the SPI core to export something
> providing access to the memory regions.  We don't currently do that,
> it'd require us to deal with the interaction with access as a SPI device
> (this hardware often seems to only accelerate reads so we still need to
> use regular SPI access for writes).

The manual lists both read and write accesses as handled through
direct mode.

> > > For me the simplest thing seems like just using one window for all the
> > > devices, it seems more likely to deliver useful results without the
> > > system integrator having to think about how to configure this for
> > > optimisation.
> 
> > Do you mean single-window mode? I don't see the difference to the other
> > modes, but that's certainly fine with me if it helps.
> 
> I don't like having to assign windows per device, like I say it seems
> like more work.

Ok, but that's something else that can be implemented independently
as I explained earlier: We currently rely on the windows to be assigned
in the DT, but it's also possible to extend the mbus driver in a way
that lets it pick its own windows in one way or another. It should
not impact the binding for the SPI host controller in any way, the
binding just describes how the internal addressing is wired up to
the mbus in hardware, and the mbus driver takes care of mapping those
into CPU addresses.

	Arnd
Mark Brown March 29, 2016, 9:28 p.m. UTC | #39
On Tue, Mar 29, 2016 at 11:08:55PM +0200, Arnd Bergmann wrote:
> On Tuesday 29 March 2016 14:00:18 Mark Brown wrote:

> > Implementing that would require the SPI core to export something
> > providing access to the memory regions.  We don't currently do that,
> > it'd require us to deal with the interaction with access as a SPI device
> > (this hardware often seems to only accelerate reads so we still need to
> > use regular SPI access for writes).

> The manual lists both read and write accesses as handled through
> direct mode.

That is not in general the case for this sort of hardware, and there's
also erase support to consider here.

> > I don't like having to assign windows per device, like I say it seems
> > like more work.

> Ok, but that's something else that can be implemented independently
> as I explained earlier: We currently rely on the windows to be assigned
> in the DT, but it's also possible to extend the mbus driver in a way
> that lets it pick its own windows in one way or another. It should

That's what I've been asking for but people keep saying it's not
possible...

> not impact the binding for the SPI host controller in any way, the
> binding just describes how the internal addressing is wired up to
> the mbus in hardware, and the mbus driver takes care of mapping those
> into CPU addresses.

If we're able to do it automatically without baking this into the DT and
there's enough space to cope with this and PCI together then that's more
reasonable.
Arnd Bergmann March 29, 2016, 10:04 p.m. UTC | #40
On Tuesday 29 March 2016 14:28:42 Mark Brown wrote:
> On Tue, Mar 29, 2016 at 11:08:55PM +0200, Arnd Bergmann wrote:
> > On Tuesday 29 March 2016 14:00:18 Mark Brown wrote:
> 
> > > Implementing that would require the SPI core to export something
> > > providing access to the memory regions.  We don't currently do that,
> > > it'd require us to deal with the interaction with access as a SPI device
> > > (this hardware often seems to only accelerate reads so we still need to
> > > use regular SPI access for writes).
> 
> > The manual lists both read and write accesses as handled through
> > direct mode.
> 
> That is not in general the case for this sort of hardware, and there's
> also erase support to consider here.

Sorry for being unclear, I was referring to spi writes here, not flash
writes.

The MTD infrastructure handles _point internally, and I think it is
only ever used on jffs2 these days. I had assumed that _point was
used by mtd_block to implement direct_access for read-only block
based file systems as well, but it's not.

I don't think there is any device that abstracts a SPI-NOR flash
to look like RAM.

> > > I don't like having to assign windows per device, like I say it seems
> > > like more work.
> 
> > Ok, but that's something else that can be implemented independently
> > as I explained earlier: We currently rely on the windows to be assigned
> > in the DT, but it's also possible to extend the mbus driver in a way
> > that lets it pick its own windows in one way or another. It should
> 
> That's what I've been asking for but people keep saying it's not
> possible...

It's not impossible, it's just that IIRC we decided that the effort
to do this well was not reasonable compared to the benefit back
when the mbus driver got added.

I don't think anything has changed here.

> > not impact the binding for the SPI host controller in any way, the
> > binding just describes how the internal addressing is wired up to
> > the mbus in hardware, and the mbus driver takes care of mapping those
> > into CPU addresses.
> 
> If we're able to do it automatically without baking this into the DT and
> there's enough space to cope with this and PCI together then that's more
> reasonable.

Ok, good. Then let's just do the SPI host driver like all the other
mbus slaves using whatever mbus provides, and implement dynamic remapping
of mbus windows whenever someone gets too annoyed with having to 
come up with a mapping for their board.

At least by doing SPI like all the other devices, we don't make it
harder to come up with a generic algorithm for assigning the windows
at boot time.

	Arnd
Stefan Roese April 5, 2016, 7:11 a.m. UTC | #41
On 30.03.2016 00:04, Arnd Bergmann wrote:
> On Tuesday 29 March 2016 14:28:42 Mark Brown wrote:
>> On Tue, Mar 29, 2016 at 11:08:55PM +0200, Arnd Bergmann wrote:
>>> On Tuesday 29 March 2016 14:00:18 Mark Brown wrote:

<snip>

>>> not impact the binding for the SPI host controller in any way, the
>>> binding just describes how the internal addressing is wired up to
>>> the mbus in hardware, and the mbus driver takes care of mapping those
>>> into CPU addresses.
>>
>> If we're able to do it automatically without baking this into the DT and
>> there's enough space to cope with this and PCI together then that's more
>> reasonable.
> 
> Ok, good. Then let's just do the SPI host driver like all the other
> mbus slaves using whatever mbus provides, and implement dynamic remapping
> of mbus windows whenever someone gets too annoyed with having to
> come up with a mapping for their board.
> 
> At least by doing SPI like all the other devices, we don't make it
> harder to come up with a generic algorithm for assigning the windows
> at boot time.

Okay, I'll implement the static mapping as suggested by Arnd in the
'reg' property of the SPI controller (additionally to the SPI controller
register range). Which will get configured and enabled in the 'ranges'
property of the per-board 'soc' node.

This will make the SPI driver patch very simple but will generate quite
a big Armada dts change, moving all the SPI controllers from the
'internal-regs' node to the 'soc' node though.

Please let me know if this is not the consensus.

Thanks,
Stefan
Andrew Lunn April 5, 2016, 1:15 p.m. UTC | #42
> This will make the SPI driver patch very simple but will generate quite
> a big Armada dts change, moving all the SPI controllers from the
> 'internal-regs' node to the 'soc' node though.

Hi Stefan

There was been talk about making use of the labels in dts files,
rather than keeping with the tree structure. So while you are moving
the spi nodes, made you can also change the style.

    Andrew
Stefan Roese April 5, 2016, 1:20 p.m. UTC | #43
Hi Andrew,

On 05.04.2016 15:15, Andrew Lunn wrote:
>> This will make the SPI driver patch very simple but will generate quite
>> a big Armada dts change, moving all the SPI controllers from the
>> 'internal-regs' node to the 'soc' node though.
>
> Hi Stefan
>
> There was been talk about making use of the labels in dts files,
> rather than keeping with the tree structure. So while you are moving
> the spi nodes, made you can also change the style.

Yes, I wanted to suggest this change as well. I hope its
acceptable that I only change the SPI nodes to using the
"labels style" in this attempt though?

Thanks,
Stefan
Andrew Lunn April 5, 2016, 1:31 p.m. UTC | #44
> Yes, I wanted to suggest this change as well. I hope its
> acceptable that I only change the SPI nodes to using the
> "labels style" in this attempt though?

Gregory and Thomas should give their O.K. for armada-* files. If you
need to change kirkwood-*, dove-* or orion5-* files, im happy with
label style.

      Andrew
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/spi/spi-orion.txt b/Documentation/devicetree/bindings/spi/spi-orion.txt
index 98bc698..b46ea0f 100644
--- a/Documentation/devicetree/bindings/spi/spi-orion.txt
+++ b/Documentation/devicetree/bindings/spi/spi-orion.txt
@@ -12,6 +12,13 @@  Required properties:
 - cell-index : Which of multiple SPI controllers is this.
 Optional properties:
 - interrupts : Is currently not used.
+- da-reg : The physical memory area that will be used for the direct
+           access mode, if enabled in one of the SPI devices.
+
+Per SPI device:
+- da-target-attribute : The target and attribute value for a specific
+                        SPI-controller / SPI-device combination.
+			E.g. <0x01 0x5e>: SPI0-CS1 target and attribute
 
 Example:
        spi@10600 {
@@ -23,3 +30,17 @@  Example:
 	       interrupts = <23>;
 	       status = "disabled";
        };
+
+Example with direct access mode:
+	spi@10600 {
+		compatible = "marvell,orion-spi";
+		status = "okay";
+		da-reg = <0xf2000000 0x100000>;
+
+		spi-fpga@1 {
+			compatible = "altera,stratix-v";
+			reg = <1>;
+			/* 0x01 0x5e: SPI0-CS1 target and attribute */
+			da-target-attribute = <0x01 0x5e>;
+		};
+	};
diff --git a/drivers/spi/spi-orion.c b/drivers/spi/spi-orion.c
index a87cfd4..8b23198 100644
--- a/drivers/spi/spi-orion.c
+++ b/drivers/spi/spi-orion.c
@@ -15,9 +15,11 @@ 
 #include <linux/err.h>
 #include <linux/io.h>
 #include <linux/spi/spi.h>
+#include <linux/mbus.h>
 #include <linux/module.h>
 #include <linux/pm_runtime.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_device.h>
 #include <linux/clk.h>
 #include <linux/sizes.h>
@@ -43,6 +45,9 @@ 
 #define ORION_SPI_INT_CAUSE_REG		0x10
 #define ORION_SPI_TIMING_PARAMS_REG	0x18
 
+/* Register for the "Direct Mode" */
+#define SPI_DIRECT_WRITE_CONFIG_REG	0x20
+
 #define ORION_SPI_TMISO_SAMPLE_MASK	(0x3 << 6)
 #define ORION_SPI_TMISO_SAMPLE_1	(1 << 6)
 #define ORION_SPI_TMISO_SAMPLE_2	(2 << 6)
@@ -78,11 +83,24 @@  struct orion_spi_dev {
 	bool			is_errata_50mhz_ac;
 };
 
+struct target_attribute {
+	bool			enabled;
+	u32			tar;
+	u32			attr;
+};
+
 struct orion_spi {
 	struct spi_master	*master;
 	void __iomem		*base;
 	struct clk              *clk;
 	const struct orion_spi_dev *devdata;
+
+	/* Direct access */
+	int			da_active_cs;
+	phys_addr_t		da_base;
+	u32			da_size;
+	void __iomem		*da_vaddr;
+	struct target_attribute	da_tarattr[ORION_NUM_CHIPSELECTS];
 };
 
 static inline void __iomem *spi_reg(struct orion_spi *orion_spi, u32 reg)
@@ -277,18 +295,44 @@  orion_spi_setup_transfer(struct spi_device *spi, struct spi_transfer *t)
 static void orion_spi_set_cs(struct spi_device *spi, bool enable)
 {
 	struct orion_spi *orion_spi;
+	int cs = spi->chip_select;
 
 	orion_spi = spi_master_get_devdata(spi->master);
 
 	orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, ORION_SPI_CS_MASK);
-	orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG,
-				ORION_SPI_CS(spi->chip_select));
+	orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG, ORION_SPI_CS(cs));
 
 	/* Chip select logic is inverted from spi_set_cs */
 	if (!enable)
 		orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
 	else
 		orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
+
+	/* Remap the MBus window for direct access if enabled */
+	if (orion_spi->da_tarattr[cs].enabled) {
+		int rc;
+
+		/* Nothing to do if the cs is not changed */
+		if (orion_spi->da_active_cs == cs)
+			return;
+
+		/* Map MBus window for the new CS */
+		mvebu_mbus_del_window(orion_spi->da_base, orion_spi->da_size);
+		rc = mvebu_mbus_add_window_by_id(orion_spi->da_tarattr[cs].tar,
+						 orion_spi->da_tarattr[cs].attr,
+						 orion_spi->da_base,
+						 orion_spi->da_size);
+		if (rc) {
+			dev_err(&spi->dev,
+				"unable to map direct access MBus window (%d)\n",
+				rc);
+			/* Disable direct-access if mapping failed */
+			orion_spi->da_tarattr[cs].enabled = false;
+			return;
+		}
+
+		orion_spi->da_active_cs = cs;
+	}
 }
 
 static inline int orion_spi_wait_till_ready(struct orion_spi *orion_spi)
@@ -372,10 +416,41 @@  orion_spi_write_read(struct spi_device *spi, struct spi_transfer *xfer)
 {
 	unsigned int count;
 	int word_len;
+	struct orion_spi *orion_spi;
 
 	word_len = spi->bits_per_word;
 	count = xfer->len;
 
+	orion_spi = spi_master_get_devdata(spi->master);
+
+	/* Use SPI direct access mode if enabled via DT */
+	if (orion_spi->da_tarattr[spi->chip_select].enabled) {
+		if (count > orion_spi->da_size) {
+			dev_err(&spi->dev,
+				"max transfer size exceeded (%d > %d)\n",
+				count, orion_spi->da_size);
+			return 0;
+		}
+
+		if (xfer->tx_buf) {
+			/*
+			 * Send the tx-data to the SPI device via the direct
+			 * mapped address window
+			 */
+			memcpy(orion_spi->da_vaddr, xfer->tx_buf, count);
+		}
+
+		if (xfer->rx_buf) {
+			/*
+			 * Read the rx-data from the SPI device via the direct
+			 * mapped address window
+			 */
+			memcpy(xfer->rx_buf, orion_spi->da_vaddr, count);
+		}
+
+		return count;
+	}
+
 	if (word_len == 8) {
 		const u8 *tx = xfer->tx_buf;
 		u8 *rx = xfer->rx_buf;
@@ -425,6 +500,10 @@  static int orion_spi_reset(struct orion_spi *orion_spi)
 {
 	/* Verify that the CS is deasserted */
 	orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
+
+	/* Don't deassert CS between the direct mapped SPI transfers */
+	writel(0, spi_reg(orion_spi, SPI_DIRECT_WRITE_CONFIG_REG));
+
 	return 0;
 }
 
@@ -504,6 +583,9 @@  static int orion_spi_probe(struct platform_device *pdev)
 	struct resource *r;
 	unsigned long tclk_hz;
 	int status = 0;
+	struct device_node *np;
+	const __be32 *ranges;
+	int rlen;
 
 	master = spi_alloc_master(&pdev->dev, sizeof(*spi));
 	if (master == NULL) {
@@ -576,6 +658,62 @@  static int orion_spi_probe(struct platform_device *pdev)
 		goto out_rel_clk;
 	}
 
+	/* Check if direct-access is enabled for this controller */
+	ranges = of_get_property(pdev->dev.of_node, "da-reg", &rlen);
+	if (ranges) {
+		rlen /= sizeof(*ranges);
+
+		if (rlen != 2) {
+			dev_err(&pdev->dev, "invalid da-reg property!\n");
+			goto out_rel_clk;
+		}
+
+		/* Store the address to use it later for the direct access */
+		spi->da_base = be32_to_cpup(&ranges[0]);
+		spi->da_size = be32_to_cpup(&ranges[1]);
+		spi->da_vaddr = devm_ioremap(&pdev->dev,
+					     spi->da_base, spi->da_size);
+		spi->da_active_cs = -1;
+
+		/*
+		 * Scan all SPI devices of this controller for direct mapped
+		 * devices. We need to save the target and attribute for
+		 * each CS as they map the SPI-controller/SPI-device to a
+		 * specific MBus window. For example on the Armada XP,
+		 * target 0x01 and attribute 0x5e map this window to SPI
+		 * controller #0 and CS #1.
+		 */
+		for_each_available_child_of_node(pdev->dev.of_node, np) {
+			u32 reg[2];
+			int rc;
+
+			/* Get "da-target-attribute" property */
+			rc = of_property_read_u32_array(np,
+							"da-target-attribute",
+							reg, ARRAY_SIZE(reg));
+			if (!rc) {
+				u32 cs;
+				int rc;
+
+				/*
+				 * Get chip-select number from the "reg"
+				 * property
+				 */
+				rc = of_property_read_u32(np, "reg", &cs);
+				if (rc) {
+					dev_err(&pdev->dev,
+						"%s has no valid 'reg' property (%d)\n",
+						np->full_name, rc);
+					continue;
+				}
+
+				spi->da_tarattr[cs].enabled = true;
+				spi->da_tarattr[cs].tar = reg[0];
+				spi->da_tarattr[cs].attr = reg[1];
+			}
+		}
+	}
+
 	pm_runtime_set_active(&pdev->dev);
 	pm_runtime_use_autosuspend(&pdev->dev);
 	pm_runtime_set_autosuspend_delay(&pdev->dev, SPI_AUTOSUSPEND_TIMEOUT);