diff mbox

[2/3] spi: mediatek: Add spi bus for Mediatek MT8173

Message ID 1433844321.16178.6.camel@mtksdaap41 (mailing list archive)
State New, archived
Headers show

Commit Message

Eddie Huang (黃智傑) June 9, 2015, 10:05 a.m. UTC
Hi Mark,

On Mon, 2015-06-08 at 18:59 +0100, Mark Brown wrote:
> On Mon, Jun 08, 2015 at 06:15:46PM +0800, Eddie Huang wrote:
> > On Fri, 2015-05-15 at 17:25 +0800, Mark Brown wrote:
> 
> > > That's how a very large proportion of devices that work with DMA are
> > > done - why would this be complicated?  All can_dma() does is report if
> > > DMA is possible.
> 
> > In include/linux/spi/spi.h, it describes if can_dma() exists and returns
> > true, dma_tx and dma_rx must be set.But Medaitek SPI controller has its
> > own dma hardware, which means this dma resides in the same base address
> > range with SPI controller, and only used by SPI, so we don't implement
> > generic DMA driver, such that can't provide dma channel and assign to
> > dmx_tx, dmx_rx parameter. We think it's strange to implement generic dma
> > driver for dma that only used by specific hardware.Can we just provide
> > can_dma() function and return false ? But I think it's a little odd that
> > there actually has dma. So can we just skip can_dma() function let it be
> > NULL ?
> 
> If it's simply the unavailbility of a struct dma_chan we must be able to
> get a better solution than just open coding all the DMA mapping and
> unmapping in the driver.  The only thing we actually use the channel for
> is to get the device we need to use to do the mapping and unmapping,
> either we need a way for devices to provide their own channels easily or
> a way for SPI drivers to specify a device here instead of a channel.
> The latter seems easier if a bit clunky (with having to work with both).

I list two ways you mention.
Pesudo code of your first suggestion:

static int mtk_spi_probe(struct platform_device *pdev) {
  struct dma_chan *tx_chan;
  struct dma_device *tx_dma;

  tx_chan = devm_kzalloc(&pdev->dev, sizeof(*tx_chan), GFP_KERNEL);
  tx_dma = devm_kzalloc(&pdev->dev, sizeof(*tx_dma), GFP_KERNEL);

  tx_dma->dev = &pdev->dev;
  tx_chan->device = tx_dma;
  master->dma_tx = tx_chan;
  ...
}

Modification of your second suggestion:


Is this what you want ? Actually, I don't like first one at all.


Eddie
Thanks

Comments

Mark Brown June 9, 2015, 10:39 a.m. UTC | #1
On Tue, Jun 09, 2015 at 06:05:21PM +0800, Eddie Huang wrote:

> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -539,8 +539,8 @@ static int __spi_map_msg(struct spi_master *master,
> struct spi_message *msg)
>         if (!master->can_dma)
>                 return 0;

> -       tx_dev = master->dma_tx->device->dev;
> -       rx_dev = master->dma_rx->device->dev;
> +       tx_dev = master->dma_tx ? master->dma_tx->device->dev :
> master->dev;
> +       rx_dev = master->dma_rx ? master->dma_rx->device->dev : master-

> Is this what you want ? Actually, I don't like first one at all.

Not quite what I'd been thinking of - we can't just pick the device in
the core safely, the device might be a MFD or have some other
restriction that needs us to use a separate struct device.  However most
of those cases are likely to point towards implementing a dmaengine
device so probably the above will work for most cases and is fine.  Can
you send a proper patch please?

Please don't use the ternery operator, though.
Eddie Huang (黃智傑) June 10, 2015, 8:06 a.m. UTC | #2
Hi Mark,

On Tue, 2015-06-09 at 11:39 +0100, Mark Brown wrote:
> On Tue, Jun 09, 2015 at 06:05:21PM +0800, Eddie Huang wrote:
> 
> > --- a/drivers/spi/spi.c
> > +++ b/drivers/spi/spi.c
> > @@ -539,8 +539,8 @@ static int __spi_map_msg(struct spi_master *master,
> > struct spi_message *msg)
> >         if (!master->can_dma)
> >                 return 0;
> 
> > -       tx_dev = master->dma_tx->device->dev;
> > -       rx_dev = master->dma_rx->device->dev;
> > +       tx_dev = master->dma_tx ? master->dma_tx->device->dev :
> > master->dev;
> > +       rx_dev = master->dma_rx ? master->dma_rx->device->dev : master-
> 
> > Is this what you want ? Actually, I don't like first one at all.
> 
> Not quite what I'd been thinking of - we can't just pick the device in
> the core safely, the device might be a MFD or have some other
> restriction that needs us to use a separate struct device.  However most
> of those cases are likely to point towards implementing a dmaengine
> device so probably the above will work for most cases and is fine.  Can
> you send a proper patch please?

Sure, we will send this with new MTK SPI driver, such that can verify
it.

> 
> Please don't use the ternery operator, though.
OK, will fix it

Eddie
Thanks
Eddie Huang (黃智傑) June 17, 2015, 9:08 a.m. UTC | #3
Hi Mark,

On Wed, 2015-06-10 at 16:06 +0800, Eddie Huang wrote:
> On Tue, 2015-06-09 at 11:39 +0100, Mark Brown wrote:
> > On Tue, Jun 09, 2015 at 06:05:21PM +0800, Eddie Huang wrote:
> > 
> > > --- a/drivers/spi/spi.c
> > > +++ b/drivers/spi/spi.c
> > > @@ -539,8 +539,8 @@ static int __spi_map_msg(struct spi_master *master,
> > > struct spi_message *msg)
> > >         if (!master->can_dma)
> > >                 return 0;
> > 
> > > -       tx_dev = master->dma_tx->device->dev;
> > > -       rx_dev = master->dma_rx->device->dev;
> > > +       tx_dev = master->dma_tx ? master->dma_tx->device->dev :
> > > master->dev;
> > > +       rx_dev = master->dma_rx ? master->dma_rx->device->dev : master-
> > 
> > > Is this what you want ? Actually, I don't like first one at all.
> > 
> > Not quite what I'd been thinking of - we can't just pick the device in
> > the core safely, the device might be a MFD or have some other
> > restriction that needs us to use a separate struct device.  However most
> > of those cases are likely to point towards implementing a dmaengine
> > device so probably the above will work for most cases and is fine.  Can
> > you send a proper patch please?
> 

After further investigate, we found problem when use can_dma() to
implement spi driver.

If can_dma() return true in __spi_map_msg() function, it will map data
buffer to sg_table, then pass this sg_table to transfer function. In
transfer function, spi hardware can do tx and rx at the same time (full
duplex), and the data size must be the same.

Here comes the problem, although total length of tx, rx is the same,
each entry in rx and tx scatterlist may not be the same (in the case
data buffer allocate from vmalloc). Other vendor have dmaengine driver
to send entry-by-entry automatically, so it is ok due to total length is
the same.But mediatek hw can only send tx and rx scatterlist entry one
by on, which means in full duplex, mediatek SPI hardware need send each
entry separately, will cause full duplex fail because tx/rx entry size
or entry number may not be the same.

The problem is not dma map discuss earlier, it is mediatek spi hardware
limitation that can't support scatterlist dma transfer in full-duplex
mode. We can only support both tx and rx has the same size continuous
memory data in full-duplex mode. I don't know whether should modify
spi.c to support mediatek spi, or we just return can_dma() false and do
transfer one continuous data in transfer function.

By the way, I think maybe it is better to change can_dma() to
can_dma_sg().

Eddie
Thanks
Mark Brown June 17, 2015, 12:47 p.m. UTC | #4
On Wed, Jun 17, 2015 at 05:08:03PM +0800, Eddie Huang wrote:

> Here comes the problem, although total length of tx, rx is the same,
> each entry in rx and tx scatterlist may not be the same (in the case
> data buffer allocate from vmalloc). Other vendor have dmaengine driver
> to send entry-by-entry automatically, so it is ok due to total length is
> the same.But mediatek hw can only send tx and rx scatterlist entry one
> by on, which means in full duplex, mediatek SPI hardware need send each
> entry separately, will cause full duplex fail because tx/rx entry size
> or entry number may not be the same.

I don't see why this is a problem - your driver is going to have to do
more work to overcome the limitations of the hardware but surely it's
just a question of how you parse the scatterlists (or rewriting them if
that's appropriate)?

> The problem is not dma map discuss earlier, it is mediatek spi hardware
> limitation that can't support scatterlist dma transfer in full-duplex
> mode. We can only support both tx and rx has the same size continuous
> memory data in full-duplex mode. I don't know whether should modify
> spi.c to support mediatek spi, or we just return can_dma() false and do
> transfer one continuous data in transfer function.

> By the way, I think maybe it is better to change can_dma() to
> can_dma_sg().

Don't you just need to handle the scatterlists in page sized chunks
here?  There's nothing about passing information about the memory that
was mapped around in a scatterlist that means you have to pass the whole
list to the hardware at once - at heart the scatterlist is just a
convenient structure for passing around where the data to be transferred
is.
Eddie Huang (黃智傑) June 17, 2015, 2:10 p.m. UTC | #5
On Wed, 2015-06-17 at 13:47 +0100, Mark Brown wrote:
> On Wed, Jun 17, 2015 at 05:08:03PM +0800, Eddie Huang wrote:
> 
> > Here comes the problem, although total length of tx, rx is the same,
> > each entry in rx and tx scatterlist may not be the same (in the case
> > data buffer allocate from vmalloc). Other vendor have dmaengine driver
> > to send entry-by-entry automatically, so it is ok due to total length is
> > the same.But mediatek hw can only send tx and rx scatterlist entry one
> > by on, which means in full duplex, mediatek SPI hardware need send each
> > entry separately, will cause full duplex fail because tx/rx entry size
> > or entry number may not be the same.
> 
> I don't see why this is a problem - your driver is going to have to do
> more work to overcome the limitations of the hardware but surely it's
> just a question of how you parse the scatterlists (or rewriting them if
> that's appropriate)?
> 
> > The problem is not dma map discuss earlier, it is mediatek spi hardware
> > limitation that can't support scatterlist dma transfer in full-duplex
> > mode. We can only support both tx and rx has the same size continuous
> > memory data in full-duplex mode. I don't know whether should modify
> > spi.c to support mediatek spi, or we just return can_dma() false and do
> > transfer one continuous data in transfer function.
> 
> > By the way, I think maybe it is better to change can_dma() to So 
> > can_dma_sg().
> 
> Don't you just need to handle the scatterlists in page sized chunks
> here?  There's nothing about passing information about the memory that
> was mapped around in a scatterlist that means you have to pass the whole
> list to the hardware at once - at heart the scatterlist is just a
> convenient structure for passing around where the data to be transferred
> is.

Our hardware limitation is: we don't have separate dma tx, rx channel
with transfer finish interrupt, only have spi trigger operation.So the
mediatek SPI dma full duplex operation steps are:
1. Set TX DMA address.
2. Set RX DMA address.
3. Set length (this step assume TX, RX are the same size).
4. Set TX DMA enable, RX DMA enable bit in spi config register. (not
trigger DMA, just told spi use dma)
5. Trigger spi operations.
6. Wait spi operations finish interrupt.

If tx scatterlist per list data size are 128, 4096, 256. rx scatterlist
per list data size are 128, 4096, 256. So we need to go through above
steps three times. If tx scatterlists per list data size are 128, 4096,
256. rx scatterlists per list data size are 256, 4096, 128. If we start
sending first entry, tx size is 128, rx size is 256, this will cause
hardware malfunction because tx, rx data length are not the same.

The solution I think is copy scatterlist data into one single buffer in
mediatek spi transfer function, but I think this is odd because
__spi_map_msg() map single buffer into scatterlist, then our driver map
scatterlist into single buffer again. I hope this explaination is more
clear than before.

Eddie
Thanks
Mark Brown June 17, 2015, 4:35 p.m. UTC | #6
On Wed, Jun 17, 2015 at 10:10:51PM +0800, Eddie Huang wrote:

> Our hardware limitation is: we don't have separate dma tx, rx channel
> with transfer finish interrupt, only have spi trigger operation.So the
> mediatek SPI dma full duplex operation steps are:
> 1. Set TX DMA address.
> 2. Set RX DMA address.
> 3. Set length (this step assume TX, RX are the same size).
> 4. Set TX DMA enable, RX DMA enable bit in spi config register. (not
> trigger DMA, just told spi use dma)
> 5. Trigger spi operations.
> 6. Wait spi operations finish interrupt.

Sure, that's what I understood.

> If tx scatterlist per list data size are 128, 4096, 256. rx scatterlist
> per list data size are 128, 4096, 256. So we need to go through above
> steps three times. If tx scatterlists per list data size are 128, 4096,
> 256. rx scatterlists per list data size are 256, 4096, 128. If we start
> sending first entry, tx size is 128, rx size is 256, this will cause
> hardware malfunction because tx, rx data length are not the same.

> The solution I think is copy scatterlist data into one single buffer in
> mediatek spi transfer function, but I think this is odd because
> __spi_map_msg() map single buffer into scatterlist, then our driver map
> scatterlist into single buffer again. I hope this explaination is more
> clear than before.

To repeat what I said in my last mail: there's no need to use the
scatterlists as-is, your driver can do whatever set of DMA transfers it
likes to keep the lengths of each transfer the same.  Attempting to
linearise the transfers in memory isn't going to work unless you
allocate physically contiguous memory (which could get painful) and will
add substantial overhead.

For example with your above example you could split the transfers up to
be 128, 128, 3968, 128, 128.
Eddie Huang (黃智傑) June 18, 2015, 8:11 a.m. UTC | #7
On Wed, 2015-06-17 at 17:35 +0100, Mark Brown wrote:
> On Wed, Jun 17, 2015 at 10:10:51PM +0800, Eddie Huang wrote:
> 
> > Our hardware limitation is: we don't have separate dma tx, rx channel
> > with transfer finish interrupt, only have spi trigger operation.So the
> > mediatek SPI dma full duplex operation steps are:
> > 1. Set TX DMA address.
> > 2. Set RX DMA address.
> > 3. Set length (this step assume TX, RX are the same size).
> > 4. Set TX DMA enable, RX DMA enable bit in spi config register. (not
> > trigger DMA, just told spi use dma)
> > 5. Trigger spi operations.
> > 6. Wait spi operations finish interrupt.
> 
> Sure, that's what I understood.
> 
> > If tx scatterlist per list data size are 128, 4096, 256. rx scatterlist
> > per list data size are 128, 4096, 256. So we need to go through above
> > steps three times. If tx scatterlists per list data size are 128, 4096,
> > 256. rx scatterlists per list data size are 256, 4096, 128. If we start
> > sending first entry, tx size is 128, rx size is 256, this will cause
> > hardware malfunction because tx, rx data length are not the same.
> 
> > The solution I think is copy scatterlist data into one single buffer in
> > mediatek spi transfer function, but I think this is odd because
> > __spi_map_msg() map single buffer into scatterlist, then our driver map
> > scatterlist into single buffer again. I hope this explaination is more
> > clear than before.
> 
> To repeat what I said in my last mail: there's no need to use the
> scatterlists as-is, your driver can do whatever set of DMA transfers it
> likes to keep the lengths of each transfer the same.  Attempting to
> linearise the transfers in memory isn't going to work unless you
> allocate physically contiguous memory (which could get painful) and will
> add substantial overhead.
> 
> For example with your above example you could split the transfers up to
> be 128, 128, 3968, 128, 128.

This is a workable way.
Thanks your suggestion.We will try to implement this,

Eddie
Thanks
diff mbox

Patch

--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -539,8 +539,8 @@  static int __spi_map_msg(struct spi_master *master,
struct spi_message *msg)
        if (!master->can_dma)
                return 0;

-       tx_dev = master->dma_tx->device->dev;
-       rx_dev = master->dma_rx->device->dev;
+       tx_dev = master->dma_tx ? master->dma_tx->device->dev :
master->dev;
+       rx_dev = master->dma_rx ? master->dma_rx->device->dev : master-