diff mbox series

[v2,3/6] dmaengine: dw: Set DMA device max segment size parameter

Message ID 20200508105304.14065-4-Sergey.Semin@baikalelectronics.ru (mailing list archive)
State Superseded
Headers show
Series dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account | expand

Commit Message

Serge Semin May 8, 2020, 10:53 a.m. UTC
Maximum block size DW DMAC configuration corresponds to the max segment
size DMA parameter in the DMA core subsystem notation. Lets set it with a
value specific to the probed DW DMA controller. It shall help the DMA
clients to create size-optimized SG-list items for the controller. This in
turn will cause less dw_desc allocations, less LLP reinitializations,
better DMA device performance.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org

---

Changelog v2:
- This is a new patch created in place of the dropped one:
  "dmaengine: dw: Add LLP and block size config accessors".
---
 drivers/dma/dw/core.c | 17 +++++++++++++++++
 drivers/dma/dw/regs.h | 18 ++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

Comments

Andy Shevchenko May 8, 2020, 11:21 a.m. UTC | #1
+Cc (Vineet, for information you probably know)

On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> Maximum block size DW DMAC configuration corresponds to the max segment
> size DMA parameter in the DMA core subsystem notation. Lets set it with a
> value specific to the probed DW DMA controller. It shall help the DMA
> clients to create size-optimized SG-list items for the controller. This in
> turn will cause less dw_desc allocations, less LLP reinitializations,
> better DMA device performance.

Thank you for the patch.
My comments below.

...

> +		/*
> +		 * Find maximum block size to be set as the DMA device maximum
> +		 * segment size. By doing so we'll have size optimized SG-list
> +		 * items for the channels with biggest block size. This won't
> +		 * be a problem for the rest of the channels, since they will
> +		 * still be able to split the requests up by allocating
> +		 * multiple DW DMA LLP descriptors, which they would have done
> +		 * anyway.
> +		 */
> +		if (dwc->block_size > block_size)
> +			block_size = dwc->block_size;
>  	}
>  
>  	/* Clear all interrupts on all channels. */
> @@ -1220,6 +1233,10 @@ int do_dma_probe(struct dw_dma_chip *chip)
>  			     BIT(DMA_MEM_TO_MEM);
>  	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
>  
> +	/* Block size corresponds to the maximum sg size */
> +	dw->dma.dev->dma_parms = &dw->dma_parms;
> +	dma_set_max_seg_size(dw->dma.dev, block_size);
> +
>  	err = dma_async_device_register(&dw->dma);
>  	if (err)
>  		goto err_dma_register;

Yeah, I have locally something like this and I didn't dare to upstream because
there is an issue. We have this information per DMA controller, while we
actually need this on per DMA channel basis.

Above will work only for synthesized DMA with all channels having same block
size. That's why above conditional is not needed anyway.

OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
that Intel Medfield has interesting settings, but I don't remember if DMA
channels are different inside the same controller).

Vineet, do you have any information that Synopsys customers synthesized DMA
controllers with different channel characteristics inside one DMA IP?

...

>  #include <linux/bitops.h>

> +#include <linux/device.h>

Isn't enough to supply

struct device;

?

>  #include <linux/interrupt.h>
>  #include <linux/dmaengine.h>

Also this change needs a separate patch I suppose.

...

> -	struct dma_device	dma;
> -	char			name[20];
> -	void __iomem		*regs;
> -	struct dma_pool		*desc_pool;
> -	struct tasklet_struct	tasklet;
> +	struct dma_device		dma;
> +	struct device_dma_parameters	dma_parms;
> +	char				name[20];
> +	void __iomem			*regs;
> +	struct dma_pool			*desc_pool;
> +	struct tasklet_struct		tasklet;
>  
>  	/* channels */
> -	struct dw_dma_chan	*chan;
> -	u8			all_chan_mask;
> -	u8			in_use;
> +	struct dw_dma_chan		*chan;
> +	u8				all_chan_mask;
> +	u8				in_use;

Please split formatting fixes into a separate patch.
Vineet Gupta May 8, 2020, 6:49 p.m. UTC | #2
On 5/8/20 4:21 AM, Andy Shevchenko wrote:
> Yeah, I have locally something like this and I didn't dare to upstream because
> there is an issue. We have this information per DMA controller, while we
> actually need this on per DMA channel basis.
>
> Above will work only for synthesized DMA with all channels having same block
> size. That's why above conditional is not needed anyway.
>
> OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> that Intel Medfield has interesting settings, but I don't remember if DMA
> channels are different inside the same controller).
>
> Vineet, do you have any information that Synopsys customers synthesized DMA
> controllers with different channel characteristics inside one DMA IP?

The IP drivers are done by different teams, but I can try and ask around.
Serge Semin May 11, 2020, 9:16 p.m. UTC | #3
On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> +Cc (Vineet, for information you probably know)
> 
> On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > Maximum block size DW DMAC configuration corresponds to the max segment
> > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > value specific to the probed DW DMA controller. It shall help the DMA
> > clients to create size-optimized SG-list items for the controller. This in
> > turn will cause less dw_desc allocations, less LLP reinitializations,
> > better DMA device performance.
> 
> Thank you for the patch.
> My comments below.
> 
> ...
> 
> > +		/*
> > +		 * Find maximum block size to be set as the DMA device maximum
> > +		 * segment size. By doing so we'll have size optimized SG-list
> > +		 * items for the channels with biggest block size. This won't
> > +		 * be a problem for the rest of the channels, since they will
> > +		 * still be able to split the requests up by allocating
> > +		 * multiple DW DMA LLP descriptors, which they would have done
> > +		 * anyway.
> > +		 */
> > +		if (dwc->block_size > block_size)
> > +			block_size = dwc->block_size;
> >  	}
> >  
> >  	/* Clear all interrupts on all channels. */
> > @@ -1220,6 +1233,10 @@ int do_dma_probe(struct dw_dma_chip *chip)
> >  			     BIT(DMA_MEM_TO_MEM);
> >  	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
> >  
> > +	/* Block size corresponds to the maximum sg size */
> > +	dw->dma.dev->dma_parms = &dw->dma_parms;
> > +	dma_set_max_seg_size(dw->dma.dev, block_size);
> > +
> >  	err = dma_async_device_register(&dw->dma);
> >  	if (err)
> >  		goto err_dma_register;
> 
> Yeah, I have locally something like this and I didn't dare to upstream because
> there is an issue. We have this information per DMA controller, while we
> actually need this on per DMA channel basis.
> 
> Above will work only for synthesized DMA with all channels having same block
> size. That's why above conditional is not needed anyway.

Hm, I don't really see why the conditional isn't needed and this won't work. As
you can see in the loop above Initially I find a maximum of all channels maximum
block sizes and use it then as a max segment size parameter for the whole device.
If the DW DMA controller has the same max block size of all channels, then it
will be found. If the channels've been synthesized with different block sizes,
then the optimization will work for the one with greatest block size. The SG
list entries of the channels with lesser max block size will be split up
by the DW DMAC driver, which would have been done anyway without
max_segment_size being set. Here we at least provide the optimization for the
channels with greatest max block size.

I do understand that it would be good to have this parameter setup on per generic
DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
such facility, so setting at least some justified value is a good idea.

> 
> OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> that Intel Medfield has interesting settings, but I don't remember if DMA
> channels are different inside the same controller).
> 
> Vineet, do you have any information that Synopsys customers synthesized DMA
> controllers with different channel characteristics inside one DMA IP?

AFAICS the DW DMAC channels can be synthesized with different max block size.
The IP core supports such configuration. So we can't assume that such DMAC
release can't be found in a real hardware just because we've never seen one.
No matter what Vineet will have to say in response to your question.

> 
> ...
> 
> >  #include <linux/bitops.h>
> 
> > +#include <linux/device.h>
> 
> Isn't enough to supply
> 
> struct device;
> 
> ?

It's "struct device_dma_parameters" and I'd prefer to include the header file.

> 
> >  #include <linux/interrupt.h>
> >  #include <linux/dmaengine.h>
> 
> Also this change needs a separate patch I suppose.

Ah, just discovered there is no need in adding the dma_parms here because since
commit 7c8978c0837d ("driver core: platform: Initialize dma_parms for platform
devices") the dma_params pointer is already initialized. The same thing is done
for the PCI device too.

-Sergey

> 
> ...
> 
> > -	struct dma_device	dma;
> > -	char			name[20];
> > -	void __iomem		*regs;
> > -	struct dma_pool		*desc_pool;
> > -	struct tasklet_struct	tasklet;
> > +	struct dma_device		dma;
> > +	struct device_dma_parameters	dma_parms;
> > +	char				name[20];
> > +	void __iomem			*regs;
> > +	struct dma_pool			*desc_pool;
> > +	struct tasklet_struct		tasklet;
> >  
> >  	/* channels */
> > -	struct dw_dma_chan	*chan;
> > -	u8			all_chan_mask;
> > -	u8			in_use;
> > +	struct dw_dma_chan		*chan;
> > +	u8				all_chan_mask;
> > +	u8				in_use;
> 
> Please split formatting fixes into a separate patch.
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
Andy Shevchenko May 12, 2020, 12:35 p.m. UTC | #4
On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > value specific to the probed DW DMA controller. It shall help the DMA
> > > clients to create size-optimized SG-list items for the controller. This in
> > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > better DMA device performance.

> > Yeah, I have locally something like this and I didn't dare to upstream because
> > there is an issue. We have this information per DMA controller, while we
> > actually need this on per DMA channel basis.
> > 
> > Above will work only for synthesized DMA with all channels having same block
> > size. That's why above conditional is not needed anyway.
> 
> Hm, I don't really see why the conditional isn't needed and this won't work. As
> you can see in the loop above Initially I find a maximum of all channels maximum
> block sizes and use it then as a max segment size parameter for the whole device.
> If the DW DMA controller has the same max block size of all channels, then it
> will be found. If the channels've been synthesized with different block sizes,
> then the optimization will work for the one with greatest block size. The SG
> list entries of the channels with lesser max block size will be split up
> by the DW DMAC driver, which would have been done anyway without
> max_segment_size being set. Here we at least provide the optimization for the
> channels with greatest max block size.
> 
> I do understand that it would be good to have this parameter setup on per generic
> DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> such facility, so setting at least some justified value is a good idea.
> 
> > 
> > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > that Intel Medfield has interesting settings, but I don't remember if DMA
> > channels are different inside the same controller).
> > 
> > Vineet, do you have any information that Synopsys customers synthesized DMA
> > controllers with different channel characteristics inside one DMA IP?
> 
> AFAICS the DW DMAC channels can be synthesized with different max block size.
> The IP core supports such configuration. So we can't assume that such DMAC
> release can't be found in a real hardware just because we've never seen one.
> No matter what Vineet will have to say in response to your question.

My point here that we probably can avoid complications till we have real
hardware where it's different. As I said I don't remember a such, except
*maybe* Intel Medfield, which is quite outdated and not supported for wider
audience anyway.
Serge Semin May 12, 2020, 5:01 p.m. UTC | #5
On Tue, May 12, 2020 at 03:35:51PM +0300, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > > value specific to the probed DW DMA controller. It shall help the DMA
> > > > clients to create size-optimized SG-list items for the controller. This in
> > > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > > better DMA device performance.
> 
> > > Yeah, I have locally something like this and I didn't dare to upstream because
> > > there is an issue. We have this information per DMA controller, while we
> > > actually need this on per DMA channel basis.
> > > 
> > > Above will work only for synthesized DMA with all channels having same block
> > > size. That's why above conditional is not needed anyway.
> > 
> > Hm, I don't really see why the conditional isn't needed and this won't work. As
> > you can see in the loop above Initially I find a maximum of all channels maximum
> > block sizes and use it then as a max segment size parameter for the whole device.
> > If the DW DMA controller has the same max block size of all channels, then it
> > will be found. If the channels've been synthesized with different block sizes,
> > then the optimization will work for the one with greatest block size. The SG
> > list entries of the channels with lesser max block size will be split up
> > by the DW DMAC driver, which would have been done anyway without
> > max_segment_size being set. Here we at least provide the optimization for the
> > channels with greatest max block size.
> > 
> > I do understand that it would be good to have this parameter setup on per generic
> > DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> > such facility, so setting at least some justified value is a good idea.
> > 
> > > 
> > > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > > that Intel Medfield has interesting settings, but I don't remember if DMA
> > > channels are different inside the same controller).
> > > 
> > > Vineet, do you have any information that Synopsys customers synthesized DMA
> > > controllers with different channel characteristics inside one DMA IP?
> > 
> > AFAICS the DW DMAC channels can be synthesized with different max block size.
> > The IP core supports such configuration. So we can't assume that such DMAC
> > release can't be found in a real hardware just because we've never seen one.
> > No matter what Vineet will have to say in response to your question.
> 
> My point here that we probably can avoid complications till we have real
> hardware where it's different. As I said I don't remember a such, except
> *maybe* Intel Medfield, which is quite outdated and not supported for wider
> audience anyway.

I see your point. My position is different in this matter and explained in the
previous emails. Let's see what Viresh and Vinod think of it.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
Vinod Koul May 15, 2020, 6:16 a.m. UTC | #6
On 12-05-20, 15:35, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > > value specific to the probed DW DMA controller. It shall help the DMA
> > > > clients to create size-optimized SG-list items for the controller. This in
> > > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > > better DMA device performance.
> 
> > > Yeah, I have locally something like this and I didn't dare to upstream because
> > > there is an issue. We have this information per DMA controller, while we
> > > actually need this on per DMA channel basis.
> > > 
> > > Above will work only for synthesized DMA with all channels having same block
> > > size. That's why above conditional is not needed anyway.
> > 
> > Hm, I don't really see why the conditional isn't needed and this won't work. As
> > you can see in the loop above Initially I find a maximum of all channels maximum
> > block sizes and use it then as a max segment size parameter for the whole device.
> > If the DW DMA controller has the same max block size of all channels, then it
> > will be found. If the channels've been synthesized with different block sizes,
> > then the optimization will work for the one with greatest block size. The SG
> > list entries of the channels with lesser max block size will be split up
> > by the DW DMAC driver, which would have been done anyway without
> > max_segment_size being set. Here we at least provide the optimization for the
> > channels with greatest max block size.
> > 
> > I do understand that it would be good to have this parameter setup on per generic
> > DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> > such facility, so setting at least some justified value is a good idea.
> > 
> > > 
> > > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > > that Intel Medfield has interesting settings, but I don't remember if DMA
> > > channels are different inside the same controller).
> > > 
> > > Vineet, do you have any information that Synopsys customers synthesized DMA
> > > controllers with different channel characteristics inside one DMA IP?
> > 
> > AFAICS the DW DMAC channels can be synthesized with different max block size.
> > The IP core supports such configuration. So we can't assume that such DMAC
> > release can't be found in a real hardware just because we've never seen one.
> > No matter what Vineet will have to say in response to your question.
> 
> My point here that we probably can avoid complications till we have real
> hardware where it's different. As I said I don't remember a such, except
> *maybe* Intel Medfield, which is quite outdated and not supported for wider
> audience anyway.

IIRC Intel Medfield has couple of dma controller instances each one with
different parameters *but* each instance has same channel configuration.

I do not recall seeing that we have synthesis parameters per channel
basis... But I maybe wrong, it's been a while.
Andy Shevchenko May 15, 2020, 10:53 a.m. UTC | #7
On Fri, May 15, 2020 at 11:46:01AM +0530, Vinod Koul wrote:
> On 12-05-20, 15:35, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:

...

> > My point here that we probably can avoid complications till we have real
> > hardware where it's different. As I said I don't remember a such, except
> > *maybe* Intel Medfield, which is quite outdated and not supported for wider
> > audience anyway.
> 
> IIRC Intel Medfield has couple of dma controller instances each one with
> different parameters *but* each instance has same channel configuration.

That's my memory too.

> I do not recall seeing that we have synthesis parameters per channel
> basis... But I maybe wrong, it's been a while.

Exactly, that's why I think we better simplify things till we will have real
issue with it. I.o.w. no need to solve the problem which doesn't exist.
Serge Semin May 17, 2020, 6:22 p.m. UTC | #8
On Fri, May 15, 2020 at 01:53:13PM +0300, Andy Shevchenko wrote:
> On Fri, May 15, 2020 at 11:46:01AM +0530, Vinod Koul wrote:
> > On 12-05-20, 15:35, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > > > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> 
> ...
> 
> > > My point here that we probably can avoid complications till we have real
> > > hardware where it's different. As I said I don't remember a such, except
> > > *maybe* Intel Medfield, which is quite outdated and not supported for wider
> > > audience anyway.
> > 
> > IIRC Intel Medfield has couple of dma controller instances each one with
> > different parameters *but* each instance has same channel configuration.
> 
> That's my memory too.
> 
> > I do not recall seeing that we have synthesis parameters per channel
> > basis... But I maybe wrong, it's been a while.
> 
> Exactly, that's why I think we better simplify things till we will have real
> issue with it. I.o.w. no need to solve the problem which doesn't exist.

Ok then. My hardware is also synthesized with uniform max block size
parameter. I'll remove that maximum of maximum search pattern and use the block
size found for the very first channel to set the maximum segment size parameter.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
diff mbox series

Patch

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 21cb2a58dbd2..8bcd82c64478 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -1054,6 +1054,7 @@  int do_dma_probe(struct dw_dma_chip *chip)
 	struct dw_dma *dw = chip->dw;
 	struct dw_dma_platform_data *pdata;
 	bool			autocfg = false;
+	unsigned int		block_size = 0;
 	unsigned int		dw_params;
 	unsigned int		i;
 	int			err;
@@ -1184,6 +1185,18 @@  int do_dma_probe(struct dw_dma_chip *chip)
 			dwc->block_size = pdata->block_size;
 			dwc->nollp = !pdata->multi_block[i];
 		}
+
+		/*
+		 * Find maximum block size to be set as the DMA device maximum
+		 * segment size. By doing so we'll have size optimized SG-list
+		 * items for the channels with biggest block size. This won't
+		 * be a problem for the rest of the channels, since they will
+		 * still be able to split the requests up by allocating
+		 * multiple DW DMA LLP descriptors, which they would have done
+		 * anyway.
+		 */
+		if (dwc->block_size > block_size)
+			block_size = dwc->block_size;
 	}
 
 	/* Clear all interrupts on all channels. */
@@ -1220,6 +1233,10 @@  int do_dma_probe(struct dw_dma_chip *chip)
 			     BIT(DMA_MEM_TO_MEM);
 	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 
+	/* Block size corresponds to the maximum sg size */
+	dw->dma.dev->dma_parms = &dw->dma_parms;
+	dma_set_max_seg_size(dw->dma.dev, block_size);
+
 	err = dma_async_device_register(&dw->dma);
 	if (err)
 		goto err_dma_register;
diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
index 3fce66ecee7a..20037d64f961 100644
--- a/drivers/dma/dw/regs.h
+++ b/drivers/dma/dw/regs.h
@@ -8,6 +8,7 @@ 
  */
 
 #include <linux/bitops.h>
+#include <linux/device.h>
 #include <linux/interrupt.h>
 #include <linux/dmaengine.h>
 
@@ -308,16 +309,17 @@  static inline struct dw_dma_chan *to_dw_dma_chan(struct dma_chan *chan)
 }
 
 struct dw_dma {
-	struct dma_device	dma;
-	char			name[20];
-	void __iomem		*regs;
-	struct dma_pool		*desc_pool;
-	struct tasklet_struct	tasklet;
+	struct dma_device		dma;
+	struct device_dma_parameters	dma_parms;
+	char				name[20];
+	void __iomem			*regs;
+	struct dma_pool			*desc_pool;
+	struct tasklet_struct		tasklet;
 
 	/* channels */
-	struct dw_dma_chan	*chan;
-	u8			all_chan_mask;
-	u8			in_use;
+	struct dw_dma_chan		*chan;
+	u8				all_chan_mask;
+	u8				in_use;
 
 	/* Channel operations */
 	void	(*initialize_chan)(struct dw_dma_chan *dwc);