diff mbox

[4/4] ARM: keystone: Use dma-ranges for dma_pfn_offset configuration

Message ID 1393275235-1087-5-git-send-email-santosh.shilimkar@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Santosh Shilimkar Feb. 24, 2014, 8:53 p.m. UTC
From: Grygorii Strashko <grygorii.strashko@ti.com>

Keystone platforms have their physical memory mapped at an address
outside the 32-bit physical range.  A Keystone machine with 16G of RAM
would find its memory at 0x0800000000 - 0x0bffffffff.
The system interconnect allows to perform DMA transfers from first 2G of
physical memory (0x08 0000 0000 to 08 7FFF FFFF) which aliased in
hardware to the 32-bit addressable space (0x80000000 - 0xffffffff),
because DMA HW supports only 32-bits addressing.

Get DMA range configuration from DT and use it to configure per
device dma_pfn_offset. DMA addresses translation will be performed
as following:
 _dev->archdata.dma_pfn_offset = PFN_DOWN(<DMA addr> - <CPU addr>);
 DMA->PFN:
     __bus_to_pfn(addr) + dev->archdata.dma_pfn_offset
 PFN->DMA:
     __pfn_to_bus(pfn - dev->archdata.dma_pfn_offset)

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Olof Johansson <olof@lixom.net>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/mach-keystone/keystone.c |   76 +++++++++++++++++++++++++++++++++++--
 1 file changed, 73 insertions(+), 3 deletions(-)

Comments

Arnd Bergmann Feb. 24, 2014, 9:11 p.m. UTC | #1
On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote:
> From: Grygorii Strashko <grygorii.strashko@ti.com>
>  }
>  
> +static unsigned long get_dma_pfn_offset(struct device *dev)
> +{
> +	struct device_node *node = of_node_get(dev->of_node);
> +	const u32 *ranges = NULL;
> +	int len, naddr, nsize, pna;
> +	dma_addr_t dma_addr;
> +	phys_addr_t cpu_addr, size;
> +	unsigned long dma_pfn_offset = 0;
> +
> +	if (!node)
> +		return 0;

Hmm, isn't this function the same as of_translate_dma_address()?

I think we should have the implementation in common code, not hidden
in the keystone platform, to avoid duplication. If of_translate_dma_address
doesn't work, what is the problem, and can you fix it there?

>  static int keystone_platform_notifier(struct notifier_block *nb,
>  				      unsigned long event, void *dev)
>  {
> +	struct device *_dev = dev;
> +
>  	if (event != BUS_NOTIFY_ADD_DEVICE)
>  		return NOTIFY_DONE;

Style: it would be nicer to name the local variable 'dev' and the
argument something else like 'p' or 'data'.

I also wonder if this shouldn't be in ARM architecture wide code
rather than platform code. Unfortunately it can't be in drivers/base
since the offset is stored in an ARM specific location.

	Arnd
Santosh Shilimkar Feb. 24, 2014, 9:38 p.m. UTC | #2
On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote:
> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote:
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>>  }
>>  
>> +static unsigned long get_dma_pfn_offset(struct device *dev)
>> +{
>> +	struct device_node *node = of_node_get(dev->of_node);
>> +	const u32 *ranges = NULL;
>> +	int len, naddr, nsize, pna;
>> +	dma_addr_t dma_addr;
>> +	phys_addr_t cpu_addr, size;
>> +	unsigned long dma_pfn_offset = 0;
>> +
>> +	if (!node)
>> +		return 0;
> 
> Hmm, isn't this function the same as of_translate_dma_address()?
> 
> I think we should have the implementation in common code, not hidden
> in the keystone platform, to avoid duplication. If of_translate_dma_address
> doesn't work, what is the problem, and can you fix it there?
> 
Will have a look at it and get back. At least it makes sense to
use/update the common function.

>>  static int keystone_platform_notifier(struct notifier_block *nb,
>>  				      unsigned long event, void *dev)
>>  {
>> +	struct device *_dev = dev;
>> +
>>  	if (event != BUS_NOTIFY_ADD_DEVICE)
>>  		return NOTIFY_DONE;
> 
> Style: it would be nicer to name the local variable 'dev' and the
> argument something else like 'p' or 'data'.
>
Agree. Will update that.
 
> I also wonder if this shouldn't be in ARM architecture wide code
> rather than platform code. Unfortunately it can't be in drivers/base
> since the offset is stored in an ARM specific location.
> 
Notifier callback is in mach code mainly because platform's might have to
do custom things here. Like setting up the coherent masters,
populating the dma_pfn_offset or populating custom dma_ops etc.
As such the in these callbacks is not much and I see only
couple of machines using it.

Regards,
Santosh
Arnd Bergmann Feb. 25, 2014, 8:35 a.m. UTC | #3
On Monday 24 February 2014 16:38:00 Santosh Shilimkar wrote:
> > I also wonder if this shouldn't be in ARM architecture wide code
> > rather than platform code. Unfortunately it can't be in drivers/base
> > since the offset is stored in an ARM specific location.
> > 
> Notifier callback is in mach code mainly because platform's might have to
> do custom things here. Like setting up the coherent masters,
> populating the dma_pfn_offset or populating custom dma_ops etc.
> As such the in these callbacks is not much and I see only
> couple of machines using it.

Yes, but since "dma-ranges" is a standard property, I think we should
provide a generic implementation so other platforms don't have to
copy one. Most platforms today are never cache-coherent and have
a trivial mapping, so they don't need this function, but I suppose
the case you have is getting more popular. I think shmobile is in
exactly the same position with their new r-car parts for instance.

	Arnd
Arnd Bergmann Feb. 25, 2014, 1:37 p.m. UTC | #4
On Tuesday 25 February 2014, Grygorii Strashko wrote:
> On 02/24/2014 11:38 PM, Santosh Shilimkar wrote:
> > On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote:
> >> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote:
> >>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>
> >> Hmm, isn't this function the same as of_translate_dma_address()?
> >>
> >> I think we should have the implementation in common code, not hidden
> >> in the keystone platform, to avoid duplication. If of_translate_dma_address
> >> doesn't work, what is the problem, and can you fix it there?
> >>
> > Will have a look at it and get back. At least it makes sense to
> > use/update the common function.
> 
> This function isn't the same as of_translate_dma_address(), more over
> it calls of_translate_dma_address() to get CPU address.
> 
> To calculate DMA bus offset we need to know the first DMA address explicitly,
> so then we can translate it to the corresponding CPU address.

Ah, got it. Sorry for being sloppy in my review. You are absolutely right.

> get_dma_pfn_offset() function does:
> - traverse DT and look for the first "dma-ranges" prop;
> - reads the first DMA address of the range;
> - translates it in CPU address using of_translate_dma_address();
> - calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr);

I was confused about the first step here and thought you were actually
doing the translation there. After a more careful reading, I only
have comments on details:

> +static unsigned long get_dma_pfn_offset(struct device *dev)
> +{
> +       while (1) {
> +               naddr = of_n_addr_cells(node);
> +               nsize = of_n_size_cells(node);
> +               node = of_get_next_parent(node);
> +               if (!node)
> +                       break;
> +
> +               ranges = of_get_property(node, "dma-ranges", &len);
> +
> +               /* Ignore empty ranges, they imply no translation required */
> +               if (ranges && len > 0)
> +                       break;
> +       }

I think we should require an empty "dma-ranges" property to be
present to signify "no translation required" and interpret the
absence of the property as "no dma possible".

A special case is having no "dma-ranges" at all, which we have
on all existing systems, and I  would interpret that as "all
devices can do 32-bit DMA, no more but no less".


> +       /*
> +        * dma-ranges format:
> +        * DMA addr     : naddr cells
> +        * CPU addr     : pna cells
> +        * size         : nsize cells
> +        */
> +       len /= sizeof(u32);
> +       pna = of_n_addr_cells(node);
> +       dma_addr = of_read_number(ranges, nsize);

This should probably use naddr, not nsize.

	Arnd
Grygorii Strashko Feb. 25, 2014, 2:14 p.m. UTC | #5
Hi Arnd,

On 02/24/2014 11:38 PM, Santosh Shilimkar wrote:
> On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote:
>> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote:
>>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>>>   }
>>>   
>>> +static unsigned long get_dma_pfn_offset(struct device *dev)
>>> +{
>>> +	struct device_node *node = of_node_get(dev->of_node);
>>> +	const u32 *ranges = NULL;
>>> +	int len, naddr, nsize, pna;
>>> +	dma_addr_t dma_addr;
>>> +	phys_addr_t cpu_addr, size;
>>> +	unsigned long dma_pfn_offset = 0;
>>> +
>>> +	if (!node)
>>> +		return 0;
>>
>> Hmm, isn't this function the same as of_translate_dma_address()?
>>
>> I think we should have the implementation in common code, not hidden
>> in the keystone platform, to avoid duplication. If of_translate_dma_address
>> doesn't work, what is the problem, and can you fix it there?
>>
> Will have a look at it and get back. At least it makes sense to
> use/update the common function.

This function isn't the same as of_translate_dma_address(), more over
it calls of_translate_dma_address() to get CPU address.

To calculate DMA bus offset we need to know the first DMA address explicitly,
so then we can translate it to the corresponding CPU address.

get_dma_pfn_offset() function does:
- traverse DT and look for the first "dma-ranges" prop;
- reads the first DMA address of the range;
- translates it in CPU address using of_translate_dma_address();
- calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr);

Also, I've investigated all places in Kernel were "dma-ranges" is used:
- cell_iommu_get_fixed_address() - arch/powerpc/platforms/cell/iommu.c
- ppc4xx_parse_dma_ranges() - arch/powerpc/sysdev/ppc4xx_pci.c

As result, I see no reasons to modify of_translate_dma_address()
(which actually is just wrapper for common __of_translate_address()).

Of course, possibly I've not fully got your point. :)

> 
>>>   static int keystone_platform_notifier(struct notifier_block *nb,
>>>   				      unsigned long event, void *dev)
>>>   {
>>> +	struct device *_dev = dev;
>>> +
>>>   	if (event != BUS_NOTIFY_ADD_DEVICE)
>>>   		return NOTIFY_DONE;
>>
>> Style: it would be nicer to name the local variable 'dev' and the
>> argument something else like 'p' or 'data'.
>>
> Agree. Will update that.
>   
>> I also wonder if this shouldn't be in ARM architecture wide code
>> rather than platform code. Unfortunately it can't be in drivers/base
>> since the offset is stored in an ARM specific location.
>>
> Notifier callback is in mach code mainly because platform's might have to
> do custom things here. Like setting up the coherent masters,
> populating the dma_pfn_offset or populating custom dma_ops etc.
> As such the in these callbacks is not much and I see only
> couple of machines using it.
> 

Thanks for your comments.

Regards,
-grygorii
Arnd Bergmann Feb. 25, 2014, 2:37 p.m. UTC | #6
On Tuesday 25 February 2014 17:19:59 Grygorii Strashko wrote:
> The Keystone can work in two modes:
> - LPAE disabled: In this case Platform bus notifier will not be called at all
> and DMA range == MEM range (32 bits mode)

The hardware setting doesn't change here, just the available memory, right?

> - LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges"
> property as "no dma possible" and clean up DMA mask. Is it ok?
> In this mode "dam-ranges" prop has to be defined always to enable DMA.

Ok, sounds good.
 
> Empty "dma-ranges" can't be treated as "no translation required" because it is used
> to move one level up while traversing DT to find the valid "dma-ranges" prop.
> Used to handle child devices like:
> 
>  bus {
>         dma-ranges = <...>;
>         
>         Dev A {
>                 dma-ranges;
> 
>                 child_dev_A1 { }
>                 child_dev_A2 { }
> 
> This is standard behavior for "[dma-]ranges".

That is what I meant: "no translation required at this level". You still
have to go up to the root in order to know the full translation.

	Arnd
Grygorii Strashko Feb. 25, 2014, 3:19 p.m. UTC | #7
On 02/25/2014 03:37 PM, Arnd Bergmann wrote:
> On Tuesday 25 February 2014, Grygorii Strashko wrote:
>> On 02/24/2014 11:38 PM, Santosh Shilimkar wrote:
>>> On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote:
>>>> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote:
>>>>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>>
>>>> Hmm, isn't this function the same as of_translate_dma_address()?
>>>>
>>>> I think we should have the implementation in common code, not hidden
>>>> in the keystone platform, to avoid duplication. If of_translate_dma_address
>>>> doesn't work, what is the problem, and can you fix it there?
>>>>
>>> Will have a look at it and get back. At least it makes sense to
>>> use/update the common function.
>>
>> This function isn't the same as of_translate_dma_address(), more over
>> it calls of_translate_dma_address() to get CPU address.
>>
>> To calculate DMA bus offset we need to know the first DMA address explicitly,
>> so then we can translate it to the corresponding CPU address.
> 
> Ah, got it. Sorry for being sloppy in my review. You are absolutely right.
> 
>> get_dma_pfn_offset() function does:
>> - traverse DT and look for the first "dma-ranges" prop;
>> - reads the first DMA address of the range;
>> - translates it in CPU address using of_translate_dma_address();
>> - calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr);
> 
> I was confused about the first step here and thought you were actually
> doing the translation there. After a more careful reading, I only
> have comments on details:
> 
>> +static unsigned long get_dma_pfn_offset(struct device *dev)
>> +{
>> +       while (1) {
>> +               naddr = of_n_addr_cells(node);
>> +               nsize = of_n_size_cells(node);
>> +               node = of_get_next_parent(node);
>> +               if (!node)
>> +                       break;
>> +
>> +               ranges = of_get_property(node, "dma-ranges", &len);
>> +
>> +               /* Ignore empty ranges, they imply no translation required */
>> +               if (ranges && len > 0)
>> +                       break;
>> +       }
> 
> I think we should require an empty "dma-ranges" property to be
> present to signify "no translation required" and interpret the
> absence of the property as "no dma possible".
> 
> A special case is having no "dma-ranges" at all, which we have
> on all existing systems, and I  would interpret that as "all
> devices can do 32-bit DMA, no more but no less".

The Keystone can work in two modes:
- LPAE disabled: In this case Platform bus notifier will not be called at all
and DMA range == MEM range (32 bits mode)
- LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges"
property as "no dma possible" and clean up DMA mask. Is it ok?
In this mode "dam-ranges" prop has to be defined always to enable DMA.

Empty "dma-ranges" can't be treated as "no translation required" because it is used
to move one level up while traversing DT to find the valid "dma-ranges" prop.
Used to handle child devices like:

 bus {
 	dma-ranges = <...>;
	
	Dev A {
		dma-ranges;

		child_dev_A1 { }
		child_dev_A2 { }

This is standard behavior for "[dma-]ranges".

> 
> 
>> +       /*
>> +        * dma-ranges format:
>> +        * DMA addr     : naddr cells
>> +        * CPU addr     : pna cells
>> +        * size         : nsize cells
>> +        */
>> +       len /= sizeof(u32);
>> +       pna = of_n_addr_cells(node);
>> +       dma_addr = of_read_number(ranges, nsize);
> 
> This should probably use naddr, not nsize.

agree. 

Regards,
-grygorii
Grygorii Strashko Feb. 25, 2014, 4:05 p.m. UTC | #8
On 02/25/2014 04:37 PM, Arnd Bergmann wrote:
> On Tuesday 25 February 2014 17:19:59 Grygorii Strashko wrote:
>> The Keystone can work in two modes:
>> - LPAE disabled: In this case Platform bus notifier will not be called at all
>> and DMA range == MEM range (32 bits mode)
>
> The hardware setting doesn't change here, just the available memory, right?

Yes. The accessible RAM will be 0x8000 0000 - 0xFFFF FFFF -
the same as DMA range. But coherent DMA will not be supported.

So, no DMA bus offset required and actions in Platform bus notifier.

>
>> - LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges"
>> property as "no dma possible" and clean up DMA mask. Is it ok?
>> In this mode "dam-ranges" prop has to be defined always to enable DMA.
>
> Ok, sounds good.
>
>> Empty "dma-ranges" can't be treated as "no translation required" because it is used
>> to move one level up while traversing DT to find the valid "dma-ranges" prop.
>> Used to handle child devices like:
>>
>>   bus {
>>          dma-ranges = <...>;
>>
>>          Dev A {
>>                  dma-ranges;
>>
>>                  child_dev_A1 { }
>>                  child_dev_A2 { }
>>
>> This is standard behavior for "[dma-]ranges".
>
> That is what I meant: "no translation required at this level". You still
> have to go up to the root in order to know the full translation.
>

Regards,
-grygorii
diff mbox

Patch

diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index e735555..ebe0a66 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -33,6 +33,7 @@ 
 
 static void __iomem *keystone_rstctrl;
 static struct notifier_block platform_nb;
+static unsigned long keystone_dma_pfn_offset __read_mostly;
 
 static bool is_coherent(struct device *dev)
 {
@@ -48,15 +49,82 @@  static bool is_coherent(struct device *dev)
 	return false;
 }
 
+static unsigned long get_dma_pfn_offset(struct device *dev)
+{
+	struct device_node *node = of_node_get(dev->of_node);
+	const u32 *ranges = NULL;
+	int len, naddr, nsize, pna;
+	dma_addr_t dma_addr;
+	phys_addr_t cpu_addr, size;
+	unsigned long dma_pfn_offset = 0;
+
+	if (!node)
+		return 0;
+
+	while (1) {
+		naddr = of_n_addr_cells(node);
+		nsize = of_n_size_cells(node);
+		node = of_get_next_parent(node);
+		if (!node)
+			break;
+
+		ranges = of_get_property(node, "dma-ranges", &len);
+
+		/* Ignore empty ranges, they imply no translation required */
+		if (ranges && len > 0)
+			break;
+	}
+
+	if (!ranges) {
+		dev_dbg(dev, "no dma-ranges found\n");
+		goto out;
+	}
+
+	/*
+	 * dma-ranges format:
+	 * DMA addr	: naddr cells
+	 * CPU addr	: pna cells
+	 * size		: nsize cells
+	 */
+	len /= sizeof(u32);
+	pna = of_n_addr_cells(node);
+	dma_addr = of_read_number(ranges, nsize);
+	cpu_addr = of_translate_dma_address(dev->of_node, ranges);
+	size = of_read_number(ranges + naddr + pna, nsize);
+	dma_pfn_offset = PFN_DOWN(cpu_addr - dma_addr);
+
+	dev_dbg(dev, "%s: naddr-%u nsize-%u pna-%u len-%u\n",
+				node->name, naddr, nsize, pna, len);
+	dev_dbg(dev, "dma_addr-%08x cpu_addr-%pa size-%pa dma_pfn_offset-%08lx\n",
+				dma_addr, &cpu_addr, &size, dma_pfn_offset);
+
+out:
+	of_node_put(node);
+	return dma_pfn_offset;
+}
+
 static int keystone_platform_notifier(struct notifier_block *nb,
 				      unsigned long event, void *dev)
 {
+	struct device *_dev = dev;
+
 	if (event != BUS_NOTIFY_ADD_DEVICE)
 		return NOTIFY_DONE;
 
-	if (is_coherent(dev)) {
-		set_dma_ops(dev, &arm_coherent_dma_ops);
-		pr_info("\t\t%s: keystone device is coherent\n", dev_name(dev));
+	if (!_dev)
+		return NOTIFY_BAD;
+
+	if (!_dev->of_node)
+		_dev->archdata.dma_pfn_offset = keystone_dma_pfn_offset;
+	else
+		_dev->archdata.dma_pfn_offset = get_dma_pfn_offset(_dev);
+
+	dev_dbg(_dev, "set dma_pfn_offset %08lx\n",
+				_dev->archdata.dma_pfn_offset);
+
+	if (is_coherent(_dev)) {
+		set_dma_ops(_dev, &arm_coherent_dma_ops);
+		dev_info(_dev, "keystone device is coherent\n");
 	}
 
 	return NOTIFY_OK;
@@ -120,6 +188,8 @@  static void __init keystone_init_meminfo(void)
 
 	/* Populate the arch idmap hook */
 	arch_virt_to_idmap = keystone_virt_to_idmap;
+	keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
+						KEYSTONE_LOW_PHYS_START);
 
 	platform_nb.notifier_call = keystone_platform_notifier;