Message ID | 1393275235-1087-5-git-send-email-santosh.shilimkar@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote: > From: Grygorii Strashko <grygorii.strashko@ti.com> > } > > +static unsigned long get_dma_pfn_offset(struct device *dev) > +{ > + struct device_node *node = of_node_get(dev->of_node); > + const u32 *ranges = NULL; > + int len, naddr, nsize, pna; > + dma_addr_t dma_addr; > + phys_addr_t cpu_addr, size; > + unsigned long dma_pfn_offset = 0; > + > + if (!node) > + return 0; Hmm, isn't this function the same as of_translate_dma_address()? I think we should have the implementation in common code, not hidden in the keystone platform, to avoid duplication. If of_translate_dma_address doesn't work, what is the problem, and can you fix it there? > static int keystone_platform_notifier(struct notifier_block *nb, > unsigned long event, void *dev) > { > + struct device *_dev = dev; > + > if (event != BUS_NOTIFY_ADD_DEVICE) > return NOTIFY_DONE; Style: it would be nicer to name the local variable 'dev' and the argument something else like 'p' or 'data'. I also wonder if this shouldn't be in ARM architecture wide code rather than platform code. Unfortunately it can't be in drivers/base since the offset is stored in an ARM specific location. Arnd
On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote: > On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote: >> From: Grygorii Strashko <grygorii.strashko@ti.com> >> } >> >> +static unsigned long get_dma_pfn_offset(struct device *dev) >> +{ >> + struct device_node *node = of_node_get(dev->of_node); >> + const u32 *ranges = NULL; >> + int len, naddr, nsize, pna; >> + dma_addr_t dma_addr; >> + phys_addr_t cpu_addr, size; >> + unsigned long dma_pfn_offset = 0; >> + >> + if (!node) >> + return 0; > > Hmm, isn't this function the same as of_translate_dma_address()? > > I think we should have the implementation in common code, not hidden > in the keystone platform, to avoid duplication. If of_translate_dma_address > doesn't work, what is the problem, and can you fix it there? > Will have a look at it and get back. At least it makes sense to use/update the common function. >> static int keystone_platform_notifier(struct notifier_block *nb, >> unsigned long event, void *dev) >> { >> + struct device *_dev = dev; >> + >> if (event != BUS_NOTIFY_ADD_DEVICE) >> return NOTIFY_DONE; > > Style: it would be nicer to name the local variable 'dev' and the > argument something else like 'p' or 'data'. > Agree. Will update that. > I also wonder if this shouldn't be in ARM architecture wide code > rather than platform code. Unfortunately it can't be in drivers/base > since the offset is stored in an ARM specific location. > Notifier callback is in mach code mainly because platform's might have to do custom things here. Like setting up the coherent masters, populating the dma_pfn_offset or populating custom dma_ops etc. As such the in these callbacks is not much and I see only couple of machines using it. Regards, Santosh
On Monday 24 February 2014 16:38:00 Santosh Shilimkar wrote: > > I also wonder if this shouldn't be in ARM architecture wide code > > rather than platform code. Unfortunately it can't be in drivers/base > > since the offset is stored in an ARM specific location. > > > Notifier callback is in mach code mainly because platform's might have to > do custom things here. Like setting up the coherent masters, > populating the dma_pfn_offset or populating custom dma_ops etc. > As such the in these callbacks is not much and I see only > couple of machines using it. Yes, but since "dma-ranges" is a standard property, I think we should provide a generic implementation so other platforms don't have to copy one. Most platforms today are never cache-coherent and have a trivial mapping, so they don't need this function, but I suppose the case you have is getting more popular. I think shmobile is in exactly the same position with their new r-car parts for instance. Arnd
On Tuesday 25 February 2014, Grygorii Strashko wrote: > On 02/24/2014 11:38 PM, Santosh Shilimkar wrote: > > On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote: > >> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote: > >>> From: Grygorii Strashko <grygorii.strashko@ti.com> > > >> Hmm, isn't this function the same as of_translate_dma_address()? > >> > >> I think we should have the implementation in common code, not hidden > >> in the keystone platform, to avoid duplication. If of_translate_dma_address > >> doesn't work, what is the problem, and can you fix it there? > >> > > Will have a look at it and get back. At least it makes sense to > > use/update the common function. > > This function isn't the same as of_translate_dma_address(), more over > it calls of_translate_dma_address() to get CPU address. > > To calculate DMA bus offset we need to know the first DMA address explicitly, > so then we can translate it to the corresponding CPU address. Ah, got it. Sorry for being sloppy in my review. You are absolutely right. > get_dma_pfn_offset() function does: > - traverse DT and look for the first "dma-ranges" prop; > - reads the first DMA address of the range; > - translates it in CPU address using of_translate_dma_address(); > - calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr); I was confused about the first step here and thought you were actually doing the translation there. After a more careful reading, I only have comments on details: > +static unsigned long get_dma_pfn_offset(struct device *dev) > +{ > + while (1) { > + naddr = of_n_addr_cells(node); > + nsize = of_n_size_cells(node); > + node = of_get_next_parent(node); > + if (!node) > + break; > + > + ranges = of_get_property(node, "dma-ranges", &len); > + > + /* Ignore empty ranges, they imply no translation required */ > + if (ranges && len > 0) > + break; > + } I think we should require an empty "dma-ranges" property to be present to signify "no translation required" and interpret the absence of the property as "no dma possible". A special case is having no "dma-ranges" at all, which we have on all existing systems, and I would interpret that as "all devices can do 32-bit DMA, no more but no less". > + /* > + * dma-ranges format: > + * DMA addr : naddr cells > + * CPU addr : pna cells > + * size : nsize cells > + */ > + len /= sizeof(u32); > + pna = of_n_addr_cells(node); > + dma_addr = of_read_number(ranges, nsize); This should probably use naddr, not nsize. Arnd
Hi Arnd, On 02/24/2014 11:38 PM, Santosh Shilimkar wrote: > On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote: >> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote: >>> From: Grygorii Strashko <grygorii.strashko@ti.com> >>> } >>> >>> +static unsigned long get_dma_pfn_offset(struct device *dev) >>> +{ >>> + struct device_node *node = of_node_get(dev->of_node); >>> + const u32 *ranges = NULL; >>> + int len, naddr, nsize, pna; >>> + dma_addr_t dma_addr; >>> + phys_addr_t cpu_addr, size; >>> + unsigned long dma_pfn_offset = 0; >>> + >>> + if (!node) >>> + return 0; >> >> Hmm, isn't this function the same as of_translate_dma_address()? >> >> I think we should have the implementation in common code, not hidden >> in the keystone platform, to avoid duplication. If of_translate_dma_address >> doesn't work, what is the problem, and can you fix it there? >> > Will have a look at it and get back. At least it makes sense to > use/update the common function. This function isn't the same as of_translate_dma_address(), more over it calls of_translate_dma_address() to get CPU address. To calculate DMA bus offset we need to know the first DMA address explicitly, so then we can translate it to the corresponding CPU address. get_dma_pfn_offset() function does: - traverse DT and look for the first "dma-ranges" prop; - reads the first DMA address of the range; - translates it in CPU address using of_translate_dma_address(); - calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr); Also, I've investigated all places in Kernel were "dma-ranges" is used: - cell_iommu_get_fixed_address() - arch/powerpc/platforms/cell/iommu.c - ppc4xx_parse_dma_ranges() - arch/powerpc/sysdev/ppc4xx_pci.c As result, I see no reasons to modify of_translate_dma_address() (which actually is just wrapper for common __of_translate_address()). Of course, possibly I've not fully got your point. :) > >>> static int keystone_platform_notifier(struct notifier_block *nb, >>> unsigned long event, void *dev) >>> { >>> + struct device *_dev = dev; >>> + >>> if (event != BUS_NOTIFY_ADD_DEVICE) >>> return NOTIFY_DONE; >> >> Style: it would be nicer to name the local variable 'dev' and the >> argument something else like 'p' or 'data'. >> > Agree. Will update that. > >> I also wonder if this shouldn't be in ARM architecture wide code >> rather than platform code. Unfortunately it can't be in drivers/base >> since the offset is stored in an ARM specific location. >> > Notifier callback is in mach code mainly because platform's might have to > do custom things here. Like setting up the coherent masters, > populating the dma_pfn_offset or populating custom dma_ops etc. > As such the in these callbacks is not much and I see only > couple of machines using it. > Thanks for your comments. Regards, -grygorii
On Tuesday 25 February 2014 17:19:59 Grygorii Strashko wrote: > The Keystone can work in two modes: > - LPAE disabled: In this case Platform bus notifier will not be called at all > and DMA range == MEM range (32 bits mode) The hardware setting doesn't change here, just the available memory, right? > - LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges" > property as "no dma possible" and clean up DMA mask. Is it ok? > In this mode "dam-ranges" prop has to be defined always to enable DMA. Ok, sounds good. > Empty "dma-ranges" can't be treated as "no translation required" because it is used > to move one level up while traversing DT to find the valid "dma-ranges" prop. > Used to handle child devices like: > > bus { > dma-ranges = <...>; > > Dev A { > dma-ranges; > > child_dev_A1 { } > child_dev_A2 { } > > This is standard behavior for "[dma-]ranges". That is what I meant: "no translation required at this level". You still have to go up to the root in order to know the full translation. Arnd
On 02/25/2014 03:37 PM, Arnd Bergmann wrote: > On Tuesday 25 February 2014, Grygorii Strashko wrote: >> On 02/24/2014 11:38 PM, Santosh Shilimkar wrote: >>> On Monday 24 February 2014 04:11 PM, Arnd Bergmann wrote: >>>> On Monday 24 February 2014 15:53:55 Santosh Shilimkar wrote: >>>>> From: Grygorii Strashko <grygorii.strashko@ti.com> >> >>>> Hmm, isn't this function the same as of_translate_dma_address()? >>>> >>>> I think we should have the implementation in common code, not hidden >>>> in the keystone platform, to avoid duplication. If of_translate_dma_address >>>> doesn't work, what is the problem, and can you fix it there? >>>> >>> Will have a look at it and get back. At least it makes sense to >>> use/update the common function. >> >> This function isn't the same as of_translate_dma_address(), more over >> it calls of_translate_dma_address() to get CPU address. >> >> To calculate DMA bus offset we need to know the first DMA address explicitly, >> so then we can translate it to the corresponding CPU address. > > Ah, got it. Sorry for being sloppy in my review. You are absolutely right. > >> get_dma_pfn_offset() function does: >> - traverse DT and look for the first "dma-ranges" prop; >> - reads the first DMA address of the range; >> - translates it in CPU address using of_translate_dma_address(); >> - calculates DMA bus offset as PFN_DOWN(cpu_addr - dma_addr); > > I was confused about the first step here and thought you were actually > doing the translation there. After a more careful reading, I only > have comments on details: > >> +static unsigned long get_dma_pfn_offset(struct device *dev) >> +{ >> + while (1) { >> + naddr = of_n_addr_cells(node); >> + nsize = of_n_size_cells(node); >> + node = of_get_next_parent(node); >> + if (!node) >> + break; >> + >> + ranges = of_get_property(node, "dma-ranges", &len); >> + >> + /* Ignore empty ranges, they imply no translation required */ >> + if (ranges && len > 0) >> + break; >> + } > > I think we should require an empty "dma-ranges" property to be > present to signify "no translation required" and interpret the > absence of the property as "no dma possible". > > A special case is having no "dma-ranges" at all, which we have > on all existing systems, and I would interpret that as "all > devices can do 32-bit DMA, no more but no less". The Keystone can work in two modes: - LPAE disabled: In this case Platform bus notifier will not be called at all and DMA range == MEM range (32 bits mode) - LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges" property as "no dma possible" and clean up DMA mask. Is it ok? In this mode "dam-ranges" prop has to be defined always to enable DMA. Empty "dma-ranges" can't be treated as "no translation required" because it is used to move one level up while traversing DT to find the valid "dma-ranges" prop. Used to handle child devices like: bus { dma-ranges = <...>; Dev A { dma-ranges; child_dev_A1 { } child_dev_A2 { } This is standard behavior for "[dma-]ranges". > > >> + /* >> + * dma-ranges format: >> + * DMA addr : naddr cells >> + * CPU addr : pna cells >> + * size : nsize cells >> + */ >> + len /= sizeof(u32); >> + pna = of_n_addr_cells(node); >> + dma_addr = of_read_number(ranges, nsize); > > This should probably use naddr, not nsize. agree. Regards, -grygorii
On 02/25/2014 04:37 PM, Arnd Bergmann wrote: > On Tuesday 25 February 2014 17:19:59 Grygorii Strashko wrote: >> The Keystone can work in two modes: >> - LPAE disabled: In this case Platform bus notifier will not be called at all >> and DMA range == MEM range (32 bits mode) > > The hardware setting doesn't change here, just the available memory, right? Yes. The accessible RAM will be 0x8000 0000 - 0xFFFF FFFF - the same as DMA range. But coherent DMA will not be supported. So, no DMA bus offset required and actions in Platform bus notifier. > >> - LPAE enabled: In this case, I'll update code to treat absence of "dam-ranges" >> property as "no dma possible" and clean up DMA mask. Is it ok? >> In this mode "dam-ranges" prop has to be defined always to enable DMA. > > Ok, sounds good. > >> Empty "dma-ranges" can't be treated as "no translation required" because it is used >> to move one level up while traversing DT to find the valid "dma-ranges" prop. >> Used to handle child devices like: >> >> bus { >> dma-ranges = <...>; >> >> Dev A { >> dma-ranges; >> >> child_dev_A1 { } >> child_dev_A2 { } >> >> This is standard behavior for "[dma-]ranges". > > That is what I meant: "no translation required at this level". You still > have to go up to the root in order to know the full translation. > Regards, -grygorii
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c index e735555..ebe0a66 100644 --- a/arch/arm/mach-keystone/keystone.c +++ b/arch/arm/mach-keystone/keystone.c @@ -33,6 +33,7 @@ static void __iomem *keystone_rstctrl; static struct notifier_block platform_nb; +static unsigned long keystone_dma_pfn_offset __read_mostly; static bool is_coherent(struct device *dev) { @@ -48,15 +49,82 @@ static bool is_coherent(struct device *dev) return false; } +static unsigned long get_dma_pfn_offset(struct device *dev) +{ + struct device_node *node = of_node_get(dev->of_node); + const u32 *ranges = NULL; + int len, naddr, nsize, pna; + dma_addr_t dma_addr; + phys_addr_t cpu_addr, size; + unsigned long dma_pfn_offset = 0; + + if (!node) + return 0; + + while (1) { + naddr = of_n_addr_cells(node); + nsize = of_n_size_cells(node); + node = of_get_next_parent(node); + if (!node) + break; + + ranges = of_get_property(node, "dma-ranges", &len); + + /* Ignore empty ranges, they imply no translation required */ + if (ranges && len > 0) + break; + } + + if (!ranges) { + dev_dbg(dev, "no dma-ranges found\n"); + goto out; + } + + /* + * dma-ranges format: + * DMA addr : naddr cells + * CPU addr : pna cells + * size : nsize cells + */ + len /= sizeof(u32); + pna = of_n_addr_cells(node); + dma_addr = of_read_number(ranges, nsize); + cpu_addr = of_translate_dma_address(dev->of_node, ranges); + size = of_read_number(ranges + naddr + pna, nsize); + dma_pfn_offset = PFN_DOWN(cpu_addr - dma_addr); + + dev_dbg(dev, "%s: naddr-%u nsize-%u pna-%u len-%u\n", + node->name, naddr, nsize, pna, len); + dev_dbg(dev, "dma_addr-%08x cpu_addr-%pa size-%pa dma_pfn_offset-%08lx\n", + dma_addr, &cpu_addr, &size, dma_pfn_offset); + +out: + of_node_put(node); + return dma_pfn_offset; +} + static int keystone_platform_notifier(struct notifier_block *nb, unsigned long event, void *dev) { + struct device *_dev = dev; + if (event != BUS_NOTIFY_ADD_DEVICE) return NOTIFY_DONE; - if (is_coherent(dev)) { - set_dma_ops(dev, &arm_coherent_dma_ops); - pr_info("\t\t%s: keystone device is coherent\n", dev_name(dev)); + if (!_dev) + return NOTIFY_BAD; + + if (!_dev->of_node) + _dev->archdata.dma_pfn_offset = keystone_dma_pfn_offset; + else + _dev->archdata.dma_pfn_offset = get_dma_pfn_offset(_dev); + + dev_dbg(_dev, "set dma_pfn_offset %08lx\n", + _dev->archdata.dma_pfn_offset); + + if (is_coherent(_dev)) { + set_dma_ops(_dev, &arm_coherent_dma_ops); + dev_info(_dev, "keystone device is coherent\n"); } return NOTIFY_OK; @@ -120,6 +188,8 @@ static void __init keystone_init_meminfo(void) /* Populate the arch idmap hook */ arch_virt_to_idmap = keystone_virt_to_idmap; + keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START - + KEYSTONE_LOW_PHYS_START); platform_nb.notifier_call = keystone_platform_notifier;