Message ID | 1418266726-12004-2-git-send-email-a.kesavan@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Abhilash, Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: > Currently, the SRAM allocator returns device memory via ioremap. > This causes issues on ARM64 when the internal SoC SRAM allocated by > the generic sram driver is used for audio playback. The destination > buffer address (which is ioremapped SRAM) is not 64-bit aligned for > certain streams (e.g. 44.1k sampling rate). In such cases we get > unhandled alignment faults. Use ioremap_wc in place of ioremap which > gives us normal non-cacheable memory instead of device memory. Could this break the omap_bus_sync() implementation in arch/arm/mach-omap2/omap4-common.c? void omap_bus_sync(void) { if (dram_sync && sram_sync) { writel_relaxed(readl_relaxed(dram_sync), dram_sync); writel_relaxed(readl_relaxed(sram_sync), sram_sync); isb(); } } It is used in wmb() and omap_do_wfi() to drain interconnect write buffers on omap4/5. If sram_sync is mapped with write-combining, could the last write to sram_sync stay stuck in the write-combining buffer until after the function returns? regards Philipp > Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> > --- > This is based on the discussion about the crash here: > http://www.spinics.net/lists/arm-kernel/msg384647.html > > drivers/misc/sram.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c > index 21181fa..15b4d4e 100644 > --- a/drivers/misc/sram.c > +++ b/drivers/misc/sram.c > @@ -69,12 +69,23 @@ static int sram_probe(struct platform_device *pdev) > INIT_LIST_HEAD(&reserve_list); > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > - virt_base = devm_ioremap_resource(&pdev->dev, res); > - if (IS_ERR(virt_base)) > - return PTR_ERR(virt_base); > + if (!res) { > + dev_err(&pdev->dev, "found no memory resource\n"); > + return -EINVAL; > + } > > size = resource_size(res); > > + if (!devm_request_mem_region(&pdev->dev, > + res->start, size, pdev->name)) { > + dev_err(&pdev->dev, "could not request region for resource\n"); > + return -EBUSY; > + } > + > + virt_base = devm_ioremap_wc(&pdev->dev, res->start, size); > + if (IS_ERR(virt_base)) > + return PTR_ERR(virt_base); > + > sram = devm_kzalloc(&pdev->dev, sizeof(*sram), GFP_KERNEL); > if (!sram) > return -ENOMEM;
On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: > Hi Abhilash, > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: > > Currently, the SRAM allocator returns device memory via ioremap. > > This causes issues on ARM64 when the internal SoC SRAM allocated by > > the generic sram driver is used for audio playback. The destination > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for > > certain streams (e.g. 44.1k sampling rate). In such cases we get > > unhandled alignment faults. Use ioremap_wc in place of ioremap which > > gives us normal non-cacheable memory instead of device memory. > > Could this break the omap_bus_sync() implementation in > arch/arm/mach-omap2/omap4-common.c? > > void omap_bus_sync(void) > { > if (dram_sync && sram_sync) { > writel_relaxed(readl_relaxed(dram_sync), dram_sync); > writel_relaxed(readl_relaxed(sram_sync), sram_sync); > isb(); > } > } > > It is used in wmb() and omap_do_wfi() to drain interconnect write > buffers on omap4/5. If sram_sync is mapped with write-combining, could > the last write to sram_sync stay stuck in the write-combining buffer > until after the function returns? I think you have that issue anyway, since you can get an early write response even if you use ioremap. Does the write to sram_sync have side-effects that we need to wait for? Will
Hi Will, Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon: > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: > > Hi Abhilash, > > > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: > > > Currently, the SRAM allocator returns device memory via ioremap. > > > This causes issues on ARM64 when the internal SoC SRAM allocated by > > > the generic sram driver is used for audio playback. The destination > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for > > > certain streams (e.g. 44.1k sampling rate). In such cases we get > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which > > > gives us normal non-cacheable memory instead of device memory. > > > > Could this break the omap_bus_sync() implementation in > > arch/arm/mach-omap2/omap4-common.c? > > > > void omap_bus_sync(void) > > { > > if (dram_sync && sram_sync) { > > writel_relaxed(readl_relaxed(dram_sync), dram_sync); > > writel_relaxed(readl_relaxed(sram_sync), sram_sync); > > isb(); > > } > > } > > > > It is used in wmb() and omap_do_wfi() to drain interconnect write > > buffers on omap4/5. If sram_sync is mapped with write-combining, could > > the last write to sram_sync stay stuck in the write-combining buffer > > until after the function returns? > > I think you have that issue anyway, since you can get an early write > response even if you use ioremap. Does the write to sram_sync have > side-effects that we need to wait for? [Added Tony Lindgren and Santosh Shilimkar to Cc:] I don't know. regards Philipp
On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote: > Hi Will, > > Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon: > > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: > > > Hi Abhilash, > > > > > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: > > > > Currently, the SRAM allocator returns device memory via ioremap. > > > > This causes issues on ARM64 when the internal SoC SRAM allocated by > > > > the generic sram driver is used for audio playback. The destination > > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for > > > > certain streams (e.g. 44.1k sampling rate). In such cases we get > > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which > > > > gives us normal non-cacheable memory instead of device memory. > > > > > > Could this break the omap_bus_sync() implementation in > > > arch/arm/mach-omap2/omap4-common.c? > > > > > > void omap_bus_sync(void) > > > { > > > if (dram_sync && sram_sync) { > > > writel_relaxed(readl_relaxed(dram_sync), dram_sync); > > > writel_relaxed(readl_relaxed(sram_sync), sram_sync); > > > isb(); > > > } > > > } > > > > > > It is used in wmb() and omap_do_wfi() to drain interconnect write > > > buffers on omap4/5. If sram_sync is mapped with write-combining, could > > > the last write to sram_sync stay stuck in the write-combining buffer > > > until after the function returns? > > > > I think you have that issue anyway, since you can get an early write > > response even if you use ioremap. Does the write to sram_sync have > > side-effects that we need to wait for? > > [Added Tony Lindgren and Santosh Shilimkar to Cc:] > I don't know. In addition to Will's question, do you care about the access size? ioremap() returns Device memory which is bufferable (early acknowledgement) but it guarantees the access size. With write combining, you may get a different access size than requested.
Hi, On Thu, Dec 11, 2014 at 8:28 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote: >> Hi Will, >> >> Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon: >> > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: >> > > Hi Abhilash, >> > > >> > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: >> > > > Currently, the SRAM allocator returns device memory via ioremap. >> > > > This causes issues on ARM64 when the internal SoC SRAM allocated by >> > > > the generic sram driver is used for audio playback. The destination >> > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for >> > > > certain streams (e.g. 44.1k sampling rate). In such cases we get >> > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which >> > > > gives us normal non-cacheable memory instead of device memory. >> > > >> > > Could this break the omap_bus_sync() implementation in >> > > arch/arm/mach-omap2/omap4-common.c? >> > > >> > > void omap_bus_sync(void) >> > > { >> > > if (dram_sync && sram_sync) { >> > > writel_relaxed(readl_relaxed(dram_sync), dram_sync); >> > > writel_relaxed(readl_relaxed(sram_sync), sram_sync); >> > > isb(); >> > > } >> > > } >> > > >> > > It is used in wmb() and omap_do_wfi() to drain interconnect write >> > > buffers on omap4/5. If sram_sync is mapped with write-combining, could >> > > the last write to sram_sync stay stuck in the write-combining buffer >> > > until after the function returns? >> > >> > I think you have that issue anyway, since you can get an early write >> > response even if you use ioremap. Does the write to sram_sync have >> > side-effects that we need to wait for? >> >> [Added Tony Lindgren and Santosh Shilimkar to Cc:] >> I don't know. > > In addition to Will's question, do you care about the access size? > ioremap() returns Device memory which is bufferable (early > acknowledgement) but it guarantees the access size. With write > combining, you may get a different access size than requested. From the existing dts files, omap, imx, rockchip and exynos seem to be the only users of the sram allocator code. I have tested this on Exynos5420, Exynos5800 and Exynos7; there is no change in behavior seen on these boards. Tested-by for other SoCs would be appreciated. Regards, Abhilash > > -- > Catalin
* Abhilash Kesavan <kesavan.abhilash@gmail.com> [141217 04:37]: > Hi, > > On Thu, Dec 11, 2014 at 8:28 PM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote: > >> Hi Will, > >> > >> Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon: > >> > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: > >> > > Hi Abhilash, > >> > > > >> > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: > >> > > > Currently, the SRAM allocator returns device memory via ioremap. > >> > > > This causes issues on ARM64 when the internal SoC SRAM allocated by > >> > > > the generic sram driver is used for audio playback. The destination > >> > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for > >> > > > certain streams (e.g. 44.1k sampling rate). In such cases we get > >> > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which > >> > > > gives us normal non-cacheable memory instead of device memory. > >> > > > >> > > Could this break the omap_bus_sync() implementation in > >> > > arch/arm/mach-omap2/omap4-common.c? > >> > > > >> > > void omap_bus_sync(void) > >> > > { > >> > > if (dram_sync && sram_sync) { > >> > > writel_relaxed(readl_relaxed(dram_sync), dram_sync); > >> > > writel_relaxed(readl_relaxed(sram_sync), sram_sync); > >> > > isb(); > >> > > } > >> > > } > >> > > > >> > > It is used in wmb() and omap_do_wfi() to drain interconnect write > >> > > buffers on omap4/5. If sram_sync is mapped with write-combining, could > >> > > the last write to sram_sync stay stuck in the write-combining buffer > >> > > until after the function returns? > >> > > >> > I think you have that issue anyway, since you can get an early write > >> > response even if you use ioremap. Does the write to sram_sync have > >> > side-effects that we need to wait for? > >> > >> [Added Tony Lindgren and Santosh Shilimkar to Cc:] > >> I don't know. > > > > In addition to Will's question, do you care about the access size? > > ioremap() returns Device memory which is bufferable (early > > acknowledgement) but it guarantees the access size. With write > > combining, you may get a different access size than requested. > > From the existing dts files, omap, imx, rockchip and exynos seem to be > the only users of the sram allocator code. I have tested this on > Exynos5420, Exynos5800 and Exynos7; there is no change in behavior > seen on these boards. Tested-by for other SoCs would be appreciated. Sorry for the delay, these seems to boot OK on omap4, so from that point of view: Tested-by: Tony Lindgren <tony@atomide.com>
Hi Tony, On Mon, Jan 5, 2015 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote: > * Abhilash Kesavan <kesavan.abhilash@gmail.com> [141217 04:37]: >> Hi, >> >> On Thu, Dec 11, 2014 at 8:28 PM, Catalin Marinas >> <catalin.marinas@arm.com> wrote: >> > On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote: >> >> Hi Will, >> >> >> >> Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon: >> >> > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote: >> >> > > Hi Abhilash, >> >> > > >> >> > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan: >> >> > > > Currently, the SRAM allocator returns device memory via ioremap. >> >> > > > This causes issues on ARM64 when the internal SoC SRAM allocated by >> >> > > > the generic sram driver is used for audio playback. The destination >> >> > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for >> >> > > > certain streams (e.g. 44.1k sampling rate). In such cases we get >> >> > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which >> >> > > > gives us normal non-cacheable memory instead of device memory. >> >> > > >> >> > > Could this break the omap_bus_sync() implementation in >> >> > > arch/arm/mach-omap2/omap4-common.c? >> >> > > >> >> > > void omap_bus_sync(void) >> >> > > { >> >> > > if (dram_sync && sram_sync) { >> >> > > writel_relaxed(readl_relaxed(dram_sync), dram_sync); >> >> > > writel_relaxed(readl_relaxed(sram_sync), sram_sync); >> >> > > isb(); >> >> > > } >> >> > > } >> >> > > >> >> > > It is used in wmb() and omap_do_wfi() to drain interconnect write >> >> > > buffers on omap4/5. If sram_sync is mapped with write-combining, could >> >> > > the last write to sram_sync stay stuck in the write-combining buffer >> >> > > until after the function returns? >> >> > >> >> > I think you have that issue anyway, since you can get an early write >> >> > response even if you use ioremap. Does the write to sram_sync have >> >> > side-effects that we need to wait for? >> >> >> >> [Added Tony Lindgren and Santosh Shilimkar to Cc:] >> >> I don't know. >> > >> > In addition to Will's question, do you care about the access size? >> > ioremap() returns Device memory which is bufferable (early >> > acknowledgement) but it guarantees the access size. With write >> > combining, you may get a different access size than requested. >> >> From the existing dts files, omap, imx, rockchip and exynos seem to be >> the only users of the sram allocator code. I have tested this on >> Exynos5420, Exynos5800 and Exynos7; there is no change in behavior >> seen on these boards. Tested-by for other SoCs would be appreciated. > > Sorry for the delay, these seems to boot OK on omap4, so from that > point of view: > > Tested-by: Tony Lindgren <tony@atomide.com> Thanks a lot for testing this. If someone with imx and rockchip boards could help test this out, then we could look to get this in. Regards, Abhilash
Hi Abhilash, Am Dienstag, den 06.01.2015, 19:57 +0530 schrieb Abhilash Kesavan: > >> From the existing dts files, omap, imx, rockchip and exynos seem to be > >> the only users of the sram allocator code. I have tested this on > >> Exynos5420, Exynos5800 and Exynos7; there is no change in behavior > >> seen on these boards. Tested-by for other SoCs would be appreciated. > > > > Sorry for the delay, these seems to boot OK on omap4, so from that > > point of view: > > > > Tested-by: Tony Lindgren <tony@atomide.com> > > Thanks a lot for testing this. If someone with imx and rockchip boards > could help test this out, then we could look to get this in. This shouldn't be a problem on i.MX, the coda driver doesn't access SRAM from the CPU at all. regards Philipp
On Tue, Jan 6, 2015 at 10:54 AM, Philipp Zabel <p.zabel@pengutronix.de> wrote: > Hi Abhilash, > > Am Dienstag, den 06.01.2015, 19:57 +0530 schrieb Abhilash Kesavan: >> >> From the existing dts files, omap, imx, rockchip and exynos seem to be >> >> the only users of the sram allocator code. I have tested this on >> >> Exynos5420, Exynos5800 and Exynos7; there is no change in behavior >> >> seen on these boards. Tested-by for other SoCs would be appreciated. >> > >> > Sorry for the delay, these seems to boot OK on omap4, so from that >> > point of view: >> > >> > Tested-by: Tony Lindgren <tony@atomide.com> >> >> Thanks a lot for testing this. If someone with imx and rockchip boards >> could help test this out, then we could look to get this in. > > This shouldn't be a problem on i.MX, the coda driver doesn't access SRAM > from the CPU at all. Audio buffers are typically (perhaps not in mainline) in SRAM on i.MX chips which are accessed by CPU and probably mmap'ed to userspace. That could cause a mismatch in mappings although I would not expect both the kernel and user space to touch the buffer. That being said, I don't think this change should cause problems for i.MX (from what I can remember). Rob
Hi Rob and Philipp, On Tue, Jan 6, 2015 at 10:57 PM, Rob Herring <robherring2@gmail.com> wrote: > On Tue, Jan 6, 2015 at 10:54 AM, Philipp Zabel <p.zabel@pengutronix.de> wrote: >> Hi Abhilash, >> >> Am Dienstag, den 06.01.2015, 19:57 +0530 schrieb Abhilash Kesavan: >>> >> From the existing dts files, omap, imx, rockchip and exynos seem to be >>> >> the only users of the sram allocator code. I have tested this on >>> >> Exynos5420, Exynos5800 and Exynos7; there is no change in behavior >>> >> seen on these boards. Tested-by for other SoCs would be appreciated. >>> > >>> > Sorry for the delay, these seems to boot OK on omap4, so from that >>> > point of view: >>> > >>> > Tested-by: Tony Lindgren <tony@atomide.com> >>> >>> Thanks a lot for testing this. If someone with imx and rockchip boards >>> could help test this out, then we could look to get this in. >> >> This shouldn't be a problem on i.MX, the coda driver doesn't access SRAM >> from the CPU at all. > > Audio buffers are typically (perhaps not in mainline) in SRAM on i.MX > chips which are accessed by CPU and probably mmap'ed to userspace. > That could cause a mismatch in mappings although I would not expect > both the kernel and user space to touch the buffer. That being said, I > don't think this change should cause problems for i.MX (from what I > can remember). Thanks for the confirmation regarding the i.MX chips. That leaves rockchip, can someone help with it please ? Regards, Abhilash > > Rob
Hi Abhilash, Am Donnerstag, 8. Januar 2015, 21:00:57 schrieb Abhilash Kesavan: > Thanks for the confirmation regarding the i.MX chips. That leaves > rockchip, can someone help with it please ? sorry for being late to this. The one current sram use-case on Rockchip boards is the SMP bringup, but that uses a reserved sram area. Nevertheless I tested your 2 patches on a rk3288 board and everything still works as expected, including smp bringup, so on Rockchip Tested-by: Heiko Stuebner <heiko@sntech.de> Heiko
diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c index 21181fa..15b4d4e 100644 --- a/drivers/misc/sram.c +++ b/drivers/misc/sram.c @@ -69,12 +69,23 @@ static int sram_probe(struct platform_device *pdev) INIT_LIST_HEAD(&reserve_list); res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - virt_base = devm_ioremap_resource(&pdev->dev, res); - if (IS_ERR(virt_base)) - return PTR_ERR(virt_base); + if (!res) { + dev_err(&pdev->dev, "found no memory resource\n"); + return -EINVAL; + } size = resource_size(res); + if (!devm_request_mem_region(&pdev->dev, + res->start, size, pdev->name)) { + dev_err(&pdev->dev, "could not request region for resource\n"); + return -EBUSY; + } + + virt_base = devm_ioremap_wc(&pdev->dev, res->start, size); + if (IS_ERR(virt_base)) + return PTR_ERR(virt_base); + sram = devm_kzalloc(&pdev->dev, sizeof(*sram), GFP_KERNEL); if (!sram) return -ENOMEM;
Currently, the SRAM allocator returns device memory via ioremap. This causes issues on ARM64 when the internal SoC SRAM allocated by the generic sram driver is used for audio playback. The destination buffer address (which is ioremapped SRAM) is not 64-bit aligned for certain streams (e.g. 44.1k sampling rate). In such cases we get unhandled alignment faults. Use ioremap_wc in place of ioremap which gives us normal non-cacheable memory instead of device memory. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> --- This is based on the discussion about the crash here: http://www.spinics.net/lists/arm-kernel/msg384647.html drivers/misc/sram.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)