Message ID | 20161001140939.GA31220@vaio-ubuntu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Sat, Oct 01, 2016 at 04:09:39PM +0200, =?UTF-8?q?Pawe=C5=82=20Jarosz?= wrote: > For some reason accessing memory region above 0xfe000000 freezes > system on rk3066. There is similiar bug on later rockchip soc (rk3288) > solved same way. > > Signed-off-by: Paweł Jarosz <paweljarosz3691@gmail.com> > --- > arch/arm/boot/dts/rk3066a.dtsi | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm/boot/dts/rk3066a.dtsi b/arch/arm/boot/dts/rk3066a.dtsi > index 0d0dae3..44c8956 100644 > --- a/arch/arm/boot/dts/rk3066a.dtsi > +++ b/arch/arm/boot/dts/rk3066a.dtsi > @@ -93,6 +93,19 @@ > }; > }; > > + reserved-memory { > + #address-cells = <1>; > + #size-cells = <1>; > + ranges; > + /* > + * The rk3066 cannot use the memory area above 0x9F000000 > + * for some unknown reason. > + */ > + unusable@9F000000 { > + reg = <0x9F000000 0x1000000>; > + }; I don't think this is a sane workaround, but it is at best difficult to tell, given there's no reason given for why this memory is unusable. For instance, if bus accesses to this address hang, then this patch only makes the hand less likely, since the kernel will still map the region (and therefore the CPU can perform speculative accesses). Are issues with this memory consistently seen in practice? Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to determine if the memory is returning erroneous values? Thanks, Mark.
Am Samstag, 1. Oktober 2016, 19:17:11 CEST schrieb Mark Rutland: > Hi, > > On Sat, Oct 01, 2016 at 04:09:39PM +0200, =?UTF-8?q?Pawe=C5=82=20Jarosz?= wrote: > > For some reason accessing memory region above 0xfe000000 freezes > > system on rk3066. There is similiar bug on later rockchip soc (rk3288) > > solved same way. > > > > Signed-off-by: Paweł Jarosz <paweljarosz3691@gmail.com> > > --- > > > > arch/arm/boot/dts/rk3066a.dtsi | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/arch/arm/boot/dts/rk3066a.dtsi > > b/arch/arm/boot/dts/rk3066a.dtsi index 0d0dae3..44c8956 100644 > > --- a/arch/arm/boot/dts/rk3066a.dtsi > > +++ b/arch/arm/boot/dts/rk3066a.dtsi > > @@ -93,6 +93,19 @@ > > > > }; > > > > }; > > > > + reserved-memory { > > + #address-cells = <1>; > > + #size-cells = <1>; > > + ranges; > > + /* > > + * The rk3066 cannot use the memory area above 0x9F000000 > > + * for some unknown reason. > > + */ > > + unusable@9F000000 { > > + reg = <0x9F000000 0x1000000>; > > + }; > > I don't think this is a sane workaround, but it is at best difficult to > tell, given there's no reason given for why this memory is unusable. > > For instance, if bus accesses to this address hang, then this patch only > makes the hand less likely, since the kernel will still map the region (and > therefore the CPU can perform speculative accesses). > > Are issues with this memory consistently seen in practice? > > Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to determine > if the memory is returning erroneous values? just for the sake of completeness, on the rk3288 the issue was the dma not being able to access the specific memory region (interestingly also the last 16MB but of the 4GB area supported on the rk3288). So memory itself was ok, just dma access to it failed. We didn't find any other sane solution to limit the dma access in a general way at the time, so opted for just blocking the memory region (as it was similarly only In the patch above, the newly blocked area is in the middle of the two 1gb memory areas (0x60000000-0xa0000000-1, 0xa0000000-0xe0000000-1). Pavel, apart from Mark's CONFIG_MEMTEST request above could you also specifiy what type of error you see please? Thanks Heiko
Hi, main symptom is complete system freeze. With CONFIG_MEMTEST enabled and with passed "memtest" to the kernel all tests run ok. But when i run command for example "memtester 800M" or simple "apt update" freeze happening. And when i reserve this region in dts, board is stable again. Thanks, Pawel W dniu 01.10.2016 o 21:18, Heiko Stuebner pisze: > Am Samstag, 1. Oktober 2016, 19:17:11 CEST schrieb Mark Rutland: >> Hi, >> >> On Sat, Oct 01, 2016 at 04:09:39PM +0200, =?UTF-8?q?Pawe=C5=82=20Jarosz?= > wrote: >>> For some reason accessing memory region above 0xfe000000 freezes >>> system on rk3066. There is similiar bug on later rockchip soc (rk3288) >>> solved same way. >>> >>> Signed-off-by: Paweł Jarosz <paweljarosz3691@gmail.com> >>> --- >>> >>> arch/arm/boot/dts/rk3066a.dtsi | 13 +++++++++++++ >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/arch/arm/boot/dts/rk3066a.dtsi >>> b/arch/arm/boot/dts/rk3066a.dtsi index 0d0dae3..44c8956 100644 >>> --- a/arch/arm/boot/dts/rk3066a.dtsi >>> +++ b/arch/arm/boot/dts/rk3066a.dtsi >>> @@ -93,6 +93,19 @@ >>> >>> }; >>> >>> }; >>> >>> + reserved-memory { >>> + #address-cells = <1>; >>> + #size-cells = <1>; >>> + ranges; >>> + /* >>> + * The rk3066 cannot use the memory area above 0x9F000000 >>> + * for some unknown reason. >>> + */ >>> + unusable@9F000000 { >>> + reg = <0x9F000000 0x1000000>; >>> + }; >> I don't think this is a sane workaround, but it is at best difficult to >> tell, given there's no reason given for why this memory is unusable. >> >> For instance, if bus accesses to this address hang, then this patch only >> makes the hand less likely, since the kernel will still map the region (and >> therefore the CPU can perform speculative accesses). >> >> Are issues with this memory consistently seen in practice? >> >> Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to determine >> if the memory is returning erroneous values? > just for the sake of completeness, on the rk3288 the issue was the dma not > being able to access the specific memory region (interestingly also the last > 16MB but of the 4GB area supported on the rk3288). So memory itself was ok, > just dma access to it failed. > > We didn't find any other sane solution to limit the dma access in a general way > at the time, so opted for just blocking the memory region (as it was similarly > only > > In the patch above, the newly blocked area is in the middle of the two 1gb > memory areas (0x60000000-0xa0000000-1, 0xa0000000-0xe0000000-1). > > Pavel, apart from Mark's CONFIG_MEMTEST request above could you also specifiy > what type of error you see please? > > > Thanks > Heiko
On Sat, Oct 01, 2016 at 09:18:15PM +0200, Heiko Stuebner wrote: > Am Samstag, 1. Oktober 2016, 19:17:11 CEST schrieb Mark Rutland: > > On Sat, Oct 01, 2016 at 04:09:39PM +0200, =?UTF-8?q?Pawe=C5=82=20Jarosz?= > wrote: > > > For some reason accessing memory region above 0xfe000000 freezes > > > system on rk3066. There is similiar bug on later rockchip soc (rk3288) > > > solved same way. [...] > > > + reserved-memory { > > > + #address-cells = <1>; > > > + #size-cells = <1>; > > > + ranges; > > > + /* > > > + * The rk3066 cannot use the memory area above 0x9F000000 > > > + * for some unknown reason. > > > + */ > > > + unusable@9F000000 { > > > + reg = <0x9F000000 0x1000000>; > > > + }; > > > > I don't think this is a sane workaround, but it is at best difficult to > > tell, given there's no reason given for why this memory is unusable. > > > > For instance, if bus accesses to this address hang, then this patch only > > makes the hand less likely, since the kernel will still map the region (and > > therefore the CPU can perform speculative accesses). > > > > Are issues with this memory consistently seen in practice? > > > > Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to determine > > if the memory is returning erroneous values? > > just for the sake of completeness, on the rk3288 the issue was the dma not > being able to access the specific memory region (interestingly also the last > 16MB but of the 4GB area supported on the rk3288). So memory itself was ok, > just dma access to it failed. How odd. > We didn't find any other sane solution to limit the dma access in a general way > at the time, so opted for just blocking the memory region (as it was similarly > only I was under the impression that dma-ranges could describe this kind of DMA addressing limitation. Was there some problem with that? Perhaps the driver is not acquiring/configuring its mask correctly? Thanks, Mark.
Am Montag, 3. Oktober 2016, 11:20:54 CEST schrieb Mark Rutland: > On Sat, Oct 01, 2016 at 09:18:15PM +0200, Heiko Stuebner wrote: > > Am Samstag, 1. Oktober 2016, 19:17:11 CEST schrieb Mark Rutland: > > > On Sat, Oct 01, 2016 at 04:09:39PM +0200, > > > =?UTF-8?q?Pawe=C5=82=20Jarosz?= > > > > wrote: > > > > For some reason accessing memory region above 0xfe000000 freezes > > > > system on rk3066. There is similiar bug on later rockchip soc (rk3288) > > > > solved same way. > > [...] > > > > > + reserved-memory { > > > > + #address-cells = <1>; > > > > + #size-cells = <1>; > > > > + ranges; > > > > + /* > > > > + * The rk3066 cannot use the memory area above 0x9F000000 > > > > + * for some unknown reason. > > > > + */ > > > > + unusable@9F000000 { > > > > + reg = <0x9F000000 0x1000000>; > > > > + }; > > > > > > I don't think this is a sane workaround, but it is at best difficult to > > > tell, given there's no reason given for why this memory is unusable. > > > > > > For instance, if bus accesses to this address hang, then this patch only > > > makes the hand less likely, since the kernel will still map the region > > > (and > > > therefore the CPU can perform speculative accesses). > > > > > > Are issues with this memory consistently seen in practice? > > > > > > Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to > > > determine if the memory is returning erroneous values? > > > > just for the sake of completeness, on the rk3288 the issue was the dma not > > being able to access the specific memory region (interestingly also the > > last 16MB but of the 4GB area supported on the rk3288). So memory itself > > was ok, just dma access to it failed. > > How odd. > > > We didn't find any other sane solution to limit the dma access in a > > general way at the time, so opted for just blocking the memory region (as > > it was similarly only > > I was under the impression that dma-ranges could describe this kind of > DMA addressing limitation. Was there some problem with that? Perhaps the > driver is not acquiring/configuring its mask correctly? I remember looking at (and trying) different options back then. dma-mask wanted power-of-2 values (so it's either 4GB or 2GB (or lower)), zone-dma was a 32bit (and non-dt) thing and dma-ranges seem to simply also calculate a dma-mask from the value, so you're down to 2GB again. So just blocking of those 16MB at the end for 4GB devices somehow sounded nicer than limiting dma access to only half the memory. I may be overlooking something but that was what I came up with last year. Heiko
>>>> I don't think this is a sane workaround, but it is at best difficult to >>>> tell, given there's no reason given for why this memory is unusable. >>>> >>>> For instance, if bus accesses to this address hang, then this patch only >>>> makes the hand less likely, since the kernel will still map the region >>>> (and >>>> therefore the CPU can perform speculative accesses). >>>> >>>> Are issues with this memory consistently seen in practice? >>>> >>>> Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to >>>> determine if the memory is returning erroneous values? >>> just for the sake of completeness, on the rk3288 the issue was the dma not >>> being able to access the specific memory region (interestingly also the >>> last 16MB but of the 4GB area supported on the rk3288). So memory itself >>> was ok, just dma access to it failed. >> How odd. >> >>> We didn't find any other sane solution to limit the dma access in a >>> general way at the time, so opted for just blocking the memory region (as >>> it was similarly only >> I was under the impression that dma-ranges could describe this kind of >> DMA addressing limitation. Was there some problem with that? Perhaps the >> driver is not acquiring/configuring its mask correctly? > I remember looking at (and trying) different options back then. > > dma-mask wanted power-of-2 values (so it's either 4GB or 2GB (or lower)), > zone-dma was a 32bit (and non-dt) thing and dma-ranges seem to simply also > calculate a dma-mask from the value, so you're down to 2GB again. > > So just blocking of those 16MB at the end for 4GB devices somehow sounded > nicer than limiting dma access to only half the memory. > > I may be overlooking something but that was what I came up with last year. > > > Heiko Is there a chance to accept this patch? I know it's not the best solution to this problem, but i don't know a better one.
Hi Paweł, Am Dienstag, 4. Oktober 2016, 13:56:07 schrieb Paweł Jarosz: > >>>> I don't think this is a sane workaround, but it is at best difficult to > >>>> tell, given there's no reason given for why this memory is unusable. > >>>> > >>>> For instance, if bus accesses to this address hang, then this patch > >>>> only > >>>> makes the hand less likely, since the kernel will still map the region > >>>> (and > >>>> therefore the CPU can perform speculative accesses). > >>>> > >>>> Are issues with this memory consistently seen in practice? > >>>> > >>>> Can you enable CONFIG_MEMTEST and pass 'memtest' to the kernel, to > >>>> determine if the memory is returning erroneous values? > >>> > >>> just for the sake of completeness, on the rk3288 the issue was the dma > >>> not > >>> being able to access the specific memory region (interestingly also the > >>> last 16MB but of the 4GB area supported on the rk3288). So memory itself > >>> was ok, just dma access to it failed. > >> > >> How odd. > >> > >>> We didn't find any other sane solution to limit the dma access in a > >>> general way at the time, so opted for just blocking the memory region > >>> (as > >>> it was similarly only > >> > >> I was under the impression that dma-ranges could describe this kind of > >> DMA addressing limitation. Was there some problem with that? Perhaps the > >> driver is not acquiring/configuring its mask correctly? > > > > I remember looking at (and trying) different options back then. > > > > dma-mask wanted power-of-2 values (so it's either 4GB or 2GB (or lower)), > > zone-dma was a 32bit (and non-dt) thing and dma-ranges seem to simply also > > calculate a dma-mask from the value, so you're down to 2GB again. > > > > So just blocking of those 16MB at the end for 4GB devices somehow sounded > > nicer than limiting dma access to only half the memory. > > > > I may be overlooking something but that was what I came up with last year. > > > > > > Heiko > > Is there a chance to accept this patch? > > I know it's not the best solution to this problem, but i don't know > a better one. there is always a "chance". But with changes like these, we always try to find a real cause first, before resorting to solutions like this. So it's definitly not off the table, but I'd like to investigate further first, so that we don't accumulate unnecessary hacks over time. Especially that your region seems to be in the middle of the designated ram area is strange. Could you please tell which board you're using (and how much memory it has) Thanks Heiko
Hi, Paweł: On 2016年10月01日 22:09, =?UTF-8?q?Pawe=C5=82=20Jarosz?= wrote: > For some reason accessing memory region above 0xfe000000 freezes > system on rk3066. There is similiar bug on later rockchip soc (rk3288) RK3066 only support 2GB memory from 0x60000000 to 0xE0000000, can not access above 0xfe000000. I think you mean 0x9F000000? > solved same way. > > Signed-off-by: Paweł Jarosz <paweljarosz3691@gmail.com> > --- > arch/arm/boot/dts/rk3066a.dtsi | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm/boot/dts/rk3066a.dtsi b/arch/arm/boot/dts/rk3066a.dtsi > index 0d0dae3..44c8956 100644 > --- a/arch/arm/boot/dts/rk3066a.dtsi > +++ b/arch/arm/boot/dts/rk3066a.dtsi > @@ -93,6 +93,19 @@ > }; > }; > > + reserved-memory { > + #address-cells = <1>; > + #size-cells = <1>; > + ranges; > + /* > + * The rk3066 cannot use the memory area above 0x9F000000 > + * for some unknown reason. > + */ I don't remember RK3066 has such limit. I will double check with our IC design team. Do you know which master can not access this memory area [0x9F000000~0xA0000000)? > + unusable@9F000000 { > + reg = <0x9F000000 0x1000000>; > + }; > + }; > + > i2s0: i2s@10118000 { > compatible = "rockchip,rk3066-i2s"; > reg = <0x10118000 0x2000>;
Hi W dniu 05.10.2016 o 04:27, Huang, Tao pisze: > Hi, Paweł: > On 2016年10月01日 22:09, =?UTF-8?q?Pawe=C5=82=20Jarosz?= wrote: >> For some reason accessing memory region above 0xfe000000 freezes >> system on rk3066. There is similiar bug on later rockchip soc (rk3288) > RK3066 only support 2GB memory from 0x60000000 to 0xE0000000, can not access > above 0xfe000000. I think you mean 0x9F000000? Yes i meant 0x9F00000. Sorry for that. > I don't remember RK3066 has such limit. I will double check with our IC > design team. > Do you know which master can not access this memory area > [0x9F000000~0xA0000000)? I don't. > Could you please tell which board you're using (and how much memory it has) Rikomagic MK808 1GB RAM Thanks Pawel
Hi Paweł: On 2016年10月05日 14:09, Paweł Jarosz wrote: > W dniu 05.10.2016 o 04:27, Huang, Tao pisze: >> I don't remember RK3066 has such limit. I will double check with our IC >> design team. >> Do you know which master can not access this memory area >> [0x9F000000~0xA0000000)? > I don't. >> Could you please tell which board you're using (and how much memory it has) > Rikomagic MK808 1GB RAM > Our IC guy need us tell them which master can not access such area, DMA or EMMC Controller or GPU, etc? Could you tell me how to reproduce such issue? And we can confirm CPU core can access this memory through /dev/mem and the test board is 1GB too. Personally, I don't think RK3066 has such limit because when we verify this chip, we don't found such limit at all. Thanks, Huang, Tao
Hi W dniu 10.10.2016 o 09:18, Huang, Tao pisze: > Our IC guy need us tell them which master can not access such area, DMA > or EMMC Controller or GPU, etc? Could you tell me how to reproduce such > issue? > And we can confirm CPU core can access this memory through /dev/mem and > the test board is 1GB too. Personally, I don't think RK3066 has such > limit because when we verify this chip, we don't found such limit at all. > > Thanks, > Huang, Tao I'm getting this on Ubuntu 16.04 with mainline kernel. My board always freezes when i type: "memtester 800M"
Hi, Paweł: On 2016年10月10日 17:11, Paweł Jarosz wrote: > W dniu 10.10.2016 o 09:18, Huang, Tao pisze: >> Our IC guy need us tell them which master can not access such area, DMA >> or EMMC Controller or GPU, etc? Could you tell me how to reproduce such >> issue? >> And we can confirm CPU core can access this memory through /dev/mem and >> the test board is 1GB too. Personally, I don't think RK3066 has such >> limit because when we verify this chip, we don't found such limit at all. >> >> Thanks, >> Huang, Tao > I'm getting this on Ubuntu 16.04 with mainline kernel. > My board always freezes when i type: "memtester 800M" > We try run memtest 800M with Linux kernel 4.8, which killed by OOM. But if we run: # memtester -p 0x9F000000 16K 1 memtester version 4.3.0 (32-bit) Copyright (C) 2001-2012 Charles Cazabon. Licensed under the GNU General Public License version 2 (only). pagesize is 4096 pagesizemask is 0xfffff000 want 0MB (16384 bytes) Loop 1/1: Stuck Address : ok Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : ok Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok 8-bit Writes : ok 16-bit Writes : ok So these memory should be fine to CPU core. Maybe your system just freeze because out of memory.
Hi W dniu 13.10.2016 o 09:12, Huang, Tao pisze: > Hi, Paweł: > On 2016年10月10日 17:11, Paweł Jarosz wrote: >> W dniu 10.10.2016 o 09:18, Huang, Tao pisze: >>> Our IC guy need us tell them which master can not access such area, DMA >>> or EMMC Controller or GPU, etc? Could you tell me how to reproduce such >>> issue? >>> And we can confirm CPU core can access this memory through /dev/mem and >>> the test board is 1GB too. Personally, I don't think RK3066 has such >>> limit because when we verify this chip, we don't found such limit at all. >>> >>> Thanks, >>> Huang, Tao >> I'm getting this on Ubuntu 16.04 with mainline kernel. >> My board always freezes when i type: "memtester 800M" >> > We try run memtest 800M with Linux kernel 4.8, which killed by OOM. > But if we run: > # memtester -p 0x9F000000 16K 1 > memtester version 4.3.0 (32-bit) > Copyright (C) 2001-2012 Charles Cazabon. > Licensed under the GNU General Public License version 2 (only). > > pagesize is 4096 > pagesizemask is 0xfffff000 > want 0MB (16384 bytes) > Loop 1/1: > Stuck Address : ok > Random Value : ok > Compare XOR : ok > Compare SUB : ok > Compare MUL : ok > Compare DIV : ok > Compare OR : ok > Compare AND : ok > Sequential Increment: ok > Solid Bits : ok > Block Sequential : ok > Checkerboard : ok > Bit Spread : ok > Bit Flip : ok > Walking Ones : ok > Walking Zeroes : ok > 8-bit Writes : ok > 16-bit Writes : ok > > So these memory should be fine to CPU core. Maybe your system just > freeze because out of memory. > weird ... memtester -p 0x9F000000 16K 1 gave me the same result. Could you try one last command: memtester 400M Values > 200M causing freeze. If this won't do it, than maybe there is something wrong with my board. Thanks for your time.
diff --git a/arch/arm/boot/dts/rk3066a.dtsi b/arch/arm/boot/dts/rk3066a.dtsi index 0d0dae3..44c8956 100644 --- a/arch/arm/boot/dts/rk3066a.dtsi +++ b/arch/arm/boot/dts/rk3066a.dtsi @@ -93,6 +93,19 @@ }; }; + reserved-memory { + #address-cells = <1>; + #size-cells = <1>; + ranges; + /* + * The rk3066 cannot use the memory area above 0x9F000000 + * for some unknown reason. + */ + unusable@9F000000 { + reg = <0x9F000000 0x1000000>; + }; + }; + i2s0: i2s@10118000 { compatible = "rockchip,rk3066-i2s"; reg = <0x10118000 0x2000>;
For some reason accessing memory region above 0xfe000000 freezes system on rk3066. There is similiar bug on later rockchip soc (rk3288) solved same way. Signed-off-by: Paweł Jarosz <paweljarosz3691@gmail.com> --- arch/arm/boot/dts/rk3066a.dtsi | 13 +++++++++++++ 1 file changed, 13 insertions(+)