diff mbox

n900 in next-20170901

Message ID 20170905233241.GA19231@js1304-P5Q-DELUXE (mailing list archive)
State New, archived
Headers show

Commit Message

Joonsoo Kim Sept. 5, 2017, 11:32 p.m. UTC
On Tue, Sep 05, 2017 at 01:13:15PM -0700, Tony Lindgren wrote:
> * Pavel Machek <pavel@ucw.cz> [170903 13:38]:
> > Hi!
> >
> > It compiles ok, but it hangs on boot; black screen, so sometime before
> > display is initialized.
> 
> Thanks for reporting it. Based on git bisect, the regression causing
> commit is 9caf25f996e8 ("mm/cma: manage the memory of the CMA area
> by using the ZONE_MOVABLE"). With this path applied, booting hangs
> with an error in omap3_save_secure_ram_context() after a call to
> _omap_save_secure_sram() that calls the related assembly code
> save_secure_ram_context.
> 
> However, looks like there is also some other commit causing issue.
> 
> Just reverting 9caf25f996e8 on Linux next causes the oops below.
> 
> Anybody got ideas why this now happens?
> 
> Regards,
> 
> Tony
> 
> 8< --------------------
> Unable to handle kernel paging request at virtual address ce800000
> pgd = c0004000 [ce800000] *pgd=00000000
> Internal error: Oops: 805 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-next-20170905+ #662
> Hardware name: Nokia RX-51 board
> task: ce0a0040 task.stack: ce0a4000
> PC is at __memzero+0x24/0x7c
> LR is at 0x0
> pc : [<c084fa84>]    lr : [<00000000>]    psr: 20000013
> sp : ce0a5e84  ip : 00000000  fp : c0c005a8
> r10: 00040000  r9 : cfc95000  r8 : 00000247
> r7 : ce0a5ef4  r6 : 00000000  r5 : 00000001  r4 : ce800000
> r3 : 00000000  r2 : 00000000  r1 : 0003ffc0  r0 : ce800000
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 80004019  DAC: 00000051
> Process swapper/0 (pid: 1, stack limit = 0xce0a4218)
> Stack: (0xce0a5e84 to 0xce0a6000)
> 5e80:          c0116718 00000040 00000006 c01a1470 00040000 00000001 00000040
> 5ea0: ce0a5ef4 00000247 cfc95000 00000000 c0c005a8 c0116844 c0dce000 c01a1484
> 5ec0: 00000247 c0d0e2e0 c0dce29c c0b5cbbc c0dce000 00000003 c0c5389c c0c06c54
> 5ee0: c0c06bd8 00000001 00000000 014000c0 00000000 c0b5cbbc ffffe000 c0c06bd8
> 5f00: 00000000 c0101ef8 000000aa 00000000 cfdfcbdb cfdfcbe7 c0b5d904 000000aa
> 5f20: 000000aa c015cf54 c0b5cbbc 00000000 00000002 00000002 cfdfcbe7 cfdfcbec
> 5f40: c0c6c59c 00000002 c0dce000 c0c53880 c0c6c66c c0dce000 c0c53884 c0dce000
> 5f60: 00000003 c0c00ecc 00000002 00000002 00000000 c0c005a8 c0864f3c 000000aa
> 5f80: 00000000 00000000 c0864f3c 00000000 00000000 00000000 00000000 00000000
> 5fa0: 00000000 c0864f44 00000000 c0107e98 00000000 00000000 00000000 00000000
> 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff
> [<c084fa84>] (__memzero) from [<c0116718>] (__dma_clear_buffer+0x140/0x154)
> [<c0116718>] (__dma_clear_buffer) from [<c0116844>] (__alloc_from_contiguous+0x50/0xdc)
> [<c0116844>] (__alloc_from_contiguous) from [<c0c06c54>] (atomic_pool_init+0x7c/0x178)
> [<c0c06c54>] (atomic_pool_init) from [<c0101ef8>] (do_one_initcall+0x3c/0x170)
> [<c0101ef8>] (do_one_initcall) from [<c0c00ecc>] (kernel_init_freeable+0x1fc/0x2c4)
> [<c0c00ecc>] (kernel_init_freeable) from [<c0864f44>] (kernel_init+0x8/0x110)
> [<c0864f44>] (kernel_init) from [<c0107e98>] (ret_from_fork+0x14/0x3c)
> Code: e52de004 e1a0c002 e1a0e002 e2511040 (a8a0500c)

Hello,

I think that I made a mistake for configuration CONFIG_HIGHMEM=y and
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. In this case, the MOVABLE_ZONE can
be *!highmem*. Could you check that your configuration have above
options?

And, could you check that following patch works for you?

Thanks.

------------>8-----------------

Comments

Tony Lindgren Sept. 6, 2017, 1:30 p.m. UTC | #1
Hi,

* Joonsoo Kim <iamjoonsoo.kim@lge.com> [170905 16:32]:
> I think that I made a mistake for configuration CONFIG_HIGHMEM=y and
> CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. In this case, the MOVABLE_ZONE can
> be *!highmem*. Could you check that your configuration have above
> options?

CONFIG_HIGHMEM is set yeah.

> And, could you check that following patch works for you?

Does not seem to help, tried against next with just 9caf25f996e8
revert and also with 9caf25f996e8.

Regards,

Tony


> ------------>8-----------------
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 38f0fde..4c39c92 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -518,7 +518,7 @@ void __init dma_contiguous_remap(void)
>                  * considered as highmem even if it's physical address belong
>                  * to lowmem. Therefore, re-mapping isn't required.
>                  */
> -               if (!IS_ENABLED(CONFIG_HIGHMEM))
> +               if (!is_highmem_idx(ZONE_MOVABLE))
>                         iotable_init(&map, 1);
>         }
>  }
>
Joonsoo Kim Sept. 7, 2017, 7:30 a.m. UTC | #2
On Wed, Sep 06, 2017 at 06:30:57AM -0700, Tony Lindgren wrote:
> Hi,
> 
> * Joonsoo Kim <iamjoonsoo.kim@lge.com> [170905 16:32]:
> > I think that I made a mistake for configuration CONFIG_HIGHMEM=y and
> > CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. In this case, the MOVABLE_ZONE can
> > be *!highmem*. Could you check that your configuration have above
> > options?
> 
> CONFIG_HIGHMEM is set yeah.
> 
> > And, could you check that following patch works for you?
> 
> Does not seem to help, tried against next with just 9caf25f996e8
> revert and also with 9caf25f996e8.

Oops. I misunderstood your problem. Could you test with
CONFIG_DEBUG_VIRTUAL?

After commit 9caf25f996e8, user for CMA memory should use to check
PageHighmem in order to get proper virtual address of the page. If
someone doesn't use it, it is possible to use wrong virtual address
and it then causes the use of wrong physical address.
CONFIG_DEBUG_VIRTUAL would catch this case.

If it doesn't help, is there a way to test n900 configuration in QEMU?

Thanks.
Tony Lindgren Sept. 7, 2017, 4:16 p.m. UTC | #3
* Joonsoo Kim <iamjoonsoo.kim@lge.com> [170907 00:30]:
> On Wed, Sep 06, 2017 at 06:30:57AM -0700, Tony Lindgren wrote:
> > Hi,
> > 
> > * Joonsoo Kim <iamjoonsoo.kim@lge.com> [170905 16:32]:
> > > I think that I made a mistake for configuration CONFIG_HIGHMEM=y and
> > > CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. In this case, the MOVABLE_ZONE can
> > > be *!highmem*. Could you check that your configuration have above
> > > options?
> > 
> > CONFIG_HIGHMEM is set yeah.
> > 
> > > And, could you check that following patch works for you?
> > 
> > Does not seem to help, tried against next with just 9caf25f996e8
> > revert and also with 9caf25f996e8.
> 
> Oops. I misunderstood your problem. Could you test with
> CONFIG_DEBUG_VIRTUAL?

Sure.

> After commit 9caf25f996e8, user for CMA memory should use to check
> PageHighmem in order to get proper virtual address of the page. If
> someone doesn't use it, it is possible to use wrong virtual address
> and it then causes the use of wrong physical address.
> CONFIG_DEBUG_VIRTUAL would catch this case.

OK, no extra output of current next with CONFIG_DEBUG_VIRTUAL=y.
Booting of n900 hangs with just the same error:

save_secure_sram() returns 0000ff02

> If it doesn't help, is there a way to test n900 configuration in QEMU?

I doubt that QEMU n900 boots in secure mode but instead shows
the SoC as general purpose SoC. If so, you'd have to patch the
omap3_save_secure_ram_context() to attempt to save secure RAM
context in all cases. If that works then debugging with any
omap3 board like beagleboard in QEMU should work.

Regards,

Tony
Joonsoo Kim Sept. 13, 2017, 7:55 a.m. UTC | #4
On Thu, Sep 07, 2017 at 09:16:51AM -0700, Tony Lindgren wrote:
> * Joonsoo Kim <iamjoonsoo.kim@lge.com> [170907 00:30]:
> > On Wed, Sep 06, 2017 at 06:30:57AM -0700, Tony Lindgren wrote:
> > > Hi,
> > > 
> > > * Joonsoo Kim <iamjoonsoo.kim@lge.com> [170905 16:32]:
> > > > I think that I made a mistake for configuration CONFIG_HIGHMEM=y and
> > > > CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. In this case, the MOVABLE_ZONE can
> > > > be *!highmem*. Could you check that your configuration have above
> > > > options?
> > > 
> > > CONFIG_HIGHMEM is set yeah.
> > > 
> > > > And, could you check that following patch works for you?
> > > 
> > > Does not seem to help, tried against next with just 9caf25f996e8
> > > revert and also with 9caf25f996e8.
> > 
> > Oops. I misunderstood your problem. Could you test with
> > CONFIG_DEBUG_VIRTUAL?
> 
> Sure.
> 
> > After commit 9caf25f996e8, user for CMA memory should use to check
> > PageHighmem in order to get proper virtual address of the page. If
> > someone doesn't use it, it is possible to use wrong virtual address
> > and it then causes the use of wrong physical address.
> > CONFIG_DEBUG_VIRTUAL would catch this case.
> 
> OK, no extra output of current next with CONFIG_DEBUG_VIRTUAL=y.
> Booting of n900 hangs with just the same error:
> 
> save_secure_sram() returns 0000ff02
> 
> > If it doesn't help, is there a way to test n900 configuration in QEMU?
> 
> I doubt that QEMU n900 boots in secure mode but instead shows
> the SoC as general purpose SoC. If so, you'd have to patch the
> omap3_save_secure_ram_context() to attempt to save secure RAM
> context in all cases. If that works then debugging with any
> omap3 board like beagleboard in QEMU should work.

Sorry for late response.

I tried to emulate beagle board by using QEMU and now I find the way
and it works. However, it doesn't call omap3_save_secure_ram_context()
due to different omap_type(). And, even if I call it forcibly, the
system dies with prefetch abort regardless of commit 9caf25f996e8.

Could you let me know the better way to test your situation?

Anyway, could you test linux-next with 'CONFIG_HIGHMEM = n'?
I'd like to know if the issue is related to the change that
all CMA memory is managed like as highmem.

Thanks.
Joonsoo Kim Sept. 18, 2017, 2:01 a.m. UTC | #5
On Fri, Sep 15, 2017 at 03:18:18PM +0200, Pavel Machek wrote:
> Hi!
> 
> > > After commit 9caf25f996e8, user for CMA memory should use to check
> > > PageHighmem in order to get proper virtual address of the page. If
> > > someone doesn't use it, it is possible to use wrong virtual address
> > > and it then causes the use of wrong physical address.
> > > CONFIG_DEBUG_VIRTUAL would catch this case.
> > 
> > OK, no extra output of current next with CONFIG_DEBUG_VIRTUAL=y.
> > Booting of n900 hangs with just the same error:
> > 
> > save_secure_sram() returns 0000ff02
> > 
> > > If it doesn't help, is there a way to test n900 configuration in QEMU?
> > 
> > I doubt that QEMU n900 boots in secure mode but instead shows
> > the SoC as general purpose SoC. If so, you'd have to patch the
> > omap3_save_secure_ram_context() to attempt to save secure RAM
> > context in all cases. If that works then debugging with any
> > omap3 board like beagleboard in QEMU should work.
> 
> Okay, linux-next from today still does not boot on n900. Is it
> something new, or was this still not fixed in -next?

Hello,

Still not fixed in -next since I cannot regenerate the error.

Thanks.
Joonsoo Kim Sept. 18, 2017, 2:07 a.m. UTC | #6
Hello,

On Fri, Sep 15, 2017 at 03:28:44PM +0200, Pali Rohár wrote:
> On Thursday 07 September 2017 16:30:38 Joonsoo Kim wrote:
> > If it doesn't help, is there a way to test n900 configuration in QEMU?
> 
> Hi Joonsoo, linaro version of QEMU has support for n900 machine. For
> more information how to prepare & run kernel image see this email:
> https://lists.denx.de/pipermail/u-boot/2015-January/200171.html
> (instead u-boot.bin you would supply kernel's zImage)

I tried to search to download required tools but cannot find a qflasher.
Looks like you have qflasher and other *.bin for n900 image.
Could you share them to me, please?

> But QEMU does not support HS mode, so there is probably no secure ram.
> IIRC smc instructions should not be used in normal GP mode.

Okay. Thanks for information. I'm not sure that I can regenerate the
error with n900 QEMU emulation but that would be my best so I will
try.

Thanks.
Pavel Machek Sept. 18, 2017, 8:11 a.m. UTC | #7
Hi!

> > > > After commit 9caf25f996e8, user for CMA memory should use to check
> > > > PageHighmem in order to get proper virtual address of the page. If
> > > > someone doesn't use it, it is possible to use wrong virtual address
> > > > and it then causes the use of wrong physical address.
> > > > CONFIG_DEBUG_VIRTUAL would catch this case.
> > > 
> > > OK, no extra output of current next with CONFIG_DEBUG_VIRTUAL=y.
> > > Booting of n900 hangs with just the same error:
> > > 
> > > save_secure_sram() returns 0000ff02
> > > 
> > > > If it doesn't help, is there a way to test n900 configuration in QEMU?
> > > 
> > > I doubt that QEMU n900 boots in secure mode but instead shows
> > > the SoC as general purpose SoC. If so, you'd have to patch the
> > > omap3_save_secure_ram_context() to attempt to save secure RAM
> > > context in all cases. If that works then debugging with any
> > > omap3 board like beagleboard in QEMU should work.
> > 
> > Okay, linux-next from today still does not boot on n900. Is it
> > something new, or was this still not fixed in -next?
> 
> Hello,
> 
> Still not fixed in -next since I cannot regenerate the error.

Unfortunately, rest of the world can reproduce the error, and it means
linux-next is useless for us.

I'd expect you to drop the relevant tree from linux-next when the
error was reported. Clearly, those patches are unsuitable for 4.15, as
they are broken, so they should not be in linux-next.

Thanks,
									Pavel
Stephen Rothwell Sept. 18, 2017, 10 p.m. UTC | #8
Hi Pavel,

On Mon, 18 Sep 2017 10:11:09 +0200 Pavel Machek <pavel@ucw.cz> wrote:
>
> Unfortunately, rest of the world can reproduce the error, and it means
> linux-next is useless for us.
> 
> I'd expect you to drop the relevant tree from linux-next when the
> error was reported. Clearly, those patches are unsuitable for 4.15, as
> they are broken, so they should not be in linux-next.

Andrew has asked me to drop the patches from linux-next today and I
have done so.
Pavel Machek Sept. 18, 2017, 10:16 p.m. UTC | #9
On Tue 2017-09-19 08:00:10, Stephen Rothwell wrote:
> Hi Pavel,
> 
> On Mon, 18 Sep 2017 10:11:09 +0200 Pavel Machek <pavel@ucw.cz> wrote:
> >
> > Unfortunately, rest of the world can reproduce the error, and it means
> > linux-next is useless for us.
> > 
> > I'd expect you to drop the relevant tree from linux-next when the
> > error was reported. Clearly, those patches are unsuitable for 4.15, as
> > they are broken, so they should not be in linux-next.
> 
> Andrew has asked me to drop the patches from linux-next today and I
> have done so.

Thanks!
									Pavel
diff mbox

Patch

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 38f0fde..4c39c92 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -518,7 +518,7 @@  void __init dma_contiguous_remap(void)
                 * considered as highmem even if it's physical address belong
                 * to lowmem. Therefore, re-mapping isn't required.
                 */
-               if (!IS_ENABLED(CONFIG_HIGHMEM))
+               if (!is_highmem_idx(ZONE_MOVABLE))
                        iotable_init(&map, 1);
        }
 }