diff mbox

[v5sub1,7/8] arm64: move kernel image to base of vmalloc area

Message ID 56C31D1D.50708@virtuozzo.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrey Ryabinin Feb. 16, 2016, 12:59 p.m. UTC
On 02/15/2016 09:59 PM, Catalin Marinas wrote:
> On Mon, Feb 15, 2016 at 05:28:02PM +0300, Andrey Ryabinin wrote:
>> On 02/12/2016 07:06 PM, Catalin Marinas wrote:
>>> So far, we have:
>>>
>>> KASAN+for-next/kernmap goes wrong
>>> KASAN+UBSAN goes wrong
>>>
>>> Enabled individually, KASAN, UBSAN and for-next/kernmap seem fine. I may
>>> have to trim for-next/core down until we figure out where the problem
>>> is.
>>>
>>> BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x164/0x16a0 at addr ffffffc93665bc8c
>>
>> Can it be related to TLB conflicts, which supposed to be fixed in
>> "arm64: kasan: avoid TLB conflicts" patch from "arm64: mm: rework page
>> table creation" series ?
> 
> I can very easily reproduce this with a vanilla 4.5-rc1 series by
> enabling inline instrumentation (maybe Mark's theory is true w.r.t.
> image size).
> 
> Some information, maybe you can shed some light on this. It seems to
> happen only for secondary CPUs on the swapper stack (I think allocated
> via fork_idle()). The code generated looks sane to me, so KASAN should
> not complain but maybe there is some uninitialised shadow, hence the
> error.
> 
> The report:
>

Actually, the first report is a bit more useful. It shows that shadow memory was corrupted:

  ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1
> ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3
                      ^
F1 - left redzone, it indicates start of stack frame
F3 - right redzone, it should be the end of stack frame.

But here we have the second set of F1s without F3s which should close the first set of F1s.
Also those two F3s in the middle cannot be right.

So shadow is corrupted.
Some hypotheses:

1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP).
    But this probably should cause kernel crash later, after kasan reports.

2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
     If we use some tricky way to exit from function this could cause false-positives like that.
     E.g. some hand-written assembly return code.

3) Screwed shadow mapping. I think the patch below should uncover such problem.
It boot-tested on qemu and didn't show any problem


---
 arch/arm64/mm/kasan_init.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

Comments

Mark Rutland Feb. 16, 2016, 2:12 p.m. UTC | #1
On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote:
> 
> On 02/15/2016 09:59 PM, Catalin Marinas wrote:
> > On Mon, Feb 15, 2016 at 05:28:02PM +0300, Andrey Ryabinin wrote:
> >> On 02/12/2016 07:06 PM, Catalin Marinas wrote:
> >>> So far, we have:
> >>>
> >>> KASAN+for-next/kernmap goes wrong
> >>> KASAN+UBSAN goes wrong
> >>>
> >>> Enabled individually, KASAN, UBSAN and for-next/kernmap seem fine. I may
> >>> have to trim for-next/core down until we figure out where the problem
> >>> is.
> >>>
> >>> BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x164/0x16a0 at addr ffffffc93665bc8c
> >>
> >> Can it be related to TLB conflicts, which supposed to be fixed in
> >> "arm64: kasan: avoid TLB conflicts" patch from "arm64: mm: rework page
> >> table creation" series ?
> > 
> > I can very easily reproduce this with a vanilla 4.5-rc1 series by
> > enabling inline instrumentation (maybe Mark's theory is true w.r.t.
> > image size).
> > 
> > Some information, maybe you can shed some light on this. It seems to
> > happen only for secondary CPUs on the swapper stack (I think allocated
> > via fork_idle()). The code generated looks sane to me, so KASAN should
> > not complain but maybe there is some uninitialised shadow, hence the
> > error.
> > 
> > The report:
> >
> 
> Actually, the first report is a bit more useful. It shows that shadow memory was corrupted:
> 
>   ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1
> > ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3
>                       ^
> F1 - left redzone, it indicates start of stack frame
> F3 - right redzone, it should be the end of stack frame.
> 
> But here we have the second set of F1s without F3s which should close the first set of F1s.
> Also those two F3s in the middle cannot be right.
> 
> So shadow is corrupted.
> Some hypotheses:
> 
> 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP).
>     But this probably should cause kernel crash later, after kasan reports.
> 
> 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
>      If we use some tricky way to exit from function this could cause false-positives like that.
>      E.g. some hand-written assembly return code.
> 
> 3) Screwed shadow mapping. I think the patch below should uncover such problem.
> It boot-tested on qemu and didn't show any problem

With that path applied I get:

[    0.000000] kasan: screwed shadow mapping 62184, 62182
[    0.000000] kasan: KernelAddressSanitizer initialized

I'm using v4.5-rc1 with KASAN_INLINE, and a random collection of debug options
to bloat the kernel per prior theory that the text size had somethign to do
with the issue.

Later in the boot process I see lots of failures like:

[   13.292190] ==================================================================
[   13.299543] BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x1950/0x19b8 at addr ffffffc936ad3c8c
[   13.309090] Read of size 4 by task swapper/3/0
[   13.313575] page:ffffffbde6dab4c0 count:0 mapcount:0 mapping:          (null) index:0x0
[   13.321657] flags: 0x4000000000000000()
[   13.325539] page dumped because: kasan: bad access detected
[   13.331150] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.5.0-rc1+ #19
[   13.337528] Hardware name: ARM Juno development board (r1) (DT)
[   13.343471] Call trace:
[   13.345978] [<ffffffc000091400>] dump_backtrace+0x0/0x3c0
[   13.351416] [<ffffffc0000917e4>] show_stack+0x24/0x30
[   13.356507] [<ffffffc0008c3a64>] dump_stack+0xc4/0x150
[   13.361685] [<ffffffc0004032bc>] kasan_report_error+0x52c/0x558
[   13.367640] [<ffffffc0004033fc>] __asan_report_load4_noabort+0x54/0x60
[   13.374200] [<ffffffc0001a46e8>] find_busiest_group+0x1950/0x19b8
[   13.380327] [<ffffffc0001a49ec>] load_balance+0x29c/0x19e0
[   13.385851] [<ffffffc0001a67c0>] pick_next_task_fair+0x690/0xd88
[   13.391896] [<ffffffc001213cf4>] __schedule+0x85c/0x13c8
[   13.397248] [<ffffffc001214d7c>] schedule+0xe4/0x228
[   13.402256] [<ffffffc00121549c>] schedule_preempt_disabled+0x24/0xb8
[   13.408642] [<ffffffc0001b97f8>] cpu_startup_entry+0x188/0x738
[   13.414511] [<ffffffc00009bcfc>] secondary_start_kernel+0x244/0x2b8
[   13.420806] [<0000000080082efc>] 0x80082efc
[   13.425023] Memory state around the buggy address:
[   13.429854]  ffffffc936ad3b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   13.437153]  ffffffc936ad3c00: 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 00 00 f3 f3
[   13.444451] >ffffffc936ad3c80: f3 f3 00 00 00 00 00 00 00 f4 f4 f4 f3 f3 f3 f3
[   13.451742]                       ^
[   13.455274]  ffffffc936ad3d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   13.462572]  ffffffc936ad3d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
[   13.469863] ==================================================================

I guess memroy layout has something to do with this. FWIW on this board my
memory map comes from EFI:

[    0.000000] Processing EFI memory map:
[    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |XP|  |  |  |   |  |  |  |UC]
[    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |XP|  |  |  |   |  |  |  |UC]
[    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000080010000-0x00008007ffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000080080000-0x000081dbffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000081dc0000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000e00f0000-0x0000f5a58fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f5a59000-0x0000f7793fff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f7794000-0x0000f9431fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f9432000-0x0000f944ffff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f9450000-0x0000f945ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9460000-0x0000f94dffff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f94e0000-0x0000f94effff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f94f0000-0x0000f94fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9500000-0x0000f950ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9510000-0x0000f953ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9540000-0x0000f954ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9550000-0x0000f956ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9570000-0x0000f958ffff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9590000-0x0000f960ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9610000-0x0000f961ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9620000-0x0000f96effff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f96f0000-0x0000f96fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9700000-0x0000f970ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9710000-0x0000f974ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9750000-0x0000f975ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9760000-0x0000f97cffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f97d0000-0x0000f97dffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f97e0000-0x0000f97effff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000f97f0000-0x0000f981ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f9820000-0x0000f9820fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f9821000-0x0000f9827fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000f9828000-0x0000f982bfff [Reserved           |   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000f982c000-0x0000fdaedfff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fdaee000-0x0000fdfbefff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fdfbf000-0x0000fdfbffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fdfc0000-0x0000fdffbfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fdffc000-0x0000fe018fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe019000-0x0000fe020fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe021000-0x0000fe022fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe023000-0x0000fe02bfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe02c000-0x0000fe03afff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe03b000-0x0000fe03dfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe03e000-0x0000fe04efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe04f000-0x0000fe057fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe058000-0x0000fe073fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe074000-0x0000fe074fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe075000-0x0000fe078fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe079000-0x0000fe07bfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe07c000-0x0000fe07dfff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe07e000-0x0000fe085fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe086000-0x0000fe087fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe088000-0x0000fe171fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe172000-0x0000fe198fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe199000-0x0000fe65ffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe660000-0x0000fe6a2fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe6a3000-0x0000fe7effff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe7f0000-0x0000fe7fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000fe800000-0x0000fe80ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
[    0.000000]   0x0000fe810000-0x0000fe82ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000fe830000-0x0000fe83ffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe840000-0x0000fe88ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000fe890000-0x0000fe891fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fe892000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000880000000-0x00099bffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00099c000000-0x0009ffffffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]

Thanks,
Mark.
Mark Rutland Feb. 16, 2016, 2:29 p.m. UTC | #2
On Tue, Feb 16, 2016 at 02:12:59PM +0000, Mark Rutland wrote:
> On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote:
> > So shadow is corrupted.
> > Some hypotheses:
> > 
> > 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP).
> >     But this probably should cause kernel crash later, after kasan reports.
> > 
> > 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
> >      If we use some tricky way to exit from function this could cause false-positives like that.
> >      E.g. some hand-written assembly return code.
> > 
> > 3) Screwed shadow mapping. I think the patch below should uncover such problem.
> > It boot-tested on qemu and didn't show any problem
> 
> With that path applied I get:
> 
> [    0.000000] kasan: screwed shadow mapping 62184, 62182
> [    0.000000] kasan: KernelAddressSanitizer initialized
> 
> I'm using v4.5-rc1 with KASAN_INLINE, and a random collection of debug options
> to bloat the kernel per prior theory that the text size had somethign to do
> with the issue.

I hacked kasan_init to dump info as it created each shadow region:

[    0.000000] kasan_init shadowing [ffffffc000000000-ffffffc060000000] @ [ffffff8800000000-ffffff880c000001] nid 0
[    0.000000] kasan_init shadowing [ffffffc0600f0000-ffffffc079450000] @ [ffffff880c01e000-ffffff880f28a001] nid 0
[    0.000000] kasan_init shadowing [ffffffc079450000-ffffffc079820000] @ [ffffff880f28a000-ffffff880f304001] nid 0
[    0.000000] kasan_init shadowing [ffffffc079820000-ffffffc079821000] @ [ffffff880f304000-ffffff880f304201] nid 0
[    0.000000] kasan_init shadowing [ffffffc079821000-ffffffc079822000] @ [ffffff880f304200-ffffff880f304401] nid 0
[    0.000000] kasan_init shadowing [ffffffc079822000-ffffffc079828000] @ [ffffff880f304400-ffffff880f305001] nid 0
[    0.000000] kasan_init shadowing [ffffffc079828000-ffffffc07982c000] @ [ffffff880f305000-ffffff880f305801] nid 0
[    0.000000] kasan_init shadowing [ffffffc07982c000-ffffffc07e7f0000] @ [ffffff880f305800-ffffff880fcfe001] nid 0
[    0.000000] kasan_init shadowing [ffffffc07e7f0000-ffffffc07e830000] @ [ffffff880fcfe000-ffffff880fd06001] nid 0
[    0.000000] kasan_init shadowing [ffffffc07e830000-ffffffc07e840000] @ [ffffff880fd06000-ffffff880fd08001] nid 0
[    0.000000] kasan_init shadowing [ffffffc07e840000-ffffffc07e890000] @ [ffffff880fd08000-ffffff880fd12001] nid 0
[    0.000000] kasan_init shadowing [ffffffc07e890000-ffffffc07f000000] @ [ffffff880fd12000-ffffff880fe00001] nid 0
[    0.000000] kasan_init shadowing [ffffffc800000000-ffffffc980000000] @ [ffffff8900000000-ffffff8930000001] nid 0
[    0.000000] kasan: screwed shadow mapping 62184, 62182
[    0.000000] kasan: KernelAddressSanitizer initialized

I note the the end of each shadow region overlaps the beginning of the next due
to the intentional end+1...

Other than the waste of memory (and the TLB conflict that gets solved by my
pgtable rework), I'm not sure though I'm not sure that's a problem, though.

Mark.

> I guess memroy layout has something to do with this. FWIW on this board my
> memory map comes from EFI:
> 
> [    0.000000] Processing EFI memory map:
> [    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |XP|  |  |  |   |  |  |  |UC]
> [    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |XP|  |  |  |   |  |  |  |UC]
> [    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000080010000-0x00008007ffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000080080000-0x000081dbffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000081dc0000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000e00f0000-0x0000f5a58fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f5a59000-0x0000f7793fff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f7794000-0x0000f9431fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f9432000-0x0000f944ffff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f9450000-0x0000f945ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9460000-0x0000f94dffff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f94e0000-0x0000f94effff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f94f0000-0x0000f94fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9500000-0x0000f950ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9510000-0x0000f953ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9540000-0x0000f954ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9550000-0x0000f956ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9570000-0x0000f958ffff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9590000-0x0000f960ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9610000-0x0000f961ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9620000-0x0000f96effff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f96f0000-0x0000f96fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9700000-0x0000f970ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9710000-0x0000f974ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9750000-0x0000f975ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9760000-0x0000f97cffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f97d0000-0x0000f97dffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f97e0000-0x0000f97effff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f97f0000-0x0000f981ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f9820000-0x0000f9820fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f9821000-0x0000f9827fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000f9828000-0x0000f982bfff [Reserved           |   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000f982c000-0x0000fdaedfff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fdaee000-0x0000fdfbefff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fdfbf000-0x0000fdfbffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fdfc0000-0x0000fdffbfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fdffc000-0x0000fe018fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe019000-0x0000fe020fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe021000-0x0000fe022fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe023000-0x0000fe02bfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe02c000-0x0000fe03afff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe03b000-0x0000fe03dfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe03e000-0x0000fe04efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe04f000-0x0000fe057fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe058000-0x0000fe073fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe074000-0x0000fe074fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe075000-0x0000fe078fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe079000-0x0000fe07bfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe07c000-0x0000fe07dfff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe07e000-0x0000fe085fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe086000-0x0000fe087fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe088000-0x0000fe171fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe172000-0x0000fe198fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe199000-0x0000fe65ffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe660000-0x0000fe6a2fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe6a3000-0x0000fe7effff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe7f0000-0x0000fe7fffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000fe800000-0x0000fe80ffff [Runtime Code       |RUN|  |  |  |  |RO|   |WB|WT|WC|UC]*
> [    0.000000]   0x0000fe810000-0x0000fe82ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000fe830000-0x0000fe83ffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe840000-0x0000fe88ffff [Runtime Data       |RUN|  |XP|  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000fe890000-0x0000fe891fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fe892000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000880000000-0x00099bffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00099c000000-0x0009ffffffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> 
> Thanks,
> Mark.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Ard Biesheuvel Feb. 16, 2016, 3:17 p.m. UTC | #3
On 16 February 2016 at 13:59, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
>
>
> On 02/15/2016 09:59 PM, Catalin Marinas wrote:
>> On Mon, Feb 15, 2016 at 05:28:02PM +0300, Andrey Ryabinin wrote:
>>> On 02/12/2016 07:06 PM, Catalin Marinas wrote:
>>>> So far, we have:
>>>>
>>>> KASAN+for-next/kernmap goes wrong
>>>> KASAN+UBSAN goes wrong
>>>>
>>>> Enabled individually, KASAN, UBSAN and for-next/kernmap seem fine. I may
>>>> have to trim for-next/core down until we figure out where the problem
>>>> is.
>>>>
>>>> BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x164/0x16a0 at addr ffffffc93665bc8c
>>>
>>> Can it be related to TLB conflicts, which supposed to be fixed in
>>> "arm64: kasan: avoid TLB conflicts" patch from "arm64: mm: rework page
>>> table creation" series ?
>>
>> I can very easily reproduce this with a vanilla 4.5-rc1 series by
>> enabling inline instrumentation (maybe Mark's theory is true w.r.t.
>> image size).
>>
>> Some information, maybe you can shed some light on this. It seems to
>> happen only for secondary CPUs on the swapper stack (I think allocated
>> via fork_idle()). The code generated looks sane to me, so KASAN should
>> not complain but maybe there is some uninitialised shadow, hence the
>> error.
>>
>> The report:
>>
>
> Actually, the first report is a bit more useful. It shows that shadow memory was corrupted:
>
>   ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1
>> ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3
>                       ^
> F1 - left redzone, it indicates start of stack frame
> F3 - right redzone, it should be the end of stack frame.
>
> But here we have the second set of F1s without F3s which should close the first set of F1s.
> Also those two F3s in the middle cannot be right.
>
> So shadow is corrupted.
> Some hypotheses:
>
> 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP).
>     But this probably should cause kernel crash later, after kasan reports.
>
> 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
>      If we use some tricky way to exit from function this could cause false-positives like that.
>      E.g. some hand-written assembly return code.
>
> 3) Screwed shadow mapping. I think the patch below should uncover such problem.
> It boot-tested on qemu and didn't show any problem
>

I think this patch gives false positive warnings in some cases:

>
> ---
>  arch/arm64/mm/kasan_init.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 55 insertions(+)
>
> diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
> index cf038c7..25d685c 100644
> --- a/arch/arm64/mm/kasan_init.c
> +++ b/arch/arm64/mm/kasan_init.c
> @@ -117,6 +117,59 @@ static void __init cpu_set_ttbr1(unsigned long ttbr1)
>         : "r" (ttbr1));
>  }
>
> +static void verify_shadow(void)
> +{
> +       struct memblock_region *reg;
> +       int i = 0;
> +
> +       for_each_memblock(memory, reg) {
> +               void *start = (void *)__phys_to_virt(reg->base);
> +               void *end = (void *)__phys_to_virt(reg->base + reg->size);
> +               int *shadow_start, *shadow_end;
> +
> +               if (start >= end)
> +                       break;
> +               shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
> +               shadow_end =  (int *)kasan_mem_to_shadow(end);

shadow_start and shadow_end can refer to the same page as in the
previous iteration. For instance, I have these two regions

  0x00006e090000-0x00006e0adfff [Conventional Memory|   |  |  |  |  |
|   |WB|WT|WC|UC]
  0x00006e0ae000-0x00006e0affff [Loader Data        |   |  |  |  |  |
|   |WB|WT|WC|UC]

which are covered by different memblocks since the second one is
marked as MEMBLOCK_NOMAP, due to the fact that it contains the UEFI
memory map.

I get the following output

kasan: screwed shadow mapping 23575, 23573

which I think is simply a result from the fact the shadow_start refers
to the same page as in the previous iteration(s)


> +               for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) {
> +                       *shadow_start = i;
> +                       i++;
> +               }
> +       }
> +
> +       i = 0;
> +       for_each_memblock(memory, reg) {
> +               void *start = (void *)__phys_to_virt(reg->base);
> +               void *end = (void *)__phys_to_virt(reg->base + reg->size);
> +               int *shadow_start, *shadow_end;
> +
> +               if (start >= end)
> +                       break;
> +               shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
> +               shadow_end =  (int *)kasan_mem_to_shadow(end);
> +               for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) {
> +                       if (*shadow_start != i) {
> +                               pr_err("screwed shadow mapping %d, %d\n", *shadow_start, i);
> +                               goto clear;
> +                       }
> +                       i++;
> +               }
> +       }
> +clear:
> +       for_each_memblock(memory, reg) {
> +               void *start = (void *)__phys_to_virt(reg->base);
> +               void *end = (void *)__phys_to_virt(reg->base + reg->size);
> +               unsigned long shadow_start, shadow_end;
> +
> +               if (start >= end)
> +                       break;
> +               shadow_start =  ((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
> +               shadow_end =  (unsigned long)kasan_mem_to_shadow(end);
> +               memset((void *)shadow_start, 0, shadow_end - shadow_start);
> +       }
> +
> +}
> +
>  void __init kasan_init(void)
>  {
>         struct memblock_region *reg;
> @@ -159,6 +212,8 @@ void __init kasan_init(void)
>         cpu_set_ttbr1(__pa(swapper_pg_dir));
>         flush_tlb_all();
>
> +       verify_shadow();
> +
>         /* At this point kasan is fully initialized. Enable error messages */
>         init_task.kasan_depth = 0;
>         pr_info("KernelAddressSanitizer initialized\n");
> --
>
>
>
>
>
Mark Rutland Feb. 17, 2016, 2:39 p.m. UTC | #4
On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote:
> Actually, the first report is a bit more useful. It shows that shadow memory was corrupted:
> 
>   ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1
> > ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3
>                       ^
> F1 - left redzone, it indicates start of stack frame
> F3 - right redzone, it should be the end of stack frame.
> 
> But here we have the second set of F1s without F3s which should close the first set of F1s.
> Also those two F3s in the middle cannot be right.
> 
> So shadow is corrupted.
> Some hypotheses:

> 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
>      If we use some tricky way to exit from function this could cause false-positives like that.
>      E.g. some hand-written assembly return code.

I think this is what's happenening, at least for the idle case.

A second attempt at bisecting led me to commit e679660dbb8347f2 ("ARM:
8481/2: drivers: psci: replace psci firmware calls"). Reverting that
makes v4.5-rc1 boot without KASAN splats.

That patch turned __invoke_psci_fn_{smc,hvc} into (ASAN-instrumented) C
functions. Prior to that commit, __invoke_psci_fn_{smc,hvc} were
pure assembly functions which used no stack.

When we go down for idle, in __cpu_suspend_enter we stash some context
to the stack (in assembly). The CPU may return from a cold state via
cpu_resume, where we restore context from the stack.

However, after storing the context we call psci_suspend_finisher, which
calls psci_cpu_suspend, which calls invoke_psci_fn_*. As
psci_cpu_suspend and invoke_psci_fn_* are instrumented, they poison
memory on function entrance, but we never perform the unpoisoning.

That was always the case for psci_suspend_finisher, so there was a
latent issue that we were somehow avoiding. Perhaps we got luck with
stack layout and never hit the poison.

I'm not sure how we fix that, as invoke_psci_fn_* may or may not return
for arbitrary reasons (e.g. a CPU_SUSPEND_CALL may or may not return
depending on whether an interrupt comes in at the right time).

Perhaps the simplest option is to not instrument invoke_psci_fn_* and
psci_suspend_finisher. Do we have a per-function annotation to avoid
KASAN instrumentation, like notrace? I need to investigate, but we may
also need notrace for similar reasons.

Andrey, on a tangential note, what do we do around hotplug? I assume
that we must unpooison the shadow region for the stack of a dead CPU,
but I wasn't able to figure out where we do that. Hopefuly we're not
just getting lucky?

Thanks,
Mark.
Andrey Ryabinin Feb. 17, 2016, 4:31 p.m. UTC | #5
On 02/17/2016 05:39 PM, Mark Rutland wrote:
> On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote:
>> Actually, the first report is a bit more useful. It shows that shadow memory was corrupted:
>>
>>   ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1
>>> ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3
>>                       ^
>> F1 - left redzone, it indicates start of stack frame
>> F3 - right redzone, it should be the end of stack frame.
>>
>> But here we have the second set of F1s without F3s which should close the first set of F1s.
>> Also those two F3s in the middle cannot be right.
>>
>> So shadow is corrupted.
>> Some hypotheses:
> 
>> 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return.
>>      If we use some tricky way to exit from function this could cause false-positives like that.
>>      E.g. some hand-written assembly return code.
> 
> I think this is what's happenening, at least for the idle case.
> 
> A second attempt at bisecting led me to commit e679660dbb8347f2 ("ARM:
> 8481/2: drivers: psci: replace psci firmware calls"). Reverting that
> makes v4.5-rc1 boot without KASAN splats.
> 
> That patch turned __invoke_psci_fn_{smc,hvc} into (ASAN-instrumented) C
> functions. Prior to that commit, __invoke_psci_fn_{smc,hvc} were
> pure assembly functions which used no stack.
> 
> When we go down for idle, in __cpu_suspend_enter we stash some context
> to the stack (in assembly). The CPU may return from a cold state via
> cpu_resume, where we restore context from the stack.
> 
> However, after storing the context we call psci_suspend_finisher, which
> calls psci_cpu_suspend, which calls invoke_psci_fn_*. As
> psci_cpu_suspend and invoke_psci_fn_* are instrumented, they poison
> memory on function entrance, but we never perform the unpoisoning.
> 
> That was always the case for psci_suspend_finisher, so there was a
> latent issue that we were somehow avoiding. Perhaps we got luck with
> stack layout and never hit the poison.
> 
> I'm not sure how we fix that, as invoke_psci_fn_* may or may not return
> for arbitrary reasons (e.g. a CPU_SUSPEND_CALL may or may not return
> depending on whether an interrupt comes in at the right time).
> 
> Perhaps the simplest option is to not instrument invoke_psci_fn_* and
> psci_suspend_finisher. Do we have a per-function annotation to avoid
> KASAN instrumentation, like notrace? I need to investigate, but we may
> also need notrace for similar reasons.

include/linux/compiler-gcc.h:
/*
* Tell the compiler that address safety instrumentation (KASAN)
* should not be applied to that function.
* Conflicts with inlining: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
*/
#define __no_sanitize_address __attribute__((no_sanitize_address))

> 
> Andrey, on a tangential note, what do we do around hotplug? I assume
> that we must unpooison the shadow region for the stack of a dead CPU,
> but I wasn't able to figure out where we do that. Hopefuly we're not
> just getting lucky?
> 

We do nothing about it. AFAIU we need to clear swapper's stack, somewhere in secondary_start_kernel() perhaps.



> Thanks,
> Mark.
>
Mark Rutland Feb. 17, 2016, 7:35 p.m. UTC | #6
On Wed, Feb 17, 2016 at 07:31:43PM +0300, Andrey Ryabinin wrote:
> On 02/17/2016 05:39 PM, Mark Rutland wrote:
> > Andrey, on a tangential note, what do we do around hotplug? I assume
> > that we must unpooison the shadow region for the stack of a dead CPU,
> > but I wasn't able to figure out where we do that. Hopefuly we're not
> > just getting lucky?
> 
> We do nothing about it. AFAIU we need to clear swapper's stack,
> somewhere in secondary_start_kernel() perhaps.

Oh, joy...

Surely other architectures (e.g. x86) will need to do something similar?

Do they do anything currently? I can't see that they do...

Mark.
diff mbox

Patch

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index cf038c7..25d685c 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -117,6 +117,59 @@  static void __init cpu_set_ttbr1(unsigned long ttbr1)
 	: "r" (ttbr1));
 }
 
+static void verify_shadow(void)
+{
+	struct memblock_region *reg;
+	int i = 0;
+
+	for_each_memblock(memory, reg) {
+		void *start = (void *)__phys_to_virt(reg->base);
+		void *end = (void *)__phys_to_virt(reg->base + reg->size);
+		int *shadow_start, *shadow_end;
+
+		if (start >= end)
+			break;
+		shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
+		shadow_end =  (int *)kasan_mem_to_shadow(end);
+		for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) {
+			*shadow_start = i;
+			i++;
+		}
+	}
+
+	i = 0;
+	for_each_memblock(memory, reg) {
+		void *start = (void *)__phys_to_virt(reg->base);
+		void *end = (void *)__phys_to_virt(reg->base + reg->size);
+		int *shadow_start, *shadow_end;
+
+		if (start >= end)
+			break;
+		shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
+		shadow_end =  (int *)kasan_mem_to_shadow(end);
+		for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) {
+			if (*shadow_start != i) {
+				pr_err("screwed shadow mapping %d, %d\n", *shadow_start, i);
+				goto clear;
+			}
+			i++;
+		}
+	}
+clear:
+	for_each_memblock(memory, reg) {
+		void *start = (void *)__phys_to_virt(reg->base);
+		void *end = (void *)__phys_to_virt(reg->base + reg->size);
+		unsigned long shadow_start, shadow_end;
+
+		if (start >= end)
+			break;
+		shadow_start =  ((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1));
+		shadow_end =  (unsigned long)kasan_mem_to_shadow(end);
+		memset((void *)shadow_start, 0, shadow_end - shadow_start);
+	}
+
+}
+
 void __init kasan_init(void)
 {
 	struct memblock_region *reg;
@@ -159,6 +212,8 @@  void __init kasan_init(void)
 	cpu_set_ttbr1(__pa(swapper_pg_dir));
 	flush_tlb_all();
 
+	verify_shadow();
+
 	/* At this point kasan is fully initialized. Enable error messages */
 	init_task.kasan_depth = 0;
 	pr_info("KernelAddressSanitizer initialized\n");