diff mbox series

[v2] mm/ksm: Fix NULL pointer dereference when KSM zero page is enabled

Message ID 20200414075622.69822-1-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series [v2] mm/ksm: Fix NULL pointer dereference when KSM zero page is enabled | expand

Commit Message

Muchun Song April 14, 2020, 7:56 a.m. UTC
The find_mergeable_vma can return NULL. In this case, it leads
to crash when we access vma->vm_mm(its offset is 0x40) later in
write_protect_page. And this case did happen on our server. The
following calltrace is captured in kernel 4.19 with KSM zero page
enabled. So add a vma check to fix it.

--------------------------------------------------------------------------
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
  Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
  RIP: 0010:try_to_merge_one_page+0xc7/0x760
  Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
        60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
        8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
  RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
  RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
  RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
  RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
  R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
  R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
  FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
    ? follow_page_pte+0x36d/0x5e0
    ksm_scan_thread+0x115e/0x1960
    ? remove_wait_queue+0x60/0x60
    kthread+0xf5/0x130
    ? try_to_merge_with_ksm_page+0x90/0x90
    ? kthread_create_worker_on_cpu+0x70/0x70
    ret_from_fork+0x1f/0x30
--------------------------------------------------------------------------

Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Xiongchun Duan <duanxiongchun@bytedance.com>
---
Change in v2:
    Update commit message and patch subject.

 mm/ksm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Markus Elfring April 14, 2020, 9:17 a.m. UTC | #1
> to crash when we access vma->vm_mm(its offset is 0x40) later in

Will another fine-tuning become relevant also for this wording?


> following calltrace is captured in kernel 4.19 with KSM zero page

Can the mentioned Linux version trigger any special software
development concerns?

Will any other tags become helpful in such a case?

Regards,
Markus
Muchun Song April 14, 2020, 9:56 a.m. UTC | #2
On Tue, Apr 14, 2020 at 5:17 PM Markus Elfring <Markus.Elfring@web.de> wrote:
>
> > to crash when we access vma->vm_mm(its offset is 0x40) later in
>
> Will another fine-tuning become relevant also for this wording?
>

Sorry, I don't understand what this means because of my poor English.
Could you explain it again. Thanks.

>
> > following calltrace is captured in kernel 4.19 with KSM zero page
>
> Can the mentioned Linux version trigger any special software
> development concerns?
>
> Will any other tags become helpful in such a case?

How about changing
    "following calltrace is captured in kernel 4.19 with KSM zero page"
to
   "The following calltrace is captured with the following patch applied:
       e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with
colouring")
    "
?
Markus Elfring April 14, 2020, 2:17 p.m. UTC | #3
>>> to crash when we access vma->vm_mm(its offset is 0x40) later in
>>
>> Will another fine-tuning become relevant also for this wording?
>
> Sorry, I don't understand what this means because of my poor English.

Our language knowledge can evolve over time.


> Could you explain it again.

You integrated a few of my suggestions into your message selection. - Thanks.
I wonder why you did not like the following small adjustment possibilities
so far.

  to a crash … vm_mm (its …


>> Will any other tags become helpful in such a case?
>
> How about changing
>     "following calltrace is captured in kernel 4.19 with KSM zero page"
> to
>    "The following calltrace is captured with the following patch applied:
>        e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with
> colouring")
>     "
> ?

I find it unlikely that such a wording alternative would be more appropriate
while I became just curious for related development consequences around
the usage of a longterm kernel version.

Would you like to reuse the term “call trace”?

Regards,
Markus
Muchun Song April 14, 2020, 2:39 p.m. UTC | #4
On Tue, Apr 14, 2020 at 10:17 PM Markus Elfring <Markus.Elfring@web.de> wrote:
>
> >>> to crash when we access vma->vm_mm(its offset is 0x40) later in
> >>
> >> Will another fine-tuning become relevant also for this wording?
> >
> > Sorry, I don't understand what this means because of my poor English.
>
> Our language knowledge can evolve over time.
>
>
> > Could you explain it again.
>
> You integrated a few of my suggestions into your message selection. - Thanks.
> I wonder why you did not like the following small adjustment possibilities
> so far.
>
>   to a crash … vm_mm (its …
>

Thanks a lot. I will fix it.

>
> >> Will any other tags become helpful in such a case?
> >
> > How about changing
> >     "following calltrace is captured in kernel 4.19 with KSM zero page"
> > to
> >    "The following calltrace is captured with the following patch applied:
> >        e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with
> > colouring")
> >     "
> > ?
>
> I find it unlikely that such a wording alternative would be more appropriate
> while I became just curious for related development consequences around
> the usage of a longterm kernel version.
>
> Would you like to reuse the term “call trace”?
>

OK, I will reuse the “call trace”. Thanks again.

Anyone else have any suggestions? If not, I will post another v4 version
to fix the commit message that Markus mentioned.
diff mbox series

Patch

diff --git a/mm/ksm.c b/mm/ksm.c
index a558da9e71770..69b2f85e22d5b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2112,8 +2112,11 @@  static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
 
 		down_read(&mm->mmap_sem);
 		vma = find_mergeable_vma(mm, rmap_item->address);
-		err = try_to_merge_one_page(vma, page,
-					    ZERO_PAGE(rmap_item->address));
+		if (vma)
+			err = try_to_merge_one_page(vma, page,
+					ZERO_PAGE(rmap_item->address));
+		else
+			err = -EFAULT;
 		up_read(&mm->mmap_sem);
 		/*
 		 * In case of failure, the page was not really empty, so we