From patchwork Fri Sep 4 23:35:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11758673 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5E69166C for ; Fri, 4 Sep 2020 23:35:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9F1C02087C for ; Fri, 4 Sep 2020 23:35:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="KIXlLk3T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9F1C02087C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AEB5D6B006E; Fri, 4 Sep 2020 19:35:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A9BC46B0070; Fri, 4 Sep 2020 19:35:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B21F8E0001; Fri, 4 Sep 2020 19:35:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id 834116B006E for ; Fri, 4 Sep 2020 19:35:45 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4ECFE180AD802 for ; Fri, 4 Sep 2020 23:35:45 +0000 (UTC) X-FDA: 77226988650.23.baby70_3e0255f270b6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 22E3437606 for ; Fri, 4 Sep 2020 23:35:45 +0000 (UTC) X-Spam-Summary: 1,0,0,2e444661113c02c7,d41d8cd98f00b204,akpm@linux-foundation.org,,RULES_HIT:1:2:41:69:355:379:800:857:960:967:973:988:989:1260:1345:1359:1381:1431:1434:1437:1605:1730:1747:1777:1792:2198:2199:2393:2525:2553:2559:2564:2682:2685:2859:2898:2901:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4052:4321:5007:6119:6238:6261:6653:6691:7576:7903:7974:8603:8660:9025:9163:9165:9545:10004:10226:11026:11473:11658:11914:12043:12048:12114:12291:12296:12297:12438:12517:12519:12555:12679:12683:12740:12895:12986:13148:13161:13229:13230:13846:13870:14096:21063:21080:21325:21433:21451:21611:21622:21773:21939:21990:30003:30012:30029:30054:30056:30070:30090,0,RBL:198.145.29.99:@linux-foundation.org:.lbl8.mailshell.net-64.100.201.201 62.2.0.100;04y85pifa464iyc6seuuf4i5hit9jop5p31kg4i31haz4ubs3m3nb1sowamarpf.iu4kq9wxuzz1bgq7t598gz8fcw9drc5qnua4aitah1r9dxzatipuoo3fk73 h5pg.n-l X-HE-Tag: baby70_3e0255f270b6 X-Filterd-Recvd-Size: 10631 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Sep 2020 23:35:44 +0000 (UTC) Received: from localhost.localdomain (c-71-198-47-131.hsd1.ca.comcast.net [71.198.47.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 91812208C7; Fri, 4 Sep 2020 23:35:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599262543; bh=9k5iDITcfjD2lAeZWwAaaYZqPQFNqH44P9wcYIv0sd8=; h=Date:From:To:Subject:In-Reply-To:From; b=KIXlLk3TBaQemxGqqkL48Ymp79aTIjcNZ7wm8GB/24AQZLA2oK/Umro4OtE3rjyuh T2rb7MNzjET7/TxzsHmGA5+ZIue4xX8ulJ3dmFdfUY95mMEQd6lOg9EwLkfHQIL5aO eGxcRi3E2CoeKIFcJGfIoMFVBkC6qmNcpH2WVrl4= Date: Fri, 04 Sep 2020 16:35:43 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chris@chris-wilson.co.uk, jroedel@suse.de, linux-mm@kvack.org, mm-commits@vger.kernel.org, pavel@ucw.cz, sfr@canb.auug.org.au, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 07/19] mm: track page table modifications in __apply_to_page_range() Message-ID: <20200904233543._QZjc1czy%akpm@linux-foundation.org> In-Reply-To: <20200904163454.4db0e6ce0c4584d2653678a3@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 22E3437606 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joerg Roedel Subject: mm: track page table modifications in __apply_to_page_range() __apply_to_page_range() is also used to change and/or allocate page-table pages in the vmalloc area of the address space. Make sure these changes get synchronized to other page-tables in the system by calling arch_sync_kernel_mappings() when necessary. The impact appears limited to x86-32, where apply_to_page_range may miss updating the PMD. That leads to explosions in drivers like [ 24.227844] BUG: unable to handle page fault for address: fe036000 [ 24.228076] #PF: supervisor write access in kernel mode [ 24.228294] #PF: error_code(0x0002) - not-present page [ 24.228494] *pde = 00000000 [ 24.228640] Oops: 0002 [#1] SMP [ 24.228788] CPU: 3 PID: 1300 Comm: gem_concurrent_ Not tainted 5.9.0-rc1+ #16 [ 24.228957] Hardware name: /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015 [ 24.229297] EIP: __execlists_context_alloc+0x132/0x2d0 [i915] [ 24.229462] Code: 31 d2 89 f0 e8 2f 55 02 00 89 45 e8 3d 00 f0 ff ff 0f 87 11 01 00 00 8b 4d e8 03 4b 30 b8 5a 5a 5a 5a ba 01 00 00 00 8d 79 04 01 5a 5a 5a 5a c7 81 fc 0f 00 00 5a 5a 5a 5a 83 e7 fc 29 f9 81 [ 24.229759] EAX: 5a5a5a5a EBX: f60ca000 ECX: fe036000 EDX: 00000001 [ 24.229915] ESI: f43b7340 EDI: fe036004 EBP: f6389cb8 ESP: f6389c9c [ 24.230072] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010286 [ 24.230229] CR0: 80050033 CR2: fe036000 CR3: 2d361000 CR4: 001506d0 [ 24.230385] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 24.230539] DR6: fffe0ff0 DR7: 00000400 [ 24.230675] Call Trace: [ 24.230957] execlists_context_alloc+0x10/0x20 [i915] [ 24.231266] intel_context_alloc_state+0x3f/0x70 [i915] [ 24.231547] __intel_context_do_pin+0x117/0x170 [i915] [ 24.231850] i915_gem_do_execbuffer+0xcc7/0x2500 [i915] [ 24.232024] ? __kmalloc_track_caller+0x54/0x230 [ 24.232181] ? ktime_get+0x3e/0x120 [ 24.232333] ? dma_fence_signal+0x34/0x50 [ 24.232617] i915_gem_execbuffer2_ioctl+0xcd/0x1f0 [i915] [ 24.232912] ? i915_gem_execbuffer_ioctl+0x2e0/0x2e0 [i915] [ 24.233084] drm_ioctl_kernel+0x8f/0xd0 [ 24.233236] drm_ioctl+0x223/0x3d0 [ 24.233505] ? i915_gem_execbuffer_ioctl+0x2e0/0x2e0 [i915] [ 24.233684] ? pick_next_task_fair+0x1b5/0x3d0 [ 24.233873] ? __switch_to_asm+0x36/0x50 [ 24.234021] ? drm_ioctl_kernel+0xd0/0xd0 [ 24.234167] __ia32_sys_ioctl+0x1ab/0x760 [ 24.234313] ? exit_to_user_mode_prepare+0xe5/0x110 [ 24.234453] ? syscall_exit_to_user_mode+0x23/0x130 [ 24.234601] __do_fast_syscall_32+0x3f/0x70 [ 24.234744] do_fast_syscall_32+0x29/0x60 [ 24.234885] do_SYSENTER_32+0x15/0x20 [ 24.235021] entry_SYSENTER_32+0x9f/0xf2 [ 24.235157] EIP: 0xb7f28559 [ 24.235288] Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 [ 24.235576] EAX: ffffffda EBX: 00000005 ECX: c0406469 EDX: bf95556c [ 24.235722] ESI: b7e68000 EDI: c0406469 EBP: 00000005 ESP: bf9554d8 [ 24.235869] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 24.236018] Modules linked in: i915 x86_pkg_temp_thermal intel_powerclamp crc32_pclmul crc32c_intel intel_cstate intel_uncore intel_gtt drm_kms_helper intel_pch_thermal video button autofs4 i2c_i801 i2c_smbus fan [ 24.236336] CR2: 00000000fe036000 It looks like kasan, xen and i915 are vulnerable. Actual impact is "on thinkpad X60 in 5.9-rc1, screen starts blinking after 30-or-so minutes, and machine is unusable" [sfr@canb.auug.org.au: ARCH_PAGE_TABLE_SYNC_MASK needs vmalloc.h] Link: https://lkml.kernel.org/r/20200825172508.16800a4f@canb.auug.org.au [chris@chris-wilson.co.uk: changelog addition] [pavel@ucw.cz: changelog addition] Link: https://lkml.kernel.org/r/20200821123746.16904-1-joro@8bytes.org Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified") Fixes: 86cf69f1d893 ("x86/mm/32: implement arch_sync_kernel_mappings()") Signed-off-by: Joerg Roedel Signed-off-by: Stephen Rothwell Tested-by: Chris Wilson [x86-32] Acked-by: Linus Torvalds Tested-by: Pavel Machek Cc: [5.8+] Signed-off-by: Andrew Morton --- mm/memory.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) --- a/mm/memory.c~mm-track-page-table-modifications-in-__apply_to_page_range +++ a/mm/memory.c @@ -73,6 +73,7 @@ #include #include #include +#include #include @@ -83,6 +84,7 @@ #include #include +#include "pgalloc-track.h" #include "internal.h" #if defined(LAST_CPUPID_NOT_IN_PAGE_FLAGS) && !defined(CONFIG_COMPILE_TEST) @@ -2206,7 +2208,8 @@ EXPORT_SYMBOL(vm_iomap_memory); static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data, bool create) + pte_fn_t fn, void *data, bool create, + pgtbl_mod_mask *mask) { pte_t *pte; int err = 0; @@ -2214,7 +2217,7 @@ static int apply_to_pte_range(struct mm_ if (create) { pte = (mm == &init_mm) ? - pte_alloc_kernel(pmd, addr) : + pte_alloc_kernel_track(pmd, addr, mask) : pte_alloc_map_lock(mm, pmd, addr, &ptl); if (!pte) return -ENOMEM; @@ -2235,6 +2238,7 @@ static int apply_to_pte_range(struct mm_ break; } } while (addr += PAGE_SIZE, addr != end); + *mask |= PGTBL_PTE_MODIFIED; arch_leave_lazy_mmu_mode(); @@ -2245,7 +2249,8 @@ static int apply_to_pte_range(struct mm_ static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data, bool create) + pte_fn_t fn, void *data, bool create, + pgtbl_mod_mask *mask) { pmd_t *pmd; unsigned long next; @@ -2254,7 +2259,7 @@ static int apply_to_pmd_range(struct mm_ BUG_ON(pud_huge(*pud)); if (create) { - pmd = pmd_alloc(mm, pud, addr); + pmd = pmd_alloc_track(mm, pud, addr, mask); if (!pmd) return -ENOMEM; } else { @@ -2264,7 +2269,7 @@ static int apply_to_pmd_range(struct mm_ next = pmd_addr_end(addr, end); if (create || !pmd_none_or_clear_bad(pmd)) { err = apply_to_pte_range(mm, pmd, addr, next, fn, data, - create); + create, mask); if (err) break; } @@ -2274,14 +2279,15 @@ static int apply_to_pmd_range(struct mm_ static int apply_to_pud_range(struct mm_struct *mm, p4d_t *p4d, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data, bool create) + pte_fn_t fn, void *data, bool create, + pgtbl_mod_mask *mask) { pud_t *pud; unsigned long next; int err = 0; if (create) { - pud = pud_alloc(mm, p4d, addr); + pud = pud_alloc_track(mm, p4d, addr, mask); if (!pud) return -ENOMEM; } else { @@ -2291,7 +2297,7 @@ static int apply_to_pud_range(struct mm_ next = pud_addr_end(addr, end); if (create || !pud_none_or_clear_bad(pud)) { err = apply_to_pmd_range(mm, pud, addr, next, fn, data, - create); + create, mask); if (err) break; } @@ -2301,14 +2307,15 @@ static int apply_to_pud_range(struct mm_ static int apply_to_p4d_range(struct mm_struct *mm, pgd_t *pgd, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data, bool create) + pte_fn_t fn, void *data, bool create, + pgtbl_mod_mask *mask) { p4d_t *p4d; unsigned long next; int err = 0; if (create) { - p4d = p4d_alloc(mm, pgd, addr); + p4d = p4d_alloc_track(mm, pgd, addr, mask); if (!p4d) return -ENOMEM; } else { @@ -2318,7 +2325,7 @@ static int apply_to_p4d_range(struct mm_ next = p4d_addr_end(addr, end); if (create || !p4d_none_or_clear_bad(p4d)) { err = apply_to_pud_range(mm, p4d, addr, next, fn, data, - create); + create, mask); if (err) break; } @@ -2331,8 +2338,9 @@ static int __apply_to_page_range(struct void *data, bool create) { pgd_t *pgd; - unsigned long next; + unsigned long start = addr, next; unsigned long end = addr + size; + pgtbl_mod_mask mask = 0; int err = 0; if (WARN_ON(addr >= end)) @@ -2343,11 +2351,14 @@ static int __apply_to_page_range(struct next = pgd_addr_end(addr, end); if (!create && pgd_none_or_clear_bad(pgd)) continue; - err = apply_to_p4d_range(mm, pgd, addr, next, fn, data, create); + err = apply_to_p4d_range(mm, pgd, addr, next, fn, data, create, &mask); if (err) break; } while (pgd++, addr = next, addr != end); + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(start, start + size); + return err; }