From patchwork Tue Aug 13 20:25:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 13762449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAD19C52D7B for ; Tue, 13 Aug 2024 20:25:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E2696B0085; Tue, 13 Aug 2024 16:25:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 691976B0089; Tue, 13 Aug 2024 16:25:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E4086B0088; Tue, 13 Aug 2024 16:25:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2C0196B0083 for ; Tue, 13 Aug 2024 16:25:45 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CE59DA7B70 for ; Tue, 13 Aug 2024 20:25:44 +0000 (UTC) X-FDA: 82448353008.02.B2F8091 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf16.hostedemail.com (Postfix) with ESMTP id E4899180005 for ; Tue, 13 Aug 2024 20:25:42 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bC5SaE0m; spf=pass (imf16.hostedemail.com: domain of jannh@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723580707; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=urxtbyntgkmhTeVfk229QCNiMtt2bXGPf/Tk0xUU9iw=; b=N63kBGnwK11OgP95ulfgWUzqaOwM5QkWDGvnIijdoVC1QVoZt/2SLWuEUgMH+K74j+hOgM jxbrAWJCRebuv74cTXuRLQZA+RfuhU1J6jGIdkwXqSW03fe8B5wwLHsBZT2gSg8a8bb0Hd +9GQlJhMZ5MSKLlxumeUo5HnWtHGe9g= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bC5SaE0m; spf=pass (imf16.hostedemail.com: domain of jannh@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723580707; a=rsa-sha256; cv=none; b=Vii3BJzcGl/UBrjUIISFUbEMrq55JLv245wSA8WavR2U6gqjRGcbjhVY4RaOcT/mQTzhub i3W477qZd9mu+9kOMqYIcKNmr3lyKWrX6ZcLF/3lk6slsxqn3hCrgqFO8/scA5PQ2slp/X 4Xcg4MRtmb5hFcDge+QzFK+sYJLfQrA= Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-42807cb6afdso2655e9.1 for ; Tue, 13 Aug 2024 13:25:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723580741; x=1724185541; darn=kvack.org; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:from:to:cc:subject:date:message-id:reply-to; bh=urxtbyntgkmhTeVfk229QCNiMtt2bXGPf/Tk0xUU9iw=; b=bC5SaE0m63bI5cBJASB9DC7CeaFp0/5faGx+qJRiEmj4AnTSbm+nfBuBhRcNlQSA11 WZZTzwL9q1GDp6BWiDK5wPS47bcydaFFsPHVslr9gTRkuaR9dDZhhuqCWiPMAAJe6HSk 6KiMQes7Gyv/CQOWhg/3j4kp0rdpQvrrV0T0Edqdv2lvU97OxFBKiFMrThOylWBza4Eo K1JzhrNX/ta/47WRQPLEdB/PEFAiMXjq+Yu97C8y3UhEiwqSWo1+8XQP/PEMD0xjqWOy W54cUdMvI3N9BhvpncBtY9Q33Na+sKaWqU5le3e6+SzULfg5+99TCRCQWwj/VYiJvqVh 6RSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723580741; x=1724185541; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=urxtbyntgkmhTeVfk229QCNiMtt2bXGPf/Tk0xUU9iw=; b=A8ooTyw4k4XgE+MGiaXzhF9TytCMZSvWKdCCtkB6CBX993kpPmojatiUZjtZOdRpvZ +ujkCvsnNq9EQqEzyrKKCTQBblej0Y2hOVST1cAo7yn4l6P2jojDHiYYTsLyoVDDg4rM pP0DqMreL8tCucXdgtxWAugtoKLS0egL9grWTcgr0gqejO2nigsPMt/2R0VBTsOhquLF QFBdnjxI5bPfoPOjLb/bZ5pMhgKw17uAmGFCJMQQpGelTR49W+dovH4Qa8xvbDTSaqsB TZ+pN5hXulu2Sx2PQDiD0nfRCuYULpgCPopCyNEVfJHg1/D991MJUkEazfcGalOAAQt7 jRlQ== X-Gm-Message-State: AOJu0YxqPgei9l21UtKc2rJC+mgNIjx6DP7JFfrQ+I1LROaYkrccnLtj 8Cv5pZB0kzN1ZPthLAsR2pk7Bk6918Ji3vzBVAxe1rtNzHMwDNHJWF648xR59tpjSjd/Gro/pWJ OSw== X-Google-Smtp-Source: AGHT+IFoEot39/qLHaxa6y/Sw82Et10+AuC9CeTanjVjdFZOOpZp9e5OHYQltsAxMxQ/WGS4zPKfGA== X-Received: by 2002:a05:600c:500b:b0:426:8ee5:3e9c with SMTP id 5b1f17b1804b1-429ddc8a1ebmr110625e9.6.1723580740634; Tue, 13 Aug 2024 13:25:40 -0700 (PDT) Received: from localhost ([2a00:79e0:9d:4:a608:a4cb:f4c2:6573]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429dea45445sm943535e9.6.2024.08.13.13.25.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 13:25:40 -0700 (PDT) From: Jann Horn Subject: [PATCH v2 0/2] userfaultfd: fix races around pmd_trans_huge() check Date: Tue, 13 Aug 2024 22:25:20 +0200 Message-Id: <20240813-uffd-thp-flip-fix-v2-0-5efa61078a41@google.com> MIME-Version: 1.0 X-B4-Tracking: v=1; b=H4sIADDBu2YC/32NQQrCMBBFr1Jm7UgmVLSuvId0YZNJMlCbktSil Nzd2AO4+fA+/Pc3yJyEM1ybDRKvkiVOFfShARMek2cUWxm00q26kMaXcxaXMKMbpYa8USvXkSM 60dBB3c2Ja707733lIHmJ6bNfrPRr/9lWQoWtM2SHszHWqpuP0Y98NPEJfSnlC91gx0CzAAAA To: Andrew Morton , Pavel Emelianov , Andrea Arcangeli , Hugh Dickins Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, David Hildenbrand , Qi Zheng , Jann Horn , stable@vger.kernel.org X-Mailer: b4 0.15-dev X-Developer-Signature: v=1; a=ed25519-sha256; t=1723580736; l=2085; i=jannh@google.com; s=20240730; h=from:subject:message-id; bh=uPZbHC8CGUlf/x8Q+x+KQ3vLMFUd1MWESPzfxfGXEFE=; b=T2zzW+NCfodiI+EDaSAYC8DlWgR0OKZs3VhkAe8fiH7o38AHgADYuunZE4nL6DePHkDrsi+Nq V7e6FCKZvtHDSpQhBGqViYvRbDaHTR1ErIRFdP5Ddtww9JAY/XVgJpQ X-Developer-Key: i=jannh@google.com; a=ed25519; pk=AljNtGOzXeF6khBXDJVVvwSEkVDGnnZZYqfWhP1V+C8= X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: E4899180005 X-Stat-Signature: i1jep1o4egdspkauroxyz8bwm3drnhzt X-HE-Tag: 1723580742-356271 X-HE-Meta: U2FsdGVkX1+kahcSD+PmIGjjpNrZhfGhZlxf5PPYf+FWMjYjUQeSzP4iURG34VSDqpxzaytdo7EozaczZ1fAMhFBgRpRnFFHB+Lfxrq1y+4SG8wSlEMe07dAF18Q1Eb4aB1AXVEDQEyWc0N8ts2b5t2A6sED41wyBY5F8A7QYXl9myptpXaNv6F2I8JdOide63wwzy5W26vlGu4uAHY0y9D1Sq74iGMco/HMHEk088bm4nGhlk3F5mOrufHCgLggWhUxaaqWGccSwqMyn57Ghf+ZLhmP8PaClx41p1yqbfDfd7zU7okPI2hL/RZQfNcp0a0Il5TLIPs9M9vOtoUeSWBJHyifub2wj8KkkwKD545zWtSQIwL9ydgWsZ5eMhhUYWm5ZmiHtl/dcXFzL7isN9Iwmq9VqLbtDhXm/clWhz5c5icSZ0z8XFquYEmFTKhz51T6R2o3oiXrGUCpVwRoAfDbkEamylXuPp22Ft9u40GW1eZM+HzVXDl6vA6HhJ9VrErj+JBSxjR2j5QI74t3OKrFTK0/GtHVyCSHN99X2K0Ek40pTutw/mDhRThmQkKUF2iLKjMFeLo5le0KS1ovsMr+bh/9gHv37L4qiABUMPM/6fjlgrzJoAYdZsy2tBJIBKZ5huU30qGAUpCdtxXBUw28ruNlVSlP6w4r+y4dCsonsV3fEJlZMzAHWrArzjSNf2h0Fi+InNuYnDEjzrfjbrsDXbsob6K2iZcYYnfP/dXwSnUPdYpqN1wmCApnNYu37qkipTA+MwziG7yQD7a6K0CZl7G9OaHNGdp+NsSF4O/FlCFVQpPtOsUQVImLzp8LPfH+gxkMxtKJcuuCL97ehEw9dEz3ZffS0w7lewQVIPBD9BnqfEKuF/bxFRsrNUsCSRLGW1JErEQRfMkMu/vq4UFTmNhB1d42Tt5mNVbmFyO29XiBv/DvCIloAN19Mv9PBgc26lToUqtO4O7B3oZ jiWzahFN aXJnNCTKFArg/h/WUkJ0sRDM96l5WOHO/nPbfSi3BvIU3m8+iDnfKpzGwPkU+SM9vIUBQKHD5NqluIqv/Ig7fFKU7bQPu132sAcr/2esTf7iOGwyMfoADFc/cOKr5f9qNguHuAxX75pBhllJSMd2q6sqX0rKfVqAy8cP17+tbXAd9WDc47Li2z3PzwLQ4CtWYrpjM2DDpwbqDjBLr1Zx58AQEuJnyXNcbd80CujV9NtEUopdmMfvVJGX8D0Ebe0pGSCNCbzzgWRHi/0qib2ebbDVBLuHdiOqilii3XCSVhDn6Jqc8qAlYfAL84om+9VOG6/lR3xOfXgOMzydpe9ky3aDVznBZ95W4p3J4JwSipi6tTrf0qADBaeJOOrBuGrsR75VRPvoJhw/EdwQudwfvuPnIjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The pmd_trans_huge() code in mfill_atomic() is wrong in three different ways depending on kernel version: 1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit the right two race windows) - I've tested this in a kernel build with some extra mdelay() calls. See the commit message for a description of the race scenario. On older kernels (before 6.5), I think the same bug can even theoretically lead to accessing transhuge page contents as a page table if you hit the right 5 narrow race windows (I haven't tested this case). 2. As pointed out by Qi Zheng, pmd_trans_huge() is not sufficient for detecting PMDs that don't point to page tables. On older kernels (before 6.5), you'd just have to win a single fairly wide race to hit this. I've tested this on 6.1 stable by racing migration (with a mdelay() patched into try_to_migrate()) against UFFDIO_ZEROPAGE - on my x86 VM, that causes a kernel oops in ptlock_ptr(). 3. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed to yank page tables out from under us (though I haven't tested that), so I think the BUG_ON() checks in mfill_atomic() are just wrong. I decided to write two separate fixes for these (one fix for bugs 1+2, one fix for bug 3), so that the first fix can be backported to kernels affected by bugs 1+2. Signed-off-by: Jann Horn --- Changes in v2: - in patch 1/2: - change title - get rid of redundant early pmd_trans_huge() check - also check for swap PMDs and devmap PMDs (Qi Zheng) - Link to v1: https://lore.kernel.org/r/20240812-uffd-thp-flip-fix-v1-0-4fc1db7ccdd0@google.com --- Jann Horn (2): userfaultfd: Fix checks for huge PMDs userfaultfd: Don't BUG_ON() if khugepaged yanks our page table mm/userfaultfd.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) --- base-commit: d4560686726f7a357922f300fc81f5964be8df04 change-id: 20240812-uffd-thp-flip-fix-20f91f1151b9