From patchwork Wed Apr 5 15:51:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13202121 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AF38C76188 for ; Wed, 5 Apr 2023 15:51:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 973F66B0074; Wed, 5 Apr 2023 11:51:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 924586B0075; Wed, 5 Apr 2023 11:51:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EB796B0078; Wed, 5 Apr 2023 11:51:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 71B256B0074 for ; Wed, 5 Apr 2023 11:51:28 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 28111A0162 for ; Wed, 5 Apr 2023 15:51:28 +0000 (UTC) X-FDA: 80647777056.17.31FA6CE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id F2C1E4001A for ; Wed, 5 Apr 2023 15:51:25 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="fG2V/byk"; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680709886; a=rsa-sha256; cv=none; b=GLKYBViqUk69T0okDVzAw7nxt0zuWj581gqSTKFyQjZQomv3edYdcyO4/EVJ511015ziWv fHVqmQ+Nvvg4DD6g0Qt8cUoFJ2HYR0NTQYBtfQtyv02YGO2ADHCMe3NxsWoXa5Vr/pPXuS bsFprHFhaSfsDymlcZ897ApfMl7wtsY= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="fG2V/byk"; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680709886; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=dX5bBmfEsfxuoyfhYLC4nRt+vkdDVDFdF9t4fgBK6nc=; b=KcVvLLBnHoASUcKKFNLLpPv/sKHeoarsZF8dKp4PDIkLba2vxQJ35/zsFGLUj/UkF8WL5j sDicjmhJFqGUH+FktPYatJrJDCuKr+eaF2kIpE7B2ic28iQ+2o7NPhBmbPhl2fXRXT/yXU HxrDwHGndiIU4O98+ADXY9CodeIOl8U= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680709885; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=dX5bBmfEsfxuoyfhYLC4nRt+vkdDVDFdF9t4fgBK6nc=; b=fG2V/byklsoGF+DiH2ZAl4J/1wxBXYI+UzVIqtjh/yjB7qoUt98Csce2Rl5bbpP55Er/eB Em3+s50uwwT8UhYwCp5Rf+23gnwvTYEaLiFg+6izdYsLuBGoVcQNx4jTvPBF0sVxrryAni frPZc60s+iewfnmQ96Jji8tT8XSOAjg= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-126-iAVOePjtN-m1X9E-RLxSLA-1; Wed, 05 Apr 2023 11:51:23 -0400 X-MC-Unique: iAVOePjtN-m1X9E-RLxSLA-1 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-3e1522cf031so12260661cf.1 for ; Wed, 05 Apr 2023 08:51:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680709882; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dX5bBmfEsfxuoyfhYLC4nRt+vkdDVDFdF9t4fgBK6nc=; b=GCp70TEZ78yyL20vLtLgOr6L22pCbh/k1IAgyKAWqqBnx/JduEv6jPi9ct8VBIACAX 8Cth/Ei4eoMKMvOasQSHCcWWgKFsS9D/X5LwrpzIdk9iB3HoHXTMeik0p5pdlPJn8c+Y 5ewM27wE+F9M/UZOOo2ALiCGXJjYelMn9XgYlPFcfsmSeDi+vHOKSjdlgkbT2kIhnmAE 0WVtCvhC05/diMw0uYptxWGTKYtc5/aUDyCYnXmMwhNc2gpJXqrIkuh5NowRHEIvGgWm kj4FUozfbX1o5KvW/FNcdGrb53ns44rkKspdZLv+uEWvimiJtwvGT8su98ABFzAy7HbE URTQ== X-Gm-Message-State: AAQBX9cbE7zrDzL21JOuW0W4lSL4e5JAY9D83TQUGBwOTicrm+A//Ard iVvKEa+v15RdCz6ByUvV8+GvIE1uiwvqtdL5dMjzYILbocbaXrFoE7MntpV+wsmYoOpkdV338es nw9rMj+XtWD4hv5giAViFdZpBnv4DzcSAOorEo3WdaTaSf5Y3NhkpkUvgklcG9JfM6ZPd X-Received: by 2002:a05:622a:1a24:b0:3e6:707e:d3c2 with SMTP id f36-20020a05622a1a2400b003e6707ed3c2mr6579986qtb.0.1680709882590; Wed, 05 Apr 2023 08:51:22 -0700 (PDT) X-Google-Smtp-Source: AKy350aQwlmz/B2rFUoV+ynxi7aLMiwTeZZ+VAdvhVh5mZQhiM9KEBDhUFK1/5/JAnwKKQvIr4jnNQ== X-Received: by 2002:a05:622a:1a24:b0:3e6:707e:d3c2 with SMTP id f36-20020a05622a1a2400b003e6707ed3c2mr6579938qtb.0.1680709882192; Wed, 05 Apr 2023 08:51:22 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id 21-20020a370415000000b0074683c45f6csm4538557qke.1.2023.04.05.08.51.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Apr 2023 08:51:21 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Axel Rasmussen , Nadav Amit , David Hildenbrand , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , Mike Rapoport , Yang Shi , linux-stable Subject: [PATCH] mm/khugepaged: Check again on anon uffd-wp during isolation Date: Wed, 5 Apr 2023 11:51:20 -0400 Message-Id: <20230405155120.3608140-1-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Queue-Id: F2C1E4001A X-Rspamd-Server: rspam01 X-Stat-Signature: zpeq91k6bie99zuj83ozhtd3n4sswes9 X-HE-Tag: 1680709885-960511 X-HE-Meta: U2FsdGVkX1+SGmm+sudbIYmATSFuFKtQYvGCB2KByYrpZJgJjxA3I2/fAdfdpOOSEVjxEfoZpSxKo9Ub6nCwbaNI011WuujCo9R2AZrlHvPnL0LxWs/N/9aaszU6bqRcylOrsY9j9OCz0Mo/3ZPgwPI8lgiw5V+PIXeypSMkIo4CHotQ2NEIy0iSJaTOQIX29P16drn/bdeO4jr6luqutvUII7Lu5Wap5b68GcPwymOnNpZBs0ouPPZKaXRBBTNvbNxkZsrNdpEU6tatYBaGykjnQcELcKe5gEG98yUO95aY1kqSC9vp44nwT1lm45npn/YZ1G+OBV+zbI4E2LLOUf1b22XQ8kdoY0tIuCpmDrBIjraRaG+32izsLar7tPaesw15DKHNlY0Gr8Efnwvh2m/RWX0VURZKp12b7VAKXKWpqkO109c/prvP2okFmeVGpy3/CIplFZ3IT17s75evXE0ruh1vcGrLVriGTLC2CQcTJsrtVCO6pbYHwPxfkKIrEFyXgzE5tAvIwL3bLYsTJyzLAhUj5PZZdGl4CjYiYg9KtI4aSglksHvyv+piQ7Pb7fkmSusjgZPwdW/WkvLwjqt43N1UmG3vGRsnuXlmgzrfIj2Oljwx6XLKk8TcKtgLP/rclxeg/ovMCBZP1KYzsUBZ0NowBn3t6Wh8QWBOml/UVh/eDE9d1ksQB7FSCNQFSOBUgd1njOzJDw0e/FnDBtnWCXayn8VQENiYVGnh5P1sUaZxNkLbaL0s7oBXZfjq6Ui2EN1OzMDy9aZd/tyL+FEUjX/8NYqx/WH+nTXN8IY3W0fFWyMqx92oQnD2wheyUHN0/mE36KPy/zWfnZssZ1kv7y08XngKeWYhvhSYKAhLwHg7o2OYzor3f7+14Jxzn2p5fLnMymsY5ejDYhNZY5IbI6vIoOmPkEsdLn0L4sFoyehi0sSJsYx16L170+kmRXqFXdVHRO0prPkQWnD RWjwpJHK LdVmsb641Nimf7pPH/qYN7BAobfBEIKQBRnAtqeGnXIwKz3VaivVsBtMDJpcNc2cKodaRHIrMSwgmNdS+wIfPKPz2AfiWQj5g2iYwDgOe/EOzgwLoFcmdWtNJz+ClQOFWl2xTv1MLEnplGSz0T9+FuVDmkyjohIHgpmCCcVuXlBJODkwePrK3+sYHJdgG7BedVlyTdrDAl6v/9auBVwSnMQWD3dr7sfiuh8w2EeZk32+O+DNZAyznrhS4213Nt95/si3A2V4Y3wRbQQFrEhP5HTMOmSyjKo5aWJyovPbpo4Ju8Y04mxRcYcAuwCWKXksrh0sS6x2Pc9la9HhYsGMlnBXWGJIhD9KpV0tFHMcRp3HQIpVh6HBhXcXNHARRTm3IA5f0Kcx+RkJho+K/csdRPZDFvvezFPDF93Vig2ftOOlblIm5Ljfyh6u1ERCgPSuSqXM0ZZsbFWGWsriB0tROPdWLB7phVwNvlUe+IuYFOxR0ZlADpnC5bHB27KQIhG6JTONya/9i/G9I29q3AEO6f0E7TQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Khugepaged collapse an anonymous thp in two rounds of scans. The 2nd round done in __collapse_huge_page_isolate() after hpage_collapse_scan_pmd(), during which all the locks will be released temporarily. It means the pgtable can change during this phase before 2nd round starts. It's logically possible some ptes got wr-protected during this phase, and we can errornously collapse a thp without noticing some ptes are wr-protected by userfault. e1e267c7928f wanted to avoid it but it only did that for the 1st phase, not the 2nd phase. Since __collapse_huge_page_isolate() happens after a round of small page swapins, we don't need to worry on any !present ptes - if it existed khugepaged will already bail out. So we only need to check present ptes with uffd-wp bit set there. This is something I found only but never had a reproducer, I thought it was one caused a bug in Muhammad's recent pagemap new ioctl work, but it turns out it's not the cause of that but an userspace bug. However this seems to still be a real bug even with a very small race window, still worth to have it fixed and copy stable. Cc: linux-stable Fixes: e1e267c7928f ("khugepaged: skip collapse if uffd-wp detected") Signed-off-by: Peter Xu Reviewed-by: David Hildenbrand --- mm/khugepaged.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a19aa140fd52..42ac93b4bd87 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -575,6 +575,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, result = SCAN_PTE_NON_PRESENT; goto out; } + if (pte_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out; + } page = vm_normal_page(vma, address, pteval); if (unlikely(!page) || unlikely(is_zone_device_page(page))) { result = SCAN_PAGE_NULL;