From patchwork Wed Sep 12 06:49:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10596889 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 89690920 for ; Wed, 12 Sep 2018 06:49:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7765D297F4 for ; Wed, 12 Sep 2018 06:49:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6A3BC297FC; Wed, 12 Sep 2018 06:49:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C090D297F4 for ; Wed, 12 Sep 2018 06:49:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57A6E8E0002; Wed, 12 Sep 2018 02:49:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 503BD8E0001; Wed, 12 Sep 2018 02:49:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F3668E0002; Wed, 12 Sep 2018 02:49:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt0-f200.google.com (mail-qt0-f200.google.com [209.85.216.200]) by kanga.kvack.org (Postfix) with ESMTP id 0EE238E0001 for ; Wed, 12 Sep 2018 02:49:34 -0400 (EDT) Received: by mail-qt0-f200.google.com with SMTP id l7-v6so830355qte.2 for ; Tue, 11 Sep 2018 23:49:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=w0su93UgPgdSfBguA29T1OGm2QHHTHN9mEjvNDqpCH0=; b=p7nQ8op9ykGhfsophmJnUr1q29kF+/oj1ZOq+wEhVjGTjgAV0za1iCMa1nKOUkctlf HP/JqO1Nw5HWokxQBmAzYNG5fsp0IxVFi+kK4u68tMmIGScPHWPNwLG1loHyc4fjg4GA GA6b5m+5Zu4DR9ki6JSn0zF2Ja8X5J871OejNlBcJANoe5qPnwtF6lymWf7LSzGroID1 g4E0VQE2nKKHDTw9mo9U5I2/YUNHU3QHPIsrBfc7r8l2HjwzfQVfginwp57wouK334ae pqpFrL+Z3o4GXQmCbObv8N8juE2RzzCDmGEYeoHl/YqCRyeGpOSdsKTvbVqnMPX+PQZL eSuw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APzg51DSQKzzMsjywEBEF1kehCqyHkd8Rly6as3zyGmJuxxg9h6BoQI0 X87hehJCcD0aeQlSnA5uzvKdcA5b80mMDWRbUemoCy1sf3Gg2uTKSp6gMYdnuy86t0kX1ZlrOOf e2pWTmRWx56YUTz/9DmUQ5AtLSemGe5+wHWxZdzh8HvUUBVJpR+ws1j3k0xGFgtfdNQ== X-Received: by 2002:ac8:33d9:: with SMTP id d25-v6mr278776qtb.313.1536734973736; Tue, 11 Sep 2018 23:49:33 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZWc73X6sPfxERI0qqWHe1SWtpkrF40QUkgFPQP3xQNI/JwhV14KzdvhDntVM3B7QET9yv7 X-Received: by 2002:ac8:33d9:: with SMTP id d25-v6mr278737qtb.313.1536734972687; Tue, 11 Sep 2018 23:49:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536734972; cv=none; d=google.com; s=arc-20160816; b=kmzpquLBlMh1Wzg1ErjpFEgtrP44PbUbd1OST6Q4oZ6weXfe64RDN9PUQmtbZql9LL TcVLNxmvIuBAlsrfME4jjsC5COC1sBLInAABH/mPhBJFi5FJn70T2UeVQ7w6p+iOwBv2 tMisMcKQJoGChbsnxuXmapcK5kfJ0JEQQM6oUELZ5NvP4at3scHHLqrCJ6Q4WTEkWXZB EtvGK0WC1xcE5QQPMEZYwa0mz17ZUtgzOwKF1qeiSZCyfkW20ey1KqacNtp4woSr9WOu QoIuQmxtcIZwjaP+BvCAo+BZ5DLqnmyKt6yknpPyS4z4l/5CP9NlXXYPr2Iga5dTCh2+ EAOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from; bh=w0su93UgPgdSfBguA29T1OGm2QHHTHN9mEjvNDqpCH0=; b=vdvu1lEkDO6gomaED2Mveau7vfzL2z9KCfcFpfNWSveKhzz1oxM1oh/TY2AAfXJvPB lmGU605SOWXNCdL3Kjl1BNnGZf/mT7YPqwYVVpIgxjZ7ruqQbLLiKKmKiR5B1xiuGdbT d844ZKZOekNdfMFp9ruAV7RvO/mN8Xz0q1BSKPphpbiOpDH49UXMr/QMBSLuZ8hg+v1S F5RN/Dgy4XTeYo271f7vNolbaBjZy3/DhpKyeCP7IW7k2jaMqbfUk7IyhWTghCUrwxNa S4EurDNtdwGrkwMEOFHxOtjMNtX813H/9BeCtlkscf565Q64jTNDVUyxYGvzlwAHiwiR VtZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx3-rdu2.redhat.com. [66.187.233.73]) by mx.google.com with ESMTPS id s63-v6si83808qkc.404.2018.09.11.23.49.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Sep 2018 23:49:32 -0700 (PDT) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 66.187.233.73 as permitted sender) client-ip=66.187.233.73; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E752340216F5; Wed, 12 Sep 2018 06:49:31 +0000 (UTC) Received: from xz-x1.redhat.com (ovpn-12-111.pek2.redhat.com [10.72.12.111]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9D6802156701; Wed, 12 Sep 2018 06:49:21 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Andrew Morton , Mel Gorman , Khalid Aziz , Thomas Gleixner , "David S. Miller" , Greg Kroah-Hartman , Andi Kleen , Henry Willard , Anshuman Khandual , Andrea Arcangeli , "Kirill A . Shutemov" , Jerome Glisse , Zi Yan , linux-mm@kvack.org Subject: [PATCH v2] mm: mprotect: check page dirty when change ptes Date: Wed, 12 Sep 2018 14:49:21 +0800 Message-Id: <20180912064921.31015-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 12 Sep 2018 06:49:32 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 12 Sep 2018 06:49:32 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add an extra check on page dirty bit in change_pte_range() since there might be case where PTE dirty bit is unset but it's actually dirtied. One example is when a huge PMD is splitted after written: the dirty bit will be set on the compound page however we won't have the dirty bit set on each of the small page PTEs. I noticed this when debugging with a customized kernel that implemented userfaultfd write-protect. In that case, the dirty bit will be critical since that's required for userspace to handle the write protect page fault (otherwise it'll get a SIGBUS with a loop of page faults). However it should still be good even for upstream Linux to cover more scenarios where we shouldn't need to do extra page faults on the small pages if the previous huge page is already written, so the dirty bit optimization path underneath can cover more. CC: Andrew Morton CC: Mel Gorman CC: Khalid Aziz CC: Thomas Gleixner CC: "David S. Miller" CC: Greg Kroah-Hartman CC: Andi Kleen CC: Henry Willard CC: Anshuman Khandual CC: Andrea Arcangeli CC: Kirill A. Shutemov CC: Jerome Glisse CC: Zi Yan CC: linux-mm@kvack.org CC: linux-kernel@vger.kernel.org Signed-off-by: Peter Xu Signed-off-by: Jérôme Glisse Signed-off-by: Jérôme Glisse Signed-off-by: Jérôme Glisse Signed-off-by: Jérôme Glisse --- v2: - checking the dirty bit when changing PTE entries rather than fixing up the dirty bit when splitting the huge page PMD. - rebase to 4.19-rc3 Instead of keeping this in my local tree, I'm giving it another shot to see whether this could be acceptable for upstream since IMHO it should still benefit the upstream. Thanks, --- mm/mprotect.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index 6d331620b9e5..5fe752515161 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -115,6 +115,17 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, if (preserve_write) ptent = pte_mk_savedwrite(ptent); + /* + * The extra PageDirty() check will make sure + * we'll capture the dirty page even if the PTE + * dirty bit is unset. One case is when the + * PTE is splitted from a huge PMD, in that + * case the dirty flag might only be set on the + * compound page instead of this PTE. + */ + if (PageDirty(pte_page(ptent))) + ptent = pte_mkdirty(ptent); + /* Avoid taking write faults for known dirty pages */ if (dirty_accountable && pte_dirty(ptent) && (pte_soft_dirty(ptent) ||