From patchwork Mon Jun 12 16:04:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13276937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E3FBC88CB4 for ; Mon, 12 Jun 2023 16:05:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B43A78E0005; Mon, 12 Jun 2023 12:05:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF3D98E0002; Mon, 12 Jun 2023 12:05:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 995778E0005; Mon, 12 Jun 2023 12:05:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 871D38E0002 for ; Mon, 12 Jun 2023 12:05:14 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1DDFB802A9 for ; Mon, 12 Jun 2023 16:05:14 +0000 (UTC) X-FDA: 80894570148.22.5A99E31 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 4589940028 for ; Mon, 12 Jun 2023 16:04:31 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Nef9Y8Lu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686585871; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=8Z4sQASkP/QzczZ9MtwSxp/dOU3reXMwaqRjOEgVO0E=; b=KU8rcObTeNb9Q9fORUHOumG0OH9joqne52gP69v9XKs+X3TNwyZrVlSyqfqi5ilGfNIp8Y KA9AgUek422eT5ZIX4pAIdIy6SjkYiDfRYVCAZjvr67WrLFPz0Pe+2I1mUQNmAfREHCqTX nENBfSzGfuzlixyHoUSskKjiDyRqejs= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Nef9Y8Lu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686585871; a=rsa-sha256; cv=none; b=T+l3ItmIfzyagCiBYRBh8aO7jCSf0U2+H8ldDrN96Z9VD/8o09c4iIgTg24RhWDD9hW+56 q1AeK06R4DhFIHvmuvfF9HWvife+Hu0KEPfPoH4XfD8EaSikwGZRMqwtNmx8BrQ1JcPFWx sCaHCR0vo+PYyma99EnGh228ipz9xEA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686585870; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=8Z4sQASkP/QzczZ9MtwSxp/dOU3reXMwaqRjOEgVO0E=; b=Nef9Y8Lu9Lhjw5yM5yWwYvhHAXQdNOfhd7jqL36j2Z7n5NZVcfxank2pVllQiIA8dnJiMI OqMZxb2utyIr+RMP+h6/ymTMGYVz7sGm3D/Ls6TLZCCvCGwqv7fcHEZn41ZydzWX6JQkOr GDdOGbDXMffGYq6/p2BRpTDCJ5x5cJ8= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-660-Cx-cVToWNzyMaxzu0dU1LA-1; Mon, 12 Jun 2023 12:04:22 -0400 X-MC-Unique: Cx-cVToWNzyMaxzu0dU1LA-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-75ec325d255so54772285a.0 for ; Mon, 12 Jun 2023 09:04:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686585862; x=1689177862; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8Z4sQASkP/QzczZ9MtwSxp/dOU3reXMwaqRjOEgVO0E=; b=IKPnyPW9uN4HftgAMIqAQdlLBZTl0h3B9J4BC+C/UUnffWwctRZvNa/RGD/aTfxTF0 UdTpuHiFqy3n+ammNyxYOYcYUOgi5jHM65ewGIgGlKwYQuuW0aappUFqduEnP4hs9+tJ 0wTdBFxPqwSuk4IN7kOsdX//RmaDcRdQeo72X8TkpfRnmtELFpkV/kUpXpv+KNQ8+Tj7 3ED5WTiRhQSFqNY6JZMl0IjPN375s4JzwK09keWgUjV0uNZ55p1sZccrZKWVLOFz9QzL GWW3ePbh7L190wQBEUT9SLAot4jmTDurvcqlxrbMvR6mkaJy4DpiO360LXYYbU/cqUzE eSoQ== X-Gm-Message-State: AC+VfDzzr8gyGCJkRPINMadsII4M1On3Dm4nkTgPhYO8M/egFqCzamMk bo76W9SRW8LhfK9fm5hMxxzy9fn4Y4/jVSGDaCeYzHTDqJsvQLJST2yvYaO3twJ4Whdip1QDzW/ nNx6WjYYpeL8= X-Received: by 2002:a05:620a:3c8d:b0:75b:23a1:82a4 with SMTP id tp13-20020a05620a3c8d00b0075b23a182a4mr10269980qkn.5.1686585862391; Mon, 12 Jun 2023 09:04:22 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ43DjOBQqRgF/c9IKJHRBcmjSh7mZCgF+L6b23a4ZBilYEfplQZvNu5fp9TUqJzmsf2MOxD1g== X-Received: by 2002:a05:620a:3c8d:b0:75b:23a1:82a4 with SMTP id tp13-20020a05620a3c8d00b0075b23a182a4mr10269965qkn.5.1686585862163; Mon, 12 Jun 2023 09:04:22 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id p5-20020a05620a112500b0075cd80fde9esm2942730qkk.89.2023.06.12.09.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 09:04:21 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , peterx@redhat.com, Muchun Song , Mike Kravetz , Naoya Horiguchi Subject: [PATCH] mm/hugetlb: Fix pgtable lock on pmd sharing Date: Mon, 12 Jun 2023 12:04:20 -0400 Message-Id: <20230612160420.809818-1-peterx@redhat.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 4589940028 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zj9xu7sin6xbebpk4d1uwxe8p8io3ax5 X-HE-Tag: 1686585871-424333 X-HE-Meta: U2FsdGVkX1+BPpj9uTHFVj+aK7NZw71h5DcA31ExRrGGehBj25FM69oTgak0QYtw90P0jOaEXA2RJRbe6ZPI0FPacUXJemp2j4o+DCvKrlfy44EiSSbXcasRxKhL7mdGk5tebbBSKWU/5kAPcJG0L7IUMcGygwYBmfNxZIhjW1e2scOyJ4ZPFFzHxIQhBZLiZ5PADeeQJVnzC+BBjskprnhpx7r4rkT/JdrC9WB1mb0tx5C0okt91gc66i5z960a1uQVKmWpUp9VvSOVzY1ZN1JnBl7behaHcfpFvJa4nk4m2GQ9Qjl5LXwxfqS8Y+8bo//9BjcNzAeWXQGAml3HY+26u4/cYRZkMbzgkLiS3wGeWxJ0XW3ppsh7FEB7kKSJLUgmZkCfPNT6/p4BFHgPlNNVyYQJ3cHJbVRPl1e2LlYnZvoDsx1Ryfo4gvlMMKvaMzyCC73Ak1Gc9y/A37TsbjEoEFENZTedz9xYvUlN58YIlofDu6/XrUwBegcF1O2nbj+MKzTSFEVpnPwRo1azaOTdm92Pv7mh98vSoV4CvtGPg91/g/mdKzuoqwg75iet4IrZ38QyqymckCo18B4hzTK4wlsF4TovtbE0GpOYOQCR4aQDzeXHGMoKgBueaQ63DtychxpwedTTE+wiAiJghWCmCC/+yCesFzM+XSbprc0nAplH0bJLE1RbrQqXESwA6K+9uSiKfcbLUhTQZSQYUYS1zO51Zgbk9docxYayCSChIZXg2siJKmZhho+xBGzi1v7e8xuOPM484z39sgC8Icr6SI6CBppblTbsHNGiYk05MGx7MKERajC5FEeKOwNjiHfCQPgP7fkL4jorWN4cI5GCjCZxXrBpgDwPHYrqRyYgVeJytBU9DYb+x0iSp7wIwAMhlVlP5iBUGC5qTj6G3Xu+Ckro72/FwySHyxCg4ue8U9ytyN7jIkVuDFcN93oCpsrLhgYHuAfTLXaYwdA ettL3Tpr 5ogf3JDrDuHV0eY7HtdbPIYpyRybzhceWgRZz4Nm55jA8LFsqZ/U5ghbYK9qke+zNzzBGmiFHATIeaHRtt7mLEEVxwNt6vAQFIyDA2t8ETp4bUOrXkOsKENqo1HKkMm+Sw+XZeYqXUsZOyBs7yLkJj8IyNaFd1PaEJ4r996b3xvR3O/2H/fgtWhRYKOnakZa9cPtzb+tuZcuA+91WR7NIbLG9ZB6XNwmRY7/AaQn6tBPsDE0eqFJW0fjGJTOSofdn005RUgEvxN1QNU3zxu2lYsboVbG8fmueF0dOabXGlzdQgzSQkLqN5FDaugMr9JdbfVccf+Zs2PWvA688i7JU4tJc5g8Psq1Gyhin79c1YehTd+FvkiwHmAddRFa5F0SLU4ukuzqjfAp19ERYNMMn6eMV/SgBQn2+96bjbnUD4DwjmIwXo6v07BruFA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Huge pmd sharing operates on PUD not PMD, huge_pte_lock() is not suitable in this case because it should only work for last level pte changes, while pmd sharing is always one level higher. Meanwhile, here we're locking over the spte pgtable lock which is even not a lock for current mm but someone else's. It seems even racy on operating on the lock, as after put_page() of the spte pgtable page logically the page can be released, so at least the spin_unlock() needs to be done after the put_page(). No report I am aware, I'm not even sure whether it'll just work on taking the spte pmd lock, because while we're holding i_mmap read lock it probably means the vma interval tree is frozen, all pte allocators over this pud entry could always find the specific svma and spte page, so maybe they'll serialize on this spte page lock? Even so, doesn't seem to be expected. It just seems to be an accident of cb900f412154. Fix it with the proper pud lock (which is the mm's page_table_lock). Cc: Mike Kravetz Cc: Naoya Horiguchi Fixes: cb900f412154 ("mm, hugetlb: convert hugetlbfs to use split pmd lock") Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dfa412d8cb30..270ec0ecd5a1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7133,7 +7133,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long saddr; pte_t *spte = NULL; pte_t *pte; - spinlock_t *ptl; i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { @@ -7154,7 +7153,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, if (!spte) goto out; - ptl = huge_pte_lock(hstate_vma(vma), mm, spte); + spin_lock(&mm->page_table_lock); if (pud_none(*pud)) { pud_populate(mm, pud, (pmd_t *)((unsigned long)spte & PAGE_MASK)); @@ -7162,7 +7161,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, } else { put_page(virt_to_page(spte)); } - spin_unlock(ptl); + spin_unlock(&mm->page_table_lock); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); i_mmap_unlock_read(mapping);