From patchwork Tue Jun 27 04:23:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EBDFEB64DC for ; Tue, 27 Jun 2023 04:23:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229738AbjF0EXb (ORCPT ); Tue, 27 Jun 2023 00:23:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229501AbjF0EX3 (ORCPT ); Tue, 27 Jun 2023 00:23:29 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4747DC for ; Mon, 26 Jun 2023 21:23:27 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-57003dac4a8so105902257b3.1 for ; Mon, 26 Jun 2023 21:23:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839807; x=1690431807; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o3AoGbebO2/sjyjLq3ryUCG8xyOW9r1weM+cOFZBuLE=; b=dHheU6IC1gE4LjWZbQ/IniaV5DkcPhnNF8syM8uOEowX24NwDyxL2jv7tc6FMBV27G 4Rxqr6ibWCntKsVRDQuwZtzhYYD6gJJ2gf36p8H6RGHXmOM8wDs4xTT/AlcWoevHEkxS v11okL1qKsR1gEcM8TCR0a/PdkjPb5Pq5EFqxbA0tdJVQ5Xuw5GeD2Ni9wv4/5tL70nq RQBUnOCAF8gaCTtKz4Rjw+9hMKvCgHNAWYRPVUKlT6kt/uP3e/quWkiQpWjZf2l4itqS 1KtiecTwEsABkWVxwh5d4fxy32LDKhCX/GgNJwBCk69VtHXh+xGpl12LuLk4+qxVZJeO Od9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839807; x=1690431807; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o3AoGbebO2/sjyjLq3ryUCG8xyOW9r1weM+cOFZBuLE=; b=NJl5k4tlNU2nXfFdSqIek3U9a+Z7Qe4YNVCUxMVHDIvUw4uGdxGqTagQnoWe9IQC+p jrexXHiYrjhN+e38FY9Jk6ZAnGyFeqe/noNCRIHReoRV8up5GsxffeMphnyc3H344vNA ifZSNNWA0+RHw1F4Eh2jMpM4brK2bGajSf3gWbbGiWkpS1TNlm7XpyIOYOJrZM9TJUdS 80hpSbrTW+TSAM6k0qg2byzdrTcu9C1rWpzB7lT3PvhRQ994gsdClI901XO5KD/ySzGj w0OBeFhlfZIl3ad8bIlZ/0rCOCaHcnk55MICZ/0AN3etKhnc7bY/GMcCDVaEq3QAPn6f h2qQ== X-Gm-Message-State: AC+VfDxqprdk9kUjSxVr0RhYduuhS2P84XftkLN2wg/9DN7aXmpKAy2U nzt3G6jmPfeHnqbQgdmCie/JAgn45eM= X-Google-Smtp-Source: ACHHUZ5z6WN5yIbltZSphoAhRLFW8PERhElNtWSsbrV8xNDcki//Fjq1aVBOXHRZFF3wnIZEH2iDfzPMLKI= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a05:6902:92:b0:bc4:a660:528f with SMTP id h18-20020a056902009200b00bc4a660528fmr10901839ybs.5.1687839807170; Mon, 26 Jun 2023 21:23:27 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:14 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-2-surenb@google.com> Subject: [PATCH v3 1/8] swap: remove remnants of polling from read_swap_cache_async From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, Christoph Hellwig Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Commit [1] introduced IO polling support duding swapin to reduce swap read latency for block devices that can be polled. However later commit [2] removed polling support. Therefore it seems safe to remove do_poll parameter in read_swap_cache_async and always call swap_readpage with synchronous=false waiting for IO completion in folio_lock_or_retry. [1] commit 23955622ff8d ("swap: add block io poll in swapin path") [2] commit 9650b453a3d4 ("block: ignore RWF_HIPRI hint for sync dio") Suggested-by: "Huang, Ying" Signed-off-by: Suren Baghdasaryan Reviewed-by: "Huang, Ying" Reviewed-by: Christoph Hellwig --- mm/madvise.c | 4 ++-- mm/swap.h | 1 - mm/swap_state.c | 12 +++++------- 3 files changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index b5ffbaf616f5..b1e8adf1234e 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -215,7 +215,7 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start, continue; page = read_swap_cache_async(entry, GFP_HIGHUSER_MOVABLE, - vma, index, false, &splug); + vma, index, &splug); if (page) put_page(page); } @@ -252,7 +252,7 @@ static void force_shm_swapin_readahead(struct vm_area_struct *vma, rcu_read_unlock(); page = read_swap_cache_async(swap, GFP_HIGHUSER_MOVABLE, - NULL, 0, false, &splug); + NULL, 0, &splug); if (page) put_page(page); diff --git a/mm/swap.h b/mm/swap.h index 7c033d793f15..8a3c7a0ace4f 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -46,7 +46,6 @@ struct folio *filemap_get_incore_folio(struct address_space *mapping, struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr, - bool do_poll, struct swap_iocb **plug); struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, diff --git a/mm/swap_state.c b/mm/swap_state.c index b76a65ac28b3..a3839de71f3f 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -517,15 +517,14 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, */ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, - unsigned long addr, bool do_poll, - struct swap_iocb **plug) + unsigned long addr, struct swap_iocb **plug) { bool page_was_allocated; struct page *retpage = __read_swap_cache_async(entry, gfp_mask, vma, addr, &page_was_allocated); if (page_was_allocated) - swap_readpage(retpage, do_poll, plug); + swap_readpage(retpage, false, plug); return retpage; } @@ -620,7 +619,7 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, struct swap_info_struct *si = swp_swap_info(entry); struct blk_plug plug; struct swap_iocb *splug = NULL; - bool do_poll = true, page_allocated; + bool page_allocated; struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address; @@ -628,7 +627,6 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, if (!mask) goto skip; - do_poll = false; /* Read a page_cluster sized and aligned cluster around offset. */ start_offset = offset & ~mask; end_offset = offset | mask; @@ -660,7 +658,7 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, lru_add_drain(); /* Push any new pages onto the LRU now */ skip: /* The page was likely read above, so no need for plugging here */ - return read_swap_cache_async(entry, gfp_mask, vma, addr, do_poll, NULL); + return read_swap_cache_async(entry, gfp_mask, vma, addr, NULL); } int init_swap_address_space(unsigned int type, unsigned long nr_pages) @@ -825,7 +823,7 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, skip: /* The page was likely read above, so no need for plugging here */ return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address, - ra_info.win == 1, NULL); + NULL); } /** From patchwork Tue Jun 27 04:23:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FCF2EB64DD for ; Tue, 27 Jun 2023 04:23:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230247AbjF0EXo (ORCPT ); Tue, 27 Jun 2023 00:23:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230112AbjF0EXb (ORCPT ); Tue, 27 Jun 2023 00:23:31 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3445F1708 for ; Mon, 26 Jun 2023 21:23:30 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5704991ea05so54421997b3.1 for ; Mon, 26 Jun 2023 21:23:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839809; x=1690431809; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=avzs4/2teeLyiygOd/b1Z8yOJdjymCMBXiqSvPvjnU8=; b=xItssmSvERMw9dsz1Nh5mK37qPbVAP5IEkuWu1uDTGcdg9Vwz2ARZEnjyozZOEqUuG e8uckzXLnLK7yP5FRM7n4SZnRy9wztCXumxHxgkSmrrAMof/XLBanlvGazRLnO/sRoJN UCIl1KpUKDC8OR2vK1T4hTMRgAefTBd7NM+Dx6ZXICvZBrXfyNX3TUTCKVgYWm2hUS2l gOq88g1izAtjxSu5tZWmDb9LNBvXrPbf9h1PhZMWTUxNWgeVfGL1S4kF5FPib76mtzVI DLL3RiWHlMy/zCm8bB7icB7scjmZmtSqhqRFvK87ijBCTPJOJVmbS5n00YK5TtuwF9oT e7Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839809; x=1690431809; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=avzs4/2teeLyiygOd/b1Z8yOJdjymCMBXiqSvPvjnU8=; b=KhKYzAsdpGcI8anbir/6ZA3AUKcnnwGaGGi/UWpcutU+PUPJX/xY24x/0mihUy2wNE LqDUOQIky9iHoJHvzguqBteDf3wqOUxhg5MDcaR5pW9RhXng/fbVKbKvFar1+m3ohGLb fsqsWaqibaW2Sqbl1rsvjgyr8MRntUm+UsrhbTp3M/OtpC3L+R2Iv2yFhNYb/RTcOUjq a5sFZi00Qa0fJ/akdJuDgKbApJCIHorwGWquATs3YQe0XOH2VFUISTGUuMqtynKKTk5y /ACPfkpjxJdifrDOn5JsC4VbLLrSw2vclr2hamPZKjTsfgcG9fH/6EWpNSVTYFNAvR3m Vy/w== X-Gm-Message-State: AC+VfDz4nHaa+8FUAbsvi4ISE/8HvTL7DjBaSLI2axpNbeVkJAICRVo9 TgJWTg4PImuF/q6DXaeftU2NINgzA70= X-Google-Smtp-Source: ACHHUZ5AuZJxZCfveJ0bV7TxtQYd9lNOfSSAPqhvcXC32iXyNPsD+vrMDeCVzVPvKZJ/jqjM2C5sDhTgrX4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:a28f:0:b0:c15:cbd1:60da with SMTP id c15-20020a25a28f000000b00c15cbd160damr2662160ybi.6.1687839809398; Mon, 26 Jun 2023 21:23:29 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:15 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-3-surenb@google.com> Subject: [PATCH v3 2/8] mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org VM_FAULT_RESULT_TRACE should contain an element for every vm_fault_reason to be used as flag_array inside trace_print_flags_seq(). The element for VM_FAULT_COMPLETED is missing, add it. Signed-off-by: Suren Baghdasaryan Reviewed-by: Peter Xu --- include/linux/mm_types.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 306a3d1a0fa6..79765e3dd8f3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1070,7 +1070,8 @@ enum vm_fault_reason { { VM_FAULT_RETRY, "RETRY" }, \ { VM_FAULT_FALLBACK, "FALLBACK" }, \ { VM_FAULT_DONE_COW, "DONE_COW" }, \ - { VM_FAULT_NEEDDSYNC, "NEEDDSYNC" } + { VM_FAULT_NEEDDSYNC, "NEEDDSYNC" }, \ + { VM_FAULT_COMPLETED, "COMPLETED" } struct vm_special_mapping { const char *name; /* The name, e.g. "[vdso]". */ From patchwork Tue Jun 27 04:23:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 560D3EB64DC for ; Tue, 27 Jun 2023 04:23:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230271AbjF0EXr (ORCPT ); Tue, 27 Jun 2023 00:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230115AbjF0EXd (ORCPT ); Tue, 27 Jun 2023 00:23:33 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4675210FF for ; Mon, 26 Jun 2023 21:23:32 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-c22664c3df7so3579919276.1 for ; Mon, 26 Jun 2023 21:23:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839811; x=1690431811; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kbyeBW7WIJNXjK7iu35fAbN/5yTuDJs52xpJl7xU1IY=; b=ugW+SG7YngNovW466Ob1qbMUa7QCgvQN2HYZdw69vEb2gnilYAyfBBDC9eIv0kJn5R K+Nth6CovXEzufoeufA2bBetWvrGzXQiUD8VhdoqNnFL16vRdxwfeZTWoLlNANAXPsaE 4it7n5SCFHwAzJ+itDODzIf22wtA7wDa/2u16CAUMCgAj1Ez2yBmApbjIHS34bZr0QAs 6CcMERtjjQw5qzSYxGAWzCQbLbu0UfN+0PYZKiPIpB8IIqWC1ndKoEhlLxsLL9cLvsPv v4BzBjfVXuH+PyuaE6GmW6rTE2x5EJClKScB3InhL/RUrH4gPNiYW0KVLBDn77KNMcAS 0haQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839811; x=1690431811; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kbyeBW7WIJNXjK7iu35fAbN/5yTuDJs52xpJl7xU1IY=; b=DC4jEC0ETnlPQseTXz12yiQfEU3W4nuoJQEAwXAdM5iaZ1gS0jYZARRAL+E0JzCWMk n8kE/sA+5rdGdYN+Y7RybxxZF6yZaL0peY/Y+v2NLzhpyGtss1pEnJCOK4wesYyeF9ue SJtWLfXiHXdXfOPjPYHHs7XFRWWtT+Oj+N0lBHzW87vtwjUib2X5ZGGRmrepOzgdjW7G 2Y8zxx7Pam5WqpDEFaX+1XY142yV09rXE0QgK5iRLZO+Kf+tqG7td6X5YI+C6MoahHOM 9CqidVhOIw7XNjV/FSqUXiebWelo8Y5PfE7rsZsbO+0Lq7NcgiAItPBGS8CPXgHpkb0z FX9Q== X-Gm-Message-State: AC+VfDy5/MVpYjgghHdIvN4hG0HSBOg6vhnB/EZYwfBpWeiJWfiD+BM5 0zSX15kiSRYWW9BGdWZf4v2muyGm5Jc= X-Google-Smtp-Source: ACHHUZ5PHgbnMPnEsTLa+W0vChTT29tB4jfEbEI5BqOjiRDx9zMqo94H5raeRb2+zfsUTcjeQvPxANoMY/M= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:ac92:0:b0:bc4:78ac:920f with SMTP id x18-20020a25ac92000000b00bc478ac920fmr13716099ybi.0.1687839811426; Mon, 26 Jun 2023 21:23:31 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:16 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-4-surenb@google.com> Subject: [PATCH v3 3/8] mm: drop per-VMA lock in handle_mm_fault if retrying or when finished From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org handle_mm_fault returning VM_FAULT_RETRY or VM_FAULT_COMPLETED means mmap_lock has been released. However with per-VMA locks behavior is different and the caller should still release it. To make the rules consistent for the caller, drop the per-VMA lock before returning from handle_mm_fault when page fault should be retried or is completed. Signed-off-by: Suren Baghdasaryan --- arch/arm64/mm/fault.c | 3 ++- arch/powerpc/mm/fault.c | 3 ++- arch/s390/mm/fault.c | 3 ++- arch/x86/mm/fault.c | 3 ++- mm/memory.c | 12 +++++++++++- 5 files changed, 19 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 6045a5117ac1..89f84e9ea1ff 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -601,7 +601,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, goto lock_mmap; } fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs); - vma_end_read(vma); + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) + vma_end_read(vma); if (!(fault & VM_FAULT_RETRY)) { count_vm_vma_lock_event(VMA_LOCK_SUCCESS); diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 531177a4ee08..4697c5dca31c 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -494,7 +494,8 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, } fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); - vma_end_read(vma); + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) + vma_end_read(vma); if (!(fault & VM_FAULT_RETRY)) { count_vm_vma_lock_event(VMA_LOCK_SUCCESS); diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index b65144c392b0..cccefe41038b 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -418,7 +418,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) goto lock_mmap; } fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); - vma_end_read(vma); + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) + vma_end_read(vma); if (!(fault & VM_FAULT_RETRY)) { count_vm_vma_lock_event(VMA_LOCK_SUCCESS); goto out; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e4399983c50c..d69c85c1c04e 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1347,7 +1347,8 @@ void do_user_addr_fault(struct pt_regs *regs, goto lock_mmap; } fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); - vma_end_read(vma); + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) + vma_end_read(vma); if (!(fault & VM_FAULT_RETRY)) { count_vm_vma_lock_event(VMA_LOCK_SUCCESS); diff --git a/mm/memory.c b/mm/memory.c index f69fbc251198..9011ad63c41b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5086,7 +5086,17 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, } } - return handle_pte_fault(&vmf); + ret = handle_pte_fault(&vmf); + if (ret & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)) { + /* + * In case of VM_FAULT_RETRY or VM_FAULT_COMPLETED we might + * be still holding per-VMA lock to keep the vma stable as long + * as possible. Drop it before returning. + */ + if (vmf.flags & FAULT_FLAG_VMA_LOCK) + vma_end_read(vma); + } + return ret; } /** From patchwork Tue Jun 27 04:23:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88294EB64D9 for ; Tue, 27 Jun 2023 04:23:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230350AbjF0EXv (ORCPT ); Tue, 27 Jun 2023 00:23:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230171AbjF0EXl (ORCPT ); Tue, 27 Jun 2023 00:23:41 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E0361718 for ; Mon, 26 Jun 2023 21:23:34 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-c0d62f4487cso4652025276.0 for ; Mon, 26 Jun 2023 21:23:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839813; x=1690431813; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WF0nP76m0uCKG0VaTNWwLBGINdJ/4Fr3BCgpH937Cjs=; b=0LWS1fUj8fHv91oTajKg0vJcBVpdqwwSqjUtzC56pWGWU3fl8K9dA6f0idnSLMfjRp fokb+1l28uCZoInOB6UO18g1SRkEK1ukcbhhi0rTdehONv4x9IMeidIzzGOWyKexc0OG jtPnV6XHIsb+Kk87Cid51Sqhcr8uIw/ufmfTLfVhd2MQ4RpVy1HWiF8B3Sb3sDy9uguB pjICzCL8grLgH0l9QV0PKaK54nmIU8F7wcbUEL44gNrVf8lCe7QlzZUd/Ktb4nhV/XHr Yfq48KIvtTGqDvC3g4QyYJOr3Z8pcOMXFmK7g5sWgh7pvY3BZaDvj7vYqGrfnd6T5ld/ zR/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839813; x=1690431813; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WF0nP76m0uCKG0VaTNWwLBGINdJ/4Fr3BCgpH937Cjs=; b=GaKdhQD6lkmN0AfM3n1Nk0CNvsoW+xlDXtyC4LjP03v/2aoAkHFRePJlNILuT0C5/8 blp/zpLyAe3m84P85x3QRmxPgOJb0GwcrvauUhUcBAK41ryRT026eK+2VMiMZqwM6wzH f42y9GEnTBbYYqOHV9fv9cr55t4CMP1h9FU85WagLduyBqLKDNJ8X5ZyoyZZ8qXXUztu 3oZd+M3KypKjSVmTD3QiVDy0CV3YHAIzEOtZ5Xn/eDyhYp6f2w8XGplh/GMxVd6255tI PixnSDudJ161IReEHJTs2PpmeYTofIPQq2JOeoCcIkS9AUsLGE3jZ3RGuVIkrhmfH73N TRbQ== X-Gm-Message-State: AC+VfDz/E/89t8E0biTAznrNJ+N45Cldn7UqxQ9sF9bCeEv/e0KDcJUD EvbZjkeSVt64HiIQoM1uazKSQhlChe8= X-Google-Smtp-Source: ACHHUZ56AMLtfxTBdGwHX0CowYNHyrKCFB2CGbJH8wbYjNIC5f8CBX5MSU7xISg3RwE4lnvBrqwh6dvbi6E= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:ac1:0:b0:c1d:4fce:466 with SMTP id 184-20020a250ac1000000b00c1d4fce0466mr2214106ybk.4.1687839813544; Mon, 26 Jun 2023 21:23:33 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:17 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-5-surenb@google.com> Subject: [PATCH v3 4/8] mm: replace folio_lock_or_retry with folio_lock_fault From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Change folio_lock_or_retry to accept vm_fault struct and return the vm_fault_t directly. This will be used later to return additional information about the state of the mmap_lock upon return from this function. Suggested-by: Matthew Wilcox Signed-off-by: Suren Baghdasaryan --- include/linux/pagemap.h | 13 ++++++------- mm/filemap.c | 29 +++++++++++++++-------------- mm/memory.c | 14 ++++++-------- 3 files changed, 27 insertions(+), 29 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index a56308a9d1a4..0bc206c6f62c 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -896,8 +896,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page, void __folio_lock(struct folio *folio); int __folio_lock_killable(struct folio *folio); -bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm, - unsigned int flags); +vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf); void unlock_page(struct page *page); void folio_unlock(struct folio *folio); @@ -995,17 +994,17 @@ static inline int folio_lock_killable(struct folio *folio) } /* - * folio_lock_or_retry - Lock the folio, unless this would block and the + * folio_lock_fault - Lock the folio, unless this would block and the * caller indicated that it can handle a retry. * * Return value and mmap_lock implications depend on flags; see - * __folio_lock_or_retry(). + * __folio_lock_fault(). */ -static inline bool folio_lock_or_retry(struct folio *folio, - struct mm_struct *mm, unsigned int flags) +static inline vm_fault_t folio_lock_fault(struct folio *folio, + struct vm_fault *vmf) { might_sleep(); - return folio_trylock(folio) || __folio_lock_or_retry(folio, mm, flags); + return folio_trylock(folio) ? 0 : __folio_lock_fault(folio, vmf); } /* diff --git a/mm/filemap.c b/mm/filemap.c index 00f01d8ead47..87b335a93530 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1701,46 +1701,47 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait) /* * Return values: - * true - folio is locked; mmap_lock is still held. - * false - folio is not locked. + * 0 - folio is locked. + * VM_FAULT_RETRY - folio is not locked. * mmap_lock has been released (mmap_read_unlock(), unless flags had both * FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in * which case mmap_lock is still held. * - * If neither ALLOW_RETRY nor KILLABLE are set, will always return true + * If neither ALLOW_RETRY nor KILLABLE are set, will always return 0 * with the folio locked and the mmap_lock unperturbed. */ -bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm, - unsigned int flags) +vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) { - if (fault_flag_allow_retry_first(flags)) { + struct mm_struct *mm = vmf->vma->vm_mm; + + if (fault_flag_allow_retry_first(vmf->flags)) { /* * CAUTION! In this case, mmap_lock is not released - * even though return 0. + * even though return VM_FAULT_RETRY. */ - if (flags & FAULT_FLAG_RETRY_NOWAIT) - return false; + if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) + return VM_FAULT_RETRY; mmap_read_unlock(mm); - if (flags & FAULT_FLAG_KILLABLE) + if (vmf->flags & FAULT_FLAG_KILLABLE) folio_wait_locked_killable(folio); else folio_wait_locked(folio); - return false; + return VM_FAULT_RETRY; } - if (flags & FAULT_FLAG_KILLABLE) { + if (vmf->flags & FAULT_FLAG_KILLABLE) { bool ret; ret = __folio_lock_killable(folio); if (ret) { mmap_read_unlock(mm); - return false; + return VM_FAULT_RETRY; } } else { __folio_lock(folio); } - return true; + return 0; } /** diff --git a/mm/memory.c b/mm/memory.c index 9011ad63c41b..3c2acafcd7b6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3568,6 +3568,7 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf) struct folio *folio = page_folio(vmf->page); struct vm_area_struct *vma = vmf->vma; struct mmu_notifier_range range; + vm_fault_t ret; /* * We need a reference to lock the folio because we don't hold @@ -3580,9 +3581,10 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf) if (!folio_try_get(folio)) return 0; - if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) { + ret = folio_lock_fault(folio, vmf); + if (ret) { folio_put(folio); - return VM_FAULT_RETRY; + return ret; } mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0, vma->vm_mm, vmf->address & PAGE_MASK, @@ -3704,7 +3706,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) bool exclusive = false; swp_entry_t entry; pte_t pte; - int locked; vm_fault_t ret = 0; void *shadow = NULL; @@ -3825,12 +3826,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) goto out_release; } - locked = folio_lock_or_retry(folio, vma->vm_mm, vmf->flags); - - if (!locked) { - ret |= VM_FAULT_RETRY; + ret |= folio_lock_fault(folio, vmf); + if (ret & VM_FAULT_RETRY) goto out_release; - } if (swapcache) { /* From patchwork Tue Jun 27 04:23:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBD8CEB64DC for ; Tue, 27 Jun 2023 04:23:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230361AbjF0EXx (ORCPT ); Tue, 27 Jun 2023 00:23:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230175AbjF0EXl (ORCPT ); Tue, 27 Jun 2023 00:23:41 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A91D172A for ; Mon, 26 Jun 2023 21:23:36 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-56938733c13so52688747b3.1 for ; Mon, 26 Jun 2023 21:23:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839816; x=1690431816; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/bqEyfDb2JHBciJ5X1JfWPHzjf7SmvaClBSwV4xMi6U=; b=OvvulEsWmawhmtqOLDyG7NX+8MMLCKyrPheIcVIJBGuloUk350HzepAZTTg9GSNFTq Zxyei2vg/f2B5jkrZYoGKPbRxvuLjH35sLV4l1wOBTJafVD0HTbDUzbP6jPQfOdU+KaG 3gIkRMxtJ9DgS5SW0Kt3bWfeu+sXaplrrhq1B7dNLrmsE2meL4KQdekBOG7sWz0zAe0N lkYsG+/en/JT0DT8eW/ip3lW+AyM8tArkN2trbjAuP+aG5gijB434ikFW3ucpom8cFWS 1HC6TfIWomgbHMNnYthALNU+Ilov/TeO7Y284bIljOOAHiqgvbBCXiM9wMHeW6Wjn6nD V8ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839816; x=1690431816; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/bqEyfDb2JHBciJ5X1JfWPHzjf7SmvaClBSwV4xMi6U=; b=B2CBOc+Atiw2/VxurwafPxxYllYYzKjvOdAFqzak236O3pjzplFttDB6gBe8YzW5QF tlolO9CIhkg4WMVApAckegekFGn7LXU1Hiz6KoODWzxQob9R8hPu+9cW4B2qCK2pY0uf u5cLUz8i5x0mO+EKpmdDhmyWBlESwQoMwg5tgXb7Sb0A+V9NULIjmItqwdNB2OBDvGTO 3uNTh2Hw1ZwtS+8LxFMcASOAvtytmi4auo+032umTuuCa09wjXGMmf+Wc15nXYClktHM wVRWEcoS9tLgUnIimJQyRrs29f9lKdOpyXAGDRqgPHCw6SOUTtucmoPvjq8MWTasU89C Gbbw== X-Gm-Message-State: AC+VfDy4H5OrqfYxR13nYxKV7jpAjd0RqQy4quqg3mLWzBwuGEU8bkQD bBqMSkYXeEUHCqaF1IlJqYj7Si6WP0I= X-Google-Smtp-Source: ACHHUZ6bk8wsxPi2CO6Ry//dDv3J6w7VSpDN0W+ynAgt0XYXbCETpv309cOV7KiQRlvOMURiMuow8VTWjRU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:d3c8:0:b0:bac:adb8:a605 with SMTP id e191-20020a25d3c8000000b00bacadb8a605mr6458407ybf.2.1687839815785; Mon, 26 Jun 2023 21:23:35 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:18 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-6-surenb@google.com> Subject: [PATCH v3 5/8] mm: make folio_lock_fault indicate the state of mmap_lock upon return From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org folio_lock_fault might drop mmap_lock before returning and to extend it to work with per-VMA locks, the callers will need to know whether the lock was dropped or is still held. Introduce new fault_flag to indicate whether the lock got dropped and store it inside vm_fault flags. Signed-off-by: Suren Baghdasaryan --- include/linux/mm_types.h | 1 + mm/filemap.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 79765e3dd8f3..6f0dbef7aa1f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1169,6 +1169,7 @@ enum fault_flag { FAULT_FLAG_UNSHARE = 1 << 10, FAULT_FLAG_ORIG_PTE_VALID = 1 << 11, FAULT_FLAG_VMA_LOCK = 1 << 12, + FAULT_FLAG_LOCK_DROPPED = 1 << 13, }; typedef unsigned int __bitwise zap_flags_t; diff --git a/mm/filemap.c b/mm/filemap.c index 87b335a93530..8ad06d69895b 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1723,6 +1723,7 @@ vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) return VM_FAULT_RETRY; mmap_read_unlock(mm); + vmf->flags |= FAULT_FLAG_LOCK_DROPPED; if (vmf->flags & FAULT_FLAG_KILLABLE) folio_wait_locked_killable(folio); else @@ -1735,6 +1736,7 @@ vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) ret = __folio_lock_killable(folio); if (ret) { mmap_read_unlock(mm); + vmf->flags |= FAULT_FLAG_LOCK_DROPPED; return VM_FAULT_RETRY; } } else { From patchwork Tue Jun 27 04:23:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F015EB64DC for ; Tue, 27 Jun 2023 04:24:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230206AbjF0EYQ (ORCPT ); Tue, 27 Jun 2023 00:24:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230217AbjF0EXn (ORCPT ); Tue, 27 Jun 2023 00:23:43 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0AED1984 for ; Mon, 26 Jun 2023 21:23:38 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-bf34588085bso5537415276.0 for ; Mon, 26 Jun 2023 21:23:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839818; x=1690431818; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=L/K5IT4x+bKsumOYQJ+yi2cT1e5z+UZoZveNoih+cDc=; b=SPsLrhIf+/FviYu5sa3cDiqhSS9nUO2CjK/9TugBc73fpzLKK7PAw7c+quZSTddlaF vZFybm/r8dGNnIepLjzU4IjRPPNo5Dyn4qcIKvLx1Ra7pU8Cv/EXd8r7DspbiwGRfqHi MU+3kIKCrT0+6JdwkeK19scpvS+xkXgrwyrrwzTvdmNJYgNZXtE3QZuG/vo4kg5AeGI2 bmnmp+IfBgDLtPSxL3LPcMXJgEk0vn02BtNF9KdqeRATk3kahh1G/AEseDofSFqqS0cB IngFLnBfekjj25euZhbjVOsR2M+c4gYV9YLPJZnMZCSFvyWp7HxKSX62DnyRwsvBkXkx 8M7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839818; x=1690431818; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L/K5IT4x+bKsumOYQJ+yi2cT1e5z+UZoZveNoih+cDc=; b=WjbuLiKNB9jphKqdqGT7MQLIlJ0aIGXWViRLo1ISaHPBonz2QiFCBulTotLwMk4fqm fAKQmc+lOpYznAWy/15qJgOBnT8ARU7biUCMH2+JKbyYTx/5VIdl4r2nzNlFdFfUHtpU EBbc2KSo7S0mmdkd1FWVcMnKk+eqEBd7AXbuAiEliLdVp0xUou97U1JDaezyAfEXZv2G rz5qrXEdsZkIO9dOhiPEmuNF+VLtC8DcMMZLzMStUjSqMqc/SZqUMmNKKnmn0BeKC96J EzA+3LL4yjGbcihaql2oTjCf2AvWrf9j+2QXvrDEq5zaJsFkSEzXDeUSs1VuTJl+N1/o r+kQ== X-Gm-Message-State: AC+VfDzjuOEaLDanwXbwiSUzI7038az4/haTlUSljfcDuD1hEX6xd3dj 30qCKFD9wXBcQaJ76XqnI5+cbiGloFI= X-Google-Smtp-Source: ACHHUZ5LhiNy4/UmnJHGuxwqmDS8M96KwdM5oYuqNif7TJ8z9OhcGrThYwi/uY+L/KRajOJ6sRKDPRQ1ACQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:e7c2:0:b0:c1d:4fce:452 with SMTP id e185-20020a25e7c2000000b00c1d4fce0452mr3205627ybh.1.1687839817945; Mon, 26 Jun 2023 21:23:37 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:19 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-7-surenb@google.com> Subject: [PATCH v3 6/8] mm: handle swap page faults under per-VMA lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When page fault is handled under per-VMA lock protection, all swap page faults are retried with mmap_lock because folio_lock_fault (formerly known as folio_lock_or_retry) had to drop and reacquire mmap_lock if folio could not be immediately locked. Follow the same pattern as mmap_lock to drop per-VMA lock when waiting for folio in folio_lock_fault and retrying once folio is available. With this obstacle removed, enable do_swap_page to operate under per-VMA lock protection. Drivers implementing ops->migrate_to_ram might still rely on mmap_lock, therefore we have to fall back to mmap_lock in that particular case. Note that the only time do_swap_page calls synchronous swap_readpage is when SWP_SYNCHRONOUS_IO is set, which is only set for QUEUE_FLAG_SYNCHRONOUS devices: brd, zram and nvdimms (both btt and pmem). Therefore we don't sleep in this path, and there's no need to drop the mmap or per-VMA lock. Signed-off-by: Suren Baghdasaryan --- mm/filemap.c | 24 ++++++++++++++++-------- mm/memory.c | 21 ++++++++++++++------- 2 files changed, 30 insertions(+), 15 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 8ad06d69895b..683f11f244cd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1703,12 +1703,14 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait) * Return values: * 0 - folio is locked. * VM_FAULT_RETRY - folio is not locked. - * mmap_lock has been released (mmap_read_unlock(), unless flags had both - * FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in - * which case mmap_lock is still held. + * FAULT_FLAG_LOCK_DROPPED bit in vmf flags will be set if mmap_lock or + * per-VMA lock got dropped. mmap_lock/per-VMA lock is dropped when + * function fails to lock the folio, unless flags had both + * FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in which case + * the lock is still held. * * If neither ALLOW_RETRY nor KILLABLE are set, will always return 0 - * with the folio locked and the mmap_lock unperturbed. + * with the folio locked and the mmap_lock/per-VMA lock unperturbed. */ vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) { @@ -1716,13 +1718,16 @@ vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) if (fault_flag_allow_retry_first(vmf->flags)) { /* - * CAUTION! In this case, mmap_lock is not released - * even though return VM_FAULT_RETRY. + * CAUTION! In this case, mmap_lock/per-VMA lock is not + * released even though returning VM_FAULT_RETRY. */ if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) return VM_FAULT_RETRY; - mmap_read_unlock(mm); + if (vmf->flags & FAULT_FLAG_VMA_LOCK) + vma_end_read(vmf->vma); + else + mmap_read_unlock(mm); vmf->flags |= FAULT_FLAG_LOCK_DROPPED; if (vmf->flags & FAULT_FLAG_KILLABLE) folio_wait_locked_killable(folio); @@ -1735,7 +1740,10 @@ vm_fault_t __folio_lock_fault(struct folio *folio, struct vm_fault *vmf) ret = __folio_lock_killable(folio); if (ret) { - mmap_read_unlock(mm); + if (vmf->flags & FAULT_FLAG_VMA_LOCK) + vma_end_read(vmf->vma); + else + mmap_read_unlock(mm); vmf->flags |= FAULT_FLAG_LOCK_DROPPED; return VM_FAULT_RETRY; } diff --git a/mm/memory.c b/mm/memory.c index 3c2acafcd7b6..5caaa4c66ea2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3712,11 +3712,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!pte_unmap_same(vmf)) goto out; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - ret = VM_FAULT_RETRY; - goto out; - } - entry = pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { @@ -3726,6 +3721,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) vmf->page = pfn_swap_entry_to_page(entry); ret = remove_device_exclusive_entry(vmf); } else if (is_device_private_entry(entry)) { + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + /* + * migrate_to_ram is not yet ready to operate + * under VMA lock. + */ + ret |= VM_FAULT_RETRY; + goto out; + } + vmf->page = pfn_swap_entry_to_page(entry); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); @@ -5089,9 +5093,12 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, /* * In case of VM_FAULT_RETRY or VM_FAULT_COMPLETED we might * be still holding per-VMA lock to keep the vma stable as long - * as possible. Drop it before returning. + * as possible. In this situation vmf.flags has + * FAULT_FLAG_VMA_LOCK set and FAULT_FLAG_LOCK_DROPPED unset. + * Drop the lock before returning when this happens. */ - if (vmf.flags & FAULT_FLAG_VMA_LOCK) + if ((vmf.flags & (FAULT_FLAG_VMA_LOCK | FAULT_FLAG_LOCK_DROPPED)) == + FAULT_FLAG_VMA_LOCK) vma_end_read(vma); } return ret; From patchwork Tue Jun 27 04:23:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2FBEEB64D9 for ; Tue, 27 Jun 2023 04:24:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230489AbjF0EYl (ORCPT ); Tue, 27 Jun 2023 00:24:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230272AbjF0EXt (ORCPT ); Tue, 27 Jun 2023 00:23:49 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCC381999 for ; Mon, 26 Jun 2023 21:23:40 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-bfae0f532e4so5218819276.2 for ; Mon, 26 Jun 2023 21:23:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839820; x=1690431820; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=S48dBfG0SWV0QU4tyl9ann+yNi8sxz8gZYLMNRkbGeY=; b=uyhBbu35AtD8uFc7lNVRhrhL9FZssNEc1U3YlNjnS6/vWb6a5ERBcwBMaYSPxjJ/Mk Qe5HTpNa5XqvLjiojjNbFmZN0xmWKkvRVR4N08jGcpfsLakGmZkxol+hWOUA2MEdPtG3 CCWMkRdx3oFd6/Dcq2ziTDhiHH5hTD+8UeRRw0fdTOeqAixqf3jfdQ7B3ZgYpw+aT0x0 7n1tbaJiroElOh+0zEGZhbUInAo3kZrsUt6r5yZG0ZhF8kLRGvyK6fttt1xmJKJlC+bc 8aGAiRC7PO3v1V82sW0UO8XGuG6YrM7whEZUcc8FMaVdHAxabZ1AEzNvmsHlLQ8otu9G mugQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839820; x=1690431820; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=S48dBfG0SWV0QU4tyl9ann+yNi8sxz8gZYLMNRkbGeY=; b=OZgrDphJnLKGJcVwnBzdixIp4oMBLNWI2MFSnm4XlveC+cLlm4P0MeCHz9B4G3WmZT ozzmQN8tzZczw6qllYgPo/YM8Wjrkb2KBw2zlhN8QNIzgzRbiTWLTtaDf4aYlEMlkK4/ +GGAix8RQ581zDqhizduY3wQzHr90uP4jSt+/LfWKxi6h9U3no/x2cXElwg3lJ75BEyN 80ehyvMc17uYLcuDcfV9nelvmdHHgihC8ZMhftDBlkUixmibhd3OwlnubHzulmX3Tl2H Z7CEshZ7OobdZtZa9FsxnXViCb1/PFDQlx8DEfQX8tWq1GcoOVZMlPNq/nmhAL6E0S3U 1fxw== X-Gm-Message-State: AC+VfDzIXfjDkovms21nILmjVlJtaXLcPQwpFnDqm9wREzQlUYL26y0P cA8gRS3L/pSC9ilXtZIfJg4mTxTAymU= X-Google-Smtp-Source: ACHHUZ7TRsCUwE2ZwvAugNjG47XxvELjQ1x8vS3tmisicM0hRabZmbuVIm5qRhxSsRS9QjFAyyEcLENdUAY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:e78a:0:b0:bc4:ef98:681c with SMTP id e132-20020a25e78a000000b00bc4ef98681cmr6584157ybh.13.1687839820015; Mon, 26 Jun 2023 21:23:40 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:20 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-8-surenb@google.com> Subject: [PATCH v3 7/8] mm: drop VMA lock before waiting for migration From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org migration_entry_wait does not need VMA lock, therefore it can be dropped before waiting. Signed-off-by: Suren Baghdasaryan --- mm/memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5caaa4c66ea2..bdf46fdc58d6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3715,8 +3715,18 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) entry = pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { - migration_entry_wait(vma->vm_mm, vmf->pmd, - vmf->address); + /* Save mm in case VMA lock is dropped */ + struct mm_struct *mm = vma->vm_mm; + + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + /* + * No need to hold VMA lock for migration. + * WARNING: vma can't be used after this! + */ + vma_end_read(vma); + ret |= VM_FAULT_COMPLETED; + } + migration_entry_wait(mm, vmf->pmd, vmf->address); } else if (is_device_exclusive_entry(entry)) { vmf->page = pfn_swap_entry_to_page(entry); ret = remove_device_exclusive_entry(vmf); From patchwork Tue Jun 27 04:23:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13293978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 395C8EB64D9 for ; Tue, 27 Jun 2023 04:24:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230211AbjF0EYx (ORCPT ); Tue, 27 Jun 2023 00:24:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230179AbjF0EYP (ORCPT ); Tue, 27 Jun 2023 00:24:15 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D95C819BA for ; Mon, 26 Jun 2023 21:23:42 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-bfec07e5eb0so4371919276.2 for ; Mon, 26 Jun 2023 21:23:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687839822; x=1690431822; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Lr5SfBQGAifcALE2slHdUCY8qB9p3T7NgiAWq42Xuxc=; b=HRPWGd1foVC0TIQog8by8o2V8v8sJUL0qEpF8wD/s++kmHcbLnW9YsASla9n/li/yn jAVoOJuXYj+89+BkDU6C7DR87/c5I2Xe7AUt5OkB4Vx+il4OJL1UCe8wo1BahbtfS+tM +TIjU6wTLeLvtbx3yRAqe+pxoXluESKq2T+f/WXJ7whSwwQzfC1S58nYgZEbfoxC42G2 cOxhlNeMNxCCBL8LKTRY8Gro4C7H4cVDDp7MvXisijQEL+ExlO4Z3Tomv0kO3XT8qUaT oJlcEThRU2WvKg9M91CG55+zV88WNMwVprV9PiZ+i5YWzMxoB7iBW977umhZZsFn9Nyn UwSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687839822; x=1690431822; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Lr5SfBQGAifcALE2slHdUCY8qB9p3T7NgiAWq42Xuxc=; b=XiSeuFoalSHMMGoyCAym8Yh2bHL39mk6Gc8XY9QbCINhOkiqa/MoFLArCVd9CJS1Ef CLVIHjKISHPk5keplRjfMCapgMYl0D/ha3kJdrL71ugRb5c69KDXFD7krZXpeZsz14Z/ BCjg8+IDgri0lAw1uiuaRtoK4wphrOzADjgyKR7iqnJF1slRPnp+urBbt0X+hr3oqrel L+GDr6fBiHIrFI1A29NZ40WO7OOVrwHw4Gew269ZCp9MFEg9utwFWZOF7A1Ee3EMZYY7 OQLI6m915ekRGn65GjcRt1tcN3wdTC+B7V6xvV3/WSvKN7LkTckJrnXBemQJw2GdzPHY wc+g== X-Gm-Message-State: AC+VfDxAeHzSAYlNKNRl7O+alE/EHypDcr1i3YQEB6y2TNXI6lHBHyHA QeLw2jEm3w7IuwtAmKNhKgoebd0+8CE= X-Google-Smtp-Source: ACHHUZ7ZrkP2eGvIORpom8C2PklGU9Puo2bqG/qIQ+eaEnnHLQXFfj/ZvexRg0ciCojV4a3OpKeuYZMwT6k= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:5075:f38d:ce2f:eb1b]) (user=surenb job=sendgmr) by 2002:a25:d658:0:b0:bbb:8c13:ce26 with SMTP id n85-20020a25d658000000b00bbb8c13ce26mr14013293ybg.11.1687839821918; Mon, 26 Jun 2023 21:23:41 -0700 (PDT) Date: Mon, 26 Jun 2023 21:23:21 -0700 In-Reply-To: <20230627042321.1763765-1-surenb@google.com> Mime-Version: 1.0 References: <20230627042321.1763765-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230627042321.1763765-9-surenb@google.com> Subject: [PATCH v3 8/8] mm: handle userfaults under VMA lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Enable handle_userfault to operate under VMA lock by releasing VMA lock instead of mmap_lock and retrying. Signed-off-by: Suren Baghdasaryan --- fs/userfaultfd.c | 42 ++++++++++++++++++++++-------------------- mm/memory.c | 9 --------- 2 files changed, 22 insertions(+), 29 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 4e800bb7d2ab..b88632c404b6 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -277,17 +277,17 @@ static inline struct uffd_msg userfault_msg(unsigned long address, * hugepmd ranges. */ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, - struct vm_area_struct *vma, - unsigned long address, - unsigned long flags, - unsigned long reason) + struct vm_fault *vmf, + unsigned long reason) { + struct vm_area_struct *vma = vmf->vma; pte_t *ptep, pte; bool ret = true; - mmap_assert_locked(ctx->mm); + if (!(vmf->flags & FAULT_FLAG_VMA_LOCK)) + mmap_assert_locked(ctx->mm); - ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); + ptep = hugetlb_walk(vma, vmf->address, vma_mmu_pagesize(vma)); if (!ptep) goto out; @@ -308,10 +308,8 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, } #else static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, - struct vm_area_struct *vma, - unsigned long address, - unsigned long flags, - unsigned long reason) + struct vm_fault *vmf, + unsigned long reason) { return false; /* should never get here */ } @@ -325,11 +323,11 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, * threads. */ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, - unsigned long address, - unsigned long flags, + struct vm_fault *vmf, unsigned long reason) { struct mm_struct *mm = ctx->mm; + unsigned long address = vmf->address; pgd_t *pgd; p4d_t *p4d; pud_t *pud; @@ -337,7 +335,8 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, pte_t *pte; bool ret = true; - mmap_assert_locked(mm); + if (!(vmf->flags & FAULT_FLAG_VMA_LOCK)) + mmap_assert_locked(mm); pgd = pgd_offset(mm, address); if (!pgd_present(*pgd)) @@ -445,7 +444,8 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) * Coredumping runs without mmap_lock so we can only check that * the mmap_lock is held, if PF_DUMPCORE was not set. */ - mmap_assert_locked(mm); + if (!(vmf->flags & FAULT_FLAG_VMA_LOCK)) + mmap_assert_locked(mm); ctx = vma->vm_userfaultfd_ctx.ctx; if (!ctx) @@ -561,15 +561,17 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) spin_unlock_irq(&ctx->fault_pending_wqh.lock); if (!is_vm_hugetlb_page(vma)) - must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, - reason); + must_wait = userfaultfd_must_wait(ctx, vmf, reason); else - must_wait = userfaultfd_huge_must_wait(ctx, vma, - vmf->address, - vmf->flags, reason); + must_wait = userfaultfd_huge_must_wait(ctx, vmf, reason); if (is_vm_hugetlb_page(vma)) hugetlb_vma_unlock_read(vma); - mmap_read_unlock(mm); + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + /* WARNING: VMA can't be used after this */ + vma_end_read(vma); + } else + mmap_read_unlock(mm); + vmf->flags |= FAULT_FLAG_LOCK_DROPPED; if (likely(must_wait && !READ_ONCE(ctx->released))) { wake_up_poll(&ctx->fd_wqh, EPOLLIN); diff --git a/mm/memory.c b/mm/memory.c index bdf46fdc58d6..923c1576bd14 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5316,15 +5316,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, if (!vma_start_read(vma)) goto inval; - /* - * Due to the possibility of userfault handler dropping mmap_lock, avoid - * it for now and fall back to page fault handling under mmap_lock. - */ - if (userfaultfd_armed(vma)) { - vma_end_read(vma); - goto inval; - } - /* Check since vm_start/vm_end might change before we lock the VMA */ if (unlikely(address < vma->vm_start || address >= vma->vm_end)) { vma_end_read(vma);