From patchwork Mon Jan 31 16:29:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12730863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C495C433F5 for ; Mon, 31 Jan 2022 16:32:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF31C8D0006; Mon, 31 Jan 2022 11:32:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A78C78D0001; Mon, 31 Jan 2022 11:32:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F29A8D0006; Mon, 31 Jan 2022 11:32:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0184.hostedemail.com [216.40.44.184]) by kanga.kvack.org (Postfix) with ESMTP id 7DC498D0001 for ; Mon, 31 Jan 2022 11:32:19 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3BEB3181E7264 for ; Mon, 31 Jan 2022 16:32:19 +0000 (UTC) X-FDA: 79091124756.12.E800521 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id B8B95C000B for ; Mon, 31 Jan 2022 16:32:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1643646738; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hGw4fY8fS9EqJwPyYv8TPZpwFN0h2hNHz6qrwbdGAa0=; b=JZGkgZ7grqSg8b3lJoq6pd6epcC6kJBZ+nEKkz/LdSc1/7FC10iQIb66Yqz3p0Qll7ru3c cj+dR5GRk/v03EjDECDiEWik6/O6QxgBzbS0fz2lXoLn8qwLXrTUjEeU6dAkEaW2KyrAOa X0eHyY/NSNKsYVy3a5KuBgQwM9y3aAY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-460-cr6LuM7uNkCo8D2BcGhQqA-1; Mon, 31 Jan 2022 11:32:14 -0500 X-MC-Unique: cr6LuM7uNkCo8D2BcGhQqA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AE87C84DA70; Mon, 31 Jan 2022 16:32:11 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.115]) by smtp.corp.redhat.com (Postfix) with ESMTP id 28A2E798D1; Mon, 31 Jan 2022 16:31:06 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , linux-mm@kvack.org, David Hildenbrand , Nadav Amit Subject: [PATCH v3 1/9] mm: optimize do_wp_page() for exclusive pages in the swapcache Date: Mon, 31 Jan 2022 17:29:31 +0100 Message-Id: <20220131162940.210846-2-david@redhat.com> In-Reply-To: <20220131162940.210846-1-david@redhat.com> References: <20220131162940.210846-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Stat-Signature: rn8oo9kzcnfmjihgp6odth39pkyyey4g X-Rspam-User: nil Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JZGkgZ7g; spf=none (imf22.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B8B95C000B X-HE-Tag: 1643646738-959500 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Liang Zhang reported [1] that the current COW logic in do_wp_page() is sub-optimal when it comes to swap+read fault+write fault of anonymous pages that have a single user, visible via a performance degradation in the redis benchmark. Something similar was previously reported [2] by Nadav with a simple reproducer. After we put an anon page into the swapcache and unmapped it from a single process, that process might read that page again and refault it read-only. If that process then writes to that page, the process is actually the exclusive user of the page, however, the COW logic in do_co_page() won't be able to reuse it due to the additional reference from the swapcache. Let's optimize for pages that have been added to the swapcache but only have an exclusive user. Try removing the swapcache reference if there is hope that we're the exclusive user. We will fail removing the swapcache reference in two scenarios: (1) There are additional swap entries referencing the page: copying instead of reusing is the right thing to do. (2) The page is under writeback: theoretically we might be able to reuse in some cases, however, we cannot remove the additional reference and will have to copy. Note that we'll only try removing the page from the swapcache when it's highly likely that we'll be the exclusive owner after removing the page from the swapache. As we're about to map that page writable and redirty it, that should not affect reclaim but is rather the right thing to do. Further, we might have additional references from the LRU pagevecs, which will force us to copy instead of being able to reuse. We'll try handling such references for some scenarios next. Concurrent writeback cannot be handled easily and we'll always have to copy. While at it, remove the superfluous page_mapcount() check: it's implicitly covered by the page_count() for ordinary anon pages. [1] https://lkml.kernel.org/r/20220113140318.11117-1-zhangliang5@huawei.com [2] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@gmail.com Reported-by: Liang Zhang Reported-by: Nadav Amit Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Vlastimil Babka Signed-off-by: David Hildenbrand --- mm/memory.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index c125c4969913..bcd3b7c50891 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3291,19 +3291,27 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) if (PageAnon(vmf->page)) { struct page *page = vmf->page; - /* PageKsm() doesn't necessarily raise the page refcount */ - if (PageKsm(page) || page_count(page) != 1) + /* + * We have to verify under page lock: these early checks are + * just an optimization to avoid locking the page and freeing + * the swapcache if there is little hope that we can reuse. + * + * PageKsm() doesn't necessarily raise the page refcount. + */ + if (PageKsm(page) || page_count(page) > 1 + PageSwapCache(page)) goto copy; if (!trylock_page(page)) goto copy; - if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) { + if (PageSwapCache(page)) + try_to_free_swap(page); + if (PageKsm(page) || page_count(page) != 1) { unlock_page(page); goto copy; } /* - * Ok, we've got the only map reference, and the only - * page count reference, and the page is locked, - * it's dark out, and we're wearing sunglasses. Hit it. + * Ok, we've got the only page reference from our mapping + * and the page is locked, it's dark out, and we're wearing + * sunglasses. Hit it. */ unlock_page(page); wp_page_reuse(vmf);