From patchwork Wed Sep 11 17:38:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13800937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 852EDEE57CA for ; Wed, 11 Sep 2024 17:38:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F36A794007F; Wed, 11 Sep 2024 13:38:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EBEC6940066; Wed, 11 Sep 2024 13:38:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5F9794007F; Wed, 11 Sep 2024 13:38:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B7433940066 for ; Wed, 11 Sep 2024 13:38:30 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 74DD24190D for ; Wed, 11 Sep 2024 17:38:30 +0000 (UTC) X-FDA: 82553166780.05.9F8ED0A Received: from out-176.mta0.migadu.com (out-176.mta0.migadu.com [91.218.175.176]) by imf18.hostedemail.com (Postfix) with ESMTP id BD6CC1C0014 for ; Wed, 11 Sep 2024 17:38:27 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=sKbgIhC5; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.176 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726076203; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zdRYrmHTTIwJmPoGD8yXUb5ju89WcIe4B/6wB3n+klc=; b=xYHdKAgMZh0oPCcIzmZc2smvuVA2e3W18OLllKfvTLQsUdEYwcwszQb+YL9xJkk3x9WTxX Bv4dVTInzxA0LlZuAr5nbhP/yRH+i6aXCQ+bhRGdge3Z8fYP2aaglSkELVS6k9LvfVfvQM /RYlL3euBudakm7Sq1Fcna2YQT5mrZM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726076203; a=rsa-sha256; cv=none; b=35/18wZgtyTD+2TWwiCyZlVrnAZdbVR9RgyZoSfmSFQ97Nuk8CMtMo8Wyva67sbpBnkBup u2WuL7MlDQGjYRrqUQGsmqhI47aJGxcRQ0J4wXu9u+z+J+LHHNo6ol3OIzQHWThgSkA8qs 14/3zbmFfcds8tm8OTl3RVA6InlOFLg= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=sKbgIhC5; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.176 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1726076305; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zdRYrmHTTIwJmPoGD8yXUb5ju89WcIe4B/6wB3n+klc=; b=sKbgIhC52Z4AaapQANOdJh+4RWQIISni3JuRbutGf66gah7JT7+031Q1R2aM1Bpj+VTBT8 nbwZfHeHH7PbbmBeviTN+5nXKc+CWO+E0zZytvbJFWeDIzwHkPj7y9eEHhRXwNuIkp6kpk m2Fz6RQCzJyk0GVpNWMmMXPusdlaFE8= From: Shakeel Butt To: Andrew Morton Cc: Matthew Wilcox , Johannes Weiner , Omar Sandoval , Chris Mason , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Meta kernel team , linux-fsdevel@vger.kernel.org Subject: [PATCH 1/2] mm: optimize truncation of shadow entries Date: Wed, 11 Sep 2024 10:38:00 -0700 Message-ID: <20240911173801.4025422-2-shakeel.butt@linux.dev> In-Reply-To: <20240911173801.4025422-1-shakeel.butt@linux.dev> References: <20240911173801.4025422-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BD6CC1C0014 X-Stat-Signature: yaztiqybndtzfaqr75btkjoci9r3kagq X-HE-Tag: 1726076307-834428 X-HE-Meta: U2FsdGVkX1//yY3ZpZxuFrGi1pfMWrdomv1vYOZTUeZL9B8aVwM3y1Xk/hKObgr/p1JvU/YEzfT5tD/8IL52c+dPuSiroYceMyiu5j6PtPhRbl0hupkDoe5UKlzUnXSDnuv0ReSRcxJ27UijOsMXOlCXNi+x7vdFAY5pIWZdahupu/9s2P5P7QR0eqOq4CVdpC3kO0vh0XPj431fdPIPwpW7F5LVqP8PMWmfQooIgjflv0hdrVDFSQQQG7Xq89gkd8EpKJBgALkl60hQ0V2dG6fvRknVBlWdYqxgSIOZWrKeDz/WrD+wctFEg1mxAfpFczvjIXjpaT+Lilum6w0FdoSP2u2lN9e+Qj9OlnJo9zcYxbecZ414/Oq/VY6gfSFF3T9dVBSKTkS61Tnw+tkRPkPn0rjw9oDz+dl94NWdUX7RsFqZjaBrxHvqad5f6nU2ReaP4gTZrC4sKF5Y2blermaHDxLC/0BMFRt96UU0HdDlrBSQb8IadBGEo+GvEDJ4kHoClkQkQnaw6mW+pck466zTb9LZnj3j2nfzU+LGLLxgj3K1s+XdVkXxrHmjQquSaEqB6xKlMMQw0nAXUTYmJ75lXnz9hk2Lu3Ivio6NDSVdsSn2CRPEL6jk16mrsel21x2I86x7DAhldcphUAyAHNYH0E7Uk5tRb5c3daTrT9HFqZ3TX9DYI99Kfm/M8a7EHI35Ruhsn5FJ57wa2rJxqPTRXdp5DFKvNXE/0KrLIwPk0T4jtVUEW7s4A685Z7jB944PaJAsO0cLz91hc+I/NBHjrMUbIk4ExnGl15e24dbnFfMnduMWrfiXaQVRxS9QZw1VuoWuh4eWtOa1owSXZUlN6xLDBG7P1Kx9gcuP/kWsxIZeaMytRy2JsQ+8tTqSTnzPdVlvzQQGj+BbWdFMWLD6AfY13FC/b50RnwW3M47lFW8JI3c7mvLRsMAKlT73r6ebWDBFj7ntu4FX8ZK ENz0Rmph HU7TYyF4HmuX58/VfVyY4EQDO7y8SZf+gqXneS8kMtgiOHmfxOsdR9tAbh+ggV5QcJPROLqvCkyOQeBk2xGGcHTQBNZDYOynvIWtKAajJ2qhmJ/M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The kernel truncates the page cache in batches of PAGEVEC_SIZE. For each batch, it traverses the page cache tree and collects the entries (folio and shadow entries) in the struct folio_batch. For the shadow entries present in the folio_batch, it has to traverse the page cache tree for each individual entry to remove them. This patch optimize this by removing them in a single tree traversal. On large machines in our production which run workloads manipulating large amount of data, we have observed that a large amount of CPUs are spent on truncation of very large files (100s of GiBs file sizes). More specifically most of time was spent on shadow entries cleanup, so optimizing the shadow entries cleanup, even a little bit, has good impact. To evaluate the changes, we created 200GiB file on a fuse fs and in a memcg. We created the shadow entries by triggering reclaim through memory.reclaim in that specific memcg and measure the simple truncation operation. # time truncate -s 0 file time (sec) Without 5.164 +- 0.059 With-patch 4.21 +- 0.066 (18.47% decrease) Signed-off-by: Shakeel Butt Acked-by: Johannes Weiner --- mm/truncate.c | 50 +++++++++++++++++++++++--------------------------- 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/mm/truncate.c b/mm/truncate.c index 0668cd340a46..c7c19c816c2e 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -72,50 +72,46 @@ static void clear_shadow_entries(struct address_space *mapping, static void truncate_folio_batch_exceptionals(struct address_space *mapping, struct folio_batch *fbatch, pgoff_t *indices) { + XA_STATE(xas, &mapping->i_pages, indices[0]); + int nr = folio_batch_count(fbatch); + struct folio *folio; int i, j; - bool dax; /* Handled by shmem itself */ if (shmem_mapping(mapping)) return; - for (j = 0; j < folio_batch_count(fbatch); j++) + for (j = 0; j < nr; j++) if (xa_is_value(fbatch->folios[j])) break; - if (j == folio_batch_count(fbatch)) + if (j == nr) return; - dax = dax_mapping(mapping); - if (!dax) { - spin_lock(&mapping->host->i_lock); - xa_lock_irq(&mapping->i_pages); + if (dax_mapping(mapping)) { + for (i = j; i < nr; i++) { + if (xa_is_value(fbatch->folios[i])) + dax_delete_mapping_entry(mapping, indices[i]); + } + goto out; } - for (i = j; i < folio_batch_count(fbatch); i++) { - struct folio *folio = fbatch->folios[i]; - pgoff_t index = indices[i]; - - if (!xa_is_value(folio)) { - fbatch->folios[j++] = folio; - continue; - } + xas_set_update(&xas, workingset_update_node); - if (unlikely(dax)) { - dax_delete_mapping_entry(mapping, index); - continue; - } + spin_lock(&mapping->host->i_lock); + xas_lock_irq(&xas); - __clear_shadow_entry(mapping, index, folio); + xas_for_each(&xas, folio, indices[nr-1]) { + if (xa_is_value(folio)) + xas_store(&xas, NULL); } - if (!dax) { - xa_unlock_irq(&mapping->i_pages); - if (mapping_shrinkable(mapping)) - inode_add_lru(mapping->host); - spin_unlock(&mapping->host->i_lock); - } - fbatch->nr = j; + xas_unlock_irq(&xas); + if (mapping_shrinkable(mapping)) + inode_add_lru(mapping->host); + spin_unlock(&mapping->host->i_lock); +out: + folio_batch_remove_exceptionals(fbatch); } /**