From patchwork Mon Feb 26 09:49:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571746 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90461634F3; Mon, 26 Feb 2024 09:49:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.161 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940998; cv=none; b=bQVawrXTRA/c8vZVw/mvd9435sT4tJSeEC+Pfh6ozgVg8IgzXhn297nLEJcDh4upLhTBZF1UHzdVuVRW0daaEtcRxlXGusPfjNYUW/Z1ca5efWz4jsWMuqvjzNVLkz4Fr8JIawfon36E4CRqp8GhoRuHyNZalEtsVMPypkUfdQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940998; c=relaxed/simple; bh=BY0+lJrnKbexBUzETPtFYZ+OJuge9ZodiUfVOjc2hK8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RyISfggWyD+U1hNDMvF8KZK4r82AmGAAAHoZglpA6OwnAXsgvyVxP3xd42fZ89fhUuBg2R3C3VN9WznUMWfipqodrssO5mL2pkRGDfwH0w0ffxd027lkdxV2Je32SoOjEtqqA7CSMUBJT1MTLuLrRjuoKnJS3RXLLLnOcRyc9iU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=fFQkG18o; arc=none smtp.client-ip=80.241.56.161 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="fFQkG18o" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4Tjwnk5kDgz9sT6; Mon, 26 Feb 2024 10:49:46 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708940986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O82T73AIK6qGhlYorGzOupTzei7wEk6+FPA8o3URJUE=; b=fFQkG18oqQYxyzAvUU/y6Xu/XNWxZbMy3Hto7NjxLMdpH4n7PnYGdCLyHr4oXKUWbXSdbt 3kIMgsOe+HNJi2CzUuOzIN8/o3Hwj48BF9JSpk1ELMgKLrxDpkLn7dQHexpKbAy6k3on3h tAY9LeXQaVX3Z0icnnr5KaXGnxzwdyT+lKKy4CYh251ECjLnS+cXrDxGEVm36OGNJwQwRd 47PB4v+rzOClutu21cDUGzMEc1J6st9TU1ML2CZo9yDdBjMu2qdfyN76b5tzCIv54RiQR8 MMZn9aLw+WeROCUn9RtgqMYpFq+RNdlieT4wk3ZhHKWPj+rdmiFDG5WMdnHFZg== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org Subject: [PATCH 01/13] mm: Support order-1 folios in the page cache Date: Mon, 26 Feb 2024 10:49:24 +0100 Message-ID: <20240226094936.2677493-2-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: "Matthew Wilcox (Oracle)" Folios of order 1 have no space to store the deferred list. This is not a problem for the page cache as file-backed folios are never placed on the deferred list. All we need to do is prevent the core MM from touching the deferred list for order 1 folios and remove the code which prevented us from allocating order 1 folios. Link: https://lore.kernel.org/linux-mm/90344ea7-4eec-47ee-5996-0c22f42d6a6a@google.com/ Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/huge_mm.h | 7 +++++-- mm/filemap.c | 2 -- mm/huge_memory.c | 23 ++++++++++++++++++----- mm/internal.h | 4 +--- mm/readahead.c | 3 --- 5 files changed, 24 insertions(+), 15 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..916a2a539517 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -263,7 +263,7 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); -void folio_prep_large_rmappable(struct folio *folio); +struct folio *folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list(struct page *page, struct list_head *list); static inline int split_huge_page(struct page *page) @@ -410,7 +410,10 @@ static inline unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, return 0; } -static inline void folio_prep_large_rmappable(struct folio *folio) {} +static inline struct folio *folio_prep_large_rmappable(struct folio *folio) +{ + return folio; +} #define transparent_hugepage_flags 0UL diff --git a/mm/filemap.c b/mm/filemap.c index 750e779c23db..2b00442b9d19 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1912,8 +1912,6 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, gfp_t alloc_gfp = gfp; err = -ENOMEM; - if (order == 1) - order = 0; if (order > 0) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; folio = filemap_alloc_folio(alloc_gfp, order); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94c958f7ebb5..81fd1ba57088 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -788,11 +788,15 @@ struct deferred_split *get_deferred_split_queue(struct folio *folio) } #endif -void folio_prep_large_rmappable(struct folio *folio) +struct folio *folio_prep_large_rmappable(struct folio *folio) { - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); - INIT_LIST_HEAD(&folio->_deferred_list); + if (!folio || !folio_test_large(folio)) + return folio; + if (folio_order(folio) > 1) + INIT_LIST_HEAD(&folio->_deferred_list); folio_set_large_rmappable(folio); + + return folio; } static inline bool is_transparent_hugepage(struct folio *folio) @@ -3082,7 +3086,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) /* Prevent deferred_split_scan() touching ->_refcount */ spin_lock(&ds_queue->split_queue_lock); if (folio_ref_freeze(folio, 1 + extra_pins)) { - if (!list_empty(&folio->_deferred_list)) { + if (folio_order(folio) > 1 && + !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del(&folio->_deferred_list); } @@ -3133,6 +3138,9 @@ void folio_undo_large_rmappable(struct folio *folio) struct deferred_split *ds_queue; unsigned long flags; + if (folio_order(folio) <= 1) + return; + /* * At this point, there is no one trying to add the folio to * deferred_list. If folio is not in deferred_list, it's safe @@ -3158,7 +3166,12 @@ void deferred_split_folio(struct folio *folio) #endif unsigned long flags; - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); + /* + * Order 1 folios have no space for a deferred list, but we also + * won't waste much memory by not adding them to the deferred list. + */ + if (folio_order(folio) <= 1) + return; /* * The try_to_unmap() in page reclaim path might reach here too, diff --git a/mm/internal.h b/mm/internal.h index f309a010d50f..5174b5b0c344 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -419,9 +419,7 @@ static inline struct folio *page_rmappable_folio(struct page *page) { struct folio *folio = (struct folio *)page; - if (folio && folio_order(folio) > 1) - folio_prep_large_rmappable(folio); - return folio; + return folio_prep_large_rmappable(folio); } static inline void prep_compound_head(struct page *page, unsigned int order) diff --git a/mm/readahead.c b/mm/readahead.c index 2648ec4f0494..369c70e2be42 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -516,9 +516,6 @@ void page_cache_ra_order(struct readahead_control *ractl, /* Don't allocate pages past EOF */ while (index + (1UL << order) - 1 > limit) order--; - /* THP machinery does not support order-1 */ - if (order == 1) - order = 0; err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break; From patchwork Mon Feb 26 09:49:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571745 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BAD96341C; Mon, 26 Feb 2024 09:49:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940997; cv=none; b=YWMaEsndge+gQjfTAx+qJAeGczMwwP6dugha2kJ6V0uzivUJyl4xMNUAP87Y9tCtqXPsPRaLEazPSX9AnU7/lymUqk2J2pMWze0PYPm28+GT8puYgNmhzrzcBNTsOHEC/cVLIBY7wo9PwQZDfU7AuTtk6gQsiZS7AHxjbh87M3g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940997; c=relaxed/simple; bh=FOON1o0mymLuX2IKmggEuVP3QW7GlR7Lcz6UZhHnvAM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eah+IR0oAAR8V+YxG8akeT/dwen4/WaXDvgfLPAnbrTFDEM5sARZDfv1Dpj+UBOE5ZLUC5lnMHWZyJjAPqkRTIUOxNLJMmCCXlUMh7oBz6vFgKM14yi/Xb0izO0hliOvU/A+jA1qwP6yfh9MKn5w8kcEaKXv57ks5dRQ9z3XsEI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=eHLA+ySb; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="eHLA+ySb" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Tjwnp4YR9z9sT0; Mon, 26 Feb 2024 10:49:50 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708940990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1cELEozXOYCd1Af2ZXfddDuv3Odm1CWuSBqyYDa15DQ=; b=eHLA+ySb1gvVydfMakrzGixCW9UqE/O4JF8zTriGRt4EDHU0TwUfMWELwyC5HYIs9xEiwj PSpYl8DwDjWmjrhrpuSj6HhJyy76xEQLrXC5Rsk0BbNYX+RmkQI9K44Olivj0jne+3sQac 4nuD6fjSNjzNBIVfAOfRRhfuF9BFJzUIfwivK9VD2XPDve+6gVvsj+1n1e4WQHZZQs1W1l COmR49AyYJbAs9aj9ATG4yeZ2R7yePNoEOh43ZOKUc+W6jmytZn1KldvUs2mLDOOS/9DzP DTgbs5SMLFY/quRvVtUH8K1aeNtyJEUaGgmz646rVVQ6nQfrat517xMfy1yhAQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 02/13] fs: Allow fine-grained control of folio sizes Date: Mon, 26 Feb 2024 10:49:25 +0100 Message-ID: <20240226094936.2677493-3-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwnp4YR9z9sT0 From: "Matthew Wilcox (Oracle)" Some filesystems want to be able to ensure that folios that are added to the page cache are at least a certain size. Add mapping_set_folio_min_order() to allow this level of control. Signed-off-by: Matthew Wilcox (Oracle) Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain --- include/linux/pagemap.h | 100 ++++++++++++++++++++++++++++++++-------- 1 file changed, 80 insertions(+), 20 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 2df35e65557d..fc8eb9c94e9c 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -202,13 +202,18 @@ enum mapping_flags { AS_EXITING = 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, - AS_LARGE_FOLIO_SUPPORT = 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS = 6, /* Call ->release_folio(), even if no private data */ + AS_STABLE_WRITES = 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ + AS_FOLIO_ORDER_MIN = 8, + AS_FOLIO_ORDER_MAX = 13, /* Bit 8-17 are used for FOLIO_ORDER */ + AS_UNMOVABLE = 18, /* The mapping cannot be moved, ever */ }; +#define AS_FOLIO_ORDER_MIN_MASK 0x00001f00 +#define AS_FOLIO_ORDER_MAX_MASK 0x0003e000 +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -344,9 +349,47 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) m->gfp_mask = mask; } +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER +#else +#define MAX_PAGECACHE_ORDER 8 +#endif + +/* + * mapping_set_folio_min_order() - Set the minimum folio order + * @mapping: The address_space. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which base size of folio the VFS can use to cache the contents + * of the file. This should only be used if the filesystem needs special + * handling of folio sizes (ie there is something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_min_order(struct address_space *mapping, + unsigned int min) +{ + if (min > MAX_PAGECACHE_ORDER) + min = MAX_PAGECACHE_ORDER; + + mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | + (MAX_PAGECACHE_ORDER << AS_FOLIO_ORDER_MAX); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. - * @mapping: The file. + * @mapping: The address_space. * * The filesystem should call this function in its inode constructor to * indicate that the VFS can use large folios to cache the contents of @@ -357,7 +400,37 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_min_order(mapping, 0); +} + +static inline unsigned int mapping_max_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline unsigned int mapping_min_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; +} + +static inline unsigned long mapping_min_folio_nrpages(struct address_space *mapping) +{ + return 1UL << mapping_min_folio_order(mapping); +} + +/** + * mapping_align_start_index() - Align starting index based on the min + * folio order of the page cache. + * @mapping: The address_space. + * + * Ensure the index used is aligned to the minimum folio order when adding + * new folios to the page cache by rounding down to the nearest minimum + * folio number of pages. + */ +static inline pgoff_t mapping_align_start_index(struct address_space *mapping, + pgoff_t index) +{ + return round_down(index, mapping_min_folio_nrpages(mapping)); } /* @@ -367,7 +440,7 @@ static inline void mapping_set_large_folios(struct address_space *mapping) static inline bool mapping_large_folio_support(struct address_space *mapping) { return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + (mapping_max_folio_order(mapping) > 0); } static inline int filemap_nr_thps(struct address_space *mapping) @@ -528,19 +601,6 @@ static inline void *detach_page_private(struct page *page) return folio_detach_private(page_folio(page)); } -/* - * There are some parts of the kernel which assume that PMD entries - * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, - * limit the maximum allocation order to PMD size. I'm not aware of any - * assumptions about maximum order if THP are disabled, but 8 seems like - * a good order (that's 1MB if you're using 4kB pages) - */ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER -#else -#define MAX_PAGECACHE_ORDER 8 -#endif - #ifdef CONFIG_NUMA struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order); #else From patchwork Mon Feb 26 09:49:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571747 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B427063517; Mon, 26 Feb 2024 09:49:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940999; cv=none; b=W6PbpF4Q9NToLleOraylGa617nRdcwL7bbsphggN8Xgu9S8pyz5f9RfHR1BwvKfgAkpATZXeKjkXo5vEoNDVQAuXJLvqwDDP1/GHy21KNweO27ZzCKLIyhp7TIgshuZeIxZnw74aMGLcFWgpMvajC91qJlhMKdy+0ebcTt1IBFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708940999; c=relaxed/simple; bh=jlHABLKNapSS9lIteDzKyffRmTxjtJQluocjrZHWOJc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JyNCLFoMa84qcF9lvtzl//2FYFKu1pTno/zTGyvXqfh5yFAxu/mlS0EOz4OQTtMS0mKYSjA/4nML2IJGTzQcUQFVs2Pue7l9CLdEKBLr4whGiYtEtIcuKuD+zdrLnd/rleLYdc1ggNVANQzyYjewbt9vfa6KAJNzryG9pgM8Zys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=bRv8Wx5c; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="bRv8Wx5c" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Tjwnt1tBzz9sTM; Mon, 26 Feb 2024 10:49:54 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708940994; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UE/+ngzUW37ueqbuDmtoyKLgw47SUrAWwA7DBQ7fmXA=; b=bRv8Wx5cvk1xd9wVeRlOtaJZztEuuuk3hMBOtenVjKOEhVioQUed98THwPiUUu1f9r9DH7 wVd6ZXgT96T6EnmbVf914Uc1jQ24dL54EHczA3owM2IwVEsCOd+na63QUB+HW4d9qUx1k4 bp9FGTWx3r3Ga4N411mN/gefo51B/cN3YxgI8XUjBXEjt1A0bW5GTLmIo6bAvCkwJmSYZr yCwDty8PFIfD+eTzomhQEytKQGuLbLueIt3c2sw5+totLREcGF7IRnsOPc6FjNyCRSOSsq lnNJkHsVkyCXiILF76ixMcjR7kuA4Zp24E1MDjwHFSTRTXs+gDEU8TW1r1YJXQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 03/13] filemap: align the index to mapping_min_order in the page cache Date: Mon, 26 Feb 2024 10:49:26 +0100 Message-ID: <20240226094936.2677493-4-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwnt1tBzz9sTM From: Luis Chamberlain Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. So when adding new folios to the page cache we must ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. A higher order folio than min_order by definition is a multiple of the min_order. If an index is aligned to an order higher than a min_order, it will also be aligned to the min order. This effectively introduces no new functional changes when min order is not set other than a few rounding computations that should result in the same value. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- include/linux/pagemap.h | 8 ++++++++ mm/filemap.c | 22 +++++++++++++--------- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index fc8eb9c94e9c..fe8e1fbb667d 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1328,6 +1328,14 @@ struct readahead_control { ._index = i, \ } +#define DEFINE_READAHEAD_ALIGNED(ractl, f, r, m, i) \ + struct readahead_control ractl = { \ + .file = f, \ + .mapping = m, \ + .ra = r, \ + ._index = mapping_align_start_index(m, i), \ + } + #define VM_READAHEAD_PAGES (SZ_128K / PAGE_SIZE) void page_cache_ra_unbounded(struct readahead_control *, diff --git a/mm/filemap.c b/mm/filemap.c index 2b00442b9d19..bdf4f65f597c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2478,11 +2478,11 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; struct file_ra_state *ra = &filp->f_ra; - pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; - pgoff_t last_index; + pgoff_t index, last_index; struct folio *folio; int err = 0; + index = mapping_align_start_index(mapping, iocb->ki_pos >> PAGE_SHIFT); /* "last_index" is the index of the page beyond the end of the read */ last_index = DIV_ROUND_UP(iocb->ki_pos + count, PAGE_SIZE); retry: @@ -2500,8 +2500,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ)) return -EAGAIN; - err = filemap_create_folio(filp, mapping, - iocb->ki_pos >> PAGE_SHIFT, fbatch); + err = filemap_create_folio(filp, mapping, index, fbatch); if (err == AOP_TRUNCATED_PAGE) goto retry; return err; @@ -3093,7 +3092,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; struct address_space *mapping = file->f_mapping; - DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff); + DEFINE_READAHEAD_ALIGNED(ractl, file, ra, mapping, vmf->pgoff); struct file *fpin = NULL; unsigned long vm_flags = vmf->vma->vm_flags; unsigned int mmap_miss; @@ -3147,7 +3146,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) ra->start = max_t(long, 0, vmf->pgoff - ra->ra_pages / 2); ra->size = ra->ra_pages; ra->async_size = ra->ra_pages / 4; - ractl._index = ra->start; + ractl._index = mapping_align_start_index(mapping, ra->start); page_cache_ra_order(&ractl, ra, 0); return fpin; } @@ -3162,7 +3161,7 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, { struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; - DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff); + DEFINE_READAHEAD_ALIGNED(ractl, file, ra, file->f_mapping, vmf->pgoff); struct file *fpin = NULL; unsigned int mmap_miss; @@ -3211,11 +3210,12 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) struct file *fpin = NULL; struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; - pgoff_t max_idx, index = vmf->pgoff; + pgoff_t max_idx, index; struct folio *folio; vm_fault_t ret = 0; bool mapping_locked = false; + index = mapping_align_start_index(mapping, vmf->pgoff); max_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); if (unlikely(index >= max_idx)) return VM_FAULT_SIGBUS; @@ -3321,7 +3321,10 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } - vmf->page = folio_file_page(folio, index); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); + + vmf->page = folio_file_page(folio, vmf->pgoff); return ret | VM_FAULT_LOCKED; page_not_uptodate: @@ -3657,6 +3660,7 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, struct folio *folio; int err; + index = mapping_align_start_index(mapping, index); if (!filler) filler = mapping->a_ops->read_folio; repeat: From patchwork Mon Feb 26 09:49:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571748 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B93065BD6; Mon, 26 Feb 2024 09:50:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941003; cv=none; b=UoH7m6rIihYMropx1FClloLL4Nx7rf3NALAa1Rw+/BAP0Ypwng2uptMjlUez9Cp5FcEl2BtTk/ZjD9OaU8v3n1PuJbWy0j1g9Rf+9+QeiYC0n/ygEhuReLSkcCB7A/pBEZe83nC4VhLC9wIoMq0pwk/8C6RFYEbT/aYnN0LiCfA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941003; c=relaxed/simple; bh=B8LAjeHNvIxjm7OqhHXSoxFDHydjeXl3PmY9Dn+JxsE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cCqtdarlm5M+mocXYz6pkCGgopws+0eJSK8oXGpl8s+LXuL98xoOwGE+YXeUEvLRUgJPTIGz4YL8ECiI/tjzZYxWjGGaDmGJkbIaO4JE+59I0bTwOHre3+2V0t68BUO00m/FPcYAW79ZTqT/2B05y/tzu5fcKE9VSWlKCtWVR9c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=k+/Se9RL; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="k+/Se9RL" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Tjwny230Zz9sd7; Mon, 26 Feb 2024 10:49:58 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708940998; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dxmi9nVYwim1cNNfe2Xlh2MStfm/gxTJRd9QPZsgFwc=; b=k+/Se9RLgvLTB8sb1uVay3Y3/zq/KKPrHWmX8ufvZwefrZwIlUZfRJ0xhs2kSLt6KkAjZ0 9MhFKa3OZzoGhJhjkXWfmOyOmF6woVxT8c4RkkyYzVoI+NsAWx9L+rFJyTV/+HVlrL7F7J Syh0kvGX8dYH758lZpESl95+fW81cGnQCMLeqVuZ5ygto6rxyq+UTXhfeGE8R8KTkQyjIg mJAkC9sTnrgntU6ZIPmJk5y6f+LDnq6nvSS1raDz9Jbn5PyL/vN7dzypafSByUv5sFm/tl 4mvQEppUs15HBHAzDg/EKxrm8u1fJ2KPfgzmxH28w64LHM1jHLhVdQWK6O39yg== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 04/13] filemap: use mapping_min_order while allocating folios Date: Mon, 26 Feb 2024 10:49:27 +0100 Message-ID: <20240226094936.2677493-5-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwny230Zz9sd7 From: Pankaj Raghav filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. As we bring the notion of mapping_min_order, make sure these functions allocate at least folio of mapping_min_order as we need to guarantee it in the page cache. Add some additional VM_BUG_ON() in page_cache_delete[batch] and __filemap_add_folio to catch errors where we delete or add folios that has order less than min_order. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Acked-by: Darrick J. Wong --- mm/filemap.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index bdf4f65f597c..4b144479c4cb 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -135,6 +135,8 @@ static void page_cache_delete(struct address_space *mapping, xas_set_order(&xas, folio->index, folio_order(folio)); nr = folio_nr_pages(folio); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); xas_store(&xas, shadow); @@ -305,6 +307,8 @@ static void page_cache_delete_batch(struct address_space *mapping, WARN_ON_ONCE(!folio_test_locked(folio)); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); folio->mapping = NULL; /* Leave folio->index set: truncation lookup relies on it */ @@ -896,6 +900,8 @@ noinline int __filemap_add_folio(struct address_space *mapping, } } + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); xas_store(&xas, folio); if (xas_error(&xas)) goto unlock; @@ -1847,6 +1853,9 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, fgf_t fgp_flags, gfp_t gfp) { struct folio *folio; + unsigned int min_order = mapping_min_folio_order(mapping); + + index = mapping_align_start_index(mapping, index); repeat: folio = filemap_get_entry(mapping, index); @@ -1886,7 +1895,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, folio_wait_stable(folio); no_page: if (!folio && (fgp_flags & FGP_CREAT)) { - unsigned order = FGF_GET_ORDER(fgp_flags); + unsigned int order = max(min_order, FGF_GET_ORDER(fgp_flags)); int err; if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping)) @@ -1912,8 +1921,13 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, gfp_t alloc_gfp = gfp; err = -ENOMEM; + if (order < min_order) + order = min_order; if (order > 0) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; + + VM_BUG_ON(index & ((1UL << order) - 1)); + folio = filemap_alloc_folio(alloc_gfp, order); if (!folio) continue; @@ -1927,7 +1941,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, break; folio_put(folio); folio = NULL; - } while (order-- > 0); + } while (order-- > min_order); if (err == -EEXIST) goto repeat; @@ -2422,7 +2436,8 @@ static int filemap_create_folio(struct file *file, struct folio *folio; int error; - folio = filemap_alloc_folio(mapping_gfp_mask(mapping), 0); + folio = filemap_alloc_folio(mapping_gfp_mask(mapping), + mapping_min_folio_order(mapping)); if (!folio) return -ENOMEM; @@ -3666,7 +3681,8 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, repeat: folio = filemap_get_folio(mapping, index); if (IS_ERR(folio)) { - folio = filemap_alloc_folio(gfp, 0); + folio = filemap_alloc_folio(gfp, + mapping_min_folio_order(mapping)); if (!folio) return ERR_PTR(-ENOMEM); err = filemap_add_folio(mapping, folio, index, gfp); From patchwork Mon Feb 26 09:49:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571749 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD61F67E79; Mon, 26 Feb 2024 09:50:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941007; cv=none; b=sHXCN9ojiCJvPN+/yCONfkPT23/WX+T/82RyTRoO/+u7zWxTvrhBE9kr0+/Q0gv3/NR9sNXmVZrdcedHHROILHbeUl28TSZfA6etVXSwx3TItk5BRuJ7G5TmpUf5uJ1c+LtWihPD7RhS/CIr9dQEM0u7kIRcMLgp08HdsshUZR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941007; c=relaxed/simple; bh=8T+7Tu933UIoxtI2pqlAuNX1DDPj7iQH+uJWQ1Gysvw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Mo3NIsTmeC5uNVj6NQynaW7UQ8Nr9kyCRSlEobSnRU9zDY9iS6mOj6oE6zwxJbDAcMUdWtGhaocOjMEwjveycaTySiXSyMcucy9ACN5liAaSsR4j97aeOAf6vnlQnGKPdvCwQDm0HbzKCjq3XqZ122qpzMtL9vgt1LHfSt40Im8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=UvS4naGN; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="UvS4naGN" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Tjwp22M3Bz9sTN; Mon, 26 Feb 2024 10:50:02 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941002; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1njdQFWaR2BtfAMszTyy2iO3Y5kDTGCSewwBty7+2Kw=; b=UvS4naGN4B56NJsI1HXSdud2G5jomcFJ6pR8o51SW+VBhDFliKwBLh4YZFda4FACselOUk sr4+d8VOS/b+sN0XHZQ6vaUSG3xl6nmuAanTZxrifjgawftiv6ksfHNzhwGDNJf6w42bDq lmZGaWCeBHrP8LsXhFaYvtWRjZpr6UlNDGjI7mlb47aYk7yS02GDIhAfcwoWIGEpMwbP2y TD3mOqA99RywbbiYZMURMmzzCB8mckCp+XBDQmQZmvggaMnvwZEK0aKR1RuwYYdM4qnyBj WJ8s1ztuAJLE9Fm/QlnPXMTzGrmXUvfuxF5DfPZIOnX717VcvRq0XaL7/vXmgA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 05/13] readahead: set file_ra_state->ra_pages to be at least mapping_min_order Date: Mon, 26 Feb 2024 10:49:28 +0100 Message-ID: <20240226094936.2677493-6-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwp22M3Bz9sTN From: Luis Chamberlain Set the file_ra_state->ra_pages in file_ra_state_init() to be at least mapping_min_order of pages if the bdi->ra_pages is less than that. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- mm/readahead.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/readahead.c b/mm/readahead.c index 369c70e2be42..8a610b78d94b 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -138,7 +138,11 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); + ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages; + if (ra->ra_pages < min_nrpages) + ra->ra_pages = min_nrpages; ra->prev_pos = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); From patchwork Mon Feb 26 09:49:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571751 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB18069306; Mon, 26 Feb 2024 09:50:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941018; cv=none; b=qzCDis6ZyBgs0pHg8Jofcehbc/V7Bs+7SMFlRILoBNFMmvZvXCQ2y6Ta55IaKnnF9mLi1fEgwXB8FQ1ASLCz8bXj8TSlcQeJtnRoztEyZ1feSNUPWf+v1U/Xdzh4n1Juz1eoRmKZglXqFhXE8bB0biPeaphXSrmaPYeulSefJ4E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941018; c=relaxed/simple; bh=gvzV2WcaeiNG86dtZhRmvW0cIHxaUjujdXE11HpU8fs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=vApBerklmE07S+WGx0qw+nD4zeoeM/N/mNxLub0SwA/Y6FG9DxlQsWbP51OBiilU7HGZkcp37kLlQ8W60/KH2/gIR/KkDmy/PIwbGBRfi1A2H5K43u5sZWnESpm5Y2kwZk7tClaL5zmGDGyc0EoNWXwn2A4bGrNVfAkapk1ra08= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=LK4xL/zq; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="LK4xL/zq" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4Tjwp748Vjz9sp7; Mon, 26 Feb 2024 10:50:07 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/iWtDtcsIIVhWdsgAPg59nbWoOxmbwnUSiPp0I9sy7c=; b=LK4xL/zqaOcK2C5cLdqWnyrsWrnOSrjd9JTjUKcDU36GVSLc7/PAhmIrSGgJHHPVt8jyaA CgAzmkvPxMIBSBLcmcHaWd47UtkpWFQwiVv9/6AbPdnMxLve7raB9+Dk0fDTKnmm9LKncR ofaxnSbT+9OLB0DXuaRe+tj1SBO/0HueXcK5qRpFENHtJ8B/WVgvHro2xe4XGQ1OM6o+cT 9QBmPtxoMmp7aubJY+9oya4KZl1IXYuPch7eyw8l+I6AeRy0I0d//IH87DD6EgY948jKob VEkkuCR9FEEG3hCVfOsWiirFMc8BLEn1q7pwYRWY0o9sYRQnlRp1e5YWUsfuDg== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 06/13] readahead: align index to mapping_min_order in ondemand_ra and force_ra Date: Mon, 26 Feb 2024 10:49:29 +0100 Message-ID: <20240226094936.2677493-7-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwp748Vjz9sp7 From: Luis Chamberlain Align the ra->start and ra->size to mapping_min_order in ondemand_readahead(), and align the index to mapping_min_order in force_page_cache_ra(). This will ensure that the folios allocated for readahead that are added to the page cache are aligned to mapping_min_order. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- mm/readahead.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 8a610b78d94b..325a25e4ee3a 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -313,7 +313,9 @@ void force_page_cache_ra(struct readahead_control *ractl, struct address_space *mapping = ractl->mapping; struct file_ra_state *ra = ractl->ra; struct backing_dev_info *bdi = inode_to_bdi(mapping->host); - unsigned long max_pages, index; + unsigned long max_pages; + pgoff_t index, new_index; + unsigned long min_nrpages = mapping_min_folio_nrpages(mapping); if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) return; @@ -323,7 +325,14 @@ void force_page_cache_ra(struct readahead_control *ractl, * be up to the optimal hardware IO size */ index = readahead_index(ractl); + new_index = mapping_align_start_index(mapping, index); + if (new_index != index) { + nr_to_read += index - new_index; + index = new_index; + } + max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); + max_pages = max_t(unsigned long, max_pages, min_nrpages); nr_to_read = min_t(unsigned long, nr_to_read, max_pages); while (nr_to_read) { unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_SIZE; @@ -331,6 +340,7 @@ void force_page_cache_ra(struct readahead_control *ractl, if (this_chunk > nr_to_read) this_chunk = nr_to_read; ractl->_index = index; + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); do_page_cache_ra(ractl, this_chunk, 0); index += this_chunk; @@ -557,8 +567,11 @@ static void ondemand_readahead(struct readahead_control *ractl, unsigned long add_pages; pgoff_t index = readahead_index(ractl); pgoff_t expected, prev_index; - unsigned int order = folio ? folio_order(folio) : 0; + unsigned int min_order = mapping_min_folio_order(ractl->mapping); + unsigned int min_nrpages = mapping_min_folio_nrpages(ractl->mapping); + unsigned int order = folio ? folio_order(folio) : min_order; + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); /* * If the request exceeds the readahead window, allow the read to * be up to the optimal hardware IO size @@ -580,7 +593,7 @@ static void ondemand_readahead(struct readahead_control *ractl, 1UL << order); if (index == expected || index == (ra->start + ra->size)) { ra->start += ra->size; - ra->size = get_next_ra_size(ra, max_pages); + ra->size = max(get_next_ra_size(ra, max_pages), min_nrpages); ra->async_size = ra->size; goto readit; } @@ -605,7 +618,7 @@ static void ondemand_readahead(struct readahead_control *ractl, ra->start = start; ra->size = start - index; /* old async_size */ ra->size += req_size; - ra->size = get_next_ra_size(ra, max_pages); + ra->size = max(get_next_ra_size(ra, max_pages), min_nrpages); ra->async_size = ra->size; goto readit; } @@ -642,7 +655,7 @@ static void ondemand_readahead(struct readahead_control *ractl, initial_readahead: ra->start = index; - ra->size = get_init_ra_size(req_size, max_pages); + ra->size = max(min_nrpages, get_init_ra_size(req_size, max_pages)); ra->async_size = ra->size > req_size ? ra->size - req_size : ra->size; readit: @@ -653,7 +666,7 @@ static void ondemand_readahead(struct readahead_control *ractl, * Take care of maximum IO pages as above. */ if (index == ra->start && ra->size == ra->async_size) { - add_pages = get_next_ra_size(ra, max_pages); + add_pages = max(get_next_ra_size(ra, max_pages), min_nrpages); if (ra->size + add_pages <= max_pages) { ra->async_size = add_pages; ra->size += add_pages; @@ -663,7 +676,7 @@ static void ondemand_readahead(struct readahead_control *ractl, } } - ractl->_index = ra->start; + ractl->_index = mapping_align_start_index(ractl->mapping, ra->start); page_cache_ra_order(ractl, ra, order); } From patchwork Mon Feb 26 09:49:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571750 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BB6467E79; Mon, 26 Feb 2024 09:50:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941017; cv=none; b=dAYx5EDivjHCU3dC53xXENCu+wBO+y/fyUDzhey/Z1PsiYIuOViQD+juyFJ0OsJU2ChGZ87hRLlcIatvHFCNMWzxZFBIpDYCysWPW7kFifeG6FiSj2K19+VbNPu5jj5LeQh64lKhF5b3DKIEVRZ8ZJlyWZ1v3BdIRJNh+99zzZI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941017; c=relaxed/simple; bh=4ejCWPI5/fRYMUcY9eJYWQ4JohLb3zMfCtxtjMK5oW8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ixrAQKVwMXSRNNwXha3A48lmQ3UjpoIoZspHguMfKNmntxesNkoDeBfLT5U/39A5XP4eFC6/2D0t0FtZ9BM7Rw97Tr5+vvDX9AQROnw+yzMyEHomnRfWoTVcj/wuHud2c9LRfMwvxYbR8ctSKEIUsy97KEZ83j4dGxWewQbm1bA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=s1FvJLOV; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="s1FvJLOV" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TjwpD4yqgz9sQ6; Mon, 26 Feb 2024 10:50:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i9yxkr7ZYzLuGR2ramAYSlLdnRgu3ddqhwDDXgb/dQg=; b=s1FvJLOV6O9BDoEZy9rfAWAdOmTdsY69l/3r70AQ4MWdtRAAjRw8jrLMD2N1T8oH7bb3FE Zo5P5AF31/WRErUzGHULK/N/M18PaNHpCSVQ/5UjkSyvUBhL6ARcSVNtDNa7LFFihtGyuX zwLRbtKGFial2tfZRA/t6m9w4WUIikd/uSVFERTvKPvVNwLbAywF7r57ufmFDElaWknTj8 dQx12H3rSonr9IpI4qrobmgSjtp6J/VscSXIGgYelBUaxhicKdv9lyM5eovuOQEre1YR3i LWU2h+G8ukjtVd6mKiPDMISh406FjzorZrXGDipZ54XIyY/I6PB1osHX3sl0NA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 07/13] readahead: rework loop in page_cache_ra_unbounded() Date: Mon, 26 Feb 2024 10:49:30 +0100 Message-ID: <20240226094936.2677493-8-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Hannes Reinecke Rework the loop in page_cache_ra_unbounded() to advance with the number of pages in a folio instead of just one page at a time. Signed-off-by: Hannes Reinecke Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Acked-by: Darrick J. Wong --- mm/readahead.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 325a25e4ee3a..ef0004147952 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -212,7 +212,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct address_space *mapping = ractl->mapping; unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i; + unsigned long i = 0; /* * Partway through the readahead operation, we will have added @@ -230,7 +230,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, /* * Preallocate as many pages as we will need. */ - for (i = 0; i < nr_to_read; i++) { + while (i < nr_to_read) { struct folio *folio = xa_load(&mapping->i_pages, index + i); if (folio && !xa_is_value(folio)) { @@ -243,8 +243,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + ractl->_index += folio_nr_pages(folio); + i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -256,13 +256,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, folio_put(folio); read_pages(ractl); ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + i = ractl->_index + ractl->_nr_pages - index; continue; } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); - ractl->_nr_pages++; + ractl->_nr_pages += folio_nr_pages(folio); + i += folio_nr_pages(folio); } /* From patchwork Mon Feb 26 09:49:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571753 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87BBD6AF99; Mon, 26 Feb 2024 09:50:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941029; cv=none; b=DLIjrDVOf6ua7iT635cNmqhs8sV7h9mqStBF1v0h1BS9T5frf4MwYjXXdpOS0JUYD6Egn8CNRw8FehJWcTwHfZN0UdvmNu3+z+Axp9uIuEdBLZQ9CpyH3f28Hz202FGPcv8gaT1BcpxTxAZGz6QBUHs8L84ampTyjVa+52EIJPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941029; c=relaxed/simple; bh=OkZ5Z/dMHgJgyvo1QDUn1hTxnoRLA1+g9AD18CeopuA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QeNx/Qt7kdIHG6V7WXz/Fte/Zp8tlokRbyReQ7pNf5Wd7aYRn+CaquV0ABCXabSdSwjEe2rFhqJbyOtv1Veqz+fBqHoWjqc3hrW0T6SibeV3U6E9v0lxiLb3iyyyJ/TUGSM/dpiTb9qMgfSzciUGIcCjGtcdI3M+fcNYefZN6dU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=KD2XCj9V; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="KD2XCj9V" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TjwpK6z92z9sZG; Mon, 26 Feb 2024 10:50:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941018; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iQD+U8fQrLibnL7zMC26FszgKpFB3C91GewFjdSXc68=; b=KD2XCj9VwH1Vmpc+ALJcr0/VQKCCQPTIil6owvpnHMNdIMgpDQ5W7a/HRkd3U4cOUjMCX0 I4cJnFFlwHSCV4Gbv2el17sDjgFvxM7cdzOZPo6QJ8vgcX2taBay3wltwWNxNcF3U5fNAJ L8WPmXdMxUgfXsQGWdlIcNLiYx0iBLpwGhhhEwubhYtDui7L8T/F+UPO5HJSx9SAcTtUtj FTQyNmAv5jqbNtdJ8Jzm5aGVJPAxwFkT36Nxj00z87HVHNTkqX+nYD5rBwtUg1Tiwon+Z2 YDfHCR6EtjB97Z3JdcRcZz++fzTQZzvFdjIDjH+KLPVowAREtTyhjqfTTaSEJA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 08/13] readahead: allocate folios with mapping_min_order in ra_(unbounded|order) Date: Mon, 26 Feb 2024 10:49:31 +0100 Message-ID: <20240226094936.2677493-9-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav Allocate folios with at least mapping_min_order in page_cache_ra_unbounded() and page_cache_ra_order() as we need to guarantee a minimum order in the page cache. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Acked-by: Darrick J. Wong Reviewed-by: Hannes Reinecke --- mm/readahead.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index ef0004147952..73aef3f080ba 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -213,6 +213,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); unsigned long i = 0; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); /* * Partway through the readahead operation, we will have added @@ -234,6 +235,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct folio *folio = xa_load(&mapping->i_pages, index + i); if (folio && !xa_is_value(folio)) { + long nr_pages = folio_nr_pages(folio); + /* * Page already present? Kick off the current batch * of contiguous pages before continuing with the @@ -243,19 +246,31 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index += folio_nr_pages(folio); + + /* + * Move the ractl->_index by at least min_pages + * if the folio got truncated to respect the + * alignment constraint in the page cache. + * + */ + if (mapping != folio->mapping) + nr_pages = min_nrpages; + + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); + ractl->_index += nr_pages; i = ractl->_index + ractl->_nr_pages - index; continue; } - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, + mapping_min_folio_order(mapping)); if (!folio) break; if (filemap_add_folio(mapping, folio, index + i, gfp_mask) < 0) { folio_put(folio); read_pages(ractl); - ractl->_index++; + ractl->_index += min_nrpages; i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -503,6 +518,7 @@ void page_cache_ra_order(struct readahead_control *ractl, { struct address_space *mapping = ractl->mapping; pgoff_t index = readahead_index(ractl); + unsigned int min_order = mapping_min_folio_order(mapping); pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; pgoff_t mark = index + ra->size - ra->async_size; int err = 0; @@ -529,8 +545,13 @@ void page_cache_ra_order(struct readahead_control *ractl, if (index & ((1UL << order) - 1)) order = __ffs(index); /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) + while (order > min_order && index + (1UL << order) - 1 > limit) order--; + + if (order < min_order) + order = min_order; + + VM_BUG_ON(index & ((1UL << order) - 1)); err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break; From patchwork Mon Feb 26 09:49:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571752 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C05E6A32A; Mon, 26 Feb 2024 09:50:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941026; cv=none; b=SZJoHzQyFmszj29uQRFLwfaRoB0whUZuewvNEUlw03mmlTuIo/dhbrWzucpOiLC/W33hvAV3LmzMvFhw61apKj8NpRclYBO2UJ90hUinvJeSqWcF7fg6CvrXTDUtyqkOfMh0k5jzvhfT86sPzhvEOfKXshnE8Kq4aprfhXASAmY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941026; c=relaxed/simple; bh=RpGDv7xjt1Tn0NAFmnUJ4dRChi0PW0zUu2AGhx/ey2I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CPjqPBxIJ5p1cFzjzqy+zmx1RXKe0fFseIEJNmsa5St3wfZrpmahGSPdSUej1xrHIPdDn/WKSnncbzVo1zPq7Q0QHZ+xFiF5HywWBD+SSlwahESjf1vVyAiKFVkSbDUT/SftSnl7KQhqKS6P4R24zEP7LykgXovsQ5EOC3Aj42o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=lioYz3tH; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="lioYz3tH" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4TjwpP5mYxz9sTv; Mon, 26 Feb 2024 10:50:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F3Xg5bE7lOUEiWrLgvJkawBGempm7FqinU/8gfujPB8=; b=lioYz3tHK8mjB7lKM1lOktEID4fLFlKa66xI9EIqMo9+mkBUeo8GueQEQuC6zRm8ZamQP7 YK5fiGlW4NBkooEUgInJLZGSECi/OX9C9RUl0wh7Uw+csnMhp4o1l1O/k7whwdYIjgCop2 G84nRQMuLf03YLlcgRix64u5MIviQ4C/cbo97lzDga4/0uiYD16VdRZByvCGW9ojfUJhEw ZfZvn0h4fRtSSxdNeA2wX+DyJIue/KNcckCgmlUXE54kwz+qVZKW26w+EIO37eSWQ2RTTj VvnAkngviyrXkfpus7MhYeIbM9ltg0nQQ8kub4GjpwHQPENVApoQGJJDkw8Q4w== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 09/13] mm: do not split a folio if it has minimum folio order requirement Date: Mon, 26 Feb 2024 10:49:32 +0100 Message-ID: <20240226094936.2677493-10-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav As we don't have a way to split a folio to a any given lower folio order yet, avoid splitting the folio in split_huge_page_to_list() if it has a minimum folio order requirement. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- mm/huge_memory.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 81fd1ba57088..6ec3417638a1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3030,6 +3030,19 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) goto out; } + /* + * Do not split if mapping has minimum folio order + * requirement. + * + * XXX: Once we have support for splitting to any lower + * folio order, then it could be split based on the + * min_folio_order. + */ + if (mapping_min_folio_order(mapping)) { + ret = -EAGAIN; + goto out; + } + gfp = current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); From patchwork Mon Feb 26 09:49:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571754 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 898796BB32; Mon, 26 Feb 2024 09:50:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941031; cv=none; b=E44cmvBOZPJ4sWRXjEFus1xinKHP116wrLRuJCO08/Chjg+ccEwDnJZBgvR4aAQ1c4LviGx80CpWgre/JCxOGHvcqIQsoRppbveQh0DaMWNVKTycW2pr92+Z2OlwOdPXqaq/6ZOEDMY2y1U88VApLvtn1TPaKzUjvfvAK1bOXVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941031; c=relaxed/simple; bh=OPjPelr7jC5j/AA2ZEydqVDJmFXOxA3MXRxWvzjeqDU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WAF1zTu4o6UrYTWNFMSAo7w23m/tWRaSNSEhhftwu+sftzXiENGay+vrF0o2EeTjokC5hvX//ATys/D0VUc7JQvJAsW2xo3hMmOSIoz6e1WamdrTPcxE2d0huguibZG+HbhxD9u7SA0u0RB9gS+nhI8VSRpF5Lmnh5ddr/KrXbA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=ftPcgErq; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="ftPcgErq" Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TjwpV0ZL1z9ssH; Mon, 26 Feb 2024 10:50:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941026; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CoSlfuEyGkiaGErrLwm5tBbxdIgWjqVT8F2G+alxKAU=; b=ftPcgErqnPBTbun/dhyU0NKfoVEkMjjZg0CZNyhDa1KTOioigQfA3tSxToEOkKPQT/5uVW CEY6E5/8D+UDJP+TbD41sLzkafb7BBOKarhRvsqEpaL3/dvCB7+HuFndTFm+FosZeS0VQl h4YFFeZ/Y+oZy/EKgfiAs1/xt1ctY/D6Fjyj/FcCTutVY6I2tpKMSKZ5oAox1iVPEfEzBX WpNNRQKPXKgxbceDl4iniriCVCxFMh9t7y74jc8ICR+vUH/CIUoQtd3pG5pxH9Qm8gddRV m/HS4ri7tPBppNTTqo7BDAzkBzabfNdsNicJ8dOItqWJP3LIMduq7kYp7b2sPQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 10/13] iomap: fix iomap_dio_zero() for fs bs > system page size Date: Mon, 26 Feb 2024 10:49:33 +0100 Message-ID: <20240226094936.2677493-11-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav Reviewed-by: Darrick J. Wong --- fs/iomap/direct-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index bcd3f8cf5ea4..04f6c5548136 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, struct page *page = ZERO_PAGE(0); struct bio *bio; - bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); + WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE)); + + bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS, + REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); + bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos); bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - __bio_add_page(bio, page, len, 0); + while (len) { + unsigned int io_len = min_t(unsigned int, len, PAGE_SIZE); + + __bio_add_page(bio, page, io_len, 0); + len -= io_len; + } iomap_dio_submit_bio(iter, dio, bio, pos); } From patchwork Mon Feb 26 09:49:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571755 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52E1D6F071; Mon, 26 Feb 2024 09:50:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941034; cv=none; b=EjlDZjCEl9NtHR79vSHwMU85cSRYAyO8uTgbkdR7TbwU/pCqN6H8p1o9WWlSm3qZSL+1nkaBb1m9GDrzCjyi2JHLfF0iKogdpAQoHPUwEbRXggOwl9TDAB2nV1TCKF/QeO1LxmAs1EOZYK/sI+uBmt+gOmtYCqGl/TjGOKxbx0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941034; c=relaxed/simple; bh=RKzUrbiIAwV//xUeCFeu109a33mJKvzhblm+B+JiUEw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mdZ7f9ba5FgZNBc6kwd0PZYI9CD0gyCLWNEDgQVNiYAbn8oPZi7dsTlRh/J99bwv2Nw7QKfyJt8Gs8LAbbqx0AkScHqgqA8CoIy9pMBfejNBf8Vo7M53OFhBOLGAMlUFaMPBjdUzmoPpRXKmaoLsZA60fBshlBPnslBKahAutzM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=X9VroBu+; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="X9VroBu+" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TjwpY6Hqyz9t0C; Mon, 26 Feb 2024 10:50:29 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zh8crSzzd1QnmeV1OcknR37ATkgZhCAIPO7aiRL4a8s=; b=X9VroBu+3yO5IgXXT944eaTQlnsCiIDYHWBS+LxO8jDzLeA9zPRPzGeQfbvBJBDt8oQVil 9z/Jroy3X9i5rjyglZl5s+1FojOZZK88Y+XGNkbBm8GWNmi28d75LAv2D11kpCozfKUKZs C/RtBKkXLZ42ChZZiXt/0ETwe/ylfB5qbLaywpGRm9QkT1pglMn9HnC2tkNPrKiyxtug7O zL8gNSCtEFROdxvnAcR3uWDZE1AYGHQUtXzOojhM9P4tfYwPQKgeUKqePIHm1/SFUPJKip 94vDoTelo97v9/BKF2OAD9svWm7Ox8jcJAUWYLurePCPkDU98tjP1ggrHhzjSA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Dave Chinner Subject: [PATCH 11/13] xfs: expose block size in stat Date: Mon, 26 Feb 2024 10:49:34 +0100 Message-ID: <20240226094936.2677493-12-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. Signed-off-by: Dave Chinner dd2d535e3fb29d ("xfs: cleanup calculating the stat optimal I/O size")] Signed-off-by: Luis Chamberlain --- fs/xfs/xfs_iops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a0d77f5f512e..1b4edfad464f 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -543,7 +543,7 @@ xfs_stat_blksize( return 1U << mp->m_allocsize_log; } - return PAGE_SIZE; + return max_t(unsigned long, PAGE_SIZE, mp->m_sb.sb_blocksize); } STATIC int From patchwork Mon Feb 26 09:49:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571756 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 074D773F0F; Mon, 26 Feb 2024 09:50:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941038; cv=none; b=E2xolLJ2Gy6+WLV7/8dRej5EAwKl8yU20aCS7csecJMtSKE8gCugsGEQV9bf4egVBEKw/ILTnSi6bh4AzOXHSl3yWDBm+jMKnic+ibLS9LFWNyddN03sGL4TD2NxAYQzlURJrbYkltEtXWfLInqDk5PtEZtLlfNE1ll1wP8du9A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941038; c=relaxed/simple; bh=FVDRi+FI/MUDSSs6VohUD1IoeSztA4fHLw2LTFv4FBc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dHm/29PNI4Qdk59FBJVbBv+3pUK5MLgChDdN50MBMDjj2suoiNk2oLZ1N8UkXWbtpN9g84ldTW2gFF5Bi+eMPDC79DChalkzPWWXoJYm0Ux1bnaG82OL7gbOIowNb1lLWaDUJpNNYJdCVV6MnyUo3GUK6h54SZNSrnOI3bzs440= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=hAVlxzD2; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="hAVlxzD2" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Tjwpd5S1bz9sqr; Mon, 26 Feb 2024 10:50:33 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HlYEwWAE6vZZcw5JeU6SUv0SKwh4LwGux0locayr8EQ=; b=hAVlxzD2LUL1rbpSWjPjfIWWZ+5drZJNxCuKnYF1eJy8lXLqGyC3JvVjJva165jpZA8JrK /NizHCJ/YjEL816L5LIhVDRvYkcIOLksj7bpIUgXlYkP3mVY1kr/aMoGEUbK03wka8yZ9S Sr6WVXmeCkUN9ZfZ97Ut4XQrp4ARMV4/DD2WK22lwFyIP5AfsEkvUGV3hnyVAFzwTdDNL0 Q5nftWFtgrFUeqMaxatl83Cq1koGPzuefYzt7zvoWkCPf69Ydn5PYfAqOWucArjj7RmCxM LPwD8UfbzpoS3DW1EujY+tdRK5i5lGBa24C8u4KLR8TX8md71im/pzk2XnBEsQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 12/13] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Date: Mon, 26 Feb 2024 10:49:35 +0100 Message-ID: <20240226094936.2677493-13-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_mount.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index aabb25dc3efa..69af3b06be99 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -133,9 +133,15 @@ xfs_sb_validate_fsb_count( { ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); + uint64_t mapping_count; + uint64_t bytes; + if (check_mul_overflow(nblocks, (1 << sbp->sb_blocklog), &bytes)) + return -EFBIG; + + mapping_count = bytes >> PAGE_SHIFT; /* Limited by ULONG_MAX of page cache index */ - if (nblocks >> (PAGE_SHIFT - sbp->sb_blocklog) > ULONG_MAX) + if (mapping_count > ULONG_MAX) return -EFBIG; return 0; } From patchwork Mon Feb 26 09:49:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13571757 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1B347690B; Mon, 26 Feb 2024 09:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941042; cv=none; b=NXM6GG0I8Focl6WL78ekSDS6vuGz5fwKRDTSs1vUi7hg+RLJJwT0cLxV/sVEfHbqYZSZHFMvVVAydTcJq1goAmc9a4mN8HtNIock61kbjZBk+CBF3mC6fuchRLfGEaCcLUDexqirRsCnog9+Qit57HT/qOT2HAHTMSV1FT9Aapg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708941042; c=relaxed/simple; bh=Y9b2xLAZqzX+pdH+yrlRg/WTNLCOOaokY7gPgj+Co8E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=afjdAfqD1s/N1d0wocJyl7sZ2jzlty8LMHd8hijW/0v+Hw/rAqcUVnQLXb4d8uaRddpV/gnR1DfAv0azdQb5lw1POD9hPTKH+o4pFg0Aw4YcRE9nCfbWieufBAKUDzflxRSWHod7T/fQwrv5XN4ZNZVFO2nzoMQ0zYOtiWmNXM8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=r/dmaeu8; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="r/dmaeu8" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Tjwpj3Lszz9sd1; Mon, 26 Feb 2024 10:50:37 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1708941037; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TxspCYGvd47S6sMZqiF8IxyiXSLrdsbW3Rzpn1A8oVs=; b=r/dmaeu8ky0Zitwk6d/p71mCATrvYsjhMj+pqGiiXKmkR/JFZzQ4UzKBr1JYGDYt7Xmey+ 4YlsyGLVY0JeTsSAfX9QD8ohDznFOefXqdL91rLdcOIA0b1DC1jKooTiTD8P0N5Otpz96p FcEQHn9pTyCODOwN1oCWuNkBxQtVNSDXAYq2A1Cb7+uZg+SFwRAQhVuH+UxnnR2HHXDPnr lu6+d5f2UIJPqHRvfkqMvtKaGXc4xWYIrcigS98zQJZKP+4F+ZO7iCuF7Us0aVsFsSJ5W/ fzYF/GIvrVQbR8vvaMZ1Xxgb/Yz13ovRc4iCTJuTdI+c/VrA577/Qf+r01F9TQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, mcgrof@kernel.org, ziy@nvidia.com, hare@suse.de, djwong@kernel.org, gost.dev@samsung.com, linux-mm@kvack.org, willy@infradead.org, Pankaj Raghav Subject: [PATCH 13/13] xfs: enable block size larger than page size support Date: Mon, 26 Feb 2024 10:49:36 +0100 Message-ID: <20240226094936.2677493-14-kernel@pankajraghav.com> In-Reply-To: <20240226094936.2677493-1-kernel@pankajraghav.com> References: <20240226094936.2677493-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4Tjwpj3Lszz9sd1 From: Pankaj Raghav Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain --- fs/xfs/libxfs/xfs_ialloc.c | 5 +++++ fs/xfs/libxfs/xfs_shared.h | 3 +++ fs/xfs/xfs_icache.c | 6 ++++-- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_super.c | 10 ++-------- 5 files changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 2361a22035b0..c040bd6271fd 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -2892,6 +2892,11 @@ xfs_ialloc_setup_geometry( igeo->ialloc_align = mp->m_dalign; else igeo->ialloc_align = 0; + + if (mp->m_sb.sb_blocksize > PAGE_SIZE) + igeo->min_folio_order = mp->m_sb.sb_blocklog - PAGE_SHIFT; + else + igeo->min_folio_order = 0; } /* Compute the location of the root directory inode that is laid out by mkfs. */ diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 4220d3584c1b..67ed406e7a81 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -188,6 +188,9 @@ struct xfs_ino_geometry { /* precomputed value for di_flags2 */ uint64_t new_diflags2; + /* minimum folio order of a page cache allocation */ + unsigned int min_folio_order; + }; #endif /* __XFS_SHARED_H__ */ diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index dba514a2c84d..a1857000e2cd 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -88,7 +88,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode = 0; VFS_I(ip)->i_state = 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + mapping_set_folio_min_order(VFS_I(ip)->i_mapping, + M_IGEO(mp)->min_folio_order); XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -323,7 +324,8 @@ xfs_reinit_inode( inode->i_rdev = dev; inode->i_uid = uid; inode->i_gid = gid; - mapping_set_large_folios(inode->i_mapping); + mapping_set_folio_min_order(inode->i_mapping, + M_IGEO(mp)->min_folio_order); return error; } diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 69af3b06be99..c7df1857195c 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -131,7 +131,6 @@ xfs_sb_validate_fsb_count( xfs_sb_t *sbp, uint64_t nblocks) { - ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); uint64_t mapping_count; uint64_t bytes; diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 5a2512d20bd0..685ce7bf7324 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1625,16 +1625,10 @@ xfs_fs_fill_super( goto out_free_sb; } - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ if (mp->m_sb.sb_blocksize > PAGE_SIZE) { xfs_warn(mp, - "File system with blocksize %d bytes. " - "Only pagesize (%ld) or less will currently work.", - mp->m_sb.sb_blocksize, PAGE_SIZE); - error = -ENOSYS; - goto out_free_sb; +"EXPERIMENTAL: Filesystem with Large Block Size (%d bytes) enabled.", + mp->m_sb.sb_blocksize); } /* Ensure this filesystem fits in the page cache limits */