From patchwork Tue Feb 13 09:37:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554805 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4822224CE; Tue, 13 Feb 2024 09:37:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817048; cv=none; b=uxwzu2Z/7jaFHG09lZI2dyU9A/NnJsXIqf/JaYMjybzSHxy/tKfgqHYnO0TFjiQeGcmVfsRFRXY30eYv+yQ50zQEmerzJe5SXo2isdteGNmqNGRNMzNM8iaTAB+8wcJcQRDsCfZBD8N8LEmsbj5ZyH5Er/qBrM3ipwfcRxaay34= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817048; c=relaxed/simple; bh=/Tm8koaoLTRmQQMsFa8oq5guCWtVz64YDWKYgTdrMBE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sZnrWhwL/4o4OzdTasjLj2NUTaCD5N79oU7M+1dXkV2CAMb7/9NFNfONn5b1kMsgfd2fEFlPPGNgO0Ito4vxYxsimjaj3A7P1CLz/vIKPZFlub4ZGmM1CfL8dDf37KAVVRwqnC6rsQxF1yvVHh23dU8JvmYI8sNtui2pFOK0SvQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=0mK9WM/E; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="0mK9WM/E" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx7P5TQGz9sTM; Tue, 13 Feb 2024 10:37:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IYsJ32qMRCtWIP7xXM7OP+qt3qi8RVhOs/HaC3Mhi1k=; b=0mK9WM/EFBtMl+dbpHA5NndcygMmv6RRM66I83fovGzWHIsnV7/2VPdQm8kcB3euYKZSFX UatwkePCcSp/Lzf0q+P8t99PuCfm8y7LsDJ0RhTt7Mh+11SSEqyFtQxXyHmbsXW8YXXsnw nQX/tG65TSJ/9vVVm/LW8DSW5GATIO5FBux0ZARu9/qrlU/C6PUxh6YE3l+j1jP5QY70yw ppe/kcPhJTO6Cpl8yT2bPB8K2n3jJXPahl9kOH8NLS+jdVgubCwhPrSbxCg0X/40NgdFXC ptJxZqX4UXs21l8Me8WCfP+0NXi0g5N0JTA/flhbb9ayDk+o55DciZOM3XGHWw== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 01/14] fs: Allow fine-grained control of folio sizes Date: Tue, 13 Feb 2024 10:37:00 +0100 Message-ID: <20240213093713.1753368-2-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: "Matthew Wilcox (Oracle)" Some filesystems want to be able to limit the maximum size of folios, and some want to be able to ensure that folios are at least a certain size. Add mapping_set_folio_orders() to allow this level of control. The max folio order parameter is ignored and it is always set to MAX_PAGECACHE_ORDER. Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- include/linux/pagemap.h | 92 ++++++++++++++++++++++++++++++++--------- 1 file changed, 73 insertions(+), 19 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 2df35e65557d..5618f762187b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -202,13 +202,18 @@ enum mapping_flags { AS_EXITING = 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, - AS_LARGE_FOLIO_SUPPORT = 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS = 6, /* Call ->release_folio(), even if no private data */ + AS_STABLE_WRITES = 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ + AS_FOLIO_ORDER_MIN = 8, + AS_FOLIO_ORDER_MAX = 13, /* Bit 8-17 are used for FOLIO_ORDER */ + AS_UNMOVABLE = 18, /* The mapping cannot be moved, ever */ }; +#define AS_FOLIO_ORDER_MIN_MASK 0x00001f00 +#define AS_FOLIO_ORDER_MAX_MASK 0x0003e000 +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -344,6 +349,53 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) m->gfp_mask = mask; } +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER +#else +#define MAX_PAGECACHE_ORDER 8 +#endif + +/* + * mapping_set_folio_orders() - Set the range of folio sizes supported. + * @mapping: The file. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * @max: Maximum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which sizes of folio the VFS can use to cache the contents + * of the file. This should only be used if the filesystem needs special + * handling of folio sizes (ie there is something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_orders(struct address_space *mapping, + unsigned int min, unsigned int max) +{ + if (min == 1) + min = 2; + if (max < min) + max = min; + if (max > MAX_PAGECACHE_ORDER) + max = MAX_PAGECACHE_ORDER; + + /* + * XXX: max is ignored as only minimum folio order is supported + * currently. + */ + mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | + (MAX_PAGECACHE_ORDER << AS_FOLIO_ORDER_MAX); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. * @mapping: The file. @@ -357,7 +409,22 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_orders(mapping, 0, MAX_PAGECACHE_ORDER); +} + +static inline unsigned int mapping_max_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline unsigned int mapping_min_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; +} + +static inline unsigned int mapping_min_folio_nrpages(struct address_space *mapping) +{ + return 1U << mapping_min_folio_order(mapping); } /* @@ -367,7 +434,7 @@ static inline void mapping_set_large_folios(struct address_space *mapping) static inline bool mapping_large_folio_support(struct address_space *mapping) { return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + (mapping_max_folio_order(mapping) > 0); } static inline int filemap_nr_thps(struct address_space *mapping) @@ -528,19 +595,6 @@ static inline void *detach_page_private(struct page *page) return folio_detach_private(page_folio(page)); } -/* - * There are some parts of the kernel which assume that PMD entries - * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, - * limit the maximum allocation order to PMD size. I'm not aware of any - * assumptions about maximum order if THP are disabled, but 8 seems like - * a good order (that's 1MB if you're using 4kB pages) - */ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER -#else -#define MAX_PAGECACHE_ORDER 8 -#endif - #ifdef CONFIG_NUMA struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order); #else From patchwork Tue Feb 13 09:37:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554807 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A6492C68A; Tue, 13 Feb 2024 09:37:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817057; cv=none; b=XgEr06jKE7j3EMuurD6nN4abQ1Fs0l8eb8I/87jPXPIescqcjzY8wlL/5hxN1MabI1cug/cUYbLbRHlC6+ZzPikOP2Ne20tbEbINyXkrvUd+W4uYs+KwyXkJKYYk1YtSudXIGWKhPkRPjoP3Dhgmkgx9CbK9LxkkxAMrdbbHy/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817057; c=relaxed/simple; bh=cG5SVOvDDNw/cZjZuDOukkzQqJjGokx8ur32PBsSpcY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BEwuDx0+vTSG6IcT7uccQx0+1XT6/1CvZlkmis1gS66N4cZIYva2VJEQNRoXh81j5+ZV8QfL9s0BN5HVudm+ctXJ3YCiQWig1R5p4KRYzvD0N18lezmeTOZmS8hbogkqKlLUlRAkOKWfShVZmaJuPiUUUjNMuGbz4rKbkpUt19M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=aXtgjhd1; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="aXtgjhd1" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4TYx7V3jchz9smV; Tue, 13 Feb 2024 10:37:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817046; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D1pKcMbFRFRAPVwFIcuNop7Fe4l/PLjOYcOBcxoLFyo=; b=aXtgjhd1b//XRwpHD3/R1kk0mMTstSjUb1ouVBf6fckgb8rZuNI8zTh5DEfERSatuEP7pk 8r5YZ81b978D/uP1iocDT3hJ0DAMvsePx0hbDmXJc9fXVhf/Umqd3qoeywBTAMUWEkLzBc vtPR/X4NoIYAa+boy6TQjlTNfkn8RNxra5BUp+2y9fcjIv/1CtJ/6sZEPGCZgSCQqAxIvE 3thP6Rsg48Qm50MXNeztGsGj45DAUmbTwN4x4ViCWqPcJT8qc7mVogI7IwJNiQjIzN9/1H Z+3XDCrnf0weMSLu5URLoEZIWi8/LyMh3ZET8GObGX6Kn0hq/Rz6GqEnzMZ0iw== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 02/14] filemap: align the index to mapping_min_order in the page cache Date: Tue, 13 Feb 2024 10:37:01 +0100 Message-ID: <20240213093713.1753368-3-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx7V3jchz9smV From: Luis Chamberlain Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. So when adding new folios to the page cache we must ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. A higher order folio than min_order by definition is a multiple of the min_order. If an index is aligned to an order higher than a min_order, it will also be aligned to the min order. This effectively introduces no new functional changes when min order is not set other than a few rounding computations that should result in the same value. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- mm/filemap.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 750e779c23db..323a8e169581 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2479,14 +2479,16 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, { struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); struct file_ra_state *ra = &filp->f_ra; - pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; + pgoff_t index = round_down(iocb->ki_pos >> PAGE_SHIFT, min_nrpages); pgoff_t last_index; struct folio *folio; int err = 0; /* "last_index" is the index of the page beyond the end of the read */ last_index = DIV_ROUND_UP(iocb->ki_pos + count, PAGE_SIZE); + last_index = round_up(last_index, min_nrpages); retry: if (fatal_signal_pending(current)) return -EINTR; @@ -2502,8 +2504,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ)) return -EAGAIN; - err = filemap_create_folio(filp, mapping, - iocb->ki_pos >> PAGE_SHIFT, fbatch); + err = filemap_create_folio(filp, mapping, index, fbatch); if (err == AOP_TRUNCATED_PAGE) goto retry; return err; @@ -3095,7 +3096,10 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; struct address_space *mapping = file->f_mapping; - DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff); + unsigned int min_order = mapping_min_folio_order(mapping); + unsigned int min_nrpages = mapping_min_folio_nrpages(file->f_mapping); + pgoff_t index = round_down(vmf->pgoff, min_nrpages); + DEFINE_READAHEAD(ractl, file, ra, mapping, index); struct file *fpin = NULL; unsigned long vm_flags = vmf->vma->vm_flags; unsigned int mmap_miss; @@ -3147,10 +3151,11 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) */ fpin = maybe_unlock_mmap_for_io(vmf, fpin); ra->start = max_t(long, 0, vmf->pgoff - ra->ra_pages / 2); + ra->start = round_down(ra->start, min_nrpages); ra->size = ra->ra_pages; ra->async_size = ra->ra_pages / 4; ractl._index = ra->start; - page_cache_ra_order(&ractl, ra, 0); + page_cache_ra_order(&ractl, ra, min_order); return fpin; } @@ -3164,7 +3169,9 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, { struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; - DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff); + unsigned int min_nrpages = mapping_min_folio_nrpages(file->f_mapping); + pgoff_t index = round_down(vmf->pgoff, min_nrpages); + DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, index); struct file *fpin = NULL; unsigned int mmap_miss; @@ -3212,13 +3219,17 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) struct file *file = vmf->vma->vm_file; struct file *fpin = NULL; struct address_space *mapping = file->f_mapping; + unsigned int min_order = mapping_min_folio_order(mapping); + unsigned int nrpages = 1UL << min_order; struct inode *inode = mapping->host; - pgoff_t max_idx, index = vmf->pgoff; + pgoff_t max_idx, index = round_down(vmf->pgoff, nrpages); struct folio *folio; vm_fault_t ret = 0; bool mapping_locked = false; max_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + max_idx = round_up(max_idx, nrpages); + if (unlikely(index >= max_idx)) return VM_FAULT_SIGBUS; @@ -3317,13 +3328,17 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) * We must recheck i_size under page lock. */ max_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + max_idx = round_up(max_idx, nrpages); + if (unlikely(index >= max_idx)) { folio_unlock(folio); folio_put(folio); return VM_FAULT_SIGBUS; } - vmf->page = folio_file_page(folio, index); + VM_BUG_ON_FOLIO(folio_order(folio) < min_order, folio); + + vmf->page = folio_file_page(folio, vmf->pgoff); return ret | VM_FAULT_LOCKED; page_not_uptodate: @@ -3658,6 +3673,9 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, { struct folio *folio; int err; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); + + index = round_down(index, min_nrpages); if (!filler) filler = mapping->a_ops->read_folio; From patchwork Tue Feb 13 09:37:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554809 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 776C3364C5; Tue, 13 Feb 2024 09:37:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817061; cv=none; b=KbHG8oVHn24wy03C1/8MWQW4Ur4WydycNM/RgEf9nPgMdh+U7HpK1RHzU31ym8f6ORnq/T9YPa3VRafXMa/kIOBKo5bbDwOjkxEQcrY8dS62pQhA9naA9EgoCFrYDxBy8KmAFhslu4w+elBikVCJeNHWciw7qQkJwsPdlKIn6Co= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817061; c=relaxed/simple; bh=8MdY3QQ4ZN+th5q5qQ6k768r1LiNi0uKAdJ0YctqX2A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rMl+FgYzdRFUOkt0cNmjNjCkSl46NOp+T2FG5948+sYGECr+6Tn6o/cIoWpaByRsRXBHLy4iC/SFwEgnpshGQsdGx7/GZoWTdns2sH3Zax79FXrO83rtQbOanPMJ3Dr+a6SXkEezCgUENte65mHvx4XxlQCxOvhBvC0kpZm0Vec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=MssUC8Ev; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="MssUC8Ev" Received: from smtp1.mailbox.org (smtp1.mailbox.org [10.196.197.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4TYx7Z18lWz9swL; Tue, 13 Feb 2024 10:37:30 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817050; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w9ziATPUggPlhgislum4xTFLvTMOz0MhHtv1RpfgBms=; b=MssUC8EvZXGuZKgwffeP8hZBO3/nI8BUTOn+lQ37ROHXeTh/hED181KRtCHW6PELMLF0UE DEYwufaeT4jV1tuqEVGeSMTaAwMDle5ktiI2EYlKQPjJTQp/vX6/rgh4t8muYPfvrXpcjV hl09EmSpiojQVTrmt2iVgR6ncEJxJK4SW2dbrCqna7p8ayxhpiMmQpfs1UXv1lF/PX5iP4 m8xTmQzMzVSGJz1P+oVrwsTiU4/m/f9+f8Im/vqTieSDlf28XAOIkW0Qd8LCWRGkPeNIvu z48iIrv01XiHn1kxa2oJFgIu9wdVvN1Un6+RzVnIo1BkcnvB0SUBi8zi64m4MA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 03/14] filemap: use mapping_min_order while allocating folios Date: Tue, 13 Feb 2024 10:37:02 +0100 Message-ID: <20240213093713.1753368-4-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. As we bring the notion of mapping_min_order, make sure these functions allocate at least folio of mapping_min_order as we need to guarantee it in the page cache. Add some additional VM_BUG_ON() in page_cache_delete[batch] and __filemap_add_folio to catch errors where we delete or add folios that has order less than min_order. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Acked-by: Darrick J. Wong --- mm/filemap.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 323a8e169581..7a6e15c47150 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -127,6 +127,7 @@ static void page_cache_delete(struct address_space *mapping, struct folio *folio, void *shadow) { + unsigned int min_order = mapping_min_folio_order(mapping); XA_STATE(xas, &mapping->i_pages, folio->index); long nr = 1; @@ -135,6 +136,7 @@ static void page_cache_delete(struct address_space *mapping, xas_set_order(&xas, folio->index, folio_order(folio)); nr = folio_nr_pages(folio); + VM_BUG_ON_FOLIO(folio_order(folio) < min_order, folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); xas_store(&xas, shadow); @@ -277,6 +279,7 @@ void filemap_remove_folio(struct folio *folio) static void page_cache_delete_batch(struct address_space *mapping, struct folio_batch *fbatch) { + unsigned int min_order = mapping_min_folio_order(mapping); XA_STATE(xas, &mapping->i_pages, fbatch->folios[0]->index); long total_pages = 0; int i = 0; @@ -305,6 +308,7 @@ static void page_cache_delete_batch(struct address_space *mapping, WARN_ON_ONCE(!folio_test_locked(folio)); + VM_BUG_ON_FOLIO(folio_order(folio) < min_order, folio); folio->mapping = NULL; /* Leave folio->index set: truncation lookup relies on it */ @@ -846,6 +850,7 @@ noinline int __filemap_add_folio(struct address_space *mapping, int huge = folio_test_hugetlb(folio); bool charged = false; long nr = 1; + unsigned int min_order = mapping_min_folio_order(mapping); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); @@ -896,6 +901,7 @@ noinline int __filemap_add_folio(struct address_space *mapping, } } + VM_BUG_ON_FOLIO(folio_order(folio) < min_order, folio); xas_store(&xas, folio); if (xas_error(&xas)) goto unlock; @@ -1847,6 +1853,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, fgf_t fgp_flags, gfp_t gfp) { struct folio *folio; + unsigned int min_order = mapping_min_folio_order(mapping); + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); + + index = round_down(index, min_nrpages); repeat: folio = filemap_get_entry(mapping, index); @@ -1886,7 +1896,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, folio_wait_stable(folio); no_page: if (!folio && (fgp_flags & FGP_CREAT)) { - unsigned order = FGF_GET_ORDER(fgp_flags); + unsigned int order = max(min_order, FGF_GET_ORDER(fgp_flags)); int err; if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping)) @@ -1914,8 +1924,13 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, err = -ENOMEM; if (order == 1) order = 0; + if (order < min_order) + order = min_order; if (order > 0) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; + + VM_BUG_ON(index & ((1UL << order) - 1)); + folio = filemap_alloc_folio(alloc_gfp, order); if (!folio) continue; @@ -1929,7 +1944,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, break; folio_put(folio); folio = NULL; - } while (order-- > 0); + } while (order-- > min_order); if (err == -EEXIST) goto repeat; @@ -2424,7 +2439,8 @@ static int filemap_create_folio(struct file *file, struct folio *folio; int error; - folio = filemap_alloc_folio(mapping_gfp_mask(mapping), 0); + folio = filemap_alloc_folio(mapping_gfp_mask(mapping), + mapping_min_folio_order(mapping)); if (!folio) return -ENOMEM; @@ -3682,7 +3698,8 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, repeat: folio = filemap_get_folio(mapping, index); if (IS_ERR(folio)) { - folio = filemap_alloc_folio(gfp, 0); + folio = filemap_alloc_folio(gfp, + mapping_min_folio_order(mapping)); if (!folio) return ERR_PTR(-ENOMEM); err = filemap_add_folio(mapping, folio, index, gfp); From patchwork Tue Feb 13 09:37:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554808 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 470AB2C692; Tue, 13 Feb 2024 09:37:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817058; cv=none; b=a/1qoFFAQLdnapZTN26y54d632EWAgxvGcBc5+SEJR92vFQOiyPReYpPfR/DTaqC6CNhvzqnAgUMRQKAgKMiwON+im0FGMwD0pKsG5M119MtFHoUQd8lxSiD7aQ3Y2/l+3Sq4lUN4LCAW8hvMJ03GEWrY7A+muopKllweRrBdjE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817058; c=relaxed/simple; bh=0Ere+VznGanpb//IG2k8d2/r8RcDXiGlh6wtYwlnByw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=n6mPkflCDYpY0fn8pIyelitDDwUvDPdtlGovRznC+ZNQL92QIMdbBfI9ZuQyVdfurnvFLvQF8viSLFLZFblMQpqdog1jVpTzAdkOV9f/EZfE5X2blzHqz2OnOftoX9JqMeTleMQVOgV1XAcXovtaBNu/33JwCyQ3L0C8iyIpxBY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=uCtQvXEu; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="uCtQvXEu" Received: from smtp1.mailbox.org (smtp1.mailbox.org [10.196.197.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx7d50r9z9sTM; Tue, 13 Feb 2024 10:37:33 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817053; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nJez9lFGKNHHqjTBT9OvccGI+MEPxelY3zNT/AM7wII=; b=uCtQvXEu+6KQlQCiuGw+WSyYea5LbyMcA82MG5gzxpXLndb1cSZu4/VRzPnge4R8I+K8Vv NZYcRK9/bHhRkukWeUhawuEyT7UQoBTZC6FyPTvux6qdgEkJRSaHWjRtW4ro1AkXkTAVYT b55DDYovZ7PBIf5u/a+N56W/LqXlxCoA8gdeOXImwC/a+P5TrtScQQAr9bWyzAu0aHAE4R +oYYuJSkshcJIBYSWUaP265Asnsqute1j4NgcbXErxl+PUNN63BcuLk2DfZLLQiC8PRKf/ C0JlKO3C/DVbujCqHYe28d3whLpp37YQNKOSWJP8AVdWPnpx2/mbjEOXTMKMFQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 04/14] readahead: set file_ra_state->ra_pages to be at least mapping_min_order Date: Tue, 13 Feb 2024 10:37:03 +0100 Message-ID: <20240213093713.1753368-5-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Luis Chamberlain Set the file_ra_state->ra_pages in file_ra_state_init() to be at least mapping_min_order of pages if the bdi->ra_pages is less than that. Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Acked-by: Darrick J. Wong --- mm/readahead.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/readahead.c b/mm/readahead.c index 2648ec4f0494..4fa7d0e65706 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -138,7 +138,12 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); + unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages; + ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages; + if (ra->ra_pages < min_nrpages && min_nrpages < max_pages) + ra->ra_pages = min_nrpages; ra->prev_pos = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); From patchwork Tue Feb 13 09:37:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554810 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C98353717D; Tue, 13 Feb 2024 09:37:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817062; cv=none; b=lFkCJ4jp9HkVLEC0CY3A33K+5FjMA2kWuEQ6a8eckXhX8WC54WN3xeGkAcltp6R+9kG8xFeYfdIUoOfQXK6zYmng8HtVV/6TfEI9zCXo2+VlQCD+Getd5JPsq5CzD/ruZeR9q4oLCZbgsk2eAyD8lYVgXbMVDW1IkLT8RAJ3FQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817062; c=relaxed/simple; bh=5Tcmh5wX4+jbi1nAENBTznd19HcP0yxR7V36BhmJnQA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DieC8/CbYUbwdN2tKkpa+Pr+FAYButNcK3hu/iWfCa2J38WAPEhePkRd6vcWs3XLpTE3dpCMc3dMl8SnM4JcJCqZOdLHg/4jsbMbR3UNNiLgAZcRpg7H7bvV75JDq6l+WaakPe+2bhNVpW0qOBhziLuSBXR6dHQQRv/0Iuftt1o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=q+QdcRp9; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="q+QdcRp9" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4TYx7h610nz9sp2; Tue, 13 Feb 2024 10:37:36 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G2fO/D/MPNOVLZ/qHSRb69SYJGOgunRGzbPIdUcBFeY=; b=q+QdcRp9VMFfzBuHWmwG0y4c6GTja5nXRvH9aAJ/+ewY7aP/tIt9Dp0+UfOMBoqce1hD6r 4Jr9W7xXq0c5XSbK3Y2hbZrDNo/2DpL4R/2wGRvMP0RY1PSm+kdlNNZopEmK/dOkqMb5CQ QTAmRx5ZdiGchLWnc+Dw0wyZ6d5eD6QTsN5x23PoMUnXNKkk0pbacHYf/5S1hQFRYemGuR 9i+kZrTl/z6yLaQCs78brNS7z1wN+GvZgsWUiCXs3qO0ElouvrUdzXa0IKodiwYR5QLZ59 gBxcaEjI6JuH8GEXz/8F7YnQUXYe9S3rD1b7DEMAcw07PdMjtyAMdrCImC2L8g== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 05/14] readahead: align index to mapping_min_order in ondemand_ra and force_ra Date: Tue, 13 Feb 2024 10:37:04 +0100 Message-ID: <20240213093713.1753368-6-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx7h610nz9sp2 From: Luis Chamberlain Align the ra->start and ra->size to mapping_min_order in ondemand_readahead(), and align the index to mapping_min_order in force_page_cache_ra(). This will ensure that the folios allocated for readahead that are added to the page cache are aligned to mapping_min_order. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav Acked-by: Darrick J. Wong --- mm/readahead.c | 48 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 40 insertions(+), 8 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 4fa7d0e65706..5e1ec7705c78 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -315,6 +315,7 @@ void force_page_cache_ra(struct readahead_control *ractl, struct file_ra_state *ra = ractl->ra; struct backing_dev_info *bdi = inode_to_bdi(mapping->host); unsigned long max_pages, index; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) return; @@ -324,6 +325,13 @@ void force_page_cache_ra(struct readahead_control *ractl, * be up to the optimal hardware IO size */ index = readahead_index(ractl); + if (!IS_ALIGNED(index, min_nrpages)) { + unsigned long old_index = index; + + index = round_down(index, min_nrpages); + nr_to_read += (old_index - index); + } + max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); nr_to_read = min_t(unsigned long, nr_to_read, max_pages); while (nr_to_read) { @@ -332,6 +340,7 @@ void force_page_cache_ra(struct readahead_control *ractl, if (this_chunk > nr_to_read) this_chunk = nr_to_read; ractl->_index = index; + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); do_page_cache_ra(ractl, this_chunk, 0); index += this_chunk; @@ -344,11 +353,20 @@ void force_page_cache_ra(struct readahead_control *ractl, * for small size, x 4 for medium, and x 2 for large * for 128k (32 page) max ra * 1-2 page = 16k, 3-4 page 32k, 5-8 page = 64k, > 8 page = 128k initial + * + * For higher order address space requirements we ensure no initial reads + * are ever less than the min number of pages required. + * + * We *always* cap the max io size allowed by the device. */ -static unsigned long get_init_ra_size(unsigned long size, unsigned long max) +static unsigned long get_init_ra_size(unsigned long size, + unsigned int min_nrpages, + unsigned long max) { unsigned long newsize = roundup_pow_of_two(size); + newsize = max_t(unsigned long, newsize, min_nrpages); + if (newsize <= max / 32) newsize = newsize * 4; else if (newsize <= max / 4) @@ -356,6 +374,8 @@ static unsigned long get_init_ra_size(unsigned long size, unsigned long max) else newsize = max; + VM_BUG_ON(newsize & (min_nrpages - 1)); + return newsize; } @@ -364,14 +384,16 @@ static unsigned long get_init_ra_size(unsigned long size, unsigned long max) * return it as the new window size. */ static unsigned long get_next_ra_size(struct file_ra_state *ra, + unsigned int min_nrpages, unsigned long max) { - unsigned long cur = ra->size; + unsigned long cur = max(ra->size, min_nrpages); if (cur < max / 16) return 4 * cur; if (cur <= max / 2) return 2 * cur; + return max; } @@ -561,7 +583,11 @@ static void ondemand_readahead(struct readahead_control *ractl, unsigned long add_pages; pgoff_t index = readahead_index(ractl); pgoff_t expected, prev_index; - unsigned int order = folio ? folio_order(folio) : 0; + unsigned int min_order = mapping_min_folio_order(ractl->mapping); + unsigned int min_nrpages = mapping_min_folio_nrpages(ractl->mapping); + unsigned int order = folio ? folio_order(folio) : min_order; + + VM_BUG_ON(!IS_ALIGNED(ractl->_index, min_nrpages)); /* * If the request exceeds the readahead window, allow the read to @@ -583,8 +609,8 @@ static void ondemand_readahead(struct readahead_control *ractl, expected = round_down(ra->start + ra->size - ra->async_size, 1UL << order); if (index == expected || index == (ra->start + ra->size)) { - ra->start += ra->size; - ra->size = get_next_ra_size(ra, max_pages); + ra->start += round_down(ra->size, min_nrpages); + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); ra->async_size = ra->size; goto readit; } @@ -603,13 +629,18 @@ static void ondemand_readahead(struct readahead_control *ractl, max_pages); rcu_read_unlock(); + start = round_down(start, min_nrpages); + + VM_BUG_ON(folio->index & (folio_nr_pages(folio) - 1)); + if (!start || start - index > max_pages) return; ra->start = start; ra->size = start - index; /* old async_size */ + ra->size += req_size; - ra->size = get_next_ra_size(ra, max_pages); + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); ra->async_size = ra->size; goto readit; } @@ -646,7 +677,7 @@ static void ondemand_readahead(struct readahead_control *ractl, initial_readahead: ra->start = index; - ra->size = get_init_ra_size(req_size, max_pages); + ra->size = get_init_ra_size(req_size, min_nrpages, max_pages); ra->async_size = ra->size > req_size ? ra->size - req_size : ra->size; readit: @@ -657,7 +688,7 @@ static void ondemand_readahead(struct readahead_control *ractl, * Take care of maximum IO pages as above. */ if (index == ra->start && ra->size == ra->async_size) { - add_pages = get_next_ra_size(ra, max_pages); + add_pages = get_next_ra_size(ra, min_nrpages, max_pages); if (ra->size + add_pages <= max_pages) { ra->async_size = add_pages; ra->size += add_pages; @@ -668,6 +699,7 @@ static void ondemand_readahead(struct readahead_control *ractl, } ractl->_index = ra->start; + VM_BUG_ON(!IS_ALIGNED(ractl->_index, min_nrpages)); page_cache_ra_order(ractl, ra, order); } From patchwork Tue Feb 13 09:37:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554811 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6309338F8F; Tue, 13 Feb 2024 09:37:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817064; cv=none; b=iufK3/yXLKOZTi9sQ2478PAw+hW5YGSE0x3fZOFIt6kFGDKZNYMbFgeOJ5woqwc1E3GBz7mjOcu0fTD8n3ZMOHdsjJZckZHRecHU9cIaFOGFpQYch6BKycCRicCit2PFT4qn3E0vjpahBOncL7vg5Hy0CRM/i4DqImpC1LBK2vo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817064; c=relaxed/simple; bh=iKbT79kLd7jyelrjB7xgHQdWAkRrlZn0Y/6yOzWx3k0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rgbRqsrkqJdXPytkrXlwCL1qQLEDhezD1CpX3n4Jn0s/M3Wo/8rlISPqpeplhwogEg8YdAUaiTTxB82FssDP/U51MTRzJV6At3NofT5NQGbGGPpE2deVRMsikC3MBblIRzmzD+WM6F2fe1JQOSrMNfqxRJpTGHTc9O4vooXdEq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=x6CfVQAW; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="x6CfVQAW" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4TYx7l4ntmz9t2m; Tue, 13 Feb 2024 10:37:39 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s3+v+1l0sQIYfL9D8O6cM5bMDlwybIjei48bqSTyWjw=; b=x6CfVQAW4rNsG3L+k95kHbFRenuHprAKwdKZE4SqzOwOs2L6aGmZ1ZQVD8/dJoBhJywTa7 6y8PrO1+vcd3D+x0Iqy3adhK62udO+kGv47JGmZGPVG7qqwA+gbeU/NJnY92pbJuF1EY0l JUstg7eTNj265zR9KXR68yLq9hPsVt7JMoLc8i9Yuiv2JGdztsLselFJepClTBB/njp9tI RG6HjSx+PGhFCUnsXbC9mYVgrbZ+ICPlPMhuKs9iRc8QUGmTUZSCWlB/fNXayyoq4+anHk uone6GAIEb5xnuUWMx9ouMcbT5CHsJzw8G3PIZcC8sKASsD5Z5GQy5wGcDz+jQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 06/14] readahead: rework loop in page_cache_ra_unbounded() Date: Tue, 13 Feb 2024 10:37:05 +0100 Message-ID: <20240213093713.1753368-7-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx7l4ntmz9t2m From: Hannes Reinecke Rework the loop in page_cache_ra_unbounded() to advance with the number of pages in a folio instead of just one page at a time. Signed-off-by: Hannes Reinecke Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Acked-by: Darrick J. Wong --- mm/readahead.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 5e1ec7705c78..13b62cbd3b79 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -213,7 +213,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct address_space *mapping = ractl->mapping; unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i; + unsigned long i = 0; /* * Partway through the readahead operation, we will have added @@ -231,7 +231,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, /* * Preallocate as many pages as we will need. */ - for (i = 0; i < nr_to_read; i++) { + while (i < nr_to_read) { struct folio *folio = xa_load(&mapping->i_pages, index + i); if (folio && !xa_is_value(folio)) { @@ -244,8 +244,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + ractl->_index += folio_nr_pages(folio); + i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -257,13 +257,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, folio_put(folio); read_pages(ractl); ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + i = ractl->_index + ractl->_nr_pages - index; continue; } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); - ractl->_nr_pages++; + ractl->_nr_pages += folio_nr_pages(folio); + i += folio_nr_pages(folio); } /* From patchwork Tue Feb 13 09:37:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554812 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF06341C76; Tue, 13 Feb 2024 09:37:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817069; cv=none; b=CtXqywAjaqxo4uQIm6RWR24CC9GCIHpXA0MM1Gyd57Pj0KQ5IQc1PBJFx9bVIqq0aArLtpjeRU2yuhRRy4dLLmplaeAB8+7aNR4DH43ildLVlhzAOibZXa8S/yKaCAB//O233EJDr4ANaeMBKirya+CpRTY+AY1fsINACWtVnXo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817069; c=relaxed/simple; bh=BXOnK4nk1tFArXcX7KTp3Xu78chTRHJFIKF4HKgMReU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=guPxZ7cfHr1UGUYZr34re58iptqEonS3DwLwE9wCDhGWakCKLhGAO97oXLk9CZxfFUuydf4GL6UjVay3RnZdYkoMgFzXknzLIOrbwsYLG4yRD+ichdcOLW3S/12d5gYIJ/b059rVbXn2bSER6MDEGUOwT1I95/FUwQp+O5kPyAQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=iOQgIoOS; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="iOQgIoOS" Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TYx7q6TR8z9sQ6; Tue, 13 Feb 2024 10:37:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+mCWMwP5wq3JUiJEv3CmIbOYWUCSLRs7oy3y/VAxBqA=; b=iOQgIoOS6MNE5AoCAkHWB9G4sH1HUqNrseAIr0UHMHOFKgN7X/i+sSLh14ZYjJ2VfVFoVD xEnCZpOimSmqDL3bsXvjRAwCjcpL4t0fpM7v0+a7NpzQ42OVb5XKUPN4Pe1yuaQzlQg510 9+GAiYGfjY4aRVJ8IhjdFc1qFraRZoin7df2vI6gb5P7s9bDaqDLapMAqTGaNHPabkP8jG 9EDznjuSuLHv8D6U3BuOBiHgpe9P5VKhmdvy9UShqKWr+81MrlafvnKcCyxnZS7ekoFiTv AsIwPvDH6bnGLovC6E5hxc34cSXEfFiEBW/VvDrEK/Um/6t5hMDmGu+a5nYQnw== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 07/14] readahead: allocate folios with mapping_min_order in ra_(unbounded|order) Date: Tue, 13 Feb 2024 10:37:06 +0100 Message-ID: <20240213093713.1753368-8-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav Allocate folios with at least mapping_min_order in page_cache_ra_unbounded() and page_cache_ra_order() as we need to guarantee a minimum order in the page cache. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Acked-by: Darrick J. Wong --- mm/readahead.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 13b62cbd3b79..a361fba18674 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -214,6 +214,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); unsigned long i = 0; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); /* * Partway through the readahead operation, we will have added @@ -235,6 +236,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct folio *folio = xa_load(&mapping->i_pages, index + i); if (folio && !xa_is_value(folio)) { + long nr_pages = folio_nr_pages(folio); + /* * Page already present? Kick off the current batch * of contiguous pages before continuing with the @@ -244,19 +247,31 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index += folio_nr_pages(folio); + + /* + * Move the ractl->_index by at least min_pages + * if the folio got truncated to respect the + * alignment constraint in the page cache. + * + */ + if (mapping != folio->mapping) + nr_pages = min_nrpages; + + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); + ractl->_index += nr_pages; i = ractl->_index + ractl->_nr_pages - index; continue; } - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, + mapping_min_folio_order(mapping)); if (!folio) break; if (filemap_add_folio(mapping, folio, index + i, gfp_mask) < 0) { folio_put(folio); read_pages(ractl); - ractl->_index++; + ractl->_index += min_nrpages; i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -516,6 +531,7 @@ void page_cache_ra_order(struct readahead_control *ractl, { struct address_space *mapping = ractl->mapping; pgoff_t index = readahead_index(ractl); + unsigned int min_order = mapping_min_folio_order(mapping); pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; pgoff_t mark = index + ra->size - ra->async_size; int err = 0; @@ -542,11 +558,17 @@ void page_cache_ra_order(struct readahead_control *ractl, if (index & ((1UL << order) - 1)) order = __ffs(index); /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) + while (order > min_order && index + (1UL << order) - 1 > limit) order--; /* THP machinery does not support order-1 */ if (order == 1) order = 0; + + if (order < min_order) + order = min_order; + + VM_BUG_ON(index & ((1UL << order) - 1)); + err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break; From patchwork Tue Feb 13 09:37:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554813 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 998C946444; Tue, 13 Feb 2024 09:37:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817073; cv=none; b=ngLP1k7FFiQkQN/J4j8svMT7TnBKY0g8g+4drcEAjNlTRUrk92YO95ni5Luq7AoUiOPr9whpHBhh3xkHsCWljEJJFPzCNJjwhYK4eacDejv7Lf50OL+fYIf2DB2mgN2hU+ouyq5ZtMoVSX71UPJBRpFc9bPkeDcwISR2Kuk0zKM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817073; c=relaxed/simple; bh=aT7myvwbv8s6n+EpwpZGFT8dH3zqU9o7RwFhXcMPQLg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LXf93ClozDS7ToV449Xu4Ka1F+nU2GpLm+2sgwmgz62ONGG5CwTqIsKX2d4JrgX6nZ/Hu4t5OlK8zytvbufvbB0qIDWkL0hnJBfrPA2tbvMuNkIpts+AgE7FYqtYNk+qnjcRJORHC1swwi3n3BC8HAJhOodRC0WNXXW1qq4sBrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=PYSoXXZF; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="PYSoXXZF" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4TYx7v6WyGz9sp2; Tue, 13 Feb 2024 10:37:47 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vOt507Ijo5nyZ5wUX3vXhvqctlR5/JOvafIYIWSMCPU=; b=PYSoXXZF27fQan8sPatQHAhpAqM4cF203+dvlg+l2btfVaBSRRZiJ1MzdG7gxXCdw/x1xv zyTcd5oYO6PhI3ZFTTCL+6yiaM+Y6yFqYDrBTpKbEIBodE2d741iRoZELCaE88A++S1DY4 rTb77WTAkP4yb3+9A/L0UsVRr/+EI8CsZdN/tVRk429O3hpuT4W5VPTqeszPTncLWrnDU0 7UA6a6QW14Jlo99OKaM/RQfdVcRG6UsIp0cSAOQ7vP8EVJfGVRCfZwh+Jcy59vj6v6E1ie USql3iBXYGMEmlth5qx6Id4J6beIAHNMU9q3QOpfjtUTd+FbUH7EXaWS074UEQ== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 08/14] mm: do not split a folio if it has minimum folio order requirement Date: Tue, 13 Feb 2024 10:37:07 +0100 Message-ID: <20240213093713.1753368-9-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx7v6WyGz9sp2 From: Pankaj Raghav As we don't have a way to split a folio to a any given lower folio order yet, avoid splitting the folio in split_huge_page_to_list() if it has a minimum folio order requirement. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- mm/huge_memory.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94c958f7ebb5..d897efc51025 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3026,6 +3026,19 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) goto out; } + /* + * Do not split if mapping has minimum folio order + * requirement. + * + * XXX: Once we have support for splitting to any lower + * folio order, then it could be split based on the + * min_folio_order. + */ + if (mapping_min_folio_order(mapping)) { + ret = -EAGAIN; + goto out; + } + gfp = current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); From patchwork Tue Feb 13 09:37:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554814 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 013A0524AC; Tue, 13 Feb 2024 09:37:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.161 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817078; cv=none; b=ahN2zpvqVeJEi8jJjPv8IKzUVmPmEhXIVQZh7A9uyZNt2lkLTzIzrUSxXmkIvvkSP7BsaU7T7kA4nccgk6DTGZrhxQZXx0k9aUNCYklAas/vOTNOesYp8cTSkxko9mTM8hNk9tT+BwV67y1uTB4LsRAj3VPOTotmH/JakSKpXjA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817078; c=relaxed/simple; bh=t+xAGesPvg8PRQJY0BLXZ/TBi6+F/tBh4Y2eOTFzprs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=B+CE8qF/QSHedNjHpYh/M2jYi0XJ2HtHXc3jS9GYFWAf78dx+bv4l2VsvrPDlUVgJDNxX2LvNm1MSexgU2UBjN9Prp9Lf97ZcOQJ2q2LXfJOB0McA0PzGi+qsBFlJf4z17CgjxhXSgMOYnsSxdAb6VYd3TFg8UIwwa84PueX+nE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=vk8rNVHr; arc=none smtp.client-ip=80.241.56.161 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="vk8rNVHr" Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4TYx805XGYz9t7w; Tue, 13 Feb 2024 10:37:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817072; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WTxYFw2wNX7P5ZclG9FyVzbP0PebvoXLD7s4qqN4OLQ=; b=vk8rNVHrH9GnDjyMJds3YOI6FHkGLa5jobWuxpBSfDc8eim75cFZeal2zSOqMPmtRTP7Lt jiU1XTqkGbKSf4FyAvu8y5Kv3XT7peTK4JTd40qH112YpZ5MhTZqagUFznvuEPJGLeRzDO 1nvCvtEaI6Vb2cmva8byqNu3rPyCcR7y8jkug7J5YUuPwYCi6zhpTo6lN4rddW3W0Zq4xX 9jeP4tpx214SFXzzjp5l3ltv5glok+YytXEQGtoCIdhdlUkbWSpyoosppXs3wr6iTUM87n zVspg8aNshEzNRnan6HsNc0cCrqVf/Ti1mskfH4RqObvWYjl2B4tNvGSyEDuuA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 09/14] mm: Support order-1 folios in the page cache Date: Tue, 13 Feb 2024 10:37:08 +0100 Message-ID: <20240213093713.1753368-10-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: "Matthew Wilcox (Oracle)" Folios of order 1 have no space to store the deferred list. This is not a problem for the page cache as file-backed folios are never placed on the deferred list. All we need to do is prevent the core MM from touching the deferred list for order 1 folios and remove the code which prevented us from allocating order 1 folios. Link: https://lore.kernel.org/linux-mm/90344ea7-4eec-47ee-5996-0c22f42d6a6a@google.com/ Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Hannes Reinecke --- include/linux/huge_mm.h | 7 +++++-- mm/filemap.c | 2 -- mm/huge_memory.c | 23 ++++++++++++++++++----- mm/internal.h | 4 +--- mm/readahead.c | 3 --- 5 files changed, 24 insertions(+), 15 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..916a2a539517 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -263,7 +263,7 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); -void folio_prep_large_rmappable(struct folio *folio); +struct folio *folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list(struct page *page, struct list_head *list); static inline int split_huge_page(struct page *page) @@ -410,7 +410,10 @@ static inline unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, return 0; } -static inline void folio_prep_large_rmappable(struct folio *folio) {} +static inline struct folio *folio_prep_large_rmappable(struct folio *folio) +{ + return folio; +} #define transparent_hugepage_flags 0UL diff --git a/mm/filemap.c b/mm/filemap.c index 7a6e15c47150..c8205a534532 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1922,8 +1922,6 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, gfp_t alloc_gfp = gfp; err = -ENOMEM; - if (order == 1) - order = 0; if (order < min_order) order = min_order; if (order > 0) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d897efc51025..6ec3417638a1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -788,11 +788,15 @@ struct deferred_split *get_deferred_split_queue(struct folio *folio) } #endif -void folio_prep_large_rmappable(struct folio *folio) +struct folio *folio_prep_large_rmappable(struct folio *folio) { - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); - INIT_LIST_HEAD(&folio->_deferred_list); + if (!folio || !folio_test_large(folio)) + return folio; + if (folio_order(folio) > 1) + INIT_LIST_HEAD(&folio->_deferred_list); folio_set_large_rmappable(folio); + + return folio; } static inline bool is_transparent_hugepage(struct folio *folio) @@ -3095,7 +3099,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) /* Prevent deferred_split_scan() touching ->_refcount */ spin_lock(&ds_queue->split_queue_lock); if (folio_ref_freeze(folio, 1 + extra_pins)) { - if (!list_empty(&folio->_deferred_list)) { + if (folio_order(folio) > 1 && + !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del(&folio->_deferred_list); } @@ -3146,6 +3151,9 @@ void folio_undo_large_rmappable(struct folio *folio) struct deferred_split *ds_queue; unsigned long flags; + if (folio_order(folio) <= 1) + return; + /* * At this point, there is no one trying to add the folio to * deferred_list. If folio is not in deferred_list, it's safe @@ -3171,7 +3179,12 @@ void deferred_split_folio(struct folio *folio) #endif unsigned long flags; - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); + /* + * Order 1 folios have no space for a deferred list, but we also + * won't waste much memory by not adding them to the deferred list. + */ + if (folio_order(folio) <= 1) + return; /* * The try_to_unmap() in page reclaim path might reach here too, diff --git a/mm/internal.h b/mm/internal.h index f309a010d50f..5174b5b0c344 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -419,9 +419,7 @@ static inline struct folio *page_rmappable_folio(struct page *page) { struct folio *folio = (struct folio *)page; - if (folio && folio_order(folio) > 1) - folio_prep_large_rmappable(folio); - return folio; + return folio_prep_large_rmappable(folio); } static inline void prep_compound_head(struct page *page, unsigned int order) diff --git a/mm/readahead.c b/mm/readahead.c index a361fba18674..7d5f6a8792a8 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -560,9 +560,6 @@ void page_cache_ra_order(struct readahead_control *ractl, /* Don't allocate pages past EOF */ while (order > min_order && index + (1UL << order) - 1 > limit) order--; - /* THP machinery does not support order-1 */ - if (order == 1) - order = 0; if (order < min_order) order = min_order; From patchwork Tue Feb 13 09:37:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554815 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6109F225A9; Tue, 13 Feb 2024 09:37:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817080; cv=none; b=rP3Krp1aBNVuQD1YF7S7F/w2w5C0KxV3xQAfzyxww08979PudwBY4fZ1g1Yt6VPR9AA+jgH4w8v1OJILzGxR6wOLGjvJCxosziFvWAAAd6bzWzUZY4DuBqk0GKr0Hz5hYbQW5ikRsNQTgvzoAKVQGZvyKU+ItSzyIXf5k02FplE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817080; c=relaxed/simple; bh=RoUxwy82xaGykdNbB9bQNH3nvpBbWo6kSTGPBWPwdjE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BvEQq+7EKs6K9WcR6y5RYJ/DxCXVi0drGGE66hP+Au0wRSZbe6Xf3hOSHMqthzp+2vXm2bprwUC5nnf4yFvRhrQTrZrVpDfP9ZWsOiQsggZal7e2nDCV58zfg8AovZ3p0p47Mspff7y/7rHiiKEZLCwvwr0liAxvqKikLMxDUjY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=BmByWoGm; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="BmByWoGm" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx8347SMz9stR; Tue, 13 Feb 2024 10:37:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817075; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TZW5dJ0nv7fo+LI90ijtR+d2/ZkIfNS9bRkcnAAFJTI=; b=BmByWoGmvp+kEWIRd+4bCnx1c2aolG8l1MFYl3diqfPl1tqcl9YnuViapkcDsVIqIxJBzY QXT/YcPpK9B3nuaFEVuFdW5R6AlwojOALukyjqK/Yy3Te36bjXCR40bUAM+uoMezT5lgKn NC+RH4zuH3m2XBI4ZPaWaxWn/2hQpPmmrP0HjKHmzPORWmz+z4fAA3h98UmbZDvBT2S2Vk isdlWNDiTRE+qpjyeeVHFzn5lQttGKmWNJ2UkMcSSryAvZdDhvlh337EwDRWCUh4cfyLwr x0EH5x4x/v33DNXaAiV4YYn1JWJnST4sjQeqW1JYK0nqjGowTU6abq3eQaKg2g== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 10/14] iomap: fix iomap_dio_zero() for fs bs > system page size Date: Tue, 13 Feb 2024 10:37:09 +0100 Message-ID: <20240213093713.1753368-11-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx8347SMz9stR From: Pankaj Raghav iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke Reviewed-by: Darrick J. Wong --- fs/iomap/direct-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index bcd3f8cf5ea4..04f6c5548136 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, struct page *page = ZERO_PAGE(0); struct bio *bio; - bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); + WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE)); + + bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS, + REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); + bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos); bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - __bio_add_page(bio, page, len, 0); + while (len) { + unsigned int io_len = min_t(unsigned int, len, PAGE_SIZE); + + __bio_add_page(bio, page, io_len, 0); + len -= io_len; + } iomap_dio_submit_bio(iter, dio, bio, pos); } From patchwork Tue Feb 13 09:37:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554816 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 535D252F7D; Tue, 13 Feb 2024 09:38:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817083; cv=none; b=d6xuO4laPZCcKMeNL0htDOphxml7J5eBRBoxxhlVCCRgPovqXu1296tj8FelPSg8ArwHuNdAt3C/RTD9ie7Yj7ozZaVW9HYXMuWpUrHL3Ov8fvraYynVIQrsaAwYKDlOJae0J49psPloyA7aSPyk7yTTfSCoD14iw/SvrvnrXDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817083; c=relaxed/simple; bh=Samgg5HXfwngqVGtKhyUchziucj3WKbVNDjN+/ay334=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rDeP8j4PzGE/aOXzz79mX7lGwau2vyHFftXNJoljBeKxCQenQGqd7pgc8G9aA6fbK1QyjdQR5dBLFn+ahphxVuv5O6Gcbdx97tY7LIvKV91YdBSqmSvdIIzvdR6SVo8/KUvGzFvKB+WZxbrVAIIVgi7YSq8dty/wpwAguu1CdWg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=LJLR6r+3; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="LJLR6r+3" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4TYx864DCXz9sp2; Tue, 13 Feb 2024 10:37:58 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MO0EI8w5WbkUQtm6TaQ0DQo1nNqS6QPeZXjuQJSUGbE=; b=LJLR6r+34CoFzdv2J3qP5MfDOQ6MMnj0VOgb57gkDihAt/g5yljWgrPF5I+ZK4MfCqdLSi vBMvmgP3hTSxUlA4ch0+pC0vjX+MdsTWc5PiIjFGjau9gw7S/J33U5ddWn4cjiM6DlMG0W Ur98/ltE05ZvGCzVpgYzQrW3OcQxJZP+oR1H5YbatkY+28PnDdNEeZSZ2gCzVWtzSnF5Q1 YJGWDOj6+sAm2G6B++jG9Yh5Hun13hydNfMq8aFAgAB9nouxcBdVpJXDifnxYLgegAPikF o8MBpLKXTY6q6zwboZzhre51ySyXG0YGTKSCzYXVfa+9HC40mLev/vudl4ahOg== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com, Dave Chinner Subject: [RFC v2 11/14] xfs: expose block size in stat Date: Tue, 13 Feb 2024 10:37:10 +0100 Message-ID: <20240213093713.1753368-12-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. Signed-off-by: Dave Chinner [mcgrof: forward rebase in consideration for commit dd2d535e3fb29d ("xfs: cleanup calculating the stat optimal I/O size")] Signed-off-by: Luis Chamberlain --- fs/xfs/xfs_iops.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a0d77f5f512e..8791a9d80897 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -515,6 +515,8 @@ xfs_stat_blksize( struct xfs_inode *ip) { struct xfs_mount *mp = ip->i_mount; + unsigned long default_size = max_t(unsigned long, PAGE_SIZE, + mp->m_sb.sb_blocksize); /* * If the file blocks are being allocated from a realtime volume, then @@ -543,7 +545,7 @@ xfs_stat_blksize( return 1U << mp->m_allocsize_log; } - return PAGE_SIZE; + return default_size; } STATIC int From patchwork Tue Feb 13 09:37:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554817 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2957653386; Tue, 13 Feb 2024 09:38:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817086; cv=none; b=QwjRZK2ZjypTsHIHiZfFDf41Ti6uLHrKYEYFnZWg5zE5P6jhKn89vUDeip/TAHh7ry+iyfTN5YPdnT7HiBARQnrIu2Mxtk/jPk6IxmFHFAZuqVljddn1a4WYiaQRqGjWh20yd2jT0YeaK4vBM72lyOkjCKVM5lN+BOtg2qW9dtM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817086; c=relaxed/simple; bh=RR0ZwTzL0jbBJPQjhhUZZUPuDNlet8wM3A8lBNHMJrE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ct/9Z27HiHD0KU2XzuqzzsTka21OjlExPtwkVLgLpAG1VsY8HlW13MFUVJPcbwFDCH9oILSrke8Ml167dglkwaRNnsLddmh7REzaOM28D6dGtjZnCE6XAI22nhv+0R7x9g1F7uUTKKteCE7xzx1NyhWVbG5H4T0O4GXJbzdAuO8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=WxRaE/Va; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="WxRaE/Va" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx894brXz9sq8; Tue, 13 Feb 2024 10:38:01 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817081; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vzZe/GhW0AKuEEGxmrsisVmOwc1fDz2jZ00nW20jdYY=; b=WxRaE/VaFusCFl5KQNguCwied061qANl5HTR9V9OJtzJe7ZG6mjmEis1M1BAnz6yvnvEHC Nzf87hlFBc4f43/BSJaMthC/KNrvWMe4PJaq0ITrrMSZgEwSJkSKiPG4hjT8OYYNVT1U05 5ckop88rRS+YSwN0H/gTEa2H+P1LDxjxKr4efXJ9CIgXLF4TUZHddADytDu7kvt01z+2VT GdFqmNf7DPnSJu4Tame5pp+JQjC4DSsYSHACxPoPkZFDSKwCrsSdshkNN9EgvAgYFytnql Leiu+FIkvhBgHBxs8VORK/8FPFWm2a6oCwlNjL0S2A9yNcrXB4yCGzHqPGEWGw== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 12/14] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Date: Tue, 13 Feb 2024 10:37:11 +0100 Message-ID: <20240213093713.1753368-13-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx894brXz9sq8 From: Pankaj Raghav Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_mount.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index aabb25dc3efa..bfbaaecaf668 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -133,9 +133,13 @@ xfs_sb_validate_fsb_count( { ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); + unsigned long mapping_count; + uint64_t bytes = nblocks << sbp->sb_blocklog; + + mapping_count = bytes >> PAGE_SHIFT; /* Limited by ULONG_MAX of page cache index */ - if (nblocks >> (PAGE_SHIFT - sbp->sb_blocklog) > ULONG_MAX) + if (mapping_count > ULONG_MAX) return -EFBIG; return 0; } From patchwork Tue Feb 13 09:37:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554818 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E956537E3; Tue, 13 Feb 2024 09:38:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817089; cv=none; b=T/yHYKfrF7fDWoTpBPSsYl/0f3X0ruNxGYM/qXt35skuUko4woleCejFACYdQmAxwnR7DU9OvtILoUh+2E0/QHh1LL61DLw+9WtFGYHS4dN5MiBpIONcv5i49HCBDBsCZCn8N/6vhPV2sD2V5MHPFUEOQnVYo4zcgzdaemM6Ib4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817089; c=relaxed/simple; bh=7g8Bs2Rmp6gOJnEQAVWC55gjo2MoxR/e102tY8njEUU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H91bQu7aYiYsZ8MQZxCLxm9FwAA7C3sO3uWKUU955vFwHSs3au6Sz3peoGNyhoHzZpIrY08x5pmKjI84rcQP30BnNLUczUwNyoRsQZzO1yaSR8fW6iGbcWNL22iD16dLD5Ph9uOOGXQSQOCPeE9QlYY8YqHJJXAU/D68t+w2B0M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=lAJKQkv8; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="lAJKQkv8" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx8D3Xykz9t40; Tue, 13 Feb 2024 10:38:04 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817084; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+2ss5SlGBbWn5UCkRFfS2rrVeqSyvxI6jrmUJVBEXU4=; b=lAJKQkv8JFUfAalsowcHUM0qevPxPDym9zxQEdcsGZ1QD34oloukLEkuEh9wYi/vwk3vTz 88MWHR5ma60yh6fcF9kNUWOoPzhytLjV6Ujqkm3NEpqVKXssY+AJ7VmlsSzPHrbWqiHHA/ nnbX4EV4Fhls9o8I6ajQX4wDUiZx29RHWX8bCkX0wpri4sldWbDFrGm1ryQp5raRgie1qr Lqsie4oCOj0gEl5zUVTbX8ms0OBr8KlMQBHEY4+8qUfqB3zPYcRNxXNbTgt3YZGojnJJEv sgKHAUiic9WDKzyryMjZWH9/hR/YdqHkJvfMwSZBVuLypx3foevRXAeEudMk+A== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 13/14] xfs: add an experimental CONFIG_XFS_LBS option Date: Tue, 13 Feb 2024 10:37:12 +0100 Message-ID: <20240213093713.1753368-14-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Pankaj Raghav Add an experimental CONFIG_XFS_LBS option to enable LBS support in XFS. Retain the ASSERT for PAGE_SHIFT if CONFIG_XFS_LBS is not enabled. Signed-off-by: Pankaj Raghav Reviewed-by: Darrick J. Wong --- fs/xfs/Kconfig | 11 +++++++++++ fs/xfs/xfs_mount.c | 4 +++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/xfs/Kconfig b/fs/xfs/Kconfig index 567fb37274d3..6b0db2f7dc13 100644 --- a/fs/xfs/Kconfig +++ b/fs/xfs/Kconfig @@ -216,3 +216,14 @@ config XFS_ASSERT_FATAL result in warnings. This behavior can be modified at runtime via sysfs. + +config XFS_LBS + bool "XFS large block size support (EXPERIMENTAL)" + depends on XFS_FS + help + Set Y to enable support for filesystem block size > system's + base page size. + + This feature is considered EXPERIMENTAL. Use with caution! + + If unsure, say N. diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index bfbaaecaf668..596aa2cdefbc 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -131,11 +131,13 @@ xfs_sb_validate_fsb_count( xfs_sb_t *sbp, uint64_t nblocks) { - ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); unsigned long mapping_count; uint64_t bytes = nblocks << sbp->sb_blocklog; + if (!IS_ENABLED(CONFIG_XFS_LBS)) + ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); + mapping_count = bytes >> PAGE_SHIFT; /* Limited by ULONG_MAX of page cache index */ From patchwork Tue Feb 13 09:37:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554819 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0473053E18; Tue, 13 Feb 2024 09:38:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817092; cv=none; b=izdmjXkaM/kC5zsWSuNzk6M0EJoeA0vMclFtecLa3imYfYWpBLwu4t3s9qo4HUUU0sRFjfGAClIrHHEdNePRTGmje8kJQXp8/mFNFpKeQkeW6m0xVrKK5doarWampHtzkcs2UzbfXqBbkrcKolgzGlWDqcdZiQf4C2cUtudwDfA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707817092; c=relaxed/simple; bh=ua/0/Cz1VYv5tIBaRlFGPxVYgrAUWB1V+tD6GkDOKdY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ss3A6nmehEvto6mhjjaqqEhFl88FR8NDq2nKrK0Ty3rczB/T48jy8aCuLJiOFyTS3cpdXHrwVzSDrfo6++mDvVZcw2HxjLhl/cQuO8RZJPJvLm0KM1yszY3VCr2SpnbsFi27S2OVZlfgeO3OLMcxKeFFQLvG5BMSkuEujWKah7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=ENbwnnNB; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="ENbwnnNB" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4TYx8H2fChz9sSH; Tue, 13 Feb 2024 10:38:07 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ydvUqzYBcKzyH4ZGo5x0A9m5vNe3qnV9Z0vpVQaGpLg=; b=ENbwnnNBjOlMkTNsWgmN/QeJZ4bByf9vXC2SH9MWslItb5E3wLNJdxS5OYwgUczoQQxGFa jJLfvCUFWE4S4XIaf/gyJsxDpV54gpCGO6Q09Hr3W1Zl8hpSte2SCg/NxBVzt13u5b+rtn u9QhTF7FnDJlW3xsb7G6zptXw5QVnwjAOVIogxhjlYniqe0K1ORrtaGBSqFIq7XIZecY9G asCnrkLcVOePpZAFLxRLQb3P4hXMoiEeVmElnhxfjx+/5h/OFo9SJwBVx77eiPLb+MoeAw aa/PQKwI9st5rNiozYJhatwuGinu3iv/klvwOtrhhlSPwM2BtJQuDhCzDDR5jA== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 14/14] xfs: enable block size larger than page size support Date: Tue, 13 Feb 2024 10:37:13 +0100 Message-ID: <20240213093713.1753368-15-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 4TYx8H2fChz9sSH From: Pankaj Raghav Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Enable it in XFS under CONFIG_XFS_LBS. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_icache.c | 8 ++++++-- fs/xfs/xfs_super.c | 8 +++----- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index dba514a2c84d..9de81caf7ad4 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -73,6 +73,7 @@ xfs_inode_alloc( xfs_ino_t ino) { struct xfs_inode *ip; + int min_order = 0; /* * XXX: If this didn't occur in transactions, we could drop GFP_NOFAIL @@ -88,7 +89,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode = 0; VFS_I(ip)->i_state = 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT); + mapping_set_folio_orders(VFS_I(ip)->i_mapping, min_order, MAX_PAGECACHE_ORDER); XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -313,6 +315,7 @@ xfs_reinit_inode( dev_t dev = inode->i_rdev; kuid_t uid = inode->i_uid; kgid_t gid = inode->i_gid; + int min_order = 0; error = inode_init_always(mp->m_super, inode); @@ -323,7 +326,8 @@ xfs_reinit_inode( inode->i_rdev = dev; inode->i_uid = uid; inode->i_gid = gid; - mapping_set_large_folios(inode->i_mapping); + min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT); + mapping_set_folio_orders(inode->i_mapping, min_order, MAX_PAGECACHE_ORDER); return error; } diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 5a2512d20bd0..6a3f0f6727eb 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1625,13 +1625,11 @@ xfs_fs_fill_super( goto out_free_sb; } - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ - if (mp->m_sb.sb_blocksize > PAGE_SIZE) { + if (!IS_ENABLED(CONFIG_XFS_LBS) && mp->m_sb.sb_blocksize > PAGE_SIZE) { xfs_warn(mp, "File system with blocksize %d bytes. " - "Only pagesize (%ld) or less will currently work.", + "Only pagesize (%ld) or less will currently work. " + "Enable Experimental CONFIG_XFS_LBS for this support", mp->m_sb.sb_blocksize, PAGE_SIZE); error = -ENOSYS; goto out_free_sb;