diff mbox series

[v7,2/3] mtd: rawnand: Enable monolithic read when reading subpages

Message ID 20240430-loongson1-nand-v7-2-60787c314fa4@gmail.com (mailing list archive)
State Handled Elsewhere
Headers show
Series Add support for Loongson-1 NAND | expand

Commit Message

Keguang Zhang via B4 Relay April 30, 2024, 11:11 a.m. UTC
From: Keguang Zhang <keguang.zhang@gmail.com>

nand_read_subpage() reads data and ECC data by two separate
operations.
This patch allows the NAND controllers who support
monolithic page read to do subpage read by a single operation,
which is more effective than nand_read_subpage().
---
Changes in v7:
- None

Changes in v6:
- A newly added patch

Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com>
---
 drivers/mtd/nand/raw/nand_base.c | 5 +++--
 include/linux/mtd/rawnand.h      | 5 +++++
 2 files changed, 8 insertions(+), 2 deletions(-)

Comments

Miquel Raynal May 6, 2024, 7:17 a.m. UTC | #1
Hi,

devnull+keguang.zhang.gmail.com@kernel.org wrote on Tue, 30 Apr 2024
19:11:11 +0800:

> From: Keguang Zhang <keguang.zhang@gmail.com>
> 
> nand_read_subpage() reads data and ECC data by two separate
> operations.
> This patch allows the NAND controllers who support
> monolithic page read to do subpage read by a single operation,
> which is more effective than nand_read_subpage().

I am a bit puzzled by this change. Usually nand_read_subpage is used
for optimizations (when less data than a full page must be retrieved).
I know it may be used in other cases (because it's easier for the core
in order to support a wide range of controllers). Can you please show a
speed test showing the results before I consider merging this patch?

The monolithic thing was not supposed to improve throughput but to help
with very limited controllers.

Thanks,
Miquèl
Keguang Zhang May 20, 2024, 10:42 a.m. UTC | #2
On Mon, May 6, 2024 at 3:17 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi,
>
> devnull+keguang.zhang.gmail.com@kernel.org wrote on Tue, 30 Apr 2024
> 19:11:11 +0800:
>
> > From: Keguang Zhang <keguang.zhang@gmail.com>
> >
> > nand_read_subpage() reads data and ECC data by two separate
> > operations.
> > This patch allows the NAND controllers who support
> > monolithic page read to do subpage read by a single operation,
> > which is more effective than nand_read_subpage().
>
> I am a bit puzzled by this change. Usually nand_read_subpage is used
> for optimizations (when less data than a full page must be retrieved).
> I know it may be used in other cases (because it's easier for the core
> in order to support a wide range of controllers). Can you please show a
> speed test showing the results before I consider merging this patch?
>
With this patch:
# flash_speed -c 128 -d /dev/mtd1
scanning for bad eraseblocks
scanned 128 eraseblocks, 0 are bad
testing eraseblock write speed
eraseblock write speed is 2112 KiB/s
testing eraseblock read speed
eraseblock read speed is 3454 KiB/s
testing page write speed
page write speed is 1915 KiB/s
testing page read speed
page read speed is 2999 KiB/s
testing 2 page write speed
2 page write speed is 2000 KiB/s
testing 2 page read speed
2 page read speed is 3207 KiB/s
Testing erase speed
erase speed is 72495 KiB/s
Testing 2x multi-block erase speed
2x multi-block erase speed is 74135 KiB/s
Testing 4x multi-block erase speed
4x multi-block erase speed is 74812 KiB/s
Testing 8x multi-block erase speed
8x multi-block erase speed is 75502 KiB/s
Testing 16x multi-block erase speed
16x multi-block erase speed is 75851 KiB/s
Testing 32x multi-block erase speed
32x multi-block erase speed is 75851 KiB/s
Testing 64x multi-block erase speed
64x multi-block erase speed is 76204 KiB/s
finished

Without this patch:
# flash_speed -c 128 -d /dev/mtd1
scanning for bad eraseblocks
scanned 128 eraseblocks, 0 are bad
testing eraseblock write speed
eraseblock write speed is 2074 KiB/s
testing eraseblock read speed
eraseblock read speed is 2895 KiB/s
testing page write speed
page write speed is 998 KiB/s
testing page read speed
page read speed is 1499 KiB/s
testing 2 page write speed
2 page write speed is 1002 KiB/s
testing 2 page read speed
2 page read speed is 1554 KiB/s
Testing erase speed
erase speed is 76560 KiB/s
Testing 2x multi-block erase speed
2x multi-block erase speed is 74019 KiB/s
Testing 4x multi-block erase speed
4x multi-block erase speed is 74769 KiB/s
Testing 8x multi-block erase speed
8x multi-block erase speed is 75149 KiB/s
Testing 16x multi-block erase speed
16x multi-block erase speed is 75921 KiB/s
Testing 32x multi-block erase speed
32x multi-block erase speed is 75921 KiB/s
Testing 64x multi-block erase speed
64x multi-block erase speed is 75921 KiB/s
finished

The throughput of the former is twice that of the latter.

> The monolithic thing was not supposed to improve throughput but to help
> with very limited controllers.
>
> Thanks,
> Miquèl
Miquel Raynal May 20, 2024, 3:33 p.m. UTC | #3
Hi Keguang,

keguang.zhang@gmail.com wrote on Mon, 20 May 2024 18:42:30 +0800:

> On Mon, May 6, 2024 at 3:17 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi,
> >
> > devnull+keguang.zhang.gmail.com@kernel.org wrote on Tue, 30 Apr 2024
> > 19:11:11 +0800:
> >  
> > > From: Keguang Zhang <keguang.zhang@gmail.com>
> > >
> > > nand_read_subpage() reads data and ECC data by two separate
> > > operations.
> > > This patch allows the NAND controllers who support
> > > monolithic page read to do subpage read by a single operation,
> > > which is more effective than nand_read_subpage().  
> >
> > I am a bit puzzled by this change. Usually nand_read_subpage is used
> > for optimizations (when less data than a full page must be retrieved).
> > I know it may be used in other cases (because it's easier for the core
> > in order to support a wide range of controllers). Can you please show a
> > speed test showing the results before I consider merging this patch?
> >  
> With this patch:
> # flash_speed -c 128 -d /dev/mtd1
> scanning for bad eraseblocks
> scanned 128 eraseblocks, 0 are bad
> testing eraseblock write speed
> eraseblock write speed is 2112 KiB/s
> testing eraseblock read speed
> eraseblock read speed is 3454 KiB/s
> testing page write speed
> page write speed is 1915 KiB/s
> testing page read speed
> page read speed is 2999 KiB/s
> testing 2 page write speed
> 2 page write speed is 2000 KiB/s
> testing 2 page read speed
> 2 page read speed is 3207 KiB/s
> Testing erase speed
> erase speed is 72495 KiB/s
> Testing 2x multi-block erase speed
> 2x multi-block erase speed is 74135 KiB/s
> Testing 4x multi-block erase speed
> 4x multi-block erase speed is 74812 KiB/s
> Testing 8x multi-block erase speed
> 8x multi-block erase speed is 75502 KiB/s
> Testing 16x multi-block erase speed
> 16x multi-block erase speed is 75851 KiB/s
> Testing 32x multi-block erase speed
> 32x multi-block erase speed is 75851 KiB/s
> Testing 64x multi-block erase speed
> 64x multi-block erase speed is 76204 KiB/s
> finished
> 
> Without this patch:
> # flash_speed -c 128 -d /dev/mtd1
> scanning for bad eraseblocks
> scanned 128 eraseblocks, 0 are bad
> testing eraseblock write speed
> eraseblock write speed is 2074 KiB/s
> testing eraseblock read speed
> eraseblock read speed is 2895 KiB/s
> testing page write speed
> page write speed is 998 KiB/s
> testing page read speed
> page read speed is 1499 KiB/s
> testing 2 page write speed
> 2 page write speed is 1002 KiB/s
> testing 2 page read speed
> 2 page read speed is 1554 KiB/s
> Testing erase speed
> erase speed is 76560 KiB/s
> Testing 2x multi-block erase speed
> 2x multi-block erase speed is 74019 KiB/s
> Testing 4x multi-block erase speed
> 4x multi-block erase speed is 74769 KiB/s
> Testing 8x multi-block erase speed
> 8x multi-block erase speed is 75149 KiB/s
> Testing 16x multi-block erase speed
> 16x multi-block erase speed is 75921 KiB/s
> Testing 32x multi-block erase speed
> 32x multi-block erase speed is 75921 KiB/s
> Testing 64x multi-block erase speed
> 64x multi-block erase speed is 75921 KiB/s
> finished
> 
> The throughput of the former is twice that of the latter.

And what is your NAND controller driver?

subpage reads are used when you only want to read a subset of a NAND
page.

Otherwise the core may use the RNDOUT command to change the pointer in
the chip's SRAM to read from a different location, but I don't see what
is impacting so much, unless if the driver implementation is really
sub-optimized.

Thanks,
Miquèl
diff mbox series

Patch

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index d7dbbd469b89..eeb654c6b4fc 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3630,7 +3630,7 @@  static int nand_do_read_ops(struct nand_chip *chip, loff_t from,
 							      oob_required,
 							      page);
 			else if (!aligned && NAND_HAS_SUBPAGE_READ(chip) &&
-				 !oob)
+				 !NAND_HAS_MONOLITHIC_READ(chip) && !oob)
 				ret = chip->ecc.read_subpage(chip, col, bytes,
 							     bufpoi, page);
 			else
@@ -3648,7 +3648,8 @@  static int nand_do_read_ops(struct nand_chip *chip, loff_t from,
 			 * partial pages or when a bounce buffer is required.
 			 */
 			if (use_bounce_buf) {
-				if (!NAND_HAS_SUBPAGE_READ(chip) && !oob &&
+				if ((!NAND_HAS_SUBPAGE_READ(chip) ||
+				     NAND_HAS_MONOLITHIC_READ(chip)) && !oob &&
 				    !(mtd->ecc_stats.failed - ecc_stats.failed) &&
 				    (ops->mode != MTD_OPS_RAW)) {
 					chip->pagecache.page = realpage;
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index e84522e31301..92d3ab491c9c 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -150,6 +150,11 @@  struct gpio_desc;
 /* Device needs 3rd row address cycle */
 #define NAND_ROW_ADDR_3		BIT(14)
 
+/* Device supports monolithic reads */
+#define NAND_MONOLITHIC_READ	BIT(15)
+/* Macros to identify the above */
+#define NAND_HAS_MONOLITHIC_READ(chip) ((chip->options & NAND_MONOLITHIC_READ))
+
 /* Non chip related options */
 /* This option skips the bbt scan during initialization. */
 #define NAND_SKIP_BBTSCAN	BIT(16)