Message ID | 20230818022817.3341-1-Sharp.Xia@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] mmc: Set optimal I/O size when mmc_setip_queue | expand |
On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > > From: Sharp Xia <Sharp.Xia@mediatek.com> > > MMC does not set readahead and uses the default VM_READAHEAD_PAGES > resulting in slower reading speed. > Use the max_req_size reported by host driver to set the optimal > I/O size to improve performance. This seems reasonable to me. However, it would be nice if you could share some performance numbers too - comparing before and after $subject patch. Kind regards Uffe > > Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > --- > drivers/mmc/core/queue.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > index b396e3900717..fc83c4917360 100644 > --- a/drivers/mmc/core/queue.c > +++ b/drivers/mmc/core/queue.c > @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card) > blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); > blk_queue_max_hw_sectors(mq->queue, > min(host->max_blk_count, host->max_req_size / 512)); > + blk_queue_io_opt(mq->queue, host->max_req_size); > if (host->can_dma_map_merge) > WARN(!blk_queue_can_use_dma_map_merging(mq->queue, > mmc_dev(host)), > -- > 2.18.0 >
On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > > > > From: Sharp Xia <Sharp.Xia@mediatek.com> > > > > MMC does not set readahead and uses the default VM_READAHEAD_PAGES > > resulting in slower reading speed. > > Use the max_req_size reported by host driver to set the optimal > > I/O size to improve performance. > > This seems reasonable to me. However, it would be nice if you could > share some performance numbers too - comparing before and after > $subject patch. > > Kind regards > Uffe > > > > > Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > > --- > > drivers/mmc/core/queue.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > > index b396e3900717..fc83c4917360 100644 > > --- a/drivers/mmc/core/queue.c > > +++ b/drivers/mmc/core/queue.c > > @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue > *mq, struct mmc_card *card) > > blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); > > blk_queue_max_hw_sectors(mq->queue, > > min(host->max_blk_count, host->max_req_size / > 512)); > > + blk_queue_io_opt(mq->queue, host->max_req_size); > > if (host->can_dma_map_merge) > > WARN(!blk_queue_can_use_dma_map_merging(mq->queue, > > mmc_dev(hos > t)), > > -- > > 2.18.0 > > I test this patch on internal platform(kernel-5.15). Before: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 128 After: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 1024
On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > > > > From: Sharp Xia <Sharp.Xia@mediatek.com> > > > > MMC does not set readahead and uses the default VM_READAHEAD_PAGES > > resulting in slower reading speed. > > Use the max_req_size reported by host driver to set the optimal > > I/O size to improve performance. > > This seems reasonable to me. However, it would be nice if you could > share some performance numbers too - comparing before and after > $subject patch. > > Kind regards > Uffe > > > > > Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > > --- > > drivers/mmc/core/queue.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > > index b396e3900717..fc83c4917360 100644 > > --- a/drivers/mmc/core/queue.c > > +++ b/drivers/mmc/core/queue.c > > @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue > *mq, struct mmc_card *card) > > blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); > > blk_queue_max_hw_sectors(mq->queue, > > min(host->max_blk_count, host->max_req_size / > 512)); > > + blk_queue_io_opt(mq->queue, host->max_req_size); > > if (host->can_dma_map_merge) > > WARN(!blk_queue_can_use_dma_map_merging(mq->queue, > > mmc_dev(hos > t)), > > -- > > 2.18.0 > > I test this patch on internal platform(kernel-5.15). Before: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 128 After: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 1024
On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > > > > From: Sharp Xia <Sharp.Xia@mediatek.com> > > > > MMC does not set readahead and uses the default VM_READAHEAD_PAGES > > resulting in slower reading speed. > > Use the max_req_size reported by host driver to set the optimal > > I/O size to improve performance. > > This seems reasonable to me. However, it would be nice if you could > share some performance numbers too - comparing before and after > $subject patch. > > Kind regards > Uffe > > > > > Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > > --- > > drivers/mmc/core/queue.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > > index b396e3900717..fc83c4917360 100644 > > --- a/drivers/mmc/core/queue.c > > +++ b/drivers/mmc/core/queue.c > > @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue > *mq, struct mmc_card *card) > > blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); > > blk_queue_max_hw_sectors(mq->queue, > > min(host->max_blk_count, host->max_req_size / > 512)); > > + blk_queue_io_opt(mq->queue, host->max_req_size); > > if (host->can_dma_map_merge) > > WARN(!blk_queue_can_use_dma_map_merging(mq->queue, > > mmc_dev(hos > t)), > > -- > > 2.18.0 > > I test this patch on internal platform(kernel-5.15). Before: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 128 After: console:/ # echo 3 > /proc/sys/vm/drop_caches console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null 4485393+1 records in 4485393+1 records out 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb 1024
Hi Sharp, On 2023/8/25 15:10, Sharp Xia (夏宇彬) wrote: > On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: >> >> External email : Please do not click links or open attachments until >> you have verified the sender or the content. >> On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: >>> >>> From: Sharp Xia <Sharp.Xia@mediatek.com> >>> >>> MMC does not set readahead and uses the default VM_READAHEAD_PAGES >>> resulting in slower reading speed. >>> Use the max_req_size reported by host driver to set the optimal >>> I/O size to improve performance. >> >> This seems reasonable to me. However, it would be nice if you could >> share some performance numbers too - comparing before and after >> $subject patch. >> >> Kind regards >> Uffe >> >>> >>> Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> >>> --- >>> drivers/mmc/core/queue.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c >>> index b396e3900717..fc83c4917360 100644 >>> --- a/drivers/mmc/core/queue.c >>> +++ b/drivers/mmc/core/queue.c >>> @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue >> *mq, struct mmc_card *card) >>> blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); >>> blk_queue_max_hw_sectors(mq->queue, >>> min(host->max_blk_count, host->max_req_size / >> 512)); >>> + blk_queue_io_opt(mq->queue, host->max_req_size); >>> if (host->can_dma_map_merge) >>> WARN(!blk_queue_can_use_dma_map_merging(mq->queue, >>> mmc_dev(hos >> t)), >>> -- >>> 2.18.0 >>> > > I test this patch on internal platform(kernel-5.15). I patched this one and the test shows me a stable 11% performance drop. Before: echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (0.9 G) copied, 3.912249 s, 256 M/s After: echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null 2048000+0 records in 2048000+0 records out 1048576000 bytes (0.9 G) copied, 4.436271 s, 225 M/s > > Before: > console:/ # echo 3 > /proc/sys/vm/drop_caches > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > 4485393+1 records in > 4485393+1 records out > 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > 128 > > After: > console:/ # echo 3 > /proc/sys/vm/drop_caches > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > 4485393+1 records in > 4485393+1 records out > 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > 1024 >
On Fri, 2023-08-25 at 16:11 +0800, Shawn Lin wrote: > > Hi Sharp, > > On 2023/8/25 15:10, Sharp Xia (夏宇彬) wrote: > > On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: > >> > >> External email : Please do not click links or open attachments > until > >> you have verified the sender or the content. > >> On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > >>> > >>> From: Sharp Xia <Sharp.Xia@mediatek.com> > >>> > >>> MMC does not set readahead and uses the default > VM_READAHEAD_PAGES > >>> resulting in slower reading speed. > >>> Use the max_req_size reported by host driver to set the optimal > >>> I/O size to improve performance. > >> > >> This seems reasonable to me. However, it would be nice if you > could > >> share some performance numbers too - comparing before and after > >> $subject patch. > >> > >> Kind regards > >> Uffe > >> > >>> > >>> Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > >>> --- > >>> drivers/mmc/core/queue.c | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > >>> index b396e3900717..fc83c4917360 100644 > >>> --- a/drivers/mmc/core/queue.c > >>> +++ b/drivers/mmc/core/queue.c > >>> @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue > >> *mq, struct mmc_card *card) > >>> blk_queue_bounce_limit(mq->queue, > BLK_BOUNCE_HIGH); > >>> blk_queue_max_hw_sectors(mq->queue, > >>> min(host->max_blk_count, host->max_req_size / > >> 512)); > >>> + blk_queue_io_opt(mq->queue, host->max_req_size); > >>> if (host->can_dma_map_merge) > >>> WARN(!blk_queue_can_use_dma_map_merging(mq- > >queue, > >>> mmc_dev( > hos > >> t)), > >>> -- > >>> 2.18.0 > >>> > > > > I test this patch on internal platform(kernel-5.15). > > I patched this one and the test shows me a stable 11% performance > drop. > > Before: > echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null > > 2048000+0 records in > 2048000+0 records out > 1048576000 bytes (0.9 G) copied, 3.912249 s, 256 M/s > > After: > echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null > 2048000+0 records in > 2048000+0 records out > 1048576000 bytes (0.9 G) copied, 4.436271 s, 225 M/s > > > > > Before: > > console:/ # echo 3 > /proc/sys/vm/drop_caches > > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > > 4485393+1 records in > > 4485393+1 records out > > 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s > > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > > 128 > > > > After: > > console:/ # echo 3 > /proc/sys/vm/drop_caches > > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > > 4485393+1 records in > > 4485393+1 records out > > 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s > > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > > 1024 > > Hi Shawn, What is your readahead value before and after applying this patch?
On 2023/8/25 16:39, Sharp.Xia@mediatek.com wrote: > On Fri, 2023-08-25 at 16:11 +0800, Shawn Lin wrote: >> >> Hi Sharp, ... >>> 1024 >>> > Hi Shawn, > > What is your readahead value before and after applying this patch? > The original readahead is 128, and after applying the patch is 1024 cat /d/mmc0/ios clock: 200000000 Hz actual clock: 200000000 Hz vdd: 18 (3.0 ~ 3.1 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 3 (8 bits) timing spec: 10 (mmc HS400 enhanced strobe) signal voltage: 1 (1.80 V) driver type: 0 (driver type B) The driver I used is sdhci-of-dwcmshc.c with a KLMBG2JETDB041 eMMC chip.
On Fri, Aug 25, 2023 at 7:43 PM <Sharp.Xia@mediatek.com> wrote: > > On Fri, 2023-08-25 at 16:11 +0800, Shawn Lin wrote: > > > > Hi Sharp, > > > > On 2023/8/25 15:10, Sharp Xia (夏宇彬) wrote: > > > On Thu, 2023-08-24 at 12:55 +0200, Ulf Hansson wrote: > > >> > > >> External email : Please do not click links or open attachments > > until > > >> you have verified the sender or the content. > > >> On Fri, 18 Aug 2023 at 04:45, <Sharp.Xia@mediatek.com> wrote: > > >>> > > >>> From: Sharp Xia <Sharp.Xia@mediatek.com> > > >>> > > >>> MMC does not set readahead and uses the default > > VM_READAHEAD_PAGES > > >>> resulting in slower reading speed. > > >>> Use the max_req_size reported by host driver to set the optimal > > >>> I/O size to improve performance. > > >> > > >> This seems reasonable to me. However, it would be nice if you > > could > > >> share some performance numbers too - comparing before and after > > >> $subject patch. > > >> > > >> Kind regards > > >> Uffe > > >> > > >>> > > >>> Signed-off-by: Sharp Xia <Sharp.Xia@mediatek.com> > > >>> --- > > >>> drivers/mmc/core/queue.c | 1 + > > >>> 1 file changed, 1 insertion(+) > > >>> > > >>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > > >>> index b396e3900717..fc83c4917360 100644 > > >>> --- a/drivers/mmc/core/queue.c > > >>> +++ b/drivers/mmc/core/queue.c > > >>> @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue > > >> *mq, struct mmc_card *card) > > >>> blk_queue_bounce_limit(mq->queue, > > BLK_BOUNCE_HIGH); > > >>> blk_queue_max_hw_sectors(mq->queue, > > >>> min(host->max_blk_count, host->max_req_size / > > >> 512)); > > >>> + blk_queue_io_opt(mq->queue, host->max_req_size); > > >>> if (host->can_dma_map_merge) > > >>> WARN(!blk_queue_can_use_dma_map_merging(mq- > > >queue, > > >>> mmc_dev( > > hos > > >> t)), > > >>> -- > > >>> 2.18.0 > > >>> > > > > > > I test this patch on internal platform(kernel-5.15). > > > > I patched this one and the test shows me a stable 11% performance > > drop. > > > > Before: > > echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null > > > > 2048000+0 records in > > 2048000+0 records out > > 1048576000 bytes (0.9 G) copied, 3.912249 s, 256 M/s > > > > After: > > echo 3 > proc/sys/vm/drop_caches && dd if=/data/1GB.img of=/dev/null > > 2048000+0 records in > > 2048000+0 records out > > 1048576000 bytes (0.9 G) copied, 4.436271 s, 225 M/s > > > > > > > > Before: > > > console:/ # echo 3 > /proc/sys/vm/drop_caches > > > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > > > 4485393+1 records in > > > 4485393+1 records out > > > 2296521564 bytes (2.1 G) copied, 37.124446 s, 59 M/s > > > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > > > 128 > > > > > > After: > > > console:/ # echo 3 > /proc/sys/vm/drop_caches > > > console:/ # dd if=/mnt/media_rw/8031-130D/super.img of=/dev/null > > > 4485393+1 records in > > > 4485393+1 records out > > > 2296521564 bytes (2.1 G) copied, 28.956049 s, 76 M/s > > > console:/ # cat /sys/block/mmcblk0/queue/read_ahead_kb > > > 1024 > > > > Hi Shawn, > > What is your readahead value before and after applying this patch? > Hi Sharp Use "echo 1024 > sys/block/mmcblk0/queue/read_ahead_kb" instead of "blk_queue_io_opt(mq->queue, host->max_req_size);"?
On Fri, 2023-08-25 at 17:17 +0800, Shawn Lin wrote: > > > On 2023/8/25 16:39, Sharp.Xia@mediatek.com wrote: > > On Fri, 2023-08-25 at 16:11 +0800, Shawn Lin wrote: > >> > >> Hi Sharp, > > ... > > >>> 1024 > >>> > > Hi Shawn, > > > > What is your readahead value before and after applying this patch? > > > > The original readahead is 128, and after applying the patch is 1024 > > > cat /d/mmc0/ios > clock: 200000000 Hz > actual clock: 200000000 Hz > vdd: 18 (3.0 ~ 3.1 V) > bus mode: 2 (push-pull) > chip select: 0 (don't care) > power mode: 2 (on) > bus width: 3 (8 bits) > timing spec: 10 (mmc HS400 enhanced strobe) > signal voltage: 1 (1.80 V) > driver type: 0 (driver type B) > > The driver I used is sdhci-of-dwcmshc.c with a KLMBG2JETDB041 eMMC > chip. I tested with RK3568 and sdhci-of-dwcmshc.c driver, the performance improved by 2~3%. Before: root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null 2097152+0 records in 2097152+0 records out real 0m 6.01s user 0m 0.84s sys 0m 2.89s root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb 128 After: root@OpenWrt:/mnt/mmcblk0p3# echo 3 > /proc/sys/vm/drop_caches root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null 2097152+0 records in 2097152+0 records out real 0m 5.86s user 0m 1.04s sys 0m 3.18s root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb 1024 root@OpenWrt:/sys/kernel/debug/mmc0# cat ios clock: 200000000 Hz actual clock: 200000000 Hz vdd: 18 (3.0 ~ 3.1 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 3 (8 bits) timing spec: 9 (mmc HS200) signal voltage: 1 (1.80 V) driver type: 0 (driver type B)
On Fri, 2023-08-25 at 20:23 +0800, Wenchao Chen wrote: > > > Hi Sharp > Use "echo 1024 > sys/block/mmcblk0/queue/read_ahead_kb" instead of > "blk_queue_io_opt(mq->queue, host->max_req_size);"? Hi Wenchao, User space does not know the max_req_size of each mmc host. And when the SD card is hot inserted, it is complicated for the user space to modify this value.
Hi Sharp On 2023/8/27 0:26, Sharp.Xia@mediatek.com wrote: > On Fri, 2023-08-25 at 17:17 +0800, Shawn Lin wrote: >> >> After more testing, most of my platforms which runs at HS400/HS200 mode shows nearly no differences with the readahead ranging from 128 to 1024. Yet just a board shows a performance drop now. Highly suspect it's eMMC chip depends. I would recommand leave it to the BSP guys to decide which readahead value is best for their usage. > > I tested with RK3568 and sdhci-of-dwcmshc.c driver, the performance improved by 2~3%. > > Before: > root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null > 2097152+0 records in > 2097152+0 records out > real 0m 6.01s > user 0m 0.84s > sys 0m 2.89s > root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb > 128 > > After: > root@OpenWrt:/mnt/mmcblk0p3# echo 3 > /proc/sys/vm/drop_caches > root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null > 2097152+0 records in > 2097152+0 records out > real 0m 5.86s > user 0m 1.04s > sys 0m 3.18s > root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb > 1024 > > root@OpenWrt:/sys/kernel/debug/mmc0# cat ios > clock: 200000000 Hz > actual clock: 200000000 Hz > vdd: 18 (3.0 ~ 3.1 V) > bus mode: 2 (push-pull) > chip select: 0 (don't care) > power mode: 2 (on) > bus width: 3 (8 bits) > timing spec: 9 (mmc HS200) > signal voltage: 1 (1.80 V) > driver type: 0 (driver type B) >
On Mon, 28 Aug 2023 at 04:28, Shawn Lin <shawn.lin@rock-chips.com> wrote: > > Hi Sharp > > On 2023/8/27 0:26, Sharp.Xia@mediatek.com wrote: > > On Fri, 2023-08-25 at 17:17 +0800, Shawn Lin wrote: > >> > >> > > After more testing, most of my platforms which runs at HS400/HS200 mode > shows nearly no differences with the readahead ranging from 128 to 1024. > Yet just a board shows a performance drop now. Highly suspect it's eMMC > chip depends. I would recommand leave it to the BSP guys to decide which > readahead value is best for their usage. That's a very good point. The SD/eMMC card certainly behaves differently, depending on the request-size. Another thing we could consider doing, could be to combine the information about the request-size from the mmc host, with some relevant information from the registers in the card (not sure exactly what though). > > > > > I tested with RK3568 and sdhci-of-dwcmshc.c driver, the performance improved by 2~3%. > > > > Before: > > root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null > > 2097152+0 records in > > 2097152+0 records out > > real 0m 6.01s > > user 0m 0.84s > > sys 0m 2.89s > > root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb > > 128 > > > > After: > > root@OpenWrt:/mnt/mmcblk0p3# echo 3 > /proc/sys/vm/drop_caches > > root@OpenWrt:/mnt/mmcblk0p3# time dd if=test.img of=/dev/null > > 2097152+0 records in > > 2097152+0 records out > > real 0m 5.86s > > user 0m 1.04s > > sys 0m 3.18s > > root@OpenWrt:/mnt/mmcblk0p3# cat /sys/block/mmcblk0/queue/read_ahead_kb > > 1024 > > > > root@OpenWrt:/sys/kernel/debug/mmc0# cat ios > > clock: 200000000 Hz > > actual clock: 200000000 Hz > > vdd: 18 (3.0 ~ 3.1 V) > > bus mode: 2 (push-pull) > > chip select: 0 (don't care) > > power mode: 2 (on) > > bus width: 3 (8 bits) > > timing spec: 9 (mmc HS200) > > signal voltage: 1 (1.80 V) > > driver type: 0 (driver type B) > > Thanks for testing and sharing the data, both of you! Kind regards Uffe
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index b396e3900717..fc83c4917360 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -359,6 +359,7 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card) blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); blk_queue_max_hw_sectors(mq->queue, min(host->max_blk_count, host->max_req_size / 512)); + blk_queue_io_opt(mq->queue, host->max_req_size); if (host->can_dma_map_merge) WARN(!blk_queue_can_use_dma_map_merging(mq->queue, mmc_dev(host)),