mbox series

[v6,0/8] Support limits below the page size

Message ID 20230612203314.17820-1-bvanassche@acm.org (mailing list archive)
Headers show
Series Support limits below the page size | expand

Message

Bart Van Assche June 12, 2023, 8:33 p.m. UTC
Hi Jens,

We want to improve Android performance by increasing the page size from 4 KiB
to 16 KiB. However, some of the storage controllers we care about do not support
DMA segments larger than 4 KiB. Hence the need support for DMA segments that are
smaller than the size of one virtual memory page. This patch series implements
that support. Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v5:
- Rebased the entire series on top of the block layer for-next branch.
- Dropped patch "block: Add support for small segments in blk_rq_map_user_iov()"
  because that patch prepares for a patch that has already been dropped.
- Modified a source code comment in patch 3/9 such that it fits in 80 columns.

Changes compared to v4:
- Fixed the debugfs patch such that the behavior for creating the block
  debugfs directory is retained.
- Made the description of patch "Support configuring limits below the page
  size" more detailed. Split that patch into two patches.
- Added patch "Use pr_info() instead of printk(KERN_INFO ...)".

Changes compared to v3:
- Removed CONFIG_BLK_SUB_PAGE_SEGMENTS and QUEUE_FLAG_SUB_PAGE_SEGMENTS.
  Replaced these by a new member in struct queue_limits and a static branch.
- The static branch that controls whether or not sub-page limits are enabled
  is set by the block layer core instead of by block drivers.
- Dropped the patches that are no longer needed (SCSI core and UFS Exynos
  driver).

Changes compared to v2:
- For SCSI drivers, only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- In the scsi_debug patch, sorted kernel module parameters alphabetically.
  Only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- Added a patch for the UFS Exynos driver that enables
  CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size exceeds 4 KiB.

Changes compared to v1:
- Added a CONFIG variable that controls whether or not small segment support
  is enabled.
- Improved patch descriptions.

Bart Van Assche (8):
  block: Use pr_info() instead of printk(KERN_INFO ...)
  block: Prepare for supporting sub-page limits
  block: Support configuring limits below the page size
  block: Make sub_page_limit_queues available in debugfs
  block: Support submitting passthrough requests with small segments
  block: Add support for filesystem requests and small segments
  scsi_debug: Support configuring the maximum segment size
  null_blk: Support configuring the maximum segment size

 block/blk-core.c                  |  4 ++
 block/blk-map.c                   |  2 +-
 block/blk-merge.c                 |  8 ++-
 block/blk-mq-debugfs.c            |  9 +++
 block/blk-mq-debugfs.h            |  6 ++
 block/blk-mq.c                    |  2 +
 block/blk-settings.c              | 91 +++++++++++++++++++++++++++----
 block/blk.h                       | 39 +++++++++++--
 drivers/block/null_blk/main.c     | 19 ++++++-
 drivers/block/null_blk/null_blk.h |  1 +
 drivers/scsi/scsi_debug.c         |  4 ++
 include/linux/blkdev.h            |  2 +
 12 files changed, 163 insertions(+), 24 deletions(-)

Comments

Bart Van Assche June 15, 2023, 2:01 a.m. UTC | #1
On 6/12/23 13:33, Bart Van Assche wrote:
> We want to improve Android performance by increasing the page size from 4 KiB
> to 16 KiB. However, some of the storage controllers we care about do not support
> DMA segments larger than 4 KiB. Hence the need support for DMA segments that are
> smaller than the size of one virtual memory page. This patch series implements
> that support. Please consider this patch series for the next merge window.

(replying to my own email)

Hi Jens,

Can you please take a look at this patch series? I think it is ready to 
be merged.

Thanks,

Bart.
Jens Axboe June 15, 2023, 2:22 a.m. UTC | #2
On 6/12/23 2:33?PM, Bart Van Assche wrote:
> Hi Jens,
> 
> We want to improve Android performance by increasing the page size
> from 4 KiB to 16 KiB. However, some of the storage controllers we care
> about do not support DMA segments larger than 4 KiB. Hence the need
> support for DMA segments that are smaller than the size of one virtual
> memory page. This patch series implements that support. Please
> consider this patch series for the next merge window.

I'm usually a fan of putting code in the core so we don't have to in
drivers, that's how most of the block layer is designed. But this seems
niche enough that perhaps it's worth considering just remapping these in
the driver? It's peppering changes all over delicate parts of the core
for cases that 99.9% don't need to worry about or should worry about.
I will say that I do think the patches do look better than they did in
earlier versions, however.

Maybe we've already discussed this before, but let's please have the
discussion again. Because I'd really love to avoid this code, if at all
possible.
Christoph Hellwig June 15, 2023, 4:15 a.m. UTC | #3
On Wed, Jun 14, 2023 at 08:22:31PM -0600, Jens Axboe wrote:
> I'm usually a fan of putting code in the core so we don't have to in
> drivers, that's how most of the block layer is designed. But this seems
> niche enough that perhaps it's worth considering just remapping these in
> the driver? It's peppering changes all over delicate parts of the core
> for cases that 99.9% don't need to worry about or should worry about.
> I will say that I do think the patches do look better than they did in
> earlier versions, however.
> 
> Maybe we've already discussed this before, but let's please have the
> discussion again. Because I'd really love to avoid this code, if at all
> possible.

I really hate having this core complexity, but I suspect trying to driver
hacks would be even worse than that, especially as this goes through
the SCSI midlayer.  I think the answer is simply that if Google keeps
buying broken hardware for their products from Samsung they just need
to stick to a 4k page size instead of moving to a larger one.
Bart Van Assche June 15, 2023, 1:55 p.m. UTC | #4
On 6/14/23 21:15, Christoph Hellwig wrote:
> I really hate having this core complexity, but I suspect trying to driver
> hacks would be even worse than that, especially as this goes through
> the SCSI midlayer.  I think the answer is simply that if Google keeps
> buying broken hardware for their products from Samsung they just need
> to stick to a 4k page size instead of moving to a larger one.

Although I do not like it that the Exynos UFS controller does not follow 
the UFS standard, this UFS controller is used much more widely than only 
in devices produced by my employer. See e.g. the output of the following 
grep command:

$ git grep -nH '\.compatible' */*/ufs-exynos.c
drivers/ufs/host/ufs-exynos.c:1739:	{ .compatible = "samsung,exynos7-ufs",
drivers/ufs/host/ufs-exynos.c:1741:	{ .compatible = 
"samsung,exynosautov9-ufs",
drivers/ufs/host/ufs-exynos.c:1743:	{ .compatible = 
"samsung,exynosautov9-ufs-vh",
drivers/ufs/host/ufs-exynos.c:1745:	{ .compatible = "tesla,fsd-ufs",

Thanks,

Bart.
Bart Van Assche June 15, 2023, 2:16 p.m. UTC | #5
On 6/14/23 19:22, Jens Axboe wrote:
> On 6/12/23 2:33?PM, Bart Van Assche wrote:
>> We want to improve Android performance by increasing the page size
>> from 4 KiB to 16 KiB. However, some of the storage controllers we care
>> about do not support DMA segments larger than 4 KiB. Hence the need
>> support for DMA segments that are smaller than the size of one virtual
>> memory page. This patch series implements that support. Please
>> consider this patch series for the next merge window.
> 
> I'm usually a fan of putting code in the core so we don't have to in
> drivers, that's how most of the block layer is designed. But this seems
> niche enough that perhaps it's worth considering just remapping these in
> the driver? It's peppering changes all over delicate parts of the core
> for cases that 99.9% don't need to worry about or should worry about.
> I will say that I do think the patches do look better than they did in
> earlier versions, however.
> 
> Maybe we've already discussed this before, but let's please have the
> discussion again. Because I'd really love to avoid this code, if at all
> possible.

Hi Jens,

These are my arguments in favor of having this functionality in the 
block layer core instead of in the UFS driver:
* This functionality is useful for multiple block drivers. It is also
   useful for  block drivers with a max_segment_size limit less than 64
   KiB on systems with a 64 KiB page size. E.g. the sbp2 driver and
   several ATA and MMC drivers set the max_segment_size limit to a value
   less than 64 KiB.
* The UFS 3.1 devices in my test setup support read bandwidths up to 2
   GiB/s and more than 100K IOPS. UFSHCI 4.0 controllers support a link
   bandwidth that is the double of UFSHCI 3.x controllers and also
   support higher queue depths (up to 512 instead of 32). In other words,
   performance matters for UFS devices. Having the SCSI core build an SG
   list and making the UFS driver rework that SG list probably would
   affect performance negatively.
* The MMC driver is more complicated than needed because the block layer
   core does not yet support the limits of MMC devices. I think that this
   patch series will allow to simplify the MMC driver. From
   drivers/mmc/block.c:

	/*
	 * The block layer doesn't support all sector count
	 * restrictions, so we need to be prepared for too big
	 * requests.
	 */

* Care has been taken not to affect performance or maintainability
   of the block layer core in a negative way.

Thanks,

Bart.
Jens Axboe June 15, 2023, 2:16 p.m. UTC | #6
On 6/14/23 10:15?PM, Christoph Hellwig wrote:
> On Wed, Jun 14, 2023 at 08:22:31PM -0600, Jens Axboe wrote:
>> I'm usually a fan of putting code in the core so we don't have to in
>> drivers, that's how most of the block layer is designed. But this seems
>> niche enough that perhaps it's worth considering just remapping these in
>> the driver? It's peppering changes all over delicate parts of the core
>> for cases that 99.9% don't need to worry about or should worry about.
>> I will say that I do think the patches do look better than they did in
>> earlier versions, however.
>>
>> Maybe we've already discussed this before, but let's please have the
>> discussion again. Because I'd really love to avoid this code, if at all
>> possible.
> 
> I really hate having this core complexity, but I suspect trying to driver
> hacks would be even worse than that, especially as this goes through
> the SCSI midlayer.  I think the answer is simply that if Google keeps
> buying broken hardware for their products from Samsung they just need
> to stick to a 4k page size instead of moving to a larger one.

I would tend to agree with that. Vendors buy cheaper things all the time
to cut cost, and then have to deal with the fallout of that. I see quite
a bit of that on the storage front.
Christoph Hellwig June 16, 2023, 7:02 a.m. UTC | #7
On Thu, Jun 15, 2023 at 06:55:36AM -0700, Bart Van Assche wrote:
> On 6/14/23 21:15, Christoph Hellwig wrote:
>> I really hate having this core complexity, but I suspect trying to driver
>> hacks would be even worse than that, especially as this goes through
>> the SCSI midlayer.  I think the answer is simply that if Google keeps
>> buying broken hardware for their products from Samsung they just need
>> to stick to a 4k page size instead of moving to a larger one.
>
> Although I do not like it that the Exynos UFS controller does not follow 
> the UFS standard, this UFS controller is used much more widely than only in 
> devices produced by my employer. See e.g. the output of the following grep 
> command:

But it seems like no one is insisting on using it with larger than 4k
page sizes.  I think we should just prohibit using the driver for those
kernel configs and be done with it.
Bart Van Assche June 16, 2023, 8:26 p.m. UTC | #8
On 6/16/23 00:02, Christoph Hellwig wrote:
> But it seems like no one is insisting on using it with larger than 4k
> page sizes.

The Android common kernel (ACK) team is working on bringing up 16K page 
size support. This involves kernel changes and also changes in user 
space code. Once 16K page size support is ready, I expect that more 
users will ask for 16K page size support in Android and also that more 
users will ask for small segment size support.

Thanks,

Bart.
Jens Axboe June 16, 2023, 9:48 p.m. UTC | #9
On 6/16/23 2:26?PM, Bart Van Assche wrote:
> On 6/16/23 00:02, Christoph Hellwig wrote:
>> But it seems like no one is insisting on using it with larger than 4k
>> page sizes.
> 
> The Android common kernel (ACK) team is working on bringing up 16K
> page size support. This involves kernel changes and also changes in
> user space code. Once 16K page size support is ready, I expect that
> more users will ask for 16K page size support in Android and also that
> more users will ask for small segment size support.

Like Christoph said in a previous email, gate the 16K page sizes on
hardware that can sanely support it. If it can't, then it runs 4K
kernels. Nudge the vendors to ensure what they deliver comply with that,
I believe Google has quite some pull in terms of that...
Bart Van Assche June 16, 2023, 10:28 p.m. UTC | #10
On 6/16/23 14:48, Jens Axboe wrote:
> Nudge the vendors to ensure what they deliver comply with that,
> I believe Google has quite some pull in terms of that...

It would be great if vendors of Android devices would ask the Google 
Android team for its opinion before selecting hardware components. 
However, that's not how it works. I think that it's more likely that 
Android vendors will put Google under pressure to support the hardware 
they have selected instead of Android vendors asking Google for its 
opinion about which hardware components to select.

Additionally, bring-up of 16K page size support for Android happens with 
existing Android hardware. This patch series helps 16K page size support 
bring-up effort because 16K page size support is being tested on Android 
devices with an Exynos UFS host controller.

Thanks,

Bart.
Juan Yescas Aug. 4, 2023, 11:11 p.m. UTC | #11
On Fri, Jun 16, 2023 at 12:02 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Jun 15, 2023 at 06:55:36AM -0700, Bart Van Assche wrote:
> > On 6/14/23 21:15, Christoph Hellwig wrote:
> >> I really hate having this core complexity, but I suspect trying to driver
> >> hacks would be even worse than that, especially as this goes through
> >> the SCSI midlayer.  I think the answer is simply that if Google keeps
> >> buying broken hardware for their products from Samsung they just need
> >> to stick to a 4k page size instead of moving to a larger one.
> >
> > Although I do not like it that the Exynos UFS controller does not follow
> > the UFS standard, this UFS controller is used much more widely than only in
> > devices produced by my employer. See e.g. the output of the following grep
> > command:
>
> But it seems like no one is insisting on using it with larger than 4k
> page sizes.  I think we should just prohibit using the driver for those
> kernel configs and be done with it.

In addition to Google, Samsung and MediaTek and other vendors have devices
that want to take advantage of 16k page size support and they use the same
Exynos UFS host controller.

For example, these phones could potentially support 16k page sizes:

Samsung Galaxy A54 5G, Exynos 1380
Samsung Galaxy A14 5G, Exynos 1330

See https://semiconductor.samsung.com/us/processor/showcase/smartphone/