Message ID | 1558520319-16452-4-git-send-email-yoshihiro.shimoda.uh@renesas.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mmc: renesas_sdhi: improve performance by changing max_segs | expand |
On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote: > In IOMMU environment, since it's possible to merge scatter gather > buffers of memory requests to one iova, this patch changes the max_segs > value when init_card of mmc_host timing to improve the transfer > performance on renesas_sdhi_internal_dmac. > > Notes that an sdio card may be possible to use scatter gather buffers > with non page aligned size, so that this driver will not use multiple > segments to avoid any trouble. Also, on renesas_sdhi_sys_dmac, > the max_segs value will change from 32 to 512, but the sys_dmac > can handle 512 segments, so that this init_card ops is added on > "TMIO_MMC_MIN_RCAR2" environment. > > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Awesome, Shimoda-san. I think you nailed it, this is nicely readable code! Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote: > In IOMMU environment, since it's possible to merge scatter gather > buffers of memory requests to one iova, this patch changes the max_segs > value when init_card of mmc_host timing to improve the transfer > performance on renesas_sdhi_internal_dmac. Well, you can't merge everything with an IOMMU. For one not every IOMMU can merge multiple scatterlist segments, second even it can merge segements the segments need to be aligned to the IOMMU page size. And then of course we might have an upper limit on the total mapping. > + if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU && > + host->pdev->dev.iommu_group && > + (mmc_card_mmc(card) || mmc_card_sd(card))) > + host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU; This is way to magic. We'll need a proper DMA layer API to expose this information, and preferably a block layer helper to increase max_segs instead of hacking that up in the driver.
Hi Christoph, Thank you for your review! > From: Christoph Hellwig, Sent: Wednesday, May 22, 2019 9:29 PM > > On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote: > > In IOMMU environment, since it's possible to merge scatter gather > > buffers of memory requests to one iova, this patch changes the max_segs > > value when init_card of mmc_host timing to improve the transfer > > performance on renesas_sdhi_internal_dmac. > > Well, you can't merge everything with an IOMMU. For one not every > IOMMU can merge multiple scatterlist segments, I didn't know such IOMMU exists. But, since R-Car Gen3 IOMMU device (handled by ipmmu-vmsa.c) can merge multiple scatterlist segments, should this mmc driver check whether the IOMMU device is used or not somehow? > second even it can merge > segements the segments need to be aligned to the IOMMU page size. If this driver checks whether the segments are aligned to the IOMMU page size before DMA API is called every time, is it acceptable? If one of the segments is not aligned, this driver should not use the DMAC. > And > then of course we might have an upper limit on the total mapping. IIUC, if such a case, DMA API will fail. What do you think? > > + if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU && > > + host->pdev->dev.iommu_group && > > + (mmc_card_mmc(card) || mmc_card_sd(card))) > > + host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU; > > This is way to magic. We'll need a proper DMA layer API to expose > this information, and preferably a block layer helper to increase > max_segs instead of hacking that up in the driver. I think I should have described the detail somewhere. This can expose this information to a block layer by using blk_queue_max_segments() that mmc_setup_queue() calls. In other words, this init_card() ops is called before a block device is created. Is this acceptable if such a comment is described here? Best regards, Yoshihiro Shimoda
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c index 5e9e36e..2f86975 100644 --- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -46,6 +46,8 @@ #define SDHI_VER_GEN3_SD 0xcc10 #define SDHI_VER_GEN3_SDMMC 0xcd10 +#define SDHI_MAX_SEGS_IN_IOMMU 512 + struct renesas_sdhi_quirks { bool hs400_disabled; bool hs400_4taps; @@ -203,6 +205,27 @@ static void renesas_sdhi_clk_disable(struct tmio_mmc_host *host) clk_disable_unprepare(priv->clk_cd); } +static void renesas_sdhi_init_card(struct mmc_host *mmc, struct mmc_card *card) +{ + struct tmio_mmc_host *host = mmc_priv(mmc); + + /* + * In IOMMU environment, it's possible to merge scatter gather buffers + * of memory requests to one iova so that this code changes + * the max_segs when init_card of mmc_host timing. Notes that an sdio + * card may be possible to use scatter gather buffers with non page + * aligned size, so that this driver will not use multiple segments + * to avoid any trouble even if IOMMU environment. + */ + if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU && + host->pdev->dev.iommu_group && + (mmc_card_mmc(card) || mmc_card_sd(card))) + host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU; + else + host->mmc->max_segs = host->pdata->max_segs ? : + TMIO_DEFAULT_MAX_SEGS; +} + static int renesas_sdhi_card_busy(struct mmc_host *mmc) { struct tmio_mmc_host *host = mmc_priv(mmc); @@ -726,6 +749,8 @@ int renesas_sdhi_probe(struct platform_device *pdev, /* SDR speeds are only available on Gen2+ */ if (mmc_data->flags & TMIO_MMC_MIN_RCAR2) { + host->ops.init_card = renesas_sdhi_init_card; + /* card_busy caused issues on r8a73a4 (pre-Gen2) CD-less SDHI */ host->ops.card_busy = renesas_sdhi_card_busy; host->ops.start_signal_voltage_switch =
In IOMMU environment, since it's possible to merge scatter gather buffers of memory requests to one iova, this patch changes the max_segs value when init_card of mmc_host timing to improve the transfer performance on renesas_sdhi_internal_dmac. Notes that an sdio card may be possible to use scatter gather buffers with non page aligned size, so that this driver will not use multiple segments to avoid any trouble. Also, on renesas_sdhi_sys_dmac, the max_segs value will change from 32 to 512, but the sys_dmac can handle 512 segments, so that this init_card ops is added on "TMIO_MMC_MIN_RCAR2" environment. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> --- drivers/mmc/host/renesas_sdhi_core.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)