Message ID | 1494506343-28572-4-git-send-email-ulf.hansson@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/05/17 15:39, Ulf Hansson wrote: > The current mmc block device implementation is tricky when it comes to > claim and release of the host, while processing I/O requests. In principle > we need to claim the host at the first request entering the queue and then > we need to release the host, as soon as the queue becomes empty. This > complexity relates to the asynchronous request mechanism that the mmc block > device driver implements. > > For the legacy block interface that we currently implements, the above > issue can be addressed, as we can find out when the queue really becomes > empty. > > However, to find out whether the queue is empty, isn't really an applicable > method when using the new blk-mq interface, as requests are instead pushed > to us via the struct struct blk_mq_ops and its function pointers. That is not entirely true. We can pull requests by running the queue i.e. blk_mq_run_hw_queues(q, false), returning BLK_MQ_RQ_QUEUE_BUSY and stopping / starting the queue as needed. But, as I have written before, we could start with the most trivial implementation. ->queue_rq() puts the requests in a list and then the thread removes them from the list. That would be a good start because it would avoid having to deal with other issues at the same time. > > Being able to support the asynchronous request method using the blk-mq > interface, means we have to allow the mmc block device driver to re-claim > the host from different tasks/contexts, as we may have > 1 request to > operate upon. > > Therefore, let's extend the mmc_claim_host() API to support reference > counting for the mmc block device. Aren't you overlooking the possibility that there are many block devices per host. i.e. one per eMMC internal partition. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12 May 2017 at 10:36, Adrian Hunter <adrian.hunter@intel.com> wrote: > On 11/05/17 15:39, Ulf Hansson wrote: >> The current mmc block device implementation is tricky when it comes to >> claim and release of the host, while processing I/O requests. In principle >> we need to claim the host at the first request entering the queue and then >> we need to release the host, as soon as the queue becomes empty. This >> complexity relates to the asynchronous request mechanism that the mmc block >> device driver implements. >> >> For the legacy block interface that we currently implements, the above >> issue can be addressed, as we can find out when the queue really becomes >> empty. >> >> However, to find out whether the queue is empty, isn't really an applicable >> method when using the new blk-mq interface, as requests are instead pushed >> to us via the struct struct blk_mq_ops and its function pointers. > > That is not entirely true. We can pull requests by running the queue i.e. > blk_mq_run_hw_queues(q, false), returning BLK_MQ_RQ_QUEUE_BUSY and stopping > / starting the queue as needed. I am not sure how that would work. It doesn't sound very effective to me, but I may be wrong. > > But, as I have written before, we could start with the most trivial > implementation. ->queue_rq() puts the requests in a list and then the > thread removes them from the list. That would work... > > That would be a good start because it would avoid having to deal with other > issues at the same time. ...however this doesn't seem like a step in the direction we want to take when porting to blkmq. There will be an extra context switch for each an every request, won't there? My point is, to be able to convert to blkmq, we must also avoid performance regression - at leas as long as possible. > >> >> Being able to support the asynchronous request method using the blk-mq >> interface, means we have to allow the mmc block device driver to re-claim >> the host from different tasks/contexts, as we may have > 1 request to >> operate upon. >> >> Therefore, let's extend the mmc_claim_host() API to support reference >> counting for the mmc block device. > > Aren't you overlooking the possibility that there are many block devices per > host. i.e. one per eMMC internal partition. Right now, yes you are right. I hope soon not. :-) These internal eMMC partitions are today suffering from the similar problems as we have for mmc ioctls. That means, requests are being I/O scheduled separately for each internal partition. Meaning requests for one partition could starve requests for another. I really hope we can fix this in some way or the other. Probably building upon Linus Walleij's series for fixing the problems for mmc ioctls [1] is the way to go. Then, when we have managed to fix these issues, I think my approach for extending the mmc_claim_host() API could be a possible intermediate step when trying to complete the blkmq port. Then we can continue to try to remove/re-work the claim host lock altogether as an optimization task. Kind regards Uffe [1] https://www.spinics.net/lists/linux-block/msg12677.html -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15/05/17 17:05, Ulf Hansson wrote: > On 12 May 2017 at 10:36, Adrian Hunter <adrian.hunter@intel.com> wrote: >> On 11/05/17 15:39, Ulf Hansson wrote: >>> The current mmc block device implementation is tricky when it comes to >>> claim and release of the host, while processing I/O requests. In principle >>> we need to claim the host at the first request entering the queue and then >>> we need to release the host, as soon as the queue becomes empty. This >>> complexity relates to the asynchronous request mechanism that the mmc block >>> device driver implements. >>> >>> For the legacy block interface that we currently implements, the above >>> issue can be addressed, as we can find out when the queue really becomes >>> empty. >>> >>> However, to find out whether the queue is empty, isn't really an applicable >>> method when using the new blk-mq interface, as requests are instead pushed >>> to us via the struct struct blk_mq_ops and its function pointers. >> >> That is not entirely true. We can pull requests by running the queue i.e. >> blk_mq_run_hw_queues(q, false), returning BLK_MQ_RQ_QUEUE_BUSY and stopping >> / starting the queue as needed. > > I am not sure how that would work. It doesn't sound very effective to > me, but I may be wrong. The queue depth is not the arbiter of whether we can issue a request. That means there will certainly be times when we have to return BLK_MQ_RQ_QUEUE_BUSY from ->queue_rq() and perhaps stop the queue as well. We could start with ->queue_rq() feeding every request to the existing thread and work towards having it submit requests immediately when possible. Currently mmc core cannot submit mmc_requests without waiting, but the command queue implementation can for read/write requests when the host controller and card are runtime resumed and the card is switched to the correct internal partition (and we are not currently discarding flushing or recovering), so it might be simpler to start with cmdq ;-) > >> >> But, as I have written before, we could start with the most trivial >> implementation. ->queue_rq() puts the requests in a list and then the >> thread removes them from the list. > > That would work... > >> >> That would be a good start because it would avoid having to deal with other >> issues at the same time. > > ...however this doesn't seem like a step in the direction we want to > take when porting to blkmq. > > There will be an extra context switch for each an every request, won't there? No, for synchronous requests, it would be the same as now. ->queue_rq() would be called in the context of the submitter and would wake the thread just like ->request_fn() does now. > > My point is, to be able to convert to blkmq, we must also avoid > performance regression - at leas as long as possible. It would still be better to start simple, and measure the performance, than to guess where the bottlenecks are. > >> >>> >>> Being able to support the asynchronous request method using the blk-mq >>> interface, means we have to allow the mmc block device driver to re-claim >>> the host from different tasks/contexts, as we may have > 1 request to >>> operate upon. >>> >>> Therefore, let's extend the mmc_claim_host() API to support reference >>> counting for the mmc block device. >> >> Aren't you overlooking the possibility that there are many block devices per >> host. i.e. one per eMMC internal partition. > > Right now, yes you are right. I hope soon not. :-) > > These internal eMMC partitions are today suffering from the similar > problems as we have for mmc ioctls. That means, requests are being I/O > scheduled separately for each internal partition. Meaning requests for > one partition could starve requests for another. > > I really hope we can fix this in some way or the other. Probably > building upon Linus Walleij's series for fixing the problems for mmc > ioctls [1] is the way to go. > > Then, when we have managed to fix these issues, I think my approach > for extending the mmc_claim_host() API could be a possible > intermediate step when trying to complete the blkmq port. Then we can > continue to try to remove/re-work the claim host lock altogether as an > optimization task. > > Kind regards > Uffe > > [1] > https://www.spinics.net/lists/linux-block/msg12677.html > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 16 May 2017 at 15:24, Adrian Hunter <adrian.hunter@intel.com> wrote: > On 15/05/17 17:05, Ulf Hansson wrote: >> On 12 May 2017 at 10:36, Adrian Hunter <adrian.hunter@intel.com> wrote: >>> On 11/05/17 15:39, Ulf Hansson wrote: >>>> The current mmc block device implementation is tricky when it comes to >>>> claim and release of the host, while processing I/O requests. In principle >>>> we need to claim the host at the first request entering the queue and then >>>> we need to release the host, as soon as the queue becomes empty. This >>>> complexity relates to the asynchronous request mechanism that the mmc block >>>> device driver implements. >>>> >>>> For the legacy block interface that we currently implements, the above >>>> issue can be addressed, as we can find out when the queue really becomes >>>> empty. >>>> >>>> However, to find out whether the queue is empty, isn't really an applicable >>>> method when using the new blk-mq interface, as requests are instead pushed >>>> to us via the struct struct blk_mq_ops and its function pointers. >>> >>> That is not entirely true. We can pull requests by running the queue i.e. >>> blk_mq_run_hw_queues(q, false), returning BLK_MQ_RQ_QUEUE_BUSY and stopping >>> / starting the queue as needed. >> >> I am not sure how that would work. It doesn't sound very effective to >> me, but I may be wrong. > > The queue depth is not the arbiter of whether we can issue a request. That > means there will certainly be times when we have to return > BLK_MQ_RQ_QUEUE_BUSY from ->queue_rq() and perhaps stop the queue as well. > > We could start with ->queue_rq() feeding every request to the existing > thread and work towards having it submit requests immediately when possible. > Currently mmc core cannot submit mmc_requests without waiting, but the > command queue implementation can for read/write requests when the host > controller and card are runtime resumed and the card is switched to the > correct internal partition (and we are not currently discarding flushing or > recovering), so it might be simpler to start with cmdq ;-) In the end I guess the only thing to do is to compare the patchsets to see how the result would look like. :-) My current observation is that our current implementation of the mmc block device and corresponding mmc queue, is still rather messy, even if you and Linus recently has worked hard to improve the situation. Moreover it looks quite different compared to other block device drivers and in the way of striving to make it more robust and maintainable, that's not good. Therefore, I am not really comfortable with replacing one mmc hack for block device management with yet another, as that seems to be what your approach would do - unless I am mistaken of course. Instead I would like us to move into a more generic blk device approach. Whatever that means. :-) > >> >>> >>> But, as I have written before, we could start with the most trivial >>> implementation. ->queue_rq() puts the requests in a list and then the >>> thread removes them from the list. >> >> That would work... >> >>> >>> That would be a good start because it would avoid having to deal with other >>> issues at the same time. >> >> ...however this doesn't seem like a step in the direction we want to >> take when porting to blkmq. >> >> There will be an extra context switch for each an every request, won't there? > > No, for synchronous requests, it would be the same as now. ->queue_rq() > would be called in the context of the submitter and would wake the thread > just like ->request_fn() does now. You are right! However, in my comparison I was thinking of how it *can* work if we were able to submit/prepare request in the context of the caller. > >> >> My point is, to be able to convert to blkmq, we must also avoid >> performance regression - at leas as long as possible. > > It would still be better to start simple, and measure the performance, than > to guess where the bottlenecks are. Yes, starting simple is always good! Although, even if simple, we need to stop adding more mmc specific hacks into the mmc block device layer. [...] Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index 0701e30..3633699 100644 --- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -1019,12 +1019,12 @@ unsigned int mmc_align_data_size(struct mmc_card *card, unsigned int sz) EXPORT_SYMBOL(mmc_align_data_size); /** - * mmc_claim_host - exclusively claim a host + * __mmc_claim_host - exclusively claim a host * @host: mmc host to claim * * Claim a host for a set of operations. */ -void mmc_claim_host(struct mmc_host *host) +void __mmc_claim_host(struct mmc_host *host, bool is_blkdev) { DECLARE_WAITQUEUE(wait, current); unsigned long flags; @@ -1036,7 +1036,11 @@ void mmc_claim_host(struct mmc_host *host) spin_lock_irqsave(&host->lock, flags); while (1) { set_current_state(TASK_UNINTERRUPTIBLE); - if (!host->claimed || host->claimer == current) + if (!host->claimed) + break; + if (host->claimer_is_blkdev && is_blkdev) + break; + if (host->claimer == current) break; spin_unlock_irqrestore(&host->lock, flags); schedule(); @@ -1045,6 +1049,7 @@ void mmc_claim_host(struct mmc_host *host) set_current_state(TASK_RUNNING); host->claimed = 1; host->claimer = current; + host->claimer_is_blkdev = is_blkdev; host->claim_cnt += 1; if (host->claim_cnt == 1) pm = true; @@ -1054,7 +1059,7 @@ void mmc_claim_host(struct mmc_host *host) if (pm) pm_runtime_get_sync(mmc_dev(host)); } -EXPORT_SYMBOL(mmc_claim_host); +EXPORT_SYMBOL(__mmc_claim_host); /** * mmc_release_host - release a host @@ -1076,6 +1081,7 @@ void mmc_release_host(struct mmc_host *host) } else { host->claimed = 0; host->claimer = NULL; + host->claimer_is_blkdev = 0; spin_unlock_irqrestore(&host->lock, flags); wake_up(&host->wq); pm_runtime_mark_last_busy(mmc_dev(host)); diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h index b247b1f..1598a37 100644 --- a/drivers/mmc/core/core.h +++ b/drivers/mmc/core/core.h @@ -122,9 +122,14 @@ int mmc_set_blocklen(struct mmc_card *card, unsigned int blocklen); int mmc_set_blockcount(struct mmc_card *card, unsigned int blockcount, bool is_rel_write); -void mmc_claim_host(struct mmc_host *host); +void __mmc_claim_host(struct mmc_host *host, bool is_blkdev); void mmc_release_host(struct mmc_host *host); void mmc_get_card(struct mmc_card *card); void mmc_put_card(struct mmc_card *card); +static inline void mmc_claim_host(struct mmc_host *host) +{ + __mmc_claim_host(host, 0); +} + #endif diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h index 8a4131f..7199817 100644 --- a/include/linux/mmc/host.h +++ b/include/linux/mmc/host.h @@ -347,6 +347,7 @@ struct mmc_host { wait_queue_head_t wq; struct task_struct *claimer; /* task that has host claimed */ + bool claimer_is_blkdev; /* claimer is blkdev */ int claim_cnt; /* "claim" nesting count */ struct delayed_work detect;
The current mmc block device implementation is tricky when it comes to claim and release of the host, while processing I/O requests. In principle we need to claim the host at the first request entering the queue and then we need to release the host, as soon as the queue becomes empty. This complexity relates to the asynchronous request mechanism that the mmc block device driver implements. For the legacy block interface that we currently implements, the above issue can be addressed, as we can find out when the queue really becomes empty. However, to find out whether the queue is empty, isn't really an applicable method when using the new blk-mq interface, as requests are instead pushed to us via the struct struct blk_mq_ops and its function pointers. Being able to support the asynchronous request method using the blk-mq interface, means we have to allow the mmc block device driver to re-claim the host from different tasks/contexts, as we may have > 1 request to operate upon. Therefore, let's extend the mmc_claim_host() API to support reference counting for the mmc block device. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> --- drivers/mmc/core/core.c | 14 ++++++++++---- drivers/mmc/core/core.h | 7 ++++++- include/linux/mmc/host.h | 1 + 3 files changed, 17 insertions(+), 5 deletions(-)