Message ID | 20210401021927.343727-12-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: support bio based io polling | expand |
On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: > From: Jeffle Xu <jefflexu@linux.alibaba.com> > > This method can be used to check if bio-based device supports IO polling > or not. For mq devices, checking for hw queue in polling mode is > adequate, while the sanity check shall be implementation specific for > bio-based devices. For example, dm device needs to check if all > underlying devices are capable of IO polling. > > Though bio-based device may have done the sanity check during the > device initialization phase, cacheing the result of this sanity check > (such as by cacheing in the queue_flags) may not work. Because for dm > devices, users could change the state of the underlying devices through > '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, > the cached result of the very beginning sanity check could be > out-of-date. Thus the sanity check needs to be done every time 'io_poll' > is to be modified. I really don't think thi should be a method, and I really do dislike how we have all this "if (is_mq)" junk. Why can't we have a flag on the gendisk that signals if the device can support polling that is autoamtically set for blk-mq and as-needed by bio based drivers? And please move everything that significantly hanges things for the mq based path to separate prep patches early in th series.
On 4/12/21 5:38 PM, Christoph Hellwig wrote: > On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: >> From: Jeffle Xu <jefflexu@linux.alibaba.com> >> >> This method can be used to check if bio-based device supports IO polling >> or not. For mq devices, checking for hw queue in polling mode is >> adequate, while the sanity check shall be implementation specific for >> bio-based devices. For example, dm device needs to check if all >> underlying devices are capable of IO polling. >> >> Though bio-based device may have done the sanity check during the >> device initialization phase, cacheing the result of this sanity check >> (such as by cacheing in the queue_flags) may not work. Because for dm >> devices, users could change the state of the underlying devices through >> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, >> the cached result of the very beginning sanity check could be >> out-of-date. Thus the sanity check needs to be done every time 'io_poll' >> is to be modified. > > I really don't think thi should be a method, and I really do dislike > how we have all this "if (is_mq)" junk. Why can't we have a flag on > the gendisk that signals if the device can support polling that > is autoamtically set for blk-mq and as-needed by bio based drivers? That would consume one more bit of queue->queue_flags. Besides, DM/MD is somehow special here that when one of the underlying devices is disabled polling through '/sys/block/<dev>/io_poll', currently there's no mechanism notifying the above MD/DM to clear the previously set queue_flags. Thus the outdated queue_flags still indicates this DM/MD is capable of polling, while in fact one of the underlying device has been disabled for polling. Mike had ever suggested that we can trust the queue_flag, and clear the outdated queue_flags when later the IO submission or polling routine finally finds that the device is not capable of polling. Currently submit_bio_checks() will silently clear the REQ_HIPRI flag and still submit the bio when the device is actually not capable of polling. To fix the issue, could we break the submission and return an error code in submit_bio_checks() if the device is not capable of polling when submitting HIPRI bio? > And please move everything that significantly hanges things for the > mq based path to separate prep patches early in th series. >
On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: > > > On 4/12/21 5:38 PM, Christoph Hellwig wrote: > > On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: > >> From: Jeffle Xu <jefflexu@linux.alibaba.com> > >> > >> This method can be used to check if bio-based device supports IO polling > >> or not. For mq devices, checking for hw queue in polling mode is > >> adequate, while the sanity check shall be implementation specific for > >> bio-based devices. For example, dm device needs to check if all > >> underlying devices are capable of IO polling. > >> > >> Though bio-based device may have done the sanity check during the > >> device initialization phase, cacheing the result of this sanity check > >> (such as by cacheing in the queue_flags) may not work. Because for dm > >> devices, users could change the state of the underlying devices through > >> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, > >> the cached result of the very beginning sanity check could be > >> out-of-date. Thus the sanity check needs to be done every time 'io_poll' > >> is to be modified. > > > > I really don't think thi should be a method, and I really do dislike > > how we have all this "if (is_mq)" junk. Why can't we have a flag on > > the gendisk that signals if the device can support polling that > > is autoamtically set for blk-mq and as-needed by bio based drivers? > > That would consume one more bit of queue->queue_flags. > > Besides, DM/MD is somehow special here that when one of the underlying > devices is disabled polling through '/sys/block/<dev>/io_poll', > currently there's no mechanism notifying the above MD/DM to clear the > previously set queue_flags. Thus the outdated queue_flags still > indicates this DM/MD is capable of polling, while in fact one of the > underlying device has been disabled for polling. Right, just like there isn't queue limit progagation. Another blocker could be that bio based queue doesn't support queue freezing. > > Mike had ever suggested that we can trust the queue_flag, and clear the > outdated queue_flags when later the IO submission or polling routine > finally finds that the device is not capable of polling. Currently > submit_bio_checks() will silently clear the REQ_HIPRI flag and still > submit the bio when the device is actually not capable of polling. To > fix the issue, could we break the submission and return an error code in > submit_bio_checks() if the device is not capable of polling when > submitting HIPRI bio? I think we may just leave it alone, if underlying queue becomes not pollable, the bio still can be submitted & completed via IRQ, just not efficient enough. Thanks, Ming
On 4/14/21 7:24 PM, Ming Lei wrote: > On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: >> >> >> On 4/12/21 5:38 PM, Christoph Hellwig wrote: >>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: >>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> >>>> >>>> This method can be used to check if bio-based device supports IO polling >>>> or not. For mq devices, checking for hw queue in polling mode is >>>> adequate, while the sanity check shall be implementation specific for >>>> bio-based devices. For example, dm device needs to check if all >>>> underlying devices are capable of IO polling. >>>> >>>> Though bio-based device may have done the sanity check during the >>>> device initialization phase, cacheing the result of this sanity check >>>> (such as by cacheing in the queue_flags) may not work. Because for dm >>>> devices, users could change the state of the underlying devices through >>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, >>>> the cached result of the very beginning sanity check could be >>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' >>>> is to be modified. >>> >>> I really don't think thi should be a method, and I really do dislike >>> how we have all this "if (is_mq)" junk. Why can't we have a flag on >>> the gendisk that signals if the device can support polling that >>> is autoamtically set for blk-mq and as-needed by bio based drivers? >> >> That would consume one more bit of queue->queue_flags. >> >> Besides, DM/MD is somehow special here that when one of the underlying >> devices is disabled polling through '/sys/block/<dev>/io_poll', >> currently there's no mechanism notifying the above MD/DM to clear the >> previously set queue_flags. Thus the outdated queue_flags still >> indicates this DM/MD is capable of polling, while in fact one of the >> underlying device has been disabled for polling. > > Right, just like there isn't queue limit progagation. > > Another blocker could be that bio based queue doesn't support queue > freezing. Do you mean the queue freezing is called in the following code snippet? ``` static ssize_t queue_poll_store(struct request_queue *q, const char *page, size_t count) { ... if (poll_on) { blk_queue_flag_set(QUEUE_FLAG_POLL, q); } else { blk_mq_freeze_queue(q); blk_queue_flag_clear(QUEUE_FLAG_POLL, q); blk_mq_unfreeze_queue(q); } ``` And I can't understand how bio-based queue doesn't support queue freezing. ``` submit_bio_noacct __submit_bio_noacct bio_queue_enter ``` Every time submitting a bio, bio_queue_enter() will be called, and once the queue has been frozen, bio_queue_enter() will wait there until the queue is unfrozen. > >> >> Mike had ever suggested that we can trust the queue_flag, and clear the >> outdated queue_flags when later the IO submission or polling routine >> finally finds that the device is not capable of polling. Currently >> submit_bio_checks() will silently clear the REQ_HIPRI flag and still >> submit the bio when the device is actually not capable of polling. To >> fix the issue, could we break the submission and return an error code in >> submit_bio_checks() if the device is not capable of polling when >> submitting HIPRI bio? > > I think we may just leave it alone, if underlying queue becomes not pollable, > the bio still can be submitted & completed via IRQ, just not efficient enough. Yes it still works. I agree if there's no better solution... And what about the issue Christoph originally concerned? Do we use one more flag bit indicating if the queue capable of polling, or the poll_capable() method way?
On Thu, Apr 15, 2021 at 09:34:36AM +0800, JeffleXu wrote: > > > On 4/14/21 7:24 PM, Ming Lei wrote: > > On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: > >> > >> > >> On 4/12/21 5:38 PM, Christoph Hellwig wrote: > >>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: > >>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> > >>>> > >>>> This method can be used to check if bio-based device supports IO polling > >>>> or not. For mq devices, checking for hw queue in polling mode is > >>>> adequate, while the sanity check shall be implementation specific for > >>>> bio-based devices. For example, dm device needs to check if all > >>>> underlying devices are capable of IO polling. > >>>> > >>>> Though bio-based device may have done the sanity check during the > >>>> device initialization phase, cacheing the result of this sanity check > >>>> (such as by cacheing in the queue_flags) may not work. Because for dm > >>>> devices, users could change the state of the underlying devices through > >>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, > >>>> the cached result of the very beginning sanity check could be > >>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' > >>>> is to be modified. > >>> > >>> I really don't think thi should be a method, and I really do dislike > >>> how we have all this "if (is_mq)" junk. Why can't we have a flag on > >>> the gendisk that signals if the device can support polling that > >>> is autoamtically set for blk-mq and as-needed by bio based drivers? > >> > >> That would consume one more bit of queue->queue_flags. > >> > >> Besides, DM/MD is somehow special here that when one of the underlying > >> devices is disabled polling through '/sys/block/<dev>/io_poll', > >> currently there's no mechanism notifying the above MD/DM to clear the > >> previously set queue_flags. Thus the outdated queue_flags still > >> indicates this DM/MD is capable of polling, while in fact one of the > >> underlying device has been disabled for polling. > > > > Right, just like there isn't queue limit progagation. > > > > Another blocker could be that bio based queue doesn't support queue > > freezing. > > Do you mean the queue freezing is called in the following code snippet? > > ``` > static ssize_t queue_poll_store(struct request_queue *q, const char > *page, size_t count) > { > ... > if (poll_on) { > blk_queue_flag_set(QUEUE_FLAG_POLL, q); > } else { > blk_mq_freeze_queue(q); > blk_queue_flag_clear(QUEUE_FLAG_POLL, q); > blk_mq_unfreeze_queue(q); > } > ``` Yes, if it is a bio based queue. Or bio queued queue(DM, MD or others) may use freeze_queue to do similar thing. > > And I can't understand how bio-based queue doesn't support queue freezing. > > ``` > submit_bio_noacct > __submit_bio_noacct > bio_queue_enter > ``` > > Every time submitting a bio, bio_queue_enter() will be called, and once > the queue has been frozen, bio_queue_enter() will wait there until the > queue is unfrozen. Not like blk-mq, the refcount is just grabbed during submission for bio based queue. I will research a bit and see if we can extend freeze queue for covering bio based queue. One trouble is that bio is ended before freeing request. > > > > >> > >> Mike had ever suggested that we can trust the queue_flag, and clear the > >> outdated queue_flags when later the IO submission or polling routine > >> finally finds that the device is not capable of polling. Currently > >> submit_bio_checks() will silently clear the REQ_HIPRI flag and still > >> submit the bio when the device is actually not capable of polling. To > >> fix the issue, could we break the submission and return an error code in > >> submit_bio_checks() if the device is not capable of polling when > >> submitting HIPRI bio? > > > > I think we may just leave it alone, if underlying queue becomes not pollable, > > the bio still can be submitted & completed via IRQ, just not efficient enough. > > Yes it still works. I agree if there's no better solution... > > And what about the issue Christoph originally concerned? Do we use one > more flag bit indicating if the queue capable of polling, or the > poll_capable() method way? Just wondering why we can't use QUEUE_FLAG_POLL simply? If user wants to enable it, let's do it for them. And bio driver can start with default poll state by checking underlying queues. Thanks, Ming
On 4/15/21 3:43 PM, Ming Lei wrote: > On Thu, Apr 15, 2021 at 09:34:36AM +0800, JeffleXu wrote: >> >> >> On 4/14/21 7:24 PM, Ming Lei wrote: >>> On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: >>>> >>>> >>>> On 4/12/21 5:38 PM, Christoph Hellwig wrote: >>>>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: >>>>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> >>>>>> >>>>>> This method can be used to check if bio-based device supports IO polling >>>>>> or not. For mq devices, checking for hw queue in polling mode is >>>>>> adequate, while the sanity check shall be implementation specific for >>>>>> bio-based devices. For example, dm device needs to check if all >>>>>> underlying devices are capable of IO polling. >>>>>> >>>>>> Though bio-based device may have done the sanity check during the >>>>>> device initialization phase, cacheing the result of this sanity check >>>>>> (such as by cacheing in the queue_flags) may not work. Because for dm >>>>>> devices, users could change the state of the underlying devices through >>>>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, >>>>>> the cached result of the very beginning sanity check could be >>>>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' >>>>>> is to be modified. >>>>> >>>>> I really don't think thi should be a method, and I really do dislike >>>>> how we have all this "if (is_mq)" junk. Why can't we have a flag on >>>>> the gendisk that signals if the device can support polling that >>>>> is autoamtically set for blk-mq and as-needed by bio based drivers? >>>> >>>> That would consume one more bit of queue->queue_flags. >>>> >>>> Besides, DM/MD is somehow special here that when one of the underlying >>>> devices is disabled polling through '/sys/block/<dev>/io_poll', >>>> currently there's no mechanism notifying the above MD/DM to clear the >>>> previously set queue_flags. Thus the outdated queue_flags still >>>> indicates this DM/MD is capable of polling, while in fact one of the >>>> underlying device has been disabled for polling. >>> >>> Right, just like there isn't queue limit progagation. >>> >>> Another blocker could be that bio based queue doesn't support queue >>> freezing. >> >> Do you mean the queue freezing is called in the following code snippet? >> >> ``` >> static ssize_t queue_poll_store(struct request_queue *q, const char >> *page, size_t count) >> { >> ... >> if (poll_on) { >> blk_queue_flag_set(QUEUE_FLAG_POLL, q); >> } else { >> blk_mq_freeze_queue(q); >> blk_queue_flag_clear(QUEUE_FLAG_POLL, q); >> blk_mq_unfreeze_queue(q); >> } >> ``` > > Yes, if it is a bio based queue. Or bio queued queue(DM, MD or others) may > use freeze_queue to do similar thing. > >> >> And I can't understand how bio-based queue doesn't support queue freezing. >> >> ``` >> submit_bio_noacct >> __submit_bio_noacct >> bio_queue_enter >> ``` >> >> Every time submitting a bio, bio_queue_enter() will be called, and once >> the queue has been frozen, bio_queue_enter() will wait there until the >> queue is unfrozen. > > Not like blk-mq, the refcount is just grabbed during submission for bio based > queue. Could you please explain it more detailed .... I will research a bit and see if we can extend freeze queue for > covering bio based queue. One trouble is that bio is ended before > freeing request. > >> >>> >>>> >>>> Mike had ever suggested that we can trust the queue_flag, and clear the >>>> outdated queue_flags when later the IO submission or polling routine >>>> finally finds that the device is not capable of polling. Currently >>>> submit_bio_checks() will silently clear the REQ_HIPRI flag and still >>>> submit the bio when the device is actually not capable of polling. To >>>> fix the issue, could we break the submission and return an error code in >>>> submit_bio_checks() if the device is not capable of polling when >>>> submitting HIPRI bio? >>> >>> I think we may just leave it alone, if underlying queue becomes not pollable, >>> the bio still can be submitted & completed via IRQ, just not efficient enough. >> >> Yes it still works. I agree if there's no better solution... >> >> And what about the issue Christoph originally concerned? Do we use one >> more flag bit indicating if the queue capable of polling, or the >> poll_capable() method way? > > Just wondering why we can't use QUEUE_FLAG_POLL simply? If user wants to > enable it, let's do it for them. And bio driver can start with default poll > state by checking underlying queues. > Consider the following scenario: QUEUE_FLAG_POLL is set after initialization, indicating the device capable of polling; then polling is turned off by '/sys/block/<dev>/io_poll', thus QUEUE_FLAG_POLL is cleared. So we have to implement two semantics: 1. a flag indicating whether the device is **capable** of polling or not; (adding another flag bit, or calling poll_capable() every time polling is to be turned on) 2. a flag indicating whether polling is currently **turned on** or not (i.e., QUEUE_FLAG_POLL)
On Thu, Apr 15, 2021 at 05:21:56PM +0800, JeffleXu wrote: > > > On 4/15/21 3:43 PM, Ming Lei wrote: > > On Thu, Apr 15, 2021 at 09:34:36AM +0800, JeffleXu wrote: > >> > >> > >> On 4/14/21 7:24 PM, Ming Lei wrote: > >>> On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: > >>>> > >>>> > >>>> On 4/12/21 5:38 PM, Christoph Hellwig wrote: > >>>>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: > >>>>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> > >>>>>> > >>>>>> This method can be used to check if bio-based device supports IO polling > >>>>>> or not. For mq devices, checking for hw queue in polling mode is > >>>>>> adequate, while the sanity check shall be implementation specific for > >>>>>> bio-based devices. For example, dm device needs to check if all > >>>>>> underlying devices are capable of IO polling. > >>>>>> > >>>>>> Though bio-based device may have done the sanity check during the > >>>>>> device initialization phase, cacheing the result of this sanity check > >>>>>> (such as by cacheing in the queue_flags) may not work. Because for dm > >>>>>> devices, users could change the state of the underlying devices through > >>>>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, > >>>>>> the cached result of the very beginning sanity check could be > >>>>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' > >>>>>> is to be modified. > >>>>> > >>>>> I really don't think thi should be a method, and I really do dislike > >>>>> how we have all this "if (is_mq)" junk. Why can't we have a flag on > >>>>> the gendisk that signals if the device can support polling that > >>>>> is autoamtically set for blk-mq and as-needed by bio based drivers? > >>>> > >>>> That would consume one more bit of queue->queue_flags. > >>>> > >>>> Besides, DM/MD is somehow special here that when one of the underlying > >>>> devices is disabled polling through '/sys/block/<dev>/io_poll', > >>>> currently there's no mechanism notifying the above MD/DM to clear the > >>>> previously set queue_flags. Thus the outdated queue_flags still > >>>> indicates this DM/MD is capable of polling, while in fact one of the > >>>> underlying device has been disabled for polling. > >>> > >>> Right, just like there isn't queue limit progagation. > >>> > >>> Another blocker could be that bio based queue doesn't support queue > >>> freezing. > >> > >> Do you mean the queue freezing is called in the following code snippet? > >> > >> ``` > >> static ssize_t queue_poll_store(struct request_queue *q, const char > >> *page, size_t count) > >> { > >> ... > >> if (poll_on) { > >> blk_queue_flag_set(QUEUE_FLAG_POLL, q); > >> } else { > >> blk_mq_freeze_queue(q); > >> blk_queue_flag_clear(QUEUE_FLAG_POLL, q); > >> blk_mq_unfreeze_queue(q); > >> } > >> ``` > > > > Yes, if it is a bio based queue. Or bio queued queue(DM, MD or others) may > > use freeze_queue to do similar thing. > > > >> > >> And I can't understand how bio-based queue doesn't support queue freezing. > >> > >> ``` > >> submit_bio_noacct > >> __submit_bio_noacct > >> bio_queue_enter > >> ``` > >> > >> Every time submitting a bio, bio_queue_enter() will be called, and once > >> the queue has been frozen, bio_queue_enter() will wait there until the > >> queue is unfrozen. > > > > Not like blk-mq, the refcount is just grabbed during submission for bio based > > queue. > > Could you please explain it more detailed .... Please see __submit_bio(), in which the queue ref is dropped. > > > I will research a bit and see if we can extend freeze queue for > > covering bio based queue. One trouble is that bio is ended before > > freeing request. > > > >> > >>> > >>>> > >>>> Mike had ever suggested that we can trust the queue_flag, and clear the > >>>> outdated queue_flags when later the IO submission or polling routine > >>>> finally finds that the device is not capable of polling. Currently > >>>> submit_bio_checks() will silently clear the REQ_HIPRI flag and still > >>>> submit the bio when the device is actually not capable of polling. To > >>>> fix the issue, could we break the submission and return an error code in > >>>> submit_bio_checks() if the device is not capable of polling when > >>>> submitting HIPRI bio? > >>> > >>> I think we may just leave it alone, if underlying queue becomes not pollable, > >>> the bio still can be submitted & completed via IRQ, just not efficient enough. > >> > >> Yes it still works. I agree if there's no better solution... > >> > >> And what about the issue Christoph originally concerned? Do we use one > >> more flag bit indicating if the queue capable of polling, or the > >> poll_capable() method way? > > > > Just wondering why we can't use QUEUE_FLAG_POLL simply? If user wants to > > enable it, let's do it for them. And bio driver can start with default poll > > state by checking underlying queues. > > > > Consider the following scenario: QUEUE_FLAG_POLL is set after > initialization, indicating the device capable of polling; then polling > is turned off by '/sys/block/<dev>/io_poll', thus QUEUE_FLAG_POLL is > cleared. If the flag is cleared, the bio will be submitted to irq queue, what is the problem? Thanks, Ming
On 4/15/21 6:06 PM, Ming Lei wrote: > On Thu, Apr 15, 2021 at 05:21:56PM +0800, JeffleXu wrote: >> >> >> On 4/15/21 3:43 PM, Ming Lei wrote: >>> On Thu, Apr 15, 2021 at 09:34:36AM +0800, JeffleXu wrote: >>>> >>>> >>>> On 4/14/21 7:24 PM, Ming Lei wrote: >>>>> On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: >>>>>> >>>>>> >>>>>> On 4/12/21 5:38 PM, Christoph Hellwig wrote: >>>>>>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: >>>>>>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> >>>>>>>> >>>>>>>> This method can be used to check if bio-based device supports IO polling >>>>>>>> or not. For mq devices, checking for hw queue in polling mode is >>>>>>>> adequate, while the sanity check shall be implementation specific for >>>>>>>> bio-based devices. For example, dm device needs to check if all >>>>>>>> underlying devices are capable of IO polling. >>>>>>>> >>>>>>>> Though bio-based device may have done the sanity check during the >>>>>>>> device initialization phase, cacheing the result of this sanity check >>>>>>>> (such as by cacheing in the queue_flags) may not work. Because for dm >>>>>>>> devices, users could change the state of the underlying devices through >>>>>>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, >>>>>>>> the cached result of the very beginning sanity check could be >>>>>>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' >>>>>>>> is to be modified. >>>>>>> >>>>>>> I really don't think thi should be a method, and I really do dislike >>>>>>> how we have all this "if (is_mq)" junk. Why can't we have a flag on >>>>>>> the gendisk that signals if the device can support polling that >>>>>>> is autoamtically set for blk-mq and as-needed by bio based drivers? >>>>>> >>>>>> That would consume one more bit of queue->queue_flags. >>>>>> >>>>>> Besides, DM/MD is somehow special here that when one of the underlying >>>>>> devices is disabled polling through '/sys/block/<dev>/io_poll', >>>>>> currently there's no mechanism notifying the above MD/DM to clear the >>>>>> previously set queue_flags. Thus the outdated queue_flags still >>>>>> indicates this DM/MD is capable of polling, while in fact one of the >>>>>> underlying device has been disabled for polling. >>>>> >>>>> Right, just like there isn't queue limit progagation. >>>>> >>>>> Another blocker could be that bio based queue doesn't support queue >>>>> freezing. >>>> >>>> Do you mean the queue freezing is called in the following code snippet? >>>> >>>> ``` >>>> static ssize_t queue_poll_store(struct request_queue *q, const char >>>> *page, size_t count) >>>> { >>>> ... >>>> if (poll_on) { >>>> blk_queue_flag_set(QUEUE_FLAG_POLL, q); >>>> } else { >>>> blk_mq_freeze_queue(q); >>>> blk_queue_flag_clear(QUEUE_FLAG_POLL, q); >>>> blk_mq_unfreeze_queue(q); >>>> } >>>> ``` >>> >>> Yes, if it is a bio based queue. Or bio queued queue(DM, MD or others) may >>> use freeze_queue to do similar thing. >>> >>>> >>>> And I can't understand how bio-based queue doesn't support queue freezing. >>>> >>>> ``` >>>> submit_bio_noacct >>>> __submit_bio_noacct >>>> bio_queue_enter >>>> ``` >>>> >>>> Every time submitting a bio, bio_queue_enter() will be called, and once >>>> the queue has been frozen, bio_queue_enter() will wait there until the >>>> queue is unfrozen. >>> >>> Not like blk-mq, the refcount is just grabbed during submission for bio based >>> queue. >> >> Could you please explain it more detailed .... > > Please see __submit_bio(), in which the queue ref is dropped. > >> >> >> I will research a bit and see if we can extend freeze queue for >>> covering bio based queue. One trouble is that bio is ended before >>> freeing request. >>> >>>> >>>>> >>>>>> >>>>>> Mike had ever suggested that we can trust the queue_flag, and clear the >>>>>> outdated queue_flags when later the IO submission or polling routine >>>>>> finally finds that the device is not capable of polling. Currently >>>>>> submit_bio_checks() will silently clear the REQ_HIPRI flag and still >>>>>> submit the bio when the device is actually not capable of polling. To >>>>>> fix the issue, could we break the submission and return an error code in >>>>>> submit_bio_checks() if the device is not capable of polling when >>>>>> submitting HIPRI bio? >>>>> >>>>> I think we may just leave it alone, if underlying queue becomes not pollable, >>>>> the bio still can be submitted & completed via IRQ, just not efficient enough. >>>> >>>> Yes it still works. I agree if there's no better solution... >>>> >>>> And what about the issue Christoph originally concerned? Do we use one >>>> more flag bit indicating if the queue capable of polling, or the >>>> poll_capable() method way? >>> >>> Just wondering why we can't use QUEUE_FLAG_POLL simply? If user wants to >>> enable it, let's do it for them. And bio driver can start with default poll >>> state by checking underlying queues. >>> >> >> Consider the following scenario: QUEUE_FLAG_POLL is set after >> initialization, indicating the device capable of polling; then polling >> is turned off by '/sys/block/<dev>/io_poll', thus QUEUE_FLAG_POLL is >> cleared. > > If the flag is cleared, the bio will be submitted to irq queue, what is > the problem? > The IO path has no problem. It is the control path. If you want to turn on polling then, you have to check if the device capable of polling, while QUEUE_FLAG_POLL has been cleared in this case. IOW you can't rely on QUEUE_FLAG_POLL to see if the device has the **ability** of polling. QUEUE_FLAG_POLL flag only indicates if polling is turned on or off currently.
On Thu, Apr 15, 2021 at 07:21:52PM +0800, JeffleXu wrote: > > > On 4/15/21 6:06 PM, Ming Lei wrote: > > On Thu, Apr 15, 2021 at 05:21:56PM +0800, JeffleXu wrote: > >> > >> > >> On 4/15/21 3:43 PM, Ming Lei wrote: > >>> On Thu, Apr 15, 2021 at 09:34:36AM +0800, JeffleXu wrote: > >>>> > >>>> > >>>> On 4/14/21 7:24 PM, Ming Lei wrote: > >>>>> On Wed, Apr 14, 2021 at 04:38:25PM +0800, JeffleXu wrote: > >>>>>> > >>>>>> > >>>>>> On 4/12/21 5:38 PM, Christoph Hellwig wrote: > >>>>>>> On Thu, Apr 01, 2021 at 10:19:26AM +0800, Ming Lei wrote: > >>>>>>>> From: Jeffle Xu <jefflexu@linux.alibaba.com> > >>>>>>>> > >>>>>>>> This method can be used to check if bio-based device supports IO polling > >>>>>>>> or not. For mq devices, checking for hw queue in polling mode is > >>>>>>>> adequate, while the sanity check shall be implementation specific for > >>>>>>>> bio-based devices. For example, dm device needs to check if all > >>>>>>>> underlying devices are capable of IO polling. > >>>>>>>> > >>>>>>>> Though bio-based device may have done the sanity check during the > >>>>>>>> device initialization phase, cacheing the result of this sanity check > >>>>>>>> (such as by cacheing in the queue_flags) may not work. Because for dm > >>>>>>>> devices, users could change the state of the underlying devices through > >>>>>>>> '/sys/block/<dev>/io_poll', bypassing the dm device above. In this case, > >>>>>>>> the cached result of the very beginning sanity check could be > >>>>>>>> out-of-date. Thus the sanity check needs to be done every time 'io_poll' > >>>>>>>> is to be modified. > >>>>>>> > >>>>>>> I really don't think thi should be a method, and I really do dislike > >>>>>>> how we have all this "if (is_mq)" junk. Why can't we have a flag on > >>>>>>> the gendisk that signals if the device can support polling that > >>>>>>> is autoamtically set for blk-mq and as-needed by bio based drivers? > >>>>>> > >>>>>> That would consume one more bit of queue->queue_flags. > >>>>>> > >>>>>> Besides, DM/MD is somehow special here that when one of the underlying > >>>>>> devices is disabled polling through '/sys/block/<dev>/io_poll', > >>>>>> currently there's no mechanism notifying the above MD/DM to clear the > >>>>>> previously set queue_flags. Thus the outdated queue_flags still > >>>>>> indicates this DM/MD is capable of polling, while in fact one of the > >>>>>> underlying device has been disabled for polling. > >>>>> > >>>>> Right, just like there isn't queue limit progagation. > >>>>> > >>>>> Another blocker could be that bio based queue doesn't support queue > >>>>> freezing. > >>>> > >>>> Do you mean the queue freezing is called in the following code snippet? > >>>> > >>>> ``` > >>>> static ssize_t queue_poll_store(struct request_queue *q, const char > >>>> *page, size_t count) > >>>> { > >>>> ... > >>>> if (poll_on) { > >>>> blk_queue_flag_set(QUEUE_FLAG_POLL, q); > >>>> } else { > >>>> blk_mq_freeze_queue(q); > >>>> blk_queue_flag_clear(QUEUE_FLAG_POLL, q); > >>>> blk_mq_unfreeze_queue(q); > >>>> } > >>>> ``` > >>> > >>> Yes, if it is a bio based queue. Or bio queued queue(DM, MD or others) may > >>> use freeze_queue to do similar thing. > >>> > >>>> > >>>> And I can't understand how bio-based queue doesn't support queue freezing. > >>>> > >>>> ``` > >>>> submit_bio_noacct > >>>> __submit_bio_noacct > >>>> bio_queue_enter > >>>> ``` > >>>> > >>>> Every time submitting a bio, bio_queue_enter() will be called, and once > >>>> the queue has been frozen, bio_queue_enter() will wait there until the > >>>> queue is unfrozen. > >>> > >>> Not like blk-mq, the refcount is just grabbed during submission for bio based > >>> queue. > >> > >> Could you please explain it more detailed .... > > > > Please see __submit_bio(), in which the queue ref is dropped. > > > >> > >> > >> I will research a bit and see if we can extend freeze queue for > >>> covering bio based queue. One trouble is that bio is ended before > >>> freeing request. > >>> > >>>> > >>>>> > >>>>>> > >>>>>> Mike had ever suggested that we can trust the queue_flag, and clear the > >>>>>> outdated queue_flags when later the IO submission or polling routine > >>>>>> finally finds that the device is not capable of polling. Currently > >>>>>> submit_bio_checks() will silently clear the REQ_HIPRI flag and still > >>>>>> submit the bio when the device is actually not capable of polling. To > >>>>>> fix the issue, could we break the submission and return an error code in > >>>>>> submit_bio_checks() if the device is not capable of polling when > >>>>>> submitting HIPRI bio? > >>>>> > >>>>> I think we may just leave it alone, if underlying queue becomes not pollable, > >>>>> the bio still can be submitted & completed via IRQ, just not efficient enough. > >>>> > >>>> Yes it still works. I agree if there's no better solution... > >>>> > >>>> And what about the issue Christoph originally concerned? Do we use one > >>>> more flag bit indicating if the queue capable of polling, or the > >>>> poll_capable() method way? > >>> > >>> Just wondering why we can't use QUEUE_FLAG_POLL simply? If user wants to > >>> enable it, let's do it for them. And bio driver can start with default poll > >>> state by checking underlying queues. > >>> > >> > >> Consider the following scenario: QUEUE_FLAG_POLL is set after > >> initialization, indicating the device capable of polling; then polling > >> is turned off by '/sys/block/<dev>/io_poll', thus QUEUE_FLAG_POLL is > >> cleared. > > > > If the flag is cleared, the bio will be submitted to irq queue, what is > > the problem? > > > > The IO path has no problem. It is the control path. If you want to turn Can you explain a bit what the control path is? > on polling then, you have to check if the device capable of polling, > while QUEUE_FLAG_POLL has been cleared in this case. IOW you can't rely > on QUEUE_FLAG_POLL to see if the device has the **ability** of polling. > QUEUE_FLAG_POLL flag only indicates if polling is turned on or off > currently. For bio based driver, I'd suggest to start with do polling simply if QUEUE_FLAG_POLL is set in bio request queue flag. The flag can be enabled/disabled during initialization, or via sysfs. That said we can start with always thinking the bio queue is capable of io polling. Thanks, Ming
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index db3268d41274..c8e7e4af66cb 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -426,9 +426,17 @@ static ssize_t queue_poll_store(struct request_queue *q, const char *page, unsigned long poll_on; ssize_t ret; - if (!q->tag_set || q->tag_set->nr_maps <= HCTX_TYPE_POLL || - !q->tag_set->map[HCTX_TYPE_POLL].nr_queues) - return -EINVAL; + if (queue_is_mq(q)) { + if (!q->tag_set || q->tag_set->nr_maps <= HCTX_TYPE_POLL || + !q->tag_set->map[HCTX_TYPE_POLL].nr_queues) + return -EINVAL; + } else { + struct gendisk *disk = queue_to_disk(q); + + if (!disk->fops->poll_capable || + !disk->fops->poll_capable(disk)) + return -EINVAL; + } ret = queue_var_store(&poll_on, page, count); if (ret < 0) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index bfab74b45f15..a46f975f2a2f 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1881,6 +1881,7 @@ struct block_device_operations { int (*report_zones)(struct gendisk *, sector_t sector, unsigned int nr_zones, report_zones_cb cb, void *data); char *(*devnode)(struct gendisk *disk, umode_t *mode); + bool (*poll_capable)(struct gendisk *disk); struct module *owner; const struct pr_ops *pr_ops; };