Message ID | 20171227032257.8182-2-snitzer@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote: > All requests allocated from a request_queue with this callback set can > failover their requests during completion. > > This callback is expected to use the blk_steal_bios() interface to > transfer a request's bios back to an upper-layer bio-based > request_queue. > > This will be used by both NVMe multipath and DM multipath. Without it > DM multipath cannot get access to NVMe-specific error handling that NVMe > core provides in nvme_complete_rq(). And the whole point is that it should not get any such access. The reason why we did nvme multipathing differently is because the design of dm-multipath inflicts so much pain on users that we absolutely want to avoid it this time around.
On Fri, Dec 29 2017 at 5:10am -0500, Christoph Hellwig <hch@lst.de> wrote: > On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote: > > All requests allocated from a request_queue with this callback set can > > failover their requests during completion. > > > > This callback is expected to use the blk_steal_bios() interface to > > transfer a request's bios back to an upper-layer bio-based > > request_queue. > > > > This will be used by both NVMe multipath and DM multipath. Without it > > DM multipath cannot get access to NVMe-specific error handling that NVMe > > core provides in nvme_complete_rq(). > > And the whole point is that it should not get any such access. No the whole point is you hijacked multipathing for little to no gain. > The reason why we did nvme multipathing differently is because the > design of dm-multipath inflicts so much pain on users that we absolutely > want to avoid it this time around. Is that the royal "we"? _You_ are the one subjecting users to pain. There is no reason users should need to have multiple management domains for multipathing unless they opt-in. Linux _is_ about choice, yet you're working overtime to limit that choice. You are blatantly ignoring/rejecting both Hannes [1] and I. Your attempt to impose _how_ NVMe multipathing must be done is unacceptable. Hopefully Jens can see through your senseless position and will accept patches 1 - 3 for 4.16. They offer very minimal change that enables users to decide which multipathing they'd prefer to use with NVMe. Just wish you could stop with this petty bullshit and actually collaborate with people. I've shown how easy it is to enable NVMe multipathing in terms of DM multipath (yet preserve your native NVMe multipathing). Please stop being so dogmatic. Are you scared of being proven wrong about what the market wants? If you'd allow progress toward native NVMe and DM multipathing coexisting we'd let the users decide what they prefer. I don't need to impose one way or the other, but I _do_ need to preserve DM multipath compatibility given the extensive use of DM multipath in the enterprise and increased tooling that builds upon it. [1] http://lists.infradead.org/pipermail/linux-nvme/2017-October/013719.html
On Fri, Dec 29, 2017 at 03:19:04PM -0500, Mike Snitzer wrote: > On Fri, Dec 29 2017 at 5:10am -0500, > Christoph Hellwig <hch@lst.de> wrote: > > > On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote: > > > All requests allocated from a request_queue with this callback set can > > > failover their requests during completion. > > > > > > This callback is expected to use the blk_steal_bios() interface to > > > transfer a request's bios back to an upper-layer bio-based > > > request_queue. > > > > > > This will be used by both NVMe multipath and DM multipath. Without it > > > DM multipath cannot get access to NVMe-specific error handling that NVMe > > > core provides in nvme_complete_rq(). > > > > And the whole point is that it should not get any such access. > > No the whole point is you hijacked multipathing for little to no gain. That is your idea. In the end there have been a lot of complains about dm-multipath, and there was a lot of discussion how to do things better, with a broad agreement on this approach. Up to the point where Hannes has started considering doing something similar for scsi. And to be honest if this is the tone you'd like to set for technical discussions I'm not really interested. Please calm down and stick to a technical discussion.
On Thu, Jan 04 2018 at 5:28am -0500, Christoph Hellwig <hch@lst.de> wrote: > On Fri, Dec 29, 2017 at 03:19:04PM -0500, Mike Snitzer wrote: > > On Fri, Dec 29 2017 at 5:10am -0500, > > Christoph Hellwig <hch@lst.de> wrote: > > > > > On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote: > > > > All requests allocated from a request_queue with this callback set can > > > > failover their requests during completion. > > > > > > > > This callback is expected to use the blk_steal_bios() interface to > > > > transfer a request's bios back to an upper-layer bio-based > > > > request_queue. > > > > > > > > This will be used by both NVMe multipath and DM multipath. Without it > > > > DM multipath cannot get access to NVMe-specific error handling that NVMe > > > > core provides in nvme_complete_rq(). > > > > > > And the whole point is that it should not get any such access. > > > > No the whole point is you hijacked multipathing for little to no gain. > > That is your idea. In the end there have been a lot of complains about > dm-multipath, and there was a lot of discussion how to do things better, > with a broad agreement on this approach. Up to the point where Hannes > has started considering doing something similar for scsi. All the "DM multipath" complaints I heard at LSF were fixable and pretty superficial. Some less so, but Hannes had a vision for addressing various SCSI stuff (which really complicated DM multipath). But I'd really rather not dwell on all the history of NVMe native multipathing's evolution. It isn't productive (other than to acknowledge that there are far more efficient and productive ways to coordinate such a change). > And to be honest if this is the tone you'd like to set for technical > discussions I'm not really interested. Please calm down and stick > to a technical discussion. I think you'd probably agree that you've repeatedly derailed or avoided technical discussion if it got into "DM multipath". But again I'm not looking to dwell on how dysfunctional this has been. I really do appreciate your technical expertise. Sadly, cannot say I feel you think similarly of me. I will say that I'm human, as such I have limits on what I'm willing to accept. You leveraged your position to the point where it has started to feel like you were lording over me. Tough to accept that. It makes my job _really_ feel like "work". All I've ever been trying to do (since accepting the reality of "NVMe native multipathing") is bridge the gap from the old solution to new solution. I'm not opposed to the new solution, it just needs to mature without being the _only_ way to provide the feature (NVMe multipathing). Hopefully we can be productive exchanges moving forward. There are certainly some challenges associated with trying to allow a kernel to support both NVMe native multipathing and DM multipathing. E.g. would an NVMe device scan multipath blacklist be doable/acceptable? I'd also like to understand if your vision for NVMe's ANA support will model something like scsi_dh? Meaning ANA is a capability that, when attached, augments the behavior of the NVMe device but that it is otherwise internal to the device and upper layers will get the benefit of ANA handler being attached. Also, curious to know if you see that as needing to be tightly coupled to multipathing? If so that is the next interface point hurdle. In the end I really think that DM multipath can help make NVMe native multipath very robust (and vice-versa). Mike
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 8089ca17db9a..f45f5925e100 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -278,6 +278,7 @@ typedef int (lld_busy_fn) (struct request_queue *q); typedef int (bsg_job_fn) (struct bsg_job *); typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t); typedef void (exit_rq_fn)(struct request_queue *, struct request *); +typedef void (failover_rq_fn)(struct request *); enum blk_eh_timer_return { BLK_EH_NOT_HANDLED, @@ -423,6 +424,11 @@ struct request_queue { exit_rq_fn *exit_rq_fn; /* Called from inside blk_get_request() */ void (*initialize_rq_fn)(struct request *rq); + /* + * Callback to failover request's bios back to upper layer + * bio-based request_queue using blk_steal_bios(). + */ + failover_rq_fn *failover_rq_fn; const struct blk_mq_ops *mq_ops;
All requests allocated from a request_queue with this callback set can failover their requests during completion. This callback is expected to use the blk_steal_bios() interface to transfer a request's bios back to an upper-layer bio-based request_queue. This will be used by both NVMe multipath and DM multipath. Without it DM multipath cannot get access to NVMe-specific error handling that NVMe core provides in nvme_complete_rq(). Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- include/linux/blkdev.h | 6 ++++++ 1 file changed, 6 insertions(+)