Message ID | 20210207181324.11429-1-smalin@marvell.com (mailing list archive) |
---|---|
Headers | show |
Series | NVMeTCP Offload ULP and QEDN Device Driver | expand |
On Sun, Feb 07, 2021 at 08:13:13PM +0200, Shai Malin wrote: > Queue Initialization: > ===================== > The nvme-tcp-offload ULP module shall register with the existing > nvmf_transport_ops (.name = "tcp_offload"), nvme_ctrl_ops and blk_mq_ops. > The nvme-tcp-offload vendor driver shall register to nvme-tcp-offload ULP > with the following ops: > - claim_dev() - in order to resolve the route to the target according to > the net_dev. > - create_queue() - in order to create offloaded nvme-tcp queue. > > The nvme-tcp-offload ULP module shall manage all the controller level > functionalities, call claim_dev and based on the return values shall call > the relevant module create_queue in order to create the admin queue and > the IO queues. Hi Shai, How well does this claim_dev approach work with multipathing? Is it expected that providing HOST_TRADDR is sufficient control over which offload device will be used with multiple valid paths to the controller? - Chris
On Fri, 12 Feb 2021 at 20:06, Chris Leech wrote: > > On Sun, Feb 07, 2021 at 08:13:13PM +0200, Shai Malin wrote: > > Queue Initialization: > > ===================== > > The nvme-tcp-offload ULP module shall register with the existing > > nvmf_transport_ops (.name = "tcp_offload"), nvme_ctrl_ops and blk_mq_ops. > > The nvme-tcp-offload vendor driver shall register to nvme-tcp-offload ULP > > with the following ops: > > - claim_dev() - in order to resolve the route to the target according to > > the net_dev. > > - create_queue() - in order to create offloaded nvme-tcp queue. > > > > The nvme-tcp-offload ULP module shall manage all the controller level > > functionalities, call claim_dev and based on the return values shall call > > the relevant module create_queue in order to create the admin queue and > > the IO queues. > > Hi Shai, > > How well does this claim_dev approach work with multipathing? Is it > expected that providing HOST_TRADDR is sufficient control over which > offload device will be used with multiple valid paths to the controller? > > - Chris > Hi Chris, The nvme-tcp-offload multipath behaves the same as the non-offloaded nvme-tcp. The HOST_TRADDR is sufficient to control which offload device will be used with multiple valid paths. - Shai
> > With the goal of enabling a generic infrastructure that allows NVMe/TCP > offload devices like NICs to seamlessly plug into the NVMe-oF stack, this > patch series introduces the nvme-tcp-offload ULP host layer, which will be a > new transport type called "tcp-offload" and will serve as an abstraction layer > to work with vendor specific nvme-tcp offload drivers. > > NVMeTCP offload is a full offload of the NVMeTCP protocol, this includes > both the TCP level and the NVMeTCP level. > > The nvme-tcp-offload transport can co-exist with the existing tcp and other > transports. The tcp offload was designed so that stack changes are kept to a > bare minimum: only registering new transports. > All other APIs, ops etc. are identical to the regular tcp transport. > Representing the TCP offload as a new transport allows clear and > manageable differentiation between the connections which should use the > offload path and those that are not offloaded (even on the same device). > Sagi, Christoph, Jens, Keith, So, as there are no more comments / questions, we understand the direction is acceptable and will proceed to the full series.
On Thu, Feb 18, 2021 at 06:38:07PM +0000, Shai Malin wrote: > So, as there are no more comments / questions, we understand the direction > is acceptable and will proceed to the full series. I do not think we should support offloads at all, and certainly not onces requiring extra drivers. Those drivers have caused unbelivable pain for iSCSI and we should not repeat that mistake.
> On Thu, Feb 18, 2021 at 06:38:07PM +0000, Shai Malin wrote: > > So, as there are no more comments / questions, we understand the > > direction is acceptable and will proceed to the full series. > > I do not think we should support offloads at all, and certainly not onces > requiring extra drivers. Those drivers have caused unbelivable pain for iSCSI > and we should not repeat that mistake. Hi Christoph, We are fully aware of the challenges the iSCSI offload faced - I was there too (in bnx2i and qedi). In our mind the heart of that hardship was the iSCSI uio design, essentially a thin alternative networking stack, which led to no end of compatibility challenges. But we were also there for RoCE and iWARP (TCP based) RDMA offloads where a different approach was used, working with the networking stack instead of around it. We feel this is a much better approach, and this is what we are attempting to implement here. For this reason exactly we designed this offload to be completely seemless. There is no alternate user stack - we plug in directly into the networking stack and there are zero changes to the regular nvme-tcp. We are just adding a new transport alongside it, which interacts with the networking stack when needed, and leaves it alone most of the time. Our intention is to completely own the maintenance of the new transport, including any compatibility requirements, and have purposefully designed it to be streamlined in this aspect. Protocol offload is at the core of our technology, and our device offloads RoCE, iWARP, iSCSI and FCoE, all already in upstream drivers (qedr, qedi and qedf respectively). We are especially excited about NVMeTCP offload as it brings huge benefits: RDMA-like latency, tremendous cpu utilization reduction and the reliability of TCP. We would be more than happy to incorporate any feedback you may have on the design, in how to make it more robust and correct. We are aware of other work being done in creating special types of offloaded queue, and could model our design similarly, although our thinking was that this would be more intrusive to regular nvme over tcp. In our original submission of the RFC we were not adding a ULP driver, only our own vendor driver, but Sagi pointed us in the direction of a vendor agnostic ulp layer, which made a lot of sense to us and we think is a good approach. Thanks, Ariel