Message ID | 20210406210125.241-2-shiraz.saleem@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Add Intel Ethernet Protocol Driver for RDMA (irdma) | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > +/* Following APIs are implemented by core PCI driver */ > +struct iidc_core_ops { > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > + * completion queues, Tx/Rx queues, etc... > + */ > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > + struct iidc_res *res, > + int partial_acceptable); > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > + struct iidc_res *res); > + > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > + enum iidc_reset_type reset_type); > + > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > + u16 vport_id, bool enable); > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, > + u16 len); > +}; What is this? There is only one implementation: static const struct iidc_core_ops ops = { .alloc_res = ice_cdev_info_alloc_res, .free_res = ice_cdev_info_free_res, .request_reset = ice_cdev_info_request_reset, .update_vport_filter = ice_cdev_info_update_vsi_filter, .vc_send = ice_cdev_info_vc_send, }; So export and call the functions directly. We just had this very same discussion with Broadcom. I notice there is no module dependency between irdma and the ethernet driver because the above ops are avoiding it. This entire idea was already NAK'd once: https://lore.kernel.org/linux-rdma/20180522203831.20624-1-jeffrey.t.kirsher@intel.com/ So please remove it. Jason
On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > +struct iidc_res_base { > + /* Union for future provision e.g. other res_type */ > + union { > + struct iidc_rdma_qset_params qsets; > + } res; Use an anonymous union? There is alot of confusiong provisioning for future types, do you have concrete plans here? I'm a bit confused why this is so different from how mlx5 ended up when it already has multiple types. > +}; > + > +struct iidc_res { > + /* Type of resource. */ > + enum iidc_res_type res_type; > + /* Count requested */ > + u16 cnt_req; > + > + /* Number of resources allocated. Filled in by callee. > + * Based on this value, caller to fill up "resources" > + */ > + u16 res_allocated; > + > + /* Unique handle to resources allocated. Zero if call fails. > + * Allocated by callee and for now used by caller for internal > + * tracking purpose. > + */ > + u32 res_handle; > + > + /* Peer driver has to allocate sufficient memory, to accommodate > + * cnt_requested before calling this function. Calling what function? > + * Memory has to be zero initialized. It is input/output param. > + * As a result of alloc_res API, this structures will be populated. > + */ > + struct iidc_res_base res[1]; So it is a wrongly defined flex array? Confused The usages are all using this as some super-complicated function argument: struct iidc_res rdma_qset_res = {}; rdma_qset_res.res_allocated = 1; rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED; rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx; rdma_qset_res.res[0].res.qsets.teid = tc_node->l2_sched_node_id; rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle; if (cdev_info->ops->free_res(cdev_info, &rdma_qset_res)) So the answer here is to make your function calls sane and well architected. If you have to pass a union to call a function then something is very wrong with the design. You aren't trying to achieve ABI decoupling of the rdma/ethernet modules with an obfuscated complicated function pointer indirection, are you? Please use sane function calls > +/* Following APIs are implemented by auxiliary drivers and invoked by core PCI > + * driver > + */ > +struct iidc_auxiliary_ops { > + /* This event_handler is meant to be a blocking call. For instance, > + * when a BEFORE_MTU_CHANGE event comes in, the event_handler will not > + * return until the auxiliary driver is ready for the MTU change to > + * happen. > + */ > + void (*event_handler)(struct iidc_core_dev_info *cdev_info, > + struct iidc_event *event); > + > + int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id, > + u8 *msg, u16 len); > +}; This is not the normal pattern: > +struct iidc_auxiliary_drv { > + struct auxiliary_driver adrv; > + struct iidc_auxiliary_ops *ops; > +}; Just put the two functions above in the drv directly: struct iidc_auxiliary_drv { struct auxilary_driver adrv; void (*event_handler)(struct iidc_core_dev_info *cdev_info, *cdev_info, struct iidc_event *event); int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, u16 len); } > + > +#define IIDC_RDMA_NAME "intel_rdma" > +#define IIDC_RDMA_ID 0x00000010 > +#define IIDC_MAX_NUM_AUX 4 > + > +/* The const struct that instantiates cdev_info_id needs to be initialized > + * in the .c with the macro ASSIGN_IIDC_INFO. > + * For example: > + * static const struct cdev_info_id cdev_info_ids[] = ASSIGN_IIDC_INFO; > + */ > +struct cdev_info_id { > + char *name; > + int id; > +}; > + > +#define IIDC_RDMA_INFO { .name = IIDC_RDMA_NAME, .id = IIDC_RDMA_ID }, > + > +#define ASSIGN_IIDC_INFO \ > +{ \ > + IIDC_RDMA_INFO \ > +} I tried to figure out what all this was for and came up short. There is only one user and all this seems unnecessary in this series, add it later when you need it. > + > +#define iidc_priv(x) ((x)->auxiliary_priv) Use a static inline function Jason
> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > +/* Following APIs are implemented by core PCI driver */ struct > > +iidc_core_ops { > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > + * completion queues, Tx/Rx queues, etc... > > + */ > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > + struct iidc_res *res, > > + int partial_acceptable); > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > + struct iidc_res *res); > > + > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > + enum iidc_reset_type reset_type); > > + > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > + u16 vport_id, bool enable); > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, > > + u16 len); > > +}; > > What is this? There is only one implementation: > > static const struct iidc_core_ops ops = { > .alloc_res = ice_cdev_info_alloc_res, > .free_res = ice_cdev_info_free_res, > .request_reset = ice_cdev_info_request_reset, > .update_vport_filter = ice_cdev_info_update_vsi_filter, > .vc_send = ice_cdev_info_vc_send, > }; > > So export and call the functions directly. No. Then we end up requiring ice to be loaded even when just want to use irdma with x722 [whose ethernet driver is "i40e"]. This will not allow us to be a unified RDMA driver. > > We just had this very same discussion with Broadcom. Yes, but they only have a single ethernet driver today I presume. Here we have a single RDMA driver and 2 ethernet drivers, i40e and ice. Shiraz
On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote: > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > > > +/* Following APIs are implemented by core PCI driver */ struct > > > +iidc_core_ops { > > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > > + * completion queues, Tx/Rx queues, etc... > > > + */ > > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > > + struct iidc_res *res, > > > + int partial_acceptable); > > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > > + struct iidc_res *res); > > > + > > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > > + enum iidc_reset_type reset_type); > > > + > > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > > + u16 vport_id, bool enable); > > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, > > > + u16 len); > > > +}; > > > > What is this? There is only one implementation: > > > > static const struct iidc_core_ops ops = { > > .alloc_res = ice_cdev_info_alloc_res, > > .free_res = ice_cdev_info_free_res, > > .request_reset = ice_cdev_info_request_reset, > > .update_vport_filter = ice_cdev_info_update_vsi_filter, > > .vc_send = ice_cdev_info_vc_send, > > }; > > > > So export and call the functions directly. > > No. Then we end up requiring ice to be loaded even when just want to > use irdma with x722 [whose ethernet driver is "i40e"]. So what? What does it matter to load a few extra kb of modules? If you had both drivers use the same ops interface you'd have an argument. Jason
On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote: > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote: > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > > > > > +/* Following APIs are implemented by core PCI driver */ struct > > > > +iidc_core_ops { > > > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > > > + * completion queues, Tx/Rx queues, etc... > > > > + */ > > > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > > > + struct iidc_res *res, > > > > + int partial_acceptable); > > > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > > > + struct iidc_res *res); > > > > + > > > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > > > + enum iidc_reset_type reset_type); > > > > + > > > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > > > + u16 vport_id, bool enable); > > > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, > > > > + u16 len); > > > > +}; > > > > > > What is this? There is only one implementation: > > > > > > static const struct iidc_core_ops ops = { > > > .alloc_res = ice_cdev_info_alloc_res, > > > .free_res = ice_cdev_info_free_res, > > > .request_reset = ice_cdev_info_request_reset, > > > .update_vport_filter = ice_cdev_info_update_vsi_filter, > > > .vc_send = ice_cdev_info_vc_send, > > > }; > > > > > > So export and call the functions directly. > > > > No. Then we end up requiring ice to be loaded even when just want to > > use irdma with x722 [whose ethernet driver is "i40e"]. > > So what? What does it matter to load a few extra kb of modules? And if user cares about it, he will blacklist that module anyway. Thanks
> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote: > > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote: > > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > > > > > > > +/* Following APIs are implemented by core PCI driver */ struct > > > > > +iidc_core_ops { > > > > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > > > > + * completion queues, Tx/Rx queues, etc... > > > > > + */ > > > > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > > > > + struct iidc_res *res, > > > > > + int partial_acceptable); > > > > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > > > > + struct iidc_res *res); > > > > > + > > > > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > > > > + enum iidc_reset_type reset_type); > > > > > + > > > > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > > > > + u16 vport_id, bool enable); > > > > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 > *msg, > > > > > + u16 len); > > > > > +}; > > > > > > > > What is this? There is only one implementation: > > > > > > > > static const struct iidc_core_ops ops = { > > > > .alloc_res = ice_cdev_info_alloc_res, > > > > .free_res = ice_cdev_info_free_res, > > > > .request_reset = ice_cdev_info_request_reset, > > > > .update_vport_filter = ice_cdev_info_update_vsi_filter, > > > > .vc_send = ice_cdev_info_vc_send, > > > > }; > > > > > > > > So export and call the functions directly. > > > > > > No. Then we end up requiring ice to be loaded even when just want to > > > use irdma with x722 [whose ethernet driver is "i40e"]. > > > > So what? What does it matter to load a few extra kb of modules? > > And if user cares about it, he will blacklist that module anyway. > blacklist ice when you just have an x722 card? How does that solve anything? You wont be able to load irdma then. Shiraz
On Fri, Apr 09, 2021 at 01:38:37AM +0000, Saleem, Shiraz wrote: > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > On Wed, Apr 07, 2021 at 07:43:24PM -0300, Jason Gunthorpe wrote: > > > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote: > > > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > > > > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > > > > > > > > > +/* Following APIs are implemented by core PCI driver */ struct > > > > > > +iidc_core_ops { > > > > > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > > > > > + * completion queues, Tx/Rx queues, etc... > > > > > > + */ > > > > > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > > > > > + struct iidc_res *res, > > > > > > + int partial_acceptable); > > > > > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > > > > > + struct iidc_res *res); > > > > > > + > > > > > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > > > > > + enum iidc_reset_type reset_type); > > > > > > + > > > > > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > > > > > + u16 vport_id, bool enable); > > > > > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 > > *msg, > > > > > > + u16 len); > > > > > > +}; > > > > > > > > > > What is this? There is only one implementation: > > > > > > > > > > static const struct iidc_core_ops ops = { > > > > > .alloc_res = ice_cdev_info_alloc_res, > > > > > .free_res = ice_cdev_info_free_res, > > > > > .request_reset = ice_cdev_info_request_reset, > > > > > .update_vport_filter = ice_cdev_info_update_vsi_filter, > > > > > .vc_send = ice_cdev_info_vc_send, > > > > > }; > > > > > > > > > > So export and call the functions directly. > > > > > > > > No. Then we end up requiring ice to be loaded even when just want to > > > > use irdma with x722 [whose ethernet driver is "i40e"]. > > > > > > So what? What does it matter to load a few extra kb of modules? > > > > And if user cares about it, he will blacklist that module anyway. > > > blacklist ice when you just have an x722 card? How does that solve anything? You wont be able to load irdma then. You will blacklist i40e if you want solely irdma functionality. Thanks > > Shiraz
> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > On Wed, Apr 07, 2021 at 08:58:49PM +0000, Saleem, Shiraz wrote: > > > Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > > > > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > > > > > +/* Following APIs are implemented by core PCI driver */ struct > > > > +iidc_core_ops { > > > > + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, > > > > + * completion queues, Tx/Rx queues, etc... > > > > + */ > > > > + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, > > > > + struct iidc_res *res, > > > > + int partial_acceptable); > > > > + int (*free_res)(struct iidc_core_dev_info *cdev_info, > > > > + struct iidc_res *res); > > > > + > > > > + int (*request_reset)(struct iidc_core_dev_info *cdev_info, > > > > + enum iidc_reset_type reset_type); > > > > + > > > > + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, > > > > + u16 vport_id, bool enable); > > > > + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, > > > > + u16 len); > > > > +}; > > > > > > What is this? There is only one implementation: > > > > > > static const struct iidc_core_ops ops = { > > > .alloc_res = ice_cdev_info_alloc_res, > > > .free_res = ice_cdev_info_free_res, > > > .request_reset = ice_cdev_info_request_reset, > > > .update_vport_filter = ice_cdev_info_update_vsi_filter, > > > .vc_send = ice_cdev_info_vc_send, > > > }; > > > > > > So export and call the functions directly. > > > > No. Then we end up requiring ice to be loaded even when just want to > > use irdma with x722 [whose ethernet driver is "i40e"]. > > So what? What does it matter to load a few extra kb of modules? Because it is an unnecessary thing to force a user to build/load drivers for which they don't have the HW for? The problem gets compounded if we have to do it for all future HW Intel PCI drivers, i.e. depends on ICE && .... IIDC is Intel's converged and generic RDMA <--> PCI driver channel interface; which we intend to use moving forward. And these .ops callbacks are part of this interface which will have different implementations by each HW generation PCI core driver. It is extensible with new ops added to the table for new HW and where implementations of the certain ops on some HW will be NULL. There is a near-term Intel ethernet VF driver which will use IIDC to provide RDMA in the VF, and implement some of these .ops callbacks. There is also intent to move i40e to IIDC. And yes, it allows to load a unified irdma driver without having all the mulit-gen PCI core drivers to be built/loaded as a pre-requisite which is solving a pain-point to the user and does not unnecessarily add a memory foot-print. In the past, with i40e <-> i40iw, I acknowledge such a dependency was decoupled for the wrong reasons [1] and understand where your frustration is coming from. But in a unified irdma driver model connecting to multiple PCI gen drivers, I do think it serves a reason. This has also been voiced over the years in some of our discussions [2] leading to the auxiliary bus and been part of our submissions from the get go. In fact, use of such domain specific .ops from the parent device is an assumption baked into the design when the auxiliary bus was conceived and in the documentation [3] (See Example Usage). [1] https://lore.kernel.org/linux-rdma/20180522205612.GD7502@mellanox.com/ [2] https://lore.kernel.org/linux-rdma/2B0E3F215D1AB84DA946C8BEE234CCC97B2E1D29@ORSMSX101.amr.corp.intel.com/ [3] https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html Shiraz
> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > On Tue, Apr 06, 2021 at 04:01:03PM -0500, Shiraz Saleem wrote: > > > +struct iidc_res_base { > > + /* Union for future provision e.g. other res_type */ > > + union { > > + struct iidc_rdma_qset_params qsets; > > + } res; > > Use an anonymous union? > > There is alot of confusiong provisioning for future types, do you have concrete > plans here? I'm a bit confused why this is so different from how mlx5 ended up > when it already has multiple types. It was initially designed to be extensible for more resource types. But at this point, there is no concrete plan and hence it doesn't need to be a union. > > > +}; > > + > > +struct iidc_res { > > + /* Type of resource. */ > > + enum iidc_res_type res_type; > > + /* Count requested */ > > + u16 cnt_req; > > + > > + /* Number of resources allocated. Filled in by callee. > > + * Based on this value, caller to fill up "resources" > > + */ > > + u16 res_allocated; > > + > > + /* Unique handle to resources allocated. Zero if call fails. > > + * Allocated by callee and for now used by caller for internal > > + * tracking purpose. > > + */ > > + u32 res_handle; > > + > > + /* Peer driver has to allocate sufficient memory, to accommodate > > + * cnt_requested before calling this function. > > Calling what function? Left over cruft from the re-write of IIDC in v2. > > > + * Memory has to be zero initialized. It is input/output param. > > + * As a result of alloc_res API, this structures will be populated. > > + */ > > + struct iidc_res_base res[1]; > > So it is a wrongly defined flex array? Confused Needs fixing. > > The usages are all using this as some super-complicated function argument: > > struct iidc_res rdma_qset_res = {}; > > rdma_qset_res.res_allocated = 1; > rdma_qset_res.res_type = IIDC_RDMA_QSETS_TXSCHED; > rdma_qset_res.res[0].res.qsets.vport_id = vsi->vsi_idx; > rdma_qset_res.res[0].res.qsets.teid = tc_node->l2_sched_node_id; > rdma_qset_res.res[0].res.qsets.qs_handle = tc_node->qs_handle; > > if (cdev_info->ops->free_res(cdev_info, &rdma_qset_res)) > > So the answer here is to make your function calls sane and well architected. If you > have to pass a union to call a function then something is very wrong with the > design. > Based on previous comment, the union will be removed. > You aren't trying to achieve ABI decoupling of the rdma/ethernet modules with an > obfuscated complicated function pointer indirection, are you? As discussed in other thread, this is part of the IIDC interface exporting the core device .ops callbacks. > > > +/* Following APIs are implemented by auxiliary drivers and invoked by > > +core PCI > > + * driver > > + */ > > +struct iidc_auxiliary_ops { > > + /* This event_handler is meant to be a blocking call. For instance, > > + * when a BEFORE_MTU_CHANGE event comes in, the event_handler will > not > > + * return until the auxiliary driver is ready for the MTU change to > > + * happen. > > + */ > > + void (*event_handler)(struct iidc_core_dev_info *cdev_info, > > + struct iidc_event *event); > > + > > + int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id, > > + u8 *msg, u16 len); > > +}; > > This is not the normal pattern: > > > +struct iidc_auxiliary_drv { > > + struct auxiliary_driver adrv; > > + struct iidc_auxiliary_ops *ops; > > +}; > > Just put the two functions above in the drv directly: Ok. > > struct iidc_auxiliary_drv { > struct auxilary_driver adrv; > void (*event_handler)(struct iidc_core_dev_info *cdev_info, *cdev_info, > struct iidc_event *event); > > int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id, > u8 *msg, u16 len); > } > > > + > > +#define IIDC_RDMA_NAME "intel_rdma" > > +#define IIDC_RDMA_ID 0x00000010 > > +#define IIDC_MAX_NUM_AUX 4 > > + > > +/* The const struct that instantiates cdev_info_id needs to be > > +initialized > > + * in the .c with the macro ASSIGN_IIDC_INFO. > > + * For example: > > + * static const struct cdev_info_id cdev_info_ids[] = > > +ASSIGN_IIDC_INFO; */ struct cdev_info_id { > > + char *name; > > + int id; > > +}; > > + > > +#define IIDC_RDMA_INFO { .name = IIDC_RDMA_NAME, .id = > IIDC_RDMA_ID }, > > + > > +#define ASSIGN_IIDC_INFO \ > > +{ \ > > + IIDC_RDMA_INFO \ > > +} > > I tried to figure out what all this was for and came up short. There is only one user > and all this seems unnecessary in this series, add it later when you need it. No plan for new user, so this should go. > > > + > > +#define iidc_priv(x) ((x)->auxiliary_priv) > > Use a static inline function > Ok
On Mon, Apr 12, 2021 at 02:50:43PM +0000, Saleem, Shiraz wrote: > Because it is an unnecessary thing to force a user to build/load drivers for > which they don't have the HW for? Happens automatically in all distros, so I don't agree with this. > The problem gets compounded if we have to do it for all future HW > Intel PCI drivers, i.e. depends on ICE && .... Then someday build a proper pluggable abstraction and put all your ethernet drivers under it. Today you haven't done that and all we have a set of ops that only work with one eth driver and a totally different set of functions that only work with a different driver. It is all just dead code until it gets finished and process is to not merge dead code. > There is a near-term Intel ethernet VF driver which will use IIDC to > provide RDMA in the VF, and implement some of these .ops > callbacks. There is also intent to move i40e to IIDC. "near-term" We are now on year three of Intel talking about this driver! Get the bulk of the thing merged and deal with the rest in followup patches. > But in a unified irdma driver model connecting to multiple PCI gen > drivers, I do think it serves a reason. It is fine as a high level idea, but the implementation has to meet kernel standards. Jason
> Subject: Re: [PATCH v4 01/23] iidc: Introduce iidc.h > > On Mon, Apr 12, 2021 at 02:50:43PM +0000, Saleem, Shiraz wrote: > [....] > > There is a near-term Intel ethernet VF driver which will use IIDC to > > provide RDMA in the VF, and implement some of these .ops callbacks. > > There is also intent to move i40e to IIDC. > > "near-term" We are now on year three of Intel talking about this driver! > > Get the bulk of the thing merged and deal with the rest in followup patches. We will submit with symbols exported from ice and direct calls from irdma. Shiraz
diff --git a/MAINTAINERS b/MAINTAINERS index a1e3502..27e424e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8955,6 +8955,7 @@ F: Documentation/networking/device_drivers/ethernet/intel/ F: drivers/net/ethernet/intel/ F: drivers/net/ethernet/intel/*/ F: include/linux/avf/virtchnl.h +F: include/linux/net/intel/iidc.h INTEL FRAMEBUFFER DRIVER (excluding 810 and 815) M: Maik Broemme <mbroemme@libmpq.org> diff --git a/include/linux/net/intel/iidc.h b/include/linux/net/intel/iidc.h new file mode 100644 index 0000000..2c360c4 --- /dev/null +++ b/include/linux/net/intel/iidc.h @@ -0,0 +1,242 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2021, Intel Corporation. */ + +#ifndef _IIDC_H_ +#define _IIDC_H_ + +#include <linux/auxiliary_bus.h> +#include <linux/dcbnl.h> +#include <linux/device.h> +#include <linux/if_ether.h> +#include <linux/kernel.h> +#include <linux/netdevice.h> + +enum iidc_event_type { + IIDC_EVENT_BEFORE_MTU_CHANGE, + IIDC_EVENT_AFTER_MTU_CHANGE, + IIDC_EVENT_BEFORE_TC_CHANGE, + IIDC_EVENT_AFTER_TC_CHANGE, + IIDC_EVENT_DSCP_MAP_CHANGE, + IIDC_EVENT_CRIT_ERR, + IIDC_EVENT_NBITS /* must be last */ +}; + +enum iidc_res_type { + IIDC_INVAL_RES, + IIDC_VSI, + IIDC_VEB, + IIDC_EVENT_Q, + IIDC_EGRESS_CMPL_Q, + IIDC_CMPL_EVENT_Q, + IIDC_ASYNC_EVENT_Q, + IIDC_DOORBELL_Q, + IIDC_RDMA_QSETS_TXSCHED, +}; + +enum iidc_reset_type { + IIDC_PFR, + IIDC_CORER, + IIDC_GLOBR, +}; + +enum iidc_rdma_protocol { + IIDC_RDMA_PROTOCOL_IWARP, + IIDC_RDMA_PROTOCOL_ROCEV2, +}; + +/* Struct to hold per DCB APP info */ +struct iidc_dcb_app_info { + u8 priority; + u8 selector; + u16 prot_id; +}; + +struct iidc_core_dev_info; + +#define IIDC_MAX_USER_PRIORITY 8 +#define IIDC_MAX_APPS 72 +#define IIDC_MAX_DSCP_MAPPING 64 + +/* Struct to hold per RDMA Qset info */ +struct iidc_rdma_qset_params { + u32 teid; /* Qset TEID */ + u16 qs_handle; /* RDMA driver provides this */ + u16 vport_id; /* VSI index */ + u8 tc; /* TC branch the Qset should belong to */ + u8 reserved[3]; +}; + +struct iidc_res_base { + /* Union for future provision e.g. other res_type */ + union { + struct iidc_rdma_qset_params qsets; + } res; +}; + +struct iidc_res { + /* Type of resource. */ + enum iidc_res_type res_type; + /* Count requested */ + u16 cnt_req; + + /* Number of resources allocated. Filled in by callee. + * Based on this value, caller to fill up "resources" + */ + u16 res_allocated; + + /* Unique handle to resources allocated. Zero if call fails. + * Allocated by callee and for now used by caller for internal + * tracking purpose. + */ + u32 res_handle; + + /* Peer driver has to allocate sufficient memory, to accommodate + * cnt_requested before calling this function. + * Memory has to be zero initialized. It is input/output param. + * As a result of alloc_res API, this structures will be populated. + */ + struct iidc_res_base res[1]; +}; + +struct iidc_qos_info { + u64 tc_ctx; + u8 rel_bw; + u8 prio_type; + u8 egress_virt_up; + u8 ingress_virt_up; +}; + +struct iidc_dscp_map { + u16 dscp_val; + u8 tc; +}; + +/* Struct to hold QoS info */ +struct iidc_qos_params { + struct iidc_qos_info tc_info[IEEE_8021QAZ_MAX_TCS]; + u8 up2tc[IIDC_MAX_USER_PRIORITY]; + u8 vport_relative_bw; + u8 vport_priority_type; + u32 num_apps; + struct iidc_dcb_app_info apps[IIDC_MAX_APPS]; + struct iidc_dscp_map dscp_maps[IIDC_MAX_DSCP_MAPPING]; + u8 num_tc; +}; + +union iidc_event_info { + /* IIDC_EVENT_AFTER_TC_CHANGE */ + struct iidc_qos_params port_qos; + /* IIDC_EVENT_CRIT_ERR */ + u32 reg; +}; + +struct iidc_event { + DECLARE_BITMAP(type, IIDC_EVENT_NBITS); + union iidc_event_info info; +}; + +/* Following APIs are implemented by core PCI driver */ +struct iidc_core_ops { + /* APIs to allocate resources such as VEB, VSI, Doorbell queues, + * completion queues, Tx/Rx queues, etc... + */ + int (*alloc_res)(struct iidc_core_dev_info *cdev_info, + struct iidc_res *res, + int partial_acceptable); + int (*free_res)(struct iidc_core_dev_info *cdev_info, + struct iidc_res *res); + + int (*request_reset)(struct iidc_core_dev_info *cdev_info, + enum iidc_reset_type reset_type); + + int (*update_vport_filter)(struct iidc_core_dev_info *cdev_info, + u16 vport_id, bool enable); + int (*vc_send)(struct iidc_core_dev_info *cdev_info, u32 vf_id, u8 *msg, + u16 len); +}; + +/* Following APIs are implemented by auxiliary drivers and invoked by core PCI + * driver + */ +struct iidc_auxiliary_ops { + /* This event_handler is meant to be a blocking call. For instance, + * when a BEFORE_MTU_CHANGE event comes in, the event_handler will not + * return until the auxiliary driver is ready for the MTU change to + * happen. + */ + void (*event_handler)(struct iidc_core_dev_info *cdev_info, + struct iidc_event *event); + + int (*vc_receive)(struct iidc_core_dev_info *cdev_info, u32 vf_id, + u8 *msg, u16 len); +}; + +#define IIDC_RDMA_NAME "intel_rdma" +#define IIDC_RDMA_ID 0x00000010 +#define IIDC_MAX_NUM_AUX 4 + +/* The const struct that instantiates cdev_info_id needs to be initialized + * in the .c with the macro ASSIGN_IIDC_INFO. + * For example: + * static const struct cdev_info_id cdev_info_ids[] = ASSIGN_IIDC_INFO; + */ +struct cdev_info_id { + char *name; + int id; +}; + +#define IIDC_RDMA_INFO { .name = IIDC_RDMA_NAME, .id = IIDC_RDMA_ID }, + +#define ASSIGN_IIDC_INFO \ +{ \ + IIDC_RDMA_INFO \ +} + +#define iidc_priv(x) ((x)->auxiliary_priv) + +/* Structure representing auxiliary driver tailored information about the core + * PCI dev, each auxiliary driver using the IIDC interface will have an + * instance of this struct dedicated to it. + */ +struct iidc_core_dev_info { + struct pci_dev *pdev; /* PCI device of corresponding to main function */ + struct auxiliary_device *adev; + /* KVA / Linear address corresponding to BAR0 of underlying + * pci_device. + */ + u8 __iomem *hw_addr; + int cdev_info_id; + + u8 ftype; /* PF(false) or VF (true) */ + + u16 vport_id; + enum iidc_rdma_protocol rdma_protocol; + + struct iidc_qos_params qos_info; + struct net_device *netdev; + + struct msix_entry *msix_entries; + u16 msix_count; /* How many vectors are reserved for this device */ + + /* Following struct contains function pointers to be initialized + * by core PCI driver and called by auxiliary driver + */ + const struct iidc_core_ops *ops; +}; + +struct iidc_auxiliary_dev { + struct auxiliary_device adev; + struct iidc_core_dev_info *cdev_info; +}; + +/* structure representing the auxiliary driver. This struct is to be + * allocated and populated by the auxiliary driver's owner. The core PCI + * driver will access these ops by performing a container_of on the + * auxiliary_device->dev.driver. + */ +struct iidc_auxiliary_drv { + struct auxiliary_driver adrv; + struct iidc_auxiliary_ops *ops; +}; + +#endif /* _IIDC_H_*/