Message ID | 792b6447-1efb-a977-5d9b-22b4351c5bcb@suse.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Leon Romanovsky |
Headers | show |
On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote: > ib_umad is required to get ibstat working. > Auto-load it for RoCE hardware so it works out of the box > > $ ibstat > ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded? > ibpanic: [4638] main: can't init UMAD library: No such file or directory > $ modprobe ib_umad > $ ibstat > CA 'bnxt_re0' > CA type: Broadcom NetXtreme-C/E RoCE Driver HCA > Number of ports: 1 > Firmware version: 20.8.29.0 > Hardware version: 0x14e4 > Node GUID: 0x9edc71fffeb69930 > System image GUID: 0x9edc71fffeb69930 > Port 1: > State: Down > Physical state: Disabled > Rate: 100 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x041d0000 > Port GUID: 0x9edc71fffeb69930 > Link layer: Ethernet > > Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> > --- > kernel-boot/modules/roce.conf | 3 +++ > 1 file changed, 3 insertions(+) Is this just a userspace bug in ibstat or do roce ports actually implement umad? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/05/2018 12:21 AM, Jason Gunthorpe wrote: > On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote: >> ib_umad is required to get ibstat working. >> Auto-load it for RoCE hardware so it works out of the box >> >> $ ibstat >> ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded? >> ibpanic: [4638] main: can't init UMAD library: No such file or directory >> $ modprobe ib_umad >> $ ibstat >> CA 'bnxt_re0' >> CA type: Broadcom NetXtreme-C/E RoCE Driver HCA >> Number of ports: 1 >> Firmware version: 20.8.29.0 >> Hardware version: 0x14e4 >> Node GUID: 0x9edc71fffeb69930 >> System image GUID: 0x9edc71fffeb69930 >> Port 1: >> State: Down >> Physical state: Disabled >> Rate: 100 >> Base lid: 0 >> LMC: 0 >> SM lid: 0 >> Capability mask: 0x041d0000 >> Port GUID: 0x9edc71fffeb69930 >> Link layer: Ethernet >> >> Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> >> --- >> kernel-boot/modules/roce.conf | 3 +++ >> 1 file changed, 3 insertions(+) > Is this just a userspace bug in ibstat or do roce ports actually implement umad? > > Jason It seems like they don't. It's just ibstat doing a umad_init() before making few calls to umad_get_cas_names() and a few other that seem to work through simple reads in sysfs (not umad related). Not sure what the clean way to fix this is. Removing the call to umad_init feels dirty but it's simple enough. Any ideas ? It might be worth updating the man pages to flag which umad function do not actually require the umad module and are safe to call without calling umad_init() first. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote: > > > On 04/05/2018 12:21 AM, Jason Gunthorpe wrote: > > On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote: > >> ib_umad is required to get ibstat working. > >> Auto-load it for RoCE hardware so it works out of the box > >> > >> $ ibstat > >> ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded? > >> ibpanic: [4638] main: can't init UMAD library: No such file or directory > >> $ modprobe ib_umad > >> $ ibstat > >> CA 'bnxt_re0' > >> CA type: Broadcom NetXtreme-C/E RoCE Driver HCA > >> Number of ports: 1 > >> Firmware version: 20.8.29.0 > >> Hardware version: 0x14e4 > >> Node GUID: 0x9edc71fffeb69930 > >> System image GUID: 0x9edc71fffeb69930 > >> Port 1: > >> State: Down > >> Physical state: Disabled > >> Rate: 100 > >> Base lid: 0 > >> LMC: 0 > >> SM lid: 0 > >> Capability mask: 0x041d0000 > >> Port GUID: 0x9edc71fffeb69930 > >> Link layer: Ethernet > >> > >> Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> > >> kernel-boot/modules/roce.conf | 3 +++ > >> 1 file changed, 3 insertions(+) > > Is this just a userspace bug in ibstat or do roce ports actually implement umad? > > > > Jason > > It seems like they don't. It's just ibstat doing a umad_init() > before making few calls to umad_get_cas_names() and a few other that > seem to work through simple reads in sysfs (not umad related). > > Not sure what the clean way to fix this is. Removing the call to > umad_init feels dirty but it's simple enough. Any ideas ? Maybe we should make umad_init not open the umad device if the link layer is ethernet, iwarp, etc? Hal? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/05/2018 05:20 PM, Jason Gunthorpe wrote: > On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote: >> >> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote: >>> >>> Is this just a userspace bug in ibstat or do roce ports actually implement umad? >>> >>> Jason >> It seems like they don't. It's just ibstat doing a umad_init() >> before making few calls to umad_get_cas_names() and a few other that >> seem to work through simple reads in sysfs (not umad related). >> >> Not sure what the clean way to fix this is. Removing the call to >> umad_init feels dirty but it's simple enough. Any ideas ? > Maybe we should make umad_init not open the umad device if the link > layer is ethernet, iwarp, etc? > > Hal? > > Jason > -- As there may be multiple device type in the same host, I don't think it'd be that easy. Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that). This is not required for a lot of the umad API as they only list stuff in sysfs. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 05, 2018 at 05:40:46PM +0200, Nicolas Morey-Chaisemartin wrote: > > > On 04/05/2018 05:20 PM, Jason Gunthorpe wrote: > > On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote: > >> > >> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote: > >>> > >>> Is this just a userspace bug in ibstat or do roce ports actually implement umad? > >>> > >>> Jason > >> It seems like they don't. It's just ibstat doing a umad_init() > >> before making few calls to umad_get_cas_names() and a few other that > >> seem to work through simple reads in sysfs (not umad related). > >> > >> Not sure what the clean way to fix this is. Removing the call to > >> umad_init feels dirty but it's simple enough. Any ideas ? > > Maybe we should make umad_init not open the umad device if the link > > layer is ethernet, iwarp, etc? > > > > Hal? > > > > Jason > > As there may be multiple device type in the same host, I don't think it'd be that easy. > > Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that). > This is not required for a lot of the umad API as they only list stuff in sysfs. It is not umad_init we need to remove, but umad_open_port should not be called if the link layer is ethernet. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/05/2018 06:05 PM, Jason Gunthorpe wrote: > On Thu, Apr 05, 2018 at 05:40:46PM +0200, Nicolas Morey-Chaisemartin wrote: >> >> On 04/05/2018 05:20 PM, Jason Gunthorpe wrote: >>> On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote: >>>> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote: >>>>> Is this just a userspace bug in ibstat or do roce ports actually implement umad? >>>>> >>>>> Jason >>>> It seems like they don't. It's just ibstat doing a umad_init() >>>> before making few calls to umad_get_cas_names() and a few other that >>>> seem to work through simple reads in sysfs (not umad related). >>>> >>>> Not sure what the clean way to fix this is. Removing the call to >>>> umad_init feels dirty but it's simple enough. Any ideas ? >>> Maybe we should make umad_init not open the umad device if the link >>> layer is ethernet, iwarp, etc? >>> >>> Hal? >>> >>> Jason >> As there may be multiple device type in the same host, I don't think it'd be that easy. >> >> Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that). >> This is not required for a lot of the umad API as they only list stuff in sysfs. > It is not umad_init we need to remove, but umad_open_port should not > be called if the link layer is ethernet. > > Jason For ibstat, I don't think it is calling umad_open_port. It's only call to umad are umad_init, umad_get_cas_names, umad_get_ca, umad_get_ca_portguid and it doesn't seem like any of these call umad_open_port. The failure from ibstat is due to umad_init which fails because there is no /sys/class/infiniband_mad/abi_version as we only have RoCE hw and the ib_umad module is not automatically loaded. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel-boot/modules/roce.conf b/kernel-boot/modules/roce.conf index 8e4927ce26f0..982b929f429a 100644 --- a/kernel-boot/modules/roce.conf +++ b/kernel-boot/modules/roce.conf @@ -1,2 +1,5 @@ # These modules are loaded by the system if any RDMA over Converged Ethernet # device is installed + +# Access to fabric management SMPs and GMPs from userspace. +ib_umad
ib_umad is required to get ibstat working. Auto-load it for RoCE hardware so it works out of the box $ ibstat ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded? ibpanic: [4638] main: can't init UMAD library: No such file or directory $ modprobe ib_umad $ ibstat CA 'bnxt_re0' CA type: Broadcom NetXtreme-C/E RoCE Driver HCA Number of ports: 1 Firmware version: 20.8.29.0 Hardware version: 0x14e4 Node GUID: 0x9edc71fffeb69930 System image GUID: 0x9edc71fffeb69930 Port 1: State: Down Physical state: Disabled Rate: 100 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x041d0000 Port GUID: 0x9edc71fffeb69930 Link layer: Ethernet Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> --- kernel-boot/modules/roce.conf | 3 +++ 1 file changed, 3 insertions(+)