diff mbox

[infiniband-diags] ibstat no longer works with ibsim with latest libibumad in rdma-core

Message ID b1658c5d-0cbc-b805-dc19-012fb09e6c7c@dev.mellanox.co.il (mailing list archive)
State Rejected
Headers show

Commit Message

Hal Rosenstock May 16, 2018, 6:40 p.m. UTC
If ibstat is run with ibsim, it no longer detects the local ports
with the latest libibumad from rdma-core with the following patch:

commit abf72057c27750275d0668375d30c4971911d041
Author: Jason Gunthorpe <jgg@mellanox.com>
Date:   Thu Apr 5 11:04:14 2018 -0600

    Now that we don't load the umad module if the HW doesn't use it (eg
    for roce only hardware) umad_init is failing to read the ABI version
    from the kernel.

    Applications still want to use some libibumad services that are not
    related to the char device, so move the version check to umad_open_port
    instead.

    Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

umad_get_cas_names no longer finds any IB devices/ports
when running with ibsim as sysfs is no longer initialized with
the latest libibumad where this is not done as part of umad_init
but moved to umad_open_port.

Fix this by adding dummy umad_open_port call (to IB device and port
that is not possible) which completes the initialization needed after
umad_init in main.

Signed-off-by: Hal Rosenstock <hal@mellanox.com>
---
 src/ibstat.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jason Gunthorpe May 16, 2018, 7:11 p.m. UTC | #1
On Wed, May 16, 2018 at 02:40:58PM -0400, Hal Rosenstock wrote:
> 
> If ibstat is run with ibsim, it no longer detects the local ports
> with the latest libibumad from rdma-core with the following patch:
> 
> commit abf72057c27750275d0668375d30c4971911d041
> Author: Jason Gunthorpe <jgg@mellanox.com>
> Date:   Thu Apr 5 11:04:14 2018 -0600
> 
>     Now that we don't load the umad module if the HW doesn't use it (eg
>     for roce only hardware) umad_init is failing to read the ABI version
>     from the kernel.
> 
>     Applications still want to use some libibumad services that are not
>     related to the char device, so move the version check to umad_open_port
>     instead.
> 
>     Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> 
> umad_get_cas_names no longer finds any IB devices/ports
> when running with ibsim as sysfs is no longer initialized with
> the latest libibumad where this is not done as part of umad_init
> but moved to umad_open_port.

Why is this not getting fixed in ibsim?

IIRC the entire point of the above patch was to make ibstat work on
rocee, so this seems to undo that effort.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hal Rosenstock May 16, 2018, 7:23 p.m. UTC | #2
On 5/16/2018 3:11 PM, Jason Gunthorpe wrote:
> On Wed, May 16, 2018 at 02:40:58PM -0400, Hal Rosenstock wrote:
>>
>> If ibstat is run with ibsim, it no longer detects the local ports
>> with the latest libibumad from rdma-core with the following patch:
>>
>> commit abf72057c27750275d0668375d30c4971911d041
>> Author: Jason Gunthorpe <jgg@mellanox.com>
>> Date:   Thu Apr 5 11:04:14 2018 -0600
>>
>>     Now that we don't load the umad module if the HW doesn't use it (eg
>>     for roce only hardware) umad_init is failing to read the ABI version
>>     from the kernel.
>>
>>     Applications still want to use some libibumad services that are not
>>     related to the char device, so move the version check to umad_open_port
>>     instead.
>>
>>     Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>>
>> umad_get_cas_names no longer finds any IB devices/ports
>> when running with ibsim as sysfs is no longer initialized with
>> the latest libibumad where this is not done as part of umad_init
>> but moved to umad_open_port.
> 
> Why is this not getting fixed in ibsim?

Unlike real hardware where sysfs is always /sys/class/..., with ibsim
moves that down a level with ./sys-nnnn/sys/class where the nnnn is per
client. It was the original umad_init in the client application and not
in libumad2sim that made this happen. Maybe I'm missing something but I
don't see way to move this into ibsim itself.

> IIRC the entire point of the above patch was to make ibstat work on
> rocee, so this seems to undo that effort.

How does it undo that ? The dummy umad_open_port call always fails
silently rather than treated as an error and ibstat continues on. On
RoCE hardware, this call does nothing. On IB hardware, it causes sysfs
to be properly populated. Without it, no IB devices are detected in
simulation but it works with real IB device.

I did not test this with RoCE as I don't have hardware but it should
work but I did test this with old and new libibumad (before and after
your patch) as well as IB hardware and IB simulation.

-- Hal
> Jason
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe May 16, 2018, 7:35 p.m. UTC | #3
On Wed, May 16, 2018 at 03:23:35PM -0400, Hal Rosenstock wrote:
> On 5/16/2018 3:11 PM, Jason Gunthorpe wrote:
> > On Wed, May 16, 2018 at 02:40:58PM -0400, Hal Rosenstock wrote:
> >>
> >> If ibstat is run with ibsim, it no longer detects the local ports
> >> with the latest libibumad from rdma-core with the following patch:
> >>
> >> commit abf72057c27750275d0668375d30c4971911d041
> >> Author: Jason Gunthorpe <jgg@mellanox.com>
> >> Date:   Thu Apr 5 11:04:14 2018 -0600
> >>
> >>     Now that we don't load the umad module if the HW doesn't use it (eg
> >>     for roce only hardware) umad_init is failing to read the ABI version
> >>     from the kernel.
> >>
> >>     Applications still want to use some libibumad services that are not
> >>     related to the char device, so move the version check to umad_open_port
> >>     instead.
> >>
> >>     Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> >>
> >> umad_get_cas_names no longer finds any IB devices/ports
> >> when running with ibsim as sysfs is no longer initialized with
> >> the latest libibumad where this is not done as part of umad_init
> >> but moved to umad_open_port.
> > 
> > Why is this not getting fixed in ibsim?
> 
> Unlike real hardware where sysfs is always /sys/class/..., with ibsim
> moves that down a level with ./sys-nnnn/sys/class where the nnnn is per
> client. It was the original umad_init in the client application and not
> in libumad2sim that made this happen. Maybe I'm missing something but I
> don't see way to move this into ibsim itself.

Have scandir() in umad2sim.c also call umad2sim_init()

?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hal Rosenstock May 16, 2018, 9:02 p.m. UTC | #4
On 5/16/2018 3:35 PM, Jason Gunthorpe wrote:
> Have scandir() in umad2sim.c also call umad2sim_init()
> 
> ?

That looks promising. I'll try this approach shortly.

Thanks.

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hal Rosenstock May 17, 2018, 12:22 a.m. UTC | #5
On 5/16/2018 5:02 PM, Hal Rosenstock wrote:
> On 5/16/2018 3:35 PM, Jason Gunthorpe wrote:
>> Have scandir() in umad2sim.c also call umad2sim_init()
>>
>> ?
> 
> That looks promising. I'll try this approach shortly.

It worked and is a better approach. Please ignore this patch.

Thanks again.

-- Hal

> Thanks.
> 
> -- Hal
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/src/ibstat.c b/src/ibstat.c
index bad4c99..77b6dc0 100644
--- a/src/ibstat.c
+++ b/src/ibstat.c
@@ -307,6 +307,8 @@  int main(int argc, char *argv[])
 	if (umad_init() < 0)
 		IBPANIC("can't init UMAD library");
 
+	umad_open_port("", 255);
+
 	if ((n = umad_get_cas_names(names, UMAD_MAX_DEVICES)) < 0)
 		IBPANIC("can't list IB device names");