Message ID | 20181218121552.24829-1-leon@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [rdma-next] IB/cm: Properly create sysfs layout of RDMA devices with InfiniBand link layer | expand |
On Tue, Dec 18, 2018 at 02:15:52PM +0200, Leon Romanovsky wrote: > From: Leon Romanovsky <leonro@mellanox.com> > > The sysfs layout is created by CM incorrectly presented RDMA devices > with InfiniBand link layer. Layout of such devices represents device > tree of connections. It means that the following issues are fixed by > the this change: > * Symlink name - It used device name instead of specific identifier. > * Target location - It was supposed to point to PCI-ID/infiniband_cm/ > instead of PCI-ID/infiniband/ > * Target name - It created extra device file under already existing > device folder, e.g. mlx5_0/mlx5_0 > > Fix it by changing sysfs device CM creation logic. > > Before the change: > [leonro@server ~]$ ls -l /sys/class/infiniband_cm/ > lrwxrwxrwx 1 root root 0 Dec 14 10:31 mlx5_0 -> ../../devices/pci0000:00/0000:00:0c.0/infiniband/mlx5_0/mlx5_0 > > After the change: > [leonro@server ~]$ ls -l /sys/class/infiniband_cm/ > lrwxrwxrwx 1 root root 0 Dec 14 10:47 cm0 -> ../../devices/pci0000:00/0000:00:0c.0/infiniband_cm/cm0 > > At the end, we will have same logic and representation as for uverbs, mad and issm sysfs classes. These other ones actually tie to chardevs, this does not. > Fixes: 110cf374a809 ("infiniband: make cm_device use a struct device and not a kobject.") > Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > I think that it should be forwarded to stable@ too. > drivers/infiniband/core/cm.c | 17 +++++++++++------ > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c > index edb2cb758be7..e3bee04fb641 100644 > +++ b/drivers/infiniband/core/cm.c > @@ -93,6 +93,8 @@ static const char * const ibcm_rej_reason_strs[] = { > [IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label", > }; > > +static DEFINE_IDA(cm_ida); > + > const char *__attribute_const__ ibcm_reject_msg(int reason) > { > size_t index = reason; > @@ -220,6 +222,7 @@ struct cm_device { > struct list_head list; > struct ib_device *ib_device; > struct device *device; > + int devnum; > u8 ack_delay; > int going_down; > struct cm_port *port[0]; > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device *ib_device) > cm_dev->ib_device = ib_device; > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > cm_dev->going_down = 0; > - cm_dev->device = device_create(&cm_class, &ib_device->dev, > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, > MKDEV(0, 0), NULL, Yikes, this is horrifying, why is it creating a struct device at all like this?? Probably the CM counter group debug sysfs should just be added directly to the actual ib_dev's sysfs?? Jason
> -----Original Message----- > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > owner@vger.kernel.org> On Behalf Of Jason Gunthorpe > Sent: Tuesday, December 18, 2018 9:49 AM > To: Leon Romanovsky <leon@kernel.org> > Cc: Doug Ledford <dledford@redhat.com>; Leon Romanovsky > <leonro@mellanox.com>; RDMA mailing list <linux-rdma@vger.kernel.org>; > Greg Kroah-Hartman <gregkh@suse.de> > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout of RDMA > devices with InfiniBand link layer > > On Tue, Dec 18, 2018 at 02:15:52PM +0200, Leon Romanovsky wrote: > > From: Leon Romanovsky <leonro@mellanox.com> > > > > The sysfs layout is created by CM incorrectly presented RDMA devices > > with InfiniBand link layer. Layout of such devices represents device > > tree of connections. It means that the following issues are fixed by > > the this change: > > * Symlink name - It used device name instead of specific identifier. > > * Target location - It was supposed to point to PCI-ID/infiniband_cm/ > > instead of PCI-ID/infiniband/ > > * Target name - It created extra device file under already existing > > device folder, e.g. mlx5_0/mlx5_0 > > > > Fix it by changing sysfs device CM creation logic. > > > > Before the change: > > [leonro@server ~]$ ls -l /sys/class/infiniband_cm/ lrwxrwxrwx 1 root > > root 0 Dec 14 10:31 mlx5_0 -> > > ../../devices/pci0000:00/0000:00:0c.0/infiniband/mlx5_0/mlx5_0 > > > > After the change: > > [leonro@server ~]$ ls -l /sys/class/infiniband_cm/ lrwxrwxrwx 1 root > > root 0 Dec 14 10:47 cm0 -> > > ../../devices/pci0000:00/0000:00:0c.0/infiniband_cm/cm0 > > > > At the end, we will have same logic and representation as for uverbs, mad > and issm sysfs classes. > > These other ones actually tie to chardevs, this does not. > > > Fixes: 110cf374a809 ("infiniband: make cm_device use a struct device > > and not a kobject.") > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com> I think that it > > should be forwarded to stable@ too. > > drivers/infiniband/core/cm.c | 17 +++++++++++------ > > 1 file changed, 11 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/infiniband/core/cm.c > > b/drivers/infiniband/core/cm.c index edb2cb758be7..e3bee04fb641 100644 > > +++ b/drivers/infiniband/core/cm.c > > @@ -93,6 +93,8 @@ static const char * const ibcm_rej_reason_strs[] = { > > [IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow > label", > > }; > > > > +static DEFINE_IDA(cm_ida); > > + > > const char *__attribute_const__ ibcm_reject_msg(int reason) { > > size_t index = reason; > > @@ -220,6 +222,7 @@ struct cm_device { > > struct list_head list; > > struct ib_device *ib_device; > > struct device *device; > > + int devnum; > > u8 ack_delay; > > int going_down; > > struct cm_port *port[0]; > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device > *ib_device) > > cm_dev->ib_device = ib_device; > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > cm_dev->going_down = 0; > > - cm_dev->device = device_create(&cm_class, &ib_device->dev, > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, > > MKDEV(0, 0), NULL, > > Yikes, this is horrifying, why is it creating a struct device at all like this?? > > Probably the CM counter group debug sysfs should just be added directly to > the actual ib_dev's sysfs?? > Yes. anchoring to ibdev's sysfs looks right way but that breaks the ABI/scripts who are accustomed to use it from infiniband_cm. I planned to do it differently to avoid breaking such users by keeping it in infiniband_cm, but since such change doesn't break any functionality, its better we make it right now to anchor at ib_dev' port's kobject. If there is agreement, I can probably do post these 3 series, as this will likely have code conflicts with those 3 series in works. > Jason
On Tue, Dec 18, 2018 at 09:32:16AM -0700, Parav Pandit wrote: > > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device > > *ib_device) > > > cm_dev->ib_device = ib_device; > > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > > cm_dev->going_down = 0; > > > - cm_dev->device = device_create(&cm_class, &ib_device->dev, > > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > > + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, > > > MKDEV(0, 0), NULL, > > > > Yikes, this is horrifying, why is it creating a struct device at all like this?? > > > > Probably the CM counter group debug sysfs should just be added directly to > > the actual ib_dev's sysfs?? > > Yes. anchoring to ibdev's sysfs looks right way but that breaks the > ABI/scripts who are accustomed to use it from infiniband_cm. But doesn't this patch rename those paths already? What paths do you think people are using? Jason
> -----Original Message----- > From: Jason Gunthorpe > Sent: Tuesday, December 18, 2018 10:35 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > <gregkh@suse.de> > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout of RDMA > devices with InfiniBand link layer > > On Tue, Dec 18, 2018 at 09:32:16AM -0700, Parav Pandit wrote: > > > > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device > > > *ib_device) > > > > cm_dev->ib_device = ib_device; > > > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > > > cm_dev->going_down = 0; > > > > - cm_dev->device = device_create(&cm_class, &ib_device->dev, > > > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > > > + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, > > > > MKDEV(0, 0), NULL, > > > > > > Yikes, this is horrifying, why is it creating a struct device at all like this?? > > > > > > Probably the CM counter group debug sysfs should just be added > > > directly to the actual ib_dev's sysfs?? > > > > Yes. anchoring to ibdev's sysfs looks right way but that breaks the > > ABI/scripts who are accustomed to use it from infiniband_cm. > > But doesn't this patch rename those paths already? > Yes, it breaks. I did raise this to Leon yesterday. But this patch solves some other problem... > What paths do you think people are using? > cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req
On Tue, Dec 18, 2018 at 09:50:44AM -0700, Parav Pandit wrote: > > > > From: Jason Gunthorpe > > Sent: Tuesday, December 18, 2018 10:35 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA > > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > > <gregkh@suse.de> > > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout of RDMA > > devices with InfiniBand link layer > > > > On Tue, Dec 18, 2018 at 09:32:16AM -0700, Parav Pandit wrote: > > > > > > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device > > > > *ib_device) > > > > > cm_dev->ib_device = ib_device; > > > > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > > > > cm_dev->going_down = 0; > > > > > - cm_dev->device = device_create(&cm_class, &ib_device->dev, > > > > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > > > > + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, > > > > > MKDEV(0, 0), NULL, > > > > > > > > Yikes, this is horrifying, why is it creating a struct device at all like this?? > > > > > > > > Probably the CM counter group debug sysfs should just be added > > > > directly to the actual ib_dev's sysfs?? > > > > > > Yes. anchoring to ibdev's sysfs looks right way but that breaks the > > > ABI/scripts who are accustomed to use it from infiniband_cm. > > > > But doesn't this patch rename those paths already? > > Yes, it breaks. I did raise this to Leon yesterday. But this patch > solves some other problem... Well, if we decide to break it then lets break it and end up in a sane spot.. > > What paths do you think people are using? > > > cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req Could we put a symlink here perhaps - one that tracks rename maybe? Jason
> -----Original Message----- > From: Jason Gunthorpe > Sent: Tuesday, December 18, 2018 10:56 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > <gregkh@suse.de> > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout of RDMA > devices with InfiniBand link layer > > On Tue, Dec 18, 2018 at 09:50:44AM -0700, Parav Pandit wrote: > > > > > > > From: Jason Gunthorpe > > > Sent: Tuesday, December 18, 2018 10:35 AM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > > > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; > RDMA > > > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > > > <gregkh@suse.de> > > > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout > > > of RDMA devices with InfiniBand link layer > > > > > > On Tue, Dec 18, 2018 at 09:32:16AM -0700, Parav Pandit wrote: > > > > > > > > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct > > > > > > ib_device > > > > > *ib_device) > > > > > > cm_dev->ib_device = ib_device; > > > > > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > > > > > cm_dev->going_down = 0; > > > > > > - cm_dev->device = device_create(&cm_class, &ib_device- > >dev, > > > > > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > > > > > + cm_dev->device = device_create(&cm_class, > > > > > > +ib_device->dev.parent, > > > > > > MKDEV(0, 0), NULL, > > > > > > > > > > Yikes, this is horrifying, why is it creating a struct device at all like > this?? > > > > > > > > > > Probably the CM counter group debug sysfs should just be added > > > > > directly to the actual ib_dev's sysfs?? > > > > > > > > Yes. anchoring to ibdev's sysfs looks right way but that breaks > > > > the ABI/scripts who are accustomed to use it from infiniband_cm. > > > > > > But doesn't this patch rename those paths already? > > > > Yes, it breaks. I did raise this to Leon yesterday. But this patch > > solves some other problem... > > Well, if we decide to break it then lets break it and end up in a sane spot.. > Yes, I also think so. If we anchor at ibdev's sysfs, it handles rename also cleanly. > > > What paths do you think people are using? > > > > > cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req > > Could we put a symlink here perhaps - one that tracks rename maybe? If we put under, cat /sys/class/infiniband/mlx5_0/1/cm_tx_retries/req compare to previously, cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req It is not that bad. In current patch as end user, I need to figure out, which pci dev cm0 maps to. Then, I need to go through all ibdevices and find out matching pci dev, and map cm0 to that device. Of course, a script can be written... But /sys/class/infiniband/mlx5_0/1/cm_tx_retries/req seems a cleaner approach that doesn't require special scripts, and works with renaming nicely.
On Tue, Dec 18, 2018 at 05:40:37PM +0000, Parav Pandit wrote: > > > > -----Original Message----- > > From: Jason Gunthorpe > > Sent: Tuesday, December 18, 2018 10:56 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA > > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > > <gregkh@suse.de> > > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout of RDMA > > devices with InfiniBand link layer > > > > On Tue, Dec 18, 2018 at 09:50:44AM -0700, Parav Pandit wrote: > > > > > > > > > > From: Jason Gunthorpe > > > > Sent: Tuesday, December 18, 2018 10:35 AM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford > > > > <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; > > RDMA > > > > mailing list <linux-rdma@vger.kernel.org>; Greg Kroah-Hartman > > > > <gregkh@suse.de> > > > > Subject: Re: [PATCH rdma-next] IB/cm: Properly create sysfs layout > > > > of RDMA devices with InfiniBand link layer > > > > > > > > On Tue, Dec 18, 2018 at 09:32:16AM -0700, Parav Pandit wrote: > > > > > > > > > > > @@ -4368,13 +4371,12 @@ static void cm_add_one(struct > > > > > > > ib_device > > > > > > *ib_device) > > > > > > > cm_dev->ib_device = ib_device; > > > > > > > cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; > > > > > > > cm_dev->going_down = 0; > > > > > > > - cm_dev->device = device_create(&cm_class, &ib_device- > > >dev, > > > > > > > + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); > > > > > > > + cm_dev->device = device_create(&cm_class, > > > > > > > +ib_device->dev.parent, > > > > > > > MKDEV(0, 0), NULL, > > > > > > > > > > > > Yikes, this is horrifying, why is it creating a struct device at all like > > this?? > > > > > > > > > > > > Probably the CM counter group debug sysfs should just be added > > > > > > directly to the actual ib_dev's sysfs?? > > > > > > > > > > Yes. anchoring to ibdev's sysfs looks right way but that breaks > > > > > the ABI/scripts who are accustomed to use it from infiniband_cm. > > > > > > > > But doesn't this patch rename those paths already? > > > > > > Yes, it breaks. I did raise this to Leon yesterday. But this patch > > > solves some other problem... > > > > Well, if we decide to break it then lets break it and end up in a sane spot.. > > > Yes, I also think so. If we anchor at ibdev's sysfs, it handles rename also cleanly. > > > > > What paths do you think people are using? > > > > > > > cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req > > > > Could we put a symlink here perhaps - one that tracks rename maybe? > If we put under, > cat /sys/class/infiniband/mlx5_0/1/cm_tx_retries/req > compare to previously, > cat /sys/class/infiniband_cm/mlx5_0/1/cm_tx_retries/req > > It is not that bad. > In current patch as end user, I need to figure out, which pci dev cm0 maps to. > Then, I need to go through all ibdevices and find out matching pci dev, and map cm0 to that device. > Of course, a script can be written... > But /sys/class/infiniband/mlx5_0/1/cm_tx_retries/req seems a cleaner approach that doesn't require special scripts, and works with renaming nicely. It is pretty easy to see to which cm your ibdev is connected: [leonro@server rdma-core]$ ls -al /sys/class/infiniband_cm/cm0/device/infiniband total 0 drwxr-xr-x 3 root root 0 Dec 16 12:38 . drwxr-xr-x 9 root root 0 Dec 16 17:17 .. drwxr-xr-x 4 root root 0 Dec 16 13:01 hfi1 Thanks
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index edb2cb758be7..e3bee04fb641 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -93,6 +93,8 @@ static const char * const ibcm_rej_reason_strs[] = { [IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label", }; +static DEFINE_IDA(cm_ida); + const char *__attribute_const__ ibcm_reject_msg(int reason) { size_t index = reason; @@ -220,6 +222,7 @@ struct cm_device { struct list_head list; struct ib_device *ib_device; struct device *device; + int devnum; u8 ack_delay; int going_down; struct cm_port *port[0]; @@ -4368,13 +4371,12 @@ static void cm_add_one(struct ib_device *ib_device) cm_dev->ib_device = ib_device; cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; cm_dev->going_down = 0; - cm_dev->device = device_create(&cm_class, &ib_device->dev, + cm_dev->devnum = ida_alloc(&cm_ida, GFP_KERNEL); + cm_dev->device = device_create(&cm_class, ib_device->dev.parent, MKDEV(0, 0), NULL, - "%s", dev_name(&ib_device->dev)); - if (IS_ERR(cm_dev->device)) { - kfree(cm_dev); - return; - } + "cm%d", cm_dev->devnum); + if (IS_ERR(cm_dev->device)) + goto devreg; set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask); for (i = 1; i <= ib_device->phys_port_cnt; i++) { @@ -4442,6 +4444,8 @@ static void cm_add_one(struct ib_device *ib_device) } free: device_unregister(cm_dev->device); +devreg: + ida_free(&cm_ida, cm_dev->devnum); kfree(cm_dev); } @@ -4496,6 +4500,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data) } device_unregister(cm_dev->device); + ida_free(&cm_ida, cm_dev->devnum); kfree(cm_dev); }