Message ID | 20190117204102.30826.18387.stgit@scvm10.sc.intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | hfi1 and qib patches for rc | expand |
On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote: > From: Michael J. Ruhl <michael.j.ruhl@intel.com> > > Commit 896de0090a85 ("RDMA/core: Use dev_name instead of > ibdev->name"), changes the API for setting the IB device name. > > HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e. > as to when the device name string is available). This assumption is > no longer valid. > > Update the drivers to reflect the commit by storing the specific > device name in device local information. Create a calldown for the > RVT driver to access the drivers name information. > > Rework the logging macros to use the best device name. > > Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device name in logging") > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> > Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> > --- > drivers/infiniband/hw/hfi1/affinity.c | 6 +++-- > drivers/infiniband/hw/hfi1/driver.c | 8 +++++++ > drivers/infiniband/hw/hfi1/hfi.h | 22 ++++++++----------- > drivers/infiniband/hw/hfi1/init.c | 2 +- > drivers/infiniband/hw/hfi1/verbs.c | 1 + > drivers/infiniband/hw/qib/qib.h | 12 +++++----- > drivers/infiniband/hw/qib/qib_driver.c | 8 +++++++ > drivers/infiniband/hw/qib/qib_init.c | 10 ++++++-- > drivers/infiniband/hw/qib/qib_verbs.c | 1 + > drivers/infiniband/sw/rdmavt/trace.h | 6 +++-- > drivers/infiniband/sw/rdmavt/vt.c | 2 +- > drivers/infiniband/sw/rdmavt/vt.h | 10 ++++++-- > include/rdma/rdma_vt.h | 38 ++++++-------------------------- > 13 files changed, 64 insertions(+), 62 deletions(-) This is too ugly. Drivers should not be defining their own printing macros, or dealing with this name stuff. They shouldn't be randomly choosing to print to the PCI struct device vs the ib struct device and all sorts of other inconsistent things we have. If drivers need to print prior to registration, then we need to copy what netdev has been doing and provide our own printing functions accepting a struct ib_device* that go through various gyrations to compute a suitable identification string across the entire life cycle - ie netdev will use a PCI BDF and other options if the ethX name is not yet assigned. I stopped short of doing this when I injected dev_name, because it didn't seem necessary, but if you say hfi has a problem, then this is how to fix it. > --- a/drivers/infiniband/hw/hfi1/init.c > +++ b/drivers/infiniband/hw/hfi1/init.c > @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd) > "Could not allocate unit ID: error %d\n", -ret); > goto bail; > } > - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit); > + kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit); Woah, this driver has a kobj and a struct ib_device in the same structure? Betcha the refcounting is done wrong :( This kobject is completely redundant with the kobject that is already in the struct ib_device, you should delete it, not use it for more stuff. > /* > * Initialize all locks for the device. This needs to be as early as > diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c > index ec582d8..6546cba 100644 > --- a/drivers/infiniband/hw/hfi1/verbs.c > +++ b/drivers/infiniband/hw/hfi1/verbs.c > @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd) > * Fill in rvt info object. > */ > dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files; > + dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name; > dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev; > dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah; > dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah; > diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h > index 83d2349..3681701 100644 > --- a/drivers/infiniband/hw/qib/qib.h > +++ b/drivers/infiniband/hw/qib/qib.h > @@ -1075,6 +1075,8 @@ struct qib_devdata { > struct tasklet_struct error_tasklet; > > int assigned_node_id; /* NUMA node closest to HCA */ > + /* device name */ > + char *name; > }; Don't store your own names :( > -/** > - * rvt_set_ibdev_name - Craft an IB device name from client info > - * @rdi: pointer to the client rvt_dev_info structure > - * @name: client specific name > - * @unit: client specific unit number. > - */ > -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi, > - const char *fmt, const char *name, > - const int unit) > -{ > - /* > - * FIXME: rvt and its users want to touch the ibdev before > - * registration and have things like the name work. We don't have the > - * infrastructure in the core to support this directly today, hack it > - * to work by setting the name manually here. > - */ > - dev_set_name(&rdi->ibdev.dev, fmt, name, unit); > - strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX); So why wasn't this hack good enough? What is the issue? Jason
On 1/17/2019 4:04 PM, Jason Gunthorpe wrote: > On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote: >> From: Michael J. Ruhl <michael.j.ruhl@intel.com> >> >> Commit 896de0090a85 ("RDMA/core: Use dev_name instead of >> ibdev->name"), changes the API for setting the IB device name. >> >> HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e. >> as to when the device name string is available). This assumption is >> no longer valid. >> >> Update the drivers to reflect the commit by storing the specific >> device name in device local information. Create a calldown for the >> RVT driver to access the drivers name information. >> >> Rework the logging macros to use the best device name. >> >> Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device name in logging") >> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> >> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> >> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> >> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> >> --- >> drivers/infiniband/hw/hfi1/affinity.c | 6 +++-- >> drivers/infiniband/hw/hfi1/driver.c | 8 +++++++ >> drivers/infiniband/hw/hfi1/hfi.h | 22 ++++++++----------- >> drivers/infiniband/hw/hfi1/init.c | 2 +- >> drivers/infiniband/hw/hfi1/verbs.c | 1 + >> drivers/infiniband/hw/qib/qib.h | 12 +++++----- >> drivers/infiniband/hw/qib/qib_driver.c | 8 +++++++ >> drivers/infiniband/hw/qib/qib_init.c | 10 ++++++-- >> drivers/infiniband/hw/qib/qib_verbs.c | 1 + >> drivers/infiniband/sw/rdmavt/trace.h | 6 +++-- >> drivers/infiniband/sw/rdmavt/vt.c | 2 +- >> drivers/infiniband/sw/rdmavt/vt.h | 10 ++++++-- >> include/rdma/rdma_vt.h | 38 ++++++-------------------------- >> 13 files changed, 64 insertions(+), 62 deletions(-) > > This is too ugly. Drivers should not be defining their own printing > macros, or dealing with this name stuff. They shouldn't be randomly > choosing to print to the PCI struct device vs the ib struct device and > all sorts of other inconsistent things we have. Why is it ugly? This isn't new. There is nothing random about what is chosen to print. The PCI is used before we have an IB dev. That doesn't change before/after this patch. Are you referring to inconsistent things in hfi1/rdmavt/qib or from an IB core point of view? > If drivers need to print prior to registration, then we need to copy > what netdev has been doing and provide our own printing functions > accepting a struct ib_device* that go through various gyrations to > compute a suitable identification string across the entire life cycle > - ie netdev will use a PCI BDF and other options if the ethX name is > not yet assigned. Which is what we do here, use the PCI BDF right? Did I miss something? Or are you saying it should be added to the core rather than the driver. I'd be OK with that as well. > I stopped short of doing this when I injected dev_name, because it > didn't seem necessary, but if you say hfi has a problem, then this is > how to fix it. > >> --- a/drivers/infiniband/hw/hfi1/init.c >> +++ b/drivers/infiniband/hw/hfi1/init.c >> @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd) >> "Could not allocate unit ID: error %d\n", -ret); >> goto bail; >> } >> - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit); >> + kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit); > > Woah, this driver has a kobj and a struct ib_device in the same > structure? Betcha the refcounting is done wrong :( Been like that for quite a while. Mike has looked at this and I trust him when he says the refcounting is correct. Do you see something wrong here? If so please point it out. > This kobject is completely redundant with the kobject that is already > in the struct ib_device, you should delete it, not use it for more > stuff. The kobject in the ib_device is just that, for ib_device stuff. The one here is for hfi1 stuff. Why muddle them together? Is there an advantage to that? >> /* >> * Initialize all locks for the device. This needs to be as early as >> diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c >> index ec582d8..6546cba 100644 >> --- a/drivers/infiniband/hw/hfi1/verbs.c >> +++ b/drivers/infiniband/hw/hfi1/verbs.c >> @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd) >> * Fill in rvt info object. >> */ >> dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files; >> + dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name; >> dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev; >> dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah; >> dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah; >> diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h >> index 83d2349..3681701 100644 >> --- a/drivers/infiniband/hw/qib/qib.h >> +++ b/drivers/infiniband/hw/qib/qib.h >> @@ -1075,6 +1075,8 @@ struct qib_devdata { >> struct tasklet_struct error_tasklet; >> >> int assigned_node_id; /* NUMA node closest to HCA */ >> + /* device name */ >> + char *name; >> }; > > Don't store your own names :( Can you explain the issue here? > >> -/** >> - * rvt_set_ibdev_name - Craft an IB device name from client info >> - * @rdi: pointer to the client rvt_dev_info structure >> - * @name: client specific name >> - * @unit: client specific unit number. >> - */ >> -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi, >> - const char *fmt, const char *name, >> - const int unit) >> -{ >> - /* >> - * FIXME: rvt and its users want to touch the ibdev before >> - * registration and have things like the name work. We don't have the >> - * infrastructure in the core to support this directly today, hack it >> - * to work by setting the name manually here. >> - */ >> - dev_set_name(&rdi->ibdev.dev, fmt, name, unit); >> - strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX); > > So why wasn't this hack good enough? What is the issue? You literally wrote FIXME, so Mike took that to mean he should fix it. -Denny
>-----Original Message----- >From: Dalessandro, Dennis >Sent: Friday, February 8, 2019 8:41 AM >To: Jason Gunthorpe <jgg@ziepe.ca> >Cc: dledford@redhat.com; linux-rdma@vger.kernel.org; Ruhl, Michael J ><michael.j.ruhl@intel.com>; Marciniszyn, Mike ><mike.marciniszyn@intel.com> >Subject: Re: [PATCH for-rc 1/7] IB/{hfi1, qib, rdmavt}: Do not depend on IB >Verbs name for driver logging > >On 1/17/2019 4:04 PM, Jason Gunthorpe wrote: >> On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote: >>> From: Michael J. Ruhl <michael.j.ruhl@intel.com> >>> >>> Commit 896de0090a85 ("RDMA/core: Use dev_name instead of >>> ibdev->name"), changes the API for setting the IB device name. >>> >>> HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e. >>> as to when the device name string is available). This assumption is >>> no longer valid. >>> >>> Update the drivers to reflect the commit by storing the specific >>> device name in device local information. Create a calldown for the >>> RVT driver to access the drivers name information. >>> >>> Rework the logging macros to use the best device name. >>> >>> Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device >name in logging") >>> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> >>> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> >>> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> >>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> >>> --- >>> drivers/infiniband/hw/hfi1/affinity.c | 6 +++-- >>> drivers/infiniband/hw/hfi1/driver.c | 8 +++++++ >>> drivers/infiniband/hw/hfi1/hfi.h | 22 ++++++++----------- >>> drivers/infiniband/hw/hfi1/init.c | 2 +- >>> drivers/infiniband/hw/hfi1/verbs.c | 1 + >>> drivers/infiniband/hw/qib/qib.h | 12 +++++----- >>> drivers/infiniband/hw/qib/qib_driver.c | 8 +++++++ >>> drivers/infiniband/hw/qib/qib_init.c | 10 ++++++-- >>> drivers/infiniband/hw/qib/qib_verbs.c | 1 + >>> drivers/infiniband/sw/rdmavt/trace.h | 6 +++-- >>> drivers/infiniband/sw/rdmavt/vt.c | 2 +- >>> drivers/infiniband/sw/rdmavt/vt.h | 10 ++++++-- >>> include/rdma/rdma_vt.h | 38 ++++++-------------------------- >>> 13 files changed, 64 insertions(+), 62 deletions(-) >> >> This is too ugly. Drivers should not be defining their own printing >> macros, or dealing with this name stuff. They shouldn't be randomly >> choosing to print to the PCI struct device vs the ib struct device and >> all sorts of other inconsistent things we have. > >Why is it ugly? This isn't new. There is nothing random about what is >chosen to print. The PCI is used before we have an IB dev. That doesn't >change before/after this patch. > >Are you referring to inconsistent things in hfi1/rdmavt/qib or from an >IB core point of view? > >> If drivers need to print prior to registration, then we need to copy >> what netdev has been doing and provide our own printing functions >> accepting a struct ib_device* that go through various gyrations to >> compute a suitable identification string across the entire life cycle >> - ie netdev will use a PCI BDF and other options if the ethX name is >> not yet assigned. > >Which is what we do here, use the PCI BDF right? Did I miss something? >Or are you saying it should be added to the core rather than the driver. >I'd be OK with that as well. > >> I stopped short of doing this when I injected dev_name, because it >> didn't seem necessary, but if you say hfi has a problem, then this is >> how to fix it. >> >>> --- a/drivers/infiniband/hw/hfi1/init.c >>> +++ b/drivers/infiniband/hw/hfi1/init.c >>> @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd) >>> "Could not allocate unit ID: error %d\n", -ret); >>> goto bail; >>> } >>> - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), >dd->unit); >>> + kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit); >> >> Woah, this driver has a kobj and a struct ib_device in the same >> structure? Betcha the refcounting is done wrong :( > >Been like that for quite a while. Mike has looked at this and I trust >him when he says the refcounting is correct. Do you see something wrong >here? If so please point it out. > >> This kobject is completely redundant with the kobject that is already >> in the struct ib_device, you should delete it, not use it for more >> stuff. > >The kobject in the ib_device is just that, for ib_device stuff. The one >here is for hfi1 stuff. Why muddle them together? Is there an advantage >to that? > >>> /* >>> * Initialize all locks for the device. This needs to be as early as >>> diff --git a/drivers/infiniband/hw/hfi1/verbs.c >b/drivers/infiniband/hw/hfi1/verbs.c >>> index ec582d8..6546cba 100644 >>> --- a/drivers/infiniband/hw/hfi1/verbs.c >>> +++ b/drivers/infiniband/hw/hfi1/verbs.c >>> @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata >*dd) >>> * Fill in rvt info object. >>> */ >>> dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files; >>> + dd->verbs_dev.rdi.driver_f.get_device_name = >hfi1_get_device_name; >>> dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev; >>> dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah; >>> dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah; >>> diff --git a/drivers/infiniband/hw/qib/qib.h >b/drivers/infiniband/hw/qib/qib.h >>> index 83d2349..3681701 100644 >>> --- a/drivers/infiniband/hw/qib/qib.h >>> +++ b/drivers/infiniband/hw/qib/qib.h >>> @@ -1075,6 +1075,8 @@ struct qib_devdata { >>> struct tasklet_struct error_tasklet; >>> >>> int assigned_node_id; /* NUMA node closest to HCA */ >>> + /* device name */ >>> + char *name; >>> }; >> >> Don't store your own names :( > >Can you explain the issue here? > >> >>> -/** >>> - * rvt_set_ibdev_name - Craft an IB device name from client info >>> - * @rdi: pointer to the client rvt_dev_info structure >>> - * @name: client specific name >>> - * @unit: client specific unit number. >>> - */ >>> -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi, >>> - const char *fmt, const char *name, >>> - const int unit) >>> -{ >>> - /* >>> - * FIXME: rvt and its users want to touch the ibdev before >>> - * registration and have things like the name work. We don't have the >>> - * infrastructure in the core to support this directly today, hack it >>> - * to work by setting the name manually here. >>> - */ >>> - dev_set_name(&rdi->ibdev.dev, fmt, name, unit); >>> - strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), >IB_DEVICE_NAME_MAX); >> >> So why wasn't this hack good enough? What is the issue? > >You literally wrote FIXME, so Mike took that to mean he should fix it. I have an alternate patch that uses: 1) a device name private to the rdmavt instance (for rdmavt logging) (adds the name as a char) 2) doesn't use the kobject from HFI 3) doesn't use the char in QIB 4) uses hardwired text rather than a string in the logging function.: i.e : "hfi1_%d", dd->unit. Would it be useful to submit that? M >-Denny > >
On Fri, Feb 08, 2019 at 08:41:28AM -0500, Dennis Dalessandro wrote: > On 1/17/2019 4:04 PM, Jason Gunthorpe wrote: > > On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote: > > > From: Michael J. Ruhl <michael.j.ruhl@intel.com> > > > > > > Commit 896de0090a85 ("RDMA/core: Use dev_name instead of > > > ibdev->name"), changes the API for setting the IB device name. > > > > > > HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e. > > > as to when the device name string is available). This assumption is > > > no longer valid. > > > > > > Update the drivers to reflect the commit by storing the specific > > > device name in device local information. Create a calldown for the > > > RVT driver to access the drivers name information. > > > > > > Rework the logging macros to use the best device name. > > > > > > Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device name in logging") > > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> > > > Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> > > > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> > > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> > > > drivers/infiniband/hw/hfi1/affinity.c | 6 +++-- > > > drivers/infiniband/hw/hfi1/driver.c | 8 +++++++ > > > drivers/infiniband/hw/hfi1/hfi.h | 22 ++++++++----------- > > > drivers/infiniband/hw/hfi1/init.c | 2 +- > > > drivers/infiniband/hw/hfi1/verbs.c | 1 + > > > drivers/infiniband/hw/qib/qib.h | 12 +++++----- > > > drivers/infiniband/hw/qib/qib_driver.c | 8 +++++++ > > > drivers/infiniband/hw/qib/qib_init.c | 10 ++++++-- > > > drivers/infiniband/hw/qib/qib_verbs.c | 1 + > > > drivers/infiniband/sw/rdmavt/trace.h | 6 +++-- > > > drivers/infiniband/sw/rdmavt/vt.c | 2 +- > > > drivers/infiniband/sw/rdmavt/vt.h | 10 ++++++-- > > > include/rdma/rdma_vt.h | 38 ++++++-------------------------- > > > 13 files changed, 64 insertions(+), 62 deletions(-) > > > > This is too ugly. Drivers should not be defining their own printing > > macros, or dealing with this name stuff. They shouldn't be randomly > > choosing to print to the PCI struct device vs the ib struct device and > > all sorts of other inconsistent things we have. > > Why is it ugly? This isn't new. There is nothing random about what is chosen > to print. The PCI is used before we have an IB dev. That doesn't change > before/after this patch. How will this private name get updated when device rename happens? > Are you referring to inconsistent things in hfi1/rdmavt/qib or from an IB > core point of view? Yes, it is inconsistent from an IB core point of view. > > If drivers need to print prior to registration, then we need to copy > > what netdev has been doing and provide our own printing functions > > accepting a struct ib_device* that go through various gyrations to > > compute a suitable identification string across the entire life cycle > > - ie netdev will use a PCI BDF and other options if the ethX name is > > not yet assigned. > > Which is what we do here, use the PCI BDF right? Did I miss something? Or > are you saying it should be added to the core rather than the driver. I'd be > OK with that as well. Yes > > I stopped short of doing this when I injected dev_name, because it > > didn't seem necessary, but if you say hfi has a problem, then this is > > how to fix it. > > > > > +++ b/drivers/infiniband/hw/hfi1/init.c > > > @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd) > > > "Could not allocate unit ID: error %d\n", -ret); > > > goto bail; > > > } > > > - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit); > > > + kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit); > > > > Woah, this driver has a kobj and a struct ib_device in the same > > structure? Betcha the refcounting is done wrong :( > > Been like that for quite a while. Mike has looked at this and I trust him > when he says the refcounting is correct. Do you see something wrong here? If > so please point it out. > > > This kobject is completely redundant with the kobject that is already > > in the struct ib_device, you should delete it, not use it for more > > stuff. > > The kobject in the ib_device is just that, for ib_device stuff. The one here > is for hfi1 stuff. Why muddle them together? Is there an advantage to that? The kobject literally *doesn't do anything* On kobject/kref per memory allocation except in bizzaro situations. > > > /* > > > * Initialize all locks for the device. This needs to be as early as > > > diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c > > > index ec582d8..6546cba 100644 > > > +++ b/drivers/infiniband/hw/hfi1/verbs.c > > > @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd) > > > * Fill in rvt info object. > > > */ > > > dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files; > > > + dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name; > > > dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev; > > > dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah; > > > dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah; > > > diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h > > > index 83d2349..3681701 100644 > > > +++ b/drivers/infiniband/hw/qib/qib.h > > > @@ -1075,6 +1075,8 @@ struct qib_devdata { > > > struct tasklet_struct error_tasklet; > > > int assigned_node_id; /* NUMA node closest to HCA */ > > > + /* device name */ > > > + char *name; > > > }; > > > > Don't store your own names :( > > Can you explain the issue here? How will rename work? > > > > > -/** > > > - * rvt_set_ibdev_name - Craft an IB device name from client info > > > - * @rdi: pointer to the client rvt_dev_info structure > > > - * @name: client specific name > > > - * @unit: client specific unit number. > > > - */ > > > -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi, > > > - const char *fmt, const char *name, > > > - const int unit) > > > -{ > > > - /* > > > - * FIXME: rvt and its users want to touch the ibdev before > > > - * registration and have things like the name work. We don't have the > > > - * infrastructure in the core to support this directly today, hack it > > > - * to work by setting the name manually here. > > > - */ > > > - dev_set_name(&rdi->ibdev.dev, fmt, name, unit); > > > - strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX); > > > > So why wasn't this hack good enough? What is the issue? > > You literally wrote FIXME, so Mike took that to mean he should fix it. Yes, the comment says "we don't have the infrastructure in the core to support this" so the implication is to add more infrastructure in the core code to support an early name assignment, or more infrastructure to avoid this need in the driver (ie logging helpers like netdev). Not more ugly driver hackery. Jason
On Fri, Feb 08, 2019 at 03:17:08PM +0000, Ruhl, Michael J wrote: > I have an alternate patch that uses: > > 1) a device name private to the rdmavt instance (for rdmavt logging) > (adds the name as a char) And how does rename work? > 2) doesn't use the kobject from HFI Delete it :( > 3) doesn't use the char in QIB > 4) uses hardwired text rather than a string in the logging > function.: i.e : "hfi1_%d", dd->unit. Would rather see this loggin as a core function like netdev does. Why don't you fix the core code to allow early name assignment? In fact my series here: https://github.com/jgunthorpe/linux/commits/device_locking_cleanup Does pretty much just that already, it needs two more lines and some EXPORT_SYMBOLs, but otherwise does the job. Jason
>-----Original Message----- >From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- >owner@vger.kernel.org] On Behalf Of Jason Gunthorpe >Sent: Friday, February 8, 2019 10:57 AM >To: Ruhl, Michael J <michael.j.ruhl@intel.com> >Cc: Dalessandro, Dennis <dennis.dalessandro@intel.com>; >dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike ><mike.marciniszyn@intel.com> >Subject: Re: [PATCH for-rc 1/7] IB/{hfi1, qib, rdmavt}: Do not depend on IB >Verbs name for driver logging > >On Fri, Feb 08, 2019 at 03:17:08PM +0000, Ruhl, Michael J wrote: >> I have an alternate patch that uses: >> >> 1) a device name private to the rdmavt instance (for rdmavt logging) >> (adds the name as a char) > >And how does rename work? > >> 2) doesn't use the kobject from HFI > >Delete it :( > >> 3) doesn't use the char in QIB > >> 4) uses hardwired text rather than a string in the logging >> function.: i.e : "hfi1_%d", dd->unit. > >Would rather see this loggin as a core function like netdev does. > >Why don't you fix the core code to allow early name assignment? > >In fact my series here: > >https://github.com/jgunthorpe/linux/commits/device_locking_cleanup > >Does pretty much just that already, it needs two more lines and some >EXPORT_SYMBOLs, but otherwise does the job. Ok, I will take a look at this path. Mike >Jason
diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c index 2baf38c..b85a096 100644 --- a/drivers/infiniband/hw/hfi1/affinity.c +++ b/drivers/infiniband/hw/hfi1/affinity.c @@ -425,7 +425,7 @@ static void _dev_comp_vect_mappings_destroy(struct hfi1_devdata *dd) dd->comp_vect_mappings[i] = -1; hfi1_cdbg(AFFINITY, "[%s] Release CPU %d from completion vector %d", - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), cpu, i); + kobject_name(&dd->kobj), cpu, i); } kfree(dd->comp_vect_mappings); @@ -475,7 +475,7 @@ static int _dev_comp_vect_mappings_create(struct hfi1_devdata *dd, dd->comp_vect_mappings[i] = cpu; hfi1_cdbg(AFFINITY, "[%s] Completion Vector %d -> CPU %d", - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), i, cpu); + kobject_name(&dd->kobj), i, cpu); } return 0; @@ -578,7 +578,7 @@ static int _dev_comp_vect_cpu_mask_init(struct hfi1_devdata *dd, hfi1_cdbg(AFFINITY, "[%s] Completion vector affinity CPU set(s) %*pbl", - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), + kobject_name(&dd->kobj), cpumask_pr_args(dev_comp_vect_mask)); return 0; diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c index a8ad707..8aaa4a6 100644 --- a/drivers/infiniband/hw/hfi1/driver.c +++ b/drivers/infiniband/hw/hfi1/driver.c @@ -160,6 +160,14 @@ static int hfi1_caps_get(char *buffer, const struct kernel_param *kp) return scnprintf(buffer, PAGE_SIZE, "0x%lx", cap_mask); } +const char *hfi1_get_device_name(struct rvt_dev_info *rdi) +{ + struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi); + struct hfi1_devdata *dd = container_of(ibdev, + struct hfi1_devdata, verbs_dev); + return kobject_name(&dd->kobj); +} + struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi) { struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi); diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h index 6db2276..dc8ba43 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -2012,6 +2012,7 @@ int get_platform_config_field(struct hfi1_devdata *dd, table_type, int table_index, int field_index, u32 *data, u32 len); +const char *hfi1_get_device_name(struct rvt_dev_info *rdi); struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi); /* @@ -2148,42 +2149,39 @@ static inline u64 hfi1_pkt_base_sdma_integrity(struct hfi1_devdata *dd) #define dd_dev_emerg(dd, fmt, ...) \ dev_emerg(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_err(dd, fmt, ...) \ dev_err(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_err_ratelimited(dd, fmt, ...) \ dev_err_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \ - ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_warn(dd, fmt, ...) \ dev_warn(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_warn_ratelimited(dd, fmt, ...) \ dev_warn_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \ - ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_info(dd, fmt, ...) \ dev_info(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_info_ratelimited(dd, fmt, ...) \ dev_info_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), \ - ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define dd_dev_dbg(dd, fmt, ...) \ dev_dbg(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), ##__VA_ARGS__) #define hfi1_dev_porterr(dd, port, fmt, ...) \ dev_err(&(dd)->pcidev->dev, "%s: port %u: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), (port), ##__VA_ARGS__) + kobject_name(&(dd)->kobj), (port), ##__VA_ARGS__) /* * this is used for formatting hw error messages... diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c index 0904490..044d9a8 100644 --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd) "Could not allocate unit ID: error %d\n", -ret); goto bail; } - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit); + kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit); /* * Initialize all locks for the device. This needs to be as early as diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c index ec582d8..6546cba 100644 --- a/drivers/infiniband/hw/hfi1/verbs.c +++ b/drivers/infiniband/hw/hfi1/verbs.c @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd) * Fill in rvt info object. */ dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files; + dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name; dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev; dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah; dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah; diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h index 83d2349..3681701 100644 --- a/drivers/infiniband/hw/qib/qib.h +++ b/drivers/infiniband/hw/qib/qib.h @@ -1075,6 +1075,8 @@ struct qib_devdata { struct tasklet_struct error_tasklet; int assigned_node_id; /* NUMA node closest to HCA */ + /* device name */ + char *name; }; /* hol_state values */ @@ -1425,6 +1427,7 @@ int qib_pcie_ddinit(struct qib_devdata *, struct pci_dev *, * dma_addr wrappers - all 0's invalid for hw */ int qib_map_page(struct pci_dev *d, struct page *p, dma_addr_t *daddr); +const char *qib_get_device_name(struct rvt_dev_info *rdi); struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi); /* @@ -1482,17 +1485,14 @@ static inline void qib_flush_wc(void) dev_err(dev, fmt, ##__VA_ARGS__) #define qib_dev_err(dd, fmt, ...) \ - dev_err(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + dev_err(&(dd)->pcidev->dev, "%s: " fmt, (dd)->name, ##__VA_ARGS__) #define qib_dev_warn(dd, fmt, ...) \ - dev_warn(&(dd)->pcidev->dev, "%s: " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), ##__VA_ARGS__) + dev_warn(&(dd)->pcidev->dev, "%s: " fmt, (dd)->name, ##__VA_ARGS__) #define qib_dev_porterr(dd, port, fmt, ...) \ dev_err(&(dd)->pcidev->dev, "%s: IB%u:%u " fmt, \ - rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), (dd)->unit, (port), \ - ##__VA_ARGS__) + (dd)->name, (dd)->unit, (port), ##__VA_ARGS__) #define qib_devinfo(pcidev, fmt, ...) \ dev_info(&(pcidev)->dev, fmt, ##__VA_ARGS__) diff --git a/drivers/infiniband/hw/qib/qib_driver.c b/drivers/infiniband/hw/qib/qib_driver.c index 3117cc5..f66d7e3 100644 --- a/drivers/infiniband/hw/qib/qib_driver.c +++ b/drivers/infiniband/hw/qib/qib_driver.c @@ -81,6 +81,14 @@ struct qlogic_ib_stats qib_stats; +const char *qib_get_device_name(struct rvt_dev_info *rdi) +{ + struct qib_ibdev *ibdev = container_of(rdi, struct qib_ibdev, rdi); + struct qib_devdata *dd = container_of(ibdev, + struct qib_devdata, verbs_dev); + return dd->name; +} + struct pci_dev *qib_get_pci_dev(struct rvt_dev_info *rdi) { struct qib_ibdev *ibdev = container_of(rdi, struct qib_ibdev, rdi); diff --git a/drivers/infiniband/hw/qib/qib_init.c b/drivers/infiniband/hw/qib/qib_init.c index 9fd6990..9896eae 100644 --- a/drivers/infiniband/hw/qib/qib_init.c +++ b/drivers/infiniband/hw/qib/qib_init.c @@ -1055,6 +1055,7 @@ void qib_free_devdata(struct qib_devdata *dd) qib_dbg_ibdev_exit(&dd->verbs_dev); #endif free_percpu(dd->int_counter); + kfree(dd->name); rvt_dealloc_device(&dd->verbs_dev.rdi); } @@ -1122,13 +1123,16 @@ struct qib_devdata *qib_alloc_devdata(struct pci_dev *pdev, size_t extra) "Could not allocate unit ID: error %d\n", -ret); goto bail; } - rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s%d", "qib", dd->unit); + dd->name = kasprintf(GFP_KERNEL, "qib%d", dd->unit); + if (!dd->name) { + qib_early_err(&pdev->dev, "Failed to allocate device name\n"); + goto bail; + } dd->int_counter = alloc_percpu(u64); if (!dd->int_counter) { ret = -ENOMEM; - qib_early_err(&pdev->dev, - "Could not allocate per-cpu int_counter\n"); + qib_dev_err(dd, "Could not allocate per-cpu int_counter\n"); goto bail; } diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c index 276304f..b116c6d 100644 --- a/drivers/infiniband/hw/qib/qib_verbs.c +++ b/drivers/infiniband/hw/qib/qib_verbs.c @@ -1568,6 +1568,7 @@ int qib_register_ib_device(struct qib_devdata *dd) * Fill in rvt info object. */ dd->verbs_dev.rdi.driver_f.port_callback = qib_create_port_files; + dd->verbs_dev.rdi.driver_f.get_device_name = qib_get_device_name; dd->verbs_dev.rdi.driver_f.get_pci_dev = qib_get_pci_dev; dd->verbs_dev.rdi.driver_f.check_ah = qib_check_ah; dd->verbs_dev.rdi.driver_f.setup_wqe = qib_check_send_wqe; diff --git a/drivers/infiniband/sw/rdmavt/trace.h b/drivers/infiniband/sw/rdmavt/trace.h index 36ddbd2..74eac77 100644 --- a/drivers/infiniband/sw/rdmavt/trace.h +++ b/drivers/infiniband/sw/rdmavt/trace.h @@ -45,8 +45,10 @@ * */ -#define RDI_DEV_ENTRY(rdi) __string(dev, rvt_get_ibdev_name(rdi)) -#define RDI_DEV_ASSIGN(rdi) __assign_str(dev, rvt_get_ibdev_name(rdi)) +#include "vt.h" + +#define RDI_DEV_ENTRY(rdi) __string(dev, get_device_name(rdi)) +#define RDI_DEV_ASSIGN(rdi) __assign_str(dev, get_device_name(rdi)) #include "trace_rvt.h" #include "trace_qp.h" diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c index aef3aa3..91b5c53 100644 --- a/drivers/infiniband/sw/rdmavt/vt.c +++ b/drivers/infiniband/sw/rdmavt/vt.c @@ -644,7 +644,7 @@ int rvt_register_device(struct rvt_dev_info *rdi, u32 driver_id) rdi->ibdev.driver_id = driver_id; /* We are now good to announce we exist */ - ret = ib_register_device(&rdi->ibdev, dev_name(&rdi->ibdev.dev), + ret = ib_register_device(&rdi->ibdev, get_device_name(rdi), rdi->driver_f.port_callback); if (ret) { rvt_pr_err(rdi, "Failed to register driver with ib core.\n"); diff --git a/drivers/infiniband/sw/rdmavt/vt.h b/drivers/infiniband/sw/rdmavt/vt.h index 0675ea6..513dd5f 100644 --- a/drivers/infiniband/sw/rdmavt/vt.h +++ b/drivers/infiniband/sw/rdmavt/vt.h @@ -62,19 +62,19 @@ #define rvt_pr_info(rdi, fmt, ...) \ __rvt_pr_info(rdi->driver_f.get_pci_dev(rdi), \ - rvt_get_ibdev_name(rdi), \ + get_device_name(rdi), \ fmt, \ ##__VA_ARGS__) #define rvt_pr_warn(rdi, fmt, ...) \ __rvt_pr_warn(rdi->driver_f.get_pci_dev(rdi), \ - rvt_get_ibdev_name(rdi), \ + get_device_name(rdi), \ fmt, \ ##__VA_ARGS__) #define rvt_pr_err(rdi, fmt, ...) \ __rvt_pr_err(rdi->driver_f.get_pci_dev(rdi), \ - rvt_get_ibdev_name(rdi), \ + get_device_name(rdi), \ fmt, \ ##__VA_ARGS__) @@ -99,4 +99,8 @@ static inline int ibport_num_to_idx(struct ib_device *ibdev, u8 port_num) return port_index; } +static inline const char *get_device_name(struct rvt_dev_info *rdi) +{ + return rdi->driver_f.get_device_name(rdi); +} #endif /* DEF_RDMAVT_H */ diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h index dd0ed80..c038552 100644 --- a/include/rdma/rdma_vt.h +++ b/include/rdma/rdma_vt.h @@ -254,6 +254,13 @@ struct rvt_driver_provided { int (*port_callback)(struct ib_device *, u8, struct kobject *); /* + * Returns a string to represent the device for which is being + * registered. This is primarily used for error and debug messages on + * the console. + */ + const char * (*get_device_name)(struct rvt_dev_info *rdi); + + /* * Returns a pointer to the undelying hardware's PCI device. This is * used to display information as to what hardware is being referenced * in an output message @@ -452,37 +459,6 @@ struct rvt_dev_info { struct rvt_wss *wss; }; -/** - * rvt_set_ibdev_name - Craft an IB device name from client info - * @rdi: pointer to the client rvt_dev_info structure - * @name: client specific name - * @unit: client specific unit number. - */ -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi, - const char *fmt, const char *name, - const int unit) -{ - /* - * FIXME: rvt and its users want to touch the ibdev before - * registration and have things like the name work. We don't have the - * infrastructure in the core to support this directly today, hack it - * to work by setting the name manually here. - */ - dev_set_name(&rdi->ibdev.dev, fmt, name, unit); - strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX); -} - -/** - * rvt_get_ibdev_name - return the IB name - * @rdi: rdmavt device - * - * Return the registered name of the device. - */ -static inline const char *rvt_get_ibdev_name(const struct rvt_dev_info *rdi) -{ - return dev_name(&rdi->ibdev.dev); -} - static inline struct rvt_pd *ibpd_to_rvtpd(struct ib_pd *ibpd) { return container_of(ibpd, struct rvt_pd, ibpd);