mbox series

[00/13] Reorganize sysfs file creation for struct ib_devices

Message ID 0-v1-34c90fa45f1c+3c7b0-port_sysfs_jgg@nvidia.com (mailing list archive)
Headers show
Series Reorganize sysfs file creation for struct ib_devices | expand

Message

Jason Gunthorpe May 17, 2021, 4:47 p.m. UTC
IB has a complex sysfs with a deep nesting of attributes. Nathan and Kees
recently noticed this was not even slightly sane with how it was handling
attributes and a deeper inspection shows the whole thing is a pretty
"ick" coding style.

Further review shows the ick extends outward from the ib_port sysfs and
basically everything is pretty crazy.

Simplify all of it:

 - Organize the ib_port and gid_attr's kobj's to have clear setup/destroy
   function pairings that work only on their own kobjs.

 - All memory allocated in service of a kobject's attributes is freed as
   part of the kobj release function. Thus all the error handling defers
   the memory frees to a put.

 - Build up lists of groups for every kobject and add the entire group
   list as a one-shot operation as the last thing in setup function.

 - Remove essentially all the error cleanup. The final kobject_put() will
   always free any memory allocated or do an internal kobject_del() if
   required. The new ordering eliminates all the other cleanup cases.

 - Make all attributes use proper typing for the kobj they are attached
   to. Split device and port hw_stats handling.

 - Create a ib_port_attribute type and change hfi1, qib and the CM code to
   work with attribute lists of ib_port_attribute type instead of building
   their own kobject madness

This is sort of RFCy in that I qib and hfi1 stuff is complex enough it needs
Dennis to look at it, and the core stuff has only passed basic testing at this
moment. Nathan confirmed an earlier version solves the CFI warning.

Jason Gunthorpe (13):
  RDMA: Split the alloc_hw_stats() ops to port and device variants
  RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port
    pointer
  RDMA/core: Split port and device counter sysfs attributes
  RDMA/core: Split gid_attrs related sysfs from add_port()
  RDMA/core: Simplify how the gid_attrs sysfs is created
  RDMA/core: Simplify how the port sysfs is created
  RDMA/core: Create the device hw_counters through the normal groups
    mechanism
  RDMA/core: Remove the kobject_uevent() NOP
  RDMA/core: Expose the ib port sysfs attribute machinery
  RDMA/cm: Use an attribute_group on the ib_port_attribute intead of
    kobj's
  RDMA/qib: Use attributes for the port sysfs
  RDMA/hfi1: Use attributes for the port sysfs
  RDMA: Change ops->init_port to ops->get_port_groups

 drivers/infiniband/core/cm.c                |  227 ++--
 drivers/infiniband/core/core_priv.h         |   13 +-
 drivers/infiniband/core/counters.c          |    4 +-
 drivers/infiniband/core/device.c            |   18 +-
 drivers/infiniband/core/nldev.c             |   10 +-
 drivers/infiniband/core/sysfs.c             | 1100 +++++++++----------
 drivers/infiniband/hw/bnxt_re/hw_counters.c |    7 +-
 drivers/infiniband/hw/bnxt_re/hw_counters.h |    4 +-
 drivers/infiniband/hw/bnxt_re/main.c        |    2 +-
 drivers/infiniband/hw/cxgb4/provider.c      |    9 +-
 drivers/infiniband/hw/efa/efa.h             |    3 +-
 drivers/infiniband/hw/efa/efa_main.c        |    3 +-
 drivers/infiniband/hw/efa/efa_verbs.c       |   11 +-
 drivers/infiniband/hw/hfi1/hfi.h            |    8 +-
 drivers/infiniband/hw/hfi1/sysfs.c          |  531 ++++-----
 drivers/infiniband/hw/hfi1/verbs.c          |   88 +-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c   |   19 +-
 drivers/infiniband/hw/mlx4/main.c           |   25 +-
 drivers/infiniband/hw/mlx5/counters.c       |   42 +-
 drivers/infiniband/hw/qib/qib.h             |   10 +-
 drivers/infiniband/hw/qib/qib_sysfs.c       |  608 +++++-----
 drivers/infiniband/hw/qib/qib_verbs.c       |    4 +-
 drivers/infiniband/sw/rdmavt/vt.c           |    2 +-
 drivers/infiniband/sw/rxe/rxe_hw_counters.c |    7 +-
 drivers/infiniband/sw/rxe/rxe_hw_counters.h |    4 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c       |    2 +-
 include/rdma/ib_sysfs.h                     |   37 +
 include/rdma/ib_verbs.h                     |   34 +-
 28 files changed, 1307 insertions(+), 1525 deletions(-)
 create mode 100644 include/rdma/ib_sysfs.h

Comments

Nathan Chancellor May 18, 2021, 11:07 p.m. UTC | #1
Hi Jason,

On 5/17/2021 9:47 AM, Jason Gunthorpe wrote:
> IB has a complex sysfs with a deep nesting of attributes. Nathan and Kees
> recently noticed this was not even slightly sane with how it was handling
> attributes and a deeper inspection shows the whole thing is a pretty
> "ick" coding style.
> 
> Further review shows the ick extends outward from the ib_port sysfs and
> basically everything is pretty crazy.
> 
> Simplify all of it:
> 
>   - Organize the ib_port and gid_attr's kobj's to have clear setup/destroy
>     function pairings that work only on their own kobjs.
> 
>   - All memory allocated in service of a kobject's attributes is freed as
>     part of the kobj release function. Thus all the error handling defers
>     the memory frees to a put.
> 
>   - Build up lists of groups for every kobject and add the entire group
>     list as a one-shot operation as the last thing in setup function.
> 
>   - Remove essentially all the error cleanup. The final kobject_put() will
>     always free any memory allocated or do an internal kobject_del() if
>     required. The new ordering eliminates all the other cleanup cases.
> 
>   - Make all attributes use proper typing for the kobj they are attached
>     to. Split device and port hw_stats handling.
> 
>   - Create a ib_port_attribute type and change hfi1, qib and the CM code to
>     work with attribute lists of ib_port_attribute type instead of building
>     their own kobject madness
> 
> This is sort of RFCy in that I qib and hfi1 stuff is complex enough it needs
> Dennis to look at it, and the core stuff has only passed basic testing at this
> moment. Nathan confirmed an earlier version solves the CFI warning.

This series still passes my basic testing of LTP's read_all test case on 
/sys with CFI in enforcing mode. If there is any more in-depth testing, 
I can put it through, let me know. I'll continue testing the series and 
when it is in a mergeable state, I can provide you with a Tested-by tag.

Cheers,
Nathan
Jason Gunthorpe May 19, 2021, 1:46 p.m. UTC | #2
On Tue, May 18, 2021 at 04:07:49PM -0700, Nathan Chancellor wrote:
> Hi Jason,
> 
> On 5/17/2021 9:47 AM, Jason Gunthorpe wrote:
> > IB has a complex sysfs with a deep nesting of attributes. Nathan and Kees
> > recently noticed this was not even slightly sane with how it was handling
> > attributes and a deeper inspection shows the whole thing is a pretty
> > "ick" coding style.
> > 
> > Further review shows the ick extends outward from the ib_port sysfs and
> > basically everything is pretty crazy.
> > 
> > Simplify all of it:
> > 
> >   - Organize the ib_port and gid_attr's kobj's to have clear setup/destroy
> >     function pairings that work only on their own kobjs.
> > 
> >   - All memory allocated in service of a kobject's attributes is freed as
> >     part of the kobj release function. Thus all the error handling defers
> >     the memory frees to a put.
> > 
> >   - Build up lists of groups for every kobject and add the entire group
> >     list as a one-shot operation as the last thing in setup function.
> > 
> >   - Remove essentially all the error cleanup. The final kobject_put() will
> >     always free any memory allocated or do an internal kobject_del() if
> >     required. The new ordering eliminates all the other cleanup cases.
> > 
> >   - Make all attributes use proper typing for the kobj they are attached
> >     to. Split device and port hw_stats handling.
> > 
> >   - Create a ib_port_attribute type and change hfi1, qib and the CM code to
> >     work with attribute lists of ib_port_attribute type instead of building
> >     their own kobject madness
> > 
> > This is sort of RFCy in that I qib and hfi1 stuff is complex enough it needs
> > Dennis to look at it, and the core stuff has only passed basic testing at this
> > moment. Nathan confirmed an earlier version solves the CFI warning.
> 
> This series still passes my basic testing of LTP's read_all test case on
> /sys with CFI in enforcing mode. If there is any more in-depth testing, I
> can put it through, let me know. I'll continue testing the series and when
> it is in a mergeable state, I can provide you with a Tested-by tag.

Thanks, I think you can probably ignore the following versions,
confirmation that the approach and root cause is correct is much
appreciated.

Jason