mbox series

[RDMA,for-next,V3,0/6] Add MAD stack trace points

Message ID 1545875731-22189-1-git-send-email-ira.weiny@intel.com (mailing list archive)
Headers show
Series Add MAD stack trace points | expand

Message

Ira Weiny Dec. 27, 2018, 1:55 a.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

A while ago I wrote these patches for MAD stack tracing.  At the time I was
proposing to remove the snoop interface.[1]

For this submission I'd like to propose adding the tracing and leave the snoop
interface in.  While I still don't see a need for the snoop interface, I'm no
longer advocating getting rid of it as the functionality of these patches is
different.

In addition I wrote a sample eBPF which shows how one can further filter at the
tracepoints to distill the information being traced.

Changes for V3:
	Rebased on current RDMA for next
	Change license text to SPDX tag
	Change dev_name to string from array
	Reorder fields for more compact ring buffer utilization
	Use a defined roce address type for safer memcpy/memset
	Adjust BPF to new umad format

Changes for v2:
        Update MAINTAINERS as indicated from Doug
	Now CC'ing the Tracing maintainers so they are aware of the additions

[1] https://www.spinics.net/lists/linux-rdma/msg29109.html


Ira Weiny (6):
  IB/MAD: Add send path trace points
  IB/MAD: Add recv path trace point
  IB/MAD: Add agent trace points
  IB/UMAD: Add umad trace points
  IB/MAD: Add SMP details to MAD tracing
  BPF: Add sample code for new ib_umad tracepoint

 MAINTAINERS                        |   4 +
 drivers/infiniband/core/mad.c      |  95 +++++++++-
 drivers/infiniband/core/user_mad.c |   7 +
 include/trace/events/ib_mad.h      | 362 +++++++++++++++++++++++++++++++++++++
 include/trace/events/ib_umad.h     | 112 ++++++++++++
 samples/bpf/Makefile               |   3 +
 samples/bpf/ibumad_kern.c          | 123 +++++++++++++
 samples/bpf/ibumad_user.c          | 120 ++++++++++++
 8 files changed, 825 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/ib_mad.h
 create mode 100644 include/trace/events/ib_umad.h
 create mode 100644 samples/bpf/ibumad_kern.c
 create mode 100644 samples/bpf/ibumad_user.c

Comments

Leon Romanovsky Dec. 27, 2018, 5:43 a.m. UTC | #1
On Wed, Dec 26, 2018 at 08:55:25PM -0500, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
>
> A while ago I wrote these patches for MAD stack tracing.  At the time I was
> proposing to remove the snoop interface.[1]
>
> For this submission I'd like to propose adding the tracing and leave the snoop
> interface in.  While I still don't see a need for the snoop interface, I'm no
> longer advocating getting rid of it as the functionality of these patches is
> different.
>
> In addition I wrote a sample eBPF which shows how one can further filter at the
> tracepoints to distill the information being traced.

I would like to hear definitive answer from RDMA maintainers - yes/no.
Will those tracepoints be declared as user space ABI?

Thanks
Ira Weiny Dec. 27, 2018, 5:45 p.m. UTC | #2
> >
> > In addition I wrote a sample eBPF which shows how one can further
> > filter at the tracepoints to distill the information being traced.
> 
> I would like to hear definitive answer from RDMA maintainers - yes/no.
> Will those tracepoints be declared as user space ABI?
> 

To be clear I am not proposing any ABI here.  And, I _think_, that is pretty well documented in the eBPF tracepoint documentation.  The patch I'm submitting is a _sample_ of how one can (with a particular version of the kernel) alter the data reported from, or change the logic around, the tracepoints.

Ira
Leon Romanovsky Dec. 27, 2018, 8:21 p.m. UTC | #3
On Thu, Dec 27, 2018 at 05:45:15PM +0000, Weiny, Ira wrote:
> > >
> > > In addition I wrote a sample eBPF which shows how one can further
> > > filter at the tracepoints to distill the information being traced.
> >
> > I would like to hear definitive answer from RDMA maintainers - yes/no.
> > Will those tracepoints be declared as user space ABI?
> >
>
> To be clear I am not proposing any ABI here.  And, I _think_, that is pretty well documented in the eBPF tracepoint documentation.  The patch I'm submitting is a _sample_ of how one can (with a particular version of the kernel) alter the data reported from, or change the logic around, the tracepoints.

Ira,

I'm totally with you and would like to see more tracepoints used in our
subsystem, but latest shit storm related to changes hfi1 debugfs showed
that promises, documentation and agreement to change debugfs in any given
time are not worth a dime, once anyone from the crowd throws an "ABI" word.

Thanks

>
> Ira
>
Ira Weiny Dec. 27, 2018, 9:05 p.m. UTC | #4
> > > >
> > > > In addition I wrote a sample eBPF which shows how one can further
> > > > filter at the tracepoints to distill the information being traced.
> > >
> > > I would like to hear definitive answer from RDMA maintainers - yes/no.
> > > Will those tracepoints be declared as user space ABI?
> > >
> >
> > To be clear I am not proposing any ABI here.  And, I _think_, that is pretty
> well documented in the eBPF tracepoint documentation.  The patch I'm
> submitting is a _sample_ of how one can (with a particular version of the
> kernel) alter the data reported from, or change the logic around, the
> tracepoints.
> 
> Ira,
> 
> I'm totally with you and would like to see more tracepoints used in our
> subsystem, but latest shit storm related to changes hfi1 debugfs showed that
> promises, documentation and agreement to change debugfs in any given
> time are not worth a dime, once anyone from the crowd throws an "ABI"
> word.

+ Denny

I think this is a bit unfair.  The HFI debugfs worked differently from the other devices and caused some confusion about what debugfs names would be presented and if there was a fundamental breakage of debugfs.  That is hardly a "shit storm".  I agree that having the name be consistent with names used elsewhere in the RDMA namespace (and/or some mapping to easily resolve the names) should be a goal going forward.

Regardless, and more importantly, this is a totally different thing.  eBPF sample _code_ is a long way from a tool which is dependent on this "ABI".

Ira
Jason Gunthorpe Jan. 2, 2019, 5:54 p.m. UTC | #5
On Thu, Dec 27, 2018 at 09:05:50PM +0000, Weiny, Ira wrote:

> Regardless, and more importantly, this is a totally different thing.
> eBPF sample _code_ is a long way from a tool which is dependent on
> this "ABI".

There has been some general debate in the kernel community if
tracepoints are ABI or not ABI (like debugfs) and I don't know where
it ended up..

Core tracepoints are probably more likely to be ABI ones, I guess..

Jason
Leon Romanovsky Jan. 2, 2019, 6:32 p.m. UTC | #6
On Wed, Jan 02, 2019 at 10:54:15AM -0700, Jason Gunthorpe wrote:
> On Thu, Dec 27, 2018 at 09:05:50PM +0000, Weiny, Ira wrote:
>
> > Regardless, and more importantly, this is a totally different thing.
> > eBPF sample _code_ is a long way from a tool which is dependent on
> > this "ABI".
>
> There has been some general debate in the kernel community if
> tracepoints are ABI or not ABI (like debugfs) and I don't know where
> it ended up..
>
> Core tracepoints are probably more likely to be ABI ones, I guess..

Tracepoints are valuable debug tool and declaring them as ABI will
create strange situation where refactoring/improvements can be blocked
simply because of those debug aids.

ABI thing will effectively kill all additions of tracepoints to RDMA/core.

Thanks

>
> Jason
Ira Weiny Jan. 3, 2019, 3:36 p.m. UTC | #7
> 
> On Wed, Jan 02, 2019 at 10:54:15AM -0700, Jason Gunthorpe wrote:
> > On Thu, Dec 27, 2018 at 09:05:50PM +0000, Weiny, Ira wrote:
> >
> > > Regardless, and more importantly, this is a totally different thing.
> > > eBPF sample _code_ is a long way from a tool which is dependent on
> > > this "ABI".
> >
> > There has been some general debate in the kernel community if
> > tracepoints are ABI or not ABI (like debugfs) and I don't know where
> > it ended up..
> >
> > Core tracepoints are probably more likely to be ABI ones, I guess..

I agree they will be "more likely" to be ABI but as we are starting out here we need to leave some flexibility...

> 
> Tracepoints are valuable debug tool and declaring them as ABI will create
> strange situation where refactoring/improvements can be blocked simply
> because of those debug aids.
> 
> ABI thing will effectively kill all additions of tracepoints to RDMA/core.

I agree.  Right now these tracepoints were good enough for the OPA debugging I have done over the last few years.  But they have not been generally used because building a kernel with them was too much trouble for many users.  It will not be until we get some feed back and usage when we will know if the amount of information in the trace is too much or too little.

Ira

> 
> Thanks
> 
> >
> > Jason
Jason Gunthorpe Jan. 18, 2019, 9:45 p.m. UTC | #8
On Wed, Dec 26, 2018 at 08:55:25PM -0500, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> A while ago I wrote these patches for MAD stack tracing.  At the time I was
> proposing to remove the snoop interface.[1]
> 
> For this submission I'd like to propose adding the tracing and leave the snoop
> interface in.  While I still don't see a need for the snoop interface, I'm no
> longer advocating getting rid of it as the functionality of these patches is
> different.
> 
> In addition I wrote a sample eBPF which shows how one can further filter at the
> tracepoints to distill the information being traced.
> 
> Changes for V3:
> 	Rebased on current RDMA for next
> 	Change license text to SPDX tag
> 	Change dev_name to string from array
> 	Reorder fields for more compact ring buffer utilization
> 	Use a defined roce address type for safer memcpy/memset
> 	Adjust BPF to new umad format
> 
> Changes for v2:
>         Update MAINTAINERS as indicated from Doug
> 	Now CC'ing the Tracing maintainers so they are aware of the additions
> 
> [1] https://www.spinics.net/lists/linux-rdma/msg29109.html
> 
> 
> Ira Weiny (6):
>   IB/MAD: Add send path trace points
>   IB/MAD: Add recv path trace point
>   IB/MAD: Add agent trace points
>   IB/UMAD: Add umad trace points
>   IB/MAD: Add SMP details to MAD tracing
>   BPF: Add sample code for new ib_umad tracepoint
> 
>  MAINTAINERS                        |   4 +
>  drivers/infiniband/core/mad.c      |  95 +++++++++-
>  drivers/infiniband/core/user_mad.c |   7 +
>  include/trace/events/ib_mad.h      | 362 +++++++++++++++++++++++++++++++++++++
>  include/trace/events/ib_umad.h     | 112 ++++++++++++
>  samples/bpf/Makefile               |   3 +
>  samples/bpf/ibumad_kern.c          | 123 +++++++++++++
>  samples/bpf/ibumad_user.c          | 120 ++++++++++++

Can someone from the BPF side Ack that this is OK?

https://patchwork.kernel.org/project/linux-rdma/list/?series=60629

Thanks,
Jason
Alexei Starovoitov Jan. 23, 2019, 11:34 p.m. UTC | #9
On Fri, Jan 18, 2019 at 02:45:44PM -0700, Jason Gunthorpe wrote:
> On Wed, Dec 26, 2018 at 08:55:25PM -0500, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > A while ago I wrote these patches for MAD stack tracing.  At the time I was
> > proposing to remove the snoop interface.[1]
> > 
> > For this submission I'd like to propose adding the tracing and leave the snoop
> > interface in.  While I still don't see a need for the snoop interface, I'm no
> > longer advocating getting rid of it as the functionality of these patches is
> > different.
> > 
> > In addition I wrote a sample eBPF which shows how one can further filter at the
> > tracepoints to distill the information being traced.
> > 
> > Changes for V3:
> > 	Rebased on current RDMA for next
> > 	Change license text to SPDX tag
> > 	Change dev_name to string from array
> > 	Reorder fields for more compact ring buffer utilization
> > 	Use a defined roce address type for safer memcpy/memset
> > 	Adjust BPF to new umad format
> > 
> > Changes for v2:
> >         Update MAINTAINERS as indicated from Doug
> > 	Now CC'ing the Tracing maintainers so they are aware of the additions
> > 
> > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html
> > 
> > 
> > Ira Weiny (6):
> >   IB/MAD: Add send path trace points
> >   IB/MAD: Add recv path trace point
> >   IB/MAD: Add agent trace points
> >   IB/UMAD: Add umad trace points
> >   IB/MAD: Add SMP details to MAD tracing
> >   BPF: Add sample code for new ib_umad tracepoint
> > 
> >  MAINTAINERS                        |   4 +
> >  drivers/infiniband/core/mad.c      |  95 +++++++++-
> >  drivers/infiniband/core/user_mad.c |   7 +
> >  include/trace/events/ib_mad.h      | 362 +++++++++++++++++++++++++++++++++++++
> >  include/trace/events/ib_umad.h     | 112 ++++++++++++
> >  samples/bpf/Makefile               |   3 +
> >  samples/bpf/ibumad_kern.c          | 123 +++++++++++++
> >  samples/bpf/ibumad_user.c          | 120 ++++++++++++
> 
> Can someone from the BPF side Ack that this is OK?
> 
> https://patchwork.kernel.org/project/linux-rdma/list/?series=60629

Looks ok-ish. TP_STRUCT__entry() is huge, but as long as TP_ARGS has
only two arguments the build will not complain.