Message ID | 1545074124-18185-1-git-send-email-ira.weiny@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Add MAD stack trace points | expand |
On Mon, 2018-12-17 at 14:15 -0500, ira.weiny@intel.com wrote: > From: Ira Weiny <ira.weiny@intel.com> > > A while ago I wrote these patches for MAD stack tracing. At the time I was > proposing to remove the snoop interface.[1] > > For this submission I'd like to propose adding the tracing and leave the snoop > interface in. While I still don't see a need for the snoop interface, I'm no > longer advocating getting rid of it as the functionality of these patches is > different. > > In addition I wrote a sample eBPF which shows how one can further filter at the > tracepoints to distill the information being traced. I don't know if this > should be submitted through another tree or if it is ok to take though > linux-rdma? I think it needs to go through the tracing subsystem tree, maintained by Ingo Molnar and Steven Rostedt. In addition, you need a MAINTAINERS file update to go along with this. The include/trace/events/ib*mad.h files should be under the INFINIBAND subsystem to patches to them get Cc:ed here. Same for the samples/bpf/ibumad_*.c files. > Ira > > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html > > > Ira Weiny (6): > IB/MAD: Add send path trace points > IB/MAD: Add recv path trace point > IB/MAD: Add agent trace points > IB/UMAD: Add umad trace points > IB/MAD: Add SMP details to MAD tracing > BPF: Add sample code for new ib_umad tracepoint > > drivers/infiniband/core/mad.c | 95 ++++++++- > drivers/infiniband/core/user_mad.c | 7 + > include/trace/events/ib_mad.h | 391 +++++++++++++++++++++++++++++++++++++ > include/trace/events/ib_umad.h | 140 +++++++++++++ > samples/bpf/Makefile | 3 + > samples/bpf/ibumad_kern.c | 123 ++++++++++++ > samples/bpf/ibumad_user.c | 120 ++++++++++++ > 7 files changed, 878 insertions(+), 1 deletion(-) > create mode 100644 include/trace/events/ib_mad.h > create mode 100644 include/trace/events/ib_umad.h > create mode 100644 samples/bpf/ibumad_kern.c > create mode 100644 samples/bpf/ibumad_user.c >
> > > > In addition I wrote a sample eBPF which shows how one can further > > filter at the tracepoints to distill the information being traced. I > > don't know if this should be submitted through another tree or if it > > is ok to take though linux-rdma? > > I think it needs to go through the tracing subsystem tree, maintained by Ingo > Molnar and Steven Rostedt. In addition, you need a MAINTAINERS file > update to go along with this. The include/trace/events/ib*mad.h files should > be under the INFINIBAND subsystem to patches to them get Cc:ed here. > Same for the samples/bpf/ibumad_*.c files. ^^^^^ I don't understand. The include/trace/events/ib*mad.h changes are the _only_ ones to the tracing subsystem. You're saying they should go under the IB subsystem. Which I where I posted them... So what should go through the tracing subsystem? Regarding the samples/bpf/ibumad_*.c files, you said they should be the "same"? So the only change left is the samples/bpf/Makefile and the changes to the IB MAD files in the IB system... Finally, do you mean the MAINTAINERS should be updated in the tracing subsystem or the IB subsystem? Ira > > > Ira > > > > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html > > > > > > Ira Weiny (6): > > IB/MAD: Add send path trace points > > IB/MAD: Add recv path trace point > > IB/MAD: Add agent trace points > > IB/UMAD: Add umad trace points > > IB/MAD: Add SMP details to MAD tracing > > BPF: Add sample code for new ib_umad tracepoint > > > > drivers/infiniband/core/mad.c | 95 ++++++++- > > drivers/infiniband/core/user_mad.c | 7 + > > include/trace/events/ib_mad.h | 391 > +++++++++++++++++++++++++++++++++++++ > > include/trace/events/ib_umad.h | 140 +++++++++++++ > > samples/bpf/Makefile | 3 + > > samples/bpf/ibumad_kern.c | 123 ++++++++++++ > > samples/bpf/ibumad_user.c | 120 ++++++++++++ > > 7 files changed, 878 insertions(+), 1 deletion(-) create mode 100644 > > include/trace/events/ib_mad.h create mode 100644 > > include/trace/events/ib_umad.h create mode 100644 > > samples/bpf/ibumad_kern.c create mode 100644 > > samples/bpf/ibumad_user.c > > > > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: B826A3330E572FDD > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
On Wed, 2018-12-19 at 18:24 +0000, Weiny, Ira wrote: > > > In addition I wrote a sample eBPF which shows how one can further > > > filter at the tracepoints to distill the information being traced. I > > > don't know if this should be submitted through another tree or if it > > > is ok to take though linux-rdma? > > > > I think it needs to go through the tracing subsystem tree, maintained by Ingo > > Molnar and Steven Rostedt. In addition, you need a MAINTAINERS file > > update to go along with this. The include/trace/events/ib*mad.h files should > > be under the INFINIBAND subsystem to patches to them get Cc:ed here. > > Same for the samples/bpf/ibumad_*.c files. > ^^^^^ > > I don't understand. > > The include/trace/events/ib*mad.h changes are the _only_ ones to the tracing subsystem. You're saying they should go under the IB subsystem. Which I where I posted them... No, I'm saying they should go through the tracing subsystem so they get that level of review and exposure, but that the INFINIBAND substem MAINTAINERS entry should get a new line indicating that our subsystem should be included on changes to those files. It doesn't mean that the files won't also flag as tracing files when someone runs the query on the MAINTAINERS file, it means both the tracing maintainers and linux- rdma will get Cc:ed on changes to these files. > > So what should go through the tracing subsystem? Send the whole series through there, just add the changes to the MAINTAINERS file I listed. > Regarding the samples/bpf/ibumad_*.c files, you said they should be the "same"? So the only change left is the samples/bpf/Makefile and the changes to the IB MAD files in the IB system... Yes, tag them as files under the INFINIBAND section of the MAINTAINERS file. Again, linux-rdma will get Cc:ed on patches to these files, but so will the main BPF maintianers. > Finally, do you mean the MAINTAINERS should be updated in the tracing subsystem or the IB subsystem? I mean what I said above: Add lines to the INFINIBAND subsystem section of the maintainers file so that we get Cc:ed on changes to these tracing system files. If you grep include/trace on the MAINTAINERS file, you will see this is a very common construct for the tracing subsystem. > Ira > > > > Ira > > > > > > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html > > > > > > > > > Ira Weiny (6): > > > IB/MAD: Add send path trace points > > > IB/MAD: Add recv path trace point > > > IB/MAD: Add agent trace points > > > IB/UMAD: Add umad trace points > > > IB/MAD: Add SMP details to MAD tracing > > > BPF: Add sample code for new ib_umad tracepoint > > > > > > drivers/infiniband/core/mad.c | 95 ++++++++- > > > drivers/infiniband/core/user_mad.c | 7 + > > > include/trace/events/ib_mad.h | 391 > > +++++++++++++++++++++++++++++++++++++ > > > include/trace/events/ib_umad.h | 140 +++++++++++++ > > > samples/bpf/Makefile | 3 + > > > samples/bpf/ibumad_kern.c | 123 ++++++++++++ > > > samples/bpf/ibumad_user.c | 120 ++++++++++++ > > > 7 files changed, 878 insertions(+), 1 deletion(-) create mode 100644 > > > include/trace/events/ib_mad.h create mode 100644 > > > include/trace/events/ib_umad.h create mode 100644 > > > samples/bpf/ibumad_kern.c create mode 100644 > > > samples/bpf/ibumad_user.c > > > > > > > -- > > Doug Ledford <dledford@redhat.com> > > GPG KeyID: B826A3330E572FDD > > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
On Wed, Dec 19, 2018 at 01:39:12PM -0500, Doug Ledford wrote: > On Wed, 2018-12-19 at 18:24 +0000, Weiny, Ira wrote: > > > > In addition I wrote a sample eBPF which shows how one can further > > > > filter at the tracepoints to distill the information being traced. I > > > > don't know if this should be submitted through another tree or if it > > > > is ok to take though linux-rdma? > > > > > > I think it needs to go through the tracing subsystem tree, maintained by Ingo > > > Molnar and Steven Rostedt. In addition, you need a MAINTAINERS file > > > update to go along with this. The include/trace/events/ib*mad.h files should > > > be under the INFINIBAND subsystem to patches to them get Cc:ed here. > > > Same for the samples/bpf/ibumad_*.c files. > > ^^^^^ > > > > I don't understand. > > > > The include/trace/events/ib*mad.h changes are the _only_ ones to the tracing subsystem. You're saying they should go under the IB subsystem. Which I where I posted them... > > No, I'm saying they should go through the tracing subsystem so they get > that level of review and exposure, but that the INFINIBAND substem > MAINTAINERS entry should get a new line indicating that our subsystem > should be included on changes to those files. It doesn't mean that the > files won't also flag as tracing files when someone runs the query on > the MAINTAINERS file, it means both the tracing maintainers and linux- > rdma will get Cc:ed on changes to these files. > > > So what should go through the tracing subsystem? > > Send the whole series through there, just add the changes to the > MAINTAINERS file I listed. It looks like the convention is to have a MAINTAINERS entry for the owning subsystem with a F: include/trace/events/.. Type thing in it. I think we should merge the patches through RDMA, but with the Ack of tracing folks.. > > > > drivers/infiniband/core/mad.c | 95 ++++++++- > > > > drivers/infiniband/core/user_mad.c | 7 + Otherwise these will might make merge conflicts.. Jason
On Wed, 2018-12-19 at 11:48 -0700, Jason Gunthorpe wrote: > I think we should merge the patches through RDMA, but with the Ack of > tracing folks.. I'm fine with that too. Just need their ack and the MAINTAINERS changes.
> > On Wed, 2018-12-19 at 11:48 -0700, Jason Gunthorpe wrote: > > I think we should merge the patches through RDMA, but with the Ack of > > tracing folks.. > > I'm fine with that too. Just need their ack and the MAINTAINERS changes. > Ok I'll update the MAINTAINERS and resend with the appropriate CC. Ira
Hi Ira, On 12/17/2018 2:15 PM, ira.weiny@intel.com wrote: > From: Ira Weiny <ira.weiny@intel.com> > > A while ago I wrote these patches for MAD stack tracing. At the time I was > proposing to remove the snoop interface.[1] > > For this submission I'd like to propose adding the tracing and leave the snoop > interface in. While I still don't see a need for the snoop interface, I'm no > longer advocating getting rid of it as the functionality of these patches is > different. I have a few questions on this patch series: Should the agent, umad, and SMP trace points also be checked for being enabled as the send and receive ones are ? What is the overhead of the trace enabled check when the trace points are off ? Are there MAD performance numbers with tracing off before and after these patches ? Does this work with RMPP ? Was this tested with RMPP responses such as SA GetTableResp ? Should GSI MAD trace points include the GRH if it's present ? Thanks. -- Hal > In addition I wrote a sample eBPF which shows how one can further filter at the > tracepoints to distill the information being traced. I don't know if this > should be submitted through another tree or if it is ok to take though > linux-rdma? > > Ira > > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html > > > Ira Weiny (6): > IB/MAD: Add send path trace points > IB/MAD: Add recv path trace point > IB/MAD: Add agent trace points > IB/UMAD: Add umad trace points > IB/MAD: Add SMP details to MAD tracing > BPF: Add sample code for new ib_umad tracepoint > > drivers/infiniband/core/mad.c | 95 ++++++++- > drivers/infiniband/core/user_mad.c | 7 + > include/trace/events/ib_mad.h | 391 +++++++++++++++++++++++++++++++++++++ > include/trace/events/ib_umad.h | 140 +++++++++++++ > samples/bpf/Makefile | 3 + > samples/bpf/ibumad_kern.c | 123 ++++++++++++ > samples/bpf/ibumad_user.c | 120 ++++++++++++ > 7 files changed, 878 insertions(+), 1 deletion(-) > create mode 100644 include/trace/events/ib_mad.h > create mode 100644 include/trace/events/ib_umad.h > create mode 100644 samples/bpf/ibumad_kern.c > create mode 100644 samples/bpf/ibumad_user.c >
On Wed, Dec 19, 2018 at 09:59:56PM -0500, Hal Rosenstock wrote: > Hi Ira, > > On 12/17/2018 2:15 PM, ira.weiny@intel.com wrote: > > From: Ira Weiny <ira.weiny@intel.com> > > > > A while ago I wrote these patches for MAD stack tracing. At the time I was > > proposing to remove the snoop interface.[1] > > > > For this submission I'd like to propose adding the tracing and leave the snoop > > interface in. While I still don't see a need for the snoop interface, I'm no > > longer advocating getting rid of it as the functionality of these patches is > > different. > > I have a few questions on this patch series: > > Should the agent, umad, and SMP trace points also be checked for being > enabled as the send and receive ones are ? The enable is only used in the send/recv because they need to call trace_create_mad_addr to convert the address based on AH_TYPE. The other trace points can use the "built in" trace enable/disable. For example the umad trace points show up with the following enable files. /sys/kernel/debug/tracing/events/ib_umad/ib_umad_write/enable /sys/kernel/debug/tracing/events/ib_umad/ib_umad_read/enable /sys/kernel/debug/tracing/events/ib_umad/enable The top level enable above will enable both write and read in this case. > > What is the overhead of the trace enabled check when the trace points > are off ? Trace points are practically a no-op when disabled. We have used them in the HFI1 driver fast path without detriment. > Are there MAD performance numbers with tracing off before and > after these patches ? I don't have numbers for this. AFAIK there are no common benchmarks for MAD stack performance. I have written some in the past but I don't know where that code is at the moment. > > Does this work with RMPP ? Was this tested with RMPP responses such as > SA GetTableResp ? Define "work"? ;-) RMPP is not affected but is also not reported. And the traces at the lower levels are showing the individual packet "MADs" where as at the umad level they are showing the RMPP "MAD", if RMPP is done within the kernel. Additional RMPP features should probably be added at some point (like decoding the rmpp header if present) but I just never got around to it because it requires more processing of the packets (something I wanted to avoid). This does not break anything it just may not be as much information as we would like WRT RMPP. But having this is better than nothing IMO. > > Should GSI MAD trace points include the GRH if it's present ? That could be added for sure. As OPA does not use GRH a lot it was not high on my list of priorities. We will probably want to add lots of things in the future, especially to other areas of the core; SA, rdmacm, or verbs? This is one of the reasons I've backed away from removing the snoop feature. Lets leave snoop in for now and get these in as a start. Sound ok? Ira > > Thanks. > > -- Hal > > > In addition I wrote a sample eBPF which shows how one can further filter at the > > tracepoints to distill the information being traced. I don't know if this > > should be submitted through another tree or if it is ok to take though > > linux-rdma? > > > > Ira > > > > [1] https://www.spinics.net/lists/linux-rdma/msg29109.html > > > > > > Ira Weiny (6): > > IB/MAD: Add send path trace points > > IB/MAD: Add recv path trace point > > IB/MAD: Add agent trace points > > IB/UMAD: Add umad trace points > > IB/MAD: Add SMP details to MAD tracing > > BPF: Add sample code for new ib_umad tracepoint > > > > drivers/infiniband/core/mad.c | 95 ++++++++- > > drivers/infiniband/core/user_mad.c | 7 + > > include/trace/events/ib_mad.h | 391 +++++++++++++++++++++++++++++++++++++ > > include/trace/events/ib_umad.h | 140 +++++++++++++ > > samples/bpf/Makefile | 3 + > > samples/bpf/ibumad_kern.c | 123 ++++++++++++ > > samples/bpf/ibumad_user.c | 120 ++++++++++++ > > 7 files changed, 878 insertions(+), 1 deletion(-) > > create mode 100644 include/trace/events/ib_mad.h > > create mode 100644 include/trace/events/ib_umad.h > > create mode 100644 samples/bpf/ibumad_kern.c > > create mode 100644 samples/bpf/ibumad_user.c > >
On 12/20/2018 12:02 AM, Ira Weiny wrote: > On Wed, Dec 19, 2018 at 09:59:56PM -0500, Hal Rosenstock wrote: >> Hi Ira, >> >> On 12/17/2018 2:15 PM, ira.weiny@intel.com wrote: >>> From: Ira Weiny <ira.weiny@intel.com> >>> >>> A while ago I wrote these patches for MAD stack tracing. At the time I was >>> proposing to remove the snoop interface.[1] >>> >>> For this submission I'd like to propose adding the tracing and leave the snoop >>> interface in. While I still don't see a need for the snoop interface, I'm no >>> longer advocating getting rid of it as the functionality of these patches is >>> different. >> >> I have a few questions on this patch series: >> >> Should the agent, umad, and SMP trace points also be checked for being >> enabled as the send and receive ones are ? > > The enable is only used in the send/recv because they need to call > trace_create_mad_addr to convert the address based on AH_TYPE. The other trace > points can use the "built in" trace enable/disable. For example the umad trace > points show up with the following enable files. > > /sys/kernel/debug/tracing/events/ib_umad/ib_umad_write/enable > /sys/kernel/debug/tracing/events/ib_umad/ib_umad_read/enable > /sys/kernel/debug/tracing/events/ib_umad/enable > > The top level enable above will enable both write and read in this case. > >> >> What is the overhead of the trace enabled check when the trace points >> are off ? > > Trace points are practically a no-op when disabled. We have used them in the > HFI1 driver fast path without detriment. > >> Are there MAD performance numbers with tracing off before and >> after these patches ? > > I don't have numbers for this. AFAIK there are no common benchmarks for MAD > stack performance. I have written some in the past but I don't know where that > code is at the moment. > >> >> Does this work with RMPP ? Was this tested with RMPP responses such as >> SA GetTableResp ? > > Define "work"? ;-) What I meant was how RMPP looks when trace is on. In addition to not displaying the RMPP fields, I guess impact is that TID would be replicated in the MAD RMPP sequence. > RMPP is not affected but is also not reported. And the traces at the lower > levels are showing the individual packet "MADs" where as at the umad level they > are showing the RMPP "MAD", if RMPP is done within the kernel. > > Additional RMPP features should probably be added at some point (like decoding > the rmpp header if present) but I just never got around to it because it > requires more processing of the packets (something I wanted to avoid). > > This does not break anything it just may not be as much information as we would > like WRT RMPP. But having this is better than nothing IMO. > >> >> Should GSI MAD trace points include the GRH if it's present ? > > That could be added for sure. As OPA does not use GRH a lot it was not high on > my list of priorities. > > We will probably want to add lots of things in the future, especially to other > areas of the core; SA, rdmacm, or verbs? > > This is one of the reasons I've backed away from removing the snoop feature. > Lets leave snoop in for now and get these in as a start. Sound ok? This seems reasonable to me. -- Hal > Ira > >> >> Thanks. >> >> -- Hal >> >>> In addition I wrote a sample eBPF which shows how one can further filter at the >>> tracepoints to distill the information being traced. I don't know if this >>> should be submitted through another tree or if it is ok to take though >>> linux-rdma? >>> >>> Ira >>> >>> [1] https://www.spinics.net/lists/linux-rdma/msg29109.html >>> >>> >>> Ira Weiny (6): >>> IB/MAD: Add send path trace points >>> IB/MAD: Add recv path trace point >>> IB/MAD: Add agent trace points >>> IB/UMAD: Add umad trace points >>> IB/MAD: Add SMP details to MAD tracing >>> BPF: Add sample code for new ib_umad tracepoint >>> >>> drivers/infiniband/core/mad.c | 95 ++++++++- >>> drivers/infiniband/core/user_mad.c | 7 + >>> include/trace/events/ib_mad.h | 391 +++++++++++++++++++++++++++++++++++++ >>> include/trace/events/ib_umad.h | 140 +++++++++++++ >>> samples/bpf/Makefile | 3 + >>> samples/bpf/ibumad_kern.c | 123 ++++++++++++ >>> samples/bpf/ibumad_user.c | 120 ++++++++++++ >>> 7 files changed, 878 insertions(+), 1 deletion(-) >>> create mode 100644 include/trace/events/ib_mad.h >>> create mode 100644 include/trace/events/ib_umad.h >>> create mode 100644 samples/bpf/ibumad_kern.c >>> create mode 100644 samples/bpf/ibumad_user.c >>> >
From: Ira Weiny <ira.weiny@intel.com>
I believe with this change I have addressed all the comments.
Again we are leaving the snoop interface in and there is a sample eBPF which
shows how one can further filter at the tracepoints to distill the information
being traced.
Changes for V5:
checkpatch cleanup
Remove if statements and use calls in TP_fast_assign for
cleaner trace code
Changes for V4:
Change dev_name to dev_index
Changes for V3:
Rebased on current RDMA for next
Change license text to SPDX tag
Change dev_name to string from array
Reorder fields for more compact ring buffer utilization
Use a defined roce address type for safer memcpy/memset
Adjust BPF to new umad format
Changes for v2:
Update MAINTAINERS as indicated from Doug
Now CC'ing the Tracing maintainers so they are aware of the additions
[1] https://www.spinics.net/lists/linux-rdma/msg29109.html
Ira Weiny (6):
IB/MAD: Add send path trace points
IB/MAD: Add recv path trace point
IB/MAD: Add agent trace points
IB/UMAD: Add umad trace points
IB/MAD: Add SMP details to MAD tracing
BPF: Add sample code for new ib_umad tracepoint
MAINTAINERS | 4 +
drivers/infiniband/core/mad.c | 57 ++++-
drivers/infiniband/core/user_mad.c | 7 +
include/trace/events/ib_mad.h | 386 +++++++++++++++++++++++++++++
include/trace/events/ib_umad.h | 122 +++++++++
samples/bpf/Makefile | 3 +
samples/bpf/ibumad_kern.c | 125 ++++++++++
samples/bpf/ibumad_user.c | 122 +++++++++
8 files changed, 825 insertions(+), 1 deletion(-)
create mode 100644 include/trace/events/ib_mad.h
create mode 100644 include/trace/events/ib_umad.h
create mode 100644 samples/bpf/ibumad_kern.c
create mode 100644 samples/bpf/ibumad_user.c
From: Ira Weiny <ira.weiny@intel.com> A while ago I wrote these patches for MAD stack tracing. At the time I was proposing to remove the snoop interface.[1] For this submission I'd like to propose adding the tracing and leave the snoop interface in. While I still don't see a need for the snoop interface, I'm no longer advocating getting rid of it as the functionality of these patches is different. In addition I wrote a sample eBPF which shows how one can further filter at the tracepoints to distill the information being traced. I don't know if this should be submitted through another tree or if it is ok to take though linux-rdma? Ira [1] https://www.spinics.net/lists/linux-rdma/msg29109.html Ira Weiny (6): IB/MAD: Add send path trace points IB/MAD: Add recv path trace point IB/MAD: Add agent trace points IB/UMAD: Add umad trace points IB/MAD: Add SMP details to MAD tracing BPF: Add sample code for new ib_umad tracepoint drivers/infiniband/core/mad.c | 95 ++++++++- drivers/infiniband/core/user_mad.c | 7 + include/trace/events/ib_mad.h | 391 +++++++++++++++++++++++++++++++++++++ include/trace/events/ib_umad.h | 140 +++++++++++++ samples/bpf/Makefile | 3 + samples/bpf/ibumad_kern.c | 123 ++++++++++++ samples/bpf/ibumad_user.c | 120 ++++++++++++ 7 files changed, 878 insertions(+), 1 deletion(-) create mode 100644 include/trace/events/ib_mad.h create mode 100644 include/trace/events/ib_umad.h create mode 100644 samples/bpf/ibumad_kern.c create mode 100644 samples/bpf/ibumad_user.c