Message ID | 20250116144931.649593-1-tglozar@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | rtla/timerlat: Stop on signal properly when overloaded | expand |
On Thu, 16 Jan 2025 15:49:26 +0100 Tomas Glozar <tglozar@redhat.com> wrote: > In the future, two more patchsets will be sent: one to display how many > events/samples were dropped (either left in tracefs buffer or by buffer > overflow), one to improve sample processing performance to be on par with > cyclictest (ideally) so that samples are not dropped in the cases mentioned > in the beginning of the email. Hmm, I wonder if timerlat can handle per cpu data, then you could kick off a thread per CPU (or a set of CPUs) where the thread is responsible for handling the data. CPU_ZERO_S(cpu_size, cpusetp); CPU_SET_S(cpu, cpu_size, cpusetp); retval = tracefs_iterate_raw_events(trace->tep, trace->inst, cpusetp, cpu_size, collect_registered_events, trace); And then that iteration will only read over a subset of CPUs. Each thread can do a different subset and then it should be able to keep up. -- Steve
pá 17. 1. 2025 v 1:46 odesílatel Steven Rostedt <rostedt@goodmis.org> napsal: > Hmm, I wonder if timerlat can handle per cpu data, then you could kick off > a thread per CPU (or a set of CPUs) where the thread is responsible for > handling the data. > > > CPU_ZERO_S(cpu_size, cpusetp); > CPU_SET_S(cpu, cpu_size, cpusetp); > retval = tracefs_iterate_raw_events(trace->tep, > trace->inst, > cpusetp, > cpu_size, > collect_registered_events, > trace); > > And then that iteration will only read over a subset of CPUs. Each thread > can do a different subset and then it should be able to keep up. > That's a good idea, I didn't think of that. But it doesn't help much in a scenario where rtla is pinned to a few housekeeping CPUs with -H, which is used for testing isolated-CPU-based setups. I was thinking of turning timerlat_hist_handler/timerlat_top_handler into a BPF program and having it executed right after the sample is created, e.g. by using the BPF perf interface to hook it to a tracepoint event. The histogram/counter would be stored in BPF maps, which would be merely copied over in the main loop. This is essentially how cyclictest does it, except in userspace. I expect this solution to have good performance, but the obvious downside is that it requires BPF. This is not a problem for us, but might be for other rtla users and we'd likely have to keep both implementations of sample processing in the code. Also, before even starting with that, it would be likely necessary to remove the duplicate code throughout timerlat/osnoise and test it properly, so we don't have to do the same code changes twice or four times. Tomas
On Fri, 17 Jan 2025 13:04:07 +0100 Tomas Glozar <tglozar@redhat.com> wrote: > I was thinking of turning timerlat_hist_handler/timerlat_top_handler > into a BPF program and having it executed right after the sample is > created, e.g. by using the BPF perf interface to hook it to a > tracepoint event. The histogram/counter would be stored in BPF maps, > which would be merely copied over in the main loop. This is > essentially how cyclictest does it, except in userspace. I expect this > solution to have good performance, but the obvious downside is that it > requires BPF. This is not a problem for us, but might be for other > rtla users and we'd likely have to keep both implementations of sample > processing in the code. > > Also, before even starting with that, it would be likely necessary to > remove the duplicate code throughout timerlat/osnoise and test it > properly, so we don't have to do the same code changes twice or four > times. We could also add kernel helpers to the code if it would help. Hmm, the timerlat event could probably get access to a trigger to allow it to do the work in the kernel like what the 'hist' triggers do. We can extend on that. The reason I haven't written a BPF program yet, is because when I feel there's a useful operation that can be done, I just extend ftrace to do it ;-) -- Steve
pá 17. 1. 2025 v 16:29 odesílatel Steven Rostedt <rostedt@goodmis.org> napsal: > Hmm, the timerlat event could probably get access to a trigger to allow it > to do the work in the kernel like what the 'hist' triggers do. We can > extend on that. > If we only need to count the numbers, that would be a perfect solution: use the hist trigger in place of the BPF program (no need to reinvent the wheel!). I'll have to look at exactly what we need. > The reason I haven't written a BPF program yet, is because when I feel > there's a useful operation that can be done, I just extend ftrace to do it > ;-) I used to work with BPF, so I usually think of a BPF solution first, then I think about whether I can replace the BPF part :D Tomas