[v28,0/8] Timestamp synchronization of host - guest tracing session

Message ID	20210208061743.510964-1-tz.stoyanov@gmail.com (mailing list archive)
Headers	show Return-Path: <linux-trace-devel-owner@kernel.org> From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com> To: rostedt@goodmis.org Cc: linux-trace-devel@vger.kernel.org Subject: [PATCH v28 0/8] Timestamp synchronization of host - guest tracing session Date: Mon, 8 Feb 2021 08:17:35 +0200 Message-Id: <20210208061743.510964-1-tz.stoyanov@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Timestamp synchronization of host - guest tracing session \| expand [v28,0/8] Timestamp synchronization of host - guest tracing session [v28,1/8] trace-cmd: Save command lines in VM agent [v28,2/8] trace-cmd: Fix bug in getting tracing dir in trace-cmd agent [v28,3/8] trace-cmd: Removed unused s64 define [v28,4/8] trace-cmd: Add timestamp synchronization per vCPU [v28,5/8] trace-cmd: Add dummy function to initialize timestamp sync logic [v28,6/8] trace-cmd: [POC] PTP-like algorithm for host - guest timestamp synchronization [v28,7/8] trace-cmd: Debug scripts for PTP-like algorithm for host - guest timestamp synchronization [v28,8/8] trace-cmd [POC]: Add KVM timestamp synchronization plugin

Tzvetomir Stoyanov (VMware) Feb. 8, 2021, 6:17 a.m. UTC

Basic infrastructure for host - guest timestamp synchronization and a
PoC implementation of PTP-like and KVM algorithms.

The KVM plugin is with higher weight than PTP, it is preferred in case both
are active. KVM is used only if x86-tsc clock is selected for the current
trace, PTP is used for all other cases.

These changes depend on commits:
"trace-cmd: Fix plist allocation check in trace-msg.c"
https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203165431.792152a4@gandalf.local.home/
"trace-cmd: Fix SEGFAULT when no tsync_protos are supplied"
https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203171018.5b82a0a7@gandalf.local.home/

v28 changes:
- Fix bugs, introduced by previous trace-cmd commits, related to tracefs
library usage.
- Removed unused type defines for trace packets.
v27 changes:
- Fixed broken compilation due to usage of old libtracefs APIs.
- Rewrote complex macros to follow the Linux kernel requirements for macros.
- Removed pathces from the set, that were already merged upstream.
v26 changes:
- Rebased on top of latest master.
- Abstracted Time sync protocol names in a structure, to avoid using
triple char pointers.
- Removed a patch from the set, as it is already merged:
"trace-cmd: Add new libtrasefs API to get the current trace clock".
- Minor coding style changes, for better code readability.
v25 changes:
- Addressed Steven's comments.
- Reorded the patches in the plugin - put the PTP and KVM PoC patches
last and all other fixes and improvements before them.
- Implemented a new libtrasefs API to get the current trace clock.
- Replaced IDs of trace timestamps synchronization protocols with strings.
- Implemented an "accuracy" parameter for timestamps synchronization
protocols.
- Added "scaling ratio" correction, calculated by the timestamp
synchronization protocols.
- Added flags, declared by the timestamps synchronization protocols,
and stored in the trace.dat file, together with the calculated
time offsets.
- Implemented timestamp offset calculation per virtual CPU.
v24 changes:
- trace-cmd internal refactoring, needed for KVM timestamp synchronization
plugin.
- Added KVM timestamp synchronization plugin.

v23 changes:
- Added flags to PTP start message to control the behaviour of the
algorithm. This is for development purposes mostly, to compare and
evaluate how these changes affect the accuracy.
PTP_FLAG_FASTEST_RESPONSE: Consider only the probe with fastest
response time, otherwise make a histogram
from all probes.
PTP_FLAG_USE_MARKER: Use trace marker to get the clock,
otherwise use directly the system clock

v22 changes, addressed Steven's comments:
- Fixed error flow in read_qemu_guests_pids(), free allocated resources
in case of an error.
- Fixed initialisation of reallocated memmory in set_vcpu_pid_mapping().
- Fixed bitmask logic in tracecmd_tsync_proto_select().
- Fixed usage of wrong structure in make_trace_resp().

v21 changes:
- Rebased on top of latest master.
- Remove these two patches from the set, as they are already merged:
trace-cmd: Add new library API for local CPU count
trace-cmd: Add support for negative time offsets in trace.dat file
- Added more descriptive warning message when fails to extract Time Shift
information from the trace.dat file.
- Added a warning message when fails to obtain vcpu - pid mappings.
- Handle the case with sparse VCPU numbers in VCPU - PID mapping
array.
- Fixed broken recording over FIFOs.
v20 changes:
- Rebased on top of latest master.
- Removed the VCPUS_MAX hardcoded limit, reimplemented the cpu_pid[] array
to be dynamically allocated.
- Added a warning when reading of TRACECMD_OPTION_TIME_SHIFT option
fails, due to unexpected option's size.
- Improved loading of TRACECMD_OPTION_GUEST option data, as Steven
suggested.
v19 changes:
- Rebased on top of latest master. The librtacefs is already merged,
this allows to drop few patches from the set and use librtacefs APIs
instead.
- Reimplemented all new trace.dat options to be in binary format,
instead of text. Leverage the new "trace-cmd dump" sub command to
track what is written in the file.
- Addressed Steven's comments.

v18 changes: addressed Steven Rostdet comments:
- Replaced semaphores with pthread mutexes.
- Made bitmask with time sync protocols unlimited, so we can support
more than 32 protocols. That required to redesign the trace request
packet format.
- A lot of small fixes.

v17 changes:
- Implemented new PTP logic for calculating the clocks offset, using
histogram of all PTP samples. It gives better results than the logic
with the fastest response time, so set the new one as default in the
POC patch.

v16 changes:
- Fixed compilation in case no VSOCK is available (Thanks to Slavomir Kaslev)
- Fixed a typo in trace-cmd-record.1.txt (Thanks to Slavomir Kaslev)
- Added forgotten file in the patch "trace-cmd: Add new library APIs for
ftrace instances." - trace-instance.c
- Fixed few compilation warnings related to TSYNC_DEBUG code.
- Removed a blank line at the end of "tsync_readme" file.

v15 changes:
- Removed the patch for "--proc-map" from the series, as it should not
be part of it.

v14 changes:
- Bring back the PTP-like algorithm and removed the ftrace event based logic.
- Reimplemented the PTP-like algorithm to use raw ftrace markers, instead of clock_gettime() API.
- Refactored the logic to be algorithm independent and plugin friendly.
- Implemented continuous timestamps synchronization, while the trace is running.
- Moved logic from trace-cmd application to libtracecmd, as new library APIs.
- Implemented new trace id functionality.
- Implemented new guest section in host trace.dat file.

v13 changes:
- Remove few patches from the set, as they were merged.
- Rebased to the latest master, Slavomirs patchest "Add VM kernel tracing over
vsockets and FIFOs" got merged!

v12 changes:
- Rebased on top of Slavomir's v13 "Add VM kernel tracing over vsockets and FIFOs"

v11 changes:
- Rebased on top of Slavomir's v10 "Add VM kernel tracing over vsockets and FIFOs"
- Addressed Slavomir's commnents from version 10 of the patch series.

v10 changes:
- Fixed broken compilation, call to timestamp_correction_calc() in timestamp_correct
was smashed.
- Replaced deprecated tep_data_event_from_type() API with tep_find_event().
- Fixed a warning on assignment const to non const.

v9 changes:
- Fixed implementation of binary search algorithm in timestamp_correct()

v8 changes:
- Added rmdir() call in tracecmd_remove_instance(), to completely remove the instance.
However, there is an issue with deleting the instances using rmdir(), which is investigated.
- Few changes in read_qemu_guests_pids(), timestamp_correct(), tsync_offset_load()
tracecmd_clock_context_new() and find_raw_events() suggested by Slavomir.

v7 changes:
- Added warning messages in case time synchronization cannot be negotiated or fails.
- Few optimizations and checks in read_qemu_guests_pids(), tsync_offset_load(),
and find_raw_events(), suggested by Slavomir Kaslev.
- Reworked timestamp_correct() to not use static variables.
- Check TRACECMD_OPTION_TIME_SHIFT before reading time sync samples from the trace.dat file

v6 changes:
- Refactored tracecmd_msg_snd_time_sync() and tracecmd_msg_rcv_time_sync() functions:
removed any time sync calculations logic as separate functions in trace-timesync.c file
- Defined TSYNC_PROBE, TSYNC_REQ and TSYNC_RESP messages, in order to make the time sync
protocol comprehensible.
- Addressed Steven Rostedt comments.
- Addressed Slavomir Kaslev commnets.

v5 changes:
- Rebased to Slavomir's v8 "Add VM kernel tracing over vsockets and FIFOs"
patch series.
- Implemented an algorithm for time drift correction.
- Addressed Slavomir's commnets.
- Refactored the code: moved all time sync specific implementation in trace-timesync.c
- Isolated all hardcoded event specific stuff in a structure, so it could be easily
moved to external plugins.
- Added a check for VSOCK support: do not perform vsock dependent time synchronisation
in case there is no VSOCK support.

v4 changes:
- Removed the implementation of PTP-like algorithm. The current
logic relies on matching time stamps of kvm_exit/virtio_transport_recv_pkt
events on host to virtio_transport_alloc_pkt/vp_notify events on guest.
- Rebased to Slavomir's v7 "Add VM kernel tracing over vsockets and FIFOs"
patch series.
- Decreased the time synch probes from 5000 to 300.
- Addressed Steven Rostedt comments.
- Code cleanup.

v3 changes:
- Removed any magic constants, used in the PTP-like algorithm,
as Slavomir Kaslev suggested.
- Implemented new algorithm, based on mapping kvm_exit events
in host context to vsock_send events in guest context,
suggested by Steven Rostedt.

v2 changes:
- Addressed Steven Rostedt comments.
- Modified PTP-like timestamps sync algorithm to gain more accuracy, with the
help of Yordan Karadzhov and Slavomir Kaslev.
Tzvetomir Stoyanov (VMware) (8):
trace-cmd: Save command lines in VM agent
trace-cmd: Fix bug in getting tracing dir in trace-cmd agent
trace-cmd: Removed unused s64 define
trace-cmd: Add timestamp synchronization per vCPU
trace-cmd: Add dummy function to initialize timestamp sync logic
trace-cmd: [POC] PTP-like algorithm for host - guest timestamp
synchronization
trace-cmd: Debug scripts for PTP-like algorithm for host - guest
timestamp synchronization
trace-cmd [POC]: Add KVM timestamp synchronization plugin

lib/trace-cmd/Makefile | 2 +
.../include/private/trace-cmd-private.h | 5 +-
lib/trace-cmd/include/trace-tsync-local.h | 23 +-
lib/trace-cmd/trace-input.c | 188 +++--
lib/trace-cmd/trace-msg.c | 1 -
lib/trace-cmd/trace-output.c | 4 +-
lib/trace-cmd/trace-timesync-kvm.c | 460 +++++++++++
lib/trace-cmd/trace-timesync-ptp.c | 715 ++++++++++++++++++
lib/trace-cmd/trace-timesync.c | 222 ++++--
scripts/debug/tsync_hist.py | 57 ++
scripts/debug/tsync_readme | 12 +
scripts/debug/tsync_res.py | 46 ++
tracecmd/trace-agent.c | 2 +
tracecmd/trace-dump.c | 52 +-
tracecmd/trace-record.c | 26 +-
tracecmd/trace-tsync.c | 67 +-
16 files changed, 1698 insertions(+), 184 deletions(-)
create mode 100644 lib/trace-cmd/trace-timesync-kvm.c
create mode 100644 lib/trace-cmd/trace-timesync-ptp.c
create mode 100644 scripts/debug/tsync_hist.py
create mode 100644 scripts/debug/tsync_readme
create mode 100644 scripts/debug/tsync_res.py

Dario Faggioli Feb. 9, 2021, 5:32 a.m. UTC | #1

Hello,

On Mon, 2021-02-08 at 08:17 +0200, Tzvetomir Stoyanov (VMware) wrote:
> Basic infrastructure for host - guest timestamp synchronization and a
> PoC implementation of PTP-like and KVM algorithms. 
> 
I'm trying to play with trace-cmd agent & clock syncronization, so I
applied this series.

However, when I try to connect to a VM where I have the agent running,
I'm seeing this (on the host):

# ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
trace-cmd: No such file or directory
  Failed to receive trace response -107

I tried to do a strace, still on the host, and this is what I see (the
last part of the output):

rt_sigaction(SIGINT, {sa_handler=0x4116a5, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f29f1b55b70}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
openat(AT_FDCWD, "/var/lib/trace-cmd/virt/(null)/trace-pipe-cpu0.out", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory)
socket(AF_VSOCK, SOCK_STREAM, 0)        = 4
connect(4, {sa_family=AF_VSOCK, sa_data="\0\0\270\"\0\0\3\0\0\0\0\0\0\0"}, 16) = 0
openat(AT_FDCWD, "/sys/kernel/debug/tracing/trace_clock", O_RDONLY) = 6
read(6, "local global counter uptime perf"..., 8192) = 62
read(6, "", 8192)                       = 0
close(6)                                = 0
write(4, "\0\0\0<\0\0\0\6\0\0\0\20\0\0\0\0\0\0\0\0\347\366+n\204\271\35\334", 28) = 28
write(4, "\0\0\0\0\0\0\0\v\0\0\0\2-e\0all\0\0\0\0\1\0\0\0\5ptp\0\0", 32) = 32
read(4, "", 12)                         = 0
dup(2)                                  = 6
fcntl(6, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
fstat(6, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x9), ...}) = 0
write(6, "trace-cmd: No such file or direc"..., 37trace-cmd: No such file or directory
) = 37
close(6)                                = 0
write(2, "  ", 2  )                       = 2
write(2, "Failed to receive trace response"..., 37Failed to receive trace response -107) = 37
write(2, "\n", 1
)                       = 1
exit_group(2)                           = ?
+++ exited with 2 +++

Finally, if I use `--debug` both on the host and in the VM, here's what
I get.

Guest:
# ./tracecmd/trace-cmd agent -p 8888 --debug
listening on @3:8888
msg received: 6 (TRACE_REQ) [60]
Segmentation fault

Host:
# ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
msg send: 6 (TRACE_REQ) [60]
trace-cmd: No such file or directory
  Failed to receive trace response -107

Am I doing something wrong? :-)

If necessary, I can provide more info about my setup, or do more tests.

Thanks and Regards

Tzvetomir Stoyanov (VMware) Feb. 9, 2021, 6:03 a.m. UTC | #2

Hi Dario,

On Tue, Feb 9, 2021 at 7:32 AM Dario Faggioli <dfaggioli@suse.com> wrote:
>
> Hello,
>
> On Mon, 2021-02-08 at 08:17 +0200, Tzvetomir Stoyanov (VMware) wrote:
> > Basic infrastructure for host - guest timestamp synchronization and a
> > PoC implementation of PTP-like and KVM algorithms.
> >
> I'm trying to play with trace-cmd agent & clock syncronization, so I
> applied this series.
>
> However, when I try to connect to a VM where I have the agent running,
> I'm seeing this (on the host):
>
> # ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
> trace-cmd: No such file or directory
>   Failed to receive trace response -107
>
> I tried to do a strace, still on the host, and this is what I see (the
> last part of the output):
>
> rt_sigaction(SIGINT, {sa_handler=0x4116a5, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f29f1b55b70}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
> openat(AT_FDCWD, "/var/lib/trace-cmd/virt/(null)/trace-pipe-cpu0.out", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory)
> socket(AF_VSOCK, SOCK_STREAM, 0)        = 4
> connect(4, {sa_family=AF_VSOCK, sa_data="\0\0\270\"\0\0\3\0\0\0\0\0\0\0"}, 16) = 0
> openat(AT_FDCWD, "/sys/kernel/debug/tracing/trace_clock", O_RDONLY) = 6
> read(6, "local global counter uptime perf"..., 8192) = 62
> read(6, "", 8192)                       = 0
> close(6)                                = 0
> write(4, "\0\0\0<\0\0\0\6\0\0\0\20\0\0\0\0\0\0\0\0\347\366+n\204\271\35\334", 28) = 28
> write(4, "\0\0\0\0\0\0\0\v\0\0\0\2-e\0all\0\0\0\0\1\0\0\0\5ptp\0\0", 32) = 32
> read(4, "", 12)                         = 0
> dup(2)                                  = 6
> fcntl(6, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
> fstat(6, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x9), ...}) = 0
> write(6, "trace-cmd: No such file or direc"..., 37trace-cmd: No such file or directory
> ) = 37
> close(6)                                = 0
> write(2, "  ", 2  )                       = 2
> write(2, "Failed to receive trace response"..., 37Failed to receive trace response -107) = 37
> write(2, "\n", 1
> )                       = 1
> exit_group(2)                           = ?
> +++ exited with 2 +++
>
> Finally, if I use `--debug` both on the host and in the VM, here's what
> I get.
>
> Guest:
> # ./tracecmd/trace-cmd agent -p 8888 --debug
> listening on @3:8888
> msg received: 6 (TRACE_REQ) [60]
> Segmentation fault
>
> Host:
> # ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
> msg send: 6 (TRACE_REQ) [60]
> trace-cmd: No such file or directory
>   Failed to receive trace response -107
>
> Am I doing something wrong? :-)
>
> If necessary, I can provide more info about my setup, or do more tests.

There are a few limitations that we are working to address:
 1. Ensure that vhost_vsock module is loaded on the host.
 2. You should add a vsock device manually to your VM config, I use
"virsh edit <vm name>" to add:
   <vsock model='virtio'>
      <cid auto='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
function='0x0'/>
    </vsock>
somewhere in the <devices> </devices> section.

>
> Thanks and Regards
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Tzvetomir Stoyanov (VMware) Feb. 9, 2021, 6:08 a.m. UTC | #3

On Tue, Feb 9, 2021 at 8:03 AM Tzvetomir Stoyanov <tz.stoyanov@gmail.com> wrote:
>
> Hi Dario,
>
> On Tue, Feb 9, 2021 at 7:32 AM Dario Faggioli <dfaggioli@suse.com> wrote:
> >
> > Hello,
> >
> > On Mon, 2021-02-08 at 08:17 +0200, Tzvetomir Stoyanov (VMware) wrote:
> > > Basic infrastructure for host - guest timestamp synchronization and a
> > > PoC implementation of PTP-like and KVM algorithms.
> > >
> > I'm trying to play with trace-cmd agent & clock syncronization, so I
> > applied this series.
> >
> > However, when I try to connect to a VM where I have the agent running,
> > I'm seeing this (on the host):
> >
> > # ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
> > trace-cmd: No such file or directory
> >   Failed to receive trace response -107
> >
> > I tried to do a strace, still on the host, and this is what I see (the
> > last part of the output):
> >
> > rt_sigaction(SIGINT, {sa_handler=0x4116a5, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f29f1b55b70}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
> > openat(AT_FDCWD, "/var/lib/trace-cmd/virt/(null)/trace-pipe-cpu0.out", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory)
> > socket(AF_VSOCK, SOCK_STREAM, 0)        = 4
> > connect(4, {sa_family=AF_VSOCK, sa_data="\0\0\270\"\0\0\3\0\0\0\0\0\0\0"}, 16) = 0
> > openat(AT_FDCWD, "/sys/kernel/debug/tracing/trace_clock", O_RDONLY) = 6
> > read(6, "local global counter uptime perf"..., 8192) = 62
> > read(6, "", 8192)                       = 0
> > close(6)                                = 0
> > write(4, "\0\0\0<\0\0\0\6\0\0\0\20\0\0\0\0\0\0\0\0\347\366+n\204\271\35\334", 28) = 28
> > write(4, "\0\0\0\0\0\0\0\v\0\0\0\2-e\0all\0\0\0\0\1\0\0\0\5ptp\0\0", 32) = 32
> > read(4, "", 12)                         = 0
> > dup(2)                                  = 6
> > fcntl(6, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
> > fstat(6, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x9), ...}) = 0
> > write(6, "trace-cmd: No such file or direc"..., 37trace-cmd: No such file or directory
> > ) = 37
> > close(6)                                = 0
> > write(2, "  ", 2  )                       = 2
> > write(2, "Failed to receive trace response"..., 37Failed to receive trace response -107) = 37
> > write(2, "\n", 1
> > )                       = 1
> > exit_group(2)                           = ?
> > +++ exited with 2 +++
> >
> > Finally, if I use `--debug` both on the host and in the VM, here's what
> > I get.
> >
> > Guest:
> > # ./tracecmd/trace-cmd agent -p 8888 --debug
> > listening on @3:8888
> > msg received: 6 (TRACE_REQ) [60]
> > Segmentation fault
> >
> > Host:
> > # ./tracecmd/trace-cmd record --debug -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:8888 -e all sleep 1
> > msg send: 6 (TRACE_REQ) [60]
> > trace-cmd: No such file or directory
> >   Failed to receive trace response -107
> >
> > Am I doing something wrong? :-)
> >
> > If necessary, I can provide more info about my setup, or do more tests.
>
> There are a few limitations that we are working to address:
>  1. Ensure that vhost_vsock module is loaded on the host.
>  2. You should add a vsock device manually to your VM config, I use
> "virsh edit <vm name>" to add:
>    <vsock model='virtio'>
>       <cid auto='yes'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
> function='0x0'/>
>     </vsock>
> somewhere in the <devices> </devices> section.
>
Also, please note that there are two patches that should be applied also,
not part of v28 version of the patchset:
  "trace-cmd: Fix plist allocation check in trace-msg.c"
   https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203165431.792152a4@gandalf.local.home/
 "trace-cmd: Fix SEGFAULT when no tsync_protos are supplied"
   https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203171018.5b82a0a7@gandalf.local.home/


> >
> > Thanks and Regards
> > --
> > Dario Faggioli, Ph.D
> > http://about.me/dario.faggioli
> > Virtualization Software Engineer
> > SUSE Labs, SUSE https://www.suse.com/
> > -------------------------------------------------------------------
> > <<This happens because _I_ choose it to happen!>> (Raistlin Majere)
>
>
>
> --
> Tzvetomir (Ceco) Stoyanov
> VMware Open Source Technology Center

Dario Faggioli Feb. 9, 2021, 7:18 a.m. UTC | #4

On Tue, 2021-02-09 at 08:03 +0200, Tzvetomir Stoyanov wrote:
> Hi Dario,
> 
Hi!

> On Tue, Feb 9, 2021 at 7:32 AM Dario Faggioli <dfaggioli@suse.com>
> wrote:
> > 
> > If necessary, I can provide more info about my setup, or do more
> > tests.
> 
> There are a few limitations that we are working to address:
>  1. Ensure that vhost_vsock module is loaded on the host.
>  2. You should add a vsock device manually to your VM config, I use
> "virsh edit <vm name>" to add:
>    <vsock model='virtio'>
>       <cid auto='yes'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
> function='0x0'/>
>     </vsock>
> somewhere in the <devices> </devices> section.
> 
Yep, these are both fine. In fact, I tested host<-->guest communication
via vsock independently, and it works, so it does not seem a vsock
issue...

I also try without the series applied (but keeping patch 2) and it
still fails in the same way.

FTR, in the past, I've used trace-cmd from git with an earlier version
of this series applied (the "KVM clock" wasn't there yet) on the same
environment I'm using now, and it worked.

Regards

Dario Faggioli Feb. 9, 2021, 7:28 a.m. UTC | #5

On Tue, 2021-02-09 at 08:08 +0200, Tzvetomir Stoyanov wrote:
> 
> Also, please note that there are two patches that should be applied
> also,
> not part of v28 version of the patchset:
>
Aha!

>   "trace-cmd: Fix plist allocation check in trace-msg.c"
>     
> https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203165431.792152a4@gandalf.local.home/
>  "trace-cmd: Fix SEGFAULT when no tsync_protos are supplied"
>     
> https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203171018.5b82a0a7@gandalf.local.home/
> 
I just applied both, and it now seems that things work!

Or at least, host and guest talk, and I see this: "Negotiated kvm time
sync protocol with guest (null)" (is that "(null)" ok/intentional? I
guess I can check the code... :-))

Thanks a lot. :-)

Tzvetomir Stoyanov (VMware) Feb. 9, 2021, 11:44 a.m. UTC | #6

Hi Dario,

On Tue, Feb 9, 2021 at 9:28 AM Dario Faggioli <dfaggioli@suse.com> wrote:
>
> On Tue, 2021-02-09 at 08:08 +0200, Tzvetomir Stoyanov wrote:
> >
> > Also, please note that there are two patches that should be applied
> > also,
> > not part of v28 version of the patchset:
> >
> Aha!
>
> >   "trace-cmd: Fix plist allocation check in trace-msg.c"
> >
> > https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203165431.792152a4@gandalf.local.home/
> >  "trace-cmd: Fix SEGFAULT when no tsync_protos are supplied"
> >
> > https://patchwork.kernel.org/project/linux-trace-devel/patch/20210203171018.5b82a0a7@gandalf.local.home/
> >
> I just applied both, and it now seems that things work!
>
> Or at least, host and guest talk, and I see this: "Negotiated kvm time
> sync protocol with guest (null)" (is that "(null)" ok/intentional? I
> guess I can check the code... :-))

That null looks like an error, the name of the guest VM should be printed there.
Thanks for testing these patches, I hope the code will be stabilized
and merged soon.

>
> Thanks a lot. :-)
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Dario Faggioli Feb. 9, 2021, 12:24 p.m. UTC | #7

On Tue, 2021-02-09 at 13:44 +0200, Tzvetomir Stoyanov wrote:
> Hi Dario,
> 
Hi again,

> On Tue, Feb 9, 2021 at 9:28 AM Dario Faggioli <dfaggioli@suse.com>
> wrote:
> > 
> > Or at least, host and guest talk, and I see this: "Negotiated kvm
> > time
> > sync protocol with guest (null)" (is that "(null)" ok/intentional?
> > I
> > guess I can check the code... :-))
> 
> That null looks like an error, the name of the guest VM should be
> printed there.
>
Yep, in fact, as I said, I tired this before and --now that I think
about it-- I do remember that the guest "name" was what was there.

BTW, if I can pester you a little bit more, now that it works I'm
trying to use this, and here's what I'm doing.

== VM:
# echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
# echo x86-tsc > /sys/kernel/tracing/trace_clock
# trace-cmd agent

== Host:
# echo x86-tsc > /sys/kernel/tracing/trace_clock
# trace-cmd record -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:823 -e sched:* -e syscalls:*_clock_nanosleep -e timer:hrtimer* sleep 1

And I do end up with the two files: trace.dat for the host and trace-
(null).dat (oh, well :-D) for the guest.

Now, just `trace-cmd report`-ing them, I get this (just some small
fragments, of course).

== Host (trace.dat):
           <...>-24791 [001]308219686938467: sched_switch:         prev_comm=sleep prev_pid=24791 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
          <idle>-0     [001]308219691836595: hrtimer_cancel:       hrtimer=0xffff96694faec710
          <idle>-0     [001]308219691837544: hrtimer_expire_entry: hrtimer=0xffff96694faec710 function=apic_timer_fn now=76856318840891
          <idle>-0     [001]308219691840271: sched_waking:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
          <idle>-0     [001]308219691849474: sched_wakeup:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
          <idle>-0     [001]308219691850914: hrtimer_expire_exit:  hrtimer=0xffff96694faec710
          <idle>-0     [001]308219691856548: sched_switch:         prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=CPU 0/KVM next_pid=9196 next_prio=120
       CPU 0/KVM-9196  [001]308219691876512: kvm_vcpu_wakeup:      wait time 2352265 ns, polling valid
       CPU 0/KVM-9196  [001]308219691880170: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
       CPU 0/KVM-9196  [001]308219691884210: kvm_pv_tlb_flush:     vcpu 0 need_flush_tlb false


 == Guest (trace-(null).dat):
           <...>-12068 [000]118613084443085: sched_stat_runtime:   comm=trace-cmd pid=12068 runtime=97403 [ns] vruntime=5101562006274 [ns]
           <...>-12068 [000]118613084451479: sched_switch:         prev_comm=trace-cmd prev_pid=12068 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12070 next_prio=120
           <...>-12070 [000]118613084662730: sched_stat_runtime:   comm=trace-cmd pid=12070 runtime=55912 [ns] vruntime=5101564964781 [ns]
           <...>-12070 [000]118613084665219: sched_switch:         prev_comm=trace-cmd prev_pid=12070 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
          <idle>-0     [000]118613085002791: sched_waking:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
          <idle>-0     [000]118613085006567: sched_wakeup:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
          <idle>-0     [000]118613085011740: sched_switch:         prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=10458 next_prio=120
     kworker/0:0-10458 [000]118613085023034: sched_waking:         comm=trace-cmd pid=12070 prio=120 target_cpu=000
     kworker/0:0-10458 [000]118613085023774: sched_stat_runtime:   comm=kworker/0:0 pid=10458 runtime=4848 [ns] vruntime=5101561969629 [ns]
     kworker/0:0-10458 [000]118613085025356: sched_wakeup:         comm=trace-cmd pid=12070 prio=120 target_cpu=000

So, the timestamps seems still a bit apart to me. Is this fine, and
hence there is some more post-processing that I should do (and, if yes,
what?). Or should they be already in sync?

I've also tried to open trace.dat in kernelshark2, and then append
trace-(null).dat but from the few that I see (as said in my other
email, kernelshark2 is not fully functional for me) the events from the
two traces are still completely separated.

I also tried with the following, but it's still the same:
# echo kvm-clock > /sys/devices/[...]/current_clocksource in the guest

Then, I've removed the last patch of the series, the one for "KVM
clock", after which, I was correctly (I think) seeing:

"Negotiated ptp time sync protocol with guest (null)"

But then again:

== Host:
       CPU 0/KVM-9196  [001]249509851092735: kvm_exit:             vcpu 0 reason PREEMPTION_TIMER rip 0x7f0da050b357 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
       CPU 0/KVM-9196  [001]249509851096316: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
       CPU 0/KVM-9196  [001]249509851112525: kvm_exit:             vcpu 0 reason MSR_WRITE rip 0xffffffffac475164 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
       CPU 0/KVM-9196  [001]249509851114034: kvm_hv_timer_state:   vcpu_id 0 hv_timer 1
       CPU 0/KVM-9196  [001]249509851115032: kvm_msr:              msr_write 6e0 = 0x367b512478b0
           sleep-26457 [000]249509851765258: sched_process_exec:   filename=/usr/bin/sleep pid=26457 old_pid=26457
       CPU 0/KVM-9196  [001]249509853247850: kvm_exit:             vcpu 0 reason EXTERNAL_INTERRUPT rip 0xffffffffad00015d info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x800000ec error_code 0x00000000
       CPU 0/KVM-9196  [001]249509853250399: hrtimer_cancel:       hrtimer=0xffff96709fc61980

== Guest:
           <...>-12891 [000]125778006088605: sched_stat_runtime:   comm=trace-cmd pid=12891 runtime=84358 [ns] vruntime=5110265291610 [ns]
           <...>-12891 [000]125778006095698: sched_switch:         prev_comm=trace-cmd prev_pid=12891 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12893 next_prio=120
           <...>-12893 [000]125778006320003: sched_stat_runtime:   comm=trace-cmd pid=12893 runtime=58448 [ns] vruntime=5110268246764 [ns]
           <...>-12893 [000]125778006322729: sched_switch:         prev_comm=trace-cmd prev_pid=12893 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
          <idle>-0     [000]125778021774505: hrtimer_cancel:       hrtimer=0xffff99d67bc21980
          <idle>-0     [000]125778021776481: hrtimer_expire_entry: hrtimer=0xffff99d67bc21980 function=tick_sched_timer now=31370890644808
          <idle>-0     [000]125778021782895: hrtimer_expire_exit:  hrtimer=0xffff99d67bc21980

So, what am I missing? :-)

And, of course, thanks a lot again!

Tzvetomir Stoyanov (VMware) Feb. 9, 2021, 1 p.m. UTC | #8

On Tue, Feb 9, 2021 at 2:24 PM Dario Faggioli <dfaggioli@suse.com> wrote:
>
> On Tue, 2021-02-09 at 13:44 +0200, Tzvetomir Stoyanov wrote:
> > Hi Dario,
> >
> Hi again,
>
> > On Tue, Feb 9, 2021 at 9:28 AM Dario Faggioli <dfaggioli@suse.com>
> > wrote:
> > >
> > > Or at least, host and guest talk, and I see this: "Negotiated kvm
> > > time
> > > sync protocol with guest (null)" (is that "(null)" ok/intentional?
> > > I
> > > guess I can check the code... :-))
> >
> > That null looks like an error, the name of the guest VM should be
> > printed there.
> >
> Yep, in fact, as I said, I tired this before and --now that I think
> about it-- I do remember that the guest "name" was what was there.
>
> BTW, if I can pester you a little bit more, now that it works I'm
> trying to use this, and here's what I'm doing.
>
> == VM:
> # echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd agent
>
> == Host:
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd record -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:823 -e sched:* -e syscalls:*_clock_nanosleep -e timer:hrtimer* sleep 1
>
> And I do end up with the two files: trace.dat for the host and trace-
> (null).dat (oh, well :-D) for the guest.
>
> Now, just `trace-cmd report`-ing them, I get this (just some small
> fragments, of course).
>
> == Host (trace.dat):
>            <...>-24791 [001]308219686938467: sched_switch:         prev_comm=sleep prev_pid=24791 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
>           <idle>-0     [001]308219691836595: hrtimer_cancel:       hrtimer=0xffff96694faec710
>           <idle>-0     [001]308219691837544: hrtimer_expire_entry: hrtimer=0xffff96694faec710 function=apic_timer_fn now=76856318840891
>           <idle>-0     [001]308219691840271: sched_waking:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
>           <idle>-0     [001]308219691849474: sched_wakeup:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
>           <idle>-0     [001]308219691850914: hrtimer_expire_exit:  hrtimer=0xffff96694faec710
>           <idle>-0     [001]308219691856548: sched_switch:         prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=CPU 0/KVM next_pid=9196 next_prio=120
>        CPU 0/KVM-9196  [001]308219691876512: kvm_vcpu_wakeup:      wait time 2352265 ns, polling valid
>        CPU 0/KVM-9196  [001]308219691880170: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
>        CPU 0/KVM-9196  [001]308219691884210: kvm_pv_tlb_flush:     vcpu 0 need_flush_tlb false
>
>
>  == Guest (trace-(null).dat):
>            <...>-12068 [000]118613084443085: sched_stat_runtime:   comm=trace-cmd pid=12068 runtime=97403 [ns] vruntime=5101562006274 [ns]
>            <...>-12068 [000]118613084451479: sched_switch:         prev_comm=trace-cmd prev_pid=12068 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12070 next_prio=120
>            <...>-12070 [000]118613084662730: sched_stat_runtime:   comm=trace-cmd pid=12070 runtime=55912 [ns] vruntime=5101564964781 [ns]
>            <...>-12070 [000]118613084665219: sched_switch:         prev_comm=trace-cmd prev_pid=12070 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>           <idle>-0     [000]118613085002791: sched_waking:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
>           <idle>-0     [000]118613085006567: sched_wakeup:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
>           <idle>-0     [000]118613085011740: sched_switch:         prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=10458 next_prio=120
>      kworker/0:0-10458 [000]118613085023034: sched_waking:         comm=trace-cmd pid=12070 prio=120 target_cpu=000
>      kworker/0:0-10458 [000]118613085023774: sched_stat_runtime:   comm=kworker/0:0 pid=10458 runtime=4848 [ns] vruntime=5101561969629 [ns]
>      kworker/0:0-10458 [000]118613085025356: sched_wakeup:         comm=trace-cmd pid=12070 prio=120 target_cpu=000
>
> So, the timestamps seems still a bit apart to me. Is this fine, and
> hence there is some more post-processing that I should do (and, if yes,
> what?). Or should they be already in sync?

The timestamps, recorded in both files are not synchronised. That's
why you are seeing the original timestamps. The offsets between both
clocks are recorded in the metadata of the guest trace file. You can verify
this using "trace-cmd dump --options trace-guest.dat" and
"trace-cmd dump --options trace.dat"
When both files are opened together, then that metadata is used to align
guest's timestamps to the host time. Currently trace-cmd report does not
have support for opening more trace files as a session, yet. This is in our
short todo list.

>
> I've also tried to open trace.dat in kernelshark2, and then append
> trace-(null).dat but from the few that I see (as said in my other
> email, kernelshark2 is not fully functional for me) the events from the
> two traces are still completely separated.

Currently using KernelShark2-beta is the only way to display host and
guests tracing files, as a session. There is logic for synchronising timestamps,
using metadata from the files. What repo are you using for getting the
KernelShark2-beta ?
Yordan can give more details on that.

>
> I also tried with the following, but it's still the same:
> # echo kvm-clock > /sys/devices/[...]/current_clocksource in the guest
>
> Then, I've removed the last patch of the series, the one for "KVM
> clock", after which, I was correctly (I think) seeing:
>
> "Negotiated ptp time sync protocol with guest (null)"

That "null" really bothers me, as I cannot reproduce it. The logic for
getting the
guest name, cid and port uses "/proc/<qemu pid>/cmdline". Please, can you send
me that string, from /proc/... or just "ps afx | grep qemu", to see
what is wrong with
our parsing logic.
Also, if you have any ideas how to get name, cid and port of running
libvirt VMs -
please share. The current approach parsing the qemu command line is a work
around, not a final solution.

Thanks!

>
> But then again:
>
> == Host:
>        CPU 0/KVM-9196  [001]249509851092735: kvm_exit:             vcpu 0 reason PREEMPTION_TIMER rip 0x7f0da050b357 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509851096316: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
>        CPU 0/KVM-9196  [001]249509851112525: kvm_exit:             vcpu 0 reason MSR_WRITE rip 0xffffffffac475164 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509851114034: kvm_hv_timer_state:   vcpu_id 0 hv_timer 1
>        CPU 0/KVM-9196  [001]249509851115032: kvm_msr:              msr_write 6e0 = 0x367b512478b0
>            sleep-26457 [000]249509851765258: sched_process_exec:   filename=/usr/bin/sleep pid=26457 old_pid=26457
>        CPU 0/KVM-9196  [001]249509853247850: kvm_exit:             vcpu 0 reason EXTERNAL_INTERRUPT rip 0xffffffffad00015d info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x800000ec error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509853250399: hrtimer_cancel:       hrtimer=0xffff96709fc61980
>
> == Guest:
>            <...>-12891 [000]125778006088605: sched_stat_runtime:   comm=trace-cmd pid=12891 runtime=84358 [ns] vruntime=5110265291610 [ns]
>            <...>-12891 [000]125778006095698: sched_switch:         prev_comm=trace-cmd prev_pid=12891 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12893 next_prio=120
>            <...>-12893 [000]125778006320003: sched_stat_runtime:   comm=trace-cmd pid=12893 runtime=58448 [ns] vruntime=5110268246764 [ns]
>            <...>-12893 [000]125778006322729: sched_switch:         prev_comm=trace-cmd prev_pid=12893 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>           <idle>-0     [000]125778021774505: hrtimer_cancel:       hrtimer=0xffff99d67bc21980
>           <idle>-0     [000]125778021776481: hrtimer_expire_entry: hrtimer=0xffff99d67bc21980 function=tick_sched_timer now=31370890644808
>           <idle>-0     [000]125778021782895: hrtimer_expire_exit:  hrtimer=0xffff99d67bc21980
>
> So, what am I missing? :-)
>
> And, of course, thanks a lot again!
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Dario Faggioli Feb. 9, 2021, 3:28 p.m. UTC | #9

On Tue, 2021-02-09 at 15:00 +0200, Tzvetomir Stoyanov wrote:
> On Tue, Feb 9, 2021 at 2:24 PM Dario Faggioli <dfaggioli@suse.com>
> wrote:
> > 
> > So, the timestamps seems still a bit apart to me. Is this fine, and
> > hence there is some more post-processing that I should do (and, if
> > yes,
> > what?). Or should they be already in sync?
> 
> The timestamps, recorded in both files are not synchronised. That's
> why you are seeing the original timestamps. The offsets between both
> clocks are recorded in the metadata of the guest trace file. 
>
Ah, sure! So I was right about me being missing something: I was
missing this. :-)

> You can verify
> this using "trace-cmd dump --options trace-guest.dat" and
> "trace-cmd dump --options trace.dat"
> When both files are opened together, then that metadata is used to
> align
> guest's timestamps to the host time.
>
Right.

>  Currently trace-cmd report does not
> have support for opening more trace files as a session, yet. This is
> in our
> short todo list.
> 
Ok, is there a WiP already, or someone working on it? I'm asking
because maybe I can help.

In fact, I am helping some University students with their theses, and
I'm planning to have them doing something in the area of virtualization
tracing. In fact, I will have quite a few other questions in the
upcoming days (hoping you guys don't mind)... But, IAC, something like
this could be a place from where to start.

> 
> Currently using KernelShark2-beta is the only way to display host and
> guests tracing files, as a session. There is logic for synchronising
> timestamps,
> using metadata from the files. What repo are you using for getting
> the
> KernelShark2-beta ?
> Yordan can give more details on that.
> 
Yes, I was already using the right repo (https://github.com/yordan-
karadzhov/kernel-shark-v2.beta ) but missed a rebase/update. And in
fact, Yordan pointed me to it, and know things are a lot better (still
not perfect, but that's for the other thread).

> > "Negotiated ptp time sync protocol with guest (null)"
> 
> That "null" really bothers me, as I cannot reproduce it. The logic
> for
> getting the
> guest name, cid and port uses "/proc/<qemu pid>/cmdline". Please, can
> you send
> me that string, from /proc/... 
>
Sure. Brace yourself, though. It's a qemu command line, and you know
that qemu command line are, well, a bit long ;-P

/usr/bin/qemu-system-x86_64-nameguest=tumbleweed-jeos,debug-threads=on-
S-
objectsecret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain
-1-tumbleweed-jeos/master-key.aes-machinepc-q35-
5.2,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram-
cpuSkylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-
adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-
capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-
ssbd=on,pschange-mc-no=on,pku=off-m4096-objectmemory-backend-
ram,id=pc.ram,size=4294967296-overcommitmem-lock=off-
smp1,sockets=1,cores=1,threads=1-uuid57a61f7b-506e-49d8-a8d7-
63b5baae8534-no-user-config-nodefaults-
chardevsocket,id=charmonitor,fd=29,server,nowait-
monchardev=charmonitor,id=monitor,mode=control-
rtcbase=utc,driftfix=slew-globalkvm-pit.lost_tick_policy=delay-no-hpet-
no-shutdown-globalICH9-LPC.disable_s3=1-globalICH9-LPC.disable_s4=1-
bootstrict=on-devicepcie-root-
port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2-
devicepcie-root-
port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1-devicepcie-
root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2-
devicepcie-root-
port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3-devicepcie-
root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4-
devicepcie-root-
port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5-devicepcie-
root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6-
deviceqemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0-devicevirtio-
serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0-
blockdev{"driver":"file","filename":"/var/lib/libvirt/images/openSUSE-
Tumbleweed-JeOS.x86_64-15.1.0-kvm-and-xen-
Snapshot20210205.qcow2","node-name":"libvirt-1-storage","auto-read-
only":true,"discard":"unmap"}-blockdev{"node-name":"libvirt-1-
format","read-only":false,"driver":"qcow2","file":"libvirt-1-
storage","backing":null}-devicevirtio-blk-
pci,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=virtio-
disk0,bootindex=1-netdevtap,fd=31,id=hostnet0,vhost=on,vhostfd=32-
devicevirtio-net-
pci,netdev=hostnet0,id=net0,mac=52:54:00:77:d6:ea,bus=pci.1,addr=0x0-
chardevpty,id=charserial0-deviceisa-
serial,chardev=charserial0,id=serial0-
chardevsocket,id=charchannel0,fd=33,server,nowait-
devicevirtserialport,bus=virtio-
serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_age
nt.0-deviceusb-tablet,id=input0,bus=usb.0,port=1-
spiceport=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on-
deviceqxl-
vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vga
mem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1-devicevirtio-balloon-
pci,id=balloon0,bus=pci.5,addr=0x0-objectrng-
random,id=objrng0,filename=/dev/urandom-devicevirtio-rng-
pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0-
sandboxon,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontr
ol=deny-devicevhost-vsock-pci,id=vsock0,guest-
cid=3,vhostfd=26,bus=pci.7,addr=0x0-msgtimestamp=ontoolbox-dario

> or just "ps afx | grep qemu", 
>
Also a mess. But here you go:

# ps afx | grep qemu
12952 ?        S      0:00  |   |   \_ ssh -l root -T -e none --
192.168.0.31 sh -c 'which virt-ssh-helper 1>/dev/null 2>&1; if test $?
= 0; then     virt-ssh-helper 'qemu:///system'; else    if 'nc' -q 2>&1
| grep "requires an argument" >/dev/null 2>&1; then ARG=-q0;else
ARG=;fi;'nc' $ARG -U /var/run/libvirt/libvirt-sock; fi'
 8758 pts/9    S+     0:00  |   |   |               \_ grep --
color=auto qemu
 9168 ?        Sl   224:55  |   \_ /usr/bin/qemu-system-x86_64 -name
guest=tumbleweed-jeos,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-
tumbleweed-jeos/master-key.aes -machine pc-q35-
5.2,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram -cpu
Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-
adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-
capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-
ssbd=on,pschange-mc-no=on,pku=off -m 4096 -object memory-backend-
ram,id=pc.ram,size=4294967296 -overcommit mem-lock=off -smp
1,sockets=1,cores=1,threads=1 -uuid 57a61f7b-506e-49d8-a8d7-
63b5baae8534 -no-user-config -nodefaults -chardev
socket,id=charmonitor,fd=29,server,nowait -mon
chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
-global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -
device pcie-root-
port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
-device pcie-root-
port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-
root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -
device pcie-root-
port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-
root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -
device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-
serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev
{"driver":"file","filename":"/var/lib/libvirt/images/openSUSE-
Tumbleweed-JeOS.x86_64-15.1.0-kvm-and-xen-
Snapshot20210205.qcow2","node-name":"libvirt-1-storage","auto-read-
only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-
format","read-only":false,"driver":"qcow2","file":"libvirt-1-
storage","backing":null} -device virtio-blk-
pci,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=virtio-
disk0,bootindex=1 -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -
device virtio-net-
pci,netdev=hostnet0,id=net0,mac=52:54:00:77:d6:ea,bus=pci.1,addr=0x0 -
chardev pty,id=charserial0 -device isa-
serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,fd=33,server,nowait -device
virtserialport,bus=virtio-
serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_age
nt.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
qxl-
vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vga
mem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device virtio-balloon-
pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-
random,id=objrng0,filename=/dev/urandom -device virtio-rng-
pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
-device vhost-vsock-pci,id=vsock0,guest-
cid=3,vhostfd=26,bus=pci.7,addr=0x0 -msg timestamp=on

It's a bit of an unusual setup, as the VM (and hence QEMU too) is
running inside a podman container. But it seems the info you're after
is indeed present on the commandline.

> to see
> what is wrong with
> our parsing logic.
> Also, if you have any ideas how to get name, cid and port of running
> libvirt VMs -
> please share. The current approach parsing the qemu command line is a
> work
> around, not a final solution.
> 
Mmm... not sure. I'll have a look at the code where you go and fetch
it, and see if anything comes to mind.

Thanks and Regards

Tzvetomir Stoyanov (VMware) Feb. 9, 2021, 5:01 p.m. UTC | #10

On Tue, Feb 9, 2021 at 5:28 PM Dario Faggioli <dfaggioli@suse.com> wrote:
>
> On Tue, 2021-02-09 at 15:00 +0200, Tzvetomir Stoyanov wrote:
> > On Tue, Feb 9, 2021 at 2:24 PM Dario Faggioli <dfaggioli@suse.com>
> > wrote:
> > >
> > > So, the timestamps seems still a bit apart to me. Is this fine, and
> > > hence there is some more post-processing that I should do (and, if
> > > yes,
> > > what?). Or should they be already in sync?
> >
> > The timestamps, recorded in both files are not synchronised. That's
> > why you are seeing the original timestamps. The offsets between both
> > clocks are recorded in the metadata of the guest trace file.
> >
> Ah, sure! So I was right about me being missing something: I was
> missing this. :-)
>
> > You can verify
> > this using "trace-cmd dump --options trace-guest.dat" and
> > "trace-cmd dump --options trace.dat"
> > When both files are opened together, then that metadata is used to
> > align
> > guest's timestamps to the host time.
> >
> Right.
>
> >  Currently trace-cmd report does not
> > have support for opening more trace files as a session, yet. This is
> > in our
> > short todo list.
> >
> Ok, is there a WiP already, or someone working on it? I'm asking
> because maybe I can help.

None is working on that yet, any help is highly appreciated :) I just
created an issue in Bugzilla  for that:
https://bugzilla.kernel.org/show_bug.cgi?id=211657

>
> In fact, I am helping some University students with their theses, and
> I'm planning to have them doing something in the area of virtualization
> tracing. In fact, I will have quite a few other questions in the
> upcoming days (hoping you guys don't mind)... But, IAC, something like
> this could be a place from where to start.
>
> >
> > Currently using KernelShark2-beta is the only way to display host and
> > guests tracing files, as a session. There is logic for synchronising
> > timestamps,
> > using metadata from the files. What repo are you using for getting
> > the
> > KernelShark2-beta ?
> > Yordan can give more details on that.
> >
> Yes, I was already using the right repo (https://github.com/yordan-
> karadzhov/kernel-shark-v2.beta ) but missed a rebase/update. And in
> fact, Yordan pointed me to it, and know things are a lot better (still
> not perfect, but that's for the other thread).
>
> > > "Negotiated ptp time sync protocol with guest (null)"
> >
> > That "null" really bothers me, as I cannot reproduce it. The logic
> > for
> > getting the
> > guest name, cid and port uses "/proc/<qemu pid>/cmdline". Please, can
> > you send
> > me that string, from /proc/...
> >
> Sure. Brace yourself, though. It's a qemu command line, and you know
> that qemu command line are, well, a bit long ;-P
>
> /usr/bin/qemu-system-x86_64-nameguest=tumbleweed-jeos,debug-threads=on-

That "-nameguest=tumbleweed-jeos" is the thing that we are looking for
while trying
to figure out the name of the quest, but looks like we should improve
our parser or
better find a more reliable way for getting that information.

> S-
> objectsecret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain
> -1-tumbleweed-jeos/master-key.aes-machinepc-q35-
> 5.2,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram-
> cpuSkylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-
> adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-
> capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-
> ssbd=on,pschange-mc-no=on,pku=off-m4096-objectmemory-backend-
> ram,id=pc.ram,size=4294967296-overcommitmem-lock=off-
> smp1,sockets=1,cores=1,threads=1-uuid57a61f7b-506e-49d8-a8d7-
> 63b5baae8534-no-user-config-nodefaults-
> chardevsocket,id=charmonitor,fd=29,server,nowait-
> monchardev=charmonitor,id=monitor,mode=control-
> rtcbase=utc,driftfix=slew-globalkvm-pit.lost_tick_policy=delay-no-hpet-
> no-shutdown-globalICH9-LPC.disable_s3=1-globalICH9-LPC.disable_s4=1-
> bootstrict=on-devicepcie-root-
> port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2-
> devicepcie-root-
> port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1-devicepcie-
> root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2-
> devicepcie-root-
> port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3-devicepcie-
> root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4-
> devicepcie-root-
> port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5-devicepcie-
> root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6-
> deviceqemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0-devicevirtio-
> serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0-
> blockdev{"driver":"file","filename":"/var/lib/libvirt/images/openSUSE-
> Tumbleweed-JeOS.x86_64-15.1.0-kvm-and-xen-
> Snapshot20210205.qcow2","node-name":"libvirt-1-storage","auto-read-
> only":true,"discard":"unmap"}-blockdev{"node-name":"libvirt-1-
> format","read-only":false,"driver":"qcow2","file":"libvirt-1-
> storage","backing":null}-devicevirtio-blk-
> pci,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=virtio-
> disk0,bootindex=1-netdevtap,fd=31,id=hostnet0,vhost=on,vhostfd=32-
> devicevirtio-net-
> pci,netdev=hostnet0,id=net0,mac=52:54:00:77:d6:ea,bus=pci.1,addr=0x0-
> chardevpty,id=charserial0-deviceisa-
> serial,chardev=charserial0,id=serial0-
> chardevsocket,id=charchannel0,fd=33,server,nowait-
> devicevirtserialport,bus=virtio-
> serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_age
> nt.0-deviceusb-tablet,id=input0,bus=usb.0,port=1-
> spiceport=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on-
> deviceqxl-
> vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vga
> mem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1-devicevirtio-balloon-
> pci,id=balloon0,bus=pci.5,addr=0x0-objectrng-
> random,id=objrng0,filename=/dev/urandom-devicevirtio-rng-
> pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0-
> sandboxon,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontr
> ol=deny-devicevhost-vsock-pci,id=vsock0,guest-
> cid=3,vhostfd=26,bus=pci.7,addr=0x0-msgtimestamp=ontoolbox-dario
>
> > or just "ps afx | grep qemu",
> >
> Also a mess. But here you go:
>
> # ps afx | grep qemu
> 12952 ?        S      0:00  |   |   \_ ssh -l root -T -e none --
> 192.168.0.31 sh -c 'which virt-ssh-helper 1>/dev/null 2>&1; if test $?
> = 0; then     virt-ssh-helper 'qemu:///system'; else    if 'nc' -q 2>&1
> | grep "requires an argument" >/dev/null 2>&1; then ARG=-q0;else
> ARG=;fi;'nc' $ARG -U /var/run/libvirt/libvirt-sock; fi'
>  8758 pts/9    S+     0:00  |   |   |               \_ grep --
> color=auto qemu
>  9168 ?        Sl   224:55  |   \_ /usr/bin/qemu-system-x86_64 -name
> guest=tumbleweed-jeos,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-
> tumbleweed-jeos/master-key.aes -machine pc-q35-
> 5.2,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram -cpu
> Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-
> adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-
> capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-
> ssbd=on,pschange-mc-no=on,pku=off -m 4096 -object memory-backend-
> ram,id=pc.ram,size=4294967296 -overcommit mem-lock=off -smp
> 1,sockets=1,cores=1,threads=1 -uuid 57a61f7b-506e-49d8-a8d7-
> 63b5baae8534 -no-user-config -nodefaults -chardev
> socket,id=charmonitor,fd=29,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
> -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
> ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -
> device pcie-root-
> port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
> -device pcie-root-
> port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-
> root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
> pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -
> device pcie-root-
> port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-
> root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
> pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -
> device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virti876714o-
> serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev
> {"driver":"file","filename":"/var/lib/libvirt/images/openSUSE-
> Tumbleweed-JeOS.x86_64-15.1.0-kvm-and-xen-
> Snapshot20210205.qcow2","node-name":"libvirt-1-storage","auto-read-
> only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-
> format","read-only":false,"driver":"qcow2","file":"libvirt-1-
> storage","backing":null} -device virtio-blk-
> pci,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=virtio-
> disk0,bootindex=1 -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -
> device virtio-net-
> pci,netdev=hostnet0,id=net0,mac=52:54:00:77:d6:ea,bus=pci.1,addr=0x0 -
> chardev pty,id=charserial0 -device isa-
> serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,fd=33,server,nowait -device
> virtserialport,bus=virtio-
> serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_age
> nt.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
> qxl-
> vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vga
> mem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device virtio-balloon-
> pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-
> random,id=objrng0,filename=/dev/urandom -device virtio-rng-
> pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox
> on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
> -device vhost-vsock-pci,id=vsock0,guest-
> cid=3,vhostfd=26,bus=pci.7,addr=0x0 -msg timestamp=on
>
> It's a bit of an unusual setup, as the VM (and hence QEMU too) is
> running inside a podman container. But it seems the info you're after
> is indeed present on the commandline.
>
> > to see
> > what is wrong with
> > our parsing logic.
> > Also, if you have any ideas how to get name, cid and port of running
> > libvirt VMs -
> > please share. The current approach parsing the qemu command line is a
> > work
> > around, not a final solution.
> >
> Mmm... not sure. I'll have a look at the code where you go and fetch
> it, and see if anything comes to mind.

Thanks a lot, Dario!

>
> Thanks and Regards
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Steven Rostedt Feb. 10, 2021, 9:59 p.m. UTC | #11

On Tue, 09 Feb 2021 13:24:23 +0100
Dario Faggioli <dfaggioli@suse.com> wrote:
> 
> BTW, if I can pester you a little bit more, now that it works I'm
> trying to use this, and here's what I'm doing.
> 
> == VM:
> # echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd agent
> 
> == Host:
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd record -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:823 -e sched:* -e syscalls:*_clock_nanosleep -e timer:hrtimer* sleep 1
>

Note you can use -C x86-tsc on the command line which would handle the
clocks for you. I believe it may update the agent as well, but if not, you
can add it to the agent as well. Also "-p nop" shouldn't be needed, as that
would happen automatically if "-p" is left off.

 trace-cmd record -e sched -e kvm -e 'hrtimer*' -C x86-tsc \
   -A tumbleweed-jeos:823 -e sched -e '*_clock_nanosleep' -e 'hrtimer*' \
   sleep 1

should be equivalent. You may need to add '-C x86-tsc' after the -A, but I
think we made it match the host if a clock was supplied.

> And I do end up with the two files: trace.dat for the host and trace-
> (null).dat (oh, well :-D) for the guest.

I believe my latest push should fix that ;-)

-- Steve

Steven Rostedt Feb. 10, 2021, 10:02 p.m. UTC | #12

On Tue, 9 Feb 2021 19:01:04 +0200
Tzvetomir Stoyanov <tz.stoyanov@gmail.com> wrote:

> > Ok, is there a WiP already, or someone working on it? I'm asking
> > because maybe I can help.  
> 
> None is working on that yet, any help is highly appreciated :) I just
> created an issue in Bugzilla  for that:
> https://bugzilla.kernel.org/show_bug.cgi?id=211657

I had the below patch locally for some time now. It's a somewhat a hack, but
should work. But this should be done properly. The hack below works if you
include only two files, the host first followed by the guest:

 trace-cmd report -i trace-host.dat -i trace-guest.dat

Also, the following patch should be applied as well:

 https://patchwork.kernel.org/project/linux-trace-devel/patch/20210210164543.5ecd9dbd@gandalf.local.home/

-- Steve

Index: trace-cmd.git/tracecmd/trace-read.c
===================================================================
--- trace-cmd.git.orig/tracecmd/trace-read.c
+++ trace-cmd.git/tracecmd/trace-read.c
@@ -339,6 +339,9 @@ static void add_input(const char *file)
 	last_input_file = item;
 }
 
+static struct tracecmd_input *host;
+static struct tracecmd_input *guest;
+
 static void add_handle(struct tracecmd_input *handle, const char *file)
 {
 	struct handle_list *item;
@@ -358,6 +361,12 @@ static void add_handle(struct tracecmd_i
 			max_file_size = strlen(item->file);
 	}
 	list_add_tail(&item->list, &handle_list);
+	if (handle_list.next != &item->list) {
+		item = container_of(handle_list.next,
+				    struct handle_list, list);
+		host = item->handle;
+		guest = handle;
+	}
 }
 
 static void free_inputs(void)
@@ -1283,6 +1292,8 @@ static void read_data_info(struct list_h
 		}
 	}
 
+	tracecmd_pair_peer(guest, host);
+
 	if (otype != OUTPUT_NORMAL)
 		return;

Dario Faggioli Feb. 10, 2021, 11:33 p.m. UTC | #13

On Wed, 2021-02-10 at 16:59 -0500, Steven Rostedt wrote:
> On Tue, 09 Feb 2021 13:24:23 +0100
> Dario Faggioli <dfaggioli@suse.com> wrote:
> > 
> > == VM:
> > # echo tsc >
> > /sys/devices/system/clocksource/clocksource0/current_clocksource
> > # echo x86-tsc > /sys/kernel/tracing/trace_clock
> > # trace-cmd agent
> > 
> > == Host:
> > # echo x86-tsc > /sys/kernel/tracing/trace_clock
> > # trace-cmd record -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A
> > tumbleweed-jeos:823 -e sched:* -e syscalls:*_clock_nanosleep -e
> > timer:hrtimer* sleep 1
> > 
> 
> Note you can use -C x86-tsc on the command line which would handle
> the
> clocks for you. I believe it may update the agent as well, but if
> not, you
> can add it to the agent as well. Also "-p nop" shouldn't be needed,
> as that
> would happen automatically if "-p" is left off.
> 
>  trace-cmd record -e sched -e kvm -e 'hrtimer*' -C x86-tsc \
>    -A tumbleweed-jeos:823 -e sched -e '*_clock_nanosleep' -e
> 'hrtimer*' \
>    sleep 1
> 
Right. That's better indeed.

> > And I do end up with the two files: trace.dat for the host and
> > trace-
> > (null).dat (oh, well :-D) for the guest.
> 
> I believe my latest push should fix that ;-)
> 
Yes, 27c464a74f8f54d0 (trace-cmd: Use instance->name for guests and not
tracefs instance name) fixes this for me.

I was also trying to do some debugging, out of curiosity. And in fact I
was not understanding where the name was being copied in the place
where tracefs_instance_get_name() looks for it. :-)

Thanks!

Dario Faggioli Feb. 10, 2021, 11:35 p.m. UTC | #14

On Wed, 2021-02-10 at 17:02 -0500, Steven Rostedt wrote:
> On Tue, 9 Feb 2021 19:01:04 +0200
> Tzvetomir Stoyanov <tz.stoyanov@gmail.com> wrote:
> > 
> > None is working on that yet, any help is highly appreciated :) I
> > just
> > created an issue in Bugzilla  for that:
> > https://bugzilla.kernel.org/show_bug.cgi?id=211657
> 
> I had the below patch locally for some time now. It's a somewhat a
> hack, but
> should work. But this should be done properly. The hack below works
> if you
> include only two files, the host first followed by the guest:
> 
>  trace-cmd report -i trace-host.dat -i trace-guest.dat
> 
Ah, very useful indeed, if only as a starting point!

Thanks for sharing this. Much appreciated. :-)

Regards

[v28,0/8] Timestamp synchronization of host - guest tracing session

Message

Comments