Message ID | 20210315061857.168570-1-tz.stoyanov@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Timestamp synchronization of host - guest tracing session | expand |
Hello, On 3/15/21 7:18 AM, Tzvetomir Stoyanov (VMware) wrote: > Basic infrastructure for host - guest timestamp synchronization and a > PoC implementation of PTP-like and KVM algorithms. I'm trying to make some analysis on traces generated by host and guest, so I applied this series and the related dependencies. However, when the KVM algorithm is used, the merging process of the two traces doesn't seem to happen properly. The output of `trace-cmd report -i trace.dat -i trace-tumbleweed.dat` is: trace.dat: cpus=4 trace-tumbleweed.dat: cpus=1 trace.dat: sleep-23948 [002]61124758463654: write_msr: c0000100, value 7f35092875c0 trace.dat: <idle>-0 [000]61124760841508: write_msr: c0000100, value 7f6136b75640 trace.dat: CPU 0/KVM-1640 [000]61124760851620: write_msr: c0011020, value 40000000000000 trace.dat: CPU 0/KVM-1640 [000]61124760861778: read_msr: c0000101, value ffff9b1b76c00000 trace.dat: CPU 0/KVM-1640 [000]61124760862832: read_msr: c0000081, value 23001000000000 trace.dat: CPU 0/KVM-1640 [000]61124760864347: read_msr: c0000082, value ffffffffb5400010 trace.dat: CPU 0/KVM-1640 [000]61124760865225: read_msr: c0000083, value ffffffffb5401670 . . . trace-tumbleweed.dat: IPC I/O Parent-4313 [000]8446232460209009653: write_msr: c0000100, value 7f94535c7640 trace-tumbleweed.dat: IPC I/O Child-4503 [000]8446348845232796661: write_msr: c001011f, value 4 trace-tumbleweed.dat: IPC I/O Child-4503 [000]8447532508154803189: write_msr: c0000100, value 7f945e8d5780 trace-tumbleweed.dat: <idle>-0 [000]8452336523564659701: write_msr: c001011f, value 0 trace-tumbleweed.dat: <idle>-0 [000]8452515224268944373: write_msr: 6e0, value 731478a3bf62 trace-tumbleweed.dat: <idle>-0 [000]8452610851715789813: write_msr: 6e0, value 731478968f37 trace-tumbleweed.dat: <idle>-0 [000]8555757214338495477: write_msr: 6e0, value 731478853ec1 trace-tumbleweed.dat: <idle>-0 [000]8555904475882173429: write_msr: 6e0, value 7314778a60e8 trace-tumbleweed.dat: kworker/0:1-7351 [000]8556355589182174197: write_msr: c0000100, value 7f8373573740 The TimeShift information is inside the guest trace file, so I think this is not a transfer problem. I also tried to remove the last patch, the one related to the KVM clock, in order to use PTP and the two traces are merged correctly. The commands used to record are: Host: # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e msr:* -C x86-tsc sleep 1 Guest: # echo x86-tsc > /sys/kernel/tracing/trace_clock # trace-cmd agent Am I doing something wrong? If necessary, I can provide more info about my setup, or do more tests. Thanks and Regards, Stefano
Hi Stefano, On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto <stefano.devenuto99@gmail.com> wrote: > > Hello, > > On 3/15/21 7:18 AM, Tzvetomir Stoyanov (VMware) wrote: > > Basic infrastructure for host - guest timestamp synchronization and a > > PoC implementation of PTP-like and KVM algorithms. > I'm trying to make some analysis on traces generated by host and guest, > so I applied this series and the related dependencies. > > However, when the KVM algorithm is used, the merging process > of the two traces doesn't seem to happen properly. > > The output of `trace-cmd report -i trace.dat -i trace-tumbleweed.dat` is: > > trace.dat: cpus=4 > trace-tumbleweed.dat: cpus=1 > trace.dat: sleep-23948 [002]61124758463654: > write_msr: c0000100, value 7f35092875c0 > trace.dat: <idle>-0 [000]61124760841508: > write_msr: c0000100, value 7f6136b75640 > trace.dat: CPU 0/KVM-1640 [000]61124760851620: > write_msr: c0011020, value 40000000000000 > trace.dat: CPU 0/KVM-1640 [000]61124760861778: > read_msr: c0000101, value ffff9b1b76c00000 > trace.dat: CPU 0/KVM-1640 [000]61124760862832: > read_msr: c0000081, value 23001000000000 > trace.dat: CPU 0/KVM-1640 [000]61124760864347: > read_msr: c0000082, value ffffffffb5400010 > trace.dat: CPU 0/KVM-1640 [000]61124760865225: > read_msr: c0000083, value ffffffffb5401670 > . > . > . > trace-tumbleweed.dat: IPC I/O Parent-4313 [000]8446232460209009653: > write_msr: c0000100, value 7f94535c7640 > trace-tumbleweed.dat: IPC I/O Child-4503 [000]8446348845232796661: > write_msr: c001011f, value 4 > trace-tumbleweed.dat: IPC I/O Child-4503 [000]8447532508154803189: > write_msr: c0000100, value 7f945e8d5780 > trace-tumbleweed.dat: <idle>-0 [000]8452336523564659701: > write_msr: c001011f, value 0 > trace-tumbleweed.dat: <idle>-0 [000]8452515224268944373: > write_msr: 6e0, value 731478a3bf62 > trace-tumbleweed.dat: <idle>-0 [000]8452610851715789813: > write_msr: 6e0, value 731478968f37 > trace-tumbleweed.dat: <idle>-0 [000]8555757214338495477: > write_msr: 6e0, value 731478853ec1 > trace-tumbleweed.dat: <idle>-0 [000]8555904475882173429: > write_msr: 6e0, value 7314778a60e8 > trace-tumbleweed.dat: kworker/0:1-7351 [000]8556355589182174197: > write_msr: c0000100, value 7f8373573740 > > The TimeShift information is inside the guest trace file, > so I think this is not a transfer problem. > > I also tried to remove the last patch, the one related to the KVM clock, > in order to use PTP and the two traces are merged correctly. > > The commands used to record are: > > Host: > # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > msr:* -C x86-tsc sleep 1 The guest trace clock is set automatically as the host, so this command should be enough: # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e msr:* sleep 1 > > Guest: > # echo x86-tsc > /sys/kernel/tracing/trace_clock There is no need to set manually the guest clock, it will be overwritten by trace-cmd agent. > # trace-cmd agent > > Am I doing something wrong? All your commands and workflow look good, nothing wrong. > > If necessary, I can provide more info about my setup, or do more tests. Yes, please can you send me both host and guest trace files ? Also, it will be useful to send me the content of the KVM debug files: /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset > > > Thanks and Regards, > > Stefano Thanks for testing this code!
On 3/19/21 12:55 PM, Tzvetomir Stoyanov wrote: > Hi Stefano, > > On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto > <stefano.devenuto99@gmail.com> wrote: Hi! >> The commands used to record are: >> >> Host: >> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e >> msr:* -C x86-tsc sleep 1 > The guest trace clock is set automatically as the host, so this > command should be enough: > # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > msr:* sleep 1 > >> Guest: >> # echo x86-tsc > /sys/kernel/tracing/trace_clock > There is no need to set manually the guest clock, it will be > overwritten by trace-cmd agent. > Thanks so much for the proper way to do it, really appreciated. >> If necessary, I can provide more info about my setup, or do more tests. > Yes, please can you send me both host and guest trace files ? Here are the trace files, host and guest respectively: - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace.dat - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace-tumbleweed.dat > Also, it will be useful to send me the content of the KVM debug files: > /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset The guest has one vcpu (vcpu0) and the content of the file is: 255647917761327 >> >> Thanks and Regards, >> >> Stefano > Thanks for testing this code! > Thanks for your time, Stefano
Hi Stefano, On Fri, Mar 19, 2021 at 7:44 PM Stefano De Venuto <stefano.devenuto99@gmail.com> wrote: > > > > On 3/19/21 12:55 PM, Tzvetomir Stoyanov wrote: > > Hi Stefano, > > > > On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto > > <stefano.devenuto99@gmail.com> wrote: > Hi! > >> The commands used to record are: > >> > >> Host: > >> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > >> msr:* -C x86-tsc sleep 1 > > The guest trace clock is set automatically as the host, so this > > command should be enough: > > # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > > msr:* sleep 1 > > > >> Guest: > >> # echo x86-tsc > /sys/kernel/tracing/trace_clock > > There is no need to set manually the guest clock, it will be > > overwritten by trace-cmd agent. > > > Thanks so much for the proper way to do it, really appreciated. > >> If necessary, I can provide more info about my setup, or do more tests. > > Yes, please can you send me both host and guest trace files ? > Here are the trace files, host and guest respectively: > > - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace.dat > - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace-tumbleweed.dat > > > Also, it will be useful to send me the content of the KVM debug files: > > /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset > The guest has one vcpu (vcpu0) and the content of the file is: > > 255647917761327 Looks like there is a scaling between host and guest clocks in your setup, not just a simple offset. We did not test yet our implementation with scaling, although both offset and scaling are part of the calculations. That makes your use case very valuable for us, as we have an opportunity to test it now. And yes, looks like we have a bug here. Please, when you have time, can you repeat again the tracing session and send again both trace files + the content of the KVM debug files: /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-offset /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-scaling-ratio I'm asking to do a new trace, as most probably these offset and scaling could be different now. Thanks! > >> > >> Thanks and Regards, > >> > >> Stefano > > Thanks for testing this code! > > > Thanks for your time, > > Stefano
On Sat, Mar 20, 2021 at 6:17 PM Stefano De Venuto <stefano.devenuto99@gmail.com> wrote: > > > > On 3/20/21 7:25 AM, Tzvetomir Stoyanov wrote: > > Hi Stefano, > > > > On Fri, Mar 19, 2021 at 7:44 PM Stefano De Venuto > > <stefano.devenuto99@gmail.com> wrote: > >> > >> > >> On 3/19/21 12:55 PM, Tzvetomir Stoyanov wrote: > >>> Hi Stefano, > >>> > >>> On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto > >>> <stefano.devenuto99@gmail.com> wrote: > >> Hi! > >>>> The commands used to record are: > >>>> > >>>> Host: > >>>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > >>>> msr:* -C x86-tsc sleep 1 > >>> The guest trace clock is set automatically as the host, so this > >>> command should be enough: > >>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > >>> msr:* sleep 1 > >>> > >>>> Guest: > >>>> # echo x86-tsc > /sys/kernel/tracing/trace_clock > >>> There is no need to set manually the guest clock, it will be > >>> overwritten by trace-cmd agent. > >>> > >> Thanks so much for the proper way to do it, really appreciated. > >>>> If necessary, I can provide more info about my setup, or do more tests. > >>> Yes, please can you send me both host and guest trace files ? > >> Here are the trace files, host and guest respectively: > >> > >> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace.dat > >> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace-tumbleweed.dat > >> > >>> Also, it will be useful to send me the content of the KVM debug files: > >>> /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset > >> The guest has one vcpu (vcpu0) and the content of the file is: > >> > >> 255647917761327 > > Looks like there is a scaling between host and guest clocks in your > > setup, not just a simple offset. We did not test yet our > > implementation with scaling, although both offset and scaling are part > > of the calculations. That makes your use case very valuable for us, as > > we have an opportunity to test it now. And yes, looks like we have a > > bug here. > > Please, when you have time, can you repeat again the tracing session > > and send again both trace files + the content of the KVM debug files: > > /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-offset > > /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-scaling-ratio > > I'm asking to do a new trace, as most probably these offset and > > scaling could be different now. > Yes, the trace files are attached to this mail. > > The content of tsc-offset is: > 453568564244284 > > The content of tsc-scaling-ratio is: > 4294967296 > Confirmed, the problem is in the trace-cmd logic that works with the KVM scaling. Looks like tsc-scaling-ratio is the default KVM scaling on your machine, (1 << 32), and in that case the scaling should be ignored in our calculations. I still have no solution, going to analyze how the KVM scaling works. Thanks, Stafano! > > Thanks and Regards, > > Stefano > > > > > Thanks! > > > >>>> Thanks and Regards, > >>>> > >>>> Stefano > >>> Thanks for testing this code! > >>> > >> Thanks for your time, > >> > >> Stefano > > > > >
On Mon, 22 Mar 2021 12:13:09 +0200 Tzvetomir Stoyanov <tz.stoyanov@gmail.com> wrote: > Confirmed, the problem is in the trace-cmd logic that works with the > KVM scaling. Looks like tsc-scaling-ratio is the default KVM scaling > on your machine, (1 << 32), and in that case the scaling should be > ignored in our calculations. I still have no solution, going to > analyze how the KVM scaling works. I'm guessing that this was run on a AMD CPU, as the default scaling (the one to ignore) is 1 << 32 for AMD and for Intel it's 1 << 48. -- Steve
Hi Stefano, On Sat, Mar 20, 2021 at 6:17 PM Stefano De Venuto <stefano.devenuto99@gmail.com> wrote: > > > > On 3/20/21 7:25 AM, Tzvetomir Stoyanov wrote: > > Hi Stefano, > > > > On Fri, Mar 19, 2021 at 7:44 PM Stefano De Venuto > > <stefano.devenuto99@gmail.com> wrote: > >> > >> > >> On 3/19/21 12:55 PM, Tzvetomir Stoyanov wrote: > >>> Hi Stefano, > >>> > >>> On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto > >>> <stefano.devenuto99@gmail.com> wrote: > >> Hi! > >>>> The commands used to record are: > >>>> > >>>> Host: > >>>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > >>>> msr:* -C x86-tsc sleep 1 > >>> The guest trace clock is set automatically as the host, so this > >>> command should be enough: > >>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e > >>> msr:* sleep 1 > >>> > >>>> Guest: > >>>> # echo x86-tsc > /sys/kernel/tracing/trace_clock > >>> There is no need to set manually the guest clock, it will be > >>> overwritten by trace-cmd agent. > >>> > >> Thanks so much for the proper way to do it, really appreciated. > >>>> If necessary, I can provide more info about my setup, or do more tests. > >>> Yes, please can you send me both host and guest trace files ? > >> Here are the trace files, host and guest respectively: > >> > >> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace.dat > >> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace-tumbleweed.dat > >> > >>> Also, it will be useful to send me the content of the KVM debug files: > >>> /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset > >> The guest has one vcpu (vcpu0) and the content of the file is: > >> > >> 255647917761327 > > Looks like there is a scaling between host and guest clocks in your > > setup, not just a simple offset. We did not test yet our > > implementation with scaling, although both offset and scaling are part > > of the calculations. That makes your use case very valuable for us, as > > we have an opportunity to test it now. And yes, looks like we have a > > bug here. > > Please, when you have time, can you repeat again the tracing session > > and send again both trace files + the content of the KVM debug files: > > /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-offset > > /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-scaling-ratio > > I'm asking to do a new trace, as most probably these offset and > > scaling could be different now. > Yes, the trace files are attached to this mail. > > The content of tsc-offset is: > 453568564244284 > > The content of tsc-scaling-ratio is: > 4294967296 > Just submitted the v33 of the patch set, added a check for a non-default KVM scaling. Now it should work on your setup, as the KVM scaling there looks to be the default. I didn't test it, as we have no machine which supports KVM scaling. This is just a workaround, we have to implement support to KVM scaling in our algorithm. Please, can you test if it is OK ? Note, you have to apply these patch sets also, as there are dependencies: v2 Refactoring and improvements of time sync logic v4 TSC trace clock to nanosecond conversion Thanks for testing this code! > > Thanks and Regards, > > Stefano > > > > > Thanks! > > > >>>> Thanks and Regards, > >>>> > >>>> Stefano > >>> Thanks for testing this code! > >>> > >> Thanks for your time, > >> > >> Stefano > > > > >
On 3/25/21 8:20 AM, Tzvetomir Stoyanov wrote: > Hi Stefano, > > On Sat, Mar 20, 2021 at 6:17 PM Stefano De Venuto > <stefano.devenuto99@gmail.com> wrote: >> >> >> On 3/20/21 7:25 AM, Tzvetomir Stoyanov wrote: >>> Hi Stefano, >>> >>> On Fri, Mar 19, 2021 at 7:44 PM Stefano De Venuto >>> <stefano.devenuto99@gmail.com> wrote: >>>> >>>> On 3/19/21 12:55 PM, Tzvetomir Stoyanov wrote: >>>>> Hi Stefano, >>>>> >>>>> On Fri, Mar 19, 2021 at 12:08 PM Stefano De Venuto >>>>> <stefano.devenuto99@gmail.com> wrote: >>>> Hi! >>>>>> The commands used to record are: >>>>>> >>>>>> Host: >>>>>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e >>>>>> msr:* -C x86-tsc sleep 1 >>>>> The guest trace clock is set automatically as the host, so this >>>>> command should be enough: >>>>> # trace-cmd record -C x86-tsc -e kvm:* -e msr:* -A tumbleweed:823 -e >>>>> msr:* sleep 1 >>>>> >>>>>> Guest: >>>>>> # echo x86-tsc > /sys/kernel/tracing/trace_clock >>>>> There is no need to set manually the guest clock, it will be >>>>> overwritten by trace-cmd agent. >>>>> >>>> Thanks so much for the proper way to do it, really appreciated. >>>>>> If necessary, I can provide more info about my setup, or do more tests. >>>>> Yes, please can you send me both host and guest trace files ? >>>> Here are the trace files, host and guest respectively: >>>> >>>> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace.dat >>>> - http://xenbits.xen.org/people/dariof/tracing-examples/kvm/sync-kvmclock/trace-tumbleweed.dat >>>> >>>>> Also, it will be useful to send me the content of the KVM debug files: >>>>> /sys/kernel/debug/kvm/<guest ID>/vcpu<*>/tsc-offset >>>> The guest has one vcpu (vcpu0) and the content of the file is: >>>> >>>> 255647917761327 >>> Looks like there is a scaling between host and guest clocks in your >>> setup, not just a simple offset. We did not test yet our >>> implementation with scaling, although both offset and scaling are part >>> of the calculations. That makes your use case very valuable for us, as >>> we have an opportunity to test it now. And yes, looks like we have a >>> bug here. >>> Please, when you have time, can you repeat again the tracing session >>> and send again both trace files + the content of the KVM debug files: >>> /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-offset >>> /sys/kernel/debug/kvm/<guest ID>/vcpu0/tsc-scaling-ratio >>> I'm asking to do a new trace, as most probably these offset and >>> scaling could be different now. >> Yes, the trace files are attached to this mail. >> >> The content of tsc-offset is: >> 453568564244284 >> >> The content of tsc-scaling-ratio is: >> 4294967296 >> > Just submitted the v33 of the patch set, added a check for a > non-default KVM scaling. Now it should work on your setup, as the KVM > scaling there looks to be the default. I didn't test it, as we have no > machine which supports KVM scaling. This is just a workaround, we have > to implement support to KVM scaling in our algorithm. > Please, can you test if it is OK ? Note, you have to apply these > patch sets also, as there are dependencies: > v2 Refactoring and improvements of time sync logic > v4 TSC trace clock to nanosecond conversion > > Thanks for testing this code! I tested the patch series and now works properly on my setup! Thanks, Stefano > >> Thanks and Regards, >> >> Stefano >> >>> Thanks! >>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Stefano >>>>> Thanks for testing this code! >>>>> >>>> Thanks for your time, >>>> >>>> Stefano >>> >
On Mon, 2021-03-22 at 10:56 -0400, Steven Rostedt wrote: > On Mon, 22 Mar 2021 12:13:09 +0200 > Tzvetomir Stoyanov <tz.stoyanov@gmail.com> wrote: > > > Confirmed, the problem is in the trace-cmd logic that works with > > the > > KVM scaling. Looks like tsc-scaling-ratio is the default KVM > > scaling > > on your machine, (1 << 32), and in that case the scaling should be > > ignored in our calculations. I still have no solution, going to > > analyze how the KVM scaling works. > > I'm guessing that this was run on a AMD CPU, as the default scaling > (the > one to ignore) is 1 << 32 for AMD and for Intel it's 1 << 48. > BTW, yes. Stefano could give us details (lscpu or anything) if necessary, but this indeed is AMD. Regards