diff mbox series

scripts/simpletrace: Mark output with unstable timestamp as WARN

Message ID 20240508043229.3433128-1-zhao1.liu@intel.com (mailing list archive)
State New
Headers show
Series scripts/simpletrace: Mark output with unstable timestamp as WARN | expand

Commit Message

Zhao Liu May 8, 2024, 4:32 a.m. UTC
In some trace log, there're unstable timestamp breaking temporal
ordering of trace records. For example:

kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60

Negative delta intervals tend to get drowned in the massive trace logs,
and an unstable timestamp can corrupt the calculation of intervals
between two events adjacent to it.

Therefore, mark the outputs with unstable timestamps as WARN like:

WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60

This would help to identify unusual events.

And skip them without updating Formatter2.last_timestamp_ns to avoid
time back.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace.py | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Philippe Mathieu-Daudé May 8, 2024, 1:09 p.m. UTC | #1
On 8/5/24 06:32, Zhao Liu wrote:
> In some trace log, there're unstable timestamp breaking temporal
> ordering of trace records. For example:
> 
> kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> 
> Negative delta intervals tend to get drowned in the massive trace logs,
> and an unstable timestamp can corrupt the calculation of intervals
> between two events adjacent to it.
> 
> Therefore, mark the outputs with unstable timestamps as WARN like:
> 
> WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> 
> This would help to identify unusual events.
> 
> And skip them without updating Formatter2.last_timestamp_ns to avoid
> time back.

Can't we reorder them instead?

> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   scripts/simpletrace.py | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
> index cef81b0707f0..23911dddb8a6 100755
> --- a/scripts/simpletrace.py
> +++ b/scripts/simpletrace.py
> @@ -343,6 +343,17 @@ def __init__(self):
>           def catchall(self, *rec_args, event, timestamp_ns, pid, event_id):
>               if self.last_timestamp_ns is None:
>                   self.last_timestamp_ns = timestamp_ns
> +
> +            if timestamp_ns < self.last_timestamp_ns:
> +                fields = [
> +                    f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}'
> +                    for r, (type, name) in zip(rec_args, event.args)
> +                ]
> +                print(f'WARN: skip unstable timestamp: {event.name} '
> +                      f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' +
> +                      f' '.join(fields))
> +                return
> +
>               delta_ns = timestamp_ns - self.last_timestamp_ns
>               self.last_timestamp_ns = timestamp_ns
>
Zhao Liu May 8, 2024, 1:55 p.m. UTC | #2
Hi Philippe,

On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote:
> Date: Wed, 8 May 2024 15:09:39 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
>  timestamp as WARN
> 
> On 8/5/24 06:32, Zhao Liu wrote:
> > In some trace log, there're unstable timestamp breaking temporal
> > ordering of trace records. For example:
> > 
> > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > 
> > Negative delta intervals tend to get drowned in the massive trace logs,
> > and an unstable timestamp can corrupt the calculation of intervals
> > between two events adjacent to it.
> > 
> > Therefore, mark the outputs with unstable timestamps as WARN like:
> > 
> > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
> > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > 
> > This would help to identify unusual events.
> > 
> > And skip them without updating Formatter2.last_timestamp_ns to avoid
> > time back.
> 
> Can't we reorder them instead?

I think so...IIUC, when the current event with "wrong" timestamp is
detected, its previous events records have already been output and
cannot be reordered.

Regards,
Zhao
Philippe Mathieu-Daudé May 8, 2024, 2:23 p.m. UTC | #3
On 8/5/24 15:55, Zhao Liu wrote:
> Hi Philippe,
> 
> On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote:
>> Date: Wed, 8 May 2024 15:09:39 +0200
>> From: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
>>   timestamp as WARN
>>
>> On 8/5/24 06:32, Zhao Liu wrote:
>>> In some trace log, there're unstable timestamp breaking temporal
>>> ordering of trace records. For example:
>>>
>>> kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
>>> kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>>> kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>>>
>>> Negative delta intervals tend to get drowned in the massive trace logs,
>>> and an unstable timestamp can corrupt the calculation of intervals
>>> between two events adjacent to it.
>>>
>>> Therefore, mark the outputs with unstable timestamps as WARN like:
>>>
>>> WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
>>> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>>> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>>>
>>> This would help to identify unusual events.
>>>
>>> And skip them without updating Formatter2.last_timestamp_ns to avoid
>>> time back.
>>
>> Can't we reorder them instead?
> 
> I think so...IIUC, when the current event with "wrong" timestamp is
> detected, its previous events records have already been output and
> cannot be reordered.

Well, this certainly need rework to cleverly handle out of sync
events, reordering them to display them correctly.

Skipping events really sounds a bad idea to me. Even more out
of sync ones.

Some simple approach could be to queue up to N events, then sort them
and unqueue N/2, while processing all events.

Regards,

Phil.
Stefan Hajnoczi May 8, 2024, 6:05 p.m. UTC | #4
On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote:
>
> In some trace log, there're unstable timestamp breaking temporal
> ordering of trace records. For example:
>
> kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>
> Negative delta intervals tend to get drowned in the massive trace logs,
> and an unstable timestamp can corrupt the calculation of intervals
> between two events adjacent to it.
>
> Therefore, mark the outputs with unstable timestamps as WARN like:

Why are the timestamps non-monotonic?

In a situation like that maybe not only the negative timestamps are
useless but even some positive timestamps are incorrect. I think it's
worth understanding the nature of the instability before merging a
fix.

Stefan

>
> WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
>
> This would help to identify unusual events.
>
> And skip them without updating Formatter2.last_timestamp_ns to avoid
> time back.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  scripts/simpletrace.py | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
> index cef81b0707f0..23911dddb8a6 100755
> --- a/scripts/simpletrace.py
> +++ b/scripts/simpletrace.py
> @@ -343,6 +343,17 @@ def __init__(self):
>          def catchall(self, *rec_args, event, timestamp_ns, pid, event_id):
>              if self.last_timestamp_ns is None:
>                  self.last_timestamp_ns = timestamp_ns
> +
> +            if timestamp_ns < self.last_timestamp_ns:
> +                fields = [
> +                    f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}'
> +                    for r, (type, name) in zip(rec_args, event.args)
> +                ]
> +                print(f'WARN: skip unstable timestamp: {event.name} '
> +                      f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' +
> +                      f' '.join(fields))
> +                return
> +
>              delta_ns = timestamp_ns - self.last_timestamp_ns
>              self.last_timestamp_ns = timestamp_ns
>
> --
> 2.34.1
>
>
Zhao Liu May 9, 2024, 3:59 a.m. UTC | #5
On Wed, May 08, 2024 at 02:05:04PM -0400, Stefan Hajnoczi wrote:
> Date: Wed, 8 May 2024 14:05:04 -0400
> From: Stefan Hajnoczi <stefanha@gmail.com>
> Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
>  timestamp as WARN
> 
> On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote:
> >
> > In some trace log, there're unstable timestamp breaking temporal
> > ordering of trace records. For example:
> >
> > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> >
> > Negative delta intervals tend to get drowned in the massive trace logs,
> > and an unstable timestamp can corrupt the calculation of intervals
> > between two events adjacent to it.
> >
> > Therefore, mark the outputs with unstable timestamps as WARN like:
> 
> Why are the timestamps non-monotonic?
> 
> In a situation like that maybe not only the negative timestamps are
> useless but even some positive timestamps are incorrect. I think it's
> worth understanding the nature of the instability before merging a
> fix.

I grabbed more traces (by -trace "*" to cover as many events as possible
) and have a couple observations:

* It's not an issue with kvm's ioctl, and that qemu_mutex_lock/
  object_dynamic_cast_assert accounted for the vast majority of all
  exception timestamps.
* The total exception timestamp occurrence probability was roughly 0.013%
  (608 / 4,616,938) in a 398M trace file.
* And the intervals between the "wrong" timestamp and its pre event are
  almost all within 50ns, even more concentrated within 20ns (there are
  even quite a few 1~10ns).

Just a guess:

Would it be possible that a trace event which is too short of an interval,
and happen to meet a delay in signaling to send to writeout_thread?

It seems that this short interval like a lack of real-time guarantees in
the underlying mechanism...

If it's QEMU's own issue, I feel like the intervals should be randomized,
not just within 50ns...

May I ask what you think? Any suggestions for researching this situation
;-)

Thanks,
Zhao
Zhao Liu May 9, 2024, 4:14 a.m. UTC | #6
On Wed, May 08, 2024 at 04:23:00PM +0200, Philippe Mathieu-Daudé wrote:
> Date: Wed, 8 May 2024 16:23:00 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
>  timestamp as WARN
> 
> On 8/5/24 15:55, Zhao Liu wrote:
> > Hi Philippe,
> > 
> > On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote:
> > > Date: Wed, 8 May 2024 15:09:39 +0200
> > > From: Philippe Mathieu-Daudé <philmd@linaro.org>
> > > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
> > >   timestamp as WARN
> > > 
> > > On 8/5/24 06:32, Zhao Liu wrote:
> > > > In some trace log, there're unstable timestamp breaking temporal
> > > > ordering of trace records. For example:
> > > > 
> > > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> > > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > > > 
> > > > Negative delta intervals tend to get drowned in the massive trace logs,
> > > > and an unstable timestamp can corrupt the calculation of intervals
> > > > between two events adjacent to it.
> > > > 
> > > > Therefore, mark the outputs with unstable timestamps as WARN like:
> > > > 
> > > > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
> > > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > > > 
> > > > This would help to identify unusual events.
> > > > 
> > > > And skip them without updating Formatter2.last_timestamp_ns to avoid
> > > > time back.
> > > 
> > > Can't we reorder them instead?
> > 
> > I think so...IIUC, when the current event with "wrong" timestamp is
> > detected, its previous events records have already been output and
> > cannot be reordered.
> 
> Well, this certainly need rework to cleverly handle out of sync
> events, reordering them to display them correctly.
> 
> Skipping events really sounds a bad idea to me. Even more out
> of sync ones.
> 
> Some simple approach could be to queue up to N events, then sort them
> and unqueue N/2, while processing all events.

Reordering might be possible, but as Stefan replied, some psitive value
might also be problematic too. 

Maybe my negative value check is not enough in itself...

Thanks,
Zhao
Stefan Hajnoczi May 9, 2024, 1:47 p.m. UTC | #7
On Thu, May 09, 2024 at 11:59:10AM +0800, Zhao Liu wrote:
> On Wed, May 08, 2024 at 02:05:04PM -0400, Stefan Hajnoczi wrote:
> > Date: Wed, 8 May 2024 14:05:04 -0400
> > From: Stefan Hajnoczi <stefanha@gmail.com>
> > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable
> >  timestamp as WARN
> > 
> > On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote:
> > >
> > > In some trace log, there're unstable timestamp breaking temporal
> > > ordering of trace records. For example:
> > >
> > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
> > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
> > >
> > > Negative delta intervals tend to get drowned in the massive trace logs,
> > > and an unstable timestamp can corrupt the calculation of intervals
> > > between two events adjacent to it.
> > >
> > > Therefore, mark the outputs with unstable timestamps as WARN like:
> > 
> > Why are the timestamps non-monotonic?
> > 
> > In a situation like that maybe not only the negative timestamps are
> > useless but even some positive timestamps are incorrect. I think it's
> > worth understanding the nature of the instability before merging a
> > fix.
> 
> I grabbed more traces (by -trace "*" to cover as many events as possible
> ) and have a couple observations:
> 
> * It's not an issue with kvm's ioctl, and that qemu_mutex_lock/
>   object_dynamic_cast_assert accounted for the vast majority of all
>   exception timestamps.
> * The total exception timestamp occurrence probability was roughly 0.013%
>   (608 / 4,616,938) in a 398M trace file.
> * And the intervals between the "wrong" timestamp and its pre event are
>   almost all within 50ns, even more concentrated within 20ns (there are
>   even quite a few 1~10ns).
> 
> Just a guess:
> 
> Would it be possible that a trace event which is too short of an interval,
> and happen to meet a delay in signaling to send to writeout_thread?
> 
> It seems that this short interval like a lack of real-time guarantees in
> the underlying mechanism...
> 
> If it's QEMU's own issue, I feel like the intervals should be randomized,
> not just within 50ns...
> 
> May I ask what you think? Any suggestions for researching this situation
> ;-)

QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page
says:

  All CLOCK_MONOTONIC variants guarantee that the time returned by
  consecutive  calls  will  not go backwards, but successive calls
  may—depending  on  the  architecture—return  identical  (not-in‐
  creased) time values.

trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events
should have monotonically increasing timestamps.

I don't see a scenario where trace record A's timestamp is greater than
trace record B's timestamp unless the clock is non-monotonic.

Which host CPU architecture and operating system are you running?

Please attach to the QEMU process with gdb and print out the value of
the use_rt_clock variable or add a printf in init_get_clock(). The value
should be 1.

Stefan
Mads Ynddal May 13, 2024, 6:54 a.m. UTC | #8
> Why are the timestamps non-monotonic?
> 
> In a situation like that maybe not only the negative timestamps are
> useless but even some positive timestamps are incorrect. I think it's
> worth understanding the nature of the instability before merging a
> fix.

I agree with Stefan on this. We'll need to find the underlying cause first.
Zhao Liu May 14, 2024, 8:12 a.m. UTC | #9
Hi Stefan,

> QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page
> says:
> 
>   All CLOCK_MONOTONIC variants guarantee that the time returned by
>   consecutive  calls  will  not go backwards, but successive calls
>   may—depending  on  the  architecture—return  identical  (not-in‐
>   creased) time values.
> 
> trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events
> should have monotonically increasing timestamps.
> 
> I don't see a scenario where trace record A's timestamp is greater than
> trace record B's timestamp unless the clock is non-monotonic.
> 
> Which host CPU architecture and operating system are you running?

I tested on these 2 machines:
* CML (intel 10th) with Ubuntu 22.04 + kernel v6.5.0-28
* MTL (intel 14th) with Ubuntu 22.04.2 + kernel v6.9.0

> Please attach to the QEMU process with gdb and print out the value of
> the use_rt_clock variable or add a printf in init_get_clock(). The value
> should be 1.

Thanks, on both above machines, use_rt_clock is 1 and there're both
timestamp reversal issues with the following debug print:

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 9a366e551fb3..7657785c27dc 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -831,10 +831,17 @@ extern int use_rt_clock;

 static inline int64_t get_clock(void)
 {
+    static int64_t clock = 0;
     if (use_rt_clock) {
         struct timespec ts;
         clock_gettime(CLOCK_MONOTONIC, &ts);
-        return ts.tv_sec * 1000000000LL + ts.tv_nsec;
+        int64_t tmp = ts.tv_sec * 1000000000LL + ts.tv_nsec;
+        if (tmp <= clock) {
+            printf("get_clock: strange, clock: %ld, tmp: %ld\n", clock, tmp);
+        }
+        assert(tmp > clock);
+        clock = tmp;
+        return clock;
     } else {
         /* XXX: using gettimeofday leads to problems if the date
            changes, so it should be avoided. */
diff --git a/util/qemu-timer-common.c b/util/qemu-timer-common.c
index cc1326f72646..3bf06eb4a4ce 100644
--- a/util/qemu-timer-common.c
+++ b/util/qemu-timer-common.c
@@ -59,5 +59,6 @@ static void __attribute__((constructor)) init_get_clock(void)
         use_rt_clock = 1;
     }
     clock_start = get_clock();
+    printf("init_get_clock: use_rt_clock: %d\n", use_rt_clock);
 }
 #endif

---
The timestamp interval is very small, for example:
get_clock: strange, clock: 3302130503505, tmp: 3302130503503

or

get_clock: strange, clock: 2761577819846455, tmp: 2761577819846395

I also tried to use CLOCK_MONOTONIC_RAW, but there's still the reversal
issue.

Thanks,
Zhao
Stefan Hajnoczi May 14, 2024, 12:56 p.m. UTC | #10
On Tue, May 14, 2024, 03:57 Zhao Liu <zhao1.liu@intel.com> wrote:

> Hi Stefan,
>
> > QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page
> > says:
> >
> >   All CLOCK_MONOTONIC variants guarantee that the time returned by
> >   consecutive  calls  will  not go backwards, but successive calls
> >   may—depending  on  the  architecture—return  identical  (not-in‐
> >   creased) time values.
> >
> > trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events
> > should have monotonically increasing timestamps.
> >
> > I don't see a scenario where trace record A's timestamp is greater than
> > trace record B's timestamp unless the clock is non-monotonic.
> >
> > Which host CPU architecture and operating system are you running?
>
> I tested on these 2 machines:
> * CML (intel 10th) with Ubuntu 22.04 + kernel v6.5.0-28
> * MTL (intel 14th) with Ubuntu 22.04.2 + kernel v6.9.0
>
> > Please attach to the QEMU process with gdb and print out the value of
> > the use_rt_clock variable or add a printf in init_get_clock(). The value
> > should be 1.
>
> Thanks, on both above machines, use_rt_clock is 1 and there're both
> timestamp reversal issues with the following debug print:
>
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 9a366e551fb3..7657785c27dc 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -831,10 +831,17 @@ extern int use_rt_clock;
>
>  static inline int64_t get_clock(void)
>  {
> +    static int64_t clock = 0;
>

Please try with a thread local variable (__thread) to check whether this
happens within a single thread.

If it only happens with a global variable then we'd need to look more
closely at race conditions in the patch below. I don't think the patch is a
reliable way to detect non-monotonic timestamps in a multi-threaded program.

     if (use_rt_clock) {
>          struct timespec ts;
>          clock_gettime(CLOCK_MONOTONIC, &ts);
> -        return ts.tv_sec * 1000000000LL + ts.tv_nsec;
> +        int64_t tmp = ts.tv_sec * 1000000000LL + ts.tv_nsec;
> +        if (tmp <= clock) {
> +            printf("get_clock: strange, clock: %ld, tmp: %ld\n", clock,
> tmp);
> +        }
> +        assert(tmp > clock);
> +        clock = tmp;
> +        return clock;
>      } else {
>          /* XXX: using gettimeofday leads to problems if the date
>             changes, so it should be avoided. */
> diff --git a/util/qemu-timer-common.c b/util/qemu-timer-common.c
> index cc1326f72646..3bf06eb4a4ce 100644
> --- a/util/qemu-timer-common.c
> +++ b/util/qemu-timer-common.c
> @@ -59,5 +59,6 @@ static void __attribute__((constructor))
> init_get_clock(void)
>          use_rt_clock = 1;
>      }
>      clock_start = get_clock();
> +    printf("init_get_clock: use_rt_clock: %d\n", use_rt_clock);
>  }
>  #endif
>
> ---
> The timestamp interval is very small, for example:
> get_clock: strange, clock: 3302130503505, tmp: 3302130503503
>
> or
>
> get_clock: strange, clock: 2761577819846455, tmp: 2761577819846395
>
> I also tried to use CLOCK_MONOTONIC_RAW, but there's still the reversal
> issue.
>
> Thanks,
> Zhao
>
>
diff mbox series

Patch

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index cef81b0707f0..23911dddb8a6 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -343,6 +343,17 @@  def __init__(self):
         def catchall(self, *rec_args, event, timestamp_ns, pid, event_id):
             if self.last_timestamp_ns is None:
                 self.last_timestamp_ns = timestamp_ns
+
+            if timestamp_ns < self.last_timestamp_ns:
+                fields = [
+                    f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}'
+                    for r, (type, name) in zip(rec_args, event.args)
+                ]
+                print(f'WARN: skip unstable timestamp: {event.name} '
+                      f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' +
+                      f' '.join(fields))
+                return
+
             delta_ns = timestamp_ns - self.last_timestamp_ns
             self.last_timestamp_ns = timestamp_ns