CPU stuck due to the taprio hrtimer

Message ID	20240627055338.2186255-1-luyun@kylinos.cn (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from mailgw.kylinos.cn (mailgw.kylinos.cn [124.126.103.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 509A0DF6C; Thu, 27 Jun 2024 05:53:49 +0000 (UTC) From: Yun Lu <luyun@kylinos.cn> To: vinicius.gomes@intel.com, jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: CPU stuck due to the taprio hrtimer Date: Thu, 27 Jun 2024 13:53:38 +0800 Message-Id: <20240627055338.2186255-1-luyun@kylinos.cn> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable
Series	CPU stuck due to the taprio hrtimer \| expand CPU stuck due to the taprio hrtimer

Message ID

20240627055338.2186255-1-luyun@kylinos.cn (mailing list archive)

State

Changes Requested

Delegated to:

Netdev Maintainers

Headers

From: Yun Lu <luyun@kylinos.cn>
To: vinicius.gomes@intel.com,
	jhs@mojatatu.com,
	xiyou.wangcong@gmail.com,
	jiri@resnulli.us
Cc: netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: CPU stuck due to the taprio hrtimer
Date: Thu, 27 Jun 2024 13:53:38 +0800
Message-Id: <20240627055338.2186255-1-luyun@kylinos.cn>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

Series

CPU stuck due to the taprio hrtimer | expand

Context	Check	Description
netdev/series_format	warning	Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 842 this patch: 842
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	warning	3 maintainers not CCed: edumazet@google.com kuba@kernel.org pabeni@redhat.com
netdev/build_clang	success	Errors and warnings before: 849 this patch: 849
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 849 this patch: 849
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 54 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-06-27--09-00 (tests: 664)

Context

Check

Description

netdev/series_format

warning

Single patches do not need cover letters; Target tree name not specified in the subject

netdev/tree_selection

success

Guessed tree name to be net-next

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 842 this patch: 842

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

warning

3 maintainers not CCed: edumazet@google.com kuba@kernel.org pabeni@redhat.com

netdev/build_clang

success

Errors and warnings before: 849 this patch: 849

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 849 this patch: 849

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 54 lines checked

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2024-06-27--09-00 (tests: 664)

Commit Message

luyun June 27, 2024, 5:53 a.m. UTC

Hello,

When I run a taprio test program on the latest kernel(v6.10-rc4), CPU stuck
is detected immediately, and the stack shows that CPU is stuck on taprio
hrtimer.

The reproducer program link:
https://github.com/xyyluyun/taprio_test/blob/main/taprio_test.c
gcc taprio_test.c -static -o taprio_test

In this program, start the taprio hrtimer which clockid is set to REALTIME, and
then adjust the system time by a significant value backwards. Thus, CPU will enter
an infinite loop in the__hrtimer_run_queues function, getting stuck and unable to
exit or respond to any interrupts.

I have tried to avoid this problem by apllying the following patch, and it does work.
But I am not sure if this can be the final solution?

Thanks.

Signed-off-by: Yun Lu <luyun@kylinos.cn>
---
 net/sched/sch_taprio.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Vinicius Costa Gomes June 27, 2024, 11:30 p.m. UTC | #1

Yun Lu <luyun@kylinos.cn> writes:

> Hello,
>
> When I run a taprio test program on the latest kernel(v6.10-rc4), CPU stuck
> is detected immediately, and the stack shows that CPU is stuck on taprio
> hrtimer.
>
> The reproducer program link:
> https://github.com/xyyluyun/taprio_test/blob/main/taprio_test.c
> gcc taprio_test.c -static -o taprio_test
>
> In this program, start the taprio hrtimer which clockid is set to REALTIME, and
> then adjust the system time by a significant value backwards. Thus, CPU will enter
> an infinite loop in the__hrtimer_run_queues function, getting stuck and unable to
> exit or respond to any interrupts.
>
> I have tried to avoid this problem by apllying the following patch, and it does work.
> But I am not sure if this can be the final solution?
>
> Thanks.
>
> Signed-off-by: Yun Lu <luyun@kylinos.cn>
> ---
>  net/sched/sch_taprio.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index a0d54b422186..2ff8d34bdbac 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -104,6 +104,7 @@ struct taprio_sched {
>  	u32 max_sdu[TC_MAX_QUEUE]; /* save info from the user */
>  	u32 fp[TC_QOPT_MAX_QUEUE]; /* only for dump and offloading */
>  	u32 txtime_delay;
> +	ktime_t offset;
>  };
>  
>  struct __tc_taprio_qopt_offload {
> @@ -170,6 +171,19 @@ static ktime_t sched_base_time(const struct sched_gate_list *sched)
>  	return ns_to_ktime(sched->base_time);
>  }
>  
> +static ktime_t taprio_get_offset(const struct taprio_sched *q)
> +{
> +	enum tk_offsets tk_offset = READ_ONCE(q->tk_offset);
> +	ktime_t time = ktime_get();
> +
> +	switch (tk_offset) {
> +	case TK_OFFS_MAX:
> +		return 0;
> +	default:
> +		return ktime_sub_ns(ktime_mono_to_any(time, tk_offset), time);
> +	}
> +}
> +
>  static ktime_t taprio_mono_to_any(const struct taprio_sched *q, ktime_t mono)
>  {
>  	/* This pairs with WRITE_ONCE() in taprio_parse_clockid() */
> @@ -918,6 +932,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>  	int num_tc = netdev_get_num_tc(dev);
>  	struct sched_entry *entry, *next;
>  	struct Qdisc *sch = q->root;
> +	ktime_t now_offset = taprio_get_offset(q);
>  	ktime_t end_time;
>  	int tc;
>  
> @@ -957,6 +972,14 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>  	end_time = ktime_add_ns(entry->end_time, next->interval);
>  	end_time = min_t(ktime_t, end_time, oper->cycle_end_time);
>  
> +	if (q->offset != now_offset) {
> +		ktime_t diff = ktime_sub_ns(now_offset, q->offset);
> +
> +		end_time = ktime_add_ns(end_time, diff);
> +		oper->cycle_end_time = ktime_add_ns(oper->cycle_end_time, diff);
> +		q->offset = now_offset;
> +	}
> +

I think what we should do here is a bit different. Let me try to explain
what I have in mind with some context.

A bit of context: The idea of taprio is to enforce "TSN" traffic
schedules, these schedules require time synchronization, for example via
PTP, and in those cases, time jumps are not expected or a sign that
something is wrong.

In my mind, a time jump, specially a big one, kind of invalidates the
schedule, as the schedule is based on an absolute time value (the
base_time), and when time jumps that reference in time is lost.

BUT making the user's system unresponsive is a bug, a big one, as if
this happens in the real world, the user will be unable to investigate
what made the system have so big a time correction.

So my idea is to warn the user that the time jumped, say that the user
needs to reconfigure the schedule, as it is now invalid, and disable the
schedule.

Does this make sense?

Ah, and thanks for the report.

>  	for (tc = 0; tc < num_tc; tc++) {
>  		if (next->gate_duration[tc] == oper->cycle_time)
>  			next->gate_close_time[tc] = KTIME_MAX;
> @@ -1210,6 +1233,7 @@ static int taprio_get_start_time(struct Qdisc *sch,
>  
>  	base = sched_base_time(sched);
>  	now = taprio_get_time(q);
> +	q->offset = taprio_get_offset(q);
>  
>  	if (ktime_after(base, now)) {
>  		*start = base;
> -- 
> 2.34.1
>


Cheers,

luyun June 28, 2024, 2:48 a.m. UTC | #2

在 2024/6/28 07:30, Vinicius Costa Gomes 写道:
> Yun Lu <luyun@kylinos.cn> writes:
>
>> Hello,
>>
>> When I run a taprio test program on the latest kernel(v6.10-rc4), CPU stuck
>> is detected immediately, and the stack shows that CPU is stuck on taprio
>> hrtimer.
>>
>> The reproducer program link:
>> https://github.com/xyyluyun/taprio_test/blob/main/taprio_test.c
>> gcc taprio_test.c -static -o taprio_test
>>
>> In this program, start the taprio hrtimer which clockid is set to REALTIME, and
>> then adjust the system time by a significant value backwards. Thus, CPU will enter
>> an infinite loop in the__hrtimer_run_queues function, getting stuck and unable to
>> exit or respond to any interrupts.
>>
>> I have tried to avoid this problem by apllying the following patch, and it does work.
>> But I am not sure if this can be the final solution?
>>
>> Thanks.
>>
>> Signed-off-by: Yun Lu <luyun@kylinos.cn>
>> ---
>>   net/sched/sch_taprio.c | 24 ++++++++++++++++++++++++
>>   1 file changed, 24 insertions(+)
>>
>> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
>> index a0d54b422186..2ff8d34bdbac 100644
>> --- a/net/sched/sch_taprio.c
>> +++ b/net/sched/sch_taprio.c
>> @@ -104,6 +104,7 @@ struct taprio_sched {
>>   	u32 max_sdu[TC_MAX_QUEUE]; /* save info from the user */
>>   	u32 fp[TC_QOPT_MAX_QUEUE]; /* only for dump and offloading */
>>   	u32 txtime_delay;
>> +	ktime_t offset;
>>   };
>>   
>>   struct __tc_taprio_qopt_offload {
>> @@ -170,6 +171,19 @@ static ktime_t sched_base_time(const struct sched_gate_list *sched)
>>   	return ns_to_ktime(sched->base_time);
>>   }
>>   
>> +static ktime_t taprio_get_offset(const struct taprio_sched *q)
>> +{
>> +	enum tk_offsets tk_offset = READ_ONCE(q->tk_offset);
>> +	ktime_t time = ktime_get();
>> +
>> +	switch (tk_offset) {
>> +	case TK_OFFS_MAX:
>> +		return 0;
>> +	default:
>> +		return ktime_sub_ns(ktime_mono_to_any(time, tk_offset), time);
>> +	}
>> +}
>> +
>>   static ktime_t taprio_mono_to_any(const struct taprio_sched *q, ktime_t mono)
>>   {
>>   	/* This pairs with WRITE_ONCE() in taprio_parse_clockid() */
>> @@ -918,6 +932,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>>   	int num_tc = netdev_get_num_tc(dev);
>>   	struct sched_entry *entry, *next;
>>   	struct Qdisc *sch = q->root;
>> +	ktime_t now_offset = taprio_get_offset(q);
>>   	ktime_t end_time;
>>   	int tc;
>>   
>> @@ -957,6 +972,14 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer)
>>   	end_time = ktime_add_ns(entry->end_time, next->interval);
>>   	end_time = min_t(ktime_t, end_time, oper->cycle_end_time);
>>   
>> +	if (q->offset != now_offset) {
>> +		ktime_t diff = ktime_sub_ns(now_offset, q->offset);
>> +
>> +		end_time = ktime_add_ns(end_time, diff);
>> +		oper->cycle_end_time = ktime_add_ns(oper->cycle_end_time, diff);
>> +		q->offset = now_offset;
>> +	}
>> +
> I think what we should do here is a bit different. Let me try to explain
> what I have in mind with some context.
>
> A bit of context: The idea of taprio is to enforce "TSN" traffic
> schedules, these schedules require time synchronization, for example via
> PTP, and in those cases, time jumps are not expected or a sign that
> something is wrong.
>
> In my mind, a time jump, specially a big one, kind of invalidates the
> schedule, as the schedule is based on an absolute time value (the
> base_time), and when time jumps that reference in time is lost.
>
> BUT making the user's system unresponsive is a bug, a big one, as if
> this happens in the real world, the user will be unable to investigate
> what made the system have so big a time correction.
>
> So my idea is to warn the user that the time jumped, say that the user
> needs to reconfigure the schedule, as it is now invalid, and disable the
> schedule.
>
> Does this make sense?
>
> Ah, and thanks for the report.
>
Yeah, I understand what you mean,  and your idea is more reasonable.

Thank you for confirming this bug, my patch is only for temporary 
testing to avoid it.

BRs.

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index a0d54b422186..2ff8d34bdbac 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -104,6 +104,7 @@  struct taprio_sched {
 	u32 max_sdu[TC_MAX_QUEUE]; /* save info from the user */
 	u32 fp[TC_QOPT_MAX_QUEUE]; /* only for dump and offloading */
 	u32 txtime_delay;
+	ktime_t offset;
 };
 
 struct __tc_taprio_qopt_offload {
@@ -170,6 +171,19 @@  static ktime_t sched_base_time(const struct sched_gate_list *sched)
 	return ns_to_ktime(sched->base_time);
 }
 
+static ktime_t taprio_get_offset(const struct taprio_sched *q)
+{
+	enum tk_offsets tk_offset = READ_ONCE(q->tk_offset);
+	ktime_t time = ktime_get();
+
+	switch (tk_offset) {
+	case TK_OFFS_MAX:
+		return 0;
+	default:
+		return ktime_sub_ns(ktime_mono_to_any(time, tk_offset), time);
+	}
+}
+
 static ktime_t taprio_mono_to_any(const struct taprio_sched *q, ktime_t mono)
 {
 	/* This pairs with WRITE_ONCE() in taprio_parse_clockid() */
@@ -918,6 +932,7 @@  static enum hrtimer_restart advance_sched(struct hrtimer *timer)
 	int num_tc = netdev_get_num_tc(dev);
 	struct sched_entry *entry, *next;
 	struct Qdisc *sch = q->root;
+	ktime_t now_offset = taprio_get_offset(q);
 	ktime_t end_time;
 	int tc;
 
@@ -957,6 +972,14 @@  static enum hrtimer_restart advance_sched(struct hrtimer *timer)
 	end_time = ktime_add_ns(entry->end_time, next->interval);
 	end_time = min_t(ktime_t, end_time, oper->cycle_end_time);
 
+	if (q->offset != now_offset) {
+		ktime_t diff = ktime_sub_ns(now_offset, q->offset);
+
+		end_time = ktime_add_ns(end_time, diff);
+		oper->cycle_end_time = ktime_add_ns(oper->cycle_end_time, diff);
+		q->offset = now_offset;
+	}
+
 	for (tc = 0; tc < num_tc; tc++) {
 		if (next->gate_duration[tc] == oper->cycle_time)
 			next->gate_close_time[tc] = KTIME_MAX;
@@ -1210,6 +1233,7 @@  static int taprio_get_start_time(struct Qdisc *sch,
 
 	base = sched_base_time(sched);
 	now = taprio_get_time(q);
+	q->offset = taprio_get_offset(q);
 
 	if (ktime_after(base, now)) {
 		*start = base;

CPU stuck due to the taprio hrtimer

Checks

Commit Message

Comments

Patch