[v3,16/18] perf sched: Fixes for thread safety analysis

Message ID	20220824153901.488576-17-irogers@google.com (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@kernel.org> Date: Wed, 24 Aug 2022 08:38:59 -0700 In-Reply-To: <20220824153901.488576-1-irogers@google.com> Message-Id: <20220824153901.488576-17-irogers@google.com> Mime-Version: 1.0 References: <20220824153901.488576-1-irogers@google.com> Subject: [PATCH v3 16/18] perf sched: Fixes for thread safety analysis From: Ian Rogers <irogers@google.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Darren Hart <dvhart@infradead.org>, Davidlohr Bueso <dave@stgolabs.net>, " =?utf-8?q?Andr=C3=A9_Almeida?= " <andrealmeid@igalia.com>, Nathan Chancellor <nathan@kernel.org>, Nick Desaulniers <ndesaulniers@google.com>, Tom Rix <trix@redhat.com>, Weiguo Li <liwg06@foxmail.com>, Athira Rajeev <atrajeev@linux.vnet.ibm.com>, Thomas Richter <tmricht@linux.ibm.com>, Ravi Bangoria <ravi.bangoria@amd.com>, Dario Petrillo <dario.pk1@gmail.com>, Hewenliang <hewenliang4@huawei.com>, yaowenbin <yaowenbin1@huawei.com>, Wenyu Liu <liuwenyu7@huawei.com>, Song Liu <songliubraving@fb.com>, Andrii Nakryiko <andrii@kernel.org>, Dave Marchevsky <davemarchevsky@fb.com>, Leo Yan <leo.yan@linaro.org>, Kim Phillips <kim.phillips@amd.com>, Pavithra Gurushankar <gpavithrasha@gmail.com>, Alexandre Truong <alexandre.truong@arm.com>, Quentin Monnet <quentin@isovalent.com>, William Cohen <wcohen@redhat.com>, Andres Freund <andres@anarazel.de>, Adrian Hunter <adrian.hunter@intel.com>, " =?utf-8?q?Martin_Li=C5=A1ka?= " <mliska@suse.cz>, Colin Ian King <colin.king@intel.com>, James Clark <james.clark@arm.com>, Fangrui Song <maskray@google.com>, Stephane Eranian <eranian@google.com>, Kajol Jain <kjain@linux.ibm.com>, Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>, Riccardo Mancini <rickyman7@gmail.com>, Andi Kleen <ak@linux.intel.com>, Masami Hiramatsu <mhiramat@kernel.org>, Zechuan Chen <chenzechuan1@huawei.com>, Jason Wang <wangborong@cdjrlc.com>, Christophe JAILLET <christophe.jaillet@wanadoo.fr>, Remi Bernon <rbernon@codeweavers.com>, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, bpf@vger.kernel.org, llvm@lists.linux.dev Cc: Ian Rogers <irogers@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Mutex wrapper, locking and memory leak fixes \| expand [v3,00/18] Mutex wrapper, locking and memory leak fixes [v3,01/18] perf mutex: Wrapped usage of mutex and cond [v3,02/18] perf bench: Update use of pthread mutex/cond [v3,03/18] perf tests: Avoid pthread.h inclusion [v3,04/18] perf hist: Update use of pthread mutex [v3,05/18] perf bpf: Remove unused pthread.h include [v3,06/18] perf lock: Remove unused pthread.h include [v3,07/18] perf record: Update use of pthread mutex [v3,08/18] perf sched: Update use of pthread mutex [v3,09/18] perf ui: Update use of pthread mutex [v3,10/18] perf mmap: Remove unnecessary pthread.h include [v3,11/18] perf dso: Update use of pthread mutex [v3,12/18] perf annotate: Update use of pthread mutex [v3,13/18] perf top: Update use of pthread mutex [v3,14/18] perf dso: Hold lock when accessing nsinfo [v3,15/18] perf mutex: Add thread safety annotations [v3,16/18] perf sched: Fixes for thread safety analysis [v3,17/18] perf top: Fixes for thread safety analysis [v3,18/18] perf build: Enable -Wthread-safety with clang

Context	Check	Description
bpf/vmtest-bpf-PR	fail	merge-conflict
netdev/tree_selection	success	Not a local patch

Ian Rogers Aug. 24, 2022, 3:38 p.m. UTC

Add annotations to describe lock behavior. Add unlocks so that mutexes
aren't conditionally held on exit from perf_sched__replay. Add an exit
variable so that thread_func can terminate, rather than leaving the
threads blocked on mutexes.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
 1 file changed, 29 insertions(+), 17 deletions(-)

Adrian Hunter Aug. 26, 2022, 12:12 p.m. UTC | #1

On 24/08/22 18:38, Ian Rogers wrote:
> Add annotations to describe lock behavior. Add unlocks so that mutexes
> aren't conditionally held on exit from perf_sched__replay. Add an exit
> variable so that thread_func can terminate, rather than leaving the
> threads blocked on mutexes.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
>  1 file changed, 29 insertions(+), 17 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 7e4006d6b8bc..b483ff0d432e 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -246,6 +246,7 @@ struct perf_sched {
>  	const char	*time_str;
>  	struct perf_time_interval ptime;
>  	struct perf_time_interval hist_time;
> +	volatile bool   thread_funcs_exit;
>  };
>  
>  /* per thread run time data */
> @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
>  	prctl(PR_SET_NAME, comm2);
>  	if (fd < 0)
>  		return NULL;
> -again:
> -	ret = sem_post(&this_task->ready_for_work);
> -	BUG_ON(ret);
> -	mutex_lock(&sched->start_work_mutex);
> -	mutex_unlock(&sched->start_work_mutex);
>  
> -	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> +	while (!sched->thread_funcs_exit) {
> +		ret = sem_post(&this_task->ready_for_work);
> +		BUG_ON(ret);
> +		mutex_lock(&sched->start_work_mutex);
> +		mutex_unlock(&sched->start_work_mutex);
>  
> -	for (i = 0; i < this_task->nr_events; i++) {
> -		this_task->curr_event = i;
> -		perf_sched__process_event(sched, this_task->atoms[i]);
> -	}
> +		cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>  
> -	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> -	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> -	ret = sem_post(&this_task->work_done_sem);
> -	BUG_ON(ret);
> +		for (i = 0; i < this_task->nr_events; i++) {
> +			this_task->curr_event = i;
> +			perf_sched__process_event(sched, this_task->atoms[i]);
> +		}
>  
> -	mutex_lock(&sched->work_done_wait_mutex);
> -	mutex_unlock(&sched->work_done_wait_mutex);
> +		cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> +		this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> +		ret = sem_post(&this_task->work_done_sem);
> +		BUG_ON(ret);
>  
> -	goto again;
> +		mutex_lock(&sched->work_done_wait_mutex);
> +		mutex_unlock(&sched->work_done_wait_mutex);
> +	}
> +	return NULL;
>  }
>  
>  static void create_tasks(struct perf_sched *sched)
> +	EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
> +	EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
>  {
>  	struct task_desc *task;
>  	pthread_attr_t attr;
> @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
>  }
>  
>  static void wait_for_tasks(struct perf_sched *sched)
> +	EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> +	EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>  {
>  	u64 cpu_usage_0, cpu_usage_1;
>  	struct task_desc *task;
> @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
>  }
>  
>  static void run_one_test(struct perf_sched *sched)
> +	EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> +	EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>  {
>  	u64 T0, T1, delta, avg_delta, fluct;
>  
> @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
>  	print_task_traces(sched);
>  	add_cross_task_wakeups(sched);
>  
> +	sched->thread_funcs_exit = false;
>  	create_tasks(sched);
>  	printf("------------------------------------------------------------\n");
>  	for (i = 0; i < sched->replay_repeat; i++)
>  		run_one_test(sched);
>  
> +	sched->thread_funcs_exit = true;
> +	mutex_unlock(&sched->start_work_mutex);
> +	mutex_unlock(&sched->work_done_wait_mutex);

I think you still need to wait for the threads to exit before
destroying the mutexes.

>  	return 0;
>  }
>

Ian Rogers Aug. 26, 2022, 4:06 p.m. UTC | #2

On Fri, Aug 26, 2022 at 5:12 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 24/08/22 18:38, Ian Rogers wrote:
> > Add annotations to describe lock behavior. Add unlocks so that mutexes
> > aren't conditionally held on exit from perf_sched__replay. Add an exit
> > variable so that thread_func can terminate, rather than leaving the
> > threads blocked on mutexes.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
> >  1 file changed, 29 insertions(+), 17 deletions(-)
> >
> > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> > index 7e4006d6b8bc..b483ff0d432e 100644
> > --- a/tools/perf/builtin-sched.c
> > +++ b/tools/perf/builtin-sched.c
> > @@ -246,6 +246,7 @@ struct perf_sched {
> >       const char      *time_str;
> >       struct perf_time_interval ptime;
> >       struct perf_time_interval hist_time;
> > +     volatile bool   thread_funcs_exit;
> >  };
> >
> >  /* per thread run time data */
> > @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
> >       prctl(PR_SET_NAME, comm2);
> >       if (fd < 0)
> >               return NULL;
> > -again:
> > -     ret = sem_post(&this_task->ready_for_work);
> > -     BUG_ON(ret);
> > -     mutex_lock(&sched->start_work_mutex);
> > -     mutex_unlock(&sched->start_work_mutex);
> >
> > -     cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> > +     while (!sched->thread_funcs_exit) {
> > +             ret = sem_post(&this_task->ready_for_work);
> > +             BUG_ON(ret);
> > +             mutex_lock(&sched->start_work_mutex);
> > +             mutex_unlock(&sched->start_work_mutex);
> >
> > -     for (i = 0; i < this_task->nr_events; i++) {
> > -             this_task->curr_event = i;
> > -             perf_sched__process_event(sched, this_task->atoms[i]);
> > -     }
> > +             cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> >
> > -     cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> > -     this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> > -     ret = sem_post(&this_task->work_done_sem);
> > -     BUG_ON(ret);
> > +             for (i = 0; i < this_task->nr_events; i++) {
> > +                     this_task->curr_event = i;
> > +                     perf_sched__process_event(sched, this_task->atoms[i]);
> > +             }
> >
> > -     mutex_lock(&sched->work_done_wait_mutex);
> > -     mutex_unlock(&sched->work_done_wait_mutex);
> > +             cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> > +             this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> > +             ret = sem_post(&this_task->work_done_sem);
> > +             BUG_ON(ret);
> >
> > -     goto again;
> > +             mutex_lock(&sched->work_done_wait_mutex);
> > +             mutex_unlock(&sched->work_done_wait_mutex);
> > +     }
> > +     return NULL;
> >  }
> >
> >  static void create_tasks(struct perf_sched *sched)
> > +     EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
> > +     EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
> >  {
> >       struct task_desc *task;
> >       pthread_attr_t attr;
> > @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
> >  }
> >
> >  static void wait_for_tasks(struct perf_sched *sched)
> > +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> > +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> >  {
> >       u64 cpu_usage_0, cpu_usage_1;
> >       struct task_desc *task;
> > @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
> >  }
> >
> >  static void run_one_test(struct perf_sched *sched)
> > +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> > +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> >  {
> >       u64 T0, T1, delta, avg_delta, fluct;
> >
> > @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
> >       print_task_traces(sched);
> >       add_cross_task_wakeups(sched);
> >
> > +     sched->thread_funcs_exit = false;
> >       create_tasks(sched);
> >       printf("------------------------------------------------------------\n");
> >       for (i = 0; i < sched->replay_repeat; i++)
> >               run_one_test(sched);
> >
> > +     sched->thread_funcs_exit = true;
> > +     mutex_unlock(&sched->start_work_mutex);
> > +     mutex_unlock(&sched->work_done_wait_mutex);
>
> I think you still need to wait for the threads to exit before
> destroying the mutexes.

This is a pre-existing issue and beyond the scope of this patch set.

Thanks,
Ian

> >       return 0;
> >  }
> >
>

Adrian Hunter Aug. 26, 2022, 5:41 p.m. UTC | #3

On 26/08/22 19:06, Ian Rogers wrote:
> On Fri, Aug 26, 2022 at 5:12 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> On 24/08/22 18:38, Ian Rogers wrote:
>>> Add annotations to describe lock behavior. Add unlocks so that mutexes
>>> aren't conditionally held on exit from perf_sched__replay. Add an exit
>>> variable so that thread_func can terminate, rather than leaving the
>>> threads blocked on mutexes.
>>>
>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>> ---
>>>  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
>>>  1 file changed, 29 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>>> index 7e4006d6b8bc..b483ff0d432e 100644
>>> --- a/tools/perf/builtin-sched.c
>>> +++ b/tools/perf/builtin-sched.c
>>> @@ -246,6 +246,7 @@ struct perf_sched {
>>>       const char      *time_str;
>>>       struct perf_time_interval ptime;
>>>       struct perf_time_interval hist_time;
>>> +     volatile bool   thread_funcs_exit;
>>>  };
>>>
>>>  /* per thread run time data */
>>> @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
>>>       prctl(PR_SET_NAME, comm2);
>>>       if (fd < 0)
>>>               return NULL;
>>> -again:
>>> -     ret = sem_post(&this_task->ready_for_work);
>>> -     BUG_ON(ret);
>>> -     mutex_lock(&sched->start_work_mutex);
>>> -     mutex_unlock(&sched->start_work_mutex);
>>>
>>> -     cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>>> +     while (!sched->thread_funcs_exit) {
>>> +             ret = sem_post(&this_task->ready_for_work);
>>> +             BUG_ON(ret);
>>> +             mutex_lock(&sched->start_work_mutex);
>>> +             mutex_unlock(&sched->start_work_mutex);
>>>
>>> -     for (i = 0; i < this_task->nr_events; i++) {
>>> -             this_task->curr_event = i;
>>> -             perf_sched__process_event(sched, this_task->atoms[i]);
>>> -     }
>>> +             cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>>>
>>> -     cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>>> -     this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
>>> -     ret = sem_post(&this_task->work_done_sem);
>>> -     BUG_ON(ret);
>>> +             for (i = 0; i < this_task->nr_events; i++) {
>>> +                     this_task->curr_event = i;
>>> +                     perf_sched__process_event(sched, this_task->atoms[i]);
>>> +             }
>>>
>>> -     mutex_lock(&sched->work_done_wait_mutex);
>>> -     mutex_unlock(&sched->work_done_wait_mutex);
>>> +             cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>>> +             this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
>>> +             ret = sem_post(&this_task->work_done_sem);
>>> +             BUG_ON(ret);
>>>
>>> -     goto again;
>>> +             mutex_lock(&sched->work_done_wait_mutex);
>>> +             mutex_unlock(&sched->work_done_wait_mutex);
>>> +     }
>>> +     return NULL;
>>>  }
>>>
>>>  static void create_tasks(struct perf_sched *sched)
>>> +     EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
>>> +     EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
>>>  {
>>>       struct task_desc *task;
>>>       pthread_attr_t attr;
>>> @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
>>>  }
>>>
>>>  static void wait_for_tasks(struct perf_sched *sched)
>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>>>  {
>>>       u64 cpu_usage_0, cpu_usage_1;
>>>       struct task_desc *task;
>>> @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
>>>  }
>>>
>>>  static void run_one_test(struct perf_sched *sched)
>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>>>  {
>>>       u64 T0, T1, delta, avg_delta, fluct;
>>>
>>> @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
>>>       print_task_traces(sched);
>>>       add_cross_task_wakeups(sched);
>>>
>>> +     sched->thread_funcs_exit = false;
>>>       create_tasks(sched);
>>>       printf("------------------------------------------------------------\n");
>>>       for (i = 0; i < sched->replay_repeat; i++)
>>>               run_one_test(sched);
>>>
>>> +     sched->thread_funcs_exit = true;
>>> +     mutex_unlock(&sched->start_work_mutex);
>>> +     mutex_unlock(&sched->work_done_wait_mutex);
>>
>> I think you still need to wait for the threads to exit before
>> destroying the mutexes.
> 
> This is a pre-existing issue and beyond the scope of this patch set.

You added the mutex_destroy functions in patch 8, so it is still
fallout from that.

> 
> Thanks,
> Ian
> 
>>>       return 0;
>>>  }
>>>
>>

Ian Rogers Aug. 26, 2022, 5:48 p.m. UTC | #4

On Fri, Aug 26, 2022 at 10:41 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 26/08/22 19:06, Ian Rogers wrote:
> > On Fri, Aug 26, 2022 at 5:12 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 24/08/22 18:38, Ian Rogers wrote:
> >>> Add annotations to describe lock behavior. Add unlocks so that mutexes
> >>> aren't conditionally held on exit from perf_sched__replay. Add an exit
> >>> variable so that thread_func can terminate, rather than leaving the
> >>> threads blocked on mutexes.
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
> >>>  1 file changed, 29 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> >>> index 7e4006d6b8bc..b483ff0d432e 100644
> >>> --- a/tools/perf/builtin-sched.c
> >>> +++ b/tools/perf/builtin-sched.c
> >>> @@ -246,6 +246,7 @@ struct perf_sched {
> >>>       const char      *time_str;
> >>>       struct perf_time_interval ptime;
> >>>       struct perf_time_interval hist_time;
> >>> +     volatile bool   thread_funcs_exit;
> >>>  };
> >>>
> >>>  /* per thread run time data */
> >>> @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
> >>>       prctl(PR_SET_NAME, comm2);
> >>>       if (fd < 0)
> >>>               return NULL;
> >>> -again:
> >>> -     ret = sem_post(&this_task->ready_for_work);
> >>> -     BUG_ON(ret);
> >>> -     mutex_lock(&sched->start_work_mutex);
> >>> -     mutex_unlock(&sched->start_work_mutex);
> >>>
> >>> -     cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> >>> +     while (!sched->thread_funcs_exit) {
> >>> +             ret = sem_post(&this_task->ready_for_work);
> >>> +             BUG_ON(ret);
> >>> +             mutex_lock(&sched->start_work_mutex);
> >>> +             mutex_unlock(&sched->start_work_mutex);
> >>>
> >>> -     for (i = 0; i < this_task->nr_events; i++) {
> >>> -             this_task->curr_event = i;
> >>> -             perf_sched__process_event(sched, this_task->atoms[i]);
> >>> -     }
> >>> +             cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> >>>
> >>> -     cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> >>> -     this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> >>> -     ret = sem_post(&this_task->work_done_sem);
> >>> -     BUG_ON(ret);
> >>> +             for (i = 0; i < this_task->nr_events; i++) {
> >>> +                     this_task->curr_event = i;
> >>> +                     perf_sched__process_event(sched, this_task->atoms[i]);
> >>> +             }
> >>>
> >>> -     mutex_lock(&sched->work_done_wait_mutex);
> >>> -     mutex_unlock(&sched->work_done_wait_mutex);
> >>> +             cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> >>> +             this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> >>> +             ret = sem_post(&this_task->work_done_sem);
> >>> +             BUG_ON(ret);
> >>>
> >>> -     goto again;
> >>> +             mutex_lock(&sched->work_done_wait_mutex);
> >>> +             mutex_unlock(&sched->work_done_wait_mutex);
> >>> +     }
> >>> +     return NULL;
> >>>  }
> >>>
> >>>  static void create_tasks(struct perf_sched *sched)
> >>> +     EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
> >>> +     EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
> >>>  {
> >>>       struct task_desc *task;
> >>>       pthread_attr_t attr;
> >>> @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
> >>>  }
> >>>
> >>>  static void wait_for_tasks(struct perf_sched *sched)
> >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> >>>  {
> >>>       u64 cpu_usage_0, cpu_usage_1;
> >>>       struct task_desc *task;
> >>> @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
> >>>  }
> >>>
> >>>  static void run_one_test(struct perf_sched *sched)
> >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> >>>  {
> >>>       u64 T0, T1, delta, avg_delta, fluct;
> >>>
> >>> @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
> >>>       print_task_traces(sched);
> >>>       add_cross_task_wakeups(sched);
> >>>
> >>> +     sched->thread_funcs_exit = false;
> >>>       create_tasks(sched);
> >>>       printf("------------------------------------------------------------\n");
> >>>       for (i = 0; i < sched->replay_repeat; i++)
> >>>               run_one_test(sched);
> >>>
> >>> +     sched->thread_funcs_exit = true;
> >>> +     mutex_unlock(&sched->start_work_mutex);
> >>> +     mutex_unlock(&sched->work_done_wait_mutex);
> >>
> >> I think you still need to wait for the threads to exit before
> >> destroying the mutexes.
> >
> > This is a pre-existing issue and beyond the scope of this patch set.
>
> You added the mutex_destroy functions in patch 8, so it is still
> fallout from that.

In the previous code the threads were blocked on mutexes that were
stack allocated and the stack memory went away. You are correct to say
that to those locks I added an init and destroy call. The lifetime of
the mutex was wrong previously and remains wrong in this change.

Thanks,
Ian

> >
> > Thanks,
> > Ian
> >
> >>>       return 0;
> >>>  }
> >>>
> >>
>

Namhyung Kim Aug. 26, 2022, 6:26 p.m. UTC | #5

On Fri, Aug 26, 2022 at 10:48 AM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, Aug 26, 2022 at 10:41 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 26/08/22 19:06, Ian Rogers wrote:
> > > On Fri, Aug 26, 2022 at 5:12 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>
> > >> On 24/08/22 18:38, Ian Rogers wrote:
> > >>> Add annotations to describe lock behavior. Add unlocks so that mutexes
> > >>> aren't conditionally held on exit from perf_sched__replay. Add an exit
> > >>> variable so that thread_func can terminate, rather than leaving the
> > >>> threads blocked on mutexes.
> > >>>
> > >>> Signed-off-by: Ian Rogers <irogers@google.com>
> > >>> ---
> > >>>  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
> > >>>  1 file changed, 29 insertions(+), 17 deletions(-)
> > >>>
> > >>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> > >>> index 7e4006d6b8bc..b483ff0d432e 100644
> > >>> --- a/tools/perf/builtin-sched.c
> > >>> +++ b/tools/perf/builtin-sched.c
> > >>> @@ -246,6 +246,7 @@ struct perf_sched {
> > >>>       const char      *time_str;
> > >>>       struct perf_time_interval ptime;
> > >>>       struct perf_time_interval hist_time;
> > >>> +     volatile bool   thread_funcs_exit;
> > >>>  };
> > >>>
> > >>>  /* per thread run time data */
> > >>> @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
> > >>>       prctl(PR_SET_NAME, comm2);
> > >>>       if (fd < 0)
> > >>>               return NULL;
> > >>> -again:
> > >>> -     ret = sem_post(&this_task->ready_for_work);
> > >>> -     BUG_ON(ret);
> > >>> -     mutex_lock(&sched->start_work_mutex);
> > >>> -     mutex_unlock(&sched->start_work_mutex);
> > >>>
> > >>> -     cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> > >>> +     while (!sched->thread_funcs_exit) {
> > >>> +             ret = sem_post(&this_task->ready_for_work);
> > >>> +             BUG_ON(ret);
> > >>> +             mutex_lock(&sched->start_work_mutex);
> > >>> +             mutex_unlock(&sched->start_work_mutex);
> > >>>
> > >>> -     for (i = 0; i < this_task->nr_events; i++) {
> > >>> -             this_task->curr_event = i;
> > >>> -             perf_sched__process_event(sched, this_task->atoms[i]);
> > >>> -     }
> > >>> +             cpu_usage_0 = get_cpu_usage_nsec_self(fd);
> > >>>
> > >>> -     cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> > >>> -     this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> > >>> -     ret = sem_post(&this_task->work_done_sem);
> > >>> -     BUG_ON(ret);
> > >>> +             for (i = 0; i < this_task->nr_events; i++) {
> > >>> +                     this_task->curr_event = i;
> > >>> +                     perf_sched__process_event(sched, this_task->atoms[i]);
> > >>> +             }
> > >>>
> > >>> -     mutex_lock(&sched->work_done_wait_mutex);
> > >>> -     mutex_unlock(&sched->work_done_wait_mutex);
> > >>> +             cpu_usage_1 = get_cpu_usage_nsec_self(fd);
> > >>> +             this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> > >>> +             ret = sem_post(&this_task->work_done_sem);
> > >>> +             BUG_ON(ret);
> > >>>
> > >>> -     goto again;
> > >>> +             mutex_lock(&sched->work_done_wait_mutex);
> > >>> +             mutex_unlock(&sched->work_done_wait_mutex);
> > >>> +     }
> > >>> +     return NULL;
> > >>>  }
> > >>>
> > >>>  static void create_tasks(struct perf_sched *sched)
> > >>> +     EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
> > >>> +     EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
> > >>>  {
> > >>>       struct task_desc *task;
> > >>>       pthread_attr_t attr;
> > >>> @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
> > >>>  }
> > >>>
> > >>>  static void wait_for_tasks(struct perf_sched *sched)
> > >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> > >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> > >>>  {
> > >>>       u64 cpu_usage_0, cpu_usage_1;
> > >>>       struct task_desc *task;
> > >>> @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
> > >>>  }
> > >>>
> > >>>  static void run_one_test(struct perf_sched *sched)
> > >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
> > >>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
> > >>>  {
> > >>>       u64 T0, T1, delta, avg_delta, fluct;
> > >>>
> > >>> @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
> > >>>       print_task_traces(sched);
> > >>>       add_cross_task_wakeups(sched);
> > >>>
> > >>> +     sched->thread_funcs_exit = false;
> > >>>       create_tasks(sched);
> > >>>       printf("------------------------------------------------------------\n");
> > >>>       for (i = 0; i < sched->replay_repeat; i++)
> > >>>               run_one_test(sched);
> > >>>
> > >>> +     sched->thread_funcs_exit = true;
> > >>> +     mutex_unlock(&sched->start_work_mutex);
> > >>> +     mutex_unlock(&sched->work_done_wait_mutex);
> > >>
> > >> I think you still need to wait for the threads to exit before
> > >> destroying the mutexes.
> > >
> > > This is a pre-existing issue and beyond the scope of this patch set.
> >
> > You added the mutex_destroy functions in patch 8, so it is still
> > fallout from that.
>
> In the previous code the threads were blocked on mutexes that were
> stack allocated and the stack memory went away. You are correct to say
> that to those locks I added an init and destroy call. The lifetime of
> the mutex was wrong previously and remains wrong in this change.

I think you fixed the lifetime issue with sched->thread_funcs_exit here.
All you need to do is calling pthread_join() after the mutex_unlock, no?

Thanks,
Namhyung

Adrian Hunter Aug. 26, 2022, 7:35 p.m. UTC | #6

On 26/08/22 21:26, Namhyung Kim wrote:
> On Fri, Aug 26, 2022 at 10:48 AM Ian Rogers <irogers@google.com> wrote:
>>
>> On Fri, Aug 26, 2022 at 10:41 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>
>>> On 26/08/22 19:06, Ian Rogers wrote:
>>>> On Fri, Aug 26, 2022 at 5:12 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>
>>>>> On 24/08/22 18:38, Ian Rogers wrote:
>>>>>> Add annotations to describe lock behavior. Add unlocks so that mutexes
>>>>>> aren't conditionally held on exit from perf_sched__replay. Add an exit
>>>>>> variable so that thread_func can terminate, rather than leaving the
>>>>>> threads blocked on mutexes.
>>>>>>
>>>>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>>>>> ---
>>>>>>  tools/perf/builtin-sched.c | 46 ++++++++++++++++++++++++--------------
>>>>>>  1 file changed, 29 insertions(+), 17 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>>>>>> index 7e4006d6b8bc..b483ff0d432e 100644
>>>>>> --- a/tools/perf/builtin-sched.c
>>>>>> +++ b/tools/perf/builtin-sched.c
>>>>>> @@ -246,6 +246,7 @@ struct perf_sched {
>>>>>>       const char      *time_str;
>>>>>>       struct perf_time_interval ptime;
>>>>>>       struct perf_time_interval hist_time;
>>>>>> +     volatile bool   thread_funcs_exit;
>>>>>>  };
>>>>>>
>>>>>>  /* per thread run time data */
>>>>>> @@ -633,31 +634,34 @@ static void *thread_func(void *ctx)
>>>>>>       prctl(PR_SET_NAME, comm2);
>>>>>>       if (fd < 0)
>>>>>>               return NULL;
>>>>>> -again:
>>>>>> -     ret = sem_post(&this_task->ready_for_work);
>>>>>> -     BUG_ON(ret);
>>>>>> -     mutex_lock(&sched->start_work_mutex);
>>>>>> -     mutex_unlock(&sched->start_work_mutex);
>>>>>>
>>>>>> -     cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>>>>>> +     while (!sched->thread_funcs_exit) {
>>>>>> +             ret = sem_post(&this_task->ready_for_work);
>>>>>> +             BUG_ON(ret);
>>>>>> +             mutex_lock(&sched->start_work_mutex);
>>>>>> +             mutex_unlock(&sched->start_work_mutex);
>>>>>>
>>>>>> -     for (i = 0; i < this_task->nr_events; i++) {
>>>>>> -             this_task->curr_event = i;
>>>>>> -             perf_sched__process_event(sched, this_task->atoms[i]);
>>>>>> -     }
>>>>>> +             cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>>>>>>
>>>>>> -     cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>>>>>> -     this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
>>>>>> -     ret = sem_post(&this_task->work_done_sem);
>>>>>> -     BUG_ON(ret);
>>>>>> +             for (i = 0; i < this_task->nr_events; i++) {
>>>>>> +                     this_task->curr_event = i;
>>>>>> +                     perf_sched__process_event(sched, this_task->atoms[i]);
>>>>>> +             }
>>>>>>
>>>>>> -     mutex_lock(&sched->work_done_wait_mutex);
>>>>>> -     mutex_unlock(&sched->work_done_wait_mutex);
>>>>>> +             cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>>>>>> +             this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
>>>>>> +             ret = sem_post(&this_task->work_done_sem);
>>>>>> +             BUG_ON(ret);
>>>>>>
>>>>>> -     goto again;
>>>>>> +             mutex_lock(&sched->work_done_wait_mutex);
>>>>>> +             mutex_unlock(&sched->work_done_wait_mutex);
>>>>>> +     }
>>>>>> +     return NULL;
>>>>>>  }
>>>>>>
>>>>>>  static void create_tasks(struct perf_sched *sched)
>>>>>> +     EXCLUSIVE_LOCK_FUNCTION(sched->start_work_mutex)
>>>>>> +     EXCLUSIVE_LOCK_FUNCTION(sched->work_done_wait_mutex)
>>>>>>  {
>>>>>>       struct task_desc *task;
>>>>>>       pthread_attr_t attr;
>>>>>> @@ -687,6 +691,8 @@ static void create_tasks(struct perf_sched *sched)
>>>>>>  }
>>>>>>
>>>>>>  static void wait_for_tasks(struct perf_sched *sched)
>>>>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
>>>>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>>>>>>  {
>>>>>>       u64 cpu_usage_0, cpu_usage_1;
>>>>>>       struct task_desc *task;
>>>>>> @@ -738,6 +744,8 @@ static void wait_for_tasks(struct perf_sched *sched)
>>>>>>  }
>>>>>>
>>>>>>  static void run_one_test(struct perf_sched *sched)
>>>>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->work_done_wait_mutex)
>>>>>> +     EXCLUSIVE_LOCKS_REQUIRED(sched->start_work_mutex)
>>>>>>  {
>>>>>>       u64 T0, T1, delta, avg_delta, fluct;
>>>>>>
>>>>>> @@ -3309,11 +3317,15 @@ static int perf_sched__replay(struct perf_sched *sched)
>>>>>>       print_task_traces(sched);
>>>>>>       add_cross_task_wakeups(sched);
>>>>>>
>>>>>> +     sched->thread_funcs_exit = false;
>>>>>>       create_tasks(sched);
>>>>>>       printf("------------------------------------------------------------\n");
>>>>>>       for (i = 0; i < sched->replay_repeat; i++)
>>>>>>               run_one_test(sched);
>>>>>>
>>>>>> +     sched->thread_funcs_exit = true;
>>>>>> +     mutex_unlock(&sched->start_work_mutex);
>>>>>> +     mutex_unlock(&sched->work_done_wait_mutex);
>>>>>
>>>>> I think you still need to wait for the threads to exit before
>>>>> destroying the mutexes.
>>>>
>>>> This is a pre-existing issue and beyond the scope of this patch set.
>>>
>>> You added the mutex_destroy functions in patch 8, so it is still
>>> fallout from that.
>>
>> In the previous code the threads were blocked on mutexes that were
>> stack allocated and the stack memory went away. You are correct to say
>> that to those locks I added an init and destroy call. The lifetime of
>> the mutex was wrong previously and remains wrong in this change.
> 
> I think you fixed the lifetime issue with sched->thread_funcs_exit here.
> All you need to do is calling pthread_join() after the mutex_unlock, no?

Like this maybe:

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index b483ff0d432e..8090c1218855 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3326,6 +3326,13 @@ static int perf_sched__replay(struct perf_sched *sched)
 	sched->thread_funcs_exit = true;
 	mutex_unlock(&sched->start_work_mutex);
 	mutex_unlock(&sched->work_done_wait_mutex);
+	/* Get rid of threads so they won't be upset by mutex destruction */
+	for (i = 0; i < sched->nr_tasks; i++) {
+		int err = pthread_join(sched->tasks[i]->thread, NULL);
+
+		if (err)
+			pr_err("pthread_join() failed for task nr %lu, error %d\n", i, err);
+	}
 	return 0;
 }

Namhyung Kim Aug. 26, 2022, 8:48 p.m. UTC | #7

On Fri, Aug 26, 2022 at 12:36 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 26/08/22 21:26, Namhyung Kim wrote:
> > On Fri, Aug 26, 2022 at 10:48 AM Ian Rogers <irogers@google.com> wrote:

> >> In the previous code the threads were blocked on mutexes that were
> >> stack allocated and the stack memory went away. You are correct to say
> >> that to those locks I added an init and destroy call. The lifetime of
> >> the mutex was wrong previously and remains wrong in this change.
> >
> > I think you fixed the lifetime issue with sched->thread_funcs_exit here.
> > All you need to do is calling pthread_join() after the mutex_unlock, no?
>
> Like this maybe:

Yeah, but at this point we might want to factor it out as a function like
destroy_tasks().

Thanks,
Namhyung

>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index b483ff0d432e..8090c1218855 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3326,6 +3326,13 @@ static int perf_sched__replay(struct perf_sched *sched)
>         sched->thread_funcs_exit = true;
>         mutex_unlock(&sched->start_work_mutex);
>         mutex_unlock(&sched->work_done_wait_mutex);
> +       /* Get rid of threads so they won't be upset by mutex destruction */
> +       for (i = 0; i < sched->nr_tasks; i++) {
> +               int err = pthread_join(sched->tasks[i]->thread, NULL);
> +
> +               if (err)
> +                       pr_err("pthread_join() failed for task nr %lu, error %d\n", i, err);
> +       }
>         return 0;
>  }

[v3,16/18] perf sched: Fixes for thread safety analysis

Checks

Commit Message

Comments

Patch