diff mbox series

[v5,03/21] gpu: host1x: Show number of pending waiters in debugfs

Message ID 20210111130019.3515669-4-mperttunen@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Host1x/TegraDRM UAPI | expand

Commit Message

Mikko Perttunen Jan. 11, 2021, 1 p.m. UTC
Show the number of pending waiters in the debugfs status file.
This is useful for testing to verify that waiters do not leak
or accumulate incorrectly.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/debug.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Thierry Reding March 23, 2021, 10:16 a.m. UTC | #1
On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
> Show the number of pending waiters in the debugfs status file.
> This is useful for testing to verify that waiters do not leak
> or accumulate incorrectly.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/gpu/host1x/debug.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> index 1b4997bda1c7..8a14880c61bb 100644
> --- a/drivers/gpu/host1x/debug.c
> +++ b/drivers/gpu/host1x/debug.c
> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>  
>  static void show_syncpts(struct host1x *m, struct output *o)
>  {
> +	struct list_head *pos;
>  	unsigned int i;
>  
>  	host1x_debug_output(o, "---- syncpts ----\n");
> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>  	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>  		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>  		u32 min = host1x_syncpt_load(m->syncpt + i);
> +		unsigned int waiters = 0;
>  
> -		if (!min && !max)
> +		spin_lock(&m->syncpt[i].intr.lock);
> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> +			waiters++;
> +		spin_unlock(&m->syncpt[i].intr.lock);

Would it make sense to keep a running count so that we don't have to
compute it here?

> +
> +		if (!min && !max && !waiters)
>  			continue;
>  
> -		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
> -				    i, m->syncpt[i].name, min, max);
> +		host1x_debug_output(o,
> +				    "id %u (%s) min %d max %d (%d waiters)\n",
> +				    i, m->syncpt[i].name, min, max, waiters);

Or alternatively, would it be useful to collect a bit more information
about waiters so that when they leak we get a better understanding of
which ones leak?

It doesn't look like we currently have much information in struct
host1x_waitlist to identify waiters, but perhaps that can be extended?

Thierry
Mikko Perttunen March 26, 2021, 2:34 p.m. UTC | #2
On 3/23/21 12:16 PM, Thierry Reding wrote:
> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
>> Show the number of pending waiters in the debugfs status file.
>> This is useful for testing to verify that waiters do not leak
>> or accumulate incorrectly.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>> ---
>>   drivers/gpu/host1x/debug.c | 14 +++++++++++---
>>   1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
>> index 1b4997bda1c7..8a14880c61bb 100644
>> --- a/drivers/gpu/host1x/debug.c
>> +++ b/drivers/gpu/host1x/debug.c
>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>>   
>>   static void show_syncpts(struct host1x *m, struct output *o)
>>   {
>> +	struct list_head *pos;
>>   	unsigned int i;
>>   
>>   	host1x_debug_output(o, "---- syncpts ----\n");
>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>>   	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>>   		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>>   		u32 min = host1x_syncpt_load(m->syncpt + i);
>> +		unsigned int waiters = 0;
>>   
>> -		if (!min && !max)
>> +		spin_lock(&m->syncpt[i].intr.lock);
>> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
>> +			waiters++;
>> +		spin_unlock(&m->syncpt[i].intr.lock);
> 
> Would it make sense to keep a running count so that we don't have to
> compute it here?

Considering this is just a debug facility, I think I prefer not adding a 
new field just for it.

> 
>> +
>> +		if (!min && !max && !waiters)
>>   			continue;
>>   
>> -		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
>> -				    i, m->syncpt[i].name, min, max);
>> +		host1x_debug_output(o,
>> +				    "id %u (%s) min %d max %d (%d waiters)\n",
>> +				    i, m->syncpt[i].name, min, max, waiters);
> 
> Or alternatively, would it be useful to collect a bit more information
> about waiters so that when they leak we get a better understanding of
> which ones leak?
> 
> It doesn't look like we currently have much information in struct
> host1x_waitlist to identify waiters, but perhaps that can be extended?

I added this patch mainly for use with integration tests, so they can 
verify no waiters leaked in negative tests. I think let's put off adding 
other information until there's some need for it.

Mikko

> 
> Thierry
>
Michał Mirosław April 1, 2021, 9:19 p.m. UTC | #3
On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote:
> On 3/23/21 12:16 PM, Thierry Reding wrote:
> > On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
> > > Show the number of pending waiters in the debugfs status file.
> > > This is useful for testing to verify that waiters do not leak
> > > or accumulate incorrectly.
> > > 
> > > Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> > > ---
> > >   drivers/gpu/host1x/debug.c | 14 +++++++++++---
> > >   1 file changed, 11 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> > > index 1b4997bda1c7..8a14880c61bb 100644
> > > --- a/drivers/gpu/host1x/debug.c
> > > +++ b/drivers/gpu/host1x/debug.c
> > > @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
> > >   static void show_syncpts(struct host1x *m, struct output *o)
> > >   {
> > > +	struct list_head *pos;
> > >   	unsigned int i;
> > >   	host1x_debug_output(o, "---- syncpts ----\n");
> > > @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
> > >   	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
> > >   		u32 max = host1x_syncpt_read_max(m->syncpt + i);
> > >   		u32 min = host1x_syncpt_load(m->syncpt + i);
> > > +		unsigned int waiters = 0;
> > > -		if (!min && !max)
> > > +		spin_lock(&m->syncpt[i].intr.lock);
> > > +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> > > +			waiters++;
> > > +		spin_unlock(&m->syncpt[i].intr.lock);
> > 
> > Would it make sense to keep a running count so that we don't have to
> > compute it here?
> 
> Considering this is just a debug facility, I think I prefer not adding a new
> field just for it.

This looks like IRQ-disabled region, so unless only root can trigger
this code, maybe the additional field could save a potential headache?
How many waiters can there be in the worst case?

Best Regards
Michał Mirosław
Dmitry Osipenko April 2, 2021, 4:02 p.m. UTC | #4
02.04.2021 00:19, Michał Mirosław пишет:
> On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote:
>> On 3/23/21 12:16 PM, Thierry Reding wrote:
>>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
>>>> Show the number of pending waiters in the debugfs status file.
>>>> This is useful for testing to verify that waiters do not leak
>>>> or accumulate incorrectly.
>>>>
>>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>>> ---
>>>>   drivers/gpu/host1x/debug.c | 14 +++++++++++---
>>>>   1 file changed, 11 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
>>>> index 1b4997bda1c7..8a14880c61bb 100644
>>>> --- a/drivers/gpu/host1x/debug.c
>>>> +++ b/drivers/gpu/host1x/debug.c
>>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>>>>   static void show_syncpts(struct host1x *m, struct output *o)
>>>>   {
>>>> +	struct list_head *pos;
>>>>   	unsigned int i;
>>>>   	host1x_debug_output(o, "---- syncpts ----\n");
>>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>>>>   	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>>>>   		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>>>>   		u32 min = host1x_syncpt_load(m->syncpt + i);
>>>> +		unsigned int waiters = 0;
>>>> -		if (!min && !max)
>>>> +		spin_lock(&m->syncpt[i].intr.lock);
>>>> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
>>>> +			waiters++;
>>>> +		spin_unlock(&m->syncpt[i].intr.lock);
>>>
>>> Would it make sense to keep a running count so that we don't have to
>>> compute it here?
>>
>> Considering this is just a debug facility, I think I prefer not adding a new
>> field just for it.
> 
> This looks like IRQ-disabled region, so unless only root can trigger
> this code, maybe the additional field could save a potential headache?
> How many waiters can there be in the worst case?

The host1x's IRQ handler runs in a workqueue, so it should be okay.
Michał Mirosław April 8, 2021, 4:13 a.m. UTC | #5
On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote:
> 02.04.2021 00:19, Michał Mirosław пишет:
> > On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote:
> >> On 3/23/21 12:16 PM, Thierry Reding wrote:
> >>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
> >>>> Show the number of pending waiters in the debugfs status file.
> >>>> This is useful for testing to verify that waiters do not leak
> >>>> or accumulate incorrectly.
> >>>>
> >>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> >>>> ---
> >>>>   drivers/gpu/host1x/debug.c | 14 +++++++++++---
> >>>>   1 file changed, 11 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> >>>> index 1b4997bda1c7..8a14880c61bb 100644
> >>>> --- a/drivers/gpu/host1x/debug.c
> >>>> +++ b/drivers/gpu/host1x/debug.c
> >>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
> >>>>   static void show_syncpts(struct host1x *m, struct output *o)
> >>>>   {
> >>>> +	struct list_head *pos;
> >>>>   	unsigned int i;
> >>>>   	host1x_debug_output(o, "---- syncpts ----\n");
> >>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
> >>>>   	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
> >>>>   		u32 max = host1x_syncpt_read_max(m->syncpt + i);
> >>>>   		u32 min = host1x_syncpt_load(m->syncpt + i);
> >>>> +		unsigned int waiters = 0;
> >>>> -		if (!min && !max)
> >>>> +		spin_lock(&m->syncpt[i].intr.lock);
> >>>> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> >>>> +			waiters++;
> >>>> +		spin_unlock(&m->syncpt[i].intr.lock);
> >>>
> >>> Would it make sense to keep a running count so that we don't have to
> >>> compute it here?
> >>
> >> Considering this is just a debug facility, I think I prefer not adding a new
> >> field just for it.
> > 
> > This looks like IRQ-disabled region, so unless only root can trigger
> > this code, maybe the additional field could save a potential headache?
> > How many waiters can there be in the worst case?
> 
> The host1x's IRQ handler runs in a workqueue, so it should be okay.

Why, then, this uses a spinlock (and it has 'intr' in its name)?

Best Regards
Michał Mirosław
Michał Mirosław April 8, 2021, 4:25 a.m. UTC | #6
On Thu, Apr 08, 2021 at 06:13:44AM +0200, Michał Mirosław wrote:
> On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote:
> > 02.04.2021 00:19, Michał Mirosław пишет:
> > > On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote:
> > >> On 3/23/21 12:16 PM, Thierry Reding wrote:
> > >>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
> > >>>> Show the number of pending waiters in the debugfs status file.
> > >>>> This is useful for testing to verify that waiters do not leak
> > >>>> or accumulate incorrectly.
> > >>>>
> > >>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> > >>>> ---
> > >>>>   drivers/gpu/host1x/debug.c | 14 +++++++++++---
> > >>>>   1 file changed, 11 insertions(+), 3 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> > >>>> index 1b4997bda1c7..8a14880c61bb 100644
> > >>>> --- a/drivers/gpu/host1x/debug.c
> > >>>> +++ b/drivers/gpu/host1x/debug.c
> > >>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
> > >>>>   static void show_syncpts(struct host1x *m, struct output *o)
> > >>>>   {
> > >>>> +	struct list_head *pos;
> > >>>>   	unsigned int i;
> > >>>>   	host1x_debug_output(o, "---- syncpts ----\n");
> > >>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
> > >>>>   	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
> > >>>>   		u32 max = host1x_syncpt_read_max(m->syncpt + i);
> > >>>>   		u32 min = host1x_syncpt_load(m->syncpt + i);
> > >>>> +		unsigned int waiters = 0;
> > >>>> -		if (!min && !max)
> > >>>> +		spin_lock(&m->syncpt[i].intr.lock);
> > >>>> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> > >>>> +			waiters++;
> > >>>> +		spin_unlock(&m->syncpt[i].intr.lock);
> > >>>
> > >>> Would it make sense to keep a running count so that we don't have to
> > >>> compute it here?
> > >>
> > >> Considering this is just a debug facility, I think I prefer not adding a new
> > >> field just for it.
> > > 
> > > This looks like IRQ-disabled region, so unless only root can trigger
> > > this code, maybe the additional field could save a potential headache?
> > > How many waiters can there be in the worst case?
> > 
> > The host1x's IRQ handler runs in a workqueue, so it should be okay.
> 
> Why, then, this uses a spinlock (and it has 'intr' in its name)?

The critical sections are already O(n) in number of waiters, so this
patch doesn't make things worse as I previously thought. The questions
remain: What is the expected number and upper bound of workers?
Shouldn't this be a mutex instead?

Best Regards
Michał Mirosław
Mikko Perttunen April 8, 2021, 11:58 a.m. UTC | #7
On 4/8/21 7:25 AM, Michał Mirosław wrote:
> On Thu, Apr 08, 2021 at 06:13:44AM +0200, Michał Mirosław wrote:
>> On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote:
>>> 02.04.2021 00:19, Michał Mirosław пишет:
>>>> On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote:
>>>>> On 3/23/21 12:16 PM, Thierry Reding wrote:
>>>>>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote:
>>>>>>> Show the number of pending waiters in the debugfs status file.
>>>>>>> This is useful for testing to verify that waiters do not leak
>>>>>>> or accumulate incorrectly.
>>>>>>>
>>>>>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/host1x/debug.c | 14 +++++++++++---
>>>>>>>    1 file changed, 11 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
>>>>>>> index 1b4997bda1c7..8a14880c61bb 100644
>>>>>>> --- a/drivers/gpu/host1x/debug.c
>>>>>>> +++ b/drivers/gpu/host1x/debug.c
>>>>>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>>>>>>>    static void show_syncpts(struct host1x *m, struct output *o)
>>>>>>>    {
>>>>>>> +	struct list_head *pos;
>>>>>>>    	unsigned int i;
>>>>>>>    	host1x_debug_output(o, "---- syncpts ----\n");
>>>>>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>>>>>>>    	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>>>>>>>    		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>>>>>>>    		u32 min = host1x_syncpt_load(m->syncpt + i);
>>>>>>> +		unsigned int waiters = 0;
>>>>>>> -		if (!min && !max)
>>>>>>> +		spin_lock(&m->syncpt[i].intr.lock);
>>>>>>> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
>>>>>>> +			waiters++;
>>>>>>> +		spin_unlock(&m->syncpt[i].intr.lock);
>>>>>>
>>>>>> Would it make sense to keep a running count so that we don't have to
>>>>>> compute it here?
>>>>>
>>>>> Considering this is just a debug facility, I think I prefer not adding a new
>>>>> field just for it.
>>>>
>>>> This looks like IRQ-disabled region, so unless only root can trigger
>>>> this code, maybe the additional field could save a potential headache?
>>>> How many waiters can there be in the worst case?
>>>
>>> The host1x's IRQ handler runs in a workqueue, so it should be okay.
>>
>> Why, then, this uses a spinlock (and it has 'intr' in its name)?
> 
> The critical sections are already O(n) in number of waiters, so this
> patch doesn't make things worse as I previously thought. The questions
> remain: What is the expected number and upper bound of workers?
> Shouldn't this be a mutex instead?

Everything is primarily for historical reasons. The name 'intr' is 
because this is in the part of the host1x driver that handles syncpoint 
threshold interrupts - just some of it is in interrupt context and some not.

In any case, this code is scheduled for a complete redesign once we get 
the UAPI changes done. I'll take this into account at that point.

Cheers,
Mikko

> 
> Best Regards
> Michał Mirosław
>
diff mbox series

Patch

diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
index 1b4997bda1c7..8a14880c61bb 100644
--- a/drivers/gpu/host1x/debug.c
+++ b/drivers/gpu/host1x/debug.c
@@ -69,6 +69,7 @@  static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
 
 static void show_syncpts(struct host1x *m, struct output *o)
 {
+	struct list_head *pos;
 	unsigned int i;
 
 	host1x_debug_output(o, "---- syncpts ----\n");
@@ -76,12 +77,19 @@  static void show_syncpts(struct host1x *m, struct output *o)
 	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
 		u32 max = host1x_syncpt_read_max(m->syncpt + i);
 		u32 min = host1x_syncpt_load(m->syncpt + i);
+		unsigned int waiters = 0;
 
-		if (!min && !max)
+		spin_lock(&m->syncpt[i].intr.lock);
+		list_for_each(pos, &m->syncpt[i].intr.wait_head)
+			waiters++;
+		spin_unlock(&m->syncpt[i].intr.lock);
+
+		if (!min && !max && !waiters)
 			continue;
 
-		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
-				    i, m->syncpt[i].name, min, max);
+		host1x_debug_output(o,
+				    "id %u (%s) min %d max %d (%d waiters)\n",
+				    i, m->syncpt[i].name, min, max, waiters);
 	}
 
 	for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {