Message ID | 20210111130019.3515669-4-mperttunen@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Host1x/TegraDRM UAPI | expand |
On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: > Show the number of pending waiters in the debugfs status file. > This is useful for testing to verify that waiters do not leak > or accumulate incorrectly. > > Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> > --- > drivers/gpu/host1x/debug.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c > index 1b4997bda1c7..8a14880c61bb 100644 > --- a/drivers/gpu/host1x/debug.c > +++ b/drivers/gpu/host1x/debug.c > @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) > > static void show_syncpts(struct host1x *m, struct output *o) > { > + struct list_head *pos; > unsigned int i; > > host1x_debug_output(o, "---- syncpts ----\n"); > @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) > for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { > u32 max = host1x_syncpt_read_max(m->syncpt + i); > u32 min = host1x_syncpt_load(m->syncpt + i); > + unsigned int waiters = 0; > > - if (!min && !max) > + spin_lock(&m->syncpt[i].intr.lock); > + list_for_each(pos, &m->syncpt[i].intr.wait_head) > + waiters++; > + spin_unlock(&m->syncpt[i].intr.lock); Would it make sense to keep a running count so that we don't have to compute it here? > + > + if (!min && !max && !waiters) > continue; > > - host1x_debug_output(o, "id %u (%s) min %d max %d\n", > - i, m->syncpt[i].name, min, max); > + host1x_debug_output(o, > + "id %u (%s) min %d max %d (%d waiters)\n", > + i, m->syncpt[i].name, min, max, waiters); Or alternatively, would it be useful to collect a bit more information about waiters so that when they leak we get a better understanding of which ones leak? It doesn't look like we currently have much information in struct host1x_waitlist to identify waiters, but perhaps that can be extended? Thierry
On 3/23/21 12:16 PM, Thierry Reding wrote: > On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: >> Show the number of pending waiters in the debugfs status file. >> This is useful for testing to verify that waiters do not leak >> or accumulate incorrectly. >> >> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> >> --- >> drivers/gpu/host1x/debug.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c >> index 1b4997bda1c7..8a14880c61bb 100644 >> --- a/drivers/gpu/host1x/debug.c >> +++ b/drivers/gpu/host1x/debug.c >> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) >> >> static void show_syncpts(struct host1x *m, struct output *o) >> { >> + struct list_head *pos; >> unsigned int i; >> >> host1x_debug_output(o, "---- syncpts ----\n"); >> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) >> for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { >> u32 max = host1x_syncpt_read_max(m->syncpt + i); >> u32 min = host1x_syncpt_load(m->syncpt + i); >> + unsigned int waiters = 0; >> >> - if (!min && !max) >> + spin_lock(&m->syncpt[i].intr.lock); >> + list_for_each(pos, &m->syncpt[i].intr.wait_head) >> + waiters++; >> + spin_unlock(&m->syncpt[i].intr.lock); > > Would it make sense to keep a running count so that we don't have to > compute it here? Considering this is just a debug facility, I think I prefer not adding a new field just for it. > >> + >> + if (!min && !max && !waiters) >> continue; >> >> - host1x_debug_output(o, "id %u (%s) min %d max %d\n", >> - i, m->syncpt[i].name, min, max); >> + host1x_debug_output(o, >> + "id %u (%s) min %d max %d (%d waiters)\n", >> + i, m->syncpt[i].name, min, max, waiters); > > Or alternatively, would it be useful to collect a bit more information > about waiters so that when they leak we get a better understanding of > which ones leak? > > It doesn't look like we currently have much information in struct > host1x_waitlist to identify waiters, but perhaps that can be extended? I added this patch mainly for use with integration tests, so they can verify no waiters leaked in negative tests. I think let's put off adding other information until there's some need for it. Mikko > > Thierry >
On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote: > On 3/23/21 12:16 PM, Thierry Reding wrote: > > On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: > > > Show the number of pending waiters in the debugfs status file. > > > This is useful for testing to verify that waiters do not leak > > > or accumulate incorrectly. > > > > > > Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> > > > --- > > > drivers/gpu/host1x/debug.c | 14 +++++++++++--- > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c > > > index 1b4997bda1c7..8a14880c61bb 100644 > > > --- a/drivers/gpu/host1x/debug.c > > > +++ b/drivers/gpu/host1x/debug.c > > > @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) > > > static void show_syncpts(struct host1x *m, struct output *o) > > > { > > > + struct list_head *pos; > > > unsigned int i; > > > host1x_debug_output(o, "---- syncpts ----\n"); > > > @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) > > > for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { > > > u32 max = host1x_syncpt_read_max(m->syncpt + i); > > > u32 min = host1x_syncpt_load(m->syncpt + i); > > > + unsigned int waiters = 0; > > > - if (!min && !max) > > > + spin_lock(&m->syncpt[i].intr.lock); > > > + list_for_each(pos, &m->syncpt[i].intr.wait_head) > > > + waiters++; > > > + spin_unlock(&m->syncpt[i].intr.lock); > > > > Would it make sense to keep a running count so that we don't have to > > compute it here? > > Considering this is just a debug facility, I think I prefer not adding a new > field just for it. This looks like IRQ-disabled region, so unless only root can trigger this code, maybe the additional field could save a potential headache? How many waiters can there be in the worst case? Best Regards Michał Mirosław
02.04.2021 00:19, Michał Mirosław пишет: > On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote: >> On 3/23/21 12:16 PM, Thierry Reding wrote: >>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: >>>> Show the number of pending waiters in the debugfs status file. >>>> This is useful for testing to verify that waiters do not leak >>>> or accumulate incorrectly. >>>> >>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> >>>> --- >>>> drivers/gpu/host1x/debug.c | 14 +++++++++++--- >>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c >>>> index 1b4997bda1c7..8a14880c61bb 100644 >>>> --- a/drivers/gpu/host1x/debug.c >>>> +++ b/drivers/gpu/host1x/debug.c >>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) >>>> static void show_syncpts(struct host1x *m, struct output *o) >>>> { >>>> + struct list_head *pos; >>>> unsigned int i; >>>> host1x_debug_output(o, "---- syncpts ----\n"); >>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) >>>> for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { >>>> u32 max = host1x_syncpt_read_max(m->syncpt + i); >>>> u32 min = host1x_syncpt_load(m->syncpt + i); >>>> + unsigned int waiters = 0; >>>> - if (!min && !max) >>>> + spin_lock(&m->syncpt[i].intr.lock); >>>> + list_for_each(pos, &m->syncpt[i].intr.wait_head) >>>> + waiters++; >>>> + spin_unlock(&m->syncpt[i].intr.lock); >>> >>> Would it make sense to keep a running count so that we don't have to >>> compute it here? >> >> Considering this is just a debug facility, I think I prefer not adding a new >> field just for it. > > This looks like IRQ-disabled region, so unless only root can trigger > this code, maybe the additional field could save a potential headache? > How many waiters can there be in the worst case? The host1x's IRQ handler runs in a workqueue, so it should be okay.
On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote: > 02.04.2021 00:19, Michał Mirosław пишет: > > On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote: > >> On 3/23/21 12:16 PM, Thierry Reding wrote: > >>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: > >>>> Show the number of pending waiters in the debugfs status file. > >>>> This is useful for testing to verify that waiters do not leak > >>>> or accumulate incorrectly. > >>>> > >>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> > >>>> --- > >>>> drivers/gpu/host1x/debug.c | 14 +++++++++++--- > >>>> 1 file changed, 11 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c > >>>> index 1b4997bda1c7..8a14880c61bb 100644 > >>>> --- a/drivers/gpu/host1x/debug.c > >>>> +++ b/drivers/gpu/host1x/debug.c > >>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) > >>>> static void show_syncpts(struct host1x *m, struct output *o) > >>>> { > >>>> + struct list_head *pos; > >>>> unsigned int i; > >>>> host1x_debug_output(o, "---- syncpts ----\n"); > >>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) > >>>> for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { > >>>> u32 max = host1x_syncpt_read_max(m->syncpt + i); > >>>> u32 min = host1x_syncpt_load(m->syncpt + i); > >>>> + unsigned int waiters = 0; > >>>> - if (!min && !max) > >>>> + spin_lock(&m->syncpt[i].intr.lock); > >>>> + list_for_each(pos, &m->syncpt[i].intr.wait_head) > >>>> + waiters++; > >>>> + spin_unlock(&m->syncpt[i].intr.lock); > >>> > >>> Would it make sense to keep a running count so that we don't have to > >>> compute it here? > >> > >> Considering this is just a debug facility, I think I prefer not adding a new > >> field just for it. > > > > This looks like IRQ-disabled region, so unless only root can trigger > > this code, maybe the additional field could save a potential headache? > > How many waiters can there be in the worst case? > > The host1x's IRQ handler runs in a workqueue, so it should be okay. Why, then, this uses a spinlock (and it has 'intr' in its name)? Best Regards Michał Mirosław
On Thu, Apr 08, 2021 at 06:13:44AM +0200, Michał Mirosław wrote: > On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote: > > 02.04.2021 00:19, Michał Mirosław пишет: > > > On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote: > > >> On 3/23/21 12:16 PM, Thierry Reding wrote: > > >>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: > > >>>> Show the number of pending waiters in the debugfs status file. > > >>>> This is useful for testing to verify that waiters do not leak > > >>>> or accumulate incorrectly. > > >>>> > > >>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> > > >>>> --- > > >>>> drivers/gpu/host1x/debug.c | 14 +++++++++++--- > > >>>> 1 file changed, 11 insertions(+), 3 deletions(-) > > >>>> > > >>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c > > >>>> index 1b4997bda1c7..8a14880c61bb 100644 > > >>>> --- a/drivers/gpu/host1x/debug.c > > >>>> +++ b/drivers/gpu/host1x/debug.c > > >>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) > > >>>> static void show_syncpts(struct host1x *m, struct output *o) > > >>>> { > > >>>> + struct list_head *pos; > > >>>> unsigned int i; > > >>>> host1x_debug_output(o, "---- syncpts ----\n"); > > >>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) > > >>>> for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { > > >>>> u32 max = host1x_syncpt_read_max(m->syncpt + i); > > >>>> u32 min = host1x_syncpt_load(m->syncpt + i); > > >>>> + unsigned int waiters = 0; > > >>>> - if (!min && !max) > > >>>> + spin_lock(&m->syncpt[i].intr.lock); > > >>>> + list_for_each(pos, &m->syncpt[i].intr.wait_head) > > >>>> + waiters++; > > >>>> + spin_unlock(&m->syncpt[i].intr.lock); > > >>> > > >>> Would it make sense to keep a running count so that we don't have to > > >>> compute it here? > > >> > > >> Considering this is just a debug facility, I think I prefer not adding a new > > >> field just for it. > > > > > > This looks like IRQ-disabled region, so unless only root can trigger > > > this code, maybe the additional field could save a potential headache? > > > How many waiters can there be in the worst case? > > > > The host1x's IRQ handler runs in a workqueue, so it should be okay. > > Why, then, this uses a spinlock (and it has 'intr' in its name)? The critical sections are already O(n) in number of waiters, so this patch doesn't make things worse as I previously thought. The questions remain: What is the expected number and upper bound of workers? Shouldn't this be a mutex instead? Best Regards Michał Mirosław
On 4/8/21 7:25 AM, Michał Mirosław wrote: > On Thu, Apr 08, 2021 at 06:13:44AM +0200, Michał Mirosław wrote: >> On Fri, Apr 02, 2021 at 07:02:32PM +0300, Dmitry Osipenko wrote: >>> 02.04.2021 00:19, Michał Mirosław пишет: >>>> On Fri, Mar 26, 2021 at 04:34:13PM +0200, Mikko Perttunen wrote: >>>>> On 3/23/21 12:16 PM, Thierry Reding wrote: >>>>>> On Mon, Jan 11, 2021 at 03:00:01PM +0200, Mikko Perttunen wrote: >>>>>>> Show the number of pending waiters in the debugfs status file. >>>>>>> This is useful for testing to verify that waiters do not leak >>>>>>> or accumulate incorrectly. >>>>>>> >>>>>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> >>>>>>> --- >>>>>>> drivers/gpu/host1x/debug.c | 14 +++++++++++--- >>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c >>>>>>> index 1b4997bda1c7..8a14880c61bb 100644 >>>>>>> --- a/drivers/gpu/host1x/debug.c >>>>>>> +++ b/drivers/gpu/host1x/debug.c >>>>>>> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) >>>>>>> static void show_syncpts(struct host1x *m, struct output *o) >>>>>>> { >>>>>>> + struct list_head *pos; >>>>>>> unsigned int i; >>>>>>> host1x_debug_output(o, "---- syncpts ----\n"); >>>>>>> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) >>>>>>> for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { >>>>>>> u32 max = host1x_syncpt_read_max(m->syncpt + i); >>>>>>> u32 min = host1x_syncpt_load(m->syncpt + i); >>>>>>> + unsigned int waiters = 0; >>>>>>> - if (!min && !max) >>>>>>> + spin_lock(&m->syncpt[i].intr.lock); >>>>>>> + list_for_each(pos, &m->syncpt[i].intr.wait_head) >>>>>>> + waiters++; >>>>>>> + spin_unlock(&m->syncpt[i].intr.lock); >>>>>> >>>>>> Would it make sense to keep a running count so that we don't have to >>>>>> compute it here? >>>>> >>>>> Considering this is just a debug facility, I think I prefer not adding a new >>>>> field just for it. >>>> >>>> This looks like IRQ-disabled region, so unless only root can trigger >>>> this code, maybe the additional field could save a potential headache? >>>> How many waiters can there be in the worst case? >>> >>> The host1x's IRQ handler runs in a workqueue, so it should be okay. >> >> Why, then, this uses a spinlock (and it has 'intr' in its name)? > > The critical sections are already O(n) in number of waiters, so this > patch doesn't make things worse as I previously thought. The questions > remain: What is the expected number and upper bound of workers? > Shouldn't this be a mutex instead? Everything is primarily for historical reasons. The name 'intr' is because this is in the part of the host1x driver that handles syncpoint threshold interrupts - just some of it is in interrupt context and some not. In any case, this code is scheduled for a complete redesign once we get the UAPI changes done. I'll take this into account at that point. Cheers, Mikko > > Best Regards > Michał Mirosław >
diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c index 1b4997bda1c7..8a14880c61bb 100644 --- a/drivers/gpu/host1x/debug.c +++ b/drivers/gpu/host1x/debug.c @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) static void show_syncpts(struct host1x *m, struct output *o) { + struct list_head *pos; unsigned int i; host1x_debug_output(o, "---- syncpts ----\n"); @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { u32 max = host1x_syncpt_read_max(m->syncpt + i); u32 min = host1x_syncpt_load(m->syncpt + i); + unsigned int waiters = 0; - if (!min && !max) + spin_lock(&m->syncpt[i].intr.lock); + list_for_each(pos, &m->syncpt[i].intr.wait_head) + waiters++; + spin_unlock(&m->syncpt[i].intr.lock); + + if (!min && !max && !waiters) continue; - host1x_debug_output(o, "id %u (%s) min %d max %d\n", - i, m->syncpt[i].name, min, max); + host1x_debug_output(o, + "id %u (%s) min %d max %d (%d waiters)\n", + i, m->syncpt[i].name, min, max, waiters); } for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
Show the number of pending waiters in the debugfs status file. This is useful for testing to verify that waiters do not leak or accumulate incorrectly. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> --- drivers/gpu/host1x/debug.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)