Message ID | 20210111130019.3515669-5-mperttunen@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Host1x/TegraDRM UAPI | expand |
11.01.2021 16:00, Mikko Perttunen пишет: > -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref) > +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref, > + bool flush) > { > struct host1x_waitlist *waiter = ref; > struct host1x_syncpt *syncpt; > > - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) == > - WLS_REMOVED) > - schedule(); > + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); > > syncpt = host->syncpt + id; > - (void)process_wait_list(host, syncpt, > - host1x_syncpt_load(host->syncpt + id)); > + > + spin_lock(&syncpt->intr.lock); > + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == > + WLS_CANCELLED) { > + list_del(&waiter->list); > + kref_put(&waiter->refcount, waiter_release); > + } > + spin_unlock(&syncpt->intr.lock); > + > + if (flush) { > + /* Wait until any concurrently executing handler has finished. */ > + while (atomic_read(&waiter->state) != WLS_HANDLED) > + cpu_relax(); > + } A busy-loop shouldn't be used in kernel unless there is a very good reason. The wait_event() should be used instead. But please don't hurry to update this patch, we may need or want to retire the host1x-waiter and then these all waiter-related patches won't be needed.
On 1/13/21 12:07 AM, Dmitry Osipenko wrote: > 11.01.2021 16:00, Mikko Perttunen пишет: >> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref) >> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref, >> + bool flush) >> { >> struct host1x_waitlist *waiter = ref; >> struct host1x_syncpt *syncpt; >> >> - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) == >> - WLS_REMOVED) >> - schedule(); >> + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); >> >> syncpt = host->syncpt + id; >> - (void)process_wait_list(host, syncpt, >> - host1x_syncpt_load(host->syncpt + id)); >> + >> + spin_lock(&syncpt->intr.lock); >> + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == >> + WLS_CANCELLED) { >> + list_del(&waiter->list); >> + kref_put(&waiter->refcount, waiter_release); >> + } >> + spin_unlock(&syncpt->intr.lock); >> + >> + if (flush) { >> + /* Wait until any concurrently executing handler has finished. */ >> + while (atomic_read(&waiter->state) != WLS_HANDLED) >> + cpu_relax(); >> + } > > A busy-loop shouldn't be used in kernel unless there is a very good > reason. The wait_event() should be used instead. > > But please don't hurry to update this patch, we may need or want to > retire the host1x-waiter and then these all waiter-related patches won't > be needed. > Yes, we should improve the intr code to remove all this complexity. But let's merge this first to get a functional baseline and do larger design changes in follow-up patches. It is cumbersome for me to develop further series (of which I have several under work and planning) with this baseline series not being merged. The uncertainty on the approval of the UAPI design also makes it hard to know whether it makes sense for me to work on top of this code or not, so I'd like to focus on what's needed to get this merged instead of further redesign of the driver at this time. Mikko
13.01.2021 01:20, Mikko Perttunen пишет: > On 1/13/21 12:07 AM, Dmitry Osipenko wrote: >> 11.01.2021 16:00, Mikko Perttunen пишет: >>> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void >>> *ref) >>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void >>> *ref, >>> + bool flush) >>> { >>> struct host1x_waitlist *waiter = ref; >>> struct host1x_syncpt *syncpt; >>> - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, >>> WLS_CANCELLED) == >>> - WLS_REMOVED) >>> - schedule(); >>> + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); >>> syncpt = host->syncpt + id; >>> - (void)process_wait_list(host, syncpt, >>> - host1x_syncpt_load(host->syncpt + id)); >>> + >>> + spin_lock(&syncpt->intr.lock); >>> + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == >>> + WLS_CANCELLED) { >>> + list_del(&waiter->list); >>> + kref_put(&waiter->refcount, waiter_release); >>> + } >>> + spin_unlock(&syncpt->intr.lock); >>> + >>> + if (flush) { >>> + /* Wait until any concurrently executing handler has >>> finished. */ >>> + while (atomic_read(&waiter->state) != WLS_HANDLED) >>> + cpu_relax(); >>> + } >> >> A busy-loop shouldn't be used in kernel unless there is a very good >> reason. The wait_event() should be used instead. >> >> But please don't hurry to update this patch, we may need or want to >> retire the host1x-waiter and then these all waiter-related patches won't >> be needed. >> > > Yes, we should improve the intr code to remove all this complexity. But > let's merge this first to get a functional baseline and do larger design > changes in follow-up patches. > > It is cumbersome for me to develop further series (of which I have > several under work and planning) with this baseline series not being > merged. The uncertainty on the approval of the UAPI design also makes it > hard to know whether it makes sense for me to work on top of this code > or not, so I'd like to focus on what's needed to get this merged instead > of further redesign of the driver at this time. Is this patch (and some others) necessary for the new UAPI? If not, could we please narrow down the patches to the minimum that is needed for trying out the new UAPI?
On 1/13/21 6:29 PM, Dmitry Osipenko wrote: > 13.01.2021 01:20, Mikko Perttunen пишет: >> On 1/13/21 12:07 AM, Dmitry Osipenko wrote: >>> 11.01.2021 16:00, Mikko Perttunen пишет: >>>> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void >>>> *ref) >>>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void >>>> *ref, >>>> + bool flush) >>>> { >>>> struct host1x_waitlist *waiter = ref; >>>> struct host1x_syncpt *syncpt; >>>> - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, >>>> WLS_CANCELLED) == >>>> - WLS_REMOVED) >>>> - schedule(); >>>> + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); >>>> syncpt = host->syncpt + id; >>>> - (void)process_wait_list(host, syncpt, >>>> - host1x_syncpt_load(host->syncpt + id)); >>>> + >>>> + spin_lock(&syncpt->intr.lock); >>>> + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == >>>> + WLS_CANCELLED) { >>>> + list_del(&waiter->list); >>>> + kref_put(&waiter->refcount, waiter_release); >>>> + } >>>> + spin_unlock(&syncpt->intr.lock); >>>> + >>>> + if (flush) { >>>> + /* Wait until any concurrently executing handler has >>>> finished. */ >>>> + while (atomic_read(&waiter->state) != WLS_HANDLED) >>>> + cpu_relax(); >>>> + } >>> >>> A busy-loop shouldn't be used in kernel unless there is a very good >>> reason. The wait_event() should be used instead. >>> >>> But please don't hurry to update this patch, we may need or want to >>> retire the host1x-waiter and then these all waiter-related patches won't >>> be needed. >>> >> >> Yes, we should improve the intr code to remove all this complexity. But >> let's merge this first to get a functional baseline and do larger design >> changes in follow-up patches. >> >> It is cumbersome for me to develop further series (of which I have >> several under work and planning) with this baseline series not being >> merged. The uncertainty on the approval of the UAPI design also makes it >> hard to know whether it makes sense for me to work on top of this code >> or not, so I'd like to focus on what's needed to get this merged instead >> of further redesign of the driver at this time. > > Is this patch (and some others) necessary for the new UAPI? If not, > could we please narrow down the patches to the minimum that is needed > for trying out the new UAPI? > Yes, it is necessary. I tried to revert it and half the tests in the test suite start failing. I think patches 01, 03, 14 and 17 are not strictly required, though reverting 03 will cause one of the syncpoint tests to fail. Mikko
On Wed, Jan 13, 2021 at 12:20:38AM +0200, Mikko Perttunen wrote: > On 1/13/21 12:07 AM, Dmitry Osipenko wrote: > > 11.01.2021 16:00, Mikko Perttunen пишет: > > > -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref) > > > +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref, > > > + bool flush) > > > { > > > struct host1x_waitlist *waiter = ref; > > > struct host1x_syncpt *syncpt; > > > - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) == > > > - WLS_REMOVED) > > > - schedule(); > > > + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); > > > syncpt = host->syncpt + id; > > > - (void)process_wait_list(host, syncpt, > > > - host1x_syncpt_load(host->syncpt + id)); > > > + > > > + spin_lock(&syncpt->intr.lock); > > > + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == > > > + WLS_CANCELLED) { > > > + list_del(&waiter->list); > > > + kref_put(&waiter->refcount, waiter_release); > > > + } > > > + spin_unlock(&syncpt->intr.lock); > > > + > > > + if (flush) { > > > + /* Wait until any concurrently executing handler has finished. */ > > > + while (atomic_read(&waiter->state) != WLS_HANDLED) > > > + cpu_relax(); > > > + } > > > > A busy-loop shouldn't be used in kernel unless there is a very good > > reason. The wait_event() should be used instead. > > > > But please don't hurry to update this patch, we may need or want to > > retire the host1x-waiter and then these all waiter-related patches won't > > be needed. > > > > Yes, we should improve the intr code to remove all this complexity. But > let's merge this first to get a functional baseline and do larger design > changes in follow-up patches. I agree, there's no reason to hold back any interim improvements. But I do agree with Dmitry's argument about the busy loop. Prior to this, the code used schedule() to let the CPU run other code while waiting for the waiter to change state. Is there any reason why we can't keep schedule() here? Thierry
diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c index 9245add23b5d..70e1096a4fe9 100644 --- a/drivers/gpu/host1x/intr.c +++ b/drivers/gpu/host1x/intr.c @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt, return 0; } -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref) +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref, + bool flush) { struct host1x_waitlist *waiter = ref; struct host1x_syncpt *syncpt; - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) == - WLS_REMOVED) - schedule(); + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); syncpt = host->syncpt + id; - (void)process_wait_list(host, syncpt, - host1x_syncpt_load(host->syncpt + id)); + + spin_lock(&syncpt->intr.lock); + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == + WLS_CANCELLED) { + list_del(&waiter->list); + kref_put(&waiter->refcount, waiter_release); + } + spin_unlock(&syncpt->intr.lock); + + if (flush) { + /* Wait until any concurrently executing handler has finished. */ + while (atomic_read(&waiter->state) != WLS_HANDLED) + cpu_relax(); + } kref_put(&waiter->refcount, waiter_release); } diff --git a/drivers/gpu/host1x/intr.h b/drivers/gpu/host1x/intr.h index aac38194398f..6ea55e615e3a 100644 --- a/drivers/gpu/host1x/intr.h +++ b/drivers/gpu/host1x/intr.h @@ -74,8 +74,10 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt, * Unreference an action submitted to host1x_intr_add_action(). * You must call this if you passed non-NULL as ref. * @ref the ref returned from host1x_intr_add_action() + * @flush wait until any pending handlers have completed before returning. */ -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref); +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref, + bool flush); /* Initialize host1x sync point interrupt */ int host1x_intr_init(struct host1x *host, unsigned int irq_sync); diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 5982fdf64e1c..e48b4595cf53 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -293,7 +293,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, } } - host1x_intr_put_ref(sp->host, sp->id, ref); + host1x_intr_put_ref(sp->host, sp->id, ref, true); done: return err;
Before this patch, cancelled waiters would only be cleaned up once their threshold value was reached. Make host1x_intr_put_ref process the cancellation immediately to fix this. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> --- v5: * Add parameter to flush, i.e. wait for all pending waiters to complete before returning. The reason this is not always true is that the pending waiter might be the place that is calling the put_ref. --- drivers/gpu/host1x/intr.c | 23 +++++++++++++++++------ drivers/gpu/host1x/intr.h | 4 +++- drivers/gpu/host1x/syncpt.c | 2 +- 3 files changed, 21 insertions(+), 8 deletions(-)