diff mbox

[RFC,v1,08/16] drm/radeon: use common fence implementation for fences

Message ID 5374BB4A.6070102@canonical.com (mailing list archive)
State New, archived
Headers show

Commit Message

Maarten Lankhorst May 15, 2014, 1:04 p.m. UTC
op 15-05-14 11:42, Christian König schreef:
> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>> op 15-05-14 11:21, Christian König schreef:
>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>> +    if (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= fence->seq) {
>>>>>> +        radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>> +        return false;
>>>>>> +    }
>>>>>> +
>>>>>> +    fence->fence_wake.flags = 0;
>>>>>> +    fence->fence_wake.private = NULL;
>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, &fence->fence_wake);
>>>>>> +    fence_get(f);
>>>>> That looks like a race condition to me. The fence needs to be added to the wait queue before the check, not after.
>>>>>
>>>>> Apart from that the whole approach looks like a really bad idea to me. How for example is lockup detection supposed to happen with this? 
>>>> It's not a race condition because fence_queue.lock is held when this function is called.
>>> Ah, I see. That's also the reason why you moved the wake_up_all out of the processing function.
>> Correct. :-)
>>>> Lockup's a bit of a weird problem, the changes wouldn't allow core ttm code to handle the lockup any more,
>>>> but any driver specific wait code would still handle this. I did this by design, because in future patches the wait
>>>> function may be called from outside of the radeon driver. The official wait function takes a timeout parameter,
>>>> so lockups wouldn't be fatal if the timeout is set to something like 30*HZ for example, it would still return
>>>> and report that the function timed out.
>>> Timeouts help with the detection of the lockup, but not at all with the handling of them.
>>>
>>> What we essentially need is a wait callback into the driver that is called in non atomic context without any locks held.
>>>
>>> This way we can block for the fence to become signaled with a timeout and can then also initiate the reset handling if necessary.
>>>
>>> The way you designed the interface now means that the driver never gets a chance to wait for the hardware to become idle and so never has the opportunity to the reset the whole thing.
>> You could set up a hangcheck timer like intel does, and end up with a reliable hangcheck detection that doesn't depend on cpu waits. :-) Or override the default wait function and restore the old behavior.
>
> Overriding the default wait function sounds better, please implement it this way.
>
> Thanks,
> Christian. 

Does this modification look sane?

Comments

Christian König May 15, 2014, 1:19 p.m. UTC | #1
Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
> op 15-05-14 11:42, Christian König schreef:
>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>> op 15-05-14 11:21, Christian König schreef:
>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>> +    if 
>>>>>>> (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= 
>>>>>>> fence->seq) {
>>>>>>> +        radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>> +        return false;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, &fence->fence_wake);
>>>>>>> +    fence_get(f);
>>>>>> That looks like a race condition to me. The fence needs to be 
>>>>>> added to the wait queue before the check, not after.
>>>>>>
>>>>>> Apart from that the whole approach looks like a really bad idea 
>>>>>> to me. How for example is lockup detection supposed to happen 
>>>>>> with this? 
>>>>> It's not a race condition because fence_queue.lock is held when 
>>>>> this function is called.
>>>> Ah, I see. That's also the reason why you moved the wake_up_all out 
>>>> of the processing function.
>>> Correct. :-)
>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow core 
>>>>> ttm code to handle the lockup any more,
>>>>> but any driver specific wait code would still handle this. I did 
>>>>> this by design, because in future patches the wait
>>>>> function may be called from outside of the radeon driver. The 
>>>>> official wait function takes a timeout parameter,
>>>>> so lockups wouldn't be fatal if the timeout is set to something 
>>>>> like 30*HZ for example, it would still return
>>>>> and report that the function timed out.
>>>> Timeouts help with the detection of the lockup, but not at all with 
>>>> the handling of them.
>>>>
>>>> What we essentially need is a wait callback into the driver that is 
>>>> called in non atomic context without any locks held.
>>>>
>>>> This way we can block for the fence to become signaled with a 
>>>> timeout and can then also initiate the reset handling if necessary.
>>>>
>>>> The way you designed the interface now means that the driver never 
>>>> gets a chance to wait for the hardware to become idle and so never 
>>>> has the opportunity to the reset the whole thing.
>>> You could set up a hangcheck timer like intel does, and end up with 
>>> a reliable hangcheck detection that doesn't depend on cpu waits. :-) 
>>> Or override the default wait function and restore the old behavior.
>>
>> Overriding the default wait function sounds better, please implement 
>> it this way.
>>
>> Thanks,
>> Christian. 
>
> Does this modification look sane?
Adding the timeout is on my todo list for quite some time as well, so 
this part makes sense.

> +static long __radeon_fence_wait(struct fence *f, bool intr, long 
> timeout)
> +{
> +    struct radeon_fence *fence = to_radeon_fence(f);
> +    u64 target_seq[RADEON_NUM_RINGS] = {};
> +
> +    target_seq[fence->ring] = fence->seq;
> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, 
> intr, timeout);
> +}
When this call is comming from outside the radeon driver you need to 
lock rdev->exclusive_lock here to make sure not to interfere with a 
possible reset.

>      .get_timeline_name = radeon_fence_get_timeline_name,
>      .enable_signaling = radeon_fence_enable_signaling,
>      .signaled = __radeon_fence_signaled,
Do we still need those callback when we implemented the wait callback?

Christian.

>
> diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
> b/drivers/gpu/drm/radeon/radeon_fence.c
> index bc844f300d3f..2d415eb2834a 100644
> --- a/drivers/gpu/drm/radeon/radeon_fence.c
> +++ b/drivers/gpu/drm/radeon/radeon_fence.c
> @@ -361,28 +361,35 @@ static bool radeon_fence_any_seq_signaled(struct 
> radeon_device *rdev, u64 *seq)
>  }
>
>  /**
> - * radeon_fence_wait_seq - wait for a specific sequence numbers
> + * radeon_fence_wait_seq_timeout - wait for a specific sequence numbers
>   *
>   * @rdev: radeon device pointer
>   * @target_seq: sequence number(s) we want to wait for
>   * @intr: use interruptable sleep
> + * @timeout: maximum time to wait, or MAX_SCHEDULE_TIMEOUT for 
> infinite wait
>   *
>   * Wait for the requested sequence number(s) to be written by any ring
>   * (all asics).  Sequnce number array is indexed by ring id.
>   * @intr selects whether to use interruptable (true) or 
> non-interruptable
>   * (false) sleep when waiting for the sequence number.  Helper function
>   * for radeon_fence_wait_*().
> - * Returns 0 if the sequence number has passed, error for all other 
> cases.
> + * Returns remaining time if the sequence number has passed, 0 when
> + * the wait timeout, or an error for all other cases.
>   * -EDEADLK is returned when a GPU lockup has been detected.
>   */
> -static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 
> *target_seq,
> -                 bool intr)
> +static int radeon_fence_wait_seq_timeout(struct radeon_device *rdev,
> +                     u64 *target_seq, bool intr,
> +                     long timeout)
>  {
>      uint64_t last_seq[RADEON_NUM_RINGS];
>      bool signaled;
> -    int i, r;
> +    int i;
>
>      while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
> +        long r, waited = timeout;
> +
> +        waited = timeout < RADEON_FENCE_JIFFIES_TIMEOUT ?
> +             timeout : RADEON_FENCE_JIFFIES_TIMEOUT;
>
>          /* Save current sequence values, used to check for GPU 
> lockups */
>          for (i = 0; i < RADEON_NUM_RINGS; ++i) {
> @@ -397,13 +404,15 @@ static int radeon_fence_wait_seq(struct 
> radeon_device *rdev, u64 *target_seq,
>          if (intr) {
>              r = wait_event_interruptible_timeout(rdev->fence_queue, (
>                  (signaled = radeon_fence_any_seq_signaled(rdev, 
> target_seq))
> -                 || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT);
> +                 || rdev->needs_reset), waited);
>          } else {
>              r = wait_event_timeout(rdev->fence_queue, (
>                  (signaled = radeon_fence_any_seq_signaled(rdev, 
> target_seq))
> -                 || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT);
> +                 || rdev->needs_reset), waited);
>          }
>
> +        timeout -= waited - r;
> +
>          for (i = 0; i < RADEON_NUM_RINGS; ++i) {
>              if (!target_seq[i])
>                  continue;
> @@ -415,6 +424,12 @@ static int radeon_fence_wait_seq(struct 
> radeon_device *rdev, u64 *target_seq,
>          if (unlikely(r < 0))
>              return r;
>
> +        /*
> +         * If this is a timed wait and the wait completely timed out 
> just return.
> +         */
> +        if (!timeout)
> +            break;
> +
>          if (unlikely(!signaled)) {
>              if (rdev->needs_reset)
>                  return -EDEADLK;
> @@ -457,7 +472,7 @@ static int radeon_fence_wait_seq(struct 
> radeon_device *rdev, u64 *target_seq,
>              }
>          }
>      }
> -    return 0;
> +    return timeout;
>  }
>
>  /**
> @@ -480,8 +495,8 @@ int radeon_fence_wait(struct radeon_fence *fence, 
> bool intr)
>          return 0;
>
>      seq[fence->ring] = fence->seq;
> -    r = radeon_fence_wait_seq(fence->rdev, seq, intr);
> -    if (r) {
> +    r = radeon_fence_wait_seq_timeout(fence->rdev, seq, intr, 
> MAX_SCHEDULE_TIMEOUT);
> +    if (r < 0) {
>          return r;
>      }
>      r = fence_signal(&fence->base);
> @@ -509,7 +524,7 @@ int radeon_fence_wait_any(struct radeon_device *rdev,
>  {
>      uint64_t seq[RADEON_NUM_RINGS];
>      unsigned i, num_rings = 0;
> -    int r;
> +    long r;
>
>      for (i = 0; i < RADEON_NUM_RINGS; ++i) {
>          seq[i] = 0;
> @@ -531,8 +546,8 @@ int radeon_fence_wait_any(struct radeon_device *rdev,
>      if (num_rings == 0)
>          return -ENOENT;
>
> -    r = radeon_fence_wait_seq(rdev, seq, intr);
> -    if (r) {
> +    r = radeon_fence_wait_seq_timeout(rdev, seq, intr, 
> MAX_SCHEDULE_TIMEOUT);
> +    if (r < 0) {
>          return r;
>      }
>      return 0;
> @@ -551,6 +566,7 @@ int radeon_fence_wait_any(struct radeon_device *rdev,
>  int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
>  {
>      uint64_t seq[RADEON_NUM_RINGS] = {};
> +    long r;
>
>      seq[ring] = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL;
>      if (seq[ring] >= rdev->fence_drv[ring].sync_seq[ring]) {
> @@ -558,7 +574,10 @@ int radeon_fence_wait_next(struct radeon_device 
> *rdev, int ring)
>             already the last emited fence */
>          return -ENOENT;
>      }
> -    return radeon_fence_wait_seq(rdev, seq, false);
> +    r = radeon_fence_wait_seq_timeout(rdev, seq, false, 
> MAX_SCHEDULE_TIMEOUT);
> +    if (r < 0)
> +        return r;
> +    return 0;
>  }
>
>  /**
> @@ -580,8 +599,8 @@ int radeon_fence_wait_empty(struct radeon_device 
> *rdev, int ring)
>      if (!seq[ring])
>          return 0;
>
> -    r = radeon_fence_wait_seq(rdev, seq, false);
> -    if (r) {
> +    r = radeon_fence_wait_seq_timeout(rdev, seq, false, 
> MAX_SCHEDULE_TIMEOUT);
> +    if (r < 0) {
>          if (r == -EDEADLK)
>              return -EDEADLK;
>
> @@ -908,6 +927,15 @@ int radeon_debugfs_fence_init(struct 
> radeon_device *rdev)
>  #endif
>  }
>
> +static long __radeon_fence_wait(struct fence *f, bool intr, long 
> timeout)
> +{
> +    struct radeon_fence *fence = to_radeon_fence(f);
> +    u64 target_seq[RADEON_NUM_RINGS] = {};
> +
> +    target_seq[fence->ring] = fence->seq;
> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, 
> intr, timeout);
> +}
> +
>  static const char *radeon_fence_get_driver_name(struct fence *fence)
>  {
>      return "radeon";
> @@ -932,6 +960,6 @@ static const struct fence_ops radeon_fence_ops = {
>      .get_timeline_name = radeon_fence_get_timeline_name,
>      .enable_signaling = radeon_fence_enable_signaling,
>      .signaled = __radeon_fence_signaled,
> -    .wait = fence_default_wait,
> +    .wait = __radeon_fence_wait,
>      .release = NULL,
>  };
>
Maarten Lankhorst May 15, 2014, 2:18 p.m. UTC | #2
op 15-05-14 15:19, Christian König schreef:
> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>> op 15-05-14 11:42, Christian König schreef:
>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>> op 15-05-14 11:21, Christian König schreef:
>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>>> +    if (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= fence->seq) {
>>>>>>>> +        radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>> +        return false;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, &fence->fence_wake);
>>>>>>>> +    fence_get(f);
>>>>>>> That looks like a race condition to me. The fence needs to be added to the wait queue before the check, not after.
>>>>>>>
>>>>>>> Apart from that the whole approach looks like a really bad idea to me. How for example is lockup detection supposed to happen with this? 
>>>>>> It's not a race condition because fence_queue.lock is held when this function is called.
>>>>> Ah, I see. That's also the reason why you moved the wake_up_all out of the processing function.
>>>> Correct. :-)
>>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow core ttm code to handle the lockup any more,
>>>>>> but any driver specific wait code would still handle this. I did this by design, because in future patches the wait
>>>>>> function may be called from outside of the radeon driver. The official wait function takes a timeout parameter,
>>>>>> so lockups wouldn't be fatal if the timeout is set to something like 30*HZ for example, it would still return
>>>>>> and report that the function timed out.
>>>>> Timeouts help with the detection of the lockup, but not at all with the handling of them.
>>>>>
>>>>> What we essentially need is a wait callback into the driver that is called in non atomic context without any locks held.
>>>>>
>>>>> This way we can block for the fence to become signaled with a timeout and can then also initiate the reset handling if necessary.
>>>>>
>>>>> The way you designed the interface now means that the driver never gets a chance to wait for the hardware to become idle and so never has the opportunity to the reset the whole thing.
>>>> You could set up a hangcheck timer like intel does, and end up with a reliable hangcheck detection that doesn't depend on cpu waits. :-) Or override the default wait function and restore the old behavior.
>>>
>>> Overriding the default wait function sounds better, please implement it this way.
>>>
>>> Thanks,
>>> Christian. 
>>
>> Does this modification look sane?
> Adding the timeout is on my todo list for quite some time as well, so this part makes sense.
>
>> +static long __radeon_fence_wait(struct fence *f, bool intr, long timeout)
>> +{
>> +    struct radeon_fence *fence = to_radeon_fence(f);
>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>> +
>> +    target_seq[fence->ring] = fence->seq;
>> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, intr, timeout);
>> +}
> When this call is comming from outside the radeon driver you need to lock rdev->exclusive_lock here to make sure not to interfere with a possible reset.
Ah thanks, I'll add that.

>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>      .enable_signaling = radeon_fence_enable_signaling,
>>      .signaled = __radeon_fence_signaled,
> Do we still need those callback when we implemented the wait callback?
.get_timeline_name is used for debugging (trace events).
.signaled is the non-blocking call to check if the fence is signaled or not.
.enable_signaling is used for adding callbacks upon fence completion, the default 'fence_default_wait' uses it, so
when it works no separate implementation is needed unless you want to do more than just waiting.
It's also used when fence_add_callback is called. i915 can be patched to use it. ;-)

~Maarten
Christian König May 15, 2014, 3:48 p.m. UTC | #3
Am 15.05.2014 16:18, schrieb Maarten Lankhorst:
> op 15-05-14 15:19, Christian König schreef:
>> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>>> op 15-05-14 11:42, Christian König schreef:
>>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>>> op 15-05-14 11:21, Christian König schreef:
>>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>>>> +    if 
>>>>>>>>> (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) 
>>>>>>>>> >= fence->seq) {
>>>>>>>>> +        radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>>> +        return false;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, 
>>>>>>>>> &fence->fence_wake);
>>>>>>>>> +    fence_get(f);
>>>>>>>> That looks like a race condition to me. The fence needs to be 
>>>>>>>> added to the wait queue before the check, not after.
>>>>>>>>
>>>>>>>> Apart from that the whole approach looks like a really bad idea 
>>>>>>>> to me. How for example is lockup detection supposed to happen 
>>>>>>>> with this? 
>>>>>>> It's not a race condition because fence_queue.lock is held when 
>>>>>>> this function is called.
>>>>>> Ah, I see. That's also the reason why you moved the wake_up_all 
>>>>>> out of the processing function.
>>>>> Correct. :-)
>>>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow 
>>>>>>> core ttm code to handle the lockup any more,
>>>>>>> but any driver specific wait code would still handle this. I did 
>>>>>>> this by design, because in future patches the wait
>>>>>>> function may be called from outside of the radeon driver. The 
>>>>>>> official wait function takes a timeout parameter,
>>>>>>> so lockups wouldn't be fatal if the timeout is set to something 
>>>>>>> like 30*HZ for example, it would still return
>>>>>>> and report that the function timed out.
>>>>>> Timeouts help with the detection of the lockup, but not at all 
>>>>>> with the handling of them.
>>>>>>
>>>>>> What we essentially need is a wait callback into the driver that 
>>>>>> is called in non atomic context without any locks held.
>>>>>>
>>>>>> This way we can block for the fence to become signaled with a 
>>>>>> timeout and can then also initiate the reset handling if necessary.
>>>>>>
>>>>>> The way you designed the interface now means that the driver 
>>>>>> never gets a chance to wait for the hardware to become idle and 
>>>>>> so never has the opportunity to the reset the whole thing.
>>>>> You could set up a hangcheck timer like intel does, and end up 
>>>>> with a reliable hangcheck detection that doesn't depend on cpu 
>>>>> waits. :-) Or override the default wait function and restore the 
>>>>> old behavior.
>>>>
>>>> Overriding the default wait function sounds better, please 
>>>> implement it this way.
>>>>
>>>> Thanks,
>>>> Christian. 
>>>
>>> Does this modification look sane?
>> Adding the timeout is on my todo list for quite some time as well, so 
>> this part makes sense.
>>
>>> +static long __radeon_fence_wait(struct fence *f, bool intr, long 
>>> timeout)
>>> +{
>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>>> +
>>> +    target_seq[fence->ring] = fence->seq;
>>> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, 
>>> intr, timeout);
>>> +}
>> When this call is comming from outside the radeon driver you need to 
>> lock rdev->exclusive_lock here to make sure not to interfere with a 
>> possible reset.
> Ah thanks, I'll add that.
>
>>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>>      .enable_signaling = radeon_fence_enable_signaling,
>>>      .signaled = __radeon_fence_signaled,
>> Do we still need those callback when we implemented the wait callback?
> .get_timeline_name is used for debugging (trace events).
> .signaled is the non-blocking call to check if the fence is signaled 
> or not.
> .enable_signaling is used for adding callbacks upon fence completion, 
> the default 'fence_default_wait' uses it, so
> when it works no separate implementation is needed unless you want to 
> do more than just waiting.
> It's also used when fence_add_callback is called. i915 can be patched 
> to use it. ;-)

I just meant enable_signaling, the other ones are fine with me. The 
problem with enable_signaling is that it's called with a spin lock held, 
so we can't sleep.

While resetting the GPU could be moved out into a timer the problem here 
is that I can't lock rdev->exclusive_lock in such situations.

This means when i915 would call into radeon to enable signaling for a 
fence we can't make sure that there is not GPU reset running on another 
CPU. And touching the IRQ registers while a reset is going on is a 
really good recipe to lockup the whole system.

Christian.

>
> ~Maarten
Maarten Lankhorst May 15, 2014, 3:58 p.m. UTC | #4
op 15-05-14 17:48, Christian König schreef:
> Am 15.05.2014 16:18, schrieb Maarten Lankhorst:
>> op 15-05-14 15:19, Christian König schreef:
>>> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>>>> op 15-05-14 11:42, Christian König schreef:
>>>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>>>> op 15-05-14 11:21, Christian König schreef:
>>>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>>>>> +    if (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= fence->seq) {
>>>>>>>>>> + radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>>>> +        return false;
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, &fence->fence_wake);
>>>>>>>>>> +    fence_get(f);
>>>>>>>>> That looks like a race condition to me. The fence needs to be added to the wait queue before the check, not after.
>>>>>>>>>
>>>>>>>>> Apart from that the whole approach looks like a really bad idea to me. How for example is lockup detection supposed to happen with this? 
>>>>>>>> It's not a race condition because fence_queue.lock is held when this function is called.
>>>>>>> Ah, I see. That's also the reason why you moved the wake_up_all out of the processing function.
>>>>>> Correct. :-)
>>>>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow core ttm code to handle the lockup any more,
>>>>>>>> but any driver specific wait code would still handle this. I did this by design, because in future patches the wait
>>>>>>>> function may be called from outside of the radeon driver. The official wait function takes a timeout parameter,
>>>>>>>> so lockups wouldn't be fatal if the timeout is set to something like 30*HZ for example, it would still return
>>>>>>>> and report that the function timed out.
>>>>>>> Timeouts help with the detection of the lockup, but not at all with the handling of them.
>>>>>>>
>>>>>>> What we essentially need is a wait callback into the driver that is called in non atomic context without any locks held.
>>>>>>>
>>>>>>> This way we can block for the fence to become signaled with a timeout and can then also initiate the reset handling if necessary.
>>>>>>>
>>>>>>> The way you designed the interface now means that the driver never gets a chance to wait for the hardware to become idle and so never has the opportunity to the reset the whole thing.
>>>>>> You could set up a hangcheck timer like intel does, and end up with a reliable hangcheck detection that doesn't depend on cpu waits. :-) Or override the default wait function and restore the old behavior.
>>>>>
>>>>> Overriding the default wait function sounds better, please implement it this way.
>>>>>
>>>>> Thanks,
>>>>> Christian. 
>>>>
>>>> Does this modification look sane?
>>> Adding the timeout is on my todo list for quite some time as well, so this part makes sense.
>>>
>>>> +static long __radeon_fence_wait(struct fence *f, bool intr, long timeout)
>>>> +{
>>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>>>> +
>>>> +    target_seq[fence->ring] = fence->seq;
>>>> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, intr, timeout);
>>>> +}
>>> When this call is comming from outside the radeon driver you need to lock rdev->exclusive_lock here to make sure not to interfere with a possible reset.
>> Ah thanks, I'll add that.
>>
>>>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>>>      .enable_signaling = radeon_fence_enable_signaling,
>>>>      .signaled = __radeon_fence_signaled,
>>> Do we still need those callback when we implemented the wait callback?
>> .get_timeline_name is used for debugging (trace events).
>> .signaled is the non-blocking call to check if the fence is signaled or not.
>> .enable_signaling is used for adding callbacks upon fence completion, the default 'fence_default_wait' uses it, so
>> when it works no separate implementation is needed unless you want to do more than just waiting.
>> It's also used when fence_add_callback is called. i915 can be patched to use it. ;-)
>
> I just meant enable_signaling, the other ones are fine with me. The problem with enable_signaling is that it's called with a spin lock held, so we can't sleep.
>
> While resetting the GPU could be moved out into a timer the problem here is that I can't lock rdev->exclusive_lock in such situations.
>
> This means when i915 would call into radeon to enable signaling for a fence we can't make sure that there is not GPU reset running on another CPU. And touching the IRQ registers while a reset is going on is a really good recipe to lockup the whole system.
If you increase the irq counter on all rings before doing a gpu reset, adjust the state and call sw_irq_put when done this race could never happen. Or am I missing something?

~Maarten
Christian König May 15, 2014, 4:13 p.m. UTC | #5
Am 15.05.2014 17:58, schrieb Maarten Lankhorst:
> op 15-05-14 17:48, Christian König schreef:
>> Am 15.05.2014 16:18, schrieb Maarten Lankhorst:
>>> op 15-05-14 15:19, Christian König schreef:
>>>> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>>>>> op 15-05-14 11:42, Christian König schreef:
>>>>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>>>>> op 15-05-14 11:21, Christian König schreef:
>>>>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>>>>>> +    if 
>>>>>>>>>>> (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= 
>>>>>>>>>>> fence->seq) {
>>>>>>>>>>> + radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>>>>> +        return false;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, 
>>>>>>>>>>> &fence->fence_wake);
>>>>>>>>>>> +    fence_get(f);
>>>>>>>>>> That looks like a race condition to me. The fence needs to be 
>>>>>>>>>> added to the wait queue before the check, not after.
>>>>>>>>>>
>>>>>>>>>> Apart from that the whole approach looks like a really bad 
>>>>>>>>>> idea to me. How for example is lockup detection supposed to 
>>>>>>>>>> happen with this? 
>>>>>>>>> It's not a race condition because fence_queue.lock is held 
>>>>>>>>> when this function is called.
>>>>>>>> Ah, I see. That's also the reason why you moved the wake_up_all 
>>>>>>>> out of the processing function.
>>>>>>> Correct. :-)
>>>>>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow 
>>>>>>>>> core ttm code to handle the lockup any more,
>>>>>>>>> but any driver specific wait code would still handle this. I 
>>>>>>>>> did this by design, because in future patches the wait
>>>>>>>>> function may be called from outside of the radeon driver. The 
>>>>>>>>> official wait function takes a timeout parameter,
>>>>>>>>> so lockups wouldn't be fatal if the timeout is set to 
>>>>>>>>> something like 30*HZ for example, it would still return
>>>>>>>>> and report that the function timed out.
>>>>>>>> Timeouts help with the detection of the lockup, but not at all 
>>>>>>>> with the handling of them.
>>>>>>>>
>>>>>>>> What we essentially need is a wait callback into the driver 
>>>>>>>> that is called in non atomic context without any locks held.
>>>>>>>>
>>>>>>>> This way we can block for the fence to become signaled with a 
>>>>>>>> timeout and can then also initiate the reset handling if 
>>>>>>>> necessary.
>>>>>>>>
>>>>>>>> The way you designed the interface now means that the driver 
>>>>>>>> never gets a chance to wait for the hardware to become idle and 
>>>>>>>> so never has the opportunity to the reset the whole thing.
>>>>>>> You could set up a hangcheck timer like intel does, and end up 
>>>>>>> with a reliable hangcheck detection that doesn't depend on cpu 
>>>>>>> waits. :-) Or override the default wait function and restore the 
>>>>>>> old behavior.
>>>>>>
>>>>>> Overriding the default wait function sounds better, please 
>>>>>> implement it this way.
>>>>>>
>>>>>> Thanks,
>>>>>> Christian. 
>>>>>
>>>>> Does this modification look sane?
>>>> Adding the timeout is on my todo list for quite some time as well, 
>>>> so this part makes sense.
>>>>
>>>>> +static long __radeon_fence_wait(struct fence *f, bool intr, long 
>>>>> timeout)
>>>>> +{
>>>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>>>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>>>>> +
>>>>> +    target_seq[fence->ring] = fence->seq;
>>>>> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, 
>>>>> intr, timeout);
>>>>> +}
>>>> When this call is comming from outside the radeon driver you need 
>>>> to lock rdev->exclusive_lock here to make sure not to interfere 
>>>> with a possible reset.
>>> Ah thanks, I'll add that.
>>>
>>>>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>>>>      .enable_signaling = radeon_fence_enable_signaling,
>>>>>      .signaled = __radeon_fence_signaled,
>>>> Do we still need those callback when we implemented the wait callback?
>>> .get_timeline_name is used for debugging (trace events).
>>> .signaled is the non-blocking call to check if the fence is signaled 
>>> or not.
>>> .enable_signaling is used for adding callbacks upon fence 
>>> completion, the default 'fence_default_wait' uses it, so
>>> when it works no separate implementation is needed unless you want 
>>> to do more than just waiting.
>>> It's also used when fence_add_callback is called. i915 can be 
>>> patched to use it. ;-)
>>
>> I just meant enable_signaling, the other ones are fine with me. The 
>> problem with enable_signaling is that it's called with a spin lock 
>> held, so we can't sleep.
>>
>> While resetting the GPU could be moved out into a timer the problem 
>> here is that I can't lock rdev->exclusive_lock in such situations.
>>
>> This means when i915 would call into radeon to enable signaling for a 
>> fence we can't make sure that there is not GPU reset running on 
>> another CPU. And touching the IRQ registers while a reset is going on 
>> is a really good recipe to lockup the whole system.
> If you increase the irq counter on all rings before doing a gpu reset, 
> adjust the state and call sw_irq_put when done this race could never 
> happen. Or am I missing something?
>
Beside that's being extremely ugly in the case of a hard PCI reset even 
touching any register or just accessing VRAM in this moment can crash 
the box. Just working around the enable/disable of the interrupt here 
won't help us much.

Adding another spin lock won't work so well either, because the reset 
function itself wants to sleep as well.

The only solution I see off hand is making the critical reset code path 
work in atomic context as well, but that's not really doable cause AFAIK 
we need to work with functions from the PCI subsystem and spinning on a 
lock for up to a second is not really desirable also.

Christian.

> ~Maarten
>
Maarten Lankhorst May 19, 2014, 8 a.m. UTC | #6
op 15-05-14 18:13, Christian König schreef:
> Am 15.05.2014 17:58, schrieb Maarten Lankhorst:
>> op 15-05-14 17:48, Christian König schreef:
>>> Am 15.05.2014 16:18, schrieb Maarten Lankhorst:
>>>> op 15-05-14 15:19, Christian König schreef:
>>>>> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>>>>>> op 15-05-14 11:42, Christian König schreef:
>>>>>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>>>>>> op 15-05-14 11:21, Christian König schreef:
>>>>>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>>>>>> +    /* did fence get signaled after we enabled the sw irq? */
>>>>>>>>>>>> +    if (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) >= fence->seq) {
>>>>>>>>>>>> + radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>>>>>> +        return false;
>>>>>>>>>>>> +    }
>>>>>>>>>>>> +
>>>>>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, &fence->fence_wake);
>>>>>>>>>>>> +    fence_get(f);
>>>>>>>>>>> That looks like a race condition to me. The fence needs to be added to the wait queue before the check, not after.
>>>>>>>>>>>
>>>>>>>>>>> Apart from that the whole approach looks like a really bad idea to me. How for example is lockup detection supposed to happen with this? 
>>>>>>>>>> It's not a race condition because fence_queue.lock is held when this function is called.
>>>>>>>>> Ah, I see. That's also the reason why you moved the wake_up_all out of the processing function.
>>>>>>>> Correct. :-)
>>>>>>>>>> Lockup's a bit of a weird problem, the changes wouldn't allow core ttm code to handle the lockup any more,
>>>>>>>>>> but any driver specific wait code would still handle this. I did this by design, because in future patches the wait
>>>>>>>>>> function may be called from outside of the radeon driver. The official wait function takes a timeout parameter,
>>>>>>>>>> so lockups wouldn't be fatal if the timeout is set to something like 30*HZ for example, it would still return
>>>>>>>>>> and report that the function timed out.
>>>>>>>>> Timeouts help with the detection of the lockup, but not at all with the handling of them.
>>>>>>>>>
>>>>>>>>> What we essentially need is a wait callback into the driver that is called in non atomic context without any locks held.
>>>>>>>>>
>>>>>>>>> This way we can block for the fence to become signaled with a timeout and can then also initiate the reset handling if necessary.
>>>>>>>>>
>>>>>>>>> The way you designed the interface now means that the driver never gets a chance to wait for the hardware to become idle and so never has the opportunity to the reset the whole thing.
>>>>>>>> You could set up a hangcheck timer like intel does, and end up with a reliable hangcheck detection that doesn't depend on cpu waits. :-) Or override the default wait function and restore the old behavior.
>>>>>>>
>>>>>>> Overriding the default wait function sounds better, please implement it this way.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Christian. 
>>>>>>
>>>>>> Does this modification look sane?
>>>>> Adding the timeout is on my todo list for quite some time as well, so this part makes sense.
>>>>>
>>>>>> +static long __radeon_fence_wait(struct fence *f, bool intr, long timeout)
>>>>>> +{
>>>>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>>>>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>>>>>> +
>>>>>> +    target_seq[fence->ring] = fence->seq;
>>>>>> +    return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, intr, timeout);
>>>>>> +}
>>>>> When this call is comming from outside the radeon driver you need to lock rdev->exclusive_lock here to make sure not to interfere with a possible reset.
>>>> Ah thanks, I'll add that.
>>>>
>>>>>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>>>>>      .enable_signaling = radeon_fence_enable_signaling,
>>>>>>      .signaled = __radeon_fence_signaled,
>>>>> Do we still need those callback when we implemented the wait callback?
>>>> .get_timeline_name is used for debugging (trace events).
>>>> .signaled is the non-blocking call to check if the fence is signaled or not.
>>>> .enable_signaling is used for adding callbacks upon fence completion, the default 'fence_default_wait' uses it, so
>>>> when it works no separate implementation is needed unless you want to do more than just waiting.
>>>> It's also used when fence_add_callback is called. i915 can be patched to use it. ;-)
>>>
>>> I just meant enable_signaling, the other ones are fine with me. The problem with enable_signaling is that it's called with a spin lock held, so we can't sleep.
>>>
>>> While resetting the GPU could be moved out into a timer the problem here is that I can't lock rdev->exclusive_lock in such situations.
>>>
>>> This means when i915 would call into radeon to enable signaling for a fence we can't make sure that there is not GPU reset running on another CPU. And touching the IRQ registers while a reset is going on is a really good recipe to lockup the whole system.
>> If you increase the irq counter on all rings before doing a gpu reset, adjust the state and call sw_irq_put when done this race could never happen. Or am I missing something?
>>
> Beside that's being extremely ugly in the case of a hard PCI reset even touching any register or just accessing VRAM in this moment can crash the box. Just working around the enable/disable of the interrupt here won't help us much.
>
> Adding another spin lock won't work so well either, because the reset function itself wants to sleep as well.
>
> The only solution I see off hand is making the critical reset code path work in atomic context as well, but that's not really doable cause AFAIK we need to work with functions from the PCI subsystem and spinning on a lock for up to a second is not really desirable also.
I've checked the code a little but that can be the case now as well. the new implementation's __radeon_fence_wait will be protected by the exclusive_lock,, but enable_signaling is only protected by the fence_queue.lock and is_signaled is not protected. But this is not a change from the current situation, so it would only become a problem if the gpu hangs in a cross-device situation.

I think adding 1 to the irq refcount in the reset sequence and adding a down_read_trylock on the exclusive lock would help. If the trylock fails we could perform only the safe actions without touching any of the gpu registers or vram, adding the refcount is needed to ensure enable_signaling works as intended.

~Maarten
Christian König May 19, 2014, 8:27 a.m. UTC | #7
Am 19.05.2014 10:00, schrieb Maarten Lankhorst:
> op 15-05-14 18:13, Christian König schreef:
>> Am 15.05.2014 17:58, schrieb Maarten Lankhorst:
>>> op 15-05-14 17:48, Christian König schreef:
>>>> Am 15.05.2014 16:18, schrieb Maarten Lankhorst:
>>>>> op 15-05-14 15:19, Christian König schreef:
>>>>>> Am 15.05.2014 15:04, schrieb Maarten Lankhorst:
>>>>>>> op 15-05-14 11:42, Christian König schreef:
>>>>>>>> Am 15.05.2014 11:38, schrieb Maarten Lankhorst:
>>>>>>>>> op 15-05-14 11:21, Christian König schreef:
>>>>>>>>>> Am 15.05.2014 03:06, schrieb Maarten Lankhorst:
>>>>>>>>>>> op 14-05-14 17:29, Christian König schreef:
>>>>>>>>>>>>> +    /* did fence get signaled after we enabled the sw 
>>>>>>>>>>>>> irq? */
>>>>>>>>>>>>> +    if 
>>>>>>>>>>>>> (atomic64_read(&fence->rdev->fence_drv[fence->ring].last_seq) 
>>>>>>>>>>>>> >= fence->seq) {
>>>>>>>>>>>>> + radeon_irq_kms_sw_irq_put(fence->rdev, fence->ring);
>>>>>>>>>>>>> +        return false;
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +    fence->fence_wake.flags = 0;
>>>>>>>>>>>>> +    fence->fence_wake.private = NULL;
>>>>>>>>>>>>> +    fence->fence_wake.func = radeon_fence_check_signaled;
>>>>>>>>>>>>> + __add_wait_queue(&fence->rdev->fence_queue, 
>>>>>>>>>>>>> &fence->fence_wake);
>>>>>>>>>>>>> +    fence_get(f);
>>>>>>>>>>>> That looks like a race condition to me. The fence needs to 
>>>>>>>>>>>> be added to the wait queue before the check, not after.
>>>>>>>>>>>>
>>>>>>>>>>>> Apart from that the whole approach looks like a really bad 
>>>>>>>>>>>> idea to me. How for example is lockup detection supposed to 
>>>>>>>>>>>> happen with this? 
>>>>>>>>>>> It's not a race condition because fence_queue.lock is held 
>>>>>>>>>>> when this function is called.
>>>>>>>>>> Ah, I see. That's also the reason why you moved the 
>>>>>>>>>> wake_up_all out of the processing function.
>>>>>>>>> Correct. :-)
>>>>>>>>>>> Lockup's a bit of a weird problem, the changes wouldn't 
>>>>>>>>>>> allow core ttm code to handle the lockup any more,
>>>>>>>>>>> but any driver specific wait code would still handle this. I 
>>>>>>>>>>> did this by design, because in future patches the wait
>>>>>>>>>>> function may be called from outside of the radeon driver. 
>>>>>>>>>>> The official wait function takes a timeout parameter,
>>>>>>>>>>> so lockups wouldn't be fatal if the timeout is set to 
>>>>>>>>>>> something like 30*HZ for example, it would still return
>>>>>>>>>>> and report that the function timed out.
>>>>>>>>>> Timeouts help with the detection of the lockup, but not at 
>>>>>>>>>> all with the handling of them.
>>>>>>>>>>
>>>>>>>>>> What we essentially need is a wait callback into the driver 
>>>>>>>>>> that is called in non atomic context without any locks held.
>>>>>>>>>>
>>>>>>>>>> This way we can block for the fence to become signaled with a 
>>>>>>>>>> timeout and can then also initiate the reset handling if 
>>>>>>>>>> necessary.
>>>>>>>>>>
>>>>>>>>>> The way you designed the interface now means that the driver 
>>>>>>>>>> never gets a chance to wait for the hardware to become idle 
>>>>>>>>>> and so never has the opportunity to the reset the whole thing.
>>>>>>>>> You could set up a hangcheck timer like intel does, and end up 
>>>>>>>>> with a reliable hangcheck detection that doesn't depend on cpu 
>>>>>>>>> waits. :-) Or override the default wait function and restore 
>>>>>>>>> the old behavior.
>>>>>>>>
>>>>>>>> Overriding the default wait function sounds better, please 
>>>>>>>> implement it this way.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Christian. 
>>>>>>>
>>>>>>> Does this modification look sane?
>>>>>> Adding the timeout is on my todo list for quite some time as 
>>>>>> well, so this part makes sense.
>>>>>>
>>>>>>> +static long __radeon_fence_wait(struct fence *f, bool intr, 
>>>>>>> long timeout)
>>>>>>> +{
>>>>>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>>>>>> +    u64 target_seq[RADEON_NUM_RINGS] = {};
>>>>>>> +
>>>>>>> +    target_seq[fence->ring] = fence->seq;
>>>>>>> +    return radeon_fence_wait_seq_timeout(fence->rdev, 
>>>>>>> target_seq, intr, timeout);
>>>>>>> +}
>>>>>> When this call is comming from outside the radeon driver you need 
>>>>>> to lock rdev->exclusive_lock here to make sure not to interfere 
>>>>>> with a possible reset.
>>>>> Ah thanks, I'll add that.
>>>>>
>>>>>>>      .get_timeline_name = radeon_fence_get_timeline_name,
>>>>>>>      .enable_signaling = radeon_fence_enable_signaling,
>>>>>>>      .signaled = __radeon_fence_signaled,
>>>>>> Do we still need those callback when we implemented the wait 
>>>>>> callback?
>>>>> .get_timeline_name is used for debugging (trace events).
>>>>> .signaled is the non-blocking call to check if the fence is 
>>>>> signaled or not.
>>>>> .enable_signaling is used for adding callbacks upon fence 
>>>>> completion, the default 'fence_default_wait' uses it, so
>>>>> when it works no separate implementation is needed unless you want 
>>>>> to do more than just waiting.
>>>>> It's also used when fence_add_callback is called. i915 can be 
>>>>> patched to use it. ;-)
>>>>
>>>> I just meant enable_signaling, the other ones are fine with me. The 
>>>> problem with enable_signaling is that it's called with a spin lock 
>>>> held, so we can't sleep.
>>>>
>>>> While resetting the GPU could be moved out into a timer the problem 
>>>> here is that I can't lock rdev->exclusive_lock in such situations.
>>>>
>>>> This means when i915 would call into radeon to enable signaling for 
>>>> a fence we can't make sure that there is not GPU reset running on 
>>>> another CPU. And touching the IRQ registers while a reset is going 
>>>> on is a really good recipe to lockup the whole system.
>>> If you increase the irq counter on all rings before doing a gpu 
>>> reset, adjust the state and call sw_irq_put when done this race 
>>> could never happen. Or am I missing something?
>>>
>> Beside that's being extremely ugly in the case of a hard PCI reset 
>> even touching any register or just accessing VRAM in this moment can 
>> crash the box. Just working around the enable/disable of the 
>> interrupt here won't help us much.
>>
>> Adding another spin lock won't work so well either, because the reset 
>> function itself wants to sleep as well.
>>
>> The only solution I see off hand is making the critical reset code 
>> path work in atomic context as well, but that's not really doable 
>> cause AFAIK we need to work with functions from the PCI subsystem and 
>> spinning on a lock for up to a second is not really desirable also.
> I've checked the code a little but that can be the case now as well. 
> the new implementation's __radeon_fence_wait will be protected by the 
> exclusive_lock,, but enable_signaling is only protected by the 
> fence_queue.lock and is_signaled is not protected. But this is not a 
> change from the current situation, so it would only become a problem 
> if the gpu hangs in a cross-device situation.
>
> I think adding 1 to the irq refcount in the reset sequence and adding 
> a down_read_trylock on the exclusive lock would help. If the trylock 
> fails we could perform only the safe actions without touching any of 
> the gpu registers or vram, adding the refcount is needed to ensure 
> enable_signaling works as intended.

The problem here is that the whole approach collides with the way we do 
reset handling from a conceptual point of view. Every IOCTL or other 
call chain into the driver is protected by the read side of the 
exclusive_lock semaphore. So in the case of a GPU lockup we can take the 
write side of the semaphore and so make sure that we have nobody else 
accessing the hardware or internal driver structures only changed at 
init time.

Leaking a drivers IRQ context into another driver as well as calling 
into a driver in atomic context is just a quite uncommon approach and 
should be considered very carefully.

I would rather vote for a completely synchronous interface only allowing 
blocking waits and checks if a fence is signaled from not atomic context.

If a driver needs to avoid blocking it should just use a workqueue and 
checking a fence outside your own driver is probably be better done in a 
bottom halve handler anyway.

Regards,
Christian.
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index bc844f300d3f..2d415eb2834a 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -361,28 +361,35 @@  static bool radeon_fence_any_seq_signaled(struct radeon_device *rdev, u64 *seq)
  }
  
  /**
- * radeon_fence_wait_seq - wait for a specific sequence numbers
+ * radeon_fence_wait_seq_timeout - wait for a specific sequence numbers
   *
   * @rdev: radeon device pointer
   * @target_seq: sequence number(s) we want to wait for
   * @intr: use interruptable sleep
+ * @timeout: maximum time to wait, or MAX_SCHEDULE_TIMEOUT for infinite wait
   *
   * Wait for the requested sequence number(s) to be written by any ring
   * (all asics).  Sequnce number array is indexed by ring id.
   * @intr selects whether to use interruptable (true) or non-interruptable
   * (false) sleep when waiting for the sequence number.  Helper function
   * for radeon_fence_wait_*().
- * Returns 0 if the sequence number has passed, error for all other cases.
+ * Returns remaining time if the sequence number has passed, 0 when
+ * the wait timeout, or an error for all other cases.
   * -EDEADLK is returned when a GPU lockup has been detected.
   */
-static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 *target_seq,
-				 bool intr)
+static int radeon_fence_wait_seq_timeout(struct radeon_device *rdev,
+					 u64 *target_seq, bool intr,
+					 long timeout)
  {
  	uint64_t last_seq[RADEON_NUM_RINGS];
  	bool signaled;
-	int i, r;
+	int i;
  
  	while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
+		long r, waited = timeout;
+
+		waited = timeout < RADEON_FENCE_JIFFIES_TIMEOUT ?
+			 timeout : RADEON_FENCE_JIFFIES_TIMEOUT;
  
  		/* Save current sequence values, used to check for GPU lockups */
  		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
@@ -397,13 +404,15 @@  static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 *target_seq,
  		if (intr) {
  			r = wait_event_interruptible_timeout(rdev->fence_queue, (
  				(signaled = radeon_fence_any_seq_signaled(rdev, target_seq))
-				 || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT);
+				 || rdev->needs_reset), waited);
  		} else {
  			r = wait_event_timeout(rdev->fence_queue, (
  				(signaled = radeon_fence_any_seq_signaled(rdev, target_seq))
-				 || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT);
+				 || rdev->needs_reset), waited);
  		}
  
+		timeout -= waited - r;
+
  		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
  			if (!target_seq[i])
  				continue;
@@ -415,6 +424,12 @@  static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 *target_seq,
  		if (unlikely(r < 0))
  			return r;
  
+		/*
+		 * If this is a timed wait and the wait completely timed out just return.
+		 */
+		if (!timeout)
+			break;
+
  		if (unlikely(!signaled)) {
  			if (rdev->needs_reset)
  				return -EDEADLK;
@@ -457,7 +472,7 @@  static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 *target_seq,
  			}
  		}
  	}
-	return 0;
+	return timeout;
  }
  
  /**
@@ -480,8 +495,8 @@  int radeon_fence_wait(struct radeon_fence *fence, bool intr)
  		return 0;
  
  	seq[fence->ring] = fence->seq;
-	r = radeon_fence_wait_seq(fence->rdev, seq, intr);
-	if (r) {
+	r = radeon_fence_wait_seq_timeout(fence->rdev, seq, intr, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0) {
  		return r;
  	}
  	r = fence_signal(&fence->base);
@@ -509,7 +524,7 @@  int radeon_fence_wait_any(struct radeon_device *rdev,
  {
  	uint64_t seq[RADEON_NUM_RINGS];
  	unsigned i, num_rings = 0;
-	int r;
+	long r;
  
  	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
  		seq[i] = 0;
@@ -531,8 +546,8 @@  int radeon_fence_wait_any(struct radeon_device *rdev,
  	if (num_rings == 0)
  		return -ENOENT;
  
-	r = radeon_fence_wait_seq(rdev, seq, intr);
-	if (r) {
+	r = radeon_fence_wait_seq_timeout(rdev, seq, intr, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0) {
  		return r;
  	}
  	return 0;
@@ -551,6 +566,7 @@  int radeon_fence_wait_any(struct radeon_device *rdev,
  int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
  {
  	uint64_t seq[RADEON_NUM_RINGS] = {};
+	long r;
  
  	seq[ring] = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL;
  	if (seq[ring] >= rdev->fence_drv[ring].sync_seq[ring]) {
@@ -558,7 +574,10 @@  int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
  		   already the last emited fence */
  		return -ENOENT;
  	}
-	return radeon_fence_wait_seq(rdev, seq, false);
+	r = radeon_fence_wait_seq_timeout(rdev, seq, false, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0)
+		return r;
+	return 0;
  }
  
  /**
@@ -580,8 +599,8 @@  int radeon_fence_wait_empty(struct radeon_device *rdev, int ring)
  	if (!seq[ring])
  		return 0;
  
-	r = radeon_fence_wait_seq(rdev, seq, false);
-	if (r) {
+	r = radeon_fence_wait_seq_timeout(rdev, seq, false, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0) {
  		if (r == -EDEADLK)
  			return -EDEADLK;
  
@@ -908,6 +927,15 @@  int radeon_debugfs_fence_init(struct radeon_device *rdev)
  #endif
  }
  
+static long __radeon_fence_wait(struct fence *f, bool intr, long timeout)
+{
+	struct radeon_fence *fence = to_radeon_fence(f);
+	u64 target_seq[RADEON_NUM_RINGS] = {};
+
+	target_seq[fence->ring] = fence->seq;
+	return radeon_fence_wait_seq_timeout(fence->rdev, target_seq, intr, timeout);
+}
+
  static const char *radeon_fence_get_driver_name(struct fence *fence)
  {
  	return "radeon";
@@ -932,6 +960,6 @@  static const struct fence_ops radeon_fence_ops = {
  	.get_timeline_name = radeon_fence_get_timeline_name,
  	.enable_signaling = radeon_fence_enable_signaling,
  	.signaled = __radeon_fence_signaled,
-	.wait = fence_default_wait,
+	.wait = __radeon_fence_wait,
  	.release = NULL,
  };