[v6,63/64] drm/i915: Move gt_revoke() slightly

Message ID	20210105153558.134272-64-maarten.lankhorst@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=Zd9y=GI=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 613AB22BF3 From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> To: intel-gfx@lists.freedesktop.org Date: Tue, 5 Jan 2021 16:35:57 +0100 Message-Id: <20210105153558.134272-64-maarten.lankhorst@linux.intel.com> In-Reply-To: <20210105153558.134272-1-maarten.lankhorst@linux.intel.com> References: <20210105153558.134272-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 63/64] drm/i915: Move gt_revoke() slightly Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	drm/i915: Remove obj->mm.lock! \| expand [v6,00/64] drm/i915: Remove obj->mm.lock! [v6,01/64] drm/i915: Do not share hwsp across contexts any more, v6 [v6,02/64] drm/i915: Pin timeline map after first timeline pin, v3. [v6,03/64] drm/i915: Move cmd parser pinning to execbuffer [v6,04/64] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2. [v6,05/64] drm/i915: Ensure we hold the object mutex in pin correctly. [v6,06/64] drm/i915: Add gem object locking to madvise. [v6,07/64] drm/i915: Move HAS_STRUCT_PAGE to obj->flags [v6,08/64] drm/i915: Rework struct phys attachment handling [v6,09/64] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2. [v6,10/64] drm/i915: make lockdep slightly happier about execbuf. [v6,11/64] drm/i915: Disable userptr pread/pwrite support. [v6,12/64] drm/i915: No longer allow exporting userptr through dma-buf [v6,13/64] drm/i915: Reject more ioctls for userptr [v6,14/64] drm/i915: Reject UNSYNCHRONIZED for userptr, v2. [v6,15/64] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER. [v6,16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5. [v6,17/64] drm/i915: Flatten obj->mm.lock [v6,18/64] drm/i915: Populate logical context during first pin. [v6,19/64] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2. [v6,20/64] drm/i915: Handle ww locking in init_status_page [v6,21/64] drm/i915: Rework clflush to work correctly without obj->mm.lock. [v6,22/64] drm/i915: Pass ww ctx to intel_pin_to_display_plane [v6,23/64] drm/i915: Add object locking to vm_fault_cpu [v6,24/64] drm/i915: Move pinning to inside engine_wa_list_verify() [v6,25/64] drm/i915: Take reservation lock around i915_vma_pin. [v6,26/64] drm/i915: Make lrc_init_wa_ctx compatible with ww locking. [v6,27/64] drm/i915: Make __engine_unpark() compatible with ww locking. [v6,28/64] drm/i915: Take obj lock around set_domain ioctl [v6,29/64] drm/i915: Defer pin calls in buffer pool until first use by caller. [v6,30/64] drm/i915: Fix pread/pwrite to work with new locking rules. [v6,31/64] drm/i915: Fix workarounds selftest, part 1 [v6,32/64] drm/i915: Prepare for obj->mm.lock removal [v6,33/64] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner. [v6,34/64] drm/i915: Add ww locking around vm_access() [v6,35/64] drm/i915: Increase ww locking for perf. [v6,36/64] drm/i915: Lock ww in ucode objects correctly [v6,37/64] drm/i915: Add ww locking to dma-buf ops. [v6,38/64] drm/i915: Add missing ww lock in intel_dsb_prepare. [v6,39/64] drm/i915: Fix ww locking in shmem_create_from_object [v6,40/64] drm/i915: Use a single page table lock for each gtt. [v6,41/64] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal. [v6,42/64] drm/i915/selftests: Prepare client blit for obj->mm.lock removal. [v6,43/64] drm/i915/selftests: Prepare coherency tests for obj->mm.lock removal. [v6,44/64] drm/i915/selftests: Prepare context tests for obj->mm.lock removal. [v6,45/64] drm/i915/selftests: Prepare dma-buf tests for obj->mm.lock removal. [v6,46/64] drm/i915/selftests: Prepare execbuf tests for obj->mm.lock removal. [v6,47/64] drm/i915/selftests: Prepare mman testcases for obj->mm.lock removal. [v6,48/64] drm/i915/selftests: Prepare object tests for obj->mm.lock removal. [v6,49/64] drm/i915/selftests: Prepare object blit tests for obj->mm.lock removal. [v6,50/64] drm/i915/selftests: Prepare igt_gem_utils for obj->mm.lock removal [v6,51/64] drm/i915/selftests: Prepare context selftest for obj->mm.lock removal [v6,52/64] drm/i915/selftests: Prepare hangcheck for obj->mm.lock removal [v6,53/64] drm/i915/selftests: Prepare execlists and lrc selftests for obj->mm.lock removal [v6,54/64] drm/i915/selftests: Prepare mocs tests for obj->mm.lock removal [v6,55/64] drm/i915/selftests: Prepare ring submission for obj->mm.lock removal [v6,56/64] drm/i915/selftests: Prepare timeline tests for obj->mm.lock removal [v6,57/64] drm/i915/selftests: Prepare i915_request tests for obj->mm.lock removal [v6,58/64] drm/i915/selftests: Prepare memory region tests for obj->mm.lock removal [v6,59/64] drm/i915/selftests: Prepare cs engine tests for obj->mm.lock removal [v6,60/64] drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal [v6,61/64] drm/i915: Finally remove obj->mm.lock. [v6,62/64] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2. [v6,63/64] drm/i915: Move gt_revoke() slightly [v6,64/64] drm/i915: Avoid some false positives in assert_object_held()

Maarten Lankhorst Jan. 5, 2021, 3:35 p.m. UTC

We get a lockdep splat when the reset mutex is held, because it can be
taken from fence_wait. This conflicts with the mmu notifier we have,
because we recurse between reset mutex and mmap lock -> mmu notifier.

Remove this recursion by calling revoke_mmaps before taking the lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Thomas Hellstrom Jan. 18, 2021, 11:11 a.m. UTC | #1

On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
> We get a lockdep splat when the reset mutex is held, because it can be
> taken from fence_wait. This conflicts with the mmu notifier we have,
> because we recurse between reset mutex and mmap lock -> mmu notifier.
>
> Remove this recursion by calling revoke_mmaps before taking the lock.

Hmm. Is the mmap se taken from gt_revoke()?

If so, isn't the real problem that the mmap_sem is taken in the 
dma_fence critical path (where the reset code sits)?

/Thomas


>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 9d177297db79..3c0807d9a86e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -975,8 +975,6 @@ static int do_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask)
>   {
>   	int err, i;
>   
> -	gt_revoke(gt);
> -
>   	err = __intel_gt_reset(gt, ALL_ENGINES);
>   	for (i = 0; err && i < RESET_MAX_RETRIES; i++) {
>   		msleep(10 * (i + 1));
> @@ -1031,6 +1029,9 @@ void intel_gt_reset(struct intel_gt *gt,
>   
>   	might_sleep();
>   	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &gt->reset.flags));
> +
> +	gt_revoke(gt);
> +
>   	mutex_lock(&gt->reset.mutex);
>   
>   	/* Clear any previous failed attempts at recovery. Time to try again. */

Maarten Lankhorst Jan. 18, 2021, 12:01 p.m. UTC | #2

Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>
> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>> We get a lockdep splat when the reset mutex is held, because it can be
>> taken from fence_wait. This conflicts with the mmu notifier we have,
>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>
>> Remove this recursion by calling revoke_mmaps before taking the lock.
>
> Hmm. Is the mmap se taken from gt_revoke()?
>
> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)? 

Hey,

The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),

so this change should be ok since all those mappings are invalidated correctly and completed before this point.

The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.

~Maarten

Thomas Hellstrom Jan. 18, 2021, 1:22 p.m. UTC | #3

On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>> We get a lockdep splat when the reset mutex is held, because it can be
>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>
>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>> Hmm. Is the mmap se taken from gt_revoke()?
>>
>> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)?
> Hey,
>
> The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),
>
> so this change should be ok since all those mappings are invalidated correctly and completed before this point.
>
> The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.
>
> ~Maarten

Hmm, OK but then we still have the following established locking order.

lock(fence_signaling)
lock(i_mmap_lock)

But in the notifier

lock(i_mmap_lock)
fence_signaling(within notifier)

So gt_revoke() is violating dma-fence rules.

BTW it looks to me like the reset mutex notation is actually doing much 
the same as the dma-fence annotations; While we can move gt_revoke() out 
of the reset mutex, that only gives us false hopes since it moves it out 
of the equivalent dma-fence annotation. I figure the reason this was not 
seen before the new code is that the reset mutex lockdep isn't taken 
when waiting for active. Only when waiting for dma-fence, but IMO the 
root problem is pre-existing.

/Thomas

Thomas Hellstrom Jan. 18, 2021, 1:28 p.m. UTC | #4

On 1/18/21 2:22 PM, Thomas Hellström wrote:
>
> On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
>> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>> We get a lockdep splat when the reset mutex is held, because it can be
>>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>>
>>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>>> Hmm. Is the mmap se taken from gt_revoke()?
>>>
>>> If so, isn't the real problem that the mmap_sem is taken in the 
>>> dma_fence critical path (where the reset code sits)?
>> Hey,
>>
>> The gpu reset code specifically needs to revoke all gtt mappings, and 
>> the fault handler uses intel_gt_reset_trylock(),
>>
>> so this change should be ok since all those mappings are invalidated 
>> correctly and completed before this point.
>>
>> The reset mutex isn't actually taken inside fence code, but used for 
>> lockdep validation, so this should be ok.
>>
>> ~Maarten
>
> Hmm, OK but then we still have the following established locking order.
>
> lock(fence_signaling)
> lock(i_mmap_lock)
>
> But in the notifier
>
> lock(i_mmap_lock)
> fence_signaling(within notifier)
>
> So gt_revoke() is violating dma-fence rules.
>
> BTW it looks to me like the reset mutex notation is actually doing 
> much the same as the dma-fence annotations; While we can move 
> gt_revoke() out of the reset mutex, that only gives us false hopes 
> since it moves it out of the equivalent dma-fence annotation. I figure 
> the reason this was not seen before the new code is that the reset 
> mutex lockdep isn't taken when waiting for active. Only when waiting 
> for dma-fence, but IMO the root problem is pre-existing.
>
> /Thomas
>
>
The interesting scenario is

thread 1:
take i_mmap_lock()
enter_mmu_notifier()
wait_fence()

thread 2:
need_to_reset_gpu_for_the_above_fence();
take i_mmap_lock()

Deadlock.

/Thomas

Maarten Lankhorst Jan. 18, 2021, 2:46 p.m. UTC | #5

Op 18-01-2021 om 14:28 schreef Thomas Hellström:
>
> On 1/18/21 2:22 PM, Thomas Hellström wrote:
>>
>> On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
>>> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>>> We get a lockdep splat when the reset mutex is held, because it can be
>>>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>>>
>>>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>>>> Hmm. Is the mmap se taken from gt_revoke()?
>>>>
>>>> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)?
>>> Hey,
>>>
>>> The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),
>>>
>>> so this change should be ok since all those mappings are invalidated correctly and completed before this point.
>>>
>>> The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.
>>>
>>> ~Maarten
>>
>> Hmm, OK but then we still have the following established locking order.
>>
>> lock(fence_signaling)
>> lock(i_mmap_lock)
>>
>> But in the notifier
>>
>> lock(i_mmap_lock)
>> fence_signaling(within notifier)
>>
>> So gt_revoke() is violating dma-fence rules.
>>
>> BTW it looks to me like the reset mutex notation is actually doing much the same as the dma-fence annotations; While we can move gt_revoke() out of the reset mutex, that only gives us false hopes since it moves it out of the equivalent dma-fence annotation. I figure the reason this was not seen before the new code is that the reset mutex lockdep isn't taken when waiting for active. Only when waiting for dma-fence, but IMO the root problem is pre-existing.
>>
>> /Thomas
>>
>>
> The interesting scenario is
>
> thread 1:
> take i_mmap_lock()
> enter_mmu_notifier()
> wait_fence()
>
> thread 2:
> need_to_reset_gpu_for_the_above_fence();
> take i_mmap_lock()
>
> Deadlock.
>
> /Thomas
>
>
Yeah, I think gpu reset isn't completely following lockdep rules yet. Thread 1 isn't doing anything wrong, gpu reset probably should stop revoking gt bindings, and allow some garbage during reset. I don't see another way out. :-/

Thomas Hellstrom Jan. 18, 2021, 3:05 p.m. UTC | #6

On 1/18/21 3:46 PM, Maarten Lankhorst wrote:
> Op 18-01-2021 om 14:28 schreef Thomas Hellström:
>> On 1/18/21 2:22 PM, Thomas Hellström wrote:
>>> On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
>>>> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>>>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>>>> We get a lockdep splat when the reset mutex is held, because it can be
>>>>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>>>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>>>>
>>>>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>>>>> Hmm. Is the mmap se taken from gt_revoke()?
>>>>>
>>>>> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)?
>>>> Hey,
>>>>
>>>> The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),
>>>>
>>>> so this change should be ok since all those mappings are invalidated correctly and completed before this point.
>>>>
>>>> The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.
>>>>
>>>> ~Maarten
>>> Hmm, OK but then we still have the following established locking order.
>>>
>>> lock(fence_signaling)
>>> lock(i_mmap_lock)
>>>
>>> But in the notifier
>>>
>>> lock(i_mmap_lock)
>>> fence_signaling(within notifier)
>>>
>>> So gt_revoke() is violating dma-fence rules.
>>>
>>> BTW it looks to me like the reset mutex notation is actually doing much the same as the dma-fence annotations; While we can move gt_revoke() out of the reset mutex, that only gives us false hopes since it moves it out of the equivalent dma-fence annotation. I figure the reason this was not seen before the new code is that the reset mutex lockdep isn't taken when waiting for active. Only when waiting for dma-fence, but IMO the root problem is pre-existing.
>>>
>>> /Thomas
>>>
>>>
>> The interesting scenario is
>>
>> thread 1:
>> take i_mmap_lock()
>> enter_mmu_notifier()
>> wait_fence()
>>
>> thread 2:
>> need_to_reset_gpu_for_the_above_fence();
>> take i_mmap_lock()
>>
>> Deadlock.
>>
>> /Thomas
>>
>>
> Yeah, I think gpu reset isn't completely following lockdep rules yet. Thread 1 isn't doing anything wrong, gpu reset probably should stop revoking gt bindings, and allow some garbage during reset. I don't see another way out. :-/

Me neither.

But to silence lockdep until dma_fence annotation is widely added:

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Maarten Lankhorst Jan. 18, 2021, 3:32 p.m. UTC | #7

Op 18-01-2021 om 16:05 schreef Thomas Hellström:
>
> On 1/18/21 3:46 PM, Maarten Lankhorst wrote:
>> Op 18-01-2021 om 14:28 schreef Thomas Hellström:
>>> On 1/18/21 2:22 PM, Thomas Hellström wrote:
>>>> On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
>>>>> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>>>>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>>>>> We get a lockdep splat when the reset mutex is held, because it can be
>>>>>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>>>>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>>>>>
>>>>>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>>>>>> Hmm. Is the mmap se taken from gt_revoke()?
>>>>>>
>>>>>> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)?
>>>>> Hey,
>>>>>
>>>>> The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),
>>>>>
>>>>> so this change should be ok since all those mappings are invalidated correctly and completed before this point.
>>>>>
>>>>> The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.
>>>>>
>>>>> ~Maarten
>>>> Hmm, OK but then we still have the following established locking order.
>>>>
>>>> lock(fence_signaling)
>>>> lock(i_mmap_lock)
>>>>
>>>> But in the notifier
>>>>
>>>> lock(i_mmap_lock)
>>>> fence_signaling(within notifier)
>>>>
>>>> So gt_revoke() is violating dma-fence rules.
>>>>
>>>> BTW it looks to me like the reset mutex notation is actually doing much the same as the dma-fence annotations; While we can move gt_revoke() out of the reset mutex, that only gives us false hopes since it moves it out of the equivalent dma-fence annotation. I figure the reason this was not seen before the new code is that the reset mutex lockdep isn't taken when waiting for active. Only when waiting for dma-fence, but IMO the root problem is pre-existing.
>>>>
>>>> /Thomas
>>>>
>>>>
>>> The interesting scenario is
>>>
>>> thread 1:
>>> take i_mmap_lock()
>>> enter_mmu_notifier()
>>> wait_fence()
>>>
>>> thread 2:
>>> need_to_reset_gpu_for_the_above_fence();
>>> take i_mmap_lock()
>>>
>>> Deadlock.
>>>
>>> /Thomas
>>>
>>>
>> Yeah, I think gpu reset isn't completely following lockdep rules yet. Thread 1 isn't doing anything wrong, gpu reset probably should stop revoking gt bindings, and allow some garbage during reset. I don't see another way out. :-/
>
> Me neither.
>
> But to silence lockdep until dma_fence annotation is widely added:
>
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>
>
Ideally we'll add fence signaling annotations to gpu reset, to exactly detect these kind of things. Hopefully in the future. :)

[v6,63/64] drm/i915: Move gt_revoke() slightly

Commit Message

Comments

Patch