diff mbox series

[v2] drm/i915: Fix the race between the GEM close and debugfs

Message ID 20200915034118.32256-1-nikunj.dadhania@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] drm/i915: Fix the race between the GEM close and debugfs | expand

Commit Message

Nikunj A. Dadhania Sept. 15, 2020, 3:41 a.m. UTC
As we close GEM object and set file_priv to -EBADF which is protected
by ctx->mutex, populating the GEM debugfs info is not protected
and results in the crash shown below.

Make sure to protect the access to file_priv using ctx->mutex to avoid
race.

BUG: unable to handle page fault for address: ffffffffffffffff
RIP: 0010:i915_gem_object_info+0x26b/0x3eb
Code: 89 44 24 48 48 89 44 24 40 48 89 44 24 38 48 89 44 24 30 48 89 44 24 28 48 89 44 24 20 49 8b 46 f0 48 89 44 24 20 49 8b 46 a0 <48> 8b 58 08 b9 0a 00 00 00 48 b8 aa aa aa aa aa aa aa aa 48 8d bc
RSP: 0018:ffffac81c14cfc30 EFLAGS: 00010246
RAX: fffffffffffffff7 RBX: ffff95094429c218 RCX: ffff95096756c740
RDX: 0000000000000000 RSI: ffffffff919b93ee RDI: ffff95094429c218
RBP: ffffac81c14cfd58 R08: ffff9509746fab80 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff9509753f8e80
R13: ffffac81c14cfc98 R14: ffff95094429c268 R15: ffffac81c14cfc88
FS:  00007a1bdcd52900(0000) GS:ffff950977e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffff CR3: 000000026b4e0000 CR4: 0000000000340ef0
Call Trace:
 seq_read+0x162/0x3ca
 full_proxy_read+0x5b/0x8d
 __vfs_read+0x45/0x1b9
 vfs_read+0xc9/0x15e
 ksys_read+0x7e/0xde
 do_syscall_64+0x54/0x7e
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7a1bdd34cf03

Signed-off-by: Nikunj A. Dadhania <nikunj.dadhania@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_debugfs.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Sasha Levin Sept. 17, 2020, 3:53 p.m. UTC | #1
Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.8.9, v5.4.65, v4.19.145, v4.14.198, v4.9.236, v4.4.236.

v5.8.9: Build OK!
v5.4.65: Failed to apply! Possible dependencies:
    061489c65ff5 ("drm/i915/dsb: single register write function for DSB.")
    11988e393813 ("drm/i915/execlists: Try rearranging breadcrumb flush")
    2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
    5a90606df7cb ("drm/i915: Replace obj->pin_global with obj->frontbuffer")
    67f3b58f3bac ("drm/i915/dsb: DSB context creation.")
    8a9a982767b7 ("drm/i915: use a separate context for gpu relocs")
    a4e7ccdac38e ("drm/i915: Move context management under GEM")
    b27a96ad72fd ("drm/i915/dsb: Indexed register write function for DSB.")
    bb120e1171a9 ("drm/i915: Show the logical context ring state on dumping")
    c210e85b8f33 ("drm/i915/tgl: Extend MI_SEMAPHORE_WAIT")
    d19d71fc2b15 ("drm/i915: Mark i915_request.timeline as a volatile, rcu pointer")
    e8f6b4952ec5 ("drm/i915/execlists: Flush the post-sync breadcrumb write harder")

v4.19.145: Failed to apply! Possible dependencies:
    0258404f9d38 ("drm/i915: start moving runtime device info to a separate struct")
    026844460743 ("drm/i915: Remove intel_context.active_link")
    07d805721938 ("drm/i915: Introduce intel_runtime_pm_disable to pair intel_runtime_pm_enable")
    13f1bfd3b332 ("drm/i915: Make object/vma allocation caches global")
    1c71bc565cdb ("drm/i915/perf: simplify configure all context function")
    2cc8376fd350 ("drm/i915: rename dev_priv info to __info to avoid usage")
    2cd9a689e97b ("drm/i915: Refactor intel_display_set_init_power() logic")
    37d7c9cc2eb6 ("drm/i915: Check engine->default_state mapping on module load")
    55ac5a1614f9 ("drm/i915: Attach the pci match data to the device upon creation")
    666424abfb86 ("drm/i915/execlists: Use coherent writes into the context image")
    6dfc4a8f134f ("drm/i915: Verify power domains after enabling them")
    722f3de39e03 ("i915/oa: Simplify updating contexts")
    900ccf30f9e1 ("drm/i915: Only force GGTT coherency w/a on required chipsets")
    c4d52feb2c46 ("drm/i915: Move over to intel_context_lookup()")
    f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
    fa9f668141f4 ("drm/i915: Export intel_context_instance()")

v4.14.198: Failed to apply! Possible dependencies:
    3bd4073524fa ("drm/i915: Consolidate get_fence with pin_fence")
    465c403cb508 ("drm/i915: introduce simple gemfs")
    66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
    67b48040255b ("drm/i915: Assert that the handle->vma lut is empty on object close")
    73ebd503034c ("drm/i915: make mappable struct resource centric")
    7789422665f5 ("drm/i915: make dsm struct resource centric")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    969b0950a188 ("drm/i915: Add interface to reserve fence registers for vGPU")
    a65adaf8a834 ("drm/i915: Track user GTT faulting per-vma")
    b4563f595ed4 ("drm/i915: Pin fence for iomap")
    e91ef99b9543 ("drm/i915/selftests: Remember to create the fake preempt context")
    f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
    f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")

v4.9.236: Failed to apply! Possible dependencies:
    0e70447605f4 ("drm/i915: Move common code out of i915_gpu_error.c")
    1b36595ffb35 ("drm/i915: Show RING registers through debugfs")
    28a60dee2ce6 ("drm/i915/gvt: vGPU HW resource management")
    3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines")
    82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member")
    85fd4f58d7ef ("drm/i915: Mark all non-vma being inserted into the address spaces")
    9c870d03674f ("drm/i915: Use RPM as the barrier for controlling user mmap access")
    bb6dc8d96b68 ("drm/i915: Implement pread without struct-mutex")
    d636951ec01b ("drm/i915: Cleanup instdone collection")
    e007b19d7ba7 ("drm/i915: Use the MRU stack search after evicting")
    f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
    f9e613728090 ("drm/i915: Try to print INSTDONE bits for all slice/subslice")

v4.4.236: Failed to apply! Possible dependencies:
    1b683729e7ac ("drm/i915: Remove redundant check in i915_gem_obj_to_vma")
    1c7f4bca5a6f ("drm/i915: Rename vma->*_list to *_link for consistency")
    3272db53136f ("drm/i915: Combine all i915_vma bitfields into a single set of flags")
    596c5923197b ("drm/i915: Reduce the pointer dance of i915_is_ggtt()")
    c1a415e261aa ("drm/i915: Disable shrinker for non-swapped backed objects")
    d0710abbcd88 ("drm/i915: Set the map-and-fenceable flag for preallocated objects")
    f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 784219962193..ea469168cd44 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -326,6 +326,7 @@  static void print_context_stats(struct seq_file *m,
 		}
 		i915_gem_context_unlock_engines(ctx);
 
+		mutex_lock(&ctx->mutex);
 		if (!IS_ERR_OR_NULL(ctx->file_priv)) {
 			struct file_stats stats = {
 				.vm = rcu_access_pointer(ctx->vm),
@@ -346,6 +347,7 @@  static void print_context_stats(struct seq_file *m,
 
 			print_file_stats(m, name, stats);
 		}
+		mutex_unlock(&ctx->mutex);
 
 		spin_lock(&i915->gem.contexts.lock);
 		list_safe_reset_next(ctx, cn, link);