Message ID | 57208E78.4070800@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Dave, thank you for the patch, but I'm using longtime kernel 3.14.66 and so the patch does not fit. Do you have any suggestions for that kernel? Andreas 2016-04-27 12:03 GMT+02:00 Dave Gordon <david.s.gordon@intel.com>: > On 27/04/16 10:19, Andreas Lampersperger wrote: > >> Hello, >> >> has anyone here a hint for me, what can cause this error. >> The error occures highly sporadic on different machines with intel hd >> graphics (ivb_gt1). >> I did also some kernel coredumps and found out, that the failed >> paging request came from drm_i915_gem_request->list.prev or ->list.next. >> >> Thank you >> Andreas >> > > Try this patch. > > .Dave. > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > >
> 2016-04-27 12:03 GMT+02:00 Dave Gordon <david.s.gordon@intel.com>: > >> On 27/04/16 10:19, Andreas Lampersperger wrote: >> >>> Hello, >>> >>> has anyone here a hint for me, what can cause this error. >>> The error occures highly sporadic on different machines with intel hd >>> graphics (ivb_gt1). >>> I did also some kernel coredumps and found out, that the failed >>> paging request came from drm_i915_gem_request->list.prev or ->list.next. >>> >>> Thank you >>> Andreas >> >> Try this patch. >> >> .Dave. > On 27/04/16 13:20, Andreas Lampersperger wrote: > Hi Dave, > > thank you for the patch, but I'm using longtime kernel 3.14.66 and so the > patch does not fit. > Do you have any suggestions for that kernel? > > Andreas I have no idea what the code would look like in that version; do you have any way to get the git commit hash or tag of the i915 driver? But essentially, if you have the kernel sources for that version, you could look for a line that allocates a "struct drm_i915_gem_request". It might be req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL); or in older versions it might be request = kzalloc(sizeof(*request), GFP_KERNEL); or some other variation. It should of course be followed by a test for successful allocation, e.g. if (request == NULL) return -ENOMEM; If you can find that line (or lines, at one time there were TWO such places), then you just have to add INIT_LIST_HEAD(&request->list); after the test-for-NULL, or anywhere inline thereafter. For example: request = kzalloc(sizeof(*request), GFP_KERNEL); if (request == NULL) return -ENOMEM; ret = i915_gem_get_seqno(ring->dev, &request->seqno); if (ret) { kfree(request); return ret; } + INIT_LIST_HEAD(&request->list); kref_init(&request->ref); request->ring = ring; request->uniq = dev_private->request_uniq++; HTH, .Dave.
Hi Dave, thank you again. I searched, where the memory for the request came from, it is set in line 2158 of i915_gem.c <https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/gpu/drm/i915/i915_gem.c?id=refs/tags/v3.14.67#n2158> and the kmalloc is in intel_ring_alloc_seqno() in intel_ringbuffer.c <https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/gpu/drm/i915/intel_ringbuffer.c?id=refs/tags/v3.14.67#n1600>, right? So I will test the following patch for intel_ringbuffer.c: intel_ring_alloc_seqno(struct intel_ring_buffer *ring) { if (ring->outstanding_lazy_seqno) return 0; if (ring->preallocated_lazy_request == NULL) { struct drm_i915_gem_request *request; request = kmalloc(sizeof(*request), GFP_KERNEL); if (request == NULL) return -ENOMEM; + + INIT_LIST_HEAD(&request->list); ring->preallocated_lazy_request = request; } return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno); } : 2016-04-27 20:28 GMT+02:00 Dave Gordon <david.s.gordon@intel.com>: > 2016-04-27 12:03 GMT+02:00 Dave Gordon <david.s.gordon@intel.com>: >> >> On 27/04/16 10:19, Andreas Lampersperger wrote: >>> >>> Hello, >>>> >>>> has anyone here a hint for me, what can cause this error. >>>> The error occures highly sporadic on different machines with intel hd >>>> graphics (ivb_gt1). >>>> I did also some kernel coredumps and found out, that the failed >>>> paging request came from drm_i915_gem_request->list.prev or ->list.next. >>>> >>>> Thank you >>>> Andreas >>>> >>> >>> Try this patch. >>> >>> .Dave. >>> >> >> On 27/04/16 13:20, Andreas Lampersperger wrote: > >> Hi Dave, >> >> thank you for the patch, but I'm using longtime kernel 3.14.66 and so the >> patch does not fit. >> Do you have any suggestions for that kernel? >> >> Andreas >> > > I have no idea what the code would look like in that version; do you have > any way to get the git commit hash or tag of the i915 driver? > > But essentially, if you have the kernel sources for that version, you > could look for a line that allocates a "struct drm_i915_gem_request". It > might be > > req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL); > > or in older versions it might be > > request = kzalloc(sizeof(*request), GFP_KERNEL); > > or some other variation. It should of course be followed by a test for > successful allocation, e.g. > > if (request == NULL) > return -ENOMEM; > > If you can find that line (or lines, at one time there were TWO such > places), then you just have to add > > INIT_LIST_HEAD(&request->list); > > after the test-for-NULL, or anywhere inline thereafter. For example: > > request = kzalloc(sizeof(*request), GFP_KERNEL); > if (request == NULL) > return -ENOMEM; > > ret = i915_gem_get_seqno(ring->dev, &request->seqno); > if (ret) { > kfree(request); > return ret; > } > > + INIT_LIST_HEAD(&request->list); > kref_init(&request->ref); > request->ring = ring; > request->uniq = dev_private->request_uniq++; > > HTH, > .Dave. >
On 28/04/16 09:58, Andreas Lampersperger wrote: > Hi Dave, > > thank you again. > > I searched, where the memory for the request came from, it is set in line 2158 > of i915_gem.c > <https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/gpu/drm/i915/i915_gem.c?id=refs/tags/v3.14.67#n2158> > and the kmalloc is in intel_ring_alloc_seqno() in intel_ringbuffer.c > <https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/gpu/drm/i915/intel_ringbuffer.c?id=refs/tags/v3.14.67#n1600>, > right? > > So I will test the following patch for intel_ringbuffer.c: > > > intel_ring_alloc_seqno(struct intel_ring_buffer *ring) > { > if (ring->outstanding_lazy_seqno) > return 0; > > if (ring->preallocated_lazy_request == NULL) { > struct drm_i915_gem_request *request; > > request = kmalloc(sizeof(*request), GFP_KERNEL); > if (request == NULL) > return -ENOMEM; > + > + INIT_LIST_HEAD(&request->list); > ring->preallocated_lazy_request = request; > } > > return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno); > } > : Yes, that looks sensible. Interestingly, that version uses kmalloc() rather than k*z*alloc, so the uninitialised list head will be random rather than zeroed. Anything could happen! .Dave.
From 6dde6b8b07239c68f32315faffa9d80b6e2d0aec Mon Sep 17 00:00:00 2001 From: Dave Gordon <david.s.gordon@intel.com> Date: Fri, 22 Apr 2016 19:33:53 +0100 Subject: [PATCH] Request constructor should initialise list head Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Users of the Linux kernel linked-list macros should remember that a zeroed "struct list_head" is not a valid list and in particular is not the same as an empty list. ALL list heads MUST be initialised! Without this, any attempt to follow this link (e.g. during teardown) before the request has been put on the list of pending requests (or if is not put on the list, e.g. is cancelled instead) will result in a NULL pointer dereference. Note: the diff below claims this code goes in i915_gem_request_free(), but that's just git's failure to parse the function headers correctly. It obviously belongs in __i915_gem_request_alloc(); it seems to apply correctly despite the incorrect function tag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907 ? Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d493e79..8ce7d4d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2751,6 +2751,7 @@ void i915_gem_request_free(struct kref *req_ref) if (ret) goto err; + INIT_LIST_HEAD(&req->list); kref_init(&req->ref); req->i915 = dev_priv; req->engine = engine; -- 1.9.1