diff mbox

[1/2] drm/udl: fix a NULL pointer reference in udl_gem_free_object().

Message ID 1472593821-38429-1-git-send-email-hshi@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Haixia Shi Aug. 30, 2016, 9:50 p.m. UTC
Previously this function had a NULL pointer check for gem->map_list.map, but
that line was refactored after commit 0de23977cfeb5b357ec884ba15417ae118ff9e9bb
("drm/gem: convert to new unified vma manager").

After the refactor it is still necessasry to check that the vma manager is not
NULL because udl_gem_free_object() may come in after the vma manager is destroyed.

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
---
 drivers/gpu/drm/udl/udl_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Daniel Vetter Aug. 31, 2016, 7:55 a.m. UTC | #1
On Tue, Aug 30, 2016 at 02:50:20PM -0700, Haixia Shi wrote:
> Previously this function had a NULL pointer check for gem->map_list.map, but
> that line was refactored after commit 0de23977cfeb5b357ec884ba15417ae118ff9e9bb
> ("drm/gem: convert to new unified vma manager").
> 
> After the refactor it is still necessasry to check that the vma manager is not
> NULL because udl_gem_free_object() may come in after the vma manager is destroyed.

When/how does this happen? Backtrace? Destroying the vma manager before
the objects are all gone sounds a bit fishy.
-Daniel

> 
> Signed-off-by: Haixia Shi <hshi@chromium.org>
> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
> ---
>  drivers/gpu/drm/udl/udl_gem.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
> index 818e707..72e1bd4 100644
> --- a/drivers/gpu/drm/udl/udl_gem.c
> +++ b/drivers/gpu/drm/udl/udl_gem.c
> @@ -204,7 +204,8 @@ void udl_gem_free_object(struct drm_gem_object *gem_obj)
>  	if (obj->pages)
>  		udl_gem_put_pages(obj);
>  
> -	drm_gem_free_mmap_offset(gem_obj);
> +	if (gem_obj->dev->vma_offset_manager)
> +		drm_gem_free_mmap_offset(gem_obj);
>  }
>  
>  /* the dumb interface doesn't work with the GEM straight MMAP
> -- 
> 2.8.0.rc3.226.g39d4020
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Haixia Shi Aug. 31, 2016, 8:45 p.m. UTC | #2
For details see https://bugs.chromium.org/p/chromium/issues/detail?id=468050

So drm_mode_config_cleanup() is called from udl_driver_unload() in
which we found there's still a framebuffer left, hence the WARN in
drm_crtc.c:5495. This also forcefully releases all the buffers.

A bit later the actual drm_buf_release comes in which attempts to
release the buffer again.

On Wed, Aug 31, 2016 at 12:55 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Aug 30, 2016 at 02:50:20PM -0700, Haixia Shi wrote:
> > Previously this function had a NULL pointer check for gem->map_list.map, but
> > that line was refactored after commit 0de23977cfeb5b357ec884ba15417ae118ff9e9bb
> > ("drm/gem: convert to new unified vma manager").
> >
> > After the refactor it is still necessasry to check that the vma manager is not
> > NULL because udl_gem_free_object() may come in after the vma manager is destroyed.
>
> When/how does this happen? Backtrace? Destroying the vma manager before
> the objects are all gone sounds a bit fishy.
> -Daniel
>
> >
> > Signed-off-by: Haixia Shi <hshi@chromium.org>
> > Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
> > ---
> >  drivers/gpu/drm/udl/udl_gem.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
> > index 818e707..72e1bd4 100644
> > --- a/drivers/gpu/drm/udl/udl_gem.c
> > +++ b/drivers/gpu/drm/udl/udl_gem.c
> > @@ -204,7 +204,8 @@ void udl_gem_free_object(struct drm_gem_object *gem_obj)
> >       if (obj->pages)
> >               udl_gem_put_pages(obj);
> >
> > -     drm_gem_free_mmap_offset(gem_obj);
> > +     if (gem_obj->dev->vma_offset_manager)
> > +             drm_gem_free_mmap_offset(gem_obj);
> >  }
> >
> >  /* the dumb interface doesn't work with the GEM straight MMAP
> > --
> > 2.8.0.rc3.226.g39d4020
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Daniel Vetter Aug. 31, 2016, 9:05 p.m. UTC | #3
On Wed, Aug 31, 2016 at 10:45 PM, Haixia Shi <hshi@chromium.org> wrote:
> For details see https://bugs.chromium.org/p/chromium/issues/detail?id=468050
>
> So drm_mode_config_cleanup() is called from udl_driver_unload() in
> which we found there's still a framebuffer left, hence the WARN in
> drm_crtc.c:5495. This also forcefully releases all the buffers.
>
> A bit later the actual drm_buf_release comes in which attempts to
> release the buffer again.

Leaving a drm_framebuffer behind on unload is definitely a bug, but
not fixed with this patch here I think.

The dma-buf lifetime issue is far worse, since we simply don't
handling those leaking past the lifetime of the exporting drm_device
at all. The dma-buf has references to a lot more than just the vma
manager. What we probably need is that every exported dma-buf holds a
ref on the underlying drm_device, but that means untangling the
refcounting&cleanup of that vs unplugging it.
-Daniel
Daniel Vetter Aug. 31, 2016, 9:17 p.m. UTC | #4
On Wed, Aug 31, 2016 at 11:05 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Aug 31, 2016 at 10:45 PM, Haixia Shi <hshi@chromium.org> wrote:
>> For details see https://bugs.chromium.org/p/chromium/issues/detail?id=468050
>>
>> So drm_mode_config_cleanup() is called from udl_driver_unload() in
>> which we found there's still a framebuffer left, hence the WARN in
>> drm_crtc.c:5495. This also forcefully releases all the buffers.
>>
>> A bit later the actual drm_buf_release comes in which attempts to
>> release the buffer again.
>
> Leaving a drm_framebuffer behind on unload is definitely a bug, but
> not fixed with this patch here I think.
>
> The dma-buf lifetime issue is far worse, since we simply don't
> handling those leaking past the lifetime of the exporting drm_device
> at all. The dma-buf has references to a lot more than just the vma
> manager. What we probably need is that every exported dma-buf holds a
> ref on the underlying drm_device, but that means untangling the
> refcounting&cleanup of that vs unplugging it.

Just noticed that these problems started only when dma-buf export
support was added (by you) to udl. That dma-buf support is a bit
hack-ish (e.g. it leaks sg mappings since udl_unmap_dma_buf isn't
implemented. Rough sketch of a fix:

- fix up udl_dmabuf.c. We could/should probably put most of these into
the core as a set of helpers for drivers which use plain shmem gem
objects.

- fix udl load/unload to no longer be midlayered, i.e. get rid of the
->load/unload hooks. There's tons of examples and drivers out there
for templates.

- fix up the unplug hook to correctly unregister everything. With the
fixed load/unload all the unplugged tracking is probably no longer
neeeded. Also with this you can drop the drm_global_mutex dance from
drm_unplug_dev. Also the sequence in drm_unplug_dev is wrong like the
midlayered ->unload. First it should do all the unregistering, then
release internal resources. Atm it's the other way round.

- make sure _all_ public objects (open files, counted by
dev->open_count, dma-bufs, ...) hold a full reference onto the
drm_device. For the dma-buf case this probably needs changes in
drm_prime.c.

- Fix that little ordering issue with the leaking fb. It's probably
not getting cleanup up as it should in udl_fbdev_unplug.

tada, bug fixed for real!

Cheers, Daniel
Haixia Shi Sept. 1, 2016, 1:22 a.m. UTC | #5
Daniel

Thanks! I agree the PATCH 1/2 needs some more work.

What do you think about the PATCH 2/2 (suspend/resume) -- would it
make sense to review it as a single standalone patch?

Regards, Haixia

On Wed, Aug 31, 2016 at 2:17 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Aug 31, 2016 at 11:05 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Aug 31, 2016 at 10:45 PM, Haixia Shi <hshi@chromium.org> wrote:
>>> For details see https://bugs.chromium.org/p/chromium/issues/detail?id=468050
>>>
>>> So drm_mode_config_cleanup() is called from udl_driver_unload() in
>>> which we found there's still a framebuffer left, hence the WARN in
>>> drm_crtc.c:5495. This also forcefully releases all the buffers.
>>>
>>> A bit later the actual drm_buf_release comes in which attempts to
>>> release the buffer again.
>>
>> Leaving a drm_framebuffer behind on unload is definitely a bug, but
>> not fixed with this patch here I think.
>>
>> The dma-buf lifetime issue is far worse, since we simply don't
>> handling those leaking past the lifetime of the exporting drm_device
>> at all. The dma-buf has references to a lot more than just the vma
>> manager. What we probably need is that every exported dma-buf holds a
>> ref on the underlying drm_device, but that means untangling the
>> refcounting&cleanup of that vs unplugging it.
>
> Just noticed that these problems started only when dma-buf export
> support was added (by you) to udl. That dma-buf support is a bit
> hack-ish (e.g. it leaks sg mappings since udl_unmap_dma_buf isn't
> implemented. Rough sketch of a fix:
>
> - fix up udl_dmabuf.c. We could/should probably put most of these into
> the core as a set of helpers for drivers which use plain shmem gem
> objects.
>
> - fix udl load/unload to no longer be midlayered, i.e. get rid of the
> ->load/unload hooks. There's tons of examples and drivers out there
> for templates.
>
> - fix up the unplug hook to correctly unregister everything. With the
> fixed load/unload all the unplugged tracking is probably no longer
> neeeded. Also with this you can drop the drm_global_mutex dance from
> drm_unplug_dev. Also the sequence in drm_unplug_dev is wrong like the
> midlayered ->unload. First it should do all the unregistering, then
> release internal resources. Atm it's the other way round.
>
> - make sure _all_ public objects (open files, counted by
> dev->open_count, dma-bufs, ...) hold a full reference onto the
> drm_device. For the dma-buf case this probably needs changes in
> drm_prime.c.
>
> - Fix that little ordering issue with the leaking fb. It's probably
> not getting cleanup up as it should in udl_fbdev_unplug.
>
> tada, bug fixed for real!
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Daniel Vetter Sept. 1, 2016, 6:28 a.m. UTC | #6
On Thu, Sep 1, 2016 at 3:22 AM, Haixia Shi <hshi@chromium.org> wrote:
> Daniel
>
> Thanks! I agree the PATCH 1/2 needs some more work.
>
> What do you think about the PATCH 2/2 (suspend/resume) -- would it
> make sense to review it as a single standalone patch?

Sure, but I have no clue about usb or udl-specific issues (I only
analyzed the generic lifetime issues in patch 1/2), I think you need
someone else for this. Sean Paul could probably help you out here.
-Daniel

>
> Regards, Haixia
>
> On Wed, Aug 31, 2016 at 2:17 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Aug 31, 2016 at 11:05 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> On Wed, Aug 31, 2016 at 10:45 PM, Haixia Shi <hshi@chromium.org> wrote:
>>>> For details see https://bugs.chromium.org/p/chromium/issues/detail?id=468050
>>>>
>>>> So drm_mode_config_cleanup() is called from udl_driver_unload() in
>>>> which we found there's still a framebuffer left, hence the WARN in
>>>> drm_crtc.c:5495. This also forcefully releases all the buffers.
>>>>
>>>> A bit later the actual drm_buf_release comes in which attempts to
>>>> release the buffer again.
>>>
>>> Leaving a drm_framebuffer behind on unload is definitely a bug, but
>>> not fixed with this patch here I think.
>>>
>>> The dma-buf lifetime issue is far worse, since we simply don't
>>> handling those leaking past the lifetime of the exporting drm_device
>>> at all. The dma-buf has references to a lot more than just the vma
>>> manager. What we probably need is that every exported dma-buf holds a
>>> ref on the underlying drm_device, but that means untangling the
>>> refcounting&cleanup of that vs unplugging it.
>>
>> Just noticed that these problems started only when dma-buf export
>> support was added (by you) to udl. That dma-buf support is a bit
>> hack-ish (e.g. it leaks sg mappings since udl_unmap_dma_buf isn't
>> implemented. Rough sketch of a fix:
>>
>> - fix up udl_dmabuf.c. We could/should probably put most of these into
>> the core as a set of helpers for drivers which use plain shmem gem
>> objects.
>>
>> - fix udl load/unload to no longer be midlayered, i.e. get rid of the
>> ->load/unload hooks. There's tons of examples and drivers out there
>> for templates.
>>
>> - fix up the unplug hook to correctly unregister everything. With the
>> fixed load/unload all the unplugged tracking is probably no longer
>> neeeded. Also with this you can drop the drm_global_mutex dance from
>> drm_unplug_dev. Also the sequence in drm_unplug_dev is wrong like the
>> midlayered ->unload. First it should do all the unregistering, then
>> release internal resources. Atm it's the other way round.
>>
>> - make sure _all_ public objects (open files, counted by
>> dev->open_count, dma-bufs, ...) hold a full reference onto the
>> drm_device. For the dma-buf case this probably needs changes in
>> drm_prime.c.
>>
>> - Fix that little ordering issue with the leaking fb. It's probably
>> not getting cleanup up as it should in udl_fbdev_unplug.
>>
>> tada, bug fixed for real!
>>
>> Cheers, Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff mbox

Patch

diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
index 818e707..72e1bd4 100644
--- a/drivers/gpu/drm/udl/udl_gem.c
+++ b/drivers/gpu/drm/udl/udl_gem.c
@@ -204,7 +204,8 @@  void udl_gem_free_object(struct drm_gem_object *gem_obj)
 	if (obj->pages)
 		udl_gem_put_pages(obj);
 
-	drm_gem_free_mmap_offset(gem_obj);
+	if (gem_obj->dev->vma_offset_manager)
+		drm_gem_free_mmap_offset(gem_obj);
 }
 
 /* the dumb interface doesn't work with the GEM straight MMAP