diff mbox

[RFC,2/8] DRM: Armada: Add Armada DRM driver

Message ID 20130613130339.GD18614@n2100.arm.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King - ARM Linux June 13, 2013, 1:03 p.m. UTC
On Thu, Jun 13, 2013 at 12:50:16PM +0100, Russell King - ARM Linux wrote:
> On Thu, Jun 13, 2013 at 12:19:03PM +0100, Russell King - ARM Linux wrote:
> > The deeper I look, the more bugs there seem to be in this DRM stuff,
> > and I'm continuing to look because I'm chasing a framebuffer refcount
> > bug.
> 
> So, this refcount bug - I think I've just found it.  This is the flow of
> references to the new fb on mode set:
> 
> drm_mode_setcrtc():
>                         fb = drm_framebuffer_lookup(dev, crtc_req->fb_id);
>         set.fb = fb;
>         ret = drm_mode_set_config_internal(&set);
> drm_mode_set_config_internal():
>         fb = set->fb;
>         ret = crtc->funcs->set_config(set);
> drm_crtc_helper_set_config():
>                         old_fb = set->crtc->fb;
>                         set->crtc->fb = set->fb;
>                         if (!drm_crtc_helper_set_mode(set->crtc, set->mode,
>                                                       set->x, set->y,
>                                                       old_fb)) {
>                 drm_helper_disable_unused_functions(dev);
> drm_helper_disable_unused_functions():
>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>                 crtc->enabled = drm_helper_crtc_in_use(crtc);
>                 if (!crtc->enabled) {
>                         crtc->fb = NULL;
> 		}
> 	}
> back to drm_mode_set_config_internal():
>         if (ret == 0) {
>                 if (fb)
>                         drm_framebuffer_reference(fb);
> back to drm_mode_setcrtc():
>         if (fb)
>                 drm_framebuffer_unreference(fb);
> 
> Assuming success all the way through, what happens when a CRTC is unused
> is:
> 
> 1. We obtain a reference in drm_mode_setcrtc() via the lookup.
> 2. We set the mode
> 3. In trying to set the mode, we discover that all connectors for the CRTC
>    are in the disconnected state, and so we disable the CRTC
> 4. We set crtc->fb to NULL
> 5. back in drm_mode_set_config_internal(), we take a reference on the
>    framebuffer irrespective of this.
> 6. back in drm_mode_setcrtc(), we drop the original reference caused by
>    the lookup.
> 
> We now have a framebuffer with a reference count incremented by one but
> no actual reference to it - the CRTC's reference is completely lost by
> the action of drm_helper_disable_unused_functions().
> 
> You could argue that it's something the driver should deal with - fine,
> but what if it only implements the DPMS method?  Should it drop a
> reference to the framebuffer when DPMS instructs it to turn off?  Surely
> not, because that means when DPMS turns stuff back on you're missing a
> refcount.
> 
> Are drivers required to implement a disable function and cater for the
> imbalance in the upper layers of code?  If so, this is not a clean
> design.

There's a bigger issue here - if it's possible for drm_crtc_helper_set_config()
to be called with set->fb set but set->mode NULL, then we overwrite
set->fb to NULL.  Again, that results in a lost reference.

For the time being, I'm using this patch, which solves my dropped
refcount problem, and marks the other possible dropped reference.
Either that check needs to be removed or it needs to properly drop
the refcount on the fb before 'losing' the reference to it.

Comments

Daniel Vetter June 14, 2013, 2:23 p.m. UTC | #1
On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Jun 13, 2013 at 12:50:16PM +0100, Russell King - ARM Linux wrote:
>> On Thu, Jun 13, 2013 at 12:19:03PM +0100, Russell King - ARM Linux wrote:
>> > The deeper I look, the more bugs there seem to be in this DRM stuff,
>> > and I'm continuing to look because I'm chasing a framebuffer refcount
>> > bug.
>>
>> So, this refcount bug - I think I've just found it.  This is the flow of
>> references to the new fb on mode set:
>>
>> drm_mode_setcrtc():
>>                         fb = drm_framebuffer_lookup(dev, crtc_req->fb_id);
>>         set.fb = fb;
>>         ret = drm_mode_set_config_internal(&set);
>> drm_mode_set_config_internal():
>>         fb = set->fb;
>>         ret = crtc->funcs->set_config(set);
>> drm_crtc_helper_set_config():
>>                         old_fb = set->crtc->fb;
>>                         set->crtc->fb = set->fb;
>>                         if (!drm_crtc_helper_set_mode(set->crtc, set->mode,
>>                                                       set->x, set->y,
>>                                                       old_fb)) {
>>                 drm_helper_disable_unused_functions(dev);
>> drm_helper_disable_unused_functions():
>>         list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>>                 crtc->enabled = drm_helper_crtc_in_use(crtc);
>>                 if (!crtc->enabled) {
>>                         crtc->fb = NULL;
>>               }
>>       }
>> back to drm_mode_set_config_internal():
>>         if (ret == 0) {
>>                 if (fb)
>>                         drm_framebuffer_reference(fb);
>> back to drm_mode_setcrtc():
>>         if (fb)
>>                 drm_framebuffer_unreference(fb);
>>
>> Assuming success all the way through, what happens when a CRTC is unused
>> is:
>>
>> 1. We obtain a reference in drm_mode_setcrtc() via the lookup.
>> 2. We set the mode
>> 3. In trying to set the mode, we discover that all connectors for the CRTC
>>    are in the disconnected state, and so we disable the CRTC
>> 4. We set crtc->fb to NULL
>> 5. back in drm_mode_set_config_internal(), we take a reference on the
>>    framebuffer irrespective of this.
>> 6. back in drm_mode_setcrtc(), we drop the original reference caused by
>>    the lookup.
>>
>> We now have a framebuffer with a reference count incremented by one but
>> no actual reference to it - the CRTC's reference is completely lost by
>> the action of drm_helper_disable_unused_functions().
>>
>> You could argue that it's something the driver should deal with - fine,
>> but what if it only implements the DPMS method?  Should it drop a
>> reference to the framebuffer when DPMS instructs it to turn off?  Surely
>> not, because that means when DPMS turns stuff back on you're missing a
>> refcount.
>>
>> Are drivers required to implement a disable function and cater for the
>> imbalance in the upper layers of code?  If so, this is not a clean
>> design.
>
> There's a bigger issue here - if it's possible for drm_crtc_helper_set_config()
> to be called with set->fb set but set->mode NULL, then we overwrite
> set->fb to NULL.  Again, that results in a lost reference.
>
> For the time being, I'm using this patch, which solves my dropped
> refcount problem, and marks the other possible dropped reference.
> Either that check needs to be removed or it needs to properly drop
> the refcount on the fb before 'losing' the reference to it.

Scrap my other mail, I see now where the leaking happens. One of them
is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs
to enforce them). The other one is indeed a real case that eluded me
when I've done the refcountification for drm_framebuffers. I'll hack
up some patches, since this seems to be a good excuse to port some of
the i915 modeset improvements back to the crtc helpers.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Russell King - ARM Linux June 14, 2013, 2:42 p.m. UTC | #2
On Fri, Jun 14, 2013 at 04:23:22PM +0200, Daniel Vetter wrote:
> On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > There's a bigger issue here - if it's possible for drm_crtc_helper_set_config()
> > to be called with set->fb set but set->mode NULL, then we overwrite
> > set->fb to NULL.  Again, that results in a lost reference.
> >
> > For the time being, I'm using this patch, which solves my dropped
> > refcount problem, and marks the other possible dropped reference.
> > Either that check needs to be removed or it needs to properly drop
> > the refcount on the fb before 'losing' the reference to it.
> 
> Scrap my other mail, I see now where the leaking happens. One of them
> is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs
> to enforce them). The other one is indeed a real case that eluded me
> when I've done the refcountification for drm_framebuffers. I'll hack
> up some patches, since this seems to be a good excuse to port some of
> the i915 modeset improvements back to the crtc helpers.

If you're happy with the patch I supplied, that's probably the minimal fix
which should go to stable kernels (I'm using 3.9 here) - this also counts
as a "user visible bug".  It's something I've tripped over which causes
exhausts memory and can prevent the X server from starting up.

If you want me to package the patch up with a commit message and sign-off...
Daniel Vetter June 14, 2013, 7:50 p.m. UTC | #3
On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jun 14, 2013 at 04:23:22PM +0200, Daniel Vetter wrote:
>> On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > There's a bigger issue here - if it's possible for drm_crtc_helper_set_config()
>> > to be called with set->fb set but set->mode NULL, then we overwrite
>> > set->fb to NULL.  Again, that results in a lost reference.
>> >
>> > For the time being, I'm using this patch, which solves my dropped
>> > refcount problem, and marks the other possible dropped reference.
>> > Either that check needs to be removed or it needs to properly drop
>> > the refcount on the fb before 'losing' the reference to it.
>>
>> Scrap my other mail, I see now where the leaking happens. One of them
>> is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs
>> to enforce them). The other one is indeed a real case that eluded me
>> when I've done the refcountification for drm_framebuffers. I'll hack
>> up some patches, since this seems to be a good excuse to port some of
>> the i915 modeset improvements back to the crtc helpers.
>
> If you're happy with the patch I supplied, that's probably the minimal fix
> which should go to stable kernels (I'm using 3.9 here) - this also counts
> as a "user visible bug".  It's something I've tripped over which causes
> exhausts memory and can prevent the X server from starting up.
>
> If you want me to package the patch up with a commit message and sign-off..

Your patch doesn't fix drm/i915 (since we don't use the crtc helpers
any more). And I don't think it's good to have the refcounting
partially in the drm core and partially in drivers. I've also thrown a
few more things on top just to port a few of the i915 cleanups to the
crtc helper. I'll submit my patches asap (need to test them a bit more
first).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Russell King - ARM Linux June 14, 2013, 10:15 p.m. UTC | #4
On Fri, Jun 14, 2013 at 09:50:22PM +0200, Daniel Vetter wrote:
> On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > If you're happy with the patch I supplied, that's probably the minimal fix
> > which should go to stable kernels (I'm using 3.9 here) - this also counts
> > as a "user visible bug".  It's something I've tripped over which causes
> > exhausts memory and can prevent the X server from starting up.
> >
> > If you want me to package the patch up with a commit message and sign-off..
> 
> Your patch doesn't fix drm/i915 (since we don't use the crtc helpers
> any more). And I don't think it's good to have the refcounting
> partially in the drm core and partially in drivers.

Let me check what you mean by that.  I hope you mean that the drm core,
drm helpers and drivers are individually responsible for dropping any
refcount that they obtain on any object.

In other words, if the drm core takes a refcount on object X, then the
DRM core must be the code base which drops that refcount.
Daniel Vetter June 14, 2013, 10:36 p.m. UTC | #5
On Sat, Jun 15, 2013 at 12:15 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jun 14, 2013 at 09:50:22PM +0200, Daniel Vetter wrote:
>> On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > If you're happy with the patch I supplied, that's probably the minimal fix
>> > which should go to stable kernels (I'm using 3.9 here) - this also counts
>> > as a "user visible bug".  It's something I've tripped over which causes
>> > exhausts memory and can prevent the X server from starting up.
>> >
>> > If you want me to package the patch up with a commit message and sign-off..
>>
>> Your patch doesn't fix drm/i915 (since we don't use the crtc helpers
>> any more). And I don't think it's good to have the refcounting
>> partially in the drm core and partially in drivers.
>
> Let me check what you mean by that.  I hope you mean that the drm core,
> drm helpers and drivers are individually responsible for dropping any
> refcount that they obtain on any object.
>
> In other words, if the drm core takes a refcount on object X, then the
> DRM core must be the code base which drops that refcount.

Yeah, that's the idea. But since with the currently quirky drm core
->set_config callback the drivers need to drop the reference if any
other crtc loses all its connectors. My patch also drops the ref for
that fb in the drm core functions. The reason for that is that we want
to switch to a global ->set_config function anyway (usually called
atomic modeset) to push the configuration for all the crtcs into the
hw in one step (or fail it completely ofc). The current approach leads
to too much flickering and some funny hacks in drivers ...

Of course such gradual transitions over mutliple kernel release (we're
shuffling backend code in drm/i915 since about 3.7 for this) leaves a
few ugly intermediat states. But I prefer to move into the overall
right direction with a bit of ugliness than doing such big transitions
in one big swoop.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index dd64a06..774d7a6 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -2014,13 +2014,16 @@  int drm_mode_set_config_internal(struct drm_mode_set *set)
 
 	old_fb = crtc->fb;
 	fb = set->fb;
+	if (fb)
+		drm_framebuffer_reference(fb);
 
 	ret = crtc->funcs->set_config(set);
 	if (ret == 0) {
 		if (old_fb)
 			drm_framebuffer_unreference(old_fb);
+	} else {
 		if (fb)
-			drm_framebuffer_reference(fb);
+			drm_framebuffer_unreference(fb);
 	}
 
 	return ret;
diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 7b2d378..0d18fb2 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -299,6 +299,8 @@  void drm_helper_disable_unused_functions(struct drm_device *dev)
 				(*crtc_funcs->disable)(crtc);
 			else
 				(*crtc_funcs->dpms)(crtc, DRM_MODE_DPMS_OFF);
+			if (crtc->fb)
+				drm_framebuffer_unreference(crtc->fb);
 			crtc->fb = NULL;
 		}
 	}
@@ -573,6 +575,7 @@  int drm_crtc_helper_set_config(struct drm_mode_set *set)
 
 	crtc_funcs = set->crtc->helper_private;
 
+	/* FIXME: this loses a refcount on the fb */
 	if (!set->mode)
 		set->fb = NULL;