Message ID | 20130613130339.GD18614@n2100.arm.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Thu, Jun 13, 2013 at 12:50:16PM +0100, Russell King - ARM Linux wrote: >> On Thu, Jun 13, 2013 at 12:19:03PM +0100, Russell King - ARM Linux wrote: >> > The deeper I look, the more bugs there seem to be in this DRM stuff, >> > and I'm continuing to look because I'm chasing a framebuffer refcount >> > bug. >> >> So, this refcount bug - I think I've just found it. This is the flow of >> references to the new fb on mode set: >> >> drm_mode_setcrtc(): >> fb = drm_framebuffer_lookup(dev, crtc_req->fb_id); >> set.fb = fb; >> ret = drm_mode_set_config_internal(&set); >> drm_mode_set_config_internal(): >> fb = set->fb; >> ret = crtc->funcs->set_config(set); >> drm_crtc_helper_set_config(): >> old_fb = set->crtc->fb; >> set->crtc->fb = set->fb; >> if (!drm_crtc_helper_set_mode(set->crtc, set->mode, >> set->x, set->y, >> old_fb)) { >> drm_helper_disable_unused_functions(dev); >> drm_helper_disable_unused_functions(): >> list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { >> crtc->enabled = drm_helper_crtc_in_use(crtc); >> if (!crtc->enabled) { >> crtc->fb = NULL; >> } >> } >> back to drm_mode_set_config_internal(): >> if (ret == 0) { >> if (fb) >> drm_framebuffer_reference(fb); >> back to drm_mode_setcrtc(): >> if (fb) >> drm_framebuffer_unreference(fb); >> >> Assuming success all the way through, what happens when a CRTC is unused >> is: >> >> 1. We obtain a reference in drm_mode_setcrtc() via the lookup. >> 2. We set the mode >> 3. In trying to set the mode, we discover that all connectors for the CRTC >> are in the disconnected state, and so we disable the CRTC >> 4. We set crtc->fb to NULL >> 5. back in drm_mode_set_config_internal(), we take a reference on the >> framebuffer irrespective of this. >> 6. back in drm_mode_setcrtc(), we drop the original reference caused by >> the lookup. >> >> We now have a framebuffer with a reference count incremented by one but >> no actual reference to it - the CRTC's reference is completely lost by >> the action of drm_helper_disable_unused_functions(). >> >> You could argue that it's something the driver should deal with - fine, >> but what if it only implements the DPMS method? Should it drop a >> reference to the framebuffer when DPMS instructs it to turn off? Surely >> not, because that means when DPMS turns stuff back on you're missing a >> refcount. >> >> Are drivers required to implement a disable function and cater for the >> imbalance in the upper layers of code? If so, this is not a clean >> design. > > There's a bigger issue here - if it's possible for drm_crtc_helper_set_config() > to be called with set->fb set but set->mode NULL, then we overwrite > set->fb to NULL. Again, that results in a lost reference. > > For the time being, I'm using this patch, which solves my dropped > refcount problem, and marks the other possible dropped reference. > Either that check needs to be removed or it needs to properly drop > the refcount on the fb before 'losing' the reference to it. Scrap my other mail, I see now where the leaking happens. One of them is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs to enforce them). The other one is indeed a real case that eluded me when I've done the refcountification for drm_framebuffers. I'll hack up some patches, since this seems to be a good excuse to port some of the i915 modeset improvements back to the crtc helpers. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Fri, Jun 14, 2013 at 04:23:22PM +0200, Daniel Vetter wrote: > On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > There's a bigger issue here - if it's possible for drm_crtc_helper_set_config() > > to be called with set->fb set but set->mode NULL, then we overwrite > > set->fb to NULL. Again, that results in a lost reference. > > > > For the time being, I'm using this patch, which solves my dropped > > refcount problem, and marks the other possible dropped reference. > > Either that check needs to be removed or it needs to properly drop > > the refcount on the fb before 'losing' the reference to it. > > Scrap my other mail, I see now where the leaking happens. One of them > is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs > to enforce them). The other one is indeed a real case that eluded me > when I've done the refcountification for drm_framebuffers. I'll hack > up some patches, since this seems to be a good excuse to port some of > the i915 modeset improvements back to the crtc helpers. If you're happy with the patch I supplied, that's probably the minimal fix which should go to stable kernels (I'm using 3.9 here) - this also counts as a "user visible bug". It's something I've tripped over which causes exhausts memory and can prevent the X server from starting up. If you want me to package the patch up with a commit message and sign-off...
On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jun 14, 2013 at 04:23:22PM +0200, Daniel Vetter wrote: >> On Thu, Jun 13, 2013 at 3:03 PM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > There's a bigger issue here - if it's possible for drm_crtc_helper_set_config() >> > to be called with set->fb set but set->mode NULL, then we overwrite >> > set->fb to NULL. Again, that results in a lost reference. >> > >> > For the time being, I'm using this patch, which solves my dropped >> > refcount problem, and marks the other possible dropped reference. >> > Either that check needs to be removed or it needs to properly drop >> > the refcount on the fb before 'losing' the reference to it. >> >> Scrap my other mail, I see now where the leaking happens. One of them >> is interface abuse which is now fixed (and i915 has a bunch of BUG_ONs >> to enforce them). The other one is indeed a real case that eluded me >> when I've done the refcountification for drm_framebuffers. I'll hack >> up some patches, since this seems to be a good excuse to port some of >> the i915 modeset improvements back to the crtc helpers. > > If you're happy with the patch I supplied, that's probably the minimal fix > which should go to stable kernels (I'm using 3.9 here) - this also counts > as a "user visible bug". It's something I've tripped over which causes > exhausts memory and can prevent the X server from starting up. > > If you want me to package the patch up with a commit message and sign-off.. Your patch doesn't fix drm/i915 (since we don't use the crtc helpers any more). And I don't think it's good to have the refcounting partially in the drm core and partially in drivers. I've also thrown a few more things on top just to port a few of the i915 cleanups to the crtc helper. I'll submit my patches asap (need to test them a bit more first). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Fri, Jun 14, 2013 at 09:50:22PM +0200, Daniel Vetter wrote: > On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > If you're happy with the patch I supplied, that's probably the minimal fix > > which should go to stable kernels (I'm using 3.9 here) - this also counts > > as a "user visible bug". It's something I've tripped over which causes > > exhausts memory and can prevent the X server from starting up. > > > > If you want me to package the patch up with a commit message and sign-off.. > > Your patch doesn't fix drm/i915 (since we don't use the crtc helpers > any more). And I don't think it's good to have the refcounting > partially in the drm core and partially in drivers. Let me check what you mean by that. I hope you mean that the drm core, drm helpers and drivers are individually responsible for dropping any refcount that they obtain on any object. In other words, if the drm core takes a refcount on object X, then the DRM core must be the code base which drops that refcount.
On Sat, Jun 15, 2013 at 12:15 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jun 14, 2013 at 09:50:22PM +0200, Daniel Vetter wrote: >> On Fri, Jun 14, 2013 at 4:42 PM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > If you're happy with the patch I supplied, that's probably the minimal fix >> > which should go to stable kernels (I'm using 3.9 here) - this also counts >> > as a "user visible bug". It's something I've tripped over which causes >> > exhausts memory and can prevent the X server from starting up. >> > >> > If you want me to package the patch up with a commit message and sign-off.. >> >> Your patch doesn't fix drm/i915 (since we don't use the crtc helpers >> any more). And I don't think it's good to have the refcounting >> partially in the drm core and partially in drivers. > > Let me check what you mean by that. I hope you mean that the drm core, > drm helpers and drivers are individually responsible for dropping any > refcount that they obtain on any object. > > In other words, if the drm core takes a refcount on object X, then the > DRM core must be the code base which drops that refcount. Yeah, that's the idea. But since with the currently quirky drm core ->set_config callback the drivers need to drop the reference if any other crtc loses all its connectors. My patch also drops the ref for that fb in the drm core functions. The reason for that is that we want to switch to a global ->set_config function anyway (usually called atomic modeset) to push the configuration for all the crtcs into the hw in one step (or fail it completely ofc). The current approach leads to too much flickering and some funny hacks in drivers ... Of course such gradual transitions over mutliple kernel release (we're shuffling backend code in drm/i915 since about 3.7 for this) leaves a few ugly intermediat states. But I prefer to move into the overall right direction with a bit of ugliness than doing such big transitions in one big swoop. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index dd64a06..774d7a6 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -2014,13 +2014,16 @@ int drm_mode_set_config_internal(struct drm_mode_set *set) old_fb = crtc->fb; fb = set->fb; + if (fb) + drm_framebuffer_reference(fb); ret = crtc->funcs->set_config(set); if (ret == 0) { if (old_fb) drm_framebuffer_unreference(old_fb); + } else { if (fb) - drm_framebuffer_reference(fb); + drm_framebuffer_unreference(fb); } return ret; diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c index 7b2d378..0d18fb2 100644 --- a/drivers/gpu/drm/drm_crtc_helper.c +++ b/drivers/gpu/drm/drm_crtc_helper.c @@ -299,6 +299,8 @@ void drm_helper_disable_unused_functions(struct drm_device *dev) (*crtc_funcs->disable)(crtc); else (*crtc_funcs->dpms)(crtc, DRM_MODE_DPMS_OFF); + if (crtc->fb) + drm_framebuffer_unreference(crtc->fb); crtc->fb = NULL; } } @@ -573,6 +575,7 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set) crtc_funcs = set->crtc->helper_private; + /* FIXME: this loses a refcount on the fb */ if (!set->mode) set->fb = NULL;