diff mbox

drm/crtc: fix connector reference counting mismatch in drm_crtc_helper_set_config

Message ID 1464883332-28371-1-git-send-email-p.zabel@pengutronix.de (mailing list archive)
State New, archived
Headers show

Commit Message

Philipp Zabel June 2, 2016, 4:02 p.m. UTC
Since commit 0955c1250e96 ("drm/crtc: take references to connectors used
in a modeset. (v2)"), the reference counts of all connectors in the
drm_mode_set given to drm_crtc_helper_set_config are incremented, and then
the reference counts of all connectors are decremented on success, but in a
temporary copy of the connector structure. This leads to the following
error after the first modeset on imx-drm:

    Unable to handle kernel NULL pointer dereference at virtual address 00000004
    pgd = ad8c4000
    [00000004] *pgd=3d9c5831, *pte=00000000, *ppte=00000000
    Internal error: Oops: 817 [#1] PREEMPT SMP ARM
    Modules linked in:
    CPU: 1 PID: 190 Comm: kmsfb-manage Not tainted 4.7.0-rc1+ #657
    Hardware name: Freescale i.MX6 Quad/DualLit: [<80506098>]    lr : [<80252e94>]    psr: 200c0013
    sp : adca7ca8  ip : adca7b90  fp : adca7cd4
    r10: 00000000  r9 : 00000100  r8 : 00000200
    r7 : af3c9800  r6 : aded7848  r5 : aded7800  r4 : 00000000
    r3 : af3ca058  r2 : 00000200  r1 : af3ca058  r0 : 00000000
    Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5387d  Table: 3d8c404a  DAC: 00000051
    Process kmsfb-manage (pid: 190, stack limit = 0xadca6210)
    Stack: (0xadca7ca8 to 0xadca8000)
    7ca0:                   805190e0 aded7800 aded7820 80501a88 8155a290 af3c9c6c
    7cc0: adca7ddc 0000000f adca7cec adca7cd8 80519104 80506044 805190e0 aded7800
    7ce0: adca7d04 adca7cf0 80501ac0 805190ec aded7820 aded7814 adca7d24 adca7d08
    7d00: 804fdb80 80501a94 aded7800 af3ca010 aded7afc af3c9c60 adca7d94 adca7d28
    7d20: 804e3518 804fdb20 00000000 af3c9b1c adca7d50 81506f44 00000000 8093c500
    7d40: af3c9c6c ae4f2ca8 ae4f2c18 00000000 00000000 ae637f00 00000000 aded7800
    7d60: 00000001 af3c9800 af23c300 ae77fcc0 ae4f2c18 00000001 af3c9800 8155a290
    7d80: af1af700 adca6000 adca7db4 adca7d98 804fea6c 804e2de4 adca7e50 adb3d940
    7da0: 00000001 af3c9800 adca7e24 adca7db8 8050440c 804fea0c ae77fcc0 00000003
    7dc0: adca7e24 adb3d940 af1af700 ae77fcc0 ae77fccc ae4f2c18 8083d44c ae77fcc0
    7de0: ae4002 80d03040 adca7e64 adca7e40 adca7e50 80503f08
    7e40: 7ebd5630 adca7e50 00000068 c06864a2 7ebd5be8 00000000 00000001 00000018
    7e60: 00000026 00000000 00000000 00000000 00000001 000115bc 05010500 05a0059f
    7e80: 03200000 03360321 00000337 0000003c 00000000 00000040 30383231 30303878
    7ea0: 00000000 00000000 00000000 00000000 00000000 00000000 80173058 80172e30
    7ec0: 80d77d32 00004000 adf7d900 00000003 00000000 7ebd5630 af342bb0 adfe3b80
    7ee0: 80272f50 00000003 adca6000 00000000 adca7f7c adca7f00 802725ec 804f52cc
    7f00: 802809cc 80178450 00000000 00000000 80280880 80145904 adb3d8c0 adf7d990
    7f20: ffffffff 00000003 00004000 01614c10 c06864a2 00000003 adca6000 00000000
    7f40: adca7f6c adca7f50 80280b04 8028088c 000115bc adfe3b81 7ebd5630 adfe3b80
    7f60: c06864a2 00000003 adca6000 00000000 adca7fa4 adca7f80 80272f50 80272548
    7f80: 000115bc 00017050 00000001 01614c10 00000036 801089e4 00000000 adca7fa8
    7fa0: 80108840 80272f18 00017050 00000001 00000003 c06864a2 7ebd5630 000115bc
    7fc0: 00017050 00000001 01614c10 00000036 00000003 00000000 00000026 00000018
    7fe0: 00016f38 7ebd562c 0000b5e9 76ef31e6 400c0030 00000003 ff5f37db bfe7dd4d
    Backtrace:
    [<80506038>] (drm_connector_cleanup) from [<80519104>] (dw_hdmi_connector_destroy+0x24/0x28)
     r10:0000000f r9:adca7ddc r8:af3c9c6c r7:8155a290 r6:80501a88 r5:aded7820
     r4:aded7800 r3:805190e0
    [<805190e0>] (dw_hdmi_connector_destroy) from [<80501ac0>] (drm_connector_free+0x38/0x3c)
     r4:aded7800 nreference) from [<804e3518>] (drm_crtc_helper_set_config+0x740/0xbf4)
     r6:af3c9c60 r5:aded7afc r4:af3ca010 r3:aded7800
    [<804e2dd8>] (drm_crtc_helper_set_config) from [<804fea6c>] (drm_mode_set_config_internal+0x6c/0xf4)
     r10:adca6000 r9:af1af700 r8:8155a290 r7:af3c9800 r6:00000001 r5:ae4f2c18
     r4:ae77fcc0
    [<804fea00>] (drm_mode_set_config_internal) from [<8050440c>] (drm_mode_setcrtc+0x504/0x57c)
     r7:af3c9800 r6:00000001 r5:adb3d940 r4:adca7e50
    [<80503f08>] (drm_mode_setcrtc) from [<804f5404>] (drm_ioctl+0x144/0x4dc)
     r10:ada2e000 r9:000000a2 r8:af3c9800 r7:8155a290 r6:809320b4 r5:00000051
     r4:adca7e50
    [<804f52c0>] (drm_ioctl) from [<802725ec>] (do_vfs_ioctl+0xb0/0x9d0)
     r10:00000000 r9:adca6000 r8:00000003 r7:80272f50 r6:adfe3b80 r5:af342bb0
     r4:7ebd5630
    [<8027253c>] (do_vfs_ioctl) from [<80272f50>] (SyS_ioctl+0x44/0x6c)
     r10:00000000 r9:adca6000 r8:00000003 r7:c06864a2 r6:adfe3b80 r5:7ebd5630
     r4:adfe3b81
    [<80272f0c>] (SyS_ioctl) from [<80108840>] (ret_fast_syscall+0x0/0x1c)
     r8:801089e4 r7:00000036 r6:01614c10 r5:00000001 r4:00017050 r3:000115bc
    Code: 0a00000c e5932004 e1a01003 e1a0a004 (e5842004)
    ---[ end trace 9a7257572ccacb16 ]---

Only the reference count of connectors that weren't previously bound to
an encoder should be incremented after a call to drm_crtc_helper_set_config.
And only the reference count of connectors that were previously bound to
an encoder and are unbound afterwards should ever be decremented.
The reference counts of the temporary copies in the save_connectors
should not be touched at all.

This patch fixes the above error by only incrementing the reference count
of those connectors in the set that are initially not bound to any encoder,
and also by restoring the reference count of only those connectors in the
set in the failure case.

Fixes: 0955c1250e96 ("drm/crtc: take references to connectors used in a modeset. (v2)")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
---
 drivers/gpu/drm/drm_crtc_helper.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Comments

Daniel Vetter June 2, 2016, 4:21 p.m. UTC | #1
On Thu, Jun 02, 2016 at 06:02:12PM +0200, Philipp Zabel wrote:
> Since commit 0955c1250e96 ("drm/crtc: take references to connectors used
> in a modeset. (v2)"), the reference counts of all connectors in the
> drm_mode_set given to drm_crtc_helper_set_config are incremented, and then
> the reference counts of all connectors are decremented on success, but in a
> temporary copy of the connector structure. This leads to the following
> error after the first modeset on imx-drm:
> 
>     Unable to handle kernel NULL pointer dereference at virtual address 00000004
>     pgd = ad8c4000
>     [00000004] *pgd=3d9c5831, *pte=00000000, *ppte=00000000
>     Internal error: Oops: 817 [#1] PREEMPT SMP ARM
>     Modules linked in:
>     CPU: 1 PID: 190 Comm: kmsfb-manage Not tainted 4.7.0-rc1+ #657
>     Hardware name: Freescale i.MX6 Quad/DualLit: [<80506098>]    lr : [<80252e94>]    psr: 200c0013
>     sp : adca7ca8  ip : adca7b90  fp : adca7cd4
>     r10: 00000000  r9 : 00000100  r8 : 00000200
>     r7 : af3c9800  r6 : aded7848  r5 : aded7800  r4 : 00000000
>     r3 : af3ca058  r2 : 00000200  r1 : af3ca058  r0 : 00000000
>     Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>     Control: 10c5387d  Table: 3d8c404a  DAC: 00000051
>     Process kmsfb-manage (pid: 190, stack limit = 0xadca6210)
>     Stack: (0xadca7ca8 to 0xadca8000)
>     7ca0:                   805190e0 aded7800 aded7820 80501a88 8155a290 af3c9c6c
>     7cc0: adca7ddc 0000000f adca7cec adca7cd8 80519104 80506044 805190e0 aded7800
>     7ce0: adca7d04 adca7cf0 80501ac0 805190ec aded7820 aded7814 adca7d24 adca7d08
>     7d00: 804fdb80 80501a94 aded7800 af3ca010 aded7afc af3c9c60 adca7d94 adca7d28
>     7d20: 804e3518 804fdb20 00000000 af3c9b1c adca7d50 81506f44 00000000 8093c500
>     7d40: af3c9c6c ae4f2ca8 ae4f2c18 00000000 00000000 ae637f00 00000000 aded7800
>     7d60: 00000001 af3c9800 af23c300 ae77fcc0 ae4f2c18 00000001 af3c9800 8155a290
>     7d80: af1af700 adca6000 adca7db4 adca7d98 804fea6c 804e2de4 adca7e50 adb3d940
>     7da0: 00000001 af3c9800 adca7e24 adca7db8 8050440c 804fea0c ae77fcc0 00000003
>     7dc0: adca7e24 adb3d940 af1af700 ae77fcc0 ae77fccc ae4f2c18 8083d44c ae77fcc0
>     7de0: ae4002 80d03040 adca7e64 adca7e40 adca7e50 80503f08
>     7e40: 7ebd5630 adca7e50 00000068 c06864a2 7ebd5be8 00000000 00000001 00000018
>     7e60: 00000026 00000000 00000000 00000000 00000001 000115bc 05010500 05a0059f
>     7e80: 03200000 03360321 00000337 0000003c 00000000 00000040 30383231 30303878
>     7ea0: 00000000 00000000 00000000 00000000 00000000 00000000 80173058 80172e30
>     7ec0: 80d77d32 00004000 adf7d900 00000003 00000000 7ebd5630 af342bb0 adfe3b80
>     7ee0: 80272f50 00000003 adca6000 00000000 adca7f7c adca7f00 802725ec 804f52cc
>     7f00: 802809cc 80178450 00000000 00000000 80280880 80145904 adb3d8c0 adf7d990
>     7f20: ffffffff 00000003 00004000 01614c10 c06864a2 00000003 adca6000 00000000
>     7f40: adca7f6c adca7f50 80280b04 8028088c 000115bc adfe3b81 7ebd5630 adfe3b80
>     7f60: c06864a2 00000003 adca6000 00000000 adca7fa4 adca7f80 80272f50 80272548
>     7f80: 000115bc 00017050 00000001 01614c10 00000036 801089e4 00000000 adca7fa8
>     7fa0: 80108840 80272f18 00017050 00000001 00000003 c06864a2 7ebd5630 000115bc
>     7fc0: 00017050 00000001 01614c10 00000036 00000003 00000000 00000026 00000018
>     7fe0: 00016f38 7ebd562c 0000b5e9 76ef31e6 400c0030 00000003 ff5f37db bfe7dd4d
>     Backtrace:
>     [<80506038>] (drm_connector_cleanup) from [<80519104>] (dw_hdmi_connector_destroy+0x24/0x28)
>      r10:0000000f r9:adca7ddc r8:af3c9c6c r7:8155a290 r6:80501a88 r5:aded7820
>      r4:aded7800 r3:805190e0
>     [<805190e0>] (dw_hdmi_connector_destroy) from [<80501ac0>] (drm_connector_free+0x38/0x3c)
>      r4:aded7800 nreference) from [<804e3518>] (drm_crtc_helper_set_config+0x740/0xbf4)
>      r6:af3c9c60 r5:aded7afc r4:af3ca010 r3:aded7800
>     [<804e2dd8>] (drm_crtc_helper_set_config) from [<804fea6c>] (drm_mode_set_config_internal+0x6c/0xf4)
>      r10:adca6000 r9:af1af700 r8:8155a290 r7:af3c9800 r6:00000001 r5:ae4f2c18
>      r4:ae77fcc0
>     [<804fea00>] (drm_mode_set_config_internal) from [<8050440c>] (drm_mode_setcrtc+0x504/0x57c)
>      r7:af3c9800 r6:00000001 r5:adb3d940 r4:adca7e50
>     [<80503f08>] (drm_mode_setcrtc) from [<804f5404>] (drm_ioctl+0x144/0x4dc)
>      r10:ada2e000 r9:000000a2 r8:af3c9800 r7:8155a290 r6:809320b4 r5:00000051
>      r4:adca7e50
>     [<804f52c0>] (drm_ioctl) from [<802725ec>] (do_vfs_ioctl+0xb0/0x9d0)
>      r10:00000000 r9:adca6000 r8:00000003 r7:80272f50 r6:adfe3b80 r5:af342bb0
>      r4:7ebd5630
>     [<8027253c>] (do_vfs_ioctl) from [<80272f50>] (SyS_ioctl+0x44/0x6c)
>      r10:00000000 r9:adca6000 r8:00000003 r7:c06864a2 r6:adfe3b80 r5:7ebd5630
>      r4:adfe3b81
>     [<80272f0c>] (SyS_ioctl) from [<80108840>] (ret_fast_syscall+0x0/0x1c)
>      r8:801089e4 r7:00000036 r6:01614c10 r5:00000001 r4:00017050 r3:000115bc
>     Code: 0a00000c e5932004 e1a01003 e1a0a004 (e5842004)
>     ---[ end trace 9a7257572ccacb16 ]---
> 
> Only the reference count of connectors that weren't previously bound to
> an encoder should be incremented after a call to drm_crtc_helper_set_config.
> And only the reference count of connectors that were previously bound to
> an encoder and are unbound afterwards should ever be decremented.
> The reference counts of the temporary copies in the save_connectors
> should not be touched at all.
> 
> This patch fixes the above error by only incrementing the reference count
> of those connectors in the set that are initially not bound to any encoder,
> and also by restoring the reference count of only those connectors in the
> set in the failure case.
> 
> Fixes: 0955c1250e96 ("drm/crtc: take references to connectors used in a modeset. (v2)")
> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

I think this is worse. We can't save/restore the connector when there's a
kref inside of it. But there's really only 2 things we need to save
restore:
- connector->encoder pointer, for each connector.
- encoder->crtc pointer, for each encoder.

For a proper fix I think we need to rectify that first, then apply your
bugfix on top. The refcount logic itself seems sound, but I've screwed
that up way too often already.

Also I'm somewhat confused about how you manage to blow things up. The
refcount of each connector we're seeing should be 1 already: Once from the
connector list, and a 2nd one from the lookup in the setcrtc ioctl code.

Which means when we drop that 1 reference in the saved connector (which is
entirely bogus code, I agree) it should only drop to 1, not 0. And youre
cleanup code should not be called.

The other thing is that radeon/nouveau also uses this code, and no one
complained yet. Together I think that's some good evidence that suggests
there's also something strange going on in imx itself. At least under
normal circumstances it should be impossible that the offending
unreference function drops the refcount to 0.

Cheers, Daniel

> ---
>  drivers/gpu/drm/drm_crtc_helper.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
> index a6e4243..1c4d674 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -631,8 +631,12 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set)
>  		mode_changed = true;
>  	}
>  
> -	/* take a reference on all connectors in set */
> +	/* take a reference on all unbound connectors in set, reuse the
> +	 * already taken reference for bound connectors
> +	 */
>  	for (ro = 0; ro < set->num_connectors; ro++) {
> +		if (set->connectors[ro]->encoder)
> +			continue;
>  		drm_connector_reference(set->connectors[ro]);
>  	}
>  
> @@ -754,12 +758,6 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set)
>  		}
>  	}
>  
> -	/* after fail drop reference on all connectors in save set */
> -	count = 0;
> -	drm_for_each_connector(connector, dev) {
> -		drm_connector_unreference(&save_connectors[count++]);
> -	}
> -
>  	kfree(save_connectors);
>  	kfree(save_encoders);
>  	return 0;
> @@ -776,8 +774,12 @@ fail:
>  		*connector = save_connectors[count++];
>  	}
>  
> -	/* after fail drop reference on all connectors in set */
> +	/* after fail drop reference on all unbound connectors in set, let
> +	 * bound connectors keep their reference
> +	 */
>  	for (ro = 0; ro < set->num_connectors; ro++) {
> +		if (set->connectors[ro]->encoder)
> +			continue;
>  		drm_connector_unreference(set->connectors[ro]);
>  	}
>  
> -- 
> 2.8.1
>
Philipp Zabel June 2, 2016, 5:28 p.m. UTC | #2
Hi Daniel,

Am Donnerstag, den 02.06.2016, 18:21 +0200 schrieb Daniel Vetter:
[...]
> > Only the reference count of connectors that weren't previously bound to
> > an encoder should be incremented after a call to drm_crtc_helper_set_config.
> > And only the reference count of connectors that were previously bound to
> > an encoder and are unbound afterwards should ever be decremented.
> > The reference counts of the temporary copies in the save_connectors
> > should not be touched at all.
> > 
> > This patch fixes the above error by only incrementing the reference count
> > of those connectors in the set that are initially not bound to any encoder,
> > and also by restoring the reference count of only those connectors in the
> > set in the failure case.
> > 
> > Fixes: 0955c1250e96 ("drm/crtc: take references to connectors used in a modeset. (v2)")
> > Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
> 
> I think this is worse. We can't save/restore the connector when there's a
> kref inside of it. But there's really only 2 things we need to save
> restore:
> - connector->encoder pointer, for each connector.
> - encoder->crtc pointer, for each encoder.
> 
> For a proper fix I think we need to rectify that first, then apply your
> bugfix on top. The refcount logic itself seems sound, but I've screwed
> that up way too often already.

Ok. Do we have to do it first, though? If we first get rid of the
    drm_connector_unreference(&save_connectors[count++]);
part first, we can don't have to fix it up when the connector array goes
away.

> Also I'm somewhat confused about how you manage to blow things up. The
> refcount of each connector we're seeing should be 1 already: Once from the
> connector list, and a 2nd one from the lookup in the setcrtc ioctl code.

All I saw is that inside drm_crtc_helper_set_config all connectors in
the set have their refcount incremented and then *all* connectors have
their refcount decremented. Isn't the lookup refcount increment only
done for connectors in the set?
I see it look up connector 36, then reference it once more as part of
the set. At the end it tries to unreference connector 34 from
save_connectors, followed by the error:

    [drm:drm_ioctl] pid=205, dev=0xe201, auth=1, DRM_IOCTL_MODE_SETCRTC
    [drm:drm_mode_setcrtc] [CRTC:24:crtc-0]
    [drm:drm_mode_setcrtc] [CONNECTOR:36:LVDS-1]
    [drm:drm_crtc_helper_set_config] 
    [drm:drm_crtc_helper_set_config] [CRTC:24:c 13.982686] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0
    [drm:drm_mode_debug_printmodeline] Modeline 41:"1280x800" 60 71100 1280 1281 1439 1440 800 801 822 823 0x40 0x0
    [drm:drm_mode_object_reference] OBJ ID: 36 (2)
    [drm:drm_crtc_helper_set_config] encoder changed, full mode switch
    [drm:drm_crtc_helper_set_config] crtc changed, full mode switch
    [drm:drm_crtc_helper_set_config] [CONNECTOR:36:LVDS-1] to [CRtc-0]
    [drm:drm_crtc_helper_set_mode] [ENCODER:35:LVDS-35] set [MODE:41:1280x800]
    [drm:drm_calc_timestamping_constants] crtc 24: hwmode: htotal 1440, vtotal 823, vdisplay 800
    [drm:drm_calc_timestamping_constants] crtc 24: clock 71100 kHz framedur 16668354 linedur 20253
    [drm:drm_crtc_helper_set_config] Setting connector DPMS state to on
    [drm:drm_crtc_helper_set_config] 	[CONNECTOR:36:LVDS-1] set DPMS on
    [drm:drm_mode_object_unreference] OBJ ID: 34 (1)

> Which means when we drop that 1 reference in the saved connector (which is
> entirely bogus code, I agree) it should only drop to 1, not 0. And youre
> cleanup code should not be called.
>
> The other thing is that radeon/nouveau also uses this code, and no one
> complained yet. Together I think that's some good evidence that suggests
> there's also something strange going on in imx itself.

That's what I wonder, too. Right now I can't see it though.

regards
Philipp
Daniel Vetter June 2, 2016, 8:25 p.m. UTC | #3
On Thu, Jun 02, 2016 at 07:28:30PM +0200, Philipp Zabel wrote:
> Hi Daniel,
> 
> Am Donnerstag, den 02.06.2016, 18:21 +0200 schrieb Daniel Vetter:
> [...]
> > > Only the reference count of connectors that weren't previously bound to
> > > an encoder should be incremented after a call to drm_crtc_helper_set_config.
> > > And only the reference count of connectors that were previously bound to
> > > an encoder and are unbound afterwards should ever be decremented.
> > > The reference counts of the temporary copies in the save_connectors
> > > should not be touched at all.
> > > 
> > > This patch fixes the above error by only incrementing the reference count
> > > of those connectors in the set that are initially not bound to any encoder,
> > > and also by restoring the reference count of only those connectors in the
> > > set in the failure case.
> > > 
> > > Fixes: 0955c1250e96 ("drm/crtc: take references to connectors used in a modeset. (v2)")
> > > Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
> > 
> > I think this is worse. We can't save/restore the connector when there's a
> > kref inside of it. But there's really only 2 things we need to save
> > restore:
> > - connector->encoder pointer, for each connector.
> > - encoder->crtc pointer, for each encoder.
> > 
> > For a proper fix I think we need to rectify that first, then apply your
> > bugfix on top. The refcount logic itself seems sound, but I've screwed
> > that up way too often already.
> 
> Ok. Do we have to do it first, though? If we first get rid of the
>     drm_connector_unreference(&save_connectors[count++]);
> part first, we can don't have to fix it up when the connector array goes
> away.

It's just a bugfix for the failure case. Doing a memcpy over a live kref
is definitely not something we should ever do, and will break the
refcounting. We want both patches in 4.7.

> > Also I'm somewhat confused about how you manage to blow things up. The
> > refcount of each connector we're seeing should be 1 already: Once from the
> > connector list, and a 2nd one from the lookup in the setcrtc ioctl code.
> 
> All I saw is that inside drm_crtc_helper_set_config all connectors in
> the set have their refcount incremented and then *all* connectors have
> their refcount decremented. Isn't the lookup refcount increment only
> done for connectors in the set?
> I see it look up connector 36, then reference it once more as part of
> the set. At the end it tries to unreference connector 34 from
> save_connectors, followed by the error:
> 
>     [drm:drm_ioctl] pid=205, dev=0xe201, auth=1, DRM_IOCTL_MODE_SETCRTC
>     [drm:drm_mode_setcrtc] [CRTC:24:crtc-0]
>     [drm:drm_mode_setcrtc] [CONNECTOR:36:LVDS-1]
>     [drm:drm_crtc_helper_set_config] 
>     [drm:drm_crtc_helper_set_config] [CRTC:24:c 13.982686] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0
>     [drm:drm_mode_debug_printmodeline] Modeline 41:"1280x800" 60 71100 1280 1281 1439 1440 800 801 822 823 0x40 0x0
>     [drm:drm_mode_object_reference] OBJ ID: 36 (2)
>     [drm:drm_crtc_helper_set_config] encoder changed, full mode switch
>     [drm:drm_crtc_helper_set_config] crtc changed, full mode switch
>     [drm:drm_crtc_helper_set_config] [CONNECTOR:36:LVDS-1] to [CRtc-0]
>     [drm:drm_crtc_helper_set_mode] [ENCODER:35:LVDS-35] set [MODE:41:1280x800]
>     [drm:drm_calc_timestamping_constants] crtc 24: hwmode: htotal 1440, vtotal 823, vdisplay 800
>     [drm:drm_calc_timestamping_constants] crtc 24: clock 71100 kHz framedur 16668354 linedur 20253
>     [drm:drm_crtc_helper_set_config] Setting connector DPMS state to on
>     [drm:drm_crtc_helper_set_config] 	[CONNECTOR:36:LVDS-1] set DPMS on
>     [drm:drm_mode_object_unreference] OBJ ID: 34 (1)

The additional get/put I'm talking off happens in drm_mode_setcrtc (the
get is hidden in drm_connector_lookup). Might be interesting to trace the
refcount for each connector after each get and before each put.

> > Which means when we drop that 1 reference in the saved connector (which is
> > entirely bogus code, I agree) it should only drop to 1, not 0. And youre
> > cleanup code should not be called.
> >
> > The other thing is that radeon/nouveau also uses this code, and no one
> > complained yet. Together I think that's some good evidence that suggests
> > there's also something strange going on in imx itself.
> 
> That's what I wonder, too. Right now I can't see it though.

Yep, something funny is definitely going on here.
-Daniel
Philipp Zabel June 3, 2016, 8:27 a.m. UTC | #4
Am Donnerstag, den 02.06.2016, 22:25 +0200 schrieb Daniel Vetter:
[...]
> >     [drm:drm_ioctl] pid=205, dev=0xe201, auth=1, DRM_IOCTL_MODE_SETCRTC
> >     [drm:drm_mode_setcrtc] [CRTC:24:crtc-0]
> >     [drm:drm_mode_setcrtc] [CONNECTOR:36:LVDS-1]

^ This is printed after each drm_connector_lookup call. The crtc_req
only contains connector 36, and only that one is looked up and gets its
refcount incremented to 2. The unrelated, unbound connector 34 is not
part of the request, doesn't get added to the drm_mode_set, and stays at
reference count 1.

> >     [drm:drm_crtc_helper_set_config] 
> >     [drm:drm_crtc_helper_set_config] [CRTC:24:c 13.982686] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0
> >     [drm:drm_mode_debug_printmodeline] Modeline 41:"1280x800" 60 71100 1280 1281 1439 1440 800 801 822 823 0x40 0x0
> >     [drm:drm_mode_object_reference] OBJ ID: 36 (2)
> >     [drm:drm_crtc_helper_set_config] encoder changed, full mode switch
> >     [drm:drm_crtc_helper_set_config] crtc changed, full mode switch
> >     [drm:drm_crtc_helper_set_config] [CONNECTOR:36:LVDS-1] to [CRtc-0]
> >     [drm:drm_crtc_helper_set_mode] [ENCODER:35:LVDS-35] set [MODE:41:1280x800]
> >     [drm:drm_calc_timestamping_constants] crtc 24: hwmode: htotal 1440, vtotal 823, vdisplay 800
> >     [drm:drm_calc_timestamping_constants] crtc 24: clock 71100 kHz framedur 16668354 linedur 20253
> >     [drm:drm_crtc_helper_set_config] Setting connector DPMS state to on
> >     [drm:drm_crtc_helper_set_config] 	[CONNECTOR:36:LVDS-1] set DPMS on
> >     [drm:drm_mode_object_unreference] OBJ ID: 34 (1)

^ And here the temporary copy of connector 34 gets its reference count
decremented to 0.

> The additional get/put I'm talking off happens in drm_mode_setcrtc (the
> get is hidden in drm_connector_lookup). Might be interesting to trace the
> refcount for each connector after each get and before each put.

drm_connector_lookup is not called for connector 34 if userspace just
requests to enable a single output on connector 36.

regards
Philipp
Thierry Reding June 3, 2016, 4:41 p.m. UTC | #5
On Thu, Jun 02, 2016 at 10:25:46PM +0200, Daniel Vetter wrote:
> On Thu, Jun 02, 2016 at 07:28:30PM +0200, Philipp Zabel wrote:
> > Am Donnerstag, den 02.06.2016, 18:21 +0200 schrieb Daniel Vetter:
[...]
> > > Which means when we drop that 1 reference in the saved connector (which is
> > > entirely bogus code, I agree) it should only drop to 1, not 0. And youre
> > > cleanup code should not be called.
> > >
> > > The other thing is that radeon/nouveau also uses this code, and no one
> > > complained yet. Together I think that's some good evidence that suggests
> > > there's also something strange going on in imx itself.
> > 
> > That's what I wonder, too. Right now I can't see it though.
> 
> Yep, something funny is definitely going on here.

Is this perhaps something similar to what Jon fixed a couple of weeks
ago in the Tegra driver? See commit 280dc0e145b2 ("drm/tegra: Fix crash
caused by reference count imbalance").

The root cause of that was that Tegra's DSI driver is subclassing the
connector state and didn't make proper use of the helpers to share the
core state code, ultimately causing the connector reference count to
become unbalanced. dw-hdmi doesn't subclass state, so it seems unlikely
that it's the same root cause here. The kernel version string indicates
that this isn't a vanilla v4.7-rc1 but that you have local changes, so
perhaps there's some significant change there?

Thierry
Daniel Vetter June 3, 2016, 7:07 p.m. UTC | #6
On Fri, Jun 03, 2016 at 10:27:11AM +0200, Philipp Zabel wrote:
> Am Donnerstag, den 02.06.2016, 22:25 +0200 schrieb Daniel Vetter:
> [...]
> > >     [drm:drm_ioctl] pid=205, dev=0xe201, auth=1, DRM_IOCTL_MODE_SETCRTC
> > >     [drm:drm_mode_setcrtc] [CRTC:24:crtc-0]
> > >     [drm:drm_mode_setcrtc] [CONNECTOR:36:LVDS-1]
> 
> ^ This is printed after each drm_connector_lookup call. The crtc_req
> only contains connector 36, and only that one is looked up and gets its
> refcount incremented to 2. The unrelated, unbound connector 34 is not
> part of the request, doesn't get added to the drm_mode_set, and stays at
> reference count 1.
> 
> > >     [drm:drm_crtc_helper_set_config] 
> > >     [drm:drm_crtc_helper_set_config] [CRTC:24:c 13.982686] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0
> > >     [drm:drm_mode_debug_printmodeline] Modeline 41:"1280x800" 60 71100 1280 1281 1439 1440 800 801 822 823 0x40 0x0
> > >     [drm:drm_mode_object_reference] OBJ ID: 36 (2)
> > >     [drm:drm_crtc_helper_set_config] encoder changed, full mode switch
> > >     [drm:drm_crtc_helper_set_config] crtc changed, full mode switch
> > >     [drm:drm_crtc_helper_set_config] [CONNECTOR:36:LVDS-1] to [CRtc-0]
> > >     [drm:drm_crtc_helper_set_mode] [ENCODER:35:LVDS-35] set [MODE:41:1280x800]
> > >     [drm:drm_calc_timestamping_constants] crtc 24: hwmode: htotal 1440, vtotal 823, vdisplay 800
> > >     [drm:drm_calc_timestamping_constants] crtc 24: clock 71100 kHz framedur 16668354 linedur 20253
> > >     [drm:drm_crtc_helper_set_config] Setting connector DPMS state to on
> > >     [drm:drm_crtc_helper_set_config] 	[CONNECTOR:36:LVDS-1] set DPMS on
> > >     [drm:drm_mode_object_unreference] OBJ ID: 34 (1)
> 
> ^ And here the temporary copy of connector 34 gets its reference count
> decremented to 0.
> 
> > The additional get/put I'm talking off happens in drm_mode_setcrtc (the
> > get is hidden in drm_connector_lookup). Might be interesting to trace the
> > refcount for each connector after each get and before each put.
> 
> drm_connector_lookup is not called for connector 34 if userspace just
> requests to enable a single output on connector 36.

Hm indeed ...

But again we're back to why isn't this an immediate fireworks show on all
the other legacy kms drivers. I have no idea ...
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index a6e4243..1c4d674 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -631,8 +631,12 @@  int drm_crtc_helper_set_config(struct drm_mode_set *set)
 		mode_changed = true;
 	}
 
-	/* take a reference on all connectors in set */
+	/* take a reference on all unbound connectors in set, reuse the
+	 * already taken reference for bound connectors
+	 */
 	for (ro = 0; ro < set->num_connectors; ro++) {
+		if (set->connectors[ro]->encoder)
+			continue;
 		drm_connector_reference(set->connectors[ro]);
 	}
 
@@ -754,12 +758,6 @@  int drm_crtc_helper_set_config(struct drm_mode_set *set)
 		}
 	}
 
-	/* after fail drop reference on all connectors in save set */
-	count = 0;
-	drm_for_each_connector(connector, dev) {
-		drm_connector_unreference(&save_connectors[count++]);
-	}
-
 	kfree(save_connectors);
 	kfree(save_encoders);
 	return 0;
@@ -776,8 +774,12 @@  fail:
 		*connector = save_connectors[count++];
 	}
 
-	/* after fail drop reference on all connectors in set */
+	/* after fail drop reference on all unbound connectors in set, let
+	 * bound connectors keep their reference
+	 */
 	for (ro = 0; ro < set->num_connectors; ro++) {
+		if (set->connectors[ro]->encoder)
+			continue;
 		drm_connector_unreference(set->connectors[ro]);
 	}