Message ID | 20190523212433.9058-1-imre.deak@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/mst: Fix MST sideband up-reply failure handling | expand |
Patch mostly looks good to me, one comment below On Fri, 2019-05-24 at 00:24 +0300, Imre Deak wrote: > Fix the breakage resulting in the stacktrace below, due to tx queue > being full when trying to send an up-reply. txmsg->seqno is -1 in this > case leading to a corruption of the mstb object by > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > in process_single_up_tx_qlock(). > > [ +0,005162] [drm:process_single_tx_qlock [drm_kms_helper]] > set_hdr_from_dst_qlock: failed to find slot > [ +0,000015] [drm:drm_dp_send_up_ack_reply.constprop.19 [drm_kms_helper]] > failed to send msg in q -11 > [ +0,000939] BUG: kernel NULL pointer dereference, address: > 00000000000005a0 > [ +0,006982] #PF: supervisor write access in kernel mode > [ +0,005223] #PF: error_code(0x0002) - not-present page > [ +0,005135] PGD 0 P4D 0 > [ +0,002581] Oops: 0002 [#1] PREEMPT SMP NOPTI > [ +0,004359] CPU: 1 PID: 1200 Comm: kworker/u16:3 Tainted: > G U 5.2.0-rc1+ #410 > [ +0,008433] Hardware name: Intel Corporation Ice Lake Client > Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3175.A00.1904261428 > 04/26/2019 > [ +0,013323] Workqueue: i915-dp i915_digport_work_func [i915] > [ +0,005676] RIP: 0010:queue_work_on+0x19/0x70 > [ +0,004372] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 41 56 > 49 89 f6 41 55 41 89 fd 41 54 55 53 48 89 d3 9c 5d fa e8 e7 81 0c 00 <f0> 48 > 0f ba 2b 00 73 31 45 31 e4 f7 c5 00 02 00 00 74 13 e8 cf 7f > [ +0,018750] RSP: 0018:ffffc900007dfc50 EFLAGS: 00010006 > [ +0,005222] RAX: 0000000000000046 RBX: 00000000000005a0 RCX: > 0000000000000001 > [ +0,007133] RDX: 000000000001b608 RSI: 0000000000000000 RDI: > ffffffff82121972 > [ +0,007129] RBP: 0000000000000202 R08: 0000000000000000 R09: > 0000000000000001 > [ +0,007129] R10: 0000000000000000 R11: 0000000000000000 R12: > ffff88847bfa5096 > [ +0,007131] R13: 0000000000000010 R14: ffff88849c08f3f8 R15: > 0000000000000000 > [ +0,007128] FS: 0000000000000000(0000) GS:ffff88849dc80000(0000) > knlGS:0000000000000000 > [ +0,008083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ +0,005749] CR2: 00000000000005a0 CR3: 0000000005210006 CR4: > 0000000000760ee0 > [ +0,007128] PKRU: 55555554 > [ +0,002722] Call Trace: > [ +0,002458] drm_dp_mst_handle_up_req+0x517/0x540 [drm_kms_helper] > [ +0,006197] ? drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > [ +0,005764] drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > [ +0,005623] ? intel_dp_hpd_pulse+0x205/0x370 [i915] > [ +0,005018] intel_dp_hpd_pulse+0x205/0x370 [i915] > [ +0,004836] i915_digport_work_func+0xbb/0x140 [i915] > [ +0,005108] process_one_work+0x245/0x610 > [ +0,004027] worker_thread+0x37/0x380 > [ +0,003684] ? process_one_work+0x610/0x610 > [ +0,004184] kthread+0x119/0x130 > [ +0,003240] ? kthread_park+0x80/0x80 > [ +0,003668] ret_from_fork+0x24/0x50 > > Cc: Lyude Paul <lyude@redhat.com> > Cc: Dave Airlie <airlied@redhat.com> > Signed-off-by: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > b/drivers/gpu/drm/drm_dp_mst_topology.c > index da1abca1b9e9..24c325f4a616 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -1996,7 +1996,11 @@ static void process_single_up_tx_qlock(struct > drm_dp_mst_topology_mgr *mgr, > if (ret != 1) > DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); > > - txmsg->dst->tx_slots[txmsg->seqno] = NULL; > + if (txmsg->seqno != -1) { > + WARN_ON((unsigned)txmsg->seqno > > + ARRAY_SIZE(txmsg->dst->tx_slots)); Not 100% sure on this nitpick myself but, if we know that txmsg->seqno is about to go out of bounds shouldn't we also try to take action to stop it? like if (!WARN_ON((unsigned)txmsg->seqno > ARRAY_SIZE(txmsg->dst->tx_slots))) txmsg->dst->tx_slots[txmsg->seqno] = NULL; > + txmsg->dst->tx_slots[txmsg->seqno] = NULL; > + } > } > > static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr,
On Thu, May 23, 2019 at 06:09:56PM -0400, Lyude Paul wrote: > Patch mostly looks good to me, one comment below > > On Fri, 2019-05-24 at 00:24 +0300, Imre Deak wrote: > > Fix the breakage resulting in the stacktrace below, due to tx queue > > being full when trying to send an up-reply. txmsg->seqno is -1 in this > > case leading to a corruption of the mstb object by > > > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > in process_single_up_tx_qlock(). > > > > [ +0,005162] [drm:process_single_tx_qlock [drm_kms_helper]] > > set_hdr_from_dst_qlock: failed to find slot > > [ +0,000015] [drm:drm_dp_send_up_ack_reply.constprop.19 [drm_kms_helper]] > > failed to send msg in q -11 > > [ +0,000939] BUG: kernel NULL pointer dereference, address: > > 00000000000005a0 > > [ +0,006982] #PF: supervisor write access in kernel mode > > [ +0,005223] #PF: error_code(0x0002) - not-present page > > [ +0,005135] PGD 0 P4D 0 > > [ +0,002581] Oops: 0002 [#1] PREEMPT SMP NOPTI > > [ +0,004359] CPU: 1 PID: 1200 Comm: kworker/u16:3 Tainted: > > G U 5.2.0-rc1+ #410 > > [ +0,008433] Hardware name: Intel Corporation Ice Lake Client > > Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3175.A00.1904261428 > > 04/26/2019 > > [ +0,013323] Workqueue: i915-dp i915_digport_work_func [i915] > > [ +0,005676] RIP: 0010:queue_work_on+0x19/0x70 > > [ +0,004372] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 41 56 > > 49 89 f6 41 55 41 89 fd 41 54 55 53 48 89 d3 9c 5d fa e8 e7 81 0c 00 <f0> 48 > > 0f ba 2b 00 73 31 45 31 e4 f7 c5 00 02 00 00 74 13 e8 cf 7f > > [ +0,018750] RSP: 0018:ffffc900007dfc50 EFLAGS: 00010006 > > [ +0,005222] RAX: 0000000000000046 RBX: 00000000000005a0 RCX: > > 0000000000000001 > > [ +0,007133] RDX: 000000000001b608 RSI: 0000000000000000 RDI: > > ffffffff82121972 > > [ +0,007129] RBP: 0000000000000202 R08: 0000000000000000 R09: > > 0000000000000001 > > [ +0,007129] R10: 0000000000000000 R11: 0000000000000000 R12: > > ffff88847bfa5096 > > [ +0,007131] R13: 0000000000000010 R14: ffff88849c08f3f8 R15: > > 0000000000000000 > > [ +0,007128] FS: 0000000000000000(0000) GS:ffff88849dc80000(0000) > > knlGS:0000000000000000 > > [ +0,008083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ +0,005749] CR2: 00000000000005a0 CR3: 0000000005210006 CR4: > > 0000000000760ee0 > > [ +0,007128] PKRU: 55555554 > > [ +0,002722] Call Trace: > > [ +0,002458] drm_dp_mst_handle_up_req+0x517/0x540 [drm_kms_helper] > > [ +0,006197] ? drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > [ +0,005764] drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > [ +0,005623] ? intel_dp_hpd_pulse+0x205/0x370 [i915] > > [ +0,005018] intel_dp_hpd_pulse+0x205/0x370 [i915] > > [ +0,004836] i915_digport_work_func+0xbb/0x140 [i915] > > [ +0,005108] process_one_work+0x245/0x610 > > [ +0,004027] worker_thread+0x37/0x380 > > [ +0,003684] ? process_one_work+0x610/0x610 > > [ +0,004184] kthread+0x119/0x130 > > [ +0,003240] ? kthread_park+0x80/0x80 > > [ +0,003668] ret_from_fork+0x24/0x50 > > > > Cc: Lyude Paul <lyude@redhat.com> > > Cc: Dave Airlie <airlied@redhat.com> > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > --- > > drivers/gpu/drm/drm_dp_mst_topology.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > index da1abca1b9e9..24c325f4a616 100644 > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > @@ -1996,7 +1996,11 @@ static void process_single_up_tx_qlock(struct > > drm_dp_mst_topology_mgr *mgr, > > if (ret != 1) > > DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); > > > > - txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > + if (txmsg->seqno != -1) { > > + WARN_ON((unsigned)txmsg->seqno > > > + ARRAY_SIZE(txmsg->dst->tx_slots)); > > Not 100% sure on this nitpick myself but, if we know that txmsg->seqno is > about to go out of bounds shouldn't we also try to take action to stop it? > like Imo, it's worth keeping thins simple. If the WARN triggers we need to fix the original issue in any case (txmsg->seqno never should be set to anything else than -1 or a valid idx) so making the assignment here conditional wouldn't have a real purpose. > > if (!WARN_ON((unsigned)txmsg->seqno > ARRAY_SIZE(txmsg->dst->tx_slots))) > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > > + txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > + } > > } > > > > static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, > -- > Cheers, > Lyude Paul >
On Fri, 2019-05-24 at 01:28 +0300, Imre Deak wrote: > On Thu, May 23, 2019 at 06:09:56PM -0400, Lyude Paul wrote: > > Patch mostly looks good to me, one comment below > > > > On Fri, 2019-05-24 at 00:24 +0300, Imre Deak wrote: > > > Fix the breakage resulting in the stacktrace below, due to tx queue > > > being full when trying to send an up-reply. txmsg->seqno is -1 in this > > > case leading to a corruption of the mstb object by > > > > > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > > > in process_single_up_tx_qlock(). > > > > > > [ +0,005162] [drm:process_single_tx_qlock [drm_kms_helper]] > > > set_hdr_from_dst_qlock: failed to find slot > > > [ +0,000015] [drm:drm_dp_send_up_ack_reply.constprop.19 > > > [drm_kms_helper]] > > > failed to send msg in q -11 > > > [ +0,000939] BUG: kernel NULL pointer dereference, address: > > > 00000000000005a0 > > > [ +0,006982] #PF: supervisor write access in kernel mode > > > [ +0,005223] #PF: error_code(0x0002) - not-present page > > > [ +0,005135] PGD 0 P4D 0 > > > [ +0,002581] Oops: 0002 [#1] PREEMPT SMP NOPTI > > > [ +0,004359] CPU: 1 PID: 1200 Comm: kworker/u16:3 Tainted: > > > G U 5.2.0-rc1+ #410 > > > [ +0,008433] Hardware name: Intel Corporation Ice Lake Client > > > Platform/IceLake U DDR4 SODIMM PD RVP, BIOS > > > ICLSFWR1.R00.3175.A00.1904261428 > > > 04/26/2019 > > > [ +0,013323] Workqueue: i915-dp i915_digport_work_func [i915] > > > [ +0,005676] RIP: 0010:queue_work_on+0x19/0x70 > > > [ +0,004372] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 > > > 41 56 > > > 49 89 f6 41 55 41 89 fd 41 54 55 53 48 89 d3 9c 5d fa e8 e7 81 0c 00 > > > <f0> 48 > > > 0f ba 2b 00 73 31 45 31 e4 f7 c5 00 02 00 00 74 13 e8 cf 7f > > > [ +0,018750] RSP: 0018:ffffc900007dfc50 EFLAGS: 00010006 > > > [ +0,005222] RAX: 0000000000000046 RBX: 00000000000005a0 RCX: > > > 0000000000000001 > > > [ +0,007133] RDX: 000000000001b608 RSI: 0000000000000000 RDI: > > > ffffffff82121972 > > > [ +0,007129] RBP: 0000000000000202 R08: 0000000000000000 R09: > > > 0000000000000001 > > > [ +0,007129] R10: 0000000000000000 R11: 0000000000000000 R12: > > > ffff88847bfa5096 > > > [ +0,007131] R13: 0000000000000010 R14: ffff88849c08f3f8 R15: > > > 0000000000000000 > > > [ +0,007128] FS: 0000000000000000(0000) GS:ffff88849dc80000(0000) > > > knlGS:0000000000000000 > > > [ +0,008083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [ +0,005749] CR2: 00000000000005a0 CR3: 0000000005210006 CR4: > > > 0000000000760ee0 > > > [ +0,007128] PKRU: 55555554 > > > [ +0,002722] Call Trace: > > > [ +0,002458] drm_dp_mst_handle_up_req+0x517/0x540 [drm_kms_helper] > > > [ +0,006197] ? drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > > [ +0,005764] drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > > [ +0,005623] ? intel_dp_hpd_pulse+0x205/0x370 [i915] > > > [ +0,005018] intel_dp_hpd_pulse+0x205/0x370 [i915] > > > [ +0,004836] i915_digport_work_func+0xbb/0x140 [i915] > > > [ +0,005108] process_one_work+0x245/0x610 > > > [ +0,004027] worker_thread+0x37/0x380 > > > [ +0,003684] ? process_one_work+0x610/0x610 > > > [ +0,004184] kthread+0x119/0x130 > > > [ +0,003240] ? kthread_park+0x80/0x80 > > > [ +0,003668] ret_from_fork+0x24/0x50 > > > > > > Cc: Lyude Paul <lyude@redhat.com> > > > Cc: Dave Airlie <airlied@redhat.com> > > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > > --- > > > drivers/gpu/drm/drm_dp_mst_topology.c | 6 +++++- > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > > index da1abca1b9e9..24c325f4a616 100644 > > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > > @@ -1996,7 +1996,11 @@ static void process_single_up_tx_qlock(struct > > > drm_dp_mst_topology_mgr *mgr, > > > if (ret != 1) > > > DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); > > > > > > - txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > + if (txmsg->seqno != -1) { > > > + WARN_ON((unsigned)txmsg->seqno > > > > + ARRAY_SIZE(txmsg->dst->tx_slots)); > > > > Not 100% sure on this nitpick myself but, if we know that txmsg->seqno is > > about to go out of bounds shouldn't we also try to take action to stop it? > > like > > Imo, it's worth keeping thins simple. If the WARN triggers we need to > fix the original issue in any case (txmsg->seqno never should be set to > anything else than -1 or a valid idx) so making the assignment here > conditional wouldn't have a real purpose. mm, fair enough. Then: Reviewed-by: Lyude Paul <lyude@redhat.com> Thanks for the fix! > > > if (!WARN_ON((unsigned)txmsg->seqno > ARRAY_SIZE(txmsg->dst->tx_slots))) > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > > > > > > + txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > + } > > > } > > > > > > static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, > > -- > > Cheers, > > Lyude Paul > >
On Thu, May 23, 2019 at 06:31:15PM -0400, Lyude Paul wrote: > On Fri, 2019-05-24 at 01:28 +0300, Imre Deak wrote: > > On Thu, May 23, 2019 at 06:09:56PM -0400, Lyude Paul wrote: > > > Patch mostly looks good to me, one comment below > > > > > > On Fri, 2019-05-24 at 00:24 +0300, Imre Deak wrote: > > > > Fix the breakage resulting in the stacktrace below, due to tx queue > > > > being full when trying to send an up-reply. txmsg->seqno is -1 in this > > > > case leading to a corruption of the mstb object by > > > > > > > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > > > > > in process_single_up_tx_qlock(). > > > > > > > > [ +0,005162] [drm:process_single_tx_qlock [drm_kms_helper]] > > > > set_hdr_from_dst_qlock: failed to find slot > > > > [ +0,000015] [drm:drm_dp_send_up_ack_reply.constprop.19 > > > > [drm_kms_helper]] > > > > failed to send msg in q -11 > > > > [ +0,000939] BUG: kernel NULL pointer dereference, address: > > > > 00000000000005a0 > > > > [ +0,006982] #PF: supervisor write access in kernel mode > > > > [ +0,005223] #PF: error_code(0x0002) - not-present page > > > > [ +0,005135] PGD 0 P4D 0 > > > > [ +0,002581] Oops: 0002 [#1] PREEMPT SMP NOPTI > > > > [ +0,004359] CPU: 1 PID: 1200 Comm: kworker/u16:3 Tainted: > > > > G U 5.2.0-rc1+ #410 > > > > [ +0,008433] Hardware name: Intel Corporation Ice Lake Client > > > > Platform/IceLake U DDR4 SODIMM PD RVP, BIOS > > > > ICLSFWR1.R00.3175.A00.1904261428 > > > > 04/26/2019 > > > > [ +0,013323] Workqueue: i915-dp i915_digport_work_func [i915] > > > > [ +0,005676] RIP: 0010:queue_work_on+0x19/0x70 > > > > [ +0,004372] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 > > > > 41 56 > > > > 49 89 f6 41 55 41 89 fd 41 54 55 53 48 89 d3 9c 5d fa e8 e7 81 0c 00 > > > > <f0> 48 > > > > 0f ba 2b 00 73 31 45 31 e4 f7 c5 00 02 00 00 74 13 e8 cf 7f > > > > [ +0,018750] RSP: 0018:ffffc900007dfc50 EFLAGS: 00010006 > > > > [ +0,005222] RAX: 0000000000000046 RBX: 00000000000005a0 RCX: > > > > 0000000000000001 > > > > [ +0,007133] RDX: 000000000001b608 RSI: 0000000000000000 RDI: > > > > ffffffff82121972 > > > > [ +0,007129] RBP: 0000000000000202 R08: 0000000000000000 R09: > > > > 0000000000000001 > > > > [ +0,007129] R10: 0000000000000000 R11: 0000000000000000 R12: > > > > ffff88847bfa5096 > > > > [ +0,007131] R13: 0000000000000010 R14: ffff88849c08f3f8 R15: > > > > 0000000000000000 > > > > [ +0,007128] FS: 0000000000000000(0000) GS:ffff88849dc80000(0000) > > > > knlGS:0000000000000000 > > > > [ +0,008083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [ +0,005749] CR2: 00000000000005a0 CR3: 0000000005210006 CR4: > > > > 0000000000760ee0 > > > > [ +0,007128] PKRU: 55555554 > > > > [ +0,002722] Call Trace: > > > > [ +0,002458] drm_dp_mst_handle_up_req+0x517/0x540 [drm_kms_helper] > > > > [ +0,006197] ? drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > > > [ +0,005764] drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] > > > > [ +0,005623] ? intel_dp_hpd_pulse+0x205/0x370 [i915] > > > > [ +0,005018] intel_dp_hpd_pulse+0x205/0x370 [i915] > > > > [ +0,004836] i915_digport_work_func+0xbb/0x140 [i915] > > > > [ +0,005108] process_one_work+0x245/0x610 > > > > [ +0,004027] worker_thread+0x37/0x380 > > > > [ +0,003684] ? process_one_work+0x610/0x610 > > > > [ +0,004184] kthread+0x119/0x130 > > > > [ +0,003240] ? kthread_park+0x80/0x80 > > > > [ +0,003668] ret_from_fork+0x24/0x50 > > > > > > > > Cc: Lyude Paul <lyude@redhat.com> > > > > Cc: Dave Airlie <airlied@redhat.com> > > > > Signed-off-by: Imre Deak <imre.deak@intel.com> > > > > --- > > > > drivers/gpu/drm/drm_dp_mst_topology.c | 6 +++++- > > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > > > index da1abca1b9e9..24c325f4a616 100644 > > > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > > > @@ -1996,7 +1996,11 @@ static void process_single_up_tx_qlock(struct > > > > drm_dp_mst_topology_mgr *mgr, > > > > if (ret != 1) > > > > DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); > > > > > > > > - txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > + if (txmsg->seqno != -1) { > > > > + WARN_ON((unsigned)txmsg->seqno > > > > > + ARRAY_SIZE(txmsg->dst->tx_slots)); > > > > > > Not 100% sure on this nitpick myself but, if we know that txmsg->seqno is > > > about to go out of bounds shouldn't we also try to take action to stop it? > > > like > > > > Imo, it's worth keeping thins simple. If the WARN triggers we need to > > fix the original issue in any case (txmsg->seqno never should be set to > > anything else than -1 or a valid idx) so making the assignment here > > conditional wouldn't have a real purpose. > > mm, fair enough. Then: > > Reviewed-by: Lyude Paul <lyude@redhat.com> > > Thanks for the fix! Thanks for the review. Got now the permission and so pushed the change to drm-misc-next. I wonder if there's a CI for this branch similar we have in place for the intel-gfx branches. Maybe I should've CC'd the patch to the intel-gfx ML for a pre-merge CI report? > > > > > > if (!WARN_ON((unsigned)txmsg->seqno > ARRAY_SIZE(txmsg->dst->tx_slots))) > > > txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > > > > > > > > > > + txmsg->dst->tx_slots[txmsg->seqno] = NULL; > > > > + } > > > > } > > > > > > > > static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, > > > -- > > > Cheers, > > > Lyude Paul > > > > -- > Cheers, > Lyude Paul >
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index da1abca1b9e9..24c325f4a616 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1996,7 +1996,11 @@ static void process_single_up_tx_qlock(struct drm_dp_mst_topology_mgr *mgr, if (ret != 1) DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); - txmsg->dst->tx_slots[txmsg->seqno] = NULL; + if (txmsg->seqno != -1) { + WARN_ON((unsigned)txmsg->seqno > + ARRAY_SIZE(txmsg->dst->tx_slots)); + txmsg->dst->tx_slots[txmsg->seqno] = NULL; + } } static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr,
Fix the breakage resulting in the stacktrace below, due to tx queue being full when trying to send an up-reply. txmsg->seqno is -1 in this case leading to a corruption of the mstb object by txmsg->dst->tx_slots[txmsg->seqno] = NULL; in process_single_up_tx_qlock(). [ +0,005162] [drm:process_single_tx_qlock [drm_kms_helper]] set_hdr_from_dst_qlock: failed to find slot [ +0,000015] [drm:drm_dp_send_up_ack_reply.constprop.19 [drm_kms_helper]] failed to send msg in q -11 [ +0,000939] BUG: kernel NULL pointer dereference, address: 00000000000005a0 [ +0,006982] #PF: supervisor write access in kernel mode [ +0,005223] #PF: error_code(0x0002) - not-present page [ +0,005135] PGD 0 P4D 0 [ +0,002581] Oops: 0002 [#1] PREEMPT SMP NOPTI [ +0,004359] CPU: 1 PID: 1200 Comm: kworker/u16:3 Tainted: G U 5.2.0-rc1+ #410 [ +0,008433] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3175.A00.1904261428 04/26/2019 [ +0,013323] Workqueue: i915-dp i915_digport_work_func [i915] [ +0,005676] RIP: 0010:queue_work_on+0x19/0x70 [ +0,004372] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 41 56 49 89 f6 41 55 41 89 fd 41 54 55 53 48 89 d3 9c 5d fa e8 e7 81 0c 00 <f0> 48 0f ba 2b 00 73 31 45 31 e4 f7 c5 00 02 00 00 74 13 e8 cf 7f [ +0,018750] RSP: 0018:ffffc900007dfc50 EFLAGS: 00010006 [ +0,005222] RAX: 0000000000000046 RBX: 00000000000005a0 RCX: 0000000000000001 [ +0,007133] RDX: 000000000001b608 RSI: 0000000000000000 RDI: ffffffff82121972 [ +0,007129] RBP: 0000000000000202 R08: 0000000000000000 R09: 0000000000000001 [ +0,007129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88847bfa5096 [ +0,007131] R13: 0000000000000010 R14: ffff88849c08f3f8 R15: 0000000000000000 [ +0,007128] FS: 0000000000000000(0000) GS:ffff88849dc80000(0000) knlGS:0000000000000000 [ +0,008083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0,005749] CR2: 00000000000005a0 CR3: 0000000005210006 CR4: 0000000000760ee0 [ +0,007128] PKRU: 55555554 [ +0,002722] Call Trace: [ +0,002458] drm_dp_mst_handle_up_req+0x517/0x540 [drm_kms_helper] [ +0,006197] ? drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] [ +0,005764] drm_dp_mst_hpd_irq+0x5b/0x9c0 [drm_kms_helper] [ +0,005623] ? intel_dp_hpd_pulse+0x205/0x370 [i915] [ +0,005018] intel_dp_hpd_pulse+0x205/0x370 [i915] [ +0,004836] i915_digport_work_func+0xbb/0x140 [i915] [ +0,005108] process_one_work+0x245/0x610 [ +0,004027] worker_thread+0x37/0x380 [ +0,003684] ? process_one_work+0x610/0x610 [ +0,004184] kthread+0x119/0x130 [ +0,003240] ? kthread_park+0x80/0x80 [ +0,003668] ret_from_fork+0x24/0x50 Cc: Lyude Paul <lyude@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/drm_dp_mst_topology.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)