diff mbox series

[v2] Fix the possible watermark miswriting for skl+

Message ID 20181113143138.9536-1-stanislav.lisovskiy@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] Fix the possible watermark miswriting for skl+ | expand

Commit Message

Stanislav Lisovskiy Nov. 13, 2018, 2:31 p.m. UTC
Currently whenever we attempt to recalculate
watermarks, we assign dirty_pipes to zero,
then compare current wm results to the recalculated
one and if they changed we set correspondent dirty_pipes
bit again.
This can lead to situation, when we are clearing dirty_pipes,
same wm results twice in a row and not setting dirty_pipes
=> so that watermarks are not actually updated, which then might
lead to fifo underruns, crc mismatch and other issues.

Instead, whenever we detect that wm results are changed,
need to set correspondent dirty_pipes bit and clear it
only once the change is written, but not clear it everytime
we attempt to recalculate those in skl_compute_wm.

v2: Fixed commit message

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Chris Wilson Nov. 13, 2018, 4:40 p.m. UTC | #1
Quoting Stanislav Lisovskiy (2018-11-13 14:31:38)
> Currently whenever we attempt to recalculate
> watermarks, we assign dirty_pipes to zero,
> then compare current wm results to the recalculated
> one and if they changed we set correspondent dirty_pipes
> bit again.
> This can lead to situation, when we are clearing dirty_pipes,
> same wm results twice in a row and not setting dirty_pipes
> => so that watermarks are not actually updated, which then might
> lead to fifo underruns, crc mismatch and other issues.
> 
> Instead, whenever we detect that wm results are changed,
> need to set correspondent dirty_pipes bit and clear it
> only once the change is written, but not clear it everytime
> we attempt to recalculate those in skl_compute_wm.

Ok, but are not dirty_pipes being recomputed for each commit wrt to the
current HW state. Should dirty_pipes not be 0 at the start of compute_wm
naturally due to it not being used before in the atomic commit sequence?
-Chris
Stanislav Lisovskiy Nov. 14, 2018, 8:19 a.m. UTC | #2
On Tue, 2018-11-13 at 16:40 +0000, Chris Wilson wrote:
> Quoting Stanislav Lisovskiy (2018-11-13 14:31:38)
> > Currently whenever we attempt to recalculate
> > watermarks, we assign dirty_pipes to zero,
> > then compare current wm results to the recalculated
> > one and if they changed we set correspondent dirty_pipes
> > bit again.
> > This can lead to situation, when we are clearing dirty_pipes,
> > same wm results twice in a row and not setting dirty_pipes
> > => so that watermarks are not actually updated, which then might
> > lead to fifo underruns, crc mismatch and other issues.
> > 
> > Instead, whenever we detect that wm results are changed,
> > need to set correspondent dirty_pipes bit and clear it
> > only once the change is written, but not clear it everytime
> > we attempt to recalculate those in skl_compute_wm.
> 
> Ok, but are not dirty_pipes being recomputed for each commit wrt to
> the
> current HW state. Should dirty_pipes not be 0 at the start of
> compute_wm
> naturally due to it not being used before in the atomic commit
> sequence?
> -Chris

We discussed this yesterday with Ville, as I understand the whole
drm_atomic_state is allocated each time the commit starts and it is 
already naturally 0. So anyway assigning here dirty_pipes to 0 looked
redundant at least to me and Ville.
However the problem I mean was that if skl_compute_wm is invoked 
twice and getting the same wm results, then we are not going to set
correspondent dirty_pipes bit and because it was zeroed in the
beginning of skl_compute_wm, we are not going to update the watermarks
at all, even though we should as it had changed.

During our discussion with Ville, we agreed that theoretically this
shouldn't happen because skl_compute_wm(called via
compute_global_watermarks hook) should be called only once per commit
in intel_atomic_check and then results are written, when crtc is
updated and this potentially can be caught using some WARN_ON.

What I did is I added a WARN_ON here in skl_compute_wm and removed
assigning dirty_pipes to zero(so that we can see that it might happen
that correspondent bit is already set, but we are getting here again):

@@ -5441,9 +5441,6 @@ skl_compute_wm(struct drm_atomic_state *state)
        bool changed = false;
        int ret, i;
 
-       /* Clear all dirty flags */
-       results->dirty_pipes = 0;
-
@@ -5471,6 +5471,8 @@ skl_compute_wm(struct drm_atomic_state *state)
		ret = skl_update_pipe_wm(cstate, old_pipe_wm, pipe_wm,
		
		 &results->ddb, &changed);
                if (ret)
                        return ret;
 
+               WARN_ON(!changed && (results->dirty_pipes &
drm_crtc_mask(crtc)));
+
                if (changed)
                        results->dirty_pipes |= drm_crtc_mask(crtc);

If skl_compute_wm is called only once we should never get it, as
we would otherwise never have correpondent crtc bit already set here
(and we would already had zeroed dirty_pipes here as we do now, the 
update would be lost).

However this WARN really catches the issue:

[   35.681909] ------------[ cut here ]------------
[   35.681917] WARN_ON(!changed && (results->dirty_pipes &
drm_crtc_mask(crtc)))
[   35.681998] WARNING: CPU: 3 PID: 1499 at
drivers/gpu/drm/i915/intel_pm.c:5474 skl_compute_wm+0x709/0x800 [i915]
[   35.682002] Modules linked in: cdc_ether usbnet snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_codec_generic r8152 mii snd_hda_intel
snd_hda_codec x86_pkg_temp_thermal snd_hwdep coretemp snd_hda_core
snd_pcm pl2303 usbserial btusb btrtl btbcm btintel bluetooth
ecdh_generic mei_me mei dm_crypt crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel i915 prime_numbers i2c_hid pinctrl_sunrisepoint
pinctrl_intel
[   35.682044] CPU: 3 PID: 1499 Comm: Xorg Tainted:
G        W         4.20.0-rc1-CI-CI_DRM_5076+ #123
[   35.682047] Hardware name: LENOVO 81A8/LNVNB161216, BIOS 5SCN34WW
09/01/2017
[   35.682093] RIP: 0010:skl_compute_wm+0x709/0x800 [i915]
[   35.682097] Code: c1 e8 11 e9 40 fe ff ff d3 e0 41 85 86 98 04 00 00
0f 84 85 fe ff ff 48 c7 c6 10 48 1c a0 48 c7 c7 2c 13 1b a0 e8 07 49 03
e1 <0f> 0b 41 8b 86 98 04 00 00 e9 4f fe ff ff 44 89 ca e9 a5 f9 ff ff
[   35.682101] RSP: 0018:ffffc90000c53a08 EFLAGS: 00010282
[   35.682106] RAX: 0000000000000000 RBX: 00000000007784bf RCX:
0000000000000006
[   35.682109] RDX: 0000000000000006 RSI: ffffffff8212886a RDI:
ffffffff820d6d57
[   35.682112] RBP: ffff88029ef40000 R08: 000000001d5e3b8b R09:
0000000000000000
[   35.682116] R10: ffff88029be2ea54 R11: 0000000000000000 R12:
ffff880299db6fc8
[   35.682119] R13: ffff88029be2e678 R14: ffff88026000c138 R15:
0000000000000001
[   35.682123] FS:  00007f3bbca90600(0000) GS:ffff8802b9d80000(0000)
knlGS:0000000000000000
[   35.682126] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.682130] CR2: 00007f838a7b1158 CR3: 000000026cbe2003 CR4:
00000000003606e0
[   35.682133] Call Trace:
[   35.682148]  ? __mutex_unlock_slowpath+0x46/0x2b0
[   35.682215]  intel_atomic_check+0x4d6/0x1230 [i915]
[   35.682233]  drm_atomic_check_only+0x477/0x710
[   35.682245]  drm_atomic_commit+0xe/0x50
[   35.682253]  drm_atomic_helper_set_config+0x7b/0x90
[   35.682260]  drm_mode_setcrtc+0x195/0x6b0
[   35.682289]  ? drm_mode_getcrtc+0x180/0x180
[   35.682295]  drm_ioctl_kernel+0x81/0xf0
[   35.682305]  drm_ioctl+0x2de/0x390
[   35.682312]  ? drm_mode_getcrtc+0x180/0x180
[   35.682327]  ? lock_acquire+0xa6/0x1c0
[   35.682336]  do_vfs_ioctl+0xa0/0x6e0
[   35.682345]  ? __fget+0xfc/0x1e0
[   35.682353]  ksys_ioctl+0x35/0x60
[   35.682361]  __x64_sys_ioctl+0x11/0x20
[   35.682369]  do_syscall_64+0x55/0x190
[   35.682376]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   35.682380] RIP: 0033:0x7f3bb9ea85d7
[   35.682385] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
[   35.682388] RSP: 002b:00007fff41c6e8d8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[   35.682393] RAX: ffffffffffffffda RBX: 000055fca1377680 RCX:
00007f3bb9ea85d7
[   35.682396] RDX: 00007fff41c6e9c0 RSI: 00000000c06864a2 RDI:
000000000000000d
[   35.682400] RBP: 00007fff41c6e9c0 R08: 0000000000000001 R09:
00007f3bbc9f4000
[   35.682403] R10: 0000000000000001 R11: 0000000000000246 R12:
00000000c06864a2
[   35.682406] R13: 000000000000000d R14: 000000000000000a R15:
000055fca137fa50
[   35.682424] irq event stamp: 847224
[   35.682429] hardirqs last  enabled at (847223): [<ffffffff810fba24>]
vprintk_emit+0x124/0x320
[   35.682434] hardirqs last disabled at (847224): [<ffffffff810019a0>]
trace_hardirqs_off_thunk+0x1a/0x1c
[   35.682438] softirqs last  enabled at (847182): [<ffffffff81c0033a>]
__do_softirq+0x33a/0x4b9
[   35.682444] softirqs last disabled at (847175): [<ffffffff8108def9>]
irq_exit+0xa9/0xc0
[   35.682484] WARNING: CPU: 3 PID: 1499 at
drivers/gpu/drm/i915/intel_pm.c:5474 skl_compute_wm+0x709/0x800 [i915]
[   35.682487] ---[ end trace 7b4397aa7cd7d405 ]---

Which means that we are losing some watermark updates, in case if we
assign dirty_pipes to zero and then not setting it back.


> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Ville Syrjala Nov. 14, 2018, 11:09 a.m. UTC | #3
On Wed, Nov 14, 2018 at 08:19:26AM +0000, Lisovskiy, Stanislav wrote:
> On Tue, 2018-11-13 at 16:40 +0000, Chris Wilson wrote:
> > Quoting Stanislav Lisovskiy (2018-11-13 14:31:38)
> > > Currently whenever we attempt to recalculate
> > > watermarks, we assign dirty_pipes to zero,
> > > then compare current wm results to the recalculated
> > > one and if they changed we set correspondent dirty_pipes
> > > bit again.
> > > This can lead to situation, when we are clearing dirty_pipes,
> > > same wm results twice in a row and not setting dirty_pipes
> > > => so that watermarks are not actually updated, which then might
> > > lead to fifo underruns, crc mismatch and other issues.
> > > 
> > > Instead, whenever we detect that wm results are changed,
> > > need to set correspondent dirty_pipes bit and clear it
> > > only once the change is written, but not clear it everytime
> > > we attempt to recalculate those in skl_compute_wm.
> > 
> > Ok, but are not dirty_pipes being recomputed for each commit wrt to
> > the
> > current HW state. Should dirty_pipes not be 0 at the start of
> > compute_wm
> > naturally due to it not being used before in the atomic commit
> > sequence?
> > -Chris
> 
> We discussed this yesterday with Ville, as I understand the whole
> drm_atomic_state is allocated each time the commit starts and it is 
> already naturally 0. So anyway assigning here dirty_pipes to 0 looked
> redundant at least to me and Ville.
> However the problem I mean was that if skl_compute_wm is invoked 
> twice and getting the same wm results, then we are not going to set
> correspondent dirty_pipes bit and because it was zeroed in the
> beginning of skl_compute_wm, we are not going to update the watermarks
> at all, even though we should as it had changed.
> 
> During our discussion with Ville, we agreed that theoretically this
> shouldn't happen because skl_compute_wm(called via
> compute_global_watermarks hook) should be called only once per commit
> in intel_atomic_check and then results are written, when crtc is
> updated and this potentially can be caught using some WARN_ON.
> 
> What I did is I added a WARN_ON here in skl_compute_wm and removed
> assigning dirty_pipes to zero(so that we can see that it might happen
> that correspondent bit is already set, but we are getting here again):
> 
> @@ -5441,9 +5441,6 @@ skl_compute_wm(struct drm_atomic_state *state)
>         bool changed = false;
>         int ret, i;
>  
> -       /* Clear all dirty flags */
> -       results->dirty_pipes = 0;
> -
> @@ -5471,6 +5471,8 @@ skl_compute_wm(struct drm_atomic_state *state)
> 		ret = skl_update_pipe_wm(cstate, old_pipe_wm, pipe_wm,
> 		
> 		 &results->ddb, &changed);
>                 if (ret)
>                         return ret;
>  
> +               WARN_ON(!changed && (results->dirty_pipes &
> drm_crtc_mask(crtc)));
> +
>                 if (changed)
>                         results->dirty_pipes |= drm_crtc_mask(crtc);
> 
> If skl_compute_wm is called only once we should never get it, as
> we would otherwise never have correpondent crtc bit already set here
> (and we would already had zeroed dirty_pipes here as we do now, the 
> update would be lost).
> 
> However this WARN really catches the issue:

Probably due to skl_ddb_add_affected_pipes(), which looks safe enough
to me.

Not sure why it sets dirty_pipes though. Either the ddb allocations
change or they don't, so I don't really see why we need to mark
every pipe as dirty here. They will be marked naturally by
skl_ddb_add_affected_planes() anyway.

So looks to me we can just drop the dirty_pipes frobbing from
skl_ddb_add_affected_pipes().

> 
> [   35.681909] ------------[ cut here ]------------
> [   35.681917] WARN_ON(!changed && (results->dirty_pipes &
> drm_crtc_mask(crtc)))
> [   35.681998] WARNING: CPU: 3 PID: 1499 at
> drivers/gpu/drm/i915/intel_pm.c:5474 skl_compute_wm+0x709/0x800 [i915]
> [   35.682002] Modules linked in: cdc_ether usbnet snd_hda_codec_hdmi
> snd_hda_codec_realtek snd_hda_codec_generic r8152 mii snd_hda_intel
> snd_hda_codec x86_pkg_temp_thermal snd_hwdep coretemp snd_hda_core
> snd_pcm pl2303 usbserial btusb btrtl btbcm btintel bluetooth
> ecdh_generic mei_me mei dm_crypt crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel i915 prime_numbers i2c_hid pinctrl_sunrisepoint
> pinctrl_intel
> [   35.682044] CPU: 3 PID: 1499 Comm: Xorg Tainted:
> G        W         4.20.0-rc1-CI-CI_DRM_5076+ #123
> [   35.682047] Hardware name: LENOVO 81A8/LNVNB161216, BIOS 5SCN34WW
> 09/01/2017
> [   35.682093] RIP: 0010:skl_compute_wm+0x709/0x800 [i915]
> [   35.682097] Code: c1 e8 11 e9 40 fe ff ff d3 e0 41 85 86 98 04 00 00
> 0f 84 85 fe ff ff 48 c7 c6 10 48 1c a0 48 c7 c7 2c 13 1b a0 e8 07 49 03
> e1 <0f> 0b 41 8b 86 98 04 00 00 e9 4f fe ff ff 44 89 ca e9 a5 f9 ff ff
> [   35.682101] RSP: 0018:ffffc90000c53a08 EFLAGS: 00010282
> [   35.682106] RAX: 0000000000000000 RBX: 00000000007784bf RCX:
> 0000000000000006
> [   35.682109] RDX: 0000000000000006 RSI: ffffffff8212886a RDI:
> ffffffff820d6d57
> [   35.682112] RBP: ffff88029ef40000 R08: 000000001d5e3b8b R09:
> 0000000000000000
> [   35.682116] R10: ffff88029be2ea54 R11: 0000000000000000 R12:
> ffff880299db6fc8
> [   35.682119] R13: ffff88029be2e678 R14: ffff88026000c138 R15:
> 0000000000000001
> [   35.682123] FS:  00007f3bbca90600(0000) GS:ffff8802b9d80000(0000)
> knlGS:0000000000000000
> [   35.682126] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   35.682130] CR2: 00007f838a7b1158 CR3: 000000026cbe2003 CR4:
> 00000000003606e0
> [   35.682133] Call Trace:
> [   35.682148]  ? __mutex_unlock_slowpath+0x46/0x2b0
> [   35.682215]  intel_atomic_check+0x4d6/0x1230 [i915]
> [   35.682233]  drm_atomic_check_only+0x477/0x710
> [   35.682245]  drm_atomic_commit+0xe/0x50
> [   35.682253]  drm_atomic_helper_set_config+0x7b/0x90
> [   35.682260]  drm_mode_setcrtc+0x195/0x6b0
> [   35.682289]  ? drm_mode_getcrtc+0x180/0x180
> [   35.682295]  drm_ioctl_kernel+0x81/0xf0
> [   35.682305]  drm_ioctl+0x2de/0x390
> [   35.682312]  ? drm_mode_getcrtc+0x180/0x180
> [   35.682327]  ? lock_acquire+0xa6/0x1c0
> [   35.682336]  do_vfs_ioctl+0xa0/0x6e0
> [   35.682345]  ? __fget+0xfc/0x1e0
> [   35.682353]  ksys_ioctl+0x35/0x60
> [   35.682361]  __x64_sys_ioctl+0x11/0x20
> [   35.682369]  do_syscall_64+0x55/0x190
> [   35.682376]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   35.682380] RIP: 0033:0x7f3bb9ea85d7
> [   35.682385] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00
> 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
> [   35.682388] RSP: 002b:00007fff41c6e8d8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [   35.682393] RAX: ffffffffffffffda RBX: 000055fca1377680 RCX:
> 00007f3bb9ea85d7
> [   35.682396] RDX: 00007fff41c6e9c0 RSI: 00000000c06864a2 RDI:
> 000000000000000d
> [   35.682400] RBP: 00007fff41c6e9c0 R08: 0000000000000001 R09:
> 00007f3bbc9f4000
> [   35.682403] R10: 0000000000000001 R11: 0000000000000246 R12:
> 00000000c06864a2
> [   35.682406] R13: 000000000000000d R14: 000000000000000a R15:
> 000055fca137fa50
> [   35.682424] irq event stamp: 847224
> [   35.682429] hardirqs last  enabled at (847223): [<ffffffff810fba24>]
> vprintk_emit+0x124/0x320
> [   35.682434] hardirqs last disabled at (847224): [<ffffffff810019a0>]
> trace_hardirqs_off_thunk+0x1a/0x1c
> [   35.682438] softirqs last  enabled at (847182): [<ffffffff81c0033a>]
> __do_softirq+0x33a/0x4b9
> [   35.682444] softirqs last disabled at (847175): [<ffffffff8108def9>]
> irq_exit+0xa9/0xc0
> [   35.682484] WARNING: CPU: 3 PID: 1499 at
> drivers/gpu/drm/i915/intel_pm.c:5474 skl_compute_wm+0x709/0x800 [i915]
> [   35.682487] ---[ end trace 7b4397aa7cd7d405 ]---
> 
> Which means that we are losing some watermark updates, in case if we
> assign dirty_pipes to zero and then not setting it back.
> 
> 
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> -- 
> Best Regards,
> 
> Lisovskiy Stanislav
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index dc034617febb..f7fbc4bc0d43 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5441,9 +5441,6 @@  skl_compute_wm(struct drm_atomic_state *state)
 	bool changed = false;
 	int ret, i;
 
-	/* Clear all dirty flags */
-	results->dirty_pipes = 0;
-
 	ret = skl_ddb_add_affected_pipes(state, &changed);
 	if (ret || !changed)
 		return ret;
@@ -5496,6 +5493,7 @@  static void skl_atomic_update_crtc_wm(struct intel_atomic_state *state,
 	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
 	struct skl_pipe_wm *pipe_wm = &cstate->wm.skl.optimal;
 	const struct skl_ddb_allocation *ddb = &state->wm_results.ddb;
+	struct skl_ddb_values *results = &state->wm_results;
 	enum pipe pipe = crtc->pipe;
 	enum plane_id plane_id;
 
@@ -5512,6 +5510,10 @@  static void skl_atomic_update_crtc_wm(struct intel_atomic_state *state,
 			skl_write_cursor_wm(crtc, &pipe_wm->planes[plane_id],
 					    ddb);
 	}
+
+	/* Clear correspondent dirty bit */
+	results->dirty_pipes &= ~drm_crtc_mask(&crtc->base);
+
 }
 
 static void skl_initial_wm(struct intel_atomic_state *state,