Message ID | CAPM=9txyMmW1DWhS--SuYQu4qDK1GPzgHJwxbAfhHT=hUsPODA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [git,pull] drm next pull for 5.10-rc1 | expand |
Hi all, On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote: > Peilin Ye (1): > drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() This patch is already in mainline: commit 543e8669ed9bfb30545fd52bc0e047ca4df7fb31 Author: Peilin Ye <yepeilin.cs@gmail.com> Date: Tue Jul 28 15:29:24 2020 -0400 drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() It has been applied to linux-next twice, for unknown reasons. Thank you! Peilin Ye
On Thu, Oct 15, 2020 at 9:59 AM Peilin Ye <yepeilin.cs@gmail.com> wrote: > > Hi all, > > On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote: > > Peilin Ye (1): > > drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() > > This patch is already in mainline: > > commit 543e8669ed9bfb30545fd52bc0e047ca4df7fb31 > Author: Peilin Ye <yepeilin.cs@gmail.com> > Date: Tue Jul 28 15:29:24 2020 -0400 > > drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() > > It has been applied to linux-next twice, for unknown reasons. Thank you! The patch was already in drm-next, but since it was a bug fix it was applied it to 5.9 as well. Alex
On Thu, Oct 15, 2020 at 10:53:28AM -0400, Alex Deucher wrote: > On Thu, Oct 15, 2020 at 9:59 AM Peilin Ye <yepeilin.cs@gmail.com> wrote: > > It has been applied to linux-next twice, for unknown reasons. Thank you! > > The patch was already in drm-next, but since it was a bug fix it was > applied it to 5.9 as well. Ah, I see. Thank you for the explanation! Peilin Ye
On Wed, Oct 14, 2020 at 6:33 PM Dave Airlie <airlied@gmail.com> wrote: > > There are a bunch of conflicts but none of them seemed overly scary, > and sfr has provided resolutions for them all. I've put a tree up with > my merge results, so you can tell me I did it wrong here: > https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next-5.10-merged Thanks, looks good to me, and I missed the same msm_iommu.c patch you did, so kudos for pointing that one out too. Linus
The pull request you sent on Thu, 15 Oct 2020 11:33:08 +1000:
> git://anongit.freedesktop.org/drm/drm tags/drm-next-2020-10-15
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/93b694d096cc10994c817730d4d50288f9ae3d66
Thank you!
On Thu, Oct 15, 2020 at 10:51 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Thanks, looks good to me [..] Uhhuh. I already pushed things out, but my clang build (which I don't do between each merge) shows a problem: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:650:8: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses] && !params[i].clock_force_enable == DSC_CLK_FORCE_DEFAULT) { ^ ~~ and I think clang is entirely right to complain about that code. Yes, the code may be correct, but even if it's correct, that's a really odd way to write things. Anyway, what it does is: !params[i].clock_force_enable turns 0 into 1, and anything that isn't 0 into 0. And DSC_CLK_FORCE_DEFAULT has a value of 0, so what that line actually means is (params[i].clock_force_enable == 0) == 0 which obviously is params[i].clock_force_enable != 0 which in this case is the same as params[i].clock_force_enable != DSC_CLK_FORCE_DEFAULT which may be what the code actually meant to do. So I suspect it does what it meant to do, but only because DSC_CLK_FORCE_DEFAULT also happens to be 0, which also acts as a boolean 'false'. But it's also possible that the '!' is a left-over, and the code actually _meant_ to do the exact reverse of that. I have no idea. This odd code was introduced by commit 0749ddeb7d6c ("drm/amd/display: Add DSC force disable to dsc_clock_en debugfs entry"), and can we please agree to not write this kind of obfuscated C code? Linus
On Fri, 16 Oct 2020 at 04:42, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Oct 15, 2020 at 10:51 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > Thanks, looks good to me [..] > > Uhhuh. I already pushed things out, but my clang build (which I don't > do between each merge) shows a problem: > > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:650:8: > warning: logical not is only applied to the left hand side of this > comparison [-Wlogical-not-parentheses] > && !params[i].clock_force_enable == DSC_CLK_FORCE_DEFAULT) { > ^ ~~ > > and I think clang is entirely right to complain about that code. > > Yes, the code may be correct, but even if it's correct, that's a > really odd way to write things. > > Anyway, what it does is: > > !params[i].clock_force_enable > > turns 0 into 1, and anything that isn't 0 into 0. > > And DSC_CLK_FORCE_DEFAULT has a value of 0, so what that line actually means is > > (params[i].clock_force_enable == 0) == 0 > > which obviously is > > params[i].clock_force_enable != 0 > > which in this case is the same as > > params[i].clock_force_enable != DSC_CLK_FORCE_DEFAULT > > which may be what the code actually meant to do. > > So I suspect it does what it meant to do, but only because > DSC_CLK_FORCE_DEFAULT also happens to be 0, which also acts as a > boolean 'false'. > > But it's also possible that the '!' is a left-over, and the code > actually _meant_ to do the exact reverse of that. I have no idea. > > This odd code was introduced by commit 0749ddeb7d6c ("drm/amd/display: > Add DSC force disable to dsc_clock_en debugfs entry"), and can we > please agree to not write this kind of obfuscated C code? I've asked Alex to direct send you any fix for you to apply once he's had a chance to validate it, Dave.
On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote:
> drm/nouveau/kms: Search for encoders' connectors properly
This commit (09838c4efe9a) broke boot for me. These two hunks in
particular:
@ -2066,7 +2120,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
outp->clr.mask, outp->set.mask);
if (outp->clr.mask) {
- help->disable(encoder);
+ help->atomic_disable(encoder, state);
interlock[NV50_DISP_INTERLOCK_CORE] |= 1;
if (outp->flush_disable) {
nv50_disp_atomic_commit_wndw(state, interlock);
@@ -2105,7 +2159,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
outp->set.mask, outp->clr.mask);
if (outp->set.mask) {
- help->enable(encoder);
+ help->atomic_enable(encoder, state);
interlock[NV50_DISP_INTERLOCK_CORE] = 1;
}
I hacked up patch to use help->disable/help->enable if atomic_ versions
are NULL. It worked.
In my setup I stepped onto nv50_msto_help->atomic_enable == NULL. But
there are two more drm_encoder_helper_funcs in dispnv50/disp.c that don't
have atomic_enable/disable set: nv50_dac_help, nv50_pior_help.
ACK, I will send out a patch for this asap On Wed, 2020-11-04 at 01:20 +0300, Kirill A. Shutemov wrote: > On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote: > > drm/nouveau/kms: Search for encoders' connectors properly > > This commit (09838c4efe9a) broke boot for me. These two hunks in > particular: > > @ -2066,7 +2120,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state > *state) > outp->clr.mask, outp->set.mask); > > if (outp->clr.mask) { > - help->disable(encoder); > + help->atomic_disable(encoder, state); > interlock[NV50_DISP_INTERLOCK_CORE] |= 1; > if (outp->flush_disable) { > nv50_disp_atomic_commit_wndw(state, > interlock); > @@ -2105,7 +2159,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state > *state) > outp->set.mask, outp->clr.mask); > > if (outp->set.mask) { > - help->enable(encoder); > + help->atomic_enable(encoder, state); > interlock[NV50_DISP_INTERLOCK_CORE] = 1; > } > > > I hacked up patch to use help->disable/help->enable if atomic_ versions > are NULL. It worked. > > In my setup I stepped onto nv50_msto_help->atomic_enable == NULL. But > there are two more drm_encoder_helper_funcs in dispnv50/disp.c that don't > have atomic_enable/disable set: nv50_dac_help, nv50_pior_help. >
On Wed, Nov 04, 2020 at 04:58:14PM -0500, Lyude Paul wrote:
> ACK, I will send out a patch for this asap
Any update. AFAICS, v5.10-rc3 is still buggy.
Looking at the patches you sent is on my TODO list for this week at least as a priority, although I really would have hoped that someone from Intel would have looked by now since it's a regression on their end. Gentle ping to Vsyrjala and Imre On Mon, 2020-11-09 at 14:52 +0300, Kirill A. Shutemov wrote: > On Wed, Nov 04, 2020 at 04:58:14PM -0500, Lyude Paul wrote: > > ACK, I will send out a patch for this asap > > Any update. AFAICS, v5.10-rc3 is still buggy. >
On Mon, Nov 09, 2020 at 12:50:48PM -0500, Lyude Paul wrote: > Looking at the patches you sent is on my TODO list for this week at least as a > priority, although I really would have hoped that someone from Intel would > have looked by now since it's a regression on their end. What regression are you talking about? > > Gentle ping to Vsyrjala and Imre > > On Mon, 2020-11-09 at 14:52 +0300, Kirill A. Shutemov wrote: > > On Wed, Nov 04, 2020 at 04:58:14PM -0500, Lyude Paul wrote: > > > ACK, I will send out a patch for this asap > > > > Any update. AFAICS, v5.10-rc3 is still buggy. > > > > -- > Cheers, > Lyude Paul (she/her) > Software Engineer at Red Hat
whoops, you can ignore this actually - I got this mixed up with an Intel issue I was looking at, this is actually a nouveau issue and you guys don't need to look at this :) Kirill-I'll get to this asap, but I've got some other stuff on my plate to go through first. Could you open up a bug on gitlab in the mean time? On Mon, 2020-11-09 at 21:08 +0200, Ville Syrjälä wrote: > On Mon, Nov 09, 2020 at 12:50:48PM -0500, Lyude Paul wrote: > > Looking at the patches you sent is on my TODO list for this week at least as > > a > > priority, although I really would have hoped that someone from Intel would > > have looked by now since it's a regression on their end. > > What regression are you talking about? > > > > > Gentle ping to Vsyrjala and Imre > > > > On Mon, 2020-11-09 at 14:52 +0300, Kirill A. Shutemov wrote: > > > On Wed, Nov 04, 2020 at 04:58:14PM -0500, Lyude Paul wrote: > > > > ACK, I will send out a patch for this asap > > > > > > Any update. AFAICS, v5.10-rc3 is still buggy. > > > > > > > -- > > Cheers, > > Lyude Paul (she/her) > > Software Engineer at Red Hat >
On 2020/10/15 9:33, Dave Airlie wrote:
> drm/vram-helper: stop using TTM placement flags
This commit (7053e0eab473) produce call trace for me as below:
[ 64.782340] WARNING: CPU: 51 PID: 1964 at drivers/gpu/drm/drm_gem_vram_helper.c:284 drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
[ 64.782411] CPU: 51 PID: 1964 Comm: Xorg Not tainted 5.10.0-rc3 #12
[ 64.782413] Hardware name: To be filled.
[ 64.782419] RIP: 0010:drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
[ 64.782424] Code: 00 48 89 e5 85 c0 74 17 48 83 bf 78 01 00 00 00 74 18 48 8b 87 80 01 00 00 5d 48 c1 e0 0c c3 0f 0b 48 c7 c0 ed ff ff ff 5d c3 <0f> 0b 31 c0 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 18 06
[ 64.782427] RSP: 0018:ffffa9128909fa68 EFLAGS: 00010246
[ 64.782431] RAX: 0000000000000002 RBX: ffff95a5c25e1ec0 RCX: ffffffffc02b6600
[ 64.782433] RDX: ffff959e49824000 RSI: ffff95a5c25e0b40 RDI: ffff959e4b1c2c00
[ 64.782434] RBP: ffffa9128909fa68 R08: 0000000000000040 R09: ffff95a9c5dcb688
[ 64.782436] R10: 0000000000000000 R11: 0000000000000001 R12: ffff959e49824000
[ 64.782437] R13: 0000000000000000 R14: 0000000000000000 R15: ffff95a5c5c56f00
[ 64.782440] FS: 00007f485d466a80(0000) GS:ffff95a9afcc0000(0000) knlGS:0000000000000000
[ 64.782442] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 64.782444] CR2: 00007f485e202000 CR3: 0000000c82a0e000 CR4: 00000000003506e0
[ 64.782446] Call Trace:
[ 64.782455] ast_cursor_page_flip+0x22/0x100 [ast]
[ 64.782460] ast_cursor_plane_helper_atomic_update+0x46/0x70 [ast]
[ 64.782477] drm_atomic_helper_commit_planes+0xbd/0x220 [drm_kms_helper]
[ 64.782493] drm_atomic_helper_commit_tail_rpm+0x3a/0x70 [drm_kms_helper]
[ 64.782507] commit_tail+0x99/0x130 [drm_kms_helper]
[ 64.782521] drm_atomic_helper_commit+0x123/0x150 [drm_kms_helper]
[ 64.782551] drm_atomic_commit+0x4a/0x50 [drm]
[ 64.782565] drm_atomic_helper_update_plane+0xe7/0x140 [drm_kms_helper]
[ 64.782592] __setplane_atomic+0xcc/0x110 [drm]
[ 64.782619] drm_mode_cursor_universal+0x13e/0x260 [drm]
[ 64.782647] drm_mode_cursor_common+0xef/0x220 [drm]
[ 64.782654] ? tomoyo_path_number_perm+0x6f/0x200
[ 64.782680] ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
[ 64.782706] drm_mode_cursor2_ioctl+0xe/0x10 [drm]
[ 64.782727] drm_ioctl_kernel+0xae/0xf0 [drm]
[ 64.782749] drm_ioctl+0x241/0x3f0 [drm]
[ 64.782774] ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
[ 64.782781] ? tomoyo_file_ioctl+0x19/0x20
[ 64.782787] __x64_sys_ioctl+0x91/0xc0
[ 64.782792] do_syscall_64+0x38/0x90
[ 64.782797] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 64.782800] RIP: 0033:0x7f485d7c637b
[ 64.782804] Code: 0f 1e fa 48 8b 05 15 3b 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e5 3a 0d 00 f7 d8 64 89 01 48
[ 64.782805] RSP: 002b:00007fff78682a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 64.782808] RAX: ffffffffffffffda RBX: 00007fff78682a60 RCX: 00007f485d7c637b
[ 64.782810] RDX: 00007fff78682a60 RSI: 00000000c02464bb RDI: 000000000000000c
[ 64.782811] RBP: 00000000c02464bb R08: 0000000000000040 R09: 0000000000000004
[ 64.782813] R10: 0000000000000002 R11: 0000000000000246 R12: 0000558647745e40
[ 64.782814] R13: 000000000000000c R14: 0000000000000002 R15: 00000000000002af
[ 64.782820] CPU: 51 PID: 1964 Comm: Xorg Not tainted 5.10.0-rc3 #12
[ 64.782821] Hardware name: To be filled.
[ 64.782822] Call Trace:
[ 64.782828] dump_stack+0x74/0x92
[ 64.782832] ? drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
[ 64.782836] __warn.cold+0x24/0x3f
[ 64.782840] ? drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
[ 64.782844] report_bug+0xd6/0x100
[ 64.782847] handle_bug+0x39/0x80
[ 64.782850] exc_invalid_op+0x19/0x70
[ 64.782853] asm_exc_invalid_op+0x12/0x20
......
I hacked up patch and found this hunk in particular introduced the call trace:
@@ -135,20 +135,23 @@ static void ttm_buffer_object_destroy(struct ttm_buffer_object *bo)
......
+ if (pl_flag & DRM_GEM_VRAM_PL_FLAG_TOPDOWN)
+ pl_flag = TTM_PL_FLAG_TOPDOWN;
It seems that these two lines will lead to gbo->placements[c].mem_type be
forcibly set to TTM_PL_SYSTEM in the next hunks which caused the problem,
even though the pl_flag is DRM_GEM_VRAM_PL_FLAG_VRAM & DRM_GEM_VRAM_PL_FLAG_TOPDOWN.
If I comment out these two lines, there will be no call trace any more.
JFYI-looking at this today On Mon, 2020-11-09 at 14:52 +0300, Kirill A. Shutemov wrote: > On Wed, Nov 04, 2020 at 04:58:14PM -0500, Lyude Paul wrote: > > ACK, I will send out a patch for this asap > > Any update. AFAICS, v5.10-rc3 is still buggy. >
On Tue, Nov 3, 2020 at 2:20 PM Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote: > > drm/nouveau/kms: Search for encoders' connectors properly > > This commit (09838c4efe9a) broke boot for me. These two hunks in > particular: Christ. It's been two weeks. I'm doing -rc4 today, and I still don't have the fix. The problem seems entirely obvious, as reported by Kirill: the nv50 code unconditionally calls the "atomic_{dis,en}able()" functions, even when not everybody was converted. The fix seems to be to either just do the conversion of the remaining cases (which looks like just adding an argument to the remaining functions, and using that for the "atomic" callback), or the trivial suggestion by Kirill from two weeks ago: > I hacked up patch to use help->disable/help->enable if atomic_ versions > are NULL. It worked. Kirill, since the nouveau people aren't fixing this, can you just send me your tested patch? Lyude/Ben - let me just say that I think this is all a huge disgrace. You had a problem report with a bisected commit, a suggested fix, and two weeks later there's absolutely _nothing_. Linus
On Mon, 16 Nov 2020 at 03:57, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Tue, Nov 3, 2020 at 2:20 PM Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > On Thu, Oct 15, 2020 at 11:33:08AM +1000, Dave Airlie wrote: > > > drm/nouveau/kms: Search for encoders' connectors properly > > > > This commit (09838c4efe9a) broke boot for me. These two hunks in > > particular: > > Christ. It's been two weeks. I'm doing -rc4 today, and I still don't > have the fix. > > The problem seems entirely obvious, as reported by Kirill: the nv50 > code unconditionally calls the "atomic_{dis,en}able()" functions, even > when not everybody was converted. > > The fix seems to be to either just do the conversion of the remaining > cases (which looks like just adding an argument to the remaining > functions, and using that for the "atomic" callback), or the trivial > suggestion by Kirill from two weeks ago: > > > I hacked up patch to use help->disable/help->enable if atomic_ versions > > are NULL. It worked. > > Kirill, since the nouveau people aren't fixing this, can you just send > me your tested patch? > > Lyude/Ben - let me just say that I think this is all a huge disgrace. > > You had a problem report with a bisected commit, a suggested fix, and > two weeks later there's absolutely _nothing_. I do have a fixes pull from Ben from Saturday in my inbox, I can send it on this morning. Dave.
> > Christ. It's been two weeks. I'm doing -rc4 today, and I still don't > have the fix. > > The problem seems entirely obvious, as reported by Kirill: the nv50 > code unconditionally calls the "atomic_{dis,en}able()" functions, even > when not everybody was converted. > > The fix seems to be to either just do the conversion of the remaining > cases (which looks like just adding an argument to the remaining > functions, and using that for the "atomic" callback), or the trivial > suggestion by Kirill from two weeks ago: > > > I hacked up patch to use help->disable/help->enable if atomic_ versions > > are NULL. It worked. > > Kirill, since the nouveau people aren't fixing this, can you just send > me your tested patch? > > Lyude/Ben - let me just say that I think this is all a huge disgrace. > > You had a problem report with a bisected commit, a suggested fix, and > two weeks later there's absolutely _nothing_. I would like to say when you sent this, there was patches on the mailing lists with Kirill cc'ed, a pull request outstanding to me on the mailing list from Ben, with the patches reviewed in it. Maybe you weren't cc'ed on it, but stuff has certainly happened, in the timeframe, and I was keeping track of it from falling down a hole. _nothing_ is a lot more a reflection on your research than the ongoing process, there was some delays here and maybe we need to communicate when we are flat out dealing with other more urgent tasks that pay the actual wages. Dave.