diff mbox

Oops (NULL pointer dereference) in radeon_fence_ref in 3.14.63

Message ID 56DE3EF2.7070609@amd.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolai Hähnle March 8, 2016, 2:54 a.m. UTC
Hi,

On 05.03.2016 16:24, Christian König wrote:
> just an educated guess, but I think the problem is simply that kernel
> 3.14 doesn't yet contain the code so that radeon_fence_get() can safely
> called with a NULL pointer.
>
> So the backport of Nicolai's patch needs and extra check for the case
> when the fence is NULL.

Oops indeed. Only the ref call should need the guard, the unref has 
always had a NULL pointer test as far as I can see.

Lutz, could you please test whether the attached patch on top of 3.14.63 
fixes the problem?

Thanks,
Nicolai

> Regards,
> Christian.
>
> Am 05.03.2016 um 18:16 schrieb Lutz Euler:
>> Hi,
>>
>> after upgrading from kernel 3.14.62 to 3.14.63, while surfing, the
>> screen suddenly got black and the mouse cursor froze. I had to reset
>> the machine and found an oops followed by repeated messages
>> "BUG: scheduling while atomic: Xorg/3757/0x00000002" in the logs.
>> I have copied the oops and the first of these messages below.
>>
>> This was repeatable: After the reboot, when the browser restored its
>> tabs, the oops occurred again. I then rebooted into 3.14.62 and the
>> problem didn't occur again.
>>
>> Just guessing: Of the commits regarding radeon between these two
>> kernel versions might the following be involved
>>
>> Nicolai Hähnle (1):
>>        drm/radeon: hold reference to fences in radeon_sa_bo_new
>>
>> as it mentions fences and the stack trace starts with radeon_sa_bo_new?
>>
>> Thanks and Regards,
>>
>> Lutz
>>
>>  From lspci -v:
>>
>> 05:00.0 VGA compatible controller: ATI Technologies Inc NI Caicos [AMD
>> RADEON HD 6450] (prog-if 00 [VGA controller])
>>     Subsystem: PC Partner Limited Device e164
>>     Flags: bus master, fast devsel, latency 0, IRQ 53
>>     Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>     Memory at fe9e0000 (64-bit, non-prefetchable) [size=128K]
>>     I/O ports at e000 [size=256]
>>     Expansion ROM at fe9c0000 [disabled] [size=128K]
>>     Capabilities: [50] Power Management version 3
>>     Capabilities: [58] Express Legacy Endpoint, MSI 00
>>     Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>     Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
>> Len=010 <?>
>>     Capabilities: [150] Advanced Error Reporting
>>     Kernel driver in use: radeon
>>     Kernel modules: radeon
>>
>> X.Org version: 1.10.4
>>
>> decodecode of the oops:
>>
>> Mar 5 15:04:58 lutz kernel: [ 6995.216776] Code: c7 c6 d8 3e 36 a0 31
>> c0 45 31 e4 e8 7b 70 15 e1 eb b5 66 0f 1f 84 00 00 00 00 00 55 48 89
>> f8 ba 01 00 00 00 48 89 e5 48 83 ec 10 <f0> 0f c1 57 08 ff c2 ff ca 7e
>> 02 c9 c3 80 3d 10 ae 10 00 01 74
>> All code
>> ========
>>     0:    c7 c6 d8 3e 36 a0        mov    $0xa0363ed8,%esi
>>     6:    31 c0                    xor    %eax,%eax
>>     8:    45 31 e4                 xor    %r12d,%r12d
>>     b:    e8 7b 70 15 e1           callq  0xffffffffe115708b
>>    10:    eb b5                    jmp    0xffffffffffffffc7
>>    12:    66 0f 1f 84 00 00 00     nopw   0x0(%rax,%rax,1)
>>    19:    00 00
>>    1b:    55                       push   %rbp
>>    1c:    48 89 f8                 mov    %rdi,%rax
>>    1f:    ba 01 00 00 00           mov    $0x1,%edx
>>    24:    48 89 e5                 mov    %rsp,%rbp
>>    27:    48 83 ec 10              sub    $0x10,%rsp
>>    2b:*    f0 0f c1 57 08           lock xadd %edx,0x8(%rdi)
>> <-- trapping instruction
>>    30:    ff c2                    inc    %edx
>>    32:    ff ca                    dec    %edx
>>    34:    7e 02                    jle    0x38
>>    36:    c9                       leaveq
>>    37:    c3                       retq
>>    38:    80 3d 10 ae 10 00 01     cmpb   $0x1,0x10ae10(%rip)        #
>> 0x10ae4f
>>    3f:    74                       .byte 0x74
>>
>> Code starting with the faulting instruction
>> ===========================================
>>     0:    f0 0f c1 57 08           lock xadd %edx,0x8(%rdi)
>>     5:    ff c2                    inc    %edx
>>     7:    ff ca                    dec    %edx
>>     9:    7e 02                    jle    0xd
>>     b:    c9                       leaveq
>>     c:    c3                       retq
>>     d:    80 3d 10 ae 10 00 01     cmpb   $0x1,0x10ae10(%rip)        #
>> 0x10ae24
>>    14:    74                       .byte 0x74
>>
>> Mar  5 15:04:58 lutz kernel: [ 6995.192330] BUG: unable to handle
>> kernel NULL pointer dereference at 0000000000000008
>> Mar  5 15:04:58 lutz kernel: [ 6995.192375] IP: [<ffffffffa02776c0>]
>> radeon_fence_ref+0x10/0x50 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.192441] PGD 22a86a067 PUD
>> 22d8e8067 PMD 0
>> Mar  5 15:04:58 lutz kernel: [ 6995.192463] Oops: 0002 [#1] PREEMPT SMP
>> Mar  5 15:04:58 lutz kernel: [ 6995.192484] Modules linked in:
>> binfmt_misc parport_pc ppdev snd_hda_codec_hdmi snd_opl3_synth
>> snd_seq_midi_emul snd_hda_intel snd_hda_codec snd_es1938 gameport
>> snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_pcm snd_seq_oss
>> snd_opl3_lib snd_hwdep snd_mpu401_uart snd_seq_midi snd_rawmidi
>> snd_seq_midi_event snd_seq psmouse fbcon tileblit edac_core font
>> serio_raw i2c_piix4 bitblit softcursor radeon snd_seq_device snd_timer
>> hwmon_vid ttm drm_kms_helper asus_atk0110 drm i2c_algo_bit snd
>> soundcore lp parport usbhid btrfs raid6_pq zlib_deflate xor r8169 mii
>> xhci_hcd libcrc32c
>> Mar  5 15:04:58 lutz kernel: [ 6995.192741] CPU: 5 PID: 3757 Comm:
>> Xorg Not tainted 3.14.63 #1
>> Mar  5 15:04:58 lutz kernel: [ 6995.192765] Hardware name: System
>> manufacturer System Product Name/M4A87TD/USB3, BIOS 1202    02/17/2011
>> Mar  5 15:04:58 lutz kernel: [ 6995.192803] task: ffff88022c1d5f80 ti:
>> ffff8800c42a6000 task.ti: ffff8800c42a6000
>> Mar  5 15:04:58 lutz kernel: [ 6995.192833] RIP:
>> 0010:[<ffffffffa02776c0>]  [<ffffffffa02776c0>]
>> radeon_fence_ref+0x10/0x50 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.192881] RSP:
>> 0018:ffff8800c42a7a68  EFLAGS: 00010282
>> Mar  5 15:04:58 lutz kernel: [ 6995.192903] RAX: 0000000000000000 RBX:
>> ffff8800c5d61328 RCX: 0000000000000000
>> Mar  5 15:04:58 lutz kernel: [ 6995.192931] RDX: 0000000000000001 RSI:
>> 0000000000000000 RDI: 0000000000000000
>> Mar  5 15:04:58 lutz kernel: [ 6995.192960] RBP: ffff8800c42a7a78 R08:
>> ffff8800c5d60000 R09: 0000000000000000
>> Mar  5 15:04:58 lutz kernel: [ 6995.192988] R10: 0000000000000000 R11:
>> 0000000000000000 R12: 0000000000000100
>> Mar  5 15:04:58 lutz kernel: [ 6995.193016] R13: ffff8800c5d613b0 R14:
>> ffff8800c42a7ae8 R15: ffff8800c42a7b08
>> Mar  5 15:04:58 lutz kernel: [ 6995.193045] FS:
>> 00007f7ebe91d8a0(0000) GS:ffff880237d40000(0000) knlGS:0000000000000000
>> Mar  5 15:04:58 lutz kernel: [ 6995.193076] CS:  0010 DS: 0000 ES:
>> 0000 CR0: 0000000080050033
>> Mar  5 15:04:58 lutz kernel: [ 6995.193100] CR2: 0000000000000008 CR3:
>> 000000022de3e000 CR4: 00000000000007e0
>> Mar  5 15:04:58 lutz kernel: [ 6995.193128] Stack:
>> Mar  5 15:04:58 lutz kernel: [ 6995.193137]  ffff880159dcef00
>> ffff8800c5d61328 ffff8800c42a7b88 ffffffffa02d9f22
>> Mar  5 15:04:58 lutz kernel: [ 6995.193170]  01ff8800c42a7a98
>> ffff8800c5d60000 ffff8800c42a7b08 ffff8800c42a7c88
>> Mar  5 15:04:58 lutz kernel: [ 6995.193203]  0000f00455ee5200
>> ffff8800c5d613b0 00000100c42a7b08 0006e97800200000
>> Mar  5 15:04:58 lutz kernel: [ 6995.193237] Call Trace:
>> Mar  5 15:04:58 lutz kernel: [ 6995.194709]  [<ffffffffa02d9f22>]
>> radeon_sa_bo_new+0x302/0x590 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.196175]  [<ffffffff811551e6>] ?
>> __kmalloc+0x66/0x1d0
>> Mar  5 15:04:58 lutz kernel: [ 6995.197659]  [<ffffffffa028f849>] ?
>> copy_from_user+0x9/0x10 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.199160]  [<ffffffffa028dcb3>]
>> radeon_ib_get+0x43/0xe0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.200655]  [<ffffffffa0290111>]
>> radeon_cs_ioctl+0x201/0xac0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.202135]  [<ffffffff812ffe52>] ?
>> idr_mark_full+0x52/0x70
>> Mar  5 15:04:58 lutz kernel: [ 6995.203620]  [<ffffffffa01b6b31>]
>> drm_ioctl+0x401/0x580 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.205094]  [<ffffffffa028ff10>] ?
>> radeon_cs_parser_init+0x470/0x470 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.206564]  [<ffffffff816572dc>] ?
>> __do_page_fault+0x1cc/0x560
>> Mar  5 15:04:58 lutz kernel: [ 6995.208033]  [<ffffffffa025e349>]
>> radeon_drm_ioctl+0x9/0x10 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.209510]  [<ffffffff8116fe21>]
>> do_vfs_ioctl+0x81/0x4d0
>> Mar  5 15:04:58 lutz kernel: [ 6995.210975]  [<ffffffff81137f29>] ?
>> SyS_mmap_pgoff+0xd9/0x210
>> Mar  5 15:04:58 lutz kernel: [ 6995.212432]  [<ffffffff81170301>]
>> SyS_ioctl+0x91/0xa0
>> Mar  5 15:04:58 lutz kernel: [ 6995.213874]  [<ffffffff8165767c>] ?
>> do_page_fault+0xc/0x10
>> Mar  5 15:04:58 lutz kernel: [ 6995.215323]  [<ffffffff8165b542>]
>> system_call_fastpath+0x16/0x1b
>> Mar  5 15:04:58 lutz kernel: [ 6995.216776] Code: c7 c6 d8 3e 36 a0 31
>> c0 45 31 e4 e8 7b 70 15 e1 eb b5 66 0f 1f 84 00 00 00 00 00 55 48 89
>> f8 ba 01 00 00 00 48 89 e5 48 83 ec 10 <f0> 0f c1 57 08 ff c2 ff ca 7e
>> 02 c9 c3 80 3d 10 ae 10 00 01 74
>> Mar  5 15:04:58 lutz kernel: [ 6995.218402] RIP  [<ffffffffa02776c0>]
>> radeon_fence_ref+0x10/0x50 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.219987]  RSP <ffff8800c42a7a68>
>> Mar  5 15:04:58 lutz kernel: [ 6995.221568] CR2: 0000000000000008
>> Mar  5 15:04:58 lutz kernel: [ 6995.234168] ---[ end trace
>> 37c0545f39a2b3e5 ]---
>>
>> Mar  5 15:04:58 lutz kernel: [ 6995.234170] note: Xorg[3757] exited
>> with preempt_count 1
>> Mar  5 15:04:58 lutz kernel: [ 6995.239926] BUG: scheduling while
>> atomic: Xorg/3757/0x00000002
>> Mar  5 15:04:58 lutz kernel: [ 6995.239931] Modules linked in:
>> binfmt_misc parport_pc ppdev snd_hda_codec_hdmi snd_opl3_synth
>> snd_seq_midi_emul snd_hda_intel snd_hda_c
>> odec snd_es1938 gameport snd_pcm_oss snd_mixer_oss snd_seq_dummy
>> snd_pcm snd_seq_oss snd_opl3_lib snd_hwdep snd_mpu401_uart
>> snd_seq_midi snd_rawmidi snd_seq_midi_event
>>   snd_seq psmouse fbcon tileblit edac_core font serio_raw i2c_piix4
>> bitblit softcursor radeon snd_seq_device snd_timer hwmon_vid ttm
>> drm_kms_helper asus_atk0110 drm i2c
>> _algo_bit snd soundcore lp parport usbhid btrfs raid6_pq zlib_deflate
>> xor r8169 mii xhci_hcd libcrc32c
>> Mar  5 15:04:58 lutz kernel: [ 6995.239960] CPU: 5 PID: 3757 Comm:
>> Xorg Tainted: G      D      3.14.63 #1
>> Mar  5 15:04:58 lutz kernel: [ 6995.239961] Hardware name: System
>> manufacturer System Product Name/M4A87TD/USB3, BIOS 1202    02/17/2011
>> Mar  5 15:04:58 lutz kernel: [ 6995.239963]  0000000000000000
>> ffff8800c42a7078 ffffffff8164f89c ffff880237d518c0
>> Mar  5 15:04:58 lutz kernel: [ 6995.239965]  0000000000000005
>> ffff8800c42a7088 ffffffff81075d27 ffff8800c42a70f8
>> Mar  5 15:04:58 lutz kernel: [ 6995.239967]  ffffffff81650680
>> ffff8800c42a7184 ffff88022c1d5f80 ffff8800c42a6000
>> Mar  5 15:04:58 lutz kernel: [ 6995.239969] Call Trace:
>> Mar  5 15:04:58 lutz kernel: [ 6995.239976]  [<ffffffff8164f89c>]
>> dump_stack+0x4f/0x6b
>> Mar  5 15:04:58 lutz kernel: [ 6995.239980]  [<ffffffff81075d27>]
>> __schedule_bug+0x47/0x60
>> Mar  5 15:04:58 lutz kernel: [ 6995.239982]  [<ffffffff81650680>]
>> __schedule+0x600/0x810
>> Mar  5 15:04:58 lutz kernel: [ 6995.239984]  [<ffffffff81650984>]
>> schedule+0x24/0x70
>> Mar  5 15:04:58 lutz kernel: [ 6995.239986]  [<ffffffff8164fa11>]
>> schedule_timeout+0x151/0x320
>> Mar  5 15:04:58 lutz kernel: [ 6995.239989]  [<ffffffff81056b60>] ?
>> call_timer_fn+0x1a0/0x1a0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240028]  [<ffffffffa027713a>]
>> radeon_fence_wait_seq+0x3ca/0x530 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240043]  [<ffffffffa02c1740>] ?
>> evergreen_program_watermarks+0x150/0x670 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240046]  [<ffffffff8108a210>] ?
>> bit_waitqueue+0xd0/0xd0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240062]  [<ffffffffa02c004b>] ?
>> evergreen_line_buffer_adjust.clone.5+0xcb/0x1a0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240073]  [<ffffffffa0277680>]
>> radeon_fence_wait_empty_locked+0xb0/0xe0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240088]  [<ffffffffa02bb83d>]
>> radeon_pm_compute_clocks+0x59d/0x840 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240100]  [<ffffffffa026a69f>]
>> atombios_crtc_dpms+0x6f/0x100 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240111]  [<ffffffffa026bc08>]
>> atombios_crtc_disable+0x28/0x2f0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240126]  [<ffffffffa02d8844>] ?
>> radeon_atom_encoder_dpms+0xf4/0x230 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240132]  [<ffffffffa021b305>]
>> drm_helper_disable_unused_functions+0x105/0x160 [drm_kms_helper]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240135]  [<ffffffffa021c610>]
>> drm_crtc_helper_set_config+0x130/0x970 [drm_kms_helper]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240138]  [<ffffffff81155962>] ?
>> __slab_free+0x302/0x480
>> Mar  5 15:04:58 lutz kernel: [ 6995.240151]  [<ffffffffa0285061>]
>> radeon_crtc_set_config+0x31/0xb0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240172]  [<ffffffffa01c16ac>]
>> drm_mode_set_config_internal+0x5c/0xe0 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240180]  [<ffffffffa01c30bd>]
>> drm_framebuffer_remove+0x10d/0x160 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240188]  [<ffffffffa01c6605>]
>> drm_fb_release+0xa5/0xd0 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240195]  [<ffffffffa01b7e18>]
>> drm_release+0x548/0x5c0 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240198]  [<ffffffff8115f5a4>]
>> __fput+0xa4/0x220
>> Mar  5 15:04:58 lutz kernel: [ 6995.240199]  [<ffffffff8115f769>]
>> ____fput+0x9/0x10
>> Mar  5 15:04:58 lutz kernel: [ 6995.240202]  [<ffffffff8106a80c>]
>> task_work_run+0xac/0xd0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240205]  [<ffffffff8104dc30>]
>> do_exit+0x2c0/0xac0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240207]  [<ffffffff8164f77b>] ?
>> printk+0x48/0x4a
>> Mar  5 15:04:58 lutz kernel: [ 6995.240209]  [<ffffffff8109bc29>] ?
>> kmsg_dump+0xb9/0xd0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240211]  [<ffffffff8165553a>]
>> oops_end+0x9a/0xe0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240214]  [<ffffffff8103ee32>]
>> no_context+0x132/0x2e0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240216]  [<ffffffff8103f0f5>]
>> __bad_area_nosemaphore+0x115/0x210
>> Mar  5 15:04:58 lutz kernel: [ 6995.240218]  [<ffffffff811136d1>] ?
>> __alloc_pages_nodemask+0x141/0x910
>> Mar  5 15:04:58 lutz kernel: [ 6995.240220]  [<ffffffff8103f1fe>]
>> bad_area_nosemaphore+0xe/0x10
>> Mar  5 15:04:58 lutz kernel: [ 6995.240222]  [<ffffffff81657442>]
>> __do_page_fault+0x332/0x560
>> Mar  5 15:04:58 lutz kernel: [ 6995.240230]  [<ffffffffa022b8bb>] ?
>> ttm_mem_global_alloc_zone.clone.1+0x12b/0x150 [ttm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240233]  [<ffffffffa022bc12>] ?
>> ttm_mem_global_alloc_page+0x42/0x50 [ttm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240237]  [<ffffffff8114dd3e>] ?
>> alloc_pages_current+0xae/0x170
>> Mar  5 15:04:58 lutz kernel: [ 6995.240238]  [<ffffffff8165767c>]
>> do_page_fault+0xc/0x10
>> Mar  5 15:04:58 lutz kernel: [ 6995.240240]  [<ffffffff816549d2>]
>> page_fault+0x22/0x30
>> Mar  5 15:04:58 lutz kernel: [ 6995.240252]  [<ffffffffa02776c0>] ?
>> radeon_fence_ref+0x10/0x50 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240268]  [<ffffffffa02d9f22>]
>> radeon_sa_bo_new+0x302/0x590 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240270]  [<ffffffff811551e6>] ?
>> __kmalloc+0x66/0x1d0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240283]  [<ffffffffa028f849>] ?
>> copy_from_user+0x9/0x10 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240296]  [<ffffffffa028dcb3>]
>> radeon_ib_get+0x43/0xe0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240309]  [<ffffffffa0290111>]
>> radeon_cs_ioctl+0x201/0xac0 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240312]  [<ffffffff812ffe52>] ?
>> idr_mark_full+0x52/0x70
>> Mar  5 15:04:58 lutz kernel: [ 6995.240320]  [<ffffffffa01b6b31>]
>> drm_ioctl+0x401/0x580 [drm]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240333]  [<ffffffffa028ff10>] ?
>> radeon_cs_parser_init+0x470/0x470 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240335]  [<ffffffff816572dc>] ?
>> __do_page_fault+0x1cc/0x560
>> Mar  5 15:04:58 lutz kernel: [ 6995.240346]  [<ffffffffa025e349>]
>> radeon_drm_ioctl+0x9/0x10 [radeon]
>> Mar  5 15:04:58 lutz kernel: [ 6995.240348]  [<ffffffff8116fe21>]
>> do_vfs_ioctl+0x81/0x4d0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240351]  [<ffffffff81137f29>] ?
>> SyS_mmap_pgoff+0xd9/0x210
>> Mar  5 15:04:58 lutz kernel: [ 6995.240353]  [<ffffffff81170301>]
>> SyS_ioctl+0x91/0xa0
>> Mar  5 15:04:58 lutz kernel: [ 6995.240355]  [<ffffffff8165767c>] ?
>> do_page_fault+0xc/0x10
>> Mar  5 15:04:58 lutz kernel: [ 6995.240357]  [<ffffffff8165b542>]
>> system_call_fastpath+0x16/0x1b
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Comments

Christian König March 8, 2016, 8:50 a.m. UTC | #1
Am 08.03.2016 um 03:54 schrieb Nicolai Hähnle:
> Hi,
>
> On 05.03.2016 16:24, Christian König wrote:
>> just an educated guess, but I think the problem is simply that kernel
>> 3.14 doesn't yet contain the code so that radeon_fence_get() can safely
>> called with a NULL pointer.
>>
>> So the backport of Nicolai's patch needs and extra check for the case
>> when the fence is NULL.
>
> Oops indeed. Only the ref call should need the guard, the unref has 
> always had a NULL pointer test as far as I can see.
>
> Lutz, could you please test whether the attached patch on top of 
> 3.14.63 fixes the problem?

Patch is Reviewed-by: Christian König <christian.koenig@amd.com>.

Regards,
Christian.

>
> Thanks,
> Nicolai
>
>> Regards,
>> Christian.
>>
>> Am 05.03.2016 um 18:16 schrieb Lutz Euler:
>>> Hi,
>>>
>>> after upgrading from kernel 3.14.62 to 3.14.63, while surfing, the
>>> screen suddenly got black and the mouse cursor froze. I had to reset
>>> the machine and found an oops followed by repeated messages
>>> "BUG: scheduling while atomic: Xorg/3757/0x00000002" in the logs.
>>> I have copied the oops and the first of these messages below.
>>>
>>> This was repeatable: After the reboot, when the browser restored its
>>> tabs, the oops occurred again. I then rebooted into 3.14.62 and the
>>> problem didn't occur again.
>>>
>>> Just guessing: Of the commits regarding radeon between these two
>>> kernel versions might the following be involved
>>>
>>> Nicolai Hähnle (1):
>>>        drm/radeon: hold reference to fences in radeon_sa_bo_new
>>>
>>> as it mentions fences and the stack trace starts with radeon_sa_bo_new?
>>>
>>> Thanks and Regards,
>>>
>>> Lutz
>>>
>>>  From lspci -v:
>>>
>>> 05:00.0 VGA compatible controller: ATI Technologies Inc NI Caicos [AMD
>>> RADEON HD 6450] (prog-if 00 [VGA controller])
>>>     Subsystem: PC Partner Limited Device e164
>>>     Flags: bus master, fast devsel, latency 0, IRQ 53
>>>     Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>>     Memory at fe9e0000 (64-bit, non-prefetchable) [size=128K]
>>>     I/O ports at e000 [size=256]
>>>     Expansion ROM at fe9c0000 [disabled] [size=128K]
>>>     Capabilities: [50] Power Management version 3
>>>     Capabilities: [58] Express Legacy Endpoint, MSI 00
>>>     Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>>     Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
>>> Len=010 <?>
>>>     Capabilities: [150] Advanced Error Reporting
>>>     Kernel driver in use: radeon
>>>     Kernel modules: radeon
>>>
>>> X.Org version: 1.10.4
>>>
>>> decodecode of the oops:
>>>
>>> Mar 5 15:04:58 lutz kernel: [ 6995.216776] Code: c7 c6 d8 3e 36 a0 31
>>> c0 45 31 e4 e8 7b 70 15 e1 eb b5 66 0f 1f 84 00 00 00 00 00 55 48 89
>>> f8 ba 01 00 00 00 48 89 e5 48 83 ec 10 <f0> 0f c1 57 08 ff c2 ff ca 7e
>>> 02 c9 c3 80 3d 10 ae 10 00 01 74
>>> All code
>>> ========
>>>     0:    c7 c6 d8 3e 36 a0        mov    $0xa0363ed8,%esi
>>>     6:    31 c0                    xor    %eax,%eax
>>>     8:    45 31 e4                 xor    %r12d,%r12d
>>>     b:    e8 7b 70 15 e1           callq  0xffffffffe115708b
>>>    10:    eb b5                    jmp    0xffffffffffffffc7
>>>    12:    66 0f 1f 84 00 00 00     nopw   0x0(%rax,%rax,1)
>>>    19:    00 00
>>>    1b:    55                       push   %rbp
>>>    1c:    48 89 f8                 mov    %rdi,%rax
>>>    1f:    ba 01 00 00 00           mov    $0x1,%edx
>>>    24:    48 89 e5                 mov    %rsp,%rbp
>>>    27:    48 83 ec 10              sub    $0x10,%rsp
>>>    2b:*    f0 0f c1 57 08           lock xadd %edx,0x8(%rdi)
>>> <-- trapping instruction
>>>    30:    ff c2                    inc    %edx
>>>    32:    ff ca                    dec    %edx
>>>    34:    7e 02                    jle    0x38
>>>    36:    c9                       leaveq
>>>    37:    c3                       retq
>>>    38:    80 3d 10 ae 10 00 01     cmpb $0x1,0x10ae10(%rip)        #
>>> 0x10ae4f
>>>    3f:    74                       .byte 0x74
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>>     0:    f0 0f c1 57 08           lock xadd %edx,0x8(%rdi)
>>>     5:    ff c2                    inc    %edx
>>>     7:    ff ca                    dec    %edx
>>>     9:    7e 02                    jle    0xd
>>>     b:    c9                       leaveq
>>>     c:    c3                       retq
>>>     d:    80 3d 10 ae 10 00 01     cmpb $0x1,0x10ae10(%rip)        #
>>> 0x10ae24
>>>    14:    74                       .byte 0x74
>>>
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192330] BUG: unable to handle
>>> kernel NULL pointer dereference at 0000000000000008
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192375] IP: [<ffffffffa02776c0>]
>>> radeon_fence_ref+0x10/0x50 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192441] PGD 22a86a067 PUD
>>> 22d8e8067 PMD 0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192463] Oops: 0002 [#1] PREEMPT SMP
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192484] Modules linked in:
>>> binfmt_misc parport_pc ppdev snd_hda_codec_hdmi snd_opl3_synth
>>> snd_seq_midi_emul snd_hda_intel snd_hda_codec snd_es1938 gameport
>>> snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_pcm snd_seq_oss
>>> snd_opl3_lib snd_hwdep snd_mpu401_uart snd_seq_midi snd_rawmidi
>>> snd_seq_midi_event snd_seq psmouse fbcon tileblit edac_core font
>>> serio_raw i2c_piix4 bitblit softcursor radeon snd_seq_device snd_timer
>>> hwmon_vid ttm drm_kms_helper asus_atk0110 drm i2c_algo_bit snd
>>> soundcore lp parport usbhid btrfs raid6_pq zlib_deflate xor r8169 mii
>>> xhci_hcd libcrc32c
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192741] CPU: 5 PID: 3757 Comm:
>>> Xorg Not tainted 3.14.63 #1
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192765] Hardware name: System
>>> manufacturer System Product Name/M4A87TD/USB3, BIOS 1202 02/17/2011
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192803] task: ffff88022c1d5f80 ti:
>>> ffff8800c42a6000 task.ti: ffff8800c42a6000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192833] RIP:
>>> 0010:[<ffffffffa02776c0>]  [<ffffffffa02776c0>]
>>> radeon_fence_ref+0x10/0x50 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192881] RSP:
>>> 0018:ffff8800c42a7a68  EFLAGS: 00010282
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192903] RAX: 0000000000000000 RBX:
>>> ffff8800c5d61328 RCX: 0000000000000000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192931] RDX: 0000000000000001 RSI:
>>> 0000000000000000 RDI: 0000000000000000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192960] RBP: ffff8800c42a7a78 R08:
>>> ffff8800c5d60000 R09: 0000000000000000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.192988] R10: 0000000000000000 R11:
>>> 0000000000000000 R12: 0000000000000100
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193016] R13: ffff8800c5d613b0 R14:
>>> ffff8800c42a7ae8 R15: ffff8800c42a7b08
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193045] FS:
>>> 00007f7ebe91d8a0(0000) GS:ffff880237d40000(0000) knlGS:0000000000000000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193076] CS:  0010 DS: 0000 ES:
>>> 0000 CR0: 0000000080050033
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193100] CR2: 0000000000000008 CR3:
>>> 000000022de3e000 CR4: 00000000000007e0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193128] Stack:
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193137]  ffff880159dcef00
>>> ffff8800c5d61328 ffff8800c42a7b88 ffffffffa02d9f22
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193170]  01ff8800c42a7a98
>>> ffff8800c5d60000 ffff8800c42a7b08 ffff8800c42a7c88
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193203]  0000f00455ee5200
>>> ffff8800c5d613b0 00000100c42a7b08 0006e97800200000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.193237] Call Trace:
>>> Mar  5 15:04:58 lutz kernel: [ 6995.194709] [<ffffffffa02d9f22>]
>>> radeon_sa_bo_new+0x302/0x590 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.196175] [<ffffffff811551e6>] ?
>>> __kmalloc+0x66/0x1d0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.197659] [<ffffffffa028f849>] ?
>>> copy_from_user+0x9/0x10 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.199160] [<ffffffffa028dcb3>]
>>> radeon_ib_get+0x43/0xe0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.200655] [<ffffffffa0290111>]
>>> radeon_cs_ioctl+0x201/0xac0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.202135] [<ffffffff812ffe52>] ?
>>> idr_mark_full+0x52/0x70
>>> Mar  5 15:04:58 lutz kernel: [ 6995.203620] [<ffffffffa01b6b31>]
>>> drm_ioctl+0x401/0x580 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.205094] [<ffffffffa028ff10>] ?
>>> radeon_cs_parser_init+0x470/0x470 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.206564] [<ffffffff816572dc>] ?
>>> __do_page_fault+0x1cc/0x560
>>> Mar  5 15:04:58 lutz kernel: [ 6995.208033] [<ffffffffa025e349>]
>>> radeon_drm_ioctl+0x9/0x10 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.209510] [<ffffffff8116fe21>]
>>> do_vfs_ioctl+0x81/0x4d0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.210975] [<ffffffff81137f29>] ?
>>> SyS_mmap_pgoff+0xd9/0x210
>>> Mar  5 15:04:58 lutz kernel: [ 6995.212432] [<ffffffff81170301>]
>>> SyS_ioctl+0x91/0xa0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.213874] [<ffffffff8165767c>] ?
>>> do_page_fault+0xc/0x10
>>> Mar  5 15:04:58 lutz kernel: [ 6995.215323] [<ffffffff8165b542>]
>>> system_call_fastpath+0x16/0x1b
>>> Mar  5 15:04:58 lutz kernel: [ 6995.216776] Code: c7 c6 d8 3e 36 a0 31
>>> c0 45 31 e4 e8 7b 70 15 e1 eb b5 66 0f 1f 84 00 00 00 00 00 55 48 89
>>> f8 ba 01 00 00 00 48 89 e5 48 83 ec 10 <f0> 0f c1 57 08 ff c2 ff ca 7e
>>> 02 c9 c3 80 3d 10 ae 10 00 01 74
>>> Mar  5 15:04:58 lutz kernel: [ 6995.218402] RIP [<ffffffffa02776c0>]
>>> radeon_fence_ref+0x10/0x50 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.219987]  RSP <ffff8800c42a7a68>
>>> Mar  5 15:04:58 lutz kernel: [ 6995.221568] CR2: 0000000000000008
>>> Mar  5 15:04:58 lutz kernel: [ 6995.234168] ---[ end trace
>>> 37c0545f39a2b3e5 ]---
>>>
>>> Mar  5 15:04:58 lutz kernel: [ 6995.234170] note: Xorg[3757] exited
>>> with preempt_count 1
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239926] BUG: scheduling while
>>> atomic: Xorg/3757/0x00000002
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239931] Modules linked in:
>>> binfmt_misc parport_pc ppdev snd_hda_codec_hdmi snd_opl3_synth
>>> snd_seq_midi_emul snd_hda_intel snd_hda_c
>>> odec snd_es1938 gameport snd_pcm_oss snd_mixer_oss snd_seq_dummy
>>> snd_pcm snd_seq_oss snd_opl3_lib snd_hwdep snd_mpu401_uart
>>> snd_seq_midi snd_rawmidi snd_seq_midi_event
>>>   snd_seq psmouse fbcon tileblit edac_core font serio_raw i2c_piix4
>>> bitblit softcursor radeon snd_seq_device snd_timer hwmon_vid ttm
>>> drm_kms_helper asus_atk0110 drm i2c
>>> _algo_bit snd soundcore lp parport usbhid btrfs raid6_pq zlib_deflate
>>> xor r8169 mii xhci_hcd libcrc32c
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239960] CPU: 5 PID: 3757 Comm:
>>> Xorg Tainted: G      D      3.14.63 #1
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239961] Hardware name: System
>>> manufacturer System Product Name/M4A87TD/USB3, BIOS 1202 02/17/2011
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239963]  0000000000000000
>>> ffff8800c42a7078 ffffffff8164f89c ffff880237d518c0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239965]  0000000000000005
>>> ffff8800c42a7088 ffffffff81075d27 ffff8800c42a70f8
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239967]  ffffffff81650680
>>> ffff8800c42a7184 ffff88022c1d5f80 ffff8800c42a6000
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239969] Call Trace:
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239976] [<ffffffff8164f89c>]
>>> dump_stack+0x4f/0x6b
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239980] [<ffffffff81075d27>]
>>> __schedule_bug+0x47/0x60
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239982] [<ffffffff81650680>]
>>> __schedule+0x600/0x810
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239984] [<ffffffff81650984>]
>>> schedule+0x24/0x70
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239986] [<ffffffff8164fa11>]
>>> schedule_timeout+0x151/0x320
>>> Mar  5 15:04:58 lutz kernel: [ 6995.239989] [<ffffffff81056b60>] ?
>>> call_timer_fn+0x1a0/0x1a0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240028] [<ffffffffa027713a>]
>>> radeon_fence_wait_seq+0x3ca/0x530 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240043] [<ffffffffa02c1740>] ?
>>> evergreen_program_watermarks+0x150/0x670 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240046] [<ffffffff8108a210>] ?
>>> bit_waitqueue+0xd0/0xd0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240062] [<ffffffffa02c004b>] ?
>>> evergreen_line_buffer_adjust.clone.5+0xcb/0x1a0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240073] [<ffffffffa0277680>]
>>> radeon_fence_wait_empty_locked+0xb0/0xe0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240088] [<ffffffffa02bb83d>]
>>> radeon_pm_compute_clocks+0x59d/0x840 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240100] [<ffffffffa026a69f>]
>>> atombios_crtc_dpms+0x6f/0x100 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240111] [<ffffffffa026bc08>]
>>> atombios_crtc_disable+0x28/0x2f0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240126] [<ffffffffa02d8844>] ?
>>> radeon_atom_encoder_dpms+0xf4/0x230 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240132] [<ffffffffa021b305>]
>>> drm_helper_disable_unused_functions+0x105/0x160 [drm_kms_helper]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240135] [<ffffffffa021c610>]
>>> drm_crtc_helper_set_config+0x130/0x970 [drm_kms_helper]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240138] [<ffffffff81155962>] ?
>>> __slab_free+0x302/0x480
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240151] [<ffffffffa0285061>]
>>> radeon_crtc_set_config+0x31/0xb0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240172] [<ffffffffa01c16ac>]
>>> drm_mode_set_config_internal+0x5c/0xe0 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240180] [<ffffffffa01c30bd>]
>>> drm_framebuffer_remove+0x10d/0x160 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240188] [<ffffffffa01c6605>]
>>> drm_fb_release+0xa5/0xd0 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240195] [<ffffffffa01b7e18>]
>>> drm_release+0x548/0x5c0 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240198] [<ffffffff8115f5a4>]
>>> __fput+0xa4/0x220
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240199] [<ffffffff8115f769>]
>>> ____fput+0x9/0x10
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240202] [<ffffffff8106a80c>]
>>> task_work_run+0xac/0xd0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240205] [<ffffffff8104dc30>]
>>> do_exit+0x2c0/0xac0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240207] [<ffffffff8164f77b>] ?
>>> printk+0x48/0x4a
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240209] [<ffffffff8109bc29>] ?
>>> kmsg_dump+0xb9/0xd0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240211] [<ffffffff8165553a>]
>>> oops_end+0x9a/0xe0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240214] [<ffffffff8103ee32>]
>>> no_context+0x132/0x2e0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240216] [<ffffffff8103f0f5>]
>>> __bad_area_nosemaphore+0x115/0x210
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240218] [<ffffffff811136d1>] ?
>>> __alloc_pages_nodemask+0x141/0x910
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240220] [<ffffffff8103f1fe>]
>>> bad_area_nosemaphore+0xe/0x10
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240222] [<ffffffff81657442>]
>>> __do_page_fault+0x332/0x560
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240230] [<ffffffffa022b8bb>] ?
>>> ttm_mem_global_alloc_zone.clone.1+0x12b/0x150 [ttm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240233] [<ffffffffa022bc12>] ?
>>> ttm_mem_global_alloc_page+0x42/0x50 [ttm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240237] [<ffffffff8114dd3e>] ?
>>> alloc_pages_current+0xae/0x170
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240238] [<ffffffff8165767c>]
>>> do_page_fault+0xc/0x10
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240240] [<ffffffff816549d2>]
>>> page_fault+0x22/0x30
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240252] [<ffffffffa02776c0>] ?
>>> radeon_fence_ref+0x10/0x50 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240268] [<ffffffffa02d9f22>]
>>> radeon_sa_bo_new+0x302/0x590 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240270] [<ffffffff811551e6>] ?
>>> __kmalloc+0x66/0x1d0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240283] [<ffffffffa028f849>] ?
>>> copy_from_user+0x9/0x10 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240296] [<ffffffffa028dcb3>]
>>> radeon_ib_get+0x43/0xe0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240309] [<ffffffffa0290111>]
>>> radeon_cs_ioctl+0x201/0xac0 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240312] [<ffffffff812ffe52>] ?
>>> idr_mark_full+0x52/0x70
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240320] [<ffffffffa01b6b31>]
>>> drm_ioctl+0x401/0x580 [drm]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240333] [<ffffffffa028ff10>] ?
>>> radeon_cs_parser_init+0x470/0x470 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240335] [<ffffffff816572dc>] ?
>>> __do_page_fault+0x1cc/0x560
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240346] [<ffffffffa025e349>]
>>> radeon_drm_ioctl+0x9/0x10 [radeon]
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240348] [<ffffffff8116fe21>]
>>> do_vfs_ioctl+0x81/0x4d0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240351] [<ffffffff81137f29>] ?
>>> SyS_mmap_pgoff+0xd9/0x210
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240353] [<ffffffff81170301>]
>>> SyS_ioctl+0x91/0xa0
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240355] [<ffffffff8165767c>] ?
>>> do_page_fault+0xc/0x10
>>> Mar  5 15:04:58 lutz kernel: [ 6995.240357] [<ffffffff8165b542>]
>>> system_call_fastpath+0x16/0x1b
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
Lutz Euler March 11, 2016, 6:06 p.m. UTC | #2
Nicolai Hähnle wrote:

> Lutz, could you please test whether the attached patch on top of 3.14.63 
> fixes the problem?

I'd tend to say it does. I am running the patched 3.14.63 since I got
the patch and so far no oops occurred, whereas without the patch it
took less than 2 hours until it oopsed. So:

Tested-by: Lutz Euler <lutz.euler@freenet.de>

Thanks for providing the patch so quickly!

Regards,

Lutz
Michel Dänzer March 15, 2016, 6:45 a.m. UTC | #3
On 08.03.2016 11:54, Nicolai Hähnle wrote:
> On 05.03.2016 16:24, Christian König wrote:
>> just an educated guess, but I think the problem is simply that kernel
>> 3.14 doesn't yet contain the code so that radeon_fence_get() can safely
>> called with a NULL pointer.
>>
>> So the backport of Nicolai's patch needs and extra check for the case
>> when the fence is NULL.
> 
> Oops indeed. Only the ref call should need the guard, the unref has
> always had a NULL pointer test as far as I can see.
> 
> Lutz, could you please test whether the attached patch on top of 3.14.63
> fixes the problem?

Nicolai, if you haven't already, please send this patch to
stable@vger.kernel.org with an explicit explanation of which stable
branches need it.
diff mbox

Patch

From 85d028178d9772f2a07e4ed156820d95c4e0ad18 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <nicolai.haehnle@amd.com>
Date: Mon, 7 Mar 2016 23:41:52 -0300
Subject: [PATCH] drm/radeon: guard call to radeon_fence_ref against NULL
 pointers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Candidate fix for a kernel oops that was introduced by the backport of
commit 954605ca3 "drm/radeon: hold reference to fences in radeon_sa_bo_new"
to kernels where radeon does not use the common fence implementation for
fences.

Reported-by: Lutz Euler <lutz.euler@freenet.de>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 drivers/gpu/drm/radeon/radeon_sa.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 197b157..7d11901 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -349,8 +349,10 @@  int radeon_sa_bo_new(struct radeon_device *rdev,
 			/* see if we can skip over some allocations */
 		} while (radeon_sa_bo_next_hole(sa_manager, fences, tries));
 
-		for (i = 0; i < RADEON_NUM_RINGS; ++i)
-			radeon_fence_ref(fences[i]);
+		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+			if (fences[i])
+				radeon_fence_ref(fences[i]);
+		}
 
 		spin_unlock(&sa_manager->wq.lock);
 		r = radeon_fence_wait_any(rdev, fences, false);
-- 
2.5.0