From patchwork Tue Jan 29 18:39:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lyude Paul X-Patchwork-Id: 10786859 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44D3313B5 for ; Tue, 29 Jan 2019 18:39:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3341B2931E for ; Tue, 29 Jan 2019 18:39:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 275762D2B8; Tue, 29 Jan 2019 18:39:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8E5D92931E for ; Tue, 29 Jan 2019 18:39:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BEC2C6E923; Tue, 29 Jan 2019 18:39:40 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by gabe.freedesktop.org (Postfix) with ESMTPS id 94A366E924; Tue, 29 Jan 2019 18:39:39 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 268EF51EF3; Tue, 29 Jan 2019 18:39:39 +0000 (UTC) Received: from malachite.bss.redhat.com (dhcp-10-20-1-11.bss.redhat.com [10.20.1.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id 067F51713D; Tue, 29 Jan 2019 18:39:32 +0000 (UTC) From: Lyude Paul To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH 1/3] drm/dp_mst: Fix unbalanced malloc ref in drm_dp_mst_deallocate_vcpi() Date: Tue, 29 Jan 2019 13:39:25 -0500 Message-Id: <20190129183928.26779-2-lyude@redhat.com> In-Reply-To: <20190129183928.26779-1-lyude@redhat.com> References: <20190129183928.26779-1-lyude@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 29 Jan 2019 18:39:39 +0000 (UTC) X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Maxime Ripard , linux-kernel@vger.kernel.org, David Airlie , Sean Paul Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP In drm_dp_mst_deallocate_vcpi(), we currently unconditionally call drm_dp_mst_put_port_malloc() on the port that's passed to us, even if we never successfully allocated VCPI to it. This is contrary to what we do in drm_dp_mst_allocate_vcpi(), where we only call drm_dp_mst_get_port_malloc() on the passed port if we successfully allocated VCPI to it. As a result, if drm_dp_mst_allocate_vcpi() fails during a modeset and another successive modeset calls drm_dp_mst_deallocate_vcpi() we will end up dropping someone else's malloc reference to the port. Example: [ 962.309260] ================================================================== [ 962.309290] BUG: KASAN: use-after-free in drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309296] Read of size 4 at addr ffff888416c30004 by task kworker/0:1H/500 [ 962.309308] CPU: 0 PID: 500 Comm: kworker/0:1H Tainted: G W O 5.0.0-rc2Lyude-Test+ #1 [ 962.309313] Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018 [ 962.309428] Workqueue: events_highpri intel_atomic_cleanup_work [i915] [ 962.309434] Call Trace: [ 962.309452] dump_stack+0xad/0x150 [ 962.309462] ? dump_stack_print_info.cold.0+0x1b/0x1b [ 962.309472] ? kmsg_dump_rewind_nolock+0xd9/0xd9 [ 962.309504] ? drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309515] print_address_description+0x6c/0x23c [ 962.309542] ? drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309568] ? drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309577] kasan_report.cold.3+0x1a/0x32 [ 962.309605] ? drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309631] drm_dp_mst_put_port_malloc+0x72/0x180 [drm_kms_helper] [ 962.309658] ? drm_dp_mst_put_mstb_malloc+0x180/0x180 [drm_kms_helper] [ 962.309687] drm_dp_mst_destroy_state+0xcd/0x120 [drm_kms_helper] [ 962.309745] drm_atomic_state_default_clear+0x6ee/0xcc0 [drm] [ 962.309864] intel_atomic_state_clear+0xe/0x80 [i915] [ 962.309928] __drm_atomic_state_free+0x35/0xd0 [drm] [ 962.310044] intel_atomic_cleanup_work+0x56/0x70 [i915] [ 962.310057] process_one_work+0x884/0x1400 [ 962.310067] ? drain_workqueue+0x5a0/0x5a0 [ 962.310075] ? __schedule+0x87f/0x1e80 [ 962.310086] ? __sched_text_start+0x8/0x8 [ 962.310095] ? run_rebalance_domains+0x400/0x400 [ 962.310110] ? deref_stack_reg+0xb4/0x120 [ 962.310117] ? __read_once_size_nocheck.constprop.7+0x10/0x10 [ 962.310124] ? worker_enter_idle+0x47f/0x6a0 [ 962.310134] ? schedule+0xd7/0x2e0 [ 962.310141] ? __schedule+0x1e80/0x1e80 [ 962.310148] ? _raw_spin_lock_irq+0x9f/0x130 [ 962.310155] ? _raw_write_unlock_irqrestore+0x110/0x110 [ 962.310164] worker_thread+0x196/0x11e0 [ 962.310175] ? set_load_weight+0x2e0/0x2e0 [ 962.310181] ? __switch_to_asm+0x34/0x70 [ 962.310187] ? __switch_to_asm+0x40/0x70 [ 962.310194] ? process_one_work+0x1400/0x1400 [ 962.310199] ? __switch_to_asm+0x40/0x70 [ 962.310205] ? __switch_to_asm+0x34/0x70 [ 962.310211] ? __switch_to_asm+0x34/0x70 [ 962.310216] ? __switch_to_asm+0x40/0x70 [ 962.310221] ? __switch_to_asm+0x34/0x70 [ 962.310226] ? __switch_to_asm+0x40/0x70 [ 962.310231] ? __switch_to_asm+0x34/0x70 [ 962.310236] ? __switch_to_asm+0x40/0x70 [ 962.310242] ? syscall_return_via_sysret+0xf/0x7f [ 962.310248] ? __switch_to_asm+0x34/0x70 [ 962.310253] ? __switch_to_asm+0x40/0x70 [ 962.310258] ? __switch_to_asm+0x34/0x70 [ 962.310263] ? __switch_to_asm+0x40/0x70 [ 962.310268] ? __switch_to_asm+0x34/0x70 [ 962.310273] ? __switch_to_asm+0x40/0x70 [ 962.310281] ? __schedule+0x87f/0x1e80 [ 962.310292] ? __sched_text_start+0x8/0x8 [ 962.310300] ? save_stack+0x8c/0xb0 [ 962.310308] ? __kasan_kmalloc.constprop.6+0xc6/0xd0 [ 962.310313] ? kthread+0x98/0x3a0 [ 962.310318] ? ret_from_fork+0x35/0x40 [ 962.310334] ? __wake_up_common+0x178/0x6f0 [ 962.310343] ? _raw_spin_lock_irqsave+0xa4/0x140 [ 962.310349] ? __lock_text_start+0x8/0x8 [ 962.310355] ? _raw_write_lock_irqsave+0x70/0x130 [ 962.310360] ? __lock_text_start+0x8/0x8 [ 962.310371] ? process_one_work+0x1400/0x1400 [ 962.310376] kthread+0x2e2/0x3a0 [ 962.310383] ? kthread_create_on_node+0xc0/0xc0 [ 962.310389] ret_from_fork+0x35/0x40 [ 962.310401] Allocated by task 1462: [ 962.310410] __kasan_kmalloc.constprop.6+0xc6/0xd0 [ 962.310437] drm_dp_add_port+0xd60/0x1960 [drm_kms_helper] [ 962.310464] drm_dp_send_link_address+0x4b0/0x770 [drm_kms_helper] [ 962.310491] drm_dp_check_and_send_link_address+0x197/0x1f0 [drm_kms_helper] [ 962.310515] drm_dp_mst_link_probe_work+0x2b6/0x330 [drm_kms_helper] [ 962.310522] process_one_work+0x884/0x1400 [ 962.310529] worker_thread+0x196/0x11e0 [ 962.310533] kthread+0x2e2/0x3a0 [ 962.310538] ret_from_fork+0x35/0x40 [ 962.310543] Freed by task 500: [ 962.310550] __kasan_slab_free+0x133/0x180 [ 962.310555] kfree+0x92/0x1a0 [ 962.310581] drm_dp_mst_put_port_malloc+0x14d/0x180 [drm_kms_helper] [ 962.310693] intel_connector_destroy+0xb2/0xe0 [i915] [ 962.310747] drm_mode_object_put.part.0+0x12b/0x1a0 [drm] [ 962.310802] drm_atomic_state_default_clear+0x1f2/0xcc0 [drm] [ 962.310916] intel_atomic_state_clear+0xe/0x80 [i915] [ 962.310972] __drm_atomic_state_free+0x35/0xd0 [drm] [ 962.311083] intel_atomic_cleanup_work+0x56/0x70 [i915] [ 962.311092] process_one_work+0x884/0x1400 [ 962.311098] worker_thread+0x196/0x11e0 [ 962.311103] kthread+0x2e2/0x3a0 [ 962.311108] ret_from_fork+0x35/0x40 [ 962.311116] The buggy address belongs to the object at ffff888416c30000 which belongs to the cache kmalloc-2k of size 2048 [ 962.311122] The buggy address is located 4 bytes inside of 2048-byte region [ffff888416c30000, ffff888416c30800) [ 962.311124] The buggy address belongs to the page: [ 962.311132] page:ffffea00105b0c00 count:1 mapcount:0 mapping:ffff88841d003040 index:0x0 compound_mapcount: 0 [ 962.311142] flags: 0x8000000000010200(slab|head) [ 962.311152] raw: 8000000000010200 dead000000000100 dead000000000200 ffff88841d003040 [ 962.311159] raw: 0000000000000000 00000000000f000f 00000001ffffffff 0000000000000000 [ 962.311162] page dumped because: kasan: bad access detected So, bail early if drm_dp_mst_deallocate_vcpi() is called on a port with no VCPI allocation. Additionally, clean up the surrounding kerneldoc while we're at it since the port is assumed to be kept around because the DRM driver is expected to hold a malloc reference to it, not just us. Signed-off-by: Lyude Paul Fixes: eceae1472467 ("drm/dp_mst: Start tracking per-port VCPI allocations") Cc: Daniel Vetter Reviewed-by: Daniel Vetter --- drivers/gpu/drm/drm_dp_mst_topology.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 2552a27362a0..8ba0e5b368da 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3246,15 +3246,13 @@ EXPORT_SYMBOL(drm_dp_mst_reset_vcpi_slots); /** * drm_dp_mst_deallocate_vcpi() - deallocate a VCPI * @mgr: manager for this port - * @port: unverified port to deallocate vcpi for + * @port: port to deallocate vcpi for */ void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port) { - /* - * A port with VCPI will remain allocated until it's VCPI is - * released, no verified ref needed - */ + if (!port->vcpi.vcpi) + return; drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi); port->vcpi.num_slots = 0; From patchwork Tue Jan 29 18:39:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lyude Paul X-Patchwork-Id: 10786865 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F0811399 for ; Tue, 29 Jan 2019 18:39:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F13842931E for ; Tue, 29 Jan 2019 18:39:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E50352D2B8; Tue, 29 Jan 2019 18:39:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1B3872931E for ; Tue, 29 Jan 2019 18:39:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 166196E92B; Tue, 29 Jan 2019 18:39:43 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5F0576E925; Tue, 29 Jan 2019 18:39:41 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C0A6891D3D; Tue, 29 Jan 2019 18:39:40 +0000 (UTC) Received: from malachite.bss.redhat.com (dhcp-10-20-1-11.bss.redhat.com [10.20.1.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id 941F15D6A6; Tue, 29 Jan 2019 18:39:39 +0000 (UTC) From: Lyude Paul To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH 2/3] drm/atomic: Add drm_atomic_state->suspend_or_resume Date: Tue, 29 Jan 2019 13:39:26 -0500 Message-Id: <20190129183928.26779-3-lyude@redhat.com> In-Reply-To: <20190129183928.26779-1-lyude@redhat.com> References: <20190129183928.26779-1-lyude@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 29 Jan 2019 18:39:40 +0000 (UTC) X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Maxime Ripard , linux-kernel@vger.kernel.org, David Airlie , Rodrigo Vivi , Sean Paul Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Since commit 39b50c603878 ("drm/atomic_helper: Stop modesets on unregistered connectors harder") We've been failing atomic checks if they try to enable new displays on unregistered connectors. This is fine except for the one situation that breaks atomic assumptions: suspend/resume. If a connector is unregistered before we attempt to restore the atomic state, something we end up failing the atomic check that happens when trying to restore the state during resume. Normally this would be OK: we try our best to make sure that the atomic state pre-suspend can be restored post-suspend, but failures at that point usually don't cause problems. That is of course, until we introduced the new atomic MST VCPI helpers: [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CRTC:65:pipe B] active changed [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] Updating routing for [CONNECTOR:123:DP-5] [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] Disabling [CONNECTOR:123:DP-5] [drm:drm_atomic_get_private_obj_state [drm]] Added new private object 0000000025844636 state 000000009fd2899a to 000000003a13d7b8 WARNING: CPU: 6 PID: 1070 at drivers/gpu/drm/drm_dp_mst_topology.c:3153 drm_dp_atomic_release_vcpi_slots+0xb9/0x200 [drm_kms_helper] Modules linked in: fuse vfat fat snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic joydev iTCO_wdt i915(O) wmi_bmof intel_rapl btusb btrtl x86_pkg_temp_thermal btbcm btintel coretemp i2c_algo_bit drm_kms_helper(O) crc32_pclmul snd_hda_intel syscopyarea sysfillrect snd_hda_codec sysimgblt snd_hda_core bluetooth fb_sys_fops snd_pcm pcspkr drm(O) psmouse snd_timer mei_me ecdh_generic i2c_i801 mei i2c_core ucsi_acpi typec_ucsi typec wmi thinkpad_acpi ledtrig_audio snd soundcore tpm_tis rfkill tpm_tis_core video tpm acpi_pad pcc_cpufreq uas usb_storage crc32c_intel nvme serio_raw xhci_pci nvme_core xhci_hcd CPU: 6 PID: 1070 Comm: gnome-shell Tainted: G W O 5.0.0-rc2Lyude-Test+ #1 Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018 RIP: 0010:drm_dp_atomic_release_vcpi_slots+0xb9/0x200 [drm_kms_helper] Code: 00 4c 39 6d f0 74 49 48 8d 7b 10 48 89 f9 48 c1 e9 03 42 80 3c 21 00 0f 85 d2 00 00 00 48 8b 6b 10 48 8d 5d f0 49 39 ee 75 c5 <0f> 0b 48 c7 c7 c0 78 b3 a0 48 89 c2 4c 89 ee e8 03 6c aa ff b8 ea RSP: 0018:ffff88841235f268 EFLAGS: 00010246 RAX: ffff88841bf12ab0 RBX: ffff88841bf12aa8 RCX: 1ffff110837e2557 RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffffed108246bde0 RBP: ffff88841bf12ab8 R08: ffffed1083db3c93 R09: ffffed1083db3c92 R10: ffffed1083db3c92 R11: ffff88841ed9e497 R12: ffff888419555d80 R13: ffff8883bc499100 R14: ffff88841bf12ab8 R15: 0000000000000000 FS: 00007f16fbd4cd00(0000) GS:ffff88841ed80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1687c9f000 CR3: 00000003ba3cc003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: drm_atomic_helper_check_modeset+0xf21/0x2f50 [drm_kms_helper] ? drm_atomic_helper_commit_modeset_enables+0xa90/0xa90 [drm_kms_helper] ? __printk_safe_exit+0x10/0x10 ? save_stack+0x8c/0xb0 ? vprintk_func+0x96/0x1bf ? __printk_safe_exit+0x10/0x10 intel_atomic_check+0x234/0x4750 [i915] ? printk+0x9f/0xc5 ? kmsg_dump_rewind_nolock+0xd9/0xd9 ? _raw_spin_lock_irqsave+0xa4/0x140 ? drm_atomic_check_only+0xb1/0x28b0 [drm] ? drm_dbg+0x186/0x1b0 [drm] ? drm_dev_dbg+0x200/0x200 [drm] ? intel_link_compute_m_n+0xb0/0xb0 [i915] ? drm_mode_put_tile_group+0x20/0x20 [drm] ? skl_plane_format_mod_supported+0x17f/0x1b0 [i915] ? drm_plane_check_pixel_format+0x14a/0x310 [drm] drm_atomic_check_only+0x13c4/0x28b0 [drm] ? drm_state_info+0x220/0x220 [drm] ? drm_atomic_helper_disable_plane+0x1d0/0x1d0 [drm_kms_helper] ? pick_single_encoder_for_connector+0xe0/0xe0 [drm_kms_helper] ? kasan_unpoison_shadow+0x35/0x40 drm_atomic_commit+0x3b/0x100 [drm] drm_atomic_helper_set_config+0xd5/0x100 [drm_kms_helper] drm_mode_setcrtc+0x636/0x1660 [drm] ? vprintk_func+0x96/0x1bf ? drm_dev_dbg+0x200/0x200 [drm] ? drm_mode_getcrtc+0x790/0x790 [drm] ? printk+0x9f/0xc5 ? mutex_unlock+0x1d/0x40 ? drm_mode_addfb2+0x2e9/0x3a0 [drm] ? rcu_sync_dtor+0x2e0/0x2e0 ? drm_dbg+0x186/0x1b0 [drm] ? set_page_dirty+0x271/0x4d0 drm_ioctl_kernel+0x203/0x290 [drm] ? drm_mode_getcrtc+0x790/0x790 [drm] ? drm_setversion+0x7f0/0x7f0 [drm] ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x34/0x70 drm_ioctl+0x445/0x950 [drm] ? drm_mode_getcrtc+0x790/0x790 [drm] ? drm_getunique+0x220/0x220 [drm] ? expand_files.part.10+0x920/0x920 do_vfs_ioctl+0x1a1/0x13d0 ? ioctl_preallocate+0x2b0/0x2b0 ? __fget_light+0x2d6/0x390 ? schedule+0xd7/0x2e0 ? fget_raw+0x10/0x10 ? apic_timer_interrupt+0xa/0x20 ? apic_timer_interrupt+0xa/0x20 ? rcu_cleanup_dead_rnp+0x2c0/0x2c0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x136/0x440 ? syscall_return_slowpath+0x2d0/0x2d0 ? do_page_fault+0x89/0x330 ? __do_page_fault+0x9c0/0x9c0 ? prepare_exit_to_usermode+0x188/0x200 ? perf_trace_sys_enter+0x1090/0x1090 ? __x64_sys_sigaltstack+0x280/0x280 ? __put_user_4+0x1c/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f16ff89a09b Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007fff001232b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fff001232f0 RCX: 00007f16ff89a09b RDX: 00007fff001232f0 RSI: 00000000c06864a2 RDI: 000000000000000b RBP: 00007fff001232f0 R08: 0000000000000000 R09: 000055a79d484460 R10: 000055a79d44e770 R11: 0000000000000246 R12: 00000000c06864a2 R13: 000000000000000b R14: 0000000000000000 R15: 000055a79d44e770 WARNING: CPU: 6 PID: 1070 at drivers/gpu/drm/drm_dp_mst_topology.c:3153 drm_dp_atomic_release_vcpi_slots+0xb9/0x200 [drm_kms_helper] ---[ end trace d536c05c13c83be2 ]--- [drm:drm_dp_atomic_release_vcpi_slots [drm_kms_helper]] *ERROR* no VCPI for [MST PORT:00000000f9e2b143] found in mst state 000000009fd2899a This appears to be happening because we destroy the VCPI allocations when disabling all connected displays while suspending, and those VCPI allocations don't get restored on resume due to failing to restore the atomic state. So, fix this by introducing the suspending option to drm_atomic_helper_duplicate_state() and use that to indicate in the atomic state that it's being used for suspending or resuming the system, and thus needs to be fixed up by the driver. We can then use the new state->suspend_or_resume hook to fixup the atomic VCPI helpers so they merely save and restore the VCPI allocations for such states instead of performing error checking on them. This fixes the warnings during suspend/resume when a topology is removed from the system, and allows us to once again restore the atomic state on resume without any issues. Signed-off-by: Lyude Paul Fixes: eceae1472467 ("drm/dp_mst: Start tracking per-port VCPI allocations") Cc: Daniel Vetter --- drivers/gpu/drm/drm_atomic_helper.c | 16 +++++++++--- drivers/gpu/drm/drm_dp_mst_topology.c | 36 ++++++++++++++++++++++++--- drivers/gpu/drm/i915/intel_display.c | 4 +-- include/drm/drm_atomic.h | 11 ++++++++ include/drm/drm_atomic_helper.h | 3 ++- 5 files changed, 61 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 6fe2303fccd9..7d23dafdffbc 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -332,8 +332,15 @@ update_connector_routing(struct drm_atomic_state *state, * about is ensuring that userspace can't do anything but shut off the * display on a connector that was destroyed after its been notified, * not before. + * + * Additionally, we also want to ignore connector registration when + * we're trying to restore an atomic state during system resume since + * there's a chance the connector may have been destroyed during the + * process, but it's better to ignore that then cause + * drm_atomic_helper_resume() to fail. */ - if (drm_connector_is_unregistered(connector) && crtc_state->active) { + if (!state->suspend_or_resume && + drm_connector_is_unregistered(connector) && crtc_state->active) { DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n", connector->base.id, connector->name); return -EINVAL; @@ -3144,6 +3151,7 @@ EXPORT_SYMBOL(drm_atomic_helper_shutdown); * drm_atomic_helper_duplicate_state - duplicate an atomic state object * @dev: DRM device * @ctx: lock acquisition context + * @suspending: If this state is being duplicated as part of system suspend * * Makes a copy of the current atomic state by looping over all objects and * duplicating their respective states. This is used for example by suspend/ @@ -3166,7 +3174,8 @@ EXPORT_SYMBOL(drm_atomic_helper_shutdown); */ struct drm_atomic_state * drm_atomic_helper_duplicate_state(struct drm_device *dev, - struct drm_modeset_acquire_ctx *ctx) + struct drm_modeset_acquire_ctx *ctx, + bool suspending) { struct drm_atomic_state *state; struct drm_connector *conn; @@ -3180,6 +3189,7 @@ drm_atomic_helper_duplicate_state(struct drm_device *dev, return ERR_PTR(-ENOMEM); state->acquire_ctx = ctx; + state->suspend_or_resume = suspending; drm_for_each_crtc(crtc, dev) { struct drm_crtc_state *crtc_state; @@ -3263,7 +3273,7 @@ struct drm_atomic_state *drm_atomic_helper_suspend(struct drm_device *dev) DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, 0, err); - state = drm_atomic_helper_duplicate_state(dev, &ctx); + state = drm_atomic_helper_duplicate_state(dev, &ctx, true); if (IS_ERR(state)) goto unlock; diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 8ba0e5b368da..371010f8d42e 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3032,6 +3032,10 @@ static int drm_dp_init_vcpi(struct drm_dp_mst_topology_mgr *mgr, * @port as needed. It is not OK however, to call this function and * drm_dp_atomic_release_vcpi_slots() in the same atomic check phase. * + * When &drm_atomic_state.suspend_or_resume is set to %true%, this function + * will not perform any error checking and will instead simply retrieve the + * previously recorded VCPI allocation that was present before system suspend. + * * See also: * drm_dp_atomic_release_vcpi_slots() * drm_dp_mst_atomic_check() @@ -3052,9 +3056,11 @@ int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state, if (IS_ERR(topology_state)) return PTR_ERR(topology_state); - port = drm_dp_mst_topology_get_port_validated(mgr, port); - if (port == NULL) - return -EINVAL; + if (!state->suspend_or_resume) { + port = drm_dp_mst_topology_get_port_validated(mgr, port); + if (!port) + return -EINVAL; + } /* Find the current allocation for this port, if any */ list_for_each_entry(pos, &topology_state->vcpis, next) { @@ -3062,6 +3068,19 @@ int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state, vcpi = pos; prev_slots = vcpi->vcpi; + /* + * When resuming, we just want to restore the previous + * VCPI without doing error checking + */ + if (state->suspend_or_resume) { + DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] [MST PORT:%p] restoring VCPI of %d\n", + port->connector->base.id, + port->connector->name, + port, prev_slots); + + return prev_slots; + } + /* * This should never happen, unless the driver tries * releasing and allocating the same VCPI allocation, @@ -3124,6 +3143,11 @@ EXPORT_SYMBOL(drm_dp_atomic_find_vcpi_slots); * drm_dp_atomic_find_vcpi_slots() on the same @port in a single atomic check * phase. * + * When &drm_atomic_state.suspend_or_resume is set, this function will not + * modify the VCPI allocations in &drm_dp_mst_topology_state.vcpis, so that + * those VCPI allocations may be restored as-is during system resume. In this + * scenario, this function will always return 0. + * * See also: * drm_dp_atomic_find_vcpi_slots() * drm_dp_mst_atomic_check() @@ -3140,6 +3164,12 @@ int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state, struct drm_dp_vcpi_allocation *pos; bool found = false; + /* During suspend, just keep our VCPI allocations as-is in the atomic + * state so they can be restored as-is at resume + */ + if (state->suspend_or_resume) + return 0; + topology_state = drm_atomic_get_mst_topology_state(state, mgr); if (IS_ERR(topology_state)) return PTR_ERR(topology_state); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4a282532af81..4c70f1531119 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3824,7 +3824,7 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) * Disabling the crtcs gracefully seems nicer. Also the * g33 docs say we should at least disable all the planes. */ - state = drm_atomic_helper_duplicate_state(dev, ctx); + state = drm_atomic_helper_duplicate_state(dev, ctx, false); if (IS_ERR(state)) { ret = PTR_ERR(state); DRM_ERROR("Duplicating state failed with %i\n", ret); @@ -15043,7 +15043,7 @@ static void sanitize_watermarks(struct drm_device *dev) goto fail; } - state = drm_atomic_helper_duplicate_state(dev, &ctx); + state = drm_atomic_helper_duplicate_state(dev, &ctx, false); if (WARN_ON(IS_ERR(state))) goto fail; diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h index 811b4a92568f..0897f51dcd62 100644 --- a/include/drm/drm_atomic.h +++ b/include/drm/drm_atomic.h @@ -329,6 +329,17 @@ struct drm_atomic_state { bool allow_modeset : 1; bool legacy_cursor_update : 1; bool async_update : 1; + /** + * @suspend_or_resume: + * + * Indicates whether or not this atomic state is being saved or + * committed as part of suspending or resuming the system + * respectively. Drivers and atomic helpers hould use this to fixup + * normal state inconsistencies that might arise between system + * suspend and resume, so as to ensure that restoring the atomic state + * at resume never fails. + */ + bool suspend_or_resume : 1; struct __drm_planes_state *planes; struct __drm_crtcs_state *crtcs; int num_connector; diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h index 58214be3bf3d..359f557b6898 100644 --- a/include/drm/drm_atomic_helper.h +++ b/include/drm/drm_atomic_helper.h @@ -129,7 +129,8 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, void drm_atomic_helper_shutdown(struct drm_device *dev); struct drm_atomic_state * drm_atomic_helper_duplicate_state(struct drm_device *dev, - struct drm_modeset_acquire_ctx *ctx); + struct drm_modeset_acquire_ctx *ctx, + bool suspending); struct drm_atomic_state *drm_atomic_helper_suspend(struct drm_device *dev); int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, struct drm_modeset_acquire_ctx *ctx); From patchwork Tue Jan 29 18:39:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lyude Paul X-Patchwork-Id: 10786867 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 528341399 for ; Tue, 29 Jan 2019 18:39:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41FB62931E for ; Tue, 29 Jan 2019 18:39:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 360A22D2B8; Tue, 29 Jan 2019 18:39:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ADDB42931E for ; Tue, 29 Jan 2019 18:39:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 45CE36E926; Tue, 29 Jan 2019 18:39:45 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by gabe.freedesktop.org (Postfix) with ESMTPS id 82F986E925; Tue, 29 Jan 2019 18:39:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F384C7BDC6; Tue, 29 Jan 2019 18:39:41 +0000 (UTC) Received: from malachite.bss.redhat.com (dhcp-10-20-1-11.bss.redhat.com [10.20.1.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F7ED5D6A6; Tue, 29 Jan 2019 18:39:40 +0000 (UTC) From: Lyude Paul To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH 3/3] drm/i915: Always allocate VCPI during system resume Date: Tue, 29 Jan 2019 13:39:27 -0500 Message-Id: <20190129183928.26779-4-lyude@redhat.com> In-Reply-To: <20190129183928.26779-1-lyude@redhat.com> References: <20190129183928.26779-1-lyude@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 29 Jan 2019 18:39:42 +0000 (UTC) X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: David Airlie , linux-kernel@vger.kernel.org, Rodrigo Vivi Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Since there's a chance MST topologies can be removed while the system is in suspend, there's also a chance that the connectors in the atomic state which we try to restore on system resume will have been unregistered if they were part of an MST topology that was removed mid-suspend. In such situations, we'll currently skip recalculating the VCPI. Normally this would be safe since we don't expect to be allowed to enable displays on unregistered connections, but since we need to try our best to restore even partially invalid atomic states on system resume we end up introducing the chance that we try enabling a display on a now-unregistered connector. This of course results on a blow up during system resume: [ 1894.177788] [drm:pipe_config_err [i915]] *ERROR* mismatch in dp_m_n (expected tu 0 gmch 1730150/8388608 link 288358/1048576, or tu 0 gmch 0/0 link 0/0, found tu 64, gmch 1730150/8388608 link 288358/1048576) [ 1894.177795] ------------[ cut here ]------------ [ 1894.177799] pipe state doesn't match! [ 1894.177905] WARNING: CPU: 2 PID: 1894 at drivers/gpu/drm/i915/intel_display.c:12292 intel_atomic_commit_tail+0xd5e/0xdd0 [i915] [ 1894.177909] Modules linked in: i915(O) drm_kms_helper(O) drm(O) vfat fat joydev iTCO_wdt wmi_bmof btusb btrtl btbcm intel_rapl btintel i2c_algo_bit x86_pkg_temp_thermal bluetooth syscopyarea coretemp sysfillrect sysimgblt crc32_pclmul fb_sys_fops ucsi_acpi psmouse ecdh_generic pcspkr typec_ucsi i2c_i801 typec mei_me mei i2c_core wmi thinkpad_acpi ledtrig_audio snd soundcore rfkill tpm_tis tpm_tis_core video pcc_cpufreq tpm acpi_pad uas usb_storage crc32c_intel nvme serio_raw nvme_core xhci_pci xhci_hcd [last unloaded: drm] [ 1894.177990] CPU: 2 PID: 1894 Comm: kworker/u16:8 Tainted: G O 5.0.0-rc4Lyude-Test+ #9 [ 1894.177994] Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018 [ 1894.178005] Workqueue: events_unbound async_run_entry_fn [ 1894.178079] RIP: 0010:intel_atomic_commit_tail+0xd5e/0xdd0 [i915] [ 1894.178085] Code: ba 01 00 00 00 4c 89 4c 24 10 e8 3d 46 01 00 4c 8b 4c 24 10 e9 a4 fb ff ff e8 b2 18 bd e0 0f 0b e9 a5 fd ff ff e8 a6 18 bd e0 <0f> 0b e9 f1 f9 ff ff e8 9a 18 bd e0 0f 0b 0f b6 44 24 18 e9 23 f9 [ 1894.178090] RSP: 0000:ffffc90000553c08 EFLAGS: 00010286 [ 1894.178096] RAX: 0000000000000000 RBX: ffff888459edc000 RCX: 0000000000000006 [ 1894.178101] RDX: 0000000000000007 RSI: ffff8884591e2f90 RDI: ffff88845e295690 [ 1894.178105] RBP: ffff888454e27000 R08: 0000000000000000 R09: 0000000000000000 [ 1894.178108] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888454e22000 [ 1894.178112] R13: ffff888454e21000 R14: ffff8883ca680750 R15: ffff8883ca680758 [ 1894.178118] FS: 0000000000000000(0000) GS:ffff88845e280000(0000) knlGS:0000000000000000 [ 1894.178123] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1894.178127] CR2: 0000000000000000 CR3: 0000000002214001 CR4: 00000000003606e0 [ 1894.178131] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1894.178135] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1894.178139] Call Trace: [ 1894.178224] intel_atomic_commit+0x240/0x320 [i915] [ 1894.178251] drm_atomic_helper_commit_duplicated_state+0xc9/0xe0 [drm_kms_helper] [ 1894.178321] __intel_display_resume+0x82/0xd0 [i915] [ 1894.178391] intel_display_resume+0xcf/0x120 [i915] [ 1894.178453] i915_drm_resume+0xb1/0x100 [i915] [ 1894.178465] ? pci_pm_suspend_late+0x30/0x30 [ 1894.178473] dpm_run_callback+0x70/0x210 [ 1894.178484] device_resume+0xae/0x1f0 [ 1894.178493] async_resume+0x19/0x30 [ 1894.178502] async_run_entry_fn+0x39/0x160 [ 1894.178513] process_one_work+0x23c/0x590 [ 1894.178529] worker_thread+0x30/0x380 [ 1894.178539] ? rescuer_thread+0x340/0x340 [ 1894.178545] kthread+0x112/0x130 [ 1894.178552] ? kthread_create_on_node+0x60/0x60 [ 1894.178563] ret_from_fork+0x3a/0x50 [ 1894.178582] irq event stamp: 76782 [ 1894.178591] hardirqs last enabled at (76781): [] vprintk_emit+0x2c5/0x2d0 [ 1894.178600] hardirqs last disabled at (76782): [] trace_hardirqs_off_thunk+0x1a/0x1c [ 1894.178609] softirqs last enabled at (76734): [] __do_softirq+0x35d/0x412 [ 1894.178618] softirqs last disabled at (76727): [] irq_exit+0xe0/0xf0 [ 1894.178687] WARNING: CPU: 2 PID: 1894 at drivers/gpu/drm/i915/intel_display.c:12292 intel_atomic_commit_tail+0xd5e/0xdd0 [i915] [ 1894.178692] ---[ end trace 0df08c0b9a67376e ]--- So, fix this by using the new drm_atomic_state.suspend_or_resume hook in order to force us to retrieve a VCPI allocation when restoring a pipe's atomic state. This is safe, since drm_dp_atomic_find_vcpi_slots() will just return the VCPI allocation we had pre-suspend. Signed-off-by: Lyude Paul Cc: Daniel Vetter Fixes: eceae1472467 ("drm/dp_mst: Start tracking per-port VCPI allocations") --- drivers/gpu/drm/i915/intel_dp_mst.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c index cdb83d294cdd..04c9a16fdc39 100644 --- a/drivers/gpu/drm/i915/intel_dp_mst.c +++ b/drivers/gpu/drm/i915/intel_dp_mst.c @@ -80,8 +80,13 @@ static int intel_dp_mst_compute_config(struct intel_encoder *encoder, mst_pbn = drm_dp_calc_pbn_mode(adjusted_mode->crtc_clock, bpp); pipe_config->pbn = mst_pbn; - /* Zombie connectors can't have VCPI slots */ - if (!drm_connector_is_unregistered(connector)) { + /* + * Zombie connectors can't have VCPI slots and won't be turned on, + * except during resume where we must make sure we restore the + * previous VCPI allocations + */ + if (!drm_connector_is_unregistered(connector) || + state->suspend_or_resume) { slots = drm_dp_atomic_find_vcpi_slots(state, &intel_dp->mst_mgr, port,