From patchwork Fri Mar 24 18:12:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 9643611 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DA98F60327 for ; Fri, 24 Mar 2017 18:13:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D10BF204C2 for ; Fri, 24 Mar 2017 18:13:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C46D1269DA; Fri, 24 Mar 2017 18:13:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 695F0204C2 for ; Fri, 24 Mar 2017 18:13:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934827AbdCXSNS (ORCPT ); Fri, 24 Mar 2017 14:13:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43806 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934718AbdCXSNM (ORCPT ); Fri, 24 Mar 2017 14:13:12 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6E94F804E5; Fri, 24 Mar 2017 18:12:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6E94F804E5 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jpoimboe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 6E94F804E5 Received: from treble (ovpn-122-207.rdu2.redhat.com [10.10.122.207]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3685A5C88E; Fri, 24 Mar 2017 18:12:55 +0000 (UTC) Date: Fri, 24 Mar 2017 13:12:54 -0500 From: Josh Poimboeuf To: Paul Menzel Cc: "Rafael J . Wysocki" , Len Brown , linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, Steven Rostedt , Ingo Molnar Subject: Re: [PATCH] acpi: fix incompatibility with mcount-based function graph tracing Message-ID: <20170324181254.gouyrbmppukrrbb6@treble> References: <6559f36c6c6cdc2552b0bccf31de967367aa790d.1489672478.git.jpoimboe@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 24 Mar 2017 18:12:57 +0000 (UTC) Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Mar 21, 2017 at 09:44:03PM +0100, Paul Menzel wrote: > I checked out Linux 4.9.16, applied your patch on top, and copied the Debian > 4.9 Linux kernel configuration, did `make menuconfig`, disabled building > debugging symbols, and executed `ARCH=i386 make -j40 deb-pkg`. > > I installed that package on the Lenovo X60, and the result with tracing > enabled has improved. The system suspends without a crash. Unfortunately, > instead of resuming when pressing the power button, it starts from scratch. > Suspend and resume without tracing enabled works though. > > I’ll try to collect logs, but I don’t know, if there will be any, if the > system just resets. > > Maybe, this can be reproduced in QEMU? So I was able to recreate this issue in qemu, and after some hours of debugging I managed to figure it out. It's rebooting during the resume because of a triple fault in prepare_ftrace_return(). acpi wakeup for secondary cpu startup_32_smp() load_ucode_ap() prepare_ftrace_return() ftrace_graph_is_dead() dereferences virtual address (kill_ftrace_graph) in real mode <-- BOOM I tried fixing it by changing load_ucode_ap() to notrace, but that function calls some other functions which also have mcount hooks, which call other functions, etc. Instead I was able to "fix" it by ignoring ftrace calls in real mode: ----- index 8f3d9cf..5c0d0c6 100644 I'm not sure what the best fix should really be. A few ideas off the top of my head: - A real mode check similar to the above (except it should probably be more precise) - Make tracing_graph_pause a percpu variable so that it can be read from prepare_ftrace_return() - pause_graph_tracing() from ftrace_suspend_notifier_call() Steven, thoughts? --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -983,6 +983,9 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent, unsigned long return_hooker = (unsigned long) &return_to_handler; + if (__builtin_return_address(0) < TASK_SIZE_MAX) + return; + if (unlikely(ftrace_graph_is_dead())) return; ---------------