From patchwork Mon Jul 1 13:43:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Douthit X-Patchwork-Id: 11026043 X-Patchwork-Delegate: lenb@kernel.org Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1D9F31890 for ; Mon, 1 Jul 2019 13:43:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F168285E3 for ; Mon, 1 Jul 2019 13:43:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 01E9F28722; Mon, 1 Jul 2019 13:43:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41C6A28715 for ; Mon, 1 Jul 2019 13:43:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729192AbfGANnX (ORCPT ); Mon, 1 Jul 2019 09:43:23 -0400 Received: from mail-eopbgr140137.outbound.protection.outlook.com ([40.107.14.137]:30679 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729148AbfGANnX (ORCPT ); Mon, 1 Jul 2019 09:43:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=SILICOMLTD.onmicrosoft.com; s=selector1-SILICOMLTD-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+4Dwe9wnlRxLdeM2BWT3PbZtSB05kpQb+fL6WTl4ey0=; b=aFFB0HudiPne1LEJx2ISGFXQPrk7HAjjw1mtRXTZ5aGGcqyhuqlprqITGW5x13Y3XT2tNbC+Npe1m+FOBKqcY+5IZSvCe8j6ykfADaq7sa40Cq6HbZF7fIFMehFPjxmiFbYeqzvH3H+YyLgmmuVIiHBszQduGe/817w+QFIuUls= Received: from HE1PR04MB3001.eurprd04.prod.outlook.com (10.170.255.147) by HE1PR04MB3244.eurprd04.prod.outlook.com (10.170.251.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2032.20; Mon, 1 Jul 2019 13:43:19 +0000 Received: from HE1PR04MB3001.eurprd04.prod.outlook.com ([fe80::5d1d:2a74:3402:c417]) by HE1PR04MB3001.eurprd04.prod.outlook.com ([fe80::5d1d:2a74:3402:c417%7]) with mapi id 15.20.2032.019; Mon, 1 Jul 2019 13:43:18 +0000 From: Stephen Douthit To: Jacob Pan , Len Brown , Bjorn Helgaas CC: Stephen Douthit , "linux-pm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" Subject: [PATCH] intel_idle: prevent SKX boot failure when C6 & SERIRQ enabled Thread-Topic: [PATCH] intel_idle: prevent SKX boot failure when C6 & SERIRQ enabled Thread-Index: AQHVMBLwGSwv5ZM4JUOLcUCRmpTlRQ== Date: Mon, 1 Jul 2019 13:43:18 +0000 Message-ID: <20190701134255.25959-1-stephend@silicom-usa.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: BN8PR15CA0049.namprd15.prod.outlook.com (2603:10b6:408:80::26) To HE1PR04MB3001.eurprd04.prod.outlook.com (2603:10a6:7:1f::19) authentication-results: spf=none (sender IP is ) smtp.mailfrom=stephend@silicom-usa.com; x-ms-exchange-messagesentrepresentingtype: 1 x-mailer: git-send-email 2.21.0 x-originating-ip: [96.82.2.57] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: a1ccafec-2e82-440a-211a-08d6fe2a1311 x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020);SRVR:HE1PR04MB3244; x-ms-traffictypediagnostic: HE1PR04MB3244: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-forefront-prvs: 00851CA28B x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(366004)(39850400004)(346002)(396003)(136003)(376002)(199004)(189003)(6512007)(73956011)(6436002)(66446008)(64756008)(1076003)(52116002)(66556008)(66476007)(66946007)(5660300002)(53936002)(305945005)(7736002)(54906003)(110136005)(66066001)(6116002)(3846002)(99286004)(6486002)(71190400001)(316002)(68736007)(71200400001)(2906002)(86362001)(4326008)(36756003)(26005)(14454004)(25786009)(50226002)(14444005)(476003)(2616005)(256004)(186003)(8676002)(6506007)(102836004)(8936002)(81166006)(478600001)(81156014)(386003)(486006);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR04MB3244;H:HE1PR04MB3001.eurprd04.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; received-spf: None (protection.outlook.com: silicom-usa.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: KCwrAyhxHzauGbe11zX76HsGW9PP5jlfLf8U9lKPXCX2EkxGjjyifCfj+wNzSSxzTleE2MvLsU92rHyNWhY+/CXDD/J3oco/ZsVPhtQl0YQUukxE+JvfZtils1lyAZhW2Mam44Z/kCM43TbkU9UVHb/T2NwTdBGuQqvhBPTETAelcnzChNtFu8go8AYEEQGE6jdttp8nrg5aA8yugof9EiCIfmLLQzvLqEV7IbXb/KifH2tx+GSac/ixFl9axatUIm0/Vq36WYW4OYIPo5AqQ9kcwg+CFyBHzuW3Pc5fB7mdCxQ3k5A6dnrrY2YIjMydDvlOk/6/mnf1LBxUnOFm3i9/I2QSp1klam16XwRQb2K3L24N6bpHX8GLFHKOpHyXFh0dyCtoZ0eAJros7uTdb1Zf/2K7h1FVnduSJt8IjO4= MIME-Version: 1.0 X-OriginatorOrg: silicom-usa.com X-MS-Exchange-CrossTenant-Network-Message-Id: a1ccafec-2e82-440a-211a-08d6fe2a1311 X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jul 2019 13:43:18.8332 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: c9e326d8-ce47-4930-8612-cc99d3c87ad1 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: stephend@silicom-usa.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR04MB3244 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Interrupts are getting misrouted and/or dropped on SKLYLAKE_X based D-2100s when C6 and SERIRQ are enabled. I've only seen this issue on systems using SERIRQs (in my case for a LPC based UART providing the serial console for a headless server). One failure mode is "do_IRQ: 8.33 No irq handler for vector" getting printed in the kernel logs. The core getting the unhandled irq is typically the one handling the UART SERIRQ. I've seen it on other cores, but I haven't confirmed if that's because the UART irq handler was moved to another core at some point. The vector varies from 33-36, but it's most often 33. The other failure mode is the system hanging. Sometimes forcing some non SERIRQ interrupt to fire (by plugging/unplugging a network/USB cable) can get the system out of this state. Generating more SERIRQs via the UART will not unstick the system. Both failures seemed to occur when transition to a low load state, which is why I started playing around with power management options and found that booting with "intel_idle.max_cstate=2" fixed the issue. This patch only disables C6 if it's able to determine that SERIRQs are enabled by checking the enable bit in the LPC controllers PCI config space. Signed-off-by: Stephen Douthit --- drivers/idle/intel_idle.c | 35 ++++++++++++++++++++++++++++++++++- include/linux/pci_ids.h | 1 + 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b8647b5c3d4d..353f6a9b1818 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -61,12 +61,13 @@ #include #include #include +#include #include #include #include #include -#define INTEL_IDLE_VERSION "0.4.1" +#define INTEL_IDLE_VERSION "0.4.2" static struct cpuidle_driver intel_idle_driver = { .name = "intel_idle", @@ -1306,6 +1307,35 @@ static void sklh_idle_state_table_update(void) skl_cstates[5].disabled = 1; /* C8-SKL */ skl_cstates[6].disabled = 1; /* C9-SKL */ } +/* + * skx_idle_state_table_update() + * + * On SKX (model 0x55) SoCs disable C6 if SERIRQ is enabled + */ +static void skx_idle_state_table_update(void) +{ +#define SCNT_OFF 0x64 +#define SCNT_EN (1 << 7) + struct pci_dev *pdev = pci_get_device(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_SKX_LPC, + NULL); + u8 reg; + + /* + * Check bit 7 of the Serial IRQ Control (SCNT) register (0x64) in the + * LPC controller. If it's set serial IRQs are enabled, and we need to + * disable C6 to prevent hangs. + */ + if (!pdev) + return; + if (pci_read_config_byte(pdev, SCNT_OFF, ®)) + return; + if (!(reg & SCNT_EN)) + return; + + pr_debug("SERIRQ enabled on SKX, disabling C6 to avoid hangs\n"); + skx_cstates[2].disabled = 1; /* C6-SKX */ +} /* * intel_idle_state_table_update() * @@ -1326,6 +1356,9 @@ static void intel_idle_state_table_update(void) case INTEL_FAM6_SKYLAKE_DESKTOP: sklh_idle_state_table_update(); break; + case INTEL_FAM6_SKYLAKE_X: + skx_idle_state_table_update(); + break; } } diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 70e86148cb1e..02bac8de03fd 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2997,6 +2997,7 @@ #define PCI_DEVICE_ID_INTEL_84460GX 0x84ea #define PCI_DEVICE_ID_INTEL_IXP4XX 0x8500 #define PCI_DEVICE_ID_INTEL_IXP2800 0x9004 +#define PCI_DEVICE_ID_INTEL_SKX_LPC 0xa1c8 #define PCI_DEVICE_ID_INTEL_S21152BB 0xb152 #define PCI_VENDOR_ID_SCALEMP 0x8686