From patchwork Tue Jul 18 06:09:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Airlie X-Patchwork-Id: 9846965 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1321B60392 for ; Tue, 18 Jul 2017 06:09:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B890237A5 for ; Tue, 18 Jul 2017 06:09:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 004C32853B; Tue, 18 Jul 2017 06:09:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A7E9237A5 for ; Tue, 18 Jul 2017 06:09:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751358AbdGRGJP (ORCPT ); Tue, 18 Jul 2017 02:09:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59452 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751203AbdGRGJO (ORCPT ); Tue, 18 Jul 2017 02:09:14 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3154F61E4C; Tue, 18 Jul 2017 06:09:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3154F61E4C Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=airlied@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3154F61E4C Received: from dreadlord-bne-redhat-com.bne.redhat.com (dhcp-40-179.bne.redhat.com [10.64.40.179]) by smtp.corp.redhat.com (Postfix) with ESMTP id B53AD5D6A3; Tue, 18 Jul 2017 06:09:11 +0000 (UTC) From: Dave Airlie To: Peter Jones , Bartlomiej Zolnierkiewicz , linux-fbdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: luto@kernel.org, hpa@zytor.com, torvalds@linux-foundation.org, Dave Airlie Subject: [PATCH] efifb: allow user to disable write combined mapping. Date: Tue, 18 Jul 2017 16:09:09 +1000 Message-Id: <20170718060909.5280-1-airlied@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 18 Jul 2017 06:09:14 +0000 (UTC) Sender: linux-fbdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fbdev@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch allows the user to disable write combined mapping of the efifb framebuffer console using an nowc option. A customer noticed major slowdowns while logging to the console with write combining enabled, on other tasks running on the same CPU. (10x or greater slow down on all other cores on the same CPU as is doing the logging). I reproduced this on a machine with dual CPUs. Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz (6 core) I wrote a test that just mmaps the pci bar and writes to it in a loop, while this was running in the background one a single core with (taskset -c 1), building a kernel up to init/version.o (taskset -c 8) went from 13s to 133s or so. I've yet to explain why this occurs or what is going wrong I haven't managed to find a perf command that in any way gives insight into this. 11,885,070,715 instructions # 1.39 insns per cycle vs 12,082,592,342 instructions # 0.13 insns per cycle is the only thing I've spotted of interest, I've tried at least: dTLB-stores,dTLB-store-misses,L1-dcache-stores,LLC-store,LLC-store-misses,LLC-load-misses,LLC-loads,\mem-loads,mem-stores,iTLB-loads,iTLB-load-misses,cache-references,cache-misses For now it seems at least a good idea to allow a user to disable write combining if they see this until we can figure it out. Note also most users get a real framebuffer driver loaded when kms kicks in, it just happens on these machines the kernel didn't support the gpu specific driver. Signed-off-by: Dave Airlie Acked-By: Peter Jones --- Documentation/fb/efifb.txt | 6 ++++++ drivers/video/fbdev/efifb.c | 8 +++++++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/Documentation/fb/efifb.txt b/Documentation/fb/efifb.txt index a59916c..1a85c1b 100644 --- a/Documentation/fb/efifb.txt +++ b/Documentation/fb/efifb.txt @@ -27,5 +27,11 @@ You have to add the following kernel parameters in your elilo.conf: Macbook Pro 17", iMac 20" : video=efifb:i20 +Accepted options: + +nowc Don't map the framebuffer write combined. This can be used + to workaround side-effects and slowdowns on other CPU cores + when large amounts of console data are written. + -- Edgar Hucek diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c index b827a81..a568fe0 100644 --- a/drivers/video/fbdev/efifb.c +++ b/drivers/video/fbdev/efifb.c @@ -17,6 +17,7 @@ #include static bool request_mem_succeeded = false; +static bool nowc = false; static struct fb_var_screeninfo efifb_defined = { .activate = FB_ACTIVATE_NOW, @@ -99,6 +100,8 @@ static int efifb_setup(char *options) screen_info.lfb_height = simple_strtoul(this_opt+7, NULL, 0); else if (!strncmp(this_opt, "width:", 6)) screen_info.lfb_width = simple_strtoul(this_opt+6, NULL, 0); + else if (!strcmp(this_opt, "nowc")) + nowc = true; } } @@ -255,7 +258,10 @@ static int efifb_probe(struct platform_device *dev) info->apertures->ranges[0].base = efifb_fix.smem_start; info->apertures->ranges[0].size = size_remap; - info->screen_base = ioremap_wc(efifb_fix.smem_start, efifb_fix.smem_len); + if (nowc) + info->screen_base = ioremap(efifb_fix.smem_start, efifb_fix.smem_len); + else + info->screen_base = ioremap_wc(efifb_fix.smem_start, efifb_fix.smem_len); if (!info->screen_base) { pr_err("efifb: abort, cannot ioremap video memory 0x%x @ 0x%lx\n", efifb_fix.smem_len, efifb_fix.smem_start);