From patchwork Fri Feb 27 19:43:27 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Crosthwaite X-Patchwork-Id: 5903091 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 318CCBF440 for ; Fri, 27 Feb 2015 19:46:12 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 26ADC20256 for ; Fri, 27 Feb 2015 19:46:11 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1BBB520219 for ; Fri, 27 Feb 2015 19:46:10 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YRQpH-0001YH-G9; Fri, 27 Feb 2015 19:43:55 +0000 Received: from mail-lb0-f174.google.com ([209.85.217.174]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YRQpD-0001RW-BC for linux-arm-kernel@lists.infradead.org; Fri, 27 Feb 2015 19:43:52 +0000 Received: by lbdu10 with SMTP id u10so19274052lbd.7 for ; Fri, 27 Feb 2015 11:43:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=aUm0ps25Z2zGdjzI4sirNW0nCCz6d9uUBRdhpdxQlhw=; b=UdFKzfZZEVbjTUbvs7afe0yg4aDEj0N6qyrsVai0KJcauqk9lmDuGanfP+5L79ZoZ+ MSn1sqhfIhJW9Rx3wDEiigvTiV/kAkY6YArt3ykz0fLeB/KjXQFWCxbBVfxHUC+3PvOo n7onCsve4aH/z3Asn3YB6RftoyF3je2HC0/zTzgtEWN3nmKLFODd3udu4JP8rw9lQOVN wJV0grCuDWxJxIsl9lxijFE5KYDV9Af9LnvLus1srbhzGd3LZ4cNj8NBMVzgReuVH6tQ JqKwq3fGQ9kGGzdGmlAJsbTDWitJlamKnvy1JKac/bFGaNe1SxMuT93wsrwq4xt3dceo dImw== X-Gm-Message-State: ALoCoQm9Za1LyUf9lbmxLdddkMJ3WUeuI76vLqYg4KD+PsPRwy5TVCOoZ6M96hLVvVdG4z/Pb5VU MIME-Version: 1.0 X-Received: by 10.152.179.135 with SMTP id dg7mr14568458lac.58.1425066207724; Fri, 27 Feb 2015 11:43:27 -0800 (PST) Received: by 10.25.163.2 with HTTP; Fri, 27 Feb 2015 11:43:27 -0800 (PST) In-Reply-To: <20150225132437.GD12377@arm.com> References: <1424819257-22664-1-git-send-email-peter.crosthwaite@xilinx.com> <20150225132437.GD12377@arm.com> Date: Fri, 27 Feb 2015 11:43:27 -0800 X-Google-Sender-Auth: VAawlTGJjpVFJLsC_AvUMDZB9ho Message-ID: Subject: Re: [RFC PATCH] arm64: Implement cpu_relax as yield From: Peter Crosthwaite To: Will Deacon X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150227_114351_578236_2CC87740 X-CRM114-Status: GOOD ( 17.28 ) X-Spam-Score: -0.7 (/) Cc: Catalin Marinas , "michals@xilinx.com" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Feb 25, 2015 at 5:24 AM, Will Deacon wrote: > On Tue, Feb 24, 2015 at 11:07:37PM +0000, Peter Crosthwaite wrote: >> ARM64 has the yield nop hint which has the intended semantics of >> cpu_relax. Implement. >> >> The immediate application is ARM CPU emulators. An emulator can take >> advantage of the yield hint to de-prioritise an emulated CPU in favor >> of other emulation tasks. QEMU A64 SMP emulation has yield awareness, >> and sees a significant boot time performance increase with this change. > > Could you elaborate on the QEMU SMP boot case please? Usually SMP pens > for booting make use of wfe/sev to minimise the spinning overhead. > So I did some follow up experiments. With my patch applied I started trapping instances of cpu_relax (now yielding) in gdb. I then commented out the cpu_relax's one by one. This one seems to be the key: is causing my issue it seems. I found three cpu_relax calls that each seems to do some spinning in the boot. There probably will be more, but I stopped after finding to above issue. Full gdb traces below: Breakpoint 1, multi_cpu_stop (data=0x0) at kernel/stop_machine.c:191 191 cpu_relax(); (gdb) bt #0 multi_cpu_stop (data=0x0) at kernel/stop_machine.c:191 #1 0xffffffc00011e750 in cpu_stopper_thread (cpu=) at kernel/stop_machine.c:473 #2 0xffffffc0000ce118 in smpboot_thread_fn (data=0xffffffc00581b980) at kernel/smpboot.c:161 #3 0xffffffc0000cae1c in kthread (_create=0xffffffc00581bb00) at kernel/kthread.c:207 #4 0xffffffc000085bf0 in ret_from_fork () at arch/arm64/kernel/entry.S:635 #5 0xffffffc000085bf0 in ret_from_fork () at arch/arm64/kernel/entry.S:635 Breakpoint 1, ktime_get_update_offsets_tick (offs_real=0xffffffc005987b70, offs_boot=0xffffffc005987b68, offs_tai=0xffffffc005987b60) at kernel/time/timekeeping.c:1785 1785 seq = read_seqcount_begin(&tk_core.seq); (gdb) bt #0 ktime_get_update_offsets_tick (offs_real=0xffffffc005987b70, offs_boot=0xffffffc005987b68, offs_tai=0xffffffc005987b60) at kernel/time/timekeeping.c:1785 #1 0xffffffc0000fbe0c in hrtimer_get_softirq_time (base=) at kernel/time/hrtimer.c:122 #2 hrtimer_run_queues () at kernel/time/hrtimer.c:1448 #3 0xffffffc0000faaac in run_local_timers () at kernel/time/timer.c:1411 #4 update_process_times (user_tick=0) at kernel/time/timer.c:1383 #5 0xffffffc000106dfc in tick_periodic (cpu=) at kernel/time/tick-common.c:91 #6 0xffffffc000107048 in tick_handle_periodic (dev=0xffffffc006fcff40) at kernel/time/tick-common.c:103 #7 0xffffffc000469e30 in timer_handler (evt=, access=) at drivers/clocksource/arm_arch_timer.c:148 #8 arch_timer_handler_virt (irq=, dev_id=) at drivers/clocksource/arm_arch_timer.c:159 #9 0xffffffc0000eef7c in handle_percpu_devid_irq (irq=3, desc=0xffffffc005804800) at kernel/irq/chip.c:714 #10 0xffffffc0000eb0d4 in generic_handle_irq_desc (desc=, irq=) at include/linux/irqdesc.h:128 #11 generic_handle_irq (irq=3) at kernel/irq/irqdesc.c:351 #12 0xffffffc0000eb3e4 in __handle_domain_irq (domain=, hwirq=, lookup=true, regs=) at kernel/irq/irqdesc.c:388 #13 0xffffffc00008241c in handle_domain_irq (regs=, hwirq=, domain=) at include/linux/irqdesc.h:146 #14 gic_handle_irq (regs=0xffffffc005987cc0) at drivers/irqchip/irq-gic.c:276 which I think is this: 106 static inline unsigned __read_seqcount_begin(const seqcount_t *s) 107 { 108 >~~~~~~~unsigned ret; 109 110 repeat: 111 >~~~~~~~ret = READ_ONCE(s->sequence); 112 >~~~~~~~if (unlikely(ret & 1)) { 113 >~~~~~~~>~~~~~~~cpu_relax(); 114 >~~~~~~~>~~~~~~~goto repeat; 115 >~~~~~~~} 116 >~~~~~~~return ret; 117 } Breakpoint 1, csd_lock_wait (csd=) at kernel/smp.c:111 111 cpu_relax(); (gdb) bt #0 csd_lock_wait (csd=) at kernel/smp.c:111 #1 smp_call_function_many (mask=, func=0xffffffc000083800 , info=0x0, wait=true) at kernel/smp.c:449 #2 0xffffffc00010ddc0 in smp_call_function (func=, info=, wait=) at kernel/smp.c:473 #3 0xffffffc00010de20 in on_each_cpu (func=0xffffffc000083800 , info=0x0, wait=) at kernel/smp.c:579 #4 0xffffffc00008385c in debug_monitors_init () at arch/arm64/kernel/debug-monitors.c:151 #5 0xffffffc0000828c8 in do_one_initcall (fn=0xffffffc00008383c ) at init/main.c:785 #6 0xffffffc00073baf0 in do_initcall_level (level=) at init/main.c:850 #7 do_initcalls () at init/main.c:858 #8 do_basic_setup () at init/main.c:877 #9 kernel_init_freeable () at init/main.c:998 #10 0xffffffc00054be04 in kernel_init (unused=) at init/main.c:928 #11 0xffffffc000085bf0 in ret_from_fork () at arch/arm64/kernel/entry.S:635 #12 0xffffffc000085bf0 in ret_from_fork () at arch/arm64/kernel/entry.S:635 Backtrace stopped: previous frame identical to this frame (corrupt stack?) Kernel baseline revision is 1d97b73f from linux-next tree. Regards, Peter > Will > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ diff --git a/kernel/smp.c b/kernel/smp.c index f38a1e6..1c692be 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -108,7 +108,8 @@ void __init call_function_init(void) static void csd_lock_wait(struct call_single_data *csd) { while (csd->flags & CSD_FLAG_LOCK) - cpu_relax(); + ; + //cpu_relax(); } Hack above causes boot time delay even with my patch applied, so this