From patchwork Wed May 3 18:59:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: pan xinhui X-Patchwork-Id: 9710297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 62D1A60385 for ; Wed, 3 May 2017 18:59:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 408C628305 for ; Wed, 3 May 2017 18:59:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2354228420; Wed, 3 May 2017 18:59:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, FORGED_HOTMAIL_RCVD2, FREEMAIL_FROM autolearn=no version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1085728305 for ; Wed, 3 May 2017 18:59:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=HJprKUQYs5PTkvERLBTJTmb4+12jgs3TXQ4bVWWCSc0=; b=b6XAxm97Asl9gl 4qW+xu1hdxjma2WLcoSZ7324jhlnnDUhcFbAqAETzMDQgIWMYpJJPB3SvRhzEBfhDvc8RPKGmS06g lp/ON+BlUn7TGdod/V83wWKvykViy+18fGhwv48UwiXq6U5id6zrYNW8tbHAwjz1DttNSgR5ZfKXc 8KdDPWgdd9rjVlOdyg0rycY/XgNIXHwyM4s5Ve/jyftNbHJ14Oj/LSO8z7DnBfDsEUbYsJ5Wl8Jy1 f5l1KBfOWTpWgg9o9uOBBKXUpWpm+QvTbqd/d14Wz+GapnRv5lS7bdZjcABnPySpOj7B1wrWE8oNv lDLWA51lKCRWUpOUrrjA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1d5zVA-0005oU-Hl; Wed, 03 May 2017 18:59:52 +0000 Received: from mail-oln040092253086.outbound.protection.outlook.com ([40.92.253.86] helo=APC01-SG2-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1d5zV5-0005mt-E2 for linux-arm-kernel@lists.infradead.org; Wed, 03 May 2017 18:59:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=RSbEgIDSGAfUqykgN9nkWRGnAvN1CgW8mjfrCikVILk=; b=Ut+GpnNRnCwjjfHpUMqmj9T9q21WGfHDaCdLF/+PDjNiUyUA8f474xwIfULksxAgbTGAEswGzEfrGDw2bKobhmMw51veBdaSjpzTcUfqJXdpJcnY3ebArCH7+i+iihrvw2WZc90TmudvbNeveAp5tP5bK5SYAjP8ApLSE7AzQecoabmLKMz5yeFIEjjiUvsmwOHrakj4/EGYFzR/ozdAcmjQbwtUpgpeGrEeZATVmeyORijFOjYlN4ex71pBn+pZ9t08S7WnDTCr7K/6qlvUmQgsfC4YO3lSFvhG+CiJMBn1Bos+0PIDHrugIsBll0tJiVdJldcNmpsXWbsG558EMg== Received: from PU1APC01FT059.eop-APC01.prod.protection.outlook.com (10.152.252.56) by PU1APC01HT130.eop-APC01.prod.protection.outlook.com (10.152.253.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.1047.9; Wed, 3 May 2017 18:59:19 +0000 Received: from SIXPR0199MB0604.apcprd01.prod.exchangelabs.com (10.152.252.60) by PU1APC01FT059.mail.protection.outlook.com (10.152.253.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.9 via Frontend Transport; Wed, 3 May 2017 18:59:19 +0000 Received: from SIXPR0199MB0604.apcprd01.prod.exchangelabs.com ([10.162.174.18]) by SIXPR0199MB0604.apcprd01.prod.exchangelabs.com ([10.162.174.18]) with mapi id 15.01.1061.021; Wed, 3 May 2017 18:59:19 +0000 From: pan xinhui To: Yury Norov , Will Deacon , Peter Zijlstra , "linux-kernel@vger.kernel.org" , "linux-arch@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: =?gb2312?B?tPC4tDogW1BBVENIIDAvM10gYXJtNjQ6IHF1ZXVlZCBzcGlubG9ja3MgYW5k?= =?gb2312?Q?_rw-locks?= Thread-Topic: [PATCH 0/3] arm64: queued spinlocks and rw-locks Thread-Index: AQHSxCHhu18BsSE/F0OED+LM9P9SYKHjfHp+ Date: Wed, 3 May 2017 18:59:19 +0000 Message-ID: References: <20170503145141.4966-1-ynorov@caviumnetworks.com> In-Reply-To: <20170503145141.4966-1-ynorov@caviumnetworks.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: codeaurora.org; dkim=none (message not signed) header.d=none; codeaurora.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:BF0FB7114D98F4A741A9C9A3D6077AE59B87D7D309302FFC9E5B9797E68268DD; UpperCasedChecksum:96C9B7772A660E34F5A932F84D65315123A668395D9046BF7DFA9CABDB876860; SizeAsReceived:8861; Count:46 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [6Ini+WkM8+cYy0FvsHv7LULlbX6vM+0G] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; PU1APC01HT130; 5:weEdDqDmaVd0eC/gXsJ/YYPiCvpuTa3T1uF3K+Mw/fKFIwLttUWaUVmtimA0AytoAihDEY719jrsmepYOuRh7v68LbamLuQF+oJwiRY56FqPAFS7ZW6VXvQDkLdD+ZYzmJOe552qYWTvZKwfgebdmQ==; 24:LxH3nWd65EvAWdjO+lRAMFmbGBq1+Osl448EqpwwA5+5fCLPzappgWLoPwl5jP25CK4FDL7quSFWI1TU2BdKzeBj+NN6uqsbjG/03wWssB0=; 7:JEdvmaR79rBUdX3EJS/EDy0LcwaoMC5FZ/ssm3FBdN1i3zZR3egWt04GnZ9WotPs0cSbrxU5oXqTm8PjgD8XMlH02IGaaCn0vlBNuvtk9Knc+e+GFSc203LtOM15NP4IU8HJXGNgxmJ+VdNgKta10X6sTxi+0wOd/8glU9rPuHHEgtLpoBZ1oyJOL4AvDGRKIxspedFre4N/INus3oQ+q1k2RSojWFVK2CnW00G1ZyakWeZqNJsr8cJU2v/HzClTr9e1oOPgaS382FZsvFWl1ISPjbRG3t9Ty1zuoXnh2mtRMWEptIkqv6KFkxi76eiK x-incomingheadercount: 46 x-eopattributedmessage: 0 x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(7070007)(98901004); DIR:OUT; SFP:1901; SCL:1; SRVR:PU1APC01HT130; H:SIXPR0199MB0604.apcprd01.prod.exchangelabs.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 05a62265-0653-45d5-e8c2-08d4925680d7 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(201702061074)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322274)(1603101448)(1601125374)(1701031045); SRVR:PU1APC01HT130; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(444000031); SRVR:PU1APC01HT130; BCL:0; PCL:0; RULEID:; SRVR:PU1APC01HT130; x-forefront-prvs: 029651C7A1 spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 03 May 2017 18:59:19.2090 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: PU1APC01HT130 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170503_115948_047648_CDE4DF35 X-CRM114-Status: GOOD ( 18.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Andrew Pinski , Pan Xinhui , Arnd Bergmann , Catalin Marinas , Adam Wallis , Ingo Molnar , Jan Glauber Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP 在 2017/5/3 22:51, Yury Norov 写道:> The patch 3 adds implementation for queued-based locking on > ARM64, and the option in kernel config to enable it. Patches > 1 and 2 fix some mess in header files to apply patch 3 smoothly. > > Tested on QDF2400 with huge improvements with these patches on > the torture tests, by Adam Wallis. > > Tested on ThunderX, by Andrew Pinski: > 120 thread (30 core - 4 thread/core) CN99xx (single socket): > > benchmark Units qspinlocks vs ticket locks > sched/messaging s 73.91% > sched/pipe ops/s 104.18% > futex/hash ops/s 103.87% > futex/wake ms 71.04% > futex/wake-parallel ms 93.88% > futex/requeue ms 96.47% > futex/lock-pi ops/s 118.33% > > Notice, there's the queued locks implementation for the Power PC introduced > by Pan Xinhui. He largely tested it and also found significant performance > gain. In arch part it is very similar to this patch though. > https://lwn.net/Articles/701137/Hi, Yury Glad to know you will join locking development :) I have left IBM. However I still care about the queued-spinlock anyway. > RFC: https://www.spinics.net/lists/arm-kernel/msg575575.htmlI notice you raised one question about the performance degradation in the acquisition of rw-lock for read on qemu. This is strange indeed. I once enabled qrwlock on ppc too. I paste your test reseults below. Is this a result of qspinlock + qrwlock VS qspinlock + normal rwlock or qspinlock + qrwlock VS normal spinlock + normal rwlock? Before After spin_lock-torture: 38957034 37076367 -4.83 rw_lock-torture W: 5369471 18971957 253.33 rw_lock-torture R: 6413179 3668160 -42.80 I am not sure how that should happen. I make one RFC patch below(not based on latest kernel), you could apply it to check if ther is any performance improvement. The idea is that. In queued_write_lock_slowpath(), we did not unlock the ->wait_lock. Because the writer hold the rwlock, all readers are still waiting anyway. And in queued_read_lock_slowpath(), calling rspin_until_writer_unlock() looks like introduce a little overhead, say, spinning at the rwlock. But in the end, queued_read_lock_slowpath() is too heavy, compared with the normal rwlock. such result maybe is somehow reasonable? thanks xinhui > v1: > - queued_spin_unlock_wait() and queued_spin_is_locked() are > re-implemented in arch part to add additional memory barriers; > - queued locks are made optional, ticket locks are enabled by default. > > Jan Glauber (1): > arm64/locking: qspinlocks and qrwlocks support > > Yury Norov (2): > kernel/locking: #include in qrwlock.c > asm-generic: don't #include in qspinlock_types.h > > arch/arm64/Kconfig | 24 +++++++++++++++++++ > arch/arm64/include/asm/qrwlock.h | 7 ++++++ > arch/arm64/include/asm/qspinlock.h | 42 +++++++++++++++++++++++++++++++++ > arch/arm64/include/asm/spinlock.h | 12 ++++++++++ > arch/arm64/include/asm/spinlock_types.h | 14 ++++++++--- > arch/arm64/kernel/Makefile | 1 + > arch/arm64/kernel/qspinlock.c | 34 ++++++++++++++++++++++++++ > include/asm-generic/qspinlock.h | 1 + > include/asm-generic/qspinlock_types.h | 8 ------- > kernel/locking/qrwlock.c | 1 + > 10 files changed, 133 insertions(+), 11 deletions(-) > create mode 100644 arch/arm64/include/asm/qrwlock.h > create mode 100644 arch/arm64/include/asm/qspinlock.h > create mode 100644 arch/arm64/kernel/qspinlock.c > diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 54a8e65..28ee01d 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -28,8 +28,9 @@ * Writer states & reader shift and bias */ #define _QW_WAITING 1 /* A writer is waiting */ -#define _QW_LOCKED 0xff /* A writer holds the lock */ -#define _QW_WMASK 0xff /* Writer mask */ +#define _QW_KICK 0x80 /* need unlock the spinlock*/ +#define _QW_LOCKED 0x7f /* A writer holds the lock */ +#define _QW_WMASK 0x7f /* Writer mask */ #define _QR_SHIFT 8 /* Reader count shift */ #define _QR_BIAS (1U << _QR_SHIFT) @@ -139,7 +140,10 @@ static inline void queued_read_unlock(struct qrwlock *lock) */ static inline void queued_write_unlock(struct qrwlock *lock) { - smp_store_release((u8 *)&lock->cnts, 0); + u32 v = atomic_read(&lock->cnts) & (_QW_WMASK | _QW_KICK); + if (v & _QW_KICK) + arch_spin_unlock(&lock->wait_lock); + (void)atomic_sub_return_release(v, &lock->cnts); } /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index fec0823..1f0ea02 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -116,7 +116,7 @@ void queued_write_lock_slowpath(struct qrwlock *lock) /* Try to acquire the lock directly if no reader is present */ if (!atomic_read(&lock->cnts) && - (atomic_cmpxchg_acquire(&lock->cnts, 0, _QW_LOCKED) == 0)) + (atomic_cmpxchg_acquire(&lock->cnts, 0, _QW_LOCKED|_QW_KICK) == 0)) goto unlock; /* @@ -138,12 +138,13 @@ void queued_write_lock_slowpath(struct qrwlock *lock) cnts = atomic_read(&lock->cnts); if ((cnts == _QW_WAITING) && (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, - _QW_LOCKED) == _QW_WAITING)) + _QW_LOCKED|_QW_KICK) == _QW_WAITING)) break; cpu_relax_lowlatency(); } unlock: - arch_spin_unlock(&lock->wait_lock); + return; } EXPORT_SYMBOL(queued_write_lock_slowpath); -- 2.4.11