From patchwork Thu Feb 20 12:02:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 11393901 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C313814E3 for ; Thu, 20 Feb 2020 12:04:22 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9F39F20722 for ; Thu, 20 Feb 2020 12:04:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="BW8NpLbH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9F39F20722 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=citrix.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j4kY1-000378-G1; Thu, 20 Feb 2020 12:03:17 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j4kY0-000373-Bj for xen-devel@lists.xenproject.org; Thu, 20 Feb 2020 12:03:16 +0000 X-Inumbo-ID: f9432972-53d8-11ea-850d-12813bfff9fa Received: from esa1.hc3370-68.iphmx.com (unknown [216.71.145.142]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id f9432972-53d8-11ea-850d-12813bfff9fa; Thu, 20 Feb 2020 12:03:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1582200195; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=5/KjXMb8oMkEaZhmRznXuXTDmR1rAhYvmD5kXZkSQu8=; b=BW8NpLbH3Mq4u3Asy9gO8zZD59VtLX3s5roXVskUt+oOGqVaTIpPAtng LYnElKY/uzg4wYhk0lNl1/639PKJe3MbhQ0gw4ID6yw/oyujaL3sZEn3e JOgDYKm1NaCJbpkHDej6XWSk4GVSidc9K5FenGlZPp+F2/tbFbemZJWa5 c=; Authentication-Results: esa1.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@citrix.com; spf=Pass smtp.mailfrom=roger.pau@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: None (esa1.hc3370-68.iphmx.com: no sender authenticity information available from domain of roger.pau@citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="roger.pau@citrix.com"; x-sender="roger.pau@citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa1.hc3370-68.iphmx.com: domain of roger.pau@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="roger.pau@citrix.com"; x-sender="roger.pau@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ip4:168.245.78.127 ~all" Received-SPF: None (esa1.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="roger.pau@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: UzrZl5ML5g8kHPuy0vQrxao7/DF8ZbHG5EWFt2fl7vvLoAwc5cnyWEGpY3mAYeJfmhalsZwQeJ vO8D+35KOoqihX95vD1KvfgsIv5O/vmUToKwlr3w4OQTuKOo0RA7AP12w1Gs2IT1KK/awrxDN+ y5ty7NDAQAxPNnQ4g7CydjH+7oBbIqATDqaZKn8Yd4RdToAXWN8GNPRbtAjeWgYnk89LsP7Cz2 q4gR84CazhQdlqUDQdfvPrw8y4rgmFWYc8P0eL+OEAFNtrx7kj6IIz4x65s/VlzKbJFMMXN6Cn EZI= X-SBRS: 2.7 X-MesageID: 12924576 X-Ironport-Server: esa1.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.70,464,1574139600"; d="scan'208";a="12924576" From: Roger Pau Monne To: Date: Thu, 20 Feb 2020 13:02:31 +0100 Message-ID: <20200220120231.86907-1-roger.pau@citrix.com> X-Mailer: git-send-email 2.25.0 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH] rwlock: allow recursive read locking when already locked in write mode X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= , Stefano Stabellini , Julien Grall , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Jan Beulich , Roger Pau Monne Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Allow a CPU already holding the lock in write mode to also lock it in read mode. There's no harm in allowing read locking a rwlock that's already owned by the caller (ie: CPU) in write mode. Allowing such accesses is required at least for the CPU maps use-case. In order to do this reserve 12bits of the lock, this allows to support up to 4096 CPUs. Also reduce the write lock mask to 2 bits: one to signal there are pending writers waiting on the lock and the other to signal the lock is owned in write mode. This reduces the maximum number of concurrent readers from 16777216 to 262144, I think this should still be enough, or else the lock field can be expanded from 32 to 64bits if all architectures support atomic operations on 64bit integers. Fixes: 5872c83b42c608 ('smp: convert the cpu maps lock into a rw lock') Reported-by: Jan Beulich Reported-by: Jürgen Groß Signed-off-by: Roger Pau Monné --- I've done some testing and at least the CPU down case is fixed now. Posting early in order to get feedback on the approach taken. --- xen/common/rwlock.c | 4 ++-- xen/include/xen/rwlock.h | 47 ++++++++++++++++++++++++++-------------- 2 files changed, 33 insertions(+), 18 deletions(-) diff --git a/xen/common/rwlock.c b/xen/common/rwlock.c index d568bbf6de..dadab372b5 100644 --- a/xen/common/rwlock.c +++ b/xen/common/rwlock.c @@ -69,7 +69,7 @@ void queue_write_lock_slowpath(rwlock_t *lock) /* Try to acquire the lock directly if no reader is present. */ if ( !atomic_read(&lock->cnts) && - (atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0) ) + (atomic_cmpxchg(&lock->cnts, 0, _write_lock_val()) == 0) ) goto unlock; /* @@ -93,7 +93,7 @@ void queue_write_lock_slowpath(rwlock_t *lock) cnts = atomic_read(&lock->cnts); if ( (cnts == _QW_WAITING) && (atomic_cmpxchg(&lock->cnts, _QW_WAITING, - _QW_LOCKED) == _QW_WAITING) ) + _write_lock_val()) == _QW_WAITING) ) break; cpu_relax(); diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h index 3dfea1ac2a..b430ebd846 100644 --- a/xen/include/xen/rwlock.h +++ b/xen/include/xen/rwlock.h @@ -20,21 +20,30 @@ typedef struct { #define DEFINE_RWLOCK(l) rwlock_t l = RW_LOCK_UNLOCKED #define rwlock_init(l) (*(l) = (rwlock_t)RW_LOCK_UNLOCKED) -/* - * Writer states & reader shift and bias. - * - * Writer field is 8 bit to allow for potential optimisation, see - * _write_unlock(). - */ -#define _QW_WAITING 1 /* A writer is waiting */ -#define _QW_LOCKED 0xff /* A writer holds the lock */ -#define _QW_WMASK 0xff /* Writer mask.*/ -#define _QR_SHIFT 8 /* Reader count shift */ +/* Writer states & reader shift and bias. */ +#define _QW_WAITING 1 /* A writer is waiting */ +#define _QW_LOCKED 3 /* A writer holds the lock */ +#define _QW_WMASK 3 /* Writer mask */ +#define _QW_CPUSHIFT 2 /* Writer CPU shift */ +#define _QW_CPUMASK 0x3ffc /* Writer CPU mask */ +#define _QR_SHIFT 14 /* Reader count shift */ #define _QR_BIAS (1U << _QR_SHIFT) void queue_read_lock_slowpath(rwlock_t *lock); void queue_write_lock_slowpath(rwlock_t *lock); +static inline bool _is_write_locked_by_me(uint32_t cnts) +{ + BUILD_BUG_ON((_QW_CPUMASK >> _QW_CPUSHIFT) < NR_CPUS); + return (cnts & _QW_WMASK) == _QW_LOCKED && + MASK_EXTR(cnts, _QW_CPUMASK) == smp_processor_id(); +} + +static inline bool _can_read_lock(uint32_t cnts) +{ + return !(cnts & _QW_WMASK) || _is_write_locked_by_me(cnts); +} + /* * _read_trylock - try to acquire read lock of a queue rwlock. * @lock : Pointer to queue rwlock structure. @@ -45,10 +54,10 @@ static inline int _read_trylock(rwlock_t *lock) u32 cnts; cnts = atomic_read(&lock->cnts); - if ( likely(!(cnts & _QW_WMASK)) ) + if ( likely(_can_read_lock(cnts)) ) { cnts = (u32)atomic_add_return(_QR_BIAS, &lock->cnts); - if ( likely(!(cnts & _QW_WMASK)) ) + if ( likely(_can_read_lock(cnts)) ) return 1; atomic_sub(_QR_BIAS, &lock->cnts); } @@ -64,7 +73,7 @@ static inline void _read_lock(rwlock_t *lock) u32 cnts; cnts = atomic_add_return(_QR_BIAS, &lock->cnts); - if ( likely(!(cnts & _QW_WMASK)) ) + if ( likely(_can_read_lock(cnts)) ) return; /* The slowpath will decrement the reader count, if necessary. */ @@ -115,6 +124,11 @@ static inline int _rw_is_locked(rwlock_t *lock) return atomic_read(&lock->cnts); } +static inline uint32_t _write_lock_val(void) +{ + return _QW_LOCKED | MASK_INSR(smp_processor_id(), _QW_CPUMASK); +} + /* * queue_write_lock - acquire write lock of a queue rwlock. * @lock : Pointer to queue rwlock structure. @@ -122,7 +136,7 @@ static inline int _rw_is_locked(rwlock_t *lock) static inline void _write_lock(rwlock_t *lock) { /* Optimize for the unfair lock case where the fair flag is 0. */ - if ( atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0 ) + if ( atomic_cmpxchg(&lock->cnts, 0, _write_lock_val()) == 0 ) return; queue_write_lock_slowpath(lock); @@ -157,7 +171,7 @@ static inline int _write_trylock(rwlock_t *lock) if ( unlikely(cnts) ) return 0; - return likely(atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0); + return likely(atomic_cmpxchg(&lock->cnts, 0, _write_lock_val()) == 0); } static inline void _write_unlock(rwlock_t *lock) @@ -166,7 +180,8 @@ static inline void _write_unlock(rwlock_t *lock) * If the writer field is atomic, it can be cleared directly. * Otherwise, an atomic subtraction will be used to clear it. */ - atomic_sub(_QW_LOCKED, &lock->cnts); + ASSERT(_is_write_locked_by_me(atomic_read(&lock->cnts))); + atomic_sub(_write_lock_val(), &lock->cnts); } static inline void _write_unlock_irq(rwlock_t *lock)