From patchwork Tue Jul 16 07:39:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11045423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B6153746 for ; Tue, 16 Jul 2019 07:49:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A29E02857E for ; Tue, 16 Jul 2019 07:49:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 93BFF285A9; Tue, 16 Jul 2019 07:49:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C7F3E2857E for ; Tue, 16 Jul 2019 07:49:40 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hnIBn-0002s2-8v; Tue, 16 Jul 2019 07:47:55 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hnIBl-0002ru-Rs for xen-devel@lists.xenproject.org; Tue, 16 Jul 2019 07:47:53 +0000 X-Inumbo-ID: 00c564e0-a79e-11e9-a42a-23f5b4669daa Received: from m9a0002g.houston.softwaregrp.com (unknown [15.124.64.67]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 00c564e0-a79e-11e9-a42a-23f5b4669daa; Tue, 16 Jul 2019 07:47:48 +0000 (UTC) Received: FROM m9a0002g.houston.softwaregrp.com (15.121.0.190) BY m9a0002g.houston.softwaregrp.com WITH ESMTP; Tue, 16 Jul 2019 07:47:47 +0000 Received: from M4W0334.microfocus.com (2002:f78:1192::f78:1192) by M9W0067.microfocus.com (2002:f79:be::f79:be) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1591.10; Tue, 16 Jul 2019 07:39:59 +0000 Received: from NAM01-BY2-obe.outbound.protection.outlook.com (15.124.8.14) by M4W0334.microfocus.com (15.120.17.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1591.10 via Frontend Transport; Tue, 16 Jul 2019 07:39:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ehKHYUWMRUvAe+2jkxB2SiZhFGpBazz++hNcM3qAbe/RADrLOpyt4OkKciEj5ZVMNYxxX15vskHh0/xfQHCF1XuHykfIry8sZC/dDmqlatlzI17i8bRj5XttDebjHWbKljmNV18rTMXO4ZHWIvsotB54eEozPwAR9/jD1GuRrv+PY9H0Ra4T5CaBTr32U9nVAVUDYb9r6mhJbVdBmyzZtNZSnnLlgoZdgMBI0G/vofK0Fwus56r3HvgcBszE2JmkcCyTqELMY+Qt0fVT1XTrv2Dbk2XsgY9nFtH1V2Q5KWPu/kSDXDoXhC+8nKgXt0CtMHahkzOmlRrCgyayR8GZ5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8OSDl/1ltzuH4FKjkG4mGUSd0Iy2sK7prqgf1+74uNU=; b=Xan1cPO/wb/RzX4TgJrKmdDZ/oBhTFwY/sshi0zUCXCwLxX69FfmtgkZYGvW1Ltq+ErAzBjtGnBrRwWz4005IUXjL+TAMToocx+KPjazDvFfqgl1VAQTTt0zV86dvrc/Rp9hIzBXLG4HK4OjEvgC0ShQgOVjaq8ornQ4UUn9oQGrWDIhc9IP+KHK5UY5JkGCJxM16SGU5AcA7o3RtySLRWccm1t4jdY8HO51Pyl6CzfV5P+tP8uRPF00lKy+ueeFwwKEWAogbYgt9l4UsqFyx1gA4CoycEnpo3Vev1yonK7ihsELOY2RVYipD1JEHusUCE2mDtYJR8H1nwhTx65YKw== ARC-Authentication-Results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none Received: from DM6PR18MB3401.namprd18.prod.outlook.com (10.255.174.218) by DM6PR18MB2667.namprd18.prod.outlook.com (20.179.107.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2073.14; Tue, 16 Jul 2019 07:39:58 +0000 Received: from DM6PR18MB3401.namprd18.prod.outlook.com ([fe80::1fe:35f6:faf3:78c7]) by DM6PR18MB3401.namprd18.prod.outlook.com ([fe80::1fe:35f6:faf3:78c7%7]) with mapi id 15.20.2073.012; Tue, 16 Jul 2019 07:39:58 +0000 From: Jan Beulich To: "xen-devel@lists.xenproject.org" Thread-Topic: [PATCH v4 05/13] x86/IRQ: fix locking around vector management Thread-Index: AQHVO6mrVR2llFRiTk2C0NteVXbnZA== Date: Tue, 16 Jul 2019 07:39:58 +0000 Message-ID: <590e074c-f3b2-91ea-acaa-5a0b0409d749@suse.com> References: <5cda711a-b417-76e9-d113-ea838463f225@suse.com> In-Reply-To: <5cda711a-b417-76e9-d113-ea838463f225@suse.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: DB6PR01CA0070.eurprd01.prod.exchangelabs.com (2603:10a6:6:46::47) To DM6PR18MB3401.namprd18.prod.outlook.com (2603:10b6:5:1cc::26) authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@suse.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [87.234.252.170] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d96ec161-3de4-4523-dc01-08d709c0cd9b x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:DM6PR18MB2667; x-ms-traffictypediagnostic: DM6PR18MB2667: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0100732B76 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(4636009)(39860400002)(346002)(366004)(376002)(396003)(136003)(43544003)(189003)(199004)(6436002)(102836004)(5640700003)(8676002)(53936002)(6506007)(4326008)(386003)(6512007)(7736002)(186003)(6116002)(6486002)(305945005)(52116002)(2906002)(66066001)(11346002)(81166006)(2616005)(8936002)(316002)(476003)(3846002)(66476007)(446003)(76176011)(2351001)(99286004)(26005)(81156014)(31686004)(68736007)(6916009)(256004)(14444005)(14454004)(486006)(71190400001)(478600001)(71200400001)(31696002)(5660300002)(86362001)(80792005)(2501003)(54906003)(36756003)(25786009)(64756008)(66556008)(66946007)(66446008); DIR:OUT; SFP:1102; SCL:1; SRVR:DM6PR18MB2667; H:DM6PR18MB3401.namprd18.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: suse.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: mHHMVUY4qKgorqQVrrG/o+cw/Sy6SEZgo/4U/ldHU8IkIILSrhucvGI8aDAzGedcQEPsw7odf+1Ttdp5Jh61BrAOjgC4S9Rn8pUBdBSHQBc3XBHmmoy/LCHPcLqPM2laxLM5SYDehdtOYm/h9U9ClM05bP4/1V4oZ/aJJD90CDmXwxcw3K2ozW6jRUm9WY9G68km4JhICFkyufH8MA/ItCDwR1Afav9kejpKPXs3lGy5hiIUTul2/l720TnlmcidHipqZamnT9+bRUIduMX3iSTAUswX18fgDdeR+GRko5sTKMmFfDGwyauwGAEw3b5Shy/Em5RAok9a4dJj5DDauW/xw98hxtRNnd5yPnVIp3u5ZoX2FmE/4qzwFzojNvsEuy4EcOSgw4F8XU48XiNsq+f+4CvEB+L+QqaegQRYg8g= Content-ID: MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: d96ec161-3de4-4523-dc01-08d709c0cd9b X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Jul 2019 07:39:58.5268 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 856b813c-16e5-49a5-85ec-6f081e13b527 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: JBeulich@suse.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR18MB2667 X-OriginatorOrg: suse.com Subject: [Xen-devel] [PATCH v4 05/13] x86/IRQ: fix locking around vector management X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc fields, and hence ought to be called with the descriptor lock held in addition to vector_lock. This is currently the case for only set_desc_affinity() (in the common case) and destroy_irq(), which also clarifies what the nesting behavior between the locks has to be. Reflect the new expectation by having these functions all take a descriptor as parameter instead of an interrupt number. Also take care of the two special cases of calls to set_desc_affinity(): set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called directly as well, and in these cases the descriptor locks hadn't got acquired till now. For set_ioapic_affinity_irq() this means acquiring / releasing of the IO-APIC lock can be plain spin_{,un}lock() then. Drop one of the two leading underscores from all three functions at the same time. There's one case left where descriptors get manipulated with just vector_lock held: setup_vector_irq() assumes its caller to acquire vector_lock, and hence can't itself acquire the descriptor locks (wrong lock order). I don't currently see how to address this. Signed-off-by: Jan Beulich Reviewed-by: Kevin Tian [VT-d] Reviewed-by: Roger Pau Monné Acked-by: Andrew Cooper --- v4: Adjust comment ahead of setup_vector_irq(). v3: Also drop one leading underscore from a comment. Re-base. v2: Also adjust set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity(). --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -550,14 +550,14 @@ static void clear_IO_APIC (void) static void set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask) { - unsigned long flags; unsigned int dest; int pin, irq; struct irq_pin_list *entry; irq = desc->irq; - spin_lock_irqsave(&ioapic_lock, flags); + spin_lock(&ioapic_lock); + dest = set_desc_affinity(desc, mask); if (dest != BAD_APICID) { if ( !x2apic_enabled ) @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc entry = irq_2_pin + entry->next; } } - spin_unlock_irqrestore(&ioapic_lock, flags); + spin_unlock(&ioapic_lock); } /* @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void) for (ioapic = 0; ioapic < nr_ioapics; ioapic++) { for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) { struct irq_desc *desc; + unsigned long flags; irq_entry = find_irq_entry(ioapic, pin, mp_INT); if (irq_entry == -1) continue; irq = pin_2_irq(irq_entry, ioapic, pin); desc = irq_to_desc(irq); + + spin_lock_irqsave(&desc->lock, flags); BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); + spin_unlock_irqrestore(&desc->lock, flags); } - } } --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -27,6 +27,7 @@ #include static int parse_irq_vector_map_param(const char *s); +static void _clear_irq_vector(struct irq_desc *desc); /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */ bool __read_mostly opt_noirqbalance; @@ -143,13 +144,12 @@ static void trace_irq_mask(uint32_t even _trace_irq_mask(event, irq, vector, mask); } -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) +static int __init _bind_irq_vector(struct irq_desc *desc, int vector, + const cpumask_t *cpu_mask) { cpumask_t online_mask; int cpu; - struct irq_desc *desc = irq_to_desc(irq); - BUG_ON((unsigned)irq >= nr_irqs); BUG_ON((unsigned)vector >= NR_VECTORS); cpumask_and(&online_mask, cpu_mask, &cpu_online_map); @@ -160,9 +160,9 @@ static int __init __bind_irq_vector(int return 0; if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED ) return -EBUSY; - trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask); + trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask); for_each_cpu(cpu, &online_mask) - per_cpu(vector_irq, cpu)[vector] = irq; + per_cpu(vector_irq, cpu)[vector] = desc->irq; desc->arch.vector = vector; cpumask_copy(desc->arch.cpu_mask, &online_mask); if ( desc->arch.used_vectors ) @@ -176,12 +176,18 @@ static int __init __bind_irq_vector(int int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) { + struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; int ret; - spin_lock_irqsave(&vector_lock, flags); - ret = __bind_irq_vector(irq, vector, cpu_mask); - spin_unlock_irqrestore(&vector_lock, flags); + BUG_ON((unsigned)irq >= nr_irqs); + + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&vector_lock); + ret = _bind_irq_vector(desc, vector, cpu_mask); + spin_unlock(&vector_lock); + spin_unlock_irqrestore(&desc->lock, flags); + return ret; } @@ -266,18 +272,20 @@ void destroy_irq(unsigned int irq) spin_lock_irqsave(&desc->lock, flags); desc->handler = &no_irq_type; - clear_irq_vector(irq); + spin_lock(&vector_lock); + _clear_irq_vector(desc); + spin_unlock(&vector_lock); desc->arch.used_vectors = NULL; spin_unlock_irqrestore(&desc->lock, flags); xfree(action); } -static void __clear_irq_vector(int irq) +static void _clear_irq_vector(struct irq_desc *desc) { - int cpu, vector, old_vector; + unsigned int cpu; + int vector, old_vector, irq = desc->irq; cpumask_t tmp_mask; - struct irq_desc *desc = irq_to_desc(irq); BUG_ON(!desc->arch.vector); @@ -323,11 +331,14 @@ static void __clear_irq_vector(int irq) void clear_irq_vector(int irq) { + struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; - spin_lock_irqsave(&vector_lock, flags); - __clear_irq_vector(irq); - spin_unlock_irqrestore(&vector_lock, flags); + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&vector_lock); + _clear_irq_vector(desc); + spin_unlock(&vector_lock); + spin_unlock_irqrestore(&desc->lock, flags); } int irq_to_vector(int irq) @@ -462,8 +473,7 @@ static vmask_t *irq_get_used_vector_mask return ret; } -static int __assign_irq_vector( - int irq, struct irq_desc *desc, const cpumask_t *mask) +static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask) { /* * NOTE! The local APIC isn't very good at handling @@ -477,7 +487,8 @@ static int __assign_irq_vector( * 0x80, because int 0x80 is hm, kind of importantish. ;) */ static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0; - int cpu, err, old_vector; + unsigned int cpu; + int err, old_vector, irq = desc->irq; vmask_t *irq_used_vectors = NULL; old_vector = irq_to_vector(irq); @@ -590,8 +601,12 @@ int assign_irq_vector(int irq, const cpu BUG_ON(irq >= nr_irqs || irq <0); - spin_lock_irqsave(&vector_lock, flags); - ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS); + spin_lock_irqsave(&desc->lock, flags); + + spin_lock(&vector_lock); + ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS); + spin_unlock(&vector_lock); + if ( !ret ) { ret = desc->arch.vector; @@ -600,14 +615,16 @@ int assign_irq_vector(int irq, const cpu else cpumask_setall(desc->affinity); } - spin_unlock_irqrestore(&vector_lock, flags); + + spin_unlock_irqrestore(&desc->lock, flags); return ret; } /* * Initialize vector_irq on a new cpu. This function must be called - * with vector_lock held. + * with vector_lock held. For this reason it may not itself acquire + * the IRQ descriptor locks, as lock nesting is the other way around. */ void setup_vector_irq(unsigned int cpu) { @@ -775,7 +792,6 @@ void irq_complete_move(struct irq_desc * unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask) { - unsigned int irq; int ret; unsigned long flags; cpumask_t dest_mask; @@ -783,10 +799,8 @@ unsigned int set_desc_affinity(struct ir if (!cpumask_intersects(mask, &cpu_online_map)) return BAD_APICID; - irq = desc->irq; - spin_lock_irqsave(&vector_lock, flags); - ret = __assign_irq_vector(irq, desc, mask); + ret = _assign_irq_vector(desc, mask); spin_unlock_irqrestore(&vector_lock, flags); if (ret < 0) @@ -2453,7 +2467,7 @@ void fixup_irqs(const cpumask_t *mask, b /* * In order for the affinity adjustment below to be successful, we - * need __assign_irq_vector() to succeed. This in particular means + * need _assign_irq_vector() to succeed. This in particular means * clearing desc->arch.move_in_progress if this would otherwise * prevent the function from succeeding. Since there's no way for the * flag to get cleared anymore when there's no possible destination --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) : NUMA_NO_NODE; const cpumask_t *cpumask = &cpu_online_map; + struct irq_desc *desc; if ( node < MAX_NUMNODES && node_online(node) && cpumask_intersects(&node_to_cpumask(node), cpumask) ) cpumask = &node_to_cpumask(node); - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); + + desc = irq_to_desc(drhd->iommu->msi.irq); + spin_lock_irq(&desc->lock); + dma_msi_set_affinity(desc, cpumask); + spin_unlock_irq(&desc->lock); } static int adjust_vtd_irq_affinities(void)