From patchwork Mon Jun 24 12:38:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709428 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9020513C8F4; Mon, 24 Jun 2024 12:38:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232739; cv=none; b=ZB3mIBKibQcxwxLfDVnetQqAMuPcKYhpVBEpL7InHtqjHfg93aFeiRWSDaEC9fRNkcIg0nTcPzwJtz0BRMp7HOsJBTzsh2u7EYHlTsnWvcGOaxQWTx92gzm44WWouDDDcO58fwySgI0wAJl2voEwQl5EILIgdEoPPBn3Zv2qF74= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232739; c=relaxed/simple; bh=O7RXLSsqXdC2DTmPqvFD7AMZjjKmlKCl3CHYjNQV1x4=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QsELBaKPRr2SvD603kH4COqg9612l6YWXGe6AEzVBmQURGOmigoglDRP6U5sa6ZJzDlKNY+dSmWEy47sdlxFcop7a4F24uPIZQC3KriW2SnZtwpN1FD/tRBnuYh1zWnU0SpjKYh9JzJ/N3Uf8HAsuO4F//nV+Rt0J2sabmVmHhk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=MrTM76nk; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="MrTM76nk" Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OBx3tg018393; Mon, 24 Jun 2024 12:38:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= cGNDuTfYQKiJFkzABJPhR0g/ZtUL2dSZtJVYo1Dzriw=; b=MrTM76nkBoPteJb9 nfuaKxmZy0l//+cSakAcsoYlr5lTQhSPlQtmxuHgXiTzn2/O0Ouxkg7vVMqFihXO iL6PgQTH+6rBMvajUGOyt1puNs3bO8uY5aGSPdkMcQSrmnZ/YV6Q20i8mYvSQty2 54TsSwYN9czDSGoQ5j23PCUqCsqCvjxTS6GjPJJ1o6TxCTDv6BNzeodM2pnN5L74 dxhGTIbr/16uzrn+mgriNGGwWCub2LnV15apBAeEVLlmUfkehofzZIxtl2se0oKr U3bkxs/a01HkwGW5npEz5LWr2iSQaT/tRGFKuWUzH6gFNSDVjc5+Vhm+OF8yG+6y 6Xc1Og== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d204fx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:33 +0000 (GMT) Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCcXSG030627; Mon, 24 Jun 2024 12:38:33 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d204ft-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:33 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OAeZP6019984; Mon, 24 Jun 2024 12:38:32 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3yxb5m8et8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:31 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCcQ2m56361464 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:38:28 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 38C4B20040; Mon, 24 Jun 2024 12:38:26 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 958D220043; Mon, 24 Jun 2024 12:38:22 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:38:22 +0000 (GMT) Subject: [PATCH v4 1/6] powerpc/iommu: Move pSeries specific functions to pseries/iommu.c From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:38:21 +0000 Message-ID: <171923269701.1397.15758640002786937132.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Hzm6CQ-vk8ro9cWo1Y20sXqJxemiL82- X-Proofpoint-GUID: nHg8wieqzgx0jNvG_L78V0Dnfy4rgaeV X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 adultscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 malwarescore=0 clxscore=1015 priorityscore=1501 bulkscore=0 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240099 The PowerNV specific table_group_ops are defined in powernv/pci-ioda.c. The pSeries specific table_group_ops are sitting in the generic powerpc file. Move it to where it actually belong(pseries/iommu.c). The functions are currently defined even for CONFIG_PPC_POWERNV which are unused on PowerNV. Only code movement, no functional changes intended. Signed-off-by: Shivaprasad G Bhat --- arch/powerpc/include/asm/iommu.h | 4 + arch/powerpc/kernel/iommu.c | 149 -------------------------------- arch/powerpc/platforms/pseries/iommu.c | 144 +++++++++++++++++++++++++++++++ 3 files changed, 149 insertions(+), 148 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index 026695943550..744cc5fc22d3 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -156,6 +156,9 @@ extern int iommu_tce_table_put(struct iommu_table *tbl); extern struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, unsigned long res_start, unsigned long res_end); bool iommu_table_in_use(struct iommu_table *tbl); +extern void iommu_table_reserve_pages(struct iommu_table *tbl, + unsigned long res_start, unsigned long res_end); +extern void iommu_table_clear(struct iommu_table *tbl); #define IOMMU_TABLE_GROUP_MAX_TABLES 2 @@ -218,7 +221,6 @@ extern long iommu_tce_xchg_no_kill(struct mm_struct *mm, extern void iommu_tce_kill(struct iommu_table *tbl, unsigned long entry, unsigned long pages); -extern struct iommu_table_group_ops spapr_tce_table_group_ops; #else static inline void iommu_register_group(struct iommu_table_group *table_group, int pci_domain_number, diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index b70b4f93561f..b5febc6c7a5e 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -643,7 +643,7 @@ void ppc_iommu_unmap_sg(struct iommu_table *tbl, struct scatterlist *sglist, tbl->it_ops->flush(tbl); } -static void iommu_table_clear(struct iommu_table *tbl) +void iommu_table_clear(struct iommu_table *tbl) { /* * In case of firmware assisted dump system goes through clean @@ -684,7 +684,7 @@ static void iommu_table_clear(struct iommu_table *tbl) #endif } -static void iommu_table_reserve_pages(struct iommu_table *tbl, +void iommu_table_reserve_pages(struct iommu_table *tbl, unsigned long res_start, unsigned long res_end) { int i; @@ -1102,59 +1102,6 @@ void iommu_tce_kill(struct iommu_table *tbl, } EXPORT_SYMBOL_GPL(iommu_tce_kill); -#if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) -static int iommu_take_ownership(struct iommu_table *tbl) -{ - unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; - int ret = 0; - - /* - * VFIO does not control TCE entries allocation and the guest - * can write new TCEs on top of existing ones so iommu_tce_build() - * must be able to release old pages. This functionality - * requires exchange() callback defined so if it is not - * implemented, we disallow taking ownership over the table. - */ - if (!tbl->it_ops->xchg_no_kill) - return -EINVAL; - - spin_lock_irqsave(&tbl->large_pool.lock, flags); - for (i = 0; i < tbl->nr_pools; i++) - spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); - - if (iommu_table_in_use(tbl)) { - pr_err("iommu_tce: it_map is not empty"); - ret = -EBUSY; - } else { - memset(tbl->it_map, 0xff, sz); - } - - for (i = 0; i < tbl->nr_pools; i++) - spin_unlock(&tbl->pools[i].lock); - spin_unlock_irqrestore(&tbl->large_pool.lock, flags); - - return ret; -} - -static void iommu_release_ownership(struct iommu_table *tbl) -{ - unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; - - spin_lock_irqsave(&tbl->large_pool.lock, flags); - for (i = 0; i < tbl->nr_pools; i++) - spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); - - memset(tbl->it_map, 0, sz); - - iommu_table_reserve_pages(tbl, tbl->it_reserved_start, - tbl->it_reserved_end); - - for (i = 0; i < tbl->nr_pools; i++) - spin_unlock(&tbl->pools[i].lock); - spin_unlock_irqrestore(&tbl->large_pool.lock, flags); -} -#endif - int iommu_add_device(struct iommu_table_group *table_group, struct device *dev) { /* @@ -1186,98 +1133,6 @@ int iommu_add_device(struct iommu_table_group *table_group, struct device *dev) EXPORT_SYMBOL_GPL(iommu_add_device); #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV) -/* - * A simple iommu_table_group_ops which only allows reusing the existing - * iommu_table. This handles VFIO for POWER7 or the nested KVM. - * The ops does not allow creating windows and only allows reusing the existing - * one if it matches table_group->tce32_start/tce32_size/page_shift. - */ -static unsigned long spapr_tce_get_table_size(__u32 page_shift, - __u64 window_size, __u32 levels) -{ - unsigned long size; - - if (levels > 1) - return ~0U; - size = window_size >> (page_shift - 3); - return size; -} - -static long spapr_tce_create_table(struct iommu_table_group *table_group, int num, - __u32 page_shift, __u64 window_size, __u32 levels, - struct iommu_table **ptbl) -{ - struct iommu_table *tbl = table_group->tables[0]; - - if (num > 0) - return -EPERM; - - if (tbl->it_page_shift != page_shift || - tbl->it_size != (window_size >> page_shift) || - tbl->it_indirect_levels != levels - 1) - return -EINVAL; - - *ptbl = iommu_tce_table_get(tbl); - return 0; -} - -static long spapr_tce_set_window(struct iommu_table_group *table_group, - int num, struct iommu_table *tbl) -{ - return tbl == table_group->tables[num] ? 0 : -EPERM; -} - -static long spapr_tce_unset_window(struct iommu_table_group *table_group, int num) -{ - return 0; -} - -static long spapr_tce_take_ownership(struct iommu_table_group *table_group) -{ - int i, j, rc = 0; - - for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { - struct iommu_table *tbl = table_group->tables[i]; - - if (!tbl || !tbl->it_map) - continue; - - rc = iommu_take_ownership(tbl); - if (!rc) - continue; - - for (j = 0; j < i; ++j) - iommu_release_ownership(table_group->tables[j]); - return rc; - } - return 0; -} - -static void spapr_tce_release_ownership(struct iommu_table_group *table_group) -{ - int i; - - for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { - struct iommu_table *tbl = table_group->tables[i]; - - if (!tbl) - continue; - - iommu_table_clear(tbl); - if (tbl->it_map) - iommu_release_ownership(tbl); - } -} - -struct iommu_table_group_ops spapr_tce_table_group_ops = { - .get_table_size = spapr_tce_get_table_size, - .create_table = spapr_tce_create_table, - .set_window = spapr_tce_set_window, - .unset_window = spapr_tce_unset_window, - .take_ownership = spapr_tce_take_ownership, - .release_ownership = spapr_tce_release_ownership, -}; - /* * A simple iommu_ops to allow less cruft in generic VFIO code. */ diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index b1e6d275cda9..bbe7eaacd829 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -54,6 +54,57 @@ enum { DDW_EXT_QUERY_OUT_SIZE = 2 }; +static int iommu_take_ownership(struct iommu_table *tbl) +{ + unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; + int ret = 0; + + /* + * VFIO does not control TCE entries allocation and the guest + * can write new TCEs on top of existing ones so iommu_tce_build() + * must be able to release old pages. This functionality + * requires exchange() callback defined so if it is not + * implemented, we disallow taking ownership over the table. + */ + if (!tbl->it_ops->xchg_no_kill) + return -EINVAL; + + spin_lock_irqsave(&tbl->large_pool.lock, flags); + for (i = 0; i < tbl->nr_pools; i++) + spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); + + if (iommu_table_in_use(tbl)) { + pr_err("iommu_tce: it_map is not empty"); + ret = -EBUSY; + } else { + memset(tbl->it_map, 0xff, sz); + } + + for (i = 0; i < tbl->nr_pools; i++) + spin_unlock(&tbl->pools[i].lock); + spin_unlock_irqrestore(&tbl->large_pool.lock, flags); + + return ret; +} + +static void iommu_release_ownership(struct iommu_table *tbl) +{ + unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; + + spin_lock_irqsave(&tbl->large_pool.lock, flags); + for (i = 0; i < tbl->nr_pools; i++) + spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); + + memset(tbl->it_map, 0, sz); + + iommu_table_reserve_pages(tbl, tbl->it_reserved_start, + tbl->it_reserved_end); + + for (i = 0; i < tbl->nr_pools; i++) + spin_unlock(&tbl->pools[i].lock); + spin_unlock_irqrestore(&tbl->large_pool.lock, flags); +} + static struct iommu_table *iommu_pseries_alloc_table(int node) { struct iommu_table *tbl; @@ -67,6 +118,8 @@ static struct iommu_table *iommu_pseries_alloc_table(int node) return tbl; } +static struct iommu_table_group_ops spapr_tce_table_group_ops; + static struct iommu_table_group *iommu_pseries_alloc_group(int node) { struct iommu_table_group *table_group; @@ -1651,6 +1704,97 @@ static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask) return false; } +/* + * A simple iommu_table_group_ops which only allows reusing the existing + * iommu_table. This handles VFIO for POWER7 or the nested KVM. + * The ops does not allow creating windows and only allows reusing the existing + * one if it matches table_group->tce32_start/tce32_size/page_shift. + */ +static unsigned long spapr_tce_get_table_size(__u32 page_shift, + __u64 window_size, __u32 levels) +{ + unsigned long size; + + if (levels > 1) + return ~0U; + size = window_size >> (page_shift - 3); + return size; +} + +static long spapr_tce_create_table(struct iommu_table_group *table_group, int num, + __u32 page_shift, __u64 window_size, __u32 levels, + struct iommu_table **ptbl) +{ + struct iommu_table *tbl = table_group->tables[0]; + + if (num > 0) + return -EPERM; + + if (tbl->it_page_shift != page_shift || + tbl->it_size != (window_size >> page_shift) || + tbl->it_indirect_levels != levels - 1) + return -EINVAL; + + *ptbl = iommu_tce_table_get(tbl); + return 0; +} + +static long spapr_tce_set_window(struct iommu_table_group *table_group, + int num, struct iommu_table *tbl) +{ + return tbl == table_group->tables[num] ? 0 : -EPERM; +} + +static long spapr_tce_unset_window(struct iommu_table_group *table_group, int num) +{ + return 0; +} + +static long spapr_tce_take_ownership(struct iommu_table_group *table_group) +{ + int i, j, rc = 0; + + for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { + struct iommu_table *tbl = table_group->tables[i]; + + if (!tbl || !tbl->it_map) + continue; + + rc = iommu_take_ownership(tbl); + if (!rc) + continue; + + for (j = 0; j < i; ++j) + iommu_release_ownership(table_group->tables[j]); + return rc; + } + return 0; +} + +static void spapr_tce_release_ownership(struct iommu_table_group *table_group) +{ + int i; + + for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { + struct iommu_table *tbl = table_group->tables[i]; + + if (!tbl) + continue; + + if (tbl->it_map) + iommu_release_ownership(tbl); + } +} + +static struct iommu_table_group_ops spapr_tce_table_group_ops = { + .get_table_size = spapr_tce_get_table_size, + .create_table = spapr_tce_create_table, + .set_window = spapr_tce_set_window, + .unset_window = spapr_tce_unset_window, + .take_ownership = spapr_tce_take_ownership, + .release_ownership = spapr_tce_release_ownership, +}; + static int iommu_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { From patchwork Mon Jun 24 12:38:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709429 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C07513D523; Mon, 24 Jun 2024 12:39:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232747; cv=none; b=sGL5b7Y75Bu4ASss3Ilrjv+hXotQaoguou6Txn8mMOaqy/TujzbJdVXOylM9rNKnkZLtXUGBoGEcsumKYnLbLKJLPAD2mNjB/kUrhVCfDgdP5LAvkdnLS9EawvvfJXZR6C3OtnlRsFyBxaOWPmoYN6eMBO0Vi/VKG3hLc6Ya9fc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232747; c=relaxed/simple; bh=msgBtPvPLVufKUo/dw1Vfo4KrFJhhWXGqeLu9akJnpw=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cALt89Vt79QdL3lr2xTNYfkRguRNdMHhx9UO8tmCl6to+Yf15DK3SRApCj2tJIdotpKkasvmyEy+XGhe7a8Fdy50b+LWO6QmNBR9ntj3fs2j8l9MgESUwlKqh0BIX0hbdB0eZ2AMCvUXtHs4wXdFsFdwA47DVB1CvjgPtpqBE+g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=DwJxwJrX; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="DwJxwJrX" Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OCQwfU024455; Mon, 24 Jun 2024 12:38:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= tTJ5gxBbl+rBVe7yJMLFV5RFSuW0U/xQpEfUTGiFOBo=; b=DwJxwJrX1k4p3Abj 8yE52n713CFUVqrLE15i15edWagnL45XZaqzkJLtZ9hEt/HqDYrq/TiHh3x9RBMO G0Bj3yVvGQIRGuzMHCC5yypLHprRRFSLKLTqQJBug16mGmdl47xT/2cKnTQ9fFcq dvMnFeS10QnqHfvi+Esndmsj16VdYz7RMVEgeRTlFxko95gpB+qG+exiq+MxIAGo FFD3b15rB/ki61rS8U1cQJspQfpqowWtioGAzg5R0OWAqw+OiwAcPuPOhCB9Hupv yLhWjw6CDoqR0099oSYzQEsNq0PXLe5Y/ZxtwQMl5NaWAWDG6fXMjr3hTUjIHwJJ 9Z47GA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8tk00y9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:45 +0000 (GMT) Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCciGu012290; Mon, 24 Jun 2024 12:38:44 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8tk00y4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:44 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OAXBiX020058; Mon, 24 Jun 2024 12:38:43 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3yxb5m8ets-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:43 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCccWO21627364 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:38:40 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 232C020040; Mon, 24 Jun 2024 12:38:38 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A3E0E20043; Mon, 24 Jun 2024 12:38:34 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:38:34 +0000 (GMT) Subject: [PATCH v4 2/6] powerpc/pseries/iommu: Fix the VFIO_IOMMU_SPAPR_TCE_GET_INFO ioctl output From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:38:34 +0000 Message-ID: <171923271138.1397.7908302630061814623.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: RKA2lvdVXrYyXXYqgfa1xeE-oB16oGSN X-Proofpoint-GUID: z3ool_xxUyLRaEMc3MROs88EBRLaeL8G X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 lowpriorityscore=0 mlxscore=0 phishscore=0 impostorscore=0 priorityscore=1501 adultscore=0 mlxlogscore=999 malwarescore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240095 The ioctl VFIO_IOMMU_SPAPR_TCE_GET_INFO is not reporting the actuals on the platform as not all the details are correctly collected during the platform probe/scan into the iommu_table_group. Collect the information during the device setup time as the DMA window property is already looked up on parent nodes anyway. Signed-off-by: Shivaprasad G Bhat --- arch/powerpc/platforms/pseries/iommu.c | 81 ++++++++++++++++++++++++++------ 1 file changed, 67 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index bbe7eaacd829..97b9a4e6bf8a 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -865,13 +865,6 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus) be32_to_cpu(prop.tce_shift), NULL, &iommu_table_lpar_multi_ops); - /* Only for normal boot with default window. Doesn't matter even - * if we set these with DDW which is 64bit during kdump, since - * these will not be used during kdump. - */ - ppci->table_group->tce32_start = be64_to_cpu(prop.dma_base); - ppci->table_group->tce32_size = 1 << be32_to_cpu(prop.window_shift); - if (!iommu_init_table(tbl, ppci->phb->node, 0, 0)) panic("Failed to initialize iommu table"); @@ -1623,6 +1616,71 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) return direct_mapping; } +static __u64 query_page_size_to_mask(u32 query_page_size) +{ + const long shift[] = { + (SZ_4K), (SZ_64K), (SZ_16M), + (SZ_32M), (SZ_64M), (SZ_128M), + (SZ_256M), (SZ_16G), (SZ_2M) + }; + int i, ret = 0; + + for (i = 0; i < ARRAY_SIZE(shift); i++) { + if (query_page_size & (1 << i)) + ret |= shift[i]; + } + + return ret; +} + +static void spapr_tce_init_table_group(struct pci_dev *pdev, + struct device_node *pdn, + struct dynamic_dma_window_prop prop) +{ + struct iommu_table_group *table_group = PCI_DN(pdn)->table_group; + u32 ddw_avail[DDW_APPLICABLE_SIZE]; + + struct ddw_query_response query; + int ret; + + /* Only for normal boot with default window. Doesn't matter during + * kdump, since these will not be used during kdump. + */ + if (is_kdump_kernel()) + return; + + if (table_group->max_dynamic_windows_supported != 0) + return; /* already initialized */ + + table_group->tce32_start = be64_to_cpu(prop.dma_base); + table_group->tce32_size = 1 << be32_to_cpu(prop.window_shift); + + if (!of_find_property(pdn, "ibm,dma-window", NULL)) + dev_err(&pdev->dev, "default dma window missing!\n"); + + ret = of_property_read_u32_array(pdn, "ibm,ddw-applicable", + &ddw_avail[0], DDW_APPLICABLE_SIZE); + if (ret) { + table_group->max_dynamic_windows_supported = -1; + return; + } + + ret = query_ddw(pdev, ddw_avail, &query, pdn); + if (ret) { + dev_err(&pdev->dev, "%s: query_ddw failed\n", __func__); + table_group->max_dynamic_windows_supported = -1; + return; + } + + if (query.windows_available == 0) + table_group->max_dynamic_windows_supported = 1; + else + table_group->max_dynamic_windows_supported = IOMMU_TABLE_GROUP_MAX_TABLES; + + table_group->max_levels = 1; + table_group->pgsizes |= query_page_size_to_mask(query.page_size); +} + static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) { struct device_node *pdn, *dn; @@ -1662,13 +1720,6 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) be32_to_cpu(prop.tce_shift), NULL, &iommu_table_lpar_multi_ops); - /* Only for normal boot with default window. Doesn't matter even - * if we set these with DDW which is 64bit during kdump, since - * these will not be used during kdump. - */ - pci->table_group->tce32_start = be64_to_cpu(prop.dma_base); - pci->table_group->tce32_size = 1 << be32_to_cpu(prop.window_shift); - iommu_init_table(tbl, pci->phb->node, 0, 0); iommu_register_group(pci->table_group, pci_domain_nr(pci->phb->bus), 0); @@ -1677,6 +1728,8 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) pr_debug(" found DMA window, table: %p\n", pci->table_group); } + spapr_tce_init_table_group(dev, pdn, prop); + set_iommu_table_base(&dev->dev, pci->table_group->tables[0]); iommu_add_device(pci->table_group, &dev->dev); } From patchwork Mon Jun 24 12:38:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709430 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CAB313D24C; Mon, 24 Jun 2024 12:39:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232758; cv=none; b=hjjmlAOop0WICQFcked74lOTGmQyoA5TnqpG5xH+fnUoVTzWgEUobLNKRDST8hXWwII5QCsoBhBfR8rXFYitQXRdmSdpDiBwKf0IINNHHfQCrPSM4U4QaKKvYqUT7X0FHfvTzgyQnbpxB93NPWcMEherh4om7knZuJxHxjoOkZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232758; c=relaxed/simple; bh=csIOidFbK9g0l4zoRHI5iNtf/AxhpMIfdrawPnTEYDk=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lexVdcDPx5pupPgNpx+AUgzvFwKDw74WVoTC88kmmlBEtbftktgk8+n6EFK8tw6D9U2bBmagdpHiqJfzQzi7+vUv7fmXJsYbFhxqpb2G0No2YwU6nfDXXSq+9j4UPU0N6jlFmnZ4kWVKdazcJxdU7S4PYErimlafyuX4cTaVeXg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=XL9lThaf; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="XL9lThaf" Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OBwbsJ032681; Mon, 24 Jun 2024 12:38:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= nqsZiY+nwGHdS7f7Tvj9s9qjRvXJEGZNaAFiWBYtykA=; b=XL9lThafpmvOxWwI HJOBEg5YPMdyEB0J2EIeXEU3QBhaeFP1LFREbSXYhklt0RMkqqspQ5F8i80MXGSF mPoy8NZbgq6GC552HJU4bX48aHdDh2Zmon5G/fcVfA8/nkIZympO3CBjOC8yPynx +VQhI56xzKRUAUUr9LBmUkyUBSwiBxYgxiRzArHsdnirLSGOZFxMJkv4ETd/ovg4 KOOhiEMSRh5IabNDMjNfiPVpdFbIGjXC9oAG7tcM00Pekw4iDklQz7qdolvYRV92 g88/OCtsEwr+gFR2HK8mF5t12IWE3dMDTuwMVVF06EMx9e857RtHbiXkU+EL5+UA 3fcWSw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d1r3k5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:57 +0000 (GMT) Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCcux1011518; Mon, 24 Jun 2024 12:38:56 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d1r3jy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:56 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OBJY8c000402; Mon, 24 Jun 2024 12:38:55 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3yxbn30aqb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:38:55 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCcoNm22217148 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:38:52 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 33DBA20043; Mon, 24 Jun 2024 12:38:50 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BA15E20040; Mon, 24 Jun 2024 12:38:46 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:38:46 +0000 (GMT) Subject: [PATCH v4 3/6] powerpc/pseries/iommu: Use the iommu table[0] for IOV VF's DDW From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:38:46 +0000 Message-ID: <171923272328.1397.1817843961216868850.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 8ZDvxGqDFe7tSfPQjmVzxf2E6LzVreo3 X-Proofpoint-GUID: virIjsEmh2ekjJyw7d7Kk7Wodf8KGzpU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 priorityscore=1501 spamscore=0 malwarescore=0 suspectscore=0 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 mlxscore=0 impostorscore=0 mlxlogscore=898 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240099 This patch basically brings consistency with PowerNV approach to use the first freely available iommu table when the default window is removed. The pSeries iommu code convention has been that the table[0] is for the default 32 bit DMA window and the table[1] is for the 64 bit DDW. With VFs having only 1 DMA window, the default has to be removed for creating the larger DMA window. The existing code uses the table[1] for that, while marking the table[0] as NULL. This is fine as long as the host driver itself uses the device. For the VFIO user, on pSeries there is no way to skip table[0] as the VFIO subdriver uses the first freely available table. The window 0, when created as 64-bit DDW in that context would still be on table[0], as the maximum number of windows is 1. This won't have any impact for the host driver as the table is fetched from the device's iommu_table_base. Signed-off-by: Shivaprasad G Bhat --- arch/powerpc/platforms/pseries/iommu.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 97b9a4e6bf8a..d2ac6c19cf9b 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -155,7 +155,7 @@ static void iommu_pseries_free_group(struct iommu_table_group *table_group, #endif /* Default DMA window table is at index 0, while DDW at 1. SR-IOV - * adapters only have table on index 1. + * adapters only have table on index 0(if not direct mapped). */ if (table_group->tables[0]) iommu_tce_table_put(table_group->tables[0]); @@ -1527,6 +1527,11 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) clean_dma_window(pdn, win64->value); goto out_del_list; } + if (default_win_removed) { + iommu_tce_table_put(pci->table_group->tables[0]); + pci->table_group->tables[0] = NULL; + set_iommu_table_base(&dev->dev, NULL); + } } else { struct iommu_table *newtbl; int i; @@ -1556,15 +1561,12 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) 1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops); iommu_init_table(newtbl, pci->phb->node, start, end); - pci->table_group->tables[1] = newtbl; + pci->table_group->tables[default_win_removed ? 0 : 1] = newtbl; set_iommu_table_base(&dev->dev, newtbl); } if (default_win_removed) { - iommu_tce_table_put(pci->table_group->tables[0]); - pci->table_group->tables[0] = NULL; - /* default_win is valid here because default_win_removed == true */ of_remove_property(pdn, default_win); dev_info(&dev->dev, "Removed default DMA window for %pOF\n", pdn); From patchwork Mon Jun 24 12:38:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709431 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10E1213D24C; Mon, 24 Jun 2024 12:39:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232767; cv=none; b=QLz0Mk9fuftqibnRd+OoVaGlb9bBuPTBHipgt6hgl75kcfmsR76eY+FWvj5bzmCG/WojO86+Q5LDNXPJvsVt0mmqQPmLLcrFz2ZMsKEUPlUAwbvwBh1hoZO5JQpg32z6VwQ46f6lsVlMTbrQSlVn8A+H89XkVuIvzwUq/EKJH+M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232767; c=relaxed/simple; bh=US6PgPuYBkBHBzzJOfK04Hwg/abY8CZ7Jk3Z2bpKCqs=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Do0mRdsZEb1T9nX06msehBZXcmWRoj5o5RG2XY4sDeljK4zuIONVCvoG0p+fXQqm5JSTzDwppO4upPeu8mi/0aLEqAKOIw6+u6veKnnw3imHbuH/2xnQjYimXy6Kko4phWeMAH/SK0vQPsnMtLn/YcZF11ZRF9yYEGfR9wpX064= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=slFTjN1f; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="slFTjN1f" Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OCSt4g015666; Mon, 24 Jun 2024 12:39:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= yBhQkttHnEyL5cv2acTqp19DVPXSx5tGvkMc8RzjqEA=; b=slFTjN1fr9GLLu+d WpRSJCIC1gNtoxpxhFqzuTssQZPbGYj4FFN9bh9sfPpYhvtduK6bWuMhcG3JUb4V /Mu1twG7HPINIuArqre3jeDtT9OQ5X5bk3TiM3PkICx5TVsREBodcJE1ZpVnNDQZ ubw++fwJ9yP3Bc0nMEkGMh8mMieht+y1bxogzG9ZkCK2m5g7V08NjGVj0vpGFv9t BzyXFebjM40Zx7vNPyALDynPsoAaU5vA8KykmIPD2zMPZwlkBzwL491VdpG+oI+J IFmaLp5WObroMCwdhnYotgOUpSZNGYOgOmRntfX6VKAxIB0uvEU43wf4PhULIZek EsswXA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8u4r0v0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:10 +0000 (GMT) Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCd92Q001415; Mon, 24 Jun 2024 12:39:09 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8u4r0up-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:09 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OAMlw8000575; Mon, 24 Jun 2024 12:39:08 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3yxaemrm8k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:08 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCd2to49348952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:39:04 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5192720043; Mon, 24 Jun 2024 12:39:02 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B6B0520040; Mon, 24 Jun 2024 12:38:58 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:38:58 +0000 (GMT) Subject: [PATCH v4 4/6] vfio/spapr: Always clear TCEs before unsetting the window From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:38:58 +0000 Message-ID: <171923273535.1397.1236742071894414895.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: E8BcRtYOtbYXq--juKL9VUPfyo9J-iGt X-Proofpoint-ORIG-GUID: XS3PfcpElw7PNvaZnxWcIkjAfMpPPmQo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 clxscore=1015 mlxscore=0 malwarescore=0 mlxlogscore=838 bulkscore=0 spamscore=0 phishscore=0 priorityscore=1501 impostorscore=0 adultscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240099 The PAPR expects the TCE table to have no entries at the time of unset window(i.e. remove-pe). The TCE clear right now is done before freeing the iommu table. On pSeries, the unset window makes those entries inaccessible to the OS and the H_PUT/GET calls fail on them with H_CONSTRAINED. On PowerNV, this has no side effect as the TCE clear can be done before the DMA window removal as well. Signed-off-by: Shivaprasad G Bhat --- drivers/vfio/vfio_iommu_spapr_tce.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c index a94ec6225d31..5f9e7e477078 100644 --- a/drivers/vfio/vfio_iommu_spapr_tce.c +++ b/drivers/vfio/vfio_iommu_spapr_tce.c @@ -364,7 +364,6 @@ static void tce_iommu_release(void *iommu_data) if (!tbl) continue; - tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size); tce_iommu_free_table(container, tbl); } @@ -720,6 +719,8 @@ static long tce_iommu_remove_window(struct tce_container *container, BUG_ON(!tbl->it_size); + tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size); + /* Detach groups from IOMMUs */ list_for_each_entry(tcegrp, &container->group_list, next) { table_group = iommu_group_get_iommudata(tcegrp->grp); @@ -738,7 +739,6 @@ static long tce_iommu_remove_window(struct tce_container *container, } /* Free table */ - tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size); tce_iommu_free_table(container, tbl); container->tables[num] = NULL; @@ -1197,9 +1197,14 @@ static void tce_iommu_release_ownership(struct tce_container *container, return; } - for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) - if (container->tables[i]) + for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { + if (container->tables[i]) { + tce_iommu_clear(container, container->tables[i], + container->tables[i]->it_offset, + container->tables[i]->it_size); table_group->ops->unset_window(table_group, i); + } + } } static long tce_iommu_take_ownership(struct tce_container *container, From patchwork Mon Jun 24 12:39:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709432 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C67C413C9D5; Mon, 24 Jun 2024 12:39:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232778; cv=none; b=jp8a1llwFKRpifYY7pUuOyHuiRpr/Vk57G2eb29Lq9tdLVVb+gP9hSWGvi95KLZwMhHPWT9dbMAIqsOr4BdNaAthzWFHj8zOULLYG/t0oINrfvlDWBbRfKf2ufSQQI/XL2eKPsaV2db0jy1Ep/ofEJhAXPEB/uxilwPcxIynD78= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232778; c=relaxed/simple; bh=Qj+EjbdWIxvH4b3fRncMjvaBMLevK8K6CGzbyp4nwys=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tW5YVHNjQGr+a871i6HtyT1OrSrwMbLQxO7EzmyvwYxlr1eTiCxvLVouSLQcfmLzDDEEwhLAYrr3NCFrUjnFggkfgHwyCVBzdtfV7RMX7yObTFcpOowy3e67Gm8jurXuD6u7HuuPK5P0KztY1gzZtjM7FanUrZ+JfOrT1qTuytA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=YqjnrMyd; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="YqjnrMyd" Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OBwbsP032681; Mon, 24 Jun 2024 12:39:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= lr6e1+vfNrPfM49ecKxEgIZrNV9dBE4qOg6iv/LnnZI=; b=YqjnrMydEz96VdO8 JR2jCtaMzjLHzbNtWiP1W1ltER6BmWB6oTSk6ZmW7iLVfcvfDxlr7aIvTkWScF6Y prHaeGww+c1SrSL2ixNWBT2Kap1EhU2EyCmUI+/pbc09RUeO2xlul38y09OgKvyU /DKSE9JDvmjWNPG3NhUoOWfLHcGQpPELM4uZpG5tVT1oanwNeyxNiTNvqweJnwlf qdEJPrxY1Rp23kRX0nFD9UUiO4F1P/7c8pmsj4PXAh5KnssY3JAKCgVfch/lJ9WT MrWL8J0dqoYLgFz7OyibiQi6qJpTWeUeVLr+QEZIGUEFZkuJzPmtyRHDu6GgZuF9 KrxXrg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d1r3nx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:21 +0000 (GMT) Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCdLF1013018; Mon, 24 Jun 2024 12:39:21 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8d1r3np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:21 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OAkMb4020074; Mon, 24 Jun 2024 12:39:20 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3yxb5m8eyq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:19 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCdEjj26542668 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:39:16 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 69B2820043; Mon, 24 Jun 2024 12:39:14 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E775820040; Mon, 24 Jun 2024 12:39:10 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:39:10 +0000 (GMT) Subject: [PATCH v4 5/6] powerpc/iommu: Move dev_has_iommu_table() to iommu.c From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:39:10 +0000 Message-ID: <171923274748.1397.6274953248403106679.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 9JEEWZzPW53U3nD_tLiZYnN8dE_qvGz6 X-Proofpoint-GUID: nsfg4u1k5BCy2KW3yklqppQJEs4TpS_H X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 priorityscore=1501 spamscore=0 malwarescore=0 suspectscore=0 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 mlxscore=0 impostorscore=0 mlxlogscore=976 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240099 Move function dev_has_iommu_table() to powerpc/kernel/iommu.c as it is going to be used by machine specific iommu code as well in subsequent patches. Signed-off-by: Shivaprasad G Bhat --- arch/powerpc/include/asm/iommu.h | 6 ++++++ arch/powerpc/kernel/eeh.c | 16 ---------------- arch/powerpc/kernel/iommu.c | 17 +++++++++++++++++ 3 files changed, 23 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index 744cc5fc22d3..b2fe6e7f81d6 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -220,6 +220,7 @@ extern long iommu_tce_xchg_no_kill(struct mm_struct *mm, enum dma_data_direction *direction); extern void iommu_tce_kill(struct iommu_table *tbl, unsigned long entry, unsigned long pages); +int dev_has_iommu_table(struct device *dev, void *data); #else static inline void iommu_register_group(struct iommu_table_group *table_group, @@ -233,6 +234,11 @@ static inline int iommu_add_device(struct iommu_table_group *table_group, { return 0; } + +static inline int dev_has_iommu_table(struct device *dev, void *data) +{ + return 0; +} #endif /* !CONFIG_IOMMU_API */ u64 dma_iommu_get_required_mask(struct device *dev); diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 6670063a7a6c..d03f17987fca 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1273,22 +1273,6 @@ EXPORT_SYMBOL(eeh_dev_release); #ifdef CONFIG_IOMMU_API -static int dev_has_iommu_table(struct device *dev, void *data) -{ - struct pci_dev *pdev = to_pci_dev(dev); - struct pci_dev **ppdev = data; - - if (!dev) - return 0; - - if (device_iommu_mapped(dev)) { - *ppdev = pdev; - return 1; - } - - return 0; -} - /** * eeh_iommu_group_to_pe - Convert IOMMU group to EEH PE * @group: IOMMU group diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index b5febc6c7a5e..ed8204cfa319 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -988,6 +988,23 @@ unsigned long iommu_direction_to_tce_perm(enum dma_data_direction dir) EXPORT_SYMBOL_GPL(iommu_direction_to_tce_perm); #ifdef CONFIG_IOMMU_API + +int dev_has_iommu_table(struct device *dev, void *data) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_dev **ppdev = data; + + if (!dev) + return 0; + + if (device_iommu_mapped(dev)) { + *ppdev = pdev; + return 1; + } + + return 0; +} + /* * SPAPR TCE API */ From patchwork Mon Jun 24 12:39:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivaprasad G Bhat X-Patchwork-Id: 13709433 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54C0E13CF9F; Mon, 24 Jun 2024 12:39:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232796; cv=none; b=eyL1pXwIa5nqZMmDw+pu5R5HNPMA5XGMijUZ9816theXZVAg8jcRnzrgvsa0YEe30TiFFLjXff10XVV44wGF2pRf2AQNVu89xS/egZ4Rm61GMVRtIFvBRGSuHbNWhlf8f0m8cos0Oetaa684BX/q1DSMVGmG0L3BiKL6iocmlp0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719232796; c=relaxed/simple; bh=U/+MnkKLb8sZNc49byUfWkBk7hIl9OgLBTmtnjw+x/Q=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tDJAZ5OmG4iWvm5heEcPx7YRYveOyLFMYFiS022rAu43SP3WaEDdVL7OYR3w3m2iFi0duTC/JR0nH03BBAuQqdhWGwPvu1ZOjNTMCsqz91ifVsjkOPuT09i0KZm9yJt1Et/f9+UP+A8wF3wC3o0bOvpE38Hu0eAMJB1HkDaeSTI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=Drn92VYJ; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="Drn92VYJ" Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45OCSW8H008962; Mon, 24 Jun 2024 12:39:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= subject:from:to:cc:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=pp1; bh= jAepbTZ11AAakUdCxyZ/o8GKqXpYX6jmM2MNSZdO3ok=; b=Drn92VYJ2h29Y+am SF1lvXTttBjLpV2XCeDEp2a3CWTtBUP7nwARxYpR3H6ss6za3KvPVRyBPSgi9YnK 98PYeLGWU+FruJf+eAD8CCUohUdQRocHRkJA0/ZAg53vIuqMXFDL5dM/SB+8gTjl A8NdhWdk5Dlisekbu+mtBSo1UC75wXRQMEe+l7eA83xdUh3LHMAC5rE+kfnLBKiZ eznkSk3Ee7ZcGxU4RV3F6O1a6yFKAiUxs9gnLwl9rWuS6nI/l1vZV/ohU/zhvcny ktpTdNSfNdZuFWUMU3bJp3YBq7EOCP5srms5KaQVH/Js43oas8QosdLfFfw6T+TK DDSFAA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8u5015h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:35 +0000 (GMT) Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45OCdZh9029641; Mon, 24 Jun 2024 12:39:35 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yy8u5015a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:35 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45OCKXiK008152; Mon, 24 Jun 2024 12:39:33 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3yx9b0gw9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Jun 2024 12:39:33 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45OCdRlW22413678 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Jun 2024 12:39:29 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A0AC12004D; Mon, 24 Jun 2024 12:39:27 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C84F320063; Mon, 24 Jun 2024 12:39:23 +0000 (GMT) Received: from [172.17.0.2] (unknown [9.3.101.175]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Jun 2024 12:39:23 +0000 (GMT) Subject: [PATCH v4 6/6] powerpc/iommu: Reimplement the iommu_table_group_ops for pSeries From: Shivaprasad G Bhat To: mpe@ellerman.id.au, tpearson@raptorengineering.com, alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org, aik@amd.com Cc: npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, gbatra@linux.vnet.ibm.com, brking@linux.vnet.ibm.com, sbhat@linux.ibm.com, aik@ozlabs.ru, jgg@ziepe.ca, ruscur@russell.cc, robh@kernel.org, sanastasio@raptorengineering.com, linux-kernel@vger.kernel.org, joel@jms.id.au, kvm@vger.kernel.org, msuchanek@suse.de, oohall@gmail.com, mahesh@linux.ibm.com, jroedel@suse.de, vaibhav@linux.ibm.com, svaidy@linux.ibm.com Date: Mon, 24 Jun 2024 12:39:23 +0000 Message-ID: <171923275958.1397.907964437142542242.stgit@linux.ibm.com> In-Reply-To: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> References: <171923268781.1397.8871195514893204050.stgit@linux.ibm.com> User-Agent: StGit/1.5 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Mzr9KY4fg9Aq0hLzD2PS3hzlohO4HNou X-Proofpoint-GUID: y4oYtkVlKRqPhPctzjjDvnTFN5ihfMV3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-24_09,2024-06-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 bulkscore=0 mlxscore=0 mlxlogscore=999 spamscore=0 impostorscore=0 phishscore=0 priorityscore=1501 adultscore=0 lowpriorityscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2406240099 PPC64 IOMMU API defines iommu_table_group_ops which handles DMA windows for PEs, their ownership transfer, create/set/unset the TCE tables for the Dynamic DMA wundows(DDW). VFIOS uses these APIs for support on POWER. The commit 9d67c9433509 ("powerpc/iommu: Add "borrowing" iommu_table_group_ops") implemented partial support for this API with "borrow" mechanism wherein the DMA windows if created already by the host driver, they would be available for VFIO to use. Also, it didn't have the support to control/modify the window size or the IO page size. The current patch implements all the necessary iommu_table_group_ops APIs there by avoiding the "borrrowing". So, just the way it is on the PowerNV platform, with this patch the iommu table group ownership is transferred to the VFIO PPC subdriver, the iommu table, DMA windows creation/deletion all driven through the APIs. The pSeries uses the query-pe-dma-window, create-pe-dma-window and reset-pe-dma-window RTAS calls for DMA window creation, deletion and reset to defaul. The RTAs calls do show some minor differences to the way things are to be handled on the pSeries which are listed below. * On pSeries, the default DMA window size is "fixed" cannot be custom sized as requested by the user. For non-SRIOV VFs, It is fixed at 2GB and for SRIOV VFs, its variable sized based on the capacity assigned to it during the VF assignment to the LPAR. So, for the default DMA window alone the size if requested less than tce32_size, the smaller size is enforced using the iommu table->it_size. * The DMA start address for 32-bit window is 0, and for the 64-bit window in case of PowerNV is hardcoded to TVE select (bit 59) at 512PiB offset. This address is returned at the time of create_table() API call (even before the window is created), the subsequent set_window() call actually opens the DMA window. On pSeries, the DMA start address for 32-bit window is known from the 'ibm,dma-window' DT property. However, the 64-bit window start address is not known until the create-pe-dma RTAS call is made. So, the create_table() which returns the DMA window start address actually opens the DMA window and returns the DMA start address as returned by the Hypervisor for the create-pe-dma RTAS call. * The reset-pe-dma RTAS call resets the DMA windows and restores the default DMA window, however it does not clear the TCE table entries if there are any. In case of ownership transfer from platform domain which used direct mapping, the patch chooses remove-pe-dma instead of reset-pe for the 64-bit window intentionally so that the clear_dma_window() is called. Other than the DMA window management changes mentioned above, the patch also brings back the userspace view for the single level TCE as it existed before commit 090bad39b237a ("powerpc/powernv: Add indirect levels to it_userspace") along with the relavent refactoring. Signed-off-by: Shivaprasad G Bhat --- arch/powerpc/include/asm/iommu.h | 4 arch/powerpc/kernel/iommu.c | 4 arch/powerpc/platforms/powernv/pci-ioda.c | 6 arch/powerpc/platforms/pseries/iommu.c | 627 +++++++++++++++++++++++++---- 4 files changed, 543 insertions(+), 98 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index b2fe6e7f81d6..6ce13ef3e37d 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -181,9 +181,9 @@ struct iommu_table_group_ops { long (*unset_window)(struct iommu_table_group *table_group, int num); /* Switch ownership from platform code to external user (e.g. VFIO) */ - long (*take_ownership)(struct iommu_table_group *table_group); + long (*take_ownership)(struct iommu_table_group *table_group, struct device *dev); /* Switch ownership from external user (e.g. VFIO) back to core */ - void (*release_ownership)(struct iommu_table_group *table_group); + void (*release_ownership)(struct iommu_table_group *table_group, struct device *dev); }; struct iommu_table_group_link { diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index ed8204cfa319..76381e14e800 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1171,7 +1171,7 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain, * The domain being set to PLATFORM from earlier * BLOCKED. The table_group ownership has to be released. */ - table_group->ops->release_ownership(table_group); + table_group->ops->release_ownership(table_group, dev); iommu_group_put(grp); return 0; @@ -1199,7 +1199,7 @@ spapr_tce_blocked_iommu_attach_dev(struct iommu_domain *platform_domain, * also sets the dma_api ops */ table_group = iommu_group_get_iommudata(grp); - ret = table_group->ops->take_ownership(table_group); + ret = table_group->ops->take_ownership(table_group, dev); iommu_group_put(grp); return ret; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 23f5b5093ec1..b0a14e48175c 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1537,7 +1537,8 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus) } } -static long pnv_ioda2_take_ownership(struct iommu_table_group *table_group) +static long pnv_ioda2_take_ownership(struct iommu_table_group *table_group, + struct device *dev __maybe_unused) { struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe, table_group); @@ -1562,7 +1563,8 @@ static long pnv_ioda2_take_ownership(struct iommu_table_group *table_group) return 0; } -static void pnv_ioda2_release_ownership(struct iommu_table_group *table_group) +static void pnv_ioda2_release_ownership(struct iommu_table_group *table_group, + struct device *dev __maybe_unused) { struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe, table_group); diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index d2ac6c19cf9b..e226e43c8762 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -54,57 +55,6 @@ enum { DDW_EXT_QUERY_OUT_SIZE = 2 }; -static int iommu_take_ownership(struct iommu_table *tbl) -{ - unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; - int ret = 0; - - /* - * VFIO does not control TCE entries allocation and the guest - * can write new TCEs on top of existing ones so iommu_tce_build() - * must be able to release old pages. This functionality - * requires exchange() callback defined so if it is not - * implemented, we disallow taking ownership over the table. - */ - if (!tbl->it_ops->xchg_no_kill) - return -EINVAL; - - spin_lock_irqsave(&tbl->large_pool.lock, flags); - for (i = 0; i < tbl->nr_pools; i++) - spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); - - if (iommu_table_in_use(tbl)) { - pr_err("iommu_tce: it_map is not empty"); - ret = -EBUSY; - } else { - memset(tbl->it_map, 0xff, sz); - } - - for (i = 0; i < tbl->nr_pools; i++) - spin_unlock(&tbl->pools[i].lock); - spin_unlock_irqrestore(&tbl->large_pool.lock, flags); - - return ret; -} - -static void iommu_release_ownership(struct iommu_table *tbl) -{ - unsigned long flags, i, sz = (tbl->it_size + 7) >> 3; - - spin_lock_irqsave(&tbl->large_pool.lock, flags); - for (i = 0; i < tbl->nr_pools; i++) - spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); - - memset(tbl->it_map, 0, sz); - - iommu_table_reserve_pages(tbl, tbl->it_reserved_start, - tbl->it_reserved_end); - - for (i = 0; i < tbl->nr_pools; i++) - spin_unlock(&tbl->pools[i].lock); - spin_unlock_irqrestore(&tbl->large_pool.lock, flags); -} - static struct iommu_table *iommu_pseries_alloc_table(int node) { struct iommu_table *tbl; @@ -196,7 +146,7 @@ static int tce_build_pSeries(struct iommu_table *tbl, long index, } -static void tce_free_pSeries(struct iommu_table *tbl, long index, long npages) +static void tce_clear_pSeries(struct iommu_table *tbl, long index, long npages) { __be64 *tcep; @@ -215,6 +165,37 @@ static unsigned long tce_get_pseries(struct iommu_table *tbl, long index) return be64_to_cpu(*tcep); } +static long pseries_tce_iommu_userspace_view_alloc(struct iommu_table *tbl) +{ + unsigned long cb = ALIGN(sizeof(tbl->it_userspace[0]) * tbl->it_size, PAGE_SIZE); + unsigned long *uas; + + if (tbl->it_indirect_levels) /* Impossible */ + return -EPERM; + + WARN_ON(tbl->it_userspace); + + uas = vzalloc(cb); + if (!uas) + return -ENOMEM; + + tbl->it_userspace = (__be64 *) uas; + + return 0; +} + +static void tce_iommu_userspace_view_free(struct iommu_table *tbl) +{ + vfree(tbl->it_userspace); + tbl->it_userspace = NULL; +} + +static void tce_free_pSeries(struct iommu_table *tbl) +{ + if (!tbl->it_userspace) + tce_iommu_userspace_view_free(tbl); +} + static void tce_free_pSeriesLP(unsigned long liobn, long, long, long); static void tce_freemulti_pSeriesLP(struct iommu_table*, long, long); @@ -629,7 +610,7 @@ struct iommu_table_ops iommu_table_lpar_multi_ops; struct iommu_table_ops iommu_table_pseries_ops = { .set = tce_build_pSeries, - .clear = tce_free_pSeries, + .clear = tce_clear_pSeries, .get = tce_get_pseries }; @@ -738,17 +719,45 @@ static int tce_exchange_pseries(struct iommu_table *tbl, long index, unsigned return rc; } + +static __be64 *tce_useraddr_pSeriesLP(struct iommu_table *tbl, long index, + bool __always_unused alloc) +{ + return tbl->it_userspace ? &tbl->it_userspace[index - tbl->it_offset] : NULL; +} #endif struct iommu_table_ops iommu_table_lpar_multi_ops = { .set = tce_buildmulti_pSeriesLP, #ifdef CONFIG_IOMMU_API .xchg_no_kill = tce_exchange_pseries, + .useraddrptr = tce_useraddr_pSeriesLP, #endif .clear = tce_freemulti_pSeriesLP, - .get = tce_get_pSeriesLP + .get = tce_get_pSeriesLP, + .free = tce_free_pSeries }; +/* + * When the DMA window properties might have been removed, + * the parent node has the table_group setup on it. + */ +static struct device_node *pci_dma_find_parent_node(struct pci_dev *dev, + struct iommu_table_group *table_group) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + struct pci_dn *rpdn; + + for (; dn && PCI_DN(dn); dn = dn->parent) { + rpdn = PCI_DN(dn); + + if (table_group == rpdn->table_group) + return dn; + } + + return NULL; +} + /* * Find nearest ibm,dma-window (default DMA window) or direct DMA window or * dynamic 64bit DMA window, walking up the device tree. @@ -963,7 +972,7 @@ static void __remove_dma_window(struct device_node *np, u32 *ddw_avail, u64 liob } static void remove_dma_window(struct device_node *np, u32 *ddw_avail, - struct property *win) + struct property *win, bool cleanup) { struct dynamic_dma_window_prop *dwp; u64 liobn; @@ -971,11 +980,44 @@ static void remove_dma_window(struct device_node *np, u32 *ddw_avail, dwp = win->value; liobn = (u64)be32_to_cpu(dwp->liobn); - clean_dma_window(np, dwp); + if (cleanup) + clean_dma_window(np, dwp); __remove_dma_window(np, ddw_avail, liobn); } -static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_name) +static void copy_property(struct device_node *pdn, const char *from, const char *to) +{ + struct property *src, *dst; + + src = of_find_property(pdn, from, NULL); + if (!src) + return; + + dst = kzalloc(sizeof(*dst), GFP_KERNEL); + if (!dst) + return; + + dst->name = kstrdup(to, GFP_KERNEL); + dst->value = kmemdup(src->value, src->length, GFP_KERNEL); + dst->length = src->length; + if (!dst->name || !dst->value) + return; + + if (of_add_property(pdn, dst)) { + pr_err("Unable to add DMA window property for %pOF", pdn); + goto free_prop; + } + + return; + +free_prop: + kfree(dst->name); + kfree(dst->value); + kfree(dst); +} + +static int remove_dma_window_named(struct device_node *np, bool remove_prop, const char *win_name, + bool cleanup) { struct property *win; u32 ddw_avail[DDW_APPLICABLE_SIZE]; @@ -990,13 +1032,20 @@ static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_ if (ret) return 0; - if (win->length >= sizeof(struct dynamic_dma_window_prop)) - remove_dma_window(np, ddw_avail, win); + remove_dma_window(np, ddw_avail, win, cleanup); if (!remove_prop) return 0; + /* Default window property if removed is lost as reset-pe doesn't restore it. + * Though FDT has a copy of it, the DLPAR hotplugged devices will not have a + * node on FDT until next reboot. So, back it up. + */ + if ((strcmp(win_name, "ibm,dma-window") == 0) && + !of_find_property(np, "ibm,dma-window-saved", NULL)) + copy_property(np, win_name, "ibm,dma-window-saved"); + ret = of_remove_property(np, win); if (ret) pr_warn("%pOF: failed to remove DMA window property: %d\n", @@ -1054,7 +1103,7 @@ static void find_existing_ddw_windows_named(const char *name) for_each_node_with_property(pdn, name) { dma64 = of_get_property(pdn, name, &len); if (!dma64 || len < sizeof(*dma64)) { - remove_ddw(pdn, true, name); + remove_dma_window_named(pdn, true, name, true); continue; } @@ -1431,7 +1480,7 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) if (reset_win_ext) goto out_failed; - remove_dma_window(pdn, ddw_avail, default_win); + remove_dma_window(pdn, ddw_avail, default_win, true); default_win_removed = true; /* Query again, to check if the window is available */ @@ -1568,6 +1617,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) if (default_win_removed) { /* default_win is valid here because default_win_removed == true */ + if (!of_find_property(pdn, "ibm,dma-window-saved", NULL)) + copy_property(pdn, "ibm,dma-window", "ibm,dma-window-saved"); of_remove_property(pdn, default_win); dev_info(&dev->dev, "Removed default DMA window for %pOF\n", pdn); } @@ -1776,24 +1827,326 @@ static unsigned long spapr_tce_get_table_size(__u32 page_shift, return size; } +static struct pci_dev *iommu_group_get_first_pci_dev(struct iommu_group *group) +{ + struct pci_dev *pdev = NULL; + int ret; + + /* No IOMMU group ? */ + if (!group) + return NULL; + + ret = iommu_group_for_each_dev(group, &pdev, dev_has_iommu_table); + if (!ret || !pdev) + return NULL; + return pdev; +} + +static void restore_default_dma_window(struct pci_dev *pdev, struct device_node *pdn) +{ + reset_dma_window(pdev, pdn); + copy_property(pdn, "ibm,dma-window-saved", "ibm,dma-window"); +} + +static long remove_dynamic_dma_windows(struct pci_dev *pdev, struct device_node *pdn) +{ + struct pci_dn *pci = PCI_DN(pdn); + struct dma_win *window; + bool direct_mapping; + int len; + + if (find_existing_ddw(pdn, &pdev->dev.archdata.dma_offset, &len, &direct_mapping)) { + remove_dma_window_named(pdn, true, direct_mapping ? + DIRECT64_PROPNAME : DMA64_PROPNAME, true); + if (!direct_mapping) { + WARN_ON(!pci->table_group->tables[0] && !pci->table_group->tables[1]); + + if (pci->table_group->tables[1]) { + iommu_tce_table_put(pci->table_group->tables[1]); + pci->table_group->tables[1] = NULL; + } else if (pci->table_group->tables[0]) { + /* Default window was removed and only the DDW exists */ + iommu_tce_table_put(pci->table_group->tables[0]); + pci->table_group->tables[0] = NULL; + } + } + spin_lock(&dma_win_list_lock); + list_for_each_entry(window, &dma_win_list, list) { + if (window->device == pdn) { + list_del(&window->list); + kfree(window); + break; + } + } + spin_unlock(&dma_win_list_lock); + } + + return 0; +} + +static long pseries_setup_default_iommu_config(struct iommu_table_group *table_group, + struct device *dev) +{ + struct pci_dev *pdev = to_pci_dev(dev); + const __be32 *default_prop; + long liobn, offset, size; + struct device_node *pdn; + struct iommu_table *tbl; + struct pci_dn *pci; + + pdn = pci_dma_find_parent_node(pdev, table_group); + if (!pdn || !PCI_DN(pdn)) { + dev_warn(&pdev->dev, "No table_group configured for the node %pOF\n", pdn); + return -1; + } + pci = PCI_DN(pdn); + + /* The default window is restored if not present already on removal of DDW. + * However, if used by VFIO SPAPR sub driver, the user's order of removal of + * windows might have been different to not leading to auto restoration, + * suppose the DDW was removed first followed by the default one. + * So, restore the default window with reset-pe-dma call explicitly. + */ + restore_default_dma_window(pdev, pdn); + + default_prop = of_get_property(pdn, "ibm,dma-window", NULL); + of_parse_dma_window(pdn, default_prop, &liobn, &offset, &size); + tbl = iommu_pseries_alloc_table(pci->phb->node); + if (!tbl) { + dev_err(&pdev->dev, "couldn't create new IOMMU table\n"); + return -1; + } + + iommu_table_setparms_common(tbl, pci->phb->bus->number, liobn, offset, + size, IOMMU_PAGE_SHIFT_4K, NULL, + &iommu_table_lpar_multi_ops); + iommu_init_table(tbl, pci->phb->node, 0, 0); + + pci->table_group->tables[0] = tbl; + set_iommu_table_base(&pdev->dev, tbl); + + return 0; +} + +static bool is_default_window_request(struct iommu_table_group *table_group, __u32 page_shift, + __u64 window_size) +{ + if ((window_size <= table_group->tce32_size) && + (page_shift == IOMMU_PAGE_SHIFT_4K)) + return true; + + return false; +} + static long spapr_tce_create_table(struct iommu_table_group *table_group, int num, __u32 page_shift, __u64 window_size, __u32 levels, struct iommu_table **ptbl) { - struct iommu_table *tbl = table_group->tables[0]; - - if (num > 0) - return -EPERM; + struct pci_dev *pdev = iommu_group_get_first_pci_dev(table_group->group); + u32 ddw_avail[DDW_APPLICABLE_SIZE]; + struct ddw_create_response create; + unsigned long liobn, offset, size; + unsigned long start = 0, end = 0; + struct ddw_query_response query; + const __be32 *default_prop; + struct failed_ddw_pdn *fpdn; + unsigned int window_shift; + struct device_node *pdn; + struct iommu_table *tbl; + struct dma_win *window; + struct property *win64; + struct pci_dn *pci; + u64 win_addr; + int len, i; + long ret; - if (tbl->it_page_shift != page_shift || - tbl->it_size != (window_size >> page_shift) || - tbl->it_indirect_levels != levels - 1) + if (!is_power_of_2(window_size) || levels > 1) return -EINVAL; + window_shift = order_base_2(window_size); + + mutex_lock(&dma_win_init_mutex); + + ret = -ENODEV; + + pdn = pci_dma_find_parent_node(pdev, table_group); + if (!pdn || !PCI_DN(pdn)) { /* Niether of 32s|64-bit exist! */ + dev_warn(&pdev->dev, "No dma-windows exist for the node %pOF\n", pdn); + goto out_failed; + } + pci = PCI_DN(pdn); + + /* If the enable DDW failed for the pdn, dont retry! */ + list_for_each_entry(fpdn, &failed_ddw_pdn_list, list) { + if (fpdn->pdn == pdn) { + dev_info(&pdev->dev, "%pOF in failed DDW device list\n", pdn); + goto out_unlock; + } + } + + tbl = iommu_pseries_alloc_table(pci->phb->node); + if (!tbl) { + dev_dbg(&pdev->dev, "couldn't create new IOMMU table\n"); + goto out_unlock; + } + + if (num == 0) { + bool direct_mapping; + /* The request is not for default window? Ensure there is no DDW window already */ + if (!is_default_window_request(table_group, page_shift, window_size)) { + if (find_existing_ddw(pdn, &pdev->dev.archdata.dma_offset, &len, + &direct_mapping)) { + dev_warn(&pdev->dev, "%pOF: 64-bit window already present.", pdn); + ret = -EPERM; + goto out_unlock; + } + } else { + /* Request is for Default window, ensure there is no DDW if there is a + * need to reset. reset-pe otherwise removes the DDW also + */ + default_prop = of_get_property(pdn, "ibm,dma-window", NULL); + if (!default_prop) { + if (find_existing_ddw(pdn, &pdev->dev.archdata.dma_offset, &len, + &direct_mapping)) { + dev_warn(&pdev->dev, "%pOF: Attempt to create window#0 when 64-bit window is present. Preventing the attempt as that would destroy the 64-bit window", + pdn); + ret = -EPERM; + goto out_unlock; + } + + restore_default_dma_window(pdev, pdn); + + default_prop = of_get_property(pdn, "ibm,dma-window", NULL); + of_parse_dma_window(pdn, default_prop, &liobn, &offset, &size); + /* Limit the default window size to window_size */ + iommu_table_setparms_common(tbl, pci->phb->bus->number, liobn, + offset, 1UL << window_shift, + IOMMU_PAGE_SHIFT_4K, NULL, + &iommu_table_lpar_multi_ops); + iommu_init_table(tbl, pci->phb->node, start, end); + + table_group->tables[0] = tbl; + + mutex_unlock(&dma_win_init_mutex); + + goto exit; + } + } + } + + ret = of_property_read_u32_array(pdn, "ibm,ddw-applicable", + &ddw_avail[0], DDW_APPLICABLE_SIZE); + if (ret) { + dev_info(&pdev->dev, "ibm,ddw-applicable not found\n"); + goto out_failed; + } + ret = -ENODEV; + + pr_err("%s: Calling query %pOF\n", __func__, pdn); + ret = query_ddw(pdev, ddw_avail, &query, pdn); + if (ret) + goto out_failed; + ret = -ENODEV; + + len = window_shift; + if (query.largest_available_block < (1ULL << (len - page_shift))) { + dev_dbg(&pdev->dev, "can't map window 0x%llx with %llu %llu-sized pages\n", + 1ULL << len, query.largest_available_block, + 1ULL << page_shift); + ret = -EINVAL; /* Retry with smaller window size */ + goto out_unlock; + } + + if (create_ddw(pdev, ddw_avail, &create, page_shift, len)) { + pr_err("%s: Create ddw failed %pOF\n", __func__, pdn); + goto out_failed; + } + + win_addr = ((u64)create.addr_hi << 32) | create.addr_lo; + win64 = ddw_property_create(DMA64_PROPNAME, create.liobn, win_addr, page_shift, len); + if (!win64) + goto remove_window; + + ret = of_add_property(pdn, win64); + if (ret) { + dev_err(&pdev->dev, "unable to add DMA window property for %pOF: %ld", pdn, ret); + goto free_property; + } + ret = -ENODEV; + + window = ddw_list_new_entry(pdn, win64->value); + if (!window) + goto remove_property; + + window->direct = false; + + for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources); i++) { + const unsigned long mask = IORESOURCE_MEM_64 | IORESOURCE_MEM; + + /* Look for MMIO32 */ + if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM) { + start = pci->phb->mem_resources[i].start; + end = pci->phb->mem_resources[i].end; + break; + } + } + + /* New table for using DDW instead of the default DMA window */ + iommu_table_setparms_common(tbl, pci->phb->bus->number, create.liobn, win_addr, + 1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops); + iommu_init_table(tbl, pci->phb->node, start, end); + + pci->table_group->tables[num] = tbl; + set_iommu_table_base(&pdev->dev, tbl); + pdev->dev.archdata.dma_offset = win_addr; + + spin_lock(&dma_win_list_lock); + list_add(&window->list, &dma_win_list); + spin_unlock(&dma_win_list_lock); + + mutex_unlock(&dma_win_init_mutex); + + goto exit; + +remove_property: + of_remove_property(pdn, win64); +free_property: + kfree(win64->name); + kfree(win64->value); + kfree(win64); +remove_window: + __remove_dma_window(pdn, ddw_avail, create.liobn); + +out_failed: + fpdn = kzalloc(sizeof(*fpdn), GFP_KERNEL); + if (!fpdn) + goto out_unlock; + fpdn->pdn = pdn; + list_add(&fpdn->list, &failed_ddw_pdn_list); + +out_unlock: + mutex_unlock(&dma_win_init_mutex); + + return ret; +exit: + /* Allocate the userspace view */ + pseries_tce_iommu_userspace_view_alloc(tbl); + tbl->it_allocated_size = spapr_tce_get_table_size(page_shift, window_size, levels); + *ptbl = iommu_tce_table_get(tbl); + return 0; } +static bool is_default_window_table(struct iommu_table_group *table_group, struct iommu_table *tbl) +{ + if (((tbl->it_size << tbl->it_page_shift) <= table_group->tce32_size) && + (tbl->it_page_shift == IOMMU_PAGE_SHIFT_4K)) + return true; + + return false; +} + static long spapr_tce_set_window(struct iommu_table_group *table_group, int num, struct iommu_table *tbl) { @@ -1802,43 +2155,133 @@ static long spapr_tce_set_window(struct iommu_table_group *table_group, static long spapr_tce_unset_window(struct iommu_table_group *table_group, int num) { - return 0; + struct pci_dev *pdev = iommu_group_get_first_pci_dev(table_group->group); + struct device_node *dn = pci_device_to_OF_node(pdev), *pdn; + struct iommu_table *tbl = table_group->tables[num]; + struct failed_ddw_pdn *fpdn; + struct dma_win *window; + const char *win_name; + int ret = -ENODEV; + + mutex_lock(&dma_win_init_mutex); + + if ((num == 0) && is_default_window_table(table_group, tbl)) + win_name = "ibm,dma-window"; + else + win_name = DMA64_PROPNAME; + + pdn = pci_dma_find(dn, NULL); + if (!pdn || !PCI_DN(pdn)) { /* Niether of 32s|64-bit exist! */ + dev_warn(&pdev->dev, "No dma-windows exist for the node %pOF\n", pdn); + goto out_failed; + } + + /* Dont clear the TCEs, User should have done it */ + if (remove_dma_window_named(pdn, true, win_name, false)) { + pr_err("%s: The existing DDW removal failed for node %pOF\n", __func__, pdn); + goto out_failed; /* Could not remove it either! */ + } + + if (strcmp(win_name, DMA64_PROPNAME) == 0) { + spin_lock(&dma_win_list_lock); + list_for_each_entry(window, &dma_win_list, list) { + if (window->device == pdn) { + list_del(&window->list); + kfree(window); + break; + } + } + spin_unlock(&dma_win_list_lock); + } + + iommu_tce_table_put(table_group->tables[num]); + table_group->tables[num] = NULL; + + ret = 0; + + goto out_unlock; + +out_failed: + fpdn = kzalloc(sizeof(*fpdn), GFP_KERNEL); + if (!fpdn) + goto out_unlock; + fpdn->pdn = pdn; + list_add(&fpdn->list, &failed_ddw_pdn_list); + +out_unlock: + mutex_unlock(&dma_win_init_mutex); + + return ret; } -static long spapr_tce_take_ownership(struct iommu_table_group *table_group) +static long spapr_tce_take_ownership(struct iommu_table_group *table_group, struct device *dev) { - int i, j, rc = 0; + struct iommu_table *tbl = table_group->tables[0]; + struct pci_dev *pdev = to_pci_dev(dev); + struct device_node *dn = pci_device_to_OF_node(pdev); + struct device_node *pdn; - for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { - struct iommu_table *tbl = table_group->tables[i]; + /* SRIOV VFs using direct map by the host driver OR multifunction devices + * where the ownership was taken on the attempt by the first function + */ + if (!tbl && (table_group->max_dynamic_windows_supported != 1)) + return 0; - if (!tbl || !tbl->it_map) - continue; + mutex_lock(&dma_win_init_mutex); - rc = iommu_take_ownership(tbl); - if (!rc) - continue; + pdn = pci_dma_find(dn, NULL); + if (!pdn || !PCI_DN(pdn)) { /* Niether of 32s|64-bit exist! */ + dev_warn(&pdev->dev, "No dma-windows exist for the node %pOF\n", pdn); + mutex_unlock(&dma_win_init_mutex); + return -1; + } - for (j = 0; j < i; ++j) - iommu_release_ownership(table_group->tables[j]); - return rc; + /* + * Though rtas call reset-pe removes the DDW, it doesn't clear the entries on the table + * if there are any. In case of direct map, the entries will be left over, which + * is fine for PEs with 2 DMA windows where the second window is created with create-pe + * at which point the table is cleared. However, on VFs having only one DMA window, the + * default window would end up seeing the entries left over from the direct map done + * on the second window. So, remove the ddw explicitly so that clean_dma_window() + * cleans up the entries if any. + */ + if (remove_dynamic_dma_windows(pdev, pdn)) { + dev_warn(&pdev->dev, "The existing DDW removal failed for node %pOF\n", pdn); + mutex_unlock(&dma_win_init_mutex); + return -1; } + + /* The table_group->tables[0] is not null now, it must be the default window + * Remove it, let the userspace create it as it needs. + */ + if (table_group->tables[0]) { + remove_dma_window_named(pdn, true, "ibm,dma-window", true); + iommu_tce_table_put(tbl); + table_group->tables[0] = NULL; + } + set_iommu_table_base(dev, NULL); + + mutex_unlock(&dma_win_init_mutex); + return 0; } -static void spapr_tce_release_ownership(struct iommu_table_group *table_group) +static void spapr_tce_release_ownership(struct iommu_table_group *table_group, struct device *dev) { - int i; + struct iommu_table *tbl = table_group->tables[0]; - for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) { - struct iommu_table *tbl = table_group->tables[i]; + if (tbl) { /* Default window already restored */ + return; + } - if (!tbl) - continue; + mutex_lock(&dma_win_init_mutex); - if (tbl->it_map) - iommu_release_ownership(tbl); - } + /* Restore the default window */ + pseries_setup_default_iommu_config(table_group, dev); + + mutex_unlock(&dma_win_init_mutex); + + return; } static struct iommu_table_group_ops spapr_tce_table_group_ops = { @@ -1911,8 +2354,8 @@ static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long acti * we have to remove the property when releasing * the device node. */ - if (remove_ddw(np, false, DIRECT64_PROPNAME)) - remove_ddw(np, false, DMA64_PROPNAME); + if (remove_dma_window_named(np, false, DIRECT64_PROPNAME, true)) + remove_dma_window_named(np, false, DMA64_PROPNAME, true); if (pci && pci->table_group) iommu_pseries_free_group(pci->table_group,