From patchwork Mon Jun 26 13:38:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen (ThunderTown)" X-Patchwork-Id: 9809725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 86BBE60329 for ; Mon, 26 Jun 2017 13:41:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 899722846F for ; Mon, 26 Jun 2017 13:41:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7AF872858D; Mon, 26 Jun 2017 13:41:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E4C312846F for ; Mon, 26 Jun 2017 13:41:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=DV9wI7sJi3SoUzMoqLD0fQ/VkBlCMlamg2WXjyBGUXw=; b=XtP1convmQSudl yIf2eMhBhTiYt6FkPkzhNeZFKlTwnf5M2XiLZM5u2B5ikzeT9WDIUyKAukPFYEM0fXRf+1EHZEUcC eGKZU/F7gV7GIGetkl2GJuDiIcTNsz0Qbq5zqELo4RIdAiPnMb3TwJ4GBHOx5MDoU7a8YLYbdVlUj vz6/dTnnG4W1IaJ/gJM5pSh27rg33wNCw8N0l0d3zEmgOMbvRAd3PhD0ctyYiuwNpelI7GCIACBxu bDgMMTAuZZl2kbo4JM3dJi1IjvVH2PDHrfumtuayYIPIEVQGozSYymd99pOdq346+vah3wrKa1R5V /ur0VLXo2wDPsDmYINZA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1dPUGO-00010b-EF; Mon, 26 Jun 2017 13:41:12 +0000 Received: from szxga02-in.huawei.com ([45.249.212.188]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1dPUGH-0000Yv-FI for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2017 13:41:10 +0000 Received: from 172.30.72.53 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.53]) by dggrg02-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AQA47793; Mon, 26 Jun 2017 21:39:26 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 26 Jun 2017 21:39:17 +0800 From: Zhen Lei To: Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , Robin Murphy , linux-kernel Subject: [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction Date: Mon, 26 Jun 2017 21:38:46 +0800 Message-ID: <1498484330-10840-2-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1498484330-10840-1-git-send-email-thunder.leizhen@huawei.com> References: <1498484330-10840-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.59510E8E.00A2, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: a4d792d883189da3eefb60cb94f23d25 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170626_064105_987062_40F1810E X-CRM114-Status: GOOD ( 15.87 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Garry , Hanjun Guo , Xinwei Hu , Zefan Li , Tianhong Ding , Zhen Lei Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Because all TLBI commands should be followed by a SYNC command, to make sure that it has been completely finished. So we can just add the TLBI commands into the queue, and put off the execution until meet SYNC or other commands. To prevent the followed SYNC command waiting for a long time because of too many commands have been delayed, restrict the max delayed number. According to my test, I got the same performance data as I replaced writel with writel_relaxed in queue_inc_prod. Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 42 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 37 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 291da5f..4481123 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -337,6 +337,7 @@ /* Command queue */ #define CMDQ_ENT_DWORDS 2 #define CMDQ_MAX_SZ_SHIFT 8 +#define CMDQ_MAX_DELAYED 32 #define CMDQ_ERR_SHIFT 24 #define CMDQ_ERR_MASK 0x7f @@ -472,6 +473,7 @@ struct arm_smmu_cmdq_ent { }; } cfgi; + #define CMDQ_OP_TLBI_NH_ALL 0x10 #define CMDQ_OP_TLBI_NH_ASID 0x11 #define CMDQ_OP_TLBI_NH_VA 0x12 #define CMDQ_OP_TLBI_EL2_ALL 0x20 @@ -499,6 +501,7 @@ struct arm_smmu_cmdq_ent { struct arm_smmu_queue { int irq; /* Wired interrupt */ + u32 nr_delay; __le64 *base; dma_addr_t base_dma; @@ -722,11 +725,16 @@ static int queue_sync_prod(struct arm_smmu_queue *q) return ret; } -static void queue_inc_prod(struct arm_smmu_queue *q) +static void queue_inc_swprod(struct arm_smmu_queue *q) { - u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1; + u32 prod = q->prod + 1; q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod); +} + +static void queue_inc_prod(struct arm_smmu_queue *q) +{ + queue_inc_swprod(q); writel(q->prod, q->prod_reg); } @@ -761,13 +769,24 @@ static void queue_write(__le64 *dst, u64 *src, size_t n_dwords) *dst++ = cpu_to_le64(*src++); } -static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent) +static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent, int optimize) { if (queue_full(q)) return -ENOSPC; queue_write(Q_ENT(q, q->prod), ent, q->ent_dwords); - queue_inc_prod(q); + + /* + * We don't want too many commands to be delayed, this may lead the + * followed sync command to wait for a long time. + */ + if (optimize && (++q->nr_delay < CMDQ_MAX_DELAYED)) { + queue_inc_swprod(q); + } else { + queue_inc_prod(q); + q->nr_delay = 0; + } + return 0; } @@ -909,6 +928,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, struct arm_smmu_cmdq_ent *ent) { + int optimize = 0; u64 cmd[CMDQ_ENT_DWORDS]; unsigned long flags; bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV); @@ -920,8 +940,17 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, return; } + /* + * All TLBI commands should be followed by a sync command later. + * The CFGI commands is the same, but they are rarely executed. + * So just optimize TLBI commands now, to reduce the "if" judgement. + */ + if ((ent->opcode >= CMDQ_OP_TLBI_NH_ALL) && + (ent->opcode <= CMDQ_OP_TLBI_NSNH_ALL)) + optimize = 1; + spin_lock_irqsave(&smmu->cmdq.lock, flags); - while (queue_insert_raw(q, cmd) == -ENOSPC) { + while (queue_insert_raw(q, cmd, optimize) == -ENOSPC) { if (queue_poll_cons(q, false, wfe)) dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); } @@ -1953,6 +1982,8 @@ static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, << Q_BASE_LOG2SIZE_SHIFT; q->prod = q->cons = 0; + q->nr_delay = 0; + return 0; } @@ -2512,6 +2543,7 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) dev_err(smmu->dev, "unit-length command queue not supported\n"); return -ENXIO; } + BUILD_BUG_ON(CMDQ_MAX_DELAYED >= (1 << CMDQ_MAX_SZ_SHIFT)); smmu->evtq.q.max_n_shift = min((u32)EVTQ_MAX_SZ_SHIFT, reg >> IDR1_EVTQ_SHIFT & IDR1_EVTQ_MASK);