From patchwork Sun Aug 19 07:51:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhen Lei X-Patchwork-Id: 10569675 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 413E1112E for ; Sun, 19 Aug 2018 07:55:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37A7D2952C for ; Sun, 19 Aug 2018 07:55:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2A86529531; Sun, 19 Aug 2018 07:55:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B333C2952C for ; Sun, 19 Aug 2018 07:55:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rb7f3ijyBBTxqhncBEdzVJObwX6frZXigC4KrXOcDk0=; b=VDmD8y3gdUfFPj KA+1z3bAEaimLaQKfBEg7AH8flL7/X8ThspQ1zqfK0stiiCM5W6iDsPuJLJyrMW3XIpdABPr1mRxE u+z/1umyYc+N1JgyOz/RweK4O1usnzt1b1KzgS+MweSJAQPXB3E7twDPbtVsetkH3ZXiEzNXJvuLB Aj7I1GCun3pIfEf2iGwOYSuWM1LAg8jatwPo41uCX7e1/hBvd15Az74azbwSCjCuRcLzLgUac6sGZ pml3BjqZOGQFbawaOcFEUTXOPKc5ChNOQ0HJZvkHMIQJxhhtNZbBaTLxETlQNulL0vCCxH69CqKOP w0xPWEToLi06NmcsIORw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1frIYr-0000iv-9Q; Sun, 19 Aug 2018 07:55:45 +0000 Received: from szxga04-in.huawei.com ([45.249.212.190] helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1frIYh-0000b4-IK for linux-arm-kernel@lists.infradead.org; Sun, 19 Aug 2018 07:55:37 +0000 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 72FEF39F75EE9; Sun, 19 Aug 2018 15:55:20 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.399.0; Sun, 19 Aug 2018 15:55:15 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel Subject: [PATCH v4 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Date: Sun, 19 Aug 2018 15:51:10 +0800 Message-ID: <1534665071-7976-2-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> References: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180819_005535_823768_F0C6CF30 X-CRM114-Status: GOOD ( 12.67 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Garry , Hanjun Guo , LinuxArm , Libin , Zhen Lei Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The condition "(int)(VAL - sync_idx) >= 0" to break loop in function __arm_smmu_sync_poll_msi requires that sync_idx must be increased monotonously according to the sequence of the CMDs in the cmdq. But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected by spinlock, so the following scenarios may appear: cpu0 cpu1 msidata=0 msidata=1 insert cmd1 insert cmd0 smmu execute cmd1 smmu execute cmd0 poll timeout, because msidata=1 is overridden by cmd0, that means VAL=0, sync_idx=1. This is not a functional problem, just make the caller wait for a long time until TIMEOUT. It's rare to happen, because any other CMD_SYNCs during the waiting period will break it. Split the building of CMD_SYNC for SIG_IRQ mode into a single function, to ensure better performance and good coding style. Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) -- 1.8.3 diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1d64710..ac6d6df 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -566,7 +566,7 @@ struct arm_smmu_device { int gerr_irq; int combined_irq; - atomic_t sync_nr; + u32 sync_nr; unsigned long ias; /* IPA */ unsigned long oas; /* PA */ @@ -775,6 +775,17 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent) return 0; } +static inline +void arm_smmu_cmdq_build_sync_msi_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) +{ + cmd[0] = FIELD_PREP(CMDQ_0_OP, ent->opcode); + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ); + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH); + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB); + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata); + cmd[1] = ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -830,14 +841,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp); break; case CMDQ_OP_CMD_SYNC: - if (ent->sync.msiaddr) - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ); - else - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV); + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV); cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH); cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB); - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata); - cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; break; default: return -ENOENT; @@ -947,14 +953,13 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC, .sync = { - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr), .msiaddr = virt_to_phys(&smmu->sync_count), }, }; - arm_smmu_cmdq_build_cmd(cmd, &ent); - spin_lock_irqsave(&smmu->cmdq.lock, flags); + ent.sync.msidata = ++smmu->sync_nr; + arm_smmu_cmdq_build_sync_msi_cmd(cmd, &ent); arm_smmu_cmdq_insert_cmd(smmu, cmd); spin_unlock_irqrestore(&smmu->cmdq.lock, flags); @@ -2179,7 +2184,6 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu) { int ret; - atomic_set(&smmu->sync_nr, 0); ret = arm_smmu_init_queues(smmu); if (ret) return ret; From patchwork Sun Aug 19 07:51:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhen Lei X-Patchwork-Id: 10569677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 14317112E for ; Sun, 19 Aug 2018 08:10:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 08D1C29909 for ; Sun, 19 Aug 2018 08:10:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC8382990C; Sun, 19 Aug 2018 08:10:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D01C629909 for ; Sun, 19 Aug 2018 08:10:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=tVztf1t3lu2eRwcTP8WkE0lrxbDY/HgUsqHpzQ2E2Vc=; b=V/HMWnYywuCHiu AtSUdrHr7MG1FS/RHbq2lVjjU0g9p0O4OMdRrGc0JcH+1yfWwt58bBHbJy4wjlGdVh05Zp5r5MQ3c zWrNOOFUQqRFNBgzhUxlXJKWaQ+OTuAVFG2z6pBUuwYLHbiUN5PQWAc9Cmddc1Reg6a1ekzuYyu/Q W9LutMhLitg4jRLABWQOGQuObh33Mnou9LirDVghsZOeSEs6t9TcZGvT5UiGJ4NPzX4hcEPu9JHkv 7/9RU6jKdqlnuAPnirtv9cL7A2OaLr6+h97frKo79emwOxBHHcVd1vveW3okNbxkIEAyuD1MdOpjR oxz+bATSSvK5ccpKME4A==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1frImr-0005xQ-NL; Sun, 19 Aug 2018 08:10:13 +0000 Received: from szxga04-in.huawei.com ([45.249.212.190] helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1frIYh-0000bC-IL for linux-arm-kernel@lists.infradead.org; Sun, 19 Aug 2018 07:55:38 +0000 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 34716649A4F60; Sun, 19 Aug 2018 15:55:22 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.399.0; Sun, 19 Aug 2018 15:55:16 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel Subject: [PATCH v4 2/2] iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible Date: Sun, 19 Aug 2018 15:51:11 +0800 Message-ID: <1534665071-7976-3-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> References: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180819_005535_824379_322CCA09 X-CRM114-Status: GOOD ( 13.42 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Garry , Hanjun Guo , LinuxArm , Libin , Zhen Lei Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP More than two CMD_SYNCs maybe adjacent in the command queue, and the first one has done what others want to do. Drop the redundant CMD_SYNCs can improve IO performance especially under the pressure scene. I did the statistics in my test environment, the number of CMD_SYNCs can be reduced about 1/3. See below: CMD_SYNCs reduced: 19542181 CMD_SYNCs total: 58098548 (include reduced) CMDs total: 116197099 (TLBI:SYNC about 1:1) Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) -- 1.8.3 diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index ac6d6df..f3a56e1 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -567,6 +567,7 @@ struct arm_smmu_device { int gerr_irq; int combined_irq; u32 sync_nr; + u8 prev_cmd_opcode; unsigned long ias; /* IPA */ unsigned long oas; /* PA */ @@ -786,6 +787,11 @@ void arm_smmu_cmdq_build_sync_msi_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[1] = ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; } +static inline u8 arm_smmu_cmd_opcode_get(u64 *cmd) +{ + return cmd[0] & CMDQ_0_OP; +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -906,6 +912,8 @@ static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, u64 *cmd) struct arm_smmu_queue *q = &smmu->cmdq.q; bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV); + smmu->prev_cmd_opcode = arm_smmu_cmd_opcode_get(cmd); + while (queue_insert_raw(q, cmd) == -ENOSPC) { if (queue_poll_cons(q, false, wfe)) dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); @@ -958,9 +966,17 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) }; spin_lock_irqsave(&smmu->cmdq.lock, flags); - ent.sync.msidata = ++smmu->sync_nr; - arm_smmu_cmdq_build_sync_msi_cmd(cmd, &ent); - arm_smmu_cmdq_insert_cmd(smmu, cmd); + if (smmu->prev_cmd_opcode == CMDQ_OP_CMD_SYNC) { + /* + * Previous command is CMD_SYNC also, there is no need to add + * one more. Just poll it. + */ + ent.sync.msidata = smmu->sync_nr; + } else { + ent.sync.msidata = ++smmu->sync_nr; + arm_smmu_cmdq_build_sync_msi_cmd(cmd, &ent); + arm_smmu_cmdq_insert_cmd(smmu, cmd); + } spin_unlock_irqrestore(&smmu->cmdq.lock, flags); return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);