From patchwork Tue Dec 19 16:34:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Murphy X-Patchwork-Id: 10123731 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7B570602CB for ; Tue, 19 Dec 2017 16:35:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74E402022B for ; Tue, 19 Dec 2017 16:35:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6945B28691; Tue, 19 Dec 2017 16:35:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12E352022B for ; Tue, 19 Dec 2017 16:35:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751600AbdLSQev (ORCPT ); Tue, 19 Dec 2017 11:34:51 -0500 Received: from foss.arm.com ([217.140.101.70]:40038 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750784AbdLSQeu (ORCPT ); Tue, 19 Dec 2017 11:34:50 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EDB1B80D; Tue, 19 Dec 2017 08:34:49 -0800 (PST) Received: from [10.1.210.88] (e110467-lin.cambridge.arm.com [10.1.210.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B2DAA3F487; Tue, 19 Dec 2017 08:34:47 -0800 (PST) Subject: Re: [PATCH V1 0/1] Fix kernel panic caused by device ID duplication presented to the IOMMU To: Tomasz Nowicki , joro@8bytes.org, will.deacon@arm.com, lorenzo.pieralisi@arm.com, bhelgaas@google.com Cc: Jayachandran.Nair@cavium.com, Ganapatrao.Kulkarni@cavium.com, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org, mw@semihalf.com, stable@vger.kernel.org References: <1513696436-31834-1-git-send-email-tomasz.nowicki@caviumnetworks.com> From: Robin Murphy Message-ID: <7f7fca76-88f8-6ee7-c402-fe4300c62253@arm.com> Date: Tue, 19 Dec 2017 16:34:46 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <1513696436-31834-1-git-send-email-tomasz.nowicki@caviumnetworks.com> Content-Language: en-GB Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Tomasz, On 19/12/17 15:13, Tomasz Nowicki wrote: > Here is my lspci output of ThunderX2 for which I am observing kernel panic coming from > SMMUv3 driver -> arm_smmu_write_strtab_ent() -> BUG_ON(ste_live): > > # lspci -vt > -[0000:00]-+-00.0-[01-1f]--+ [...] > + [...] > \-00.0-[1e-1f]----00.0-[1f]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family > > ASP device -> 1f:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family > PCI-Express to PCI/PCI-X Bridge -> 1e:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge > > While setting up ASP device SID in IORT dirver: > iort_iommu_configure() -> pci_for_each_dma_alias() > we need to walk up and iterate over each device which alias transaction from > downstream devices. > > AST device (1f:00.0) gets BDF=0x1f00 and corresponding SID=0x1f00 from IORT. > Bridge (1e:00.0) is the first alias. Following PCI Express to PCI/PCI-X Bridge > spec: PCIe-to-PCI/X bridges alias transactions from downstream devices using > the subordinate bus number. For bridge (1e:00.0), the subordinate is equal > to 0x1f. This gives BDF=0x1f00 and SID=1f00 which is the same as downstream > device. So it is possible to have two identical SIDs. The question is what we > should do about such case. Presented patch prevents from registering the same > ID so that SMMUv3 is not complaining later on. Ooh, subtle :( There is logic in arm_smmu_attach_device() to tolerate grouped devices aliasing to the same ID, but I guess I overlooked the distinction of a device sharing an alias ID with itself. I'm not sure I really like trying to work around this in generic code, since fwspec->ids is essentially opaque data in a driver-specific format - in theory a driver is free to encode a single logical ID into multiple fwspec elements (I think I did that in an early draft of SMMUv2 SMR support), at which point this approach might corrupt things massively. Does the (untested) diff below suffice? Robin. ----->8-----diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index f122071688fd..d8a730d83401 100644 @@ -1739,6 +1739,13 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec) u32 sid = fwspec->ids[i]; __le64 *step = arm_smmu_get_step_for_sid(smmu, sid); + /* Bridged PCI devices may end up with duplicated IDs */ + for (j = 0; j < i; j++) + if (fwspec->ids[j] == sid) + break; + if (j < i) + continue; + arm_smmu_write_strtab_ent(smmu, sid, step, &master->ste); } } Tested-by: Tomasz Nowicki Tested-by: Jayachandran C. --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1731,7 +1731,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec) { - int i; + int i, j; struct arm_smmu_master_data *master = fwspec->iommu_priv; struct arm_smmu_device *smmu = master->smmu;