diff mbox series

PCI: revert "Enable PASID only when ACS RR & UF enabled on upstream path"

Message ID 20230111085745.401710-1-christian.koenig@amd.com (mailing list archive)
State Superseded
Headers show
Series PCI: revert "Enable PASID only when ACS RR & UF enabled on upstream path" | expand

Commit Message

Christian König Jan. 11, 2023, 8:57 a.m. UTC
This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.

It's correct that the PCIe fabric routes Memory Requests based on the
TLP address, but enabling the PASID mapping doesn't necessary mean that
Memory Requests will have a PASID associated with them.

The alternative is ATS which lets the device resolve the PASID+addr pair
before a memory request is made into a routeable TLB address through the
TA. Those resolved addresses are then cached on the device instead of
in the IOMMU TLB.

So the assumption that you mandatory need ACS to enabled PASID handling
on a device is simply not correct, we need to take ATS into account as
well.

The patch caused failures with AMDs integrated GPUs because some of them
only enable ATS but not ACS.

For now just revert the patch until this is completely solved.

CC: Jason Gunthorpe <jgg@nvidia.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Lu Baolu <baolu.lu@linux.intel.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Tony Zhu <tony.zhu@intel.com>
CC: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216865
---
 drivers/pci/ats.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Christian König Jan. 11, 2023, 9:15 a.m. UTC | #1
Forgot to add Felix as CC as well.

Am 11.01.23 um 09:57 schrieb Christian König:
> This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.
>
> It's correct that the PCIe fabric routes Memory Requests based on the
> TLP address, but enabling the PASID mapping doesn't necessary mean that
> Memory Requests will have a PASID associated with them.
>
> The alternative is ATS which lets the device resolve the PASID+addr pair
> before a memory request is made into a routeable TLB address through the
> TA. Those resolved addresses are then cached on the device instead of
> in the IOMMU TLB.
>
> So the assumption that you mandatory need ACS to enabled PASID handling
> on a device is simply not correct, we need to take ATS into account as
> well.
>
> The patch caused failures with AMDs integrated GPUs because some of them
> only enable ATS but not ACS.
>
> For now just revert the patch until this is completely solved.
>
> CC: Jason Gunthorpe <jgg@nvidia.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Lu Baolu <baolu.lu@linux.intel.com>
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Tony Zhu <tony.zhu@intel.com>
> CC: Joerg Roedel <jroedel@suse.de>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216865
> ---
>   drivers/pci/ats.c | 3 ---
>   1 file changed, 3 deletions(-)
>
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index f9cc2e10b676..c967ad6e2626 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -382,9 +382,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features)
>   	if (!pasid)
>   		return -EINVAL;
>   
> -	if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF))
> -		return -EINVAL;
> -
>   	pci_read_config_word(pdev, pasid + PCI_PASID_CAP, &supported);
>   	supported &= PCI_PASID_CAP_EXEC | PCI_PASID_CAP_PRIV;
>
Thorsten Leemhuis Jan. 11, 2023, 10:04 a.m. UTC | #2
On 11.01.23 09:57, Christian König wrote:
> This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.
> 
> It's correct that the PCIe fabric routes Memory Requests based on the
> TLP address, but enabling the PASID mapping doesn't necessary mean that
> Memory Requests will have a PASID associated with them.
> 
> The alternative is ATS which lets the device resolve the PASID+addr pair
> before a memory request is made into a routeable TLB address through the
> TA. Those resolved addresses are then cached on the device instead of
> in the IOMMU TLB.
> 
> So the assumption that you mandatory need ACS to enabled PASID handling
> on a device is simply not correct, we need to take ATS into account as
> well.
> 
> The patch caused failures with AMDs integrated GPUs because some of them
> only enable ATS but not ACS.
> 
> For now just revert the patch until this is completely solved.
> 
> CC: Jason Gunthorpe <jgg@nvidia.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Lu Baolu <baolu.lu@linux.intel.com>
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Tony Zhu <tony.zhu@intel.com>
> CC: Joerg Roedel <jroedel@suse.de>
> Signed-off-by: Christian König <christian.koenig@amd.com>

One small thing to improve:

> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216865

s/Bug:/Link:/ here, otherwise you might get mails from Linus like these:
https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/

This usage is also explained in
Documentation/process/submitting-patches.rst
(http://docs.kernel.org/process/submitting-patches.html) and
Documentation/process/5.Posting.rst
(https://docs.kernel.org/process/5.Posting.html)

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

P.S.: let me tell regzbot to monitor this thread:

#regzbot ^backmonitor: https://bugzilla.kernel.org/show_bug.cgi?id=216865
Jason Gunthorpe Jan. 11, 2023, 1:07 p.m. UTC | #3
On Wed, Jan 11, 2023 at 09:57:45AM +0100, Christian König wrote:
> This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.
> 
> It's correct that the PCIe fabric routes Memory Requests based on the
> TLP address, but enabling the PASID mapping doesn't necessary mean that
> Memory Requests will have a PASID associated with them.

It is true, the routine assumes the device will use untranslated
requests with the PASID.

> The alternative is ATS which lets the device resolve the PASID+addr pair
> before a memory request is made into a routeable TLB address through the
> TA. Those resolved addresses are then cached on the device instead of
> in the IOMMU TLB.

We should pass in a flag "device always sets the translated bit for
PASID" and skip the ACS check in that case.

The ACS check is not wrong, and it is definately necessary for devices
that do not guarentee ATS and PASID are used together, we should not
be removing it.

Given adding the flag is trivial we should just fix it, not revert this.

Jason
Christian König Jan. 11, 2023, 1:38 p.m. UTC | #4
Am 11.01.23 um 14:07 schrieb Jason Gunthorpe:
> On Wed, Jan 11, 2023 at 09:57:45AM +0100, Christian König wrote:
>> This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.
>>
>> It's correct that the PCIe fabric routes Memory Requests based on the
>> TLP address, but enabling the PASID mapping doesn't necessary mean that
>> Memory Requests will have a PASID associated with them.
> It is true, the routine assumes the device will use untranslated
> requests with the PASID.
>
>> The alternative is ATS which lets the device resolve the PASID+addr pair
>> before a memory request is made into a routeable TLB address through the
>> TA. Those resolved addresses are then cached on the device instead of
>> in the IOMMU TLB.
> We should pass in a flag "device always sets the translated bit for
> PASID" and skip the ACS check in that case.
>
> The ACS check is not wrong, and it is definately necessary for devices
> that do not guarentee ATS and PASID are used together, we should not
> be removing it.
>
> Given adding the flag is trivial we should just fix it, not revert this.

Well exactly that's the point, adding this flag is absolutely not 
trivial as far as I can see. We need to go through multiple layers of 
abstraction since this is the low level function and nothing high level.

Additional to that the check doesn't seem to make much sense to me. 
pci_enable_pasid() is called by three functions:

pdev_pri_ats_enable() in the AMD IOMMU driver while enabling ATS. As far 
as I can see we absolutely don't need the ACS check here because ATS is 
a must have.

iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some 
handling for ATS, so here we could check the info->ats_supported flag if 
ACS needs to be checked or not.

arm_smmu_enable_pasid() in the ARM IOMMU driver code. No idea what this 
one does with ATS. Here is the only place where the ACS check might make 
sense.

So even if we have some need for this check this seems to be the wrong 
place for the check since not all necessary information from the higher 
level is available.

Regards,
Christian.

>
> Jason
Jason Gunthorpe Jan. 11, 2023, 1:44 p.m. UTC | #5
On Wed, Jan 11, 2023 at 02:38:34PM +0100, Christian König wrote:
> Am 11.01.23 um 14:07 schrieb Jason Gunthorpe:
> > On Wed, Jan 11, 2023 at 09:57:45AM +0100, Christian König wrote:
> > > This reverts commit 201007ef707a8bb5592cd07dd46fc9222c48e0b9.
> > > 
> > > It's correct that the PCIe fabric routes Memory Requests based on the
> > > TLP address, but enabling the PASID mapping doesn't necessary mean that
> > > Memory Requests will have a PASID associated with them.
> > It is true, the routine assumes the device will use untranslated
> > requests with the PASID.
> > 
> > > The alternative is ATS which lets the device resolve the PASID+addr pair
> > > before a memory request is made into a routeable TLB address through the
> > > TA. Those resolved addresses are then cached on the device instead of
> > > in the IOMMU TLB.
> > We should pass in a flag "device always sets the translated bit for
> > PASID" and skip the ACS check in that case.
> > 
> > The ACS check is not wrong, and it is definately necessary for devices
> > that do not guarentee ATS and PASID are used together, we should not
> > be removing it.
> > 
> > Given adding the flag is trivial we should just fix it, not revert this.
> 
> Well exactly that's the point, adding this flag is absolutely not trivial as
> far as I can see. We need to go through multiple layers of abstraction since
> this is the low level function and nothing high level.
> 
> Additional to that the check doesn't seem to make much sense to me.

AFAICT it is the only solution that makes any sense..

> pci_enable_pasid() is called by three functions:
> 
> pdev_pri_ats_enable() in the AMD IOMMU driver while enabling ATS. As far as
> I can see we absolutely don't need the ACS check here because ATS is a must
> have.

Enabling ATS does not mean every PASID TLP will use ATS. It just means
that some transactions might.

We cannot do this properly without driver sourced device-specific
knowledge.

> iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
> handling for ATS, so here we could check the info->ats_supported flag if ACS
> needs to be checked or not.

*groan* this is seems wrong :( Lu why are we doing this inside iommu
drivers instead of in the device drivers to declare they want to use
PASID?

> arm_smmu_enable_pasid() in the ARM IOMMU driver code. No idea what this one
> does with ATS. Here is the only place where the ACS check might make sense.
> 
> So even if we have some need for this check this seems to be the wrong place
> for the check since not all necessary information from the higher level is
> available.

IIRC we only have 1 driver using the standard pasid support, so we
could just move these things to IDXD.

The AMD sidechannel thing is only use for AMDGPU so it can just assume
things until it gets fixed to use the standard API.

Jason
Baolu Lu Jan. 11, 2023, 1:54 p.m. UTC | #6
On 2023/1/11 21:44, Jason Gunthorpe wrote:
>> iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
>> handling for ATS, so here we could check the info->ats_supported flag if ACS
>> needs to be checked or not.
> *groan*  this is seems wrong 
Jason Gunthorpe Jan. 11, 2023, 2:14 p.m. UTC | #7
On Wed, Jan 11, 2023 at 09:54:03PM +0800, Baolu Lu wrote:
> On 2023/1/11 21:44, Jason Gunthorpe wrote:
> > > iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
> > > handling for ATS, so here we could check the info->ats_supported flag if ACS
> > > needs to be checked or not.
> > *groan*  this is seems wrong 
Christian König Jan. 11, 2023, 2:17 p.m. UTC | #8
Am 11.01.23 um 15:14 schrieb Jason Gunthorpe:
> On Wed, Jan 11, 2023 at 09:54:03PM +0800, Baolu Lu wrote:
>> On 2023/1/11 21:44, Jason Gunthorpe wrote:
>>>> iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
>>>> handling for ATS, so here we could check the info->ats_supported flag if ACS
>>>> needs to be checked or not.
>>> *groan*  this is seems wrong 
Jason Gunthorpe Jan. 11, 2023, 2:20 p.m. UTC | #9
On Wed, Jan 11, 2023 at 03:17:03PM +0100, Christian König wrote:
> Am 11.01.23 um 15:14 schrieb Jason Gunthorpe:
> > On Wed, Jan 11, 2023 at 09:54:03PM +0800, Baolu Lu wrote:
> > > On 2023/1/11 21:44, Jason Gunthorpe wrote:
> > > > > iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
> > > > > handling for ATS, so here we could check the info->ats_supported flag if ACS
> > > > > needs to be checked or not.
> > > > *groan*  this is seems wrong 
Christian König Jan. 11, 2023, 2:24 p.m. UTC | #10
Am 11.01.23 um 15:20 schrieb Jason Gunthorpe:
> On Wed, Jan 11, 2023 at 03:17:03PM +0100, Christian König wrote:
>> Am 11.01.23 um 15:14 schrieb Jason Gunthorpe:
>>> On Wed, Jan 11, 2023 at 09:54:03PM +0800, Baolu Lu wrote:
>>>> On 2023/1/11 21:44, Jason Gunthorpe wrote:
>>>>>> iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
>>>>>> handling for ATS, so here we could check the info->ats_supported flag if ACS
>>>>>> needs to be checked or not.
>>>>> *groan*  this is seems wrong 
Jason Gunthorpe Jan. 11, 2023, 2:36 p.m. UTC | #11
On Wed, Jan 11, 2023 at 03:07:36PM +0100, Christian König wrote:

>    Well no, we can perfectly fine enable this as we have done in the past.
>    What we can't do is rejecting it without driver specific knowledge,
>    because the hardware might still work correctly.

It is an interesting point that any device using PASID for SVA must
necessary also be doing ATS/PRI and thus must always be setting the
translated bit in their Mem TLPs

So, at least at this instant in the kernel we have no need for the ACS
check as everything is SVA.

This is forward looking where we are going to have non SVA uses of
PASIDs and we cannot guarantee translated TLPs.

Keep in mind this is pretty much an integrity problem, eg if I allow
iommufd to assign page tables to a PASID without PRI we can get bad
behaviors if the HW does not route properly.

So we are justified to be conservative here to prevent data corruption
in bad cases.

Jason
Baolu Lu Jan. 12, 2023, 8:59 a.m. UTC | #12
On 2023/1/11 22:17, Christian König wrote:
> Am 11.01.23 um 15:14 schrieb Jason Gunthorpe:
>> On Wed, Jan 11, 2023 at 09:54:03PM +0800, Baolu Lu wrote:
>>> On 2023/1/11 21:44, Jason Gunthorpe wrote:
>>>>> iommu_enable_pci_caps() in the Intel IOMMU driver. This also does some
>>>>> handling for ATS, so here we could check the info->ats_supported 
>>>>> flag if ACS
>>>>> needs to be checked or not.
>>>> *groan*  this is seems wrong 
diff mbox series

Patch

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index f9cc2e10b676..c967ad6e2626 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -382,9 +382,6 @@  int pci_enable_pasid(struct pci_dev *pdev, int features)
 	if (!pasid)
 		return -EINVAL;
 
-	if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF))
-		return -EINVAL;
-
 	pci_read_config_word(pdev, pasid + PCI_PASID_CAP, &supported);
 	supported &= PCI_PASID_CAP_EXEC | PCI_PASID_CAP_PRIV;