diff mbox series

[1/2] dt-bindings: arm-smmu: Add qcom,last-ctx-bank-reserved

Message ID 20240814-smmu-v1-1-3d6c27027d5b@freebox.fr (mailing list archive)
State New, archived
Headers show
Series Work around reserved SMMU context bank on msm8998 | expand

Commit Message

Marc Gonzalez Aug. 14, 2024, 1:59 p.m. UTC
On qcom msm8998, writing to the last context bank of lpass_q6_smmu
(base address 0x05100000) produces a system freeze & reboot.

Specifically, here:

	qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
	arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0);

and here:

	arm_smmu_write_context_bank(smmu, i);
	arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT);

It is likely that FW reserves the last context bank for its own use,
thus a simple work-around would be: DON'T USE IT in Linux.

Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Rob Herring (Arm) Aug. 18, 2024, 3:25 p.m. UTC | #1
On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote:
> On qcom msm8998, writing to the last context bank of lpass_q6_smmu
> (base address 0x05100000) produces a system freeze & reboot.
> 
> Specifically, here:
> 
> 	qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
> 	arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0);
> 
> and here:
> 
> 	arm_smmu_write_context_bank(smmu, i);
> 	arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT);
> 
> It is likely that FW reserves the last context bank for its own use,
> thus a simple work-around would be: DON'T USE IT in Linux.
> 
> Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr>
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index 280b4e49f2191..f9b23aef351b0 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -204,6 +204,12 @@ properties:
>        access to SMMU configuration registers. In this case non-secure aliases of
>        secure registers have to be used during SMMU configuration.
>  
> +  qcom,last-ctx-bank-reserved:
> +    type: boolean
> +    description:
> +      FW reserves the last context bank of this SMMU for its own use.
> +      If Linux tries to use it, Linux gets nuked.

How is this Qualcomm specific? Presumably any implementation could do 
this if there's no way to properly partition things. Robin?

Also, this property isn't very flexible. What happens when it is not the 
last bank or more than 1 bank reserved? This should probably be a mask 
instead.

Rob
Marc Gonzalez Aug. 19, 2024, 11:37 a.m. UTC | #2
On 18/08/2024 17:25, Rob Herring wrote:

> On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote:
>
>> On qcom msm8998, writing to the last context bank of lpass_q6_smmu
>> (base address 0x05100000) produces a system freeze & reboot.
>>
>> Specifically, here:
>>
>> 	qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
>> 	arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0);
>>
>> and here:
>>
>> 	arm_smmu_write_context_bank(smmu, i);
>> 	arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT);
>>
>> It is likely that FW reserves the last context bank for its own use,
>> thus a simple work-around would be: DON'T USE IT in Linux.
>>
>> Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr>
>> ---
>>  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> index 280b4e49f2191..f9b23aef351b0 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>> @@ -204,6 +204,12 @@ properties:
>>        access to SMMU configuration registers. In this case non-secure aliases of
>>        secure registers have to be used during SMMU configuration.
>>  
>> +  qcom,last-ctx-bank-reserved:
>> +    type: boolean
>> +    description:
>> +      FW reserves the last context bank of this SMMU for its own use.
>> +      If Linux tries to use it, Linux gets nuked.
> 
> How is this Qualcomm specific? Presumably any implementation could do 
> this if there's no way to properly partition things. Robin?

Obviously, there is nothing Qualcomm specific about reserving   
an SMMU context bank for the FW / hypervisor, other than it
appears that qcom is the first to do it; or at least the
LPASS SMMU on qcom msm8998 is the first known SMMU where such
a work-around is required.

What is the correct nomenclature?

Can we just drop the vendor prefix if a property is generic
across vendors? But does it require a subsystem prefix like
"iommu" in order to not clash with generic props in other subsystems?

> Also, this property isn't very flexible. What happens when it is not the 
> last bank or more than 1 bank reserved? This should probably be a mask 
> instead.

OK, I'm getting conflicting requests here.

Bjorn has recommended dropping the property altogether:

> It also seems, as the different SMMUs in this platform behave
> differently it might be worth giving them further specific compatibles,
> in which case we could just check if it's the qcom,msm8998-lpass-smmu,
> instead of inventing a property for this quirk.


I'll send a patch series in line with Bjorn's request.

Regards
Robin Murphy Aug. 19, 2024, 12:57 p.m. UTC | #3
On 19/08/2024 12:37 pm, Marc Gonzalez wrote:
> On 18/08/2024 17:25, Rob Herring wrote:
> 
>> On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote:
>>
>>> On qcom msm8998, writing to the last context bank of lpass_q6_smmu
>>> (base address 0x05100000) produces a system freeze & reboot.
>>>
>>> Specifically, here:
>>>
>>> 	qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
>>> 	arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0);
>>>
>>> and here:
>>>
>>> 	arm_smmu_write_context_bank(smmu, i);
>>> 	arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT);
>>>
>>> It is likely that FW reserves the last context bank for its own use,
>>> thus a simple work-around would be: DON'T USE IT in Linux.
>>>
>>> Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr>
>>> ---
>>>   Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>>> index 280b4e49f2191..f9b23aef351b0 100644
>>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
>>> @@ -204,6 +204,12 @@ properties:
>>>         access to SMMU configuration registers. In this case non-secure aliases of
>>>         secure registers have to be used during SMMU configuration.
>>>   
>>> +  qcom,last-ctx-bank-reserved:
>>> +    type: boolean
>>> +    description:
>>> +      FW reserves the last context bank of this SMMU for its own use.
>>> +      If Linux tries to use it, Linux gets nuked.
>>
>> How is this Qualcomm specific? Presumably any implementation could do
>> this if there's no way to properly partition things. Robin?
> 
> Obviously, there is nothing Qualcomm specific about reserving
> an SMMU context bank for the FW / hypervisor, other than it
> appears that qcom is the first to do it; or at least the
> LPASS SMMU on qcom msm8998 is the first known SMMU where such
> a work-around is required.

Yes, the Qualcomm-specific aspect is that it's Qualcomm's hypervisor 
which is broken and reporting a larger number in its emulated 
SMMU_IDR1.NUMCB than the number of context banks it's actually willing 
to emulate.

> What is the correct nomenclature?
> 
> Can we just drop the vendor prefix if a property is generic
> across vendors? But does it require a subsystem prefix like
> "iommu" in order to not clash with generic props in other subsystems?

I guess if we *were* to consider a generic property to endorse violating 
the SMMU architecture, then it would logically be vendored to Arm as the 
owner of the SMMU architecture. However I am strongly against that idea, 
not only because I obviously don't want to normalise hypervisors 
emulating non-architectural behaviour which every DT-consuming OS will 
have to understand how to work around, but it's also less than great for 
the user to have a workaround that's not compatible with existing DTBs.

Luckily, in this case it seems straightforward enough to be able to see 
that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we 
should just treat it as if it has 12 - it's also notable that it only 
reports NUMSMRG=12, so we couldn't use more than that many S1 context 
banks at once anyway.

Thanks,
Robin.

>> Also, this property isn't very flexible. What happens when it is not the
>> last bank or more than 1 bank reserved? This should probably be a mask
>> instead.
> 
> OK, I'm getting conflicting requests here.
> 
> Bjorn has recommended dropping the property altogether:
> 
>> It also seems, as the different SMMUs in this platform behave
>> differently it might be worth giving them further specific compatibles,
>> in which case we could just check if it's the qcom,msm8998-lpass-smmu,
>> instead of inventing a property for this quirk.
> 
> 
> I'll send a patch series in line with Bjorn's request.
> 
> Regards
>
Marc Gonzalez Aug. 19, 2024, 3:02 p.m. UTC | #4
On 19/08/2024 14:57, Robin Murphy wrote:

> Luckily, in this case it seems straightforward enough to be able to see 
> that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we 
> should just treat it as if it has 12 - it's also notable that it only 
> reports NUMSMRG=12, so we couldn't use more than that many S1 context 
> banks at once anyway.

This is what the hypervisor reports:

[    2.550974] arm-smmu 5100000.iommu: probing hardware configuration...
[    2.557309] arm-smmu 5100000.iommu: SMMUv2 with:
[    2.563815] arm-smmu 5100000.iommu:  stage 1 translation
[    2.568494] arm-smmu 5100000.iommu:  address translation ops
[    2.573791] arm-smmu 5100000.iommu:  non-coherent table walk
[    2.579434] arm-smmu 5100000.iommu:  (IDR0.CTTW overridden by FW configuration)
[    2.585088] arm-smmu 5100000.iommu:  stream matching with 12 register groups
[    2.592132] arm-smmu 5100000.iommu:  13 context banks (0 stage-2 only)
[    2.619316] arm-smmu 5100000.iommu:  Supported page sizes: 0x63315000
[    2.626225] arm-smmu 5100000.iommu:  Stage-1: 36-bit VA -> 36-bit IPA
[    2.632645] arm-smmu 5100000.iommu:  preserved 0 boot mappings


smmu->num_mapping_groups = 12
smmu->num_context_banks  = 13


Are you saying that

	smmu->num_context_banks > smmu->num_mapping_groups

does not make sense?


Would a well-placed

	if (smmu->num_context_banks > smmu->num_mapping_groups)
		smmu->num_context_banks = smmu->num_mapping_groups;

be a proper work-around?

(Probably in qcom_smmu_cfg_probe() so as to not interfere with other platforms.)


Maybe to limit the side effects even more:

	if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") &&
		smmu->num_context_banks > smmu->num_mapping_groups))
		smmu->num_context_banks = smmu->num_mapping_groups;


Neither work-around would require changing the binding.

Is either work-around acceptable, Robin?

Regards
Bjorn Andersson Aug. 19, 2024, 10:04 p.m. UTC | #5
On Mon, Aug 19, 2024 at 05:02:16PM +0200, Marc Gonzalez wrote:
> On 19/08/2024 14:57, Robin Murphy wrote:
> 
> > Luckily, in this case it seems straightforward enough to be able to see 
> > that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we 
> > should just treat it as if it has 12 - it's also notable that it only 
> > reports NUMSMRG=12, so we couldn't use more than that many S1 context 
> > banks at once anyway.
> 
> This is what the hypervisor reports:
> 
> [    2.550974] arm-smmu 5100000.iommu: probing hardware configuration...
> [    2.557309] arm-smmu 5100000.iommu: SMMUv2 with:
> [    2.563815] arm-smmu 5100000.iommu:  stage 1 translation
> [    2.568494] arm-smmu 5100000.iommu:  address translation ops
> [    2.573791] arm-smmu 5100000.iommu:  non-coherent table walk
> [    2.579434] arm-smmu 5100000.iommu:  (IDR0.CTTW overridden by FW configuration)
> [    2.585088] arm-smmu 5100000.iommu:  stream matching with 12 register groups
> [    2.592132] arm-smmu 5100000.iommu:  13 context banks (0 stage-2 only)
> [    2.619316] arm-smmu 5100000.iommu:  Supported page sizes: 0x63315000
> [    2.626225] arm-smmu 5100000.iommu:  Stage-1: 36-bit VA -> 36-bit IPA
> [    2.632645] arm-smmu 5100000.iommu:  preserved 0 boot mappings
> 
> 
> smmu->num_mapping_groups = 12

Ignore num_mapping_groups, they are used to define which streams should
be mapped to which context bank. But there's no relationship between
these numbers.

> smmu->num_context_banks  = 13
> 
> 
> Are you saying that
> 
> 	smmu->num_context_banks > smmu->num_mapping_groups
> 
> does not make sense?
> 
> 
> Would a well-placed
> 
> 	if (smmu->num_context_banks > smmu->num_mapping_groups)
> 		smmu->num_context_banks = smmu->num_mapping_groups;
> 
> be a proper work-around?

No, something like this would apply your quirk to other targets (and
specifically it would be wrong, per above).

> 
> (Probably in qcom_smmu_cfg_probe() so as to not interfere with other platforms.)
> 
> 
> Maybe to limit the side effects even more:
> 
> 	if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") &&
> 		smmu->num_context_banks > smmu->num_mapping_groups))
> 		smmu->num_context_banks = smmu->num_mapping_groups;

If we don't want to introduce a more specific compatible for this SMMU
instance, then let's add this to qcom_smmu_cfg_probe():

	/* MSM8998 LPASS SMMU reports 13 context banks, but only 12 are accessible */
 	if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") && smmu->num_context_banks == 13)
		smmu->num_context_banks = 12;


Regards,
Bjorn

> 
> 
> Neither work-around would require changing the binding.
> 
> Is either work-around acceptable, Robin?
> 
> Regards
> 
>
diff mbox series

Patch

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 280b4e49f2191..f9b23aef351b0 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -204,6 +204,12 @@  properties:
       access to SMMU configuration registers. In this case non-secure aliases of
       secure registers have to be used during SMMU configuration.
 
+  qcom,last-ctx-bank-reserved:
+    type: boolean
+    description:
+      FW reserves the last context bank of this SMMU for its own use.
+      If Linux tries to use it, Linux gets nuked.
+
   stream-match-mask:
     $ref: /schemas/types.yaml#/definitions/uint32
     description: |