mbox series

[v2,0/2] iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled

Message ID 20190318131243.20716-1-thunder.leizhen@huawei.com (mailing list archive)
Headers show
Series iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled | expand

Message

Zhen Lei March 18, 2019, 1:12 p.m. UTC
v1 --> v2:
1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
(Report abort to device, no event recorded) to suppress the event messages
caused by the unexpected devices.
2. rewrite the patch description.

v1:
This patch series include two parts:
1. Patch1-2 use dummy STE tables with "ste abort" hardware feature to abort unexpected
   devices accessing. For more details, see the description in patch 2.
2. If the "ste abort" feature is not support, force the unexpected devices in the
   secondary kernel to use the memory maps which it used in the first kernel. For more
   details, see patch 5.

Zhen Lei (2):
  iommu/arm-smmu-v3: make sure the stale caching of L1STD are invalid
  iommu/arm-smmu-v3: to make smmu can be enabled in the kdump kernel

 drivers/iommu/arm-smmu-v3.c | 88 +++++++++++++++++++++++++++++++++------------
 1 file changed, 65 insertions(+), 23 deletions(-)

Comments

Will Deacon April 4, 2019, 3:30 p.m. UTC | #1
Hi Zhen Lei,

On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
> v1 --> v2:
> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
> (Report abort to device, no event recorded) to suppress the event messages
> caused by the unexpected devices.
> 2. rewrite the patch description.

This issue came up a while back:

https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/

and I'd still prefer to solve it using the disable_bypass logic which we
already have. Something along the lines of the diff below?

We're relying on the DMA API not subsequently requesting a passthrough
domain, but it should only do that if you've configured your crashkernel
to do so.

Will

--->8

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..91b8f3b2ee25 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	/* Clear CR0 and sync (disables SMMU and queue processing) */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
 	if (reg & CR0_SMMUEN) {
-		if (is_kdump_kernel()) {
-			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
-			arm_smmu_device_disable(smmu);
-			return -EBUSY;
-		}
-
 		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
+		WARN_ON(is_kdump_kernel() && !disable_bypass);
+		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
 	}
 
 	ret = arm_smmu_device_disable(smmu);
Zhen Lei April 8, 2019, 2:31 a.m. UTC | #2
Hi Will,

On 2019/4/4 23:30, Will Deacon wrote:
> Hi Zhen Lei,
> 
> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>> v1 --> v2:
>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>> (Report abort to device, no event recorded) to suppress the event messages
>> caused by the unexpected devices.
>> 2. rewrite the patch description.
> 
> This issue came up a while back:
> 
> https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
> 
> and I'd still prefer to solve it using the disable_bypass logic which we
> already have. Something along the lines of the diff below?

Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
(2-level Stream Table), we only allocated and initialized the first level tables,
but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
will be reported, if an unexpeted device access memory without reinitialized in
kdump kernel. So my patches allocated a dummy level 2 table(STE table), and make
all level 1 table entries pointer to it in advance. That means abort all unexpected
devices memory access base this dummy STE table. When an expected device(need to be
used in kdump kernel) attached, we will allocate a new level 2 table(STE table)
accordingly, but keep others still pointer to the dummy STE table.


> 
> We're relying on the DMA API not subsequently requesting a passthrough
> domain, but it should only do that if you've configured your crashkernel
> to do so.
> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..91b8f3b2ee25 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  	/* Clear CR0 and sync (disables SMMU and queue processing) */
>  	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>  	if (reg & CR0_SMMUEN) {
> -		if (is_kdump_kernel()) {
> -			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> -			arm_smmu_device_disable(smmu);
> -			return -EBUSY;
> -		}
> -
>  		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> +		WARN_ON(is_kdump_kernel() && !disable_bypass);
> +		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>  	}
>  
>  	ret = arm_smmu_device_disable(smmu);
> 
> .
>
Will Deacon April 16, 2019, 9:14 a.m. UTC | #3
On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
> On 2019/4/4 23:30, Will Deacon wrote:
> > On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
> >> v1 --> v2:
> >> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
> >> (Report abort to device, no event recorded) to suppress the event messages
> >> caused by the unexpected devices.
> >> 2. rewrite the patch description.
> > 
> > This issue came up a while back:
> > 
> > https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
> > 
> > and I'd still prefer to solve it using the disable_bypass logic which we
> > already have. Something along the lines of the diff below?
> 
> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
> (2-level Stream Table), we only allocated and initialized the first level tables,
> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
> will be reported, if an unexpeted device access memory without reinitialized in
> kdump kernel.

So is your problem just that the C_BAD_STREAMID events are noisy? If so,
perhaps we should be disabling fault reporting entirely in the kdump kernel.

How about the update diff below? I'm keen to have this as simple as
possible, so we don't end up introducing rarely tested, complex code on
the crash path.

Will

--->8

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..d8b73da6447d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	/* Clear CR0 and sync (disables SMMU and queue processing) */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
 	if (reg & CR0_SMMUEN) {
-		if (is_kdump_kernel()) {
-			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
-			arm_smmu_device_disable(smmu);
-			return -EBUSY;
-		}
-
 		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
+		WARN_ON(is_kdump_kernel() && !disable_bypass);
+		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
 	}
 
 	ret = arm_smmu_device_disable(smmu);
@@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 		return ret;
 	}
 
+	if (is_kdump_kernel())
+		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
 
 	/* Enable the SMMU interface, or ensure bypass */
 	if (!bypass || disable_bypass) {
Zhen Lei April 17, 2019, 1:39 a.m. UTC | #4
On 2019/4/16 17:14, Will Deacon wrote:
> On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
>> On 2019/4/4 23:30, Will Deacon wrote:
>>> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>>>> v1 --> v2:
>>>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>>>> (Report abort to device, no event recorded) to suppress the event messages
>>>> caused by the unexpected devices.
>>>> 2. rewrite the patch description.
>>>
>>> This issue came up a while back:
>>>
>>> https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
>>>
>>> and I'd still prefer to solve it using the disable_bypass logic which we
>>> already have. Something along the lines of the diff below?
>>
>> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
>> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
>> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
>> (2-level Stream Table), we only allocated and initialized the first level tables,
>> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
>> will be reported, if an unexpeted device access memory without reinitialized in
>> kdump kernel.
> 
> So is your problem just that the C_BAD_STREAMID events are noisy? If so,
> perhaps we should be disabling fault reporting entirely in the kdump kernel.
> 
> How about the update diff below? I'm keen to have this as simple as
> possible, so we don't end up introducing rarely tested, complex code on
> the crash path.
In theory, it can solve the problem, let me test it.

But then again, below patch will also disable the fault reporting come from the
expected devices which are used in the kdump kernel. In fact, my patches have been
merged into our interval version more than 2 months, no bug have been found yet.

However, my patches do not support the case that the hardware does not support the
"STE bypass" feature, I think your patch can also resolve it.

> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..d8b73da6447d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  	/* Clear CR0 and sync (disables SMMU and queue processing) */
>  	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>  	if (reg & CR0_SMMUEN) {
> -		if (is_kdump_kernel()) {
> -			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> -			arm_smmu_device_disable(smmu);
> -			return -EBUSY;
> -		}
> -
>  		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> +		WARN_ON(is_kdump_kernel() && !disable_bypass);
> +		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>  	}
>  
>  	ret = arm_smmu_device_disable(smmu);
> @@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  		return ret;
>  	}
>  
> +	if (is_kdump_kernel())
> +		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
>  
>  	/* Enable the SMMU interface, or ensure bypass */
>  	if (!bypass || disable_bypass) {
> 
> .
>
Zhen Lei April 19, 2019, 1:48 p.m. UTC | #5
On 2019/4/17 9:39, Leizhen (ThunderTown) wrote:
> 
> 
> On 2019/4/16 17:14, Will Deacon wrote:
>> On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
>>> On 2019/4/4 23:30, Will Deacon wrote:
>>>> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>>>>> v1 --> v2:
>>>>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>>>>> (Report abort to device, no event recorded) to suppress the event messages
>>>>> caused by the unexpected devices.
>>>>> 2. rewrite the patch description.
>>>>
>>>> This issue came up a while back:
>>>>
>>>> https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
>>>>
>>>> and I'd still prefer to solve it using the disable_bypass logic which we
>>>> already have. Something along the lines of the diff below?
>>>
>>> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
>>> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
>>> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
>>> (2-level Stream Table), we only allocated and initialized the first level tables,
>>> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
>>> will be reported, if an unexpeted device access memory without reinitialized in
>>> kdump kernel.
>>
>> So is your problem just that the C_BAD_STREAMID events are noisy? If so,
>> perhaps we should be disabling fault reporting entirely in the kdump kernel.
>>
>> How about the update diff below? I'm keen to have this as simple as
>> possible, so we don't end up introducing rarely tested, complex code on
>> the crash path.
> In theory, it can solve the problem, let me test it.
Hi Will,
  I have tested your patch on my board today. It works well.

> 
> But then again, below patch will also disable the fault reporting come from the
> expected devices which are used in the kdump kernel. In fact, my patches have been
> merged into our interval version more than 2 months, no bug have been found yet.
> 
> However, my patches do not support the case that the hardware does not support the
> "STE bypass" feature, I think your patch can also resolve it.
> 
>>
>> Will
>>
>> --->8
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..d8b73da6447d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>>  	/* Clear CR0 and sync (disables SMMU and queue processing) */
>>  	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>>  	if (reg & CR0_SMMUEN) {
>> -		if (is_kdump_kernel()) {
>> -			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>> -			arm_smmu_device_disable(smmu);
>> -			return -EBUSY;
>> -		}
>> -
>>  		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
>> +		WARN_ON(is_kdump_kernel() && !disable_bypass);
>> +		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>>  	}
>>  
>>  	ret = arm_smmu_device_disable(smmu);
>> @@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>>  		return ret;
>>  	}
>>  
>> +	if (is_kdump_kernel())
>> +		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
>>  
>>  	/* Enable the SMMU interface, or ensure bypass */
>>  	if (!bypass || disable_bypass) {
>>
>> .
>>
>
Bhupesh Sharma April 22, 2019, 12:33 p.m. UTC | #6
Hi Will,

On 04/16/2019 02:44 PM, Will Deacon wrote:
> On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
>> On 2019/4/4 23:30, Will Deacon wrote:
>>> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>>>> v1 --> v2:
>>>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>>>> (Report abort to device, no event recorded) to suppress the event messages
>>>> caused by the unexpected devices.
>>>> 2. rewrite the patch description.
>>>
>>> This issue came up a while back:
>>>
>>> https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
>>>
>>> and I'd still prefer to solve it using the disable_bypass logic which we
>>> already have. Something along the lines of the diff below?
>>
>> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
>> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
>> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
>> (2-level Stream Table), we only allocated and initialized the first level tables,
>> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
>> will be reported, if an unexpeted device access memory without reinitialized in
>> kdump kernel.
> 
> So is your problem just that the C_BAD_STREAMID events are noisy? If so,
> perhaps we should be disabling fault reporting entirely in the kdump kernel.
> 
> How about the update diff below? I'm keen to have this as simple as
> possible, so we don't end up introducing rarely tested, complex code on
> the crash path.
> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..d8b73da6447d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>   	/* Clear CR0 and sync (disables SMMU and queue processing) */
>   	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>   	if (reg & CR0_SMMUEN) {
> -		if (is_kdump_kernel()) {
> -			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> -			arm_smmu_device_disable(smmu);
> -			return -EBUSY;
> -		}
> -
>   		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> +		WARN_ON(is_kdump_kernel() && !disable_bypass);
> +		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>   	}
>   
>   	ret = arm_smmu_device_disable(smmu);
> @@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>   		return ret;
>   	}
>   
> +	if (is_kdump_kernel())
> +		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
>   
>   	/* Enable the SMMU interface, or ensure bypass */
>   	if (!bypass || disable_bypass) {
> 

Thanks for the fix.

I can confirm that after this kdump kernel boots well for me on huawei 
boards, so feel free to add:

Tested-by: Bhupesh Sharma <bhsharma@redhat.com>

Here are the kdump kernel logs without this fix:

[    4.514181] arm-smmu-v3 arm-smmu-v3.1.auto: EVTQ overflow detected -- 
events lost

.. And then repeating messages like the following ..

[    4.521654] arm-smmu-v3 arm-smmu-v3.1.auto: event 0x02 received:
[    4.527654] arm-smmu-v3 arm-smmu-v3.1.auto:  0x00007d0200000002
[    4.533567] arm-smmu-v3 arm-smmu-v3.1.auto:  0x000000010000017e
[    4.539478] arm-smmu-v3 arm-smmu-v3.1.auto:  0x00000000ff6de000
[    4.545390] arm-smmu-v3 arm-smmu-v3.1.auto:  0x000000000eee03e8

And with the fix applied, kdump kernel logs can be seen below:

[ 9136.361094] Starting crashdump kernel...
[ 9136.365007] Bye!
[    0.000000] Booting Linux on physical CPU 0x0000070002 [0x480fd010]
[    0.000000] Linux version 5.1.0-rc6+

<..snip..>

[    3.424103] arm-smmu-v3 arm-smmu-v3.0.auto: option mask 0x0
[    3.429674] arm-smmu-v3 arm-smmu-v3.0.auto: ias 48-bit, oas 48-bit 
(features 0x00000fef)
[    3.437780] arm-smmu-v3 arm-smmu-v3.0.auto: SMMU currently enabled! 
Resetting...
[    3.445431] arm-smmu-v3 arm-smmu-v3.1.auto: option mask 0x0


<..snip..>

Thanks,
Bhupesh
Matthias Brugger April 24, 2019, 4:22 p.m. UTC | #7
On 16/04/2019 11:14, Will Deacon wrote:
> On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
>> On 2019/4/4 23:30, Will Deacon wrote:
>>> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>>>> v1 --> v2:
>>>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>>>> (Report abort to device, no event recorded) to suppress the event messages
>>>> caused by the unexpected devices.
>>>> 2. rewrite the patch description.
>>>
>>> This issue came up a while back:
>>>
>>> https://lore.kernel.org/linux-pci/20180302103032.GB19323@arm.com/
>>>
>>> and I'd still prefer to solve it using the disable_bypass logic which we
>>> already have. Something along the lines of the diff below?
>>
>> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
>> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
>> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
>> (2-level Stream Table), we only allocated and initialized the first level tables,
>> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
>> will be reported, if an unexpeted device access memory without reinitialized in
>> kdump kernel.
> 
> So is your problem just that the C_BAD_STREAMID events are noisy? If so,
> perhaps we should be disabling fault reporting entirely in the kdump kernel.
> 
> How about the update diff below? I'm keen to have this as simple as
> possible, so we don't end up introducing rarely tested, complex code on
> the crash path.
> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..d8b73da6447d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  	/* Clear CR0 and sync (disables SMMU and queue processing) */
>  	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>  	if (reg & CR0_SMMUEN) {
> -		if (is_kdump_kernel()) {
> -			arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> -			arm_smmu_device_disable(smmu);
> -			return -EBUSY;
> -		}
> -
>  		dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> +		WARN_ON(is_kdump_kernel() && !disable_bypass);
> +		arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>  	}
>  
>  	ret = arm_smmu_device_disable(smmu);
> @@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  		return ret;
>  	}
>  
> +	if (is_kdump_kernel())
> +		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
>  
>  	/* Enable the SMMU interface, or ensure bypass */
>  	if (!bypass || disable_bypass) {
> 

Same here I tested the patch and it works for me.

Feel free to add:
Tested-by: Matthias Brugger <mbrugger@suse.com>

Regards,
Matthias