diff mbox series

[1/2] drm/amdgpu: make sure to init common IP before gmc

Message ID 20220908040821.5786-1-alexander.deucher@amd.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [1/2] drm/amdgpu: make sure to init common IP before gmc | expand

Commit Message

Alex Deucher Sept. 8, 2022, 4:08 a.m. UTC
Common is mainly golden register setting and HDP register
remapping, it shouldn't allocate any GPU memory.  Make sure
common happens before gmc so that the HDP registers are
remapped before gmc attempts to access them.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Lazar, Lijo Sept. 8, 2022, 5:11 a.m. UTC | #1
On 9/8/2022 9:38 AM, Alex Deucher wrote:
> Common is mainly golden register setting and HDP register
> remapping, it shouldn't allocate any GPU memory.  Make sure
> common happens before gmc so that the HDP registers are
> remapped before gmc attempts to access them.
> 
> This fixes the Unsupported Request error reported through
> AER during driver load. The error happens as a write happens
> to the remap offset before real remapping is done.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
> 
> The error was unnoticed before and got visible because of the commit
> referenced below. This doesn't fix anything in the commit below, rather
> fixes the issue in amdgpu exposed by the commit. The reference is only
> to associate this commit with below one so that both go together.
> 
> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> 
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Series is:
	Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>

Thanks,
Lijo

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 899564ea8b4b..4da85ce9e3b1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2375,8 +2375,16 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   		}
>   		adev->ip_blocks[i].status.sw = true;
>   
> -		/* need to do gmc hw init early so we can allocate gpu mem */
> -		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
> +		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) {
> +			/* need to do common hw init early so everything is set up for gmc */
> +			r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
> +			if (r) {
> +				DRM_ERROR("hw_init %d failed %d\n", i, r);
> +				goto init_failed;
> +			}
> +			adev->ip_blocks[i].status.hw = true;
> +		} else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
> +			/* need to do gmc hw init early so we can allocate gpu mem */
>   			/* Try to reserve bad pages early */
>   			if (amdgpu_sriov_vf(adev))
>   				amdgpu_virt_exchange_data(adev);
> @@ -3062,8 +3070,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
>   	int i, r;
>   
>   	static enum amd_ip_block_type ip_order[] = {
> -		AMD_IP_BLOCK_TYPE_GMC,
>   		AMD_IP_BLOCK_TYPE_COMMON,
> +		AMD_IP_BLOCK_TYPE_GMC,
>   		AMD_IP_BLOCK_TYPE_PSP,
>   		AMD_IP_BLOCK_TYPE_IH,
>   	};
>
Alex Deucher Sept. 8, 2022, 2:21 p.m. UTC | #2
On Thu, Sep 8, 2022 at 1:11 AM Lazar, Lijo <lijo.lazar@amd.com> wrote:
>
>
>
> On 9/8/2022 9:38 AM, Alex Deucher wrote:
> > Common is mainly golden register setting and HDP register
> > remapping, it shouldn't allocate any GPU memory.  Make sure
> > common happens before gmc so that the HDP registers are
> > remapped before gmc attempts to access them.
> >
> > This fixes the Unsupported Request error reported through
> > AER during driver load. The error happens as a write happens
> > to the remap offset before real remapping is done.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
> >
> > The error was unnoticed before and got visible because of the commit
> > referenced below. This doesn't fix anything in the commit below, rather
> > fixes the issue in amdgpu exposed by the commit. The reference is only
> > to associate this commit with below one so that both go together.
> >
> > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> >
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Series is:
>         Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>


@tseewald@gmail.com it would be good if you could verify that this
patch fixes the issue for you as well.

Thanks,

Alex

>
> Thanks,
> Lijo
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++---
> >   1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 899564ea8b4b..4da85ce9e3b1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2375,8 +2375,16 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
> >               }
> >               adev->ip_blocks[i].status.sw = true;
> >
> > -             /* need to do gmc hw init early so we can allocate gpu mem */
> > -             if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
> > +             if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) {
> > +                     /* need to do common hw init early so everything is set up for gmc */
> > +                     r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
> > +                     if (r) {
> > +                             DRM_ERROR("hw_init %d failed %d\n", i, r);
> > +                             goto init_failed;
> > +                     }
> > +                     adev->ip_blocks[i].status.hw = true;
> > +             } else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
> > +                     /* need to do gmc hw init early so we can allocate gpu mem */
> >                       /* Try to reserve bad pages early */
> >                       if (amdgpu_sriov_vf(adev))
> >                               amdgpu_virt_exchange_data(adev);
> > @@ -3062,8 +3070,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
> >       int i, r;
> >
> >       static enum amd_ip_block_type ip_order[] = {
> > -             AMD_IP_BLOCK_TYPE_GMC,
> >               AMD_IP_BLOCK_TYPE_COMMON,
> > +             AMD_IP_BLOCK_TYPE_GMC,
> >               AMD_IP_BLOCK_TYPE_PSP,
> >               AMD_IP_BLOCK_TYPE_IH,
> >       };
> >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 899564ea8b4b..4da85ce9e3b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2375,8 +2375,16 @@  static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 		}
 		adev->ip_blocks[i].status.sw = true;
 
-		/* need to do gmc hw init early so we can allocate gpu mem */
-		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
+		if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) {
+			/* need to do common hw init early so everything is set up for gmc */
+			r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
+			if (r) {
+				DRM_ERROR("hw_init %d failed %d\n", i, r);
+				goto init_failed;
+			}
+			adev->ip_blocks[i].status.hw = true;
+		} else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) {
+			/* need to do gmc hw init early so we can allocate gpu mem */
 			/* Try to reserve bad pages early */
 			if (amdgpu_sriov_vf(adev))
 				amdgpu_virt_exchange_data(adev);
@@ -3062,8 +3070,8 @@  static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
 	int i, r;
 
 	static enum amd_ip_block_type ip_order[] = {
-		AMD_IP_BLOCK_TYPE_GMC,
 		AMD_IP_BLOCK_TYPE_COMMON,
+		AMD_IP_BLOCK_TYPE_GMC,
 		AMD_IP_BLOCK_TYPE_PSP,
 		AMD_IP_BLOCK_TYPE_IH,
 	};