diff mbox series

drm/nouveau/core/object: fix double free on error in nvkm_ioctl_new()

Message ID YMcyzyVyI4N6anBo@mwanda (mailing list archive)
State New, archived
Headers show
Series drm/nouveau/core/object: fix double free on error in nvkm_ioctl_new() | expand

Commit Message

Dan Carpenter June 14, 2021, 10:43 a.m. UTC
If nvkm_object_init() fails then we should not call nvkm_object_fini()
because it results in calling object->func->fini(object, suspend) twice.
Once inside the nvkm_object_init() function and once inside the
nvkm_object_fini() function.

Fixes: fbd58ebda9c8 ("drm/nouveau/object: merge with handle")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
This is something that I spotted while looking for reference counting
bugs.  I have tried running it, but it does not fix my crashes.  My
system is basically unusable.  It's something to do with the new version
of Firefox which triggers the refcount_t underflow, but switching to
Epiphany doesn't solve the issue either.

 drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dan Carpenter June 14, 2021, 11:05 a.m. UTC | #1
On Mon, Jun 14, 2021 at 01:43:27PM +0300, Dan Carpenter wrote:
> If nvkm_object_init() fails then we should not call nvkm_object_fini()
> because it results in calling object->func->fini(object, suspend) twice.
> Once inside the nvkm_object_init() function and once inside the
> nvkm_object_fini() function.
> 
> Fixes: fbd58ebda9c8 ("drm/nouveau/object: merge with handle")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> This is something that I spotted while looking for reference counting
> bugs.  I have tried running it, but it does not fix my crashes.  My
> system is basically unusable.  It's something to do with the new version
> of Firefox which triggers the refcount_t underflow, but switching to
> Epiphany doesn't solve the issue either.
> 
>  drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> index d777df5a64e6..87c761fb475a 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> @@ -134,8 +134,8 @@ nvkm_ioctl_new(struct nvkm_client *client,
>  				return 0;
>  			}
>  			ret = -EEXIST;
> +			nvkm_object_fini(object, false);
>  		}
> -		nvkm_object_fini(object, false);
>  	}

Hm...  There is probably another bug here.

drivers/gpu/drm/nouveau/nvkm/core/ioctl.c 
   118          } while (oclass.base.oclass != args->v0.oclass);
   119  
   120          if (oclass.engine) {
   121                  oclass.engine = nvkm_engine_ref(oclass.engine);
   122                  if (IS_ERR(oclass.engine))
   123                          return PTR_ERR(oclass.engine);
   124          }
   125  
   126          ret = oclass.ctor(&oclass, data, size, &object);
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

   127          nvkm_engine_unref(&oclass.engine);
   128          if (ret == 0) {
   129                  ret = nvkm_object_init(object);
   130                  if (ret == 0) {
   131                          list_add(&object->head, &parent->tree);
   132                          if (nvkm_object_insert(object)) {
   133                                  client->data = object;
   134                                  return 0;
   135                          }
   136                          ret = -EEXIST;
   137                          nvkm_object_fini(object, false);
   138                  }
   139          }
   140  
   141          nvkm_object_del(&object);

This calls .dtor() whether or not .ctor succeeded.

This error handling is written sort of unconventionally, where it checks
for success and has nested success blocks but it's more normal to use
gotos and keep the success path at indent level 1.

	a = alloc();
	if (fail)
		return ret;

	b = alloc();
	if (fail)
		goto free_a;

	c = alloc();
	if (fail)
		goto free_b;

	return 0;

free_b:
	free(b);
free_a:
	free(a);

	return ret;

It looks sort of like this if you highlight the success path vs the
fail path:

	success
	success
		fail
	success
	success
		fail
		fail
	success

Then if you want to read what the function does in the ideal case,
you just read the 1 tab indented code.  Otherwise here we'd have to
add another indent level.

No, wait, we can actually just move the nvkm_object_del() into the curly
braces...  I'll test that out and send the patch if it works.  Probably
I'll send the patch regardless of whether it works or not because I
think it's the right thing to do.  ;)

regards,
dan carpenter
Dan Carpenter June 17, 2021, 11:32 a.m. UTC | #2
On Mon, Jun 14, 2021 at 01:43:27PM +0300, Dan Carpenter wrote:
> If nvkm_object_init() fails then we should not call nvkm_object_fini()
> because it results in calling object->func->fini(object, suspend) twice.
> Once inside the nvkm_object_init() function and once inside the
> nvkm_object_fini() function.
> 
> Fixes: fbd58ebda9c8 ("drm/nouveau/object: merge with handle")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> This is something that I spotted while looking for reference counting
> bugs.  I have tried running it, but it does not fix my crashes.  My
> system is basically unusable.  It's something to do with the new version
> of Firefox which triggers the refcount_t underflow, but switching to
> Epiphany doesn't solve the issue either.
> 
>  drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> index d777df5a64e6..87c761fb475a 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> @@ -134,8 +134,8 @@ nvkm_ioctl_new(struct nvkm_client *client,
>  				return 0;
>  			}
>  			ret = -EEXIST;
> +			nvkm_object_fini(object, false);
>  		}
> -		nvkm_object_fini(object, false);

Actually calling nvkm_object_fini() is probably fine.  It just screws
around with the registers and it's probably fine if we do that twice.

Calling .dtor() when .ctor() fails is actually required because .ctor
doesn't clean up after itself.

So this patch is not required.  The other patch is required.
https://lore.kernel.org/nouveau/YMinJwpIei9n1Pn1@mwanda/T/

In the end, I had to give up on fixing the hang and downgrade to
debian's long term support version of firefox.

regards,
dan carpenter
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
index d777df5a64e6..87c761fb475a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
@@ -134,8 +134,8 @@  nvkm_ioctl_new(struct nvkm_client *client,
 				return 0;
 			}
 			ret = -EEXIST;
+			nvkm_object_fini(object, false);
 		}
-		nvkm_object_fini(object, false);
 	}
 
 	nvkm_object_del(&object);