diff mbox series

next/master boot bisection: Oops in nouveau driver on jetson-tk1

Message ID 89fa50d0-945f-d1ed-ae55-ee947187f209@collabora.com (mailing list archive)
State New, archived
Headers show
Series next/master boot bisection: Oops in nouveau driver on jetson-tk1 | expand

Commit Message

Guillaume Tucker Dec. 7, 2018, 11:31 p.m. UTC
Please find below an automated bisection report for a kernel Oops
seen during the initialisation of the nouveau GPU driver on
jetson-tk1.


All the LAVA test jobs for this bisection can be found here:

  http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table


Here's the beginning of the Oops stack trace:

[    7.485361] [00000064] *pgd=f9e7b835
[    7.485372] Internal error: Oops: 17 [#1] SMP ARM
[    7.485376] Modules linked in: snd_soc_tegra_rt5640(+) snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+) ttm tegra_devfreq tegra_wdt
[    7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next-20181206 #44
[    7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[    7.555859] PC is at drm_plane_register_all+0x18/0x50
[    7.560899] LR is at drm_modeset_register_all+0xc/0x6c


Full log:

  http://lava.baylibre.com:10080/scheduler/job/68628#L816


The bisection was run from next-20181206 as this is where the
issue was discovered on kernelci.org but the patch it found has
already been merged in mainline.

Hope this helps!

Guillaume


-----------8<------------------------8<-----------


* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* This automated bisection report was sent to you on the basis  *
* that you may be involved with the breaking commit it has      *
* found.  No manual investigation has been done to verify it,   *
* and the root cause of the problem may be somewhere else.      *
* Hope this helps!                                              *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Bisection result for next/master (next-20181206) on jetson-tk1

  Good:       84df9525b0c2 Linux 4.19
  Bad:        4c92b7b3080d Add linux-next specific files for 20181206
  Found:      cfea88a4d866 drm/nouveau: Start using new drm_dev initialization helpers

Checks:
  revert:     PASS
  verify:     PASS

Parameters:
  Tree:       next
  URL:        None
  Branch:     master
  Target:     jetson-tk1
  Lab:        lab-baylibre
  Config:     multi_v7_defconfig
  Plan:       dmesg-nouveau

Breaking commit found:

-------------------------------------------------------------------------------
commit cfea88a4d86632f28cf80be97079f131645b7869
Author: Lyude Paul <lyude@redhat.com>
Date:   Wed Aug 22 21:40:07 2018 -0400

    drm/nouveau: Start using new drm_dev initialization helpers
    
    Per the documentation in drm_get_pci_dev(), this function is deprecated
    and shouldn't be used anymore. As it turns out, we're going to need to
    stop using drm_get_pci_dev() anyway in order to allow us to turn off the
    card before full system shutdowns, otherwise we'll hit race conditions
    with userspace while trying to tear down the card on shutdown.
    
    So, start using drm_dev_get() and drm_dev_put(), and just turn our
    load/unload callbacks into open coded init/fini() functions.
    
    Signed-off-by: Lyude Paul <lyude@redhat.com>
    Cc: Karol Herbst <kherbst@redhat.com>
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

-------------------------------------------------------------------------------


Git bisection log:

-------------------------------------------------------------------------------
git bisect start
# good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
# bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific files for 20181206
git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4
# bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f
# good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect good d49f8a52b15bf35db778035340d8a673149f9f93
# good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21
# bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next-fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f
# good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make plane alpha configurable
git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5
# good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop using gart_start as offset for the GTT domain"
git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb
# good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for-4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48
# good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove unneeded checks in dpu_crtc.c
git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf
# good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61
# good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes-2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next
git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb
# bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20' of git://github.com/skeggsb/linux into drm-next
git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8
# good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix potential memory leak in nouveau_drm_load()
git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9
# bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep track of high-speed state, program into clock
git bisect bad a971558c298755d2c07bc5508c65d689471763c8
# bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a way to configure scrambling/tmds for hdmi 2.0
git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56
# bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using new drm_dev initialization helpers
git bisect bad cfea88a4d86632f28cf80be97079f131645b7869
# first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using new drm_dev initialization helpers
-------------------------------------------------------------------------------

Comments

Lyude Paul Dec. 8, 2018, 12:08 a.m. UTC | #1
uhhhhhhhhhhhhh
didn't we fix this weeks ago? with "drm/nouveau: tegra: Call
nouveau_drm_device_init()"


On Fri, 2018-12-07 at 23:31 +0000, Guillaume Tucker wrote:
> Please find below an automated bisection report for a kernel Oops
> seen during the initialisation of the nouveau GPU driver on
> jetson-tk1.
> 
> 
> All the LAVA test jobs for this bisection can be found here:
> 
>   
> http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table
> 
> 
> Here's the beginning of the Oops stack trace:
> 
> [    7.485361] [00000064] *pgd=f9e7b835
> [    7.485372] Internal error: Oops: 17 [#1] SMP ARM
> [    7.485376] Modules linked in: snd_soc_tegra_rt5640(+)
> snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub
> snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus
> snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+)
> ttm tegra_devfreq tegra_wdt
> [    7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next-
> 20181206 #44
> [    7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> [    7.555859] PC is at drm_plane_register_all+0x18/0x50
> [    7.560899] LR is at drm_modeset_register_all+0xc/0x6c
> 
> 
> Full log:
> 
>   http://lava.baylibre.com:10080/scheduler/job/68628#L816
> 
> 
> The bisection was run from next-20181206 as this is where the
> issue was discovered on kernelci.org but the patch it found has
> already been merged in mainline.
> 
> Hope this helps!
> 
> Guillaume
> 
> 
> -----------8<------------------------8<-----------
> 
> 
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> Bisection result for next/master (next-20181206) on jetson-tk1
> 
>   Good:       84df9525b0c2 Linux 4.19
>   Bad:        4c92b7b3080d Add linux-next specific files for 20181206
>   Found:      cfea88a4d866 drm/nouveau: Start using new drm_dev
> initialization helpers
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       next
>   URL:        None
>   Branch:     master
>   Target:     jetson-tk1
>   Lab:        lab-baylibre
>   Config:     multi_v7_defconfig
>   Plan:       dmesg-nouveau
> 
> Breaking commit found:
> 
> ----------------------------------------------------------------------------
> ---
> commit cfea88a4d86632f28cf80be97079f131645b7869
> Author: Lyude Paul <lyude@redhat.com>
> Date:   Wed Aug 22 21:40:07 2018 -0400
> 
>     drm/nouveau: Start using new drm_dev initialization helpers
>     
>     Per the documentation in drm_get_pci_dev(), this function is deprecated
>     and shouldn't be used anymore. As it turns out, we're going to need to
>     stop using drm_get_pci_dev() anyway in order to allow us to turn off the
>     card before full system shutdowns, otherwise we'll hit race conditions
>     with userspace while trying to tear down the card on shutdown.
>     
>     So, start using drm_dev_get() and drm_dev_put(), and just turn our
>     load/unload callbacks into open coded init/fini() functions.
>     
>     Signed-off-by: Lyude Paul <lyude@redhat.com>
>     Cc: Karol Herbst <kherbst@redhat.com>
>     Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 905956809d21..2b2baf6e0e0d 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm)
>  	nouveau_bo_move_init(drm);
>  }
>  
> -static int nouveau_drm_probe(struct pci_dev *pdev,
> -			     const struct pci_device_id *pent)
> -{
> -	struct nvkm_device *device;
> -	struct apertures_struct *aper;
> -	bool boot = false;
> -	int ret;
> -
> -	if (vga_switcheroo_client_probe_defer(pdev))
> -		return -EPROBE_DEFER;
> -
> -	/* We need to check that the chipset is supported before booting
> -	 * fbdev off the hardware, as there's no way to put it back.
> -	 */
> -	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
> &device);
> -	if (ret)
> -		return ret;
> -
> -	nvkm_device_del(&device);
> -
> -	/* Remove conflicting drivers (vesafb, efifb etc). */
> -	aper = alloc_apertures(3);
> -	if (!aper)
> -		return -ENOMEM;
> -
> -	aper->ranges[0].base = pci_resource_start(pdev, 1);
> -	aper->ranges[0].size = pci_resource_len(pdev, 1);
> -	aper->count = 1;
> -
> -	if (pci_resource_len(pdev, 2)) {
> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
> -		aper->count++;
> -	}
> -
> -	if (pci_resource_len(pdev, 3)) {
> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
> -		aper->count++;
> -	}
> -
> -#ifdef CONFIG_X86
> -	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
> -#endif
> -	if (nouveau_modeset != 2)
> -		drm_fb_helper_remove_conflicting_framebuffers(aper,
> "nouveaufb", boot);
> -	kfree(aper);
> -
> -	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
> -				  true, true, ~0ULL, &device);
> -	if (ret)
> -		return ret;
> -
> -	pci_set_master(pdev);
> -
> -	if (nouveau_atomic)
> -		driver_pci.driver_features |= DRIVER_ATOMIC;
> -
> -	ret = drm_get_pci_dev(pdev, pent, &driver_pci);
> -	if (ret) {
> -		nvkm_device_del(&device);
> -		return ret;
> -	}
> -
> -	return 0;
> -}
> -
>  static int
> -nouveau_drm_load(struct drm_device *dev, unsigned long flags)
> +nouveau_drm_device_init(struct drm_device *dev)
>  {
>  	struct nouveau_drm *drm;
>  	int ret;
> @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long
> flags)
>  }
>  
>  static void
> -nouveau_drm_unload(struct drm_device *dev)
> +nouveau_drm_device_fini(struct drm_device *dev)
>  {
>  	struct nouveau_drm *drm = nouveau_drm(dev);
>  
> @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev)
>  	kfree(drm);
>  }
>  
> +static int nouveau_drm_probe(struct pci_dev *pdev,
> +			     const struct pci_device_id *pent)
> +{
> +	struct nvkm_device *device;
> +	struct drm_device *drm_dev;
> +	struct apertures_struct *aper;
> +	bool boot = false;
> +	int ret;
> +
> +	if (vga_switcheroo_client_probe_defer(pdev))
> +		return -EPROBE_DEFER;
> +
> +	/* We need to check that the chipset is supported before booting
> +	 * fbdev off the hardware, as there's no way to put it back.
> +	 */
> +	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
> &device);
> +	if (ret)
> +		return ret;
> +
> +	nvkm_device_del(&device);
> +
> +	/* Remove conflicting drivers (vesafb, efifb etc). */
> +	aper = alloc_apertures(3);
> +	if (!aper)
> +		return -ENOMEM;
> +
> +	aper->ranges[0].base = pci_resource_start(pdev, 1);
> +	aper->ranges[0].size = pci_resource_len(pdev, 1);
> +	aper->count = 1;
> +
> +	if (pci_resource_len(pdev, 2)) {
> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
> +		aper->count++;
> +	}
> +
> +	if (pci_resource_len(pdev, 3)) {
> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
> +		aper->count++;
> +	}
> +
> +#ifdef CONFIG_X86
> +	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
> +#endif
> +	if (nouveau_modeset != 2)
> +		drm_fb_helper_remove_conflicting_framebuffers(aper,
> "nouveaufb", boot);
> +	kfree(aper);
> +
> +	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
> +				  true, true, ~0ULL, &device);
> +	if (ret)
> +		return ret;
> +
> +	pci_set_master(pdev);
> +
> +	if (nouveau_atomic)
> +		driver_pci.driver_features |= DRIVER_ATOMIC;
> +
> +	drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev);
> +	if (IS_ERR(drm_dev)) {
> +		ret = PTR_ERR(drm_dev);
> +		goto fail_nvkm;
> +	}
> +
> +	ret = pci_enable_device(pdev);
> +	if (ret)
> +		goto fail_drm;
> +
> +	drm_dev->pdev = pdev;
> +	pci_set_drvdata(pdev, drm_dev);
> +
> +	ret = nouveau_drm_device_init(drm_dev);
> +	if (ret)
> +		goto fail_pci;
> +
> +	ret = drm_dev_register(drm_dev, pent->driver_data);
> +	if (ret)
> +		goto fail_drm_dev_init;
> +
> +	return 0;
> +
> +fail_drm_dev_init:
> +	nouveau_drm_device_fini(drm_dev);
> +fail_pci:
> +	pci_disable_device(pdev);
> +fail_drm:
> +	drm_dev_put(drm_dev);
> +fail_nvkm:
> +	nvkm_device_del(&device);
> +	return ret;
> +}
> +
>  void
>  nouveau_drm_device_remove(struct drm_device *dev)
>  {
> +	struct pci_dev *pdev = dev->pdev;
>  	struct nouveau_drm *drm = nouveau_drm(dev);
>  	struct nvkm_client *client;
>  	struct nvkm_device *device;
>  
> +	drm_dev_unregister(dev);
> +
>  	dev->irq_enabled = false;
>  	client = nvxx_client(&drm->client.base);
>  	device = nvkm_device_find(client->device);
> -	drm_put_dev(dev);
>  
> +	nouveau_drm_device_fini(dev);
> +	pci_disable_device(pdev);
> +	drm_dev_put(dev);
>  	nvkm_device_del(&device);
>  }
>  
> @@ -1020,8 +1051,6 @@ driver_stub = {
>  		DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER |
>  		DRIVER_KMS_LEGACY_CONTEXT,
>  
> -	.load = nouveau_drm_load,
> -	.unload = nouveau_drm_unload,
>  	.open = nouveau_drm_open,
>  	.postclose = nouveau_drm_postclose,
>  	.lastclose = nouveau_vga_lastclose,
> ----------------------------------------------------------------------------
> ---
> 
> 
> Git bisection log:
> 
> ----------------------------------------------------------------------------
> ---
> git bisect start
> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific
> files for 20181206
> git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4
> # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20-
> 2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f
> # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> git bisect good d49f8a52b15bf35db778035340d8a673149f9f93
> # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20'
> of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
> git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21
> # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next-
> fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-
> next
> git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f
> # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make
> plane alpha configurable
> git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5
> # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop
> using gart_start as offset for the GTT domain"
> git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb
> # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for-
> 4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
> git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48
> # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove
> unneeded checks in dpu_crtc.c
> git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf
> # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for-
> upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
> git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61
> # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes-
> 2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next
> git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb
> # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20'
> of git://github.com/skeggsb/linux into drm-next
> git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8
> # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix
> potential memory leak in nouveau_drm_load()
> git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9
> # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep
> track of high-speed state, program into clock
> git bisect bad a971558c298755d2c07bc5508c65d689471763c8
> # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a
> way to configure scrambling/tmds for hdmi 2.0
> git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56
> # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using
> new drm_dev initialization helpers
> git bisect bad cfea88a4d86632f28cf80be97079f131645b7869
> # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau:
> Start using new drm_dev initialization helpers
> ----------------------------------------------------------------------------
> ---
Guillaume Tucker Dec. 10, 2018, 10 a.m. UTC | #2
On 08/12/2018 00:08, Lyude Paul wrote:
> uhhhhhhhhhhhhh
> didn't we fix this weeks ago? with "drm/nouveau: tegra: Call
> nouveau_drm_device_init()"

Yes here's the fix from Thierry:

  https://patchwork.freedesktop.org/patch/263587/


and I can confirm that it does fix the Oops when applied on top
of next-20181206 (what I used for the bisection last week):

  http://lava.baylibre.com:10080/scheduler/job/71109


However the fix doesn't appear to have been applied in any
upstream tree yet.

Guillaume
 
> On Fri, 2018-12-07 at 23:31 +0000, Guillaume Tucker wrote:
>> Please find below an automated bisection report for a kernel Oops
>> seen during the initialisation of the nouveau GPU driver on
>> jetson-tk1.
>>
>>
>> All the LAVA test jobs for this bisection can be found here:
>>
>>   
>> http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table
>>
>>
>> Here's the beginning of the Oops stack trace:
>>
>> [    7.485361] [00000064] *pgd=f9e7b835
>> [    7.485372] Internal error: Oops: 17 [#1] SMP ARM
>> [    7.485376] Modules linked in: snd_soc_tegra_rt5640(+)
>> snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub
>> snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus
>> snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+)
>> ttm tegra_devfreq tegra_wdt
>> [    7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next-
>> 20181206 #44
>> [    7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
>> [    7.555859] PC is at drm_plane_register_all+0x18/0x50
>> [    7.560899] LR is at drm_modeset_register_all+0xc/0x6c
>>
>>
>> Full log:
>>
>>   http://lava.baylibre.com:10080/scheduler/job/68628#L816
>>
>>
>> The bisection was run from next-20181206 as this is where the
>> issue was discovered on kernelci.org but the patch it found has
>> already been merged in mainline.
>>
>> Hope this helps!
>>
>> Guillaume
>>
>>
>> -----------8<------------------------8<-----------
>>
>>
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>> * This automated bisection report was sent to you on the basis  *
>> * that you may be involved with the breaking commit it has      *
>> * found.  No manual investigation has been done to verify it,   *
>> * and the root cause of the problem may be somewhere else.      *
>> * Hope this helps!                                              *
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>>
>> Bisection result for next/master (next-20181206) on jetson-tk1
>>
>>   Good:       84df9525b0c2 Linux 4.19
>>   Bad:        4c92b7b3080d Add linux-next specific files for 20181206
>>   Found:      cfea88a4d866 drm/nouveau: Start using new drm_dev
>> initialization helpers
>>
>> Checks:
>>   revert:     PASS
>>   verify:     PASS
>>
>> Parameters:
>>   Tree:       next
>>   URL:        None
>>   Branch:     master
>>   Target:     jetson-tk1
>>   Lab:        lab-baylibre
>>   Config:     multi_v7_defconfig
>>   Plan:       dmesg-nouveau
>>
>> Breaking commit found:
>>
>> ----------------------------------------------------------------------------
>> ---
>> commit cfea88a4d86632f28cf80be97079f131645b7869
>> Author: Lyude Paul <lyude@redhat.com>
>> Date:   Wed Aug 22 21:40:07 2018 -0400
>>
>>     drm/nouveau: Start using new drm_dev initialization helpers
>>     
>>     Per the documentation in drm_get_pci_dev(), this function is deprecated
>>     and shouldn't be used anymore. As it turns out, we're going to need to
>>     stop using drm_get_pci_dev() anyway in order to allow us to turn off the
>>     card before full system shutdowns, otherwise we'll hit race conditions
>>     with userspace while trying to tear down the card on shutdown.
>>     
>>     So, start using drm_dev_get() and drm_dev_put(), and just turn our
>>     load/unload callbacks into open coded init/fini() functions.
>>     
>>     Signed-off-by: Lyude Paul <lyude@redhat.com>
>>     Cc: Karol Herbst <kherbst@redhat.com>
>>     Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> index 905956809d21..2b2baf6e0e0d 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm)
>>  	nouveau_bo_move_init(drm);
>>  }
>>  
>> -static int nouveau_drm_probe(struct pci_dev *pdev,
>> -			     const struct pci_device_id *pent)
>> -{
>> -	struct nvkm_device *device;
>> -	struct apertures_struct *aper;
>> -	bool boot = false;
>> -	int ret;
>> -
>> -	if (vga_switcheroo_client_probe_defer(pdev))
>> -		return -EPROBE_DEFER;
>> -
>> -	/* We need to check that the chipset is supported before booting
>> -	 * fbdev off the hardware, as there's no way to put it back.
>> -	 */
>> -	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
>> &device);
>> -	if (ret)
>> -		return ret;
>> -
>> -	nvkm_device_del(&device);
>> -
>> -	/* Remove conflicting drivers (vesafb, efifb etc). */
>> -	aper = alloc_apertures(3);
>> -	if (!aper)
>> -		return -ENOMEM;
>> -
>> -	aper->ranges[0].base = pci_resource_start(pdev, 1);
>> -	aper->ranges[0].size = pci_resource_len(pdev, 1);
>> -	aper->count = 1;
>> -
>> -	if (pci_resource_len(pdev, 2)) {
>> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
>> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
>> -		aper->count++;
>> -	}
>> -
>> -	if (pci_resource_len(pdev, 3)) {
>> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
>> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
>> -		aper->count++;
>> -	}
>> -
>> -#ifdef CONFIG_X86
>> -	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
>> -#endif
>> -	if (nouveau_modeset != 2)
>> -		drm_fb_helper_remove_conflicting_framebuffers(aper,
>> "nouveaufb", boot);
>> -	kfree(aper);
>> -
>> -	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
>> -				  true, true, ~0ULL, &device);
>> -	if (ret)
>> -		return ret;
>> -
>> -	pci_set_master(pdev);
>> -
>> -	if (nouveau_atomic)
>> -		driver_pci.driver_features |= DRIVER_ATOMIC;
>> -
>> -	ret = drm_get_pci_dev(pdev, pent, &driver_pci);
>> -	if (ret) {
>> -		nvkm_device_del(&device);
>> -		return ret;
>> -	}
>> -
>> -	return 0;
>> -}
>> -
>>  static int
>> -nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>> +nouveau_drm_device_init(struct drm_device *dev)
>>  {
>>  	struct nouveau_drm *drm;
>>  	int ret;
>> @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long
>> flags)
>>  }
>>  
>>  static void
>> -nouveau_drm_unload(struct drm_device *dev)
>> +nouveau_drm_device_fini(struct drm_device *dev)
>>  {
>>  	struct nouveau_drm *drm = nouveau_drm(dev);
>>  
>> @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev)
>>  	kfree(drm);
>>  }
>>  
>> +static int nouveau_drm_probe(struct pci_dev *pdev,
>> +			     const struct pci_device_id *pent)
>> +{
>> +	struct nvkm_device *device;
>> +	struct drm_device *drm_dev;
>> +	struct apertures_struct *aper;
>> +	bool boot = false;
>> +	int ret;
>> +
>> +	if (vga_switcheroo_client_probe_defer(pdev))
>> +		return -EPROBE_DEFER;
>> +
>> +	/* We need to check that the chipset is supported before booting
>> +	 * fbdev off the hardware, as there's no way to put it back.
>> +	 */
>> +	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
>> &device);
>> +	if (ret)
>> +		return ret;
>> +
>> +	nvkm_device_del(&device);
>> +
>> +	/* Remove conflicting drivers (vesafb, efifb etc). */
>> +	aper = alloc_apertures(3);
>> +	if (!aper)
>> +		return -ENOMEM;
>> +
>> +	aper->ranges[0].base = pci_resource_start(pdev, 1);
>> +	aper->ranges[0].size = pci_resource_len(pdev, 1);
>> +	aper->count = 1;
>> +
>> +	if (pci_resource_len(pdev, 2)) {
>> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
>> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
>> +		aper->count++;
>> +	}
>> +
>> +	if (pci_resource_len(pdev, 3)) {
>> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
>> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
>> +		aper->count++;
>> +	}
>> +
>> +#ifdef CONFIG_X86
>> +	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
>> +#endif
>> +	if (nouveau_modeset != 2)
>> +		drm_fb_helper_remove_conflicting_framebuffers(aper,
>> "nouveaufb", boot);
>> +	kfree(aper);
>> +
>> +	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
>> +				  true, true, ~0ULL, &device);
>> +	if (ret)
>> +		return ret;
>> +
>> +	pci_set_master(pdev);
>> +
>> +	if (nouveau_atomic)
>> +		driver_pci.driver_features |= DRIVER_ATOMIC;
>> +
>> +	drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev);
>> +	if (IS_ERR(drm_dev)) {
>> +		ret = PTR_ERR(drm_dev);
>> +		goto fail_nvkm;
>> +	}
>> +
>> +	ret = pci_enable_device(pdev);
>> +	if (ret)
>> +		goto fail_drm;
>> +
>> +	drm_dev->pdev = pdev;
>> +	pci_set_drvdata(pdev, drm_dev);
>> +
>> +	ret = nouveau_drm_device_init(drm_dev);
>> +	if (ret)
>> +		goto fail_pci;
>> +
>> +	ret = drm_dev_register(drm_dev, pent->driver_data);
>> +	if (ret)
>> +		goto fail_drm_dev_init;
>> +
>> +	return 0;
>> +
>> +fail_drm_dev_init:
>> +	nouveau_drm_device_fini(drm_dev);
>> +fail_pci:
>> +	pci_disable_device(pdev);
>> +fail_drm:
>> +	drm_dev_put(drm_dev);
>> +fail_nvkm:
>> +	nvkm_device_del(&device);
>> +	return ret;
>> +}
>> +
>>  void
>>  nouveau_drm_device_remove(struct drm_device *dev)
>>  {
>> +	struct pci_dev *pdev = dev->pdev;
>>  	struct nouveau_drm *drm = nouveau_drm(dev);
>>  	struct nvkm_client *client;
>>  	struct nvkm_device *device;
>>  
>> +	drm_dev_unregister(dev);
>> +
>>  	dev->irq_enabled = false;
>>  	client = nvxx_client(&drm->client.base);
>>  	device = nvkm_device_find(client->device);
>> -	drm_put_dev(dev);
>>  
>> +	nouveau_drm_device_fini(dev);
>> +	pci_disable_device(pdev);
>> +	drm_dev_put(dev);
>>  	nvkm_device_del(&device);
>>  }
>>  
>> @@ -1020,8 +1051,6 @@ driver_stub = {
>>  		DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER |
>>  		DRIVER_KMS_LEGACY_CONTEXT,
>>  
>> -	.load = nouveau_drm_load,
>> -	.unload = nouveau_drm_unload,
>>  	.open = nouveau_drm_open,
>>  	.postclose = nouveau_drm_postclose,
>>  	.lastclose = nouveau_vga_lastclose,
>> ----------------------------------------------------------------------------
>> ---
>>
>>
>> Git bisection log:
>>
>> ----------------------------------------------------------------------------
>> ---
>> git bisect start
>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>> git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
>> # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific
>> files for 20181206
>> git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4
>> # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20-
>> 2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>> git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f
>> # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>> git bisect good d49f8a52b15bf35db778035340d8a673149f9f93
>> # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
>> git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21
>> # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next-
>> fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-
>> next
>> git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f
>> # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make
>> plane alpha configurable
>> git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5
>> # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop
>> using gart_start as offset for the GTT domain"
>> git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb
>> # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for-
>> 4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
>> git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48
>> # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove
>> unneeded checks in dpu_crtc.c
>> git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf
>> # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for-
>> upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
>> git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61
>> # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes-
>> 2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next
>> git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb
>> # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20'
>> of git://github.com/skeggsb/linux into drm-next
>> git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8
>> # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix
>> potential memory leak in nouveau_drm_load()
>> git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9
>> # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep
>> track of high-speed state, program into clock
>> git bisect bad a971558c298755d2c07bc5508c65d689471763c8
>> # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a
>> way to configure scrambling/tmds for hdmi 2.0
>> git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56
>> # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using
>> new drm_dev initialization helpers
>> git bisect bad cfea88a4d86632f28cf80be97079f131645b7869
>> # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau:
>> Start using new drm_dev initialization helpers
>> ----------------------------------------------------------------------------
>> ---
Mark Brown Dec. 10, 2018, 2:25 p.m. UTC | #3
On Mon, Dec 10, 2018 at 10:00:08AM +0000, Guillaume Tucker wrote:
> On 08/12/2018 00:08, Lyude Paul wrote:
> > uhhhhhhhhhhhhh
> > didn't we fix this weeks ago? with "drm/nouveau: tegra: Call
> > nouveau_drm_device_init()"
> 
> Yes here's the fix from Thierry:
> 
>   https://patchwork.freedesktop.org/patch/263587/
> 
> 
> and I can confirm that it does fix the Oops when applied on top
> of next-20181206 (what I used for the bisection last week):
> 
>   http://lava.baylibre.com:10080/scheduler/job/71109
> 
> 
> However the fix doesn't appear to have been applied in any
> upstream tree yet.

This has been broken for a considerable time now with no response from
Ben - is there some other path we can use to get the fix merged?
Thierry Reding Dec. 10, 2018, 4:26 p.m. UTC | #4
On Mon, Dec 10, 2018 at 02:25:59PM +0000, Mark Brown wrote:
> On Mon, Dec 10, 2018 at 10:00:08AM +0000, Guillaume Tucker wrote:
> > On 08/12/2018 00:08, Lyude Paul wrote:
> > > uhhhhhhhhhhhhh
> > > didn't we fix this weeks ago? with "drm/nouveau: tegra: Call
> > > nouveau_drm_device_init()"
> > 
> > Yes here's the fix from Thierry:
> > 
> >   https://patchwork.freedesktop.org/patch/263587/
> > 
> > 
> > and I can confirm that it does fix the Oops when applied on top
> > of next-20181206 (what I used for the bisection last week):
> > 
> >   http://lava.baylibre.com:10080/scheduler/job/71109
> > 
> > 
> > However the fix doesn't appear to have been applied in any
> > upstream tree yet.
> 
> This has been broken for a considerable time now with no response from
> Ben - is there some other path we can use to get the fix merged?

I suppose we could go directly via Dave. But Ben's usually pretty
responsive, so he probably just missed it. Let me ping him on IRC, maybe
that'll get his attention.

Thierry
Mark Brown Dec. 11, 2018, 12:51 a.m. UTC | #5
On Mon, Dec 10, 2018 at 05:26:22PM +0100, Thierry Reding wrote:
> On Mon, Dec 10, 2018 at 02:25:59PM +0000, Mark Brown wrote:

> > This has been broken for a considerable time now with no response from
> > Ben - is there some other path we can use to get the fix merged?

> I suppose we could go directly via Dave. But Ben's usually pretty
> responsive, so he probably just missed it. Let me ping him on IRC, maybe
> that'll get his attention.

This is at least the third go at reporting this as a boot failure IIRC
so these clearly aren't doing the trick :/ .  Perhaps a resend as well?
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 905956809d21..2b2baf6e0e0d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -458,75 +458,8 @@  nouveau_accel_init(struct nouveau_drm *drm)
 	nouveau_bo_move_init(drm);
 }
 
-static int nouveau_drm_probe(struct pci_dev *pdev,
-			     const struct pci_device_id *pent)
-{
-	struct nvkm_device *device;
-	struct apertures_struct *aper;
-	bool boot = false;
-	int ret;
-
-	if (vga_switcheroo_client_probe_defer(pdev))
-		return -EPROBE_DEFER;
-
-	/* We need to check that the chipset is supported before booting
-	 * fbdev off the hardware, as there's no way to put it back.
-	 */
-	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, &device);
-	if (ret)
-		return ret;
-
-	nvkm_device_del(&device);
-
-	/* Remove conflicting drivers (vesafb, efifb etc). */
-	aper = alloc_apertures(3);
-	if (!aper)
-		return -ENOMEM;
-
-	aper->ranges[0].base = pci_resource_start(pdev, 1);
-	aper->ranges[0].size = pci_resource_len(pdev, 1);
-	aper->count = 1;
-
-	if (pci_resource_len(pdev, 2)) {
-		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
-		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
-		aper->count++;
-	}
-
-	if (pci_resource_len(pdev, 3)) {
-		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
-		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
-		aper->count++;
-	}
-
-#ifdef CONFIG_X86
-	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
-#endif
-	if (nouveau_modeset != 2)
-		drm_fb_helper_remove_conflicting_framebuffers(aper, "nouveaufb", boot);
-	kfree(aper);
-
-	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
-				  true, true, ~0ULL, &device);
-	if (ret)
-		return ret;
-
-	pci_set_master(pdev);
-
-	if (nouveau_atomic)
-		driver_pci.driver_features |= DRIVER_ATOMIC;
-
-	ret = drm_get_pci_dev(pdev, pent, &driver_pci);
-	if (ret) {
-		nvkm_device_del(&device);
-		return ret;
-	}
-
-	return 0;
-}
-
 static int
-nouveau_drm_load(struct drm_device *dev, unsigned long flags)
+nouveau_drm_device_init(struct drm_device *dev)
 {
 	struct nouveau_drm *drm;
 	int ret;
@@ -613,7 +546,7 @@  nouveau_drm_load(struct drm_device *dev, unsigned long flags)
 }
 
 static void
-nouveau_drm_unload(struct drm_device *dev)
+nouveau_drm_device_fini(struct drm_device *dev)
 {
 	struct nouveau_drm *drm = nouveau_drm(dev);
 
@@ -642,18 +575,116 @@  nouveau_drm_unload(struct drm_device *dev)
 	kfree(drm);
 }
 
+static int nouveau_drm_probe(struct pci_dev *pdev,
+			     const struct pci_device_id *pent)
+{
+	struct nvkm_device *device;
+	struct drm_device *drm_dev;
+	struct apertures_struct *aper;
+	bool boot = false;
+	int ret;
+
+	if (vga_switcheroo_client_probe_defer(pdev))
+		return -EPROBE_DEFER;
+
+	/* We need to check that the chipset is supported before booting
+	 * fbdev off the hardware, as there's no way to put it back.
+	 */
+	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, &device);
+	if (ret)
+		return ret;
+
+	nvkm_device_del(&device);
+
+	/* Remove conflicting drivers (vesafb, efifb etc). */
+	aper = alloc_apertures(3);
+	if (!aper)
+		return -ENOMEM;
+
+	aper->ranges[0].base = pci_resource_start(pdev, 1);
+	aper->ranges[0].size = pci_resource_len(pdev, 1);
+	aper->count = 1;
+
+	if (pci_resource_len(pdev, 2)) {
+		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
+		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
+		aper->count++;
+	}
+
+	if (pci_resource_len(pdev, 3)) {
+		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
+		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
+		aper->count++;
+	}
+
+#ifdef CONFIG_X86
+	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
+#endif
+	if (nouveau_modeset != 2)
+		drm_fb_helper_remove_conflicting_framebuffers(aper, "nouveaufb", boot);
+	kfree(aper);
+
+	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
+				  true, true, ~0ULL, &device);
+	if (ret)
+		return ret;
+
+	pci_set_master(pdev);
+
+	if (nouveau_atomic)
+		driver_pci.driver_features |= DRIVER_ATOMIC;
+
+	drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev);
+	if (IS_ERR(drm_dev)) {
+		ret = PTR_ERR(drm_dev);
+		goto fail_nvkm;
+	}
+
+	ret = pci_enable_device(pdev);
+	if (ret)
+		goto fail_drm;
+
+	drm_dev->pdev = pdev;
+	pci_set_drvdata(pdev, drm_dev);
+
+	ret = nouveau_drm_device_init(drm_dev);
+	if (ret)
+		goto fail_pci;
+
+	ret = drm_dev_register(drm_dev, pent->driver_data);
+	if (ret)
+		goto fail_drm_dev_init;
+
+	return 0;
+
+fail_drm_dev_init:
+	nouveau_drm_device_fini(drm_dev);
+fail_pci:
+	pci_disable_device(pdev);
+fail_drm:
+	drm_dev_put(drm_dev);
+fail_nvkm:
+	nvkm_device_del(&device);
+	return ret;
+}
+
 void
 nouveau_drm_device_remove(struct drm_device *dev)
 {
+	struct pci_dev *pdev = dev->pdev;
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct nvkm_client *client;
 	struct nvkm_device *device;
 
+	drm_dev_unregister(dev);
+
 	dev->irq_enabled = false;
 	client = nvxx_client(&drm->client.base);
 	device = nvkm_device_find(client->device);
-	drm_put_dev(dev);
 
+	nouveau_drm_device_fini(dev);
+	pci_disable_device(pdev);
+	drm_dev_put(dev);
 	nvkm_device_del(&device);
 }
 
@@ -1020,8 +1051,6 @@  driver_stub = {
 		DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER |
 		DRIVER_KMS_LEGACY_CONTEXT,
 
-	.load = nouveau_drm_load,
-	.unload = nouveau_drm_unload,
 	.open = nouveau_drm_open,
 	.postclose = nouveau_drm_postclose,
 	.lastclose = nouveau_vga_lastclose,