Message ID | 20140821233435.19a9cffa@neptune.home (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Thu, Aug 21, 2014 at 4:34 PM, Bruno Prémont <bonbons@linux-vserver.org> wrote: > A second step would then be to tune vgaarb's initial selection. > Bjorn, is it possible to verify which I/O ports are decoded by a PCI > device at the time of adding it to vgaarb? If so, how? I would like to > check for legacy VGA I/O range (0x03B0-0x03DF) and only let vgaarb set > a device as default if that I/O range is decoded by the device. I don't know of a way. I'm pretty sure VGA devices are allowed to respond to those legacy addresses even if there's no BAR for them, but I haven't found a spec reference for this. There is the VGA Enable bit in bridges, of course (PCI Bridge spec, sec 12.1.1. If the VGA device is behind a bridge that doesn't have the VGA Enable bit set, it probably isn't the default device. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 21 Aug 2014 23:39:31 -0500 Bjorn Helgaas wrote: > On Thu, Aug 21, 2014 at 4:34 PM, Bruno Prémont wrote: > > > A second step would then be to tune vgaarb's initial selection. > > Bjorn, is it possible to verify which I/O ports are decoded by a PCI > > device at the time of adding it to vgaarb? If so, how? I would like to > > check for legacy VGA I/O range (0x03B0-0x03DF) and only let vgaarb set > > a device as default if that I/O range is decoded by the device. > > I don't know of a way. I'm pretty sure VGA devices are allowed to > respond to those legacy addresses even if there's no BAR for them, but > I haven't found a spec reference for this. There is the VGA Enable > bit in bridges, of course (PCI Bridge spec, sec 12.1.1. If the VGA > device is behind a bridge that doesn't have the VGA Enable bit set, it > probably isn't the default device. Those VGA devices behind bridges are the easy ones that vgaarb selects properly. It's the ones not behind a bridge (integrated graphics) like the intel one that cause problems. For Andreas's system the discrete nvidia GPU has no I/O enabled according to PCI_COMMAND flags while the integrated intel one does have them (that's why the Intel GPU is chosen). Unfortunately I don't know what makes his system choke at boot time as he did not provide logs for the failing case. If there is no better way to detect the proper legacy VGA device the only remaining option would be to perform the screen_info testing in vga_arb_device_init() enclosed in arch #ifdef... Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 22 August 2014 Andreas Noever <andreas.noever@gmail.com> wrote: > > For Andreas's system the discrete nvidia GPU has no I/O enabled > > according to PCI_COMMAND flags while the integrated intel one does have > > them (that's why the Intel GPU is chosen). > > > > Unfortunately I don't know what makes his system choke at boot time as > > he did not provide logs for the failing case. > Attached dmesg for the failing case (obtained via ssh). > > Without blacklisting a small horizontal bar of vertical green bars > appears (no x, no console). It's good to know that it's just the graphics (console / X) that are not displaying properly. > If nouveau is blacklisted then I get a console, but X will not start > (No devices found). The console you get is EFIFB (on the nvidia GPU to which display is routed). Here the reason why X does not start is probably that i915 did not find its VBIOS tables nor any connected monitor and thus X thinks "no active output => I don't start". Though your X would be able to start if it did not find xf86-video-intel (intel_drv.so) and/or did find/had an explicit reference to xf86-video-fbdev (fbdev_drv.so). If under OSX you told your system to start on intel GPU (I think there is an option in this direction) you system would probably boot fine as the initial choice by vgaarb would match gmux/switcheroo settings. > If i915 is blacklisted then I do not get a console. The screen just > freezes after a few boot messages. This is more interesting. Initially you had efifb printing kernel logs until nouveau gets loaded by udev and replaces efifb. From there on possibly applegmux does not take over correctly (it may need both i915 and nouveau active to properly route framebuffer to panel or connector). Though your X should be telling the same thing as for nouveau blacklisted as nvidia GPU is not the one having boot_vga set... If not it may be worth finding out in what state your system exactly is with regards to graphics. > What is vga_default_device() used for? Is it supposed to hold the > device that is controlling the (boot) screen? Why can't we just read > the configuration from vga_switcheroo/gmux? For systems not using vga_switcheroo: vga_default_device represents the PCI GPU that was used to boot (and normally handles legacy VGA I/O). It's never changed after boot (except eventually when a GPU gets hotplugged) For systems with vga_switcheroo vga_default_device represents the active GPU (the one that would be handling legacy VGA I/O if used - and the one controlling the output connectors) vga_switcheroo is actively changing vga_default_device. gmux is a driver for vga_switcheroo to perform the low-level platform operations allowing switching (outputs) from one GPU to the other. So a guess on my side would be that with both i915 and nouveau loaded you may be able to get your display working if you can tell X to switch GPU twice (and thus end up with matching vga_default_device and device selected by gmux) - though I don't know how one asks for this switch to happen. > > If there is no better way to detect the proper legacy VGA device the > > only remaining option would be to perform the screen_info testing in > > vga_arb_device_init() enclosed in arch #ifdef... I will propose a patch in this direction later this weekend. Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 22, 2014 at 08:23:24AM +0200, Bruno Prémont wrote: > On Thu, 21 Aug 2014 23:39:31 -0500 Bjorn Helgaas wrote: > > On Thu, Aug 21, 2014 at 4:34 PM, Bruno Prémont wrote: > > > > > A second step would then be to tune vgaarb's initial selection. > > > Bjorn, is it possible to verify which I/O ports are decoded by a PCI > > > device at the time of adding it to vgaarb? If so, how? I would like to > > > check for legacy VGA I/O range (0x03B0-0x03DF) and only let vgaarb set > > > a device as default if that I/O range is decoded by the device. > > > > I don't know of a way. I'm pretty sure VGA devices are allowed to > > respond to those legacy addresses even if there's no BAR for them, but > > I haven't found a spec reference for this. There is the VGA Enable > > bit in bridges, of course (PCI Bridge spec, sec 12.1.1. If the VGA > > device is behind a bridge that doesn't have the VGA Enable bit set, it > > probably isn't the default device. > > Those VGA devices behind bridges are the easy ones that vgaarb selects > properly. > It's the ones not behind a bridge (integrated graphics) like the intel > one that cause problems. > > For Andreas's system the discrete nvidia GPU has no I/O enabled > according to PCI_COMMAND flags while the integrated intel one does have > them (that's why the Intel GPU is chosen). > > Unfortunately I don't know what makes his system choke at boot time as > he did not provide logs for the failing case. Very often when something goes wrong with a kms driver we hang while doing the initial modeset. Which is all done while holding the console_lock (because fbdev+vt locking is just insane). You can try to get a closer look with I915_FBDEV=n which will avoid the console_lock, but which also won't register the legacy/compat i915 fbdev emulation any more, so greatly changes boot behaviour. If that doesn't lead to clues the next approach is to "carefully" drop&reacquire console_lock at a few "interesting" places to get a few printks out over netconsole or similar. Or just hack up entire netconsole loggin infrastructure which bypasses printk and so all the console_lock insanity. It's not pretty, I know :( Cheers, Daniel
Hi Daniel, On Mon, 25 Aug 2014 14:16:02 +0200 Daniel Vetter wrote: > Very often when something goes wrong with a kms driver we hang while doing > the initial modeset. Which is all done while holding the console_lock > (because fbdev+vt locking is just insane). You can try to get a closer > look with I915_FBDEV=n which will avoid the console_lock, but which also > won't register the legacy/compat i915 fbdev emulation any more, so greatly > changes boot behaviour. > > If that doesn't lead to clues the next approach is to "carefully" > drop&reacquire console_lock at a few "interesting" places to get a few > printks out over netconsole or similar. Or just hack up entire netconsole > loggin infrastructure which bypasses printk and so all the console_lock > insanity. In this case it's not that bad as Andreas could send the logs for all cases (captured via ssh). So probably console lock is not held (unless he did have to do terminal-free ssh which I doubt). It looks much more as if it's just the output routing that gets weird on his Mac (or possibly any other dual-GPU MacBook where discrete GPU is primary). Black screen but alive system :) See follow-up posts in this thread. If you have some uncommon or otherwise weird (EFI) multi-GPU systems around and want to give my patches sent yesterday evening a try, you're welcome! Some with non-Apple GPU multiplexer would be nice to have tested as well. The following part mentioned earlier by Andreas might be of interest to you though (and my latest patch series should bring the improvement): > > vga_arbiter_add_pci_device chooses intel simply because it is the > > first device. Next pci_fixup_video(intel) sees that it is the default > > device, sets the IORESOURCE_ROM_SHADOW flag and calls > > vga_set_default_device again. And finally (if the check is removed) > > pci_fixup_video(nvidia) sees that it owns the framebuffer and sets > > itself as the default device which allows the system to boot again. > > > > Does setting the ROM_SHADOW flag on (possibly) the wrong device have > > any effect? > Yes it does. Removing the line changes a long standing > i915 0000:00:02.0: Invalid ROM contents > into a > i915 0000:00:02.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment). > > The first is logged at KERN_ERR and the second one only at KERN_INFO. > We are making progress. Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c index 5b392d2..fc509d5 100644 --- a/arch/x86/pci/fixup.c +++ b/arch/x86/pci/fixup.c @@ -326,7 +326,10 @@ static void pci_fixup_video(struct pci_dev *pdev) struct pci_bus *bus; u16 config; - if (!vga_default_device()) { + if (!vga_default_device() || 1) { + /* The `|| 1` condition papers over vgaarb initial GPU selection limitation + * on Apple dual-GPU systems using EFI. + */ resource_size_t start, end; int i;