diff mbox

x86/ACPI: Make Sony Vaio Z1 series to use "reboot=pci" default

Message ID 1382597377-26797-1-git-send-email-tianyu.lan@intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

lan,Tianyu Oct. 24, 2013, 6:49 a.m. UTC
From: Lan Tianyu <tianyu.lan@intel.com>

Sony Vaio Z1 series require "reboot=pci" for reboot and power off.
This patch is to add them machines to quirk table and set pci reboot
default.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=61721
Reported-and-tested-by: Adam Williamson <awilliam@redhat.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 arch/x86/kernel/reboot.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Ingo Molnar Oct. 25, 2013, 10:53 a.m. UTC | #1
* tianyu.lan@intel.com <tianyu.lan@intel.com> wrote:

> From: Lan Tianyu <tianyu.lan@intel.com>
> 
> Sony Vaio Z1 series require "reboot=pci" for reboot and power off.
> This patch is to add them machines to quirk table and set pci reboot
> default.
> 
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=61721
> Reported-and-tested-by: Adam Williamson <awilliam@redhat.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  arch/x86/kernel/reboot.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 7e920bf..083ade7 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -382,6 +382,14 @@ static struct dmi_system_id __initdata reboot_dmi_table[] = {
>  			DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
>  		},
>  	},
> +	{	/* Handle problems with rebooting on Sony Vaio Z1 series*/
> +		.callback = set_pci_reboot,
> +		.ident = "Sony Vaio Z1",
> +		.matches = {
> +		DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
> +		DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1"),
> +		},
> +	},

This is becoming somewhat endemic - do we know _why_ the ACPI reboot 
method does not work?

We reworked the x86 reboot sequence 2.5 years agom, in:

 commit 660e34cebf0a11d54f2d5dd8838607452355f321
 Author: Matthew Garrett <mjg@redhat.com>
 Date:   Mon Apr 4 13:55:05 2011 -0400

    x86: Reorder reboot method preferences
    
    We have a never ending stream of 'reboot quirks' for new boxes
    that will not reboot properly under Linux (they will hang on
    reboot).
    
    The reason is widespread 'Windows compatible' assumption of modern
    x86 hardware, which expects the following reboot sequence:
    
     - hitting the ACPI reboot vector (if available)
     - trying the keyboard controller
     - hitting the ACPI reboot vector again
     - then giving the keyboard controller one last go

    This sequence expectation gets more and more embedded in modern
    hardware, which often lacks a keyboard controller and may even
    lock up if the legacy io ports are hit - and which hardware is
    often not tested with Linux during development.
    
    The end result is that reboot works under Windows-alike OSs but not
    under Linux.
    
    Rework our reboot process to meet this hardware externality a little
    better and match this assumption of newer x86 hardware.
    
    In addition to the ACPI,kbd,ACPI,kbd sequence we'll still fall
    through to attempting a legacy triple fault if nothing else
    works - and keep trying that and the kbd reset.

    [...]

Do we know why reboot apparently works quickly enough on Windows on 
this laptop, but not under Linux? Does Windows use the ACPI reboot 
method? If yes, does it use a different pattern?

Is it all perhaps virtualization or IRQ routing related?

I.e. we really need a real analysis here, not just a quirk!

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Jones Oct. 25, 2013, 12:44 p.m. UTC | #2
On Fri, Oct 25, 2013 at 12:53:52PM +0200, Ingo Molnar wrote:

 > > Sony Vaio Z1 series require "reboot=pci" for reboot and power off.
 > > This patch is to add them machines to quirk table and set pci reboot
 > > default.
 > > 
 > > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=61721
 > > Reported-and-tested-by: Adam Williamson <awilliam@redhat.com>
 > > Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
 > > ---
 > >  arch/x86/kernel/reboot.c |    8 ++++++++
 > >  1 file changed, 8 insertions(+)
 > 
 > This is becoming somewhat endemic - do we know _why_ the ACPI reboot 
 > method does not work?
 > 
 > We reworked the x86 reboot sequence 2.5 years agom, in:
 > 
 >  commit 660e34cebf0a11d54f2d5dd8838607452355f321
 >  Author: Matthew Garrett <mjg@redhat.com>
 >  Date:   Mon Apr 4 13:55:05 2011 -0400
 > 
 >     x86: Reorder reboot method preferences
 >     [...]
 > 
 > Do we know why reboot apparently works quickly enough on Windows on 
 > this laptop, but not under Linux? Does Windows use the ACPI reboot 
 > method? If yes, does it use a different pattern?
 > 
 > Is it all perhaps virtualization or IRQ routing related?
 > 
 > I.e. we really need a real analysis here, not just a quirk!

I have a few machines at home that have the same problem.
There does seem to be a number of vendor wide issues here. Pretty much
every vaio and every Dell seems to need working around.

I'd point at the acpi dumps, but they're currently switched off and not
responding to wake-on-lan for some reason.
I'll send them when I get back from kernel summit.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin Oct. 25, 2013, 12:48 p.m. UTC | #3
On 10/25/2013 01:44 PM, Dave Jones wrote:
> 
> I have a few machines at home that have the same problem.
> There does seem to be a number of vendor wide issues here. Pretty much
> every vaio and every Dell seems to need working around.
> 

The Dell problem I believe is understood... they are invoking the KBC as
their ACPI reboot port, and it is believed that that triggers an SMI
which invokes the BIOS, and the BIOS is broken if you behave like a
non-Windows system.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Oct. 25, 2013, 12:54 p.m. UTC | #4
On Fri, Oct 25, 2013 at 1:48 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> The Dell problem I believe is understood... they are invoking the KBC as
> their ACPI reboot port, and it is believed that that triggers an SMI
> which invokes the BIOS, and the BIOS is broken if you behave like a
> non-Windows system.

Well, that's presumably true of *all* the machines. Because I bet
windows boots on it. The details may matter.

The solution is to act more like Windows. There was some talk about
one likely fundamental difference being in how we enable VT-d. Maybe
we should just change that?

Seriously, if the "fix" is potentially something as simple as
disabling VT-d before reboots, let's just do it. Not add these quirks.
We have people to test a patch, but what _is_ that patch?

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 25, 2013, 4:47 p.m. UTC | #5
On Fri, 2013-10-25 at 12:53 +0200, Ingo Molnar wrote:
> * tianyu.lan@intel.com <tianyu.lan@intel.com> wrote:
> 
> > From: Lan Tianyu <tianyu.lan@intel.com>
> > 
> > Sony Vaio Z1 series require "reboot=pci" for reboot and power off.
> > This patch is to add them machines to quirk table and set pci reboot
> > default.
> > 
> > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=61721
> > Reported-and-tested-by: Adam Williamson <awilliam@redhat.com>
> > Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> > ---
> >  arch/x86/kernel/reboot.c |    8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> > index 7e920bf..083ade7 100644
> > --- a/arch/x86/kernel/reboot.c
> > +++ b/arch/x86/kernel/reboot.c
> > @@ -382,6 +382,14 @@ static struct dmi_system_id __initdata reboot_dmi_table[] = {
> >  			DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
> >  		},
> >  	},
> > +	{	/* Handle problems with rebooting on Sony Vaio Z1 series*/
> > +		.callback = set_pci_reboot,
> > +		.ident = "Sony Vaio Z1",
> > +		.matches = {
> > +		DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
> > +		DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1"),
> > +		},
> > +	},
> 
> This is becoming somewhat endemic - do we know _why_ the ACPI reboot 
> method does not work?

I don't, but one comment I could add is that reboot _used_ to work okay
on the Z1. And I _think_ it's worked OK since the April 2011 commit you
mention (so the introduction of that didn't break it), but I can't
absolutely swear to it.

(And yes, of course, reboot from stock-installed, fully-updated Windows
- IIRC, Win7 - on the same system works OK.)
Linus Torvalds Oct. 25, 2013, 5:16 p.m. UTC | #6
On Fri, Oct 25, 2013 at 5:47 PM, Adam Williamson <awilliam@redhat.com> wrote:
>
> I don't, but one comment I could add is that reboot _used_ to work okay
> on the Z1. And I _think_ it's worked OK since the April 2011 commit you
> mention (so the introduction of that didn't break it), but I can't
> absolutely swear to it.

It might be interesting to try, and if it used to work with the acpi
method, use bisection to see when it stopped working..

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 25, 2013, 5:43 p.m. UTC | #7
On Fri, 2013-10-25 at 18:16 +0100, Linus Torvalds wrote:
> On Fri, Oct 25, 2013 at 5:47 PM, Adam Williamson <awilliam@redhat.com> wrote:
> >
> > I don't, but one comment I could add is that reboot _used_ to work okay
> > on the Z1. And I _think_ it's worked OK since the April 2011 commit you
> > mention (so the introduction of that didn't break it), but I can't
> > absolutely swear to it.
> 
> It might be interesting to try, and if it used to work with the acpi
> method, use bisection to see when it stopped working..

Yeah, I knew that'd be the reply :) I'll try and dig out some old kernel
builds from Koji and see if I can at least pinpoint when it last worked,
and if it really _was_ after the ACPI method landed.
Rafael J. Wysocki Oct. 25, 2013, 7:44 p.m. UTC | #8
On Friday, October 25, 2013 10:43:00 AM Adam Williamson wrote:
> On Fri, 2013-10-25 at 18:16 +0100, Linus Torvalds wrote:
> > On Fri, Oct 25, 2013 at 5:47 PM, Adam Williamson <awilliam@redhat.com> wrote:
> > >
> > > I don't, but one comment I could add is that reboot _used_ to work okay
> > > on the Z1. And I _think_ it's worked OK since the April 2011 commit you
> > > mention (so the introduction of that didn't break it), but I can't
> > > absolutely swear to it.
> > 
> > It might be interesting to try, and if it used to work with the acpi
> > method, use bisection to see when it stopped working..
> 
> Yeah, I knew that'd be the reply :) I'll try and dig out some old kernel
> builds from Koji and see if I can at least pinpoint when it last worked,
> and if it really _was_ after the ACPI method landed.

This is a shot in the dark, but can you also please check if booting 3.12-rc6
with acpi_osi="!Windows 2012" on the affected machine makes any difference?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 26, 2013, 1:27 a.m. UTC | #9
On Fri, 2013-10-25 at 18:16 +0100, Linus Torvalds wrote:
> On Fri, Oct 25, 2013 at 5:47 PM, Adam Williamson <awilliam@redhat.com> wrote:
> >
> > I don't, but one comment I could add is that reboot _used_ to work okay
> > on the Z1. And I _think_ it's worked OK since the April 2011 commit you
> > mention (so the introduction of that didn't break it), but I can't
> > absolutely swear to it.
> 
> It might be interesting to try, and if it used to work with the acpi
> method, use bisection to see when it stopped working..

So I might have trouble bisecting, as stone-age kernels don't seem to
boot on my installed system cleanly any more; some kind of
incompatibility with my encrypted partitions. But I booted an F16 live
image, which has kernel 3.1.0, and that reboots quickly. If I'm reading
things correctly, mjg59's patch went into kernel 2.6.39, so if that was
causing the problem, I'd expect the F16 live image to have a slow
reboot.

I'll go back through the live images for F17->F19 and see when the
reboot gets slow, which will give us a very rough range to look at, at
least. I'll also try Rafael's suggestion re acpi_osi.
Adam Williamson Oct. 26, 2013, 2:20 a.m. UTC | #10
On Fri, 2013-10-25 at 18:27 -0700, Adam Williamson wrote:
> On Fri, 2013-10-25 at 18:16 +0100, Linus Torvalds wrote:
> > On Fri, Oct 25, 2013 at 5:47 PM, Adam Williamson <awilliam@redhat.com> wrote:
> > >
> > > I don't, but one comment I could add is that reboot _used_ to work okay
> > > on the Z1. And I _think_ it's worked OK since the April 2011 commit you
> > > mention (so the introduction of that didn't break it), but I can't
> > > absolutely swear to it.
> > 
> > It might be interesting to try, and if it used to work with the acpi
> > method, use bisection to see when it stopped working..
> 
> So I might have trouble bisecting, as stone-age kernels don't seem to
> boot on my installed system cleanly any more; some kind of
> incompatibility with my encrypted partitions. But I booted an F16 live
> image, which has kernel 3.1.0, and that reboots quickly. If I'm reading
> things correctly, mjg59's patch went into kernel 2.6.39, so if that was
> causing the problem, I'd expect the F16 live image to have a slow
> reboot.
> 
> I'll go back through the live images for F17->F19 and see when the
> reboot gets slow, which will give us a very rough range to look at, at
> least. I'll also try Rafael's suggestion re acpi_osi.

OK, so, findings: F16 and F17 live images reboot quickly. F18 is the
first that reboots slowly. That gives us a range of
kernel-3.3.4-5.fc17.x86_64.rpm  (the F17 release kernel - last known
good) to kernel-3.6.10-4.fc18.x86_64.rpm (the F18 release kernel - first
known bad). I'll try and see if I can find a way to narrow the range
any. Well, one report on the vaio-z mailing list -
https://lists.launchpad.net/sony-vaio-z-series/msg02868.html - indicates
the problem existed in Ubuntu kernel 3.5.0-17-generic , so that'd be a
range of 3.3.4 to 3.5.0 for the breakage.

acpi_osi="!Windows 2012" as a kernel parameter on a 3.12-rc6 kernel does
not result in a fast reboot, it's no help.
Dave Jones Oct. 26, 2013, 3:17 a.m. UTC | #11
On Fri, Oct 25, 2013 at 07:20:15PM -0700, Adam Williamson wrote:
 > OK, so, findings: F16 and F17 live images reboot quickly. F18 is the
 > first that reboots slowly.

wait, stop. what does "reboots slowly" mean ? 
In every one of the failure I've seen, without a quirk it means reboot
doesn't happen at all.

	Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 26, 2013, 7:22 a.m. UTC | #12
On Fri, 2013-10-25 at 23:17 -0400, Dave Jones wrote:
> On Fri, Oct 25, 2013 at 07:20:15PM -0700, Adam Williamson wrote:
>  > OK, so, findings: F16 and F17 live images reboot quickly. F18 is the
>  > first that reboots slowly.
> 
> wait, stop. what does "reboots slowly" mean ? 
> In every one of the failure I've seen, without a quirk it means reboot
> doesn't happen at all.

If you read the original report - on Z1s, if you leave the system alone,
it sits at the 'Restarting system.' message for about two minutes, then
finally reboots. If you pass 'reboot=pci', reboot is as fast as you'd
expect.

https://bugzilla.kernel.org/show_bug.cgi?id=61721
Ingo Molnar Oct. 26, 2013, 9:15 a.m. UTC | #13
* Adam Williamson <awilliam@redhat.com> wrote:

> On Fri, 2013-10-25 at 23:17 -0400, Dave Jones wrote:
> > On Fri, Oct 25, 2013 at 07:20:15PM -0700, Adam Williamson wrote:
> >  > OK, so, findings: F16 and F17 live images reboot quickly. F18 is the
> >  > first that reboots slowly.
> > 
> > wait, stop. what does "reboots slowly" mean ? 
> > In every one of the failure I've seen, without a quirk it means reboot
> > doesn't happen at all.
> 
> If you read the original report - on Z1s, if you leave the system alone,
> it sits at the 'Restarting system.' message for about two minutes, then
> finally reboots. If you pass 'reboot=pci', reboot is as fast as you'd
> expect.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=61721

So, the best range for the breakage we have is v3.3..v3.5.

As a blind shot into the dark, that turns out to be a range when a 
number of irq-remapping, vt-d changes went upstream:

comet:~/tip> gll v3.3..v3.6 arch/x86 | grep -i remap
399988eea194 irq_remap: Fix compiler warning with CONFIG_IRQ_REMAP=y
79fec2c557cf Merge tag 'intr-remapping-ops-for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu
8a8f422d3b4f iommu: rename intr_remapping.[ch] to irq_remapping.[ch]
95a02e976c39 iommu: rename intr_remapping references to irq_remapping
263b5e8629c9 x86, iommu/vt-d: Clean up interfaces for interrupt remapping
5e2b930b0784 iommu/vt-d: Convert MSI remapping setup to remap_ops
9d619f657222 iommu/vt-d: Convert free_irte into a remap_ops callback
4c1bad6a0af1 iommu/vt-d: Convert IR set_affinity function to remap_ops
0c3f173a88c4 iommu/vt-d: Convert IR ioapic-setup to use remap_ops
4f3d8b67ad30 iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops
736baef4472d iommu/vt-d: Make intr-remapping initialization generic
f7219a5300ba x86: Introduce CONFIG_X86_DMA_REMAP

So if you are able to test current kernels, it might be an 
additional data point to see whether the reboot delay (which appears 
to be a reboot hang on other systems) is related to the following 
kernel option:

  CONFIG_IRQ_REMAP=y

(CONFIG_X86_DMA_REMAP is off on Fedora.)

IRQ_REMAP is somewhat of a dual personality feature, living half in 
arch/x86/, half in drivers/iommu/. It should normally matter for 
servers more than for laptops.

Btw., regarding your encrypted partitions boot problem, do you have 
any non-encrypted filesystem? If yes then you could copy /[s]bin and 
/lib to it and boot via init=/bin/bash, you ought to get a minimal 
shell and be able to run /sbin/reboot.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
lan,Tianyu Oct. 26, 2013, 12:58 p.m. UTC | #14
On 10/25/2013 06:53 PM, Ingo Molnar wrote:
>
> * tianyu.lan@intel.com <tianyu.lan@intel.com> wrote:
>
>> From: Lan Tianyu <tianyu.lan@intel.com>
>>
>> Sony Vaio Z1 series require "reboot=pci" for reboot and power off.
>> This patch is to add them machines to quirk table and set pci reboot
>> default.
>>
>> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=61721
>> Reported-and-tested-by: Adam Williamson <awilliam@redhat.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>   arch/x86/kernel/reboot.c |    8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
>> index 7e920bf..083ade7 100644
>> --- a/arch/x86/kernel/reboot.c
>> +++ b/arch/x86/kernel/reboot.c
>> @@ -382,6 +382,14 @@ static struct dmi_system_id __initdata reboot_dmi_table[] = {
>>   			DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
>>   		},
>>   	},
>> +	{	/* Handle problems with rebooting on Sony Vaio Z1 series*/
>> +		.callback = set_pci_reboot,
>> +		.ident = "Sony Vaio Z1",
>> +		.matches = {
>> +		DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
>> +		DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1"),
>> +		},
>> +	},
>
> This is becoming somewhat endemic - do we know _why_ the ACPI reboot
> method does not work?
>
> We reworked the x86 reboot sequence 2.5 years agom, in:
>
>   commit 660e34cebf0a11d54f2d5dd8838607452355f321
>   Author: Matthew Garrett <mjg@redhat.com>
>   Date:   Mon Apr 4 13:55:05 2011 -0400
>
>      x86: Reorder reboot method preferences
>
>      We have a never ending stream of 'reboot quirks' for new boxes
>      that will not reboot properly under Linux (they will hang on
>      reboot).
>
>      The reason is widespread 'Windows compatible' assumption of modern
>      x86 hardware, which expects the following reboot sequence:
>
>       - hitting the ACPI reboot vector (if available)
>       - trying the keyboard controller
>       - hitting the ACPI reboot vector again
>       - then giving the keyboard controller one last go
>
>      This sequence expectation gets more and more embedded in modern
>      hardware, which often lacks a keyboard controller and may even
>      lock up if the legacy io ports are hit - and which hardware is
>      often not tested with Linux during development.
>
>      The end result is that reboot works under Windows-alike OSs but not
>      under Linux.
>
>      Rework our reboot process to meet this hardware externality a little
>      better and match this assumption of newer x86 hardware.
>
>      In addition to the ACPI,kbd,ACPI,kbd sequence we'll still fall
>      through to attempting a legacy triple fault if nothing else
>      works - and keep trying that and the kbd reset.
>
>      [...]
>
> Do we know why reboot apparently works quickly enough on Windows on
> this laptop, but not under Linux? Does Windows use the ACPI reboot
> method? If yes, does it use a different pattern?
>
> Is it all perhaps virtualization or IRQ routing related?
>
> I.e. we really need a real analysis here, not just a quirk!

Sorry about this. Will do some analyses.

>
> Thanks,
>
> 	Ingo
>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 26, 2013, 5:08 p.m. UTC | #15
On Sat, 2013-10-26 at 11:15 +0200, Ingo Molnar wrote:

> So, the best range for the breakage we have is v3.3..v3.5.

So far, yeah. I will try and narrow it down.

> As a blind shot into the dark, that turns out to be a range when a 
> number of irq-remapping, vt-d changes went upstream:
> 
> comet:~/tip> gll v3.3..v3.6 arch/x86 | grep -i remap
> 399988eea194 irq_remap: Fix compiler warning with CONFIG_IRQ_REMAP=y
> 79fec2c557cf Merge tag 'intr-remapping-ops-for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu
> 8a8f422d3b4f iommu: rename intr_remapping.[ch] to irq_remapping.[ch]
> 95a02e976c39 iommu: rename intr_remapping references to irq_remapping
> 263b5e8629c9 x86, iommu/vt-d: Clean up interfaces for interrupt remapping
> 5e2b930b0784 iommu/vt-d: Convert MSI remapping setup to remap_ops
> 9d619f657222 iommu/vt-d: Convert free_irte into a remap_ops callback
> 4c1bad6a0af1 iommu/vt-d: Convert IR set_affinity function to remap_ops
> 0c3f173a88c4 iommu/vt-d: Convert IR ioapic-setup to use remap_ops
> 4f3d8b67ad30 iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops
> 736baef4472d iommu/vt-d: Make intr-remapping initialization generic
> f7219a5300ba x86: Introduce CONFIG_X86_DMA_REMAP
> 
> So if you are able to test current kernels, it might be an 
> additional data point to see whether the reboot delay (which appears 
> to be a reboot hang on other systems) is related to the following 
> kernel option:

Thanks, I'll give it a shot. Note - I suspect some of the other cases
may really be delays rather than hangs too. It's very easy to look at
the screen sitting there doing absolutely nothing at all for 30-60
seconds, lose patience, conclude it's hung, and force a shutdown/reboot.
Several of the earlier Z1 reporters on the vaio-z mailing list reported
it as a 'hang', but after I pointed out that it wasn't, confirmed the
same behaviour on their systems. This could possibly apply to others
too.

>   CONFIG_IRQ_REMAP=y
> 
> (CONFIG_X86_DMA_REMAP is off on Fedora.)
> 
> IRQ_REMAP is somewhat of a dual personality feature, living half in 
> arch/x86/, half in drivers/iommu/. It should normally matter for 
> servers more than for laptops.
> 
> Btw., regarding your encrypted partitions boot problem, do you have 
> any non-encrypted filesystem? If yes then you could copy /[s]bin and 
> /lib to it and boot via init=/bin/bash, you ought to get a minimal 
> shell and be able to run /sbin/reboot.
> 
> Thanks,
> 
> 	Ingo
>
Linus Torvalds Oct. 26, 2013, 8 p.m. UTC | #16
On Sat, Oct 26, 2013 at 10:08 AM, Adam Williamson <awilliam@redhat.com> wrote:
>
> Thanks, I'll give it a shot. Note - I suspect some of the other cases
> may really be delays rather than hangs too. It's very easy to look at
> the screen sitting there doing absolutely nothing at all for 30-60
> seconds, lose patience, conclude it's hung, and force a shutdown/reboot.
> Several of the earlier Z1 reporters on the vaio-z mailing list reported
> it as a 'hang', but after I pointed out that it wasn't, confirmed the
> same behaviour on their systems. This could possibly apply to others
> too.

I agree that a delay of 60s may well be reported as a hang, but at
least for the Dell case I can test, the hang is definitely at least
close to infinite. Definitely longer than a couple of minutes.

So the Sony and Dell issues may be different. That said, vt-d was
suspected for both, and apparently does match your kernel versions, so
it's entirely possible that the fundamental cause is the same even if
the symptoms are slightly different.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Oct. 27, 2013, 7:06 a.m. UTC | #17
On Sat, 2013-10-26 at 11:15 +0200, Ingo Molnar wrote:

> So if you are able to test current kernels, it might be an 
> additional data point to see whether the reboot delay (which appears 
> to be a reboot hang on other systems) is related to the following 
> kernel option:
> 
>   CONFIG_IRQ_REMAP=y

Agh, sorry - I had this down in my mind as a boot time parameter, not a
compile time option. I'm off on vacation for a week in the morning, and
it's too late to wait around for a kernel compile tonight :/ So I'll
have to check this one when I get back. Sorry again.
Joerg Roedel Nov. 1, 2013, 2:02 p.m. UTC | #18
On Sun, Oct 27, 2013 at 12:06:29AM -0700, Adam Williamson wrote:
> On Sat, 2013-10-26 at 11:15 +0200, Ingo Molnar wrote:
> >   CONFIG_IRQ_REMAP=y
> 
> Agh, sorry - I had this down in my mind as a boot time parameter, not a
> compile time option. I'm off on vacation for a week in the morning, and
> it's too late to wait around for a kernel compile tonight :/ So I'll
> have to check this one when I get back. Sorry again.

Note, instead of recompiling the kernel, you can also pass 'intremap=off'
on the kernel cmdline to disable interrupt remapping and test with that.


	Joerg


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Williamson Nov. 18, 2013, 11:43 p.m. UTC | #19
On Fri, 2013-11-01 at 15:02 +0100, Joerg Roedel wrote:
> On Sun, Oct 27, 2013 at 12:06:29AM -0700, Adam Williamson wrote:
> > On Sat, 2013-10-26 at 11:15 +0200, Ingo Molnar wrote:
> > >   CONFIG_IRQ_REMAP=y
> > 
> > Agh, sorry - I had this down in my mind as a boot time parameter, not a
> > compile time option. I'm off on vacation for a week in the morning, and
> > it's too late to wait around for a kernel compile tonight :/ So I'll
> > have to check this one when I get back. Sorry again.
> 
> Note, instead of recompiling the kernel, you can also pass 'intremap=off'
> on the kernel cmdline to disable interrupt remapping and test with that.

Sorry for the delay, folks - just got back to this. Booting with
'intremap=off' results in a slow reboot, i.e., doesn't fix the bug. Is
that a sufficient test, Ingo, or do you still want me to build with
CONFIG_IRQ_REMAP=n and try that?
Ingo Molnar Nov. 19, 2013, 7:03 a.m. UTC | #20
* Adam Williamson <awilliam@redhat.com> wrote:

> On Fri, 2013-11-01 at 15:02 +0100, Joerg Roedel wrote:
> > On Sun, Oct 27, 2013 at 12:06:29AM -0700, Adam Williamson wrote:
> > > On Sat, 2013-10-26 at 11:15 +0200, Ingo Molnar wrote:
> > > >   CONFIG_IRQ_REMAP=y
> > > 
> > > Agh, sorry - I had this down in my mind as a boot time 
> > > parameter, not a compile time option. I'm off on vacation for a 
> > > week in the morning, and it's too late to wait around for a 
> > > kernel compile tonight :/ So I'll have to check this one when I 
> > > get back. Sorry again.
> > 
> > Note, instead of recompiling the kernel, you can also pass 
> > 'intremap=off' on the kernel cmdline to disable interrupt 
> > remapping and test with that.
> 
> Sorry for the delay, folks - just got back to this. Booting with 
> 'intremap=off' results in a slow reboot, i.e., doesn't fix the bug. 
> Is that a sufficient test, Ingo, or do you still want me to build 
> with CONFIG_IRQ_REMAP=n and try that?

That should be a sufficient boot I suspect.

Can you disable virtualization in the BIOS - does that affect reboot 
speed?

I'm just shooting into the dark here - if you can make your system 
boot bzImages then you might as well be better off trying to bisect 
it.

On Fedora booting bzImages of vanilla kernels is reasonably 
straightforward: a 'make localconfig' done while you are booted into a 
Fedora kernel ought to pick up everything into your .config and you 
won't need any modules to be able to boot up to userspace. That should 
ease bisection.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 7e920bf..083ade7 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -382,6 +382,14 @@  static struct dmi_system_id __initdata reboot_dmi_table[] = {
 			DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
 		},
 	},
+	{	/* Handle problems with rebooting on Sony Vaio Z1 series*/
+		.callback = set_pci_reboot,
+		.ident = "Sony Vaio Z1",
+		.matches = {
+		DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
+		DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1"),
+		},
+	},
 	{ }
 };