diff mbox

[1/2] x86: Set up resources correctly on Hyper-V Generation 2

Message ID 1471547575-14748-1-git-send-email-mawilcox@linuxonhyperv.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Matthew Wilcox Aug. 18, 2016, 7:12 p.m. UTC
From: Matthew Wilcox <mawilcox@microsoft.com>

The Generation 2 Hyper-V virtual machine does not emulate PCI.
This check causes the call to pcibios_resource_survey() to be skipped,
and pcibios_resource_survey() calls e820_reserve_resources_late(), which
is where PMEM resources are added to the resource tree.  With this patch,
the PMEM devices now show up.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/x86/pci/common.c | 5 -----
 1 file changed, 5 deletions(-)

Comments

Dan Williams Aug. 18, 2016, 5:31 p.m. UTC | #1
On Thu, Aug 18, 2016 at 12:12 PM, Matthew Wilcox
<mawilcox@linuxonhyperv.com> wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
>
> The Generation 2 Hyper-V virtual machine does not emulate PCI.
> This check causes the call to pcibios_resource_survey() to be skipped,
> and pcibios_resource_survey() calls e820_reserve_resources_late(), which
> is where PMEM resources are added to the resource tree.  With this patch,
> the PMEM devices now show up.
>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> ---
>  arch/x86/pci/common.c | 5 -----
>  1 file changed, 5 deletions(-)
>
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 7b6a9d1..d39e799 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -516,11 +516,6 @@ void __init pcibios_set_cache_line_size(void)
>
>  int __init pcibios_init(void)
>  {
> -       if (!raw_pci_ops && !raw_pci_ext_ops) {
> -               printk(KERN_WARNING "PCI: System does not support PCI\n");
> -               return 0;
> -       }
> -
>         pcibios_set_cache_line_size();
>         pcibios_resource_survey();

Is this for the memmap= kernel command line option or the ACPI NFIT
method for defining NVDIMM resources?
Matthew Wilcox Aug. 18, 2016, 5:50 p.m. UTC | #2
This is the memmap= kernel command line option.

-----Original Message-----
From: Dan Williams [mailto:dan.j.williams@intel.com] 
Sent: Thursday, August 18, 2016 1:32 PM
To: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Cc: X86 ML <x86@kernel.org>; linux-kernel@vger.kernel.org; linux-nvdimm@lists.01.org; Matthew Wilcox <mawilcox@microsoft.com>
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

Is this for the memmap= kernel command line option or the ACPI NFIT
method for defining NVDIMM resources?
Dan Williams Aug. 18, 2016, 6:02 p.m. UTC | #3
On Thu, Aug 18, 2016 at 10:50 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> This is the memmap= kernel command line option.

The Hyper-V ACPI tables are growing NFIT support, or I believe that
work is already underway.  I'd rather not touch the Linux PCI core to
cover for the fact that Hyper-V disables one of the kernel's debug
command line options when there's a platform standard way to
communicate these resources.
Matthew Wilcox Aug. 18, 2016, 6:05 p.m. UTC | #4
I'd rather the memmap= option works on all platforms.  Unless we're going to get rid of it entirely and exclusively use ACPI tables, I see no good reason to leave this landmine lying around for someone else to blow their own leg off.

-----Original Message-----
From: Dan Williams [mailto:dan.j.williams@intel.com] 
Sent: Thursday, August 18, 2016 2:03 PM
To: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>; X86 ML <x86@kernel.org>; linux-kernel@vger.kernel.org; linux-nvdimm@lists.01.org
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

On Thu, Aug 18, 2016 at 10:50 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> This is the memmap= kernel command line option.

The Hyper-V ACPI tables are growing NFIT support, or I believe that work is already underway.  I'd rather not touch the Linux PCI core to cover for the fact that Hyper-V disables one of the kernel's debug command line options when there's a platform standard way to communicate these resources.
Dan Williams Aug. 18, 2016, 6:15 p.m. UTC | #5
On Thu, Aug 18, 2016 at 11:05 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> I'd rather the memmap= option works on all platforms.  Unless we're going to get rid of it entirely and exclusively use ACPI tables, I see no good reason to leave this landmine lying around for someone else to blow their own leg off.
>

memmap= is already a landmine: https://lkml.org/lkml/2016/6/28/819

Killing memmap= in favor of NFIT is a better way to go.  I chatted
with Peter off list about this problem and he wondered about
re-purposing using the mBFT mechanism to define memory ranges:

http://omniboot.org/txt/syslinux/memdisk.txt
Matthew Wilcox Aug. 18, 2016, 6:26 p.m. UTC | #6
*Until* such a mechanism is in, I see no reason to not keep feature parity for all platforms.  This is the exact "You should boil the ocean" response that Arjan famously complained about at Kernel Summit a few years ago.

-----Original Message-----
From: Dan Williams [mailto:dan.j.williams@intel.com] 
Sent: Thursday, August 18, 2016 2:15 PM
To: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>; X86 ML <x86@kernel.org>; linux-kernel@vger.kernel.org; linux-nvdimm@lists.01.org
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

On Thu, Aug 18, 2016 at 11:05 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> I'd rather the memmap= option works on all platforms.  Unless we're going to get rid of it entirely and exclusively use ACPI tables, I see no good reason to leave this landmine lying around for someone else to blow their own leg off.
>

memmap= is already a landmine: https://lkml.org/lkml/2016/6/28/819

Killing memmap= in favor of NFIT is a better way to go.  I chatted with Peter off list about this problem and he wondered about re-purposing using the mBFT mechanism to define memory ranges:

http://omniboot.org/txt/syslinux/memdisk.txt
Dan Williams Aug. 18, 2016, 7:12 p.m. UTC | #7
On Thu, Aug 18, 2016 at 11:26 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> *Until* such a mechanism is in, I see no reason to not keep feature parity for all platforms.  This is the exact "You should boil the ocean" response that Arjan famously complained about at Kernel Summit a few years ago.

Feature parity for all platforms is the NFIT which Hyper-V is already
implementing.

Setting that aside, if we want to fix memmap= for the general case,
implementing a table like mBFT does not amount to ocean boiling in my
opinion.
Matthew Wilcox Aug. 18, 2016, 7:56 p.m. UTC | #8
Compared to a patch which removes 5 lines of code, almost any additional work is ocean-boiling.

-----Original Message-----
From: Dan Williams [mailto:dan.j.williams@intel.com] 
Sent: Thursday, August 18, 2016 3:12 PM
To: Matthew Wilcox <mawilcox@microsoft.com>
Cc: X86 ML <x86@kernel.org>; linux-kernel@vger.kernel.org; linux-nvdimm@lists.01.org
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

On Thu, Aug 18, 2016 at 11:26 AM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> *Until* such a mechanism is in, I see no reason to not keep feature parity for all platforms.  This is the exact "You should boil the ocean" response that Arjan famously complained about at Kernel Summit a few years ago.

Feature parity for all platforms is the NFIT which Hyper-V is already implementing.

Setting that aside, if we want to fix memmap= for the general case, implementing a table like mBFT does not amount to ocean boiling in my opinion.
Dan Williams Aug. 18, 2016, 8:16 p.m. UTC | #9
On Thu, Aug 18, 2016 at 12:56 PM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> Compared to a patch which removes 5 lines of code, almost any additional work is ocean-boiling.
>

Did you check the state of NFIT enabling in Hyper-V?  Not patching the
Linux kernel at all is even less work.
Matthew Wilcox Aug. 18, 2016, 9:26 p.m. UTC | #10
Yes, but this actually *removes a bug* in the Linux kernel; if any memory resource is left to be set up later, it is currently not set up on x86 machines which don't have PCI busses.  That's not very many x86 systems, I'll agree, but I'm sure some enterprising person is busy creating an SoC which lacks PCI.

-----Original Message-----
From: Dan Williams [mailto:dan.j.williams@intel.com] 
Sent: Thursday, August 18, 2016 4:17 PM
To: Matthew Wilcox <mawilcox@microsoft.com>
Cc: X86 ML <x86@kernel.org>; linux-kernel@vger.kernel.org; linux-nvdimm@lists.01.org
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

On Thu, Aug 18, 2016 at 12:56 PM, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> Compared to a patch which removes 5 lines of code, almost any additional work is ocean-boiling.
>

Did you check the state of NFIT enabling in Hyper-V?  Not patching the Linux kernel at all is even less work.
Konrad Rzeszutek Wilk Aug. 19, 2016, 4:06 p.m. UTC | #11
On Thu, Aug 18, 2016 at 12:12:54PM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The Generation 2 Hyper-V virtual machine does not emulate PCI.
> This check causes the call to pcibios_resource_survey() to be skipped,
> and pcibios_resource_survey() calls e820_reserve_resources_late(), which
> is where PMEM resources are added to the resource tree.  With this patch,
> the PMEM devices now show up.
> 
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> ---
>  arch/x86/pci/common.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 7b6a9d1..d39e799 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -516,11 +516,6 @@ void __init pcibios_set_cache_line_size(void)
>  
>  int __init pcibios_init(void)
>  {
> -	if (!raw_pci_ops && !raw_pci_ext_ops) {
> -		printk(KERN_WARNING "PCI: System does not support PCI\n");
> -		return 0;
> -	}

So shouldn't this be gated on whether the platform is HyperV?

> -
>  	pcibios_set_cache_line_size();
>  	pcibios_resource_survey();
>  
> -- 
> 2.8.1
> 
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
Matthew Wilcox Aug. 19, 2016, 4:46 p.m. UTC | #12
I don't think we need to gate this based on the platform.  If you look at the differences between what is done:

 - The printk is no longer called.  That's fine; we already did the printk modified in patch 2/2.  The user is aware there machine doesn't have PCI :-)

 - We set the cacheline size.  That's copying a value from the cpuinfo to a global __read_mostly variable and emitting a printk.  No big deal.

 - In pcibios_resource_survey, we iterate over three empty lists, call e820_reserve_resources_late() (which is what we want!) and call ioapic_insert_resources() (which I rather think we want too)

In short, not calling these functions is a bug.  We just haven't noticed before now because PCI is so omnipresent.

-----Original Message-----
From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] 
Sent: Friday, August 19, 2016 12:07 PM
To: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Cc: x86@kernel.org; linux-kernel@vger.kernel.org; linux-nvdimm@ml01.01.org; Matthew Wilcox <mawilcox@microsoft.com>
Subject: Re: [PATCH 1/2] x86: Set up resources correctly on Hyper-V Generation 2

On Thu, Aug 18, 2016 at 12:12:54PM -0700, Matthew Wilcox wrote:
> @@ -516,11 +516,6 @@ void __init pcibios_set_cache_line_size(void)
>  
>  int __init pcibios_init(void)
>  {
> -	if (!raw_pci_ops && !raw_pci_ext_ops) {
> -		printk(KERN_WARNING "PCI: System does not support PCI\n");
> -		return 0;
> -	}

So shouldn't this be gated on whether the platform is HyperV?

> -
>  	pcibios_set_cache_line_size();
>  	pcibios_resource_survey();
>  
> --
> 2.8.1
> 
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
diff mbox

Patch

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 7b6a9d1..d39e799 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -516,11 +516,6 @@  void __init pcibios_set_cache_line_size(void)
 
 int __init pcibios_init(void)
 {
-	if (!raw_pci_ops && !raw_pci_ext_ops) {
-		printk(KERN_WARNING "PCI: System does not support PCI\n");
-		return 0;
-	}
-
 	pcibios_set_cache_line_size();
 	pcibios_resource_survey();