diff mbox series

[3/3] acpi/hmat: Skip publishing target info for nodes with no online memory

Message ID 20190805142706.22520-4-keith.busch@intel.com (mailing list archive)
State Mainlined, archived
Headers show
Series HMAT node online fixes | expand

Commit Message

Keith Busch Aug. 5, 2019, 2:27 p.m. UTC
From: Dan Williams <dan.j.williams@intel.com>

There are multiple scenarios where the HMAT may contain information
about proximity domains that are not currently online. Rather than fail
to report any HMAT data just elide those offline domains.

If and when those domains are later onlined they can be added to the
HMEM reporting at that point.

This was found while testing EFI_MEMORY_SP support which reserves
"specific purpose" memory from the general allocation pool. If that
reservation results in an empty numa-node then the node is not marked
online leading a spurious:

    "acpi/hmat: Ignoring HMAT: Invalid table"

...result for HMAT parsing.

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/hmat/hmat.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Rafael J. Wysocki Aug. 12, 2019, 8:59 a.m. UTC | #1
On Mon, Aug 5, 2019 at 4:30 PM Keith Busch <keith.busch@intel.com> wrote:
>
> From: Dan Williams <dan.j.williams@intel.com>
>
> There are multiple scenarios where the HMAT may contain information
> about proximity domains that are not currently online. Rather than fail
> to report any HMAT data just elide those offline domains.
>
> If and when those domains are later onlined they can be added to the
> HMEM reporting at that point.
>
> This was found while testing EFI_MEMORY_SP support which reserves
> "specific purpose" memory from the general allocation pool. If that
> reservation results in an empty numa-node then the node is not marked
> online leading a spurious:
>
>     "acpi/hmat: Ignoring HMAT: Invalid table"
>
> ...result for HMAT parsing.
>
> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Keith Busch <keith.busch@intel.com>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

When you send somebody else's patches, you should sign them off as a
rule, but since you sent this one with your own R-by, I converted that
to a S-o-b.
Rafael J. Wysocki Aug. 26, 2019, 9:05 a.m. UTC | #2
On Monday, August 12, 2019 10:59:58 AM CEST Rafael J. Wysocki wrote:
> On Mon, Aug 5, 2019 at 4:30 PM Keith Busch <keith.busch@intel.com> wrote:
> >
> > From: Dan Williams <dan.j.williams@intel.com>
> >
> > There are multiple scenarios where the HMAT may contain information
> > about proximity domains that are not currently online. Rather than fail
> > to report any HMAT data just elide those offline domains.
> >
> > If and when those domains are later onlined they can be added to the
> > HMEM reporting at that point.
> >
> > This was found while testing EFI_MEMORY_SP support which reserves
> > "specific purpose" memory from the general allocation pool. If that
> > reservation results in an empty numa-node then the node is not marked
> > online leading a spurious:
> >
> >     "acpi/hmat: Ignoring HMAT: Invalid table"
> >
> > ...result for HMAT parsing.
> >
> > Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Reviewed-by: Keith Busch <keith.busch@intel.com>
> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> When you send somebody else's patches, you should sign them off as a
> rule, but since you sent this one with your own R-by, I converted that
> to a S-o-b.
> 

And all patches in the series have been applied.

Thanks!
Dan Williams Feb. 12, 2020, 4:29 p.m. UTC | #3
On Mon, Aug 26, 2019 at 2:05 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Monday, August 12, 2019 10:59:58 AM CEST Rafael J. Wysocki wrote:
> > On Mon, Aug 5, 2019 at 4:30 PM Keith Busch <keith.busch@intel.com> wrote:
> > >
> > > From: Dan Williams <dan.j.williams@intel.com>
> > >
> > > There are multiple scenarios where the HMAT may contain information
> > > about proximity domains that are not currently online. Rather than fail
> > > to report any HMAT data just elide those offline domains.
> > >
> > > If and when those domains are later onlined they can be added to the
> > > HMEM reporting at that point.
> > >
> > > This was found while testing EFI_MEMORY_SP support which reserves
> > > "specific purpose" memory from the general allocation pool. If that
> > > reservation results in an empty numa-node then the node is not marked
> > > online leading a spurious:
> > >
> > >     "acpi/hmat: Ignoring HMAT: Invalid table"
> > >
> > > ...result for HMAT parsing.
> > >
> > > Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > Reviewed-by: Keith Busch <keith.busch@intel.com>
> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > When you send somebody else's patches, you should sign them off as a
> > rule, but since you sent this one with your own R-by, I converted that
> > to a S-o-b.
> >
>
> And all patches in the series have been applied.

I want to flag this patch (commit 5c7ed4385424 "HMAT: Skip publishing
target info for nodes with no online memory")
for -stable to cleanup a spurious WARN_ON:

WARNING: CPU: 7 PID: 1 at drivers/base/node.c:191 node_set_perf_attrs+0x90/0xa0
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.3.6-100.fc29.x86_64 #1
RIP: 0010:node_set_perf_attrs+0x90/0xa0
Call Trace:
 ? do_early_param+0x8e/0x8e
 hmat_init+0x2ff/0x443
 ? hmat_parse_subtable+0x55a/0x55a
 ? do_early_param+0x8e/0x8e
 do_one_initcall+0x46/0x1f4

Do you mind if I forward to stable@, or do you collect ACPI patches to
send to stable@?
Rafael J. Wysocki Feb. 12, 2020, 10:23 p.m. UTC | #4
On Wed, Feb 12, 2020 at 5:29 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Mon, Aug 26, 2019 at 2:05 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Monday, August 12, 2019 10:59:58 AM CEST Rafael J. Wysocki wrote:
> > > On Mon, Aug 5, 2019 at 4:30 PM Keith Busch <keith.busch@intel.com> wrote:
> > > >
> > > > From: Dan Williams <dan.j.williams@intel.com>
> > > >
> > > > There are multiple scenarios where the HMAT may contain information
> > > > about proximity domains that are not currently online. Rather than fail
> > > > to report any HMAT data just elide those offline domains.
> > > >
> > > > If and when those domains are later onlined they can be added to the
> > > > HMEM reporting at that point.
> > > >
> > > > This was found while testing EFI_MEMORY_SP support which reserves
> > > > "specific purpose" memory from the general allocation pool. If that
> > > > reservation results in an empty numa-node then the node is not marked
> > > > online leading a spurious:
> > > >
> > > >     "acpi/hmat: Ignoring HMAT: Invalid table"
> > > >
> > > > ...result for HMAT parsing.
> > > >
> > > > Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > > Reviewed-by: Keith Busch <keith.busch@intel.com>
> > > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > >
> > > When you send somebody else's patches, you should sign them off as a
> > > rule, but since you sent this one with your own R-by, I converted that
> > > to a S-o-b.
> > >
> >
> > And all patches in the series have been applied.
>
> I want to flag this patch (commit 5c7ed4385424 "HMAT: Skip publishing
> target info for nodes with no online memory")
> for -stable to cleanup a spurious WARN_ON:
>
> WARNING: CPU: 7 PID: 1 at drivers/base/node.c:191 node_set_perf_attrs+0x90/0xa0
> CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.3.6-100.fc29.x86_64 #1
> RIP: 0010:node_set_perf_attrs+0x90/0xa0
> Call Trace:
>  ? do_early_param+0x8e/0x8e
>  hmat_init+0x2ff/0x443
>  ? hmat_parse_subtable+0x55a/0x55a
>  ? do_early_param+0x8e/0x8e
>  do_one_initcall+0x46/0x1f4
>
> Do you mind if I forward to stable@, or do you collect ACPI patches to
> send to stable@?

Please forward it, thanks!
diff mbox series

Patch

diff --git a/drivers/acpi/hmat/hmat.c b/drivers/acpi/hmat/hmat.c
index f86fe7130736..8f9a28a870b0 100644
--- a/drivers/acpi/hmat/hmat.c
+++ b/drivers/acpi/hmat/hmat.c
@@ -108,9 +108,6 @@  static __init void alloc_memory_target(unsigned int mem_pxm)
 {
 	struct memory_target *target;
 
-	if (pxm_to_node(mem_pxm) == NUMA_NO_NODE)
-		return;
-
 	target = find_mem_target(mem_pxm);
 	if (target)
 		return;
@@ -618,7 +615,16 @@  static void hmat_register_target_perf(struct memory_target *target)
 
 static void hmat_register_target(struct memory_target *target)
 {
-	if (!node_online(pxm_to_node(target->memory_pxm)))
+	int nid = pxm_to_node(target->memory_pxm);
+
+	/*
+	 * Skip offline nodes. This can happen when memory
+	 * marked EFI_MEMORY_SP, "specific purpose", is applied
+	 * to all the memory in a promixity domain leading to
+	 * the node being marked offline / unplugged, or if
+	 * memory-only "hotplug" node is offline.
+	 */
+	if (nid == NUMA_NO_NODE || !node_online(nid))
 		return;
 
 	mutex_lock(&target_lock);