diff mbox series

[v7,04/25] ACPI / APEI: Make hest.c manage the estatus memory pool

Message ID 20181203180613.228133-5-james.morse@arm.com (mailing list archive)
State New, archived
Headers show
Series APEI in_nmi() rework and SDEI wire-up | expand

Commit Message

James Morse Dec. 3, 2018, 6:05 p.m. UTC
ghes.c has a memory pool it uses for the estatus cache and the estatus
queue. The cache is initialised when registering the platform driver.
For the queue, an NMI-like notification has to grow/shrink the pool
as it is registered and unregistered.

This is all pretty noisy when adding new NMI-like notifications, it
would be better to replace this with a static pool size based on the
number of users.

As a precursor, move the call that creates the pool from ghes_init(),
into hest.c. Later this will take the number of ghes entries and
consolidate the queue allocations.
Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
this.

The pool is now initialised as part of ACPI's subsys_initcall():
(acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
Before this patch it happened later as a GHES specific device_initcall().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 33 ++++++---------------------------
 drivers/acpi/apei/hest.c |  5 +++++
 include/acpi/ghes.h      |  2 ++
 3 files changed, 13 insertions(+), 27 deletions(-)

Comments

Borislav Petkov Dec. 11, 2018, 4:48 p.m. UTC | #1
On Mon, Dec 03, 2018 at 06:05:52PM +0000, James Morse wrote:
> ghes.c has a memory pool it uses for the estatus cache and the estatus
> queue. The cache is initialised when registering the platform driver.
> For the queue, an NMI-like notification has to grow/shrink the pool
> as it is registered and unregistered.
> 
> This is all pretty noisy when adding new NMI-like notifications, it
> would be better to replace this with a static pool size based on the
> number of users.
> 
> As a precursor, move the call that creates the pool from ghes_init(),
> into hest.c. Later this will take the number of ghes entries and
> consolidate the queue allocations.
> Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
> this.
> 
> The pool is now initialised as part of ACPI's subsys_initcall():
> (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
> Before this patch it happened later as a GHES specific device_initcall().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 33 ++++++---------------------------
>  drivers/acpi/apei/hest.c |  5 +++++
>  include/acpi/ghes.h      |  2 ++
>  3 files changed, 13 insertions(+), 27 deletions(-)

...

> diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
> index b1e9f81ebeea..da5fabaeb48f 100644
> --- a/drivers/acpi/apei/hest.c
> +++ b/drivers/acpi/apei/hest.c
> @@ -32,6 +32,7 @@
>  #include <linux/io.h>
>  #include <linux/platform_device.h>
>  #include <acpi/apei.h>
> +#include <acpi/ghes.h>
>  
>  #include "apei-internal.h"
>  
> @@ -200,6 +201,10 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
>  	if (!ghes_arr.ghes_devs)
>  		return -ENOMEM;
>  
> +	rc = ghes_estatus_pool_init();
> +	if (rc)
> +		goto out;

Right, this happens before...

> +
>  	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);

... this but do we even want to do any memory allocations if we don't
have any HEST tables or we've been disabled by hest_disable?

IOW, we should swap those two calls, methinks.
James Morse Dec. 14, 2018, 1:56 p.m. UTC | #2
Hi Boris,

On 11/12/2018 16:48, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:05:52PM +0000, James Morse wrote:
>> ghes.c has a memory pool it uses for the estatus cache and the estatus
>> queue. The cache is initialised when registering the platform driver.
>> For the queue, an NMI-like notification has to grow/shrink the pool
>> as it is registered and unregistered.
>>
>> This is all pretty noisy when adding new NMI-like notifications, it
>> would be better to replace this with a static pool size based on the
>> number of users.
>>
>> As a precursor, move the call that creates the pool from ghes_init(),
>> into hest.c. Later this will take the number of ghes entries and
>> consolidate the queue allocations.
>> Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
>> this.
>>
>> The pool is now initialised as part of ACPI's subsys_initcall():
>> (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
>> Before this patch it happened later as a GHES specific device_initcall().

>> diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
>> index b1e9f81ebeea..da5fabaeb48f 100644
>> --- a/drivers/acpi/apei/hest.c
>> +++ b/drivers/acpi/apei/hest.c
>> @@ -32,6 +32,7 @@
>>  #include <linux/io.h>
>>  #include <linux/platform_device.h>
>>  #include <acpi/apei.h>
>> +#include <acpi/ghes.h>
>>  
>>  #include "apei-internal.h"
>>  
>> @@ -200,6 +201,10 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
>>  	if (!ghes_arr.ghes_devs)
>>  		return -ENOMEM;
>>  
>> +	rc = ghes_estatus_pool_init();
>> +	if (rc)
>> +		goto out;
> 
> Right, this happens before...
> 
>> +
>>  	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
> 
> ... this but do we even want to do any memory allocations if we don't
> have any HEST tables or we've been disabled by hest_disable?

I agree we shouldn't,


> IOW, we should swap those two calls, methinks.

/me digs a bit,

ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
another 2 calls to apei_hest_parse().

If ghes_disable is set, we don't call this thing.
If hest_disable is set, acpi_hest_init() exits early.
If we don't have a HEST table, acpi_hest_init() exits early.

... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
great!) But we do call ghes_estatus_pool_init().

I think a check that ghes_count is non-zero before calling
hest_ghes_dev_register() is the cleanest way to avoid this.

I wanted the estatus pool to be initialised before creating the platform devices
in case the order of these things is changed in the future and they get probed
immediately, before the pool is initialised.


Thanks,

James
Borislav Petkov Dec. 19, 2018, 2:42 p.m. UTC | #3
On Fri, Dec 14, 2018 at 01:56:16PM +0000, James Morse wrote:
> /me digs a bit,
> 
> ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
> Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
> another 2 calls to apei_hest_parse().
> 
> If ghes_disable is set, we don't call this thing.
> If hest_disable is set, acpi_hest_init() exits early.
> If we don't have a HEST table, acpi_hest_init() exits early.
> 
> ... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
> called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
> great!) But we do call ghes_estatus_pool_init().
> 
> I think a check that ghes_count is non-zero before calling
> hest_ghes_dev_register() is the cleanest way to avoid this.

Grrr, what an effing mess that code is! There's hest_disable *and*
ghes_disable. Do we really need them both?

With my simplifier hat on I wanna say, we should have a single switch -
apei_disable - and kill those other two. What a damn mess that is.

> I wanted the estatus pool to be initialised before creating the platform devices
> in case the order of these things is changed in the future and they get probed
> immediately, before the pool is initialised.

Hmmm.

Actually, I meant flipping those two calls:

        rc = ghes_estatus_pool_init(ghes_count);
        if (rc)
                goto out;

        rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
        if (rc)
                goto err;

to

        rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
        if (rc)
                goto err;

        rc = ghes_estatus_pool_init(ghes_count);
        if (rc)
                goto out;

so as not to alloc the pool unnecessarily if the parsing fails.

Also, AFAICT, the order you have them in now might be a problem anyway
if

	apei_hest_parse(hest_parse_ghes, &ghes_arr);

fails because then you goto err and and that pool leaks, right?

Thx.
James Morse Jan. 10, 2019, 6:20 p.m. UTC | #4
Hi Boris,

On 19/12/2018 14:42, Borislav Petkov wrote:
> On Fri, Dec 14, 2018 at 01:56:16PM +0000, James Morse wrote:
>> /me digs a bit,
>>
>> ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
>> Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
>> another 2 calls to apei_hest_parse().
>>
>> If ghes_disable is set, we don't call this thing.
>> If hest_disable is set, acpi_hest_init() exits early.
>> If we don't have a HEST table, acpi_hest_init() exits early.
>>
>> ... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
>> called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
>> great!) But we do call ghes_estatus_pool_init().
>>
>> I think a check that ghes_count is non-zero before calling
>> hest_ghes_dev_register() is the cleanest way to avoid this.
> 
> Grrr, what an effing mess that code is! There's hest_disable *and*
> ghes_disable. Do we really need them both?

ghes_disable lets you ignore the firmware-first notifications, but still 'use'
the other error sources:
drivers/pci/pcie/aer.c picks out the three AER types, and uses apei_hest_parse()
to know if firmware is controlling AER, even if ghes_disable is set.

x86's arch_apei_enable_cmcff() looks like it disables MCE to get firmware to
handle them. hest_disable would stop this, but instead ghes_disable keeps that,
and stops the NOTIFY_NMI being registered.


> With my simplifier hat on I wanna say, we should have a single switch -
> apei_disable - and kill those other two. What a damn mess that is.

(do you consider cmdline arguments as ABI, or hard to justify and hard to remove?)

I don't think its broken enough to justify ripping them out. A user of
ghes_disable would be someone with broken firmware-first handling of AER. They
need to know firmware is changing the register values behind their back (so need
to parse the HEST), but want to ignore the junk notifications. It doesn't sound
like an unlikely scenario.


>> I wanted the estatus pool to be initialised before creating the platform devices
>> in case the order of these things is changed in the future and they get probed
>> immediately, before the pool is initialised.
> 
> Hmmm.
> 
> Actually, I meant flipping those two calls:
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
> to
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
> so as not to alloc the pool unnecessarily if the parsing fails.
> 
> Also, AFAICT, the order you have them in now might be a problem anyway
> if
> 
> 	apei_hest_parse(hest_parse_ghes, &ghes_arr);
> 
> fails because then you goto err and and that pool leaks, right?

Right, yes. I've been ignoring errors like this on the probe path as it implies
you've got busted ACPI tables, or so little memory you're never going to make it
to user-space. I was more worried about ghes_probe() trying to use the pool
memory before its been allocated. I doesn't seem right to register the device if
the driver wouldn't work yet. But one is an subsys_initcall(), the drivers is
device_initcall(), which is obvious enough.

Fixed.


Thanks,

James
diff mbox series

Patch

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index c15264f2dc4b..78058adb2574 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -162,26 +162,16 @@  static void ghes_iounmap_irq(void)
 	clear_fixmap(FIX_APEI_GHES_IRQ);
 }
 
-static int ghes_estatus_pool_init(void)
+static int ghes_estatus_pool_expand(unsigned long len); //temporary
+
+int ghes_estatus_pool_init(void)
 {
 	ghes_estatus_pool = gen_pool_create(GHES_ESTATUS_POOL_MIN_ALLOC_ORDER, -1);
 	if (!ghes_estatus_pool)
 		return -ENOMEM;
-	return 0;
-}
 
-static void ghes_estatus_pool_free_chunk(struct gen_pool *pool,
-					      struct gen_pool_chunk *chunk,
-					      void *data)
-{
-	vfree((void *)chunk->start_addr);
-}
-
-static void ghes_estatus_pool_exit(void)
-{
-	gen_pool_for_each_chunk(ghes_estatus_pool,
-				ghes_estatus_pool_free_chunk, NULL);
-	gen_pool_destroy(ghes_estatus_pool);
+	return ghes_estatus_pool_expand(GHES_ESTATUS_CACHE_AVG_SIZE *
+					GHES_ESTATUS_CACHE_ALLOCED_MAX);
 }
 
 static int ghes_estatus_pool_expand(unsigned long len)
@@ -1225,18 +1215,9 @@  static int __init ghes_init(void)
 
 	ghes_nmi_init_cxt();
 
-	rc = ghes_estatus_pool_init();
-	if (rc)
-		goto err;
-
-	rc = ghes_estatus_pool_expand(GHES_ESTATUS_CACHE_AVG_SIZE *
-				      GHES_ESTATUS_CACHE_ALLOCED_MAX);
-	if (rc)
-		goto err_pool_exit;
-
 	rc = platform_driver_register(&ghes_platform_driver);
 	if (rc)
-		goto err_pool_exit;
+		goto err;
 
 	rc = apei_osc_setup();
 	if (rc == 0 && osc_sb_apei_support_acked)
@@ -1249,8 +1230,6 @@  static int __init ghes_init(void)
 		pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
 
 	return 0;
-err_pool_exit:
-	ghes_estatus_pool_exit();
 err:
 	return rc;
 }
diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
index b1e9f81ebeea..da5fabaeb48f 100644
--- a/drivers/acpi/apei/hest.c
+++ b/drivers/acpi/apei/hest.c
@@ -32,6 +32,7 @@ 
 #include <linux/io.h>
 #include <linux/platform_device.h>
 #include <acpi/apei.h>
+#include <acpi/ghes.h>
 
 #include "apei-internal.h"
 
@@ -200,6 +201,10 @@  static int __init hest_ghes_dev_register(unsigned int ghes_count)
 	if (!ghes_arr.ghes_devs)
 		return -ENOMEM;
 
+	rc = ghes_estatus_pool_init();
+	if (rc)
+		goto out;
+
 	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
 	if (rc)
 		goto err;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 82cb4eb225a4..46ef5566e052 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -52,6 +52,8 @@  enum {
 	GHES_SEV_PANIC = 0x3,
 };
 
+int ghes_estatus_pool_init(void);
+
 /* From drivers/edac/ghes_edac.c */
 
 #ifdef CONFIG_EDAC_GHES