diff mbox

[4] mce: acpi/apei: Add a sysctl to control page offlining on firmware report

Message ID 20130702125137.7388.97225.stgit@localhost.localdomain (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Naveen N. Rao July 2, 2013, 12:54 p.m. UTC
I am adding another patch here to disable page offlining in case the firmware
starts acting up.

Thanks,
Naveen

--

Add a sysctl memory_failure_soft_offline to control what is done on receipt of
firmware ghes notification for a corrected error. By default, kernel tries
to soft-offline the page immediately. If set to 0, no action is taken.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 Documentation/sysctl/vm.txt |   12 ++++++++++++
 include/linux/mm.h          |    1 +
 kernel/sysctl.c             |    9 +++++++++
 mm/memory-failure.c         |   10 +++++++---
 4 files changed, 29 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Borislav Petkov July 3, 2013, 2:46 p.m. UTC | #1
On Tue, Jul 02, 2013 at 06:24:00PM +0530, Naveen N. Rao wrote:
> I am adding another patch here to disable page offlining in case the firmware
> starts acting up.
> 
> Thanks,
> Naveen
> 
> --
> 
> Add a sysctl memory_failure_soft_offline to control what is done on receipt of
> firmware ghes notification for a corrected error. By default, kernel tries
> to soft-offline the page immediately. If set to 0, no action is taken.

What is the rationale for that? Are we adding it just in case, as a
chicken bit or do you have a specific case?

If the second, we'd love to hear about it in the commit message. :)

Thanks.
Naveen N. Rao July 3, 2013, 3:46 p.m. UTC | #2
On 07/03/2013 08:16 PM, Borislav Petkov wrote:
> On Tue, Jul 02, 2013 at 06:24:00PM +0530, Naveen N. Rao wrote:
>> I am adding another patch here to disable page offlining in case the firmware
>> starts acting up.
>>
>> Thanks,
>> Naveen
>>
>> --
>>
>> Add a sysctl memory_failure_soft_offline to control what is done on receipt of
>> firmware ghes notification for a corrected error. By default, kernel tries
>> to soft-offline the page immediately. If set to 0, no action is taken.
>
> What is the rationale for that? Are we adding it just in case, as a
> chicken bit or do you have a specific case?
>
> If the second, we'd love to hear about it in the commit message. :)

Nope, this is a just-in-case thing. I think you or Tony asked to have 
this in a previous discussion so that we're covered if firmware starts 
acting up. Other than that, I'm ok if this is left out.


Thanks,
Naveen

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Luck July 8, 2013, 8:26 p.m. UTC | #3
PiBOb3BlLCB0aGlzIGlzIGEganVzdC1pbi1jYXNlIHRoaW5nLiBJIHRoaW5rIHlvdSBvciBUb255
IGFza2VkIHRvIGhhdmUgDQo+IHRoaXMgaW4gYSBwcmV2aW91cyBkaXNjdXNzaW9uIHNvIHRoYXQg
d2UncmUgY292ZXJlZCBpZiBmaXJtd2FyZSBzdGFydHMgDQo+IGFjdGluZyB1cC4gT3RoZXIgdGhh
biB0aGF0LCBJJ20gb2sgaWYgdGhpcyBpcyBsZWZ0IG91dC4NCg0KSSdtIHN0cnVnZ2xpbmcgdG8g
dGhpbmsgb2YgYSBjYXNlIHdoZXJlIHRoaXMgd291bGQgaGVscC4gIEl0IGltcGxpZXMgdGhhdA0K
d2UgYXJlIG9uIGEgcnVubmluZyBzeXN0ZW0sIGFuZCB3ZSBzb21laG93IG5vdGljZSB0aGF0IHRo
ZSBCSU9TIGlzDQp0ZWxsaW5nIHVzIHRvIHRha2Ugc29tZSBwYWdlcyBvZmZsaW5lIC0gYW5kIHRo
YXQgd2Uga25vdyBiZXR0ZXIgdGhhbiB0aGUNCkJJT1MgdGhhdCB3ZSdkIGxpa2UgdG8ganVzdCBp
Z25vcmUgYW55IG1vcmUgc3VjaCBtZXNzYWdlcyBmcm9tIHRoZSBCSU9TLg0KDQpCdXQgd2Ugc3Rp
bGwgbGVhdmUgdGhlIEJJT1MgaW4gY2hhcmdlIG9mIGxvZ2dpbmcgdGhlIGVycm9ycyBhbmQga2Vl
cGluZw0KdHJhY2sgb2YgdGhlIHRocmVzaG9sZHMuDQoNCkknbSBoYXBweSB3aXRoIGp1c3QgdGhl
IGFjcGk9bm9jbWNmZiB0byBhdm9pZCBhIEJJT1MgdGhhdCBkb2VzIHdlaXJkDQpzdHVmZi4gIE9y
IGRvIHlvdSB0aGluayB3ZSBtaWdodCBzdGlsbCBoYXZlIHRvIGRlYWwgd2l0aCBhIHN0cmluZyBv
ZiBBUEVJDQptZXNzYWdlcz8NCg0KLVRvbnkNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Naveen N. Rao July 10, 2013, 9:17 a.m. UTC | #4
On 07/09/2013 01:56 AM, Luck, Tony wrote:
> I'm happy with just the acpi=nocmcff to avoid a BIOS that does weird
> stuff.  Or do you think we might still have to deal with a string of APEI
> messages?

Agreed - and I don't think this patch can help with a string of APEI 
messages either. So yes, I think we can leave this out for now.

Thanks,
Naveen

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index dcc75a9..6d0fcba 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -375,6 +375,18 @@  Enable memory failure recovery (when supported by the platform)
 
 ==============================================================
 
+memory_failure_soft_offline
+
+Control soft-offlining of pages on receipt of appropriate firmware error
+report through GHES. Note that this does not affect user-space initiated
+soft-offlining.
+
+1: Attempt soft-offlining.
+
+0: No action.
+
+==============================================================
+
 min_free_kbytes:
 
 This is used to force the Linux VM to keep a minimum number
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 958e9efd..2c16ca4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1791,6 +1791,7 @@  extern void memory_failure_queue(unsigned long pfn, int trapno, int flags);
 extern int unpoison_memory(unsigned long pfn);
 extern int sysctl_memory_failure_early_kill;
 extern int sysctl_memory_failure_recovery;
+extern int sysctl_memory_failure_soft_offline;
 extern void shake_page(struct page *p, int access);
 extern atomic_long_t num_poisoned_pages;
 extern int soft_offline_page(struct page *page, int flags);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b0a1f99..cc4b794 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1427,6 +1427,15 @@  static struct ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "memory_failure_soft_offline",
+		.data		= &sysctl_memory_failure_soft_offline,
+		.maxlen		= sizeof(sysctl_memory_failure_soft_offline),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
 #endif
 	{
 		.procname	= "user_reserve_kbytes",
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0d6717e..ec4851c 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -61,6 +61,8 @@  int sysctl_memory_failure_early_kill __read_mostly = 0;
 
 int sysctl_memory_failure_recovery __read_mostly = 1;
 
+int sysctl_memory_failure_soft_offline __read_mostly = 1;
+
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
 #if defined(CONFIG_HWPOISON_INJECT) || defined(CONFIG_HWPOISON_INJECT_MODULE)
@@ -1286,9 +1288,11 @@  static void memory_failure_work_func(struct work_struct *work)
 		spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
 		if (!gotten)
 			break;
-		if (entry.flags & MF_SOFT_OFFLINE)
-			soft_offline_page(pfn_to_page(entry.pfn), entry.flags);
-		else
+		if (entry.flags & MF_SOFT_OFFLINE) {
+			if (sysctl_memory_failure_soft_offline)
+				soft_offline_page(pfn_to_page(entry.pfn),
+						entry.flags);
+		} else
 			memory_failure(entry.pfn, entry.trapno, entry.flags);
 	}
 }