diff mbox series

[v2,1/5] mm/memory-failure: try to send SIGBUS even if unmap failed

Message ID 20240510062602.901510-2-jane.chu@oracle.com (mailing list archive)
State New
Headers show
Series Enhance soft hwpoison handling and injection | expand

Commit Message

Jane Chu May 10, 2024, 6:25 a.m. UTC
For years when it comes down to kill a process due to hwpoison,
a SIGBUS is delivered only if unmap has been successful.
Otherwise, a SIGKILL is delivered. And the reason for that is
to prevent the involved process from accessing the hwpoisoned
page again.

Since then a lot has changed, a hwpoisoned page is marked and
upon being re-accessed, the memory-failure handler invokes
kill_accessing_process() to kill the process immediately.
So let's take out the '!unmap_success' factor and try to deliver
SIGBUS if possible.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
---
 mm/memory-failure.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

Comments

Miaohe Lin May 11, 2024, 7:01 a.m. UTC | #1
On 2024/5/10 14:25, Jane Chu wrote:
> For years when it comes down to kill a process due to hwpoison,
> a SIGBUS is delivered only if unmap has been successful.
> Otherwise, a SIGKILL is delivered. And the reason for that is
> to prevent the involved process from accessing the hwpoisoned
> page again.
> 
> Since then a lot has changed, a hwpoisoned page is marked and
> upon being re-accessed, the memory-failure handler invokes
> kill_accessing_process() to kill the process immediately.
> So let's take out the '!unmap_success' factor and try to deliver
> SIGBUS if possible.
> 
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

LGTM. Thanks.
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks.
.
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 16ada4fb02b7..739311e121af 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -514,22 +514,15 @@  void add_to_kill_ksm(struct task_struct *tsk, struct page *p,
  *
  * Only do anything when FORCEKILL is set, otherwise just free the
  * list (this is used for clean pages which do not need killing)
- * Also when FAIL is set do a force kill because something went
- * wrong earlier.
  */
-static void kill_procs(struct list_head *to_kill, int forcekill, bool fail,
+static void kill_procs(struct list_head *to_kill, int forcekill,
 		unsigned long pfn, int flags)
 {
 	struct to_kill *tk, *next;
 
 	list_for_each_entry_safe(tk, next, to_kill, nd) {
 		if (forcekill) {
-			/*
-			 * In case something went wrong with munmapping
-			 * make sure the process doesn't catch the
-			 * signal and then access the memory. Just kill it.
-			 */
-			if (fail || tk->addr == -EFAULT) {
+			if (tk->addr == -EFAULT) {
 				pr_err("%#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
 				       pfn, tk->tsk->comm, tk->tsk->pid);
 				do_send_sig_info(SIGKILL, SEND_SIG_PRIV,
@@ -1660,7 +1653,7 @@  static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
 	 */
 	forcekill = folio_test_dirty(folio) || (flags & MF_MUST_KILL) ||
 		    !unmap_success;
-	kill_procs(&tokill, forcekill, !unmap_success, pfn, flags);
+	kill_procs(&tokill, forcekill, pfn, flags);
 
 	return unmap_success;
 }
@@ -1724,7 +1717,7 @@  static void unmap_and_kill(struct list_head *to_kill, unsigned long pfn,
 		unmap_mapping_range(mapping, start, size, 0);
 	}
 
-	kill_procs(to_kill, flags & MF_MUST_KILL, false, pfn, flags);
+	kill_procs(to_kill, flags & MF_MUST_KILL, pfn, flags);
 }
 
 /*