From patchwork Tue May 21 23:54:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jane Chu X-Patchwork-Id: 13669813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89C1CC25B74 for ; Tue, 21 May 2024 23:55:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 03CE56B0088; Tue, 21 May 2024 19:55:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2F256B008A; Tue, 21 May 2024 19:55:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1CE06B008C; Tue, 21 May 2024 19:55:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C563F6B0088 for ; Tue, 21 May 2024 19:55:04 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6749F160CD0 for ; Tue, 21 May 2024 23:55:04 +0000 (UTC) X-FDA: 82144061328.28.0819C1E Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf04.hostedemail.com (Postfix) with ESMTP id 95A7B40007 for ; Tue, 21 May 2024 23:55:02 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Iy6uZIAS; spf=pass (imf04.hostedemail.com: domain of jane.chu@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=jane.chu@oracle.com; dmarc=pass (policy=quarantine) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716335702; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=Luwg1gXwUUH9kO0fvhX+8WrwbHsLOytbWyLws884IyI=; b=QsPj6m/ndM+qRY3YsE8vQW2VXPs2nZl5yu0/ppXEyJ/NqiJcgi466NFFCEgjGx0aIM7Zra Za/DAAXH1SMCnxHLj03SjFRrUeSP7Hm8E/3KdOpk/rDbev3t+cIwsiRbTuXQSh5JgG1a2Q Iakg5kF/UZrqeqpa49zeSet4HOwaR3k= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Iy6uZIAS; spf=pass (imf04.hostedemail.com: domain of jane.chu@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=jane.chu@oracle.com; dmarc=pass (policy=quarantine) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716335702; a=rsa-sha256; cv=none; b=Ybp1KKuODpknPPs8Zs1GMmCOvZer5u+cVWDNXyUIIWAKAusbBMXTsWymt42j8ODj9v9zkW uLRG6wqAx6uNkAe4mIuLLviSr/MNl8iIQ/EOU7nWPIgzZl1hlyTfy8EZycgoVWVjeSW53p SkNx6rYAVu0MzL1CAFgcm0lczs0ka7c= Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 44LJiCmJ025793; Tue, 21 May 2024 23:54:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : mime-version : content-transfer-encoding; s=corp-2023-11-20; bh=Luwg1gXwUUH9kO0fvhX+8WrwbHsLOytbWyLws884IyI=; b=Iy6uZIAS0KGzOCKDnRuqi1nxahESWdPDofmOPUzfQztVO0RVbZ3Ttf2GxqAhnnbPvJFY wSvpvgxp/FyR/GooIq1ehO/4qvKUHB5ZJFpL1gmEvQ1i82va6fHsC7M6AYry+FlZhwM/ +Q1FQyJndhSo7rYjDwY3LNprK71bKtOXqF+V/k9ilkWZOfWBwjyksl1yJ2fdj4EyALye nl917vQAA4oz3TcfukfyFvPzk+7ydtN3ltfNvzpcCzKWx+Xeo6/dfF8i7dkm8wXj3Ft7 hwWgpLuBcmvtE6s0h4d6jUwf6FSk4wUa+g/npXThBFL1LkDZslZmcbCBbykeakG0O55s bg== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3y6k466m1x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 May 2024 23:54:54 +0000 Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 44LMZkKR002667; Tue, 21 May 2024 23:54:53 GMT Received: from brm-x62-16.us.oracle.com (brm-x62-16.us.oracle.com [10.80.150.37]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 3y6js8erce-1; Tue, 21 May 2024 23:54:53 +0000 From: Jane Chu To: linmiaohe@huawei.com, nao.horiguchi@gmail.com, akpm@linux-foundation.org, osalvador@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 0/5] Enhance soft hwpoison handling and injection Date: Tue, 21 May 2024 17:54:24 -0600 Message-Id: <20240521235429.2368017-1-jane.chu@oracle.com> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-05-21_14,2024-05-21_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 bulkscore=0 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405210181 X-Proofpoint-ORIG-GUID: ICL2FUO5C3Em64Ba8poKST_ZvZxvZq5G X-Proofpoint-GUID: ICL2FUO5C3Em64Ba8poKST_ZvZxvZq5G X-Stat-Signature: o3gdmfwuyqi76w8wbetbhbz9d3rkb1b8 X-Rspamd-Queue-Id: 95A7B40007 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1716335702-934834 X-HE-Meta: U2FsdGVkX1+bcQzXtdNpBa+AhnldBfLiXbhUVVPtjkuKCZ9mMOC4/9TmoPeJ9mx/1HlWxpqUbrDsc6+QiqzGRWG/DuUKCdmHoDQWkPD3DWw1Gmqz0GvOB2OnbRN3+3Is7Ml1eJvASuQqMt+xTfjNjMjJ6J7gwKhbG+pWrnKNJAZClA/3gEUeUJz2XNUHqr2Lr+mfo1ku+S4Q32TYopAjh8Z7Vsi+OlBDftJA3XtVvRaiL04DK5QX030cChGtZNhhMc0ZcmXOfXJzIzUbLUkcSohrr/AM9MqNldGpUX3yeqdh9qwy4RBLJq135sVwRsPdR6c7mWManYZOW2GedmOw+qdnFRwI4V6iK1dAj4r000vBqxf5nMuPiapHdUW7Hu4FDe/JMc+1Qt8zRWCf/PmbTJoa8sO3PUN/TJI6C/xgDAo90l3XPqgEHFsijNVwArdWtgKwad+dn0kb2/HemMqYukLQhm0CzH0oJZoRjZwI5+4Kxxpf2bu9CwY5BhC1ms+qrX7oMd1Rc8jqdJsRGwvUfP65vX0Z5lr9q5s+yfuib33z403+JioxFc9LknesqJGCq6cJdSqCwQcv7UFwuH7e4lifeSaZgf/AXUs+D8R1nwV1AgxkNJ3yTUWMsv84pt7xpI9dE3LaEpYKI7AzaYPzNzdUsQCIes7514LoPu0DPZOok5rrIje7e/qgSu5DyWEDqOo4pPSFx4wlDwJ74wCVFt4Sv+PPpONY02Zeu0M7y3SfPbTRXjnvTfPFDMEB+yszYjW+m/BnSQ1INo3H9AgIRPAttRAwv1JMO/YxMibFB1dT7IRxZEgbcOyKgIKo0+fuCfMdp0t0/X9bMhQUBLQLa23d0IiJAHQCCpD+MpnbpkDJ0eJ7VKigzlMROmHI5rMqsaHUyyenWwOe+dpQx36RHc5EBWhc1Bn3iIt1fl7tYaKdtYlxmqI3MlDQtH3uU7X5uJKq87/pjiZq5KpVT6R iz/PJ+K6 saFoc8b3e7WDz4Y1M2zFddPwERHMlZ7PlTrqCFE9gVmqzwam5DVdRwMGk/ce2l99kkY8/cA5eOyyxtdQLzqa/1l7wExFgTsUeAGbcX9Ff+FQIU2VQEH347/rzOll7MGB8H15gUDuo7jApjpzY2jpLKX1Se1g9Hv4Fi/t9MT2Wabqx3wr/Y+J7tNDSPTPLH0BHZgamGQNJ8NWlO2c3z3dY14eQwFeoq7RfGTNDJU/wEJ6oj0Awri3QTf8jbgmvqRAeeWC6kATzTYBMel52vJJxkvhrdUDwDsEOVVBgfxqoeobys50= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Changes in v3: - rebased to mainline as of 5/20/2024 - added an acked-by from Miaohe Lin - picked up a R-B from Oscar Salvador - fixed/clarified comments about MF_IGNORED/MF_FAILED definition and usage. - Oscar Salvador - invoke hwpoison_filter slightly earlier to avoid unnecessary THP split, and with refcount held. - Miaohe Lin - added comments to try_to_split_thp_page() on when not to release page refcount. - Oscar Salvador - added action_result() in a couple cases, but take care not to overwrite the intended returns. - Oscar Salvador Changes in v2: - rebased to mm-stable as of 5/8/2024 - added RB by Oscar Salvador - comments from Oscar on patch 1-of-3: clarify changelog - comments from Miahe Lin on patch 3-of-3: remove unnecessary user page checking and remove incorrect put_page() in kill_procs_now(). Invoke kill_procs_now() regardless MF_ACTIN_REQUIRED is set or not, moved hwpoison_filter() higher up. - added two patches 3-of-5 and 4-of-5 This series aim at the following enhancement - - Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave more like as if a real UE occurred. Because the other two injectors such as hwpoison-inject and the 'einj' on x86 can't, and it seems to me we need a better simulation to real UE scenario. - For years, if the kernel is unable to unmap a hwpoisoned page, it send a SIGKILL instead of SIGBUS to prevent user process from potentially accessing the page again. But in doing so, the user process also lose important information: vaddr, for recovery. Fortunately, the kernel already has code to kill process re-accessing a hwpoisoned page, so remove the '!unmap_success' check. - Right now, if a thp page under GUP longterm pin is hwpoisoned, and kernel cannot split the thp page, memory-failure simply ignores the UE and returns. That's not ideal, it could deliver a SIGBUS with useful information for userspace recovery. Jane Chu (5): mm/memory-failure: try to send SIGBUS even if unmap failed mm/madvise: Add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON) mm/memory-failure: improve memory failure action_result messages mm/memory-failure: move hwpoison_filter() higher up mm/memory-failure: send SIGBUS in the event of thp split fail include/linux/mm.h | 2 + include/ras/ras_event.h | 2 + mm/madvise.c | 2 +- mm/memory-failure.c | 108 +++++++++++++++++++++++++++++----------- 4 files changed, 84 insertions(+), 30 deletions(-)