From patchwork Fri Mar 29 06:36:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610210 Received: from esa7.hc1455-7.c3s2.iphmx.com (esa7.hc1455-7.c3s2.iphmx.com [139.138.61.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D4633BB20; Fri, 29 Mar 2024 06:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=139.138.61.252 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694253; cv=none; b=TLDtluwW/IVZPqcdjBop8gquOuTRjArJKabCIwrkVF+6KAzvnX3sDqQF4t73lYWxXsxDoNgYXwUghopk9hjKtyhOnaCV7OoOf7hc2Wf8InS94i50+3talX/B3MxXHFRXvOosQtel9mx08qge1dLfYVBGV2RVofrNLxGEf1AiSRs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694253; c=relaxed/simple; bh=eaTURs2DX3hFrascB0JmgCheHmdrmF0RLNDE+Or7F2I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QNTDiocbIdyTUBleiml8aJ7QFOST9myPo7u9q/dxV446HKX8Y1I+xug+oObnI3WlqbOPGHkfxBdyyN9RVdYGkSZlATHTd6LhNiAbOTLyCMTXL76U0gwwvfdERBbyE0seddRk38Yx2X3AB0GUOafdxSsoU2oV12DfFjwvpBDOM6g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=WnPpWSfB; arc=none smtp.client-ip=139.138.61.252 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="WnPpWSfB" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694251; x=1743230251; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eaTURs2DX3hFrascB0JmgCheHmdrmF0RLNDE+Or7F2I=; b=WnPpWSfB+nDinQ74Roq2uUQGzPg68f87JSOCOUylWknn5smSnul//wcl 8jOTIKROUr3BmjsvOVI38dU/ewX9q9r1BSZT20uTxmSV+4Lc32XTYawEw eAwCpN2UDfubE7xgOy5GZMw1kA7KPI37VEqI68yWxUnKawUH8DY2PsPmZ 0XDtdx8jH5HV0Xa2YYomQeL6pyan12SnExvrn8UCHnvBbLVBsOvN3WZ+7 wufz4Vj7QyON1cUciqWokKGF0vB2n+31cYw8v3OSr6QO+6u80KCyJEFrR yYmQfo8l9wKS4bbHchokzuS1N1QtybZEzHYCjoWHTa9k7mtE+tYmVd05E Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="132533101" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="132533101" Received: from unknown (HELO yto-r3.gw.nic.fujitsu.com) ([218.44.52.219]) by esa7.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:18 +0900 Received: from yto-m1.gw.nic.fujitsu.com (yto-nat-yto-m1.gw.nic.fujitsu.com [192.168.83.64]) by yto-r3.gw.nic.fujitsu.com (Postfix) with ESMTP id 01556E9668; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by yto-m1.gw.nic.fujitsu.com (Postfix) with ESMTP id 34FEA12F761; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id BC9362288EE; Fri, 29 Mar 2024 15:36:15 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id 378B31A006D; Fri, 29 Mar 2024 14:36:15 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com, stable@vger.kernel.org Subject: [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Date: Fri, 29 Mar 2024 14:36:09 +0800 Message-Id: <20240329063614.362763-2-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10-1.936900-10.000000 X-TMASE-MatchedRID: 6KVfooacCZzo4fT4WEihVSQWufwDJ4K9T5ysQDj6eFnIPbn2oQhptW4j 6HJSTgstiC69Gex0rT0XkIT0cenSu+BRuAss+FbmEXjPIvKd74BUENBIMyKD0ceQfu6iwSfsePr 7SQbqKPDi8zVgXoAltsIJ+4gwXrEtJ0RPnyOnrZINUgM7QdTcfXGwKxjdKzHf8Yi1jddtlC4HZ7 7a2hmTYf2ng54tmDQpuA2+Y9lsxTCcrrRiTSKATNx+1ANNyC8exuBs026550TAYLx7rnbR8rDQ8 m3TqgloelpCXnG+JjvDGBZ1G8r1Sf2D6gx/0ozp X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected. If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today. Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: Cc: Dan Williams Cc: Davidlohr Bueso Cc: Jonathan Cameron Cc: Ira Weiny Signed-off-by: Shiyang Ruan --- drivers/cxl/core/trace.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..e2d1f296df97 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event, * DRAM Event Record * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 */ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) -#define CXL_DPA_VOLATILE BIT(0) -#define CXL_DPA_NOT_REPAIRABLE BIT(1) +#define CXL_DPA_VOLATILE BIT_ULL(0) +#define CXL_DPA_NOT_REPAIRABLE BIT_ULL(1) #define show_dpa_flags(flags) __print_flags(flags, "|", \ { CXL_DPA_VOLATILE, "VOLATILE" }, \ { CXL_DPA_NOT_REPAIRABLE, "NOT_REPAIRABLE" } \ From patchwork Fri Mar 29 06:36:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610212 Received: from esa11.hc1455-7.c3s2.iphmx.com (esa11.hc1455-7.c3s2.iphmx.com [207.54.90.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6659C3BB55 for ; Fri, 29 Mar 2024 06:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.54.90.137 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694254; cv=none; b=AhFG2h4BoRZ80qnvilFqR9vCJd85vQGLmpRmls5ZZ38WFYLvPIzxhIfGLSHjoYxApWMwlfLxn+ru+Bq7COTgy+jArHrD+jGBbBa+zJcBkUYYkiHrAeXXcctdviKZ6CUsSbrpTPHbqEv0VxL4C6n+WhkyAcbD3c5/rAPu5A305rU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694254; c=relaxed/simple; bh=WjXkq8DjQjakIlilkVzxIex2qly8t8WwbQWxANAhOWU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Tkns41soawHuhjomhc59AyGj/M9zCeECyYYmpB/iZC5rIsf13C1zed0V81N01IBNCvppCdlMcBMz9g0fi+lJOvqm0QvCND0P83jXG7rgurGTbATl/DhqMORgmBCnb0wqfAerpcyRPVNZZmKALGvPAcYuSAhfanJZKZ063To9YgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=StmxCTx1; arc=none smtp.client-ip=207.54.90.137 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="StmxCTx1" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694252; x=1743230252; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WjXkq8DjQjakIlilkVzxIex2qly8t8WwbQWxANAhOWU=; b=StmxCTx1IvQKLfsPOYdZWMxMFQBHPM4G07QbHIsT1gPuH+TC+Z5hMHo6 PjwL7aYsxncgniGW3fmqa7hdVc/+7MWvIcXE091PGkQZ6SmssKWns1Xus qtbcOSBK0HXFhoACVtQb1LupRYmHxo6J6CpFaF2n4d/7qA7sLxYggDtxJ 0Za0IAh7iU4BSVQEMddfm6z7L9mgX/xioImsVmJSR/91EFvcQL0nuFzRt IWOcPHs5SpeVBqlZ8WO6MMnMl48kp2y7XdyuzWMIGp0oIai17R+iRTqBZ BMM6GQbniMO9+4DDn0XR/dy9r+xsimlsVg5ngvXYX749Fql3qwq/9gbBm A==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="133107774" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="133107774" Received: from unknown (HELO yto-r3.gw.nic.fujitsu.com) ([218.44.52.219]) by esa11.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:20 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r3.gw.nic.fujitsu.com (Postfix) with ESMTP id 41CA1E967A for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id 7A675D624D for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id 006452288EB for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id 86FCE1A006E; Fri, 29 Mar 2024 14:36:15 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison() Date: Fri, 29 Mar 2024 14:36:10 +0800 Message-Id: <20240329063614.362763-3-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10--4.197900-10.000000 X-TMASE-MatchedRID: 53aa+u+VJBkM5CG8CYBPxRhvfWx0TE/bQR7lWMXPA1uWGhlHXorXXRZa DvoiUT/Mg1jj+Zp5wfm12HagvbwDji/7QU2czuUNA9lly13c/gEgltMEWVygJifJTn+dmnFQcHj giTON9jJvu+EAUOCx01Q+BXcIki7EZEHJCRAt0NqeAiCmPx4NwBnUJ0Ek6yhjxEHRux+uk8hxKp vEGAbTDo3PgYtyDuTWI3WPpm6ecjdp0YLTMI01adj70x37BoN8/h4BuZEnt4Io/4nN6pA2LIkSU kMsH+K04A1LMJVhA4LWGNvCCott3luMG6V02+QySir3tZId0WN+6klq53W5kJ9Gzq4huQVX X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 If poison is detected(reported from cxl memdev), OS should be notified to handle it. So, introduce this helper function for later use: 1. translate DPA to HPA; 2. enqueue records into memory_failure's work queue; Signed-off-by: Shiyang Ruan --- Currently poison injection from debugfs always create a 64-bytes-length record, which is fine. But the injection from qemu's QMP API: qmp_cxl_inject_poison() could create a poison record contains big length, which may cause many many times of calling memory_failure_queue(). Though the MEMORY_FAILURE_FIFO_SIZE is 1 << 4, it seems not enougth. --- drivers/cxl/core/mbox.c | 18 ++++++++++++++++++ drivers/cxl/cxlmem.h | 3 +++ 2 files changed, 21 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 9adda4795eb7..31b1b8711256 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1290,6 +1290,24 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds) } EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL); +void cxl_mem_report_poison(struct cxl_memdev *cxlmd, + struct cxl_region *cxlr, + struct cxl_poison_record *poison) +{ + u64 dpa = le64_to_cpu(poison->address) & CXL_POISON_START_MASK; + u64 len = PAGE_ALIGN(le32_to_cpu(poison->length) * CXL_POISON_LEN_MULT); + u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa); + unsigned long pfn = PHYS_PFN(hpa); + unsigned long pfn_end = pfn + len / PAGE_SIZE - 1; + + if (!IS_ENABLED(CONFIG_MEMORY_FAILURE)) + return; + + for (; pfn <= pfn_end; pfn++) + memory_failure_queue(pfn, MF_ACTION_REQUIRED); +} +EXPORT_SYMBOL_NS_GPL(cxl_mem_report_poison, CXL); + int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, struct cxl_region *cxlr) { diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 20fb3b35e89e..82f80eb381fb 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -828,6 +828,9 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, const uuid_t *uuid, union cxl_event *evt); int cxl_set_timestamp(struct cxl_memdev_state *mds); int cxl_poison_state_init(struct cxl_memdev_state *mds); +void cxl_mem_report_poison(struct cxl_memdev *cxlmd, + struct cxl_region *cxlr, + struct cxl_poison_record *poison); int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, struct cxl_region *cxlr); int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); From patchwork Fri Mar 29 06:36:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610202 Received: from esa8.hc1455-7.c3s2.iphmx.com (esa8.hc1455-7.c3s2.iphmx.com [139.138.61.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4609022324 for ; Fri, 29 Mar 2024 06:36:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=139.138.61.253 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694193; cv=none; b=dGU2AixxGoMl/d7nf5Z39FvNudDyqGWQjRtHWTfTU+hXdzDqTbF/1eYwqSQbD5pHaCPU76hzuyS7IBW35CQQXwvcpYsMBYjlkCFTh/z8SqUyx7J2QhCA0esY6z/DCombZ2bILJcXX/8varqT2RYjtfyyTm/00cB2ghy0Y9NpmuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694193; c=relaxed/simple; bh=wuhLrD6nDKASM1GZwEy/EhBT98EFm0ZU9lZPPyQH+DE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CAGIPcvYcNLhzIppWN0t1BttahptbrK9WvfmbWG3TUXuwzGodwKL7GwC2WmbE2ulBUk0kL6h8JvQEzR04TXplEohm+YBIVnwwkmJoNHPXm4upGm8f7cecHQCHgAaWnRk8M5F7mtgMQswiB/l8DhvhbSpfjCMzC+IbW7jjpxJtkA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=iDeh5z2k; arc=none smtp.client-ip=139.138.61.253 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="iDeh5z2k" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694190; x=1743230190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wuhLrD6nDKASM1GZwEy/EhBT98EFm0ZU9lZPPyQH+DE=; b=iDeh5z2ku7cxzuolBUqZIHDav99tBR6tFxlkcVTew7r+4YfEbyudphp1 cK3K3bHzPh0RbWIMlBBAO/28heK9T3u14GaCL1Uqq9XWdgeKi2I2976oi e6YHo/DrhTW57foDVVHTnpjQg/V5j/hUHxL0nyNMxItucRzQx6b2PQ+N+ cKJNX+1Dg408WTKAP3sU0QkZUa7KkL7qpPnFyIJXwP/8j31E+ycz8dxXw 1KfSF/CFd3SXj9gfCM+pTnbCwaghpNIDx2bhpWNoVuwSJzST90IqP9IHw DuuNad2PppPReyiHhrT2g00iR8oN9OmLJSE/ggvy2YlPEg/ZqWQqmcXMf w==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="141726400" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="141726400" Received: from unknown (HELO yto-r3.gw.nic.fujitsu.com) ([218.44.52.219]) by esa8.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:19 +0900 Received: from yto-m3.gw.nic.fujitsu.com (yto-nat-yto-m3.gw.nic.fujitsu.com [192.168.83.66]) by yto-r3.gw.nic.fujitsu.com (Postfix) with ESMTP id 89C9DE967F for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from kws-ab3.gw.nic.fujitsu.com (kws-ab3.gw.nic.fujitsu.com [192.51.206.21]) by yto-m3.gw.nic.fujitsu.com (Postfix) with ESMTP id B7438F923 for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab3.gw.nic.fujitsu.com (Postfix) with ESMTP id 4A233200A7F0C for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id D019B1A006A; Fri, 29 Mar 2024 14:36:15 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison() Date: Fri, 29 Mar 2024 14:36:11 +0800 Message-Id: <20240329063614.362763-4-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10--9.462200-10.000000 X-TMASE-MatchedRID: a3KJLn6RIiIJoDeiZ6YZhB1kSRHxj+Z5/OuUJVcMZhtdXeeZrNJbgiu5 iQ9EM64EjZRdXn3Xp9M6uZyF2AGJsLVQ6XPWwtdyEXjPIvKd74BUENBIMyKD0cSiwizsgluQ9my Umu4fZeT7dV04rak8k5t7C5yK9FBjJBgtEIxUn4HfSQNpZkETVBgff28UuvITicvz9DxarMHQSo WMT37bqFIeZmiiCQMfIdXPCd0ceGWvvxILmKK/HBRFJJyf5BJe3QfwsVk0UbtuRXh7bFKB7qYgz NcfPy5s9/7GOwz/1iinIjtl5KwpMUvyE7CVOX4olExlQIQeRG0= X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 The GMER only has "Physical Address" field, no such one indicates length. So, when a poison event is received, we could use GET_POISON_LIST command to get the poison list. Now driver has cxl_mem_get_poison(), so reuse it and add a parameter 'bool report', report poison record to MCE if set true. Signed-off-by: Shiyang Ruan --- drivers/cxl/core/mbox.c | 8 ++++++-- drivers/cxl/core/memdev.c | 4 ++-- drivers/cxl/core/region.c | 8 ++++---- drivers/cxl/cxlmem.h | 2 +- 4 files changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 31b1b8711256..19b46fb06ed6 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1309,7 +1309,7 @@ void cxl_mem_report_poison(struct cxl_memdev *cxlmd, EXPORT_SYMBOL_NS_GPL(cxl_mem_report_poison, CXL); int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, - struct cxl_region *cxlr) + struct cxl_region *cxlr, bool report) { struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); struct cxl_mbox_poison_out *po; @@ -1340,10 +1340,14 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, if (rc) break; - for (int i = 0; i < le16_to_cpu(po->count); i++) + for (int i = 0; i < le16_to_cpu(po->count); i++) { trace_cxl_poison(cxlmd, cxlr, &po->record[i], po->flags, po->overflow_ts, CXL_POISON_TRACE_LIST); + if (report) + cxl_mem_report_poison(cxlmd, cxlr, + &po->record[i]); + } /* Protect against an uncleared _FLAG_MORE */ nr_records = nr_records + le16_to_cpu(po->count); diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index d4e259f3a7e9..e976141ca4a9 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -200,14 +200,14 @@ static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd) if (resource_size(&cxlds->pmem_res)) { offset = cxlds->pmem_res.start; length = resource_size(&cxlds->pmem_res); - rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false); if (rc) return rc; } if (resource_size(&cxlds->ram_res)) { offset = cxlds->ram_res.start; length = resource_size(&cxlds->ram_res); - rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false); /* * Invalid Physical Address is not an error for * volatile addresses. Device support is optional. diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 5c186e0a39b9..e83c46cb4dea 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -2585,7 +2585,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd, if (ctx->mode == CXL_DECODER_RAM) { offset = ctx->offset; length = resource_size(&cxlds->ram_res) - offset; - rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false); if (rc == -EFAULT) rc = 0; if (rc) @@ -2603,7 +2603,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd, return 0; } - return cxl_mem_get_poison(cxlmd, offset, length, NULL); + return cxl_mem_get_poison(cxlmd, offset, length, NULL, false); } static int poison_by_decoder(struct device *dev, void *arg) @@ -2637,7 +2637,7 @@ static int poison_by_decoder(struct device *dev, void *arg) if (cxled->skip) { offset = cxled->dpa_res->start - cxled->skip; length = cxled->skip; - rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false); if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM) rc = 0; if (rc) @@ -2646,7 +2646,7 @@ static int poison_by_decoder(struct device *dev, void *arg) offset = cxled->dpa_res->start; length = cxled->dpa_res->end - offset + 1; - rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region); + rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region, false); if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM) rc = 0; if (rc) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 82f80eb381fb..1f03130b9d6a 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -832,7 +832,7 @@ void cxl_mem_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr, struct cxl_poison_record *poison); int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, - struct cxl_region *cxlr); + struct cxl_region *cxlr, bool report); int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); From patchwork Fri Mar 29 06:36:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610211 Received: from esa3.hc1455-7.c3s2.iphmx.com (esa3.hc1455-7.c3s2.iphmx.com [207.54.90.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D54B3BBCE for ; Fri, 29 Mar 2024 06:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.54.90.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694254; cv=none; b=dp/Glxvi9fD2pGxZd/KsU+ZjgnrBSLAt3S3C3Akyv0EKlrlF+cM4XG2FB+tV60OahYnlvzr662Xh0H/jUirU4ZKGZDL3lNCoYt4aX2e3eAFufRxnRTQ5f7XipqK4V1+goQ59JkqrrClQOnGhqeE2TIL2Q/FzjhcUUcAZ71viy7M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694254; c=relaxed/simple; bh=Om0rTA9WbsSAo7t6tlUEm6jtC04fsfKD+Idi78A3PwY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gK0dTGwGjuUj4NSISJUUCQ/GCTWp9oTzZTD0xo6R0iDZIiPUBEGs2iKUvAoLCukDR23PP/hbQvT81t7Z3IMi+nx+YXvlr0qUxOTXMdFx7xjaKxJ8CevP5dDagrpVXPtCvCDEZotaw9bnmCOnWKAejjeT2sIIAXB2FiNFpxg4S28= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=oJ6eQja+; arc=none smtp.client-ip=207.54.90.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="oJ6eQja+" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694252; x=1743230252; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Om0rTA9WbsSAo7t6tlUEm6jtC04fsfKD+Idi78A3PwY=; b=oJ6eQja+++8WusJ6gRaasAyD5ewRDKe1/YcXaVzisfEG5k8Lgu90xtb7 LcfklFIvWWTW/vESJSBb8MtfmdrOQxKqiB5sarIzsvg71Wvh/SbFtYGCj zKWlkuRaisZCJt+jgLkjKZbGodFLRNKmAeA+dIaaoqF5qF2JsRAW8yhLT 2YyQiCHD9g2E6ndmAfLl1ATaw5y8woW/rCJ1qpwsKkoRpQYyQ9Wi5R/Nj mhlT/PXUG2RNtszWIXiiCQyn2wht52u2vkqXbxctxtVY4ww+ZOdXK2S+L BaZ4OKg41je/NEzJ8n3UuzgG7qZN1Iw1Jj9pgl9D0cgVJl08kS4FK0F2s A==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="153723889" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="153723889" Received: from unknown (HELO yto-r3.gw.nic.fujitsu.com) ([218.44.52.219]) by esa3.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:20 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r3.gw.nic.fujitsu.com (Postfix) with ESMTP id BD0A5E966B for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from kws-ab3.gw.nic.fujitsu.com (kws-ab3.gw.nic.fujitsu.com [192.51.206.21]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id E9F77D5EAF for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab3.gw.nic.fujitsu.com (Postfix) with ESMTP id 7ED19200A80C1 for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id 24B0C1A006D; Fri, 29 Mar 2024 14:36:16 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Date: Fri, 29 Mar 2024 14:36:12 +0800 Message-Id: <20240329063614.362763-5-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10--1.096700-10.000000 X-TMASE-MatchedRID: It5tpevY7hTR6RKL7TRTbhF4zyLyne+ATJDl9FKHbrnLBiiIoKf6r6PF jJEFr+olKE0Je8DR/D4NXwNUB3oA790H8LFZNFG76sBnwpOylLOGGOXsAst2EY71S6fa2jIEbbi ume7ndEjAl7InEAdrZfsQLJr4C1gANHYsYoI80lgKm9Bb+04k3Mw0hF8jARVgyLf0xPiHr9ghya dlGFXHKsTgfCdKUS4cicSkmYsAV+kLUU1zqiphVX7cGd19dSFd X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 Poison injection from debugfs is silent too. Add calling cxl_mem_report_poison() to make it able to do memory_failure(). Signed-off-by: Shiyang Ruan --- drivers/cxl/core/memdev.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index e976141ca4a9..b0dcbe6f1004 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -366,6 +366,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa) .length = cpu_to_le32(1), }; trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT); + cxl_mem_report_poison(cxlmd, cxlr, &record); out: up_read(&cxl_dpa_rwsem); up_read(&cxl_region_rwsem); From patchwork Fri Mar 29 06:36:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610214 Received: from esa3.hc1455-7.c3s2.iphmx.com (esa3.hc1455-7.c3s2.iphmx.com [207.54.90.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 868E4376E2 for ; Fri, 29 Mar 2024 06:37:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.54.90.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694256; cv=none; b=uP1YrxYF9uktKoqUNw3/FfrKVZWlJB43reaEpGXU9tVDID3SgJ38cibSG2YYtIXl3Uxr5w+lM/hyfxwnlD7GcDC5FIjMOlktEfdsdq2iel6GNVkdDZwpBjYJs25Ojbn6jOJQuuXF741zeuJyL2bsXtf7ypUilU2mc3yPQXBSCV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694256; c=relaxed/simple; bh=bCrcdq4lNI8kIJNZkx/SfFaA95id3/KBVLHFuh8UmNU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dkqFlvagr/7E23n/teccT6KCXJbR86bYH00xAS04Rv6w4cOoWi9iz81VE/Lcvpjg+b184XQMma0CdTokbdTJZXrDaOVWJfyDouHT5xwSeEaJnXQh3HmZfZJaU4B5jJCFjXeG3OAgbVeuujO63+rhQ88dS8s82bujH2efIkU54/k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=auabaMAs; arc=none smtp.client-ip=207.54.90.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="auabaMAs" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694255; x=1743230255; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bCrcdq4lNI8kIJNZkx/SfFaA95id3/KBVLHFuh8UmNU=; b=auabaMAsvdw3fQY+GCn0zvx59ewRMOkPVKlKOu6HBDt9JbH7w3nbsn8n uFlzOxPfcd3zxmI8O/9ma1AaQFQIixtK9Czz/7GFiesFoMwxGLF+35iTB lqUOSj+lFybA9N43H2qv1gkoTW7N5s37EA+foGnn/HLGLggSbc938OSZw wqdSboV9Q3CSgR4Z39+XzZrk0ygjb5WTnaviu3EqxBMhWWuUKL+/xiAtH YUkTy2u2hC+y/q7IpkAf1qxENWemmhI+gmp5T22te2vjBDDzxRu9xqPwd 3CBbfMl1RNf1xIwr4zulQwRGyAB21mwp1ciayF2CdV3oIfONIjPYYZAbY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="153723890" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="153723890" Received: from unknown (HELO yto-r4.gw.nic.fujitsu.com) ([218.44.52.220]) by esa3.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:21 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r4.gw.nic.fujitsu.com (Postfix) with ESMTP id 2BC3FD9F01 for ; Fri, 29 Mar 2024 15:36:18 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id 5DF1AD5610 for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id E80DD224950 for ; Fri, 29 Mar 2024 15:36:16 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id 783B21A006A; Fri, 29 Mar 2024 14:36:16 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 5/6] cxl: add definition for transaction types Date: Fri, 29 Mar 2024 14:36:13 +0800 Message-Id: <20240329063614.362763-6-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10--4.520000-10.000000 X-TMASE-MatchedRID: NuUGwdOZh4ZkQckJEC3Q2tyBRU/cKn69F4r8H5YrEqxOeMm3G6m0Tb8F Hrw7frluf146W0iUu2tR9er5wo4xLchblrt58TvtqdwbW9Wx9tAv5vY1YvMqbuUlKuNFfcJqo8W MkQWv6iV3LAytsQR4e42j49Ftap9ExlblqLlYqXJt4l4q/wnikovCfpj71RK4UrGm7aSNDOJYJI xPFhImJcZoZKsx29ZV9DCWgDRvWseCgKQPezcuADwdpcMIVCZT+OYGlmIZze4RZbRsQk5MBUB1Q Pq9bxnWZkAxAwjIrrMHz/H0kiLyEqGAtHMDjkk9 X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 The transaction types are defined in General Media Event Record/DRAM Event per CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 and Section 8.2.9.2.1.2; Table 8-44. Add them for Event Record handler use. Signed-off-by: Shiyang Ruan --- include/linux/cxl-event.h | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h index 03fa6d50d46f..0a50754fc330 100644 --- a/include/linux/cxl-event.h +++ b/include/linux/cxl-event.h @@ -23,6 +23,19 @@ struct cxl_event_generic { u8 data[CXL_EVENT_RECORD_DATA_LENGTH]; } __packed; +/* + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 + */ +enum cxl_event_transaction_type { + CXL_EVENT_TRANSACTION_UNKNOWN = 0X00, + CXL_EVENT_TRANSACTION_READ, + CXL_EVENT_TRANSACTION_WRITE, + CXL_EVENT_TRANSACTION_SCAN_MEDIA, + CXL_EVENT_TRANSACTION_INJECT_POISON, + CXL_EVENT_TRANSACTION_MEDIA_SCRUB, + CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT, +}; + /* * General Media Event Record * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 @@ -33,7 +46,7 @@ struct cxl_event_gen_media { __le64 phys_addr; u8 descriptor; u8 type; - u8 transaction_type; + u8 transaction_type; /* enum cxl_event_transaction_type */ u8 validity_flags[2]; u8 channel; u8 rank; @@ -52,7 +65,7 @@ struct cxl_event_dram { __le64 phys_addr; u8 descriptor; u8 type; - u8 transaction_type; + u8 transaction_type; /* enum cxl_event_transaction_type */ u8 validity_flags[2]; u8 channel; u8 rank; From patchwork Fri Mar 29 06:36:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiyang Ruan X-Patchwork-Id: 13610213 Received: from esa4.hc1455-7.c3s2.iphmx.com (esa4.hc1455-7.c3s2.iphmx.com [68.232.139.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24B693C6A6 for ; Fri, 29 Mar 2024 06:37:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=68.232.139.117 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694255; cv=none; b=E1jmxh6Ss1vCFUbFm6g/K93sIpVpObNlC2moykDs58dtoxhkjLZBwynITrs4p9BvqHgDltm11xzPAj7UamtdDLBDCRtW91dfOo9HorJ9nnU22eJLKmJ03ZvaeP2+lQu8uN+edgormCYkDLCnv2SxvZauQimELf6wwyk9T8m3cDg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711694255; c=relaxed/simple; bh=shtuRfAxtrYc7R8K9/CJ4Azd4LkSM+WTHpdkGRJ63DM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Z4xzA25rmvjH6h/40uH8/GRiEuyBZtZUxL29HEe4JdM5jTiy14FqVSfBwxIRH70b2gdS1Y+PbmuEiE7h1jS7o3GoUwnWJ/Ixt7V88lzJHKRSIkmcodiCAo9Na7gFXwbfdeXRSbWrvUNETDFN4UD2CaGAzb+uPIADAKwntsEOJqQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=NnW76oZS; arc=none smtp.client-ip=68.232.139.117 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="NnW76oZS" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1711694253; x=1743230253; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=shtuRfAxtrYc7R8K9/CJ4Azd4LkSM+WTHpdkGRJ63DM=; b=NnW76oZS5CH67Av5d+FXlpdoVSX3vwt3GcuQNQeo7oARcP+mnWXnDw6o nAnnuVzbhEKAfcelHTr2q/wzrxKfq7b7ivsU7jRSpWO1WJ/MAqUdiwdJz kqILu5kx8f7n/l4P68yd8xDq+X/bs5h117hihf9KsUpHOm/EXcgFAHIyk FmO/n/nZlrsg4J4z1V5ltG3PeXxBat5U/2PUHuv9/rkWjAE8G3LVcDmvs aU0Z459F+wX7/mCAgCxRsQRy97tdvXO4mG6mtInK/X05AftGada0yZUOU ShVfVFEZTGyj7c+dCwIWxKk2jNTFPF9XDEoDZRTgiEzVqsl8Se5PGN3Ns g==; X-IronPort-AV: E=McAfee;i="6600,9927,11027"; a="154014271" X-IronPort-AV: E=Sophos;i="6.07,164,1708354800"; d="scan'208";a="154014271" Received: from unknown (HELO yto-r4.gw.nic.fujitsu.com) ([218.44.52.220]) by esa4.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2024 15:36:20 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r4.gw.nic.fujitsu.com (Postfix) with ESMTP id 77FBFD9F07 for ; Fri, 29 Mar 2024 15:36:18 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id A78E0D6257 for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id 388BA224950 for ; Fri, 29 Mar 2024 15:36:17 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id C10561A006E; Fri, 29 Mar 2024 14:36:16 +0800 (CST) From: Shiyang Ruan To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 6/6] cxl/core: add poison injection event handler Date: Fri, 29 Mar 2024 14:36:14 +0800 Message-Id: <20240329063614.362763-7-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> References: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28282.003 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28282.003 X-TMASE-Result: 10--21.177000-10.000000 X-TMASE-MatchedRID: iooG+Wyw6IPvjhWxSrUkKRFbgtHjUWLy/OuUJVcMZhsshTvdDYMpJmdv IBM8UuLSlD/Z1FnrlIQq1udiYlzzOw+AJDU3H+qgMGAKZueP0mZcsgu/IQFPzlFcLhxfrlwkrEi NJh+xJpmolk6IAqhmvEZXTR3Us53S8Aj+/7+3oKUFxov+3JYvY35Lmbb/xUuaOhR0VsdhRrC/BR 68O365bn9eOltIlLtrGYYZJnYtPhnonyaWYsETGxmCYUYerLHruJpeHGRhXLFHpEd1UrzmFevn/ O99cHKFCazasHFXmjvJI5NrSJkuqtnzIBSf0ZiSBe3KRVyu+k2fmd9HsjZ0Uw3H/quqvfm4NUIc 8ma/DM+LYCqQJUBx8PD9KdZg9ohhrVSkkKFNk8GJXSm2bBmGrSCW0wRZXKAmmyiLZetSf8mfop0 ytGwvXiq2rl3dzGQ1A/3R8k/14e0= X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 Currently driver only traces cxl events, poison injection (for both vmem and pmem type) on cxl memdev is silent. OS needs to be notified then it could handle poison range in time. Per CXL spec, the device error event could be signaled through FW-First and OS-First methods. So, add poison event handler in OS-First method: - qemu: - CXL device report POISON event to OS by MSI by sending GMER after injecting a poison record - CXL driver a. parse the POISON event from GMER; <-- this patch b. retrieve POISON list from memdev; c. translate poisoned DPA to HPA; d. enqueue poisoned PFN to memory_failure's work queue; Signed-off-by: Shiyang Ruan --- the reply to Jonathan's comment in last version: > I'm not 100% convinced this is necessary poison causing. Also > the text tells us we should see 'an appropriate event'. > DRAM one seems likely to be chosen by some vendors. I think it's right to use DRAM Event Record for volatile-memdev, but should poison on a persistent-memdev also use DRAM Event Record too? Though its 'Physical Address' feild has the 'Volatile' bit too, which is same as General Media Event Record. I am a bit confused about this. --- drivers/cxl/core/mbox.c | 100 ++++++++++++++++++++++++++++++++++------ drivers/cxl/cxlmem.h | 8 ++-- 2 files changed, 91 insertions(+), 17 deletions(-) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 19b46fb06ed6..97ef45d808b8 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -837,25 +837,99 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds) } EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL); -void cxl_event_trace_record(const struct cxl_memdev *cxlmd, - enum cxl_event_log_type type, - enum cxl_event_type event_type, - const uuid_t *uuid, union cxl_event *evt) +struct cxl_event_poison_context { + u64 dpa; + u64 length; +}; + +static int __cxl_report_poison(struct device *dev, void *arg) +{ + struct cxl_event_poison_context *ctx = arg; + struct cxl_endpoint_decoder *cxled; + struct cxl_memdev *cxlmd; + + cxled = to_cxl_endpoint_decoder(dev); + if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res)) + return 0; + + if (cxled->mode == CXL_DECODER_MIXED) { + dev_dbg(dev, "poison list read unsupported in mixed mode\n"); + return 0; + } + + if (ctx->dpa > cxled->dpa_res->end || ctx->dpa < cxled->dpa_res->start) + return 0; + + cxlmd = cxled_to_memdev(cxled); + cxl_mem_get_poison(cxlmd, ctx->dpa, ctx->length, cxled->cxld.region, + true); + + return 1; +} + +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, + struct cxl_event_gen_media *rec) +{ + struct cxl_port *port = cxlmd->endpoint; + u64 phys_addr = le64_to_cpu(rec->phys_addr); + struct cxl_event_poison_context ctx = { + .dpa = phys_addr & CXL_DPA_MASK, + }; + + /* No regions mapped to this memdev, that is to say no HPA is mapped */ + if (!port || !is_cxl_endpoint(port) || + cxl_num_decoders_committed(port) == 0) + return; + + /* + * Host Inject Poison may have a range of DPA, but the GMER only has + * "Physical Address" field, no such one indicates length. So it's + * better to call cxl_mem_get_poison() to find this poison record. + */ + ctx.length = phys_addr & CXL_DPA_VOLATILE ? + resource_size(&cxlmd->cxlds->ram_res) : + resource_size(&cxlmd->cxlds->pmem_res) - ctx.dpa; + + device_for_each_child(&port->dev, &ctx, __cxl_report_poison); +} + +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd, + enum cxl_event_log_type type, + struct cxl_event_gen_media *rec) +{ + if (type == CXL_EVENT_TYPE_FAIL) { + switch (rec->transaction_type) { + case CXL_EVENT_TRANSACTION_READ: + case CXL_EVENT_TRANSACTION_WRITE: + case CXL_EVENT_TRANSACTION_INJECT_POISON: + cxl_event_handle_poison(cxlmd, rec); + break; + default: + break; + } + } +} + +void cxl_event_handle_record(struct cxl_memdev *cxlmd, + enum cxl_event_log_type type, + enum cxl_event_type event_type, + const uuid_t *uuid, union cxl_event *evt) { - if (event_type == CXL_CPER_EVENT_GEN_MEDIA) + if (event_type == CXL_CPER_EVENT_GEN_MEDIA) { trace_cxl_general_media(cxlmd, type, &evt->gen_media); - else if (event_type == CXL_CPER_EVENT_DRAM) + cxl_event_handle_general_media(cxlmd, type, &evt->gen_media); + } else if (event_type == CXL_CPER_EVENT_DRAM) trace_cxl_dram(cxlmd, type, &evt->dram); else if (event_type == CXL_CPER_EVENT_MEM_MODULE) trace_cxl_memory_module(cxlmd, type, &evt->mem_module); else trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic); } -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL); -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd, - enum cxl_event_log_type type, - struct cxl_event_record_raw *record) +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd, + enum cxl_event_log_type type, + struct cxl_event_record_raw *record) { enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC; const uuid_t *uuid = &record->id; @@ -867,7 +941,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd, else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID)) ev_type = CXL_CPER_EVENT_MEM_MODULE; - cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event); + cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event); } static int cxl_clear_event_record(struct cxl_memdev_state *mds, @@ -978,8 +1052,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds, break; for (i = 0; i < nr_rec; i++) - __cxl_event_trace_record(cxlmd, type, - &payload->records[i]); + __cxl_event_handle_record(cxlmd, type, + &payload->records[i]); if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW) trace_cxl_overflow(cxlmd, type, payload); diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 1f03130b9d6a..dfd7bdd0d66a 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds, void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, unsigned long *cmds); void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status); -void cxl_event_trace_record(const struct cxl_memdev *cxlmd, - enum cxl_event_log_type type, - enum cxl_event_type event_type, - const uuid_t *uuid, union cxl_event *evt); +void cxl_event_handle_record(struct cxl_memdev *cxlmd, + enum cxl_event_log_type type, + enum cxl_event_type event_type, + const uuid_t *uuid, union cxl_event *evt); int cxl_set_timestamp(struct cxl_memdev_state *mds); int cxl_poison_state_init(struct cxl_memdev_state *mds); void cxl_mem_report_poison(struct cxl_memdev *cxlmd,