From patchwork Fri Oct 18 16:12:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 13842025 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B14CF1EE028 for ; Fri, 18 Oct 2024 16:13:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729267996; cv=none; b=ChuOfjfbqUH3OQ+WtxdCBP0NyneFiJn5AQ64RA7gd+ehEvOx7L+m6PyytNdvW9lMFHYhCVjUss6+o7uBvslXEvdYMTYQlXFrpOc68vDX6IUIJTW8ZAp+RlFstZ6DR8NjMb+BpI+2hLp6ZUrNbHtwcgORg/yK9J6WmB5BTqgTVLc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729267996; c=relaxed/simple; bh=msy/cQXb5Dfs2fmJlRUaV26iIsCtXdQF3yh60DCKWvI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Hgv/wXxgzlS5rhG1IYOO6h3jinwlErGd7TgfM82rbXZlSAcgzmkasJNT6VMf+5YjZQ3BeGiMeWA8CFSLMQ00yi6J92B1aNvvtMmc9pz/2/EqTTcG1mQ1IaQe1TooOvoA39h7jq/nDISGGJd12Cz/kn5cMRJEhkgYGLoi+4ZWvSI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=Pge3digz; arc=none smtp.client-ip=209.85.222.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="Pge3digz" Received: by mail-qk1-f182.google.com with SMTP id af79cd13be357-7b13ff3141aso159810485a.1 for ; Fri, 18 Oct 2024 09:13:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267993; x=1729872793; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y/0cRfb4YBF8cPInnUDV/L7Jp4pgWglmeJbCtRCfJPQ=; b=Pge3digz48bIggkBFgh5YHhZVBgabYfhiKGb0NSDx/m+8EEp4A9LNPfm8WZc5CBMxR 5UTxpkIpCYwt/aaPyWwtqobIWhzZt9KeivJtWezy3gKSJDjR+1t9L0l1OboFonhpkbcI y+zDi18dL/waBrxI0D9iz6H5PzHBF0RfZjdlaF03NT7JD4TzKbUgYrY+5H7bPkBgRbSY z1Xg7gW55RR0OZvWvYiE6BWQr1an99c79TzMt08qZ7Rcq6rLiBdwE7EyKnfLTw5n4Krb 6YgxcKnuFWNI09eulgo6ryg7STzv2aBtPC7/UsvhHa4yM+nruosk68hQaWjBvXJLNX6G XzGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267993; x=1729872793; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y/0cRfb4YBF8cPInnUDV/L7Jp4pgWglmeJbCtRCfJPQ=; b=D6bf2FowJEuu9RQBUG+cjbbp7RAwbyUy1ruD1OEnU0ny9CasLkWBwAWrX2NDCtyGBB AUyDSWU0jNQmdItD8ZN/YlZoKc00NNSeXX40kFrOnvcYq0A5QpTJOZ4a9rEKrsvm1/fR uEPCtGYx6CcgkXThxiCoIRBH3SZ8dXWpIVGa7HtwobLY4UIUUth7ZCAEeU+5yVogGb2s YppJ+ckJvkYmQ5plv67uIRUIk3g1hfo1SIz2lPMRo01I9TTO7NukBEk8uEJnOWLbE96T +4TGdAghSv2cuJzUMmngRVOZ6LZ8y4OQbvbAz6uEC7ve7nu88sRUZD3WuAPsZxqene/r lc/w== X-Gm-Message-State: AOJu0Yxts4k/ixGQCioVtsZcWNYDFs3zgFk3KFQBpX1vh4ddScj053PB eyGI2CTv+NtJXJJUX5hENLZV9FS42jxleK2t2ZSLpimZWcahfryT7aG8WWevR4d9iXo1N1Lm45q O X-Google-Smtp-Source: AGHT+IHgiNaZWimN4akiargJriy+PnqMz0dQ4VKMrO8LERPUv3AnoxXaArowIO1TsiS2YWVAiX8YXQ== X-Received: by 2002:a05:620a:2402:b0:7b1:4948:109f with SMTP id af79cd13be357-7b157bf13camr287097685a.57.1729267993256; Fri, 18 Oct 2024 09:13:13 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:12 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 1/3] cxl-mailbox-utils: move CXLUpdateDCExtentListInPl into header Date: Fri, 18 Oct 2024 12:12:50 -0400 Message-ID: <20241018161252.8896-2-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Svetly Todorov Allows other CXL devices to access host DCD-add-response payload. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/cxl-mailbox-utils.c | 16 ---------------- include/hw/cxl/cxl_device.h | 16 ++++++++++++++++ 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c index 72c03d85cf..10de26605c 100644 --- a/hw/cxl/cxl-mailbox-utils.c +++ b/hw/cxl/cxl-mailbox-utils.c @@ -2446,22 +2446,6 @@ void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list) g_free(group); } -/* - * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload - * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload - */ -typedef struct CXLUpdateDCExtentListInPl { - uint32_t num_entries_updated; - uint8_t flags; - uint8_t rsvd[3]; - /* CXL r3.1 Table 8-169: Updated Extent */ - struct { - uint64_t start_dpa; - uint64_t len; - uint8_t rsvd[8]; - } QEMU_PACKED updated_entries[]; -} QEMU_PACKED CXLUpdateDCExtentListInPl; - /* * For the extents in the extent list to operate, check whether they are valid * 1. The extent should be in the range of a valid DC region; diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h index c3e93b876a..b2dc7fb769 100644 --- a/include/hw/cxl/cxl_device.h +++ b/include/hw/cxl/cxl_device.h @@ -552,6 +552,22 @@ typedef struct CXLDCExtentGroup { } CXLDCExtentGroup; typedef QTAILQ_HEAD(, CXLDCExtentGroup) CXLDCExtentGroupList; +/* + * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload + * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload + */ +typedef struct CXLUpdateDCExtentListInPl { + uint32_t num_entries_updated; + uint8_t flags; + uint8_t rsvd[3]; + /* CXL r3.1 Table 8-169: Updated Extent */ + struct { + uint64_t start_dpa; + uint64_t len; + uint8_t rsvd[8]; + } QEMU_PACKED updated_entries[]; +} QEMU_PACKED CXLUpdateDCExtentListInPl; + typedef struct CXLDCRegion { uint64_t base; /* aligned to 256*MiB */ uint64_t decode_len; /* aligned to 256*MiB */ From patchwork Fri Oct 18 16:12:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 13842026 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F5F820262C for ; Fri, 18 Oct 2024 16:13:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729267998; cv=none; b=tAuQkYrQhiL2Y8DKMaP7/wcGG7hhUEd+oZa+n5gzNuosUMP1xiFv3cyY4YZnokvYQbvS5zLepCMDtiZUfR+UXLgq4yZ9kIlQB9XTKGK2d4R2QSRPeIX26tr+juf9nGUu4H4PB+yxF3rwAC9oYdX6wxb31IVkJqyIBAgky4svZmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729267998; c=relaxed/simple; bh=IqaB/KGMVFL/KaoreSmMSXUrTD1rAXjaUaimBbMlXdw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j0h73vwlda7jdNOxweiXeT9htKwUatEenk4L7Rf4l+djf+FEb0Hv90dDDE2hsTY/aHXBPtdtUcMEyzz4nqtleZtOQmU9TwY9XbqLI/Lsi3eF/8IpNIVC01ul3yPPZGMHU/uzwkgnOqLTp/F3aNMTedN37AI2X3BB0C2Qm7tFyWA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=Y2JzrUdB; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="Y2JzrUdB" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-7b150dc7bc0so147757785a.1 for ; Fri, 18 Oct 2024 09:13:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267995; x=1729872795; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lWkU6s2NOVNttOr0QYI8NeG1kjlYmYzZJu3z4aGvls0=; b=Y2JzrUdB7lNys90UcbzyOdsq6bOrrDRMWkRa0CTqCjWlrVaQml34Z5ZpYH/l2Z9XtU 3u6NHjeiHRIpi8l3WnLf78WvBrBwoaaJiu7PjnH6H6VrCpDkywkWou/IaB9nVIgNdNwJ B8qoXS2cnLe5ixHNqrKg21TfxcnNofYMEqOx5Rl0Zx/80IZ6jnRo58dZK4gKl3H6NPEm MZBfOX5ym4u607FTXoIpeZdHit1oLpFvTDuUHFeVJ+fOhIQZrn12T7yNv56LMVkXONeE 0Y2Ez08fvV/uPVIrFa46rD6kXYgqHAku9PhMB5foDq4IfqszpkYBvqmJKK4ihsKKDqFE ORrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267995; x=1729872795; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lWkU6s2NOVNttOr0QYI8NeG1kjlYmYzZJu3z4aGvls0=; b=E0x9Rywm90E5nWYzx3PU3Ne6FKaBCx8l5MVY3Cd1wjEFxgogPLDm2WUG7FP3Ad/YKW s/GemuRd5FZ5y0HjpmeDEHr6Ur84e+mQYnLlfDpKkcz4xDiBRqiuaV+tmVButh6cfhmZ HxD7J6rmkbb3vG8vkrrYqSUnjJQrO4rUolTCqq9lY/mFQT+ftidUGE0LN6J64QvXpHq8 2Oxv2pkI/NltN7CraG4/kiTueW32as2hwL8F/zafGbXxjBQxuyHXnw3GgAjzdB2Zf3NM Qdpb4DEJlmrdZrKG8hNd6z19+aLvmi/O5Ur32WtLLfiMDH9Vss6pRQSXjNyz9H6XhApi HVWA== X-Gm-Message-State: AOJu0YzT5Fjq0iP50zzb82BcV/XeYiFdkNWRT2gsy+sjql2MEmRdEDVL ubxLe14lgh5Czf6LSWS4iBZxDlPzE+tL1mPMH9Rh/riL8ZgfcF75zISRO/3VH++II6R902YLPlS 0 X-Google-Smtp-Source: AGHT+IEqKAO0ketnSqmhUBvPOV068tf2A5Q37gpR9t7bNJ8EjmxgVJSGSr2XZ2oPQNDkOHcwr/ymVw== X-Received: by 2002:a05:620a:2492:b0:7b1:4579:61fa with SMTP id af79cd13be357-7b157beea8fmr371565785a.55.1729267994804; Fri, 18 Oct 2024 09:13:14 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:14 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 2/3] cxl_type3: add MHD callbacks Date: Fri, 18 Oct 2024 12:12:51 -0400 Message-ID: <20241018161252.8896-3-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Svetly Todorov Introduce an API for validating DC adds, removes, and responses against a multi-headed device. mhd_reserve_extents() is called during a DC add request. This allows a multi-headed device to check whether the requested extents belong to another host. If not, then this function can claim those extents in the MHD state and allow the cxl_type3 code to follow suit in the host-local blk_bitmap. mhd_reclaim_extents() is called during the DC add response. It allows the MHD to reclaim extents that were preallocated to a host during the request but rejected in the response. mhd_release_extent() is called during the DC release response. It can be invoked after a host frees an extent in its local bitmap, allowing the MHD handler to release that same extent in the multi-host state. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/cxl-mailbox-utils.c | 28 +++++++++++++++++++++++++++- hw/mem/cxl_type3.c | 17 +++++++++++++++++ include/hw/cxl/cxl_device.h | 8 ++++++++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c index 10de26605c..112272e9ac 100644 --- a/hw/cxl/cxl-mailbox-utils.c +++ b/hw/cxl/cxl-mailbox-utils.c @@ -2545,6 +2545,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd, { CXLUpdateDCExtentListInPl *in = (void *)payload_in; CXLType3Dev *ct3d = CXL_TYPE3(cci->d); + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); CXLDCExtentList *extent_list = &ct3d->dc.extents; uint32_t i; uint64_t dpa, len; @@ -2579,6 +2580,11 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd, ct3d->dc.total_extent_count += 1; ct3_set_region_block_backed(ct3d, dpa, len); } + + if (cvc->mhd_reclaim_extents) + cvc->mhd_reclaim_extents(&ct3d->parent_obj, &ct3d->dc.extents_pending, + in); + /* Remove the first extent group in the pending list */ cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending); @@ -2612,6 +2618,7 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d, uint32_t *updated_list_size) { CXLDCExtent *ent, *ent_next; + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); uint64_t dpa, len; uint32_t i; int cnt_delta = 0; @@ -2632,6 +2639,13 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d, goto free_and_exit; } + /* In an MHD, check that this DPA range belongs to this host */ + if (cvc->mhd_access_valid && + !cvc->mhd_access_valid(&ct3d->parent_obj, dpa, len)) { + ret = CXL_MBOX_INVALID_PA; + goto free_and_exit; + } + /* After this point, extent overflow is the only error can happen */ while (len > 0) { QTAILQ_FOREACH(ent, updated_list, node) { @@ -2704,9 +2718,11 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd, { CXLUpdateDCExtentListInPl *in = (void *)payload_in; CXLType3Dev *ct3d = CXL_TYPE3(cci->d); + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); CXLDCExtentList updated_list; CXLDCExtent *ent, *ent_next; - uint32_t updated_list_size; + uint32_t updated_list_size, i; + uint64_t dpa, len; CXLRetCode ret; if (in->num_entries_updated == 0) { @@ -2724,6 +2740,16 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd, return ret; } + /* Updated_entries contains the released extents. Free those in the MHD */ + for (i = 0; cvc->mhd_release_extent && i < in->num_entries_updated; ++i) { + dpa = in->updated_entries[i].start_dpa; + len = in->updated_entries[i].len; + + if (cvc->mhd_release_extent) { + cvc->mhd_release_extent(&ct3d->parent_obj, dpa, len); + } + } + /* * If the dry run release passes, the returned updated_list will * be the updated extent list and we just need to clear the extents diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c index b7b24b6a32..a94b9931d2 100644 --- a/hw/mem/cxl_type3.c +++ b/hw/mem/cxl_type3.c @@ -799,6 +799,7 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d) { CXLDCExtent *ent, *ent_next; CXLDCExtentGroup *group, *group_next; + CXLType3Class *cvc = CXL_TYPE3_CLASS(ct3d); int i; CXLDCRegion *region; @@ -817,6 +818,10 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d) for (i = 0; i < ct3d->dc.num_regions; i++) { region = &ct3d->dc.regions[i]; g_free(region->blk_bitmap); + if (cvc->mhd_release_extent) { + cvc->mhd_release_extent(&ct3d->parent_obj, region->base, + region->len); + } } } @@ -2077,6 +2082,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, CXLEventDynamicCapacity dCap = {}; CXLEventRecordHdr *hdr = &dCap.hdr; CXLType3Dev *dcd; + CXLType3Class *cvc; uint8_t flags = 1 << CXL_EVENT_TYPE_INFO; uint32_t num_extents = 0; CxlDynamicCapacityExtentList *list; @@ -2094,6 +2100,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, } dcd = CXL_TYPE3(obj); + cvc = CXL_TYPE3_GET_CLASS(dcd); if (!dcd->dc.num_regions) { error_setg(errp, "No dynamic capacity support from the device"); return; @@ -2166,6 +2173,13 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, num_extents++; } + /* If this is an MHD, attempt to reserve the extents */ + if (type == DC_EVENT_ADD_CAPACITY && cvc->mhd_reserve_extents && + !cvc->mhd_reserve_extents(&dcd->parent_obj, records, rid)) { + error_setg(errp, "mhsld is enabled and extent reservation failed"); + return; + } + /* Create extent list for event being passed to host */ i = 0; list = records; @@ -2304,6 +2318,9 @@ static void ct3_class_init(ObjectClass *oc, void *data) cvc->set_cacheline = set_cacheline; cvc->mhd_get_info = NULL; cvc->mhd_access_valid = NULL; + cvc->mhd_reserve_extents = NULL; + cvc->mhd_reclaim_extents = NULL; + cvc->mhd_release_extent = NULL; } static const TypeInfo ct3d_info = { diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h index b2dc7fb769..13c97b576f 100644 --- a/include/hw/cxl/cxl_device.h +++ b/include/hw/cxl/cxl_device.h @@ -14,6 +14,7 @@ #include "hw/pci/pci_device.h" #include "hw/register.h" #include "hw/cxl/cxl_events.h" +#include "qapi/qapi-commands-cxl.h" #include "hw/cxl/cxl_cpmu.h" /* @@ -682,6 +683,13 @@ struct CXLType3Class { size_t *len_out, CXLCCI *cci); bool (*mhd_access_valid)(PCIDevice *d, uint64_t addr, unsigned int size); + bool (*mhd_reserve_extents)(PCIDevice *d, + CxlDynamicCapacityExtentList *records, + uint8_t rid); + bool (*mhd_reclaim_extents)(PCIDevice *d, + CXLDCExtentGroupList *groups, + CXLUpdateDCExtentListInPl *in); + bool (*mhd_release_extent)(PCIDevice *d, uint64_t dpa, uint64_t len); }; struct CSWMBCCIDev { From patchwork Fri Oct 18 16:12:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 13842027 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1033D202640 for ; Fri, 18 Oct 2024 16:13:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729268000; cv=none; b=IBvLAI/N+sRw/Sk1yECiMpVLoqFvq5s4eeTGIdxOWR773T3gSXytxQ5x4AOV8ufP9hLwELGLIOPva6bPr7xV/UlhoGXbWuzOklIFFHrF3NT6G6WE6PMNBQejwwZaAuBFSpAQ6F9AJGHXCzRbW7eoQTZdzdXc13d8IjY4z8Ikrso= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729268000; c=relaxed/simple; bh=8eMl/Fs6KFfbwHe3jSFwevmPiB1/zZuOdsVTS3o4zsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=C2dSWY1yRBApEeI9tk5xodBMs7qa1w9AUqSSJGpgRBnKB17lKhP6DECWh6yklwC15ni7jUgBTr7H15jKTqnFpOedEnvADH8WwwJIGvYKPaNeGoVvqKGIeudBeVCqOi123CIbZ3oIBKpT1SxbDiBbmAVRpZlP97H/axjdpzqCQA4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=eJBpqls3; arc=none smtp.client-ip=209.85.222.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="eJBpqls3" Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7b13bf566c0so141764885a.3 for ; Fri, 18 Oct 2024 09:13:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267997; x=1729872797; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yg1P4kmuxtDkOe9Icy7EtGIDk1BBaBntSPvV2yBw5Vg=; b=eJBpqls3aGNUlo4oQtAVtejE4WX9BfVMJy9CzXfnbR7aMGQUwWg3TV2iuSfwVj5Jbb +F6O1DFxqhamUXlbp1vlJrfsB/mz+U0upO+jr3CAXzxg6We+rrlGpKp1CQdv49zvrbD1 npsq+D4r+GXXr8DjNAQGG5wVAIV/hxQ5b+vDPK/0PmOTjoxxRpZLG/oTlQx5UJx+e14o S0hDsCFK9qCpnOOMBJXV1fgVyThwhGwDdLBJVM3+pd7s8m3Z/oljuKKNzTtSAOANPqHf lMiF87ZWVTtxxeFwBmbWPSOgQQhbjC2zbxnm93xY1URJMiBuRbFSUkbW/GU6I2pgDahm fQ+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267997; x=1729872797; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yg1P4kmuxtDkOe9Icy7EtGIDk1BBaBntSPvV2yBw5Vg=; b=w0crNTv6KgB2egB2x3Fv//NgxECdewZ6ZhwG7nbPB3rN6LhpYtFu8PEXmRUMegpnu4 xUCYFPOhxtmeZ87L28vWfmRMN1E3T2vIqN2n6uPIYIfTzLmTHsdYVedtGXNswDKm6Jh+ juf6Sxe/K4W79OTHI7CoVAW+WoSsDSX1BPfOPVNtEwxUaXfYgo80cQqJqeBNqNv5aqes bsDyltYLpwG0NFWTnFPIcHffttkLtYK0PhM4mAjJHbOF1n2hKDyMMBvMyvcJD+0ZQqmd 9uAcBfml5mB9bOcMZAEdBNfIo1fdx6VI5qQCGendKfTAwDS3GFYAjzJR9HBjyNN0BPnG WDrA== X-Gm-Message-State: AOJu0Yzp6sCETa4SD3ODcZwkXBbAHC0nYt3vLzKgxVF+s46nPVJz7Xk3 1ZurJY6vK1N+UkCHxMGGM9nY69bpwf+jKyEXKHtYpvFhPrQB7oxkF/8ZoTZa0m8Q3Kp03VUwF3v 0 X-Google-Smtp-Source: AGHT+IGikF7stO1WCXDYt/0EcUOjzFHMh+z2T+w/8SlImKWFClQj57CqBtAj2rgWTq3cnNRp87CkTA== X-Received: by 2002:a05:620a:4007:b0:7ac:c359:f132 with SMTP id af79cd13be357-7b157b72801mr259516085a.26.1729267996520; Fri, 18 Oct 2024 09:13:16 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:16 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 3/3] mhsld: implement MHSLD device Date: Fri, 18 Oct 2024 12:12:52 -0400 Message-ID: <20241018161252.8896-4-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Svetly Todorov Using a shared-memory bytemap, validates that DC adds, releases, and reclamations happen on extents belonging to the appropriate host. The MHSLD device inherits from the CXL_TYPE3 class and adds the following configuration options: --mhd-head= --mhd-state_file= --mhd-init= --mhd-head specifies the head ID of the host on the given device. --mhd-state_file is the name of the shared-memory-backed file used to store the MHD state. --mhd-init indicates whether this QEMU instance should initialize the state_file; if so, the instance will create the file if it does not exist, ftruncate it to the appropriate size, and initialize its header. It is assumed that the --mhd-init instance is run and allowed to completely finish configuration before any other guests access the shared state. The shared state file only needs to be intialized once. Even if a guest dies without clearing the ownership bits associated with its head-ID, future guests with that ID will clear those bits in cxl_mhsld_realize(), regardless of whether mhd_init is true or false. The following command line options create an MHSLD with 4GB of backing memory, whose state is tracked in /dev/shm/mhd_metadata. --mhd-init=true tells this instance to initialize the state as described above. ./qemu-system_x86-64 \ [... other options ...] \ -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \ -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,port=0,slot=0 \ -object memory-backend-ram,id=mem0,size=4G \ -device cxl-mhsld,bus=rp0,num-dc-regions=1,volatile-dc-memdev=mem0,id=cxl-mem0,sn=66667,mhd-head=0,mhd-state_file=mhd_metadata,mhd-init=true \ -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G \ -qmp unix:/tmp/qmp-sock-1,server,nowait Once this guest completes setup, other guests looking to access the device can be booted with the same configuration options, but with --mhd-head != 0, --mhd-init=false, and a different QMP socket. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/Kconfig | 1 + hw/cxl/meson.build | 1 + hw/cxl/mhsld/Kconfig | 4 + hw/cxl/mhsld/meson.build | 3 + hw/cxl/mhsld/mhsld.c | 456 +++++++++++++++++++++++++++++++++++++++ hw/cxl/mhsld/mhsld.h | 75 +++++++ 6 files changed, 540 insertions(+) create mode 100644 hw/cxl/mhsld/Kconfig create mode 100644 hw/cxl/mhsld/meson.build create mode 100644 hw/cxl/mhsld/mhsld.c create mode 100644 hw/cxl/mhsld/mhsld.h diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig index e603839a62..919e59b598 100644 --- a/hw/cxl/Kconfig +++ b/hw/cxl/Kconfig @@ -1,3 +1,4 @@ +source mhsld/Kconfig source vendor/Kconfig config CXL diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build index e8c8c1355a..394750dd19 100644 --- a/hw/cxl/meson.build +++ b/hw/cxl/meson.build @@ -16,4 +16,5 @@ system_ss.add(when: 'CONFIG_I2C_MCTP_CXL', if_true: files('i2c_mctp_cxl.c')) system_ss.add(when: 'CONFIG_ALL', if_true: files('cxl-host-stubs.c')) +subdir('mhsld') subdir('vendor') diff --git a/hw/cxl/mhsld/Kconfig b/hw/cxl/mhsld/Kconfig new file mode 100644 index 0000000000..dc2be15140 --- /dev/null +++ b/hw/cxl/mhsld/Kconfig @@ -0,0 +1,4 @@ +config CXL_MHSLD + bool + depends on CXL_MEM_DEVICE + default y diff --git a/hw/cxl/mhsld/meson.build b/hw/cxl/mhsld/meson.build new file mode 100644 index 0000000000..c595558f8a --- /dev/null +++ b/hw/cxl/mhsld/meson.build @@ -0,0 +1,3 @@ +if host_os == 'linux' + system_ss.add(when: 'CONFIG_CXL_MHSLD', if_true: files('mhsld.c',)) +endif diff --git a/hw/cxl/mhsld/mhsld.c b/hw/cxl/mhsld/mhsld.c new file mode 100644 index 0000000000..2a3023607e --- /dev/null +++ b/hw/cxl/mhsld/mhsld.c @@ -0,0 +1,456 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright (c) 2024 MemVerge Inc. + * + */ + +#include +#include "qemu/osdep.h" +#include "qemu/bitmap.h" +#include "hw/irq.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "hw/cxl/cxl.h" +#include "hw/cxl/cxl_mailbox.h" +#include "hw/cxl/cxl_device.h" +#include "hw/pci/pcie.h" +#include "hw/pci/pcie_port.h" +#include "hw/qdev-properties.h" +#include "sysemu/hostmem.h" +#include "mhsld.h" + +#define TYPE_CXL_MHSLD "cxl-mhsld" +OBJECT_DECLARE_TYPE(CXLMHSLDState, CXLMHSLDClass, CXL_MHSLD) + +/* + * CXL r3.0 section 7.6.7.5.1 - Get Multi-Headed Info (Opcode 5500h) + * + * This command retrieves the number of heads, number of supported LDs, + * and Head-to-LD mapping of a Multi-Headed device. + */ +static CXLRetCode cmd_mhd_get_info(const struct cxl_cmd *cmd, + uint8_t *payload_in, size_t len_in, + uint8_t *payload_out, size_t *len_out, + CXLCCI * cci) +{ + CXLMHSLDState *s = CXL_MHSLD(cci->d); + MHDGetInfoInput *input = (void *)payload_in; + MHDGetInfoOutput *output = (void *)payload_out; + + uint8_t start_ld = input->start_ld; + uint8_t ldmap_len = input->ldmap_len; + uint8_t i; + + if (start_ld >= s->mhd_state->nr_lds) { + return CXL_MBOX_INVALID_INPUT; + } + + output->nr_lds = s->mhd_state->nr_lds; + output->nr_heads = s->mhd_state->nr_heads; + output->resv1 = 0; + output->start_ld = start_ld; + output->resv2 = 0; + + for (i = 0; i < ldmap_len && (start_ld + i) < output->nr_lds; i++) { + output->ldmap[i] = s->mhd_state->ldmap[start_ld + i]; + } + output->ldmap_len = i; + + *len_out = sizeof(*output) + output->ldmap_len; + return CXL_MBOX_SUCCESS; +} + +static const struct cxl_cmd cxl_cmd_set_mhsld[256][256] = { + [MHSLD_MHD][GET_MHD_INFO] = {"GET_MULTI_HEADED_INFO", + cmd_mhd_get_info, 2, 0}, +}; + +static Property cxl_mhsld_props[] = { + DEFINE_PROP_UINT32("mhd-head", CXLMHSLDState, mhd_head, ~(0)), + DEFINE_PROP_STRING("mhd-state_file", CXLMHSLDState, mhd_state_file), + DEFINE_PROP_BOOL("mhd-init", CXLMHSLDState, mhd_init, false), + DEFINE_PROP_END_OF_LIST(), +}; + +static int cxl_mhsld_state_open(const char *filename, int flags) +{ + char name[128]; + snprintf(name, sizeof(name), "/%s", filename); + return shm_open(name, flags, 0666); +} + +static int cxl_mhsld_state_unlink(const char *filename) +{ + char name[128]; + snprintf(name, sizeof(name), "/%s", filename); + return shm_unlink(name); +} + +static int cxl_mhsld_state_create(const char *filename, size_t size) +{ + int fd, rc; + + fd = cxl_mhsld_state_open(filename, O_RDWR | O_CREAT); + if (fd == -1) { + return -1; + } + + rc = ftruncate(fd, size); + + if (rc) { + close(fd); + return -1; + } + + return fd; +} + +static bool cxl_mhsld_state_set(CXLMHSLDState *s, size_t block_start, + size_t block_count) +{ + uint8_t prev, val, *block; + size_t i; + + val = (1 << s->mhd_head); + + /* + * Try to claim all extents from start -> start + count; + * break early if a claimed extent is encountered + */ + for (i = 0; i < block_count; ++i) { + block = &s->mhd_state->blocks[block_start + i]; + prev = __sync_val_compare_and_swap(block, 0, val); + if (prev != 0) { + break; + } + } + + if (prev == 0) { + return true; + } + + /* Roll back incomplete claims */ + for (;; --i) { + block = &s->mhd_state->blocks[block_start + i]; + __sync_fetch_and_and(block, ~(1u << s->mhd_head)); + if (i == 0) { + break; + } + } + + return false; +} + +static void cxl_mhsld_state_clear(CXLMHSLDState *s, size_t block_start, + size_t block_count) +{ + size_t i; + uint8_t *block; + + for (i = 0; i < block_count; ++i) { + block = &s->mhd_state->blocks[block_start + i]; + __sync_fetch_and_and(block, ~(1u << s->mhd_head)); + } +} + +static void cxl_mhsld_state_initialize(CXLMHSLDState *s, size_t dc_size) +{ + if (!s->mhd_init) { + cxl_mhsld_state_clear(s, 0, dc_size / MHSLD_BLOCK_SZ); + return; + } + + memset(s->mhd_state, 0, s->mhd_state_size); + s->mhd_state->nr_heads = MHSLD_HEADS; + s->mhd_state->nr_lds = MHSLD_HEADS; + s->mhd_state->nr_blocks = dc_size / MHSLD_BLOCK_SZ; +} + +/* Returns starting index of region in MHD map. */ +static inline size_t cxl_mhsld_find_dc_region_start(PCIDevice *d, + CXLDCRegion *r) +{ + CXLType3Dev *dcd = CXL_TYPE3(d); + size_t start = 0; + uint8_t rid; + + for (rid = 0; rid < dcd->dc.num_regions; ++rid) { + if (&dcd->dc.regions[rid] == r) { + break; + } + start += dcd->dc.regions[rid].len / dcd->dc.regions[rid].block_size; + } + + return start; +} + +static MHSLDSharedState *cxl_mhsld_state_map(CXLMHSLDState *s) +{ + void *map; + size_t size = s->mhd_state_size; + int fd = s->mhd_state_fd; + + if (fd < 0) { + return NULL; + } + + map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (map == MAP_FAILED) { + return NULL; + } + + return (MHSLDSharedState *)map; +} + +/* + * Triggered during an add_capacity command to a CXL device: + * takes a list of extent records and preallocates them, + * in anticipation of a "dcd accept" response from the host. + * + * Extents that are not accepted by the host will be rolled + * back later. + */ +static bool cxl_mhsld_reserve_extents(PCIDevice *d, + CxlDynamicCapacityExtentList *records, + uint8_t rid) +{ + uint64_t len, dpa; + bool rc; + + CXLMHSLDState *s = CXL_MHSLD(d); + CxlDynamicCapacityExtentList *list = records, *rollback = NULL; + + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLDCRegion *region = &ct3d->dc.regions[rid]; + + for (; list; list = list->next) { + len = list->value->len / MHSLD_BLOCK_SZ; + dpa = (list->value->offset + region->base) / MHSLD_BLOCK_SZ; + + rc = cxl_mhsld_state_set(s, dpa, len); + + if (!rc) { + rollback = records; + break; + } + } + + /* Setting the mhd state failed. Roll back the extents that were added */ + for (; rollback; rollback = rollback->next) { + len = rollback->value->len / MHSLD_BLOCK_SZ; + dpa = (list->value->offset + region->base) / MHSLD_BLOCK_SZ; + + cxl_mhsld_state_clear(s, dpa, len); + + if (rollback == list) { + return false; + } + } + + return true; +} + +static bool cxl_mhsld_reclaim_extents(PCIDevice *d, + CXLDCExtentGroupList *ext_groups, + CXLUpdateDCExtentListInPl *in) +{ + CXLMHSLDState *s = CXL_MHSLD(d); + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLDCExtentGroup *ext_group = QTAILQ_FIRST(ext_groups); + CXLDCExtent *ent; + CXLDCRegion *region; + g_autofree unsigned long *blk_bitmap = NULL; + uint64_t dpa, off, len, size, i; + + /* Get the DCD region via the first requested extent */ + ent = QTAILQ_FIRST(&ext_group->list); + dpa = ent->start_dpa; + len = ent->len; + region = cxl_find_dc_region(ct3d, dpa, len); + size = region->len / MHSLD_BLOCK_SZ; + blk_bitmap = bitmap_new(size); + + /* Set all requested extents to 1 in a bitmap */ + QTAILQ_FOREACH(ent, &ext_group->list, node) { + off = ent->start_dpa - region->base; + len = ent->len; + bitmap_set(blk_bitmap, off / MHSLD_BLOCK_SZ, len / MHSLD_BLOCK_SZ); + } + + /* Clear bits associated with accepted extents */ + for (i = 0; i < in->num_entries_updated; i++) { + off = in->updated_entries[i].start_dpa - region->base; + len = in->updated_entries[i].len; + bitmap_clear(blk_bitmap, off / MHSLD_BLOCK_SZ, len / MHSLD_BLOCK_SZ); + } + + /* + * Reclaim only the extents that belong to unaccepted extents, + * i.e. those whose bits are still raised in blk_bitmap + */ + for (off = find_first_bit(blk_bitmap, size); off < size;) { + len = find_next_zero_bit(blk_bitmap, size, off) - off; + cxl_mhsld_state_clear(s, off, len); + off = find_next_bit(blk_bitmap, size, off + len); + } + + return true; +} + +static bool cxl_mhsld_release_extent(PCIDevice *d, uint64_t dpa, uint64_t len) +{ + cxl_mhsld_state_clear(CXL_MHSLD(d), dpa / MHSLD_BLOCK_SZ, + len / MHSLD_BLOCK_SZ); + return true; +} + +static bool cxl_mhsld_access_valid(PCIDevice *d, uint64_t addr, + unsigned int size) +{ + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLMHSLDState *s = CXL_MHSLD(d); + CXLDCRegion *r = cxl_find_dc_region(ct3d, addr, size); + size_t i; + + addr = addr / r->block_size; + size = size / r->block_size; + + for (i = 0; i < size; ++i) { + if (s->mhd_state->blocks[addr + i] != (1 << s->mhd_head)) { + return false; + } + } + + return true; +} + +static void cxl_mhsld_realize(PCIDevice *pci_dev, Error **errp) +{ + CXLMHSLDState *s = CXL_MHSLD(pci_dev); + MemoryRegion *mr; + int fd = -1; + size_t dc_size; + + ct3_realize(pci_dev, errp); + + /* Get number of blocks from dcd size */ + mr = host_memory_backend_get_memory(s->ct3d.dc.host_dc); + if (!mr) { + return; + } + dc_size = memory_region_size(mr); + if (!dc_size) { + error_setg(errp, "MHSLD does not have dynamic capacity to manage"); + return; + } + + s->mhd_state_size = (dc_size / MHSLD_BLOCK_SZ) + sizeof(MHSLDSharedState); + + /* Sanity check the head idx */ + if (s->mhd_head >= MHSLD_HEADS) { + error_setg(errp, "MHD Head ID must be between 0-7"); + return; + } + + /* Create the state file if this is the 'mhd_init' instance */ + if (s->mhd_init) { + fd = cxl_mhsld_state_create(s->mhd_state_file, s->mhd_state_size); + } else { + fd = cxl_mhsld_state_open(s->mhd_state_file, O_RDWR); + } + + if (fd < 0) { + error_setg(errp, "failed to open mhsld state errno %d", errno); + return; + } + + s->mhd_state_fd = fd; + + /* Map the state and initialize it as needed */ + s->mhd_state = cxl_mhsld_state_map(s); + if (!s->mhd_state) { + error_setg(errp, "Failed to mmap mhd state file"); + close(fd); + cxl_mhsld_state_unlink(s->mhd_state_file); + return; + } + + cxl_mhsld_state_initialize(s, dc_size); + + /* Set the LD ownership for this head to this system */ + s->mhd_state->ldmap[s->mhd_head] = s->mhd_head; + return; +} + + +static void cxl_mhsld_exit(PCIDevice *pci_dev) +{ + CXLMHSLDState *s = CXL_MHSLD(pci_dev); + + ct3_exit(pci_dev); + + if (s->mhd_state_fd) { + munmap(s->mhd_state, s->mhd_state_size); + close(s->mhd_state_fd); + cxl_mhsld_state_unlink(s->mhd_state_file); + s->mhd_state = NULL; + } +} + +static void cxl_mhsld_reset(DeviceState *d) +{ + CXLMHSLDState *s = CXL_MHSLD(d); + + ct3d_reset(d); + cxl_add_cci_commands(&s->ct3d.cci, cxl_cmd_set_mhsld, 512); + + cxl_mhsld_state_clear(s, 0, s->mhd_state->nr_blocks); +} + +/* + * Example: DCD-add events need to validate that the requested extent + * does not already have a mapping (or, if it does, it is + * a shared extent with the right tagging). + * + * Since this operates on the shared state, we will need to serialize + * these callbacks across QEMU instances via a mutex in shared state. + */ + +static void cxl_mhsld_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); + + pc->realize = cxl_mhsld_realize; + pc->exit = cxl_mhsld_exit; + device_class_set_legacy_reset(dc, cxl_mhsld_reset); + device_class_set_props(dc, cxl_mhsld_props); + + CXLType3Class *cvc = CXL_TYPE3_CLASS(klass); + cvc->mhd_get_info = cmd_mhd_get_info; + cvc->mhd_access_valid = cxl_mhsld_access_valid; + cvc->mhd_reserve_extents = cxl_mhsld_reserve_extents; + cvc->mhd_reclaim_extents = cxl_mhsld_reclaim_extents; + cvc->mhd_release_extent = cxl_mhsld_release_extent; +} + +static const TypeInfo cxl_mhsld_info = { + .name = TYPE_CXL_MHSLD, + .parent = TYPE_CXL_TYPE3, + .class_size = sizeof(struct CXLMHSLDClass), + .class_init = cxl_mhsld_class_init, + .instance_size = sizeof(CXLMHSLDState), + .interfaces = (InterfaceInfo[]) { + { INTERFACE_CXL_DEVICE }, + { INTERFACE_PCIE_DEVICE }, + {} + }, +}; + +static void cxl_mhsld_register_types(void) +{ + type_register_static(&cxl_mhsld_info); +} + +type_init(cxl_mhsld_register_types) diff --git a/hw/cxl/mhsld/mhsld.h b/hw/cxl/mhsld/mhsld.h new file mode 100644 index 0000000000..e7ead1f0d2 --- /dev/null +++ b/hw/cxl/mhsld/mhsld.h @@ -0,0 +1,75 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright (c) 2024 MemVerge Inc. + * + */ + +#ifndef CXL_MHSLD_H +#define CXL_MHSLD_H +#include +#include "hw/cxl/cxl.h" +#include "hw/cxl/cxl_mailbox.h" +#include "hw/cxl/cxl_device.h" +#include "qemu/units.h" + +#define MHSLD_BLOCK_SZ (2 * MiB) + +/* + * We limit the number of heads to prevent the shared state + * region from becoming a major memory hog. We need 512MB of + * memory space to track 8-host ownership of 4GB of memory in + * blocks of 2MB. This can change if the block size is increased. + */ +#define MHSLD_HEADS (8) + +/* + * The shared state cannot have 2 variable sized regions + * so we have to max out the ldmap. + */ +typedef struct MHSLDSharedState { + uint8_t nr_heads; + uint8_t nr_lds; + uint8_t ldmap[MHSLD_HEADS]; + uint64_t nr_blocks; + uint8_t blocks[]; +} MHSLDSharedState; + +struct CXLMHSLDState { + CXLType3Dev ct3d; + bool mhd_init; + char *mhd_state_file; + int mhd_state_fd; + size_t mhd_state_size; + uint32_t mhd_head; + MHSLDSharedState *mhd_state; +}; + +struct CXLMHSLDClass { + CXLType3Class parent_class; +}; + +enum { + MHSLD_MHD = 0x55, + #define GET_MHD_INFO 0x0 +}; + +/* + * MHD Get Info Command + * Returns information the LD's associated with this head + */ +typedef struct MHDGetInfoInput { + uint8_t start_ld; + uint8_t ldmap_len; +} QEMU_PACKED MHDGetInfoInput; + +typedef struct MHDGetInfoOutput { + uint8_t nr_lds; + uint8_t nr_heads; + uint16_t resv1; + uint8_t start_ld; + uint8_t ldmap_len; + uint16_t resv2; + uint8_t ldmap[]; +} QEMU_PACKED MHDGetInfoOutput; +#endif