[iwl-next,v4,08/12] ice: Save and load RX Queue head

Message ID	20231121025111.257597-9-yahui.cao@intel.com (mailing list archive)
State	Awaiting Upstream
Delegated to:	Netdev Maintainers
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cF6Vef0y" From: Yahui Cao <yahui.cao@intel.com> To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Subject: [PATCH iwl-next v4 08/12] ice: Save and load RX Queue head Date: Tue, 21 Nov 2023 02:51:07 +0000 Message-Id: <20231121025111.257597-9-yahui.cao@intel.com> In-Reply-To: <20231121025111.257597-1-yahui.cao@intel.com> References: <20231121025111.257597-1-yahui.cao@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Add E800 live migration driver \| expand [iwl-next,v4,00/12] Add E800 live migration driver [iwl-next,v4,01/12] ice: Add function to get RX queue context [iwl-next,v4,02/12] ice: Add function to get and set TX queue context [iwl-next,v4,03/12] ice: Introduce VF state ICE_VF_STATE_REPLAYING_VC for migration [iwl-next,v4,04/12] ice: Add fundamental migration init and exit function [iwl-next,v4,05/12] ice: Log virtual channel messages in PF [iwl-next,v4,06/12] ice: Add device state save/load function for migration [iwl-next,v4,07/12] ice: Fix VSI id in virtual channel message for migration [iwl-next,v4,08/12] ice: Save and load RX Queue head [iwl-next,v4,09/12] ice: Save and load TX Queue head [iwl-next,v4,10/12] ice: Add device suspend function for migration [iwl-next,v4,11/12] ice: Save and load mmio registers [iwl-next,v4,12/12] vfio/ice: Implement vfio_pci driver for E800 devices

Message ID

20231121025111.257597-9-yahui.cao@intel.com (mailing list archive)

State

Awaiting Upstream

Delegated to:

Netdev Maintainers

Headers

From: Yahui Cao <yahui.cao@intel.com>
To: intel-wired-lan@lists.osuosl.org
Cc: kvm@vger.kernel.org,
	netdev@vger.kernel.org,
	lingyu.liu@intel.com,
	kevin.tian@intel.com,
	madhu.chittim@intel.com,
	sridhar.samudrala@intel.com,
	alex.williamson@redhat.com,
	jgg@nvidia.com,
	yishaih@nvidia.com,
	shameerali.kolothum.thodi@huawei.com,
	brett.creeley@amd.com,
	davem@davemloft.net,
	edumazet@google.com,
	kuba@kernel.org,
	pabeni@redhat.com
Subject: [PATCH iwl-next v4 08/12] ice: Save and load RX Queue head
Date: Tue, 21 Nov 2023 02:51:07 +0000
Message-Id: <20231121025111.257597-9-yahui.cao@intel.com>
In-Reply-To: <20231121025111.257597-1-yahui.cao@intel.com>
References: <20231121025111.257597-1-yahui.cao@intel.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

Add E800 live migration driver | expand

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/codegen	success	Generated files up to date
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers	warning	2 maintainers not CCed: jesse.brandeburg@intel.com anthony.l.nguyen@intel.com
netdev/build_clang	success	Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 8 this patch: 8
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 164 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Context

Check

Description

netdev/series_format

success

Posting correctly formatted

netdev/codegen

success

Generated files up to date

netdev/tree_selection

success

Guessed tree name to be net-next

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 8 this patch: 8

netdev/cc_maintainers

warning

2 maintainers not CCed: jesse.brandeburg@intel.com anthony.l.nguyen@intel.com

netdev/build_clang

success

Errors and warnings before: 8 this patch: 8

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 8 this patch: 8

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 164 lines checked

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

Commit Message

Cao, Yahui Nov. 21, 2023, 2:51 a.m. UTC

From: Lingyu Liu <lingyu.liu@intel.com>

RX Queue head is a fundamental dma ring context which determines the
next RX descriptor to be fetched. However, RX Queue head is not visible
to VF while it is only visible in PF. As a result, PF needs to save and
load RX Queue Head explicitly.

Since network packets may come in at any time once RX Queue is enabled,
RX Queue head needs to be loaded before Queue is enabled.

RX Queue head loading handler is implemented by reading and then
overwriting queue context with specific HEAD value.

Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
---
 .../net/ethernet/intel/ice/ice_migration.c    | 125 ++++++++++++++++++
 1 file changed, 125 insertions(+)

Comments

Tian, Kevin Dec. 7, 2023, 7:55 a.m. UTC | #1

> From: Cao, Yahui <yahui.cao@intel.com>
> Sent: Tuesday, November 21, 2023 10:51 AM
>
> +
> +		/* Once RX Queue is enabled, network traffic may come in at
> any
> +		 * time. As a result, RX Queue head needs to be loaded
> before
> +		 * RX Queue is enabled.
> +		 * For simplicity and integration, overwrite RX head just after
> +		 * RX ring context is configured.
> +		 */
> +		if (msg_slot->opcode == VIRTCHNL_OP_CONFIG_VSI_QUEUES)
> {
> +			ret = ice_migration_load_rx_head(vf, devstate);
> +			if (ret) {
> +				dev_err(dev, "VF %d failed to load rx head\n",
> +					vf->vf_id);
> +				goto out_clear_replay;
> +			}
> +		}
> +

Don't we have the same problem here as for TX head restore that the
vfio migration protocol doesn't carry a way to tell whether the IOAS
associated with the device has been restored then allowing RX DMA
at this point might cause device error?

@Jason, is it a common gap applying to all devices which include a
receiving path from link? How is it handled in mlx migration
driver? 

I may overlook an important aspect here but if not I wonder whether
the migration driver should keep DMA disabled (at least for RX) even
when the device moves to RUNNING and then introduce an explicit
enable-DMA state which VMM can request after it restores the
relevant IOAS/HWPT...
with the device.

Jason Gunthorpe Dec. 7, 2023, 2:46 p.m. UTC | #2

On Thu, Dec 07, 2023 at 07:55:17AM +0000, Tian, Kevin wrote:
> > From: Cao, Yahui <yahui.cao@intel.com>
> > Sent: Tuesday, November 21, 2023 10:51 AM
> >
> > +
> > +		/* Once RX Queue is enabled, network traffic may come in at
> > any
> > +		 * time. As a result, RX Queue head needs to be loaded
> > before
> > +		 * RX Queue is enabled.
> > +		 * For simplicity and integration, overwrite RX head just after
> > +		 * RX ring context is configured.
> > +		 */
> > +		if (msg_slot->opcode == VIRTCHNL_OP_CONFIG_VSI_QUEUES)
> > {
> > +			ret = ice_migration_load_rx_head(vf, devstate);
> > +			if (ret) {
> > +				dev_err(dev, "VF %d failed to load rx head\n",
> > +					vf->vf_id);
> > +				goto out_clear_replay;
> > +			}
> > +		}
> > +
> 
> Don't we have the same problem here as for TX head restore that the
> vfio migration protocol doesn't carry a way to tell whether the IOAS
> associated with the device has been restored then allowing RX DMA
> at this point might cause device error?

Does this trigger a DMA?

> @Jason, is it a common gap applying to all devices which include a
> receiving path from link? How is it handled in mlx migration
> driver? 

There should be no DMA until the device is placed in RUNNING. All
devices may instantly trigger DMA once placed in RUNNING.

The VMM must ensure the entire environment is ready to go before
putting anything in RUNNING, including having setup the IOMMU.

> I may overlook an important aspect here but if not I wonder whether
> the migration driver should keep DMA disabled (at least for RX) even
> when the device moves to RUNNING and then introduce an explicit
> enable-DMA state which VMM can request after it restores the
> relevant IOAS/HWPT...
> with the device.

Why do we need a state like this?

Jason

Tian, Kevin Dec. 8, 2023, 2:53 a.m. UTC | #3

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, December 7, 2023 10:46 PM
> 
> On Thu, Dec 07, 2023 at 07:55:17AM +0000, Tian, Kevin wrote:
> > > From: Cao, Yahui <yahui.cao@intel.com>
> > > Sent: Tuesday, November 21, 2023 10:51 AM
> > >
> > > +
> > > +		/* Once RX Queue is enabled, network traffic may come in at
> > > any
> > > +		 * time. As a result, RX Queue head needs to be loaded
> > > before
> > > +		 * RX Queue is enabled.
> > > +		 * For simplicity and integration, overwrite RX head just after
> > > +		 * RX ring context is configured.
> > > +		 */
> > > +		if (msg_slot->opcode == VIRTCHNL_OP_CONFIG_VSI_QUEUES)
> > > {
> > > +			ret = ice_migration_load_rx_head(vf, devstate);
> > > +			if (ret) {
> > > +				dev_err(dev, "VF %d failed to load rx head\n",
> > > +					vf->vf_id);
> > > +				goto out_clear_replay;
> > > +			}
> > > +		}
> > > +
> >
> > Don't we have the same problem here as for TX head restore that the
> > vfio migration protocol doesn't carry a way to tell whether the IOAS
> > associated with the device has been restored then allowing RX DMA
> > at this point might cause device error?
> 
> Does this trigger a DMA?

looks yes from the comment

> 
> > @Jason, is it a common gap applying to all devices which include a
> > receiving path from link? How is it handled in mlx migration
> > driver?
> 
> There should be no DMA until the device is placed in RUNNING. All
> devices may instantly trigger DMA once placed in RUNNING.
> 
> The VMM must ensure the entire environment is ready to go before
> putting anything in RUNNING, including having setup the IOMMU.
> 

ah, yes. that is the right behavior.

so if there is no other way to block DMA before RUNNING is reached,
here the RX queue should be left disabled until when transitioning 
to RUNNING.

Yahui, can you double check?

diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c
index 780d2183011a..473be6a83cf3 100644
--- a/drivers/net/ethernet/intel/ice/ice_migration.c
+++ b/drivers/net/ethernet/intel/ice/ice_migration.c
@@ -2,9 +2,11 @@ 
 /* Copyright (C) 2018-2023 Intel Corporation */
 
 #include "ice.h"
+#include "ice_base.h"
 
 #define ICE_MIG_DEVSTAT_MAGIC			0xE8000001
 #define ICE_MIG_DEVSTAT_VERSION			0x1
+#define ICE_MIG_VF_QRX_TAIL_MAX			256
 
 struct ice_migration_virtchnl_msg_slot {
 	u32 opcode;
@@ -26,6 +28,8 @@  struct ice_migration_dev_state {
 	u16 num_rxq;
 
 	u16 vsi_id;
+	/* next RX desc index to be processed by the device */
+	u16 rx_head[ICE_MIG_VF_QRX_TAIL_MAX];
 	u8 virtchnl_msgs[];
 } __aligned(8);
 
@@ -264,6 +268,54 @@  u32 ice_migration_supported_caps(void)
 	return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE;
 }
 
+/**
+ * ice_migration_save_rx_head - save rx head into device state buffer
+ * @vf: pointer to VF structure
+ * @devstate: pointer to migration buffer
+ *
+ * Return 0 for success, negative for error
+ */
+static int
+ice_migration_save_rx_head(struct ice_vf *vf,
+			   struct ice_migration_dev_state *devstate)
+{
+	struct device *dev = ice_pf_to_dev(vf->pf);
+	struct ice_vsi *vsi;
+	int i;
+
+	vsi = ice_get_vf_vsi(vf);
+	if (!vsi) {
+		dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id);
+		return -EINVAL;
+	}
+
+	ice_for_each_rxq(vsi, i) {
+		struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
+		struct ice_rlan_ctx rlan_ctx = {};
+		struct ice_hw *hw = &vf->pf->hw;
+		u16 rxq_index;
+		int status;
+
+		if (WARN_ON_ONCE(!rx_ring))
+			return -EINVAL;
+
+		devstate->rx_head[i] = 0;
+		if (!test_bit(i, vf->rxq_ena))
+			continue;
+
+		rxq_index = rx_ring->reg_idx;
+		status = ice_read_rxq_ctx(hw, &rlan_ctx, rxq_index);
+		if (status) {
+			dev_err(dev, "Failed to read RXQ[%d] context, err=%d\n",
+				rx_ring->q_index, status);
+			return -EIO;
+		}
+		devstate->rx_head[i] = rlan_ctx.head;
+	}
+
+	return 0;
+}
+
 /**
  * ice_migration_save_devstate - save device state to migration buffer
  * @pf: pointer to PF of migration device
@@ -318,6 +370,12 @@  ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz)
 	buf = devstate->virtchnl_msgs;
 	devstate->vsi_id = vf->vm_vsi_num;
 
+	ret = ice_migration_save_rx_head(vf, devstate);
+	if (ret) {
+		dev_err(dev, "VF %d failed to save rxq head\n", vf->vf_id);
+		goto out_put_vf;
+	}
+
 	list_for_each_entry(msg_listnode, &vf->virtchnl_msg_list, node) {
 		struct ice_migration_virtchnl_msg_slot *msg_slot;
 		u64 slot_size;
@@ -409,6 +467,57 @@  ice_migration_check_match(struct ice_vf *vf, const u8 *buf, u64 buf_sz)
 	return 0;
 }
 
+/**
+ * ice_migration_load_rx_head - load rx head from device state buffer
+ * @vf: pointer to VF structure
+ * @devstate: pointer to migration device state
+ *
+ * Return 0 for success, negative for error
+ */
+static int
+ice_migration_load_rx_head(struct ice_vf *vf,
+			   struct ice_migration_dev_state *devstate)
+{
+	struct device *dev = ice_pf_to_dev(vf->pf);
+	struct ice_vsi *vsi;
+	int i;
+
+	vsi = ice_get_vf_vsi(vf);
+	if (!vsi) {
+		dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id);
+		return -EINVAL;
+	}
+
+	ice_for_each_rxq(vsi, i) {
+		struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
+		struct ice_rlan_ctx rlan_ctx = {};
+		struct ice_hw *hw = &vf->pf->hw;
+		u16 rxq_index;
+		int status;
+
+		if (WARN_ON_ONCE(!rx_ring))
+			return -EINVAL;
+
+		rxq_index = rx_ring->reg_idx;
+		status = ice_read_rxq_ctx(hw, &rlan_ctx, rxq_index);
+		if (status) {
+			dev_err(dev, "Failed to read RXQ[%d] context, err=%d\n",
+				rx_ring->q_index, status);
+			return -EIO;
+		}
+
+		rlan_ctx.head = devstate->rx_head[i];
+		status = ice_write_rxq_ctx(hw, &rlan_ctx, rxq_index);
+		if (status) {
+			dev_err(dev, "Failed to set LAN RXQ[%d] context, err=%d\n",
+				rx_ring->q_index, status);
+			return -EIO;
+		}
+	}
+
+	return 0;
+}
+
 /**
  * ice_migration_load_devstate - load device state at destination
  * @pf: pointer to PF of migration device
@@ -467,6 +576,22 @@  int ice_migration_load_devstate(struct ice_pf *pf, int vf_id,
 				vf->vf_id, msg_slot->opcode);
 			goto out_clear_replay;
 		}
+
+		/* Once RX Queue is enabled, network traffic may come in at any
+		 * time. As a result, RX Queue head needs to be loaded before
+		 * RX Queue is enabled.
+		 * For simplicity and integration, overwrite RX head just after
+		 * RX ring context is configured.
+		 */
+		if (msg_slot->opcode == VIRTCHNL_OP_CONFIG_VSI_QUEUES) {
+			ret = ice_migration_load_rx_head(vf, devstate);
+			if (ret) {
+				dev_err(dev, "VF %d failed to load rx head\n",
+					vf->vf_id);
+				goto out_clear_replay;
+			}
+		}
+
 		event.msg_buf = NULL;
 		msg_slot = (struct ice_migration_virtchnl_msg_slot *)
 					((char *)msg_slot + slot_sz);

[iwl-next,v4,08/12] ice: Save and load RX Queue head

Checks

Commit Message

Comments

Patch