diff mbox series

[net-next,05/12] nfp: introduce keepalive mechanism for multi-PF setup

Message ID 20230724094821.14295-6-louis.peens@corigine.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series nfp: add support for multi-pf configuration | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 1342 this patch: 1343
netdev/cc_maintainers warning 3 maintainers not CCed: tglx@linutronix.de edumazet@google.com niklas.soderlund@corigine.com
netdev/build_clang success Errors and warnings before: 1365 this patch: 1365
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1365 this patch: 1365
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Louis Peens July 24, 2023, 9:48 a.m. UTC
From: Yinjun Zhang <yinjun.zhang@corigine.com>

In multi-PF setup, management firmware is in charge of application
firmware unloading instead of driver by keepalive mechanism.

A new NSP resource area is allocated for keepalive use with name
"nfp.beat". Driver sets the magic number when keepalive is needed
and periodically updates the PF's corresponding qword in "nfp.beat".
Management firmware checks these PFs' qwords to learn whether and
which PFs are alive, and will unload the application firmware if
no PF is running. This only works when magic number is correct.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_main.c | 88 +++++++++++++++++++
 drivers/net/ethernet/netronome/nfp/nfp_main.h |  8 ++
 .../net/ethernet/netronome/nfp/nfpcore/nfp.h  |  4 +
 3 files changed, 100 insertions(+)

Comments

kernel test robot July 29, 2023, 8:20 p.m. UTC | #1
Hi Louis,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Louis-Peens/nsp-generate-nsp-command-with-variable-nsp-major-version/20230724-180015
base:   net-next/main
patch link:    https://lore.kernel.org/r/20230724094821.14295-6-louis.peens%40corigine.com
patch subject: [PATCH net-next 05/12] nfp: introduce keepalive mechanism for multi-PF setup
config: openrisc-randconfig-r081-20230730 (https://download.01.org/0day-ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307300422.oPy5E1hB-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   drivers/net/ethernet/netronome/nfp/nfp_main.c: note: in included file (through drivers/net/ethernet/netronome/nfp/nfp_net.h):
>> include/linux/io-64-nonatomic-hi-lo.h:22:16: sparse: sparse: cast truncates bits from constant value (6e66702e62656174 becomes 62656174)

vim +22 include/linux/io-64-nonatomic-hi-lo.h

797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  18  
3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron    2014-07-04  19  static inline void hi_lo_writeq(__u64 val, volatile void __iomem *addr)
797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  20  {
797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  21  	writel(val >> 32, addr + 4);
797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07 @22  	writel(val, addr);
797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  23  }
3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron    2014-07-04  24
Yinjun Zhang July 30, 2023, 4:51 a.m. UTC | #2
On Sun, 30 Jul 2023 04:20:57 +0800, kernel test robot wrote:
> Hi Louis,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on net-next/main]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Louis-Peens/nsp-generate-nsp-command-with-variable-nsp-major-version/20230724-180015
> base:   net-next/main
> patch link:    https://lore.kernel.org/r/20230724094821.14295-6-louis.peens%40corigine.com
> patch subject: [PATCH net-next 05/12] nfp: introduce keepalive mechanism for multi-PF setup
> config: openrisc-randconfig-r081-20230730 (https://download.01.org/0day-ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/config)
> compiler: or1k-linux-gcc (GCC) 12.3.0
> reproduce: (https://download.01.org/0day-ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202307300422.oPy5E1hB-lkp@intel.com/
> 
> sparse warnings: (new ones prefixed by >>)
>    drivers/net/ethernet/netronome/nfp/nfp_main.c: note: in included file (through drivers/net/ethernet/netronome/nfp/nfp_net.h):
> >> include/linux/io-64-nonatomic-hi-lo.h:22:16: sparse: sparse: cast truncates bits from constant value (6e66702e62656174 becomes 62656174)

I think it's more like a callee's problem instead of the caller's.
`writeq` is supposed to be able to be fed with a constant. WDYT?

> 
> vim +22 include/linux/io-64-nonatomic-hi-lo.h
> 
> 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  18  
> 3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron    2014-07-04  19  static inline void hi_lo_writeq(__u64 val, volatile void __iomem *addr)
> 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  20  {
> 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  21  	writel(val >> 32, addr + 4);
> 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07 @22  	writel(val, addr);
> 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi Mitake 2012-02-07  23  }
> 3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron    2014-07-04  24  
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
Yinjun Zhang Aug. 7, 2023, 2:10 a.m. UTC | #3
On Sunday, July 30, 2023 12:52 PM, Yinjun Zhang wrote:
> On Sun, 30 Jul 2023 04:20:57 +0800, kernel test robot wrote:
> > Hi Louis,
> >
> > kernel test robot noticed the following build warnings:
> >
> > [auto build test WARNING on net-next/main]
> >
> > url:    https://github.com/intel-lab-lkp/linux/commits/Louis-Peens/nsp-
> generate-nsp-command-with-variable-nsp-major-version/20230724-180015
> > base:   net-next/main
> > patch link:    https://lore.kernel.org/r/20230724094821.14295-6-
> louis.peens%40corigine.com
> > patch subject: [PATCH net-next 05/12] nfp: introduce keepalive mechanism
> for multi-PF setup
> > config: openrisc-randconfig-r081-20230730 (https://download.01.org/0day-
> ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/config)
> > compiler: or1k-linux-gcc (GCC) 12.3.0
> > reproduce: (https://download.01.org/0day-
> ci/archive/20230730/202307300422.oPy5E1hB-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202307300422.oPy5E1hB-
> lkp@intel.com/
> >
> > sparse warnings: (new ones prefixed by >>)
> >    drivers/net/ethernet/netronome/nfp/nfp_main.c: note: in included file
> (through drivers/net/ethernet/netronome/nfp/nfp_net.h):
> > >> include/linux/io-64-nonatomic-hi-lo.h:22:16: sparse: sparse: cast
> truncates bits from constant value (6e66702e62656174 becomes 62656174)
> 
> I think it's more like a callee's problem instead of the caller's.
> `writeq` is supposed to be able to be fed with a constant. WDYT?

There's no response for one week. To compromise, I'm going to change
the caller in driver side. Let me know if anybody has some other comments.

> 
> >
> > vim +22 include/linux/io-64-nonatomic-hi-lo.h
> >
> > 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi
> Mitake 2012-02-07  18
> > 3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron
> 2014-07-04  19  static inline void hi_lo_writeq(__u64 val, volatile void
> __iomem *addr)
> > 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi
> Mitake 2012-02-07  20  {
> > 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi
> Mitake 2012-02-07  21  	writel(val >> 32, addr + 4);
> > 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi
> Mitake 2012-02-07 @22  	writel(val, addr);
> > 797a796a13df6b include/asm-generic/io-64-nonatomic-hi-lo.h Hitoshi
> Mitake 2012-02-07  23  }
> > 3a044178cccfeb include/asm-generic/io-64-nonatomic-hi-lo.h Jason Baron
> 2014-07-04  24
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
diff mbox series

Patch

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c
index 778f21dfbbd5..489113c53596 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c
@@ -469,6 +469,77 @@  nfp_get_fw_policy_value(struct pci_dev *pdev, struct nfp_nsp *nsp,
 	return err;
 }
 
+static void
+nfp_nsp_beat_timer(struct timer_list *t)
+{
+	struct nfp_pf *pf = from_timer(pf, t, multi_pf.beat_timer);
+	u8 __iomem *addr;
+
+	/* Each PF has corresponding qword to beat:
+	 * offset | usage
+	 *   0    | magic number
+	 *   8    | beat qword of pf0
+	 *   16   | beat qword of pf1
+	 */
+	addr = pf->multi_pf.beat_addr + ((pf->multi_pf.id + 1) << 3);
+	writeq(jiffies, addr);
+	/* Beat once per second. */
+	mod_timer(&pf->multi_pf.beat_timer, jiffies + HZ);
+}
+
+/**
+ * nfp_nsp_keepalive_start() - Start keepalive mechanism if needed
+ * @pf:		NFP PF Device structure
+ *
+ * Return 0 if no error, errno otherwise
+ */
+static int
+nfp_nsp_keepalive_start(struct nfp_pf *pf)
+{
+	struct nfp_resource *res;
+	u8 __iomem *base;
+	int err = 0;
+	u64 addr;
+	u32 cpp;
+
+	if (!pf->multi_pf.en)
+		return 0;
+
+	res = nfp_resource_acquire(pf->cpp, NFP_KEEPALIVE);
+	if (IS_ERR(res))
+		return PTR_ERR(res);
+
+	cpp = nfp_resource_cpp_id(res);
+	addr = nfp_resource_address(res);
+
+	/* Allocate a fixed area for keepalive. */
+	base = nfp_cpp_map_area(pf->cpp, "keepalive", cpp, addr,
+				nfp_resource_size(res), &pf->multi_pf.beat_area);
+	if (IS_ERR(base)) {
+		nfp_err(pf->cpp, "Failed to map area for keepalive\n");
+		err = PTR_ERR(base);
+		goto res_release;
+	}
+
+	pf->multi_pf.beat_addr = base;
+	timer_setup(&pf->multi_pf.beat_timer, nfp_nsp_beat_timer, 0);
+	mod_timer(&pf->multi_pf.beat_timer, jiffies);
+
+res_release:
+	nfp_resource_release(res);
+	return err;
+}
+
+static void
+nfp_nsp_keepalive_stop(struct nfp_pf *pf)
+{
+	if (!pf->multi_pf.beat_area)
+		return;
+
+	del_timer_sync(&pf->multi_pf.beat_timer);
+	nfp_cpp_area_release_free(pf->multi_pf.beat_area);
+}
+
 static bool
 nfp_skip_fw_load(struct nfp_pf *pf, struct nfp_nsp *nsp)
 {
@@ -550,6 +621,10 @@  nfp_fw_load(struct pci_dev *pdev, struct nfp_pf *pf, struct nfp_nsp *nsp)
 	if (err)
 		return err;
 
+	err = nfp_nsp_keepalive_start(pf);
+	if (err)
+		return err;
+
 	if (nfp_skip_fw_load(pf, nsp))
 		return 1;
 
@@ -620,6 +695,16 @@  nfp_fw_load(struct pci_dev *pdev, struct nfp_pf *pf, struct nfp_nsp *nsp)
 	if (fw_loaded && ifcs == 1 && !pf->multi_pf.en)
 		pf->unload_fw_on_remove = true;
 
+	/* Only setting magic number when fw is freshly loaded here. NSP
+	 * won't unload fw when heartbeat stops if the magic number is not
+	 * correct. It's used when firmware is preloaded and shouldn't be
+	 * unloaded when driver exits.
+	 */
+	if (err < 0)
+		nfp_nsp_keepalive_stop(pf);
+	else if (fw_loaded && pf->multi_pf.en)
+		writeq(NFP_KEEPALIVE_MAGIC, pf->multi_pf.beat_addr);
+
 	return err < 0 ? err : fw_loaded;
 }
 
@@ -666,6 +751,7 @@  static int nfp_nsp_init(struct pci_dev *pdev, struct nfp_pf *pf)
 	}
 
 	pf->multi_pf.en = pdev->multifunction;
+	pf->multi_pf.id = PCI_FUNC(pdev->devfn);
 	dev_info(&pdev->dev, "%s-PF detected\n", pf->multi_pf.en ? "Multi" : "Single");
 
 	err = nfp_nsp_wait(nsp);
@@ -913,6 +999,7 @@  static int nfp_pci_probe(struct pci_dev *pdev,
 err_net_remove:
 	nfp_net_pci_remove(pf);
 err_fw_unload:
+	nfp_nsp_keepalive_stop(pf);
 	kfree(pf->rtbl);
 	nfp_mip_close(pf->mip);
 	if (pf->unload_fw_on_remove)
@@ -952,6 +1039,7 @@  static void __nfp_pci_shutdown(struct pci_dev *pdev, bool unload_fw)
 	nfp_net_pci_remove(pf);
 
 	vfree(pf->dumpspec);
+	nfp_nsp_keepalive_stop(pf);
 	kfree(pf->rtbl);
 	nfp_mip_close(pf->mip);
 	if (unload_fw && pf->unload_fw_on_remove)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.h b/drivers/net/ethernet/netronome/nfp/nfp_main.h
index 72ea3b83d313..c58849a332b0 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.h
@@ -86,6 +86,10 @@  struct nfp_dumpspec {
  * @num_shared_bufs:	Number of elements in @shared_bufs
  * @multi_pf:		Used in multi-PF setup
  * @multi_pf.en:	True if it's a NIC with multiple PFs
+ * @multi_pf.id:	PF index
+ * @multi_pf.beat_timer:Timer for beat to keepalive
+ * @multi_pf.beat_area:	Pointer to CPP area for beat to keepalive
+ * @multi_pf.beat_addr:	Pointer to mapped beat address used for keepalive
  *
  * Fields which may change after proble are protected by devlink instance lock.
  */
@@ -146,6 +150,10 @@  struct nfp_pf {
 
 	struct {
 		bool en;
+		u8 id;
+		struct timer_list beat_timer;
+		struct nfp_cpp_area *beat_area;
+		u8 __iomem *beat_addr;
 	} multi_pf;
 };
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
index db94b0bddc92..89a131cffc48 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
@@ -64,6 +64,10 @@  int nfp_nsp_read_sensors(struct nfp_nsp *state, unsigned int sensor_mask,
 /* MAC Statistics Accumulator */
 #define NFP_RESOURCE_MAC_STATISTICS	"mac.stat"
 
+/* Keepalive */
+#define NFP_KEEPALIVE			"nfp.beat"
+#define NFP_KEEPALIVE_MAGIC		0x6e66702e62656174ULL /* ASCII of "nfp.beat" */
+
 int nfp_resource_table_init(struct nfp_cpp *cpp);
 
 struct nfp_resource *