Message ID | 20250129004337.36898-2-shannon.nelson@amd.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | pds_core: fixes for adminq overflow | expand |
On Tue, Jan 28, 2025 at 04:43:36PM -0800, Shannon Nelson wrote: > From: Brett Creeley <brett.creeley@amd.com> > > The pds_core's adminq is protected by the adminq_lock, which prevents > more than 1 command to be posted onto it at any one time. This makes it > so the client drivers cannot simultaneously post adminq commands. > However, the completions happen in a different context, which means > multiple adminq commands can be posted sequentially and all waiting > on completion. > > On the FW side, the backing adminq request queue is only 16 entries > long and the retry mechanism and/or overflow/stuck prevention is > lacking. This can cause the adminq to get stuck, so commands are no > longer processed and completions are no longer sent by the FW. > > As an initial fix, prevent more than 16 outstanding adminq commands so > there's no way to cause the adminq from getting stuck. This works > because the backing adminq request queue will never have more than 16 > pending adminq commands, so it will never overflow. This is done by > reducing the adminq depth to 16. > > This is just the first step to fix this issue because there are already > devices being used. Moving forward a new capability bit will be defined > and set if the FW can gracefully handle the host driver/device having a > deeper adminq. > > Fixes: 792d36ccc163 ("pds_core: Clean up init/uninit flows to be more readable") > Signed-off-by: Brett Creeley <brett.creeley@amd.com> > Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> > --- > drivers/net/ethernet/amd/pds_core/core.c | 5 +---- > drivers/net/ethernet/amd/pds_core/core.h | 2 +- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c > index 536635e57727..4830292d5f87 100644 > --- a/drivers/net/ethernet/amd/pds_core/core.c > +++ b/drivers/net/ethernet/amd/pds_core/core.c > @@ -325,10 +325,7 @@ static int pdsc_core_init(struct pdsc *pdsc) > size_t sz; > int err; > > - /* Scale the descriptor ring length based on number of CPUs and VFs */ > - numdescs = max_t(int, PDSC_ADMINQ_MIN_LENGTH, num_online_cpus()); > - numdescs += 2 * pci_sriov_get_totalvfs(pdsc->pdev); > - numdescs = roundup_pow_of_two(numdescs); > + numdescs = PDSC_ADMINQ_MAX_LENGTH; > err = pdsc_qcq_alloc(pdsc, PDS_CORE_QTYPE_ADMINQ, 0, "adminq", > PDS_CORE_QCQ_F_CORE | PDS_CORE_QCQ_F_INTR, > numdescs, > diff --git a/drivers/net/ethernet/amd/pds_core/core.h b/drivers/net/ethernet/amd/pds_core/core.h > index 14522d6d5f86..543097983bf6 100644 > --- a/drivers/net/ethernet/amd/pds_core/core.h > +++ b/drivers/net/ethernet/amd/pds_core/core.h > @@ -16,7 +16,7 @@ > > #define PDSC_WATCHDOG_SECS 5 > #define PDSC_QUEUE_NAME_MAX_SZ 16 > -#define PDSC_ADMINQ_MIN_LENGTH 16 /* must be a power of two */ > +#define PDSC_ADMINQ_MAX_LENGTH 16 /* must be a power of two */ > #define PDSC_NOTIFYQ_LENGTH 64 /* must be a power of two */ > #define PDSC_TEARDOWN_RECOVERY false > #define PDSC_TEARDOWN_REMOVING true > -- Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > 2.17.1
diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c index 536635e57727..4830292d5f87 100644 --- a/drivers/net/ethernet/amd/pds_core/core.c +++ b/drivers/net/ethernet/amd/pds_core/core.c @@ -325,10 +325,7 @@ static int pdsc_core_init(struct pdsc *pdsc) size_t sz; int err; - /* Scale the descriptor ring length based on number of CPUs and VFs */ - numdescs = max_t(int, PDSC_ADMINQ_MIN_LENGTH, num_online_cpus()); - numdescs += 2 * pci_sriov_get_totalvfs(pdsc->pdev); - numdescs = roundup_pow_of_two(numdescs); + numdescs = PDSC_ADMINQ_MAX_LENGTH; err = pdsc_qcq_alloc(pdsc, PDS_CORE_QTYPE_ADMINQ, 0, "adminq", PDS_CORE_QCQ_F_CORE | PDS_CORE_QCQ_F_INTR, numdescs, diff --git a/drivers/net/ethernet/amd/pds_core/core.h b/drivers/net/ethernet/amd/pds_core/core.h index 14522d6d5f86..543097983bf6 100644 --- a/drivers/net/ethernet/amd/pds_core/core.h +++ b/drivers/net/ethernet/amd/pds_core/core.h @@ -16,7 +16,7 @@ #define PDSC_WATCHDOG_SECS 5 #define PDSC_QUEUE_NAME_MAX_SZ 16 -#define PDSC_ADMINQ_MIN_LENGTH 16 /* must be a power of two */ +#define PDSC_ADMINQ_MAX_LENGTH 16 /* must be a power of two */ #define PDSC_NOTIFYQ_LENGTH 64 /* must be a power of two */ #define PDSC_TEARDOWN_RECOVERY false #define PDSC_TEARDOWN_REMOVING true