Message ID | 20240206033804.1198416-1-alan.brady@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | idpf: refactor virtchnl messages | expand |
From: Alan Brady <alan.brady@intel.com> Date: Mon, 5 Feb 2024 19:37:54 -0800 > The motivation for this series has two primary goals. We want to enable > support of multiple simultaneous messages and make the channel more > robust. The way it works right now, the driver can only send and receive > a single message at a time and if something goes really wrong, it can > lead to data corruption and strange bugs. This works better than v3. For the basic scenarios: Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Thanks, Olek
On Mon, 5 Feb 2024 19:37:54 -0800 Alan Brady wrote: > The motivation for this series has two primary goals. We want to enable > support of multiple simultaneous messages and make the channel more > robust. The way it works right now, the driver can only send and receive > a single message at a time and if something goes really wrong, it can > lead to data corruption and strange bugs. Coccinelle points out some potential places to use min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1956:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1271:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1319:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:2640:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1295:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:2157:17-18: WARNING opportunity for min() testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:3582:17-18: WARNING opportunity for min()
On 2/6/2024 10:57 AM, Jakub Kicinski wrote: > On Mon, 5 Feb 2024 19:37:54 -0800 Alan Brady wrote: >> The motivation for this series has two primary goals. We want to enable >> support of multiple simultaneous messages and make the channel more >> robust. The way it works right now, the driver can only send and receive >> a single message at a time and if something goes really wrong, it can >> lead to data corruption and strange bugs. > > Coccinelle points out some potential places to use min() > > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1956:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1271:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1319:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:2640:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:1295:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:2157:17-18: WARNING opportunity for min() > testing/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c:3582:17-18: WARNING opportunity for min() We did run coccinelle check and see the min suggestions. It's triggering on these statements I added: return reply_sz < 0 ? reply_sz : 0; A min here would change it to: return min(reply_sz, 0); I didn't really like that because it's misleading as though we're returning the size of the reply and might accidentally encourage someone to change it to a max. Here reply_sz will be negative if an error was returned from message sending. But this function we only want to return 0 or negative. By being explicit in what we want to do, it seems clearer to me what the intention is but I could be wrong. We can definitely change it however if that's preferred here.
On Tue, 6 Feb 2024 11:18:48 -0800 Alan Brady wrote: > We did run coccinelle check and see the min suggestions. It's triggering > on these statements I added: > > return reply_sz < 0 ? reply_sz : 0; > > A min here would change it to: > > return min(reply_sz, 0); > > I didn't really like that because it's misleading as though we're > returning the size of the reply and might accidentally encourage someone > to change it to a max. Here reply_sz will be negative if an error was > returned from message sending. But this function we only want to return > 0 or negative. By being explicit in what we want to do, it seems clearer > to me what the intention is but I could be wrong. > > We can definitely change it however if that's preferred here. Hm, okay, that does sound like making it worse. I'll disable the minmax coccicheck for now, it seems noisy.
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Tuesday, February 6, 2024 12:03 PM > To: Brady, Alan <alan.brady@intel.com> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; > willemdebruijn.kernel@gmail.com; Kitszel, Przemyslaw > <przemyslaw.kitszel@intel.com>; Bagnucki, Igor <igor.bagnucki@intel.com>; > Lobakin, Aleksander <aleksander.lobakin@intel.com> > Subject: Re: [PATCH v4 00/10 iwl-next] idpf: refactor virtchnl messages > > On Tue, 6 Feb 2024 11:18:48 -0800 Alan Brady wrote: > > We did run coccinelle check and see the min suggestions. It's triggering > > on these statements I added: > > > > return reply_sz < 0 ? reply_sz : 0; > > > > A min here would change it to: > > > > return min(reply_sz, 0); > > > > I didn't really like that because it's misleading as though we're > > returning the size of the reply and might accidentally encourage someone > > to change it to a max. Here reply_sz will be negative if an error was > > returned from message sending. But this function we only want to return > > 0 or negative. By being explicit in what we want to do, it seems clearer > > to me what the intention is but I could be wrong. > > > > We can definitely change it however if that's preferred here. > > Hm, okay, that does sound like making it worse. > I'll disable the minmax coccicheck for now, it seems noisy. Maybe you could make the coccicheck only complain if the value is non-zero or not const? Maybe that's a bit too complicated... Hm
On Tue, 6 Feb 2024 22:50:25 +0000 Keller, Jacob E wrote: > > Hm, okay, that does sound like making it worse. > > I'll disable the minmax coccicheck for now, it seems noisy. > > Maybe you could make the coccicheck only complain if the value is > non-zero or not const? Maybe that's a bit too complicated... Hm Non-zero could work. It may be worthwhile to look at the warnings cocci currently generates and figure out what's common among the cases where warning doesn't make sense.
From: Alexander Lobakin <aleksander.lobakin@intel.com> Date: Tue, 6 Feb 2024 18:02:33 +0100 > From: Alan Brady <alan.brady@intel.com> > Date: Mon, 5 Feb 2024 19:37:54 -0800 > >> The motivation for this series has two primary goals. We want to enable >> support of multiple simultaneous messages and make the channel more >> robust. The way it works right now, the driver can only send and receive >> a single message at a time and if something goes really wrong, it can >> lead to data corruption and strange bugs. > > This works better than v3. > For the basic scenarios: > > Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Sorry I'm reverting my tag. After the series, the CP segfaults on each rmmod idpf: root@mev-imc:/usr/bin/cplane# cp_pim_mdd_handler: MDD interrupt detected cp_pim_mdd_event_handler: reg = 40 Jan 1 00:27:57 mev-imc local0.warn LanCP[190]: cp_pim_mdd_handler: MDD interrupt detected cp_pim_mdd_event_handler: reg = 1 Jan 1 00:28:54 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is configured on 1st contiguous num of queuJan 1 00:28:54 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is configured on 1st contiguous num of queuJan 1 00:28:55 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is configured on 1st contiguous num of queues= 16 start Qid= 34 Jan 1 00:28:55 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is configured on 1st contiguous num of queu16 start Qid= 640 Jan 1 00:28:55 mev-imc local0.err LanCP[190]: [cp_del_node_rxbuff_lst/4179] ERR: Resource list is empty, so nothing to delete here Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. Jan 1 00::08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed on vsi (1). Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 6. Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_rss_config/340] ERR: faininit_vsi_rss_config() failed on vsi (6). Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 7. Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (7)'s queue regions. Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vo init vsi LUT on vsi 8. Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (8)'s queue regions. Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed on vsi (8). Jan 1 00:29:08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. Jan 1 00::08 mev-imc local0.err LanCP[190]: [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed on vsi (1). [1]+ Segmentation fault ./imccp 0000:00:01.6 0 cp_init.cfg Only restarting the CP helps -- restarting the imccp daemon makes it immediately crash again. This should be dropped from the next-queue until it's fixed. Thanks, Olek
On 2/14/2024 6:49 AM, Alexander Lobakin wrote: > From: Alexander Lobakin <aleksander.lobakin@intel.com> > Date: Tue, 6 Feb 2024 18:02:33 +0100 > >> From: Alan Brady <alan.brady@intel.com> >> Date: Mon, 5 Feb 2024 19:37:54 -0800 >> >>> The motivation for this series has two primary goals. We want to enable >>> support of multiple simultaneous messages and make the channel more >>> robust. The way it works right now, the driver can only send and receive >>> a single message at a time and if something goes really wrong, it can >>> lead to data corruption and strange bugs. >> >> This works better than v3. >> For the basic scenarios: >> >> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> > > Sorry I'm reverting my tag. > After the series, the CP segfaults on each rmmod idpf: > > root@mev-imc:/usr/bin/cplane# cp_pim_mdd_handler: MDD interrupt detected > cp_pim_mdd_event_handler: reg = 40 > Jan 1 00:27:57 mev-imc local0.warn LanCP[190]: cp_pim_mdd_handler: MDD > interrupt detected > cp_pim_mdd_event_handler: reg = 1 > Jan 1 00:28:54 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] > WARNING: RSS is configured on 1st contiguous num of queuJan 1 00:28:54 > mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is > configured on 1st contiguous num of queuJan 1 00:28:55 mev-imc > local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is > configured on 1st contiguous num of queues= 16 start Qid= 34 > Jan 1 00:28:55 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] > WARNING: RSS is configured on 1st contiguous num of queu16 start Qid= 640 > Jan 1 00:28:55 mev-imc local0.err LanCP[190]: > [cp_del_node_rxbuff_lst/4179] ERR: Resource list is empty, so nothing to > delete here > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. > Jan 1 00::08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (1). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 6. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: faininit_vsi_rss_config() failed on > vsi (6). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 7. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (7)'s queue > regions. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vo init vsi LUT on vsi 8. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (8)'s queue > regions. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (8). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. > Jan 1 00::08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (1). > > [1]+ Segmentation fault ./imccp 0000:00:01.6 0 cp_init.cfg > > Only restarting the CP helps -- restarting the imccp daemon makes it > immediately crash again. > > This should be dropped from the next-queue until it's fixed. > > Thanks, > Olek I definitely tested rmmod so I'm frankly not understanding how this can be possible. If you would like I'm happy to sync on how you're able to cause this to happen. Our internal validation verified 1000 load/unloads passed their testing. Is there more you're doing other than just rmmod? Alan
From: Alexander Lobakin <aleksander.lobakin@intel.com> Date: Wed, 14 Feb 2024 15:49:51 +0100 > From: Alexander Lobakin <aleksander.lobakin@intel.com> > Date: Tue, 6 Feb 2024 18:02:33 +0100 > >> From: Alan Brady <alan.brady@intel.com> >> Date: Mon, 5 Feb 2024 19:37:54 -0800 >> >>> The motivation for this series has two primary goals. We want to enable >>> support of multiple simultaneous messages and make the channel more >>> robust. The way it works right now, the driver can only send and receive >>> a single message at a time and if something goes really wrong, it can >>> lead to data corruption and strange bugs. >> >> This works better than v3. >> For the basic scenarios: >> >> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> > > Sorry I'm reverting my tag. > After the series, the CP segfaults on each rmmod idpf: > > root@mev-imc:/usr/bin/cplane# cp_pim_mdd_handler: MDD interrupt detected > cp_pim_mdd_event_handler: reg = 40 > Jan 1 00:27:57 mev-imc local0.warn LanCP[190]: cp_pim_mdd_handler: MDD > interrupt detected > cp_pim_mdd_event_handler: reg = 1 > Jan 1 00:28:54 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] > WARNING: RSS is configured on 1st contiguous num of queuJan 1 00:28:54 > mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is > configured on 1st contiguous num of queuJan 1 00:28:55 mev-imc > local0.warn LanCP[190]: [hma_create_vport/4257] WARNING: RSS is > configured on 1st contiguous num of queues= 16 start Qid= 34 > Jan 1 00:28:55 mev-imc local0.warn LanCP[190]: [hma_create_vport/4257] > WARNING: RSS is configured on 1st contiguous num of queu16 start Qid= 640 > Jan 1 00:28:55 mev-imc local0.err LanCP[190]: > [cp_del_node_rxbuff_lst/4179] ERR: Resource list is empty, so nothing to > delete here > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. > Jan 1 00::08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (1). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 6. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: faininit_vsi_rss_config() failed on > vsi (6). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 7. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (7)'s queue > regions. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vo init vsi LUT on vsi 8. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_rss_config/340] ERR: failed to remove vsi (8)'s queue > regions. > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (8). > Jan 1 00:29:08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_tc_q_region/222] ERR: Failed to init vsi LUT on vsi 1. > Jan 1 00::08 mev-imc local0.err LanCP[190]: > [cp_uninit_vsi_fxp_config/1101] ERR: cp_uninit_vsi_rss_config() failed > on vsi (1). > > [1]+ Segmentation fault ./imccp 0000:00:01.6 0 cp_init.cfg > > Only restarting the CP helps -- restarting the imccp daemon makes it > immediately crash again. > > This should be dropped from the next-queue until it's fixed. The latest firmware works with this series -- the problem was there, the series only revealed it. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Can be taken back to Tony's tree. Thanks, Olek