diff mbox series

[net] ice: fix unaligned access in ice_create_lag_recipe

Message ID 20240131115823.541317-1-mschmidt@redhat.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net] ice: fix unaligned access in ice_create_lag_recipe | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1064 this patch: 1064
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 1081 this patch: 1081
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1081 this patch: 1081
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-01-31--21-00 (tests: 710)

Commit Message

Michal Schmidt Jan. 31, 2024, 11:58 a.m. UTC
new_rcp->recipe_bitmap was written to as if it were an aligned bitmap.
It is an 8-byte array, but aligned only to 4.
Use put_unaligned to set its value.

Additionally, values in ice commands are typically in little-endian.
I assume the recipe bitmap should be too, so use the *_le64 conversion.
I don't have a big-endian system with ice to test this.

I tested that the driver does not crash when probing on aarch64 anymore,
which is good enough for me. I don't know if the LAG feature actually
works.

This is what the crash looked like without the fix:
[   17.599009] Unable to handle kernel paging request at virtual address ffff07ff9c6dc004
[   17.599011] Mem abort info:
[   17.599011]   ESR = 0x0000000096000021
[   17.599012]   EC = 0x25: DABT (current EL), IL = 32 bits
[   17.599013]   SET = 0, FnV = 0
[   17.599014]   EA = 0, S1PTW = 0
[   17.599014]   FSC = 0x21: alignment fault
[   17.599015] Data abort info:
[   17.599016]   ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
[   17.599016]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   17.599017]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   17.599019] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080dd6bd0000
[   17.599020] [ffff07ff9c6dc004] pgd=1800083fffacf003, p4d=1800083fffacf003, pud=1800083ffface003, pmd=1800083fff9ea003, pte=006808001c6dcf07
[   17.599025] Internal error: Oops: 0000000096000021 [#1] SMP
[   17.599027] Modules linked in: crct10dif_ce ghash_ce sha2_ce sha256_arm64 mlx5_core sha1_ce sbsa_gwdt ice(+) nvme nvme_core mlxfw igb tls nvme_common psample i2c_algo_bit gnss pci_hyperv_intf i2c_designware_platform i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod
[   17.599043] CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.14.0-407.el9.aarch64 #1
[   17.599044] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F31L (SCP: 2.10.20220531) 09/29/2022
[   17.599046] Workqueue: events work_for_cpu_fn
[   17.599051] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   17.599053] pc : ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
[   17.599091] lr : ice_create_lag_recipe.constprop.0+0x54/0x11c [ice]
[   17.599121] sp : ffff8000084a3c50
[   17.599122] x29: ffff8000084a3c50 x28: ffffabc4a6790f00 x27: ffffabc4a6200fa0
[   17.599124] x26: ffff07ff809e5c34 x25: ffff083e5f41980d x24: ffff07ff8610a0c0
[   17.599126] x23: 0000000000000000 x22: ffff07ff9fe894c0 x21: ffff07ffb771a460
[   17.599128] x20: ffff07ff9c6dc000 x19: 0000000000000000 x18: 0000000000000014
[   17.599130] x17: 00000000c3142fa2 x16: 000000007e77e163 x15: 0000000018c66856
[   17.599132] x14: 00000000b8afd426 x13: 000000007e8b3b19 x12: 000000004a34fdf7
[   17.599134] x11: 00000000a7cb2fcc x10: 00000000ffffff8a x9 : 0000000000000000
[   17.599136] x8 : 0000002000000005 x7 : 0000000000000001 x6 : ffffabc487a054d8
[   17.599138] x5 : ffff07ff9c6dc004 x4 : 000000000000000a x3 : 0000000000000000
[   17.599140] x2 : 0000000000000000 x1 : 0000000000000400 x0 : ffff07ff9c6dc004
[   17.599142] Call trace:
[   17.599143]  ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
[   17.599172]  ice_init_lag+0xcc/0x22c [ice]
[   17.599201]  ice_init_features+0x160/0x2b4 [ice]
[   17.599230]  ice_probe+0x2d0/0x30c [ice]
[   17.599258]  local_pci_probe+0x58/0xb0
[   17.599262]  work_for_cpu_fn+0x20/0x30
[   17.599264]  process_one_work+0x1e4/0x4c0
[   17.599266]  worker_thread+0x220/0x450
[   17.599268]  kthread+0xe8/0xf4
[   17.599270]  ret_from_fork+0x10/0x20
[   17.599273] Code: 380044a4 f800429f 8b000ca0 d503201f (f821301f)
[   17.599274] ---[ end trace 168d79e2ecf9f7e3 ]---
[   17.599275] Kernel panic - not syncing: Oops: Fatal exception
[   17.893321] SMP: stopping secondary CPUs
[   17.897374] Kernel Offset: 0x2bc49c400000 from 0xffff800008000000
[   17.903453] PHYS_OFFSET: 0x80000000
[   17.906928] CPU features: 0x0,00000001,70028143,1041720b
[   17.912226] Memory Limit: none
[   17.915268] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_lag.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Jiri Pirko Jan. 31, 2024, 12:17 p.m. UTC | #1
Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>new_rcp->recipe_bitmap was written to as if it were an aligned bitmap.
>It is an 8-byte array, but aligned only to 4.
>Use put_unaligned to set its value.
>
>Additionally, values in ice commands are typically in little-endian.
>I assume the recipe bitmap should be too, so use the *_le64 conversion.
>I don't have a big-endian system with ice to test this.
>
>I tested that the driver does not crash when probing on aarch64 anymore,
>which is good enough for me. I don't know if the LAG feature actually
>works.
>
>This is what the crash looked like without the fix:
>[   17.599009] Unable to handle kernel paging request at virtual address ffff07ff9c6dc004
>[   17.599011] Mem abort info:
>[   17.599011]   ESR = 0x0000000096000021
>[   17.599012]   EC = 0x25: DABT (current EL), IL = 32 bits
>[   17.599013]   SET = 0, FnV = 0
>[   17.599014]   EA = 0, S1PTW = 0
>[   17.599014]   FSC = 0x21: alignment fault
>[   17.599015] Data abort info:
>[   17.599016]   ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
>[   17.599016]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>[   17.599017]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>[   17.599019] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080dd6bd0000
>[   17.599020] [ffff07ff9c6dc004] pgd=1800083fffacf003, p4d=1800083fffacf003, pud=1800083ffface003, pmd=1800083fff9ea003, pte=006808001c6dcf07
>[   17.599025] Internal error: Oops: 0000000096000021 [#1] SMP
>[   17.599027] Modules linked in: crct10dif_ce ghash_ce sha2_ce sha256_arm64 mlx5_core sha1_ce sbsa_gwdt ice(+) nvme nvme_core mlxfw igb tls nvme_common psample i2c_algo_bit gnss pci_hyperv_intf i2c_designware_platform i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod
>[   17.599043] CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.14.0-407.el9.aarch64 #1
>[   17.599044] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F31L (SCP: 2.10.20220531) 09/29/2022
>[   17.599046] Workqueue: events work_for_cpu_fn
>[   17.599051] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>[   17.599053] pc : ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
>[   17.599091] lr : ice_create_lag_recipe.constprop.0+0x54/0x11c [ice]
>[   17.599121] sp : ffff8000084a3c50
>[   17.599122] x29: ffff8000084a3c50 x28: ffffabc4a6790f00 x27: ffffabc4a6200fa0
>[   17.599124] x26: ffff07ff809e5c34 x25: ffff083e5f41980d x24: ffff07ff8610a0c0
>[   17.599126] x23: 0000000000000000 x22: ffff07ff9fe894c0 x21: ffff07ffb771a460
>[   17.599128] x20: ffff07ff9c6dc000 x19: 0000000000000000 x18: 0000000000000014
>[   17.599130] x17: 00000000c3142fa2 x16: 000000007e77e163 x15: 0000000018c66856
>[   17.599132] x14: 00000000b8afd426 x13: 000000007e8b3b19 x12: 000000004a34fdf7
>[   17.599134] x11: 00000000a7cb2fcc x10: 00000000ffffff8a x9 : 0000000000000000
>[   17.599136] x8 : 0000002000000005 x7 : 0000000000000001 x6 : ffffabc487a054d8
>[   17.599138] x5 : ffff07ff9c6dc004 x4 : 000000000000000a x3 : 0000000000000000
>[   17.599140] x2 : 0000000000000000 x1 : 0000000000000400 x0 : ffff07ff9c6dc004
>[   17.599142] Call trace:
>[   17.599143]  ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
>[   17.599172]  ice_init_lag+0xcc/0x22c [ice]
>[   17.599201]  ice_init_features+0x160/0x2b4 [ice]
>[   17.599230]  ice_probe+0x2d0/0x30c [ice]
>[   17.599258]  local_pci_probe+0x58/0xb0
>[   17.599262]  work_for_cpu_fn+0x20/0x30
>[   17.599264]  process_one_work+0x1e4/0x4c0
>[   17.599266]  worker_thread+0x220/0x450
>[   17.599268]  kthread+0xe8/0xf4
>[   17.599270]  ret_from_fork+0x10/0x20
>[   17.599273] Code: 380044a4 f800429f 8b000ca0 d503201f (f821301f)
>[   17.599274] ---[ end trace 168d79e2ecf9f7e3 ]---
>[   17.599275] Kernel panic - not syncing: Oops: Fatal exception
>[   17.893321] SMP: stopping secondary CPUs
>[   17.897374] Kernel Offset: 0x2bc49c400000 from 0xffff800008000000
>[   17.903453] PHYS_OFFSET: 0x80000000
>[   17.906928] CPU features: 0x0,00000001,70028143,1041720b
>[   17.912226] Memory Limit: none
>[   17.915268] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>
>Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
>Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>---
> drivers/net/ethernet/intel/ice/ice_lag.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
>diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
>index 2a25323105e5..d4848f6fe919 100644
>--- a/drivers/net/ethernet/intel/ice/ice_lag.c
>+++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>@@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw *hw, u16 *rid,
> 	new_rcp->content.act_ctrl_fwd_priority = prio;
> 	new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
> 	new_rcp->recipe_indx = *rid;
>-	bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>-		    ICE_MAX_NUM_RECIPES);
>-	set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>+	put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);

Looks like there might be another incorrect bitmap usage for this in
ice_add_sw_recipe(). Care to fix it there as well?

Otherwise, the patch looks fine.


> 
> 	err = ice_aq_add_recipe(hw, new_rcp, 1, NULL);
> 	if (err)
>-- 
>2.43.0
>
>
Alexander Lobakin Jan. 31, 2024, 4:59 p.m. UTC | #2
From: Jiri Pirko <jiri@resnulli.us>
Date: Wed, 31 Jan 2024 13:17:44 +0100

> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>> new_rcp->recipe_bitmap was written to as if it were an aligned bitmap.
>> It is an 8-byte array, but aligned only to 4.
>> Use put_unaligned to set its value.
>>
>> Additionally, values in ice commands are typically in little-endian.
>> I assume the recipe bitmap should be too, so use the *_le64 conversion.
>> I don't have a big-endian system with ice to test this.
>>
>> I tested that the driver does not crash when probing on aarch64 anymore,
>> which is good enough for me. I don't know if the LAG feature actually
>> works.
>>
>> This is what the crash looked like without the fix:
>> [   17.599009] Unable to handle kernel paging request at virtual address ffff07ff9c6dc004
>> [   17.599011] Mem abort info:
>> [   17.599011]   ESR = 0x0000000096000021
>> [   17.599012]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   17.599013]   SET = 0, FnV = 0
>> [   17.599014]   EA = 0, S1PTW = 0
>> [   17.599014]   FSC = 0x21: alignment fault
>> [   17.599015] Data abort info:
>> [   17.599016]   ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
>> [   17.599016]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>> [   17.599017]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> [   17.599019] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080dd6bd0000
>> [   17.599020] [ffff07ff9c6dc004] pgd=1800083fffacf003, p4d=1800083fffacf003, pud=1800083ffface003, pmd=1800083fff9ea003, pte=006808001c6dcf07
>> [   17.599025] Internal error: Oops: 0000000096000021 [#1] SMP
>> [   17.599027] Modules linked in: crct10dif_ce ghash_ce sha2_ce sha256_arm64 mlx5_core sha1_ce sbsa_gwdt ice(+) nvme nvme_core mlxfw igb tls nvme_common psample i2c_algo_bit gnss pci_hyperv_intf i2c_designware_platform i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod
>> [   17.599043] CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.14.0-407.el9.aarch64 #1
>> [   17.599044] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F31L (SCP: 2.10.20220531) 09/29/2022
>> [   17.599046] Workqueue: events work_for_cpu_fn
>> [   17.599051] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [   17.599053] pc : ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
>> [   17.599091] lr : ice_create_lag_recipe.constprop.0+0x54/0x11c [ice]
>> [   17.599121] sp : ffff8000084a3c50
>> [   17.599122] x29: ffff8000084a3c50 x28: ffffabc4a6790f00 x27: ffffabc4a6200fa0
>> [   17.599124] x26: ffff07ff809e5c34 x25: ffff083e5f41980d x24: ffff07ff8610a0c0
>> [   17.599126] x23: 0000000000000000 x22: ffff07ff9fe894c0 x21: ffff07ffb771a460
>> [   17.599128] x20: ffff07ff9c6dc000 x19: 0000000000000000 x18: 0000000000000014
>> [   17.599130] x17: 00000000c3142fa2 x16: 000000007e77e163 x15: 0000000018c66856
>> [   17.599132] x14: 00000000b8afd426 x13: 000000007e8b3b19 x12: 000000004a34fdf7
>> [   17.599134] x11: 00000000a7cb2fcc x10: 00000000ffffff8a x9 : 0000000000000000
>> [   17.599136] x8 : 0000002000000005 x7 : 0000000000000001 x6 : ffffabc487a054d8
>> [   17.599138] x5 : ffff07ff9c6dc004 x4 : 000000000000000a x3 : 0000000000000000
>> [   17.599140] x2 : 0000000000000000 x1 : 0000000000000400 x0 : ffff07ff9c6dc004
>> [   17.599142] Call trace:
>> [   17.599143]  ice_create_lag_recipe.constprop.0+0xbc/0x11c [ice]
>> [   17.599172]  ice_init_lag+0xcc/0x22c [ice]
>> [   17.599201]  ice_init_features+0x160/0x2b4 [ice]
>> [   17.599230]  ice_probe+0x2d0/0x30c [ice]
>> [   17.599258]  local_pci_probe+0x58/0xb0
>> [   17.599262]  work_for_cpu_fn+0x20/0x30
>> [   17.599264]  process_one_work+0x1e4/0x4c0
>> [   17.599266]  worker_thread+0x220/0x450
>> [   17.599268]  kthread+0xe8/0xf4
>> [   17.599270]  ret_from_fork+0x10/0x20
>> [   17.599273] Code: 380044a4 f800429f 8b000ca0 d503201f (f821301f)
>> [   17.599274] ---[ end trace 168d79e2ecf9f7e3 ]---
>> [   17.599275] Kernel panic - not syncing: Oops: Fatal exception
>> [   17.893321] SMP: stopping secondary CPUs
>> [   17.897374] Kernel Offset: 0x2bc49c400000 from 0xffff800008000000
>> [   17.903453] PHYS_OFFSET: 0x80000000
>> [   17.906928] CPU features: 0x0,00000001,70028143,1041720b
>> [   17.912226] Memory Limit: none
>> [   17.915268] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>>
>> Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
>> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> ---
>> drivers/net/ethernet/intel/ice/ice_lag.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
>> index 2a25323105e5..d4848f6fe919 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw *hw, u16 *rid,
>> 	new_rcp->content.act_ctrl_fwd_priority = prio;
>> 	new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>> 	new_rcp->recipe_indx = *rid;
>> -	bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>> -		    ICE_MAX_NUM_RECIPES);
>> -	set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>> +	put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
> 
> Looks like there might be another incorrect bitmap usage for this in
> ice_add_sw_recipe(). Care to fix it there as well?

Those are already fixed in one switchdev series and will be sent to IWL
soon.
I believe this patch would also make no sense after it's sent.

> 
> Otherwise, the patch looks fine.
> 
> 
>>
>> 	err = ice_aq_add_recipe(hw, new_rcp, 1, NULL);
>> 	if (err)
>> -- 
>> 2.43.0

Thanks,
Olek
Michal Schmidt Feb. 1, 2024, 6:40 p.m. UTC | #3
On 1/31/24 17:59, Alexander Lobakin wrote:
> From: Jiri Pirko <jiri@resnulli.us>
> Date: Wed, 31 Jan 2024 13:17:44 +0100
> 
>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
>>> index 2a25323105e5..d4848f6fe919 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw *hw, u16 *rid,
>>> 	new_rcp->content.act_ctrl_fwd_priority = prio;
>>> 	new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>> 	new_rcp->recipe_indx = *rid;
>>> -	bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>> -		    ICE_MAX_NUM_RECIPES);
>>> -	set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>> +	put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>
>> Looks like there might be another incorrect bitmap usage for this in
>> ice_add_sw_recipe(). Care to fix it there as well?
> 
> Those are already fixed in one switchdev series and will be sent to IWL
> soon.
> I believe this patch would also make no sense after it's sent.

Hi Alexander,
When will the series be sent?
The bug causes a kernel panic. Will the series target net.git?
Michal
Alexander Lobakin Feb. 2, 2024, 12:39 p.m. UTC | #4
From: Michal Schmidt <mschmidt@redhat.com>
Date: Thu, 1 Feb 2024 19:40:17 +0100

> On 1/31/24 17:59, Alexander Lobakin wrote:
>> From: Jiri Pirko <jiri@resnulli.us>
>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>
>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>> index 2a25323105e5..d4848f6fe919 100644
>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
>>>> *hw, u16 *rid,
>>>>     new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>     new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>     new_rcp->recipe_indx = *rid;
>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>> -            ICE_MAX_NUM_RECIPES);
>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>
>>> Looks like there might be another incorrect bitmap usage for this in
>>> ice_add_sw_recipe(). Care to fix it there as well?
>>
>> Those are already fixed in one switchdev series and will be sent to IWL
>> soon.
>> I believe this patch would also make no sense after it's sent.
> 
> Hi Alexander,
> When will the series be sent?
> The bug causes a kernel panic. Will the series target net.git?

The global fix is here: [0]
It's targeting net-next.

I don't know what the best way here would be. Target net instead or pick
your fix to net and then fix it properly in net-next?

> Michal

Thanks,
Olek
Alexander Lobakin Feb. 2, 2024, 12:40 p.m. UTC | #5
From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Fri, 2 Feb 2024 13:39:28 +0100

> From: Michal Schmidt <mschmidt@redhat.com>
> Date: Thu, 1 Feb 2024 19:40:17 +0100
> 
>> On 1/31/24 17:59, Alexander Lobakin wrote:
>>> From: Jiri Pirko <jiri@resnulli.us>
>>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>>
>>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>> index 2a25323105e5..d4848f6fe919 100644
>>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
>>>>> *hw, u16 *rid,
>>>>>     new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>>     new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>>     new_rcp->recipe_indx = *rid;
>>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>>> -            ICE_MAX_NUM_RECIPES);
>>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>>
>>>> Looks like there might be another incorrect bitmap usage for this in
>>>> ice_add_sw_recipe(). Care to fix it there as well?
>>>
>>> Those are already fixed in one switchdev series and will be sent to IWL
>>> soon.
>>> I believe this patch would also make no sense after it's sent.
>>
>> Hi Alexander,
>> When will the series be sent?
>> The bug causes a kernel panic. Will the series target net.git?
> 
> The global fix is here: [0]
> It's targeting net-next.
> 
> I don't know what the best way here would be. Target net instead or pick
> your fix to net and then fix it properly in net-next?

Sorry, forgot to paste the link :clownface:

[0]
https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com

Thanks,
Olek
Jiri Pirko Feb. 2, 2024, 12:54 p.m. UTC | #6
Fri, Feb 02, 2024 at 01:40:18PM CET, aleksander.lobakin@intel.com wrote:
>From: Alexander Lobakin <aleksander.lobakin@intel.com>
>Date: Fri, 2 Feb 2024 13:39:28 +0100
>
>> From: Michal Schmidt <mschmidt@redhat.com>
>> Date: Thu, 1 Feb 2024 19:40:17 +0100
>> 
>>> On 1/31/24 17:59, Alexander Lobakin wrote:
>>>> From: Jiri Pirko <jiri@resnulli.us>
>>>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>>>
>>>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>> index 2a25323105e5..d4848f6fe919 100644
>>>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
>>>>>> *hw, u16 *rid,
>>>>>>     new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>>>     new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>>>     new_rcp->recipe_indx = *rid;
>>>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>>>> -            ICE_MAX_NUM_RECIPES);
>>>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>>>
>>>>> Looks like there might be another incorrect bitmap usage for this in
>>>>> ice_add_sw_recipe(). Care to fix it there as well?
>>>>
>>>> Those are already fixed in one switchdev series and will be sent to IWL
>>>> soon.
>>>> I believe this patch would also make no sense after it's sent.
>>>
>>> Hi Alexander,
>>> When will the series be sent?
>>> The bug causes a kernel panic. Will the series target net.git?
>> 
>> The global fix is here: [0]
>> It's targeting net-next.
>> 
>> I don't know what the best way here would be. Target net instead or pick
>> your fix to net and then fix it properly in net-next?
>
>Sorry, forgot to paste the link :clownface:
>
>[0]
>https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com

Just put this into -net, no? I don't see why not.

>
>Thanks,
>Olek
Fijalkowski, Maciej Feb. 2, 2024, 1 p.m. UTC | #7
On Fri, Feb 02, 2024 at 01:40:18PM +0100, Alexander Lobakin wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
> Date: Fri, 2 Feb 2024 13:39:28 +0100
> 
> > From: Michal Schmidt <mschmidt@redhat.com>
> > Date: Thu, 1 Feb 2024 19:40:17 +0100
> > 
> >> On 1/31/24 17:59, Alexander Lobakin wrote:
> >>> From: Jiri Pirko <jiri@resnulli.us>
> >>> Date: Wed, 31 Jan 2024 13:17:44 +0100
> >>>
> >>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
> >>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
> >>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
> >>>>> index 2a25323105e5..d4848f6fe919 100644
> >>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
> >>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
> >>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
> >>>>> *hw, u16 *rid,
> >>>>>     new_rcp->content.act_ctrl_fwd_priority = prio;
> >>>>>     new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
> >>>>>     new_rcp->recipe_indx = *rid;
> >>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
> >>>>> -            ICE_MAX_NUM_RECIPES);
> >>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
> >>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
> >>>>
> >>>> Looks like there might be another incorrect bitmap usage for this in
> >>>> ice_add_sw_recipe(). Care to fix it there as well?
> >>>
> >>> Those are already fixed in one switchdev series and will be sent to IWL
> >>> soon.
> >>> I believe this patch would also make no sense after it's sent.
> >>
> >> Hi Alexander,
> >> When will the series be sent?
> >> The bug causes a kernel panic. Will the series target net.git?
> > 
> > The global fix is here: [0]
> > It's targeting net-next.
> > 
> > I don't know what the best way here would be. Target net instead or pick
> > your fix to net and then fix it properly in net-next?
> 
> Sorry, forgot to paste the link :clownface:

IMHO 1/2 should go to net. Then you would have to wait for it to got
accepted and get merged to -next and then you come back with 2/2. You know
the deal.

> 
> [0]
> https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com
> 
> Thanks,
> Olek
>
Alexander Lobakin Feb. 2, 2024, 1:01 p.m. UTC | #8
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Fri, 2 Feb 2024 14:00:03 +0100

> On Fri, Feb 02, 2024 at 01:40:18PM +0100, Alexander Lobakin wrote:
>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>> Date: Fri, 2 Feb 2024 13:39:28 +0100
>>
>>> From: Michal Schmidt <mschmidt@redhat.com>
>>> Date: Thu, 1 Feb 2024 19:40:17 +0100
>>>
>>>> On 1/31/24 17:59, Alexander Lobakin wrote:
>>>>> From: Jiri Pirko <jiri@resnulli.us>
>>>>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>>>>
>>>>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>> index 2a25323105e5..d4848f6fe919 100644
>>>>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
>>>>>>> *hw, u16 *rid,
>>>>>>>     new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>>>>     new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>>>>     new_rcp->recipe_indx = *rid;
>>>>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>>>>> -            ICE_MAX_NUM_RECIPES);
>>>>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>>>>
>>>>>> Looks like there might be another incorrect bitmap usage for this in
>>>>>> ice_add_sw_recipe(). Care to fix it there as well?
>>>>>
>>>>> Those are already fixed in one switchdev series and will be sent to IWL
>>>>> soon.
>>>>> I believe this patch would also make no sense after it's sent.
>>>>
>>>> Hi Alexander,
>>>> When will the series be sent?
>>>> The bug causes a kernel panic. Will the series target net.git?
>>>
>>> The global fix is here: [0]
>>> It's targeting net-next.
>>>
>>> I don't know what the best way here would be. Target net instead or pick
>>> your fix to net and then fix it properly in net-next?
>>
>> Sorry, forgot to paste the link :clownface:
> 
> IMHO 1/2 should go to net. Then you would have to wait for it to got
> accepted and get merged to -next and then you come back with 2/2. You know
> the deal.

Agree!

Hi Steve,

Could you please send the first patch from your series to net instead of
net-next?

(and add "Fixes:" tag with the blamed commit)

> 
>>
>> [0]
>> https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com
>>
>> Thanks,
>> Olek

Thanks,
Olek
Zou, Steven Feb. 7, 2024, 12:44 a.m. UTC | #9
On 2/2/2024 9:01 PM, Alexander Lobakin wrote:
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Fri, 2 Feb 2024 14:00:03 +0100
>
>> On Fri, Feb 02, 2024 at 01:40:18PM +0100, Alexander Lobakin wrote:
>>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Date: Fri, 2 Feb 2024 13:39:28 +0100
>>>
>>>> From: Michal Schmidt <mschmidt@redhat.com>
>>>> Date: Thu, 1 Feb 2024 19:40:17 +0100
>>>>
>>>>> On 1/31/24 17:59, Alexander Lobakin wrote:
>>>>>> From: Jiri Pirko <jiri@resnulli.us>
>>>>>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>>>>>
>>>>>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>> index 2a25323105e5..d4848f6fe919 100644
>>>>>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct ice_hw
>>>>>>>> *hw, u16 *rid,
>>>>>>>>      new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>>>>>      new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>>>>>      new_rcp->recipe_indx = *rid;
>>>>>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>>>>>> -            ICE_MAX_NUM_RECIPES);
>>>>>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>>>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>>>>> Looks like there might be another incorrect bitmap usage for this in
>>>>>>> ice_add_sw_recipe(). Care to fix it there as well?
>>>>>> Those are already fixed in one switchdev series and will be sent to IWL
>>>>>> soon.
>>>>>> I believe this patch would also make no sense after it's sent.
>>>>> Hi Alexander,
>>>>> When will the series be sent?
>>>>> The bug causes a kernel panic. Will the series target net.git?
>>>> The global fix is here: [0]
>>>> It's targeting net-next.
>>>>
>>>> I don't know what the best way here would be. Target net instead or pick
>>>> your fix to net and then fix it properly in net-next?
>>> Sorry, forgot to paste the link :clownface:
>> IMHO 1/2 should go to net. Then you would have to wait for it to got
>> accepted and get merged to -next and then you come back with 2/2. You know
>> the deal.
> Agree!
>
> Hi Steve,
>
> Could you please send the first patch from your series to net instead of
> net-next?
>
> (and add "Fixes:" tag with the blamed commit)

Hi Olek,
Sure, I will do it soon.

>
>>> [0]
>>> https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com
>>>
>>> Thanks,
>>> Olek
> Thanks,
> Olek
Zou, Steven Feb. 7, 2024, 2:15 a.m. UTC | #10
On 2/7/2024 8:44 AM, Zou, Steven wrote:
>
> On 2/2/2024 9:01 PM, Alexander Lobakin wrote:
>> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> Date: Fri, 2 Feb 2024 14:00:03 +0100
>>
>>> On Fri, Feb 02, 2024 at 01:40:18PM +0100, Alexander Lobakin wrote:
>>>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>>>> Date: Fri, 2 Feb 2024 13:39:28 +0100
>>>>
>>>>> From: Michal Schmidt <mschmidt@redhat.com>
>>>>> Date: Thu, 1 Feb 2024 19:40:17 +0100
>>>>>
>>>>>> On 1/31/24 17:59, Alexander Lobakin wrote:
>>>>>>> From: Jiri Pirko <jiri@resnulli.us>
>>>>>>> Date: Wed, 31 Jan 2024 13:17:44 +0100
>>>>>>>
>>>>>>>> Wed, Jan 31, 2024 at 12:58:23PM CET, mschmidt@redhat.com wrote:
>>>>>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>>> b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>>> index 2a25323105e5..d4848f6fe919 100644
>>>>>>>>> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
>>>>>>>>> @@ -1829,9 +1829,7 @@ static int ice_create_lag_recipe(struct 
>>>>>>>>> ice_hw
>>>>>>>>> *hw, u16 *rid,
>>>>>>>>>      new_rcp->content.act_ctrl_fwd_priority = prio;
>>>>>>>>>      new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
>>>>>>>>>      new_rcp->recipe_indx = *rid;
>>>>>>>>> -    bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
>>>>>>>>> -            ICE_MAX_NUM_RECIPES);
>>>>>>>>> -    set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
>>>>>>>>> +    put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
>>>>>>>> Looks like there might be another incorrect bitmap usage for 
>>>>>>>> this in
>>>>>>>> ice_add_sw_recipe(). Care to fix it there as well?
>>>>>>> Those are already fixed in one switchdev series and will be sent 
>>>>>>> to IWL
>>>>>>> soon.
>>>>>>> I believe this patch would also make no sense after it's sent.
>>>>>> Hi Alexander,
>>>>>> When will the series be sent?
>>>>>> The bug causes a kernel panic. Will the series target net.git?
>>>>> The global fix is here: [0]
>>>>> It's targeting net-next.
>>>>>
>>>>> I don't know what the best way here would be. Target net instead 
>>>>> or pick
>>>>> your fix to net and then fix it properly in net-next?
>>>> Sorry, forgot to paste the link :clownface:
>>> IMHO 1/2 should go to net. Then you would have to wait for it to got
>>> accepted and get merged to -next and then you come back with 2/2. 
>>> You know
>>> the deal.
>> Agree!
>>
>> Hi Steve,
>>
>> Could you please send the first patch from your series to net instead of
>> net-next?
>>
>> (and add "Fixes:" tag with the blamed commit)
>
> Hi Olek,
> Sure, I will do it soon.

Hi team,
The patch has been rebased and submitted on:
https://lore.kernel.org/intel-wired-lan/20240207014959.24012-1-steven.zou@intel.com/
https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20240207014959.24012-1-steven.zou@intel.com/

Thanks,
Steven

>
>>
>>>> [0]
>>>> https://lore.kernel.org/intel-wired-lan/20240130025146.30265-2-steven.zou@intel.com 
>>>>
>>>>
>>>> Thanks,
>>>> Olek
>> Thanks,
>> Olek
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
index 2a25323105e5..d4848f6fe919 100644
--- a/drivers/net/ethernet/intel/ice/ice_lag.c
+++ b/drivers/net/ethernet/intel/ice/ice_lag.c
@@ -1829,9 +1829,7 @@  static int ice_create_lag_recipe(struct ice_hw *hw, u16 *rid,
 	new_rcp->content.act_ctrl_fwd_priority = prio;
 	new_rcp->content.rid = *rid | ICE_AQ_RECIPE_ID_IS_ROOT;
 	new_rcp->recipe_indx = *rid;
-	bitmap_zero((unsigned long *)new_rcp->recipe_bitmap,
-		    ICE_MAX_NUM_RECIPES);
-	set_bit(*rid, (unsigned long *)new_rcp->recipe_bitmap);
+	put_unaligned_le64(BIT_ULL(*rid), new_rcp->recipe_bitmap);
 
 	err = ice_aq_add_recipe(hw, new_rcp, 1, NULL);
 	if (err)