mbox series

[v3,0/7,iwl-next] idpf: refactor virtchnl messages

Message ID 20240130005923.983026-1-alan.brady@intel.com (mailing list archive)
Headers show
Series idpf: refactor virtchnl messages | expand

Message

Alan Brady Jan. 30, 2024, 12:59 a.m. UTC
The motivation for this series has two primary goals. We want to enable
support of multiple simultaneous messages and make the channel more
robust. The way it works right now, the driver can only send and receive
a single message at a time and if something goes really wrong, it can
lead to data corruption and strange bugs.

This works by conceptualizing a send and receive as a "virtchnl
transaction" (idpf_vc_xn) and introducing a "transaction manager"
(idpf_vc_xn_manager). The vcxn_mngr will init a ring of transactions
from which the driver will pop from a bitmap of free transactions to
track in-flight messages. Instead of needing to handle a complicated
send/recv for every a message, the driver now just needs to fill out a
xn_params struct and hand it over to idpf_vc_xn_exec which will take
care of all the messy bits. Once a message is sent and receives a reply,
we leverage the completion API to signal the received buffer is ready to
be used (assuming success, or an error code otherwise).

At a low-level, this implements the "sw cookie" field of the virtchnl
message descriptor to enable this. We have 16 bits we can put whatever
we want and the recipient is required to apply the same cookie to the
reply for that message.  We use the first 8 bits as an index into the
array of transactions to enable fast lookups and we use the second 8
bits as a salt to make sure each cookie is unique for that message. As
transactions are received in arbitrary order, it's possible to reuse a
transaction index and the salt guards against index conflicts to make
certain the lookup is correct. As a primitive example, say index 1 is
used with salt 1. The message times out without receiving a reply so
index 1 is renewed to be ready for a new transaction, we report the
timeout, and send the message again. Since index 1 is free to be used
again now, index 1 is again sent but now salt is 2. This time we do get
a reply, however it could be that the reply is _actually_ for the
previous send index 1 with salt 1.  Without the salt we would have no
way of knowing for sure if it's the correct reply, but with we will know
for certain.

Through this conversion we also get several other benefits. We can now
more appropriately handle asynchronously sent messages by providing
space for a callback to be defined. This notably allows us to handle MAC
filter failures better; previously we could potentially have stale,
failed filters in our list, which shouldn't really have a major impact
but is obviously not correct. I also managed to remove fairly
significant more lines than I added which is a win in my book.

Additionally, this converts some variables to use auto-variables where
appropriate. This makes the alloc paths much cleaner and less prone to
memory leaks.


Alan Brady (7):
  idpf: implement virtchnl transaction manager
  idpf: refactor vport virtchnl messages
  idpf: refactor queue related virtchnl messages
  idpf: refactor remaining virtchnl messages
  idpf: add async_handler for MAC filter messages
  idpf: refactor idpf_recv_mb_msg
  idpf: cleanup virtchnl cruft

 drivers/net/ethernet/intel/idpf/idpf.h        |  192 +-
 .../ethernet/intel/idpf/idpf_controlq_api.h   |    5 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c    |   29 +-
 drivers/net/ethernet/intel/idpf/idpf_main.c   |    3 +-
 drivers/net/ethernet/intel/idpf/idpf_vf_dev.c |    2 +-
 .../net/ethernet/intel/idpf/idpf_virtchnl.c   | 2191 ++++++++---------
 6 files changed, 1086 insertions(+), 1336 deletions(-)

Comments

Alexander Lobakin Jan. 31, 2024, 6:33 p.m. UTC | #1
From: Alan Brady <alan.brady@intel.com>
Date: Mon, 29 Jan 2024 16:59:16 -0800

> The motivation for this series has two primary goals. We want to enable
> support of multiple simultaneous messages and make the channel more
> robust. The way it works right now, the driver can only send and receive
> a single message at a time and if something goes really wrong, it can
> lead to data corruption and strange bugs.

Have you tested v3?
I have this on my system (net-next + your series), no other patches applied:

> [alobakin@GK3153-KR1-CYP-38282-U39-ETH1 ~]$ sudo modprobe idpf
> [   89.785966] idpf 0000:83:00.0: Device HW Reset initiated
> [alobakin@GK3153-KR1-CYP-38282-U39-ETH1 ~]$ [   90.241658] BUG: unable to handle page fault for address: ff8e1df482000000
> [   90.241704] #PF: supervisor write access in kernel mode
> [   90.241728] #PF: error_code(0x0002) - not-present page
> [   90.241751] PGD 107ffc8067 P4D 107ffc7067 PUD 207fdc8067 PMD 0 
> [   90.241782] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [   90.241805] CPU: 32 PID: 847 Comm: kworker/32:1 Kdump: loaded Not tainted 6.8.0-rc1-libeth+ #1
> [   90.241841] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0008.2305172341 05/17/2023
> [   90.241879] Workqueue: idpf-0000:83:00.0-vc_ev idpf_vc_event_task [idpf]
> [   90.241932] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [   90.241962] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [   90.242027] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [   90.242053] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 0000000000000034
> [   90.242084] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: ff8e1df481d980c0
> [   90.242115] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 0000000000000000
> [   90.242145] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 0000000000000000
> [   90.242174] R13: 0000000000267f40 R14: 0000000004000000 R15: 0000000000000000
> [   90.242206] FS:  0000000000000000(0000) GS:ff2506253ec00000(0000) knlGS:0000000000000000
> [   90.242240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   90.242266] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 0000000000771ef0
> [   90.242297] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   90.242327] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   90.242357] PKRU: 55555554
> [   90.242378] Call Trace:
> [   90.242393]  <TASK>
> [   90.242410]  ? __die_body+0x68/0xb0
> [   90.242437]  ? page_fault_oops+0x3a6/0x400
> [   90.242467]  ? exc_page_fault+0xb2/0x1b0
> [   90.242496]  ? asm_exc_page_fault+0x26/0x30
> [   90.242527]  ? __free_pages_ok+0x338/0x4f0
> [   90.242554]  idpf_mb_clean+0xc1/0x110 [idpf]
> [   90.242600]  idpf_send_mb_msg+0x50/0x1b0 [idpf]
> [   90.242643]  idpf_vc_xn_exec+0x189/0x350 [idpf]
> [   90.242688]  idpf_vc_core_init+0x32c/0x6d0 [idpf]
> [   90.242735]  idpf_vc_event_task+0x2da/0x390 [idpf]
> [   90.242779]  process_scheduled_works+0x251/0x460
> [   90.242807]  worker_thread+0x21c/0x2d0
> [   90.242830]  ? __pfx_worker_thread+0x10/0x10
> [   90.242855]  kthread+0xe8/0x110
> [   90.242878]  ? __pfx_kthread+0x10/0x10
> [   90.242902]  ret_from_fork+0x37/0x50
> [   90.242925]  ? __pfx_kthread+0x10/0x10
> [   90.243802]  ret_from_fork_asm+0x1b/0x30
> [   90.244598]  </TASK>
> [   90.245361] Modules linked in: idpf libeth rpcrdma rdma_cm ib_cm iw_cm ib_core qrtr rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel binfmt_misc vfat fat kvm irqbypass rapl ipmi_ssif iTCO_wdt intel_cstate intel_pmc_bxt iTCO_vendor_support dax_hmem cxl_acpi nfsd ioatdma intel_uncore mei_me isst_if_mmio cxl_core i2c_i801 isst_if_mbox_pci mei acpi_ipmi pcspkr intel_vsec isst_if_common i2c_smbus joydev dca ipmi_si intel_pch_thermal ipmi_devintf auth_rpcgss ipmi_msghandler acpi_power_meter acpi_pad nfs_acl lockd sunrpc loop grace zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel nvme bnxt_en sha512_ssse3 ast nvme_core sha256_ssse3 sha1_ssse3 i2c_algo_bit wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables dm_multipath fuse
> [   90.252381] CR2: ff8e1df482000000
> [   90.253263] ---[ end trace 0000000000000000 ]---
> [   90.314201] pstore: backend (erst) writing error (-28)
> [   90.314686] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [   90.314970] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [   90.315511] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [   90.315778] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 0000000000000034
> [   90.316043] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: ff8e1df481d980c0
> [   90.316308] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 0000000000000000
> [   90.316573] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 0000000000000000
> [   90.316838] R13: 0000000000267f40 R14: 0000000004000000 R15: 0000000000000000
> [   90.317104] FS:  0000000000000000(0000) GS:ff2506253ec00000(0000) knlGS:0000000000000000
> [   90.317368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   90.317631] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 0000000000771ef0
> [   90.317897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   90.318163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   90.318427] PKRU: 55555554
> [   90.318687] note: kworker/32:1[847] exited with irqs disabled
> [   90.319202] BUG: kernel NULL pointer dereference, address: 0000000000000008

[...]

Thanks,
Olek
Alan Brady Jan. 31, 2024, 10:03 p.m. UTC | #2
On 1/31/2024 10:33 AM, Alexander Lobakin wrote:
> From: Alan Brady <alan.brady@intel.com>
> Date: Mon, 29 Jan 2024 16:59:16 -0800
>
>> The motivation for this series has two primary goals. We want to enable
>> support of multiple simultaneous messages and make the channel more
>> robust. The way it works right now, the driver can only send and receive
>> a single message at a time and if something goes really wrong, it can
>> lead to data corruption and strange bugs.
> Have you tested v3?
> I have this on my system (net-next + your series), no other patches applied:


Mea culpa I have made a grave error.

We did test the vast majority of change and it was all fine. At last 
minute I noticed it looked like we could clean up idpf_send_mb_msg as 
well. I had confidence in the other changes weren't a problem so this 
seemed innocent to me. I was very wrong. There's some complications here 
with how idpf_mb_clean works that I think need to be worked out 
separately before we can do this change to idpf_send_mb_msg. I would 
like to revert the change to idpf_send_mb_msg I did in v3 and instead 
investigate it further in a follow up series that attempts to address 
all cases where automatic variables would be a good idea. Apologies and 
thanks.


-Alan


> [...]
>
> Thanks,
> Olek