mbox series

[0/6,iwl-next] idpf: refactor virtchnl messages

Message ID 20240122211125.840833-1-alan.brady@intel.com (mailing list archive)
Headers show
Series idpf: refactor virtchnl messages | expand

Message

Alan Brady Jan. 22, 2024, 9:11 p.m. UTC
The motivation for this series has two primary goals. We want to enable
support of multiple simultaneous messages and make the channel more
robust. The way it works right now, the driver can only send and receive
a single message at a time and if something goes really wrong, it can
lead to data corruption and strange bugs.

This works by conceptualizing a send and receive as a "virtchnl
transaction" (idpf_vc_xn) and introducing a "transaction manager"
(idpf_vc_xn_manager). The vcxn_mngr will init a ring of transactions
from which the driver will pop from a bitmap of free transactions to
track in-flight messages. Instead of needing to handle a complicated
send/recv for every a message, the driver now just needs to fill out a
xn_params struct and hand it over to idpf_vc_xn_exec which will take
care of all the messy bits. Once a message is sent and receives a reply,
we leverage the completion API to signal the received buffer is ready to
be used (assuming success, or an error code otherwise).

At a low-level, this implements the "sw cookie" field of the virtchnl
message descriptor to enable this. We have 16 bits we can put whatever
we want and the recipient is required to apply the same cookie to the
reply for that message.  We use the first 8 bits as an index into the
array of transactions to enable fast lookups and we use the second 8
bits as a salt to make sure each cookie is unique for that message. As
transactions are received in arbitrary order, it's possible to reuse a
transaction index and the salt guards against index conflicts to make
certain the lookup is correct. As a primitive example, say index 1 is
used with salt 1. The message times out without receiving a reply so
index 1 is renewed to be ready for a new transaction, we report the
timeout, and send the message again. Since index 1 is free to be used
again now, index 1 is again sent but now salt is 2. This time we do get
a reply, however it could be that the reply is _actually_ for the
previous send index 1 with salt 1.  Without the salt we would have no
way of knowing for sure if it's the correct reply, but with we will know
for certain.

Through this conversion we also get several other benefits. We can now
more appropriately handle asynchronously sent messages by providing
space for a callback to be defined. This notably allows us to handle MAC
filter failures better; previously we could potentially have stale,
failed filters in our list, which shouldn't really have a major impact
but is obviously not correct. I also managed to remove slightly more
lines than I added which is a win in my book.

Alan Brady (6):
  idpf: implement virtchnl transaction manager
  idpf: refactor vport virtchnl messages
  idpf: refactor queue related virtchnl messages
  idpf: refactor remaining virtchnl messages
  idpf: refactor idpf_recv_mb_msg
  idpf: cleanup virtchnl cruft

 drivers/net/ethernet/intel/idpf/idpf.h        |  192 +-
 .../ethernet/intel/idpf/idpf_controlq_api.h   |    5 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c    |   29 +-
 drivers/net/ethernet/intel/idpf/idpf_main.c   |    3 +-
 drivers/net/ethernet/intel/idpf/idpf_vf_dev.c |    2 +-
 .../net/ethernet/intel/idpf/idpf_virtchnl.c   | 1984 ++++++++---------
 6 files changed, 1045 insertions(+), 1170 deletions(-)

Comments

Willem de Bruijn Jan. 25, 2024, 3:14 a.m. UTC | #1
Alan Brady wrote:
> The motivation for this series has two primary goals. We want to enable
> support of multiple simultaneous messages and make the channel more
> robust. The way it works right now, the driver can only send and receive
> a single message at a time and if something goes really wrong, it can
> lead to data corruption and strange bugs.
> 
> This works by conceptualizing a send and receive as a "virtchnl
> transaction" (idpf_vc_xn) and introducing a "transaction manager"
> (idpf_vc_xn_manager). The vcxn_mngr will init a ring of transactions
> from which the driver will pop from a bitmap of free transactions to
> track in-flight messages. Instead of needing to handle a complicated
> send/recv for every a message, the driver now just needs to fill out a
> xn_params struct and hand it over to idpf_vc_xn_exec which will take
> care of all the messy bits. Once a message is sent and receives a reply,
> we leverage the completion API to signal the received buffer is ready to
> be used (assuming success, or an error code otherwise).
> 
> At a low-level, this implements the "sw cookie" field of the virtchnl
> message descriptor to enable this. We have 16 bits we can put whatever
> we want and the recipient is required to apply the same cookie to the
> reply for that message.  We use the first 8 bits as an index into the
> array of transactions to enable fast lookups and we use the second 8
> bits as a salt to make sure each cookie is unique for that message. As
> transactions are received in arbitrary order, it's possible to reuse a
> transaction index and the salt guards against index conflicts to make
> certain the lookup is correct. As a primitive example, say index 1 is
> used with salt 1. The message times out without receiving a reply so
> index 1 is renewed to be ready for a new transaction, we report the
> timeout, and send the message again. Since index 1 is free to be used
> again now, index 1 is again sent but now salt is 2. This time we do get
> a reply, however it could be that the reply is _actually_ for the
> previous send index 1 with salt 1.  Without the salt we would have no
> way of knowing for sure if it's the correct reply, but with we will know
> for certain.
> 
> Through this conversion we also get several other benefits. We can now
> more appropriately handle asynchronously sent messages by providing
> space for a callback to be defined. This notably allows us to handle MAC
> filter failures better; previously we could potentially have stale,
> failed filters in our list, which shouldn't really have a major impact
> but is obviously not correct. I also managed to remove slightly more
> lines than I added which is a win in my book.
> 
> Alan Brady (6):
>   idpf: implement virtchnl transaction manager
>   idpf: refactor vport virtchnl messages
>   idpf: refactor queue related virtchnl messages
>   idpf: refactor remaining virtchnl messages
>   idpf: refactor idpf_recv_mb_msg
>   idpf: cleanup virtchnl cruft
> 
>  drivers/net/ethernet/intel/idpf/idpf.h        |  192 +-
>  .../ethernet/intel/idpf/idpf_controlq_api.h   |    5 +
>  drivers/net/ethernet/intel/idpf/idpf_lib.c    |   29 +-
>  drivers/net/ethernet/intel/idpf/idpf_main.c   |    3 +-
>  drivers/net/ethernet/intel/idpf/idpf_vf_dev.c |    2 +-
>  .../net/ethernet/intel/idpf/idpf_virtchnl.c   | 1984 ++++++++---------
>  6 files changed, 1045 insertions(+), 1170 deletions(-)

Great improvement, +1.

This makes virtchan more robust during edge case conditions, more
scalable and the code cleaner: less open coded duplication across
every vc operation.

The code mostly matches what I am familiar with and we have extensive
test experience with. From an initial side-by-side comparison.

I'll need to read the code that differs more closely (such as the
xn_bm_lock that Simon commented on) and will run a sanity test. Even
just a successful boot will have exercised much of this code.

One comment for patch [4/6] only.