mbox series

[0/2] fix gss seqno handling to be more rfc-compliant

Message ID 20250310A15110284b71e58.njha@janestreet.com (mailing list archive)
Headers show
Series fix gss seqno handling to be more rfc-compliant | expand

Message

Nikhil Jha March 10, 2025, 3:11 p.m. UTC
When the client retransmits an operation (for example, because the
server is slow to respond), a new GSS sequence number is associated with
the XID. In the current kernel code the original sequence number is
discarded. Subsequently, if a response to the original request is
received there will be a GSS sequence number mismatch. A mismatch will
trigger another retransmit, possibly repeating the cycle, and after some
number of failed retries EACCES is returned.

RFC2203, section 5.3.3.1 suggests a possible solution... “cache the
RPCSEC_GSS sequence number of each request it sends” and "compute the
checksum of each sequence number in the cache to try to match the
checksum in the reply's verifier." This is what FreeBSD’s implementation
does (rpc_gss_validate in sys/rpc/rpcsec_gss/rpcsec_gss.c).

However, even with this cache, retransmits directly caused by a seqno
mismatch can still cause a bad message interleaving that results in this
bug. The RFC already suggests ignoring incorrect seqnos on the server
side, and this seems symmetric, so this patchset also applies that
behavior to the client as well.

These two patches are *not* dependent on each other. I tested them by
delaying packets with a Python script hooked up to NFQUEUE. If it would
be helpful I can send this script along as well.

Nikhil Jha (2):
  sunrpc: implement rfc2203 rpcsec_gss seqnum cache
  sunrpc: don't retransmit on seqno events

 include/linux/sunrpc/xprt.h    | 31 +++++++++++++++++++++++++++++-
 net/sunrpc/auth_gss/auth_gss.c | 35 +++++++++++++++++++++++-----------
 net/sunrpc/clnt.c              |  9 +++++++--
 net/sunrpc/xprt.c              |  1 +
 4 files changed, 62 insertions(+), 14 deletions(-)