[00/27] Convert RPC client transmission to a queued model
mbox series

Message ID 20180903152936.24325-1-trond.myklebust@hammerspace.com
Headers show
Series
  • Convert RPC client transmission to a queued model
Related show

Message

Trond Myklebust Sept. 3, 2018, 3:29 p.m. UTC
For historical reasons, the RPC client is heavily serialised during the
process of transmitting a request by the XPRT_LOCK. A request is
required to take that lock before it can start XDR encoding, and it is
required to hold it until it is done transmitting. In essence the lock
protects the following functions:

- Stream based transport connect/reconnect
- RPCSEC_GSS encoding of the RPC message
- Transmission of a single RPC message

The following patch set assumes that we do not need to do much to
improve performance of the connect/reconnect case, as that is supposed
to be a rare occurrence.

The set looks at dealing with RPCSEC_GSS issues by removing serialisation
while encoding, and simply assuming that if we detect after grabbing the
XPRT_LOCK that we're about to transmit a message with a sequence number
that has fallen outside the window allowed by RFC2203, then we can
abort the transmission of that message, and schedule it for re-encoding.
Since window sizes are typically expected to lie above 100 messages or
so, we expect these cases where we miss the window to be rare, in
general.

Finally, we look at trying to avoid the requirement that every request
must go through the process of being woken up to grab the XPRT_LOCK in
order to transmit itself by allowing a request that currently holds the
XPRT_LOCK to grab other requests from an ordered queue, and to transmit
them too. The bulk of the changes in this patchset are dedicated to
providing this functionality.

Trond Myklebust (27):
  SUNRPC: Clean up initialisation of the struct rpc_rqst
  SUNRPC: If there is no reply expected, bail early from call_decode
  SUNRPC: The transmitted message must lie in the RPCSEC window of
    validity
  SUNRPC: Simplify identification of when the message send/receive is
    complete
  SUNRPC: Avoid holding locks across the XDR encoding of the RPC message
  SUNRPC: Rename TCP receive-specific state variables
  SUNRPC: Move reset of TCP state variables into the reconnect code
  SUNRPC: Add socket transmit queue offset tracking
  SUNRPC: Simplify dealing with aborted partially transmitted messages
  SUNRPC: Refactor the transport request pinning
  SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status
  SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit
  SUNRPC: Rename xprt->recv_lock to xprt->queue_lock
  SUNRPC: Refactor xprt_transmit() to remove the reply queue code
  SUNRPC: Refactor xprt_transmit() to remove wait for reply code
  SUNRPC: Minor cleanup for call_transmit()
  SUNRPC: Distinguish between the slot allocation list and receive queue
  NFS: Add a transmission queue for RPC requests
  SUNRPC: Refactor RPC call encoding
  SUNRPC: Treat the task and request as separate in the
    xprt_ops->send_request()
  SUNRPC: Don't reset the request 'bytes_sent' counter when releasing
    XPRT_LOCK
  SUNRPC: Simplify xprt_prepare_transmit()
  SUNRPC: Move RPC retransmission stat counter to xprt_transmit()
  SUNRPC: Fix up the back channel transmit
  SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit
    queue
  SUNRPC: Queue the request for transmission immediately after encoding
  SUNRPC: Convert the xprt->sending queue back to an ordinary wait queue

 include/linux/sunrpc/auth.h                |   2 +
 include/linux/sunrpc/auth_gss.h            |   1 +
 include/linux/sunrpc/sched.h               |   7 +-
 include/linux/sunrpc/xprt.h                |  23 +-
 include/linux/sunrpc/xprtsock.h            |  23 +-
 include/trace/events/sunrpc.h              |  10 +-
 net/sunrpc/auth.c                          |  10 +
 net/sunrpc/auth_gss/auth_gss.c             |  41 ++
 net/sunrpc/backchannel_rqst.c              |   3 +-
 net/sunrpc/clnt.c                          | 139 +++---
 net/sunrpc/sched.c                         |  63 ++-
 net/sunrpc/svcsock.c                       |   6 +-
 net/sunrpc/xprt.c                          | 503 +++++++++++++--------
 net/sunrpc/xprtrdma/backchannel.c          |   3 +-
 net/sunrpc/xprtrdma/rpc_rdma.c             |  10 +-
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |   7 +-
 net/sunrpc/xprtrdma/transport.c            |   5 +-
 net/sunrpc/xprtsock.c                      | 327 +++++++-------
 18 files changed, 728 insertions(+), 455 deletions(-)

Comments

Chuck Lever Sept. 3, 2018, 5:41 p.m. UTC | #1
> On Sep 3, 2018, at 11:29 AM, Trond Myklebust <trondmy@gmail.com> wrote:
> 
> For historical reasons, the RPC client is heavily serialised during the
> process of transmitting a request by the XPRT_LOCK. A request is
> required to take that lock before it can start XDR encoding, and it is
> required to hold it until it is done transmitting. In essence the lock
> protects the following functions:
> 
> - Stream based transport connect/reconnect
> - RPCSEC_GSS encoding of the RPC message
> - Transmission of a single RPC message

It also protects TCP rqst slot allocation:

void xprt_lock_and_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
{
        /* Note: grabbing the xprt_lock_write() ensures that we throttle
         * new slot allocation if the transport is congested (i.e. when
         * reconnecting a stream transport or when out of socket write
         * buffer space).
         */
        if (xprt_lock_write(xprt, task)) {
                xprt_alloc_slot(xprt, task);
                xprt_release_write(xprt, task);
        }
}


> The following patch set assumes that we do not need to do much to
> improve performance of the connect/reconnect case, as that is supposed
> to be a rare occurrence.
> 
> The set looks at dealing with RPCSEC_GSS issues by removing serialisation
> while encoding, and simply assuming that if we detect after grabbing the
> XPRT_LOCK that we're about to transmit a message with a sequence number
> that has fallen outside the window allowed by RFC2203, then we can
> abort the transmission of that message, and schedule it for re-encoding.
> Since window sizes are typically expected to lie above 100 messages or
> so, we expect these cases where we miss the window to be rare, in
> general.
> 
> Finally, we look at trying to avoid the requirement that every request
> must go through the process of being woken up to grab the XPRT_LOCK in
> order to transmit itself by allowing a request that currently holds the
> XPRT_LOCK to grab other requests from an ordered queue, and to transmit
> them too. The bulk of the changes in this patchset are dedicated to
> providing this functionality.

When considering whether this kind of change could work for
xprtrdma: the transport send lock mechanism is used to manage
the credit grant. The transport send lock prevents the client
from sending too many RPC Calls at once.

Congestion- or flow-controlled transports might not be able to
adopt this approach, because there needs to be a check before
each RPC Call is sent to see if the congestion/credit window
has room.


> Trond Myklebust (27):
>  SUNRPC: Clean up initialisation of the struct rpc_rqst
>  SUNRPC: If there is no reply expected, bail early from call_decode
>  SUNRPC: The transmitted message must lie in the RPCSEC window of
>    validity
>  SUNRPC: Simplify identification of when the message send/receive is
>    complete
>  SUNRPC: Avoid holding locks across the XDR encoding of the RPC message
>  SUNRPC: Rename TCP receive-specific state variables
>  SUNRPC: Move reset of TCP state variables into the reconnect code
>  SUNRPC: Add socket transmit queue offset tracking
>  SUNRPC: Simplify dealing with aborted partially transmitted messages
>  SUNRPC: Refactor the transport request pinning
>  SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status
>  SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit
>  SUNRPC: Rename xprt->recv_lock to xprt->queue_lock
>  SUNRPC: Refactor xprt_transmit() to remove the reply queue code
>  SUNRPC: Refactor xprt_transmit() to remove wait for reply code
>  SUNRPC: Minor cleanup for call_transmit()
>  SUNRPC: Distinguish between the slot allocation list and receive queue
>  NFS: Add a transmission queue for RPC requests
>  SUNRPC: Refactor RPC call encoding
>  SUNRPC: Treat the task and request as separate in the
>    xprt_ops->send_request()
>  SUNRPC: Don't reset the request 'bytes_sent' counter when releasing
>    XPRT_LOCK
>  SUNRPC: Simplify xprt_prepare_transmit()
>  SUNRPC: Move RPC retransmission stat counter to xprt_transmit()
>  SUNRPC: Fix up the back channel transmit
>  SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit
>    queue
>  SUNRPC: Queue the request for transmission immediately after encoding
>  SUNRPC: Convert the xprt->sending queue back to an ordinary wait queue
> 
> include/linux/sunrpc/auth.h                |   2 +
> include/linux/sunrpc/auth_gss.h            |   1 +
> include/linux/sunrpc/sched.h               |   7 +-
> include/linux/sunrpc/xprt.h                |  23 +-
> include/linux/sunrpc/xprtsock.h            |  23 +-
> include/trace/events/sunrpc.h              |  10 +-
> net/sunrpc/auth.c                          |  10 +
> net/sunrpc/auth_gss/auth_gss.c             |  41 ++
> net/sunrpc/backchannel_rqst.c              |   3 +-
> net/sunrpc/clnt.c                          | 139 +++---
> net/sunrpc/sched.c                         |  63 ++-
> net/sunrpc/svcsock.c                       |   6 +-
> net/sunrpc/xprt.c                          | 503 +++++++++++++--------
> net/sunrpc/xprtrdma/backchannel.c          |   3 +-
> net/sunrpc/xprtrdma/rpc_rdma.c             |  10 +-
> net/sunrpc/xprtrdma/svc_rdma_backchannel.c |   7 +-
> net/sunrpc/xprtrdma/transport.c            |   5 +-
> net/sunrpc/xprtsock.c                      | 327 +++++++-------
> 18 files changed, 728 insertions(+), 455 deletions(-)
> 
> -- 
> 2.17.1
> 

--
Chuck Lever
Trond Myklebust Sept. 3, 2018, 5:55 p.m. UTC | #2
On Mon, 2018-09-03 at 13:41 -0400, Chuck Lever wrote:
> > On Sep 3, 2018, at 11:29 AM, Trond Myklebust <trondmy@gmail.com>
> > wrote:
> > 
> > For historical reasons, the RPC client is heavily serialised during
> > the
> > process of transmitting a request by the XPRT_LOCK. A request is
> > required to take that lock before it can start XDR encoding, and it
> > is
> > required to hold it until it is done transmitting. In essence the
> > lock
> > protects the following functions:
> > 
> > - Stream based transport connect/reconnect
> > - RPCSEC_GSS encoding of the RPC message
> > - Transmission of a single RPC message
> 
> It also protects TCP rqst slot allocation:
> 
> void xprt_lock_and_alloc_slot(struct rpc_xprt *xprt, struct rpc_task
> *task)
> {
>         /* Note: grabbing the xprt_lock_write() ensures that we
> throttle
>          * new slot allocation if the transport is congested (i.e.
> when
>          * reconnecting a stream transport or when out of socket
> write
>          * buffer space).
>          */
>         if (xprt_lock_write(xprt, task)) {
>                 xprt_alloc_slot(xprt, task);
>                 xprt_release_write(xprt, task);
>         }
> }

Ack. That needs some thought.

> 
> > The following patch set assumes that we do not need to do much to
> > improve performance of the connect/reconnect case, as that is
> > supposed
> > to be a rare occurrence.
> > 
> > The set looks at dealing with RPCSEC_GSS issues by removing
> > serialisation
> > while encoding, and simply assuming that if we detect after
> > grabbing the
> > XPRT_LOCK that we're about to transmit a message with a sequence
> > number
> > that has fallen outside the window allowed by RFC2203, then we can
> > abort the transmission of that message, and schedule it for re-
> > encoding.
> > Since window sizes are typically expected to lie above 100 messages
> > or
> > so, we expect these cases where we miss the window to be rare, in
> > general.
> > 
> > Finally, we look at trying to avoid the requirement that every
> > request
> > must go through the process of being woken up to grab the XPRT_LOCK
> > in
> > order to transmit itself by allowing a request that currently holds
> > the
> > XPRT_LOCK to grab other requests from an ordered queue, and to
> > transmit
> > them too. The bulk of the changes in this patchset are dedicated to
> > providing this functionality.
> 
> When considering whether this kind of change could work for
> xprtrdma: the transport send lock mechanism is used to manage
> the credit grant. The transport send lock prevents the client
> from sending too many RPC Calls at once.
> 
> Congestion- or flow-controlled transports might not be able to
> adopt this approach, because there needs to be a check before
> each RPC Call is sent to see if the congestion/credit window
> has room.

That's a good point. We might want to put an additional check for
congestion overflow in the send request to push back when we hit the
credit window limit. I'll think about that.

> 
> > Trond Myklebust (27):
> >  SUNRPC: Clean up initialisation of the struct rpc_rqst
> >  SUNRPC: If there is no reply expected, bail early from call_decode
> >  SUNRPC: The transmitted message must lie in the RPCSEC window of
> >    validity
> >  SUNRPC: Simplify identification of when the message send/receive
> > is
> >    complete
> >  SUNRPC: Avoid holding locks across the XDR encoding of the RPC
> > message
> >  SUNRPC: Rename TCP receive-specific state variables
> >  SUNRPC: Move reset of TCP state variables into the reconnect code
> >  SUNRPC: Add socket transmit queue offset tracking
> >  SUNRPC: Simplify dealing with aborted partially transmitted
> > messages
> >  SUNRPC: Refactor the transport request pinning
> >  SUNRPC: Add a helper to wake up a sleeping rpc_task and set its
> > status
> >  SUNRPC: Don't wake queued RPC calls multiple times in
> > xprt_transmit
> >  SUNRPC: Rename xprt->recv_lock to xprt->queue_lock
> >  SUNRPC: Refactor xprt_transmit() to remove the reply queue code
> >  SUNRPC: Refactor xprt_transmit() to remove wait for reply code
> >  SUNRPC: Minor cleanup for call_transmit()
> >  SUNRPC: Distinguish between the slot allocation list and receive
> > queue
> >  NFS: Add a transmission queue for RPC requests
> >  SUNRPC: Refactor RPC call encoding
> >  SUNRPC: Treat the task and request as separate in the
> >    xprt_ops->send_request()
> >  SUNRPC: Don't reset the request 'bytes_sent' counter when
> > releasing
> >    XPRT_LOCK
> >  SUNRPC: Simplify xprt_prepare_transmit()
> >  SUNRPC: Move RPC retransmission stat counter to xprt_transmit()
> >  SUNRPC: Fix up the back channel transmit
> >  SUNRPC: Allow calls to xprt_transmit() to drain the entire
> > transmit
> >    queue
> >  SUNRPC: Queue the request for transmission immediately after
> > encoding
> >  SUNRPC: Convert the xprt->sending queue back to an ordinary wait
> > queue
> > 
> > include/linux/sunrpc/auth.h                |   2 +
> > include/linux/sunrpc/auth_gss.h            |   1 +
> > include/linux/sunrpc/sched.h               |   7 +-
> > include/linux/sunrpc/xprt.h                |  23 +-
> > include/linux/sunrpc/xprtsock.h            |  23 +-
> > include/trace/events/sunrpc.h              |  10 +-
> > net/sunrpc/auth.c                          |  10 +
> > net/sunrpc/auth_gss/auth_gss.c             |  41 ++
> > net/sunrpc/backchannel_rqst.c              |   3 +-
> > net/sunrpc/clnt.c                          | 139 +++---
> > net/sunrpc/sched.c                         |  63 ++-
> > net/sunrpc/svcsock.c                       |   6 +-
> > net/sunrpc/xprt.c                          | 503 +++++++++++++-----
> > ---
> > net/sunrpc/xprtrdma/backchannel.c          |   3 +-
> > net/sunrpc/xprtrdma/rpc_rdma.c             |  10 +-
> > net/sunrpc/xprtrdma/svc_rdma_backchannel.c |   7 +-
> > net/sunrpc/xprtrdma/transport.c            |   5 +-
> > net/sunrpc/xprtsock.c                      | 327 +++++++-------
> > 18 files changed, 728 insertions(+), 455 deletions(-)
> > 
> > -- 
> > 2.17.1
> > 
> 
> --
> Chuck Lever
> 
> 
>