diff mbox series

[v3,1/6] SUNRPC: Implement xdr_reserve_space_vec()

Message ID 20200803165954.1348263-2-Anna.Schumaker@Netapp.com (mailing list archive)
State New, archived
Headers show
Series NFSD: Add support for the v4.2 READ_PLUS operation | expand

Commit Message

Anna Schumaker Aug. 3, 2020, 4:59 p.m. UTC
From: Anna Schumaker <Anna.Schumaker@Netapp.com>

Reserving space for a large READ payload requires special handling when
reserving space in the xdr buffer pages. One problem we can have is use
of the scratch buffer, which is used to get a pointer to a contiguous
region of data up to PAGE_SIZE. When using the scratch buffer, calls to
xdr_commit_encode() shift the data to it's proper alignment in the xdr
buffer. If we've reserved several pages in a vector, then this could
potentially invalidate earlier pointers and result in incorrect READ
data being sent to the client.

I get around this by looking at the amount of space left in the current
page, and never reserve more than that for each entry in the read
vector. This lets us place data directly where it needs to go in the
buffer pages.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
 include/linux/sunrpc/xdr.h |  2 ++
 net/sunrpc/xdr.c           | 45 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

Comments

Chuck Lever Aug. 3, 2020, 7:19 p.m. UTC | #1
Hi Anna-

> On Aug 3, 2020, at 12:59 PM, schumaker.anna@gmail.com wrote:
> 
> From: Anna Schumaker <Anna.Schumaker@Netapp.com>
> 
> Reserving space for a large READ payload requires special handling when
> reserving space in the xdr buffer pages. One problem we can have is use
> of the scratch buffer, which is used to get a pointer to a contiguous
> region of data up to PAGE_SIZE. When using the scratch buffer, calls to
> xdr_commit_encode() shift the data to it's proper alignment in the xdr
> buffer. If we've reserved several pages in a vector, then this could
> potentially invalidate earlier pointers and result in incorrect READ
> data being sent to the client.
> 
> I get around this by looking at the amount of space left in the current
> page, and never reserve more than that for each entry in the read
> vector. This lets us place data directly where it needs to go in the
> buffer pages.

Nit: This appears to be a refactoring change that should be squashed
together with 2/6.


> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
> ---
> include/linux/sunrpc/xdr.h |  2 ++
> net/sunrpc/xdr.c           | 45 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 47 insertions(+)
> 
> diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
> index 22c207b2425f..bac459584dd0 100644
> --- a/include/linux/sunrpc/xdr.h
> +++ b/include/linux/sunrpc/xdr.h
> @@ -234,6 +234,8 @@ typedef int	(*kxdrdproc_t)(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
> extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf,
> 			    __be32 *p, struct rpc_rqst *rqst);
> extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
> +extern int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec,
> +		size_t nbytes);
> extern void xdr_commit_encode(struct xdr_stream *xdr);
> extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
> extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index be11d672b5b9..6dfe5dc8b35f 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -648,6 +648,51 @@ __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
> }
> EXPORT_SYMBOL_GPL(xdr_reserve_space);
> 
> +
> +/**
> + * xdr_reserve_space_vec - Reserves a large amount of buffer space for sending
> + * @xdr: pointer to xdr_stream
> + * @vec: pointer to a kvec array
> + * @nbytes: number of bytes to reserve
> + *
> + * Reserves enough buffer space to encode 'nbytes' of data and stores the
> + * pointers in 'vec'. The size argument passed to xdr_reserve_space() is
> + * determined based on the number of bytes remaining in the current page to
> + * avoid invalidating iov_base pointers when xdr_commit_encode() is called.
> + */
> +int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec, size_t nbytes)
> +{
> +	int thislen;
> +	int v = 0;
> +	__be32 *p;
> +
> +	/*
> +	 * svcrdma requires every READ payload to start somewhere
> +	 * in xdr->pages.
> +	 */
> +	if (xdr->iov == xdr->buf->head) {
> +		xdr->iov = NULL;
> +		xdr->end = xdr->p;
> +	}
> +
> +	while (nbytes) {
> +		thislen = xdr->buf->page_len % PAGE_SIZE;
> +		thislen = min_t(size_t, nbytes, PAGE_SIZE - thislen);
> +
> +		p = xdr_reserve_space(xdr, thislen);
> +		if (!p)
> +			return -EIO;
> +
> +		vec[v].iov_base = p;
> +		vec[v].iov_len = thislen;
> +		v++;
> +		nbytes -= thislen;
> +	}
> +
> +	return v;
> +}
> +EXPORT_SYMBOL_GPL(xdr_reserve_space_vec);
> +
> /**
>  * xdr_truncate_encode - truncate an encode buffer
>  * @xdr: pointer to xdr_stream
> -- 
> 2.27.0
> 

--
Chuck Lever
Anna Schumaker Aug. 3, 2020, 7:37 p.m. UTC | #2
Hi Chuck,

On Mon, Aug 3, 2020 at 3:21 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Hi Anna-
>
> > On Aug 3, 2020, at 12:59 PM, schumaker.anna@gmail.com wrote:
> >
> > From: Anna Schumaker <Anna.Schumaker@Netapp.com>
> >
> > Reserving space for a large READ payload requires special handling when
> > reserving space in the xdr buffer pages. One problem we can have is use
> > of the scratch buffer, which is used to get a pointer to a contiguous
> > region of data up to PAGE_SIZE. When using the scratch buffer, calls to
> > xdr_commit_encode() shift the data to it's proper alignment in the xdr
> > buffer. If we've reserved several pages in a vector, then this could
> > potentially invalidate earlier pointers and result in incorrect READ
> > data being sent to the client.
> >
> > I get around this by looking at the amount of space left in the current
> > page, and never reserve more than that for each entry in the read
> > vector. This lets us place data directly where it needs to go in the
> > buffer pages.
>
> Nit: This appears to be a refactoring change that should be squashed
> together with 2/6.

My default was to leave sunrpc and nfs changes as separate patches,
but I can squash them together if you want me to!

Anna
>
>
> > Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
> > ---
> > include/linux/sunrpc/xdr.h |  2 ++
> > net/sunrpc/xdr.c           | 45 ++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 47 insertions(+)
> >
> > diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
> > index 22c207b2425f..bac459584dd0 100644
> > --- a/include/linux/sunrpc/xdr.h
> > +++ b/include/linux/sunrpc/xdr.h
> > @@ -234,6 +234,8 @@ typedef int       (*kxdrdproc_t)(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
> > extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf,
> >                           __be32 *p, struct rpc_rqst *rqst);
> > extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
> > +extern int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec,
> > +             size_t nbytes);
> > extern void xdr_commit_encode(struct xdr_stream *xdr);
> > extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
> > extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
> > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> > index be11d672b5b9..6dfe5dc8b35f 100644
> > --- a/net/sunrpc/xdr.c
> > +++ b/net/sunrpc/xdr.c
> > @@ -648,6 +648,51 @@ __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
> > }
> > EXPORT_SYMBOL_GPL(xdr_reserve_space);
> >
> > +
> > +/**
> > + * xdr_reserve_space_vec - Reserves a large amount of buffer space for sending
> > + * @xdr: pointer to xdr_stream
> > + * @vec: pointer to a kvec array
> > + * @nbytes: number of bytes to reserve
> > + *
> > + * Reserves enough buffer space to encode 'nbytes' of data and stores the
> > + * pointers in 'vec'. The size argument passed to xdr_reserve_space() is
> > + * determined based on the number of bytes remaining in the current page to
> > + * avoid invalidating iov_base pointers when xdr_commit_encode() is called.
> > + */
> > +int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec, size_t nbytes)
> > +{
> > +     int thislen;
> > +     int v = 0;
> > +     __be32 *p;
> > +
> > +     /*
> > +      * svcrdma requires every READ payload to start somewhere
> > +      * in xdr->pages.
> > +      */
> > +     if (xdr->iov == xdr->buf->head) {
> > +             xdr->iov = NULL;
> > +             xdr->end = xdr->p;
> > +     }
> > +
> > +     while (nbytes) {
> > +             thislen = xdr->buf->page_len % PAGE_SIZE;
> > +             thislen = min_t(size_t, nbytes, PAGE_SIZE - thislen);
> > +
> > +             p = xdr_reserve_space(xdr, thislen);
> > +             if (!p)
> > +                     return -EIO;
> > +
> > +             vec[v].iov_base = p;
> > +             vec[v].iov_len = thislen;
> > +             v++;
> > +             nbytes -= thislen;
> > +     }
> > +
> > +     return v;
> > +}
> > +EXPORT_SYMBOL_GPL(xdr_reserve_space_vec);
> > +
> > /**
> >  * xdr_truncate_encode - truncate an encode buffer
> >  * @xdr: pointer to xdr_stream
> > --
> > 2.27.0
> >
>
> --
> Chuck Lever
>
>
>
Chuck Lever Aug. 3, 2020, 7:44 p.m. UTC | #3
> On Aug 3, 2020, at 3:37 PM, Anna Schumaker <schumaker.anna@gmail.com> wrote:
> 
> Hi Chuck,
> 
> On Mon, Aug 3, 2020 at 3:21 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> Hi Anna-
>> 
>>> On Aug 3, 2020, at 12:59 PM, schumaker.anna@gmail.com wrote:
>>> 
>>> From: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> 
>>> Reserving space for a large READ payload requires special handling when
>>> reserving space in the xdr buffer pages. One problem we can have is use
>>> of the scratch buffer, which is used to get a pointer to a contiguous
>>> region of data up to PAGE_SIZE. When using the scratch buffer, calls to
>>> xdr_commit_encode() shift the data to it's proper alignment in the xdr
>>> buffer. If we've reserved several pages in a vector, then this could
>>> potentially invalidate earlier pointers and result in incorrect READ
>>> data being sent to the client.
>>> 
>>> I get around this by looking at the amount of space left in the current
>>> page, and never reserve more than that for each entry in the read
>>> vector. This lets us place data directly where it needs to go in the
>>> buffer pages.
>> 
>> Nit: This appears to be a refactoring change that should be squashed
>> together with 2/6.
> 
> My default was to leave sunrpc and nfs changes as separate patches,
> but I can squash them together if you want me to!

IMO, in this case the rule about introducing and using a new helper
in the same patch takes precedence.


> Anna
>> 
>> 
>>> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> ---
>>> include/linux/sunrpc/xdr.h |  2 ++
>>> net/sunrpc/xdr.c           | 45 ++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 47 insertions(+)
>>> 
>>> diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
>>> index 22c207b2425f..bac459584dd0 100644
>>> --- a/include/linux/sunrpc/xdr.h
>>> +++ b/include/linux/sunrpc/xdr.h
>>> @@ -234,6 +234,8 @@ typedef int       (*kxdrdproc_t)(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
>>> extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf,
>>>                          __be32 *p, struct rpc_rqst *rqst);
>>> extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
>>> +extern int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec,
>>> +             size_t nbytes);
>>> extern void xdr_commit_encode(struct xdr_stream *xdr);
>>> extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
>>> extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
>>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
>>> index be11d672b5b9..6dfe5dc8b35f 100644
>>> --- a/net/sunrpc/xdr.c
>>> +++ b/net/sunrpc/xdr.c
>>> @@ -648,6 +648,51 @@ __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
>>> }
>>> EXPORT_SYMBOL_GPL(xdr_reserve_space);
>>> 
>>> +
>>> +/**
>>> + * xdr_reserve_space_vec - Reserves a large amount of buffer space for sending
>>> + * @xdr: pointer to xdr_stream
>>> + * @vec: pointer to a kvec array
>>> + * @nbytes: number of bytes to reserve
>>> + *
>>> + * Reserves enough buffer space to encode 'nbytes' of data and stores the
>>> + * pointers in 'vec'. The size argument passed to xdr_reserve_space() is
>>> + * determined based on the number of bytes remaining in the current page to
>>> + * avoid invalidating iov_base pointers when xdr_commit_encode() is called.
>>> + */
>>> +int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec, size_t nbytes)
>>> +{
>>> +     int thislen;
>>> +     int v = 0;
>>> +     __be32 *p;
>>> +
>>> +     /*
>>> +      * svcrdma requires every READ payload to start somewhere
>>> +      * in xdr->pages.
>>> +      */
>>> +     if (xdr->iov == xdr->buf->head) {
>>> +             xdr->iov = NULL;
>>> +             xdr->end = xdr->p;
>>> +     }
>>> +
>>> +     while (nbytes) {
>>> +             thislen = xdr->buf->page_len % PAGE_SIZE;
>>> +             thislen = min_t(size_t, nbytes, PAGE_SIZE - thislen);
>>> +
>>> +             p = xdr_reserve_space(xdr, thislen);
>>> +             if (!p)
>>> +                     return -EIO;
>>> +
>>> +             vec[v].iov_base = p;
>>> +             vec[v].iov_len = thislen;
>>> +             v++;
>>> +             nbytes -= thislen;
>>> +     }
>>> +
>>> +     return v;
>>> +}
>>> +EXPORT_SYMBOL_GPL(xdr_reserve_space_vec);
>>> +
>>> /**
>>> * xdr_truncate_encode - truncate an encode buffer
>>> * @xdr: pointer to xdr_stream
>>> --
>>> 2.27.0
>>> 
>> 
>> --
>> Chuck Lever

--
Chuck Lever
diff mbox series

Patch

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 22c207b2425f..bac459584dd0 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -234,6 +234,8 @@  typedef int	(*kxdrdproc_t)(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
 extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf,
 			    __be32 *p, struct rpc_rqst *rqst);
 extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
+extern int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec,
+		size_t nbytes);
 extern void xdr_commit_encode(struct xdr_stream *xdr);
 extern void xdr_truncate_encode(struct xdr_stream *xdr, size_t len);
 extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index be11d672b5b9..6dfe5dc8b35f 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -648,6 +648,51 @@  __be32 * xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes)
 }
 EXPORT_SYMBOL_GPL(xdr_reserve_space);
 
+
+/**
+ * xdr_reserve_space_vec - Reserves a large amount of buffer space for sending
+ * @xdr: pointer to xdr_stream
+ * @vec: pointer to a kvec array
+ * @nbytes: number of bytes to reserve
+ *
+ * Reserves enough buffer space to encode 'nbytes' of data and stores the
+ * pointers in 'vec'. The size argument passed to xdr_reserve_space() is
+ * determined based on the number of bytes remaining in the current page to
+ * avoid invalidating iov_base pointers when xdr_commit_encode() is called.
+ */
+int xdr_reserve_space_vec(struct xdr_stream *xdr, struct kvec *vec, size_t nbytes)
+{
+	int thislen;
+	int v = 0;
+	__be32 *p;
+
+	/*
+	 * svcrdma requires every READ payload to start somewhere
+	 * in xdr->pages.
+	 */
+	if (xdr->iov == xdr->buf->head) {
+		xdr->iov = NULL;
+		xdr->end = xdr->p;
+	}
+
+	while (nbytes) {
+		thislen = xdr->buf->page_len % PAGE_SIZE;
+		thislen = min_t(size_t, nbytes, PAGE_SIZE - thislen);
+
+		p = xdr_reserve_space(xdr, thislen);
+		if (!p)
+			return -EIO;
+
+		vec[v].iov_base = p;
+		vec[v].iov_len = thislen;
+		v++;
+		nbytes -= thislen;
+	}
+
+	return v;
+}
+EXPORT_SYMBOL_GPL(xdr_reserve_space_vec);
+
 /**
  * xdr_truncate_encode - truncate an encode buffer
  * @xdr: pointer to xdr_stream