Message ID | 20190329142332.12010-1-jencce.kernel@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nfsd/nfsd3_proc_readdir: check before encoding on temporary page | expand |
On Fri, Mar 29 2019, Murphy Zhou wrote: > After this commit > f875a79 nfsd: allow nfsv3 readdir request to be larger. > nfsv3 readdir request size can be larger than PAGE_SIZE. So if the > request and the directory are large enough, we can run out of pages > in rq_respages. Then the temporary page we are encoding on is a > random page, Oops happen. > > Listing a directory with 30000 files in it can trigger the panic. > > Fixing this by ensuring the temporary page resides in rq_respages. > And when counting how many bytes left currently from buffer to the > end of the page which buffer is pointing to, aka len1, do not > assume that curr_page_addr is at the beginning of the same page. > Also, update resp->count by going through all pages. Because the > pages may not be sequential, the old way is not safe. > > Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" > Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Hi, thanks for finding these problems and submitting a patch. Clearly I should have tested better :-( > --- > fs/nfsd/nfs3proc.c | 17 +++++++++++++++-- > fs/nfsd/nfs3xdr.c | 8 +++++--- > 2 files changed, 20 insertions(+), 5 deletions(-) > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c > index 8f933e84cec1..9bc32af4e2da 100644 > --- a/fs/nfsd/nfs3proc.c > +++ b/fs/nfsd/nfs3proc.c > @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > struct nfsd3_readdirargs *argp = rqstp->rq_argp; > struct nfsd3_readdirres *resp = rqstp->rq_resp; > __be32 nfserr; > - int count; > + int count = 0; > + struct page **p; > + caddr_t page_addr = NULL; > > dprintk("nfsd: READDIR(3) %s %d bytes at %d\n", > SVCFH_fmt(&argp->fh), > @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, > &resp->common, nfs3svc_encode_entry); > memcpy(resp->verf, argp->verf, 8); > - resp->count = resp->buffer - argp->buffer; > + count = 0; > + for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) { > + page_addr = page_address(*p); > + > + if (((caddr_t)resp->buffer >= page_addr) && > + ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) { > + count += (caddr_t)resp->buffer - page_addr; > + break; > + } > + count += PAGE_SIZE; > + } > + resp->count = count >> 2; > if (resp->offset) { > loff_t offset = argp->cookie; This section makes perfect sense. You have copied code from nfsd3_proc_readdirplus() int nfsd3_proc_readdir(). This is needed because readdir doesn't limit replies to PAGE_SIZE any more. > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c > index 93fea246f676..1fabf1952bdb 100644 > --- a/fs/nfsd/nfs3xdr.c > +++ b/fs/nfsd/nfs3xdr.c > @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > break; > } > > + Adding blank lines is best avoided. > if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) { > /* encode entry in current page */ > > @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > if (plus) > p = encode_entryplus_baggage(cd, p, name, namlen, ino); > num_entry_words = p - cd->buffer; > - } else if (*(page+1) != NULL) { > + } else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) { Adding the test against rq_next_page looks correct. nfs3svc_decode_readdirplusargs increments ->rq_next_page over enough pages to cover the request count. However nfs4svc_decode_readdirargs doesn't! So it will not advance rq_next_page properly. We need to fix that to make it like nfs3svc_decode_readdirplusargs(). But I don't think this test should be needed. svc_alloc_arg() NULL- terminates the ->rq_pages array which ->rq_next_page points in to. So I think this test is correct as it stands, but nfs4svc_decode_readdirargs() needs to be fixed. > /* temporarily encode entry into next page, then move back to > * current and next page in rq_respages[] */ > __be32 *p1, *tmp; > int len1, len2; > + caddr_t tmp_page_addr = NULL; > > /* grab next page for temporary storage of entry */ > p1 = tmp = page_address(*(page+1)); > @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > /* determine entry word length and lengths to go in pages */ > num_entry_words = p1 - tmp; > - len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > + tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK); > + len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; I think tmp_page_addr will always be the same as curr_page_addr. At least, it will when nfs4svc_decode_readdirargs() is fixed. Could you please revert the changed you've made to nfs3xdr.c (keeping the change to nfsd3_proc_readdir()), and fix nfs4svc_decode_readdirargs to be more like nfs4svc_decode_readdirplusargs, and see if that fixed the problem? Thanks a lot, NeilBrown > if ((num_entry_words << 2) < len1) { > /* the actual number of words in the entry is less > * than elen and can still fit in the current page > @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > cd->buffer = p; > cd->common.err = nfs_ok; > return 0; > - > } > > int > -- > 2.21.0
Hi, Thanks for the review! On Mon, Apr 1, 2019 at 9:20 AM NeilBrown <neilb@suse.com> wrote: > > On Fri, Mar 29 2019, Murphy Zhou wrote: > > > After this commit > > f875a79 nfsd: allow nfsv3 readdir request to be larger. > > nfsv3 readdir request size can be larger than PAGE_SIZE. So if the > > request and the directory are large enough, we can run out of pages > > in rq_respages. Then the temporary page we are encoding on is a > > random page, Oops happen. > > > > Listing a directory with 30000 files in it can trigger the panic. > > > > Fixing this by ensuring the temporary page resides in rq_respages. > > And when counting how many bytes left currently from buffer to the > > end of the page which buffer is pointing to, aka len1, do not > > assume that curr_page_addr is at the beginning of the same page. > > Also, update resp->count by going through all pages. Because the > > pages may not be sequential, the old way is not safe. > > > > Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" > > Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> > > Hi, > thanks for finding these problems and submitting a patch. Clearly I > should have tested better :-( > > > --- > > fs/nfsd/nfs3proc.c | 17 +++++++++++++++-- > > fs/nfsd/nfs3xdr.c | 8 +++++--- > > 2 files changed, 20 insertions(+), 5 deletions(-) > > > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c > > index 8f933e84cec1..9bc32af4e2da 100644 > > --- a/fs/nfsd/nfs3proc.c > > +++ b/fs/nfsd/nfs3proc.c > > @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > > struct nfsd3_readdirargs *argp = rqstp->rq_argp; > > struct nfsd3_readdirres *resp = rqstp->rq_resp; > > __be32 nfserr; > > - int count; > > + int count = 0; > > + struct page **p; > > + caddr_t page_addr = NULL; > > > > dprintk("nfsd: READDIR(3) %s %d bytes at %d\n", > > SVCFH_fmt(&argp->fh), > > @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) > > nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, > > &resp->common, nfs3svc_encode_entry); > > memcpy(resp->verf, argp->verf, 8); > > - resp->count = resp->buffer - argp->buffer; > > + count = 0; > > + for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) { > > + page_addr = page_address(*p); > > + > > + if (((caddr_t)resp->buffer >= page_addr) && > > + ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) { > > + count += (caddr_t)resp->buffer - page_addr; > > + break; > > + } > > + count += PAGE_SIZE; > > + } > > + resp->count = count >> 2; > > if (resp->offset) { > > loff_t offset = argp->cookie; > > This section makes perfect sense. You have copied code from > nfsd3_proc_readdirplus() int nfsd3_proc_readdir(). This is needed > because readdir doesn't limit replies to PAGE_SIZE any more. > > > > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c > > index 93fea246f676..1fabf1952bdb 100644 > > --- a/fs/nfsd/nfs3xdr.c > > +++ b/fs/nfsd/nfs3xdr.c > > @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > break; > > } > > > > + > > Adding blank lines is best avoided. > > > if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) { > > /* encode entry in current page */ > > > > @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > if (plus) > > p = encode_entryplus_baggage(cd, p, name, namlen, ino); > > num_entry_words = p - cd->buffer; > > - } else if (*(page+1) != NULL) { > > + } else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) { > > Adding the test against rq_next_page looks correct. > nfs3svc_decode_readdirplusargs increments ->rq_next_page over enough > pages to cover the request count. > However nfs4svc_decode_readdirargs doesn't! So it will not advance > rq_next_page properly. We need to fix that to make it like > nfs3svc_decode_readdirplusargs(). > > But I don't think this test should be needed. svc_alloc_arg() NULL- > terminates the ->rq_pages array which ->rq_next_page points in to. > > So I think this test is correct as it stands, but > nfs4svc_decode_readdirargs() needs to be fixed. Agreed. If we set the page pointers right. These checks wont be necessary. > > > /* temporarily encode entry into next page, then move back to > > * current and next page in rq_respages[] */ > > __be32 *p1, *tmp; > > int len1, len2; > > + caddr_t tmp_page_addr = NULL; > > > > /* grab next page for temporary storage of entry */ > > p1 = tmp = page_address(*(page+1)); > > @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > > > /* determine entry word length and lengths to go in pages */ > > num_entry_words = p1 - tmp; > > - len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > > + tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK); > > + len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; > > I think tmp_page_addr will always be the same as curr_page_addr. > At least, it will when nfs4svc_decode_readdirargs() is fixed. > > Could you please revert the changed you've made to nfs3xdr.c (keeping > the change to nfsd3_proc_readdir()), and fix nfs4svc_decode_readdirargs > to be more like nfs4svc_decode_readdirplusargs, and see if that fixed > the problem? Sure. A quick test on 3000 files passed. I'll post v2 if test on 30,000 files pass. Thanks very much for reviewing! M > > Thanks a lot, > NeilBrown > > > > if ((num_entry_words << 2) < len1) { > > /* the actual number of words in the entry is less > > * than elen and can still fit in the current page > > @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, > > cd->buffer = p; > > cd->common.err = nfs_ok; > > return 0; > > - > > } > > > > int > > -- > > 2.21.0
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c index 8f933e84cec1..9bc32af4e2da 100644 --- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) struct nfsd3_readdirargs *argp = rqstp->rq_argp; struct nfsd3_readdirres *resp = rqstp->rq_resp; __be32 nfserr; - int count; + int count = 0; + struct page **p; + caddr_t page_addr = NULL; dprintk("nfsd: READDIR(3) %s %d bytes at %d\n", SVCFH_fmt(&argp->fh), @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp) nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, &resp->common, nfs3svc_encode_entry); memcpy(resp->verf, argp->verf, 8); - resp->count = resp->buffer - argp->buffer; + count = 0; + for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) { + page_addr = page_address(*p); + + if (((caddr_t)resp->buffer >= page_addr) && + ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) { + count += (caddr_t)resp->buffer - page_addr; + break; + } + count += PAGE_SIZE; + } + resp->count = count >> 2; if (resp->offset) { loff_t offset = argp->cookie; diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 93fea246f676..1fabf1952bdb 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, break; } + if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) { /* encode entry in current page */ @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, if (plus) p = encode_entryplus_baggage(cd, p, name, namlen, ino); num_entry_words = p - cd->buffer; - } else if (*(page+1) != NULL) { + } else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) { /* temporarily encode entry into next page, then move back to * current and next page in rq_respages[] */ __be32 *p1, *tmp; int len1, len2; + caddr_t tmp_page_addr = NULL; /* grab next page for temporary storage of entry */ p1 = tmp = page_address(*(page+1)); @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, /* determine entry word length and lengths to go in pages */ num_entry_words = p1 - tmp; - len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; + tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK); + len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer; if ((num_entry_words << 2) < len1) { /* the actual number of words in the entry is less * than elen and can still fit in the current page @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen, cd->buffer = p; cd->common.err = nfs_ok; return 0; - } int
After this commit f875a79 nfsd: allow nfsv3 readdir request to be larger. nfsv3 readdir request size can be larger than PAGE_SIZE. So if the request and the directory are large enough, we can run out of pages in rq_respages. Then the temporary page we are encoding on is a random page, Oops happen. Listing a directory with 30000 files in it can trigger the panic. Fixing this by ensuring the temporary page resides in rq_respages. And when counting how many bytes left currently from buffer to the end of the page which buffer is pointing to, aka len1, do not assume that curr_page_addr is at the beginning of the same page. Also, update resp->count by going through all pages. Because the pages may not be sequential, the old way is not safe. Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> --- fs/nfsd/nfs3proc.c | 17 +++++++++++++++-- fs/nfsd/nfs3xdr.c | 8 +++++--- 2 files changed, 20 insertions(+), 5 deletions(-)