diff mbox series

[RFC] NFS: Fix missing files in `ls` command output

Message ID 20240829091340.2043-1-laoar.shao@gmail.com (mailing list archive)
State New
Headers show
Series [RFC] NFS: Fix missing files in `ls` command output | expand

Commit Message

Yafang Shao Aug. 29, 2024, 9:13 a.m. UTC
In our production environment, we noticed that some files are missing when
running the ls command in an NFS directory. However, we can still
successfully cd into the missing directories. This issue can be illustrated
as follows:

  $ cd nfs
  $ ls
  a b c e f            <<<< 'd' is missing
  $ cd d               <<<< success

I verified the issue with the latest upstream kernel, and it still
persists. Further analysis reveals that files go missing when the dtsize is
expanded. The default dtsize was reduced from 1MB to 4KB in commit
580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
After restoring the default size to 1MB, the issue disappears. I also tried
setting the default size to 8KB, and the issue similarly disappears.

Upon further analysis, it appears that there is a bad entry being decoded
in nfs_readdir_entry_decode(). When a bad entry is encountered, the
decoding process breaks without handling the error. We should revert the
bad entry in such cases. After implementing this change, the issue is
resolved.

However, I am unable to reproduce this issue with a simple example; it only
occurs on our production servers.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 fs/nfs/dir.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

Comments

Benjamin Coddington Aug. 29, 2024, 12:44 p.m. UTC | #1
On 29 Aug 2024, at 5:13, Yafang Shao wrote:

> In our production environment, we noticed that some files are missing when
> running the ls command in an NFS directory. However, we can still
> successfully cd into the missing directories. This issue can be illustrated
> as follows:
>
>   $ cd nfs
>   $ ls
>   a b c e f            <<<< 'd' is missing
>   $ cd d               <<<< success
>
> I verified the issue with the latest upstream kernel, and it still
> persists. Further analysis reveals that files go missing when the dtsize is
> expanded. The default dtsize was reduced from 1MB to 4KB in commit
> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
> After restoring the default size to 1MB, the issue disappears. I also tried
> setting the default size to 8KB, and the issue similarly disappears.
>
> Upon further analysis, it appears that there is a bad entry being decoded
> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
> decoding process breaks without handling the error. We should revert the
> bad entry in such cases. After implementing this change, the issue is
> resolved.

It seems like you're trying to handle a server bug of some sort.  Have you
been able to look at a wire capture to determine why there's a bad entry?

Ben
Yafang Shao Aug. 29, 2024, 12:54 p.m. UTC | #2
On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
>
> > In our production environment, we noticed that some files are missing when
> > running the ls command in an NFS directory. However, we can still
> > successfully cd into the missing directories. This issue can be illustrated
> > as follows:
> >
> >   $ cd nfs
> >   $ ls
> >   a b c e f            <<<< 'd' is missing
> >   $ cd d               <<<< success
> >
> > I verified the issue with the latest upstream kernel, and it still
> > persists. Further analysis reveals that files go missing when the dtsize is
> > expanded. The default dtsize was reduced from 1MB to 4KB in commit
> > 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
> > After restoring the default size to 1MB, the issue disappears. I also tried
> > setting the default size to 8KB, and the issue similarly disappears.
> >
> > Upon further analysis, it appears that there is a bad entry being decoded
> > in nfs_readdir_entry_decode(). When a bad entry is encountered, the
> > decoding process breaks without handling the error. We should revert the
> > bad entry in such cases. After implementing this change, the issue is
> > resolved.
>
> It seems like you're trying to handle a server bug of some sort.  Have you
> been able to look at a wire capture to determine why there's a bad entry?

I've used tcpdump to analyze the packets but didn't find anything
suspicious. Do you have any suggestions?

Interestingly, when we increase the dtsize, the issue goes away. This
suggests that the problem might not be with the server itself, but
rather with the NFS readdir operation. The change in dtsize is as
follows,

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 56a8aee..39847e1 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -81,7 +81,7 @@
        ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT);
        if (ctx != NULL) {
                ctx->attr_gencount = nfsi->attr_gencount;
-               ctx->dtsize = NFS_INIT_DTSIZE;
+               ctx->dtsize = 2 * NFS_INIT_DTSIZE; // 8K
                spin_lock(&dir->i_lock);
                if (list_empty(&nfsi->open_files) &&
                    (nfsi->cache_validity & NFS_INO_DATA_INVAL_DEFER))



--
Regards
Yafang
Benjamin Coddington Aug. 29, 2024, 5:57 p.m. UTC | #3
On 29 Aug 2024, at 8:54, Yafang Shao wrote:

> On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>>
>> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
>>
>>> In our production environment, we noticed that some files are missing when
>>> running the ls command in an NFS directory. However, we can still
>>> successfully cd into the missing directories. This issue can be illustrated
>>> as follows:
>>>
>>>   $ cd nfs
>>>   $ ls
>>>   a b c e f            <<<< 'd' is missing
>>>   $ cd d               <<<< success
>>>
>>> I verified the issue with the latest upstream kernel, and it still
>>> persists. Further analysis reveals that files go missing when the dtsize is
>>> expanded. The default dtsize was reduced from 1MB to 4KB in commit
>>> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
>>> After restoring the default size to 1MB, the issue disappears. I also tried
>>> setting the default size to 8KB, and the issue similarly disappears.
>>>
>>> Upon further analysis, it appears that there is a bad entry being decoded
>>> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
>>> decoding process breaks without handling the error. We should revert the
>>> bad entry in such cases. After implementing this change, the issue is
>>> resolved.
>>
>> It seems like you're trying to handle a server bug of some sort.  Have you
>> been able to look at a wire capture to determine why there's a bad entry?
>
> I've used tcpdump to analyze the packets but didn't find anything
> suspicious. Do you have any suggestions?

I'd check to make sure the server isn't overrunning the READDIR request's
dircount and maxcount (they should be the same for the linux client).  If
the server isn't exceeding them, then there's a likely client bug.

Ben
Yafang Shao Sept. 1, 2024, 5:52 a.m. UTC | #4
On Fri, Aug 30, 2024 at 1:57 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> On 29 Aug 2024, at 8:54, Yafang Shao wrote:
>
> > On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
> >>
> >> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
> >>
> >>> In our production environment, we noticed that some files are missing when
> >>> running the ls command in an NFS directory. However, we can still
> >>> successfully cd into the missing directories. This issue can be illustrated
> >>> as follows:
> >>>
> >>>   $ cd nfs
> >>>   $ ls
> >>>   a b c e f            <<<< 'd' is missing
> >>>   $ cd d               <<<< success
> >>>
> >>> I verified the issue with the latest upstream kernel, and it still
> >>> persists. Further analysis reveals that files go missing when the dtsize is
> >>> expanded. The default dtsize was reduced from 1MB to 4KB in commit
> >>> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
> >>> After restoring the default size to 1MB, the issue disappears. I also tried
> >>> setting the default size to 8KB, and the issue similarly disappears.
> >>>
> >>> Upon further analysis, it appears that there is a bad entry being decoded
> >>> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
> >>> decoding process breaks without handling the error. We should revert the
> >>> bad entry in such cases. After implementing this change, the issue is
> >>> resolved.
> >>
> >> It seems like you're trying to handle a server bug of some sort.  Have you
> >> been able to look at a wire capture to determine why there's a bad entry?
> >
> > I've used tcpdump to analyze the packets but didn't find anything
> > suspicious. Do you have any suggestions?
>
> I'd check to make sure the server isn't overrunning the READDIR request's
> dircount and maxcount (they should be the same for the linux client).  If
> the server isn't exceeding them, then there's a likely client bug.

Thank you for the suggestion. I have captured and analyzed the NFS RPC
traffic using Wireshark. I noticed that the ls command is being split
into two NFS READDIR operations. In the first READDIR request, both
the dircount and maxcount parameters are set to 4008. In the
subsequent READDIR request, both dircount and maxcount are set to
8192.

Interestingly, when I increase the value of ctx->dtsize to 8192, the
ls command now generates only a single NFS READDIR RPC call. In this
case, both the dircount and maxcount parameters are set to 8104. This
issue disappears as well.

--
Regards
Yafang
Yafang Shao Sept. 2, 2024, 11:46 a.m. UTC | #5
On Fri, Aug 30, 2024 at 1:57 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> On 29 Aug 2024, at 8:54, Yafang Shao wrote:
>
> > On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
> >>
> >> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
> >>
> >>> In our production environment, we noticed that some files are missing when
> >>> running the ls command in an NFS directory. However, we can still
> >>> successfully cd into the missing directories. This issue can be illustrated
> >>> as follows:
> >>>
> >>>   $ cd nfs
> >>>   $ ls
> >>>   a b c e f            <<<< 'd' is missing
> >>>   $ cd d               <<<< success
> >>>
> >>> I verified the issue with the latest upstream kernel, and it still
> >>> persists. Further analysis reveals that files go missing when the dtsize is
> >>> expanded. The default dtsize was reduced from 1MB to 4KB in commit
> >>> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
> >>> After restoring the default size to 1MB, the issue disappears. I also tried
> >>> setting the default size to 8KB, and the issue similarly disappears.
> >>>
> >>> Upon further analysis, it appears that there is a bad entry being decoded
> >>> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
> >>> decoding process breaks without handling the error. We should revert the
> >>> bad entry in such cases. After implementing this change, the issue is
> >>> resolved.
> >>
> >> It seems like you're trying to handle a server bug of some sort.  Have you
> >> been able to look at a wire capture to determine why there's a bad entry?
> >
> > I've used tcpdump to analyze the packets but didn't find anything
> > suspicious. Do you have any suggestions?
>
> I'd check to make sure the server isn't overrunning the READDIR request's
> dircount and maxcount (they should be the same for the linux client).  If
> the server isn't exceeding them, then there's a likely client bug.
>
> Ben
>

Hello Ben,

Upon thorough examination, we have identified the root cause of the
issue to lie within the NFS server, specifically its behavior of
truncating file listings to match the client's READDIR RPC args->size
parameter without appropriately adjusting the cookie value. After
implementing a fix on the server side, the issue has been resolved.
However, to enhance resilience and mitigate future server-side
vulnerabilities, it may be prudent to implement client-side handling
mechanisms for such issues. What do you think?
Chuck Lever Sept. 2, 2024, 6:27 p.m. UTC | #6
> On Sep 2, 2024, at 7:46 AM, Yafang Shao <laoar.shao@gmail.com> wrote:
> 
> On Fri, Aug 30, 2024 at 1:57 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>> 
>> On 29 Aug 2024, at 8:54, Yafang Shao wrote:
>> 
>>> On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>>>> 
>>>> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
>>>> 
>>>>> In our production environment, we noticed that some files are missing when
>>>>> running the ls command in an NFS directory. However, we can still
>>>>> successfully cd into the missing directories. This issue can be illustrated
>>>>> as follows:
>>>>> 
>>>>>  $ cd nfs
>>>>>  $ ls
>>>>>  a b c e f            <<<< 'd' is missing
>>>>>  $ cd d               <<<< success
>>>>> 
>>>>> I verified the issue with the latest upstream kernel, and it still
>>>>> persists. Further analysis reveals that files go missing when the dtsize is
>>>>> expanded. The default dtsize was reduced from 1MB to 4KB in commit
>>>>> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
>>>>> After restoring the default size to 1MB, the issue disappears. I also tried
>>>>> setting the default size to 8KB, and the issue similarly disappears.
>>>>> 
>>>>> Upon further analysis, it appears that there is a bad entry being decoded
>>>>> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
>>>>> decoding process breaks without handling the error. We should revert the
>>>>> bad entry in such cases. After implementing this change, the issue is
>>>>> resolved.
>>>> 
>>>> It seems like you're trying to handle a server bug of some sort.  Have you
>>>> been able to look at a wire capture to determine why there's a bad entry?
>>> 
>>> I've used tcpdump to analyze the packets but didn't find anything
>>> suspicious. Do you have any suggestions?
>> 
>> I'd check to make sure the server isn't overrunning the READDIR request's
>> dircount and maxcount (they should be the same for the linux client).  If
>> the server isn't exceeding them, then there's a likely client bug.
>> 
>> Ben
>> 
> 
> Hello Ben,
> 
> Upon thorough examination, we have identified the root cause of the
> issue to lie within the NFS server, specifically its behavior of
> truncating file listings to match the client's READDIR RPC args->size
> parameter without appropriately adjusting the cookie value. After
> implementing a fix on the server side, the issue has been resolved.

Please post your server fix on this mailing list. Thanks!


> However, to enhance resilience and mitigate future server-side
> vulnerabilities, it may be prudent to implement client-side handling
> mechanisms for such issues. What do you think?

The general policy we follow is to avoid fixing server
bugs via client-side workarounds. Fix the server in
that case.


--
Chuck Lever
Benjamin Coddington Sept. 3, 2024, 12:24 p.m. UTC | #7
On 2 Sep 2024, at 7:46, Yafang Shao wrote:

> Hello Ben,
>
> Upon thorough examination, we have identified the root cause of the
> issue to lie within the NFS server, specifically its behavior of
> truncating file listings to match the client's READDIR RPC args->size
> parameter without appropriately adjusting the cookie value. After
> implementing a fix on the server side, the issue has been resolved.

Nice work!  Out of curiosity, what server implemenation did you fix?

> However, to enhance resilience and mitigate future server-side
> vulnerabilities, it may be prudent to implement client-side handling
> mechanisms for such issues. What do you think?

We have in the past modified the client to be more resilient, but usually
only for cases where the server can cause the client to crash and/or corrupt
data.  For a bug like this, the maintainers usually assert "we do not fix
the client for server bugs", since doing so can paper over protocol
correctness issues created by the server.  I think that's what would happen
here if you keep working to justify your fix --  which you're free to do, of
course!

Regards,
Ben
Chuck Lever Sept. 3, 2024, 1:48 p.m. UTC | #8
> On Sep 2, 2024, at 2:27 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Sep 2, 2024, at 7:46 AM, Yafang Shao <laoar.shao@gmail.com> wrote:
>> 
>> On Fri, Aug 30, 2024 at 1:57 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>>> 
>>> On 29 Aug 2024, at 8:54, Yafang Shao wrote:
>>> 
>>>> On Thu, Aug 29, 2024 at 8:44 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>>>>> 
>>>>> On 29 Aug 2024, at 5:13, Yafang Shao wrote:
>>>>> 
>>>>>> In our production environment, we noticed that some files are missing when
>>>>>> running the ls command in an NFS directory. However, we can still
>>>>>> successfully cd into the missing directories. This issue can be illustrated
>>>>>> as follows:
>>>>>> 
>>>>>> $ cd nfs
>>>>>> $ ls
>>>>>> a b c e f            <<<< 'd' is missing
>>>>>> $ cd d               <<<< success
>>>>>> 
>>>>>> I verified the issue with the latest upstream kernel, and it still
>>>>>> persists. Further analysis reveals that files go missing when the dtsize is
>>>>>> expanded. The default dtsize was reduced from 1MB to 4KB in commit
>>>>>> 580f236737d1 ("NFS: Adjust the amount of readahead performed by NFS readdir").
>>>>>> After restoring the default size to 1MB, the issue disappears. I also tried
>>>>>> setting the default size to 8KB, and the issue similarly disappears.
>>>>>> 
>>>>>> Upon further analysis, it appears that there is a bad entry being decoded
>>>>>> in nfs_readdir_entry_decode(). When a bad entry is encountered, the
>>>>>> decoding process breaks without handling the error. We should revert the
>>>>>> bad entry in such cases. After implementing this change, the issue is
>>>>>> resolved.
>>>>> 
>>>>> It seems like you're trying to handle a server bug of some sort.  Have you
>>>>> been able to look at a wire capture to determine why there's a bad entry?
>>>> 
>>>> I've used tcpdump to analyze the packets but didn't find anything
>>>> suspicious. Do you have any suggestions?
>>> 
>>> I'd check to make sure the server isn't overrunning the READDIR request's
>>> dircount and maxcount (they should be the same for the linux client).  If
>>> the server isn't exceeding them, then there's a likely client bug.
>>> 
>>> Ben
>>> 
>> 
>> Hello Ben,
>> 
>> Upon thorough examination, we have identified the root cause of the
>> issue to lie within the NFS server, specifically its behavior of
>> truncating file listings to match the client's READDIR RPC args->size
>> parameter without appropriately adjusting the cookie value. After
>> implementing a fix on the server side, the issue has been resolved.
> 
> Please post your server fix on this mailing list. Thanks!

I was assuming your test server was Linux NFSD. If not,
then please ignore me!


--
Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 07a7be27182e..1f5a99888a11 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -310,7 +310,7 @@  static int nfs_readdir_array_can_expand(struct nfs_cache_array *array)
 
 static int nfs_readdir_folio_array_append(struct folio *folio,
 					  const struct nfs_entry *entry,
-					  u64 *cookie)
+					  u64 *cookie, u64 *prev_cookie)
 {
 	struct nfs_cache_array *array;
 	struct nfs_cache_array_entry *cache_entry;
@@ -342,6 +342,7 @@  static int nfs_readdir_folio_array_append(struct folio *folio,
 		nfs_readdir_array_set_eof(array);
 out:
 	*cookie = array->last_cookie;
+	*prev_cookie = cache_entry->cookie;
 	kunmap_local(array);
 	return ret;
 }
@@ -826,10 +827,11 @@  static int nfs_readdir_folio_filler(struct nfs_readdir_descriptor *desc,
 {
 	struct address_space *mapping = desc->file->f_mapping;
 	struct folio *new, *folio = *arrays;
+	struct nfs_cache_array *array;
 	struct xdr_stream stream;
+	u64 cookie, prev_cookie;
 	struct page *scratch;
 	struct xdr_buf buf;
-	u64 cookie;
 	int status;
 
 	scratch = alloc_page(GFP_KERNEL);
@@ -841,10 +843,20 @@  static int nfs_readdir_folio_filler(struct nfs_readdir_descriptor *desc,
 
 	do {
 		status = nfs_readdir_entry_decode(desc, entry, &stream);
-		if (status != 0)
+		if (status != 0) {
+			if (status == -EAGAIN && entry->cookie == cookie) {
+				/* Revert the bad entry */
+				array = kmap_local_folio(folio, 0);
+				array->last_cookie = prev_cookie;
+				desc->last_cookie = 0;
+				desc->dir_cookie = 0;
+				array->size--;
+				kunmap_local(array);
+			}
 			break;
+		}
 
-		status = nfs_readdir_folio_array_append(folio, entry, &cookie);
+		status = nfs_readdir_folio_array_append(folio, entry, &cookie, &prev_cookie);
 		if (status != -ENOSPC)
 			continue;
 
@@ -866,7 +878,7 @@  static int nfs_readdir_folio_filler(struct nfs_readdir_descriptor *desc,
 			folio = new;
 		}
 		desc->folio_index_max++;
-		status = nfs_readdir_folio_array_append(folio, entry, &cookie);
+		status = nfs_readdir_folio_array_append(folio, entry, &cookie, &prev_cookie);
 	} while (!status && !entry->eof);
 
 	switch (status) {