diff mbox

[3.0.0+,Regression,Bisected] CIFS: getdents() broken for large dirs

Message ID 20110802120008.3e19d193@tlielax.poochiereds.net (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton Aug. 2, 2011, 4 p.m. UTC
On Tue, 2 Aug 2011 06:44:55 -0400
Jeff Layton <jlayton@redhat.com> wrote:

> On Tue, 2 Aug 2011 02:30:35 +0200
> Jan Seiffert <kaffeemonster@googlemail.com> wrote:
> 
> > Please CC, as not subscibed.
> > 
> > Looks like something broke CIFS.
> > Kernel 3.0.0 works, master after the CIFS merge does not ATM.
> > 
> > Example cifs mount is:
> > $ mount
> > ...
> > //server/portage on /usr/portage type cifs (rw,mand)
> > 
> > Server is:
> > $ smbd --version
> > Version 3.4.12
> > 
> > Actual Output is:
> > $ ls -l /usr/portage/ | wc -l
> > ls: reading directory /usr/portage/: input/output error
> > 1
> > $ ls -l /usr/portage/distfiles | wc -l
> > 105
> > 
> > Expected Output:
> > $ ls -l /usr/portage/ | wc -l
> > 170
> > $ ls -l /usr/portage/distfiles/ | wc -l
> > 47470
> > 
> > /usr/portage contains a lot of directories, /usr/portage/distfiles
> > contains lots of files.
> > 
> > A bisect in fs/cifs gives:
> > $ git bisect bad
> > c4d3396b261473ded6f370edd1e79ba34e089d7e is the first bad commit
> > commit c4d3396b261473ded6f370edd1e79ba34e089d7e
> > Author: Jeff Layton <jlayton@redhat.com>
> > Date:   Tue Jul 26 12:20:18 2011 -0400
> > 
> >     cifs: advertise the right receive buffer size to the server
> > 
> >     Currently, we mirror the same size back to the server that it sends us.
> >     That makes little sense. Instead we should be sending the server the
> >     maximum buffer size that we can handle -- CIFSMaxBufSize minus the
> >     4 byte RFC1001 header.
> > 
> >     Signed-off-by: Jeff Layton <jlayton@redhat.com>
> >     Signed-off-by: Steve French <sfrench@us.ibm.com>
> > 
> > :040000 040000 7a5015e0b47ca27538c4387a3b38470c86a42bae
> > c802cfda7d618f1073e45ce98efe65ce03cd0e79 M      fs
> > 
> > $ git bisect log
> > git bisect start '--' 'fs/cifs/'
> > # bad: [5f66d2b58ca879e70740c82422354144845d6dd3] Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
> > git bisect bad 5f66d2b58ca879e70740c82422354144845d6dd3
> > # good: [02f8c6aee8df3cdc935e9bdd4f2d020306035dbe] Linux 3.0
> > git bisect good 02f8c6aee8df3cdc935e9bdd4f2d020306035dbe
> > # good: [3ca30d40a91fb9b9871e61d5dea2c1a895906a15] CIFS: Fix oops
> > while mounting with prefixpath
> > git bisect good 3ca30d40a91fb9b9871e61d5dea2c1a895906a15
> > # bad: [1f1cff0be05f59d5939edf28ff5ca0c6fd0a8e1c] cifs: trivial: goto
> > out here is unnecessary
> > git bisect bad 1f1cff0be05f59d5939edf28ff5ca0c6fd0a8e1c
> > # good: [e010a5ef95b8b6a12b74b548578f7dcf93564347] [CIFS] Redundant
> > null check after dereference
> > git bisect good e010a5ef95b8b6a12b74b548578f7dcf93564347
> > # good: [1d87c28e680ce4ecb8c260d8ce070b8339d52abb] Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
> > git bisect good 1d87c28e680ce4ecb8c260d8ce070b8339d52abb
> > # bad: [c4d3396b261473ded6f370edd1e79ba34e089d7e] cifs: advertise the
> > right receive buffer size to the server
> > git bisect bad c4d3396b261473ded6f370edd1e79ba34e089d7e
> > 
> > A git revert of c4d3396b261473ded6f370edd1e79ba34e089d7e seems to resolve it.
> > 
> > If more info is needed, please let me know.
> 
> Thanks for the bug report. According to the spec, I think I was a
> *little* off in the calculation but not by much. CIFSMaxBufSize doesn't
> include the size of the header, so the value we're sending is too small
> by 0x58 bytes. But, if anything though that should have led to the
> server sending smaller frames than we can handle, which should not
> cause this sort of problem.
> 
> I tried to reproduce this on my test setup, but couldn't...
> 
> Some questions...
> 
> 1) did anything pop up in dmesg when this error occurred?
> 
> 2) are you setting the CIFSMaxBufSize module parm to anything?
> 
> 3) would it be possible to get debugging output? Instructions on how to
> do that are here:
> 
> http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging
> 
> Thanks,

Nevermind... I was able to reproduce it, and the following patch seems
to fix it for me. Jan, can you test this as well? If so, I'll
"officially" send it to Steve and we'll get this in ASAP.

Long term, it would be better clean up the way CIFSMaxBufSize is
handled to get rid of this source of confusion...

-----------------[snip]-------------------

cifs: fix MaxBufferSize advertisement in session setup

Commit c4d3396b2 failed to account for the fact that CIFSMaxBufSize does
not include the size of the header. This means that we're advertising a
size that's 0x58 bytes smaller than the client can handle.

It's also problematic in the case of a large FIND_FIRST request. That
request will specify its own MaxDataCount value, which can turn out to
be larger than the originally provided MaxBufferSize. This causes the
server to break up the response into multiple pieces when it shouldn't
and throws off the checks in check2ndT2.

Fix this by adding the amount of the header to the maxBufSize that we
advertise to the server.

Reported-by: Jan Seiffert <kaffeemonster@googlemail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/cifs/sess.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Comments

Steve French Aug. 2, 2011, 4:03 p.m. UTC | #1
your patch does look right.

On Tue, Aug 2, 2011 at 11:00 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Tue, 2 Aug 2011 06:44:55 -0400
> Jeff Layton <jlayton@redhat.com> wrote:
>
>> On Tue, 2 Aug 2011 02:30:35 +0200
>> Jan Seiffert <kaffeemonster@googlemail.com> wrote:
>>
>> > Please CC, as not subscibed.
>> >
>> > Looks like something broke CIFS.
>> > Kernel 3.0.0 works, master after the CIFS merge does not ATM.
>> >
>> > Example cifs mount is:
>> > $ mount
>> > ...
>> > //server/portage on /usr/portage type cifs (rw,mand)
>> >
>> > Server is:
>> > $ smbd --version
>> > Version 3.4.12
>> >
>> > Actual Output is:
>> > $ ls -l /usr/portage/ | wc -l
>> > ls: reading directory /usr/portage/: input/output error
>> > 1
>> > $ ls -l /usr/portage/distfiles | wc -l
>> > 105
>> >
>> > Expected Output:
>> > $ ls -l /usr/portage/ | wc -l
>> > 170
>> > $ ls -l /usr/portage/distfiles/ | wc -l
>> > 47470
>> >
>> > /usr/portage contains a lot of directories, /usr/portage/distfiles
>> > contains lots of files.
>> >
>> > A bisect in fs/cifs gives:
>> > $ git bisect bad
>> > c4d3396b261473ded6f370edd1e79ba34e089d7e is the first bad commit
>> > commit c4d3396b261473ded6f370edd1e79ba34e089d7e
>> > Author: Jeff Layton <jlayton@redhat.com>
>> > Date:   Tue Jul 26 12:20:18 2011 -0400
>> >
>> >     cifs: advertise the right receive buffer size to the server
>> >
>> >     Currently, we mirror the same size back to the server that it sends us.
>> >     That makes little sense. Instead we should be sending the server the
>> >     maximum buffer size that we can handle -- CIFSMaxBufSize minus the
>> >     4 byte RFC1001 header.
>> >
>> >     Signed-off-by: Jeff Layton <jlayton@redhat.com>
>> >     Signed-off-by: Steve French <sfrench@us.ibm.com>
>> >
>> > :040000 040000 7a5015e0b47ca27538c4387a3b38470c86a42bae
>> > c802cfda7d618f1073e45ce98efe65ce03cd0e79 M      fs
>> >
>> > $ git bisect log
>> > git bisect start '--' 'fs/cifs/'
>> > # bad: [5f66d2b58ca879e70740c82422354144845d6dd3] Merge
>> > git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
>> > git bisect bad 5f66d2b58ca879e70740c82422354144845d6dd3
>> > # good: [02f8c6aee8df3cdc935e9bdd4f2d020306035dbe] Linux 3.0
>> > git bisect good 02f8c6aee8df3cdc935e9bdd4f2d020306035dbe
>> > # good: [3ca30d40a91fb9b9871e61d5dea2c1a895906a15] CIFS: Fix oops
>> > while mounting with prefixpath
>> > git bisect good 3ca30d40a91fb9b9871e61d5dea2c1a895906a15
>> > # bad: [1f1cff0be05f59d5939edf28ff5ca0c6fd0a8e1c] cifs: trivial: goto
>> > out here is unnecessary
>> > git bisect bad 1f1cff0be05f59d5939edf28ff5ca0c6fd0a8e1c
>> > # good: [e010a5ef95b8b6a12b74b548578f7dcf93564347] [CIFS] Redundant
>> > null check after dereference
>> > git bisect good e010a5ef95b8b6a12b74b548578f7dcf93564347
>> > # good: [1d87c28e680ce4ecb8c260d8ce070b8339d52abb] Merge
>> > git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
>> > git bisect good 1d87c28e680ce4ecb8c260d8ce070b8339d52abb
>> > # bad: [c4d3396b261473ded6f370edd1e79ba34e089d7e] cifs: advertise the
>> > right receive buffer size to the server
>> > git bisect bad c4d3396b261473ded6f370edd1e79ba34e089d7e
>> >
>> > A git revert of c4d3396b261473ded6f370edd1e79ba34e089d7e seems to resolve it.
>> >
>> > If more info is needed, please let me know.
>>
>> Thanks for the bug report. According to the spec, I think I was a
>> *little* off in the calculation but not by much. CIFSMaxBufSize doesn't
>> include the size of the header, so the value we're sending is too small
>> by 0x58 bytes. But, if anything though that should have led to the
>> server sending smaller frames than we can handle, which should not
>> cause this sort of problem.
>>
>> I tried to reproduce this on my test setup, but couldn't...
>>
>> Some questions...
>>
>> 1) did anything pop up in dmesg when this error occurred?
>>
>> 2) are you setting the CIFSMaxBufSize module parm to anything?
>>
>> 3) would it be possible to get debugging output? Instructions on how to
>> do that are here:
>>
>> http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging
>>
>> Thanks,
>
> Nevermind... I was able to reproduce it, and the following patch seems
> to fix it for me. Jan, can you test this as well? If so, I'll
> "officially" send it to Steve and we'll get this in ASAP.
>
> Long term, it would be better clean up the way CIFSMaxBufSize is
> handled to get rid of this source of confusion...
>
> -----------------[snip]-------------------
>
> cifs: fix MaxBufferSize advertisement in session setup
>
> Commit c4d3396b2 failed to account for the fact that CIFSMaxBufSize does
> not include the size of the header. This means that we're advertising a
> size that's 0x58 bytes smaller than the client can handle.
>
> It's also problematic in the case of a large FIND_FIRST request. That
> request will specify its own MaxDataCount value, which can turn out to
> be larger than the originally provided MaxBufferSize. This causes the
> server to break up the response into multiple pieces when it shouldn't
> and throws off the checks in check2ndT2.
>
> Fix this by adding the amount of the header to the maxBufSize that we
> advertise to the server.
>
> Reported-by: Jan Seiffert <kaffeemonster@googlemail.com>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/cifs/sess.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
> index 243d587..c7d80e2 100644
> --- a/fs/cifs/sess.c
> +++ b/fs/cifs/sess.c
> @@ -124,8 +124,9 @@ static __u32 cifs_ssetup_hdr(struct cifs_ses *ses, SESSION_SETUP_ANDX *pSMB)
>        /*      that we use in next few lines                               */
>        /* Note that header is initialized to zero in header_assemble */
>        pSMB->req.AndXCommand = 0xFF;
> -       pSMB->req.MaxBufferSize = cpu_to_le16(min_t(u32, CIFSMaxBufSize - 4,
> -                                               USHRT_MAX));
> +       pSMB->req.MaxBufferSize = cpu_to_le16(min_t(u32,
> +                                       CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4,
> +                                       USHRT_MAX));
>        pSMB->req.MaxMpxCount = cpu_to_le16(ses->server->maxReq);
>        pSMB->req.VcNumber = get_next_vcnum(ses);
>
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Jan Seiffert Aug. 2, 2011, 9:40 p.m. UTC | #2
2011/8/2 Jeff Layton <jlayton@redhat.com>:
> On Tue, 2 Aug 2011 06:44:55 -0400
> Jeff Layton <jlayton@redhat.com> wrote:
>
>> On Tue, 2 Aug 2011 02:30:35 +0200
>> Jan Seiffert <kaffeemonster@googlemail.com> wrote:
>>
[snip - bug report]
>>
>> Thanks for the bug report.

Hi Jeff,
thanks for your fast feedback, and sorry it took me so long to get back to you.

>> According to the spec, I think I was a
>> *little* off in the calculation but not by much. CIFSMaxBufSize doesn't
>> include the size of the header, so the value we're sending is too small
>> by 0x58 bytes. But, if anything though that should have led to the
>> server sending smaller frames than we can handle, which should not
>> cause this sort of problem.
>>
>> I tried to reproduce this on my test setup, but couldn't...
>>
>> Some questions...
>>
>> 1) did anything pop up in dmesg when this error occurred?
>>

No, nothing in dmesg.
Looks like it is detected as malformed packet and silently dropped.
(at dmesg default log level and default debug options)

>> 2) are you setting the CIFSMaxBufSize module parm to anything?
>>

No parameters to the CIFS module (i didn't even know till now there
are options...)

>> 3) would it be possible to get debugging output? Instructions on how to
>> do that are here:
>>
>> http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging
>>

I will skip this ...

>> Thanks,
>
> Nevermind... I was able to reproduce it, and the following patch seems
> to fix it for me. Jan, can you test this as well?

.. to test your patch, and it looks good!

$ ls -l /usr/portage/ | wc -l
170
$ ls -l /usr/portage/distfiles/ | wc -l
47470

So here it is:
Tested-by: Jan Seiffert <kaffeemonster@googlemail.com>

> If so, I'll "officially" send it to Steve and we'll get this in ASAP.
>

I hope so ;)

> Long term, it would be better clean up the way CIFSMaxBufSize is
> handled to get rid of this source of confusion...
>

Sounds good, but with a lot of different servers (orig. Windows
versions and different Samba versions), i hope there is no maze of
endless special cases because all those versions did something subtle
different with the MaxBufSize (eg. simple +/-1 Bugs).

Greetings
Jan

[snip - patch]
Jeff Layton Aug. 2, 2011, 9:43 p.m. UTC | #3
On Tue, 2 Aug 2011 11:03:28 -0500
Steve French <smfrench@gmail.com> wrote:

> your patch does look right.
> 

Sigh...as is often the case, things are not quite so simple...

I did some auditing of how server->maxBuf is used in the cifs code and
found it to be very confused. The reason it basically works, I think is
because on NEGOTIATE, the client does this:

       server->maxBuf = min(le32_to_cpu(pSMBr->MaxBufferSize),
                       (__u32) CIFSMaxBufSize + MAX_CIFS_HDR_SIZE);

So using maxBuf and CIFSMaxBufSize interchangeably like the code does
is basically OK, even if it is confusing to limit how much the client
can receive based on how much the server can.

I think that we really need to approach this more comprehensively and
have a clear delineation between server->maxBuf and CIFSMaxBufSize. I
don't think it would be wise though to put that into 3.1 at this point.

I think it would probably be best to just back out commit c4d3396b2 for
now, and I'll plan to do this as a larger (and hopefully better-tested)
patchset for 6.2.

Sound ok? Do you need me to send a revert patch?
Steve French Aug. 2, 2011, 9:49 p.m. UTC | #4
your explanation makes sense.  will revert the commit unless anyone objects.

On Tue, Aug 2, 2011 at 4:43 PM, Jeff Layton <jlayton@redhat.com> wrote:
> On Tue, 2 Aug 2011 11:03:28 -0500
> Steve French <smfrench@gmail.com> wrote:
>
>> your patch does look right.
>>
>
> Sigh...as is often the case, things are not quite so simple...
>
> I did some auditing of how server->maxBuf is used in the cifs code and
> found it to be very confused. The reason it basically works, I think is
> because on NEGOTIATE, the client does this:
>
>       server->maxBuf = min(le32_to_cpu(pSMBr->MaxBufferSize),
>                       (__u32) CIFSMaxBufSize + MAX_CIFS_HDR_SIZE);
>
> So using maxBuf and CIFSMaxBufSize interchangeably like the code does
> is basically OK, even if it is confusing to limit how much the client
> can receive based on how much the server can.
>
> I think that we really need to approach this more comprehensively and
> have a clear delineation between server->maxBuf and CIFSMaxBufSize. I
> don't think it would be wise though to put that into 3.1 at this point.
>
> I think it would probably be best to just back out commit c4d3396b2 for
> now, and I'll plan to do this as a larger (and hopefully better-tested)
> patchset for 6.2.
>
> Sound ok? Do you need me to send a revert patch?
>
> --
> Jeff Layton <jlayton@redhat.com>
>
diff mbox

Patch

diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index 243d587..c7d80e2 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -124,8 +124,9 @@  static __u32 cifs_ssetup_hdr(struct cifs_ses *ses, SESSION_SETUP_ANDX *pSMB)
 	/*	that we use in next few lines                               */
 	/* Note that header is initialized to zero in header_assemble */
 	pSMB->req.AndXCommand = 0xFF;
-	pSMB->req.MaxBufferSize = cpu_to_le16(min_t(u32, CIFSMaxBufSize - 4,
-						USHRT_MAX));
+	pSMB->req.MaxBufferSize = cpu_to_le16(min_t(u32,
+					CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4,
+					USHRT_MAX));
 	pSMB->req.MaxMpxCount = cpu_to_le16(ses->server->maxReq);
 	pSMB->req.VcNumber = get_next_vcnum(ses);