Message ID | 1361831310-24260-1-git-send-email-chiluk@canonical.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Dave, > When messages are currently in queue awaiting a response, decrease amount of > time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > seconds) since the last response was recieved. This does not take into account > the fact that messages waiting for a response should be serviced within a > reasonable round trip time. Wouldn't that mean that the client will disconnect a good connection, if the server doesn't response within 10 seconds? Reads and Writes can take longer than 10 seconds... > This fixes the issue where user moves from wired to wireless or vice versa > causing the mount to hang for 120 seconds, when it could reconnect considerably > faster. After this fix it will take SMB_MAX_RTT (10 seconds) from the last > time the user attempted to access the volume or SMB_MAX_RTT after the last > echo. The worst case of the latter scenario being > 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 seconds). > Statistically speaking it would normally reconnect sooner. However in the best > case where the user changes nics, and immediately tries to access the cifs > share it will take SMB_MAX_RTT=10 seconds. I think it would be better to detect the broken connection by using an AF_NETLINK socket listening for RTM_DELADDR messages? metze
On Wed, 27 Feb 2013 12:06:14 +0100 "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > Hi Dave, > > > When messages are currently in queue awaiting a response, decrease amount of > > time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > > wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > > seconds) since the last response was recieved. This does not take into account > > the fact that messages waiting for a response should be serviced within a > > reasonable round trip time. > > Wouldn't that mean that the client will disconnect a good connection, > if the server doesn't response within 10 seconds? > Reads and Writes can take longer than 10 seconds... > Where does this magic value of 10s come from? Note that a slow server can take *minutes* to respond to writes that are long past the EOF. > > This fixes the issue where user moves from wired to wireless or vice versa > > causing the mount to hang for 120 seconds, when it could reconnect considerably > > faster. After this fix it will take SMB_MAX_RTT (10 seconds) from the last > > time the user attempted to access the volume or SMB_MAX_RTT after the last > > echo. The worst case of the latter scenario being > > 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 seconds). > > Statistically speaking it would normally reconnect sooner. However in the best > > case where the user changes nics, and immediately tries to access the cifs > > share it will take SMB_MAX_RTT=10 seconds. > > I think it would be better to detect the broken connection > by using an AF_NETLINK socket listening for RTM_DELADDR > messages? > > metze > Ick -- that sounds horrid ;) Dave, this problem sounds very similar to the one that your colleague Chris J Arges was trying to solve several months ago. You may want to go back and review that thread. Perhaps you can solve both problems at the same time here...
On 02/27/2013 10:34 AM, Jeff Layton wrote: > On Wed, 27 Feb 2013 12:06:14 +0100 > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >> Hi Dave, >> >>> When messages are currently in queue awaiting a response, decrease amount of >>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>> seconds) since the last response was recieved. This does not take into account >>> the fact that messages waiting for a response should be serviced within a >>> reasonable round trip time. >> >> Wouldn't that mean that the client will disconnect a good connection, >> if the server doesn't response within 10 seconds? >> Reads and Writes can take longer than 10 seconds... >> > > Where does this magic value of 10s come from? Note that a slow server > can take *minutes* to respond to writes that are long past the EOF. It comes from the desire to decrease the reconnection delay to something better than a random number between 60 and 120 seconds. I am not committed to this number, and it is open for discussion. Additionally if you look closely at the logic it's not 10 seconds per request, but actually when requests have been in flight for more than 10 seconds make sure we've heard from the server in the last 10 seconds. Can you explain more fully your use case of writes that are long past the EOF? Perhaps with a test-case or script that I can test? As far as I know writes long past EOF will just result in a sparse file, and return in a reasonable round trip time *(that's at least what I'm seeing with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 seek=100000, starts receiving responses from the server in about .05 seconds with subsequent responses following at roughly .002-.01 second intervals. This is well within my 10 second value. Even adding the latency of AT&T's 2g cell network brings it up to only 1s. Still 10x less than my 10 second value. The new logic goes like this if( we've been expecting a response from the server (in_flight), and message has been in_flight for more than 10 seconds and we haven't had any other contact from the server in that time reconnect On a side note, I discovered a small race condition in the previous logic while working on this, that my new patch also fixes. 1s request 2s response 61.995 echo job pops 121.995 echo job pops and sends echo 122 server_unresponsive called. Finds no response and attempts to reconnect 122.95 response to echo received >>> This fixes the issue where user moves from wired to wireless or vice versa >>> causing the mount to hang for 120 seconds, when it could reconnect considerably >>> faster. After this fix it will take SMB_MAX_RTT (10 seconds) from the last >>> time the user attempted to access the volume or SMB_MAX_RTT after the last >>> echo. The worst case of the latter scenario being >>> 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 seconds). >>> Statistically speaking it would normally reconnect sooner. However in the best >>> case where the user changes nics, and immediately tries to access the cifs >>> share it will take SMB_MAX_RTT=10 seconds. >> >> I think it would be better to detect the broken connection >> by using an AF_NETLINK socket listening for RTM_DELADDR >> messages? >> >> metze >> > > Ick -- that sounds horrid ;) > > Dave, this problem sounds very similar to the one that your colleague > Chris J Arges was trying to solve several months ago. You may want to > go back and review that thread. Perhaps you can solve both problems at > the same time here... > This is the same problem as was discussed here. https://patchwork.kernel.org/patch/1717841/ From that thread you made the suggestion of "What would really be better is fixing the code to only echo when there are outstanding calls to the server." I thought about that, and liked keeping the echo functionality as a heart beat when nothing else is going on. If we only echo when there are outstanding calls, then the client will not attempt to reconnect until the user attempts to use the mount. I'd rather it reconnect when nothing is happening. As for the rest of the suggestion from that thread, we aren't trying to solve a suspend/resume use case, but actually a dock/undock use case. Basically reconnecting quickly when going from wired to wireless or vice versa. Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk <dave.chiluk@canonical.com> wrote: > On 02/27/2013 10:34 AM, Jeff Layton wrote: >> On Wed, 27 Feb 2013 12:06:14 +0100 >> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >> >>> Hi Dave, >>> >>>> When messages are currently in queue awaiting a response, decrease amount of >>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>>> seconds) since the last response was recieved. This does not take into account >>>> the fact that messages waiting for a response should be serviced within a >>>> reasonable round trip time. >>> >>> Wouldn't that mean that the client will disconnect a good connection, >>> if the server doesn't response within 10 seconds? >>> Reads and Writes can take longer than 10 seconds... >>> >> >> Where does this magic value of 10s come from? Note that a slow server >> can take *minutes* to respond to writes that are long past the EOF. > It comes from the desire to decrease the reconnection delay to something > better than a random number between 60 and 120 seconds. I am not > committed to this number, and it is open for discussion. Additionally > if you look closely at the logic it's not 10 seconds per request, but > actually when requests have been in flight for more than 10 seconds make > sure we've heard from the server in the last 10 seconds. > > Can you explain more fully your use case of writes that are long past > the EOF? Perhaps with a test-case or script that I can test? As far as > I know writes long past EOF will just result in a sparse file, and > return in a reasonable round trip time *(that's at least what I'm seeing > with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > seek=100000, starts receiving responses from the server in about .05 > seconds with subsequent responses following at roughly .002-.01 second > intervals. This is well within my 10 second value. Note that not all Linux file systems support sparse files and certainly there are cifs servers running on operating systems other than Linux which have popular file systems which don't support sparse files (e.g. FAT32 but there are many others) - in any case, writes after end of file can take a LONG time if sparse files are not supported and I don't know a good way for the client to know that attribute of the server file system ahead of time (although we could attempt to set the sparse flag, servers can and do lie)
On 02/27/2013 04:40 PM, Steve French wrote: > On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk <dave.chiluk@canonical.com> wrote: >> On 02/27/2013 10:34 AM, Jeff Layton wrote: >>> On Wed, 27 Feb 2013 12:06:14 +0100 >>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >>> >>>> Hi Dave, >>>> >>>>> When messages are currently in queue awaiting a response, decrease amount of >>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>>>> seconds) since the last response was recieved. This does not take into account >>>>> the fact that messages waiting for a response should be serviced within a >>>>> reasonable round trip time. >>>> >>>> Wouldn't that mean that the client will disconnect a good connection, >>>> if the server doesn't response within 10 seconds? >>>> Reads and Writes can take longer than 10 seconds... >>>> >>> >>> Where does this magic value of 10s come from? Note that a slow server >>> can take *minutes* to respond to writes that are long past the EOF. >> It comes from the desire to decrease the reconnection delay to something >> better than a random number between 60 and 120 seconds. I am not >> committed to this number, and it is open for discussion. Additionally >> if you look closely at the logic it's not 10 seconds per request, but >> actually when requests have been in flight for more than 10 seconds make >> sure we've heard from the server in the last 10 seconds. >> >> Can you explain more fully your use case of writes that are long past >> the EOF? Perhaps with a test-case or script that I can test? As far as >> I know writes long past EOF will just result in a sparse file, and >> return in a reasonable round trip time *(that's at least what I'm seeing >> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 >> seek=100000, starts receiving responses from the server in about .05 >> seconds with subsequent responses following at roughly .002-.01 second >> intervals. This is well within my 10 second value. > > Note that not all Linux file systems support sparse files and > certainly there are cifs servers running on operating systems other > than Linux which have popular file systems which don't support sparse > files (e.g. FAT32 but there are many others) - in any case, writes > after end of file can take a LONG time if sparse files are not > supported and I don't know a good way for the client to know that > attribute of the server file system ahead of time (although we could > attempt to set the sparse flag, servers can and do lie) > It doesn't matter how long it takes for the entire operation to complete, just so long as the server acks something in less than 10 seconds. Now the question becomes, is there an OS out there that doesn't ack the request or doesn't ack the progress regularly. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 27.02.2013 17:34, schrieb Jeff Layton: > On Wed, 27 Feb 2013 12:06:14 +0100 > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >> Hi Dave, >> >>> When messages are currently in queue awaiting a response, decrease amount of >>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>> seconds) since the last response was recieved. This does not take into account >>> the fact that messages waiting for a response should be serviced within a >>> reasonable round trip time. >> >> Wouldn't that mean that the client will disconnect a good connection, >> if the server doesn't response within 10 seconds? >> Reads and Writes can take longer than 10 seconds... >> > > Where does this magic value of 10s come from? Note that a slow server > can take *minutes* to respond to writes that are long past the EOF. > >>> This fixes the issue where user moves from wired to wireless or vice versa >>> causing the mount to hang for 120 seconds, when it could reconnect considerably >>> faster. After this fix it will take SMB_MAX_RTT (10 seconds) from the last >>> time the user attempted to access the volume or SMB_MAX_RTT after the last >>> echo. The worst case of the latter scenario being >>> 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 seconds). >>> Statistically speaking it would normally reconnect sooner. However in the best >>> case where the user changes nics, and immediately tries to access the cifs >>> share it will take SMB_MAX_RTT=10 seconds. >> >> I think it would be better to detect the broken connection >> by using an AF_NETLINK socket listening for RTM_DELADDR >> messages? >> >> metze >> > > Ick -- that sounds horrid ;) This is what winbindd uses to detect that a source ip of outgoing connections are gone. I don't know much of the kernel, there might be a better way from within the kernel to detect this. But this is exactly the correct thing to do to failover to another interface, as it just happens when the ip is removed without messing with a timeout value. Another optimization would be to use tcp keepalives (I think there 10 seconds would be ok), I think that's what Windows SMB3 clients are using. metze
Am 27.02.2013 23:44, schrieb Dave Chiluk: > On 02/27/2013 04:40 PM, Steve French wrote: >> On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk <dave.chiluk@canonical.com> wrote: >>> On 02/27/2013 10:34 AM, Jeff Layton wrote: >>>> On Wed, 27 Feb 2013 12:06:14 +0100 >>>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >>>> >>>>> Hi Dave, >>>>> >>>>>> When messages are currently in queue awaiting a response, decrease amount of >>>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>>>>> seconds) since the last response was recieved. This does not take into account >>>>>> the fact that messages waiting for a response should be serviced within a >>>>>> reasonable round trip time. >>>>> >>>>> Wouldn't that mean that the client will disconnect a good connection, >>>>> if the server doesn't response within 10 seconds? >>>>> Reads and Writes can take longer than 10 seconds... >>>>> >>>> >>>> Where does this magic value of 10s come from? Note that a slow server >>>> can take *minutes* to respond to writes that are long past the EOF. >>> It comes from the desire to decrease the reconnection delay to something >>> better than a random number between 60 and 120 seconds. I am not >>> committed to this number, and it is open for discussion. Additionally >>> if you look closely at the logic it's not 10 seconds per request, but >>> actually when requests have been in flight for more than 10 seconds make >>> sure we've heard from the server in the last 10 seconds. >>> >>> Can you explain more fully your use case of writes that are long past >>> the EOF? Perhaps with a test-case or script that I can test? As far as >>> I know writes long past EOF will just result in a sparse file, and >>> return in a reasonable round trip time *(that's at least what I'm seeing >>> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 >>> seek=100000, starts receiving responses from the server in about .05 >>> seconds with subsequent responses following at roughly .002-.01 second >>> intervals. This is well within my 10 second value. >> >> Note that not all Linux file systems support sparse files and >> certainly there are cifs servers running on operating systems other >> than Linux which have popular file systems which don't support sparse >> files (e.g. FAT32 but there are many others) - in any case, writes >> after end of file can take a LONG time if sparse files are not >> supported and I don't know a good way for the client to know that >> attribute of the server file system ahead of time (although we could >> attempt to set the sparse flag, servers can and do lie) >> > > It doesn't matter how long it takes for the entire operation to > complete, just so long as the server acks something in less than 10 > seconds. Now the question becomes, is there an OS out there that > doesn't ack the request or doesn't ack the progress regularly. This kind of ack can only be at the tcp layer not at the smb layer. metze
On Wed, 2013-02-27 at 16:44 -0600, Dave Chiluk wrote: > On 02/27/2013 04:40 PM, Steve French wrote: > > On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk <dave.chiluk@canonical.com> wrote: > >> On 02/27/2013 10:34 AM, Jeff Layton wrote: > >>> On Wed, 27 Feb 2013 12:06:14 +0100 > >>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >>> > >>>> Hi Dave, > >>>> > >>>>> When messages are currently in queue awaiting a response, decrease amount of > >>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > >>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > >>>>> seconds) since the last response was recieved. This does not take into account > >>>>> the fact that messages waiting for a response should be serviced within a > >>>>> reasonable round trip time. > >>>> > >>>> Wouldn't that mean that the client will disconnect a good connection, > >>>> if the server doesn't response within 10 seconds? > >>>> Reads and Writes can take longer than 10 seconds... > >>>> > >>> > >>> Where does this magic value of 10s come from? Note that a slow server > >>> can take *minutes* to respond to writes that are long past the EOF. > >> It comes from the desire to decrease the reconnection delay to something > >> better than a random number between 60 and 120 seconds. I am not > >> committed to this number, and it is open for discussion. Additionally > >> if you look closely at the logic it's not 10 seconds per request, but > >> actually when requests have been in flight for more than 10 seconds make > >> sure we've heard from the server in the last 10 seconds. > >> > >> Can you explain more fully your use case of writes that are long past > >> the EOF? Perhaps with a test-case or script that I can test? As far as > >> I know writes long past EOF will just result in a sparse file, and > >> return in a reasonable round trip time *(that's at least what I'm seeing > >> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > >> seek=100000, starts receiving responses from the server in about .05 > >> seconds with subsequent responses following at roughly .002-.01 second > >> intervals. This is well within my 10 second value. > > > > Note that not all Linux file systems support sparse files and > > certainly there are cifs servers running on operating systems other > > than Linux which have popular file systems which don't support sparse > > files (e.g. FAT32 but there are many others) - in any case, writes > > after end of file can take a LONG time if sparse files are not > > supported and I don't know a good way for the client to know that > > attribute of the server file system ahead of time (although we could > > attempt to set the sparse flag, servers can and do lie) > > > > It doesn't matter how long it takes for the entire operation to > complete, just so long as the server acks something in less than 10 > seconds. Now the question becomes, is there an OS out there that > doesn't ack the request or doesn't ack the progress regularly. IIRC older samba servers were fully synchronous and wouldn't reply to anything while processing an operation. I am sure you can still find old code bases in older (and slow) appliances out there. Simo.
> -----Original Message----- > From: linux-cifs-owner@vger.kernel.org [mailto:linux-cifs- > owner@vger.kernel.org] On Behalf Of Dave Chiluk > Sent: Wednesday, February 27, 2013 5:44 PM > To: Steve French > Cc: Jeff Layton; Stefan (metze) Metzmacher; Dave Chiluk; Steve French; > linux-cifs@vger.kernel.org; samba-technical@lists.samba.org; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH] CIFS: Decrease reconnection delay when switching nics > > On 02/27/2013 04:40 PM, Steve French wrote: > > On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk > <dave.chiluk@canonical.com> wrote: > >> On 02/27/2013 10:34 AM, Jeff Layton wrote: > >>> On Wed, 27 Feb 2013 12:06:14 +0100 > >>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >>> > >>>> Hi Dave, > >>>> > >>>>> When messages are currently in queue awaiting a response, decrease > >>>>> amount of time before attempting cifs_reconnect to SMB_MAX_RTT > = > >>>>> 10 seconds. The current wait time before attempting to reconnect > >>>>> is currently 2*SMB_ECHO_INTERVAL(120 > >>>>> seconds) since the last response was recieved. This does not take > >>>>> into account the fact that messages waiting for a response should > >>>>> be serviced within a reasonable round trip time. > >>>> > >>>> Wouldn't that mean that the client will disconnect a good > >>>> connection, if the server doesn't response within 10 seconds? > >>>> Reads and Writes can take longer than 10 seconds... > >>>> > >>> > >>> Where does this magic value of 10s come from? Note that a slow > >>> server can take *minutes* to respond to writes that are long past the > EOF. > >> It comes from the desire to decrease the reconnection delay to > >> something better than a random number between 60 and 120 seconds. I > >> am not committed to this number, and it is open for discussion. > >> Additionally if you look closely at the logic it's not 10 seconds per > >> request, but actually when requests have been in flight for more than > >> 10 seconds make sure we've heard from the server in the last 10 seconds. > >> > >> Can you explain more fully your use case of writes that are long past > >> the EOF? Perhaps with a test-case or script that I can test? As far > >> as I know writes long past EOF will just result in a sparse file, and > >> return in a reasonable round trip time *(that's at least what I'm > >> seeing with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M > >> count=100 seek=100000, starts receiving responses from the server in > >> about .05 seconds with subsequent responses following at roughly > >> .002-.01 second intervals. This is well within my 10 second value. > > > > Note that not all Linux file systems support sparse files and > > certainly there are cifs servers running on operating systems other > > than Linux which have popular file systems which don't support sparse > > files (e.g. FAT32 but there are many others) - in any case, writes > > after end of file can take a LONG time if sparse files are not > > supported and I don't know a good way for the client to know that > > attribute of the server file system ahead of time (although we could > > attempt to set the sparse flag, servers can and do lie) > > > > It doesn't matter how long it takes for the entire operation to complete, just > so long as the server acks something in less than 10 seconds. Now the > question becomes, is there an OS out there that doesn't ack the request or > doesn't ack the progress regularly. SMB/CIFS servers will signal the operation "going async" by returning a STATUS_PENDING response if the operation is not prompt, but this only happens once. The client is still expected to run a timer, and recover from possibly lost responses and/or unresponsive servers. Windows clients extend their timeout when this occurs, typically quadrupling it. Some clients will issue ECHO requests to probe the server in this case, but it is neither a protocol requirement nor does it truly address the issue of tracking each pending operation. Windows SMB2 clients do not do this. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: samba-technical-bounces@lists.samba.org [mailto:samba-technical- > bounces@lists.samba.org] On Behalf Of Stefan (metze) Metzmacher > Sent: Wednesday, February 27, 2013 7:16 PM > To: Jeff Layton > Cc: Steve French; Dave Chiluk; samba-technical@lists.samba.org; linux- > kernel@vger.kernel.org; linux-cifs@vger.kernel.org > Subject: Re: [PATCH] CIFS: Decrease reconnection delay when switching nics > > Am 27.02.2013 17:34, schrieb Jeff Layton: > > On Wed, 27 Feb 2013 12:06:14 +0100 > > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > > > >> Hi Dave, > >> > >>> When messages are currently in queue awaiting a response, decrease > >>> amount of time before attempting cifs_reconnect to SMB_MAX_RTT = > 10 > >>> seconds. The current wait time before attempting to reconnect is > >>> currently 2*SMB_ECHO_INTERVAL(120 > >>> seconds) since the last response was recieved. This does not take > >>> into account the fact that messages waiting for a response should be > >>> serviced within a reasonable round trip time. > >> > >> Wouldn't that mean that the client will disconnect a good connection, > >> if the server doesn't response within 10 seconds? > >> Reads and Writes can take longer than 10 seconds... > >> > > > > Where does this magic value of 10s come from? Note that a slow server > > can take *minutes* to respond to writes that are long past the EOF. > > > >>> This fixes the issue where user moves from wired to wireless or vice > >>> versa causing the mount to hang for 120 seconds, when it could > >>> reconnect considerably faster. After this fix it will take > >>> SMB_MAX_RTT (10 seconds) from the last time the user attempted to > >>> access the volume or SMB_MAX_RTT after the last echo. The worst > >>> case of the latter scenario being > 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 > seconds). > >>> Statistically speaking it would normally reconnect sooner. However > >>> in the best case where the user changes nics, and immediately tries > >>> to access the cifs share it will take SMB_MAX_RTT=10 seconds. > >> > >> I think it would be better to detect the broken connection by using > >> an AF_NETLINK socket listening for RTM_DELADDR messages? > >> > >> metze > >> > > > > Ick -- that sounds horrid ;) > > This is what winbindd uses to detect that a source ip of outgoing connections > are gone. I don't know much of the kernel, there might be a better way from > within the kernel to detect this. But this is exactly the correct thing to do to > failover to another interface, as it just happens when the ip is removed > without messing with a timeout value. > > Another optimization would be to use tcp keepalives (I think there 10 > seconds would be ok), I think that's what Windows SMB3 clients are using. Yes, they do. See MS-SMB2 behavior note 144 attached to section 3.2.5.14.9. 10 seconds seems a fairly rapid keepalive interval. The TCP stack probably won't allow it to be less than the maximum retransmit, for instance. Tom. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 27 Feb 2013 16:24:07 -0600 Dave Chiluk <dave.chiluk@canonical.com> wrote: > On 02/27/2013 10:34 AM, Jeff Layton wrote: > > On Wed, 27 Feb 2013 12:06:14 +0100 > > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > > > >> Hi Dave, > >> > >>> When messages are currently in queue awaiting a response, decrease amount of > >>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > >>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > >>> seconds) since the last response was recieved. This does not take into account > >>> the fact that messages waiting for a response should be serviced within a > >>> reasonable round trip time. > >> > >> Wouldn't that mean that the client will disconnect a good connection, > >> if the server doesn't response within 10 seconds? > >> Reads and Writes can take longer than 10 seconds... > >> > > > > Where does this magic value of 10s come from? Note that a slow server > > can take *minutes* to respond to writes that are long past the EOF. > It comes from the desire to decrease the reconnection delay to something > better than a random number between 60 and 120 seconds. I am not > committed to this number, and it is open for discussion. Additionally > if you look closely at the logic it's not 10 seconds per request, but > actually when requests have been in flight for more than 10 seconds make > sure we've heard from the server in the last 10 seconds. > > Can you explain more fully your use case of writes that are long past > the EOF? Perhaps with a test-case or script that I can test? As far as > I know writes long past EOF will just result in a sparse file, and > return in a reasonable round trip time *(that's at least what I'm seeing > with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > seek=100000, starts receiving responses from the server in about .05 > seconds with subsequent responses following at roughly .002-.01 second > intervals. This is well within my 10 second value. Even adding the > latency of AT&T's 2g cell network brings it up to only 1s. Still 10x > less than my 10 second value. > > The new logic goes like this > if( we've been expecting a response from the server (in_flight), and > message has been in_flight for more than 10 seconds and > we haven't had any other contact from the server in that time > reconnect > That will break writes long past the EOF. Note too that reconnects on CIFS are horrifically expensive and problematic. Much of the state on a CIFS mount is tied to the connection. When that drops, open files are closed and things like locks are dropped. SMB1 has no real mechanism for state recovery, so that can really be a problem. > On a side note, I discovered a small race condition in the previous > logic while working on this, that my new patch also fixes. > 1s request > 2s response > 61.995 echo job pops > 121.995 echo job pops and sends echo > 122 server_unresponsive called. Finds no response and attempts to > reconnect > 122.95 response to echo received > Sure, here's a reproducer. Do this against a windows server, preferably one exporting NTFS on relatively slow storage. Make sure that "testfile" doesn't exist first: $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 NTFS doesn't support sparse files, so the OS has to zero-fill up to the point where you're writing. That can take a looooong time on slow storage (minutes even). What we do now is periodically send a SMB echo to make sure the server is alive rather than trying to time out a particular call. The logic that handles that today is somewhat sub-optimal though. We send an echo every 60s whether there are any calls in flight or not and wait for 60s until we decide that the server isn't there. What would be better is to only send one when we've been waiting a long time for a response. That "long time" is debatable -- 10s would be fine with me but the logic needs to be fixed not to send echoes unless there is an outstanding request first. I think though that you're trying to use this mechanism to do something that it wasn't really designed to do. A better method might be to try and detect whether the TCP connection is really dead somehow. That would be more immediate, but I'm unclear on how best to do that. Probably it'll mean groveling around down in the TCP layer... FWIW, there was a thread on the linux-cifs mailing list started on Dec 3, 2010 entitled "cifs client timeouts and hard/soft mounts" that lays out the rationale for the current reconnection behavior. You may want to look over that before you go making changes here...
On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: > On Wed, 27 Feb 2013 16:24:07 -0600 > Dave Chiluk <dave.chiluk@canonical.com> wrote: > >> On 02/27/2013 10:34 AM, Jeff Layton wrote: >> > On Wed, 27 Feb 2013 12:06:14 +0100 >> > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >> > >> >> Hi Dave, >> >> >> >>> When messages are currently in queue awaiting a response, decrease amount of >> >>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >> >>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >> >>> seconds) since the last response was recieved. This does not take into account >> >>> the fact that messages waiting for a response should be serviced within a >> >>> reasonable round trip time. >> >> >> >> Wouldn't that mean that the client will disconnect a good connection, >> >> if the server doesn't response within 10 seconds? >> >> Reads and Writes can take longer than 10 seconds... >> >> >> > >> > Where does this magic value of 10s come from? Note that a slow server >> > can take *minutes* to respond to writes that are long past the EOF. >> It comes from the desire to decrease the reconnection delay to something >> better than a random number between 60 and 120 seconds. I am not >> committed to this number, and it is open for discussion. Additionally >> if you look closely at the logic it's not 10 seconds per request, but >> actually when requests have been in flight for more than 10 seconds make >> sure we've heard from the server in the last 10 seconds. >> >> Can you explain more fully your use case of writes that are long past >> the EOF? Perhaps with a test-case or script that I can test? As far as >> I know writes long past EOF will just result in a sparse file, and >> return in a reasonable round trip time *(that's at least what I'm seeing >> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 >> seek=100000, starts receiving responses from the server in about .05 >> seconds with subsequent responses following at roughly .002-.01 second >> intervals. This is well within my 10 second value. Even adding the >> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x >> less than my 10 second value. >> >> The new logic goes like this >> if( we've been expecting a response from the server (in_flight), and >> message has been in_flight for more than 10 seconds and >> we haven't had any other contact from the server in that time >> reconnect >> > > That will break writes long past the EOF. Note too that reconnects on > CIFS are horrifically expensive and problematic. Much of the state on a > CIFS mount is tied to the connection. When that drops, open files are > closed and things like locks are dropped. SMB1 has no real mechanism > for state recovery, so that can really be a problem. > >> On a side note, I discovered a small race condition in the previous >> logic while working on this, that my new patch also fixes. >> 1s request >> 2s response >> 61.995 echo job pops >> 121.995 echo job pops and sends echo >> 122 server_unresponsive called. Finds no response and attempts to >> reconnect >> 122.95 response to echo received >> > > Sure, here's a reproducer. Do this against a windows server, preferably > one exporting NTFS on relatively slow storage. Make sure that > "testfile" doesn't exist first: > > $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 > > NTFS doesn't support sparse files, so the OS has to zero-fill up to the > point where you're writing. That can take a looooong time on slow > storage (minutes even). What we do now is periodically send a SMB echo > to make sure the server is alive rather than trying to time out a > particular call. Writing past end of file in Windows can be very slow, but note that it is possible for a windows to set as sparse a file on an NTFS partition. Quoting from http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx Windows NTFS does support sparse files (and we could even send it over cifs if we want) but it has to be explicitly set by the app on the file: "To determine whether a file system supports sparse files, call the GetVolumeInformation function and examine the FILE_SUPPORTS_SPARSE_FILES bit flag returned through the lpFileSystemFlags parameter. Most applications are not aware of sparse files and will not create sparse files. The fact that an application is reading a sparse file is transparent to the application. An application that is aware of sparse-files should determine whether its data set is suitable to be kept in a sparse file. After that determination is made, the application must explicitly declare a file as sparse, using the FSCTL_SET_SPARSE control code."
On Thu, 28 Feb 2013 10:04:36 -0600 Steve French <smfrench@gmail.com> wrote: > On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: > > On Wed, 27 Feb 2013 16:24:07 -0600 > > Dave Chiluk <dave.chiluk@canonical.com> wrote: > > > >> On 02/27/2013 10:34 AM, Jeff Layton wrote: > >> > On Wed, 27 Feb 2013 12:06:14 +0100 > >> > "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >> > > >> >> Hi Dave, > >> >> > >> >>> When messages are currently in queue awaiting a response, decrease amount of > >> >>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > >> >>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > >> >>> seconds) since the last response was recieved. This does not take into account > >> >>> the fact that messages waiting for a response should be serviced within a > >> >>> reasonable round trip time. > >> >> > >> >> Wouldn't that mean that the client will disconnect a good connection, > >> >> if the server doesn't response within 10 seconds? > >> >> Reads and Writes can take longer than 10 seconds... > >> >> > >> > > >> > Where does this magic value of 10s come from? Note that a slow server > >> > can take *minutes* to respond to writes that are long past the EOF. > >> It comes from the desire to decrease the reconnection delay to something > >> better than a random number between 60 and 120 seconds. I am not > >> committed to this number, and it is open for discussion. Additionally > >> if you look closely at the logic it's not 10 seconds per request, but > >> actually when requests have been in flight for more than 10 seconds make > >> sure we've heard from the server in the last 10 seconds. > >> > >> Can you explain more fully your use case of writes that are long past > >> the EOF? Perhaps with a test-case or script that I can test? As far as > >> I know writes long past EOF will just result in a sparse file, and > >> return in a reasonable round trip time *(that's at least what I'm seeing > >> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > >> seek=100000, starts receiving responses from the server in about .05 > >> seconds with subsequent responses following at roughly .002-.01 second > >> intervals. This is well within my 10 second value. Even adding the > >> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x > >> less than my 10 second value. > >> > >> The new logic goes like this > >> if( we've been expecting a response from the server (in_flight), and > >> message has been in_flight for more than 10 seconds and > >> we haven't had any other contact from the server in that time > >> reconnect > >> > > > > That will break writes long past the EOF. Note too that reconnects on > > CIFS are horrifically expensive and problematic. Much of the state on a > > CIFS mount is tied to the connection. When that drops, open files are > > closed and things like locks are dropped. SMB1 has no real mechanism > > for state recovery, so that can really be a problem. > > > >> On a side note, I discovered a small race condition in the previous > >> logic while working on this, that my new patch also fixes. > >> 1s request > >> 2s response > >> 61.995 echo job pops > >> 121.995 echo job pops and sends echo > >> 122 server_unresponsive called. Finds no response and attempts to > >> reconnect > >> 122.95 response to echo received > >> > > > > Sure, here's a reproducer. Do this against a windows server, preferably > > one exporting NTFS on relatively slow storage. Make sure that > > "testfile" doesn't exist first: > > > > $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 > > > > NTFS doesn't support sparse files, so the OS has to zero-fill up to the > > point where you're writing. That can take a looooong time on slow > > storage (minutes even). What we do now is periodically send a SMB echo > > to make sure the server is alive rather than trying to time out a > > particular call. > > Writing past end of file in Windows can be very slow, but note that it > is possible for a windows to set as sparse a file on an NTFS > partition. Quoting from > http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx > Windows NTFS does support sparse files (and we could even send it over > cifs if we want) but it has to be explicitly set by the app on the > file: > > "To determine whether a file system supports sparse files, call the > GetVolumeInformation function and examine the > FILE_SUPPORTS_SPARSE_FILES bit flag returned through the > lpFileSystemFlags parameter. > > Most applications are not aware of sparse files and will not create > sparse files. The fact that an application is reading a sparse file is > transparent to the application. An application that is aware of > sparse-files should determine whether its data set is suitable to be > kept in a sparse file. After that determination is made, the > application must explicitly declare a file as sparse, using the > FSCTL_SET_SPARSE control code." > > That's interesting. I didn't know about the fsctl. It doesn't really help us though. Not all servers support passthrough infolevels, and there are other filesystems (e.g. FAT) that don't support sparse files at all. In any case, the upshot of all of this is that we simply can't assume that we'll get the response to a particular call in any given amount of time, so we have to periodically check that the server is still responding via echoes before giving up on it completely.
On 02/28/2013 10:47 AM, Jeff Layton wrote: > On Thu, 28 Feb 2013 10:04:36 -0600 > Steve French <smfrench@gmail.com> wrote: > >> On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: >>> On Wed, 27 Feb 2013 16:24:07 -0600 >>> Dave Chiluk <dave.chiluk@canonical.com> wrote: >>> >>>> On 02/27/2013 10:34 AM, Jeff Layton wrote: >>>>> On Wed, 27 Feb 2013 12:06:14 +0100 >>>>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >>>>> >>>>>> Hi Dave, >>>>>> >>>>>>> When messages are currently in queue awaiting a response, decrease amount of >>>>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>>>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>>>>>> seconds) since the last response was recieved. This does not take into account >>>>>>> the fact that messages waiting for a response should be serviced within a >>>>>>> reasonable round trip time. >>>>>> >>>>>> Wouldn't that mean that the client will disconnect a good connection, >>>>>> if the server doesn't response within 10 seconds? >>>>>> Reads and Writes can take longer than 10 seconds... >>>>>> >>>>> >>>>> Where does this magic value of 10s come from? Note that a slow server >>>>> can take *minutes* to respond to writes that are long past the EOF. >>>> It comes from the desire to decrease the reconnection delay to something >>>> better than a random number between 60 and 120 seconds. I am not >>>> committed to this number, and it is open for discussion. Additionally >>>> if you look closely at the logic it's not 10 seconds per request, but >>>> actually when requests have been in flight for more than 10 seconds make >>>> sure we've heard from the server in the last 10 seconds. >>>> >>>> Can you explain more fully your use case of writes that are long past >>>> the EOF? Perhaps with a test-case or script that I can test? As far as >>>> I know writes long past EOF will just result in a sparse file, and >>>> return in a reasonable round trip time *(that's at least what I'm seeing >>>> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 >>>> seek=100000, starts receiving responses from the server in about .05 >>>> seconds with subsequent responses following at roughly .002-.01 second >>>> intervals. This is well within my 10 second value. Even adding the >>>> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x >>>> less than my 10 second value. >>>> >>>> The new logic goes like this >>>> if( we've been expecting a response from the server (in_flight), and >>>> message has been in_flight for more than 10 seconds and >>>> we haven't had any other contact from the server in that time >>>> reconnect >>>> >>> >>> That will break writes long past the EOF. Note too that reconnects on >>> CIFS are horrifically expensive and problematic. Much of the state on a >>> CIFS mount is tied to the connection. When that drops, open files are >>> closed and things like locks are dropped. SMB1 has no real mechanism >>> for state recovery, so that can really be a problem. >>> >>>> On a side note, I discovered a small race condition in the previous >>>> logic while working on this, that my new patch also fixes. >>>> 1s request >>>> 2s response >>>> 61.995 echo job pops >>>> 121.995 echo job pops and sends echo >>>> 122 server_unresponsive called. Finds no response and attempts to >>>> reconnect >>>> 122.95 response to echo received >>>> >>> >>> Sure, here's a reproducer. Do this against a windows server, preferably >>> one exporting NTFS on relatively slow storage. Make sure that >>> "testfile" doesn't exist first: >>> >>> $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 >>> >>> NTFS doesn't support sparse files, so the OS has to zero-fill up to the >>> point where you're writing. That can take a looooong time on slow >>> storage (minutes even). What we do now is periodically send a SMB echo >>> to make sure the server is alive rather than trying to time out a >>> particular call. >> >> Writing past end of file in Windows can be very slow, but note that it >> is possible for a windows to set as sparse a file on an NTFS >> partition. Quoting from >> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx >> Windows NTFS does support sparse files (and we could even send it over >> cifs if we want) but it has to be explicitly set by the app on the >> file: >> >> "To determine whether a file system supports sparse files, call the >> GetVolumeInformation function and examine the >> FILE_SUPPORTS_SPARSE_FILES bit flag returned through the >> lpFileSystemFlags parameter. >> >> Most applications are not aware of sparse files and will not create >> sparse files. The fact that an application is reading a sparse file is >> transparent to the application. An application that is aware of >> sparse-files should determine whether its data set is suitable to be >> kept in a sparse file. After that determination is made, the >> application must explicitly declare a file as sparse, using the >> FSCTL_SET_SPARSE control code." >> >> > > That's interesting. I didn't know about the fsctl. > > It doesn't really help us though. Not all servers support passthrough > infolevels, and there are other filesystems (e.g. FAT) that don't > support sparse files at all. > > In any case, the upshot of all of this is that we simply can't assume > that we'll get the response to a particular call in any given amount of > time, so we have to periodically check that the server is still > responding via echoes before giving up on it completely. > I just verified this by running the dd testcase against a windows 7 server. I'm going to rewrite my patch to optimise the echo logic as Jeff suggested earlier. The only difference being that, I think we should still have regular echos when nothing else is happening, so that the connection can be rebuilt when nothing urgent is going on. It still makes more sense to me that we should be checking the status of the tcp socket, and it's underlying nic, but I'm still not completely clear on how that could be accomplished. Any pointers to that regard would be appreciated. Thanks for the help guys. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 28, 2013 at 11:31 AM, Dave Chiluk <dave.chiluk@canonical.com> wrote: > On 02/28/2013 10:47 AM, Jeff Layton wrote: >> On Thu, 28 Feb 2013 10:04:36 -0600 >> Steve French <smfrench@gmail.com> wrote: >> >>> On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: >>>> On Wed, 27 Feb 2013 16:24:07 -0600 >>>> Dave Chiluk <dave.chiluk@canonical.com> wrote: >>>> >>>>> On 02/27/2013 10:34 AM, Jeff Layton wrote: >>>>>> On Wed, 27 Feb 2013 12:06:14 +0100 >>>>>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: >>>>>> >>>>>>> Hi Dave, >>>>>>> >>>>>>>> When messages are currently in queue awaiting a response, decrease amount of >>>>>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current >>>>>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 >>>>>>>> seconds) since the last response was recieved. This does not take into account >>>>>>>> the fact that messages waiting for a response should be serviced within a >>>>>>>> reasonable round trip time. >>>>>>> >>>>>>> Wouldn't that mean that the client will disconnect a good connection, >>>>>>> if the server doesn't response within 10 seconds? >>>>>>> Reads and Writes can take longer than 10 seconds... >>>>>>> >>>>>> >>>>>> Where does this magic value of 10s come from? Note that a slow server >>>>>> can take *minutes* to respond to writes that are long past the EOF. >>>>> It comes from the desire to decrease the reconnection delay to something >>>>> better than a random number between 60 and 120 seconds. I am not >>>>> committed to this number, and it is open for discussion. Additionally >>>>> if you look closely at the logic it's not 10 seconds per request, but >>>>> actually when requests have been in flight for more than 10 seconds make >>>>> sure we've heard from the server in the last 10 seconds. >>>>> >>>>> Can you explain more fully your use case of writes that are long past >>>>> the EOF? Perhaps with a test-case or script that I can test? As far as >>>>> I know writes long past EOF will just result in a sparse file, and >>>>> return in a reasonable round trip time *(that's at least what I'm seeing >>>>> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 >>>>> seek=100000, starts receiving responses from the server in about .05 >>>>> seconds with subsequent responses following at roughly .002-.01 second >>>>> intervals. This is well within my 10 second value. Even adding the >>>>> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x >>>>> less than my 10 second value. >>>>> >>>>> The new logic goes like this >>>>> if( we've been expecting a response from the server (in_flight), and >>>>> message has been in_flight for more than 10 seconds and >>>>> we haven't had any other contact from the server in that time >>>>> reconnect >>>>> >>>> >>>> That will break writes long past the EOF. Note too that reconnects on >>>> CIFS are horrifically expensive and problematic. Much of the state on a >>>> CIFS mount is tied to the connection. When that drops, open files are >>>> closed and things like locks are dropped. SMB1 has no real mechanism >>>> for state recovery, so that can really be a problem. >>>> >>>>> On a side note, I discovered a small race condition in the previous >>>>> logic while working on this, that my new patch also fixes. >>>>> 1s request >>>>> 2s response >>>>> 61.995 echo job pops >>>>> 121.995 echo job pops and sends echo >>>>> 122 server_unresponsive called. Finds no response and attempts to >>>>> reconnect >>>>> 122.95 response to echo received >>>>> >>>> >>>> Sure, here's a reproducer. Do this against a windows server, preferably >>>> one exporting NTFS on relatively slow storage. Make sure that >>>> "testfile" doesn't exist first: >>>> >>>> $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 >>>> >>>> NTFS doesn't support sparse files, so the OS has to zero-fill up to the >>>> point where you're writing. That can take a looooong time on slow >>>> storage (minutes even). What we do now is periodically send a SMB echo >>>> to make sure the server is alive rather than trying to time out a >>>> particular call. >>> >>> Writing past end of file in Windows can be very slow, but note that it >>> is possible for a windows to set as sparse a file on an NTFS >>> partition. Quoting from >>> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx >>> Windows NTFS does support sparse files (and we could even send it over >>> cifs if we want) but it has to be explicitly set by the app on the >>> file: >>> >>> "To determine whether a file system supports sparse files, call the >>> GetVolumeInformation function and examine the >>> FILE_SUPPORTS_SPARSE_FILES bit flag returned through the >>> lpFileSystemFlags parameter. >>> >>> Most applications are not aware of sparse files and will not create >>> sparse files. The fact that an application is reading a sparse file is >>> transparent to the application. An application that is aware of >>> sparse-files should determine whether its data set is suitable to be >>> kept in a sparse file. After that determination is made, the >>> application must explicitly declare a file as sparse, using the >>> FSCTL_SET_SPARSE control code." >>> >>> >> >> That's interesting. I didn't know about the fsctl. >> >> It doesn't really help us though. Not all servers support passthrough >> infolevels, and there are other filesystems (e.g. FAT) that don't >> support sparse files at all. >> >> In any case, the upshot of all of this is that we simply can't assume >> that we'll get the response to a particular call in any given amount of >> time, so we have to periodically check that the server is still >> responding via echoes before giving up on it completely. >> > > I just verified this by running the dd testcase against a windows 7 > server. I'm going to rewrite my patch to optimise the echo logic as > Jeff suggested earlier. The only difference being that, I think we > should still have regular echos when nothing else is happening, so that > the connection can be rebuilt when nothing urgent is going on. > > It still makes more sense to me that we should be checking the status of > the tcp socket, and it's underlying nic, but I'm still not completely > clear on how that could be accomplished. Any pointers to that regard > would be appreciated. It is also worth checking if the witness protocol would help us (even in a nonclustered environment) because it was designed to allow (at least for smb3 mounts) a client to tell when a server is up or down
On Thu, 28 Feb 2013 11:31:54 -0600 Dave Chiluk <dave.chiluk@canonical.com> wrote: > On 02/28/2013 10:47 AM, Jeff Layton wrote: > > On Thu, 28 Feb 2013 10:04:36 -0600 > > Steve French <smfrench@gmail.com> wrote: > > > >> On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: > >>> On Wed, 27 Feb 2013 16:24:07 -0600 > >>> Dave Chiluk <dave.chiluk@canonical.com> wrote: > >>> > >>>> On 02/27/2013 10:34 AM, Jeff Layton wrote: > >>>>> On Wed, 27 Feb 2013 12:06:14 +0100 > >>>>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >>>>> > >>>>>> Hi Dave, > >>>>>> > >>>>>>> When messages are currently in queue awaiting a response, decrease amount of > >>>>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > >>>>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > >>>>>>> seconds) since the last response was recieved. This does not take into account > >>>>>>> the fact that messages waiting for a response should be serviced within a > >>>>>>> reasonable round trip time. > >>>>>> > >>>>>> Wouldn't that mean that the client will disconnect a good connection, > >>>>>> if the server doesn't response within 10 seconds? > >>>>>> Reads and Writes can take longer than 10 seconds... > >>>>>> > >>>>> > >>>>> Where does this magic value of 10s come from? Note that a slow server > >>>>> can take *minutes* to respond to writes that are long past the EOF. > >>>> It comes from the desire to decrease the reconnection delay to something > >>>> better than a random number between 60 and 120 seconds. I am not > >>>> committed to this number, and it is open for discussion. Additionally > >>>> if you look closely at the logic it's not 10 seconds per request, but > >>>> actually when requests have been in flight for more than 10 seconds make > >>>> sure we've heard from the server in the last 10 seconds. > >>>> > >>>> Can you explain more fully your use case of writes that are long past > >>>> the EOF? Perhaps with a test-case or script that I can test? As far as > >>>> I know writes long past EOF will just result in a sparse file, and > >>>> return in a reasonable round trip time *(that's at least what I'm seeing > >>>> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > >>>> seek=100000, starts receiving responses from the server in about .05 > >>>> seconds with subsequent responses following at roughly .002-.01 second > >>>> intervals. This is well within my 10 second value. Even adding the > >>>> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x > >>>> less than my 10 second value. > >>>> > >>>> The new logic goes like this > >>>> if( we've been expecting a response from the server (in_flight), and > >>>> message has been in_flight for more than 10 seconds and > >>>> we haven't had any other contact from the server in that time > >>>> reconnect > >>>> > >>> > >>> That will break writes long past the EOF. Note too that reconnects on > >>> CIFS are horrifically expensive and problematic. Much of the state on a > >>> CIFS mount is tied to the connection. When that drops, open files are > >>> closed and things like locks are dropped. SMB1 has no real mechanism > >>> for state recovery, so that can really be a problem. > >>> > >>>> On a side note, I discovered a small race condition in the previous > >>>> logic while working on this, that my new patch also fixes. > >>>> 1s request > >>>> 2s response > >>>> 61.995 echo job pops > >>>> 121.995 echo job pops and sends echo > >>>> 122 server_unresponsive called. Finds no response and attempts to > >>>> reconnect > >>>> 122.95 response to echo received > >>>> > >>> > >>> Sure, here's a reproducer. Do this against a windows server, preferably > >>> one exporting NTFS on relatively slow storage. Make sure that > >>> "testfile" doesn't exist first: > >>> > >>> $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 > >>> > >>> NTFS doesn't support sparse files, so the OS has to zero-fill up to the > >>> point where you're writing. That can take a looooong time on slow > >>> storage (minutes even). What we do now is periodically send a SMB echo > >>> to make sure the server is alive rather than trying to time out a > >>> particular call. > >> > >> Writing past end of file in Windows can be very slow, but note that it > >> is possible for a windows to set as sparse a file on an NTFS > >> partition. Quoting from > >> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx > >> Windows NTFS does support sparse files (and we could even send it over > >> cifs if we want) but it has to be explicitly set by the app on the > >> file: > >> > >> "To determine whether a file system supports sparse files, call the > >> GetVolumeInformation function and examine the > >> FILE_SUPPORTS_SPARSE_FILES bit flag returned through the > >> lpFileSystemFlags parameter. > >> > >> Most applications are not aware of sparse files and will not create > >> sparse files. The fact that an application is reading a sparse file is > >> transparent to the application. An application that is aware of > >> sparse-files should determine whether its data set is suitable to be > >> kept in a sparse file. After that determination is made, the > >> application must explicitly declare a file as sparse, using the > >> FSCTL_SET_SPARSE control code." > >> > >> > > > > That's interesting. I didn't know about the fsctl. > > > > It doesn't really help us though. Not all servers support passthrough > > infolevels, and there are other filesystems (e.g. FAT) that don't > > support sparse files at all. > > > > In any case, the upshot of all of this is that we simply can't assume > > that we'll get the response to a particular call in any given amount of > > time, so we have to periodically check that the server is still > > responding via echoes before giving up on it completely. > > > > I just verified this by running the dd testcase against a windows 7 > server. I'm going to rewrite my patch to optimise the echo logic as > Jeff suggested earlier. The only difference being that, I think we > should still have regular echos when nothing else is happening, so that > the connection can be rebuilt when nothing urgent is going on. > OTOH, you don't want to hammer the server with echoes. They are fairly light weight, but they aren't completely free. That's why I think we might get better milage out of trying to look at the socket itself to figure out the state. > It still makes more sense to me that we should be checking the status of > the tcp socket, and it's underlying nic, but I'm still not completely > clear on how that could be accomplished. Any pointers to that regard > would be appreciated. > > Thanks for the help guys. You can always look at the sk_state flags to figure out the state of the TCP connection (there are some examples of that in sunrpc code, but there may be simpler ones elsewhere). As far as the underlying interface goes, I'm not sure what you can do. There's not always a straightforward 1:1 correspondance between an interface and connection is there? Also, we don't necessarily want to reconnect just because networkmanager got upgraded and took down the if for a second and brought it right back up again. What if an address migrates to a different interface altogether on the same subnet? Do TCP connections normally keep chugging along in that situation? I think you probably need to nail down the specific circumstances where you want to reconnect and then try to figure out how best to detect them.
On Thu, 2013-02-28 at 11:31 -0600, Dave Chiluk wrote: > On 02/28/2013 10:47 AM, Jeff Layton wrote: > > On Thu, 28 Feb 2013 10:04:36 -0600 > > Steve French <smfrench@gmail.com> wrote: > > > >> On Thu, Feb 28, 2013 at 9:26 AM, Jeff Layton <jlayton@samba.org> wrote: > >>> On Wed, 27 Feb 2013 16:24:07 -0600 > >>> Dave Chiluk <dave.chiluk@canonical.com> wrote: > >>> > >>>> On 02/27/2013 10:34 AM, Jeff Layton wrote: > >>>>> On Wed, 27 Feb 2013 12:06:14 +0100 > >>>>> "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > >>>>> > >>>>>> Hi Dave, > >>>>>> > >>>>>>> When messages are currently in queue awaiting a response, decrease amount of > >>>>>>> time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current > >>>>>>> wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 > >>>>>>> seconds) since the last response was recieved. This does not take into account > >>>>>>> the fact that messages waiting for a response should be serviced within a > >>>>>>> reasonable round trip time. > >>>>>> > >>>>>> Wouldn't that mean that the client will disconnect a good connection, > >>>>>> if the server doesn't response within 10 seconds? > >>>>>> Reads and Writes can take longer than 10 seconds... > >>>>>> > >>>>> > >>>>> Where does this magic value of 10s come from? Note that a slow server > >>>>> can take *minutes* to respond to writes that are long past the EOF. > >>>> It comes from the desire to decrease the reconnection delay to something > >>>> better than a random number between 60 and 120 seconds. I am not > >>>> committed to this number, and it is open for discussion. Additionally > >>>> if you look closely at the logic it's not 10 seconds per request, but > >>>> actually when requests have been in flight for more than 10 seconds make > >>>> sure we've heard from the server in the last 10 seconds. > >>>> > >>>> Can you explain more fully your use case of writes that are long past > >>>> the EOF? Perhaps with a test-case or script that I can test? As far as > >>>> I know writes long past EOF will just result in a sparse file, and > >>>> return in a reasonable round trip time *(that's at least what I'm seeing > >>>> with my testing). dd if=/dev/zero of=/mnt/cifs/a bs=1M count=100 > >>>> seek=100000, starts receiving responses from the server in about .05 > >>>> seconds with subsequent responses following at roughly .002-.01 second > >>>> intervals. This is well within my 10 second value. Even adding the > >>>> latency of AT&T's 2g cell network brings it up to only 1s. Still 10x > >>>> less than my 10 second value. > >>>> > >>>> The new logic goes like this > >>>> if( we've been expecting a response from the server (in_flight), and > >>>> message has been in_flight for more than 10 seconds and > >>>> we haven't had any other contact from the server in that time > >>>> reconnect > >>>> > >>> > >>> That will break writes long past the EOF. Note too that reconnects on > >>> CIFS are horrifically expensive and problematic. Much of the state on a > >>> CIFS mount is tied to the connection. When that drops, open files are > >>> closed and things like locks are dropped. SMB1 has no real mechanism > >>> for state recovery, so that can really be a problem. > >>> > >>>> On a side note, I discovered a small race condition in the previous > >>>> logic while working on this, that my new patch also fixes. > >>>> 1s request > >>>> 2s response > >>>> 61.995 echo job pops > >>>> 121.995 echo job pops and sends echo > >>>> 122 server_unresponsive called. Finds no response and attempts to > >>>> reconnect > >>>> 122.95 response to echo received > >>>> > >>> > >>> Sure, here's a reproducer. Do this against a windows server, preferably > >>> one exporting NTFS on relatively slow storage. Make sure that > >>> "testfile" doesn't exist first: > >>> > >>> $ dd if=/dev/zero of=/path/to/cifs/share/testfile bs=1M count=1 seek=3192 > >>> > >>> NTFS doesn't support sparse files, so the OS has to zero-fill up to the > >>> point where you're writing. That can take a looooong time on slow > >>> storage (minutes even). What we do now is periodically send a SMB echo > >>> to make sure the server is alive rather than trying to time out a > >>> particular call. > >> > >> Writing past end of file in Windows can be very slow, but note that it > >> is possible for a windows to set as sparse a file on an NTFS > >> partition. Quoting from > >> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365566%28v=vs.85%29.aspx > >> Windows NTFS does support sparse files (and we could even send it over > >> cifs if we want) but it has to be explicitly set by the app on the > >> file: > >> > >> "To determine whether a file system supports sparse files, call the > >> GetVolumeInformation function and examine the > >> FILE_SUPPORTS_SPARSE_FILES bit flag returned through the > >> lpFileSystemFlags parameter. > >> > >> Most applications are not aware of sparse files and will not create > >> sparse files. The fact that an application is reading a sparse file is > >> transparent to the application. An application that is aware of > >> sparse-files should determine whether its data set is suitable to be > >> kept in a sparse file. After that determination is made, the > >> application must explicitly declare a file as sparse, using the > >> FSCTL_SET_SPARSE control code." > >> > >> > > > > That's interesting. I didn't know about the fsctl. > > > > It doesn't really help us though. Not all servers support passthrough > > infolevels, and there are other filesystems (e.g. FAT) that don't > > support sparse files at all. > > > > In any case, the upshot of all of this is that we simply can't assume > > that we'll get the response to a particular call in any given amount of > > time, so we have to periodically check that the server is still > > responding via echoes before giving up on it completely. > > > > I just verified this by running the dd testcase against a windows 7 > server. I'm going to rewrite my patch to optimise the echo logic as > Jeff suggested earlier. The only difference being that, I think we > should still have regular echos when nothing else is happening, so that > the connection can be rebuilt when nothing urgent is going on. Constant echo requests, keep the server busy and create unnecessary traffic. They would also probably kill connections that would otherwise survive temporary disruption of communications (laptop gets briefly out of range while moving through rooms, etc...) when otherwise not needing to contact the server and the underlying TCP connection would not be dropped. Simo. > It still makes more sense to me that we should be checking the status of > the tcp socket, and it's underlying nic, but I'm still not completely > clear on how that could be accomplished. Any pointers to that regard > would be appreciated.
On 2013-02-28 at 07:26 -0800 Jeff Layton sent off: > NTFS doesn't support sparse files, so the OS has to zero-fill up to the > point where you're writing. That can take a looooong time on slow > storage (minutes even). but you are talking about FAT here, right? NTFS does support sparse files if the sparse bit has been explicitly been set on it. Bit even if the sparse bit is not set filling a file with zeros by writing after a seek long beyond the end of the file is very fast because NTFS supports that feature what Unix filesystems like xfs call extents. If writing beyond the end of a file is really slow via cifs vfs in the test case against a ntfs volume then I wonder if that operation is being really done optimally over the wire. ntfs really isn't that bad with handling this kind of files. Cheers Björn -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 28 Feb 2013 23:54:13 +0100 Björn JACKE <bj@SerNet.DE> wrote: > On 2013-02-28 at 07:26 -0800 Jeff Layton sent off: > > NTFS doesn't support sparse files, so the OS has to zero-fill up to the > > point where you're writing. That can take a looooong time on slow > > storage (minutes even). > > but you are talking about FAT here, right? NTFS does support sparse files if > the sparse bit has been explicitly been set on it. Bit even if the sparse bit > is not set filling a file with zeros by writing after a seek long beyond the > end of the file is very fast because NTFS supports that feature what Unix > filesystems like xfs call extents. > > If writing beyond the end of a file is really slow via cifs vfs in the test > case against a ntfs volume then I wonder if that operation is being really done > optimally over the wire. ntfs really isn't that bad with handling this kind of > files. > I'm not sure since I don't know the internals of NTFS. I had always assumed that it didn't really handle sparse files well (hence the "rabbit-pellet" thing that windows clients do). All I can say however is that writes long past the EOF can take a *really* long time to run. Typically we just issue a SMB_COM_WRITEX at the offset to which we want to put the data. Is there some other way we ought to be doing this? In any case, it doesn't really change the fact that there is no guaranteed time of response from CIFS servers. They can easily take a really long time to respond to certain requests. The best method we have to deal with that is to periodically "ping" the server with an echo to see if it's still there.
On Thu, Feb 28, 2013 at 6:11 PM, Jeff Layton <jlayton@samba.org> wrote: > On Thu, 28 Feb 2013 23:54:13 +0100 > Björn JACKE <bj@SerNet.DE> wrote: > >> On 2013-02-28 at 07:26 -0800 Jeff Layton sent off: >> > NTFS doesn't support sparse files, so the OS has to zero-fill up to the >> > point where you're writing. That can take a looooong time on slow >> > storage (minutes even). >> >> but you are talking about FAT here, right? NTFS does support sparse files if >> the sparse bit has been explicitly been set on it. Bit even if the sparse bit >> is not set filling a file with zeros by writing after a seek long beyond the >> end of the file is very fast because NTFS supports that feature what Unix >> filesystems like xfs call extents. >> >> If writing beyond the end of a file is really slow via cifs vfs in the test >> case against a ntfs volume then I wonder if that operation is being really done >> optimally over the wire. ntfs really isn't that bad with handling this kind of >> files. >> > > I'm not sure since I don't know the internals of NTFS. I had always > assumed that it didn't really handle sparse files well (hence the > "rabbit-pellet" thing that windows clients do). > > All I can say however is that writes long past the EOF can take a > *really* long time to run. Typically we just issue a SMB_COM_WRITEX at > the offset to which we want to put the data. Is there some other way we > ought to be doing this? > > In any case, it doesn't really change the fact that there is no > guaranteed time of response from CIFS servers. They can easily take a > really long time to respond to certain requests. The best method we > have to deal with that is to periodically "ping" the server with an > echo to see if it's still there. SMB2/SMB3 with better async support may make this easier - but Jeff is right.
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index e6899ce..138c8cf 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -80,6 +80,8 @@ /* SMB echo "timeout" -- FIXME: tunable? */ #define SMB_ECHO_INTERVAL (60 * HZ) +/* Maximum acceptable round trip time to server */ +#define SMB_MAX_RTT (10 * HZ) #include "cifspdu.h" @@ -1141,8 +1143,8 @@ struct mid_q_entry { __u32 pid; /* process id */ __u32 sequence_number; /* for CIFS signing */ unsigned long when_alloc; /* when mid was created */ -#ifdef CONFIG_CIFS_STATS2 unsigned long when_sent; /* time when smb send finished */ +#ifdef CONFIG_CIFS_STATS2 unsigned long when_received; /* when demux complete (taken off wire) */ #endif mid_receive_t *receive; /* call receive callback */ @@ -1179,11 +1181,6 @@ static inline void cifs_num_waiters_dec(struct TCP_Server_Info *server) { atomic_dec(&server->num_waiters); } - -static inline void cifs_save_when_sent(struct mid_q_entry *mid) -{ - mid->when_sent = jiffies; -} #else static inline void cifs_in_send_inc(struct TCP_Server_Info *server) { @@ -1199,11 +1196,15 @@ static inline void cifs_num_waiters_inc(struct TCP_Server_Info *server) static inline void cifs_num_waiters_dec(struct TCP_Server_Info *server) { } +#endif +/* We always need to know when a mid was sent in order to determine if + * the server is not responding. + */ static inline void cifs_save_when_sent(struct mid_q_entry *mid) { + mid->when_sent = jiffies; } -#endif /* for pending dnotify requests */ struct dir_notify_req { diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 12b3da3..57c78b3 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -456,25 +456,66 @@ allocate_buffers(struct TCP_Server_Info *server) return true; } +/* Takes struct *TCP_Server_Info and returns the when_sent jiffy of the + * oldest unanswered mid in the pending queue or the newest response. + * Whichever is newer. + */ +static unsigned long +oldest_req_or_newest_resp(struct TCP_Server_Info *server) +{ + struct mid_q_entry *mid; + unsigned long oldest_jiffy = jiffies; + + spin_lock(&GlobalMid_Lock); + list_for_each_entry(mid, &server->pending_mid_q, qhead) { + if (mid->mid_state == MID_REQUEST_SUBMITTED) { + if (time_before(mid->when_sent, oldest_jiffy)) + oldest_jiffy = mid->when_sent; + } + } + spin_unlock(&GlobalMid_Lock); + + /* Check to see if the last response is newer than the oldest request + * This could mean that the server is just responding very slowly, + * possibly even longer than SMB_MAX_RTT. In which * case we don't + * want to cause a reconnect. + */ + if (time_after(server->lstrp , oldest_jiffy)) + return server->lstrp; + else + return oldest_jiffy; +} + static bool server_unresponsive(struct TCP_Server_Info *server) { + unsigned long oldest; + /* - * We need to wait 2 echo intervals to make sure we handle such - * situations right: + * When no messages are in flight max wait is + * 2*SMB_ECHO_INTERVAL + SMB_MAX_RTT + scheduling delay + * + * 1s client sends a normal SMB request + * 2s client gets a response + * 61s echo workqueue job pops, and decides we got a response < 60 + * seconds ago and don't need to send another + * 121s kernel_recvmsg times out, and we see that we haven't gotten + * a response in >60s. Send echo causing in_flight() to return + * true + * 131s echo hasn't returned run cifs_reconnect + * + * Situation 2 where non-echo messages are in_flight * 1s client sends a normal SMB request * 2s client gets a response - * 30s echo workqueue job pops, and decides we got a response recently - * and don't need to send another - * ... - * 65s kernel_recvmsg times out, and we see that we haven't gotten - * a response in >60s. + * 3s client sends a normal SMB request + * 13s client still has not received SMB response run cifs_reconnect */ if (server->tcpStatus == CifsGood && - time_after(jiffies, server->lstrp + 2 * SMB_ECHO_INTERVAL)) { - cERROR(1, "Server %s has not responded in %d seconds. " + (in_flight(server) > 0 && time_after(jiffies, + oldest = oldest_req_or_newest_resp(server) + SMB_MAX_RTT))) { + cERROR(1, "Server %s has not responded in %lu seconds. " "Reconnecting...", server->hostname, - (2 * SMB_ECHO_INTERVAL) / HZ); + ((jiffies - oldest) / HZ)); cifs_reconnect(server); wake_up(&server->response_q); return true;
When messages are currently in queue awaiting a response, decrease amount of time before attempting cifs_reconnect to SMB_MAX_RTT = 10 seconds. The current wait time before attempting to reconnect is currently 2*SMB_ECHO_INTERVAL(120 seconds) since the last response was recieved. This does not take into account the fact that messages waiting for a response should be serviced within a reasonable round trip time. This fixes the issue where user moves from wired to wireless or vice versa causing the mount to hang for 120 seconds, when it could reconnect considerably faster. After this fix it will take SMB_MAX_RTT (10 seconds) from the last time the user attempted to access the volume or SMB_MAX_RTT after the last echo. The worst case of the latter scenario being 2*SMB_ECHO_INTERVAL+SMB_MAX_RTT+small scheduling delay (about 130 seconds). Statistically speaking it would normally reconnect sooner. However in the best case where the user changes nics, and immediately tries to access the cifs share it will take SMB_MAX_RTT=10 seconds. BugLink: http://bugs.launchpad.net/bugs/1017622 Signed-off-by: Dave Chiluk <chiluk@canonical.com> --- fs/cifs/cifsglob.h | 15 +++++++------ fs/cifs/connect.c | 61 +++++++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 59 insertions(+), 17 deletions(-)