Message ID | 20240813232241.2369855-1-joannelkoong@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | fuse: add timeout option for requests | expand |
On Wed, Aug 14, 2024 at 7:23 AM Joanne Koong <joannelkoong@gmail.com> wrote: > > There are situations where fuse servers can become unresponsive or take > too long to reply to a request. Currently there is no upper bound on > how long a request may take, which may be frustrating to users who get > stuck waiting for a request to complete. > > This patchset adds a timeout option for requests and two dynamically > configurable fuse sysctls "default_request_timeout" and "max_request_timeout" > for controlling/enforcing timeout behavior system-wide. > > Existing fuse servers will not be affected unless they explicitly opt into the > timeout. > > v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/ > Changes from v3 -> v4: > - Fix wording on some comments to make it more clear > - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef) > - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo) > - Fix comment for "processing queue", add req->fpq = NULL safeguard (Bernd) > > v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/ > Changes from v2 -> v3: > - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd) > - Disarm timer in error handling for fatal interrupt (Yafang) > - Clean up do_fuse_request_end (Jingbo) > - Add timer for notify retrieve requests > - Fix kernel test robot errors for #define no-op functions > > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/ > Changes from v1 -> v2: > - Add timeout for background requests > - Handle resend race condition > - Add sysctls > > Joanne Koong (2): > fuse: add optional kernel-enforced timeout for requests > fuse: add default_request_timeout and max_request_timeout sysctls > > Documentation/admin-guide/sysctl/fs.rst | 17 +++ > fs/fuse/Makefile | 2 +- > fs/fuse/dev.c | 192 +++++++++++++++++++++++- > fs/fuse/fuse_i.h | 30 ++++ > fs/fuse/inode.c | 24 +++ > fs/fuse/sysctl.c | 42 ++++++ > 6 files changed, 298 insertions(+), 9 deletions(-) > create mode 100644 fs/fuse/sysctl.c > > -- > 2.43.5 > For this series, Tested-by: Yafang Shao <laoar.shao@gmail.com>
On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote: > > There are situations where fuse servers can become unresponsive or take > too long to reply to a request. Currently there is no upper bound on > how long a request may take, which may be frustrating to users who get > stuck waiting for a request to complete. > > This patchset adds a timeout option for requests and two dynamically > configurable fuse sysctls "default_request_timeout" and "max_request_timeout" > for controlling/enforcing timeout behavior system-wide. > > Existing fuse servers will not be affected unless they explicitly opt into the > timeout. I sort of understand the motivation, but do not clearly see why this is required. A well written server will be able to do request timeouts properly, without the kernel having to cut off requests mid flight without the knowledge of the server. The latter could even be dangerous because locking guarantees previously provided by the kernel do not apply anymore. Can you please explain why this needs to be done by the client (kernel) instead of the server (userspace)? Thanks, Miklos > > v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/ > Changes from v3 -> v4: > - Fix wording on some comments to make it more clear > - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef) > - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo) > - Fix comment for "processing queue", add req->fpq = NULL safeguard (Bernd) > > v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/ > Changes from v2 -> v3: > - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd) > - Disarm timer in error handling for fatal interrupt (Yafang) > - Clean up do_fuse_request_end (Jingbo) > - Add timer for notify retrieve requests > - Fix kernel test robot errors for #define no-op functions > > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/ > Changes from v1 -> v2: > - Add timeout for background requests > - Handle resend race condition > - Add sysctls > > Joanne Koong (2): > fuse: add optional kernel-enforced timeout for requests > fuse: add default_request_timeout and max_request_timeout sysctls > > Documentation/admin-guide/sysctl/fs.rst | 17 +++ > fs/fuse/Makefile | 2 +- > fs/fuse/dev.c | 192 +++++++++++++++++++++++- > fs/fuse/fuse_i.h | 30 ++++ > fs/fuse/inode.c | 24 +++ > fs/fuse/sysctl.c | 42 ++++++ > 6 files changed, 298 insertions(+), 9 deletions(-) > create mode 100644 fs/fuse/sysctl.c > > -- > 2.43.5 >
On 8/21/24 15:47, Miklos Szeredi wrote: > On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote: >> >> There are situations where fuse servers can become unresponsive or take >> too long to reply to a request. Currently there is no upper bound on >> how long a request may take, which may be frustrating to users who get >> stuck waiting for a request to complete. >> >> This patchset adds a timeout option for requests and two dynamically >> configurable fuse sysctls "default_request_timeout" and "max_request_timeout" >> for controlling/enforcing timeout behavior system-wide. >> >> Existing fuse servers will not be affected unless they explicitly opt into the >> timeout. > > I sort of understand the motivation, but do not clearly see why this > is required. > > A well written server will be able to do request timeouts properly, > without the kernel having to cut off requests mid flight without the > knowledge of the server. The latter could even be dangerous because > locking guarantees previously provided by the kernel do not apply > anymore. > > Can you please explain why this needs to be done by the client > (kernel) instead of the server (userspace)? I don't know about Joannes motivation, I see two use cases - You suggested yourself that timeouts might be a (non-ideal) solution to avoid page copies on writeback https://lore.kernel.org/linux-mm/CAJfpegt_mEYOeeTo2bWS3iJfC38t5bf29mzrxK68dhMptrgamg@mail.gmail.com/raw - Would probably be a solution for this LXCFS issue https://lore.kernel.org/lkml/a8d0c5da-6935-4d28-9380-68b84b8e6e72@shopee.com/ (although I don't fully understand the issue there yet) Thanks, Bernd
On Wed, 21 Aug 2024 at 16:15, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > - You suggested yourself that timeouts might be a (non-ideal) > solution to avoid page copies on writeback > https://lore.kernel.org/linux-mm/CAJfpegt_mEYOeeTo2bWS3iJfC38t5bf29mzrxK68dhMptrgamg@mail.gmail.com/raw What I suggested is to copy the original writeback page to a tmp page after a timeout. That's not the same as aborting the request, though the mechanism might be similar. > - Would probably be a solution for this LXCFS issue > https://lore.kernel.org/lkml/a8d0c5da-6935-4d28-9380-68b84b8e6e72@shopee.com/ > (although I don't fully understand the issue there yet) Yeah, we first need to understand what's going on. Thanks, Miklos
On Wed, Aug 21, 2024 at 03:47:53PM +0200, Miklos Szeredi wrote: > On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > There are situations where fuse servers can become unresponsive or take > > too long to reply to a request. Currently there is no upper bound on > > how long a request may take, which may be frustrating to users who get > > stuck waiting for a request to complete. > > > > This patchset adds a timeout option for requests and two dynamically > > configurable fuse sysctls "default_request_timeout" and "max_request_timeout" > > for controlling/enforcing timeout behavior system-wide. > > > > Existing fuse servers will not be affected unless they explicitly opt into the > > timeout. > > I sort of understand the motivation, but do not clearly see why this > is required. > > A well written server will be able to do request timeouts properly, > without the kernel having to cut off requests mid flight without the > knowledge of the server. The latter could even be dangerous because > locking guarantees previously provided by the kernel do not apply > anymore. > > Can you please explain why this needs to be done by the client > (kernel) instead of the server (userspace)? > "A well written server" is the key part here ;). In our case we had a "well written server" that ended up having a deadlock and we had to run around with a drgn script to find those hung mounts and kill them manually. The usecase here is specifically for bugs in the FUSE server to allow us to cleanup automatically with EIO's rather than a drgn script to figure out if the mount is hung. It also gives us the opportunity to do the things that Bernd points out, specifically remove the double buffering downside as we can trust that eventually writeback will either succeed or timeout. Thanks, Josef
On Wed, 21 Aug 2024 at 20:11, Josef Bacik <josef@toxicpanda.com> wrote: > "A well written server" is the key part here ;). In our case we had a "well > written server" that ended up having a deadlock and we had to run around with a > drgn script to find those hung mounts and kill them manually. The usecase here > is specifically for bugs in the FUSE server to allow us to cleanup automatically > with EIO's rather than a drgn script to figure out if the mount is hung. So you 'd like to automatically abort the connection to an unresponsive server? I'm okay with that. What I'm worried about is the unintended side effects of timed out request without the server's knowledge (i.e. VFS locks released, then new request takes VFS lock). If the connection to the server is aborted, then that's not an issue. It's also much simpler to just time out any response from the server (either read or write on /dev/fuse) than having to do per-request timeouts. > It also gives us the opportunity to do the things that Bernd points out, > specifically remove the double buffering downside as we can trust that > eventually writeback will either succeed or timeout. Thanks, Well see this explanation for how this might deadlock on a memory allocation by the server: https://lore.kernel.org/all/CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@mail.gmail.com/ Having a timeout would fix the deadlock, but it doesn't seem to me a proper solution. Thanks, Miklos
On Wed, Aug 21, 2024 at 11:54 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Wed, 21 Aug 2024 at 20:11, Josef Bacik <josef@toxicpanda.com> wrote: > > > "A well written server" is the key part here ;). In our case we had a "well > > written server" that ended up having a deadlock and we had to run around with a > > drgn script to find those hung mounts and kill them manually. The usecase here > > is specifically for bugs in the FUSE server to allow us to cleanup automatically > > with EIO's rather than a drgn script to figure out if the mount is hung. > > So you 'd like to automatically abort the connection to an > unresponsive server? I'm okay with that. > > What I'm worried about is the unintended side effects of timed out > request without the server's knowledge (i.e. VFS locks released, then > new request takes VFS lock). If the connection to the server is > aborted, then that's not an issue. > > It's also much simpler to just time out any response from the server > (either read or write on /dev/fuse) than having to do per-request > timeouts. In our case, the deadlock was triggered by invalidating the inode in the middle of handling the write request. The server becomes stuck since the inode invalidation (eg fuse_reverse_inval_inode()) is attempting to acquire the folio lock but the lock was acquired when servicing the write request (eg fuse_fill_write_pages()) and only gets released after the server has replied to the write request (eg in fuse_send_write_pages()). Without a kernel enforced timeout, the only way out of this is to abort the connection. A userspace timeout wouldn't help in this case with getting the server unstuck. With the kernel timeout, this forces the kernel handling of the write request to proceed, whihc will drop the folio lock and resume the server back to a functioning state. I don't think situations like this are uncommon. For example, it's not obvious or clear to developers that fuse_lowlevel_notify_inval_inode() shouldn't be called inside of a write handler in their server code. I believe Yafang had a use case for this as well in https://lore.kernel.org/linux-fsdevel/20240724071156.97188-1-laoar.shao@gmail.com/ where they were seeing fuse connections becoming indefinitely stuck. For your concern about potential unintended side effects of timed out requests without the server's knowledge, could you elaborate more on the VFS locking example? In my mind, a request that times out is the same thing as a request that behaves normally and completes with an error code, but perhaps not? I think also, having some way for system admins to enforce request timeouts across the board might be useful as well - for example, if a malignant fuse server doesn't reply to any requests, the requests hog memory until the server is killed. Thanks, Joanne > > > It also gives us the opportunity to do the things that Bernd points out, > > specifically remove the double buffering downside as we can trust that > > eventually writeback will either succeed or timeout. Thanks, > > Well see this explanation for how this might deadlock on a memory > allocation by the server: > > https://lore.kernel.org/all/CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@mail.gmail.com/ > > Having a timeout would fix the deadlock, but it doesn't seem to me a > proper solution. > > Thanks, > Miklos
On Wed, 21 Aug 2024 at 23:22, Joanne Koong <joannelkoong@gmail.com> wrote: > Without a kernel enforced timeout, the only way out of this is to > abort the connection. A userspace timeout wouldn't help in this case > with getting the server unstuck. With the kernel timeout, this forces > the kernel handling of the write request to proceed, whihc will drop > the folio lock and resume the server back to a functioning state. > > I don't think situations like this are uncommon. For example, it's not > obvious or clear to developers that fuse_lowlevel_notify_inval_inode() > shouldn't be called inside of a write handler in their server code. Documentation is definitely lacking. In fact a simple rule is: never call a notification function from within a request handling function. Notifications are async events that should happen independently of handling regular operations. Anything else is an abuse of the interface. > > For your concern about potential unintended side effects of timed out > requests without the server's knowledge, could you elaborate more on > the VFS locking example? In my mind, a request that times out is the > same thing as a request that behaves normally and completes with an > error code, but perhaps not? - user calls mknod(2) on fuse directory - VFS takes inode lock on parent directory - calls into fuse to create the file - fuse sends request to server - file creation is slow and times out in the kernel - fuse returns -ETIMEDOUT - VFS releases inode lock - meanwhile the server is still working on creating the file and has no idea that something went wrong - user calls the same mknod(2) again - same things happen as last time - server starts to create the file *again* knowing that the VFS takes care of concurrency - server crashes due to corruption > I think also, having some way for system admins to enforce request > timeouts across the board might be useful as well - for example, if a > malignant fuse server doesn't reply to any requests, the requests hog > memory until the server is killed. As I said, I'm not against enforcing a response time for fuse servers, as long as a timeout results in a complete abort and not just an error on the timed out request. Thanks, Miklos
On Thu, Aug 22, 2024 at 3:52 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Wed, 21 Aug 2024 at 23:22, Joanne Koong <joannelkoong@gmail.com> wrote: > > > Without a kernel enforced timeout, the only way out of this is to > > abort the connection. A userspace timeout wouldn't help in this case > > with getting the server unstuck. With the kernel timeout, this forces > > the kernel handling of the write request to proceed, whihc will drop > > the folio lock and resume the server back to a functioning state. > > > > I don't think situations like this are uncommon. For example, it's not > > obvious or clear to developers that fuse_lowlevel_notify_inval_inode() > > shouldn't be called inside of a write handler in their server code. > > Documentation is definitely lacking. In fact a simple rule is: never > call a notification function from within a request handling function. > Notifications are async events that should happen independently of > handling regular operations. Anything else is an abuse of the > interface. > > > > > For your concern about potential unintended side effects of timed out > > requests without the server's knowledge, could you elaborate more on > > the VFS locking example? In my mind, a request that times out is the > > same thing as a request that behaves normally and completes with an > > error code, but perhaps not? > > - user calls mknod(2) on fuse directory > - VFS takes inode lock on parent directory > - calls into fuse to create the file > - fuse sends request to server > - file creation is slow and times out in the kernel > - fuse returns -ETIMEDOUT > - VFS releases inode lock > - meanwhile the server is still working on creating the file and has > no idea that something went wrong > - user calls the same mknod(2) again > - same things happen as last time > - server starts to create the file *again* knowing that the VFS takes > care of concurrency > - server crashes due to corruption Thanks for the details. For cases like these though, isn't the server already responsible for handling errors properly to avoid potential corruption if their reply to the request fails? In your example above, it seems like the server would already need to have the error handling in place to roll back the file creation if their fuse_reply_create() call returned an error (eg -EIO if copying out args in the kernel had an issue). If the request timed out, then the server would get back -ENOENT to their reply. Thanks, Joanne > > > > I think also, having some way for system admins to enforce request > > timeouts across the board might be useful as well - for example, if a > > malignant fuse server doesn't reply to any requests, the requests hog > > memory until the server is killed. > > As I said, I'm not against enforcing a response time for fuse servers, > as long as a timeout results in a complete abort and not just an error > on the timed out request. > > Thanks, > Miklos
On Thu, 22 Aug 2024 at 19:31, Joanne Koong <joannelkoong@gmail.com> wrote: > For cases like these though, isn't the server already responsible for > handling errors properly to avoid potential corruption if their reply > to the request fails? In your example above, it seems like the server > would already need to have the error handling in place to roll back > the file creation if their fuse_reply_create() call returned an error > (eg -EIO if copying out args in the kernel had an issue). No, the server does not need to implement rollback, and does not in fact need to check for the return value of the fuse_reply_create() call unless it wants to mess with interrupts (not enabled by default). See libfuse/lib/fuse.c where most of the fuse_replu_XXX() calls just ignore the return value. Thanks, Miklos
On Thu, Aug 22, 2024 at 10:43 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, 22 Aug 2024 at 19:31, Joanne Koong <joannelkoong@gmail.com> wrote: > > > For cases like these though, isn't the server already responsible for > > handling errors properly to avoid potential corruption if their reply > > to the request fails? In your example above, it seems like the server > > would already need to have the error handling in place to roll back > > the file creation if their fuse_reply_create() call returned an error > > (eg -EIO if copying out args in the kernel had an issue). > > No, the server does not need to implement rollback, and does not in > fact need to check for the return value of the fuse_reply_create() > call unless it wants to mess with interrupts (not enabled by default). > See libfuse/lib/fuse.c where most of the fuse_replu_XXX() calls just > ignore the return value. Ok, I see. For v5, I will update this to abort the connection altogether if a request times out. Thanks, Joanne > > Thanks, > Miklos
On Tue, Aug 20, 2024 at 7:02 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > On Wed, Aug 14, 2024 at 7:23 AM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > There are situations where fuse servers can become unresponsive or take > > too long to reply to a request. Currently there is no upper bound on > > how long a request may take, which may be frustrating to users who get > > stuck waiting for a request to complete. > > > > This patchset adds a timeout option for requests and two dynamically > > configurable fuse sysctls "default_request_timeout" and "max_request_timeout" > > for controlling/enforcing timeout behavior system-wide. > > > > Existing fuse servers will not be affected unless they explicitly opt into the > > timeout. > > > > v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/ > > Changes from v3 -> v4: > > - Fix wording on some comments to make it more clear > > - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef) > > - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo) > > - Fix comment for "processing queue", add req->fpq = NULL safeguard (Bernd) > > > > v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/ > > Changes from v2 -> v3: > > - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd) > > - Disarm timer in error handling for fatal interrupt (Yafang) > > - Clean up do_fuse_request_end (Jingbo) > > - Add timer for notify retrieve requests > > - Fix kernel test robot errors for #define no-op functions > > > > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/ > > Changes from v1 -> v2: > > - Add timeout for background requests > > - Handle resend race condition > > - Add sysctls > > > > Joanne Koong (2): > > fuse: add optional kernel-enforced timeout for requests > > fuse: add default_request_timeout and max_request_timeout sysctls > > > > Documentation/admin-guide/sysctl/fs.rst | 17 +++ > > fs/fuse/Makefile | 2 +- > > fs/fuse/dev.c | 192 +++++++++++++++++++++++- > > fs/fuse/fuse_i.h | 30 ++++ > > fs/fuse/inode.c | 24 +++ > > fs/fuse/sysctl.c | 42 ++++++ > > 6 files changed, 298 insertions(+), 9 deletions(-) > > create mode 100644 fs/fuse/sysctl.c > > > > -- > > 2.43.5 > > > > For this series, > > Tested-by: Yafang Shao <laoar.shao@gmail.com> Thanks for testing this version. For v5, the behavior will be modified (if a request times out, the connection will be aborted instead of just the request being aborted) so I'll hold off on adding your Tested-by sign-off until you explicitly give the ok on the newer version. Thanks, Joanne > > -- > Regards > Yafang