[v4,0/2] fuse: add timeout option for requests

Message ID	20240813232241.2369855-1-joannelkoong@gmail.com (mailing list archive)
Headers	show Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D913A16DEA9 for <linux-fsdevel@vger.kernel.org>; Tue, 13 Aug 2024 23:23:57 +0000 (UTC) From: Joanne Koong <joannelkoong@gmail.com> To: miklos@szeredi.hu, linux-fsdevel@vger.kernel.org Cc: josef@toxicpanda.com, bernd.schubert@fastmail.fm, jefflexu@linux.alibaba.com, laoar.shao@gmail.com, kernel-team@meta.com Subject: [PATCH v4 0/2] fuse: add timeout option for requests Date: Tue, 13 Aug 2024 16:22:39 -0700 Message-ID: <20240813232241.2369855-1-joannelkoong@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	fuse: add timeout option for requests \| expand [v4,0/2] fuse: add timeout option for requests [v4,1/2] fuse: add optional kernel-enforced timeout for requests [v4,2/2] fuse: add default_request_timeout and max_request_timeout sysctls

Joanne Koong Aug. 13, 2024, 11:22 p.m. UTC

There are situations where fuse servers can become unresponsive or take
too long to reply to a request. Currently there is no upper bound on
how long a request may take, which may be frustrating to users who get
stuck waiting for a request to complete.

This patchset adds a timeout option for requests and two dynamically
configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
for controlling/enforcing timeout behavior system-wide.

Existing fuse servers will not be affected unless they explicitly opt into the
timeout.

v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
Changes from v3 -> v4:
- Fix wording on some comments to make it more clear
- Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef)
- Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
- Fix comment for "processing queue", add req->fpq = NULL safeguard  (Bernd)

v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
Changes from v2 -> v3:
- Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
- Disarm timer in error handling for fatal interrupt (Yafang)
- Clean up do_fuse_request_end (Jingbo)
- Add timer for notify retrieve requests 
- Fix kernel test robot errors for #define no-op functions

v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
Changes from v1 -> v2:
- Add timeout for background requests
- Handle resend race condition
- Add sysctls

Joanne Koong (2):
  fuse: add optional kernel-enforced timeout for requests
  fuse: add default_request_timeout and max_request_timeout sysctls

 Documentation/admin-guide/sysctl/fs.rst |  17 +++
 fs/fuse/Makefile                        |   2 +-
 fs/fuse/dev.c                           | 192 +++++++++++++++++++++++-
 fs/fuse/fuse_i.h                        |  30 ++++
 fs/fuse/inode.c                         |  24 +++
 fs/fuse/sysctl.c                        |  42 ++++++
 6 files changed, 298 insertions(+), 9 deletions(-)
 create mode 100644 fs/fuse/sysctl.c

Yafang Shao Aug. 21, 2024, 2:01 a.m. UTC | #1

On Wed, Aug 14, 2024 at 7:23 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> There are situations where fuse servers can become unresponsive or take
> too long to reply to a request. Currently there is no upper bound on
> how long a request may take, which may be frustrating to users who get
> stuck waiting for a request to complete.
>
> This patchset adds a timeout option for requests and two dynamically
> configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
> for controlling/enforcing timeout behavior system-wide.
>
> Existing fuse servers will not be affected unless they explicitly opt into the
> timeout.
>
> v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
> Changes from v3 -> v4:
> - Fix wording on some comments to make it more clear
> - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef)
> - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
> - Fix comment for "processing queue", add req->fpq = NULL safeguard  (Bernd)
>
> v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
> Changes from v2 -> v3:
> - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
> - Disarm timer in error handling for fatal interrupt (Yafang)
> - Clean up do_fuse_request_end (Jingbo)
> - Add timer for notify retrieve requests
> - Fix kernel test robot errors for #define no-op functions
>
> v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
> Changes from v1 -> v2:
> - Add timeout for background requests
> - Handle resend race condition
> - Add sysctls
>
> Joanne Koong (2):
>   fuse: add optional kernel-enforced timeout for requests
>   fuse: add default_request_timeout and max_request_timeout sysctls
>
>  Documentation/admin-guide/sysctl/fs.rst |  17 +++
>  fs/fuse/Makefile                        |   2 +-
>  fs/fuse/dev.c                           | 192 +++++++++++++++++++++++-
>  fs/fuse/fuse_i.h                        |  30 ++++
>  fs/fuse/inode.c                         |  24 +++
>  fs/fuse/sysctl.c                        |  42 ++++++
>  6 files changed, 298 insertions(+), 9 deletions(-)
>  create mode 100644 fs/fuse/sysctl.c
>
> --
> 2.43.5
>

For this series,

Tested-by: Yafang Shao <laoar.shao@gmail.com>

Miklos Szeredi Aug. 21, 2024, 1:47 p.m. UTC | #2

On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> There are situations where fuse servers can become unresponsive or take
> too long to reply to a request. Currently there is no upper bound on
> how long a request may take, which may be frustrating to users who get
> stuck waiting for a request to complete.
>
> This patchset adds a timeout option for requests and two dynamically
> configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
> for controlling/enforcing timeout behavior system-wide.
>
> Existing fuse servers will not be affected unless they explicitly opt into the
> timeout.

I sort of understand the motivation, but do not clearly see why this
is required.

A well written server will be able to do request timeouts properly,
without the kernel having to cut off requests mid flight without the
knowledge of the server.  The latter could even be dangerous because
locking guarantees previously provided by the kernel do not apply
anymore.

Can you please explain why this needs to be done by the client
(kernel) instead of the server (userspace)?

Thanks,
Miklos




>
> v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
> Changes from v3 -> v4:
> - Fix wording on some comments to make it more clear
> - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef)
> - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
> - Fix comment for "processing queue", add req->fpq = NULL safeguard  (Bernd)
>
> v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
> Changes from v2 -> v3:
> - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
> - Disarm timer in error handling for fatal interrupt (Yafang)
> - Clean up do_fuse_request_end (Jingbo)
> - Add timer for notify retrieve requests
> - Fix kernel test robot errors for #define no-op functions
>
> v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
> Changes from v1 -> v2:
> - Add timeout for background requests
> - Handle resend race condition
> - Add sysctls
>
> Joanne Koong (2):
>   fuse: add optional kernel-enforced timeout for requests
>   fuse: add default_request_timeout and max_request_timeout sysctls
>
>  Documentation/admin-guide/sysctl/fs.rst |  17 +++
>  fs/fuse/Makefile                        |   2 +-
>  fs/fuse/dev.c                           | 192 +++++++++++++++++++++++-
>  fs/fuse/fuse_i.h                        |  30 ++++
>  fs/fuse/inode.c                         |  24 +++
>  fs/fuse/sysctl.c                        |  42 ++++++
>  6 files changed, 298 insertions(+), 9 deletions(-)
>  create mode 100644 fs/fuse/sysctl.c
>
> --
> 2.43.5
>

Bernd Schubert Aug. 21, 2024, 2:15 p.m. UTC | #3

On 8/21/24 15:47, Miklos Szeredi wrote:
> On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> There are situations where fuse servers can become unresponsive or take
>> too long to reply to a request. Currently there is no upper bound on
>> how long a request may take, which may be frustrating to users who get
>> stuck waiting for a request to complete.
>>
>> This patchset adds a timeout option for requests and two dynamically
>> configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
>> for controlling/enforcing timeout behavior system-wide.
>>
>> Existing fuse servers will not be affected unless they explicitly opt into the
>> timeout.
> 
> I sort of understand the motivation, but do not clearly see why this
> is required.
> 
> A well written server will be able to do request timeouts properly,
> without the kernel having to cut off requests mid flight without the
> knowledge of the server.  The latter could even be dangerous because
> locking guarantees previously provided by the kernel do not apply
> anymore.
> 
> Can you please explain why this needs to be done by the client
> (kernel) instead of the server (userspace)?

I don't know about Joannes motivation, I see two use cases

- You suggested yourself that timeouts might be a (non-ideal) 
solution to avoid page copies on writeback
https://lore.kernel.org/linux-mm/CAJfpegt_mEYOeeTo2bWS3iJfC38t5bf29mzrxK68dhMptrgamg@mail.gmail.com/raw

- Would probably be a solution for this LXCFS issue
https://lore.kernel.org/lkml/a8d0c5da-6935-4d28-9380-68b84b8e6e72@shopee.com/
(although I don't fully understand the issue there yet)


Thanks,
Bernd

Miklos Szeredi Aug. 21, 2024, 2:25 p.m. UTC | #4

On Wed, 21 Aug 2024 at 16:15, Bernd Schubert <bernd.schubert@fastmail.fm> wrote:

> - You suggested yourself that timeouts might be a (non-ideal)
> solution to avoid page copies on writeback
> https://lore.kernel.org/linux-mm/CAJfpegt_mEYOeeTo2bWS3iJfC38t5bf29mzrxK68dhMptrgamg@mail.gmail.com/raw

What I suggested is to copy the original writeback page to a tmp page
after a timeout.  That's not the same as aborting the request, though
the mechanism might be similar.

> - Would probably be a solution for this LXCFS issue
> https://lore.kernel.org/lkml/a8d0c5da-6935-4d28-9380-68b84b8e6e72@shopee.com/
> (although I don't fully understand the issue there yet)

Yeah, we first need to understand what's going on.

Thanks,
Miklos

Josef Bacik Aug. 21, 2024, 6:11 p.m. UTC | #5

On Wed, Aug 21, 2024 at 03:47:53PM +0200, Miklos Szeredi wrote:
> On Wed, 14 Aug 2024 at 01:23, Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > There are situations where fuse servers can become unresponsive or take
> > too long to reply to a request. Currently there is no upper bound on
> > how long a request may take, which may be frustrating to users who get
> > stuck waiting for a request to complete.
> >
> > This patchset adds a timeout option for requests and two dynamically
> > configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
> > for controlling/enforcing timeout behavior system-wide.
> >
> > Existing fuse servers will not be affected unless they explicitly opt into the
> > timeout.
> 
> I sort of understand the motivation, but do not clearly see why this
> is required.
> 
> A well written server will be able to do request timeouts properly,
> without the kernel having to cut off requests mid flight without the
> knowledge of the server.  The latter could even be dangerous because
> locking guarantees previously provided by the kernel do not apply
> anymore.
> 
> Can you please explain why this needs to be done by the client
> (kernel) instead of the server (userspace)?
> 

"A well written server" is the key part here ;).  In our case we had a "well
written server" that ended up having a deadlock and we had to run around with a
drgn script to find those hung mounts and kill them manually.  The usecase here
is specifically for bugs in the FUSE server to allow us to cleanup automatically
with EIO's rather than a drgn script to figure out if the mount is hung.

It also gives us the opportunity to do the things that Bernd points out,
specifically remove the double buffering downside as we can trust that
eventually writeback will either succeed or timeout.  Thanks,

Josef

Miklos Szeredi Aug. 21, 2024, 6:54 p.m. UTC | #6

On Wed, 21 Aug 2024 at 20:11, Josef Bacik <josef@toxicpanda.com> wrote:

> "A well written server" is the key part here ;).  In our case we had a "well
> written server" that ended up having a deadlock and we had to run around with a
> drgn script to find those hung mounts and kill them manually.  The usecase here
> is specifically for bugs in the FUSE server to allow us to cleanup automatically
> with EIO's rather than a drgn script to figure out if the mount is hung.

So you 'd like to automatically abort the connection to an
unresponsive server?  I'm okay with that.

What I'm worried about is the unintended side effects of timed out
request without the server's knowledge (i.e. VFS locks released, then
new request takes VFS lock).   If the connection to the server is
aborted, then that's not an issue.

It's also much simpler to just time out any response from the server
(either read or write on /dev/fuse) than having to do per-request
timeouts.

> It also gives us the opportunity to do the things that Bernd points out,
> specifically remove the double buffering downside as we can trust that
> eventually writeback will either succeed or timeout.  Thanks,

Well see this explanation for how this might deadlock on a memory
allocation by the server:

 https://lore.kernel.org/all/CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@mail.gmail.com/

Having a timeout would fix the deadlock, but it doesn't seem to me a
proper solution.

Thanks,
Miklos

Joanne Koong Aug. 21, 2024, 9:22 p.m. UTC | #7

On Wed, Aug 21, 2024 at 11:54 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, 21 Aug 2024 at 20:11, Josef Bacik <josef@toxicpanda.com> wrote:
>
> > "A well written server" is the key part here ;).  In our case we had a "well
> > written server" that ended up having a deadlock and we had to run around with a
> > drgn script to find those hung mounts and kill them manually.  The usecase here
> > is specifically for bugs in the FUSE server to allow us to cleanup automatically
> > with EIO's rather than a drgn script to figure out if the mount is hung.
>
> So you 'd like to automatically abort the connection to an
> unresponsive server?  I'm okay with that.
>
> What I'm worried about is the unintended side effects of timed out
> request without the server's knowledge (i.e. VFS locks released, then
> new request takes VFS lock).   If the connection to the server is
> aborted, then that's not an issue.
>
> It's also much simpler to just time out any response from the server
> (either read or write on /dev/fuse) than having to do per-request
> timeouts.

In our case, the deadlock was triggered by invalidating the inode in
the middle of handling the write request. The server becomes stuck
since the inode invalidation (eg fuse_reverse_inval_inode())  is
attempting to acquire the folio lock but the lock was acquired when
servicing the write request (eg fuse_fill_write_pages()) and only gets
released after the server has replied to the write request (eg in
fuse_send_write_pages()).

Without a kernel enforced timeout, the only way out of this is to
abort the connection. A userspace timeout wouldn't help in this case
with getting the server unstuck. With the kernel timeout, this forces
the kernel handling of the write request to proceed, whihc will drop
the folio lock and resume the server back to a functioning state.

I don't think situations like this are uncommon. For example, it's not
obvious or clear to developers that fuse_lowlevel_notify_inval_inode()
shouldn't be called inside of a write handler in their server code.

I believe Yafang had a use case for this as well in
https://lore.kernel.org/linux-fsdevel/20240724071156.97188-1-laoar.shao@gmail.com/
where they were seeing fuse connections becoming indefinitely stuck.

For your concern about potential unintended side effects of timed out
requests without the server's knowledge, could you elaborate more on
the VFS locking example? In my mind, a request that times out is the
same thing as a request that behaves normally and completes with an
error code, but perhaps not?

I think also, having some way for system admins to enforce request
timeouts across the board might be useful as well - for example, if a
malignant fuse server doesn't reply to any requests, the requests hog
memory until the server is killed.

Thanks,
Joanne

>
> > It also gives us the opportunity to do the things that Bernd points out,
> > specifically remove the double buffering downside as we can trust that
> > eventually writeback will either succeed or timeout.  Thanks,
>
> Well see this explanation for how this might deadlock on a memory
> allocation by the server:
>
>  https://lore.kernel.org/all/CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@mail.gmail.com/
>
> Having a timeout would fix the deadlock, but it doesn't seem to me a
> proper solution.
>
> Thanks,
> Miklos

Miklos Szeredi Aug. 22, 2024, 10:52 a.m. UTC | #8

On Wed, 21 Aug 2024 at 23:22, Joanne Koong <joannelkoong@gmail.com> wrote:

> Without a kernel enforced timeout, the only way out of this is to
> abort the connection. A userspace timeout wouldn't help in this case
> with getting the server unstuck. With the kernel timeout, this forces
> the kernel handling of the write request to proceed, whihc will drop
> the folio lock and resume the server back to a functioning state.
>
> I don't think situations like this are uncommon. For example, it's not
> obvious or clear to developers that fuse_lowlevel_notify_inval_inode()
> shouldn't be called inside of a write handler in their server code.

Documentation is definitely lacking.  In fact a simple rule is: never
call a notification function from within a request handling function.
Notifications are async events that should happen independently of
handling regular operations.  Anything else is an abuse of the
interface.

>
> For your concern about potential unintended side effects of timed out
> requests without the server's knowledge, could you elaborate more on
> the VFS locking example? In my mind, a request that times out is the
> same thing as a request that behaves normally and completes with an
> error code, but perhaps not?

- user calls mknod(2) on fuse directory
- VFS takes inode lock on parent directory
- calls into fuse to create the file
- fuse sends request to server
- file creation is slow and times out in the kernel
- fuse returns -ETIMEDOUT
- VFS releases inode lock
- meanwhile the server is still working on creating the file and has
no idea that something went wrong
- user calls the same mknod(2) again
- same things happen as last time
- server starts to create the file *again* knowing that the VFS takes
care of concurrency
- server crashes due to corruption


> I think also, having some way for system admins to enforce request
> timeouts across the board might be useful as well - for example, if a
> malignant fuse server doesn't reply to any requests, the requests hog
> memory until the server is killed.

As I said, I'm not against enforcing a response time for fuse servers,
as long as a timeout results in a complete abort and not just an error
on the timed out request.

Thanks,
Miklos

Joanne Koong Aug. 22, 2024, 5:31 p.m. UTC | #9

On Thu, Aug 22, 2024 at 3:52 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, 21 Aug 2024 at 23:22, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> > Without a kernel enforced timeout, the only way out of this is to
> > abort the connection. A userspace timeout wouldn't help in this case
> > with getting the server unstuck. With the kernel timeout, this forces
> > the kernel handling of the write request to proceed, whihc will drop
> > the folio lock and resume the server back to a functioning state.
> >
> > I don't think situations like this are uncommon. For example, it's not
> > obvious or clear to developers that fuse_lowlevel_notify_inval_inode()
> > shouldn't be called inside of a write handler in their server code.
>
> Documentation is definitely lacking.  In fact a simple rule is: never
> call a notification function from within a request handling function.
> Notifications are async events that should happen independently of
> handling regular operations.  Anything else is an abuse of the
> interface.
>
> >
> > For your concern about potential unintended side effects of timed out
> > requests without the server's knowledge, could you elaborate more on
> > the VFS locking example? In my mind, a request that times out is the
> > same thing as a request that behaves normally and completes with an
> > error code, but perhaps not?
>
> - user calls mknod(2) on fuse directory
> - VFS takes inode lock on parent directory
> - calls into fuse to create the file
> - fuse sends request to server
> - file creation is slow and times out in the kernel
> - fuse returns -ETIMEDOUT
> - VFS releases inode lock
> - meanwhile the server is still working on creating the file and has
> no idea that something went wrong
> - user calls the same mknod(2) again
> - same things happen as last time
> - server starts to create the file *again* knowing that the VFS takes
> care of concurrency
> - server crashes due to corruption

Thanks for the details.

For cases like these though, isn't the server already responsible for
handling errors properly to avoid potential corruption if their reply
to the request fails? In your example above, it seems like the server
would already need to have the error handling in place to roll back
the file creation if their fuse_reply_create() call returned an error
(eg -EIO if copying out args in the kernel had an issue). If the
request timed out, then the server would get back -ENOENT to their
reply.


Thanks,
Joanne

>
>
> > I think also, having some way for system admins to enforce request
> > timeouts across the board might be useful as well - for example, if a
> > malignant fuse server doesn't reply to any requests, the requests hog
> > memory until the server is killed.
>
> As I said, I'm not against enforcing a response time for fuse servers,
> as long as a timeout results in a complete abort and not just an error
> on the timed out request.
>
> Thanks,
> Miklos

Miklos Szeredi Aug. 22, 2024, 5:43 p.m. UTC | #10

On Thu, 22 Aug 2024 at 19:31, Joanne Koong <joannelkoong@gmail.com> wrote:

> For cases like these though, isn't the server already responsible for
> handling errors properly to avoid potential corruption if their reply
> to the request fails? In your example above, it seems like the server
> would already need to have the error handling in place to roll back
> the file creation if their fuse_reply_create() call returned an error
> (eg -EIO if copying out args in the kernel had an issue).

No, the server does not need to implement rollback, and does not in
fact need to check for the return value of the fuse_reply_create()
call unless it wants to mess with interrupts (not enabled by default).
See libfuse/lib/fuse.c where most of the fuse_replu_XXX() calls just
ignore the return value.

Thanks,
Miklos

Joanne Koong Aug. 22, 2024, 10:38 p.m. UTC | #11

On Thu, Aug 22, 2024 at 10:43 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 22 Aug 2024 at 19:31, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> > For cases like these though, isn't the server already responsible for
> > handling errors properly to avoid potential corruption if their reply
> > to the request fails? In your example above, it seems like the server
> > would already need to have the error handling in place to roll back
> > the file creation if their fuse_reply_create() call returned an error
> > (eg -EIO if copying out args in the kernel had an issue).
>
> No, the server does not need to implement rollback, and does not in
> fact need to check for the return value of the fuse_reply_create()
> call unless it wants to mess with interrupts (not enabled by default).
> See libfuse/lib/fuse.c where most of the fuse_replu_XXX() calls just
> ignore the return value.

Ok, I see.

For v5, I will update this to abort the connection altogether if a
request times out.

Thanks,
Joanne
>
> Thanks,
> Miklos

Joanne Koong Aug. 26, 2024, 8:30 p.m. UTC | #12

On Tue, Aug 20, 2024 at 7:02 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Wed, Aug 14, 2024 at 7:23 AM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > There are situations where fuse servers can become unresponsive or take
> > too long to reply to a request. Currently there is no upper bound on
> > how long a request may take, which may be frustrating to users who get
> > stuck waiting for a request to complete.
> >
> > This patchset adds a timeout option for requests and two dynamically
> > configurable fuse sysctls "default_request_timeout" and "max_request_timeout"
> > for controlling/enforcing timeout behavior system-wide.
> >
> > Existing fuse servers will not be affected unless they explicitly opt into the
> > timeout.
> >
> > v3: https://lore.kernel.org/linux-fsdevel/20240808190110.3188039-1-joannelkoong@gmail.com/
> > Changes from v3 -> v4:
> > - Fix wording on some comments to make it more clear
> > - Use simpler logic for timer (eg remove extra if checks, use mod timer API) (Josef)
> > - Sanity-check should be on FR_FINISHING not FR_FINISHED (Jingbo)
> > - Fix comment for "processing queue", add req->fpq = NULL safeguard  (Bernd)
> >
> > v2: https://lore.kernel.org/linux-fsdevel/20240730002348.3431931-1-joannelkoong@gmail.com/
> > Changes from v2 -> v3:
> > - Disarm / rearm timer in dev_do_read to handle race conditions (Bernrd)
> > - Disarm timer in error handling for fatal interrupt (Yafang)
> > - Clean up do_fuse_request_end (Jingbo)
> > - Add timer for notify retrieve requests
> > - Fix kernel test robot errors for #define no-op functions
> >
> > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@gmail.com/
> > Changes from v1 -> v2:
> > - Add timeout for background requests
> > - Handle resend race condition
> > - Add sysctls
> >
> > Joanne Koong (2):
> >   fuse: add optional kernel-enforced timeout for requests
> >   fuse: add default_request_timeout and max_request_timeout sysctls
> >
> >  Documentation/admin-guide/sysctl/fs.rst |  17 +++
> >  fs/fuse/Makefile                        |   2 +-
> >  fs/fuse/dev.c                           | 192 +++++++++++++++++++++++-
> >  fs/fuse/fuse_i.h                        |  30 ++++
> >  fs/fuse/inode.c                         |  24 +++
> >  fs/fuse/sysctl.c                        |  42 ++++++
> >  6 files changed, 298 insertions(+), 9 deletions(-)
> >  create mode 100644 fs/fuse/sysctl.c
> >
> > --
> > 2.43.5
> >
>
> For this series,
>
> Tested-by: Yafang Shao <laoar.shao@gmail.com>

Thanks for testing this version. For v5, the behavior will be modified
(if a request times out, the connection will be aborted instead of
just the request being aborted) so I'll hold off on adding your
Tested-by sign-off until you explicitly give the ok on the newer
version.

Thanks,
Joanne

>
> --
> Regards
> Yafang

[v4,0/2] fuse: add timeout option for requests

Message

Comments