diff mbox

[2/2] FUSE: fix congested state leak on aborted connections

Message ID 20180202175414.GM1121507@devbig577.frc2.facebook.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tejun Heo Feb. 2, 2018, 5:54 p.m. UTC
If a connection gets aborted while congested, FUSE can leave
nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
wait spuriously which can lead to severe performance degradation.

The leak is caused by gating congestion state clearing with
fc->connected test in request_end().  This was added way back in 2009
by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
description doesn't explain why the test was added, it most likely was
to avoid dereferencing bdi after it got destroyed.

Since then, bdi lifetime rules have changed many times and now we're
always guaranteed to have access to the bdi while the superblock is
alive (fc->sb).

Drop fc->connected conditional to avoid leaking congestion states.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joshua Miller <joshmiller@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
---
 fs/fuse/dev.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Jan Kara Feb. 6, 2018, 4:25 p.m. UTC | #1
On Fri 02-02-18 09:54:14, Tejun Heo wrote:
> If a connection gets aborted while congested, FUSE can leave
> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
> wait spuriously which can lead to severe performance degradation.
> 
> The leak is caused by gating congestion state clearing with
> fc->connected test in request_end().  This was added way back in 2009
> by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
> description doesn't explain why the test was added, it most likely was
> to avoid dereferencing bdi after it got destroyed.
> 
> Since then, bdi lifetime rules have changed many times and now we're
> always guaranteed to have access to the bdi while the superblock is
> alive (fc->sb).
> 
> Drop fc->connected conditional to avoid leaking congestion states.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joshua Miller <joshmiller@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org

Yeah, this should be fine AFAICT but my knowledge of FUSE is very cursory.
Anyway:

Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fuse/dev.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -381,8 +381,7 @@ static void request_end(struct fuse_conn
>  		if (!fc->blocked && waitqueue_active(&fc->blocked_waitq))
>  			wake_up(&fc->blocked_waitq);
>  
> -		if (fc->num_background == fc->congestion_threshold &&
> -		    fc->connected && fc->sb) {
> +		if (fc->num_background == fc->congestion_threshold && fc->sb) {
>  			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_SYNC);
>  			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_ASYNC);
>  		}
Miklos Szeredi May 30, 2018, 2:22 p.m. UTC | #2
On Tue, Feb 6, 2018 at 5:25 PM, Jan Kara <jack@suse.cz> wrote:
> On Fri 02-02-18 09:54:14, Tejun Heo wrote:
>> If a connection gets aborted while congested, FUSE can leave
>> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
>> wait spuriously which can lead to severe performance degradation.
>>
>> The leak is caused by gating congestion state clearing with
>> fc->connected test in request_end().  This was added way back in 2009
>> by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
>> description doesn't explain why the test was added, it most likely was
>> to avoid dereferencing bdi after it got destroyed.
>>
>> Since then, bdi lifetime rules have changed many times and now we're
>> always guaranteed to have access to the bdi while the superblock is
>> alive (fc->sb).
>>
>> Drop fc->connected conditional to avoid leaking congestion states.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org>
>> Reported-by: Joshua Miller <joshmiller@fb.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Miklos Szeredi <miklos@szeredi.hu>
>> Cc: Jan Kara <jack@suse.cz>
>> Cc: stable@vger.kernel.org
>
> Yeah, this should be fine AFAICT but my knowledge of FUSE is very cursory.
> Anyway:
>
> Acked-by: Jan Kara <jack@suse.cz>

Can't say I fully understand how the global "is any bdi congested"
state is used in direct reclaim, but the patch is an obvious
improvement, so applied.

Thanks,
Miklos
diff mbox

Patch

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -381,8 +381,7 @@  static void request_end(struct fuse_conn
 		if (!fc->blocked && waitqueue_active(&fc->blocked_waitq))
 			wake_up(&fc->blocked_waitq);
 
-		if (fc->num_background == fc->congestion_threshold &&
-		    fc->connected && fc->sb) {
+		if (fc->num_background == fc->congestion_threshold && fc->sb) {
 			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_SYNC);
 			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_ASYNC);
 		}