From patchwork Mon Mar 9 20:23:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 11427967 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D122F1731 for ; Mon, 9 Mar 2020 20:23:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A9FCB2146E for ; Mon, 9 Mar 2020 20:23:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=toxicpanda-com.20150623.gappssmtp.com header.i=@toxicpanda-com.20150623.gappssmtp.com header.b="z7Cvi5VO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726508AbgCIUX3 (ORCPT ); Mon, 9 Mar 2020 16:23:29 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:39892 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726096AbgCIUX2 (ORCPT ); Mon, 9 Mar 2020 16:23:28 -0400 Received: by mail-qk1-f193.google.com with SMTP id e16so10621166qkl.6 for ; Mon, 09 Mar 2020 13:23:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=63LpO6dYb/LHEHQztq+Eq95wgeKe8ztKpY5mOzbww1E=; b=z7Cvi5VOlvhkxCDc32OYWDel33L7JNLkHUBxSV0M5Igd00Pd50mo08CqFZHb16WDha wo+Z45lia6sU2Cx97wFt/9ryqoRjyUn0oSBb1gk+PjXf6Zh2GOJkErCoenGa7BKoNH/G DA+bJxAILfxmj6AgtgUCBeYiUFZU82gsK3dYGwrdlfsAPPXfl7Xq8uRccge15gL0hytC uYGryfVuXiGcb40XKeKz0VMJauyFXpMYzPq3iCpY0e7uryAyKXvAdZvVDSp3rCZba0jv +vE7Us1PVAi2peYZkwFEEMws4Mc6GVPydRKZw2mhCTyScT5lyxd5BpViUYrX2Fw9Wg9v mlnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=63LpO6dYb/LHEHQztq+Eq95wgeKe8ztKpY5mOzbww1E=; b=CrJ3aiKQXeTG9R6oTr07HNjkLQxpvgglnlWxIDW9A5Q9hjgCtLitVbTa6/vstUY+AX zWWHXO4R9AGJWdPfhOTEuc56ChMjGStgOiN3ydElColkjwwGTvJSsIzIr+5EJCp2FyqO r2uSG5tLFLcZUGyuPHPgOvuR3izeMvHR84gQzt7Q25TLnYJYZS/f7LBBZ7gSI38sWp0J 88a7Z9zxLMvBo9QphDMgUfO/jXI9pxTS5fWQDYvCJOPmv+eLTKneUlXSCahoaePAIsN3 JsyWOwLIz1dtkNVOfoS5OJ3qaKWfdZQuwy63ZuCFSKfRejsIBwcRNJWuNAHzm2bSqUXu LvMg== X-Gm-Message-State: ANhLgQ2YQodbq+HAHdLnB5ASVg3XsIyJ3hMdF2DcqqUzaZVgkcVEJn2u K0rbt+d+tY2RtGkHgawbVlrYuj54HMQ= X-Google-Smtp-Source: ADFU+vujr3TUf3d0o7qPWnZV0gwbN6SWbAmEnlcf/ZI1zAIOFoo1KcM2k0QyF2200FXiI/tKyyEodA== X-Received: by 2002:ae9:f504:: with SMTP id o4mr3946999qkg.306.1583785407452; Mon, 09 Mar 2020 13:23:27 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id j4sm2602390qtn.78.2020.03.09.13.23.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 13:23:26 -0700 (PDT) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/5] btrfs: Improve global reserve stealing logic Date: Mon, 9 Mar 2020 16:23:18 -0400 Message-Id: <20200309202322.12327-2-josef@toxicpanda.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200309202322.12327-1-josef@toxicpanda.com> References: <20200309202322.12327-1-josef@toxicpanda.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For unlink transactions and block group removal btrfs_start_transaction_fallback_global_rsv will first try to start an ordinary transaction and if it fails it will fall back to reserving the required amount by stealing from the global reserve. This is sound in theory but current code doesn't perform any locking or throttling so if there are multiple concurrent unlink() callers they can deplete the global reservation which will result in ENOSPC. Fix this behavior by introducing BTRFS_RESERVE_FLUSH_ALL_STEAL. It's used to mark unlink reservation. The flushing machinery is modified to steal from global reservation when it sees such reservation being on the brink of failure (in maybe_fail_all_tickets). Signed-off-by: Josef Bacik --- fs/btrfs/block-group.c | 2 +- fs/btrfs/ctree.h | 1 + fs/btrfs/inode.c | 2 +- fs/btrfs/space-info.c | 38 +++++++++++++++++++++++++++++++++++++- fs/btrfs/space-info.h | 1 + fs/btrfs/transaction.c | 42 +++++------------------------------------- fs/btrfs/transaction.h | 3 +-- 7 files changed, 47 insertions(+), 42 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 60e9bb136f34..faa04093b6b5 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1171,7 +1171,7 @@ struct btrfs_trans_handle *btrfs_start_trans_remove_block_group( free_extent_map(em); return btrfs_start_transaction_fallback_global_rsv(fs_info->extent_root, - num_items, 1); + num_items); } /* diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2ccb2a090782..782c63f213e9 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2528,6 +2528,7 @@ enum btrfs_reserve_flush_enum { BTRFS_RESERVE_FLUSH_DATA, BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE, BTRFS_RESERVE_FLUSH_ALL, + BTRFS_RESERVE_FLUSH_ALL_STEAL, }; enum btrfs_flush_state { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b8dabffac767..4e3b115ef1d7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3617,7 +3617,7 @@ static struct btrfs_trans_handle *__unlink_start_trans(struct inode *dir) * 1 for the inode ref * 1 for the inode */ - return btrfs_start_transaction_fallback_global_rsv(root, 5, 5); + return btrfs_start_transaction_fallback_global_rsv(root, 5); } static int btrfs_unlink(struct inode *dir, struct dentry *dentry) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 26e1c492b9b5..9c9a4933f72b 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -810,6 +810,35 @@ static inline int need_do_async_reclaim(struct btrfs_fs_info *fs_info, !test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state)); } +static bool steal_from_global_rsv(struct btrfs_fs_info *fs_info, + struct btrfs_space_info *space_info, + struct reserve_ticket *ticket) +{ + struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv; + u64 min_bytes; + + if (global_rsv->space_info != space_info) + return false; + + spin_lock(&global_rsv->lock); + min_bytes = div_factor(global_rsv->size, 1); + if (global_rsv->reserved < min_bytes + ticket->bytes) { + spin_unlock(&global_rsv->lock); + return false; + } + global_rsv->reserved -= ticket->bytes; + ticket->bytes = 0; + trace_printk("Satisfied ticket from global rsv\n"); + list_del_init(&ticket->list); + wake_up(&ticket->wait); + space_info->tickets_id++; + if (global_rsv->reserved < global_rsv->size) + global_rsv->full = 0; + spin_unlock(&global_rsv->lock); + + return true; +} + /* * maybe_fail_all_tickets - we've exhausted our flushing, start failing tickets * @fs_info - fs_info for this fs @@ -842,6 +871,10 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info, ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); + if (ticket->steal && + steal_from_global_rsv(fs_info, space_info, ticket)) + return true; + /* * may_commit_transaction will avoid committing the transaction * if it doesn't feel like the space reclaimed by the commit @@ -1195,6 +1228,7 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info, switch (flush) { case BTRFS_RESERVE_FLUSH_DATA: case BTRFS_RESERVE_FLUSH_ALL: + case BTRFS_RESERVE_FLUSH_ALL_STEAL: wait_reserve_ticket(fs_info, space_info, ticket); break; case BTRFS_RESERVE_FLUSH_LIMIT: @@ -1300,8 +1334,10 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, ticket.bytes = orig_bytes; ticket.error = 0; init_waitqueue_head(&ticket.wait); + ticket.steal = (flush == BTRFS_RESERVE_FLUSH_ALL_STEAL); if (flush == BTRFS_RESERVE_FLUSH_ALL || - flush == BTRFS_RESERVE_FLUSH_DATA) { + flush == BTRFS_RESERVE_FLUSH_DATA || + flush == BTRFS_RESERVE_FLUSH_ALL_STEAL) { list_add_tail(&ticket.list, &space_info->tickets); if (!space_info->flush) { space_info->flush = 1; diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index 179f757c4a6b..a7f600efb772 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -71,6 +71,7 @@ struct btrfs_space_info { struct reserve_ticket { u64 bytes; int error; + bool steal; struct list_head list; wait_queue_head_t wait; }; diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 53af0f55f5f9..d171fd52c82b 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -559,7 +559,8 @@ start_transaction(struct btrfs_root *root, unsigned int num_items, * refill that amount for whatever is missing in the reserve. */ num_bytes = btrfs_calc_insert_metadata_size(fs_info, num_items); - if (delayed_refs_rsv->full == 0) { + if (flush == BTRFS_RESERVE_FLUSH_ALL && + delayed_refs_rsv->full == 0) { delayed_refs_bytes = num_bytes; num_bytes <<= 1; } @@ -686,43 +687,10 @@ struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root, struct btrfs_trans_handle *btrfs_start_transaction_fallback_global_rsv( struct btrfs_root *root, - unsigned int num_items, - int min_factor) + unsigned int num_items) { - struct btrfs_fs_info *fs_info = root->fs_info; - struct btrfs_trans_handle *trans; - u64 num_bytes; - int ret; - - /* - * We have two callers: unlink and block group removal. The - * former should succeed even if we will temporarily exceed - * quota and the latter operates on the extent root so - * qgroup enforcement is ignored anyway. - */ - trans = start_transaction(root, num_items, TRANS_START, - BTRFS_RESERVE_FLUSH_ALL, false); - if (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC) - return trans; - - trans = btrfs_start_transaction(root, 0); - if (IS_ERR(trans)) - return trans; - - num_bytes = btrfs_calc_insert_metadata_size(fs_info, num_items); - ret = btrfs_cond_migrate_bytes(fs_info, &fs_info->trans_block_rsv, - num_bytes, min_factor); - if (ret) { - btrfs_end_transaction(trans); - return ERR_PTR(ret); - } - - trans->block_rsv = &fs_info->trans_block_rsv; - trans->bytes_reserved = num_bytes; - trace_btrfs_space_reservation(fs_info, "transaction", - trans->transid, num_bytes, 1); - - return trans; + return start_transaction(root, num_items, TRANS_START, + BTRFS_RESERVE_FLUSH_ALL_STEAL, false); } struct btrfs_trans_handle *btrfs_join_transaction(struct btrfs_root *root) diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 453cea7c7a72..228e8b560e42 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -192,8 +192,7 @@ struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root, unsigned int num_items); struct btrfs_trans_handle *btrfs_start_transaction_fallback_global_rsv( struct btrfs_root *root, - unsigned int num_items, - int min_factor); + unsigned int num_items); struct btrfs_trans_handle *btrfs_join_transaction(struct btrfs_root *root); struct btrfs_trans_handle *btrfs_join_transaction_spacecache(struct btrfs_root *root); struct btrfs_trans_handle *btrfs_join_transaction_nostart(struct btrfs_root *root); From patchwork Mon Mar 9 20:23:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 11427971 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E4F6114BC for ; Mon, 9 Mar 2020 20:23:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C67AE2146E for ; Mon, 9 Mar 2020 20:23:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=toxicpanda-com.20150623.gappssmtp.com header.i=@toxicpanda-com.20150623.gappssmtp.com header.b="0yEnZzAE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726536AbgCIUXc (ORCPT ); Mon, 9 Mar 2020 16:23:32 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:37053 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726096AbgCIUXc (ORCPT ); Mon, 9 Mar 2020 16:23:32 -0400 Received: by mail-qt1-f196.google.com with SMTP id l20so6403796qtp.4 for ; Mon, 09 Mar 2020 13:23:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=+11tuGa5ypmlSF2o5NuLqbvlL8n4BkqYzntbvS9at/0=; b=0yEnZzAEOjSEyZIqvpLCT8glZagXPcAk3ZrgdvpAdoQUTOL0dNLbuGdUz0XKtCkZ1+ jAzaPYAd8p53GFD5yEjZV1U3s9gQWg8EEZ0hANAsLeVlow9aaNWXAFmEVrVSS8SSHhK0 bpp0yEkk7iceS9N5E9/NV83AdzTb1m71haf4weF/NKQA41bAF9yx9hSN20WpNiEwRYIC 2+vdYAEJYtSue1YUKIOhGdS8qYRty0mC91cLVr43HyihvUoZkguwp3/pmLEr2UcGioSB hs8mQB68/ophuxUfuUEroMin3kx4YZ09W8C6Ch/Di2LPIQff3y1IeqXU+Xq5OMps6h19 InVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+11tuGa5ypmlSF2o5NuLqbvlL8n4BkqYzntbvS9at/0=; b=Nlt6RDVn0fONGG5U8ws1usAdOaUPp5dMaYVDd3y8kxiyTAgBLCEKXB5Tt/7Ze/NsnY I6Tnh0kErm+Yaekf9dhxam4n0+YDdoJpANiK0vW2Ig7qO48/CkY3Bym9BUrDvd1pRoeJ 8fxTQqXBKVe7ies50TJ6fQ7v2VR1KLMd6dqNWAOosnl+5R2OEABfDHh7faB5YZ4hpj03 aniJkJmltnElFuiN6OWDCuq1YHqTcI50GoQlNSUMBSFzT86WhX/rd00W5o5AOzGdeJI7 T188njw7gIT6GGI0bGjOP73C4h6Nqpfm3xZZi4aobr/LpBr/ez0bUaMkNAZ6dokg0wJ+ 6yFg== X-Gm-Message-State: ANhLgQ0XDnpCJ35MJMqqWBSR4gWegX5t2nl6h1WfHflN85bHTH5ZK9xr zDrkIvOjP8yeft4SEJo4f9ZZUdrD5kQ= X-Google-Smtp-Source: ADFU+vu9CXjCyVQrIJ4z1ucqqR+XevCaNUXt49duTd0vsTnsehucTFuA+TnihWGlZ714vJTts458Fw== X-Received: by 2002:ac8:5497:: with SMTP id h23mr7454399qtq.226.1583785409116; Mon, 09 Mar 2020 13:23:29 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id p35sm3056449qtk.2.2020.03.09.13.23.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 13:23:28 -0700 (PDT) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/5] btrfs: Account for trans_block_rsv in may_commit_transaction Date: Mon, 9 Mar 2020 16:23:19 -0400 Message-Id: <20200309202322.12327-3-josef@toxicpanda.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200309202322.12327-1-josef@toxicpanda.com> References: <20200309202322.12327-1-josef@toxicpanda.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On ppc64le with 64k page size (respectively 64k block size) generic/320 was failing and debug output showed we were getting a premature ENOSPC with a bunch of space in btrfs_fs_info::trans_block_rsv. This meant there were still open transaction handles holding space, yet the flusher didn't commit the transaction because it deemed the freed space won't be enough to satisfy the current reserve ticket. Fix this by accounting for space in trans_block_rsv when deciding whether the current transaction should be committed or not. Signed-off-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/space-info.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 9c9a4933f72b..8d00a9ee9458 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -575,6 +575,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, struct reserve_ticket *ticket = NULL; struct btrfs_block_rsv *delayed_rsv = &fs_info->delayed_block_rsv; struct btrfs_block_rsv *delayed_refs_rsv = &fs_info->delayed_refs_rsv; + struct btrfs_block_rsv *trans_rsv = &fs_info->trans_block_rsv; struct btrfs_trans_handle *trans; u64 reclaim_bytes = 0; u64 bytes_needed; @@ -637,6 +638,11 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, spin_lock(&delayed_refs_rsv->lock); reclaim_bytes += delayed_refs_rsv->reserved; spin_unlock(&delayed_refs_rsv->lock); + + spin_lock(&trans_rsv->lock); + reclaim_bytes += trans_rsv->reserved; + spin_unlock(&trans_rsv->lock); + if (reclaim_bytes >= bytes_needed) goto commit; bytes_needed -= reclaim_bytes; From patchwork Mon Mar 9 20:23:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 11427969 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 65EB11731 for ; Mon, 9 Mar 2020 20:23:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 47ECE2146E for ; Mon, 9 Mar 2020 20:23:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=toxicpanda-com.20150623.gappssmtp.com header.i=@toxicpanda-com.20150623.gappssmtp.com header.b="ZufC7zes" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726557AbgCIUXc (ORCPT ); Mon, 9 Mar 2020 16:23:32 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:36578 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726514AbgCIUXc (ORCPT ); Mon, 9 Mar 2020 16:23:32 -0400 Received: by mail-qt1-f193.google.com with SMTP id m33so8068904qtb.3 for ; Mon, 09 Mar 2020 13:23:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=uLTPLOeB90UMR5FAKL/IUJ5HHEgSN+N4ETJQO4JMGnI=; b=ZufC7zestyphZVD7j0QowvzbINZ+0/qNVue8qNdnk1xuSbqYtXyi5pyM2lxE02QRi3 /yqIO3Ait2dWtQ63AUpAXgks3qPGewNmdwTvqyPtMWfcpPKBC+Q3fLngEbZ3LOroD6ng I/V9QoCZau7rtZ2KqIy84G0FqkWzNGwBe0CeOJLENPVibXE84tsdwFEybWkU4I1WUuLk 0gPbW6wr5ZUvVxtg2RqDXZtU6B6A80LdKwyZSRYs5ZaS8HG2cUCyDRb0Q8VGCOm2CJXj otXMbB4eXb6x6JueTOG1IdT330A6VJQ88dmlY0axx21Zff6q3X14u4jsneARcDrbkl8o VVmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uLTPLOeB90UMR5FAKL/IUJ5HHEgSN+N4ETJQO4JMGnI=; b=jYOVZNEeX1RIqnhNeTWJWkZAapaYRKrgc/EBNLFejrjUKWuViChXba232zmZLefmne 5wiGLv3occNW3Af6hltW4QVBOErUTtrejkxMqVBVNRAO3fGf43qmEcj48vunWqdC9tod O8Z5nTZV9UUyG10Ef3ZnhqaOO8cA0AmyBXH6I2eyZdkw0P+XjIUAoH9DuB0E+lHxL9Cx ofFm8PHiQ0pKmRE1l06FjBy5dHctyQxKtldhCDQg6zoVXRS4u9WOKriAZg0PSqAq0+kj oAzCIWtT2NA6fayc4rUYdSEQrSmE9eFk9rYEk0cw35+U5wPzwJ8FLdyRu3DR+ZiFnbTZ U3tQ== X-Gm-Message-State: ANhLgQ2+8977OvKF32sWdhSLbaHeVPegZ8MrsUSpgqqPVS+zEaE23MIY C6LzXTNQqlX2tKAO8iwVxE+Lq2/5+Ws= X-Google-Smtp-Source: ADFU+vvQfKssgiyNSOFvkadHUy0UXPfVlPlQv9aIoSCuMCbYucFa8BOlVCPnOlV2kM3dbYKvbZ7WRg== X-Received: by 2002:ac8:4d02:: with SMTP id w2mr4466616qtv.240.1583785410772; Mon, 09 Mar 2020 13:23:30 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id a23sm22803011qko.77.2020.03.09.13.23.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 13:23:30 -0700 (PDT) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/5] btrfs: only take normal tickets into account in may_commit_transaction Date: Mon, 9 Mar 2020 16:23:20 -0400 Message-Id: <20200309202322.12327-4-josef@toxicpanda.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200309202322.12327-1-josef@toxicpanda.com> References: <20200309202322.12327-1-josef@toxicpanda.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In debugging a generic/320 failure on ppc64, Nikolay noticed that sometimes we'd ENOSPC out with plenty of space to reclaim if we had committed the transaction. He further discovered that this was because there was a priority ticket that was small enough to fit in the free space currently in the space_info. While that is a problem by itself, it exposed another flaw, that we consider priority tickets in may_commit_transaction. Priority tickets are not allowed to commit the transaction, thus we shouldn't even consider them in may_commit_transaction. Instead we need to only consider current normal tickets. With this fix in place, we will properly commit the transaction. Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 8d00a9ee9458..d198cfd45cf7 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, else cur_free_bytes = 0; - if (!list_empty(&space_info->priority_tickets)) - ticket = list_first_entry(&space_info->priority_tickets, - struct reserve_ticket, list); - else if (!list_empty(&space_info->tickets)) + if (!list_empty(&space_info->tickets)) ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); bytes_needed = (ticket) ? ticket->bytes : 0; From patchwork Mon Mar 9 20:23:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 11427973 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64D7D14BC for ; Mon, 9 Mar 2020 20:23:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 40F9E2146E for ; Mon, 9 Mar 2020 20:23:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=toxicpanda-com.20150623.gappssmtp.com header.i=@toxicpanda-com.20150623.gappssmtp.com header.b="aKH5n5Vf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726595AbgCIUXe (ORCPT ); Mon, 9 Mar 2020 16:23:34 -0400 Received: from mail-qv1-f65.google.com ([209.85.219.65]:46876 "EHLO mail-qv1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726514AbgCIUXd (ORCPT ); Mon, 9 Mar 2020 16:23:33 -0400 Received: by mail-qv1-f65.google.com with SMTP id m2so5008324qvu.13 for ; Mon, 09 Mar 2020 13:23:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CQmzWklwMZ2eC7GGZTJ/d27x7JCQ1D70I7B2Iq10RRE=; b=aKH5n5VfjBlDFORTkNE8SgSznUwLhvpuvsf4fCOZXW9QOrtFSHoVMg7AHLdbGGZmNm MXqfcwC5l6HjtMwTNkIDlPYzufEWFmpcxRhWRZQ7zNCJnSqO8ARDyHL8VvPznuMxTUg/ onNV02LcATMEbCJY8NiatUAVPlYIr7u12bUHlq4zo1dI1YegD8sLlcmIVOj9SRFWGLQc Ioyh1BQKiXS5XwL6GhbqLL/+rJAy6OomoXNq0hKcc3cmWsVOJ/tQYsASoFdA6zB5HHNF nrjoxi9QO7x7qwFUPo4EKKZ5dLImlj7SGMvtRHycO1MF0aC5svvjbo6XF3eoI5++/47h 97DA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CQmzWklwMZ2eC7GGZTJ/d27x7JCQ1D70I7B2Iq10RRE=; b=LCfCKUM9AJR5FleQNeZPF3+Aw8KlphLpRIXkCYkze1zwrEyA1M7s12So4ZUY9FLowR NqoEEoyv2f/MUmU8Ou1u3b9QNPd+Gfyb95xh5Lxyi8Wj/+b6OkTzAp2nphp/AixLHz8t mXBt92/9FIDimEzr0zUOZ8TmfBQSyufgZfuLI7y9Kt42spO42662Pv2GrZKxYJPzCfSn ePBJx+SB1znfztRYdIRVCoqopMaGWyx3B4tyXNTtpcYKywLqEJGOqIVLw2ACyhhb4A8H wXUw5AVyrk+Sj9p0akNnqahzg/22fscqG1TFBWdgMwkraWxcGxwNHL6eIf6mkDlc0iaF kCfQ== X-Gm-Message-State: ANhLgQ3Jimifb+DoI/4tBXWMMrDoaPR+wlLZWPgdgukC49TRpaAp8GnD DCKDbR8HztEGq9QHlaUxDcpno3TmT50= X-Google-Smtp-Source: ADFU+vuX7qXvHLs01FnqLd8dgx9kr0ua+oI5izzEYDVLfPMVf7IKmxg9nbM2htcJVFlppo+CiSFVYw== X-Received: by 2002:a05:6214:983:: with SMTP id dt3mr16278628qvb.145.1583785412442; Mon, 09 Mar 2020 13:23:32 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id k202sm10146947qke.134.2020.03.09.13.23.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 13:23:31 -0700 (PDT) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/5] btrfs: only check priority tickets for priority flushing Date: Mon, 9 Mar 2020 16:23:21 -0400 Message-Id: <20200309202322.12327-5-josef@toxicpanda.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200309202322.12327-1-josef@toxicpanda.com> References: <20200309202322.12327-1-josef@toxicpanda.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In debugging a generic/320 failure on ppc64, Nikolay noticed that sometimes we'd ENOSPC out with plenty of space to reclaim if we had committed the transaction. He further discovered that this was because there was a priority ticket that was small enough to fit in the free space currently in the space_info. This is problematic because we prioritize priority tickets, refilling them first as new space becomes available. However this leaves a corner where we could fail to satisfy a priority ticket when we would have otherwise succeeded. Consider the case where there's no flushing left to happen other than commit the transaction, and there are tickets on the normal flushing list. The priority flusher comes in, and assume there's enough space left in the space_info to satisfy this request. We will still be added to the priority list and go through the flushing motions, and eventually fail returning an ENOSPC. Instead we should only add ourselves to the list if there's something on the priority_list already. This way we avoid the incorrect ENOSPC scenario. Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index d198cfd45cf7..77ea204f0b6a 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -1276,6 +1276,17 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info, return ret; } +/* + * This returns true if this flush state will go through the ordinary flushing + * code. + */ +static inline bool is_normal_flushing(enum btrfs_reserve_flush_enum flush) +{ + return (flush == BTRFS_RESERVE_FLUSH_DATA) || + (flush == BTRFS_RESERVE_FLUSH_ALL) || + (flush == BTRFS_RESERVE_FLUSH_ALL_STEAL); +} + /** * reserve_metadata_bytes - try to reserve bytes from the block_rsv's space * @root - the root we're allocating for @@ -1311,8 +1322,17 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, spin_lock(&space_info->lock); ret = -ENOSPC; used = btrfs_space_info_used(space_info, true); - pending_tickets = !list_empty(&space_info->tickets) || - !list_empty(&space_info->priority_tickets); + + /* + * We don't want NO_FLUSH allocations to jump everybody, they can + * generally handle ENOSPC in a different way, so treat them the same as + * normal flushers when it comes to skipping pending tickets. + */ + if (is_normal_flushing(flush) || (flush == BTRFS_RESERVE_NO_FLUSH)) + pending_tickets = !list_empty(&space_info->tickets) || + !list_empty(&space_info->priority_tickets); + else + pending_tickets = !list_empty(&space_info->priority_tickets); /* * Carry on if we have enough space (short-circuit) OR call @@ -1338,9 +1358,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, ticket.error = 0; init_waitqueue_head(&ticket.wait); ticket.steal = (flush == BTRFS_RESERVE_FLUSH_ALL_STEAL); - if (flush == BTRFS_RESERVE_FLUSH_ALL || - flush == BTRFS_RESERVE_FLUSH_DATA || - flush == BTRFS_RESERVE_FLUSH_ALL_STEAL) { + if (is_normal_flushing(flush)) { list_add_tail(&ticket.list, &space_info->tickets); if (!space_info->flush) { space_info->flush = 1; From patchwork Mon Mar 9 20:23:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 11427975 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A493E1874 for ; Mon, 9 Mar 2020 20:23:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 846CE20828 for ; Mon, 9 Mar 2020 20:23:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=toxicpanda-com.20150623.gappssmtp.com header.i=@toxicpanda-com.20150623.gappssmtp.com header.b="UFLD0/vU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726616AbgCIUXf (ORCPT ); Mon, 9 Mar 2020 16:23:35 -0400 Received: from mail-qv1-f42.google.com ([209.85.219.42]:41724 "EHLO mail-qv1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726514AbgCIUXf (ORCPT ); Mon, 9 Mar 2020 16:23:35 -0400 Received: by mail-qv1-f42.google.com with SMTP id a10so835306qvq.8 for ; Mon, 09 Mar 2020 13:23:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=vpJ0/kSS8DBe+x7DDC9jS9as6qWh1b2sdxdmeekjT0I=; b=UFLD0/vUhr1ud3QdbyqBTUT5HrJRGqRmZQ8K1V6GVEMId4PL3TPFD0ZSF8zgWIWI7x kxtKtlUXoAGLusB4+xI1t8K7tuqed9L2+dQaRihMoVjkMK9K5+3ZqopGLSwZuajj0eg9 mHkWDbEGPSvAb2R3i1Edl/59Sg+Yl6NvpB0XtrC15sqPf9PtQ+eQU15CnWZxAWH9bdw0 /wHuaVJ9BmRWiyrtenIjRtkw9cwIhwhTenrY+UCDsNLHeB6RjwFZLsnYWvVW0YPqzFdZ WqG+2I9L6A/UjopyTI8yNc5FsBzdLXjkfgXDa6N8UllFiCOdPodNPnONer+2THx8RUOp ZJqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vpJ0/kSS8DBe+x7DDC9jS9as6qWh1b2sdxdmeekjT0I=; b=JfUCI3KEx+/URlrE4LyavBFHQmQi81FEtlggWkt2WQqLZ+qXSDaAqlzcg/TyFeMMb6 SqPfoH+aKRKa6gjsvMgbTUizc6QgoPrGG7+HJ9NTxoy95P5Awe8nxDAvu2wJAdQoPuvT Nr7lTj0h3IAWxnEjXKiJ/CIllxDjaqaQdv5rWkCxl8f+m5NL/NVsQx3IqBxwQKd0E/gZ Pkb8lAcoLICiqRNXzlz6Km7lK0/aInRX88GsLHOTWdSOKDFr5vgSpZnUWYLRUqdBytgX f1foCiNHOuBofqRgu5E0YRL6eH1L4hOKSG0Hmh89tPJ/ykj/Df1JHQ03ggbaVdw77akd DQnQ== X-Gm-Message-State: ANhLgQ0gGqUmCvwkOZgvQAfTDfUBjkhjbuchQbjODP6bjiej0aYmdXxD cPZ8Phh2rbQh288C4ulvN+VElgoSv1U= X-Google-Smtp-Source: ADFU+vswTVdWM6qN1ASgv7Srr6gn1Xc+eBiFW5yQujUxnO2P4H4MHZfUD7Kay5psEwoeV9JWKtVSfQ== X-Received: by 2002:ad4:4687:: with SMTP id bq7mr7890557qvb.248.1583785414136; Mon, 09 Mar 2020 13:23:34 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id x7sm16790917qkx.110.2020.03.09.13.23.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 13:23:33 -0700 (PDT) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/5] btrfs: run btrfs_try_granting_tickets if a priority ticket fails Date: Mon, 9 Mar 2020 16:23:22 -0400 Message-Id: <20200309202322.12327-6-josef@toxicpanda.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200309202322.12327-1-josef@toxicpanda.com> References: <20200309202322.12327-1-josef@toxicpanda.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org With normal tickets we could have a large reservation at the front of the list that is unable to be satisfied, but a smaller ticket later on that can be satisfied. The way we handle this is to run btrfs_try_granting_tickets() in maybe_fail_all_tickets(). However no such protection exists for priority tickets. Fix this by handling it in handle_reserve_ticket(). If we've returned after attempting to flush space in a priority related way, we'll still be on the priority list and need to be removed. We rely on the flushing to free up space and wake the ticket, but if there is not enough space to reclaim _but_ there's enough space in the space_info to handle subsequent reservations then we would have gotten an ENOSPC erroneously. Address this by catching where we are still on the list, meaning we were a priority ticket, and removing ourselves and then running btrfs_try_granting_tickets(). This will handle this particular corner case. Signed-off-by: Josef Bacik --- fs/btrfs/space-info.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 77ea204f0b6a..03172ecd9c0b 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -1256,11 +1256,17 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info, ret = ticket->error; if (ticket->bytes || ticket->error) { /* - * Need to delete here for priority tickets. For regular tickets - * either the async reclaim job deletes the ticket from the list - * or we delete it ourselves at wait_reserve_ticket(). + * We were a priority ticket, so we need to delete ourselves + * from the list. Because we could have other priority tickets + * behind us that require less space, run + * btrfs_try_granting_tickets() to see if their reservations can + * now be made. */ - list_del_init(&ticket->list); + if (!list_empty(&ticket->list)) { + list_del_init(&ticket->list); + btrfs_try_granting_tickets(fs_info, space_info); + } + if (!ret) ret = -ENOSPC; }