From patchwork Mon Jul  6 07:44:33 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Qu Wenruo <wqu@suse.com>
X-Patchwork-Id: 11645129
Return-Path: <SRS0=Izu+=AR=vger.kernel.org=linux-btrfs-owner@kernel.org>
Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org
 [172.30.200.123])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 18DB560D
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
 Mon,  6 Jul 2020 07:44:43 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 08EEC2074F
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
 Mon,  6 Jul 2020 07:44:42 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728797AbgGFHom (ORCPT
        <rfc822;patchwork-linux-btrfs@patchwork.kernel.org>);
        Mon, 6 Jul 2020 03:44:42 -0400
Received: from mx2.suse.de ([195.135.220.15]:44202 "EHLO mx2.suse.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728248AbgGFHol (ORCPT <rfc822;linux-btrfs@vger.kernel.org>);
        Mon, 6 Jul 2020 03:44:41 -0400
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
Received: from relay2.suse.de (unknown [195.135.221.27])
        by mx2.suse.de (Postfix) with ESMTP id 238B5AEA8
        for <linux-btrfs@vger.kernel.org>;
 Mon,  6 Jul 2020 07:44:41 +0000 (UTC)
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH RFC 0/2] btrfs: make ticket wait uninterruptible to address
 unexpected RO during balance
Date: Mon,  6 Jul 2020 15:44:33 +0800
Message-Id: <20200706074435.52356-1-wqu@suse.com>
X-Mailer: git-send-email 2.27.0
MIME-Version: 1.0
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-btrfs.vger.kernel.org>
X-Mailing-List: linux-btrfs@vger.kernel.org

There is a report that, unlucky signal timing during balance can cause
btrfs to remounted into RO mode.

This is caused by the fact that, most btrfs_start_transaction() or
delalloc metadata reserve are interruptible.

That would return -EINTR to a lot of critical code section, and under
most case, our way to handle such error is just to abort transaction,
without any consideration for -EINTR.

This is never a good idea to allow random Ctrl-C to make btrfs RO, even
if the window is pretty small for regular operations.

This patchset will address it in a different direction, since most
operations are pretty fast, we don't need that signal check in waiting
ticket.

For those long running operations, signal should be checked in their
call sites.
E.g. __generic_block_fiemap() calls fatal_signal_pending() to check if
it needs to exit, so does btrfs_clone().

We shouldn't check the signal, and just throw a -EINTR for all ticketing
system callers, they don't really want to handle that damn -EINTR.

Only long executing operations really need that signal checking, and let
them to check, not the infrastructure.

Reason for RFC:
I'm not yet completely sure if uninterruptible ticketing system would
cause extra problems.
Any advice on that would be great.

Qu Wenruo (2):
  btrfs: relocation: Allow signal to cancel balance
  btrfs: space-info: Don't allow signal to interrupt ticket waiting

 fs/btrfs/relocation.c | 3 ++-
 fs/btrfs/space-info.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)