[v2,1/3] md: Fix undefined behaviour in is_mddev_idle

UBSAN reports this problem:

[ 5984.281385] UBSAN: Undefined behaviour in drivers/md/md.c:8175:15
[ 5984.281390] signed integer overflow:
[ 5984.281393] -2147483291 - 2072033152 cannot be represented in type 'int'
[ 5984.281400] CPU: 25 PID: 1854 Comm: md101_resync Kdump: loaded Not tainted 4.19.90
[ 5984.281404] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDDA
[ 5984.281406] Call trace:
[ 5984.281415]  dump_backtrace+0x0/0x310
[ 5984.281418]  show_stack+0x28/0x38
[ 5984.281425]  dump_stack+0xec/0x15c
[ 5984.281430]  ubsan_epilogue+0x18/0x84
[ 5984.281434]  handle_overflow+0x14c/0x19c
[ 5984.281439]  __ubsan_handle_sub_overflow+0x34/0x44
[ 5984.281445]  is_mddev_idle+0x338/0x3d8
[ 5984.281449]  md_do_sync+0x1bb8/0x1cf8
[ 5984.281452]  md_thread+0x220/0x288
[ 5984.281457]  kthread+0x1d8/0x1e0
[ 5984.281461]  ret_from_fork+0x10/0x18

When the stat aacum of the disk is greater than INT_MAX, its
value becomes negative after casting to 'int', which may lead
to overflow after subtracting a positive number. In the same
way, when the value of sync_io is greater than INT_MAX,
overflow may also occur. These situations will lead to
undefined behavior.

Otherwise, if the stat accum of the disk is close to INT_MAX
when creating raid arrays, the initial value of last_events
would be set close to INT_MAX when mddev initializes IO
event counters. 'curr_events - rdev->last_events > 64' will
always false during synchronization. If all the disks of mddev
are in this case, is_mddev_idle() will always return 1, which
may cause non-sync IO is very slow.

To address these problems, need to use 64bit signed integer
type for sync_io, last_events, and curr_events.

In all the drivers that come with the kernel, the sync_io
variable in struct gendisk is only used for the md driver
currently. It should be more suitable in struct md_rdev, so
add the sync_io variable in struct md_rdev, and use it to
replace. We modify md_sync_acct() and md_sync_acct_bio()
to fit for this change as well. md_sync_acct_bio() need
access to the rdev, so we set bio->bi_bdev to rdev before
calling it, and reset bio->bi_bdev to bdev in this function.

Signed-off-by: Li Jinlin <lijinlin3@huawei.com>
---
 drivers/md/md.c       |  6 +++---
 drivers/md/md.h       | 13 +++++++++----
 drivers/md/raid1.c    |  4 ++--
 drivers/md/raid10.c   | 24 ++++++++++++------------
 drivers/md/raid5.c    |  4 ++--
 include/linux/genhd.h |  1 -
 6 files changed, 28 insertions(+), 24 deletions(-)

Message ID	20211210051707.2202646-2-lijinlin3@huawei.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Li Jinlin <lijinlin3@huawei.com> To: <song@kernel.org>, <philipp.reisner@linbit.com>, <lars.ellenberg@linbit.com>, <axboe@kernel.dk>, <hare@suse.de>, <jack@suse.cz>, <ming.lei@redhat.com>, <tj@kernel.org>, <mcgrof@kernel.org>, <mcroce@microsoft.com> CC: <linux-raid@vger.kernel.org>, <linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <drbd-dev@lists.linbit.com>, <linfeilong@huawei.com> Subject: [PATCH v2 1/3] md: Fix undefined behaviour in is_mddev_idle Date: Fri, 10 Dec 2021 13:17:05 +0800 Message-ID: <20211210051707.2202646-2-lijinlin3@huawei.com> In-Reply-To: <20211210051707.2202646-1-lijinlin3@huawei.com> References: <20211210051707.2202646-1-lijinlin3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Precedence: bulk
Series	Fix undefined behaviour during device synchronization \| expand [0/3] Fix undefined behaviour during device synchronization [v2,1/3] md: Fix undefined behaviour in is_mddev_idle [2/3] drdb: Fix undefined behaviour in drbd_rs_c_min_rate_throttle [3/3] drdb: Remove useless variable in struct drbd_device

[v2,1/3] md: Fix undefined behaviour in is_mddev_idle

Commit Message

Comments

Patch