Unreliable disk detection order in 5.x

I'm seeing disk detection order changing across reboots on 5.x kernels
(5.4, 5.10, 5.14), but not 4.9, 4.14, 4.19, with megaraid_sas (Dell
PERC_H700). With 13 disks and 5.14.14, the order changes almost always.

I did initially try to bisect this issue, but it seems to become more
rare in earlier kernels, and there are some non-booting problems between
4.x and 5.x.

The most common effect is swapping of sda with sdb, or two neighboring
devices in the list; for example:

# diff -u lsblk-S-5.10.0 lsblk-S-5.10.0-2

This is happening on vendor (Debian 5.10.0) and home-built kernels, and
on a variety of hosts. On all kernels, the detection printks come up in
an interesting order, but in older kernels, it always ends up with an
sd-name that is ordered by SCSI ID ascending:

[    2.289776] sd 0:2:0:0: [sda] 999030784 512-byte logical blocks: (512 GB/476 GiB)
[    2.289918] sd 0:2:4:0: [sdd] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.289947] sd 0:2:3:0: [sdc] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290032] sd 0:2:6:0: [sdf] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290210] sd 0:2:7:0: [sdg] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290248] sd 0:2:9:0: [sdi] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290323] sd 0:2:2:0: [sdb] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290461] sd 0:2:5:0: [sde] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    2.290476] sd 0:2:8:0: [sdh] 11719933952 512-byte logical blocks: (6.00 TB/5.46 TiB)

Full "dmesg" is saved here: https://0x.ca/sim/ref/5.10.0/dmesg

Any ideas on suggestions on what I could use to track down what changed
here, or ideas on what might have influenced it?

Simon-

Message ID	20211105064623.GD32560@hostway.ca (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> Date: Thu, 4 Nov 2021 23:46:23 -0700 From: Simon Kirby <sim@hostway.ca> To: linux-scsi@vger.kernel.org, linux-block@vger.kernel.org Subject: Unreliable disk detection order in 5.x Message-ID: <20211105064623.GD32560@hostway.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk
Series	Unreliable disk detection order in 5.x \| expand Unreliable disk detection order in 5.x

Unreliable disk detection order in 5.x

Commit Message

Comments

Patch