diff mbox series

Manage: Block unsafe member failing

Message ID 20220818094721.8969-1-mateusz.kusiak@intel.com (mailing list archive)
State Mainlined, archived
Delegated to: Coly Li
Headers show
Series Manage: Block unsafe member failing | expand

Commit Message

Mateusz Kusiak Aug. 18, 2022, 9:47 a.m. UTC
Kernel may or may not block mdadm from removing member device if it
will cause arrays failed state. It depends on raid personality
implementation in kernel.
Add verification on requested removal path (#mdadm --set-faulty
command).

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
---
 Manage.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 52 insertions(+), 1 deletion(-)

Comments

Jes Sorensen Sept. 8, 2022, 4:53 p.m. UTC | #1
On 8/18/22 05:47, Mateusz Kusiak wrote:
> Kernel may or may not block mdadm from removing member device if it
> will cause arrays failed state. It depends on raid personality
> implementation in kernel.
> Add verification on requested removal path (#mdadm --set-faulty
> command).
> 
> Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
> ---
>  Manage.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 52 insertions(+), 1 deletion(-)

Applied!

Thanks,
Jes
Mateusz Kusiak Nov. 2, 2022, 8:35 a.m. UTC | #2
We have noticed that this patch introduced a regression, unwanted
behavior regarding matrix raid.
We're working on a fix, this in no.1 on our list. We'll post changes soon.

Thanks,
Mateusz

On 08/09/2022 18:53, Jes Sorensen wrote:
> On 8/18/22 05:47, Mateusz Kusiak wrote:
>> Kernel may or may not block mdadm from removing member device if it
>> will cause arrays failed state. It depends on raid personality
>> implementation in kernel.
>> Add verification on requested removal path (#mdadm --set-faulty
>> command).
>>
>> Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
>> ---
>>  Manage.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 52 insertions(+), 1 deletion(-)
> 
> Applied!
> 
> Thanks,
> Jes
> 
>
diff mbox series

Patch

diff --git a/Manage.c b/Manage.c
index f789e0c1..774b8a11 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1285,6 +1285,50 @@  int Manage_with(struct supertype *tst, int fd, struct mddev_dev *dv,
 	return -1;
 }
 
+/**
+ * is_remove_safe() - Check if remove is safe.
+ * @array: Array info.
+ * @fd: Array file descriptor.
+ * @devname: Name of device to remove.
+ * @verbose: Verbose.
+ *
+ * The function determines if array will be operational
+ * after removing &devname.
+ *
+ * Return: True if array will be operational, false otherwise.
+ */
+bool is_remove_safe(mdu_array_info_t *array, const int fd, char *devname, const int verbose)
+{
+	dev_t devid = devnm2devid(devname + 5);
+	struct mdinfo *mdi = sysfs_read(fd, NULL, GET_DEVS | GET_DISKS | GET_STATE);
+
+	if (!mdi) {
+		if (verbose)
+			pr_err("Failed to read sysfs attributes for %s\n", devname);
+		return false;
+	}
+
+	char *avail = xcalloc(array->raid_disks, sizeof(char));
+
+	for (mdi = mdi->devs; mdi; mdi = mdi->next) {
+		if (mdi->disk.raid_disk < 0)
+			continue;
+		if (!(mdi->disk.state & (1 << MD_DISK_SYNC)))
+			continue;
+		if (makedev(mdi->disk.major, mdi->disk.minor) == devid)
+			continue;
+		avail[mdi->disk.raid_disk] = 1;
+	}
+	sysfs_free(mdi);
+
+	bool is_enough = enough(array->level, array->raid_disks,
+				array->layout, (array->state & 1),
+				avail);
+
+	free(avail);
+	return is_enough;
+}
+
 int Manage_subdevs(char *devname, int fd,
 		   struct mddev_dev *devlist, int verbose, int test,
 		   char *update, int force)
@@ -1598,7 +1642,14 @@  int Manage_subdevs(char *devname, int fd,
 			break;
 
 		case 'f': /* set faulty */
-			/* FIXME check current member */
+			if (!is_remove_safe(&array, fd, dv->devname, verbose)) {
+				pr_err("Cannot remove %s from %s, array will be failed.\n",
+				       dv->devname, devname);
+				if (sysfd >= 0)
+					close(sysfd);
+				goto abort;
+			}
+
 			if ((sysfd >= 0 && write(sysfd, "faulty", 6) != 6) ||
 			    (sysfd < 0 && ioctl(fd, SET_DISK_FAULTY,
 						rdev))) {