diff mbox series

[isar-cip-core,RFC] fix: Check return code value of e2fsck

Message ID 20240819121045.1533891-1-Quirin.Gylstorff@siemens.com (mailing list archive)
State Superseded
Headers show
Series [isar-cip-core,RFC] fix: Check return code value of e2fsck | expand

Commit Message

Gylstorff Quirin Aug. 19, 2024, 12:09 p.m. UTC
From: Quirin Gylstorff <quirin.gylstorff@siemens.com>

This fixes the overlay script in case of a recoverable file system error.
In that case we execute the erronously the recouvery script and mke2fs
ask if the filesystem should be overwritten.

The issue was introduced with:

8644fb1 initramfs-overlay-hook: Check file system of INITRAMFS_OVERLAY_STORAGE_DEVICE

Signed-off-by: Quirin Gylstorff <quirin.gylstorff@siemens.com>
---
This fixes the issue but we need to address the problem in case of a
return code of 2 which requires a reboot of the system. Should we:
1. panic
2. silently reboot
3. execute the user defined recovery action (add the return value as a
  additional argument)
4. do nothing
In my testing i used 4.

Also should we force mke2fs?

 .../initramfs-overlay-hook/files/overlay.script.tmpl   | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Jan Kiszka Aug. 19, 2024, 12:16 p.m. UTC | #1
On 19.08.24 14:09, Quirin Gylstorff wrote:
> From: Quirin Gylstorff <quirin.gylstorff@siemens.com>
> 
> This fixes the overlay script in case of a recoverable file system error.
> In that case we execute the erronously the recouvery script and mke2fs
> ask if the filesystem should be overwritten.
> 
> The issue was introduced with:
> 
> 8644fb1 initramfs-overlay-hook: Check file system of INITRAMFS_OVERLAY_STORAGE_DEVICE
> 
> Signed-off-by: Quirin Gylstorff <quirin.gylstorff@siemens.com>
> ---
> This fixes the issue but we need to address the problem in case of a
> return code of 2 which requires a reboot of the system. Should we:
> 1. panic
> 2. silently reboot
> 3. execute the user defined recovery action (add the return value as a
>   additional argument)
> 4. do nothing
> In my testing i used 4.

Why does e2fsck require a reboot in some cases?

> 
> Also should we force mke2fs?

I would say so. Who knows in which state that target partition was left.

> 
>  .../initramfs-overlay-hook/files/overlay.script.tmpl   | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
> index c655a4f..02df53d 100644
> --- a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
> +++ b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
> @@ -40,7 +40,15 @@ partition_fstype=$(get_fstype "${ovl_partition_device}")
>  if ! mountpoint -q "${rootmnt}${storage_mount_point}"; then
>  	case $partition_fstype in
>  	ext*)
> -		if ! e2fsck -p -f "$ovl_partition_device" && [ -x "$ovl_recovery_script" ]; then
> +		e2fsck -p -f "$ovl_partition_device"
> +		fsck_ret="$?"
> +		# e2fsck returns 1 in case ofrepairing the file system or a 2 in case
> +		# a reboot is required.
> +		# https://man7.org/linux/man-pages/man8/e2fsck.8.html#EXIT_CODE
> +		if [ "$fsck_ret" -gt "1" ]; then
> +			# TODO should we force are reboot or customer action
> +			panic "fsck requires a reboot"
> +		elif [ "$fsck_ret" -gt "2" ]  && [ -x "$ovl_recovery_script" ]; then
>  			"$ovl_recovery_script" "$partition_fstype" "$ovl_partition_device"
>  		fi
>  		;;

Please typo check everything.

Does this fix
https://gitlab.com/cip-project/cip-core/isar-cip-core/-/issues/113?

Jan
Gylstorff Quirin Aug. 19, 2024, 12:29 p.m. UTC | #2
On 8/19/24 2:16 PM, Jan Kiszka wrote:
> On 19.08.24 14:09, Quirin Gylstorff wrote:
>> From: Quirin Gylstorff <quirin.gylstorff@siemens.com>
>>
>> This fixes the overlay script in case of a recoverable file system error.
>> In that case we execute the erronously the recouvery script and mke2fs
>> ask if the filesystem should be overwritten.
>>
>> The issue was introduced with:
>>
>> 8644fb1 initramfs-overlay-hook: Check file system of INITRAMFS_OVERLAY_STORAGE_DEVICE
>>
>> Signed-off-by: Quirin Gylstorff <quirin.gylstorff@siemens.com>
>> ---
>> This fixes the issue but we need to address the problem in case of a
>> return code of 2 which requires a reboot of the system. Should we:
>> 1. panic
>> 2. silently reboot
>> 3. execute the user defined recovery action (add the return value as a
>>    additional argument)
>> 4. do nothing
>> In my testing i used 4.
> 
> Why does e2fsck require a reboot in some cases?
After looking into it:
In case of a ext2 system an the check file system is root.
https://github.com/tytso/e2fsprogs/blob/950a0d69c82b585aba30118f01bf80151deffe8c/e2fsck/unix.c#L2133
https://github.com/tytso/e2fsprogs/blob/950a0d69c82b585aba30118f01bf80151deffe8c/e2fsck/util.c#L70

I will remove that check and say that greater than 1 needs the recovery 
script in v2.


> 
>>
>> Also should we force mke2fs?
> 
> I would say so. Who knows in which state that target partition was left.

I will add it in v2.

Quirin
> 
>>
>>   .../initramfs-overlay-hook/files/overlay.script.tmpl   | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
>> index c655a4f..02df53d 100644
>> --- a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
>> +++ b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
>> @@ -40,7 +40,15 @@ partition_fstype=$(get_fstype "${ovl_partition_device}")
>>   if ! mountpoint -q "${rootmnt}${storage_mount_point}"; then
>>   	case $partition_fstype in
>>   	ext*)
>> -		if ! e2fsck -p -f "$ovl_partition_device" && [ -x "$ovl_recovery_script" ]; then
>> +		e2fsck -p -f "$ovl_partition_device"
>> +		fsck_ret="$?"
>> +		# e2fsck returns 1 in case ofrepairing the file system or a 2 in case
>> +		# a reboot is required.
>> +		# https://man7.org/linux/man-pages/man8/e2fsck.8.html#EXIT_CODE
>> +		if [ "$fsck_ret" -gt "1" ]; then
>> +			# TODO should we force are reboot or customer action
>> +			panic "fsck requires a reboot"
>> +		elif [ "$fsck_ret" -gt "2" ]  && [ -x "$ovl_recovery_script" ]; then
>>   			"$ovl_recovery_script" "$partition_fstype" "$ovl_partition_device"
>>   		fi
>>   		;;
> 
> Please typo check everything.
> 
> Does this fix
> https://gitlab.com/cip-project/cip-core/isar-cip-core/-/issues/113?
> 
> Jan
>
diff mbox series

Patch

diff --git a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
index c655a4f..02df53d 100644
--- a/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
+++ b/recipes-initramfs/initramfs-overlay-hook/files/overlay.script.tmpl
@@ -40,7 +40,15 @@  partition_fstype=$(get_fstype "${ovl_partition_device}")
 if ! mountpoint -q "${rootmnt}${storage_mount_point}"; then
 	case $partition_fstype in
 	ext*)
-		if ! e2fsck -p -f "$ovl_partition_device" && [ -x "$ovl_recovery_script" ]; then
+		e2fsck -p -f "$ovl_partition_device"
+		fsck_ret="$?"
+		# e2fsck returns 1 in case ofrepairing the file system or a 2 in case
+		# a reboot is required.
+		# https://man7.org/linux/man-pages/man8/e2fsck.8.html#EXIT_CODE
+		if [ "$fsck_ret" -gt "1" ]; then
+			# TODO should we force are reboot or customer action
+			panic "fsck requires a reboot"
+		elif [ "$fsck_ret" -gt "2" ]  && [ -x "$ovl_recovery_script" ]; then
 			"$ovl_recovery_script" "$partition_fstype" "$ovl_partition_device"
 		fi
 		;;