diff mbox series

block: Fix validation of ioprio level

Message ID 1724833695-22194-1-git-send-email-ajay.kaher@broadcom.com (mailing list archive)
State New
Headers show
Series block: Fix validation of ioprio level | expand

Commit Message

Ajay Kaher Aug. 28, 2024, 8:28 a.m. UTC
The commit eca2040972b4 introduced a backward compatibility issue in
the function ioprio_check_cap.

Before the change, if ioprio contains a level greater than 0x7, it was
treated as -EINVAL:

    data = ioprio & 0x1FFF
    if data >= 0x7, return -EINVAL 

Since the change, if ioprio contains a level greater than 0x7 say 0x8
it is calculated as 0x0:

    level = ioprio & 0x7

To maintain backward compatibility the kernel should return -EINVAL in
the above case as well.

Fixes: eca2040972b4 ("scsi: block: ioprio: Clean up interface definition")
Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com>
---
 block/ioprio.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Damien Le Moal Aug. 28, 2024, 8:45 a.m. UTC | #1
On 8/28/24 17:28, Ajay Kaher wrote:
> The commit eca2040972b4 introduced a backward compatibility issue in
> the function ioprio_check_cap.
> 
> Before the change, if ioprio contains a level greater than 0x7, it was
> treated as -EINVAL:
> 
>     data = ioprio & 0x1FFF
>     if data >= 0x7, return -EINVAL 
> 
> Since the change, if ioprio contains a level greater than 0x7 say 0x8
> it is calculated as 0x0:
> 
>     level = ioprio & 0x7
> 
> To maintain backward compatibility the kernel should return -EINVAL in
> the above case as well.
> 
> Fixes: eca2040972b4 ("scsi: block: ioprio: Clean up interface definition")
> Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com>
> ---
>  block/ioprio.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/ioprio.c b/block/ioprio.c
> index 73301a2..f08e76b 100644
> --- a/block/ioprio.c
> +++ b/block/ioprio.c
> @@ -30,6 +30,15 @@
>  #include <linux/security.h>
>  #include <linux/pid_namespace.h>
>  
> +static inline int ioprio_check_level(int ioprio, int max_level)
> +{
> +	int data = IOPRIO_PRIO_DATA(ioprio);
> +
> +	if (IOPRIO_BAD_VALUE(data, max_level))
> +		return -EINVAL;

No, this cannot possibly work correctly because the prio level part of the prio
data is only 3 bits, so 0 to 7. The remaining 10 bits of the prio data are used
for priority hints (IOPRIO_HINT_XXX).

Your change will thus return an error for cases where the prio data has a level
AND also a hint (e.g. for command duration limits). This change would break
command duration limits. So NACK.

The userspace header file has the ioprio_value() that a user should use to
construct an ioprio. Bad values are checked in that function and errors will be
returned if an invalid level is passed.

> +	return 0;
> +}
> +
>  int ioprio_check_cap(int ioprio)
>  {
>  	int class = IOPRIO_PRIO_CLASS(ioprio);
> @@ -49,7 +58,7 @@ int ioprio_check_cap(int ioprio)
>  			fallthrough;
>  			/* rt has prio field too */
>  		case IOPRIO_CLASS_BE:
> -			if (level >= IOPRIO_NR_LEVELS)
> +			if (ioprio_check_level(ioprio, IOPRIO_NR_LEVELS))
>  				return -EINVAL;
>  			break;
>  		case IOPRIO_CLASS_IDLE:
Ajay Kaher Sept. 9, 2024, 8:59 a.m. UTC | #2
On Wed, Aug 28, 2024 at 2:15 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>
> On 8/28/24 17:28, Ajay Kaher wrote:
> > The commit eca2040972b4 introduced a backward compatibility issue in
> > the function ioprio_check_cap.
> >
> > Before the change, if ioprio contains a level greater than 0x7, it was
> > treated as -EINVAL:
> >
> >     data = ioprio & 0x1FFF
> >     if data >= 0x7, return -EINVAL
> >
> > Since the change, if ioprio contains a level greater than 0x7 say 0x8
> > it is calculated as 0x0:
> >
> >     level = ioprio & 0x7
> >
> > To maintain backward compatibility the kernel should return -EINVAL in
> > the above case as well.
> >
> > Fixes: eca2040972b4 ("scsi: block: ioprio: Clean up interface definition")
> > Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com>
> > ---
> >  block/ioprio.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/ioprio.c b/block/ioprio.c
> > index 73301a2..f08e76b 100644
> > --- a/block/ioprio.c
> > +++ b/block/ioprio.c
> > @@ -30,6 +30,15 @@
> >  #include <linux/security.h>
> >  #include <linux/pid_namespace.h>
> >
> > +static inline int ioprio_check_level(int ioprio, int max_level)
> > +{
> > +     int data = IOPRIO_PRIO_DATA(ioprio);
> > +
> > +     if (IOPRIO_BAD_VALUE(data, max_level))
> > +             return -EINVAL;
>
> No, this cannot possibly work correctly because the prio level part of the prio
> data is only 3 bits, so 0 to 7. The remaining 10 bits of the prio data are used
> for priority hints (IOPRIO_HINT_XXX).
>
> Your change will thus return an error for cases where the prio data has a level
> AND also a hint (e.g. for command duration limits). This change would break
> command duration limits. So NACK.
>
> The userspace header file has the ioprio_value() that a user should use to
> construct an ioprio. Bad values are checked in that function and errors will be
> returned if an invalid level is passed.
>

OK. Thanks for the detailed explanation.

I agree, to use unused bits, functionality (return value in this case)
will be changed. If applications are built using Kernel headers of
v6.1 (doesn't include eca2040972b4) and later only upgrading Kernel to
v6.6, because of the changes in return values applications may have
some sort of regression.

To make the software backward compatible I believe, unused bits should
always be ignored. So that if in future someone uses it, it should not
change the behaviour (return values) of existing software.

- Ajay

> > +     return 0;
> > +}
> > +
> >  int ioprio_check_cap(int ioprio)
> >  {
> >       int class = IOPRIO_PRIO_CLASS(ioprio);
> > @@ -49,7 +58,7 @@ int ioprio_check_cap(int ioprio)
> >                       fallthrough;
> >                       /* rt has prio field too */
> >               case IOPRIO_CLASS_BE:
> > -                     if (level >= IOPRIO_NR_LEVELS)
> > +                     if (ioprio_check_level(ioprio, IOPRIO_NR_LEVELS))
> >                               return -EINVAL;
> >                       break;
> >               case IOPRIO_CLASS_IDLE:
>
> --
> Damien Le Moal
> Western Digital Research
>
Damien Le Moal Sept. 9, 2024, 10:20 a.m. UTC | #3
On 9/9/24 17:59, Ajay Kaher wrote:
> On Wed, Aug 28, 2024 at 2:15 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>>
>> On 8/28/24 17:28, Ajay Kaher wrote:
>>> The commit eca2040972b4 introduced a backward compatibility issue in
>>> the function ioprio_check_cap.
>>>
>>> Before the change, if ioprio contains a level greater than 0x7, it was
>>> treated as -EINVAL:
>>>
>>>     data = ioprio & 0x1FFF
>>>     if data >= 0x7, return -EINVAL
>>>
>>> Since the change, if ioprio contains a level greater than 0x7 say 0x8
>>> it is calculated as 0x0:
>>>
>>>     level = ioprio & 0x7
>>>
>>> To maintain backward compatibility the kernel should return -EINVAL in
>>> the above case as well.
>>>
>>> Fixes: eca2040972b4 ("scsi: block: ioprio: Clean up interface definition")
>>> Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com>
>>> ---
>>>  block/ioprio.c | 11 ++++++++++-
>>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/ioprio.c b/block/ioprio.c
>>> index 73301a2..f08e76b 100644
>>> --- a/block/ioprio.c
>>> +++ b/block/ioprio.c
>>> @@ -30,6 +30,15 @@
>>>  #include <linux/security.h>
>>>  #include <linux/pid_namespace.h>
>>>
>>> +static inline int ioprio_check_level(int ioprio, int max_level)
>>> +{
>>> +     int data = IOPRIO_PRIO_DATA(ioprio);
>>> +
>>> +     if (IOPRIO_BAD_VALUE(data, max_level))
>>> +             return -EINVAL;
>>
>> No, this cannot possibly work correctly because the prio level part of the prio
>> data is only 3 bits, so 0 to 7. The remaining 10 bits of the prio data are used
>> for priority hints (IOPRIO_HINT_XXX).
>>
>> Your change will thus return an error for cases where the prio data has a level
>> AND also a hint (e.g. for command duration limits). This change would break
>> command duration limits. So NACK.
>>
>> The userspace header file has the ioprio_value() that a user should use to
>> construct an ioprio. Bad values are checked in that function and errors will be
>> returned if an invalid level is passed.
>>
> 
> OK. Thanks for the detailed explanation.
> 
> I agree, to use unused bits, functionality (return value in this case)
> will be changed. If applications are built using Kernel headers of
> v6.1 (doesn't include eca2040972b4) and later only upgrading Kernel to
> v6.6, because of the changes in return values applications may have
> some sort of regression.

Which would mean that the application was not first fixed to handle the error
return, and so it would still being doing the wrong thing on 6.6 as well as on
6.1. So I do not see any regression here...

> To make the software backward compatible I believe, unused bits should
> always be ignored. So that if in future someone uses it, it should not
> change the behaviour (return values) of existing software.

There are no unused bits anymore since kernel 6.5: hints = 10 bits, level = 3
bits, and class = 3 bits, for a total of 16-bits for the entire ioprio.

So if an application attempt to set an invalid level that uses more than 3-bits,
the extra bits will be ignored in the kernel and such bad level value can be
caught only if the application uses the ioprio_value() helper defined in
include/uapi/linux/ioprio.h.

And note that before 6.5, checking the prio level was also far from perfect: if
the user tried to set the prio level to a value using more than 13-bits, the
extra bits would be ignored too and the bad value would NOT necessarily cause an
error (e.g. using 1 << 13 = 8192 as the level would result in the kernel seeing
level = 0 and not throw an error). The introduction of ioprio_value() allows
catching these issues, if the application use it.
diff mbox series

Patch

diff --git a/block/ioprio.c b/block/ioprio.c
index 73301a2..f08e76b 100644
--- a/block/ioprio.c
+++ b/block/ioprio.c
@@ -30,6 +30,15 @@ 
 #include <linux/security.h>
 #include <linux/pid_namespace.h>
 
+static inline int ioprio_check_level(int ioprio, int max_level)
+{
+	int data = IOPRIO_PRIO_DATA(ioprio);
+
+	if (IOPRIO_BAD_VALUE(data, max_level))
+		return -EINVAL;
+	return 0;
+}
+
 int ioprio_check_cap(int ioprio)
 {
 	int class = IOPRIO_PRIO_CLASS(ioprio);
@@ -49,7 +58,7 @@  int ioprio_check_cap(int ioprio)
 			fallthrough;
 			/* rt has prio field too */
 		case IOPRIO_CLASS_BE:
-			if (level >= IOPRIO_NR_LEVELS)
+			if (ioprio_check_level(ioprio, IOPRIO_NR_LEVELS))
 				return -EINVAL;
 			break;
 		case IOPRIO_CLASS_IDLE: