diff mbox series

[v1,3/3] media: cedrus: Fix endless loop in cedrus_h265_skip_bits()

Message ID 20220818203308.439043-4-nicolas.dufresne@collabora.com (mailing list archive)
State New, archived
Headers show
Series [v1,1/3] media: cedrus: Fix watchdog race condition | expand

Commit Message

Nicolas Dufresne Aug. 18, 2022, 8:33 p.m. UTC
From: Dmitry Osipenko <dmitry.osipenko@collabora.com>

The busy status bit may never de-assert if number of programmed skip
bits is incorrect, resulting in a kernel hang because the bit is polled
endlessly in the code. Fix it by adding timeout for the bit-polling.
This problem is reproducible by setting the data_bit_offset field of
the HEVC slice params to a wrong value by userspace.

Cc: stable@vger.kernel.org
Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
---
 drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Dmitry Osipenko Aug. 18, 2022, 8:39 p.m. UTC | #1
On 8/18/22 23:33, Nicolas Dufresne wrote:
> From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> 
> The busy status bit may never de-assert if number of programmed skip
> bits is incorrect, resulting in a kernel hang because the bit is polled
> endlessly in the code. Fix it by adding timeout for the bit-polling.
> This problem is reproducible by setting the data_bit_offset field of
> the HEVC slice params to a wrong value by userspace.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> ---
>  drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> index f703c585d91c5..f0bc118021b0a 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev,
>  static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
>  {
>  	int count = 0;
> +	u32 reg;

This "reg" variable isn't needed anymore after switching to
cedrus_wait_for(). Sorry, I missed it :)

>  	while (count < num) {
>  		int tmp = min(num - count, 32);
> @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
>  		cedrus_write(dev, VE_DEC_H265_TRIGGER,
>  			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
>  			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
> -		while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY)
> -			udelay(1);
> +
> +		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY))
> +			dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n");
>  
>  		count += tmp;
>  	}
Nicolas Dufresne Aug. 18, 2022, 9:17 p.m. UTC | #2
Le jeudi 18 août 2022 à 23:39 +0300, Dmitry Osipenko a écrit :
> On 8/18/22 23:33, Nicolas Dufresne wrote:
> > From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > 
> > The busy status bit may never de-assert if number of programmed skip
> > bits is incorrect, resulting in a kernel hang because the bit is polled
> > endlessly in the code. Fix it by adding timeout for the bit-polling.
> > This problem is reproducible by setting the data_bit_offset field of
> > the HEVC slice params to a wrong value by userspace.
> > 
> > Cc: stable@vger.kernel.org
> > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > ---
> >  drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > index f703c585d91c5..f0bc118021b0a 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev,
> >  static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
> >  {
> >  	int count = 0;
> > +	u32 reg;
> 
> This "reg" variable isn't needed anymore after switching to
> cedrus_wait_for(). Sorry, I missed it :)

Good catch thanks, will fix.

> 
> >  	while (count < num) {
> >  		int tmp = min(num - count, 32);
> > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
> >  		cedrus_write(dev, VE_DEC_H265_TRIGGER,
> >  			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
> >  			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
> > -		while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY)
> > -			udelay(1);
> > +
> > +		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY))
> > +			dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n");
> >  
> >  		count += tmp;
> >  	}
> 
>
Jernej Škrabec Aug. 19, 2022, 4:16 a.m. UTC | #3
Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne napisal(a):
> From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> 
> The busy status bit may never de-assert if number of programmed skip
> bits is incorrect, resulting in a kernel hang because the bit is polled
> endlessly in the code. Fix it by adding timeout for the bit-polling.
> This problem is reproducible by setting the data_bit_offset field of
> the HEVC slice params to a wrong value by userspace.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

Fixes tag would be nice.

> ---
>  drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index
> f703c585d91c5..f0bc118021b0a 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct
> cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev *dev,
> int num) {
>  	int count = 0;
> +	u32 reg;
> 
>  	while (count < num) {
>  		int tmp = min(num - count, 32);
> @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev
> *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER,
>  			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
>  			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
> -		while (cedrus_read(dev, VE_DEC_H265_STATUS) &
> VE_DEC_H265_STATUS_VLD_BUSY) -			udelay(1);
> +
> +		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS,
> VE_DEC_H265_STATUS_VLD_BUSY)) +			
dev_err_ratelimited(dev->dev, "timed out
> waiting to skip bits\n");

Reporting issue is nice, but better would be to propagate error, since there 
is no way to properly decode this slice if above code block fails.

Best regards,
Jernej

> 
>  		count += tmp;
>  	}
Nicolas Dufresne Aug. 19, 2022, 3:39 p.m. UTC | #4
Le vendredi 19 août 2022 à 06:16 +0200, Jernej Škrabec a écrit :
> Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne napisal(a):
> > From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > 
> > The busy status bit may never de-assert if number of programmed skip
> > bits is incorrect, resulting in a kernel hang because the bit is polled
> > endlessly in the code. Fix it by adding timeout for the bit-polling.
> > This problem is reproducible by setting the data_bit_offset field of
> > the HEVC slice params to a wrong value by userspace.
> > 
> > Cc: stable@vger.kernel.org
> > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> 
> Fixes tag would be nice.
> 
> > ---
> >  drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index
> > f703c585d91c5..f0bc118021b0a 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct
> > cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev *dev,
> > int num) {
> >  	int count = 0;
> > +	u32 reg;
> > 
> >  	while (count < num) {
> >  		int tmp = min(num - count, 32);
> > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev
> > *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER,
> >  			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
> >  			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
> > -		while (cedrus_read(dev, VE_DEC_H265_STATUS) &
> > VE_DEC_H265_STATUS_VLD_BUSY) -			udelay(1);
> > +
> > +		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS,
> > VE_DEC_H265_STATUS_VLD_BUSY)) +			
> dev_err_ratelimited(dev->dev, "timed out
> > waiting to skip bits\n");
> 
> Reporting issue is nice, but better would be to propagate error, since there 
> is no way to properly decode this slice if above code block fails.

This mimic what was already there, mind if we do that later ? The propagation is
doing to be a lot more intrusive.

> 
> Best regards,
> Jernej
> 
> > 
> >  		count += tmp;
> >  	}
> 
> 
> 
>
Jernej Škrabec Aug. 25, 2022, 9:13 p.m. UTC | #5
Dne petek, 19. avgust 2022 ob 17:39:25 CEST je Nicolas Dufresne napisal(a):
> Le vendredi 19 août 2022 à 06:16 +0200, Jernej Škrabec a écrit :
> > Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne 
napisal(a):
> > > From: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > > 
> > > The busy status bit may never de-assert if number of programmed skip
> > > bits is incorrect, resulting in a kernel hang because the bit is polled
> > > endlessly in the code. Fix it by adding timeout for the bit-polling.
> > > This problem is reproducible by setting the data_bit_offset field of
> > > the HEVC slice params to a wrong value by userspace.
> > > 
> > > Cc: stable@vger.kernel.org
> > > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > 
> > Fixes tag would be nice.
> > 
> > > ---
> > > 
> > >  drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > > b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index
> > > f703c585d91c5..f0bc118021b0a 100644
> > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
> > > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct
> > > cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev
> > > *dev,
> > > int num) {
> > > 
> > >  	int count = 0;
> > > 
> > > +	u32 reg;
> > > 
> > >  	while (count < num) {
> > >  	
> > >  		int tmp = min(num - count, 32);
> > > 
> > > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev
> > > *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER,
> > > 
> > >  			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
> > >  			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
> > > 
> > > -		while (cedrus_read(dev, VE_DEC_H265_STATUS) &
> > > VE_DEC_H265_STATUS_VLD_BUSY) -			udelay(1);
> > > +
> > > +		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS,
> > > VE_DEC_H265_STATUS_VLD_BUSY)) +
> > 
> > dev_err_ratelimited(dev->dev, "timed out
> > 
> > > waiting to skip bits\n");
> > 
> > Reporting issue is nice, but better would be to propagate error, since
> > there is no way to properly decode this slice if above code block fails.
> This mimic what was already there, mind if we do that later ? The
> propagation is doing to be a lot more intrusive.

Since backporting fixes before 6.0 isn't priority, viability for backpporting 
isn't that important. You would only need to return 0 or -ETIMEDOUT and check 
for error in only one location. That doesn't sound  very intrusive for me.

Best regards,
Jernej

> 
> > Best regards,
> > Jernej
> > 
> > >  		count += tmp;
> > >  	
> > >  	}
diff mbox series

Patch

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
index f703c585d91c5..f0bc118021b0a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -227,6 +227,7 @@  static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev,
 static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
 {
 	int count = 0;
+	u32 reg;
 
 	while (count < num) {
 		int tmp = min(num - count, 32);
@@ -234,8 +235,9 @@  static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num)
 		cedrus_write(dev, VE_DEC_H265_TRIGGER,
 			     VE_DEC_H265_TRIGGER_FLUSH_BITS |
 			     VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp));
-		while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY)
-			udelay(1);
+
+		if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY))
+			dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n");
 
 		count += tmp;
 	}