diff mbox series

[v2] fpga: altera-cvp: Truncated bitstream error support

Message ID 20220520013040.2920835-1-tien.sung.ang@intel.com (mailing list archive)
State New
Headers show
Series [v2] fpga: altera-cvp: Truncated bitstream error support | expand

Commit Message

Ang, Tien Sung May 20, 2022, 1:30 a.m. UTC
From: Ang Tien Sung <tien.sung.ang@intel.com>

To support the error handling of a truncated bitstream sent.
The current AIB CvP firmware is not capable of handling a
data stream smaller than 4096bytes. The firmware's limitation
causes a hung-up as it's DMA engine waits forever for the
completion of the instructed 4096bytes.
To resolve this design limitation, both firmware and CvP
driver made several changes. At the CvP driver, we just
have to ensure that anything lesser than 4096bytes are
padded with extra bytes. The CvP will then, initiate the
tear-down by clearing the START_XFER and CVP_CONFIG bits.
We should also check for CVP_ERROR during the CvP completion.
A send_buf which is always 4096bytes is used to copy the
data during every transaction.

Signed-off-by: Ang Tien Sung <tien.sung.ang@intel.com>
---
changelog v2:
* Alignment fix parameter 'conf' altera_cvp_send_block
---
 drivers/fpga/altera-cvp.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

Comments

Xu Yilun May 28, 2022, 9:59 a.m. UTC | #1
On Fri, May 20, 2022 at 09:30:40AM +0800, tien.sung.ang@intel.com wrote:
> From: Ang Tien Sung <tien.sung.ang@intel.com>
> 
> To support the error handling of a truncated bitstream sent.

A blank line here.

> The current AIB CvP firmware is not capable of handling a
> data stream smaller than 4096bytes. The firmware's limitation

So why don't you check the image size on write_init(), and just prevent
the DMA writing at the very beginning?

> causes a hung-up as it's DMA engine waits forever for the
> completion of the instructed 4096bytes.

A blank line here.

> To resolve this design limitation, both firmware and CvP
> driver made several changes. At the CvP driver, we just
> have to ensure that anything lesser than 4096bytes are
> padded with extra bytes. The CvP will then, initiate the
> tear-down by clearing the START_XFER and CVP_CONFIG bits.

The driver pads the data block to 4096 bytes, then why the CvP still
should fail the reprograming?

If the image size is larger than 1 Page but is not aligned to 1 Page,
will the reprogramming still fail?

> We should also check for CVP_ERROR during the CvP completion.
> A send_buf which is always 4096bytes is used to copy the
> data during every transaction.
> 
> Signed-off-by: Ang Tien Sung <tien.sung.ang@intel.com>
> ---
> changelog v2:
> * Alignment fix parameter 'conf' altera_cvp_send_block
> ---
>  drivers/fpga/altera-cvp.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
> index 4ffb9da537d8..5169f9bcd726 100644
> --- a/drivers/fpga/altera-cvp.c
> +++ b/drivers/fpga/altera-cvp.c
> @@ -81,6 +81,7 @@ struct altera_cvp_conf {
>  	u8			numclks;
>  	u32			sent_packets;
>  	u32			vsec_offset;
> +	u8			*send_buf;
>  	const struct cvp_priv	*priv;
>  };
>  
> @@ -453,7 +454,11 @@ static int altera_cvp_write(struct fpga_manager *mgr, const char *buf,
>  		}
>  
>  		len = min(conf->priv->block_size, remaining);
> -		altera_cvp_send_block(conf, data, len);
> +		/* Copy the requested host data into the transmit buffer */
> +

This blank line is not needed.

> +		memcpy(conf->send_buf, data, len);

Any padding value is OK?

> +		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
> +				      conf->priv->block_size);

If the len equals block_size, is the copy still needed?

>  		data += len / sizeof(u32);
>  		done += len;
>  		remaining -= len;
> @@ -492,10 +497,13 @@ static int altera_cvp_write_complete(struct fpga_manager *mgr,
>  	if (ret)
>  		return ret;
>  
> -	/* STEP 16 - check CVP_CONFIG_ERROR_LATCHED bit */
> -	altera_read_config_dword(conf, VSE_UNCOR_ERR_STATUS, &val);
> -	if (val & VSE_UNCOR_ERR_CVP_CFG_ERR) {
> -		dev_err(&mgr->dev, "detected CVP_CONFIG_ERROR_LATCHED!\n");
> +	/*
> +	 * STEP 16 - If bitstream error (truncated/miss-matched),
> +	 * we shall exit here.
> +	 */
> +	ret = altera_read_config_dword(conf, VSE_CVP_STATUS, &val);
> +	if (ret || (val & VSE_CVP_STATUS_CFG_ERR)) {
> +		dev_err(&mgr->dev, "CVP_CONFIG_ERROR!\n");

So this new error checking covers the previous "latched error" case?

>  		return -EPROTO;
>  	}
>  
> @@ -661,6 +669,12 @@ static int altera_cvp_probe(struct pci_dev *pdev,
>  
>  	pci_set_drvdata(pdev, mgr);
>  
> +	/* Allocate the 4096 block size transmit buffer */
> +	conf->send_buf = devm_kzalloc(&pdev->dev, conf->priv->block_size, GFP_KERNEL);

If block_size == ALTERA_CVP_V1_SIZE, the copy is still needed?

> +	if (!conf->send_buf) {
> +		ret = -ENOMEM;
> +		goto err_unmap;
> +	}

Maybe it is better move the buffer allocation to write_init()

Thanks,
Yilun

>  	return 0;
>  
>  err_unmap:
> -- 
> 2.25.1
Ang, Tien Sung May 31, 2022, 1:53 a.m. UTC | #2
hi,

I will fix the "blank line" issues. 

>> The current AIB CvP firmware is not capable of handling a
>> data stream smaller than 4096bytes. The firmware's limitation

>So why don't you check the image size on write_init(), and just prevent
>the DMA writing at the very beginning?
We try not to add any kind of logic in the CvP driver to read
or understand the correctness of the bitstream. In future, this
logic may change. The CvP driver to our best interest should
only be responsible to perform the transfer of the bitstream
and perform the respective handshake. We have plans to have
the CvP firmware on the other side be more intelligent to 
handle errors and inform the CvP driver on why a certain 
configuration session fails. We are in the process of making
the next generation CvP to close this gaps.

>> To resolve this design limitation, both firmware and CvP
>> driver made several changes. At the CvP driver, we just
>> have to ensure that anything lesser than 4096bytes are
>> padded with extra bytes. The CvP will then, initiate the
>> tear-down by clearing the START_XFER and CVP_CONFIG bits.

> The driver pads the data block to 4096 bytes, then why the CvP still
> should fail the reprograming?
The block size of 4096bytes containts headers and signatures that
comprise a bitstream to FPGA. Any of the missing pieces will 
cause the FPGA configuration to fail at the SDM (Secure device
manager). We are at Intel are just trying to avoid a user from
using a bad data to reconfigure the FPGA via CvP and then, hitting
into a fault that is non-recoverable. Only a hard-reboot of the 
entire system can undo it. 

> If the image size is larger than 1 Page but is not aligned to 1 Page,
> will the reprogramming still fail?
Yes, the reconfiguration will fail. The above tear-down  is to prevent
that CvP Hardware/firmware in the FPGA from entering into a dead-lock.


>> +		memcpy(conf->send_buf, data, len);

>Any padding value is OK?
Yes, any padding value is fine. The reason for this is that, the
CvP Firmware/Hardware on the other side is using DMA to perform
transfers of 4096bytes. The firmware will wait forever for the
DMA to complete. The firmware is not able to check for status
or respond to any errors while in the wait state. 

>> +		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
>> +				      conf->priv->block_size);

>If the len equals block_size, is the copy still needed?
Actually, not required. But to maintain a simple design, we use
a common flow for all so you can skip the check.

>> +	ret = altera_read_config_dword(conf, VSE_CVP_STATUS, &val);
>> +	if (ret || (val & VSE_CVP_STATUS_CFG_ERR)) {
>> +		dev_err(&mgr->dev, "CVP_CONFIG_ERROR!\n");

>So this new error checking covers the previous "latched error" case?
Yes, because the new feature requires for the CvP driver to 
be able to subsequently re-perform a fresh new CvP session after
an error. With the above, the CvP driver will forever report
an error and caused the entire CvP to be permanently in error. 
This is not desirable and the user would need to power-cycle the
or hard reboot the entire system to recover. Intel sees this as
a terrible user-experience espcially as CvP is performed 
normally remotely.

>> +	/* Allocate the 4096 block size transmit buffer */
>> +	conf->send_buf = devm_kzalloc(&pdev->dev, conf->priv->block_size, GFP_KERNEL);

>If block_size == ALTERA_CVP_V1_SIZE, the copy is still needed?
Yes, the firmware and hardware on the other side uses DMA
and V1 to my understanding has a smaller buffer size. Intel 
later improved this and introduced a new back-pressure credit 
data flow control to help increase the performance.

>> +	if (!conf->send_buf) {
>> +		ret = -ENOMEM;
>> +		goto err_unmap;
>> +	}

>Maybe it is better move the buffer allocation to write_init()
The altera_cvp_conf has not been created yet at this point.
The init routine in CvP so far just registers the driver.
Xu Yilun June 3, 2022, 9:49 a.m. UTC | #3
On Tue, May 31, 2022 at 09:53:56AM +0800, tien.sung.ang@intel.com wrote:
> hi,
> 
> I will fix the "blank line" issues. 

Please don't top post.

And you don't have to clean too much context, otherwise people have to
search previous mails to remember what is it. People may not read the
mail at once.

> 
> >> The current AIB CvP firmware is not capable of handling a
> >> data stream smaller than 4096bytes. The firmware's limitation
> 
> >So why don't you check the image size on write_init(), and just prevent
> >the DMA writing at the very beginning?
> We try not to add any kind of logic in the CvP driver to read
> or understand the correctness of the bitstream. In future, this
> logic may change. The CvP driver to our best interest should

But you are adding the logic in altera_cvp_write() to keep the
correctness. What's the difference adding it somewhere else? And as I
can see, preventing the writing at the very beginning is a much cleaner
way.

> only be responsible to perform the transfer of the bitstream
> and perform the respective handshake. We have plans to have
> the CvP firmware on the other side be more intelligent to 
> handle errors and inform the CvP driver on why a certain 
> configuration session fails. We are in the process of making
> the next generation CvP to close this gaps.
> 
> >> To resolve this design limitation, both firmware and CvP
> >> driver made several changes. At the CvP driver, we just
> >> have to ensure that anything lesser than 4096bytes are
> >> padded with extra bytes. The CvP will then, initiate the
> >> tear-down by clearing the START_XFER and CVP_CONFIG bits.
> 
> > The driver pads the data block to 4096 bytes, then why the CvP still
> > should fail the reprograming?
> The block size of 4096bytes containts headers and signatures that
> comprise a bitstream to FPGA. Any of the missing pieces will 
> cause the FPGA configuration to fail at the SDM (Secure device
> manager). We are at Intel are just trying to avoid a user from
> using a bad data to reconfigure the FPGA via CvP and then, hitting
> into a fault that is non-recoverable. Only a hard-reboot of the 
> entire system can undo it. 
> 
> > If the image size is larger than 1 Page but is not aligned to 1 Page,
> > will the reprogramming still fail?
> Yes, the reconfiguration will fail. The above tear-down  is to prevent
> that CvP Hardware/firmware in the FPGA from entering into a dead-lock.

So if the image size is not aligned to 1 Page, the reprogram should fail
anyway, is it? Then I really recommend you fail the reprogramming at the
very beginning.

> 
> 
> >> +		memcpy(conf->send_buf, data, len);
> 
> >Any padding value is OK?
> Yes, any padding value is fine. The reason for this is that, the
> CvP Firmware/Hardware on the other side is using DMA to perform
> transfers of 4096bytes. The firmware will wait forever for the
> DMA to complete. The firmware is not able to check for status
> or respond to any errors while in the wait state. 
> 
> >> +		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
> >> +				      conf->priv->block_size);
> 
> >If the len equals block_size, is the copy still needed?
> Actually, not required. But to maintain a simple design, we use
> a common flow for all so you can skip the check.

I don't think so. We should make the code reasonable. Blindly copy the
whole buffer impacts all normal cases, and makes people confused. The
code seems shorter now, but not simpler, it takes people more time to
figure out why.

> 
> >> +	ret = altera_read_config_dword(conf, VSE_CVP_STATUS, &val);
> >> +	if (ret || (val & VSE_CVP_STATUS_CFG_ERR)) {
> >> +		dev_err(&mgr->dev, "CVP_CONFIG_ERROR!\n");
> 
> >So this new error checking covers the previous "latched error" case?
> Yes, because the new feature requires for the CvP driver to 
> be able to subsequently re-perform a fresh new CvP session after
> an error. With the above, the CvP driver will forever report
> an error and caused the entire CvP to be permanently in error. 
> This is not desirable and the user would need to power-cycle the
> or hard reboot the entire system to recover. Intel sees this as
> a terrible user-experience espcially as CvP is performed 
> normally remotely.
> 
> >> +	/* Allocate the 4096 block size transmit buffer */
> >> +	conf->send_buf = devm_kzalloc(&pdev->dev, conf->priv->block_size, GFP_KERNEL);
> 
> >If block_size == ALTERA_CVP_V1_SIZE, the copy is still needed?
> Yes, the firmware and hardware on the other side uses DMA
> and V1 to my understanding has a smaller buffer size. Intel 
> later improved this and introduced a new back-pressure credit 
> data flow control to help increase the performance.
> 
> >> +	if (!conf->send_buf) {
> >> +		ret = -ENOMEM;
> >> +		goto err_unmap;
> >> +	}
> 
> >Maybe it is better move the buffer allocation to write_init()
> The altera_cvp_conf has not been created yet at this point.
> The init routine in CvP so far just registers the driver.

I meant moving it to altera_cvp_write_init(). But now I think it is not
necessary. Bouncing buffer is not necessary at all.

Thanks,
Yilun
Ang, Tien Sung June 7, 2022, 5:55 a.m. UTC | #4
> But you are adding the logic in altera_cvp_write() to keep the
> correctness. What's the difference adding it somewhere else? And as I
> can see, preventing the writing at the very beginning is a much cleaner
> way.

Though without doubt, your solution is doable, I have a 
discussion with the Intel architect and they insisted that
the device driver must not make such a decision. The decision to 
drop or accept a transfer is up to the firmware. The firmware
insisted that the buffer be padded with whatever value. They
desire the transfer protocol to be obeyed to ensure, that 
there is no hard dependencies on the device driver if they
do one day change the 4kbytes to some other smaller value.

> > > If the image size is larger than 1 Page but is not aligned to 1 Page,
> > > will the reprogramming still fail?
> > Yes, the reconfiguration will fail. The above tear-down  is to prevent
> > that CvP Hardware/firmware in the FPGA from entering into a dead-lock.

> So if the image size is not aligned to 1 Page, the reprogram should fail
> anyway, is it? Then I really recommend you fail the reprogramming at the
> very beginning.

Same reasoning as above. We don't want the driver to make this
decision.

> > >> +		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
> > >> +				      conf->priv->block_size);
> > 
> > >If the len equals block_size, is the copy still needed?
> > Actually, not required. But to maintain a simple design, we use
> > a common flow for all so you can skip the check.

> I don't think so. We should make the code reasonable. Blindly copy the
> whole buffer impacts all normal cases, and makes people confused. The
> code seems shorter now, but not simpler, it takes people more time to
> figure out why.

Yes, it could look confusing to other programmers. And, yes, the 
padding doesn't matter. Let me relook into this. As the driver is 
already re-tested by the silicon validatioin. 
I want to avoid making any change as it
would meant another couple of weeks of re-testing. 
Can this be accepted as it is?
Xu Yilun June 8, 2022, 10:30 a.m. UTC | #5
On Tue, Jun 07, 2022 at 01:55:30PM +0800, tien.sung.ang@intel.com wrote:
> > But you are adding the logic in altera_cvp_write() to keep the
> > correctness. What's the difference adding it somewhere else? And as I
> > can see, preventing the writing at the very beginning is a much cleaner
> > way.
> 
> Though without doubt, your solution is doable, I have a 
> discussion with the Intel architect and they insisted that
> the device driver must not make such a decision. The decision to 
> drop or accept a transfer is up to the firmware. The firmware

Why dropping or accepting is forbidden, but padding is allowed? If the
decision is no operation to the data, then padding should also be
forbidden.

> insisted that the buffer be padded with whatever value. They

If I understand right, the 2 decisions are conflicted. On one hand,
the driver should not care about the data. On another hand, the
driver should still care about the data for some rule.

So please firstly reach your agreement before making patches.

> desire the transfer protocol to be obeyed to ensure, that 
> there is no hard dependencies on the device driver if they
> do one day change the 4kbytes to some other smaller value.
> 
> > > > If the image size is larger than 1 Page but is not aligned to 1 Page,
> > > > will the reprogramming still fail?
> > > Yes, the reconfiguration will fail. The above tear-down  is to prevent
> > > that CvP Hardware/firmware in the FPGA from entering into a dead-lock.
> 
> > So if the image size is not aligned to 1 Page, the reprogram should fail
> > anyway, is it? Then I really recommend you fail the reprogramming at the
> > very beginning.
> 
> Same reasoning as above. We don't want the driver to make this
> decision.
> 
> > > >> +		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
> > > >> +				      conf->priv->block_size);
> > > 
> > > >If the len equals block_size, is the copy still needed?
> > > Actually, not required. But to maintain a simple design, we use
> > > a common flow for all so you can skip the check.
> 
> > I don't think so. We should make the code reasonable. Blindly copy the
> > whole buffer impacts all normal cases, and makes people confused. The
> > code seems shorter now, but not simpler, it takes people more time to
> > figure out why.
> 
> Yes, it could look confusing to other programmers. And, yes, the 
> padding doesn't matter. Let me relook into this. As the driver is 
> already re-tested by the silicon validatioin. 
> I want to avoid making any change as it
> would meant another couple of weeks of re-testing. 
> Can this be accepted as it is?

Sorry, I can't. Some clarification is needed.

Thanks,
Yilun
Ang, Tien Sung June 27, 2022, 7:45 a.m. UTC | #6
> > Though without doubt, your solution is doable, I have a 
> > discussion with the Intel architect and they insisted that
> > the device driver must not make such a decision. The decision to 
> > drop or accept a transfer is up to the firmware. The firmware

> Why dropping or accepting is forbidden, but padding is allowed? If the
> decision is no operation to the data, then padding should also be
> forbidden.

> > insisted that the buffer be padded with whatever value. They

> If I understand right, the 2 decisions are conflicted. On one hand,
> the driver should not care about the data. On another hand, the
> driver should still care about the data for some rule.

> So please firstly reach your agreement before making patches.

Had some delays in replying,from the perspective of the driver, 
Intel architects want it to be just purely on the transfer of 
data without inspecting the data. 
If the size of a transfer is 4096bytes, we can padd it. 
I can do that which I think it is not an issue.
Let me create a reworked patch to pad the payload with Zeros.
Would this be OK for patch v3?

> > Yes, it could look confusing to other programmers. And, yes, the 
> > padding doesn't matter. Let me relook into this. As the driver is 
> > already re-tested by the silicon validatioin. 
> > I want to avoid making any change as it
> > would meant another couple of weeks of re-testing. 
> > Can this be accepted as it is?

> Sorry, I can't. Some clarification is needed.

Reply as above, let us implement the padding in the
next patch v3. Appreciate your inputs. Thanks
Xu Yilun June 27, 2022, 8:52 a.m. UTC | #7
On Mon, Jun 27, 2022 at 03:45:53PM +0800, tien.sung.ang@intel.com wrote:
> > > Though without doubt, your solution is doable, I have a 
> > > discussion with the Intel architect and they insisted that
> > > the device driver must not make such a decision. The decision to 
> > > drop or accept a transfer is up to the firmware. The firmware
> 
> > Why dropping or accepting is forbidden, but padding is allowed? If the
> > decision is no operation to the data, then padding should also be
> > forbidden.
> 
> > > insisted that the buffer be padded with whatever value. They
> 
> > If I understand right, the 2 decisions are conflicted. On one hand,
> > the driver should not care about the data. On another hand, the
> > driver should still care about the data for some rule.
> 
> > So please firstly reach your agreement before making patches.
> 
> Had some delays in replying,from the perspective of the driver, 
> Intel architects want it to be just purely on the transfer of 
> data without inspecting the data. 
> If the size of a transfer is 4096bytes, we can padd it. 
> I can do that which I think it is not an issue.
> Let me create a reworked patch to pad the payload with Zeros.

No, it's not about padding with Zeros or other values. If we should
stick on no image touching, then please fix the firmware. If we need
a workaround in driver, then making a decent fix. Blindly copy the image
is just a bad idea, and it could actually be avoided.

Thanks,
Yilun

> Would this be OK for patch v3?
> 
> > > Yes, it could look confusing to other programmers. And, yes, the 
> > > padding doesn't matter. Let me relook into this. As the driver is 
> > > already re-tested by the silicon validatioin. 
> > > I want to avoid making any change as it
> > > would meant another couple of weeks of re-testing. 
> > > Can this be accepted as it is?
> 
> > Sorry, I can't. Some clarification is needed.
> 
> Reply as above, let us implement the padding in the
> next patch v3. Appreciate your inputs. Thanks
diff mbox series

Patch

diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
index 4ffb9da537d8..5169f9bcd726 100644
--- a/drivers/fpga/altera-cvp.c
+++ b/drivers/fpga/altera-cvp.c
@@ -81,6 +81,7 @@  struct altera_cvp_conf {
 	u8			numclks;
 	u32			sent_packets;
 	u32			vsec_offset;
+	u8			*send_buf;
 	const struct cvp_priv	*priv;
 };
 
@@ -453,7 +454,11 @@  static int altera_cvp_write(struct fpga_manager *mgr, const char *buf,
 		}
 
 		len = min(conf->priv->block_size, remaining);
-		altera_cvp_send_block(conf, data, len);
+		/* Copy the requested host data into the transmit buffer */
+
+		memcpy(conf->send_buf, data, len);
+		altera_cvp_send_block(conf, (const u32 *)conf->send_buf,
+				      conf->priv->block_size);
 		data += len / sizeof(u32);
 		done += len;
 		remaining -= len;
@@ -492,10 +497,13 @@  static int altera_cvp_write_complete(struct fpga_manager *mgr,
 	if (ret)
 		return ret;
 
-	/* STEP 16 - check CVP_CONFIG_ERROR_LATCHED bit */
-	altera_read_config_dword(conf, VSE_UNCOR_ERR_STATUS, &val);
-	if (val & VSE_UNCOR_ERR_CVP_CFG_ERR) {
-		dev_err(&mgr->dev, "detected CVP_CONFIG_ERROR_LATCHED!\n");
+	/*
+	 * STEP 16 - If bitstream error (truncated/miss-matched),
+	 * we shall exit here.
+	 */
+	ret = altera_read_config_dword(conf, VSE_CVP_STATUS, &val);
+	if (ret || (val & VSE_CVP_STATUS_CFG_ERR)) {
+		dev_err(&mgr->dev, "CVP_CONFIG_ERROR!\n");
 		return -EPROTO;
 	}
 
@@ -661,6 +669,12 @@  static int altera_cvp_probe(struct pci_dev *pdev,
 
 	pci_set_drvdata(pdev, mgr);
 
+	/* Allocate the 4096 block size transmit buffer */
+	conf->send_buf = devm_kzalloc(&pdev->dev, conf->priv->block_size, GFP_KERNEL);
+	if (!conf->send_buf) {
+		ret = -ENOMEM;
+		goto err_unmap;
+	}
 	return 0;
 
 err_unmap: