diff mbox

fpga zynq: Check the bitstream for validity

Message ID 5ea0e77e-11c5-b4f7-00a9-9c5425ffac5a@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Matthias Brugger Oct. 28, 2016, 4:36 p.m. UTC
On 28/10/16 17:47, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2016 at 01:06:08PM +0200, Matthias Brugger wrote:
>
>> The only case we don't check is, if count == 0. If we check that here, we
>> can get rid of the count <= 4 check.
>
> You don't think
>
>   if (count == 0 || buf[3] = 'x')
>
> looks weird and wrong? I do.
>

That wasn't what I meant. Apart it looks quite wrong, because when the 
count is zero buf[3] points to anything but a valid value.

>>> The count <= 4 should stay here since it is primarily guarding against
>>> read past the buffer in the if.
>>
>> If you insist in doing this check, it should be count < 4, because we check
>> the first four elements of buf, or do I miss something?
>
> count = 4 and count = 0 are both invalid. A bitstream consisting of
> only the sync word is also going to fail programming.
>
> As Michal said, the actual min bitstream length is probably >> 50 bytes
>

Sure but we are checking here that the bitstream passed to the kernel is 
correct. I was thinking of something like:

Comments

Jason Gunthorpe Oct. 28, 2016, 4:55 p.m. UTC | #1
On Fri, Oct 28, 2016 at 06:36:06PM +0200, Matthias Brugger wrote:

> Sure but we are checking here that the bitstream passed to the kernel is
> correct.

The intent to check if it *possible* that the bitstream is
correct. Correct means that DONE will assert after programming. A 4
byte bitstream will never assert DONE.

Arguably the threshold should be larger but we don't know what the
true minimum is.

> +	/* All valid bitstreams are multiples of 32 bits */
> +	if (!count || (count % 4) != 0)
> +		return -EINVAL;
> +

Too clever for my taste.

Aside from this, is the general idea even OK? In my world I cannonize
the bitstream to start with the sync word, but others may not be doing
that.

I designed this patch based on the prior work to remove the
auto-detection, eg that the driver should be passed a canonized
bitstream.

The problem with the way it is now is how hard it is to figure out
what the right bitstream format should be. Clear instructions to
canonize by droping the header before the sync word and byteswap so
the sync word is in the given order is much clearer..

Jason
Moritz Fischer Oct. 28, 2016, 6:23 p.m. UTC | #2
On Fri, Oct 28, 2016 at 10:55:55AM -0600, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2016 at 06:36:06PM +0200, Matthias Brugger wrote:
> 
> > Sure but we are checking here that the bitstream passed to the kernel is
> > correct.
> 
> The intent to check if it *possible* that the bitstream is
> correct. Correct means that DONE will assert after programming. A 4
> byte bitstream will never assert DONE.
> 
> Arguably the threshold should be larger but we don't know what the
> true minimum is.
> 
> > +	/* All valid bitstreams are multiples of 32 bits */
> > +	if (!count || (count % 4) != 0)
> > +		return -EINVAL;
> > +
> 
> Too clever for my taste.
> 
> Aside from this, is the general idea even OK? In my world I cannonize
> the bitstream to start with the sync word, but others may not be doing
> that.
> 
> I designed this patch based on the prior work to remove the
> auto-detection, eg that the driver should be passed a canonized
> bitstream.

I'm fine with checking for boundary cases where the bitstreams are
obviously too small or wrong size, I'm not convinced that checking using
internal knowledge about the bistream format inside the kernel is the
right place.

> The problem with the way it is now is how hard it is to figure out
> what the right bitstream format should be. Clear instructions to
> canonize by droping the header before the sync word and byteswap so
> the sync word is in the given order is much clearer..

I don't know about you, but for my designs I can just use what drops out
of my Vivado build. Are you unhappy with the way we document which
format to use, or that we don't slice off the beginning (that gets
ignored by hardware?).

Thanks for addressing the issues with the length calculations,

Moritz
Jason Gunthorpe Oct. 28, 2016, 8:26 p.m. UTC | #3
On Fri, Oct 28, 2016 at 11:23:08AM -0700, Moritz Fischer wrote:

> I'm fine with checking for boundary cases where the bitstreams are
> obviously too small or wrong size, I'm not convinced that checking using
> internal knowledge about the bistream format inside the kernel is the
> right place.

To be clear, the sync word is documented in the Xilinx public docs, it
is mandatory. I don't think there is anything wrong with doing basic
validation on the structure of the bitstream before sending it.

> > The problem with the way it is now is how hard it is to figure out
> > what the right bitstream format should be. Clear instructions to
> > canonize by droping the header before the sync word and byteswap so
> > the sync word is in the given order is much clearer..
> 
> I don't know about you, but for my designs I can just use what drops out
> of my Vivado build.

Are you sure? With a 4.8 kernel?

 # (in vivado 2016.3) write_bitstream -bin_file foo
 $ hexdump -C foo.bin
 00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
 *
 00000020  00 00 00 bb 11 22 00 44  ff ff ff ff ff ff ff ff  |.....".D........|
 00000030  aa 99 55 66 20 00 00 00  30 02 20 01 00 00 00 00  |..Uf ...0. .....|

So that isn't going to work, it needs byte swapping

 $ hexdump -C foo.bit
 000000a0  bb 11 22 00 44 ff ff ff  ff ff ff ff ff aa 99 55  |..".D..........U|
 000000b0  66 20 00 00 00 30 02 20  01 00 00 00 00 30 02 00  |f ...0. .....0..|

This also is not going to work, it needs byteswapping and the sync word
has to be 4 byte aligned.

What did you do to get a working bitfile? Are you using one of the
Vivado automatic flows that 'handles' this for you? I am not.

Remember, 4.8 now has the patch to remove the autodetection that used
to correct both of the above two problems..

So from my perspective, this driver is incompatible with the output of
the Xilinx tools. I don't really care, we always post-process the
output of write_bitfile, and I am happy to provide a canonized
bitstream, but the driver needs to do more to help people get this
right.

> Are you unhappy with the way we document which format to use, or
> that we don't slice off the beginning (that gets ignored by hardware?).

Well, I'm unhappy I spent an hour wondering why things didn't work
with no information on what the expected format was now for this
driver. For a bit I wondered if there was a driver bug as this stuff
all worked for me with the original xdevcfg driver.

Some of the problems were bugs on my part (which would have been found
much faster with validation), but at the end of the day this is
horribly unfriendly. You get a timeout and a 'Failed' message, thats
it.

I think all common bitstream formatting errors would be detected by
simply validating the sync word.

Jason
Moritz Fischer Oct. 28, 2016, 9 p.m. UTC | #4
Jason,

On Fri, Oct 28, 2016 at 02:26:19PM -0600, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2016 at 11:23:08AM -0700, Moritz Fischer wrote:
> 
> > I'm fine with checking for boundary cases where the bitstreams are
> > obviously too small or wrong size, I'm not convinced that checking using
> > internal knowledge about the bistream format inside the kernel is the
> > right place.
> 
> To be clear, the sync word is documented in the Xilinx public docs, it
> is mandatory. I don't think there is anything wrong with doing basic
> validation on the structure of the bitstream before sending it.

By 'internal' I meant internal to the bitstream. On second thought,
you might have a point with the error message being not helpful (see
below).
> 
> > > The problem with the way it is now is how hard it is to figure out
> > > what the right bitstream format should be. Clear instructions to
> > > canonize by droping the header before the sync word and byteswap so
> > > the sync word is in the given order is much clearer..
> > 
> > I don't know about you, but for my designs I can just use what drops out
> > of my Vivado build.
> 
> Are you sure? With a 4.8 kernel?
> 
>  # (in vivado 2016.3) write_bitstream -bin_file foo
>  $ hexdump -C foo.bin
>  00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>  *
>  00000020  00 00 00 bb 11 22 00 44  ff ff ff ff ff ff ff ff  |.....".D........|
>  00000030  aa 99 55 66 20 00 00 00  30 02 20 01 00 00 00 00  |..Uf ...0. .....|
> 
> So that isn't going to work, it needs byte swapping
> 
>  $ hexdump -C foo.bit
>  000000a0  bb 11 22 00 44 ff ff ff  ff ff ff ff ff aa 99 55  |..".D..........U|
>  000000b0  66 20 00 00 00 30 02 20  01 00 00 00 00 30 02 00  |f ...0. .....0..|
> 
> This also is not going to work, it needs byteswapping and the sync word
> has to be 4 byte aligned.
> 
> What did you do to get a working bitfile? Are you using one of the
> Vivado automatic flows that 'handles' this for you? I am not.

https://github.com/EttusResearch/fpgadev/blob/master/usrp3/tools/scripts/viv_utils.tcl#L165

Is what our build process does (we set the byte_swap_bin parameter to
1). You're right in that write_bitstream will give you a non-swapped
version.

> Remember, 4.8 now has the patch to remove the autodetection that used
> to correct both of the above two problems..

The intial version had all the 'autocorrection' stuff in there, it was
then decided that we're not gonna support this.

The fact that we ended up supporting .bin in the initial version, and
then had a 'patch' to remove that was due to Greg pulling in my code too
early (before I could resubmit a squashed version).

If we reevaluate now that we wanna support swapping we should do this at
a framework level. Maybe a preprocess callback or a FPGA_MGR_SWAP flag.
I'll need to think about this :-)

> So from my perspective, this driver is incompatible with the output of
> the Xilinx tools. I don't really care, we always post-process the
> output of write_bitfile, and I am happy to provide a canonized
> bitstream, but the driver needs to do more to help people get this
> right.

Ok, so I'm fine with adding the checks and a warning if you don't find
the sync word. We could add documentation describing which format we
expect.

> 
> > Are you unhappy with the way we document which format to use, or
> > that we don't slice off the beginning (that gets ignored by hardware?).
> 
> Well, I'm unhappy I spent an hour wondering why things didn't work
> with no information on what the expected format was now for this
> driver. For a bit I wondered if there was a driver bug as this stuff
> all worked for me with the original xdevcfg driver.
> 
> Some of the problems were bugs on my part (which would have been found
> much faster with validation), but at the end of the day this is
> horribly unfriendly. You get a timeout and a 'Failed' message, thats
> it.
> 
> I think all common bitstream formatting errors would be detected by
> simply validating the sync word.

Ok. That sounds reasonable.

Cheers

Moritz
Jason Gunthorpe Oct. 28, 2016, 10:05 p.m. UTC | #5
On Fri, Oct 28, 2016 at 02:00:15PM -0700, Moritz Fischer wrote:

> > What did you do to get a working bitfile? Are you using one of the
> > Vivado automatic flows that 'handles' this for you? I am not.
> 
> https://github.com/EttusResearch/fpgadev/blob/master/usrp3/tools/scripts/viv_utils.tcl#L165

Hm 404

> Is what our build process does (we set the byte_swap_bin parameter to
> 1). You're right in that write_bitstream will give you a non-swapped
> version.

?? byte_swap_bin is not documented in UG835

> If we reevaluate now that we wanna support swapping we should do this at
> a framework level. Maybe a preprocess callback or a FPGA_MGR_SWAP flag.
> I'll need to think about this :-)

I'm perfectly fine with the driver only working with a single canonical
bitfile format. That seems completly reasonable, and I prefer an
efficient programming sequence for performance.

Further, eventually this framework is going to have to be fixed to be
able to DMA out of the page cache and in that world there is no
sensible option to process the data before sending it on to the
hardware.

> > So from my perspective, this driver is incompatible with the output of
> > the Xilinx tools. I don't really care, we always post-process the
> > output of write_bitfile, and I am happy to provide a canonized
> > bitstream, but the driver needs to do more to help people get this
> > right.
> 
> Ok, so I'm fine with adding the checks and a warning if you don't find
> the sync word. We could add documentation describing which format we
> expect.

Maybe you could send a patch to update the comments for the driver, or
add a documentation file how to produce an acceptable format using
Xilinx tools..

> Ok. That sounds reasonable.

So, the question still remains, should the driver require the header
be stripped (eg the sync word is the first 4 bytes), or should it
search the first bit for an aligned sync word?

Either requirement is acceptable to the hardware. My patch does the
former, I suspect you need the later?

Jason
Moritz Fischer Oct. 29, 2016, 12:09 a.m. UTC | #6
Jason,

On Fri, Oct 28, 2016 at 04:05:46PM -0600, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2016 at 02:00:15PM -0700, Moritz Fischer wrote:
> 
> > > What did you do to get a working bitfile? Are you using one of the
> > > Vivado automatic flows that 'handles' this for you? I am not.
> > 
> > https://github.com/EttusResearch/fpgadev/blob/master/usrp3/tools/scripts/viv_utils.tcl#L165
> 
> Hm 404

Whooopsie ... that was the internal link. Try that one:

https://github.com/EttusResearch/fpga/blob/master/usrp3/tools/scripts/viv_utils.tcl#L165

It's not a single command but rather a sequence of steps we take to
create an image that works (using write_cfgmem instead of write_binfile)

> 
> > Is what our build process does (we set the byte_swap_bin parameter to
> > 1). You're right in that write_bitstream will give you a non-swapped
> > version.
> 
> ?? byte_swap_bin is not documented in UG835
> 
> > If we reevaluate now that we wanna support swapping we should do this at
> > a framework level. Maybe a preprocess callback or a FPGA_MGR_SWAP flag.
> > I'll need to think about this :-)
> 
> I'm perfectly fine with the driver only working with a single canonical
> bitfile format. That seems completly reasonable, and I prefer an
> efficient programming sequence for performance.
> 
> Further, eventually this framework is going to have to be fixed to be
> able to DMA out of the page cache and in that world there is no
> sensible option to process the data before sending it on to the
> hardware.
> 
> > > So from my perspective, this driver is incompatible with the output of
> > > the Xilinx tools. I don't really care, we always post-process the
> > > output of write_bitfile, and I am happy to provide a canonized
> > > bitstream, but the driver needs to do more to help people get this
> > > right.
> > 
> > Ok, so I'm fine with adding the checks and a warning if you don't find
> > the sync word. We could add documentation describing which format we
> > expect.
> 
> Maybe you could send a patch to update the comments for the driver, or
> add a documentation file how to produce an acceptable format using
> Xilinx tools..

Yeah will do. I don't know if the generation flow outlined above is perfect,
we just pad our images and I haven't run into issues so far.

> So, the question still remains, should the driver require the header
> be stripped (eg the sync word is the first 4 bytes), or should it
> search the first bit for an aligned sync word?

So currently we don't require it to be stripped, changing it so it does
require stripping would break people's setups that already use the
current implementation.
That being said, I don't like the idea of the driver having to search
either...

Michal, Sören you guys have a preference / input?

> Either requirement is acceptable to the hardware. My patch does the
> former, I suspect you need the later?

For my usecases I could deal with either way, looking at backwards
compat the latter one would be preferential I supose ...

Cheers,

Moritz
Jason Gunthorpe Oct. 31, 2016, 4:23 p.m. UTC | #7
On Fri, Oct 28, 2016 at 05:09:26PM -0700, Moritz Fischer wrote:
> It's not a single command but rather a sequence of steps we take to
> create an image that works (using write_cfgmem instead of write_binfile)

Ah, and it relies on newer Vivado features too.. Never had a use for
write_cfgmem before, and wouldn't have thought it did this.

I always do these transform externally and we tack our own header onto
the bitfile as well..

> > So, the question still remains, should the driver require the header
> > be stripped (eg the sync word is the first 4 bytes), or should it
> > search the first bit for an aligned sync word?
> 
> So currently we don't require it to be stripped, changing it so it does
> require stripping would break people's setups that already use the
> current implementation.

Considering there is a way to produce an acceptable bitfile via
write_cfgmem I think we should stick with that and allow the header to
be present, otherwise all users need yet another tool.

I'll send another patch when I get back from the plumbers conference.

> > Either requirement is acceptable to the hardware. My patch does the
> > former, I suspect you need the later?
> 
> For my usecases I could deal with either way, looking at backwards
> compat the latter one would be preferential I supose ...

Well, there are no in-kernel users, and no uapi, so it isn't such a
big deal. I'm actually surprised this got merged without users ..

Jason
Michal Simek Nov. 1, 2016, 6:39 a.m. UTC | #8
Hi Jason,

On 31.10.2016 17:23, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2016 at 05:09:26PM -0700, Moritz Fischer wrote:
>> It's not a single command but rather a sequence of steps we take to
>> create an image that works (using write_cfgmem instead of write_binfile)
> 
> Ah, and it relies on newer Vivado features too.. Never had a use for
> write_cfgmem before, and wouldn't have thought it did this.
> 
> I always do these transform externally and we tack our own header onto
> the bitfile as well..
> 
>>> So, the question still remains, should the driver require the header
>>> be stripped (eg the sync word is the first 4 bytes), or should it
>>> search the first bit for an aligned sync word?
>>
>> So currently we don't require it to be stripped, changing it so it does
>> require stripping would break people's setups that already use the
>> current implementation.
> 
> Considering there is a way to produce an acceptable bitfile via
> write_cfgmem I think we should stick with that and allow the header to
> be present, otherwise all users need yet another tool.
> 
> I'll send another patch when I get back from the plumbers conference.
> 
>>> Either requirement is acceptable to the hardware. My patch does the
>>> former, I suspect you need the later?
>>
>> For my usecases I could deal with either way, looking at backwards
>> compat the latter one would be preferential I supose ...
> 
> Well, there are no in-kernel users, and no uapi, so it isn't such a
> big deal. I'm actually surprised this got merged without users ..

There were several things in this whole thread.

Regarding BIT and BIN format. This support is in vivado for a long time
and it is up2you what you want to support. We have removed that BIT
support and not doing any swap by saying only BIN format is supported.

If driver check header and if it is not valid you should just error out.
If header is fine driver can program it.

Regarding what you need to do in Vivado to get your bitstream right is
completely out of this thread and TBH I don't know it too.

Thanks,
Michal
Jason Gunthorpe Nov. 1, 2016, 3:33 p.m. UTC | #9
On Tue, Nov 01, 2016 at 07:39:22AM +0100, Michal Simek wrote:

> Regarding BIT and BIN format. This support is in vivado for a long time
> and it is up2you what you want to support. We have removed that BIT
> support and not doing any swap by saying only BIN format is supported.

BIN is not supported, it needs a swap as well.

Moritz has it right, you have to use vivado to create a PROM image to be
compatible with the driver.

Jason
Michal Simek Nov. 1, 2016, 5:48 p.m. UTC | #10
On 1.11.2016 16:33, Jason Gunthorpe wrote:
> On Tue, Nov 01, 2016 at 07:39:22AM +0100, Michal Simek wrote:
> 
>> Regarding BIT and BIN format. This support is in vivado for a long time
>> and it is up2you what you want to support. We have removed that BIT
>> support and not doing any swap by saying only BIN format is supported.
> 
> BIN is not supported, it needs a swap as well.
> 
> Moritz has it right, you have to use vivado to create a PROM image to be
> compatible with the driver.

hm than that's bad.

Thanks,
Michal
Jason Gunthorpe Nov. 8, 2016, 12:05 a.m. UTC | #11
On Tue, Nov 01, 2016 at 06:48:42PM +0100, Michal Simek wrote:
> On 1.11.2016 16:33, Jason Gunthorpe wrote:
> > On Tue, Nov 01, 2016 at 07:39:22AM +0100, Michal Simek wrote:
> > 
> >> Regarding BIT and BIN format. This support is in vivado for a long time
> >> and it is up2you what you want to support. We have removed that BIT
> >> support and not doing any swap by saying only BIN format is supported.
> > 
> > BIN is not supported, it needs a swap as well.
> > 
> > Moritz has it right, you have to use vivado to create a PROM image to be
> > compatible with the driver.
> 
> hm than that's bad.

IMHO, Xilinx made an error with Zynq DevC, the DMA does not accept a
memory image that is output by the usual Xilinx tools. It should have
accepted a byte swapped input.

I think Moritz is right, the fpgamgr *should not* alter the bitstream
in any way. This is important for future work to make the DMA do
gather and avoid the really bad high-order allocation.

So users will have to provide byte swapped .bin files - the vivado
write_cfgmem command will produce them - this all needs to be
documented.

Also, I think Punnaiah (?) was telling me that bitstream encryption
does not work - DevC must be told the bitstream is encrypted.
That seems like something that needs work at the fpgamgr level - and
maybe this driver should auto-detect encryption by looking at the
bitfile (as is typical for Xilinx programming)

Jason
Mike Looijmans Nov. 9, 2016, 2:21 p.m. UTC | #12
On 08-11-16 01:05, Jason Gunthorpe wrote:
> On Tue, Nov 01, 2016 at 06:48:42PM +0100, Michal Simek wrote:
>> On 1.11.2016 16:33, Jason Gunthorpe wrote:
>>> On Tue, Nov 01, 2016 at 07:39:22AM +0100, Michal Simek wrote:
>>>
>>>> Regarding BIT and BIN format. This support is in vivado for a long time
>>>> and it is up2you what you want to support. We have removed that BIT
>>>> support and not doing any swap by saying only BIN format is supported.
>>>
>>> BIN is not supported, it needs a swap as well.
>>>
>>> Moritz has it right, you have to use vivado to create a PROM image to be
>>> compatible with the driver.
>>
>> hm than that's bad.
>
> IMHO, Xilinx made an error with Zynq DevC, the DMA does not accept a
> memory image that is output by the usual Xilinx tools. It should have
> accepted a byte swapped input.
>
> I think Moritz is right, the fpgamgr *should not* alter the bitstream
> in any way. This is important for future work to make the DMA do
> gather and avoid the really bad high-order allocation.
>
> So users will have to provide byte swapped .bin files - the vivado
> write_cfgmem command will produce them - this all needs to be
> documented.
>
> Also, I think Punnaiah (?) was telling me that bitstream encryption
> does not work - DevC must be told the bitstream is encrypted.
> That seems like something that needs work at the fpgamgr level - and
> maybe this driver should auto-detect encryption by looking at the
> bitfile (as is typical for Xilinx programming)
>

I think the basic idea behind the commit is flawed to begin with and the patch 
should be discarded completely. The same discussion was done for the Xilinx 
FPGA manager driver, which resulted in the decision that the tooling should 
create a proper file. This driver should either become obsolete or at least 
move into the same direction as the FPGA manager rather than away from that.

Besides political arguments, there's a more pressing technical argument 
against this theck as well: The whole check is pointless since the hardware
itself already verifies the validity of the stream. Sending bitstreams 
intended for other devices has no effect at all. Even sending random data 
doesn't have any effect, the hardware will discard it. There's no reason to 
waste CPU cycles duplicating this check in software.




Kind regards,

Mike Looijmans
System Expert

TOPIC Products
Materiaalweg 4, NL-5681 RJ Best
Postbus 440, NL-5680 AK Best
Telefoon: +31 (0) 499 33 69 79
E-mail: mike.looijmans@topicproducts.com
Website: www.topicproducts.com

Please consider the environment before printing this e-mail
Matthias Brugger Nov. 9, 2016, 3:18 p.m. UTC | #13
On 11/09/2016 03:21 PM, Mike Looijmans wrote:
> On 08-11-16 01:05, Jason Gunthorpe wrote:
>> On Tue, Nov 01, 2016 at 06:48:42PM +0100, Michal Simek wrote:
>>> On 1.11.2016 16:33, Jason Gunthorpe wrote:
>>>> On Tue, Nov 01, 2016 at 07:39:22AM +0100, Michal Simek wrote:
>>>>
>>>>> Regarding BIT and BIN format. This support is in vivado for a long
>>>>> time
>>>>> and it is up2you what you want to support. We have removed that BIT
>>>>> support and not doing any swap by saying only BIN format is supported.
>>>>
>>>> BIN is not supported, it needs a swap as well.
>>>>
>>>> Moritz has it right, you have to use vivado to create a PROM image
>>>> to be
>>>> compatible with the driver.
>>>
>>> hm than that's bad.
>>
>> IMHO, Xilinx made an error with Zynq DevC, the DMA does not accept a
>> memory image that is output by the usual Xilinx tools. It should have
>> accepted a byte swapped input.
>>
>> I think Moritz is right, the fpgamgr *should not* alter the bitstream
>> in any way. This is important for future work to make the DMA do
>> gather and avoid the really bad high-order allocation.
>>
>> So users will have to provide byte swapped .bin files - the vivado
>> write_cfgmem command will produce them - this all needs to be
>> documented.
>>
>> Also, I think Punnaiah (?) was telling me that bitstream encryption
>> does not work - DevC must be told the bitstream is encrypted.
>> That seems like something that needs work at the fpgamgr level - and
>> maybe this driver should auto-detect encryption by looking at the
>> bitfile (as is typical for Xilinx programming)
>>
>
> I think the basic idea behind the commit is flawed to begin with and the
> patch should be discarded completely. The same discussion was done for
> the Xilinx FPGA manager driver, which resulted in the decision that the
> tooling should create a proper file. This driver should either become
> obsolete or at least move into the same direction as the FPGA manager
> rather than away from that.
>
> Besides political arguments, there's a more pressing technical argument
> against this theck as well: The whole check is pointless since the hardware
> itself already verifies the validity of the stream. Sending bitstreams
> intended for other devices has no effect at all. Even sending random
> data doesn't have any effect, the hardware will discard it. There's no
> reason to waste CPU cycles duplicating this check in software.
>

I get your point. Especially looping to the whole file to find the sync 
header can take a while, especially if the file is big and the sync 
header is not present.

So I think the whole idea behind this patch is to provide feedback to 
the user about what went wrong when trying to update the FPGA. Is there 
a way to get this information from the hardware which discards the update?

Regards,
Matthias
Jason Gunthorpe Nov. 9, 2016, 3:56 p.m. UTC | #14
On Wed, Nov 09, 2016 at 03:21:52PM +0100, Mike Looijmans wrote:
> I think the basic idea behind the commit is flawed to begin with and the
> patch should be discarded completely. The same discussion was done for the
> Xilinx FPGA manager driver, which resulted in the decision that the tooling
> should create a proper file.

That hasn't changed at all, this just produces a clear and obvious
message about what is wrong instead of 'timed out'.

Remember, again, the Xilinx tools do not produce an acceptable
bitstream file by default. You need to do very strange and special
things to get something acceptable to this driver. It is a very easy
mistake to make and hard to track down if you don't already know these
details.

> This driver should either become obsolete or at least move into the
> same direction as the FPGA manager rather than away from that.

I don't understand what you are talking about here, this is a fpga mgr
driver already, and is consistent with the FPGA manger - the auto
detect stuff was removed a while ago and isn't coming back.

It is perfectly reasonable for fpga manager drivers to pre-validate
the proposed bitstream, IMHO all of the drivers should try and do
that step.

Remember, it is deeply unlikely but sending garbage to an FPGA could
damage it.

> Besides political arguments, there's a more pressing technical argument
> against this theck as well: The whole check is pointless since the hardware
> itself already verifies the validity of the stream.

The purpose of the check is not to protect the hardware. The check is
to help the user provide the correct input because the hardware
provides no feedback at all on what is wrong with the input.

And again, the out-of-tree Xilinx driver accepted files that this
driver does not, so having a clear and understandable message is going
to be very important for users.

> doesn't have any effect, the hardware will discard it. There's no reason to
> waste CPU cycles duplicating this check in software.

In the typical success case we are talking about 5 32 bit compares,
which isn't even worth considering.

Jason
Jason Gunthorpe Nov. 9, 2016, 4 p.m. UTC | #15
On Wed, Nov 09, 2016 at 04:18:29PM +0100, Matthias Brugger wrote:

> I get your point. Especially looping to the whole file to find the sync
> header can take a while, especially if the file is big and the sync header
> is not present.

Er, no not at all. If you send garbage to the FPGA the driver detects
it after a 2.5s timeout. The memory scan is always alot faster.

In the normal success case it is ~5 compares.

> So I think the whole idea behind this patch is to provide feedback to the
> user about what went wrong when trying to update the FPGA. Is there a way to
> get this information from the hardware which discards the update?

No, not with such specificity.

Jason
Mike Looijmans Nov. 9, 2016, 5:31 p.m. UTC | #16
On 09-11-16 16:56, Jason Gunthorpe wrote:
> On Wed, Nov 09, 2016 at 03:21:52PM +0100, Mike Looijmans wrote:
>> I think the basic idea behind the commit is flawed to begin with and the
>> patch should be discarded completely. The same discussion was done for the
>> Xilinx FPGA manager driver, which resulted in the decision that the tooling
>> should create a proper file.
>
> That hasn't changed at all, this just produces a clear and obvious
> message about what is wrong instead of 'timed out'.
>
> Remember, again, the Xilinx tools do not produce an acceptable
> bitstream file by default. You need to do very strange and special
> things to get something acceptable to this driver. It is a very easy
> mistake to make and hard to track down if you don't already know these
> details.
>
>> This driver should either become obsolete or at least move into the
>> same direction as the FPGA manager rather than away from that.
>
> I don't understand what you are talking about here, this is a fpga mgr
> driver already, and is consistent with the FPGA manger - the auto
> detect stuff was removed a while ago and isn't coming back.

That's exactly what I mean - the code was kept simple.

> It is perfectly reasonable for fpga manager drivers to pre-validate
> the proposed bitstream, IMHO all of the drivers should try and do
> that step.
>
> Remember, it is deeply unlikely but sending garbage to an FPGA could
> damage it.

Then what's the purpose of pre-validation? Sending valid data should be 
the normal case, not the exception.

>> Besides political arguments, there's a more pressing technical argument
>> against this theck as well: The whole check is pointless since the hardware
>> itself already verifies the validity of the stream.
>
> The purpose of the check is not to protect the hardware. The check is
> to help the user provide the correct input because the hardware
> provides no feedback at all on what is wrong with the input.
>
> And again, the out-of-tree Xilinx driver accepted files that this
> driver does not, so having a clear and understandable message is going
> to be very important for users.

Then just create a "validate my bitstream" tool.

I wrote a Python script to do what Xilinx refused to do years ago:
https://github.com/topic-embedded-products/meta-topic/blob/master/recipes-bsp/fpga/fpga-bit-to-bin/fpga-bit-to-bin.py
So apparently it wasn't hard to figure out what to do.

>> doesn't have any effect, the hardware will discard it. There's no reason to
>> waste CPU cycles duplicating this check in software.
>
> In the typical success case we are talking about 5 32 bit compares,
> which isn't even worth considering.

I'm mostly against the complication of the code. The code is more 
complex, and that's bad. It's firmware loading code, and it need not be 
aware of exactly what it is doing. I did not see any checks like this in 
other firmware loading code I've come across.

What you're creating here will require active maintenance.
There are already 7007 and 7012 devices which aren't in your list so the 
driver will refuse to load them until someone fills in the IDs.
The bitstream format is actually proprietary and undocumented. Any 
"checks" in code are likely based on guesswork and reverse engineering.
We also use partial reprogramming a lot. Did you test that? On all 
devices? And we're actually planning to go beyond that. Maybe I'll be 
providing parts of the data through ICAP and some through PCAP, that 
might even work, but the driver would block it for no apparent reason.
Jason Gunthorpe Nov. 28, 2016, 6 p.m. UTC | #17
I seem to have missed being CC on this follow up from Mike and wanted
to respond:

> What you're creating here will require active maintenance.  There
> are already 7007 and 7012 devices which aren't in your list so the
> driver will refuse to load them until someone fills in the IDs.

I don't know what list you are refering to, are you sure you are
responding to the right patch?

The patch searches for the sync word, it has nothing to do with IDs,
and does not attempt to parse any of the proprietary headers. Based on
the Xilinx documentation this will work on 7 Series, US and US+ at a
minimum. Certainly on all Zynq hardware.

> The bitstream format is actually proprietary and undocumented.
> Any "checks" in code are likely based on guesswork and reverse
> engineering.

No, this part is fully documented in UG470.

> We also use partial reprogramming a lot. Did you test
> that? On all devices?

You should read the patch, it only does the check on a full bitstream.

Jason
diff mbox

Patch

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index c2fb4120bd62..46a38772e7ee 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -184,12 +184,26 @@  static int zynq_fpga_ops_write_init(struct 
fpga_manager *mgr, u32 flags,

  	priv = mgr->priv;

+	/* All valid bitstreams are multiples of 32 bits */
+	if (!count || (count % 4) != 0)
+		return -EINVAL;
+

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel