Message ID | ff5a3ea4-4d15-5be3-9cb8-9fd7c716e2e6@plating.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iio_compute_scan_bytes does not seem to account for alignment if first channel uses more storagebits than its successors | expand |
On 11/29/19 3:30 PM, Lars Möllendorf wrote: > Hi, > > I have written a custom kernel module implementing the IIO device API > backed by an IIO triggered buffer. > > My IIO device provides 3 channels + timestamp. The sizes of the channels are > > index | iio_chan_spec.scan_type.storagebits > -------|------------------------------------------------ > 0 | 32 > 1 | 16 > 2 | 16 > > If I select channel 0 (32bit) and one of channel 1 or 2 (16bit) > indio_dev.scan_bytes and iio_buffer.bytes_per_datum have a value of 6 > Byte which does not account for any alignment. > > > After having a closer look at `iio_compute_scan_bytes` which is > responsible for calculating both, `indio_dev.scan_bytes` and > `iio_buffer.bytes_per_datum` it seems to me that the order of channels > matter: > > ```c > /* How much space will the demuxed element take? */ > for_each_set_bit(i, mask, > indio_dev->masklength) { > length = iio_storage_bytes_for_si(indio_dev, i); > bytes = ALIGN(bytes, length); > bytes += length; > } > ``` > > I understand that in case the length of each scan element is smaller > then the length of the successive scan elements, this algorithm works > because it aligns the current element to its own length. But if, as in > my case, the length of channel 0's scan elements is greater then the > size of the samples of the consecutive channels no alignment seems to be > taken into account. Do I miss something here? [...] > But in my case the latter would bloat the buffer from 16 Byte to 4*16 = > 64 Byte per scan if all channels are selected and timestamp is active. > > For now, I will work around this by using 32 storagebits for all my > channels. This gives my 4 Bytes of overhead per scan if all elements are > selected and additional 2 Byte if timestamp is active. > > In "Why do you align the buffer pointer to a multiple of the size of the > current scan element in iio_buffer_foreach_sample()?" on > https://github.com/analogdevicesinc/libiio/issues/324 I have been > pointed to this mailing list. Hi Lars, The way this is supposed to work is that each element is aligned to its own natural alignment. What seems to be missing at the moment is that the total length is also aligned to the size of the first element, so that alignment is preserved for multiple consecutive samples. I feel like we had that at some point, but maybe I'm misremembering. E.g. putting something like unsigned int first_index = find_first_bit(mask, indio_dev->masklength); length = iio_storage_bytes_for_si(indio_dev, first_index); bytes = ALIGN(bytes, length); below the loop should do the trick I believe. - Lars
On Fri, 29 Nov 2019 18:23:37 +0100 Lars-Peter Clausen <lars@metafoo.de> wrote: > On 11/29/19 3:30 PM, Lars Möllendorf wrote: > > Hi, > > > > I have written a custom kernel module implementing the IIO device API > > backed by an IIO triggered buffer. > > > > My IIO device provides 3 channels + timestamp. The sizes of the channels are > > > > index | iio_chan_spec.scan_type.storagebits > > -------|------------------------------------------------ > > 0 | 32 > > 1 | 16 > > 2 | 16 > > > > If I select channel 0 (32bit) and one of channel 1 or 2 (16bit) > > indio_dev.scan_bytes and iio_buffer.bytes_per_datum have a value of 6 > > Byte which does not account for any alignment. > > > > > > After having a closer look at `iio_compute_scan_bytes` which is > > responsible for calculating both, `indio_dev.scan_bytes` and > > `iio_buffer.bytes_per_datum` it seems to me that the order of channels > > matter: > > > > ```c > > /* How much space will the demuxed element take? */ > > for_each_set_bit(i, mask, > > indio_dev->masklength) { > > length = iio_storage_bytes_for_si(indio_dev, i); > > bytes = ALIGN(bytes, length); > > bytes += length; > > } > > ``` > > > > I understand that in case the length of each scan element is smaller > > then the length of the successive scan elements, this algorithm works > > because it aligns the current element to its own length. But if, as in > > my case, the length of channel 0's scan elements is greater then the > > size of the samples of the consecutive channels no alignment seems to be > > taken into account. Do I miss something here? > [...] > > But in my case the latter would bloat the buffer from 16 Byte to 4*16 = > > 64 Byte per scan if all channels are selected and timestamp is active. > > > > For now, I will work around this by using 32 storagebits for all my > > channels. This gives my 4 Bytes of overhead per scan if all elements are > > selected and additional 2 Byte if timestamp is active. > > > > In "Why do you align the buffer pointer to a multiple of the size of the > > current scan element in iio_buffer_foreach_sample()?" on > > https://github.com/analogdevicesinc/libiio/issues/324 I have been > > pointed to this mailing list. > > Hi Lars, > > The way this is supposed to work is that each element is aligned to its > own natural alignment. What seems to be missing at the moment is that > the total length is also aligned to the size of the first element, so > that alignment is preserved for multiple consecutive samples. I feel > like we had that at some point, but maybe I'm misremembering. > > E.g. putting something like > > unsigned int first_index = find_first_bit(mask, indio_dev->masklength); > length = iio_storage_bytes_for_si(indio_dev, first_index); > bytes = ALIGN(bytes, length); > > below the loop should do the trick I believe. Good find by the way. Not sure how we never hit this before as there are definitely sensors out there with similar mixes to you have. Maybe they are always used with the timestamp enabled? Anyhow, no matter it's clearly wrong. Lars trick doesn't work either (I think) as we can have u16 u16 (might be padding or real channel) u32 u16 which should be 4 byte aligned. I think we need to keep track of the largest element present during the loop and use that length afterwards to pad out the end, but only the end rather than every element. In theory we can also have larger elements than the timestamp, so should not do special handling for that, it's just another element. I think that ends up effectively a combination of the two suggestions? Jonathan > > - Lars >
On Fri, 29 Nov 2019 18:23:37 +0100 Lars-Peter Clausen <lars@metafoo.de> wrote: > On 11/29/19 3:30 PM, Lars Möllendorf wrote: > > Hi, > > > > I have written a custom kernel module implementing the IIO device API > > backed by an IIO triggered buffer. > > > > My IIO device provides 3 channels + timestamp. The sizes of the channels are > > > > index | iio_chan_spec.scan_type.storagebits > > -------|------------------------------------------------ > > 0 | 32 > > 1 | 16 > > 2 | 16 > > > > If I select channel 0 (32bit) and one of channel 1 or 2 (16bit) > > indio_dev.scan_bytes and iio_buffer.bytes_per_datum have a value of 6 > > Byte which does not account for any alignment. > > > > > > After having a closer look at `iio_compute_scan_bytes` which is > > responsible for calculating both, `indio_dev.scan_bytes` and > > `iio_buffer.bytes_per_datum` it seems to me that the order of channels > > matter: > > > > ```c > > /* How much space will the demuxed element take? */ > > for_each_set_bit(i, mask, > > indio_dev->masklength) { > > length = iio_storage_bytes_for_si(indio_dev, i); > > bytes = ALIGN(bytes, length); > > bytes += length; > > } > > ``` > > > > I understand that in case the length of each scan element is smaller > > then the length of the successive scan elements, this algorithm works > > because it aligns the current element to its own length. But if, as in > > my case, the length of channel 0's scan elements is greater then the > > size of the samples of the consecutive channels no alignment seems to be > > taken into account. Do I miss something here? > [...] > > But in my case the latter would bloat the buffer from 16 Byte to 4*16 = > > 64 Byte per scan if all channels are selected and timestamp is active. > > > > For now, I will work around this by using 32 storagebits for all my > > channels. This gives my 4 Bytes of overhead per scan if all elements are > > selected and additional 2 Byte if timestamp is active. > > > > In "Why do you align the buffer pointer to a multiple of the size of the > > current scan element in iio_buffer_foreach_sample()?" on > > https://github.com/analogdevicesinc/libiio/issues/324 I have been > > pointed to this mailing list. > > Hi Lars, I managed to crash my email client whilst sending a first reply to this so no idea if anyone got it! > > The way this is supposed to work is that each element is aligned to its > own natural alignment. What seems to be missing at the moment is that > the total length is also aligned to the size of the first element, so > that alignment is preserved for multiple consecutive samples. I feel > like we had that at some point, but maybe I'm misremembering. It's more than possible we broke this in a cleanup at some stage and a bit surprising it hasn't caused problems before as we definitely have other sensors with similar channel patterns. I guess maybe they always have the timestamp enabled. Good spot by the way. > > E.g. putting something like > > unsigned int first_index = find_first_bit(mask, indio_dev->masklength); > length = iio_storage_bytes_for_si(indio_dev, first_index); > bytes = ALIGN(bytes, length); > > below the loop should do the trick I believe. It would work for Lars case, but I think a combination of both approaches is needed to handle cases like u16 u16 (may be padding) u32 u16 We need to realign to the largest sized element whereever it occurs in the channels list. So find that whilst computing the individual alignments, and apply it only once at the end. Thanks, Jonathan > > - Lars >
On 12/1/19 1:10 PM, Jonathan Cameron wrote: >> >> E.g. putting something like >> >> unsigned int first_index = find_first_bit(mask, indio_dev->masklength); >> length = iio_storage_bytes_for_si(indio_dev, first_index); >> bytes = ALIGN(bytes, length); >> >> below the loop should do the trick I believe. > It would work for Lars case, but I think a combination of both approaches > is needed to handle cases like > > u16 > u16 (may be padding) > u32 > u16 > > We need to realign to the largest sized element whereever it occurs in the > channels list. So find that whilst computing the individual alignments, > and apply it only once at the end. > Ah, yes. We need the same rules as for the alignment or padding of a struct, which mean align the size to size of the largest element.
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 5d05c38c4ba9..3d2c4e26d818 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -546,16 +546,18 @@ static int iio_compute_scan_bytes(struct iio_dev *indio_dev, const unsigned long *mask, bool timestamp) { unsigned bytes = 0; - int length, i; + int length, i, align = 0, count = 0; /* How much space will the demuxed element take? */ for_each_set_bit(i, mask, indio_dev->masklength) { length = iio_storage_bytes_for_si(indio_dev, i); - bytes = ALIGN(bytes, length); - bytes += length; + align = max(align, length); + count++; } + bytes = count * align; + if (timestamp) { length = iio_storage_bytes_for_timestamp(indio_dev); bytes = ALIGN(bytes, length); ``` And if the storage bytes for timestamp have to be taken into account for the alignment of the scan elements (what seems not to be the case as it is currently implemented), then: ```diff diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 5d05c38c4ba9..59aee2ea4e19 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -546,21 +546,23 @@ static int iio_compute_scan_bytes(struct iio_dev *indio_dev, const unsigned long *mask, bool timestamp) { unsigned bytes = 0; - int length, i; + int length, i, align = 0, count = 0; + + if (timestamp) { + align = iio_storage_bytes_for_timestamp(indio_dev); + count++; + } /* How much space will the demuxed element take? */ for_each_set_bit(i, mask, indio_dev->masklength) { length = iio_storage_bytes_for_si(indio_dev, i); - bytes = ALIGN(bytes, length); - bytes += length; + align = max(align, length); + count++; } - if (timestamp) { - length = iio_storage_bytes_for_timestamp(indio_dev); - bytes = ALIGN(bytes, length); - bytes += length; - } + bytes = count * align; + return bytes; }