mbox series

[v3,0/8] Clean up of gzip decompressor

Message ID 20240424163422.23276-1-dpsmith@apertussolutions.com (mailing list archive)
Headers show
Series Clean up of gzip decompressor | expand

Message

Daniel P. Smith April 24, 2024, 4:34 p.m. UTC
An issue ran into by hyperlaunch was the need to use the gzip decompressor
multiple times. The current implementation fails when reused due to tainting of
decompressor state from a previous usage. This series seeks to colocate the
gzip unit files under a single directory similar to the other decompression
algorithms.  To enable the refactoring of the state tracking, the code is then
cleaned up in line with Xen coding style.

Changes in v3:
- patch "xen/gzip: Drop huffman code table tracking" was merged
- patch "xen/gzip: Remove custom memory allocator" was merged
- patch "xen/gzip: Drop unused define checks" was merged
- move of the decompressor state into struct was broken up to ease review
- expanded macros that were either only used once or obsfucated the logic
- adjusted variable types as needed to be more appropriate for their usage

Changes in v2:
- patch "xen/gzip: Colocate gunzip code files" was merged
- dropped ifdef chunks that are never enabled
- addressed formatting changes missed in v1
- replaced custom memory allocator with xalloc_bytes()
- renamed gzip_data to gzip_state
- moved gzip_state to being dynamically allocated
- changed crc table to the explicit size of uint32_t 
- instead of moving huffman tracking into state, dropped altogether


Daniel P. Smith (8):
  gzip: clean up comments and fix code alignment
  gzip: refactor and expand macros
  gzip: refactor the gunzip window into common state
  gzip: move window pointer into gunzip state
  gzip: move input buffer handling into gunzip state
  gzip: move output buffer into gunzip state
  gzip: move bitbuffer into gunzip state
  gzip: move crc state into gunzip state

 xen/common/gzip/gunzip.c  |  84 ++--
 xen/common/gzip/inflate.c | 948 +++++++++++++++++++-------------------
 2 files changed, 516 insertions(+), 516 deletions(-)

Comments

Daniel P. Smith April 24, 2024, 4:48 p.m. UTC | #1
On 4/24/24 12:34, Daniel P. Smith wrote:
> An issue ran into by hyperlaunch was the need to use the gzip decompressor
> multiple times. The current implementation fails when reused due to tainting of
> decompressor state from a previous usage. This series seeks to colocate the
> gzip unit files under a single directory similar to the other decompression
> algorithms.  To enable the refactoring of the state tracking, the code is then
> cleaned up in line with Xen coding style.
> 

Forgot to add this comment to the cover letter.

Concern was raised about taking updates from original source to the 
code. In this case the original source is from the Linux kernel, which 
can be found in lib/inflate.c, and added to Xen in 2009. The last time 
there was a logic change to that code in the Linux kernel was in 2008, 
before it was added to Xen. Since then, it has only been updated for 
changes made to included headers. If fact, as far as I can see, while 
the file is still in place, nothing uses it. For zlib decompression, 
three is a new lib/zlib_deflate code base that is used. The scope of 
this work is not to completely fix/replace zlib decompression for Xen, 
but to stabilize it to allow hyperlaunch to decompress more than one 
zlib compressed kernel.

V/r,
Daniel P. Smith
Andrew Cooper May 9, 2024, 3:36 p.m. UTC | #2
On 24/04/2024 5:34 pm, Daniel P. Smith wrote:
> An issue ran into by hyperlaunch was the need to use the gzip decompressor
> multiple times. The current implementation fails when reused due to tainting of
> decompressor state from a previous usage. This series seeks to colocate the
> gzip unit files under a single directory similar to the other decompression
> algorithms.  To enable the refactoring of the state tracking, the code is then
> cleaned up in line with Xen coding style.
>
> Changes in v3:
> - patch "xen/gzip: Drop huffman code table tracking" was merged
> - patch "xen/gzip: Remove custom memory allocator" was merged
> - patch "xen/gzip: Drop unused define checks" was merged
> - move of the decompressor state into struct was broken up to ease review
> - expanded macros that were either only used once or obsfucated the logic
> - adjusted variable types as needed to be more appropriate for their usage
>
> Changes in v2:
> - patch "xen/gzip: Colocate gunzip code files" was merged
> - dropped ifdef chunks that are never enabled
> - addressed formatting changes missed in v1
> - replaced custom memory allocator with xalloc_bytes()
> - renamed gzip_data to gzip_state
> - moved gzip_state to being dynamically allocated
> - changed crc table to the explicit size of uint32_t 
> - instead of moving huffman tracking into state, dropped altogether
>
>
> Daniel P. Smith (8):
>   gzip: clean up comments and fix code alignment
>   gzip: refactor and expand macros
>   gzip: refactor the gunzip window into common state
>   gzip: move window pointer into gunzip state
>   gzip: move input buffer handling into gunzip state
>   gzip: move output buffer into gunzip state
>   gzip: move bitbuffer into gunzip state
>   gzip: move crc state into gunzip state

In hindsight my suggestion that lead to "refactor and expand macros"
wasn't great.

We want to keep the underrun labels for a future when the error handling
isn't panic().  After that, expanding flush_window() is better folded
into "refactor the gunzip window into common state" to reduce the churn.

As that was my fault, and unpicking it is reasonably hard, and we're on
a tight deadline for 4.19 now, I've gone ahead and unpicked this.

Also I've rebased over Jan's patch addressing the memory leaks reported
by Coverity, as the two sets of fixes collide, along with various other
minor notes.

I've conferred with Daniel who is happy with the aformentioned changes.

This area of code still has a lot of work needing doing on it, but
removing the use of global state is a concrete improvement in the status
quo.

~Andrew