mbox series

[v2,0/9] hw, ui, virtfs-proxy-helper: Reduce QEMU .data/.rodata/.bss footprint

Message ID 20200305124525.14555-1-philmd@redhat.com (mailing list archive)
Headers show
Series hw, ui, virtfs-proxy-helper: Reduce QEMU .data/.rodata/.bss footprint | expand

Message

Philippe Mathieu-Daudé March 5, 2020, 12:45 p.m. UTC
Since v1:
- merged 2 series
- reworked hw/usb/quirks
- added R-b/A-b tags

This series reduce the footprint of the QEMU binary:
.bss: 106KiB (moved to .heap)
.data: 1MiB
.rodata: 4.34MiB
(sizes on x86_64 building with -Os)

The elf-dissector tool [1] [2] helped to notice the big array.

[1] https://phabricator.kde.org/source/elf-dissector/
[2] https://www.volkerkrause.eu/2019/06/22/elf-dissector-aarch64-support.html
[heap equivalent tool working with QEMU: https://github.com/KDE/heaptrack]

Supersedes: <20200304221807.25212-1-philmd@redhat.com>
Supersedes: <20200305010446.17029-1-philmd@redhat.com>

Philippe Mathieu-Daudé (9):
  hw/audio/fmopl: Fix a typo twice
  hw/audio/fmopl: Move ENV_CURVE to .heap to save 32KiB of .bss
  hw/audio/intel-hda: Use memory region alias to reduce .rodata by
    4.34MB
  hw/net/e1000: Add readops/writeops typedefs
  hw/net/e1000: Move macreg[] arrays to .rodata to save 1MiB of .data
  hw/usb/quirks: Use smaller types to reduce .rodata by 10KiB
  ui/curses: Make control_characters[] array const
  ui/curses: Move arrays to .heap to save 74KiB of .bss
  virtfs-proxy-helper: Make the helper_opts[] array const

 hw/usb/quirks.h             | 22 +++++++++++++---------
 fsdev/virtfs-proxy-helper.c |  2 +-
 hw/audio/fmopl.c            |  8 +++++---
 hw/audio/intel-hda.c        | 24 ++++++++++--------------
 hw/net/e1000.c              |  6 ++++--
 hw/net/e1000e_core.c        |  6 ++++--
 hw/usb/quirks.c             |  4 ++--
 ui/curses.c                 | 10 +++++++---
 8 files changed, 46 insertions(+), 36 deletions(-)

Comments

no-reply@patchew.org March 5, 2020, 1:15 p.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/20200305124525.14555-1-philmd@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
remote: Counting objects: 5394, done.        
error: RPC failed; result=18, HTTP code = 200
fatal: The remote end hung up unexpectedly
fatal: protocol error: bad pack header
Clone of 'https://git.qemu.org/git/dtc.git' into submodule path 'dtc' failed
failed to update submodule dtc
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) unregistered for path 'dtc'
make[1]: *** [/var/tmp/patchew-tester-tmp-aw2p1w3p/src/docker-src.2020-03-05-08.09.04.27630] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-aw2p1w3p/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    6m3.459s
user    0m2.484s


The full log is available at
http://patchew.org/logs/20200305124525.14555-1-philmd@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
no-reply@patchew.org March 5, 2020, 1:25 p.m. UTC | #2
Patchew URL: https://patchew.org/QEMU/20200305124525.14555-1-philmd@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
remote: Counting objects: 5394, done.        
error: RPC failed; result=18, HTTP code = 200
fatal: The remote end hung up unexpectedly
fatal: protocol error: bad pack header
Clone of 'https://git.qemu.org/git/dtc.git' into submodule path 'dtc' failed
failed to update submodule dtc
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) unregistered for path 'dtc'
make[1]: *** [/var/tmp/patchew-tester-tmp-l4s5r6pv/src/docker-src.2020-03-05-08.15.46.28319] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-l4s5r6pv/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    9m17.768s
user    0m2.937s


The full log is available at
http://patchew.org/logs/20200305124525.14555-1-philmd@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Daniel P. Berrangé March 5, 2020, 1:42 p.m. UTC | #3
On Thu, Mar 05, 2020 at 01:45:16PM +0100, Philippe Mathieu-Daudé wrote:
> Since v1:
> - merged 2 series
> - reworked hw/usb/quirks
> - added R-b/A-b tags
> 
> This series reduce the footprint of the QEMU binary:
> .bss: 106KiB (moved to .heap)

Did this actually have an impact on the binary size, or just on the
size the elf-dissector reports ?  I'm not very familiar with ELF,
but Wikipedia's description of BSS makes me question it...

  "Typically only the length of the bss section, but no data, 
   is stored in the object file. The program loader allocates 
   memory for the bss section when it loads the program. On
   some platforms, some or all of the bss section is initialized
   to zeroes. Unix-like systems and Windows initialize the bss 
   section to zero"

This suggests .bss has no on-disk overhead, only runtime overhead,
which is presumably going to be the same with heap allocations.

> .data: 1MiB
> .rodata: 4.34MiB

These looks useful though in terms of disk footprint.

> (sizes on x86_64 building with -Os)
> 
> The elf-dissector tool [1] [2] helped to notice the big array.
> 
> [1] https://phabricator.kde.org/source/elf-dissector/
> [2] https://www.volkerkrause.eu/2019/06/22/elf-dissector-aarch64-support.html
> [heap equivalent tool working with QEMU: https://github.com/KDE/heaptrack]
> 
> Supersedes: <20200304221807.25212-1-philmd@redhat.com>
> Supersedes: <20200305010446.17029-1-philmd@redhat.com>
> 
> Philippe Mathieu-Daudé (9):
>   hw/audio/fmopl: Fix a typo twice
>   hw/audio/fmopl: Move ENV_CURVE to .heap to save 32KiB of .bss
>   hw/audio/intel-hda: Use memory region alias to reduce .rodata by
>     4.34MB
>   hw/net/e1000: Add readops/writeops typedefs
>   hw/net/e1000: Move macreg[] arrays to .rodata to save 1MiB of .data
>   hw/usb/quirks: Use smaller types to reduce .rodata by 10KiB
>   ui/curses: Make control_characters[] array const
>   ui/curses: Move arrays to .heap to save 74KiB of .bss
>   virtfs-proxy-helper: Make the helper_opts[] array const
> 
>  hw/usb/quirks.h             | 22 +++++++++++++---------
>  fsdev/virtfs-proxy-helper.c |  2 +-
>  hw/audio/fmopl.c            |  8 +++++---
>  hw/audio/intel-hda.c        | 24 ++++++++++--------------
>  hw/net/e1000.c              |  6 ++++--
>  hw/net/e1000e_core.c        |  6 ++++--
>  hw/usb/quirks.c             |  4 ++--
>  ui/curses.c                 | 10 +++++++---
>  8 files changed, 46 insertions(+), 36 deletions(-)
> 
> -- 
> 2.21.1
> 
> 

Regards,
Daniel
no-reply@patchew.org March 5, 2020, 1:48 p.m. UTC | #4
Patchew URL: https://patchew.org/QEMU/20200305124525.14555-1-philmd@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
remote: Counting objects: 5394, done.        
error: RPC failed; result=18, HTTP code = 200
fatal: The remote end hung up unexpectedly
fatal: protocol error: bad pack header
Clone of 'https://git.qemu.org/git/dtc.git' into submodule path 'dtc' failed
failed to update submodule dtc
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) unregistered for path 'dtc'
make[1]: *** [/var/tmp/patchew-tester-tmp-uyru360g/src/docker-src.2020-03-05-08.43.13.1580] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-uyru360g/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    5m33.393s
user    0m2.702s


The full log is available at
http://patchew.org/logs/20200305124525.14555-1-philmd@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Philippe Mathieu-Daudé March 5, 2020, 1:56 p.m. UTC | #5
On 3/5/20 2:42 PM, Daniel P. Berrangé wrote:
> On Thu, Mar 05, 2020 at 01:45:16PM +0100, Philippe Mathieu-Daudé wrote:
>> Since v1:
>> - merged 2 series
>> - reworked hw/usb/quirks
>> - added R-b/A-b tags
>>
>> This series reduce the footprint of the QEMU binary:
>> .bss: 106KiB (moved to .heap)
> 
> Did this actually have an impact on the binary size, or just on the
> size the elf-dissector reports ?  I'm not very familiar with ELF,
> but Wikipedia's description of BSS makes me question it...
> 
>    "Typically only the length of the bss section, but no data,
>     is stored in the object file. The program loader allocates
>     memory for the bss section when it loads the program. On
>     some platforms, some or all of the bss section is initialized
>     to zeroes. Unix-like systems and Windows initialize the bss
>     section to zero"
> 
> This suggests .bss has no on-disk overhead, only runtime overhead,
> which is presumably going to be the same with heap allocations.

IIUC when stored in the .bss, the buffer are always allocated in memory, 
even if not used. By moving them to the .heap, we only allocate them 
when using either the adlib audio device or curses.

> 
>> .data: 1MiB
>> .rodata: 4.34MiB
> 
> These looks useful though in terms of disk footprint.

Memory footprint is more important than disk footprint, but harder to 
track/manage.

> 
>> (sizes on x86_64 building with -Os)
>>
>> The elf-dissector tool [1] [2] helped to notice the big array.
>>
>> [1] https://phabricator.kde.org/source/elf-dissector/
>> [2] https://www.volkerkrause.eu/2019/06/22/elf-dissector-aarch64-support.html
>> [heap equivalent tool working with QEMU: https://github.com/KDE/heaptrack]
>>
>> Supersedes: <20200304221807.25212-1-philmd@redhat.com>
>> Supersedes: <20200305010446.17029-1-philmd@redhat.com>
>>
>> Philippe Mathieu-Daudé (9):
>>    hw/audio/fmopl: Fix a typo twice
>>    hw/audio/fmopl: Move ENV_CURVE to .heap to save 32KiB of .bss
>>    hw/audio/intel-hda: Use memory region alias to reduce .rodata by
>>      4.34MB
>>    hw/net/e1000: Add readops/writeops typedefs
>>    hw/net/e1000: Move macreg[] arrays to .rodata to save 1MiB of .data
>>    hw/usb/quirks: Use smaller types to reduce .rodata by 10KiB
>>    ui/curses: Make control_characters[] array const
>>    ui/curses: Move arrays to .heap to save 74KiB of .bss
>>    virtfs-proxy-helper: Make the helper_opts[] array const
>>
>>   hw/usb/quirks.h             | 22 +++++++++++++---------
>>   fsdev/virtfs-proxy-helper.c |  2 +-
>>   hw/audio/fmopl.c            |  8 +++++---
>>   hw/audio/intel-hda.c        | 24 ++++++++++--------------
>>   hw/net/e1000.c              |  6 ++++--
>>   hw/net/e1000e_core.c        |  6 ++++--
>>   hw/usb/quirks.c             |  4 ++--
>>   ui/curses.c                 | 10 +++++++---
>>   8 files changed, 46 insertions(+), 36 deletions(-)
>>
>> -- 
>> 2.21.1
>>
>>
> 
> Regards,
> Daniel
>
Eric Blake March 5, 2020, 2:21 p.m. UTC | #6
On 3/5/20 6:45 AM, Philippe Mathieu-Daudé wrote:
> Since v1:
> - merged 2 series
> - reworked hw/usb/quirks
> - added R-b/A-b tags
> 
> This series reduce the footprint of the QEMU binary:
> .bss: 106KiB (moved to .heap)

Why is moving stuff from .bss to .heap beneficial?  With .bss, 
0-initialization is cheap (the OS gives it to us for free by mapping the 
entire .bss to a copy-on-write zero page); but with .heap, we generally 
have to zero it at runtime ourselves.

> .data: 1MiB
> .rodata: 4.34MiB

But reducing these sizes makes sense.
Eric Blake March 5, 2020, 2:32 p.m. UTC | #7
On 3/5/20 7:42 AM, Daniel P. Berrangé wrote:
> On Thu, Mar 05, 2020 at 01:45:16PM +0100, Philippe Mathieu-Daudé wrote:
>> Since v1:
>> - merged 2 series
>> - reworked hw/usb/quirks
>> - added R-b/A-b tags
>>
>> This series reduce the footprint of the QEMU binary:
>> .bss: 106KiB (moved to .heap)
> 
> Did this actually have an impact on the binary size, or just on the
> size the elf-dissector reports ?  I'm not very familiar with ELF,
> but Wikipedia's description of BSS makes me question it...
> 
>    "Typically only the length of the bss section, but no data,
>     is stored in the object file. The program loader allocates
>     memory for the bss section when it loads the program. On
>     some platforms, some or all of the bss section is initialized
>     to zeroes. Unix-like systems and Windows initialize the bss
>     section to zero"
> 
> This suggests .bss has no on-disk overhead, only runtime overhead,
> which is presumably going to be the same with heap allocations.

Or even LESS overhead.  Heap allocationhave unspecified contents 
requiring runtime effort (true, some implementations of malloc() handle 
large allocations specially as anonymous mmap initially backed by 
/dev/zero, so that those allocations start life 0-allocated, but you 
can't rely on that optimization), while .bss is required by the C 
language to be 0 initialized (and you CAN rely on the OS implementing 
that as efficiently as possible, generally by starting with COW mapping 
initially backed by the zero page).

In fact, on nbdkit, we hit an interesting case where using .bss instead 
of .rodata is actually beneficial:
https://www.redhat.com/archives/libguestfs/2019-July/msg00074.html
Marking a large array that will always consist only of zero bytes as 
const actually pessimized the image size and load time, because the 
addition of const moved the array from .bss into .rodata.

> 
>> .data: 1MiB
>> .rodata: 4.34MiB
> 
> These looks useful though in terms of disk footprint.

Yes. Smaller data structures allow for smaller binaries and faster loading.
Eric Blake March 5, 2020, 2:34 p.m. UTC | #8
On 3/5/20 7:56 AM, Philippe Mathieu-Daudé wrote:
> On 3/5/20 2:42 PM, Daniel P. Berrangé wrote:
>> On Thu, Mar 05, 2020 at 01:45:16PM +0100, Philippe Mathieu-Daudé wrote:
>>> Since v1:
>>> - merged 2 series
>>> - reworked hw/usb/quirks
>>> - added R-b/A-b tags
>>>
>>> This series reduce the footprint of the QEMU binary:
>>> .bss: 106KiB (moved to .heap)
>>
>> Did this actually have an impact on the binary size, or just on the
>> size the elf-dissector reports ?  I'm not very familiar with ELF,
>> but Wikipedia's description of BSS makes me question it...
>>
>>    "Typically only the length of the bss section, but no data,
>>     is stored in the object file. The program loader allocates
>>     memory for the bss section when it loads the program. On
>>     some platforms, some or all of the bss section is initialized
>>     to zeroes. Unix-like systems and Windows initialize the bss
>>     section to zero"
>>
>> This suggests .bss has no on-disk overhead, only runtime overhead,
>> which is presumably going to be the same with heap allocations.
> 
> IIUC when stored in the .bss, the buffer are always allocated in memory, 
> even if not used. By moving them to the .heap, we only allocate them 
> when using either the adlib audio device or curses.

Virtual memory is cheap on 64-bit platforms.  Just because the address 
range is reserved does not actually change the amount of memory in use 
by the machine, if the application does not touch those virtual 
addresses.  But you do have a potential point on a 32-bit platform, 
where a heap allocation only when needed may avoid address space 
exhaustion for other cases.