Message ID | 20210106034623.GA1128@open-light-1.localdomain (mailing list archive) |
---|---|
Headers | show |
Series | hugetlbfs: support free page reporting | expand |
On 06.01.21 04:46, Liang Li wrote: > A typical usage of hugetlbfs it's to reserve amount of memory > during the kernel booting stage, and the reserved pages are > unlikely to return to the buddy system. When application need > hugepages, kernel will allocate them from the reserved pool. > when application terminates, huge pages will return to the > reserved pool and are kept in the free list for hugetlbfs, > these free pages will not return to buddy freelist unless the > size of reserved pool is changed. > Free page reporting only supports buddy pages, it can't report > the free pages reserved for hugetlbfs. On the other hand, > hugetlbfs is a good choice for system with a huge amount of RAM, > because it can help to reduce the memory management overhead and > improve system performance. > This patch add the support for reporting hugepages in the free > list of hugetlbfs, it can be used by virtio_balloon driver for > memory overcommit and pre zero out free pages for speeding up > memory population and page fault handling. You should lay out the use case + measurements. Further you should describe what this patch set actually does, how behavior can be tuned, pros and cons, etc... And you should most probably keep this RFC. > > Most of the code are 'copied' from free page reporting because > they are working in the same way. So the code can be refined to > remove duplication. It can be done later. Nothing speaks about getting it right from the beginning. Otherwise it will most likely never happen. > > Since some guys have some concern about side effect of the 'buddy > free page pre zero out' feature brings, I remove it from this > serier. You should really point out what changed size the last version. I remember Alex and Mike had some pretty solid points of what they don't want to see (especially: don't use free page reporting infrastructure and don't temporarily allocate huge pages for processing them). I am not convinced that we want to use the free page reporting infrastructure for this (pre-zeroing huge pages). What speaks about a thread simply iterating over huge pages one at a time, zeroing them? The whole free page reporting infrastructure was invented because we have to do expensive coordination (+ locking) when going via the hypervisor. For the main use case of zeroing huge pages in the background, I don't see a real need for that. If you believe this is the right thing to do, please add a discussion regarding this.
On Wed, Jan 6, 2021 at 5:41 PM David Hildenbrand <david@redhat.com> wrote: > > On 06.01.21 04:46, Liang Li wrote: > > A typical usage of hugetlbfs it's to reserve amount of memory > > during the kernel booting stage, and the reserved pages are > > unlikely to return to the buddy system. When application need > > hugepages, kernel will allocate them from the reserved pool. > > when application terminates, huge pages will return to the > > reserved pool and are kept in the free list for hugetlbfs, > > these free pages will not return to buddy freelist unless the > > size of reserved pool is changed. > > Free page reporting only supports buddy pages, it can't report > > the free pages reserved for hugetlbfs. On the other hand, > > hugetlbfs is a good choice for system with a huge amount of RAM, > > because it can help to reduce the memory management overhead and > > improve system performance. > > This patch add the support for reporting hugepages in the free > > list of hugetlbfs, it can be used by virtio_balloon driver for > > memory overcommit and pre zero out free pages for speeding up > > memory population and page fault handling. > > You should lay out the use case + measurements. Further you should > describe what this patch set actually does, how behavior can be tuned, > pros and cons, etc... And you should most probably keep this RFC. > > > > > Most of the code are 'copied' from free page reporting because > > they are working in the same way. So the code can be refined to > > remove duplication. It can be done later. > > Nothing speaks about getting it right from the beginning. Otherwise it > will most likely never happen. > > > > > Since some guys have some concern about side effect of the 'buddy > > free page pre zero out' feature brings, I remove it from this > > serier. > > You should really point out what changed size the last version. I > remember Alex and Mike had some pretty solid points of what they don't > want to see (especially: don't use free page reporting infrastructure > and don't temporarily allocate huge pages for processing them). > > I am not convinced that we want to use the free page reporting > infrastructure for this (pre-zeroing huge pages). What speaks about a > thread simply iterating over huge pages one at a time, zeroing them? The > whole free page reporting infrastructure was invented because we have to > do expensive coordination (+ locking) when going via the hypervisor. For > the main use case of zeroing huge pages in the background, I don't see a > real need for that. If you believe this is the right thing to do, please > add a discussion regarding this. > > -- > Thanks, > > David / dhildenb > > I will take all your advice and give more detail in the next revision, Thanks for your comments! Liang