Message ID | 20221217105833.24851-1-laoar.shao@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | mm: page_ext: split page_ext flags | expand |
On Sat, Dec 17, 2022 at 10:58:31AM +0000, Yafang Shao wrote: > On 64bit system, page extension is for debugging purpose only currently, > because of its overhead, in particular the memory overhead. > > Once a page_ext is enabled, at least it will take 0.2% of the total > memory because of the page_ext flags, no matter this page_ext uses it or > not. Currently this page_ext flags is only used for page_owner on 64bit > system. So it doesn't make sense to allocate this flags for all page_ext > by default. We'd better move it into page_owner's structure, then when > someone wants to introduce a new page_ext which may be memory-overhead > sensitive, it will save this unneeded overhead. I'm still worried about the approach of using page_ext for BPF memory accounting. Let's say we finally accepted the approach of using page_ext for BPF memory accounting, and if it's not enabled by default, you need to rebuild the kernel when you want to check BPF memory usage. (Or make everyone bear the overhead by enabling page_ext default for BPF memory accounting that one may not be interested) How the problem of accounting kfree_rcu() is going? Doesn't call_rcu() work for the problem? I still think it's much more feasible. > On 32bit system, there's page_idle running on production envrionment, > which also uses this page_ext flags. So it will take another 0.2% of > total memory if the user enable both page_idle and page_owner after this > change, but considering page_owner is for debugging purpose only, the > memory overhead in this case won't be a problem. > > So, let split the page_ext flags. > > Yafang Shao (2): > mm: page_owner: split page_owner's flag from the comm flags > mm: page_idle: split 32bit page_idle's flag from the common flags > > include/linux/page_ext.h | 14 +------------- > include/linux/page_idle.h | 39 +++++++++++++++++++++++++++++++++------ > mm/page_ext.c | 10 ---------- > mm/page_idle.c | 12 ++++++++++++ > mm/page_owner.c | 36 ++++++++++++++++++++++-------------- > 5 files changed, 68 insertions(+), 43 deletions(-) > > -- > 2.30.1 (Apple Git-130) >
On Sun, Dec 18, 2022 at 4:08 PM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > On Sat, Dec 17, 2022 at 10:58:31AM +0000, Yafang Shao wrote: > > On 64bit system, page extension is for debugging purpose only currently, > > because of its overhead, in particular the memory overhead. > > > > Once a page_ext is enabled, at least it will take 0.2% of the total > > memory because of the page_ext flags, no matter this page_ext uses it or > > not. Currently this page_ext flags is only used for page_owner on 64bit > > system. So it doesn't make sense to allocate this flags for all page_ext > > by default. We'd better move it into page_owner's structure, then when > > someone wants to introduce a new page_ext which may be memory-overhead > > sensitive, it will save this unneeded overhead. > Hi Hyeonggon, > I'm still worried about the approach of using page_ext for BPF memory > accounting. > This patchset is independent of BPF memory accounting. It is just an improvement for the page_ext itself. > Let's say we finally accepted the approach of using page_ext for BPF memory > accounting, and if it's not enabled by default, you need to rebuild the kernel > when you want to check BPF memory usage. (Or make everyone bear the overhead > by enabling page_ext default for BPF memory accounting that one may not be interested) > The compile config can be enabled by default, but the static key is disabled by default. That said, the user doesn't need to rebuild the kernel. The user can pass a kernel parameter to enable or disable it. But that doesn't mean I have made a final decision to use page_ext to account for it. I will investigate if the dynamic page_ext can be introduced or not first. If we can allocate page_ext dynamically rather than allocate them at boot, the page_ext will be used in the production environment rather than for debugging purposes only, that will make it more useful. > How the problem of accounting kfree_rcu() is going? > Doesn't call_rcu() work for the problem? > I still think it's much more feasible. > I'm also investigating the call_rcu() solution. The disadvantage of call_rcu(), as I have explained in earlier replyment, is that it adds restrictions for this solution and it can be broken easily if MM guys introduce other deferred freeing functions inside mm/. That will be a burden for further development.
On Sun, Dec 18, 2022 at 06:01:09PM +0800, Yafang Shao wrote: > On Sun, Dec 18, 2022 at 4:08 PM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > > > On Sat, Dec 17, 2022 at 10:58:31AM +0000, Yafang Shao wrote: > > > On 64bit system, page extension is for debugging purpose only currently, > > > because of its overhead, in particular the memory overhead. > > > > > > Once a page_ext is enabled, at least it will take 0.2% of the total > > > memory because of the page_ext flags, no matter this page_ext uses it or > > > not. Currently this page_ext flags is only used for page_owner on 64bit > > > system. So it doesn't make sense to allocate this flags for all page_ext > > > by default. We'd better move it into page_owner's structure, then when > > > someone wants to introduce a new page_ext which may be memory-overhead > > > sensitive, it will save this unneeded overhead. > > > > Hi Hyeonggon, Hi Yafang. > > I'm still worried about the approach of using page_ext for BPF memory > > accounting. > > > > This patchset is independent of BPF memory accounting. It is just an > improvement for the page_ext itself. this improvement doesn't make sense until we have non-debugging/memory overhead sensitive use case of page_ext. > > Let's say we finally accepted the approach of using page_ext for BPF memory > > accounting, and if it's not enabled by default, you need to rebuild the kernel > > when you want to check BPF memory usage. (Or make everyone bear the overhead > > by enabling page_ext default for BPF memory accounting that one may not be interested) > > > > The compile config can be enabled by default, but the static key is > disabled by default. That said, the user doesn't need to rebuild the > kernel. The user can pass a kernel parameter to enable or disable it. > But that doesn't mean I have made a final decision to use page_ext to > account for it. Okay. > I will investigate if the dynamic page_ext can be > introduced or not first. If we can allocate page_ext dynamically > rather than allocate them at boot, the page_ext will be used in the > production environment rather than for debugging purposes only, that > will make it more useful. IMHO that's gonna be quite challenging.... To get page_ext using pfn, we need array of page_ext. And in FLATMEM size of the array can be bigger than maximum allocation size supported by buddy allocator. (that's why page_ext uses early memory allocator, memblock in FLATMEM and memblock is unavailable after boot process) Why challenge page_ext if call_rcu() can solve the problem? > > How the problem of accounting kfree_rcu() is going? > > Doesn't call_rcu() work for the problem? > > I still think it's much more feasible. > > > > I'm also investigating the call_rcu() solution. The disadvantage of > call_rcu(), as I have explained in earlier replyment, is that it adds > restrictions for this solution What kind of restrictions did you mean? > and it can be broken easily if MM guys > introduce other deferred freeing functions inside mm/. That will be a > burden for further development. No, if call_rcu() is called in BPF subsystem and if the callback just accounts memory usage & calls memory free API (like kfree()/vfree()/etc), this is not going to be burden for MM developers at all. Also, even if it is considered as burden, it can be justified because it avoids increasing memory usage.
On 12/17/22 11:58, Yafang Shao wrote: > On 64bit system, page extension is for debugging purpose only currently, > because of its overhead, in particular the memory overhead. > > Once a page_ext is enabled, at least it will take 0.2% of the total > memory because of the page_ext flags, no matter this page_ext uses it or > not. Currently this page_ext flags is only used for page_owner on 64bit > system. So it doesn't make sense to allocate this flags for all page_ext > by default. We'd better move it into page_owner's structure, then when > someone wants to introduce a new page_ext which may be memory-overhead > sensitive, it will save this unneeded overhead. > > On 32bit system, there's page_idle running on production envrionment, > which also uses this page_ext flags. So it will take another 0.2% of > total memory if the user enable both page_idle and page_owner after this > change, but considering page_owner is for debugging purpose only, the > memory overhead in this case won't be a problem. > > So, let split the page_ext flags. Hi, FYI I think Pasha's solution should work to avoid the waste even without the split: https://lore.kernel.org/all/20230113154253.92480-1-pasha.tatashin@soleen.com/ > Yafang Shao (2): > mm: page_owner: split page_owner's flag from the comm flags > mm: page_idle: split 32bit page_idle's flag from the common flags > > include/linux/page_ext.h | 14 +------------- > include/linux/page_idle.h | 39 +++++++++++++++++++++++++++++++++------ > mm/page_ext.c | 10 ---------- > mm/page_idle.c | 12 ++++++++++++ > mm/page_owner.c | 36 ++++++++++++++++++++++-------------- > 5 files changed, 68 insertions(+), 43 deletions(-) >
On Mon, Jan 16, 2023 at 6:58 PM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 12/17/22 11:58, Yafang Shao wrote: > > On 64bit system, page extension is for debugging purpose only currently, > > because of its overhead, in particular the memory overhead. > > > > Once a page_ext is enabled, at least it will take 0.2% of the total > > memory because of the page_ext flags, no matter this page_ext uses it or > > not. Currently this page_ext flags is only used for page_owner on 64bit > > system. So it doesn't make sense to allocate this flags for all page_ext > > by default. We'd better move it into page_owner's structure, then when > > someone wants to introduce a new page_ext which may be memory-overhead > > sensitive, it will save this unneeded overhead. > > > > On 32bit system, there's page_idle running on production envrionment, > > which also uses this page_ext flags. So it will take another 0.2% of > > total memory if the user enable both page_idle and page_owner after this > > change, but considering page_owner is for debugging purpose only, the > > memory overhead in this case won't be a problem. > > > > So, let split the page_ext flags. > > Hi, > > FYI I think Pasha's solution should work to avoid the waste even without the > split: > > https://lore.kernel.org/all/20230113154253.92480-1-pasha.tatashin@soleen.com/ > Thanks for the information! Splitting the flags could make page extensions more clear. We'd better modularize these page extensions. Different page extensions should maintain their members in their own implementations. But if the maintainer prefers Pasha's solution, I don't have strong objectation. Regards Yafang