Message ID | 20220407031525.2368067-1-yuzhao@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Multi-Gen LRU Framework | expand |
On Wed, Apr 6, 2022 at 9:15 PM Yu Zhao <yuzhao@google.com> wrote: > > TLDR > ==== > The current page reclaim is too expensive in terms of CPU usage and it > often makes poor choices about what to evict. This patchset offers an > alternative solution that is performant, versatile and > straightforward. <snipped> > Summery > ======= > The facts are: > 1. The independent lab results and the real-world applications > indicate substantial improvements; there are no known regressions. > 2. Thrashing prevention, working set estimation and proactive reclaim > work out of the box; there are no equivalent solutions. > 3. There is a lot of new code; nobody has demonstrated smaller changes > with similar effects. > > Our options, accordingly, are: > 1. Given the amount of evidence, the reported improvements will likely > materialize for a wide range of workloads. > 2. Gauging the interest from the past discussions [22][23][24], the > new features will likely be put to use for both personal computers > and data centers. > 3. Based on Google's track record, the new code will likely be well > maintained in the long term. It'd be more difficult if not > impossible to achieve similar effects on top of the current > active/inactive LRU. Hi Stephen, Can you please include this patchset in linux-next? Git repo for you to fetch: https://linux-mm.googlesource.com/mglru for-linux-next My goal is to get additional test coverage before I send a pull request for 5.19 to Linus. I've explored all avenues, but ultimately I've failed to rally substantial support from the MM stakeholders [1]. There are no pending technical issues against this patchset [2]. What is more concerning are the fundamental disagreements on priorities, methodologies, etc. that are not specific to this patchset and have been hindering our progress as a collective. (Cheers to the mutual dissatisfaction.) While we plan to discuss those issues during the LSFMM next month, it doesn't seem reasonable to leave this patchset hanging in the air, since it has reached its maturity a while ago and there are strong demands from downstream kernels as well as a large user base. Thus I sent that pull request to Linus a couple of weeks ago, implying that he would have to make the final decision soon. I hope this gives enough background about what's been going on with this patchset. If you decide to take it and it causes you any troubles, please feel free to yell at me. Thanks! [1] https://lore.kernel.org/r/20220104202227.2903605-1-yuzhao@google.com/ [2] https://lore.kernel.org/r/20220326010003.3155137-1-yuzhao@google.com/
Hi Yu, On Wed, 6 Apr 2022 21:24:27 -0600 Yu Zhao <yuzhao@google.com> wrote: > > Can you please include this patchset in linux-next? Git repo for you to fetch: > > https://linux-mm.googlesource.com/mglru for-linux-next I get a message saying "This repository is empty. Push to it to show branches and history." :-( > My goal is to get additional test coverage before I send a pull > request for 5.19 to Linus. Good idea :-) > I've explored all avenues, but ultimately I've failed to rally > substantial support from the MM stakeholders [1]. There are no pending > technical issues against this patchset [2]. What is more concerning > are the fundamental disagreements on priorities, methodologies, etc. > that are not specific to this patchset and have been hindering our > progress as a collective. (Cheers to the mutual dissatisfaction.) I have not been following the discussion as I am not an mm person, but this is not a good sign. > While we plan to discuss those issues during the LSFMM next month, it > doesn't seem reasonable to leave this patchset hanging in the air, > since it has reached its maturity a while ago and there are strong > demands from downstream kernels as well as a large user base. Thus I > sent that pull request to Linus a couple of weeks ago, implying that > he would have to make the final decision soon. > > I hope this gives enough background about what's been going on with > this patchset. If you decide to take it and it causes you any > troubles, please feel free to yell at me. > > Thanks! > > [1] https://lore.kernel.org/r/20220104202227.2903605-1-yuzhao@google.com/ > [2] https://lore.kernel.org/r/20220326010003.3155137-1-yuzhao@google.com/ I had a look at those threads and I guess things are better that your comment above implies. So, a couple of questions: Have you done a trial merge with a current linux-next tree to see what sort of mess/pain we may already be in? Is it all stable enough now that it could be sent as a patch series for Andrew to include in mmotm (with perhaps just smallish followup patches)?
On Thu, Apr 7, 2022 at 2:31 AM Stephen Rothwell <sfr@canb.auug.org.au> wrote: > > Hi Yu, > > On Wed, 6 Apr 2022 21:24:27 -0600 Yu Zhao <yuzhao@google.com> wrote: > > > > Can you please include this patchset in linux-next? Git repo for you to fetch: > > > > https://linux-mm.googlesource.com/mglru for-linux-next > > I get a message saying "This repository is empty. Push to it to show > branches and history." :-( Sorry about that > > My goal is to get additional test coverage before I send a pull > > request for 5.19 to Linus. > > Good idea :-) > > > I've explored all avenues, but ultimately I've failed to rally > > substantial support from the MM stakeholders [1]. There are no pending > > technical issues against this patchset [2]. What is more concerning > > are the fundamental disagreements on priorities, methodologies, etc. > > that are not specific to this patchset and have been hindering our > > progress as a collective. (Cheers to the mutual dissatisfaction.) > > I have not been following the discussion as I am not an mm person, but > this is not a good sign. > > > While we plan to discuss those issues during the LSFMM next month, it > > doesn't seem reasonable to leave this patchset hanging in the air, > > since it has reached its maturity a while ago and there are strong > > demands from downstream kernels as well as a large user base. Thus I > > sent that pull request to Linus a couple of weeks ago, implying that > > he would have to make the final decision soon. > > > > I hope this gives enough background about what's been going on with > > this patchset. If you decide to take it and it causes you any > > troubles, please feel free to yell at me. > > > > Thanks! > > > > [1] https://lore.kernel.org/r/20220104202227.2903605-1-yuzhao@google.com/ > > [2] https://lore.kernel.org/r/20220326010003.3155137-1-yuzhao@google.com/ > > I had a look at those threads and I guess things are better that your > comment above implies. > > So, a couple of questions: > > Have you done a trial merge with a current linux-next tree to see what > sort of mess/pain we may already be in? > > Is it all stable enough now that it could be sent as a patch series for > Andrew to include in mmotm (with perhaps just smallish followup patches)? > > -- > Cheers, > Stephen Rothwell
On Thu, Apr 7, 2022 at 2:31 AM Stephen Rothwell <sfr@canb.auug.org.au> wrote: > > Hi Yu, > > On Wed, 6 Apr 2022 21:24:27 -0600 Yu Zhao <yuzhao@google.com> wrote: > > > > Can you please include this patchset in linux-next? Git repo for you to fetch: > > > > https://linux-mm.googlesource.com/mglru for-linux-next > > I get a message saying "This repository is empty. Push to it to show > branches and history." :-( Sorry about this. It should work now. > > My goal is to get additional test coverage before I send a pull > > request for 5.19 to Linus. > > Good idea :-) > > > I've explored all avenues, but ultimately I've failed to rally > > substantial support from the MM stakeholders [1]. There are no pending > > technical issues against this patchset [2]. What is more concerning > > are the fundamental disagreements on priorities, methodologies, etc. > > that are not specific to this patchset and have been hindering our > > progress as a collective. (Cheers to the mutual dissatisfaction.) > > I have not been following the discussion as I am not an mm person, but > this is not a good sign. > > > While we plan to discuss those issues during the LSFMM next month, it > > doesn't seem reasonable to leave this patchset hanging in the air, > > since it has reached its maturity a while ago and there are strong > > demands from downstream kernels as well as a large user base. Thus I > > sent that pull request to Linus a couple of weeks ago, implying that > > he would have to make the final decision soon. > > > > I hope this gives enough background about what's been going on with > > this patchset. If you decide to take it and it causes you any > > troubles, please feel free to yell at me. > > > > Thanks! > > > > [1] https://lore.kernel.org/r/20220104202227.2903605-1-yuzhao@google.com/ > > [2] https://lore.kernel.org/r/20220326010003.3155137-1-yuzhao@google.com/ > > I had a look at those threads and I guess things are better that your > comment above implies. > > So, a couple of questions: > > Have you done a trial merge with a current linux-next tree to see what > sort of mess/pain we may already be in? Yes, the repo I prepared for you is based on the latest linux-next. There shouldn't be any conflicts. > Is it all stable enough now that it could be sent as a patch series for > Andrew to include in mmotm (with perhaps just smallish followup patches)? Yes, on multiple occasions, e.g., [1][2][3], I've claimed this patchset has an unprecedented test coverage and nobody has proven otherwise so far. Andrew suggested a cycle in linux-next [4]. So here we are :) [1] https://lore.kernel.org/all/YdSuSHa%2FVjl6bPkg@google.com/ [2] https://lore.kernel.org/r/YdiKVJlClB3h1Kmg@google.com/ [3] https://lore.kernel.org/r/YgR+MfXjpg82QyBT@google.com/ [4] https://lore.kernel.org/r/20220326134928.ad739eeecd5d0855dbdc6257@linux-foundation.org/
Hi, On Thu, 7 Apr 2022 03:41:15 -0600 Yu Zhao <yuzhao@google.com> wrote: > > > So, a couple of questions: > > > > Have you done a trial merge with a current linux-next tree to see what > > sort of mess/pain we may already be in? > > Yes, the repo I prepared for you is based on the latest linux-next. > There shouldn't be any conflicts. Ah, that is a problem :-( I can't merge a branch into linux-next if that branch is based on linux-next itself. linux-next rebases everyday, so that merge would bring in the previous version of linux-next - including other branches that may have rebased :-( All the branches in linux-next need to be based on Linus' tree or some tree that does not rebase (or one you can keep up with if it does rebase). The only exception is part of Andrew's patch series which is rebased (by me) on top of linux-next each day. > > Is it all stable enough now that it could be sent as a patch series for > > Andrew to include in mmotm (with perhaps just smallish followup patches)? > > Yes, on multiple occasions, e.g., [1][2][3], I've claimed this > patchset has an unprecedented test coverage and nobody has proven > otherwise so far. > > Andrew suggested a cycle in linux-next [4]. So here we are :) So the easiest thing for me is if Andrew takes it into his mmotm patch series (most of which ends up in linux-next). Otherwise I am probably at some point going to need help fixing the conflicts.
On Thu, Apr 7, 2022 at 6:14 AM Stephen Rothwell <sfr@canb.auug.org.au> wrote: > > Hi, > > On Thu, 7 Apr 2022 03:41:15 -0600 Yu Zhao <yuzhao@google.com> wrote: > > > > > So, a couple of questions: > > > > > > Have you done a trial merge with a current linux-next tree to see what > > > sort of mess/pain we may already be in? > > > > Yes, the repo I prepared for you is based on the latest linux-next. > > There shouldn't be any conflicts. > > Ah, that is a problem :-( I can't merge a branch into linux-next if > that branch is based on linux-next itself. linux-next rebases > everyday, so that merge would bring in the previous version of > linux-next - including other branches that may have rebased :-( > > All the branches in linux-next need to be based on Linus' tree or some > tree that does not rebase (or one you can keep up with if it does > rebase). > > The only exception is part of Andrew's patch series which is rebased > (by me) on top of linux-next each day. Gotcha. > > > Is it all stable enough now that it could be sent as a patch series for > > > Andrew to include in mmotm (with perhaps just smallish followup patches)? > > > > Yes, on multiple occasions, e.g., [1][2][3], I've claimed this > > patchset has an unprecedented test coverage and nobody has proven > > otherwise so far. > > > > Andrew suggested a cycle in linux-next [4]. So here we are :) > > So the easiest thing for me is if Andrew takes it into his mmotm patch > series (most of which ends up in linux-next). Agreed. > Otherwise I am probably > at some point going to need help fixing the conflicts. Yes, very likely. I see three options at the moment: 1. I grab mmotm, rebase it, apply this patchset atop and then route it to you. Based on my rough understanding of your workflow, this might reduce the work on your side. 2. I skip linux-next, and when I send the pull request for 5.19, I'll include incentive for Linus to potentially forgo the required linux-next cycle. 3. You or Andrew would have to do something you/he might not enjoy :) Please let me know. Thanks!
It's pretty clear from the results and from the test coverage to date that Linux wants this. I do think additional review is needed. For the usual code quality reasons, but also to help others get up to speed with the proposed changes and to ensure that we have something which is well maintainable going forward. The code at present is unnecessarily difficult to review and that review will be less effective than it might be, because of its lack of commentary. I'll merge the v10 series into -mm and -next for additional runtime exposure, but I'll be asking for a broad update. The data structures appear to be adequately documented (this is rare) but the code itself is lacking. Please go through each and every new function and ask "is this function so self-evident that it can be presented to a newcomer with no explanatory comments at all". If "yes" then check again. If still "yes" then fine. If "no" then let's craft comments which tell the reader things which are not evident from the code itself. Things which are helpful to that reader. Design concepts, side-effects, preconditions, unobvious traps, etc. Please ensure that the current group of mglru developers review these comment additions as closely as they review code changes. Alas, it's late in the process to be adding these things. But it can be made better. I'll make some high-level comments on the patches themselves now, but I see little point in attempting detailed review when a better-explained version is hopefully forthcoming. A few gratuitous notebook entries: - lru_gen_struct.timestamps works in jiffies? Why? jiffies can vary based on platform and config - why add inter-platform and inter-Kconfig variability like this? Time is measured in seconds! And the amount of work which can be performed in one second varies by, guess, 100x on machines which we care about. This also has potential to introduce tremendous inter-platform variability in the behaviour of this feature. - did lru_gen_use_mm() really need to test nodes_full()? Is that likely enough to be a net benefit? - seq_is_valid() sounds like it belongs to the seq_file() subsystem - does get_nr_evictable() need to return and use signed types (long)? It sums the contents of lru_gen_struct.nr_pages[][][], which is ulong, so I think "no".
The code uses struct lru_gen_mm_walk *walk in some places and struct mm_walk *walk in others. This is already driving me nuts. Can we please use different identifiers for different types?
On Wed, Apr 13, 2022 at 11:06 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > The code uses > > struct lru_gen_mm_walk *walk > > in some places and > > struct mm_walk *walk > > in others. This is already driving me nuts. Can we please use > different identifiers for different types? Sorry. Going with "struct lru_gen_mm_walk *args", unless there are objections. ("struct mm_walk *walk" is an existing convention so I don't think I should change that.)