Message ID | 20210714193542.21857-1-joao.m.martins@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | mm, sparse-vmemmap: Introduce compound pagemaps | expand |
On Wed, 14 Jul 2021 20:35:28 +0100 Joao Martins <joao.m.martins@oracle.com> wrote: > This series, attempts at minimizing 'struct page' overhead by > pursuing a similar approach as Muchun Song series "Free some vmemmap > pages of hugetlb page"[0] but applied to devmap/ZONE_DEVICE which is now > in mmotm. > > [0] https://lore.kernel.org/linux-mm/20210308102807.59745-1-songmuchun@bytedance.com/ [0] is now in mainline. This patch series looks like it'll clash significantly with the folio work and it is pretty thinly reviewed, so I think I'll take a pass for now. Matthew, thoughts?
On Wed, Jul 14, 2021 at 2:48 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Wed, 14 Jul 2021 20:35:28 +0100 Joao Martins <joao.m.martins@oracle.com> wrote: > > > This series, attempts at minimizing 'struct page' overhead by > > pursuing a similar approach as Muchun Song series "Free some vmemmap > > pages of hugetlb page"[0] but applied to devmap/ZONE_DEVICE which is now > > in mmotm. > > > > [0] https://lore.kernel.org/linux-mm/20210308102807.59745-1-songmuchun@bytedance.com/ > > [0] is now in mainline. > > This patch series looks like it'll clash significantly with the folio > work and it is pretty thinly reviewed, Sorry about that, I've promised Joao some final reviewed-by tags and testing for a while, and the gears are turning now. > so I think I'll take a pass for now. Matthew, thoughts? I'll defer to Matthew about folio collision, but I did not think this compound page geometry setup for memremap_pages() would collide with folios that want to clarify passing multi-order pages around the kernel. Joao is solving a long standing criticism of memremap_pages() usage for PMEM where the page metadata is too large to fit in RAM and the page array in PMEM is noticeably slower to pin for frequent pin_user_pages() events. memremap_pages() is a good first candidate for this solution given it's pages never get handled by the page allocator. If anything it allows folios to seep deeper into the DAX code as it knocks down the "base-pages only" assumption of those paths.
On Wed, Jul 14, 2021 at 02:48:30PM -0700, Andrew Morton wrote: > On Wed, 14 Jul 2021 20:35:28 +0100 Joao Martins <joao.m.martins@oracle.com> wrote: > > > This series, attempts at minimizing 'struct page' overhead by > > pursuing a similar approach as Muchun Song series "Free some vmemmap > > pages of hugetlb page"[0] but applied to devmap/ZONE_DEVICE which is now > > in mmotm. > > > > [0] https://lore.kernel.org/linux-mm/20210308102807.59745-1-songmuchun@bytedance.com/ > > [0] is now in mainline. > > This patch series looks like it'll clash significantly with the folio > work and it is pretty thinly reviewed, so I think I'll take a pass for > now. Matthew, thoughts? I had a look through it, and I don't see anything that looks like it'll clash with the folio patches. The folio work really touches the page cache for now, and this seems mostly to touch the devmap paths. It would be nice to convert the devmap code to folios too, but that can wait. The mess with page refcounts needs to be sorted out first.
On 7/22/21 3:24 AM, Matthew Wilcox wrote: > On Wed, Jul 14, 2021 at 02:48:30PM -0700, Andrew Morton wrote: >> On Wed, 14 Jul 2021 20:35:28 +0100 Joao Martins <joao.m.martins@oracle.com> wrote: >> >>> This series, attempts at minimizing 'struct page' overhead by >>> pursuing a similar approach as Muchun Song series "Free some vmemmap >>> pages of hugetlb page"[0] but applied to devmap/ZONE_DEVICE which is now >>> in mmotm. >>> >>> [0] https://lore.kernel.org/linux-mm/20210308102807.59745-1-songmuchun@bytedance.com/ >> >> [0] is now in mainline. >> >> This patch series looks like it'll clash significantly with the folio >> work and it is pretty thinly reviewed, so I think I'll take a pass for >> now. Matthew, thoughts? > > I had a look through it, and I don't see anything that looks like it'll > clash with the folio patches. FWIW, I had tried this last week, and this series applies cleanly on top of your 130+ patch series for Folios. > The folio work really touches the page > cache for now, and this seems mostly to touch the devmap paths. > /me nods -- it really is about devmap infra for usage in device-dax for persistent memory. Perhaps I should do s/pagemaps/devmap/ throughout the series to avoid confusion. > It would be nice to convert the devmap code to folios too, but that > can wait. The mess with page refcounts needs to be sorted out first. > I suppose you refer to fixing the current zone-device elevated page ref count? https://lore.kernel.org/linux-mm/20210717192135.9030-3-alex.sierra@amd.com/
On Thu, Jul 22, 2021 at 3:54 AM Joao Martins <joao.m.martins@oracle.com> wrote: [..] > > The folio work really touches the page > > cache for now, and this seems mostly to touch the devmap paths. > > > /me nods -- it really is about devmap infra for usage in device-dax for persistent memory. > > Perhaps I should do s/pagemaps/devmap/ throughout the series to avoid confusion. I also like "devmap" as a more accurate name. It matches the PFN_DEV and PFN_MAP flags that decorate DAX capable pfn_t instances. It also happens to match a recommendation I gave to Ira for his support for supervisor protection keys with devmap pfns.
On 7/28/21 12:23 AM, Dan Williams wrote: > On Thu, Jul 22, 2021 at 3:54 AM Joao Martins <joao.m.martins@oracle.com> wrote: > [..] >>> The folio work really touches the page >>> cache for now, and this seems mostly to touch the devmap paths. >>> >> /me nods -- it really is about devmap infra for usage in device-dax for persistent memory. >> >> Perhaps I should do s/pagemaps/devmap/ throughout the series to avoid confusion. > > I also like "devmap" as a more accurate name. It matches the PFN_DEV > and PFN_MAP flags that decorate DAX capable pfn_t instances. It also > happens to match a recommendation I gave to Ira for his support for > supervisor protection keys with devmap pfns. > /me nods Additionally, I think I'll be reordering the patches for more clear/easier bisection i.e. first introducing compound pages for devmap, fixing associated issues wrt to the slow pinning and then introduce vmemmap deduplication for devmap. It should look like below after the reordering from first patch to last. Let me know if you disagree. memory-failure: fetch compound_head after pgmap_pfn_valid() mm/page_alloc: split prep_compound_page into head and tail subparts mm/page_alloc: refactor memmap_init_zone_device() page init mm/memremap: add ZONE_DEVICE support for compound pages device-dax: use ALIGN() for determining pgoff device-dax: compound devmap support mm/gup: grab head page refcount once for group of subpages mm/sparse-vmemmap: add a pgmap argument to section activation mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper mm/hugetlb_vmemmap: move comment block to Documentation/vm mm/sparse-vmemmap: populate compound devmaps mm/page_alloc: reuse tail struct pages for compound devmaps mm/sparse-vmemmap: improve memory savings for compound pud geometry
On Mon, Aug 2, 2021 at 3:41 AM Joao Martins <joao.m.martins@oracle.com> wrote: > > > > On 7/28/21 12:23 AM, Dan Williams wrote: > > On Thu, Jul 22, 2021 at 3:54 AM Joao Martins <joao.m.martins@oracle.com> wrote: > > [..] > >>> The folio work really touches the page > >>> cache for now, and this seems mostly to touch the devmap paths. > >>> > >> /me nods -- it really is about devmap infra for usage in device-dax for persistent memory. > >> > >> Perhaps I should do s/pagemaps/devmap/ throughout the series to avoid confusion. > > > > I also like "devmap" as a more accurate name. It matches the PFN_DEV > > and PFN_MAP flags that decorate DAX capable pfn_t instances. It also > > happens to match a recommendation I gave to Ira for his support for > > supervisor protection keys with devmap pfns. > > > /me nods > > Additionally, I think I'll be reordering the patches for more clear/easier > bisection i.e. first introducing compound pages for devmap, fixing associated > issues wrt to the slow pinning and then introduce vmemmap deduplication for > devmap. > > It should look like below after the reordering from first patch to last. > Let me know if you disagree. > > memory-failure: fetch compound_head after pgmap_pfn_valid() > mm/page_alloc: split prep_compound_page into head and tail subparts > mm/page_alloc: refactor memmap_init_zone_device() page init > mm/memremap: add ZONE_DEVICE support for compound pages > device-dax: use ALIGN() for determining pgoff > device-dax: compound devmap support > mm/gup: grab head page refcount once for group of subpages > mm/sparse-vmemmap: add a pgmap argument to section activation > mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper > mm/hugetlb_vmemmap: move comment block to Documentation/vm > mm/sparse-vmemmap: populate compound devmaps > mm/page_alloc: reuse tail struct pages for compound devmaps > mm/sparse-vmemmap: improve memory savings for compound pud geometry LGTM.