diff mbox series

[05/19] mm/gup: Detect huge pfnmap entries in gup-fast

Message ID 20240809160909.1023470-6-peterx@redhat.com (mailing list archive)
State New
Headers show
Series mm: Support huge pfnmaps | expand

Commit Message

Peter Xu Aug. 9, 2024, 4:08 p.m. UTC
Since gup-fast doesn't have the vma reference, teach it to detect such huge
pfnmaps by checking the special bit for pmd/pud too, just like ptes.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/gup.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

David Hildenbrand Aug. 9, 2024, 4:23 p.m. UTC | #1
On 09.08.24 18:08, Peter Xu wrote:
> Since gup-fast doesn't have the vma reference, teach it to detect such huge
> pfnmaps by checking the special bit for pmd/pud too, just like ptes.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   mm/gup.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index d19884e097fd..a49f67a512ee 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -3038,6 +3038,9 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
>   	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
>   		return 0;
>   
> +	if (pmd_special(orig))
> +		return 0;
> +
>   	if (pmd_devmap(orig)) {
>   		if (unlikely(flags & FOLL_LONGTERM))
>   			return 0;
> @@ -3082,6 +3085,9 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
>   	if (!pud_access_permitted(orig, flags & FOLL_WRITE))
>   		return 0;
>   
> +	if (pud_special(orig))
> +		return 0;
> +
>   	if (pud_devmap(orig)) {
>   		if (unlikely(flags & FOLL_LONGTERM))
>   			return 0;

In gup_fast_pte_range() we check after checking pte_devmap(). Do we want 
to do it in a similar fashion here, or is there a reason to do it 
differently?

Acked-by: David Hildenbrand <david@redhat.com>
Peter Xu Aug. 9, 2024, 4:59 p.m. UTC | #2
On Fri, Aug 09, 2024 at 06:23:53PM +0200, David Hildenbrand wrote:
> On 09.08.24 18:08, Peter Xu wrote:
> > Since gup-fast doesn't have the vma reference, teach it to detect such huge
> > pfnmaps by checking the special bit for pmd/pud too, just like ptes.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   mm/gup.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index d19884e097fd..a49f67a512ee 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -3038,6 +3038,9 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> >   	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> >   		return 0;
> > +	if (pmd_special(orig))
> > +		return 0;
> > +
> >   	if (pmd_devmap(orig)) {
> >   		if (unlikely(flags & FOLL_LONGTERM))
> >   			return 0;
> > @@ -3082,6 +3085,9 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
> >   	if (!pud_access_permitted(orig, flags & FOLL_WRITE))
> >   		return 0;
> > +	if (pud_special(orig))
> > +		return 0;
> > +
> >   	if (pud_devmap(orig)) {
> >   		if (unlikely(flags & FOLL_LONGTERM))
> >   			return 0;
> 
> In gup_fast_pte_range() we check after checking pte_devmap(). Do we want to
> do it in a similar fashion here, or is there a reason to do it differently?

IIUC they should behave the same, as the two should be mutual exclusive so
far.  E.g. see insert_pfn():

	if (pfn_t_devmap(pfn))
		entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
	else
		entry = pte_mkspecial(pfn_t_pte(pfn, prot));

It might change for sure if Alistair move on with the devmap work, though..
these two always are processed together now, so I hope that won't add much
burden which series will land first, then we may need some care on merging
them.  I don't expect anything too tricky in merge if that was about
removal of the devmap bits.

> 
> Acked-by: David Hildenbrand <david@redhat.com>

Thanks, I'll take this one first.
Jason Gunthorpe Aug. 14, 2024, 12:41 p.m. UTC | #3
On Fri, Aug 09, 2024 at 12:08:55PM -0400, Peter Xu wrote:
> Since gup-fast doesn't have the vma reference, teach it to detect such huge
> pfnmaps by checking the special bit for pmd/pud too, just like ptes.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/gup.c | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason
Jason Gunthorpe Aug. 14, 2024, 12:42 p.m. UTC | #4
On Fri, Aug 09, 2024 at 12:59:40PM -0400, Peter Xu wrote:
> > In gup_fast_pte_range() we check after checking pte_devmap(). Do we want to
> > do it in a similar fashion here, or is there a reason to do it differently?
> 
> IIUC they should behave the same, as the two should be mutual exclusive so
> far.  E.g. see insert_pfn():

Yes, agree no functional difference, but David has a point to try to
keep the logic structurally the same in all pte/pmd/pud copies.

> 	if (pfn_t_devmap(pfn))
> 		entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
> 	else
> 		entry = pte_mkspecial(pfn_t_pte(pfn, prot));
> 
> It might change for sure if Alistair move on with the devmap work, though..
> these two always are processed together now, so I hope that won't add much
> burden which series will land first, then we may need some care on merging
> them.  I don't expect anything too tricky in merge if that was about
> removal of the devmap bits.

Removing pte_mkdevmap can only make things simpler :)

Jason
Peter Xu Aug. 14, 2024, 3:34 p.m. UTC | #5
On Wed, Aug 14, 2024 at 09:42:28AM -0300, Jason Gunthorpe wrote:
> On Fri, Aug 09, 2024 at 12:59:40PM -0400, Peter Xu wrote:
> > > In gup_fast_pte_range() we check after checking pte_devmap(). Do we want to
> > > do it in a similar fashion here, or is there a reason to do it differently?
> > 
> > IIUC they should behave the same, as the two should be mutual exclusive so
> > far.  E.g. see insert_pfn():
> 
> Yes, agree no functional difference, but David has a point to try to
> keep the logic structurally the same in all pte/pmd/pud copies.

OK, let me reorder them if that helps.

> 
> > 	if (pfn_t_devmap(pfn))
> > 		entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
> > 	else
> > 		entry = pte_mkspecial(pfn_t_pte(pfn, prot));
> > 
> > It might change for sure if Alistair move on with the devmap work, though..
> > these two always are processed together now, so I hope that won't add much
> > burden which series will land first, then we may need some care on merging
> > them.  I don't expect anything too tricky in merge if that was about
> > removal of the devmap bits.
> 
> Removing pte_mkdevmap can only make things simpler :)

Yep. :)

Thanks,
diff mbox series

Patch

diff --git a/mm/gup.c b/mm/gup.c
index d19884e097fd..a49f67a512ee 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3038,6 +3038,9 @@  static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
+	if (pmd_special(orig))
+		return 0;
+
 	if (pmd_devmap(orig)) {
 		if (unlikely(flags & FOLL_LONGTERM))
 			return 0;
@@ -3082,6 +3085,9 @@  static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
 	if (!pud_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
+	if (pud_special(orig))
+		return 0;
+
 	if (pud_devmap(orig)) {
 		if (unlikely(flags & FOLL_LONGTERM))
 			return 0;