diff mbox series

[v2,3/3] mm/selftest: uffd: Explain the write missing fault check

Message ID 20221004003705.497782-4-peterx@redhat.com (mailing list archive)
State New
Headers show
Series mm/hugetlb: Fix selftest failures with write check | expand

Commit Message

Peter Xu Oct. 4, 2022, 12:37 a.m. UTC
It's not obvious why we had a write check for each of the missing messages,
especially when it should be a locking op.  Add a rich comment for that,
and also try to explain its good side and limitations, so that if someone
hit it again for either a bug or a different glibc impl there'll be some
clue to start with.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/vm/userfaultfd.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

Comments

Mike Kravetz Oct. 4, 2022, 2:37 a.m. UTC | #1
On 10/03/22 20:37, Peter Xu wrote:
> It's not obvious why we had a write check for each of the missing messages,
> especially when it should be a locking op.  Add a rich comment for that,
> and also try to explain its good side and limitations, so that if someone
> hit it again for either a bug or a different glibc impl there'll be some
> clue to start with.

Thanks!  It did take a while to understand all this, so the comment is
appropriate.

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  tools/testing/selftests/vm/userfaultfd.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
David Hildenbrand Oct. 4, 2022, 12:20 p.m. UTC | #2
On 04.10.22 02:37, Peter Xu wrote:
> It's not obvious why we had a write check for each of the missing messages,
> especially when it should be a locking op.  Add a rich comment for that,
> and also try to explain its good side and limitations, so that if someone
> hit it again for either a bug or a different glibc impl there'll be some
> clue to start with.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   tools/testing/selftests/vm/userfaultfd.c | 22 +++++++++++++++++++++-
>   1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 74babdbc02e5..297f250c1d95 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -774,7 +774,27 @@ static void uffd_handle_page_fault(struct uffd_msg *msg,
>   		continue_range(uffd, msg->arg.pagefault.address, page_size);
>   		stats->minor_faults++;
>   	} else {
> -		/* Missing page faults */
> +		/*
> +		 * Missing page faults.
> +		 *
> +		 * Here we force a write check for each of the missing mode
> +		 * faults.  It's guaranteed because the only threads that
> +		 * will trigger uffd faults are the locking threads, and
> +		 * their first instruction to touch the missing page will
> +		 * always be pthread_mutex_lock().
> +		 *
> +		 * Note that here we relied on an NPTL glibc impl detail to
> +		 * always read the lock type at the entry of the lock op
> +		 * (pthread_mutex_t.__data.__type, offset 0x10) before
> +		 * doing any locking operations to guarantee that.  It's
> +		 * actually not good to rely on this impl detail because
> +		 * logically a pthread-compatible lib can implement the
> +		 * locks without types and we can fail when linking with
> +		 * them.  However since we used to find bugs with this
> +		 * strict check we still keep it around.  Hopefully this
> +		 * could be a good hint when it fails again.  If one day
> +		 * it'll break on some other impl of glibc we'll revisit.
> +		 */
>   		if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE)
>   			err("unexpected write fault");
>   

Reviewed-by: David Hildenbrand <david@redhat.com>
diff mbox series

Patch

diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index 74babdbc02e5..297f250c1d95 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -774,7 +774,27 @@  static void uffd_handle_page_fault(struct uffd_msg *msg,
 		continue_range(uffd, msg->arg.pagefault.address, page_size);
 		stats->minor_faults++;
 	} else {
-		/* Missing page faults */
+		/*
+		 * Missing page faults.
+		 *
+		 * Here we force a write check for each of the missing mode
+		 * faults.  It's guaranteed because the only threads that
+		 * will trigger uffd faults are the locking threads, and
+		 * their first instruction to touch the missing page will
+		 * always be pthread_mutex_lock().
+		 *
+		 * Note that here we relied on an NPTL glibc impl detail to
+		 * always read the lock type at the entry of the lock op
+		 * (pthread_mutex_t.__data.__type, offset 0x10) before
+		 * doing any locking operations to guarantee that.  It's
+		 * actually not good to rely on this impl detail because
+		 * logically a pthread-compatible lib can implement the
+		 * locks without types and we can fail when linking with
+		 * them.  However since we used to find bugs with this
+		 * strict check we still keep it around.  Hopefully this
+		 * could be a good hint when it fails again.  If one day
+		 * it'll break on some other impl of glibc we'll revisit.
+		 */
 		if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE)
 			err("unexpected write fault");