diff mbox series

[V2,1/2] mm: madvise: return correct bytes advised with process_madvise

Message ID 125b61a0edcee5c2db8658aed9d06a43a19ccafc.1647008754.git.quic_charante@quicinc.com (mailing list archive)
State New
Headers show
Series mm: madvise: return correct bytes processed with process_madvise | expand

Commit Message

Charan Teja Kalla March 11, 2022, 3:29 p.m. UTC
The process_madvise() system call returns error even after processing
some VMA's passed in the 'struct iovec' vector list which leaves the
user confused to know where to restart the advise next. It is also
against this syscall man page[1] documentation where it mentions that
"return value may be less than the total number of requested bytes, if
an error occurred after some iovec elements were already processed.".

Consider a user passed 10 VMA's in the 'struct iovec' vector list of
which 9 are processed but one. Then it just returns the error caused on
that failed VMA despite the first 9 VMA's processed, leaving the user
confused about on which VMA it is failed. Returning the number of bytes
processed here can help the user to know which VMA it is failed on and
thus can retry/skip the advise on that VMA.

[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.

Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
Cc: <stable@vger.kernel.org> # 5.10+
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
Changes in V2:
 -- Separated the ENOMEM handling and return bytes processed, as per Minchan comments.
 -- This contains correcting return bytes processed with process_madvise().

Changes in V1:
 -- Fixed the ENOMEM handling and return bytes processed by process_madvise.
 -- https://patchwork.kernel.org/project/linux-mm/patch/1646803679-11433-1-git-send-email-quic_charante@quicinc.com/

 mm/madvise.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Minchan Kim March 15, 2022, 10:20 p.m. UTC | #1
On Fri, Mar 11, 2022 at 08:59:05PM +0530, Charan Teja Kalla wrote:
> The process_madvise() system call returns error even after processing
> some VMA's passed in the 'struct iovec' vector list which leaves the
> user confused to know where to restart the advise next. It is also
> against this syscall man page[1] documentation where it mentions that
> "return value may be less than the total number of requested bytes, if
> an error occurred after some iovec elements were already processed.".
> 
> Consider a user passed 10 VMA's in the 'struct iovec' vector list of
> which 9 are processed but one. Then it just returns the error caused on
> that failed VMA despite the first 9 VMA's processed, leaving the user
> confused about on which VMA it is failed. Returning the number of bytes
> processed here can help the user to know which VMA it is failed on and
> thus can retry/skip the advise on that VMA.
> 
> [1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
> 
> Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
> Cc: <stable@vger.kernel.org> # 5.10+
> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Michal Hocko March 21, 2022, 3:18 p.m. UTC | #2
On Fri 11-03-22 20:59:05, Charan Teja Kalla wrote:
> The process_madvise() system call returns error even after processing
> some VMA's passed in the 'struct iovec' vector list which leaves the
> user confused to know where to restart the advise next. It is also
> against this syscall man page[1] documentation where it mentions that
> "return value may be less than the total number of requested bytes, if
> an error occurred after some iovec elements were already processed.".
> 
> Consider a user passed 10 VMA's in the 'struct iovec' vector list of
> which 9 are processed but one. Then it just returns the error caused on
> that failed VMA despite the first 9 VMA's processed, leaving the user
> confused about on which VMA it is failed. Returning the number of bytes
> processed here can help the user to know which VMA it is failed on and
> thus can retry/skip the advise on that VMA.
> 
> [1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
> 
> Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
> Cc: <stable@vger.kernel.org> # 5.10+
> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
> Changes in V2:
>  -- Separated the ENOMEM handling and return bytes processed, as per Minchan comments.
>  -- This contains correcting return bytes processed with process_madvise().
> 
> Changes in V1:
>  -- Fixed the ENOMEM handling and return bytes processed by process_madvise.
>  -- https://patchwork.kernel.org/project/linux-mm/patch/1646803679-11433-1-git-send-email-quic_charante@quicinc.com/
> 
>  mm/madvise.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 38d0f51..e97e6a9 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1433,8 +1433,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>  		iov_iter_advance(&iter, iovec.iov_len);
>  	}
>  
> -	if (ret == 0)
> -		ret = total_len - iov_iter_count(&iter);
> +	ret = (total_len - iov_iter_count(&iter)) ? : ret;
>  
>  release_mm:
>  	mmput(mm);
> -- 
> 2.7.4
diff mbox series

Patch

diff --git a/mm/madvise.c b/mm/madvise.c
index 38d0f51..e97e6a9 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1433,8 +1433,7 @@  SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 		iov_iter_advance(&iter, iovec.iov_len);
 	}
 
-	if (ret == 0)
-		ret = total_len - iov_iter_count(&iter);
+	ret = (total_len - iov_iter_count(&iter)) ? : ret;
 
 release_mm:
 	mmput(mm);