diff mbox series

mm: use down_read_killable for locking mmap_sem in access_remote_vm

Message ID 155790847881.2798.7160461383704600177.stgit@buzz (mailing list archive)
State New, archived
Headers show
Series mm: use down_read_killable for locking mmap_sem in access_remote_vm | expand

Commit Message

Konstantin Khlebnikov May 15, 2019, 8:21 a.m. UTC
This function is used by ptrace and proc files like /proc/pid/cmdline and
/proc/pid/environ. Return 0 (bytes read) if current task is killed.

Mmap_sem could be locked for a long time or forever if something wrong.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/memory.c |    4 +++-
 mm/nommu.c  |    3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

Comments

Michal Koutný May 15, 2019, 8:38 a.m. UTC | #1
Hi,
making this holder of mmap_sem killable was for the reasons of /proc/...
diagnostics was an idea I was pondeering too. However, I think the
approach of pretending we read 0 bytes is not correct. The API would IMO
need to be extended to allow pass a result such as EINTR to the end
caller.
Why do you think it's safe to return just 0?

Michal
Konstantin Khlebnikov May 15, 2019, 8:48 a.m. UTC | #2
On 15.05.2019 11:38, Michal Koutný wrote:
> Hi,
> making this holder of mmap_sem killable was for the reasons of /proc/...
> diagnostics was an idea I was pondeering too. However, I think the
> approach of pretending we read 0 bytes is not correct. The API would IMO
> need to be extended to allow pass a result such as EINTR to the end
> caller.
> Why do you think it's safe to return just 0?

This function ignores any error like reading from unmapped area and
returns only size of successful transfer. It never returned any error codes.

> 
> Michal
>
Matthew Wilcox May 15, 2019, 11:49 a.m. UTC | #3
On Wed, May 15, 2019 at 10:38:26AM +0200, Michal Koutný wrote:
> Hi,
> making this holder of mmap_sem killable was for the reasons of /proc/...
> diagnostics was an idea I was pondeering too. However, I think the
> approach of pretending we read 0 bytes is not correct. The API would IMO
> need to be extended to allow pass a result such as EINTR to the end
> caller.
> Why do you think it's safe to return just 0?

_killable_, not _interruptible_.

The return value will never be seen by userspace because it's dead.
Oleg Nesterov May 15, 2019, 2:28 p.m. UTC | #4
> @@ -4348,7 +4348,9 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
>  	void *old_buf = buf;
>  	int write = gup_flags & FOLL_WRITE;
>  
> -	down_read(&mm->mmap_sem);
> +	if (down_read_killable(&mm->mmap_sem))
> +		return 0;
> +

I too think that "return 0" looks a bit strange even if correct, to me
"return -EINTR" would look better.

But I won't insist, this is cosmetic.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Michal Koutný May 16, 2019, 5 p.m. UTC | #5
On Wed, May 15, 2019 at 11:48:32AM +0300, Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
> This function ignores any error like reading from unmapped area and
> returns only size of successful transfer. It never returned any error codes.
This is a point I missed. Hence no need to adjust consumers of
__access_remote_vm() (they won't actually handle -EINTR correctly w/out
further changes). This beats my original idea with simplicity.


Reviewed-by: Michal Koutný <mkoutny@suse.com>

Michal
Michal Hocko May 17, 2019, 12:51 p.m. UTC | #6
On Wed 15-05-19 11:21:18, Konstantin Khlebnikov wrote:
> This function is used by ptrace and proc files like /proc/pid/cmdline and
> /proc/pid/environ. Return 0 (bytes read) if current task is killed.

Please add an explanation about why this is OK (as explained in the
follow up email).

> Mmap_sem could be locked for a long time or forever if something wrong.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory.c |    4 +++-
>  mm/nommu.c  |    3 ++-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 96f1d473c89a..2e6846d09023 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4348,7 +4348,9 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
>  	void *old_buf = buf;
>  	int write = gup_flags & FOLL_WRITE;
>  
> -	down_read(&mm->mmap_sem);
> +	if (down_read_killable(&mm->mmap_sem))
> +		return 0;
> +
>  	/* ignore errors, just check how much was successfully transferred */
>  	while (len) {
>  		int bytes, ret, offset;
> diff --git a/mm/nommu.c b/mm/nommu.c
> index b492fd1fcf9f..cad8fb34088f 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -1791,7 +1791,8 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
>  	struct vm_area_struct *vma;
>  	int write = gup_flags & FOLL_WRITE;
>  
> -	down_read(&mm->mmap_sem);
> +	if (down_read_killable(&mm->mmap_sem))
> +		return 0;
>  
>  	/* the access must start within one of the target process's mappings */
>  	vma = find_vma(mm, addr);
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 96f1d473c89a..2e6846d09023 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4348,7 +4348,9 @@  int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 	void *old_buf = buf;
 	int write = gup_flags & FOLL_WRITE;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem))
+		return 0;
+
 	/* ignore errors, just check how much was successfully transferred */
 	while (len) {
 		int bytes, ret, offset;
diff --git a/mm/nommu.c b/mm/nommu.c
index b492fd1fcf9f..cad8fb34088f 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1791,7 +1791,8 @@  int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 	struct vm_area_struct *vma;
 	int write = gup_flags & FOLL_WRITE;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem))
+		return 0;
 
 	/* the access must start within one of the target process's mappings */
 	vma = find_vma(mm, addr);