diff mbox series

[2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock

Message ID 1597120955-16495-3-git-send-email-chinwen.chang@mediatek.com (mailing list archive)
State New, archived
Headers show
Series Try to release mmap_lock temporarily in smaps_rollup | expand

Commit Message

Chinwen Chang Aug. 11, 2020, 4:42 a.m. UTC
smaps_rollup will try to grab mmap_lock and go through the whole vma
list until it finishes the iterating. When encountering large processes,
the mmap_lock will be held for a longer time, which may block other
write requests like mmap and munmap from progressing smoothly.

There are upcoming mmap_lock optimizations like range-based locks, but
the lock applied to smaps_rollup would be the coarse type, which doesn't
avoid the occurrence of unpleasant contention.

To solve aforementioned issue, we add a check which detects whether
anyone wants to grab mmap_lock for write attempts.

Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
---
 fs/proc/task_mmu.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Steven Price Aug. 12, 2020, 8:39 a.m. UTC | #1
On 11/08/2020 05:42, Chinwen Chang wrote:
> smaps_rollup will try to grab mmap_lock and go through the whole vma
> list until it finishes the iterating. When encountering large processes,
> the mmap_lock will be held for a longer time, which may block other
> write requests like mmap and munmap from progressing smoothly.
> 
> There are upcoming mmap_lock optimizations like range-based locks, but
> the lock applied to smaps_rollup would be the coarse type, which doesn't
> avoid the occurrence of unpleasant contention.
> 
> To solve aforementioned issue, we add a check which detects whether
> anyone wants to grab mmap_lock for write attempts.
> 
> Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
> ---
>   fs/proc/task_mmu.c | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index dbda449..4b51f25 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -856,6 +856,27 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
>   	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
>   		smap_gather_stats(vma, &mss);
>   		last_vma_end = vma->vm_end;
> +
> +		/*
> +		 * Release mmap_lock temporarily if someone wants to
> +		 * access it for write request.
> +		 */
> +		if (mmap_lock_is_contended(mm)) {
> +			mmap_read_unlock(mm);
> +			ret = mmap_read_lock_killable(mm);
> +			if (ret) {
> +				release_task_mempolicy(priv);
> +				goto out_put_mm;
> +			}
> +
> +			/* Check whether current vma is available */
> +			vma = find_vma(mm, last_vma_end - 1);
> +			if (vma && vma->vm_start < last_vma_end)

I may be wrong, but this looks like it could return incorrect results. 
For example if we start reading with the following VMAs:

  +------+------+-----------+
  | VMA1 | VMA2 | VMA3      |
  +------+------+-----------+
  |      |      |           |
4k     8k     16k         400k

Then after reading VMA2 we drop the lock due to contention. So:

   last_vma_end = 16k

Then if VMA2 is freed while the lock is dropped, so we have:

  +------+      +-----------+
  | VMA1 |      | VMA3      |
  +------+      +-----------+
  |      |      |           |
4k     8k     16k         400k

find_vma(mm, 16k-1) will then return VMA3 and the condition vm_start < 
last_vma_end will be false.

> +				continue;
> +
> +			/* Current vma is not available, just break */
> +			break;

Which means we break out here and report an incomplete output (the 
numbers will be much smaller than reality).

Would it be better to have a loop like:

	for (vma = priv->mm->mmap; vma;) {
		smap_gather_stats(vma, &mss);
		last_vma_end = vma->vm_end;

		if (contended) {
			/* drop/acquire lock */

			vma = find_vma(mm, last_vma_end - 1);
			if (!vma)
				break;
			if (vma->vm_start >= last_vma_end)
				continue;
		}
		vma = vma->vm_next;
	}

that way if the VMA is removed while the lock is dropped the loop can 
just continue from the next VMA.

Or perhaps I missed something obvious? I haven't actually tested 
anything above.

Steve

> +		}
>   	}
>   
>   	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
>
Chinwen Chang Aug. 12, 2020, 9:26 a.m. UTC | #2
On Wed, 2020-08-12 at 09:39 +0100, Steven Price wrote:
> On 11/08/2020 05:42, Chinwen Chang wrote:
> > smaps_rollup will try to grab mmap_lock and go through the whole vma
> > list until it finishes the iterating. When encountering large processes,
> > the mmap_lock will be held for a longer time, which may block other
> > write requests like mmap and munmap from progressing smoothly.
> > 
> > There are upcoming mmap_lock optimizations like range-based locks, but
> > the lock applied to smaps_rollup would be the coarse type, which doesn't
> > avoid the occurrence of unpleasant contention.
> > 
> > To solve aforementioned issue, we add a check which detects whether
> > anyone wants to grab mmap_lock for write attempts.
> > 
> > Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
> > ---
> >   fs/proc/task_mmu.c | 21 +++++++++++++++++++++
> >   1 file changed, 21 insertions(+)
> > 
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index dbda449..4b51f25 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -856,6 +856,27 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
> >   	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
> >   		smap_gather_stats(vma, &mss);
> >   		last_vma_end = vma->vm_end;
> > +
> > +		/*
> > +		 * Release mmap_lock temporarily if someone wants to
> > +		 * access it for write request.
> > +		 */
> > +		if (mmap_lock_is_contended(mm)) {
> > +			mmap_read_unlock(mm);
> > +			ret = mmap_read_lock_killable(mm);
> > +			if (ret) {
> > +				release_task_mempolicy(priv);
> > +				goto out_put_mm;
> > +			}
> > +
> > +			/* Check whether current vma is available */
> > +			vma = find_vma(mm, last_vma_end - 1);
> > +			if (vma && vma->vm_start < last_vma_end)
> 
> I may be wrong, but this looks like it could return incorrect results. 
> For example if we start reading with the following VMAs:
> 
>   +------+------+-----------+
>   | VMA1 | VMA2 | VMA3      |
>   +------+------+-----------+
>   |      |      |           |
> 4k     8k     16k         400k
> 
> Then after reading VMA2 we drop the lock due to contention. So:
> 
>    last_vma_end = 16k
> 
> Then if VMA2 is freed while the lock is dropped, so we have:
> 
>   +------+      +-----------+
>   | VMA1 |      | VMA3      |
>   +------+      +-----------+
>   |      |      |           |
> 4k     8k     16k         400k
> 
> find_vma(mm, 16k-1) will then return VMA3 and the condition vm_start < 
> last_vma_end will be false.
> 
Hi Steve,

Thank you for reviewing this patch.

You are correct. If the contention is detected and the current vma(here
is VMA2) is freed while the lock is dropped, it will report an
incomplete result.

> > +				continue;
> > +
> > +			/* Current vma is not available, just break */
> > +			break;
> 
> Which means we break out here and report an incomplete output (the 
> numbers will be much smaller than reality).
> 
> Would it be better to have a loop like:
> 
> 	for (vma = priv->mm->mmap; vma;) {
> 		smap_gather_stats(vma, &mss);
> 		last_vma_end = vma->vm_end;
> 
> 		if (contended) {
> 			/* drop/acquire lock */
> 
> 			vma = find_vma(mm, last_vma_end - 1);
> 			if (!vma)
> 				break;
> 			if (vma->vm_start >= last_vma_end)
> 				continue;
> 		}
> 		vma = vma->vm_next;
> 	}
> 
> that way if the VMA is removed while the lock is dropped the loop can 
> just continue from the next VMA.
> 
Thanks a lot for your great suggestion.

> Or perhaps I missed something obvious? I haven't actually tested 
> anything above.
> 
> Steve

I will prepare new patch series for further reviews.

Thank you.
Chinwen
> 
> > +		}
> >   	}
> >   
> >   	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
> > 
>
diff mbox series

Patch

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dbda449..4b51f25 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -856,6 +856,27 @@  static int show_smaps_rollup(struct seq_file *m, void *v)
 	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
 		smap_gather_stats(vma, &mss);
 		last_vma_end = vma->vm_end;
+
+		/*
+		 * Release mmap_lock temporarily if someone wants to
+		 * access it for write request.
+		 */
+		if (mmap_lock_is_contended(mm)) {
+			mmap_read_unlock(mm);
+			ret = mmap_read_lock_killable(mm);
+			if (ret) {
+				release_task_mempolicy(priv);
+				goto out_put_mm;
+			}
+
+			/* Check whether current vma is available */
+			vma = find_vma(mm, last_vma_end - 1);
+			if (vma && vma->vm_start < last_vma_end)
+				continue;
+
+			/* Current vma is not available, just break */
+			break;
+		}
 	}
 
 	show_vma_header_prefix(m, priv->mm->mmap->vm_start,