diff mbox series

[6/6] cma: Isolate pageblocks speculatively during allocation

Message ID 20190218210715.1066-7-krisman@collabora.com (mailing list archive)
State New, archived
Headers show
Series Improve handling of GFP flags in the CMA allocator | expand

Commit Message

Gabriel Krisman Bertazi Feb. 18, 2019, 9:07 p.m. UTC
Holding the mutex when calling alloc_contig_range() is not essential
because of the bitmap reservation plus the implicit synchronization
mechanism inside start_isolate_page_range(), which prevents allocations
on the same pageblock.  It is still beneficial to perform some kind of
serialization on this path, though, to allow allocations on the same
pageblock, if possible, instead of immediately jumping to another
allocatable region.

Therefore, this patch, instead of serializing every CMA allocation,
speculatively try to do the allocation without acquiring the mutex.  If
we race with another thread allocating on the same pageblock, we can
retry on the same region, after waiting for the other colliding
allocations to finish.

The synchronization of aborted tasks is still done globaly for the CMA
allocator.  Ideally, the aborted allocation would wait only for the
migration of the colliding pageblock, but there is no easy way to track
each pageblock isolation in a non-racy way without adding more code
overhead.  Thus, I believe the mutex mechanism to be an acceptable
compromise, if it is not violating the mutex semantics too much.

Finally, some code paths like the writeback case, should not blindly
sleep waiting for the mutex, because of the possibility of deadlocking
if it is a dependency of another allocation thread that holds the mutex.
This exact scenario was observed by Gael Portay, with a GPU thread that
allocs CMA triggering a writeback, and a USB device in the ARM device
that tries to satisfy the writeback with a NOIO CMA allocation [1].  For
that reason, we restrict writeback threads from waiting on the
pageblock, and instead, we let them move on to a readily available
contiguous memory region, effectively preventing the issue reported in
[1].

[1] https://groups.google.com/a/lists.one-eyed-alien.net/forum/#!topic/usb-storage/BXpAsg-G1us

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 mm/cma.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/mm/cma.c b/mm/cma.c
index 1dff74b1a8c5..ace978623b8d 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -411,6 +411,7 @@  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	void *kaddr;
 	struct page *page = NULL;
 	int ret = -ENOMEM;
+	bool has_lock = false;
 
 	/* Be noisy about caller asking for unsupported flags. */
 	WARN_ON(unlikely(!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
@@ -451,17 +452,39 @@  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		mutex_unlock(&cma->lock);
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
-		mutex_lock(&cma_mutex);
+
+		/* Mutual exclusion inside alloc_contig_range() is not
+		 * strictly necessary, but it makes the allocation a
+		 * little more likely to succeed, because it serializes
+		 * simultaneous allocations on the same pageblock.  We
+		 * cannot sleep on all paths, though, so try to do the
+		 * allocation speculatively, if we identify another
+		 * thread using the same pageblock, fallback to the
+		 * serial path mutex, if possible, or try another
+		 * pageblock, otherwise.
+		 */
+		has_lock = mutex_trylock(&cma_mutex);
+retry:
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
 					 gfp_mask);
-		mutex_unlock(&cma_mutex);
+
+		if (ret == -EAGAIN && (gfp_mask & __GFP_IO)) {
+			if (!has_lock) {
+				mutex_lock(&cma_mutex);
+				has_lock = true;
+			}
+			goto retry;
+		}
+		if (has_lock)
+			mutex_unlock(&cma_mutex);
+
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
 			break;
 		}
 
 		cma_clear_bitmap(cma, pfn, count);
-		if (ret != -EBUSY)
+		if (ret != -EBUSY && ret != -EAGAIN)
 			break;
 
 		pr_debug("%s(): memory range at %p is busy, retrying\n",