[v2,16/40] arm64: mm: Pin down ASIDs for sharing mm with devices

To enable address space sharing with the IOMMU, introduce mm_context_get()
and mm_context_put(), that pin down a context and ensure that it will keep
its ASID after a rollover.

Pinning is necessary because a device constantly needs a valid ASID,
unlike tasks that only require one when running. Without pinning, we would
need to notify the IOMMU when we're about to use a new ASID for a task,
and it would get complicated when a new task is assigned a shared ASID.
Consider the following scenario with no ASID pinned:

1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
   We would now have to immediately generate a new ASID for t1, notify
   the IOMMU, and finally enable task tn. We are holding the lock during
   all that time, since we can't afford having another CPU trigger a
   rollover. The IOMMU issues invalidation commands that can take tens of
   milliseconds.

It gets needlessly complicated. All we wanted to do was schedule task tn,
that has no business with the IOMMU. By letting the IOMMU pin tasks when
needed, we avoid stalling the slow path, and let the pinning fail when
we're out of shareable ASIDs.

After a rollover, the allocator expects at least one ASID to be available
in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
1) is the maximum number of ASIDs that can be shared with the IOMMU.

Cc: catalin.marinas@arm.com
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>

---
v1->v2: TLC found a bug in my code :) It was a bit silly.

When updating an mm's context after a rollover, we check if the ASID is
pinned. If it is, then we can reuse it. But even then we do need to update
the generation in the reserved_asid map. V1 didn't do that, so what
happened was:

1. A task t1 is running with ASID (gen=1, asid=1) on CPU1 all along.
2. CPU2 triggers a rollover, but since t1 is running it keeps its ASID.
3. ASID 1 is pinned. t1 is scheduled on CPU2. The ASID allocator sees the
   ASID pinned, and skips the update of reserved_asids. t1 now has ASID
   (2, 1)
4. ASID 1 is unpinned. Another rollover. t1 is scheduled on CPU2. Since it
   is still running on CPU1, the allocator should keep reuse its ASID, but
   as it looks for ASID (2, 1) in reserved_asid, it finds (1, 1), and
   concludes that the task needs a new ASID. Woops.

The fix is simple: check and update reserved_asids *before* checking for
pinned ASIDs. The bug was found this afternoon (after a 4h run), and there
probably will be more. I restarted the validation but it might take a
while or never finish -- I had to stop the penultimate run after 2 weeks,
the parameters were too large. The last successful run was with only two
generations and took 4:30 hours (on 4 Xeon E5-2660v4). This bug was found
with 3 generations and a single pinned task.

You can find the asidalloc changes for kernel-tla here, temporarily:
http://jpbrucker.net/git/kernel-tla/commit/?id=b70361
http://jpbrucker.net/git/kernel-tla/commit/?id=f5413d
---
 arch/arm64/include/asm/mmu.h         |  1 +
 arch/arm64/include/asm/mmu_context.h | 11 +++-
 arch/arm64/mm/context.c              | 92 ++++++++++++++++++++++++++--
 3 files changed, 99 insertions(+), 5 deletions(-)

[v2,16/40] arm64: mm: Pin down ASIDs for sharing mm with devices

Commit Message

Comments

Patch