Message ID | Yq3r2qO//NzJGVlO@mx3210.localdomain (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | parisc: Fix flush_anon_page on PA8800/PA8900 | expand |
Am Samstag, 18. Juni 2022, 17:14:34 CEST schrieb John David Anglin: > Anonymous pages are allocated with the shared mappings colouring, > SHM_COLOUR. Since the alias boundary on machines with PA8800 and > PA8900 processors is unknown, flush_user_cache_page() might not > flush all mappings of a shared anonymous page. Flushing the whole > data cache flushes all mappings. > > This won't fix all coherency issues with shared mappings but it > seems to work well in practice. I haven't seen any random memory > faults in almost a month on a rp3440 running as a debian buildd > machine. > > There is a small preformance hit. Is there a limit we can limit this to the given CPU types? And given that this seems to be a best effort workaround I would suggest adding a comment in the code as explaining why this happens, otherwise someone looking at the code in 3 years may not get the point of it and a quick test will just show "oh, it works without that". Eike
On 2022-06-18 1:08 p.m., Rolf Eike Beer wrote: > Am Samstag, 18. Juni 2022, 17:14:34 CEST schrieb John David Anglin: >> Anonymous pages are allocated with the shared mappings colouring, >> SHM_COLOUR. Since the alias boundary on machines with PA8800 and >> PA8900 processors is unknown, flush_user_cache_page() might not >> flush all mappings of a shared anonymous page. Flushing the whole >> data cache flushes all mappings. >> >> This won't fix all coherency issues with shared mappings but it >> seems to work well in practice. I haven't seen any random memory >> faults in almost a month on a rp3440 running as a debian buildd >> machine. >> >> There is a small preformance hit. > Is there a limit we can limit this to the given CPU types? And given that this It is limited to PA8800 and PA8900 CPUs by the parisc_requires_coherency() check. There are already a bunch of similar checks in cache.c that have comments (e.g., range an mm flush routines). > seems to be a best effort workaround I would suggest adding a comment in the > code as explaining why this happens, otherwise someone looking at the code in > 3 years may not get the point of it and a quick test will just show "oh, it > works without that". The change is not a best effort workaround. Flushing the whole data cache always flushes all aliases to a page. It could be used for all anonymous page flushes but it is slow. Shared anonymous mappings only work when the mappings are equivalent or meet the requirements for multi reader, single writer. The problem is we don't in general know what mappings are equivalent on PA8800/PA8900 processors. There is a comment about the alias boundary of machines with PA8800/PA8900 processors in arch/parisc/include/asm/fixmap.h. This is why we can't flush all aliases of a shared page with single flush or use tmpalias flushes on these machines. Sometimes they work and sometimes they don't. Dave
On 2022-06-18 9:18 p.m., John David Anglin wrote: > Shared anonymous mappings only work when the mappings are equivalent or meet the > requirements for multi reader, single writer. The problem is we don't in general know what > mappings are equivalent on PA8800/PA8900 processors. This change introduced the possibility that shared mappings might not be equivalent on parisc: https://marc.info/?l=git-commits-head&m=139776619924910&w=2 The shmat() man page describes SHMLBA as follows: > SHMLBA Segment low boundary address multiple. When explicitly specify- > ing an attach address in a call to shmat(), the caller should > ensure that the address is a multiple of this value. This is > necessary on some architectures, in order either to ensure good > CPU cache performance or to ensure that different attaches of > the same segment have consistent views within the CPU cache. > SHMLBA is normally some multiple of the system page size. (On > many Linux architectures, SHMLBA is the same as the system page > size.) Further, the manpage states: > o If shmaddr is NULL, the system chooses a suitable (unused) page- > aligned address to attach the segment. > > o If shmaddr isn't NULL and SHM_RND is specified in shmflg, the attach > occurs at the address equal to shmaddr rounded down to the nearest > multiple of SHMLBA. > > o Otherwise, shmaddr must be a page-aligned address at which the attach > occurs. Thus, shared mappings may not be equivalent and we need to flush all aliases. Dave
On Sun, 2022-06-19 at 09:32 -0400, John David Anglin wrote: > On 2022-06-18 9:18 p.m., John David Anglin wrote: > > Shared anonymous mappings only work when the mappings are equivalen > > t or meet the > > requirements for multi reader, single writer. The problem is we do > > n't in general know what mappings are equivalent on PA8800/PA8900 p > > rocessors. > This change introduced the possibility that shared mappings might not > be equivalent on parisc: > https://marc.info/?l=git-commits-head&m=139776619924910&w=2 > > The shmat() man page describes SHMLBA as follows: I wouldn't read what the manpage says, I'd read what the code does. SHMLBA at page size allows a mapping to start anywhere as opposed to only on 4M boundaries. SHM_COLOUR at 4M should ensure that any additional mappings of the area are all aliased correctly. However, the theory of that seems to rest on the ability of GET_LAST_MMAP/SET_LAST_MMAP to stash the mmaped address, which appears to be broken. So I'd say rather that over flushing we should fix that, either by stashing properly or falling back to 4MB SHMLBA. James
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c index 0fd04073d4b6..a20c1c47b780 100644 --- a/arch/parisc/kernel/cache.c +++ b/arch/parisc/kernel/cache.c @@ -722,7 +722,10 @@ void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned lon return; if (parisc_requires_coherency()) { - flush_user_cache_page(vma, vmaddr); + if (vma->vm_flags & VM_SHARED) + flush_data_cache(); + else + flush_user_cache_page(vma, vmaddr); return; }
Anonymous pages are allocated with the shared mappings colouring, SHM_COLOUR. Since the alias boundary on machines with PA8800 and PA8900 processors is unknown, flush_user_cache_page() might not flush all mappings of a shared anonymous page. Flushing the whole data cache flushes all mappings. This won't fix all coherency issues with shared mappings but it seems to work well in practice. I haven't seen any random memory faults in almost a month on a rp3440 running as a debian buildd machine. There is a small preformance hit. Signed-off-by: John David Anglin <dave.anglin@bell.net> ---