diff mbox

parisc: fix mmap(MAP_FIXED|MAP_SHARED) to already mmapped address

Message ID 533DB961.9010607@gmx.de (mailing list archive)
State Not Applicable
Headers show

Commit Message

Helge Deller April 3, 2014, 7:41 p.m. UTC
On 04/02/2014 11:41 PM, John David Anglin wrote:
> On 4/2/2014 5:09 PM, Helge Deller wrote:
>> On 04/02/2014 09:09 PM, Carlos O'Donell wrote:
>>> On Tue, Apr 1, 2014 at 2:49 PM, Helge Deller <deller@gmx.de> wrote:
>>>> Yes.
>>>> But it's not a kernel bug. Kernel 3.14 and previous stable releases are OK.
>>>>
>>>> I did proposed a glibc change in my previous mail (http://www.spinics.net/lists/linux-parisc/msg05384.html).
>>>> Debian bug report with patch is here:
>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741243
>>>>
>>>> And this is what I proposed:
>>>>
>>>> A trivial FIX/workaround would be to change libc-mmap.h like this:
>>>> #ifdef __hppa__
>>>> #define MAP_FIXED_ALIGNMENT 4096
>>>> #else
>>>> #define MAP_FIXED_ALIGNMENT SHMLBA
>>>> #endif
>>>>
>>>> That works because then the new aligned address is then the same as the original
>>>> (the mmap call returns 4k aligned addresses, so it stays unchanged), but I'm not sure
>>>> if such a patch would be acceptable.
>>>> Do you have another idea/proposal?
>>> The responsibility for fixing this falls to me, but I've been busy.
>> No problem.
>>
>>> If someone wants to propose a patch for glibc please email
>>> libc-alpha@sourceware.org, TO me, and I'll review and commit the patch
>>> granted that you show you've done the appropriate testing.
>>>
>>> Otherwise I'll get to this at some point in the next couple of weeks :-(
>> Hi Carlos,
>>
>> I'm not really sure if my patch is the best way to go. Technically it's correct
>> and it's tested since all our debian buildservers currently run with this patch.
>> But all other options would probably involve more code changes.
>>
>> So, I think I'm happy if you can look at it at some point when you find time.
>> Your input would be very valuable here.

> I'm wondering if kernel value for SHMLBA shouldn't change to PAGE_SIZE to better
> reflect that attach addresses are page aligned.  The color alignment for shared maps
> seems a separate issue which maybe userspace doesn't need to worry about.

I think this is a very interesting idea and it should be pretty simple!

The attached patch for eglibc should resolve it.
And the attached patch for kernel isn't necessary, but makes it clear that the colouring is important.
 
I did tested the kernel patch - and it seems to work without problems.

I'm not sure if this might introduce userspace compile problems though (although unlikely).

Helge

Comments

John David Anglin April 3, 2014, 8:03 p.m. UTC | #1
On 4/3/2014 3:41 PM, Helge Deller wrote:
> On 04/02/2014 11:41 PM, John David Anglin wrote:
>> On 4/2/2014 5:09 PM, Helge Deller wrote:
>>> On 04/02/2014 09:09 PM, Carlos O'Donell wrote:
>>>> On Tue, Apr 1, 2014 at 2:49 PM, Helge Deller <deller@gmx.de> wrote:
>>>>> Yes.
>>>>> But it's not a kernel bug. Kernel 3.14 and previous stable releases are OK.
>>>>>
>>>>> I did proposed a glibc change in my previous mail (http://www.spinics.net/lists/linux-parisc/msg05384.html).
>>>>> Debian bug report with patch is here:
>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741243
>>>>>
>>>>> And this is what I proposed:
>>>>>
>>>>> A trivial FIX/workaround would be to change libc-mmap.h like this:
>>>>> #ifdef __hppa__
>>>>> #define MAP_FIXED_ALIGNMENT 4096
>>>>> #else
>>>>> #define MAP_FIXED_ALIGNMENT SHMLBA
>>>>> #endif
>>>>>
>>>>> That works because then the new aligned address is then the same as the original
>>>>> (the mmap call returns 4k aligned addresses, so it stays unchanged), but I'm not sure
>>>>> if such a patch would be acceptable.
>>>>> Do you have another idea/proposal?
>>>> The responsibility for fixing this falls to me, but I've been busy.
>>> No problem.
>>>
>>>> If someone wants to propose a patch for glibc please email
>>>> libc-alpha@sourceware.org, TO me, and I'll review and commit the patch
>>>> granted that you show you've done the appropriate testing.
>>>>
>>>> Otherwise I'll get to this at some point in the next couple of weeks :-(
>>> Hi Carlos,
>>>
>>> I'm not really sure if my patch is the best way to go. Technically it's correct
>>> and it's tested since all our debian buildservers currently run with this patch.
>>> But all other options would probably involve more code changes.
>>>
>>> So, I think I'm happy if you can look at it at some point when you find time.
>>> Your input would be very valuable here.
>> I'm wondering if kernel value for SHMLBA shouldn't change to PAGE_SIZE to better
>> reflect that attach addresses are page aligned.  The color alignment for shared maps
>> seems a separate issue which maybe userspace doesn't need to worry about.
> I think this is a very interesting idea and it should be pretty simple!
>
> The attached patch for eglibc should resolve it.
> And the attached patch for kernel isn't necessary, but makes it clear that the colouring is important.
>   
> I did tested the kernel patch - and it seems to work without problems.
>
> I'm not sure if this might introduce userspace compile problems though (although unlikely).
Very nice!  In our current Debian eglibc build, SHMLBA is set to 4096.
So, it should work just fine with the kernel patch.  The buildds have 
been running for some time
and I'm not aware of any mmap issues aside from the pthread_create 
ENOMEM errors.

Do you think this helps the allocation of small maps (perl locale test bug)?

Dave
John David Anglin April 3, 2014, 8:12 p.m. UTC | #2
On 4/3/2014 3:41 PM, Helge Deller wrote:
> And the attached patch for kernel isn't necessary, but makes it clear that the colouring is important.
Regarding the kernel patch, I see you have used the British/Canadian 
spelling
of "colour" :-)

I thinking changing SHMLBA does affect things in other parts of the kernel.
That's why I was uncertain whether it would work.

Dave
Helge Deller April 3, 2014, 8:26 p.m. UTC | #3
On 04/03/2014 10:03 PM, John David Anglin wrote:
> On 4/3/2014 3:41 PM, Helge Deller wrote:
>> On 04/02/2014 11:41 PM, John David Anglin wrote:
>>> On 4/2/2014 5:09 PM, Helge Deller wrote:
>>>> On 04/02/2014 09:09 PM, Carlos O'Donell wrote:
>>>>> On Tue, Apr 1, 2014 at 2:49 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>> Yes.
>>>>>> But it's not a kernel bug. Kernel 3.14 and previous stable releases are OK.
>>>>>>
>>>>>> I did proposed a glibc change in my previous mail (http://www.spinics.net/lists/linux-parisc/msg05384.html).
>>>>>> Debian bug report with patch is here:
>>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741243
>>>>>>
>>>>>> And this is what I proposed:
>>>>>>
>>>>>> A trivial FIX/workaround would be to change libc-mmap.h like this:
>>>>>> #ifdef __hppa__
>>>>>> #define MAP_FIXED_ALIGNMENT 4096
>>>>>> #else
>>>>>> #define MAP_FIXED_ALIGNMENT SHMLBA
>>>>>> #endif
>>>>>>
>>>>>> That works because then the new aligned address is then the same as the original
>>>>>> (the mmap call returns 4k aligned addresses, so it stays unchanged), but I'm not sure
>>>>>> if such a patch would be acceptable.
>>>>>> Do you have another idea/proposal?
>>>>> The responsibility for fixing this falls to me, but I've been busy.
>>>> No problem.
>>>>
>>>>> If someone wants to propose a patch for glibc please email
>>>>> libc-alpha@sourceware.org, TO me, and I'll review and commit the patch
>>>>> granted that you show you've done the appropriate testing.
>>>>>
>>>>> Otherwise I'll get to this at some point in the next couple of weeks :-(
>>>> Hi Carlos,
>>>>
>>>> I'm not really sure if my patch is the best way to go. Technically it's correct
>>>> and it's tested since all our debian buildservers currently run with this patch.
>>>> But all other options would probably involve more code changes.
>>>>
>>>> So, I think I'm happy if you can look at it at some point when you find time.
>>>> Your input would be very valuable here.
>>> I'm wondering if kernel value for SHMLBA shouldn't change to PAGE_SIZE to better
>>> reflect that attach addresses are page aligned.  The color alignment for shared maps
>>> seems a separate issue which maybe userspace doesn't need to worry about.
>> I think this is a very interesting idea and it should be pretty simple!
>>
>> The attached patch for eglibc should resolve it.
>> And the attached patch for kernel isn't necessary, but makes it clear that the colouring is important.
>>   I did tested the kernel patch - and it seems to work without problems.
>>
>> I'm not sure if this might introduce userspace compile problems though (although unlikely).

> Very nice!  In our current Debian eglibc build, SHMLBA is set to 4096.

*No*, it's not!
In current eglibc it's set to 0x00400000
That's what my eglibc-patch changes...
I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).

> So, it should work just fine with the kernel patch. 

No, the kernel patch isn't necessary. Only the glibc patch.

> The buildds have been running for some time
> and I'm not aware of any mmap issues aside from the pthread_create ENOMEM errors.

True - we need to find the cause.
I just suspected the arch_get_unmapped_area() kernel functions, but they seem correct.
 
> Do you think this helps the allocation of small maps (perl locale test bug)?

No. Only my other (unfinished) patches will resolve this.

Helge

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller April 3, 2014, 8:27 p.m. UTC | #4
On 04/03/2014 10:12 PM, John David Anglin wrote:
> On 4/3/2014 3:41 PM, Helge Deller wrote:
>> And the attached patch for kernel isn't necessary, but makes it clear that the colouring is important.
> Regarding the kernel patch, I see you have used the British/Canadian spelling
> of "colour" :-)

Yeah - seems to be commonly used in similiar parts of the kernel for other arches.
If you have a better idea/naming, please let me know.
 
> I thinking changing SHMLBA does affect things in other parts of the kernel.
> That's why I was uncertain whether it would work.

I scanned it. It's used very little, and shouldn't affect us.

Helge

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeroen Roovers April 4, 2014, 3:45 p.m. UTC | #5
On Thu, 3 Apr 2014 16:12:03 -0400
John David Anglin <dave.anglin@bell.net> wrote:

> On 4/3/2014 3:41 PM, Helge Deller wrote:
> Regarding the kernel patch, I see you have used the British/Canadian 
> spelling of "colour" :-)

Also:

+#define SHM_COLOUR 0x00400000	/* shared mappings coulouring */

coulouring => colouring


     jer
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Carlos O'Donell Feb. 20, 2015, 9:36 p.m. UTC | #6
On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> In current eglibc it's set to 0x00400000
> That's what my eglibc-patch changes...
> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).

Anyone object to me fixing this upstream by making SHMLBA match the kernel?

I plan to use a fixed value of 4096, since I never expect hppa
userspace to have to care (even if the kernel uses superpages).

Please correct me if I'm wrong.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Feb. 21, 2015, 8:31 p.m. UTC | #7
On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:

> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>> In current eglibc it's set to 0x00400000
>> That's what my eglibc-patch changes...
>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> 
> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> 
> I plan to use a fixed value of 4096, since I never expect hppa
> userspace to have to care (even if the kernel uses superpages).

We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
Is there a performance advantage in using 4096?

> 
> Please correct me if I'm wrong.


At one time, we thought this value needed to be 4 MB.  Helge was working on improving the mmap
allocation scheme but this work stalled after some improvement.  I can't remember the issues and how
they relate to SHMLBA.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Feb. 21, 2015, 8:40 p.m. UTC | #8
On 2015-02-21, at 3:31 PM, John David Anglin wrote:

> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> 
>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>> In current eglibc it's set to 0x00400000
>>> That's what my eglibc-patch changes...
>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>> 
>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>> 
>> I plan to use a fixed value of 4096, since I never expect hppa
>> userspace to have to care (even if the kernel uses superpages).
> 
> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> Is there a performance advantage in using 4096?
> 
>> 
>> Please correct me if I'm wrong.
> 
> 
> At one time, we thought this value needed to be 4 MB.  Helge was working on improving the mmap
> allocation scheme but this work stalled after some improvement.  I can't remember the issues and how
> they relate to SHMLBA.


Actually, the number was 4 Mb (bit).

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 21, 2015, 11:09 p.m. UTC | #9
On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
> 
> > On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> > 
> >> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> >>> In current eglibc it's set to 0x00400000
> >>> That's what my eglibc-patch changes...
> >>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> >> 
> >> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> >> 
> >> I plan to use a fixed value of 4096, since I never expect hppa
> >> userspace to have to care (even if the kernel uses superpages).
> > 
> > We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> > Is there a performance advantage in using 4096?
> > 
> >> 
> >> Please correct me if I'm wrong.
> > 
> > 
> > At one time, we thought this value needed to be 4 MB.  Helge was
> working on improving the mmap
> > allocation scheme but this work stalled after some improvement.  I
> can't remember the issues and how
> > they relate to SHMLBA.
> 
> 
> Actually, the number was 4 Mb (bit).

No, it was 4MB.  That's the cache equivalency stride on PA processors
because we have a VIPT cache.  The architectural requirement according
to the dreaded appendix F is 16MB but we were assured by the PA
architects that it was 4 because they never planned producing processors
that would require 16.  The actual meaning is it's the number of bits of
the virtual address that are significant in the virtual index.

The point of SHMLBA is that if the same physical page is mapped into two
different virtual addresses but the two addresses are equal, modulo
SHMLBA, then the L1 cache sees the equivalency and you can't get
inequivalent cache aliases for the page (two writes to the two different
addresses producing two separately dirty cache lines which can never
resolve).  This means that the virtual addresses of all shared mappings
have to be equal modulo SHMLBA for the caches not to alias.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Feb. 21, 2015, 11:26 p.m. UTC | #10
On 22.02.2015 00:09, James Bottomley wrote:
> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>
>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>
>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>> In current eglibc it's set to 0x00400000
>>>>> That's what my eglibc-patch changes...
>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>
>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>
>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>> userspace to have to care (even if the kernel uses superpages).
>>>
>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>> Is there a performance advantage in using 4096?
>>>
>>>>
>>>> Please correct me if I'm wrong.
>>>
>>>
>>> At one time, we thought this value needed to be 4 MB.  Helge was
>> working on improving the mmap
>>> allocation scheme but this work stalled after some improvement.  I
>> can't remember the issues and how
>>> they relate to SHMLBA.
>>
>>
>> Actually, the number was 4 Mb (bit).
>
> No, it was 4MB.  That's the cache equivalency stride on PA processors
> because we have a VIPT cache.  The architectural requirement according
> to the dreaded appendix F is 16MB but we were assured by the PA
> architects that it was 4 because they never planned producing processors
> that would require 16.  The actual meaning is it's the number of bits of
> the virtual address that are significant in the virtual index.
>

Your following statement:

> The point of SHMLBA is that if the same physical page is mapped into two
> different virtual addresses but the two addresses are equal, modulo
> SHMLBA, then the L1 cache sees the equivalency and you can't get
> inequivalent cache aliases for the page (two writes to the two different
> addresses producing two separately dirty cache lines which can never
> resolve).  This means that the virtual addresses of all shared mappings
> have to be equal modulo SHMLBA for the caches not to alias.

With this you define SHMLBA to be the representative number which defines
what the current cache equivalency stride of the kernel is, *and* which then can
be used by userspace. I think this is a misinterpretation of SHMLBA (or at
least a parisc-specific interpretation of SHMLBA), which is not like how it
is used on other architectures with similar limitations.
Userspace should not know the kernel/architecture specifics. Instead they
should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
kernel/glibc will return a corrected mapping address (modulo 4MB).
I think this is important, since most userspace programs usually try to mmap at
a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
This has been the issue with localedef in glibc (a strange coding which tries
to be platform-specific with mmap-calculation). Because of that in the end
it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).

So, your statement above is correct, I would just not use "SHMLBA" in this term,
but maybe "KERNEL_SHMLBA" instead.

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 21, 2015, 11:57 p.m. UTC | #11
On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
> On 22.02.2015 00:09, James Bottomley wrote:
> > On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
> >> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
> >>
> >>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> >>>
> >>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> >>>>> In current eglibc it's set to 0x00400000
> >>>>> That's what my eglibc-patch changes...
> >>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> >>>>
> >>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> >>>>
> >>>> I plan to use a fixed value of 4096, since I never expect hppa
> >>>> userspace to have to care (even if the kernel uses superpages).
> >>>
> >>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> >>> Is there a performance advantage in using 4096?
> >>>
> >>>>
> >>>> Please correct me if I'm wrong.
> >>>
> >>>
> >>> At one time, we thought this value needed to be 4 MB.  Helge was
> >> working on improving the mmap
> >>> allocation scheme but this work stalled after some improvement.  I
> >> can't remember the issues and how
> >>> they relate to SHMLBA.
> >>
> >>
> >> Actually, the number was 4 Mb (bit).
> >
> > No, it was 4MB.  That's the cache equivalency stride on PA processors
> > because we have a VIPT cache.  The architectural requirement according
> > to the dreaded appendix F is 16MB but we were assured by the PA
> > architects that it was 4 because they never planned producing processors
> > that would require 16.  The actual meaning is it's the number of bits of
> > the virtual address that are significant in the virtual index.
> >
> 
> Your following statement:
> 
> > The point of SHMLBA is that if the same physical page is mapped into two
> > different virtual addresses but the two addresses are equal, modulo
> > SHMLBA, then the L1 cache sees the equivalency and you can't get
> > inequivalent cache aliases for the page (two writes to the two different
> > addresses producing two separately dirty cache lines which can never
> > resolve).  This means that the virtual addresses of all shared mappings
> > have to be equal modulo SHMLBA for the caches not to alias.
> 
> With this you define SHMLBA to be the representative number which defines
> what the current cache equivalency stride of the kernel is, *and* which then can
> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
> least a parisc-specific interpretation of SHMLBA), which is not like how it
> is used on other architectures with similar limitations.
> Userspace should not know the kernel/architecture specifics. Instead they
> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
> kernel/glibc will return a corrected mapping address (modulo 4MB).
> I think this is important, since most userspace programs usually try to mmap at
> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
> This has been the issue with localedef in glibc (a strange coding which tries
> to be platform-specific with mmap-calculation). Because of that in the end
> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
> 
> So, your statement above is correct, I would just not use "SHMLBA" in this term,
> but maybe "KERNEL_SHMLBA" instead.

Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
enough to allow the user pick the address of the region of shared
memory, so the user had to know these architectural details and SHMLBA
encodes them (man shmat will give you the gory details).

For mmap, we can mostly do the right thing in the kernel, except for
MAP_FIXED, where the user has to know what they're doing again.

For the cases the user thinks they know best, we can't avoid giving out
the knowledge somehow, because inequivalent aliases in writeable
mappings will HPMC a system.  We could be more relaxed about
inequivalent aliases in read only mappings (say shared libraries), but
the consequence of that is an explosion in the use of cache space, so we
would want some libraries (like glibc) with many shared copies to obey
SHMLBA.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Feb. 22, 2015, 4:45 p.m. UTC | #12
On 2015-02-21, at 6:57 PM, James Bottomley wrote:

> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
>> On 22.02.2015 00:09, James Bottomley wrote:
>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>>> 
>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>>> 
>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>>> In current eglibc it's set to 0x00400000
>>>>>>> That's what my eglibc-patch changes...
>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>>> 
>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>>> 
>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>>>> userspace to have to care (even if the kernel uses superpages).
>>>>> 
>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>>>> Is there a performance advantage in using 4096?
>>>>> 
>>>>>> 
>>>>>> Please correct me if I'm wrong.
>>>>> 
>>>>> 
>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
>>>> working on improving the mmap
>>>>> allocation scheme but this work stalled after some improvement.  I
>>>> can't remember the issues and how
>>>>> they relate to SHMLBA.
>>>> 
>>>> 
>>>> Actually, the number was 4 Mb (bit).
>>> 
>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
>>> because we have a VIPT cache.  The architectural requirement according
>>> to the dreaded appendix F is 16MB but we were assured by the PA
>>> architects that it was 4 because they never planned producing processors
>>> that would require 16.  The actual meaning is it's the number of bits of
>>> the virtual address that are significant in the virtual index.
>>> 
>> 
>> Your following statement:
>> 
>>> The point of SHMLBA is that if the same physical page is mapped into two
>>> different virtual addresses but the two addresses are equal, modulo
>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
>>> inequivalent cache aliases for the page (two writes to the two different
>>> addresses producing two separately dirty cache lines which can never
>>> resolve).  This means that the virtual addresses of all shared mappings
>>> have to be equal modulo SHMLBA for the caches not to alias.
>> 
>> With this you define SHMLBA to be the representative number which defines
>> what the current cache equivalency stride of the kernel is, *and* which then can
>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
>> least a parisc-specific interpretation of SHMLBA), which is not like how it
>> is used on other architectures with similar limitations.
>> Userspace should not know the kernel/architecture specifics. Instead they
>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
>> kernel/glibc will return a corrected mapping address (modulo 4MB).
>> I think this is important, since most userspace programs usually try to mmap at
>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
>> This has been the issue with localedef in glibc (a strange coding which tries
>> to be platform-specific with mmap-calculation). Because of that in the end
>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
>> 
>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
>> but maybe "KERNEL_SHMLBA" instead.
> 
> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
> enough to allow the user pick the address of the region of shared
> memory, so the user had to know these architectural details and SHMLBA
> encodes them (man shmat will give you the gory details).
> 
> For mmap, we can mostly do the right thing in the kernel, except for
> MAP_FIXED, where the user has to know what they're doing again.
> 
> For the cases the user thinks they know best, we can't avoid giving out
> the knowledge somehow, because inequivalent aliases in writeable
> mappings will HPMC a system.  We could be more relaxed about
> inequivalent aliases in read only mappings (say shared libraries), but
> the consequence of that is an explosion in the use of cache space, so we
> would want some libraries (like glibc) with many shared copies to obey
> SHMLBA.


I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
userspace applications failing as mentioned by Helge.

MAP_FIXED can fail it the address is a problem.  I believe we check for that.

If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
address as it was originally created as there is no way to relocate the data.

SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.

The following is in <asm-generic/shmparam.h>:
#define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */

Shared mappings are handled with 
asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 22, 2015, 5:17 p.m. UTC | #13
On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
> 
> > On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
> >> On 22.02.2015 00:09, James Bottomley wrote:
> >>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
> >>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
> >>>> 
> >>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> >>>>> 
> >>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> >>>>>>> In current eglibc it's set to 0x00400000
> >>>>>>> That's what my eglibc-patch changes...
> >>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> >>>>>> 
> >>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> >>>>>> 
> >>>>>> I plan to use a fixed value of 4096, since I never expect hppa
> >>>>>> userspace to have to care (even if the kernel uses superpages).
> >>>>> 
> >>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> >>>>> Is there a performance advantage in using 4096?
> >>>>> 
> >>>>>> 
> >>>>>> Please correct me if I'm wrong.
> >>>>> 
> >>>>> 
> >>>>> At one time, we thought this value needed to be 4 MB.  Helge was
> >>>> working on improving the mmap
> >>>>> allocation scheme but this work stalled after some improvement.  I
> >>>> can't remember the issues and how
> >>>>> they relate to SHMLBA.
> >>>> 
> >>>> 
> >>>> Actually, the number was 4 Mb (bit).
> >>> 
> >>> No, it was 4MB.  That's the cache equivalency stride on PA processors
> >>> because we have a VIPT cache.  The architectural requirement according
> >>> to the dreaded appendix F is 16MB but we were assured by the PA
> >>> architects that it was 4 because they never planned producing processors
> >>> that would require 16.  The actual meaning is it's the number of bits of
> >>> the virtual address that are significant in the virtual index.
> >>> 
> >> 
> >> Your following statement:
> >> 
> >>> The point of SHMLBA is that if the same physical page is mapped into two
> >>> different virtual addresses but the two addresses are equal, modulo
> >>> SHMLBA, then the L1 cache sees the equivalency and you can't get
> >>> inequivalent cache aliases for the page (two writes to the two different
> >>> addresses producing two separately dirty cache lines which can never
> >>> resolve).  This means that the virtual addresses of all shared mappings
> >>> have to be equal modulo SHMLBA for the caches not to alias.
> >> 
> >> With this you define SHMLBA to be the representative number which defines
> >> what the current cache equivalency stride of the kernel is, *and* which then can
> >> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
> >> least a parisc-specific interpretation of SHMLBA), which is not like how it
> >> is used on other architectures with similar limitations.
> >> Userspace should not know the kernel/architecture specifics. Instead they
> >> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
> >> kernel/glibc will return a corrected mapping address (modulo 4MB).
> >> I think this is important, since most userspace programs usually try to mmap at
> >> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
> >> This has been the issue with localedef in glibc (a strange coding which tries
> >> to be platform-specific with mmap-calculation). Because of that in the end
> >> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
> >> 
> >> So, your statement above is correct, I would just not use "SHMLBA" in this term,
> >> but maybe "KERNEL_SHMLBA" instead.
> > 
> > Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
> > enough to allow the user pick the address of the region of shared
> > memory, so the user had to know these architectural details and SHMLBA
> > encodes them (man shmat will give you the gory details).
> > 
> > For mmap, we can mostly do the right thing in the kernel, except for
> > MAP_FIXED, where the user has to know what they're doing again.
> > 
> > For the cases the user thinks they know best, we can't avoid giving out
> > the knowledge somehow, because inequivalent aliases in writeable
> > mappings will HPMC a system.  We could be more relaxed about
> > inequivalent aliases in read only mappings (say shared libraries), but
> > the consequence of that is an explosion in the use of cache space, so we
> > would want some libraries (like glibc) with many shared copies to obey
> > SHMLBA.
> 
> 
> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
> userspace applications failing as mentioned by Helge.
> 
> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
> 
> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
> address as it was originally created as there is no way to relocate the data.
> 
> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
> 
> The following is in <asm-generic/shmparam.h>:
> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
> 
> Shared mappings are handled with 
> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */

So how is the sys-v ipc problem fixed?  There the user is told to select
an address which is a multiple of SHMLBA.  Programs that do this today
will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
because the colour will be wrong.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 22, 2015, 5:28 p.m. UTC | #14
On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
> 
> > On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
> >> On 22.02.2015 00:09, James Bottomley wrote:
> >>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
> >>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
> >>>> 
> >>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> >>>>> 
> >>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> >>>>>>> In current eglibc it's set to 0x00400000
> >>>>>>> That's what my eglibc-patch changes...
> >>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> >>>>>> 
> >>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> >>>>>> 
> >>>>>> I plan to use a fixed value of 4096, since I never expect hppa
> >>>>>> userspace to have to care (even if the kernel uses superpages).
> >>>>> 
> >>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> >>>>> Is there a performance advantage in using 4096?
> >>>>> 
> >>>>>> 
> >>>>>> Please correct me if I'm wrong.
> >>>>> 
> >>>>> 
> >>>>> At one time, we thought this value needed to be 4 MB.  Helge was
> >>>> working on improving the mmap
> >>>>> allocation scheme but this work stalled after some improvement.  I
> >>>> can't remember the issues and how
> >>>>> they relate to SHMLBA.
> >>>> 
> >>>> 
> >>>> Actually, the number was 4 Mb (bit).
> >>> 
> >>> No, it was 4MB.  That's the cache equivalency stride on PA processors
> >>> because we have a VIPT cache.  The architectural requirement according
> >>> to the dreaded appendix F is 16MB but we were assured by the PA
> >>> architects that it was 4 because they never planned producing processors
> >>> that would require 16.  The actual meaning is it's the number of bits of
> >>> the virtual address that are significant in the virtual index.
> >>> 
> >> 
> >> Your following statement:
> >> 
> >>> The point of SHMLBA is that if the same physical page is mapped into two
> >>> different virtual addresses but the two addresses are equal, modulo
> >>> SHMLBA, then the L1 cache sees the equivalency and you can't get
> >>> inequivalent cache aliases for the page (two writes to the two different
> >>> addresses producing two separately dirty cache lines which can never
> >>> resolve).  This means that the virtual addresses of all shared mappings
> >>> have to be equal modulo SHMLBA for the caches not to alias.
> >> 
> >> With this you define SHMLBA to be the representative number which defines
> >> what the current cache equivalency stride of the kernel is, *and* which then can
> >> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
> >> least a parisc-specific interpretation of SHMLBA), which is not like how it
> >> is used on other architectures with similar limitations.
> >> Userspace should not know the kernel/architecture specifics. Instead they
> >> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
> >> kernel/glibc will return a corrected mapping address (modulo 4MB).
> >> I think this is important, since most userspace programs usually try to mmap at
> >> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
> >> This has been the issue with localedef in glibc (a strange coding which tries
> >> to be platform-specific with mmap-calculation). Because of that in the end
> >> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
> >> 
> >> So, your statement above is correct, I would just not use "SHMLBA" in this term,
> >> but maybe "KERNEL_SHMLBA" instead.
> > 
> > Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
> > enough to allow the user pick the address of the region of shared
> > memory, so the user had to know these architectural details and SHMLBA
> > encodes them (man shmat will give you the gory details).
> > 
> > For mmap, we can mostly do the right thing in the kernel, except for
> > MAP_FIXED, where the user has to know what they're doing again.
> > 
> > For the cases the user thinks they know best, we can't avoid giving out
> > the knowledge somehow, because inequivalent aliases in writeable
> > mappings will HPMC a system.  We could be more relaxed about
> > inequivalent aliases in read only mappings (say shared libraries), but
> > the consequence of that is an explosion in the use of cache space, so we
> > would want some libraries (like glibc) with many shared copies to obey
> > SHMLBA.
> 
> 
> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
> userspace applications failing as mentioned by Helge.
> 
> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
> 
> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
> address as it was originally created as there is no way to relocate the data.
> 
> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.

Do we know what tricks hpux does in shmat() to pull this off?  Making
writeable mappings uncacheable might be one way.  Using the space bits
as part of the VI index generation would be another.  I think there were
also some space bit quadrant tricks hpux pulls, aren't there?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Feb. 22, 2015, 5:53 p.m. UTC | #15
On 22.02.2015 18:17, James Bottomley wrote:
> On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
>> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
>>
>>> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
>>>> On 22.02.2015 00:09, James Bottomley wrote:
>>>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>>>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>>>>>
>>>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>>>>>
>>>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>>>>> In current eglibc it's set to 0x00400000
>>>>>>>>> That's what my eglibc-patch changes...
>>>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>>>>>
>>>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>>>>>
>>>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>>>>>> userspace to have to care (even if the kernel uses superpages).
>>>>>>>
>>>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>>>>>> Is there a performance advantage in using 4096?
>>>>>>>
>>>>>>>>
>>>>>>>> Please correct me if I'm wrong.
>>>>>>>
>>>>>>>
>>>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
>>>>>> working on improving the mmap
>>>>>>> allocation scheme but this work stalled after some improvement.  I
>>>>>> can't remember the issues and how
>>>>>>> they relate to SHMLBA.
>>>>>>
>>>>>>
>>>>>> Actually, the number was 4 Mb (bit).
>>>>>
>>>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
>>>>> because we have a VIPT cache.  The architectural requirement according
>>>>> to the dreaded appendix F is 16MB but we were assured by the PA
>>>>> architects that it was 4 because they never planned producing processors
>>>>> that would require 16.  The actual meaning is it's the number of bits of
>>>>> the virtual address that are significant in the virtual index.
>>>>>
>>>>
>>>> Your following statement:
>>>>
>>>>> The point of SHMLBA is that if the same physical page is mapped into two
>>>>> different virtual addresses but the two addresses are equal, modulo
>>>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
>>>>> inequivalent cache aliases for the page (two writes to the two different
>>>>> addresses producing two separately dirty cache lines which can never
>>>>> resolve).  This means that the virtual addresses of all shared mappings
>>>>> have to be equal modulo SHMLBA for the caches not to alias.
>>>>
>>>> With this you define SHMLBA to be the representative number which defines
>>>> what the current cache equivalency stride of the kernel is, *and* which then can
>>>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
>>>> least a parisc-specific interpretation of SHMLBA), which is not like how it
>>>> is used on other architectures with similar limitations.
>>>> Userspace should not know the kernel/architecture specifics. Instead they
>>>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
>>>> kernel/glibc will return a corrected mapping address (modulo 4MB).
>>>> I think this is important, since most userspace programs usually try to mmap at
>>>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
>>>> This has been the issue with localedef in glibc (a strange coding which tries
>>>> to be platform-specific with mmap-calculation). Because of that in the end
>>>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
>>>>
>>>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
>>>> but maybe "KERNEL_SHMLBA" instead.
>>>
>>> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
>>> enough to allow the user pick the address of the region of shared
>>> memory, so the user had to know these architectural details and SHMLBA
>>> encodes them (man shmat will give you the gory details).
>>>
>>> For mmap, we can mostly do the right thing in the kernel, except for
>>> MAP_FIXED, where the user has to know what they're doing again.
>>>
>>> For the cases the user thinks they know best, we can't avoid giving out
>>> the knowledge somehow, because inequivalent aliases in writeable
>>> mappings will HPMC a system.  We could be more relaxed about
>>> inequivalent aliases in read only mappings (say shared libraries), but
>>> the consequence of that is an explosion in the use of cache space, so we
>>> would want some libraries (like glibc) with many shared copies to obey
>>> SHMLBA.
>>
>>
>> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
>> userspace applications failing as mentioned by Helge.
>>
>> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
>>
>> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
>> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
>> address as it was originally created as there is no way to relocate the data.
>>
>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
>>
>> The following is in <asm-generic/shmparam.h>:
>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
>>
>> Shared mappings are handled with
>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
>
> So how is the sys-v ipc problem fixed?  There the user is told to select
> an address which is a multiple of SHMLBA.  Programs that do this today
> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
> because the colour will be wrong.

No, an mmap() to a fixed shared address which violates the colouring will fail for userspace.
Instead most userspaces today just use a shared mmap() without a fixed address and get returned
a new address (calculated by kernel) which does not violate the colouring.
Dave and me worked on quite some such userspace issues in the last years (esp. glibc), and
having SHMLBA=4096 is the way it now works best as it is similar to the other architectures
and existing userspace programs do cope correctly with it.

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Feb. 22, 2015, 5:54 p.m. UTC | #16
On 2015-02-22, at 12:17 PM, James Bottomley wrote:

> On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
>> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
>> 
>>> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
>>>> On 22.02.2015 00:09, James Bottomley wrote:
>>>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>>>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>>>>> 
>>>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>>>>> 
>>>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>>>>> In current eglibc it's set to 0x00400000
>>>>>>>>> That's what my eglibc-patch changes...
>>>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>>>>> 
>>>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>>>>> 
>>>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>>>>>> userspace to have to care (even if the kernel uses superpages).
>>>>>>> 
>>>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>>>>>> Is there a performance advantage in using 4096?
>>>>>>> 
>>>>>>>> 
>>>>>>>> Please correct me if I'm wrong.
>>>>>>> 
>>>>>>> 
>>>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
>>>>>> working on improving the mmap
>>>>>>> allocation scheme but this work stalled after some improvement.  I
>>>>>> can't remember the issues and how
>>>>>>> they relate to SHMLBA.
>>>>>> 
>>>>>> 
>>>>>> Actually, the number was 4 Mb (bit).
>>>>> 
>>>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
>>>>> because we have a VIPT cache.  The architectural requirement according
>>>>> to the dreaded appendix F is 16MB but we were assured by the PA
>>>>> architects that it was 4 because they never planned producing processors
>>>>> that would require 16.  The actual meaning is it's the number of bits of
>>>>> the virtual address that are significant in the virtual index.
>>>>> 
>>>> 
>>>> Your following statement:
>>>> 
>>>>> The point of SHMLBA is that if the same physical page is mapped into two
>>>>> different virtual addresses but the two addresses are equal, modulo
>>>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
>>>>> inequivalent cache aliases for the page (two writes to the two different
>>>>> addresses producing two separately dirty cache lines which can never
>>>>> resolve).  This means that the virtual addresses of all shared mappings
>>>>> have to be equal modulo SHMLBA for the caches not to alias.
>>>> 
>>>> With this you define SHMLBA to be the representative number which defines
>>>> what the current cache equivalency stride of the kernel is, *and* which then can
>>>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
>>>> least a parisc-specific interpretation of SHMLBA), which is not like how it
>>>> is used on other architectures with similar limitations.
>>>> Userspace should not know the kernel/architecture specifics. Instead they
>>>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
>>>> kernel/glibc will return a corrected mapping address (modulo 4MB).
>>>> I think this is important, since most userspace programs usually try to mmap at
>>>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
>>>> This has been the issue with localedef in glibc (a strange coding which tries
>>>> to be platform-specific with mmap-calculation). Because of that in the end
>>>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
>>>> 
>>>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
>>>> but maybe "KERNEL_SHMLBA" instead.
>>> 
>>> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
>>> enough to allow the user pick the address of the region of shared
>>> memory, so the user had to know these architectural details and SHMLBA
>>> encodes them (man shmat will give you the gory details).
>>> 
>>> For mmap, we can mostly do the right thing in the kernel, except for
>>> MAP_FIXED, where the user has to know what they're doing again.
>>> 
>>> For the cases the user thinks they know best, we can't avoid giving out
>>> the knowledge somehow, because inequivalent aliases in writeable
>>> mappings will HPMC a system.  We could be more relaxed about
>>> inequivalent aliases in read only mappings (say shared libraries), but
>>> the consequence of that is an explosion in the use of cache space, so we
>>> would want some libraries (like glibc) with many shared copies to obey
>>> SHMLBA.
>> 
>> 
>> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
>> userspace applications failing as mentioned by Helge.
>> 
>> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
>> 
>> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
>> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
>> address as it was originally created as there is no way to relocate the data.
>> 
>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
>> 
>> The following is in <asm-generic/shmparam.h>:
>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
>> 
>> Shared mappings are handled with 
>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
> 
> So how is the sys-v ipc problem fixed?  There the user is told to select
> an address which is a multiple of SHMLBA.  Programs that do this today
> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
> because the colour will be wrong.


The code returns -EINVAL.  See arch_get_unmapped_area.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 22, 2015, 5:58 p.m. UTC | #17
On Sun, 2015-02-22 at 12:54 -0500, John David Anglin wrote:
> On 2015-02-22, at 12:17 PM, James Bottomley wrote:
> 
> > On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
> >> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
> >> 
> >>> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
> >>>> On 22.02.2015 00:09, James Bottomley wrote:
> >>>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
> >>>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
> >>>>>> 
> >>>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
> >>>>>>> 
> >>>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
> >>>>>>>>> In current eglibc it's set to 0x00400000
> >>>>>>>>> That's what my eglibc-patch changes...
> >>>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
> >>>>>>>> 
> >>>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
> >>>>>>>> 
> >>>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
> >>>>>>>> userspace to have to care (even if the kernel uses superpages).
> >>>>>>> 
> >>>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
> >>>>>>> Is there a performance advantage in using 4096?
> >>>>>>> 
> >>>>>>>> 
> >>>>>>>> Please correct me if I'm wrong.
> >>>>>>> 
> >>>>>>> 
> >>>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
> >>>>>> working on improving the mmap
> >>>>>>> allocation scheme but this work stalled after some improvement.  I
> >>>>>> can't remember the issues and how
> >>>>>>> they relate to SHMLBA.
> >>>>>> 
> >>>>>> 
> >>>>>> Actually, the number was 4 Mb (bit).
> >>>>> 
> >>>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
> >>>>> because we have a VIPT cache.  The architectural requirement according
> >>>>> to the dreaded appendix F is 16MB but we were assured by the PA
> >>>>> architects that it was 4 because they never planned producing processors
> >>>>> that would require 16.  The actual meaning is it's the number of bits of
> >>>>> the virtual address that are significant in the virtual index.
> >>>>> 
> >>>> 
> >>>> Your following statement:
> >>>> 
> >>>>> The point of SHMLBA is that if the same physical page is mapped into two
> >>>>> different virtual addresses but the two addresses are equal, modulo
> >>>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
> >>>>> inequivalent cache aliases for the page (two writes to the two different
> >>>>> addresses producing two separately dirty cache lines which can never
> >>>>> resolve).  This means that the virtual addresses of all shared mappings
> >>>>> have to be equal modulo SHMLBA for the caches not to alias.
> >>>> 
> >>>> With this you define SHMLBA to be the representative number which defines
> >>>> what the current cache equivalency stride of the kernel is, *and* which then can
> >>>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
> >>>> least a parisc-specific interpretation of SHMLBA), which is not like how it
> >>>> is used on other architectures with similar limitations.
> >>>> Userspace should not know the kernel/architecture specifics. Instead they
> >>>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
> >>>> kernel/glibc will return a corrected mapping address (modulo 4MB).
> >>>> I think this is important, since most userspace programs usually try to mmap at
> >>>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
> >>>> This has been the issue with localedef in glibc (a strange coding which tries
> >>>> to be platform-specific with mmap-calculation). Because of that in the end
> >>>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
> >>>> 
> >>>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
> >>>> but maybe "KERNEL_SHMLBA" instead.
> >>> 
> >>> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
> >>> enough to allow the user pick the address of the region of shared
> >>> memory, so the user had to know these architectural details and SHMLBA
> >>> encodes them (man shmat will give you the gory details).
> >>> 
> >>> For mmap, we can mostly do the right thing in the kernel, except for
> >>> MAP_FIXED, where the user has to know what they're doing again.
> >>> 
> >>> For the cases the user thinks they know best, we can't avoid giving out
> >>> the knowledge somehow, because inequivalent aliases in writeable
> >>> mappings will HPMC a system.  We could be more relaxed about
> >>> inequivalent aliases in read only mappings (say shared libraries), but
> >>> the consequence of that is an explosion in the use of cache space, so we
> >>> would want some libraries (like glibc) with many shared copies to obey
> >>> SHMLBA.
> >> 
> >> 
> >> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
> >> userspace applications failing as mentioned by Helge.
> >> 
> >> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
> >> 
> >> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
> >> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
> >> address as it was originally created as there is no way to relocate the data.
> >> 
> >> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
> >> 
> >> The following is in <asm-generic/shmparam.h>:
> >> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
> >> 
> >> Shared mappings are handled with 
> >> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
> > 
> > So how is the sys-v ipc problem fixed?  There the user is told to select
> > an address which is a multiple of SHMLBA.  Programs that do this today
> > will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
> > because the colour will be wrong.
> 
> 
> The code returns -EINVAL.  See arch_get_unmapped_area.

But that's not a solution.  Let me try to illustrate: I have an existing
application, it uses sys-v ipc and selects a shmat address based on the
multiple of SHMLBA for a writeable mapping.  Today it works.  Tomorrow
when you make this change, it fails with -EINVAL.  That's breaking an
existing application because chances are the app will just report the
failure and exit.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Feb. 22, 2015, 6:02 p.m. UTC | #18
On 2015-02-22, at 12:28 PM, James Bottomley wrote:

> On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
>> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
>> 
>>> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
>>>> On 22.02.2015 00:09, James Bottomley wrote:
>>>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>>>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>>>>> 
>>>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>>>>> 
>>>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>>>>> In current eglibc it's set to 0x00400000
>>>>>>>>> That's what my eglibc-patch changes...
>>>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>>>>> 
>>>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>>>>> 
>>>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>>>>>> userspace to have to care (even if the kernel uses superpages).
>>>>>>> 
>>>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>>>>>> Is there a performance advantage in using 4096?
>>>>>>> 
>>>>>>>> 
>>>>>>>> Please correct me if I'm wrong.
>>>>>>> 
>>>>>>> 
>>>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
>>>>>> working on improving the mmap
>>>>>>> allocation scheme but this work stalled after some improvement.  I
>>>>>> can't remember the issues and how
>>>>>>> they relate to SHMLBA.
>>>>>> 
>>>>>> 
>>>>>> Actually, the number was 4 Mb (bit).
>>>>> 
>>>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
>>>>> because we have a VIPT cache.  The architectural requirement according
>>>>> to the dreaded appendix F is 16MB but we were assured by the PA
>>>>> architects that it was 4 because they never planned producing processors
>>>>> that would require 16.  The actual meaning is it's the number of bits of
>>>>> the virtual address that are significant in the virtual index.
>>>>> 
>>>> 
>>>> Your following statement:
>>>> 
>>>>> The point of SHMLBA is that if the same physical page is mapped into two
>>>>> different virtual addresses but the two addresses are equal, modulo
>>>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
>>>>> inequivalent cache aliases for the page (two writes to the two different
>>>>> addresses producing two separately dirty cache lines which can never
>>>>> resolve).  This means that the virtual addresses of all shared mappings
>>>>> have to be equal modulo SHMLBA for the caches not to alias.
>>>> 
>>>> With this you define SHMLBA to be the representative number which defines
>>>> what the current cache equivalency stride of the kernel is, *and* which then can
>>>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
>>>> least a parisc-specific interpretation of SHMLBA), which is not like how it
>>>> is used on other architectures with similar limitations.
>>>> Userspace should not know the kernel/architecture specifics. Instead they
>>>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
>>>> kernel/glibc will return a corrected mapping address (modulo 4MB).
>>>> I think this is important, since most userspace programs usually try to mmap at
>>>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
>>>> This has been the issue with localedef in glibc (a strange coding which tries
>>>> to be platform-specific with mmap-calculation). Because of that in the end
>>>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
>>>> 
>>>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
>>>> but maybe "KERNEL_SHMLBA" instead.
>>> 
>>> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
>>> enough to allow the user pick the address of the region of shared
>>> memory, so the user had to know these architectural details and SHMLBA
>>> encodes them (man shmat will give you the gory details).
>>> 
>>> For mmap, we can mostly do the right thing in the kernel, except for
>>> MAP_FIXED, where the user has to know what they're doing again.
>>> 
>>> For the cases the user thinks they know best, we can't avoid giving out
>>> the knowledge somehow, because inequivalent aliases in writeable
>>> mappings will HPMC a system.  We could be more relaxed about
>>> inequivalent aliases in read only mappings (say shared libraries), but
>>> the consequence of that is an explosion in the use of cache space, so we
>>> would want some libraries (like glibc) with many shared copies to obey
>>> SHMLBA.
>> 
>> 
>> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
>> userspace applications failing as mentioned by Helge.
>> 
>> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
>> 
>> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
>> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
>> address as it was originally created as there is no way to relocate the data.
>> 
>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
> 
> Do we know what tricks hpux does in shmat() to pull this off?  Making
> writeable mappings uncacheable might be one way.  Using the space bits
> as part of the VI index generation would be another.  I think there were
> also some space bit quadrant tricks hpux pulls, aren't there?


I don't know anything about the internal details.  I believed that shared mappings of shared libraries normally in
one quadrant and private mappings in another quadrant.  One can't place a break point in a shared library when
the mapping is shared.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Feb. 22, 2015, 6:07 p.m. UTC | #19
On 22.02.2015 18:58, James Bottomley wrote:
> On Sun, 2015-02-22 at 12:54 -0500, John David Anglin wrote:
>> On 2015-02-22, at 12:17 PM, James Bottomley wrote:
>>
>>> On Sun, 2015-02-22 at 11:45 -0500, John David Anglin wrote:
>>>> On 2015-02-21, at 6:57 PM, James Bottomley wrote:
>>>>
>>>>> On Sun, 2015-02-22 at 00:26 +0100, Helge Deller wrote:
>>>>>> On 22.02.2015 00:09, James Bottomley wrote:
>>>>>>> On Sat, 2015-02-21 at 15:40 -0500, John David Anglin wrote:
>>>>>>>> On 2015-02-21, at 3:31 PM, John David Anglin wrote:
>>>>>>>>
>>>>>>>>> On 2015-02-20, at 4:36 PM, Carlos O'Donell wrote:
>>>>>>>>>
>>>>>>>>>> On Thu, Apr 3, 2014 at 4:26 PM, Helge Deller <deller@gmx.de> wrote:
>>>>>>>>>>> In current eglibc it's set to 0x00400000
>>>>>>>>>>> That's what my eglibc-patch changes...
>>>>>>>>>>> I'm currently building a eglibc on hpviz with SHMLBA set to 4096 (__getpagesize()).
>>>>>>>>>>
>>>>>>>>>> Anyone object to me fixing this upstream by making SHMLBA match the kernel?
>>>>>>>>>>
>>>>>>>>>> I plan to use a fixed value of 4096, since I never expect hppa
>>>>>>>>>> userspace to have to care (even if the kernel uses superpages).
>>>>>>>>>
>>>>>>>>> We currently use (__getpagesize ()) in Debian and this seems to be a common definition.
>>>>>>>>> Is there a performance advantage in using 4096?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please correct me if I'm wrong.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> At one time, we thought this value needed to be 4 MB.  Helge was
>>>>>>>> working on improving the mmap
>>>>>>>>> allocation scheme but this work stalled after some improvement.  I
>>>>>>>> can't remember the issues and how
>>>>>>>>> they relate to SHMLBA.
>>>>>>>>
>>>>>>>>
>>>>>>>> Actually, the number was 4 Mb (bit).
>>>>>>>
>>>>>>> No, it was 4MB.  That's the cache equivalency stride on PA processors
>>>>>>> because we have a VIPT cache.  The architectural requirement according
>>>>>>> to the dreaded appendix F is 16MB but we were assured by the PA
>>>>>>> architects that it was 4 because they never planned producing processors
>>>>>>> that would require 16.  The actual meaning is it's the number of bits of
>>>>>>> the virtual address that are significant in the virtual index.
>>>>>>>
>>>>>>
>>>>>> Your following statement:
>>>>>>
>>>>>>> The point of SHMLBA is that if the same physical page is mapped into two
>>>>>>> different virtual addresses but the two addresses are equal, modulo
>>>>>>> SHMLBA, then the L1 cache sees the equivalency and you can't get
>>>>>>> inequivalent cache aliases for the page (two writes to the two different
>>>>>>> addresses producing two separately dirty cache lines which can never
>>>>>>> resolve).  This means that the virtual addresses of all shared mappings
>>>>>>> have to be equal modulo SHMLBA for the caches not to alias.
>>>>>>
>>>>>> With this you define SHMLBA to be the representative number which defines
>>>>>> what the current cache equivalency stride of the kernel is, *and* which then can
>>>>>> be used by userspace. I think this is a misinterpretation of SHMLBA (or at
>>>>>> least a parisc-specific interpretation of SHMLBA), which is not like how it
>>>>>> is used on other architectures with similar limitations.
>>>>>> Userspace should not know the kernel/architecture specifics. Instead they
>>>>>> should try to mmap() memory somewhere (e.g. 4KB aligned) and if they need shared mappings then
>>>>>> kernel/glibc will return a corrected mapping address (modulo 4MB).
>>>>>> I think this is important, since most userspace programs usually try to mmap at
>>>>>> a multiple of SHMLBA with which we then run very soon out of userspace (with SHMLBA=4MB).
>>>>>> This has been the issue with localedef in glibc (a strange coding which tries
>>>>>> to be platform-specific with mmap-calculation). Because of that in the end
>>>>>> it turned out to be best for parisc to have SHMLBA defined to 4kb (and not 4MB).
>>>>>>
>>>>>> So, your statement above is correct, I would just not use "SHMLBA" in this term,
>>>>>> but maybe "KERNEL_SHMLBA" instead.
>>>>>
>>>>> Um, no, SHMLBA comes from the SYS-V IPC primitives.  They were stupid
>>>>> enough to allow the user pick the address of the region of shared
>>>>> memory, so the user had to know these architectural details and SHMLBA
>>>>> encodes them (man shmat will give you the gory details).
>>>>>
>>>>> For mmap, we can mostly do the right thing in the kernel, except for
>>>>> MAP_FIXED, where the user has to know what they're doing again.
>>>>>
>>>>> For the cases the user thinks they know best, we can't avoid giving out
>>>>> the knowledge somehow, because inequivalent aliases in writeable
>>>>> mappings will HPMC a system.  We could be more relaxed about
>>>>> inequivalent aliases in read only mappings (say shared libraries), but
>>>>> the consequence of that is an explosion in the use of cache space, so we
>>>>> would want some libraries (like glibc) with many shared copies to obey
>>>>> SHMLBA.
>>>>
>>>>
>>>> I agree with Helge.  We run out of memory too quickly with 4 MB.  This resulted in various
>>>> userspace applications failing as mentioned by Helge.
>>>>
>>>> MAP_FIXED can fail it the address is a problem.  I believe we check for that.
>>>>
>>>> If you look at the pch implementation in gcc, you will see that the MAP_FIXED problem can be
>>>> worked around, and this problem is not specific to parisc.  The pch data has to mapped at the same
>>>> address as it was originally created as there is no way to relocate the data.
>>>>
>>>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
>>>>
>>>> The following is in <asm-generic/shmparam.h>:
>>>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
>>>>
>>>> Shared mappings are handled with
>>>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
>>>
>>> So how is the sys-v ipc problem fixed?  There the user is told to select
>>> an address which is a multiple of SHMLBA.  Programs that do this today
>>> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
>>> because the colour will be wrong.
>>
>>
>> The code returns -EINVAL.  See arch_get_unmapped_area.
>
> But that's not a solution.  Let me try to illustrate: I have an existing
> application, it uses sys-v ipc and selects a shmat address based on the
> multiple of SHMLBA for a writeable mapping.  Today it works.

It will work as well with SHMBLA=4096, if you just use SHM_RND too
(and most applications do have SHM_RND).
man shmat says:
  * If shmaddr isn't NULL and SHM_RND is specified in shmflg, the attach occurs at the address equal to shmaddr rounded down to the nearest multiple of SHMLBA.
* Otherwise, shmaddr must be a page-aligned address at which the attach occurs.

So, even here shmaddr is mentioned to be page-aligned (4k), not SHMLBA-aligned (4M in your case).

>Tomorrow
> when you make this change, it fails with -EINVAL.  That's breaking an
> existing application because chances are the app will just report the
> failure and exit.

Tomorrow?  This change is *already* implemented in eglibc since a year or
so and I don't see any applications which break because of SHMBLA=4096...

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Feb. 22, 2015, 6:28 p.m. UTC | #20
On 22.02.2015 18:54, John David Anglin wrote:
> At one time, we thought this value needed to be 4 MB.
> Helge was working on improving the mmap allocation scheme but this
> work stalled after some improvement.

The patches are still available:
http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-mmap&id=34ae0a4620b50d27ce2f1314322275cbea7f2055
and
http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-mmap&id=7a6e51ddfd3ab3b11a4ebdd995e26672e69a8efa

Basically the idea is:
- Currently we have a static calculation where the mapping should happen inside the 4MB range:
   see: arch/parisc/kernel/sys_parisc.c: (filp ? ((unsigned long) filp->f_mapping) >> 8 : 0UL)
- Replace that by a dynamic mapping, which searches best fit address in free mem area *if* the file hasn't been mapped yet, and save this mapping in the struct address_space. If another process then maps the same file again, then just reuse the last calculated "dynamic" mapping (offset).

This helps a lot to prevent userspace memory fragmentation, but Linus didn't liked this approach and proposed instead:
  https://lkml.org/lkml/2014/5/1/368

Sadly his patch didn't worked out of the box. I did tried various ways (I'm sure it can work somehow), but I couldn't solve it yet.
Maybe some Linux mm expert could help here?

Helge   
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 22, 2015, 7:13 p.m. UTC | #21
On Sun, 2015-02-22 at 19:07 +0100, Helge Deller wrote:
> >>>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
> >>>>
> >>>> The following is in <asm-generic/shmparam.h>:
> >>>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
> >>>>
> >>>> Shared mappings are handled with
> >>>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
> >>>
> >>> So how is the sys-v ipc problem fixed?  There the user is told to select
> >>> an address which is a multiple of SHMLBA.  Programs that do this today
> >>> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
> >>> because the colour will be wrong.
> >>
> >>
> >> The code returns -EINVAL.  See arch_get_unmapped_area.
> >
> > But that's not a solution.  Let me try to illustrate: I have an existing
> > application, it uses sys-v ipc and selects a shmat address based on the
> > multiple of SHMLBA for a writeable mapping.  Today it works.
> 
> It will work as well with SHMBLA=4096, if you just use SHM_RND too
> (and most applications do have SHM_RND).
> man shmat says:
>   * If shmaddr isn't NULL and SHM_RND is specified in shmflg, the attach occurs at the address equal to shmaddr rounded down to the nearest multiple of SHMLBA.
> * Otherwise, shmaddr must be a page-aligned address at which the attach occurs.
> 
> So, even here shmaddr is mentioned to be page-aligned (4k), not
> SHMLBA-aligned (4M in your case).

I think that part is x86.  All the other VI architectures impose their
VI colour constrainst in SHMLBA.  We're the odd one out because we have
a huge stride (everyone else is small multiples of pages).

But agree if no applications are affected, we can make the ABI change.

> >Tomorrow
> > when you make this change, it fails with -EINVAL.  That's breaking an
> > existing application because chances are the app will just report the
> > failure and exit.
> 
> Tomorrow?  This change is *already* implemented in eglibc since a year or
> so and I don't see any applications which break because of SHMBLA=4096...

Is eglibc a big enough sample to make the claim that no applications
will be broken?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Feb. 22, 2015, 7:16 p.m. UTC | #22
On 22.02.2015 20:13, James Bottomley wrote:
> On Sun, 2015-02-22 at 19:07 +0100, Helge Deller wrote:
>>>>>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
>>>>>>
>>>>>> The following is in <asm-generic/shmparam.h>:
>>>>>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
>>>>>>
>>>>>> Shared mappings are handled with
>>>>>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
>>>>>
>>>>> So how is the sys-v ipc problem fixed?  There the user is told to select
>>>>> an address which is a multiple of SHMLBA.  Programs that do this today
>>>>> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
>>>>> because the colour will be wrong.
>>>>
>>>>
>>>> The code returns -EINVAL.  See arch_get_unmapped_area.
>>>
>>> But that's not a solution.  Let me try to illustrate: I have an existing
>>> application, it uses sys-v ipc and selects a shmat address based on the
>>> multiple of SHMLBA for a writeable mapping.  Today it works.
>>
>> It will work as well with SHMBLA=4096, if you just use SHM_RND too
>> (and most applications do have SHM_RND).
>> man shmat says:
>>    * If shmaddr isn't NULL and SHM_RND is specified in shmflg, the attach occurs at the address equal to shmaddr rounded down to the nearest multiple of SHMLBA.
>> * Otherwise, shmaddr must be a page-aligned address at which the attach occurs.
>>
>> So, even here shmaddr is mentioned to be page-aligned (4k), not
>> SHMLBA-aligned (4M in your case).
>
> I think that part is x86.  All the other VI architectures impose their
> VI colour constrainst in SHMLBA.  We're the odd one out because we have
> a huge stride (everyone else is small multiples of pages).
>
> But agree if no applications are affected, we can make the ABI change.
>
>>> Tomorrow
>>> when you make this change, it fails with -EINVAL.  That's breaking an
>>> existing application because chances are the app will just report the
>>> failure and exit.
>>
>> Tomorrow?  This change is *already* implemented in eglibc since a year or
>> so and I don't see any applications which break because of SHMBLA=4096...
>
> Is eglibc a big enough sample to make the claim that no applications
> will be broken?

Try yourself:
https://parisc.wiki.kernel.org/index.php/Debian_Ports_Installation
Just install debian 8.0 (aka unstable for hppa). KDE, Xfce, libreoffice,
and a all others packages just work out of the box.

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Feb. 22, 2015, 7:42 p.m. UTC | #23
On Sun, 2015-02-22 at 20:16 +0100, Helge Deller wrote:
> On 22.02.2015 20:13, James Bottomley wrote:
> > On Sun, 2015-02-22 at 19:07 +0100, Helge Deller wrote:
> >>>>>> SHMLBA is 4096 /* (1 << PGSHIFT) */ on hpux.
> >>>>>>
> >>>>>> The following is in <asm-generic/shmparam.h>:
> >>>>>> #define SHMLBA PAGE_SIZE	 /* attach addr a multiple of this */
> >>>>>>
> >>>>>> Shared mappings are handled with
> >>>>>> asm/shmparam.h:#define SHM_COLOUR 0x00400000	/* shared mappings colouring */
> >>>>>
> >>>>> So how is the sys-v ipc problem fixed?  There the user is told to select
> >>>>> an address which is a multiple of SHMLBA.  Programs that do this today
> >>>>> will start to break on writeable mappings if we set SHMLBA to PAGE_SIZE
> >>>>> because the colour will be wrong.
> >>>>
> >>>>
> >>>> The code returns -EINVAL.  See arch_get_unmapped_area.
> >>>
> >>> But that's not a solution.  Let me try to illustrate: I have an existing
> >>> application, it uses sys-v ipc and selects a shmat address based on the
> >>> multiple of SHMLBA for a writeable mapping.  Today it works.
> >>
> >> It will work as well with SHMBLA=4096, if you just use SHM_RND too
> >> (and most applications do have SHM_RND).
> >> man shmat says:
> >>    * If shmaddr isn't NULL and SHM_RND is specified in shmflg, the attach occurs at the address equal to shmaddr rounded down to the nearest multiple of SHMLBA.
> >> * Otherwise, shmaddr must be a page-aligned address at which the attach occurs.
> >>
> >> So, even here shmaddr is mentioned to be page-aligned (4k), not
> >> SHMLBA-aligned (4M in your case).
> >
> > I think that part is x86.  All the other VI architectures impose their
> > VI colour constrainst in SHMLBA.  We're the odd one out because we have
> > a huge stride (everyone else is small multiples of pages).
> >
> > But agree if no applications are affected, we can make the ABI change.
> >
> >>> Tomorrow
> >>> when you make this change, it fails with -EINVAL.  That's breaking an
> >>> existing application because chances are the app will just report the
> >>> failure and exit.
> >>
> >> Tomorrow?  This change is *already* implemented in eglibc since a year or
> >> so and I don't see any applications which break because of SHMBLA=4096...
> >
> > Is eglibc a big enough sample to make the claim that no applications
> > will be broken?
> 
> Try yourself:
> https://parisc.wiki.kernel.org/index.php/Debian_Ports_Installation
> Just install debian 8.0 (aka unstable for hppa). KDE, Xfce, libreoffice,
> and a all others packages just work out of the box.

I know, I do, but that's sort of expected: most modern linux apps use
mmap.  It's the older stuff that uses sys-v ipc, but perhaps for us this
doesn't matter.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Carlos O'Donell March 7, 2015, 7:05 p.m. UTC | #24
On Sun, Feb 22, 2015 at 2:42 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> I know, I do, but that's sort of expected: most modern linux apps use
> mmap.  It's the older stuff that uses sys-v ipc, but perhaps for us this
> doesn't matter.

And that's the real truth.

The usage patterns that appear to matter are:
(a) An initial mmap, followed by an mmap with MAP_FIXED at the same
previous address.
(b) shmat with SHM_RND, in which case we get to round the resulting
address or return EINVAL.

Both of (a) and (b) work today if glibc sets SHMLBA to 4kb.

On parisc we simply can't deal with the user selecting an arbitrary address.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -up ./ports/sysdeps/unix/sysv/linux/hppa/bits/shm.h.org ./ports/sysdeps/unix/sysv/linux/hppa/bits/shm.h
--- ./ports/sysdeps/unix/sysv/linux/hppa/bits/shm.h.org	2014-04-03 13:20:43.644098000 -0600
+++ ./ports/sysdeps/unix/sysv/linux/hppa/bits/shm.h	2014-04-03 13:22:15.840098000 -0600
@@ -36,7 +36,7 @@ 
 #define SHM_UNLOCK	12		/* unlock segment (root only) */
 
 /* Segment low boundary address multiple.  */
-#define SHMLBA 0x00400000		/* address needs to be 4 Mb aligned */
+#define SHMLBA		(__getpagesize ())
 
 /* Type to count number of attaches.  */
 typedef unsigned long int shmatt_t;