Message ID | 20150725023842.8664.97620.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Dan Williams |
Headers | show |
On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: > The behavior change to return NULL on an unsupported request is reserved > for a later patch. Why? > +enum { > + MEMREMAP_WB = 1 << 0, > + MEMREMAP_WT = 1 << 1, > + MEMREMAP_CACHE = MEMREMAP_WB, What's the point of having this MEMREMAP_CACHE alias? Also please document the meaning of the flags for the poor users.
On Sun, Jul 26, 2015 at 10:25 AM, Christoph Hellwig <hch@lst.de> wrote: > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: >> The behavior change to return NULL on an unsupported request is reserved >> for a later patch. > > Why? This is for drivers like pmem that care about the mapping type. For example, if pmem can't get a cache-enabled mapping it is potentially putting the write durability of the persistent media at risk. >> +enum { >> + MEMREMAP_WB = 1 << 0, >> + MEMREMAP_WT = 1 << 1, >> + MEMREMAP_CACHE = MEMREMAP_WB, > > What's the point of having this MEMREMAP_CACHE alias? For developers that are used to seeing ioremap_cache()... > Also please document the meaning of the flags for the poor users. Will do. I'll mostly borrow from the x86 mapping type definitions, but these will also have architecture specific semantics / constraints.
On Sun, Jul 26, 2015 at 10:49:39AM -0700, Dan Williams wrote: > > What's the point of having this MEMREMAP_CACHE alias? > > For developers that are used to seeing ioremap_cache()... Meh. We've got git logs that show clearly what replaced a previous API. Duplicate names for the same API are highly confusing.
On Sun, Jul 26, 2015 at 10:49:39AM -0700, Dan Williams wrote: > On Sun, Jul 26, 2015 at 10:25 AM, Christoph Hellwig <hch@lst.de> wrote: > > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: > >> The behavior change to return NULL on an unsupported request is reserved > >> for a later patch. > > > > Why? > > This is for drivers like pmem that care about the mapping type. For > example, if pmem can't get a cache-enabled mapping it is potentially > putting the write durability of the persistent media at risk. I understand that part, but the question is why the old behavior is retained for now and only changed later.
On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: > Existing users of ioremap_cache() are mapping memory that is known in > advance to not have i/o side effects. These users are forced to cast > away the __iomem annotation, or otherwise neglect to fix the sparse > errors thrown when dereferencing pointers to this memory. Provide > memremap() as a non __iomem annotated ioremap_*() in the case when > ioremap is otherwise a pointer to memory. Ok so a special use case. > Outside of ioremap() and > ioremap_nocache(), the expectation is that most calls to > ioremap_<type>() are seeking memory-like semantics (e.g. speculative > reads, and prefetching permitted). These callsites can be moved to > memremap() over time. Such memory-like smantics are not well defined yet and this has caused issues over expectations over a slew of APIs. As you note above your own defined 'semantics' so far for memremap are just that there are no i/o side effects, when the mapped memory is just a pointer to memory, as such I do not think its fair to set the excpectations that we'll switch all other ioremap_*() callers to the same memremap() API. Now, it may be a good idea to use something similar, ie, to pass flags, but setting the expectations outright to move to memremap() without having any agreement having been made over semantics seems uncalled for at this point in time, specially when you are noting that the expectations for both sets of calls are different. So perhaps: " Outside of ioremap() and ioremap_nocache(), all other ioremap_<type>() variant calls are seeking memory-like semantics (e.g. speculative reads, and prefetching permitted) and all calls expecations currently differ depending on architecture. Once and if we get agreement on such semantics we may be able to move such ioremap_*() variant calls to a similar API where the semantics required are clearly spelled out and well defined and passed as arguments. " > diff --git a/kernel/memremap.c b/kernel/memremap.c > new file mode 100644 > index 000000000000..ba206fd11785 > --- /dev/null > +++ b/kernel/memremap.c > @@ -0,0 +1,82 @@ > +/* > + * Copyright(c) 2015 Intel Corporation. All rights reserved. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of version 2 of the GNU General Public License as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + */ > +#include <linux/types.h> > +#include <linux/io.h> > +#include <linux/mm.h> > + > +#ifndef ioremap_cache > +/* temporary while we convert existing ioremap_cache users to memremap */ > +__weak void __iomem *ioremap_cache(resource_size_t offset, unsigned long size) > +{ > + return ioremap(offset, size); > +} > +#endif > + > +/* > + * memremap() is "ioremap" for cases where it is known that the resource > + * being mapped does not have i/o side effects and the __iomem > + * annotation is not applicable. > + */ > +void *memremap(resource_size_t offset, size_t size, unsigned long flags) > +{ > + int is_ram = region_is_ram(offset, size); > + void *addr = NULL; > + > + if (is_ram < 0) { > + WARN_ONCE(1, "memremap attempted on mixed range %pa size: %zu\n", > + &offset, size); > + return NULL; > + } > + > + /* Try all mapping types requested until one returns non-NULL */ > + if (flags & MEMREMAP_CACHE) { > + flags &= ~MEMREMAP_CACHE; > + /* > + * MEMREMAP_CACHE is special in that it can be satisifed > + * from the direct map. Some archs depend on the > + * capability of memremap() to autodetect cases where > + * the requested range is potentially in "System RAM" > + */ > + if (is_ram) > + addr = __va(offset); The no MMU case seems to match this, can that be switch to this? Luis
On Sun, Jul 26, 2015 at 10:12 PM, Christoph Hellwig <hch@lst.de> wrote: > On Sun, Jul 26, 2015 at 10:49:39AM -0700, Dan Williams wrote: >> On Sun, Jul 26, 2015 at 10:25 AM, Christoph Hellwig <hch@lst.de> wrote: >> > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: >> >> The behavior change to return NULL on an unsupported request is reserved >> >> for a later patch. >> > >> > Why? >> >> This is for drivers like pmem that care about the mapping type. For >> example, if pmem can't get a cache-enabled mapping it is potentially >> putting the write durability of the persistent media at risk. > > I understand that part, but the question is why the old behavior > is retained for now and only changed later. Oh, because all we have at this point is ioremap_cache() which silently falls back. It's not until the introduction of arch_memremp() where we update the arch code to break that behavior. That said, I think it may be beneficial to allow a fallback if the user cares. So maybe memremap() can call plain ioremap() if MEMREMAP_STRICT is not set and none of the other mapping types are satisfied.
On Mon, Jul 27, 2015 at 4:17 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: >> Existing users of ioremap_cache() are mapping memory that is known in >> advance to not have i/o side effects. These users are forced to cast >> away the __iomem annotation, or otherwise neglect to fix the sparse >> errors thrown when dereferencing pointers to this memory. Provide >> memremap() as a non __iomem annotated ioremap_*() in the case when >> ioremap is otherwise a pointer to memory. > > Ok so a special use case. > >> Outside of ioremap() and >> ioremap_nocache(), the expectation is that most calls to >> ioremap_<type>() are seeking memory-like semantics (e.g. speculative >> reads, and prefetching permitted). These callsites can be moved to >> memremap() over time. > > Such memory-like smantics are not well defined yet and this has caused > issues over expectations over a slew of APIs. As you note above > your own defined 'semantics' so far for memremap are just that there are > no i/o side effects, when the mapped memory is just a pointer to memory, > as such I do not think its fair to set the excpectations that we'll > switch all other ioremap_*() callers to the same memremap() API. Now, > it may be a good idea to use something similar, ie, to pass flags, > but setting the expectations outright to move to memremap() without having > any agreement having been made over semantics seems uncalled for at this > point in time, specially when you are noting that the expectations for > both sets of calls are different. > > So perhaps: > > " > Outside of ioremap() and ioremap_nocache(), all other ioremap_<type>() > variant calls are seeking memory-like semantics (e.g. speculative > reads, and prefetching permitted) and all calls expecations currently > differ depending on architecture. Once and if we get agreement on such > semantics we may be able to move such ioremap_*() variant calls to > a similar API where the semantics required are clearly spelled out > and well defined and passed as arguments. I still think ioremap_wc(), and now ioremap_uc(), are special and are not obvious candidates for conversion to memremap(). > " > >> diff --git a/kernel/memremap.c b/kernel/memremap.c >> new file mode 100644 >> index 000000000000..ba206fd11785 >> --- /dev/null >> +++ b/kernel/memremap.c >> @@ -0,0 +1,82 @@ >> +/* >> + * Copyright(c) 2015 Intel Corporation. All rights reserved. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of version 2 of the GNU General Public License as >> + * published by the Free Software Foundation. >> + * >> + * This program is distributed in the hope that it will be useful, but >> + * WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + * General Public License for more details. >> + */ >> +#include <linux/types.h> >> +#include <linux/io.h> >> +#include <linux/mm.h> >> + >> +#ifndef ioremap_cache >> +/* temporary while we convert existing ioremap_cache users to memremap */ >> +__weak void __iomem *ioremap_cache(resource_size_t offset, unsigned long size) >> +{ >> + return ioremap(offset, size); >> +} >> +#endif >> + >> +/* >> + * memremap() is "ioremap" for cases where it is known that the resource >> + * being mapped does not have i/o side effects and the __iomem >> + * annotation is not applicable. >> + */ >> +void *memremap(resource_size_t offset, size_t size, unsigned long flags) >> +{ >> + int is_ram = region_is_ram(offset, size); >> + void *addr = NULL; >> + >> + if (is_ram < 0) { >> + WARN_ONCE(1, "memremap attempted on mixed range %pa size: %zu\n", >> + &offset, size); >> + return NULL; >> + } >> + >> + /* Try all mapping types requested until one returns non-NULL */ >> + if (flags & MEMREMAP_CACHE) { >> + flags &= ~MEMREMAP_CACHE; >> + /* >> + * MEMREMAP_CACHE is special in that it can be satisifed >> + * from the direct map. Some archs depend on the >> + * capability of memremap() to autodetect cases where >> + * the requested range is potentially in "System RAM" >> + */ >> + if (is_ram) >> + addr = __va(offset); > > The no MMU case seems to match this, can that be switch to this? Hmm, it may be possible to consolidate all the NOMMU cases here. That's a good incremental cleanup once these base patches are merged.
On Mon, Jul 27, 2015 at 04:31:09PM -0700, Dan Williams wrote: > On Mon, Jul 27, 2015 at 4:17 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: > >> Existing users of ioremap_cache() are mapping memory that is known in > >> advance to not have i/o side effects. These users are forced to cast > >> away the __iomem annotation, or otherwise neglect to fix the sparse > >> errors thrown when dereferencing pointers to this memory. Provide > >> memremap() as a non __iomem annotated ioremap_*() in the case when > >> ioremap is otherwise a pointer to memory. > > > > Ok so a special use case. > > > >> Outside of ioremap() and > >> ioremap_nocache(), the expectation is that most calls to > >> ioremap_<type>() are seeking memory-like semantics (e.g. speculative > >> reads, and prefetching permitted). These callsites can be moved to > >> memremap() over time. > > > > Such memory-like smantics are not well defined yet and this has caused > > issues over expectations over a slew of APIs. As you note above > > your own defined 'semantics' so far for memremap are just that there are > > no i/o side effects, when the mapped memory is just a pointer to memory, > > as such I do not think its fair to set the excpectations that we'll > > switch all other ioremap_*() callers to the same memremap() API. Now, > > it may be a good idea to use something similar, ie, to pass flags, > > but setting the expectations outright to move to memremap() without having > > any agreement having been made over semantics seems uncalled for at this > > point in time, specially when you are noting that the expectations for > > both sets of calls are different. > > > > So perhaps: > > > > " > > Outside of ioremap() and ioremap_nocache(), all other ioremap_<type>() > > variant calls are seeking memory-like semantics (e.g. speculative > > reads, and prefetching permitted) and all calls expecations currently > > differ depending on architecture. Once and if we get agreement on such > > semantics we may be able to move such ioremap_*() variant calls to > > a similar API where the semantics required are clearly spelled out > > and well defined and passed as arguments. > > I still think ioremap_wc(), and now ioremap_uc(), are special and are > not obvious candidates for conversion to memremap(). OK great, then we're in strong agreement, so removing the "Outside of ioremap() and... over time" might be best then? Otherwise what I posted seems to reflect the state of affairs better? > >> +void *memremap(resource_size_t offset, size_t size, unsigned long flags) > >> +{ > >> + int is_ram = region_is_ram(offset, size); > >> + void *addr = NULL; > >> + > >> + if (is_ram < 0) { > >> + WARN_ONCE(1, "memremap attempted on mixed range %pa size: %zu\n", > >> + &offset, size); > >> + return NULL; > >> + } > >> + > >> + /* Try all mapping types requested until one returns non-NULL */ > >> + if (flags & MEMREMAP_CACHE) { > >> + flags &= ~MEMREMAP_CACHE; > >> + /* > >> + * MEMREMAP_CACHE is special in that it can be satisifed > >> + * from the direct map. Some archs depend on the > >> + * capability of memremap() to autodetect cases where > >> + * the requested range is potentially in "System RAM" > >> + */ > >> + if (is_ram) > >> + addr = __va(offset); > > > > The no MMU case seems to match this, can that be switch to this? > > Hmm, it may be possible to consolidate all the NOMMU cases here. > That's a good incremental cleanup once these base patches are merged. Sure fine by me. Luis
On Mon, Jul 27, 2015 at 4:43 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > On Mon, Jul 27, 2015 at 04:31:09PM -0700, Dan Williams wrote: >> On Mon, Jul 27, 2015 at 4:17 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: >> > On Fri, Jul 24, 2015 at 10:38:42PM -0400, Dan Williams wrote: >> >> Existing users of ioremap_cache() are mapping memory that is known in >> >> advance to not have i/o side effects. These users are forced to cast >> >> away the __iomem annotation, or otherwise neglect to fix the sparse >> >> errors thrown when dereferencing pointers to this memory. Provide >> >> memremap() as a non __iomem annotated ioremap_*() in the case when >> >> ioremap is otherwise a pointer to memory. >> > >> > Ok so a special use case. >> > >> >> Outside of ioremap() and >> >> ioremap_nocache(), the expectation is that most calls to >> >> ioremap_<type>() are seeking memory-like semantics (e.g. speculative >> >> reads, and prefetching permitted). These callsites can be moved to >> >> memremap() over time. >> > >> > Such memory-like smantics are not well defined yet and this has caused >> > issues over expectations over a slew of APIs. As you note above >> > your own defined 'semantics' so far for memremap are just that there are >> > no i/o side effects, when the mapped memory is just a pointer to memory, >> > as such I do not think its fair to set the excpectations that we'll >> > switch all other ioremap_*() callers to the same memremap() API. Now, >> > it may be a good idea to use something similar, ie, to pass flags, >> > but setting the expectations outright to move to memremap() without having >> > any agreement having been made over semantics seems uncalled for at this >> > point in time, specially when you are noting that the expectations for >> > both sets of calls are different. >> > >> > So perhaps: >> > >> > " >> > Outside of ioremap() and ioremap_nocache(), all other ioremap_<type>() >> > variant calls are seeking memory-like semantics (e.g. speculative >> > reads, and prefetching permitted) and all calls expecations currently >> > differ depending on architecture. Once and if we get agreement on such >> > semantics we may be able to move such ioremap_*() variant calls to >> > a similar API where the semantics required are clearly spelled out >> > and well defined and passed as arguments. >> >> I still think ioremap_wc(), and now ioremap_uc(), are special and are >> not obvious candidates for conversion to memremap(). > > OK great, then we're in strong agreement, so removing the "Outside of > ioremap() and... over time" might be best then? Otherwise what I posted > seems to reflect the state of affairs better? > Ah yes, I need to clean that up. Thanks!
On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > Oh, because all we have at this point is ioremap_cache() which > silently falls back. It's not until the introduction of > arch_memremp() where we update the arch code to break that behavior. Ok, makes sense. Might be worth to document in the changelog. > That said, I think it may be beneficial to allow a fallback if the > user cares. So maybe memremap() can call plain ioremap() if > MEMREMAP_STRICT is not set and none of the other mapping types are > satisfied. Is there a real use case for it? Fallback APIs always seem confusing and it might make more sense to do this in the caller(s) that actually need it.
On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > Oh, because all we have at this point is ioremap_cache() which > > silently falls back. It's not until the introduction of > > arch_memremp() where we update the arch code to break that behavior. > > Ok, makes sense. Might be worth to document in the changelog. > > > That said, I think it may be beneficial to allow a fallback if the > > user cares. So maybe memremap() can call plain ioremap() if > > MEMREMAP_STRICT is not set and none of the other mapping types are > > satisfied. > > Is there a real use case for it? Fallback APIs always seem confusing > and it might make more sense to do this in the caller(s) that actually > need it. It seems semantics-wise we are trying to separate these two really, so I agree with this. Having a fallback would onloy make things more complicated for any sanitizer / checker / etc, and I don't think the practical gains of having a fallback outweight the gains of having a clear semantic separation on intended memory type and interactions with it. Luis
On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: >> On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: >> > Oh, because all we have at this point is ioremap_cache() which >> > silently falls back. It's not until the introduction of >> > arch_memremp() where we update the arch code to break that behavior. >> >> Ok, makes sense. Might be worth to document in the changelog. >> >> > That said, I think it may be beneficial to allow a fallback if the >> > user cares. So maybe memremap() can call plain ioremap() if >> > MEMREMAP_STRICT is not set and none of the other mapping types are >> > satisfied. >> >> Is there a real use case for it? Fallback APIs always seem confusing >> and it might make more sense to do this in the caller(s) that actually >> need it. > > It seems semantics-wise we are trying to separate these two really, so I agree > with this. Having a fallback would onloy make things more complicated for any > sanitizer / checker / etc, and I don't think the practical gains of having a > fallback outweight the gains of having a clear semantic separation on intended > memory type and interactions with it. > Yup, consider it dropped. Drivers that want fallback behavior can do it explicitly.
On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> > wrote: > > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > > > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > > > Oh, because all we have at this point is ioremap_cache() which > > > > silently falls back. It's not until the introduction of > > > > arch_memremp() where we update the arch code to break that > > > > behavior. > > > > > > Ok, makes sense. Might be worth to document in the changelog. > > > > > > > That said, I think it may be beneficial to allow a fallback if the > > > > user cares. So maybe memremap() can call plain ioremap() if > > > > MEMREMAP_STRICT is not set and none of the other mapping types are > > > > satisfied. > > > > > > Is there a real use case for it? Fallback APIs always seem confusing > > > and it might make more sense to do this in the caller(s) that > > > actually > > > need it. > > > > It seems semantics-wise we are trying to separate these two really, so > > I agree > > with this. Having a fallback would onloy make things more complicated > > for any > > sanitizer / checker / etc, and I don't think the practical gains of > > having a > > fallback outweight the gains of having a clear semantic separation on > > intended > > memory type and interactions with it. > > > > Yup, consider it dropped. Drivers that want fallback behavior can do > it explicitly. I agree in general. However, for the drivers to be able to fall back properly, they will need to know the cache type they can fall back with. ioremap() falls back to the cache type of an existing mapping to avoid aliasing. Thanks, -Toshi
On Wed, 2015-07-29 at 15:00 -0600, Toshi Kani wrote: > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > > On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> > > wrote: > > > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > > > > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > > > > Oh, because all we have at this point is ioremap_cache() which > > > > > silently falls back. It's not until the introduction of > > > > > arch_memremp() where we update the arch code to break that > > > > > behavior. > > > > > > > > Ok, makes sense. Might be worth to document in the changelog. > > > > > > > > > That said, I think it may be beneficial to allow a fallback if > > > > > the > > > > > user cares. So maybe memremap() can call plain ioremap() if > > > > > MEMREMAP_STRICT is not set and none of the other mapping types > > > > > are > > > > > satisfied. > > > > > > > > Is there a real use case for it? Fallback APIs always seem > > > > confusing > > > > and it might make more sense to do this in the caller(s) that > > > > actually > > > > need it. > > > > > > It seems semantics-wise we are trying to separate these two really, > > > so > > > I agree > > > with this. Having a fallback would onloy make things more complicated > > > > > > for any > > > sanitizer / checker / etc, and I don't think the practical gains of > > > having a > > > fallback outweight the gains of having a clear semantic separation on > > > > > > intended > > > memory type and interactions with it. > > > > > > > Yup, consider it dropped. Drivers that want fallback behavior can do > > it explicitly. > > I agree in general. However, for the drivers to be able to fall back > properly, they will need to know the cache type they can fall back with. > ioremap() falls back to the cache type of an existing mapping to avoid > aliasing. Never mind... In this context, we are talking about fallback in case of " API not implemented" on arch. I was talking about fallback in case of aliasing. Thanks, -Toshi
On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > > On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> > > wrote: > > > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > > > > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > > > > Oh, because all we have at this point is ioremap_cache() which > > > > > silently falls back. It's not until the introduction of > > > > > arch_memremp() where we update the arch code to break that > > > > > behavior. > > > > > > > > Ok, makes sense. Might be worth to document in the changelog. > > > > > > > > > That said, I think it may be beneficial to allow a fallback if the > > > > > user cares. So maybe memremap() can call plain ioremap() if > > > > > MEMREMAP_STRICT is not set and none of the other mapping types are > > > > > satisfied. > > > > > > > > Is there a real use case for it? Fallback APIs always seem confusing > > > > and it might make more sense to do this in the caller(s) that > > > > actually > > > > need it. > > > > > > It seems semantics-wise we are trying to separate these two really, so > > > I agree > > > with this. Having a fallback would onloy make things more complicated > > > for any > > > sanitizer / checker / etc, and I don't think the practical gains of > > > having a > > > fallback outweight the gains of having a clear semantic separation on > > > intended > > > memory type and interactions with it. > > > > > > > Yup, consider it dropped. Drivers that want fallback behavior can do > > it explicitly. > > I agree in general. However, for the drivers to be able to fall back > properly, they will need to know the cache type they can fall back with. That would depend on the purpose of the region and the driver developer should in theory know best. One issue with this of course is that, as we've discovered, the semantics of on the ioremap*() variant front regarding cache types is not clearly well defined, or at least it may be only well defined implicitly on x86 only, so the driver developer can only *hope* for the best across architectures. Our ambiguity on our semantics on ioremap*() variants therefore means driver developers can resonably be puzzled by what fallbacks to use. That also means architectures maintainers should not whip driver developers for any misuse. Such considerations are why although we're now revisiting semantics for ioremap*() variants I was in hopes we could be at least somewhat pedantic about memremap() semantics. For instance since memremap() only has 2 types right now can a respective fallback be documented as an alternative to help with this ? Or can we not generalize this ? One for MEMREMAP_WB and one for MEMREMAP_WT ? > ioremap() falls back to the cache type of an existing mapping to avoid > aliasing. Does that assume an existing ioremap*() call was used on a bigger range? Do you know if that happens to only be the case for x86 (I'd think so) or if its the same for other architectures ? While at it, Dan, will / should memremap() support overlapping calls ? What is the expectation on behaviour ? PS: I did see you reply about this being about lacking an arch implementation for a memremap() type, not a cache type, but as the driver uses one memremap() type the goal for a region is just as important as the resulting type. Luis
On Wed, Jul 29, 2015 at 2:43 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > While at it, Dan, will / should memremap() support overlapping calls ? > What is the expectation on behaviour ? It will be the same as ioremap(). Each mapping gets its own virtual address and overlapping calls are permitted. The only protection is that RAM is never mapped with an alias.
On Wed, Jul 29, 2015 at 02:47:40PM -0700, Dan Williams wrote: > On Wed, Jul 29, 2015 at 2:43 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > > While at it, Dan, will / should memremap() support overlapping calls ? > > What is the expectation on behaviour ? > > It will be the same as ioremap(). Each mapping gets its own virtual > address and overlapping calls are permitted. The only protection is > that RAM is never mapped with an alias. If you could document this as part of your next respin that'd be highly appreciated. Luis
On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > > > On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> > > > wrote: > > > > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > > > > > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > > > > > Oh, because all we have at this point is ioremap_cache() which > > > > > > silently falls back. It's not until the introduction of > > > > > > arch_memremp() where we update the arch code to break that > > > > > > behavior. > > > > > > > > > > Ok, makes sense. Might be worth to document in the changelog. > > > > > > > > > > > That said, I think it may be beneficial to allow a fallback if > > > > > > the user cares. So maybe memremap() can call plain ioremap() if > > > > > > MEMREMAP_STRICT is not set and none of the other mapping types > > > > > > are satisfied. > > > > > > > > > > Is there a real use case for it? Fallback APIs always seem > > > > > confusing and it might make more sense to do this in the caller(s) > > > > > that actually need it. > > > > > > > > It seems semantics-wise we are trying to separate these two really, > > > > so I agree with this. Having a fallback would onloy make things more > > > > complicated for any sanitizer / checker / etc, and I don't think the > > > > practical gains of having a fallback outweight the gains of having a > > > > clear semantic separation on intended memory type and interactions > > > > with it. > > > > > > > > > > Yup, consider it dropped. Drivers that want fallback behavior can do > > > it explicitly. > > > > I agree in general. However, for the drivers to be able to fall back > > properly, they will need to know the cache type they can fall back with. > > > > That would depend on the purpose of the region and the driver developer > should in theory know best. One issue with this of course is that, as > we've discovered, the semantics of on the ioremap*() variant front > regarding cache types is not clearly well defined, or at least it may be > only well defined implicitly on x86 only, so the driver developer can only > *hope* for the best across architectures. Our ambiguity on our semantics > on ioremap*() variants therefore means driver developers can resonably be > puzzled by what fallbacks to use. That also means architectures > maintainers should not whip driver developers for any misuse. Such > considerations are why although we're now revisiting semantics for > ioremap*() variants I was in hopes we could be at least somewhat > pedantic about memremap() semantics. I agree. However, there are a few exceptions like /dev/mem, which can map a target range without knowledge. > For instance since memremap() only has 2 types right now can a respective > fallback be documented as an alternative to help with this ? Or can we not > generalize this ? One for MEMREMAP_WB and one for MEMREMAP_WT ? Yes, if a target range can be only mapped by memremap(). However, there is no restriction that a range can be mapped with a single interface alone. For example, a range can be mapped with remap_pfn_range() to user space with any cache type. So, in theory, memremap() can overlap with any other types. > > ioremap() falls back to the cache type of an existing mapping to avoid > > aliasing. > > Does that assume an existing ioremap*() call was used on a bigger range? > Do you know if that happens to only be the case for x86 (I'd think so) > or if its the same for other architectures ? In the /dev/mem example, it is remap_pfn_range(). I think other archs have the same issue, but I do not know if they fall back in case of overlapping call. > While at it, Dan, will / should memremap() support overlapping calls ? > What is the expectation on behaviour ? > > PS: I did see you reply about this being about lacking an arch > implementation for a memremap() type, not a cache type, but as the driver > uses one memremap() type the goal for a region is just as important as the > resulting type. Agreed. Drivers cannot tell if a fallback is due to lacking implementation of overlapping, unless they check with #ifdef ioremap_xxx. Thanks, -Toshi
On Wed, Jul 29, 2015 at 06:00:04PM -0600, Toshi Kani wrote: > On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: > > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: > > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > > > > On Wed, Jul 29, 2015 at 11:27 AM, Luis R. Rodriguez <mcgrof@suse.com> > > > > wrote: > > > > > On Wed, Jul 29, 2015 at 08:50:04AM +0200, Christoph Hellwig wrote: > > > > > > On Mon, Jul 27, 2015 at 04:26:03PM -0700, Dan Williams wrote: > > > > > > > Oh, because all we have at this point is ioremap_cache() which > > > > > > > silently falls back. It's not until the introduction of > > > > > > > arch_memremp() where we update the arch code to break that > > > > > > > behavior. > > > > > > > > > > > > Ok, makes sense. Might be worth to document in the changelog. > > > > > > > > > > > > > That said, I think it may be beneficial to allow a fallback if > > > > > > > the user cares. So maybe memremap() can call plain ioremap() if > > > > > > > MEMREMAP_STRICT is not set and none of the other mapping types > > > > > > > are satisfied. > > > > > > > > > > > > Is there a real use case for it? Fallback APIs always seem > > > > > > confusing and it might make more sense to do this in the caller(s) > > > > > > that actually need it. > > > > > > > > > > It seems semantics-wise we are trying to separate these two really, > > > > > so I agree with this. Having a fallback would onloy make things more > > > > > complicated for any sanitizer / checker / etc, and I don't think the > > > > > practical gains of having a fallback outweight the gains of having a > > > > > clear semantic separation on intended memory type and interactions > > > > > with it. > > > > > > > > > > > > > Yup, consider it dropped. Drivers that want fallback behavior can do > > > > it explicitly. > > > > > > I agree in general. However, for the drivers to be able to fall back > > > properly, they will need to know the cache type they can fall back with. > > > > > > > That would depend on the purpose of the region and the driver developer > > should in theory know best. One issue with this of course is that, as > > we've discovered, the semantics of on the ioremap*() variant front > > regarding cache types is not clearly well defined, or at least it may be > > only well defined implicitly on x86 only, so the driver developer can only > > *hope* for the best across architectures. Our ambiguity on our semantics > > on ioremap*() variants therefore means driver developers can resonably be > > puzzled by what fallbacks to use. That also means architectures > > maintainers should not whip driver developers for any misuse. Such > > considerations are why although we're now revisiting semantics for > > ioremap*() variants I was in hopes we could be at least somewhat > > pedantic about memremap() semantics. > > I agree. However, there are a few exceptions like /dev/mem, which can map a > target range without knowledge. Still, the expectation to require support for overlapping ioremap() calls would seem to be more of an exception than the norm, so I'd argue that requiring callers who know they do need overlapping support to be explicit about it may help us simplify expecations on semantics in other areas of the kernel. > > For instance since memremap() only has 2 types right now can a respective > > fallback be documented as an alternative to help with this ? Or can we not > > generalize this ? One for MEMREMAP_WB and one for MEMREMAP_WT ? > > Yes, if a target range can be only mapped by memremap(). However, there is > no restriction that a range can be mapped with a single interface alone. > For example, a range can be mapped with remap_pfn_range() to user space > with any cache type. So, in theory, memremap() can overlap with any other > types. Shouldn't that be an issue or area of concern ? It seems the flakiness on ioremap() and its wide array flexibility would spill over the any semantics which folks would be trying to set out with memremap(). That should not stop the evolution of memremap() though, just pointing out that perhaps we should be a bit more restrictive over how things can criss-cross and who areas can do it. > > > ioremap() falls back to the cache type of an existing mapping to avoid > > > aliasing. > > > > Does that assume an existing ioremap*() call was used on a bigger range? > > Do you know if that happens to only be the case for x86 (I'd think so) > > or if its the same for other architectures ? > > In the /dev/mem example, it is remap_pfn_range(). I think other archs have > the same issue, but I do not know if they fall back in case of overlapping > call. What should happen if remap_pfn_range() was used with pgprot_writecombine() and later memremap() is used for instance with a smaller overlappin section, or perhaps bigger? Luis
On Tue, 2015-08-11 at 23:30 +0200, Luis R. Rodriguez wrote: > On Wed, Jul 29, 2015 at 06:00:04PM -0600, Toshi Kani wrote: > > On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: > > > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: > > > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: : > > > That would depend on the purpose of the region and the driver > > > developer should in theory know best. One issue with this of course is > > > that, as we've discovered, the semantics of on the ioremap*() variant > > > front regarding cache types is not clearly well defined, or at least > > > it may be only well defined implicitly on x86 only, so the driver > > > developer can only *hope* for the best across architectures. Our > > > ambiguity on our semantics on ioremap*() variants therefore means > > > driver developers can resonably be puzzled by what fallbacks to use. > > > That also means architectures maintainers should not whip driver > > > developers for any misuse. Such considerations are why although we're > > > now revisiting semantics for ioremap*() variants I was in hopes we > > > could be at least somewhat pedantic about memremap() semantics. > > > > I agree. However, there are a few exceptions like /dev/mem, which can > > map a target range without knowledge. > > Still, the expectation to require support for overlapping ioremap() calls > would seem to be more of an exception than the norm, so I'd argue that > requiring callers who know they do need overlapping support to be explicit > about it may help us simplify expecations on semantics in other areas of > the kernel. Again, I agree. I am simply saying that the fallback in an overlapping case may need to remain supported for such exceptional cases, possibly with a separate interface. > > > For instance since memremap() only has 2 types right now can a > > > respective fallback be documented as an alternative to help with this > > > ? Or can we not generalize this ? One for MEMREMAP_WB and one for > > > MEMREMAP_WT ? > > > > Yes, if a target range can be only mapped by memremap(). However, there > > is no restriction that a range can be mapped with a single interface > > alone. For example, a range can be mapped with remap_pfn_range() to > > user space with any cache type. So, in theory, memremap() can overlap > > with any other types. > > Shouldn't that be an issue or area of concern ? It seems the flakiness on > ioremap() and its wide array flexibility would spill over the any > semantics which folks would be trying to set out with memremap(). That > should not stop the evolution of memremap() though, just pointing out that > perhaps we should be a bit more restrictive over how things can criss > -cross and who areas can do it. reserve_pfn_range() allows the caller to specify if the fallback needs to be enabled or disabled with 'strict_prot'. track_pfn_remap() called from remap_pfn_range() enables it, and track_pfn_copy() disables it. I think we can do similar for the memremap and ioremap families as well. The fallback could be set disabled in the regular interfaces, and enabled in some interface as necessary. This also allows gradual transition, ex. disable in memremap while ioremap remains enabled for now. > > > > ioremap() falls back to the cache type of an existing mapping to > > > > avoid aliasing. > > > > > > Does that assume an existing ioremap*() call was used on a bigger > > > range? Do you know if that happens to only be the case for x86 (I'd > > > think so) or if its the same for other architectures ? > > > > In the /dev/mem example, it is remap_pfn_range(). I think other archs > > have the same issue, but I do not know if they fall back in case of > > overlapping call. > > What should happen if remap_pfn_range() was used with > pgprot_writecombine() and later memremap() is used for instance with a > smaller overlappin section, or perhaps bigger? With the fallback disabled, memremap() should fail in this case. Thanks, -Toshi
On Tue, Aug 11, 2015 at 3:40 PM, Toshi Kani <toshi.kani@hp.com> wrote: > On Tue, 2015-08-11 at 23:30 +0200, Luis R. Rodriguez wrote: >> On Wed, Jul 29, 2015 at 06:00:04PM -0600, Toshi Kani wrote: >> > On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: >> > > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: >> > > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: > : >> > > That would depend on the purpose of the region and the driver >> > > developer should in theory know best. One issue with this of course is >> > > that, as we've discovered, the semantics of on the ioremap*() variant >> > > front regarding cache types is not clearly well defined, or at least >> > > it may be only well defined implicitly on x86 only, so the driver >> > > developer can only *hope* for the best across architectures. Our >> > > ambiguity on our semantics on ioremap*() variants therefore means >> > > driver developers can resonably be puzzled by what fallbacks to use. >> > > That also means architectures maintainers should not whip driver >> > > developers for any misuse. Such considerations are why although we're >> > > now revisiting semantics for ioremap*() variants I was in hopes we >> > > could be at least somewhat pedantic about memremap() semantics. >> > >> > I agree. However, there are a few exceptions like /dev/mem, which can >> > map a target range without knowledge. >> >> Still, the expectation to require support for overlapping ioremap() calls >> would seem to be more of an exception than the norm, so I'd argue that >> requiring callers who know they do need overlapping support to be explicit >> about it may help us simplify expecations on semantics in other areas of >> the kernel. > > Again, I agree. I am simply saying that the fallback in an overlapping case > may need to remain supported for such exceptional cases, possibly with a > separate interface. Great. >> > > For instance since memremap() only has 2 types right now can a >> > > respective fallback be documented as an alternative to help with this >> > > ? Or can we not generalize this ? One for MEMREMAP_WB and one for >> > > MEMREMAP_WT ? >> > >> > Yes, if a target range can be only mapped by memremap(). However, there >> > is no restriction that a range can be mapped with a single interface >> > alone. For example, a range can be mapped with remap_pfn_range() to >> > user space with any cache type. So, in theory, memremap() can overlap >> > with any other types. >> >> Shouldn't that be an issue or area of concern ? It seems the flakiness on >> ioremap() and its wide array flexibility would spill over the any >> semantics which folks would be trying to set out with memremap(). That >> should not stop the evolution of memremap() though, just pointing out that >> perhaps we should be a bit more restrictive over how things can criss >> -cross and who areas can do it. > > reserve_pfn_range() allows the caller to specify if the fallback needs to be > enabled or disabled with 'strict_prot'. track_pfn_remap() called from > remap_pfn_range() enables it, and track_pfn_copy() disables it. I think we > can do similar for the memremap and ioremap families as well. The fallback > could be set disabled in the regular interfaces, and enabled in some > interface as necessary. This also allows gradual transition, ex. disable in > memremap while ioremap remains enabled for now. Sounds sexy to me. >> > > > ioremap() falls back to the cache type of an existing mapping to >> > > > avoid aliasing. >> > > >> > > Does that assume an existing ioremap*() call was used on a bigger >> > > range? Do you know if that happens to only be the case for x86 (I'd >> > > think so) or if its the same for other architectures ? >> > >> > In the /dev/mem example, it is remap_pfn_range(). I think other archs >> > have the same issue, but I do not know if they fall back in case of >> > overlapping call. >> >> What should happen if remap_pfn_range() was used with >> pgprot_writecombine() and later memremap() is used for instance with a >> smaller overlappin section, or perhaps bigger? > > With the fallback disabled, memremap() should fail in this case. Excellent. Luis
On Tue, Aug 11, 2015 at 3:52 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: > On Tue, Aug 11, 2015 at 3:40 PM, Toshi Kani <toshi.kani@hp.com> wrote: >> On Tue, 2015-08-11 at 23:30 +0200, Luis R. Rodriguez wrote: >>> On Wed, Jul 29, 2015 at 06:00:04PM -0600, Toshi Kani wrote: >>> > On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: >>> > > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: >>> > > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: >> : >>> > > That would depend on the purpose of the region and the driver >>> > > developer should in theory know best. One issue with this of course is >>> > > that, as we've discovered, the semantics of on the ioremap*() variant >>> > > front regarding cache types is not clearly well defined, or at least >>> > > it may be only well defined implicitly on x86 only, so the driver >>> > > developer can only *hope* for the best across architectures. Our >>> > > ambiguity on our semantics on ioremap*() variants therefore means >>> > > driver developers can resonably be puzzled by what fallbacks to use. >>> > > That also means architectures maintainers should not whip driver >>> > > developers for any misuse. Such considerations are why although we're >>> > > now revisiting semantics for ioremap*() variants I was in hopes we >>> > > could be at least somewhat pedantic about memremap() semantics. >>> > >>> > I agree. However, there are a few exceptions like /dev/mem, which can >>> > map a target range without knowledge. >>> >>> Still, the expectation to require support for overlapping ioremap() calls >>> would seem to be more of an exception than the norm, so I'd argue that >>> requiring callers who know they do need overlapping support to be explicit >>> about it may help us simplify expecations on semantics in other areas of >>> the kernel. >> >> Again, I agree. I am simply saying that the fallback in an overlapping case >> may need to remain supported for such exceptional cases, possibly with a >> separate interface. > > Great. > >>> > > For instance since memremap() only has 2 types right now can a >>> > > respective fallback be documented as an alternative to help with this >>> > > ? Or can we not generalize this ? One for MEMREMAP_WB and one for >>> > > MEMREMAP_WT ? >>> > >>> > Yes, if a target range can be only mapped by memremap(). However, there >>> > is no restriction that a range can be mapped with a single interface >>> > alone. For example, a range can be mapped with remap_pfn_range() to >>> > user space with any cache type. So, in theory, memremap() can overlap >>> > with any other types. >>> >>> Shouldn't that be an issue or area of concern ? It seems the flakiness on >>> ioremap() and its wide array flexibility would spill over the any >>> semantics which folks would be trying to set out with memremap(). That >>> should not stop the evolution of memremap() though, just pointing out that >>> perhaps we should be a bit more restrictive over how things can criss >>> -cross and who areas can do it. >> >> reserve_pfn_range() allows the caller to specify if the fallback needs to be >> enabled or disabled with 'strict_prot'. track_pfn_remap() called from >> remap_pfn_range() enables it, and track_pfn_copy() disables it. I think we >> can do similar for the memremap and ioremap families as well. The fallback >> could be set disabled in the regular interfaces, and enabled in some >> interface as necessary. This also allows gradual transition, ex. disable in >> memremap while ioremap remains enabled for now. > > Sounds sexy to me. Cool, sounds like something we can tackle in 4.4 along with the ioremap_cache removal cleanups.
On Wed, Aug 12, 2015 at 12:13 AM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Aug 11, 2015 at 3:52 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote: >> On Tue, Aug 11, 2015 at 3:40 PM, Toshi Kani <toshi.kani@hp.com> wrote: >>> On Tue, 2015-08-11 at 23:30 +0200, Luis R. Rodriguez wrote: >>>> On Wed, Jul 29, 2015 at 06:00:04PM -0600, Toshi Kani wrote: >>>> > On Wed, 2015-07-29 at 23:43 +0200, Luis R. Rodriguez wrote: >>>> > > On Wed, Jul 29, 2015 at 03:00:38PM -0600, Toshi Kani wrote: >>>> > > > On Wed, 2015-07-29 at 11:33 -0700, Dan Williams wrote: >>> : >>>> > > That would depend on the purpose of the region and the driver >>>> > > developer should in theory know best. One issue with this of course is >>>> > > that, as we've discovered, the semantics of on the ioremap*() variant >>>> > > front regarding cache types is not clearly well defined, or at least >>>> > > it may be only well defined implicitly on x86 only, so the driver >>>> > > developer can only *hope* for the best across architectures. Our >>>> > > ambiguity on our semantics on ioremap*() variants therefore means >>>> > > driver developers can resonably be puzzled by what fallbacks to use. >>>> > > That also means architectures maintainers should not whip driver >>>> > > developers for any misuse. Such considerations are why although we're >>>> > > now revisiting semantics for ioremap*() variants I was in hopes we >>>> > > could be at least somewhat pedantic about memremap() semantics. >>>> > >>>> > I agree. However, there are a few exceptions like /dev/mem, which can >>>> > map a target range without knowledge. >>>> >>>> Still, the expectation to require support for overlapping ioremap() calls >>>> would seem to be more of an exception than the norm, so I'd argue that >>>> requiring callers who know they do need overlapping support to be explicit >>>> about it may help us simplify expecations on semantics in other areas of >>>> the kernel. >>> >>> Again, I agree. I am simply saying that the fallback in an overlapping case >>> may need to remain supported for such exceptional cases, possibly with a >>> separate interface. >> >> Great. >> >>>> > > For instance since memremap() only has 2 types right now can a >>>> > > respective fallback be documented as an alternative to help with this >>>> > > ? Or can we not generalize this ? One for MEMREMAP_WB and one for >>>> > > MEMREMAP_WT ? >>>> > >>>> > Yes, if a target range can be only mapped by memremap(). However, there >>>> > is no restriction that a range can be mapped with a single interface >>>> > alone. For example, a range can be mapped with remap_pfn_range() to >>>> > user space with any cache type. So, in theory, memremap() can overlap >>>> > with any other types. >>>> >>>> Shouldn't that be an issue or area of concern ? It seems the flakiness on >>>> ioremap() and its wide array flexibility would spill over the any >>>> semantics which folks would be trying to set out with memremap(). That >>>> should not stop the evolution of memremap() though, just pointing out that >>>> perhaps we should be a bit more restrictive over how things can criss >>>> -cross and who areas can do it. >>> >>> reserve_pfn_range() allows the caller to specify if the fallback needs to be >>> enabled or disabled with 'strict_prot'. track_pfn_remap() called from >>> remap_pfn_range() enables it, and track_pfn_copy() disables it. I think we >>> can do similar for the memremap and ioremap families as well. The fallback >>> could be set disabled in the regular interfaces, and enabled in some >>> interface as necessary. This also allows gradual transition, ex. disable in >>> memremap while ioremap remains enabled for now. >> >> Sounds sexy to me. > > Cool, sounds like something we can tackle in 4.4 along with the > ioremap_cache removal cleanups. Just a reminder that we should expect some of these changes soon :D Luis
diff --git a/arch/ia64/include/asm/io.h b/arch/ia64/include/asm/io.h index 80a7e34be009..9041bbe2b7b4 100644 --- a/arch/ia64/include/asm/io.h +++ b/arch/ia64/include/asm/io.h @@ -435,6 +435,7 @@ static inline void __iomem * ioremap_cache (unsigned long phys_addr, unsigned lo { return ioremap(phys_addr, size); } +#define ioremap_cache ioremap_cache /* diff --git a/arch/sh/include/asm/io.h b/arch/sh/include/asm/io.h index 728c4c571f40..6194e20fccca 100644 --- a/arch/sh/include/asm/io.h +++ b/arch/sh/include/asm/io.h @@ -342,6 +342,7 @@ ioremap_cache(phys_addr_t offset, unsigned long size) { return __ioremap_mode(offset, size, PAGE_KERNEL); } +#define ioremap_cache ioremap_cache #ifdef CONFIG_HAVE_IOREMAP_PROT static inline void __iomem * diff --git a/arch/xtensa/include/asm/io.h b/arch/xtensa/include/asm/io.h index c39bb6e61911..867840f5400f 100644 --- a/arch/xtensa/include/asm/io.h +++ b/arch/xtensa/include/asm/io.h @@ -57,6 +57,7 @@ static inline void __iomem *ioremap_cache(unsigned long offset, else BUG(); } +#define ioremap_cache ioremap_cache #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache diff --git a/include/linux/io.h b/include/linux/io.h index fb5a99800e77..dfed9d608bb3 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -121,4 +121,13 @@ static inline int arch_phys_wc_index(int handle) #endif #endif +enum { + MEMREMAP_WB = 1 << 0, + MEMREMAP_WT = 1 << 1, + MEMREMAP_CACHE = MEMREMAP_WB, +}; + +extern void *memremap(resource_size_t offset, size_t size, unsigned long flags); +extern void memunmap(void *addr); + #endif /* _LINUX_IO_H */ diff --git a/kernel/Makefile b/kernel/Makefile index 43c4c920f30a..92866d36e376 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -99,6 +99,8 @@ obj-$(CONFIG_JUMP_LABEL) += jump_label.o obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o obj-$(CONFIG_TORTURE_TEST) += torture.o +obj-$(CONFIG_HAS_IOMEM) += memremap.o + $(obj)/configs.o: $(obj)/config_data.h # config_data.h contains the same information as ikconfig.h but gzipped. diff --git a/kernel/memremap.c b/kernel/memremap.c new file mode 100644 index 000000000000..ba206fd11785 --- /dev/null +++ b/kernel/memremap.c @@ -0,0 +1,82 @@ +/* + * Copyright(c) 2015 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +#include <linux/types.h> +#include <linux/io.h> +#include <linux/mm.h> + +#ifndef ioremap_cache +/* temporary while we convert existing ioremap_cache users to memremap */ +__weak void __iomem *ioremap_cache(resource_size_t offset, unsigned long size) +{ + return ioremap(offset, size); +} +#endif + +/* + * memremap() is "ioremap" for cases where it is known that the resource + * being mapped does not have i/o side effects and the __iomem + * annotation is not applicable. + */ +void *memremap(resource_size_t offset, size_t size, unsigned long flags) +{ + int is_ram = region_is_ram(offset, size); + void *addr = NULL; + + if (is_ram < 0) { + WARN_ONCE(1, "memremap attempted on mixed range %pa size: %zu\n", + &offset, size); + return NULL; + } + + /* Try all mapping types requested until one returns non-NULL */ + if (flags & MEMREMAP_CACHE) { + flags &= ~MEMREMAP_CACHE; + /* + * MEMREMAP_CACHE is special in that it can be satisifed + * from the direct map. Some archs depend on the + * capability of memremap() to autodetect cases where + * the requested range is potentially in "System RAM" + */ + if (is_ram) + addr = __va(offset); + else + addr = ioremap_cache(offset, size); + } + + /* + * If we don't have a mapping yet and more request flags are + * pending then we will be attempting to establish a new virtual + * address mapping. Enforce that this mapping is not aliasing + * "System RAM" + */ + if (!addr && is_ram && flags) { + WARN_ONCE(1, "memremap attempted on ram %pa size: %zu\n", + &offset, size); + return NULL; + } + + if (!addr && (flags & MEMREMAP_WT)) { + flags &= ~MEMREMAP_WT; + addr = ioremap_wt(offset, size); + } + + return addr; +} +EXPORT_SYMBOL(memremap); + +void memunmap(void *addr) +{ + if (is_vmalloc_addr(addr)) + iounmap((void __iomem *) addr); +} +EXPORT_SYMBOL(memunmap);
Existing users of ioremap_cache() are mapping memory that is known in advance to not have i/o side effects. These users are forced to cast away the __iomem annotation, or otherwise neglect to fix the sparse errors thrown when dereferencing pointers to this memory. Provide memremap() as a non __iomem annotated ioremap_*() in the case when ioremap is otherwise a pointer to memory. Outside of ioremap() and ioremap_nocache(), the expectation is that most calls to ioremap_<type>() are seeking memory-like semantics (e.g. speculative reads, and prefetching permitted). These callsites can be moved to memremap() over time. memremap() is a break from the ioremap implementation pattern of adding a new memremap_<type>() for each mapping type and having silent compatibility fall backs. Instead, the implementation defines flags that are passed to the central memremap() and if a mapping type is not supported by an arch memremap returns NULL. The behavior change to return NULL on an unsupported request is reserved for a later patch. This initial implementation starts off by using ioremap_cache() directly. Once all ioremap_cache() and ioremap_wt() instances have been converted the functionality for establishing these mappings will be pushed to a per-architecture arch_memremap() implementation. Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- arch/ia64/include/asm/io.h | 1 + arch/sh/include/asm/io.h | 1 + arch/xtensa/include/asm/io.h | 1 + include/linux/io.h | 9 +++++ kernel/Makefile | 2 + kernel/memremap.c | 82 ++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 96 insertions(+) create mode 100644 kernel/memremap.c