Message ID | 20191111192258.2234502-1-arnd@arndb.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | video: fbdev: atyfb: only use ioremap_uc() on i386 and ia64 | expand |
On Tue, Nov 12, 2019 at 11:55 AM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, Nov 11, 2019 at 08:22:50PM +0100, Arnd Bergmann wrote: > > ioremap_uc() is only meaningful on old x86-32 systems with the PAT > > extension, and on ia64 with its slightly unconventional ioremap() > > behavior, everywhere else this is the same as ioremap() anyway. > > > > Change the only driver that still references ioremap_uc() to only do so > > on x86-32/ia64 in order to allow removing that interface at some > > point in the future for the other architectures. > > > > On some architectures, ioremap_uc() just returns NULL, changing > > the driver to call ioremap() means that they now have a chance > > of working correctly. > > So this whole area is a bit of a mess. ioremap_uc on ia64 is brand > new in this cycle, for ia64-internal users that were previously > using ioremap_nocache. And ioremap_nocache was identical to ioremap > everywhere but ia64. So I think we can safely skip ia64 in the ifdef > if we go down that route. > > But also on x86 I'd actually rather prefer killing off this only mainline > user of ioremap_uc. It was added by Luis in commit 3cc2dac5be3f > ("drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC", > which looks like an optimization of MTRR usage. I feel like I really > don't understand the point there, but it also seems a pity to keep > the API around just for that. Wut ... Maybe I'm missing something, but from how we use mtrr in other gpu drivers it's a) either you use MTRR because that's all you got or b) you use pat. Mixing both sounds like a pretty bad idea, since if you need MTRR for performance (because you dont have PAT) then you can't fix the wc with the PAT-based ioremap_uc. And if you have PAT, then you don't really need an MTRR to get wc. So I'd revert this patch from Luis and ... > That being said linux-next added another call to ioremap_uc in > drivers/mfd/intel-lpss.c, so it looks like the API might have to stay. > > So I guess modulo excluding ia64 your patch looks fine, and should go > along something like the one below. I'll happily carry them in the > ioremap tree with the right ACKs. > > --- > From 81243b2aa78babcc5f97b2c2a29957fcf3fd3664 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Tue, 12 Nov 2019 11:52:25 +0100 > Subject: ioremap: remove ioremap_uc except on x86 and ia64 > > ioremap_uc is a special API to work around caching oddities on > Intel platforms. Remove it from all other architectures now that > only x86 and ia64-specific callers are left. ... apply this one. Since the same reasoning should apply to anything that's running on any cpu with PAT. -Daniel > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > arch/alpha/include/asm/io.h | 1 - > arch/m68k/include/asm/kmap.h | 1 - > arch/mips/include/asm/io.h | 1 - > arch/parisc/include/asm/io.h | 1 - > arch/powerpc/include/asm/io.h | 1 - > arch/sh/include/asm/io.h | 1 - > arch/sparc/include/asm/io_64.h | 1 - > include/asm-generic/io.h | 15 --------------- > include/linux/io.h | 2 ++ > lib/devres.c | 4 ++++ > 10 files changed, 6 insertions(+), 22 deletions(-) > > diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h > index 1989b946a28d..11fdcade3c5c 100644 > --- a/arch/alpha/include/asm/io.h > +++ b/arch/alpha/include/asm/io.h > @@ -290,7 +290,6 @@ static inline void __iomem * ioremap_nocache(unsigned long offset, > } > > #define ioremap_wc ioremap_nocache > -#define ioremap_uc ioremap_nocache > > static inline void iounmap(volatile void __iomem *addr) > { > diff --git a/arch/m68k/include/asm/kmap.h b/arch/m68k/include/asm/kmap.h > index 559cb91bede1..22b5ea4fc8b8 100644 > --- a/arch/m68k/include/asm/kmap.h > +++ b/arch/m68k/include/asm/kmap.h > @@ -28,7 +28,6 @@ static inline void __iomem *ioremap(unsigned long physaddr, unsigned long size) > } > > #define ioremap_nocache ioremap > -#define ioremap_uc ioremap > #define ioremap_wt ioremap_wt > static inline void __iomem *ioremap_wt(unsigned long physaddr, > unsigned long size) > diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h > index 3f6ce74335b4..9195ded1d6a7 100644 > --- a/arch/mips/include/asm/io.h > +++ b/arch/mips/include/asm/io.h > @@ -249,7 +249,6 @@ static inline void __iomem *ioremap_prot(phys_addr_t offset, > */ > #define ioremap_nocache(offset, size) \ > __ioremap_mode((offset), (size), _CACHE_UNCACHED) > -#define ioremap_uc ioremap_nocache > > /* > * ioremap_cache - map bus memory into CPU space > diff --git a/arch/parisc/include/asm/io.h b/arch/parisc/include/asm/io.h > index 46212b52c23e..0674f5cd3045 100644 > --- a/arch/parisc/include/asm/io.h > +++ b/arch/parisc/include/asm/io.h > @@ -130,7 +130,6 @@ static inline void gsc_writeq(unsigned long long val, unsigned long addr) > void __iomem *ioremap(unsigned long offset, unsigned long size); > #define ioremap_nocache(off, sz) ioremap((off), (sz)) > #define ioremap_wc ioremap_nocache > -#define ioremap_uc ioremap_nocache > > extern void iounmap(const volatile void __iomem *addr); > > diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h > index a63ec938636d..119bcbe3e328 100644 > --- a/arch/powerpc/include/asm/io.h > +++ b/arch/powerpc/include/asm/io.h > @@ -716,7 +716,6 @@ extern void __iomem *ioremap_wc(phys_addr_t address, unsigned long size); > void __iomem *ioremap_wt(phys_addr_t address, unsigned long size); > void __iomem *ioremap_coherent(phys_addr_t address, unsigned long size); > #define ioremap_nocache(addr, size) ioremap((addr), (size)) > -#define ioremap_uc(addr, size) ioremap((addr), (size)) > #define ioremap_cache(addr, size) \ > ioremap_prot((addr), (size), pgprot_val(PAGE_KERNEL)) > > diff --git a/arch/sh/include/asm/io.h b/arch/sh/include/asm/io.h > index 1495489225ac..30bbe787f1ef 100644 > --- a/arch/sh/include/asm/io.h > +++ b/arch/sh/include/asm/io.h > @@ -368,7 +368,6 @@ static inline int iounmap_fixed(void __iomem *addr) { return -EINVAL; } > #endif > > #define ioremap_nocache ioremap > -#define ioremap_uc ioremap > > /* > * Convert a physical pointer to a virtual kernel pointer for /dev/mem > diff --git a/arch/sparc/include/asm/io_64.h b/arch/sparc/include/asm/io_64.h > index f4afa301954a..688911051b44 100644 > --- a/arch/sparc/include/asm/io_64.h > +++ b/arch/sparc/include/asm/io_64.h > @@ -407,7 +407,6 @@ static inline void __iomem *ioremap(unsigned long offset, unsigned long size) > } > > #define ioremap_nocache(X,Y) ioremap((X),(Y)) > -#define ioremap_uc(X,Y) ioremap((X),(Y)) > #define ioremap_wc(X,Y) ioremap((X),(Y)) > #define ioremap_wt(X,Y) ioremap((X),(Y)) > > diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h > index 325fc98cc9ff..357e8638040c 100644 > --- a/include/asm-generic/io.h > +++ b/include/asm-generic/io.h > @@ -972,21 +972,6 @@ static inline void __iomem *ioremap(phys_addr_t addr, size_t size) > #define ioremap_wt ioremap > #endif > > -/* > - * ioremap_uc is special in that we do require an explicit architecture > - * implementation. In general you do not want to use this function in a > - * driver and use plain ioremap, which is uncached by default. Similarly > - * architectures should not implement it unless they have a very good > - * reason. > - */ > -#ifndef ioremap_uc > -#define ioremap_uc ioremap_uc > -static inline void __iomem *ioremap_uc(phys_addr_t offset, size_t size) > -{ > - return NULL; > -} > -#endif > - > #ifdef CONFIG_HAS_IOPORT_MAP > #ifndef CONFIG_GENERIC_IOMAP > #ifndef ioport_map > diff --git a/include/linux/io.h b/include/linux/io.h > index a59834bc0a11..6574bb0f28e6 100644 > --- a/include/linux/io.h > +++ b/include/linux/io.h > @@ -64,8 +64,10 @@ static inline void devm_ioport_unmap(struct device *dev, void __iomem *addr) > > void __iomem *devm_ioremap(struct device *dev, resource_size_t offset, > resource_size_t size); > +#ifdef ioremap_uc > void __iomem *devm_ioremap_uc(struct device *dev, resource_size_t offset, > resource_size_t size); > +#endif > void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t offset, > resource_size_t size); > void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset, > diff --git a/lib/devres.c b/lib/devres.c > index f56070cf970b..0299279ddf02 100644 > --- a/lib/devres.c > +++ b/lib/devres.c > @@ -40,9 +40,11 @@ static void __iomem *__devm_ioremap(struct device *dev, resource_size_t offset, > case DEVM_IOREMAP_NC: > addr = ioremap_nocache(offset, size); > break; > +#ifdef ioremap_uc > case DEVM_IOREMAP_UC: > addr = ioremap_uc(offset, size); > break; > +#endif > case DEVM_IOREMAP_WC: > addr = ioremap_wc(offset, size); > break; > @@ -72,6 +74,7 @@ void __iomem *devm_ioremap(struct device *dev, resource_size_t offset, > } > EXPORT_SYMBOL(devm_ioremap); > > +#ifdef ioremap_uc > /** > * devm_ioremap_uc - Managed ioremap_uc() > * @dev: Generic device to remap IO address for > @@ -86,6 +89,7 @@ void __iomem *devm_ioremap_uc(struct device *dev, resource_size_t offset, > return __devm_ioremap(dev, offset, size, DEVM_IOREMAP_UC); > } > EXPORT_SYMBOL_GPL(devm_ioremap_uc); > +#endif /* ioremap_uc */ > > /** > * devm_ioremap_nocache - Managed ioremap_nocache() > -- > 2.20.1 >
On Tue, Nov 12, 2019 at 3:06 PM Christoph Hellwig <hch@lst.de> wrote: > On Tue, Nov 12, 2019 at 02:04:16PM +0100, Daniel Vetter wrote: > > Wut ... Maybe I'm missing something, but from how we use mtrr in other > > gpu drivers it's a) either you use MTRR because that's all you got or > > b) you use pat. Mixing both sounds like a pretty bad idea, since if > > you need MTRR for performance (because you dont have PAT) then you > > can't fix the wc with the PAT-based ioremap_uc. And if you have PAT, > > then you don't really need an MTRR to get wc. > > > > So I'd revert this patch from Luis and ... > > Sounds great to me.. > > > ... apply this one. Since the same reasoning should apply to anything > > that's running on any cpu with PAT. > > Can you take a look at "mfd: intel-lpss: Use devm_ioremap_uc for MMIO" > in linux-next, which also looks rather fishy to me? Can't we use > the MTRR APIs to override the broken BIOS MTRR setup there as well? Hm so that's way out of my knowledge, but I think mtrr_cleanup() was supposed to fix up messy/broken MTRR setups by the bios. So maybe they simply didn't enable that in their .config with CONFIG_MTRR_SANITIZER. An explicit cleanup is currently not possible for drivers, since the only interface exported to drivers is arch_phys_wc_add/del (which short-circuits if pat works since you don't need mtrr in that case). Adding everyone from that commit, plus Luis. Drivers really shouldn't assume/work around the bios setting up superflous/wrong MTRR. > With that we could kill ioremap_uc entirely. So yeah removing that seems definitely like the right thing. -Daniel
On Tue, Nov 12, 2019 at 03:06:31PM +0100, Christoph Hellwig wrote: > On Tue, Nov 12, 2019 at 02:04:16PM +0100, Daniel Vetter wrote: > > Wut ... Maybe I'm missing something, but from how we use mtrr in other > > gpu drivers it's a) either you use MTRR because that's all you got or > > b) you use pat. Mixing both sounds like a pretty bad idea, You misread the patch. And indeed there is a bit of complexity involved here folks should be aware of as .. well, its been a while. A mix of both MTRR and PAT is not effectively done on the code patch for the atyb driver. If you have PAT only PAT is used. If you don't have PAT a solution is provided to use MTRR. The goal of the patch really was to help finally avoid direct calls to MTRR. *This* driver was the *one* crazy exception where we needed to adddress this with a solution which would work effectively for both non-PAT and PAT world which had crazy constraints. So with this out of the way, no direct calls of MTRRs was possible and there are future possible gains with this for x86. The biggest two were: 1) Xen didn't have to implement MTRR hypervisor calls out for Linux guests. This means Xen guests don't have to enable MTRRs. Any code path avoiding such craziness as stop_machine() on each CPU during bootup, resume, CPU online and whenever an MTRR is set is a blessing. 2) We may be closer in the future to getting ioremap_nocache to use UC isntead of UC-, this would be a win. x86 ioremap_nocache() does not use UC (strong UC), it just uses UC-. Note though that BIOSes can *only* enable UC by using MTRR directly, fan control for a system was one use case example that can come up, just as an example. Ideally your BIOS won't need this. When and how this is done is platform and BIOS specific though. So effectively, if a BIOS enables MTRRs the Linux must keep them enabled. If the BIOS disables MTRRs the kernel keeps them disabled. > Can you take a look at "mfd: intel-lpss: Use devm_ioremap_uc for MMIO" > in linux-next, which also looks rather fishy to me? Can't we use > the MTRR APIs to override the broken BIOS MTRR setup there as well? The call there was put to allow precisely for such work around but also allow the code to work on PAT / non-PAT systems by using the same API. > With that we could kill ioremap_uc entirely. ioremap_uc() is a compromise to avoid direct calls to MTRRs, since ioremap_nocache() is not effectively yet using UC. Whether or not other archs carry it. Luis
On Tue, Nov 12, 2019 at 03:26:35PM +0100, Daniel Vetter wrote: > On Tue, Nov 12, 2019 at 3:06 PM Christoph Hellwig <hch@lst.de> wrote: > > On Tue, Nov 12, 2019 at 02:04:16PM +0100, Daniel Vetter wrote: > > > Wut ... Maybe I'm missing something, but from how we use mtrr in other > > > gpu drivers it's a) either you use MTRR because that's all you got or > > > b) you use pat. Mixing both sounds like a pretty bad idea, since if > > > you need MTRR for performance (because you dont have PAT) then you > > > can't fix the wc with the PAT-based ioremap_uc. And if you have PAT, > > > then you don't really need an MTRR to get wc. > > > > > > So I'd revert this patch from Luis and ... > > > > Sounds great to me.. > > > > > ... apply this one. Since the same reasoning should apply to anything > > > that's running on any cpu with PAT. > > > > Can you take a look at "mfd: intel-lpss: Use devm_ioremap_uc for MMIO" > > in linux-next, which also looks rather fishy to me? Can't we use > > the MTRR APIs to override the broken BIOS MTRR setup there as well? > > Hm so that's way out of my knowledge, but I think mtrr_cleanup() was > supposed to fix up messy/broken MTRR setups by the bios. So maybe they > simply didn't enable that in their .config with CONFIG_MTRR_SANITIZER. I had originally suggested to just make the driver build on x86, but an atlternative was to provide the call for the missing architecture. > An explicit cleanup is currently not possible for drivers, since the > only interface exported to drivers is arch_phys_wc_add/del (which > short-circuits if pat works since you don't need mtrr in that case). Right, the goal was to not call MTRR directly. > Adding everyone from that commit, plus Luis. Drivers really shouldn't > assume/work around the bios setting up superflous/wrong MTRR. Such things are needed, otherwise some systems may not boot... > > With that we could kill ioremap_uc entirely. > > So yeah removing that seems definitely like the right thing. I think this would be possible if we could flop ioremap_nocache() to UC instead of UC- on x86. Otherwise, I can't see how we can remove this by still not allowing direct MTRR calls. Luis
On Wed, Nov 13, 2019 at 8:27 AM Christoph Hellwig <hch@lst.de> wrote: > > On Tue, Nov 12, 2019 at 10:24:23PM +0000, Luis Chamberlain wrote: > > I think this would be possible if we could flop ioremap_nocache() to UC > > instead of UC- on x86. Otherwise, I can't see how we can remove this by > > still not allowing direct MTRR calls. > > If everything goes well ioremap_nocache will be gone as of 5.5. As ioremap_nocache() just an alias for ioremap(), I suppose the idea would then be to make x86 ioremap be UC instead of UC-, again matching what the other architectures do already. Arnd
On Wed, Nov 13, 2019 at 08:38:15AM +0100, Arnd Bergmann wrote: > On Wed, Nov 13, 2019 at 8:27 AM Christoph Hellwig <hch@lst.de> wrote: > > > > On Tue, Nov 12, 2019 at 10:24:23PM +0000, Luis Chamberlain wrote: > > > I think this would be possible if we could flop ioremap_nocache() to UC > > > instead of UC- on x86. Otherwise, I can't see how we can remove this by > > > still not allowing direct MTRR calls. > > > > If everything goes well ioremap_nocache will be gone as of 5.5. > > As ioremap_nocache() just an alias for ioremap(), I suppose the idea would > then be to make x86 ioremap be UC instead of UC-, again matching what the > other architectures do already. I think it's right thing to do, i.e. assume that ioremap() always does strong UC independently on MTRR settings.
On Wed, Nov 13, 2019 at 11:31:54AM +0200, Andy Shevchenko wrote: > On Wed, Nov 13, 2019 at 08:38:15AM +0100, Arnd Bergmann wrote: > > On Wed, Nov 13, 2019 at 8:27 AM Christoph Hellwig <hch@lst.de> wrote: > > > > > > On Tue, Nov 12, 2019 at 10:24:23PM +0000, Luis Chamberlain wrote: > > > > I think this would be possible if we could flop ioremap_nocache() to UC > > > > instead of UC- on x86. Otherwise, I can't see how we can remove this by > > > > still not allowing direct MTRR calls. > > > > > > If everything goes well ioremap_nocache will be gone as of 5.5. > > > > As ioremap_nocache() just an alias for ioremap(), I suppose the idea would > > then be to make x86 ioremap be UC instead of UC-, again matching what the > > other architectures do already. > > I think it's right thing to do, i.e. assume that ioremap() always does strong > UC independently on MTRR settings. Agreed wholeheartedly. What are the blockers from making that happen? Do we have any left? Luis
diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c index 79d548746efd..bdbaca7200b2 100644 --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -3420,11 +3420,15 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info, } info->fix.mmio_start = raddr; +#if defined(__i386__) || defined(__ia64__) /* * By using strong UC we force the MTRR to never have an * effect on the MMIO region on both non-PAT and PAT systems. */ par->ati_regbase = ioremap_uc(info->fix.mmio_start, 0x1000); +#else + par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000); +#endif if (par->ati_regbase == NULL) return -ENOMEM;
ioremap_uc() is only meaningful on old x86-32 systems with the PAT extension, and on ia64 with its slightly unconventional ioremap() behavior, everywhere else this is the same as ioremap() anyway. Change the only driver that still references ioremap_uc() to only do so on x86-32/ia64 in order to allow removing that interface at some point in the future for the other architectures. On some architectures, ioremap_uc() just returns NULL, changing the driver to call ioremap() means that they now have a chance of working correctly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- drivers/video/fbdev/aty/atyfb_base.c | 4 ++++ 1 file changed, 4 insertions(+)