diff mbox

[v1,09/47] vidoe: fbdev: atyfb: remove and fix MTRR MMIO "hole" work around

Message ID 1426893517-2511-10-git-send-email-mcgrof@do-not-panic.com (mailing list archive)
State New, archived
Headers show

Commit Message

Luis R. Rodriguez March 20, 2015, 11:17 p.m. UTC
From: "Luis R. Rodriguez" <mcgrof@suse.com>

The atyfb driver uses an MTRR work around since some
cards use the same PCI BAR for the framebuffer and MMIO.
In such cards the last page is used for MMIO, the rest for
the framebuffer, so on those cards we ioremap() the MMIO
page alone, then again ioremap() the full framebuffer
including the MMIO space *and* ___then___ use an MTRR with
MTRR_TYPE_WRCOMB on the full PCI BAR... and finally "hole"
in an MTRR_TYPE_UNCACHABLE MTRR only for MMIO.

This is a terrible fucking work around, and should by no means
be necessary however evidence through a large series of conversion
of drivers to ioremap_wc() for the framebuffer shows that around
the time MTRR started becoming popular devices did not have things
lined up for easily separating the framebuffer and MMIO register
access. In some cases a driver requires significant intrusive
changes in order to make the split for an ioremap() for MMIO registers
and another ioremap_wc() for the framebuffer, at other times a
bit of careful study of the driver suffices. This example driver
falls into the later category.

We can replace the MTRR MTRR_TYPE_UNCACHABLE
work around by using ioremap_nocache(), the length of the
MMIO space should already be correct. The other part we
need to correct is ensuring we ioremap() for the framebuffer
only the required size. Since the ioremap() happens early
on probe for PCI devices before aty_init() where we typically
adjust the length and know how to do it, we can fix this by
pegging the bus type as PCI on PCI probe, and finally fudging
and framebuffer length just as we do on aty_init().

The last thing we do must do to remain sane is ensure we
use the info->fix.smem_start and info->fix.smem_len for
the framebuffer MTRR as we know that is always well adjusted.
The *one* concern here would be if the MTRR is not in units
of 4K __but__ we already know that in the PCI case this cannot
happen, in the shared space setting the MTRR would be up to
0x7ff000 and assuming a 4K page:

; 0x7ff000 / 0x1000
	2047

Also, internally when MTRR is used mtrr_add() will use mtrr_check()
and that should splat a warning when the MTRR base and size are
not compatible with what is expected for MTRR usage.

This fix lets us nuke the MTRR_TYPE_UNCACHABLE MTRR "hole".

Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/video/fbdev/aty/atyfb.h      |  1 -
 drivers/video/fbdev/aty/atyfb_base.c | 28 ++++++----------------------
 2 files changed, 6 insertions(+), 23 deletions(-)

Comments

Andy Lutomirski March 20, 2015, 11:52 p.m. UTC | #1
On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> The atyfb driver uses an MTRR work around since some
> cards use the same PCI BAR for the framebuffer and MMIO.
> In such cards the last page is used for MMIO, the rest for
> the framebuffer, so on those cards we ioremap() the MMIO
> page alone, then again ioremap() the full framebuffer
> including the MMIO space *and* ___then___ use an MTRR with
> MTRR_TYPE_WRCOMB on the full PCI BAR... and finally "hole"
> in an MTRR_TYPE_UNCACHABLE MTRR only for MMIO.
>
> This is a terrible fucking work around, and should by no means
> be necessary however evidence through a large series of conversion
> of drivers to ioremap_wc() for the framebuffer shows that around
> the time MTRR started becoming popular devices did not have things
> lined up for easily separating the framebuffer and MMIO register
> access. In some cases a driver requires significant intrusive
> changes in order to make the split for an ioremap() for MMIO registers
> and another ioremap_wc() for the framebuffer, at other times a
> bit of careful study of the driver suffices. This example driver
> falls into the later category.
>
> We can replace the MTRR MTRR_TYPE_UNCACHABLE
> work around by using ioremap_nocache(), the length of the
> MMIO space should already be correct. The other part we
> need to correct is ensuring we ioremap() for the framebuffer
> only the required size. Since the ioremap() happens early
> on probe for PCI devices before aty_init() where we typically
> adjust the length and know how to do it, we can fix this by
> pegging the bus type as PCI on PCI probe, and finally fudging
> and framebuffer length just as we do on aty_init().
>
> The last thing we do must do to remain sane is ensure we
> use the info->fix.smem_start and info->fix.smem_len for
> the framebuffer MTRR as we know that is always well adjusted.
> The *one* concern here would be if the MTRR is not in units
> of 4K __but__ we already know that in the PCI case this cannot
> happen, in the shared space setting the MTRR would be up to
> 0x7ff000 and assuming a 4K page:
>
> ; 0x7ff000 / 0x1000
>         2047
>
> Also, internally when MTRR is used mtrr_add() will use mtrr_check()
> and that should splat a warning when the MTRR base and size are
> not compatible with what is expected for MTRR usage.
>
> This fix lets us nuke the MTRR_TYPE_UNCACHABLE MTRR "hole".
>
> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Antonino Daplas <adaplas@gmail.com>
> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
> Cc: linux-fbdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  drivers/video/fbdev/aty/atyfb.h      |  1 -
>  drivers/video/fbdev/aty/atyfb_base.c | 28 ++++++----------------------
>  2 files changed, 6 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
> index 1f39a62..89ec439 100644
> --- a/drivers/video/fbdev/aty/atyfb.h
> +++ b/drivers/video/fbdev/aty/atyfb.h
> @@ -184,7 +184,6 @@ struct atyfb_par {
>         spinlock_t int_lock;
>  #ifdef CONFIG_MTRR
>         int mtrr_aper;
> -       int mtrr_reg;
>  #endif
>         u32 mem_cntl;
>         struct crtc saved_crtc;
> diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> index 8025624..8875e56 100644
> --- a/drivers/video/fbdev/aty/atyfb_base.c
> +++ b/drivers/video/fbdev/aty/atyfb_base.c
> @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
>
>  #ifdef CONFIG_MTRR
>         par->mtrr_aper = -1;
> -       par->mtrr_reg = -1;
>         if (!nomtrr) {
> -               /* Cover the whole resource. */
> -               par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> +               par->mtrr_aper = mtrr_add(info->fix.smem_start,
> +                                         info->fix.smem_len,
>                                           MTRR_TYPE_WRCOMB, 1);
> -               if (par->mtrr_aper >= 0 && !par->aux_start) {
> -                       /* Make a hole for mmio. */
> -                       par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
> -                                                GUI_RESERVE, GUI_RESERVE,
> -                                                MTRR_TYPE_UNCACHABLE, 1);
> -                       if (par->mtrr_reg < 0) {
> -                               mtrr_del(par->mtrr_aper, 0, 0);
> -                               par->mtrr_aper = -1;
> -                       }
> -               }
>         }
>  #endif
>
> @@ -2776,10 +2765,6 @@ aty_init_exit:
>         par->pll_ops->set_pll(info, &par->saved_pll);
>
>  #ifdef CONFIG_MTRR
> -       if (par->mtrr_reg >= 0) {
> -               mtrr_del(par->mtrr_reg, 0, 0);
> -               par->mtrr_reg = -1;
> -       }
>         if (par->mtrr_aper >= 0) {
>                 mtrr_del(par->mtrr_aper, 0, 0);
>                 par->mtrr_aper = -1;
> @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
>         }
>
>         info->fix.mmio_start = raddr;
> -       par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
> +       par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);

Double-check me, but I think that ioremap_nocache + WC MTRR = WC.  I
think we might need ioremap_nocache_me_harder (or maybe ioremap_x86_uc
if you prefer that bikeshed color) for this.
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ville Syrjälä March 21, 2015, 9:15 a.m. UTC | #2
On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> index 8025624..8875e56 100644
> --- a/drivers/video/fbdev/aty/atyfb_base.c
> +++ b/drivers/video/fbdev/aty/atyfb_base.c
> @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
>  
>  #ifdef CONFIG_MTRR
>  	par->mtrr_aper = -1;
> -	par->mtrr_reg = -1;
>  	if (!nomtrr) {
> -		/* Cover the whole resource. */
> -		par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> +		par->mtrr_aper = mtrr_add(info->fix.smem_start,
> +					  info->fix.smem_len,
>  					  MTRR_TYPE_WRCOMB, 1);

MTRRs need power of two size, so how is this supposed to work?

> -		if (par->mtrr_aper >= 0 && !par->aux_start) {
> -			/* Make a hole for mmio. */
> -			par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
> -						 GUI_RESERVE, GUI_RESERVE,
> -						 MTRR_TYPE_UNCACHABLE, 1);
> -			if (par->mtrr_reg < 0) {
> -				mtrr_del(par->mtrr_aper, 0, 0);
> -				par->mtrr_aper = -1;
> -			}
> -		}
>  	}
>  #endif
>  
> @@ -2776,10 +2765,6 @@ aty_init_exit:
>  	par->pll_ops->set_pll(info, &par->saved_pll);
>  
>  #ifdef CONFIG_MTRR
> -	if (par->mtrr_reg >= 0) {
> -		mtrr_del(par->mtrr_reg, 0, 0);
> -		par->mtrr_reg = -1;
> -	}
>  	if (par->mtrr_aper >= 0) {
>  		mtrr_del(par->mtrr_aper, 0, 0);
>  		par->mtrr_aper = -1;
> @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
>  	}
>  
>  	info->fix.mmio_start = raddr;
> -	par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
> +	par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
>  	if (par->ati_regbase == NULL)
>  		return -ENOMEM;
>  
> @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
>  	info->fix.smem_start = addr;
>  	info->fix.smem_len = 0x800000;
>  
> +	aty_fudge_framebuffer_len(info);
> +
>  	info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
>  	if (info->screen_base == NULL) {
>  		ret = -ENOMEM;
> @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev,
>  		return -ENOMEM;
>  	}
>  	par = info->par;
> +	par->bus_type = PCI;
>  	info->fix = atyfb_fix;
>  	info->device = &pdev->dev;
>  	par->pci_id = pdev->device;
> @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info)
>  #endif
>  
>  #ifdef CONFIG_MTRR
> -	if (par->mtrr_reg >= 0) {
> -		mtrr_del(par->mtrr_reg, 0, 0);
> -		par->mtrr_reg = -1;
> -	}
>  	if (par->mtrr_aper >= 0) {
>  		mtrr_del(par->mtrr_aper, 0, 0);
>  		par->mtrr_aper = -1;
> -- 
> 2.3.2.209.gd67f9d5.dirty
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ville Syrjälä March 27, 2015, 8:37 a.m. UTC | #3
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > index 8025624..8875e56 100644
> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >  
> >  #ifdef CONFIG_MTRR
> >  	par->mtrr_aper = -1;
> > -	par->mtrr_reg = -1;
> >  	if (!nomtrr) {
> > -		/* Cover the whole resource. */
> > -		par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > +		par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > +					  info->fix.smem_len,
> >  					  MTRR_TYPE_WRCOMB, 1);
> 
> MTRRs need power of two size, so how is this supposed to work?

Still waiting for an answer...

> 
> > -		if (par->mtrr_aper >= 0 && !par->aux_start) {
> > -			/* Make a hole for mmio. */
> > -			par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
> > -						 GUI_RESERVE, GUI_RESERVE,
> > -						 MTRR_TYPE_UNCACHABLE, 1);
> > -			if (par->mtrr_reg < 0) {
> > -				mtrr_del(par->mtrr_aper, 0, 0);
> > -				par->mtrr_aper = -1;
> > -			}
> > -		}
> >  	}
> >  #endif
> >  
> > @@ -2776,10 +2765,6 @@ aty_init_exit:
> >  	par->pll_ops->set_pll(info, &par->saved_pll);
> >  
> >  #ifdef CONFIG_MTRR
> > -	if (par->mtrr_reg >= 0) {
> > -		mtrr_del(par->mtrr_reg, 0, 0);
> > -		par->mtrr_reg = -1;
> > -	}
> >  	if (par->mtrr_aper >= 0) {
> >  		mtrr_del(par->mtrr_aper, 0, 0);
> >  		par->mtrr_aper = -1;
> > @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
> >  	}
> >  
> >  	info->fix.mmio_start = raddr;
> > -	par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
> > +	par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
> >  	if (par->ati_regbase == NULL)
> >  		return -ENOMEM;
> >  
> > @@ -3491,6 +3476,8 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
> >  	info->fix.smem_start = addr;
> >  	info->fix.smem_len = 0x800000;
> >  
> > +	aty_fudge_framebuffer_len(info);
> > +
> >  	info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
> >  	if (info->screen_base == NULL) {
> >  		ret = -ENOMEM;
> > @@ -3563,6 +3550,7 @@ static int atyfb_pci_probe(struct pci_dev *pdev,
> >  		return -ENOMEM;
> >  	}
> >  	par = info->par;
> > +	par->bus_type = PCI;
> >  	info->fix = atyfb_fix;
> >  	info->device = &pdev->dev;
> >  	par->pci_id = pdev->device;
> > @@ -3732,10 +3720,6 @@ static void atyfb_remove(struct fb_info *info)
> >  #endif
> >  
> >  #ifdef CONFIG_MTRR
> > -	if (par->mtrr_reg >= 0) {
> > -		mtrr_del(par->mtrr_reg, 0, 0);
> > -		par->mtrr_reg = -1;
> > -	}
> >  	if (par->mtrr_aper >= 0) {
> >  		mtrr_del(par->mtrr_aper, 0, 0);
> >  		par->mtrr_aper = -1;
> > -- 
> > 2.3.2.209.gd67f9d5.dirty
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Ville Syrjälä
> syrjala@sci.fi
> http://www.sci.fi/~syrjala/
Luis Chamberlain March 27, 2015, 7:38 p.m. UTC | #4
On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > index 8025624..8875e56 100644
> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >  
> >  #ifdef CONFIG_MTRR
> >  	par->mtrr_aper = -1;
> > -	par->mtrr_reg = -1;
> >  	if (!nomtrr) {
> > -		/* Cover the whole resource. */
> > -		par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > +		par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > +					  info->fix.smem_len,
> >  					  MTRR_TYPE_WRCOMB, 1);
> 
> MTRRs need power of two size, so how is this supposed to work?

As per mtrr_add_page() [0] the base and size are just supposed to be in units
of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
is not standardized and by no means recorded as a requirement. Obviously
powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
will use mtrr_check() to verify the the same requirement. Furthermore,
as per my commit log message:

---
    The last thing we do must do to remain sane is ensure we
    use the info->fix.smem_start and info->fix.smem_len for
    the framebuffer MTRR as we know that is always well adjusted.
    The *one* concern here would be if the MTRR is not in units
    of 4K __but__ we already know that in the PCI case this cannot
    happen, in the shared space setting the MTRR would be up to
    0x7ff000 and assuming a 4K page:
    
    ; 0x7ff000 / 0x1000
        2047
    
    Also, internally when MTRR is used mtrr_add() will use mtrr_check()
    and that should splat a warning when the MTRR base and size are
    not compatible with what is expected for MTRR usage.
---

If any of this is too risky we can use the __arch_phys_wc_add() (or as
Andy suggested perhaps use set_page_* stuff, although I am still evaluating
this) but I did this change to show the effort required for a change when
the registers / framebuffer is on the same PCI BAR but at different offsets.

[0] scripts/kernel-doc -man -function mtrr_add_page arch/x86/kernel/cpu/mtrr/main.c | nroff -man | less

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain March 27, 2015, 7:38 p.m. UTC | #5
On Fri, Mar 27, 2015 at 10:37:04AM +0200, Ville Syrjälä wrote:
> On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> > On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> > > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > > index 8025624..8875e56 100644
> > > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> > >  
> > >  #ifdef CONFIG_MTRR
> > >  	par->mtrr_aper = -1;
> > > -	par->mtrr_reg = -1;
> > >  	if (!nomtrr) {
> > > -		/* Cover the whole resource. */
> > > -		par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > > +		par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > > +					  info->fix.smem_len,
> > >  					  MTRR_TYPE_WRCOMB, 1);
> > 
> > MTRRs need power of two size, so how is this supposed to work?
> 
> Still waiting for an answer...

Sorry was in the desert for a bit, I'm back now.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski March 27, 2015, 7:43 p.m. UTC | #6
On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
>> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
>> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
>> > index 8025624..8875e56 100644
>> > --- a/drivers/video/fbdev/aty/atyfb_base.c
>> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
>> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
>> >
>> >  #ifdef CONFIG_MTRR
>> >     par->mtrr_aper = -1;
>> > -   par->mtrr_reg = -1;
>> >     if (!nomtrr) {
>> > -           /* Cover the whole resource. */
>> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
>> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
>> > +                                     info->fix.smem_len,
>> >                                       MTRR_TYPE_WRCOMB, 1);
>>
>> MTRRs need power of two size, so how is this supposed to work?
>
> As per mtrr_add_page() [0] the base and size are just supposed to be in units
> of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
> is not standardized and by no means recorded as a requirement. Obviously
> powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
> will use mtrr_check() to verify the the same requirement. Furthermore,
> as per my commit log message:

Whatever the code may or may not do, the x86 architecture uses
power-of-two MTRR sizes.  So I'm confused.

--Andy

>
> ---
>     The last thing we do must do to remain sane is ensure we
>     use the info->fix.smem_start and info->fix.smem_len for
>     the framebuffer MTRR as we know that is always well adjusted.
>     The *one* concern here would be if the MTRR is not in units
>     of 4K __but__ we already know that in the PCI case this cannot
>     happen, in the shared space setting the MTRR would be up to
>     0x7ff000 and assuming a 4K page:
>
>     ; 0x7ff000 / 0x1000
>         2047
>
>     Also, internally when MTRR is used mtrr_add() will use mtrr_check()
>     and that should splat a warning when the MTRR base and size are
>     not compatible with what is expected for MTRR usage.
> ---
>
> If any of this is too risky we can use the __arch_phys_wc_add() (or as
> Andy suggested perhaps use set_page_* stuff, although I am still evaluating
> this) but I did this change to show the effort required for a change when
> the registers / framebuffer is on the same PCI BAR but at different offsets.
>
> [0] scripts/kernel-doc -man -function mtrr_add_page arch/x86/kernel/cpu/mtrr/main.c | nroff -man | less
>
>   Luis
Luis Chamberlain March 27, 2015, 7:57 p.m. UTC | #7
On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> > On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> >> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> >> > index 8025624..8875e56 100644
> >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >> >
> >> >  #ifdef CONFIG_MTRR
> >> >     par->mtrr_aper = -1;
> >> > -   par->mtrr_reg = -1;
> >> >     if (!nomtrr) {
> >> > -           /* Cover the whole resource. */
> >> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> >> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
> >> > +                                     info->fix.smem_len,
> >> >                                       MTRR_TYPE_WRCOMB, 1);
> >>
> >> MTRRs need power of two size, so how is this supposed to work?
> >
> > As per mtrr_add_page() [0] the base and size are just supposed to be in units
> > of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
> > is not standardized and by no means recorded as a requirement. Obviously
> > powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
> > will use mtrr_check() to verify the the same requirement. Furthermore,
> > as per my commit log message:
> 
> Whatever the code may or may not do, the x86 architecture uses
> power-of-two MTRR sizes.  So I'm confused.

There should be no confusion, I simply did not know that *was* the
requirement for x86, if that is the case we should add a check for that
and perhaps generalize a helper that does the power of two helper changes,
the cleanest I found was the vesafb driver solution.

Thoughts?

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain March 27, 2015, 8:12 p.m. UTC | #8
On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
> <mcgrof@do-not-panic.com> wrote:
> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > index 8025624..8875e56 100644
> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >
> >  #ifdef CONFIG_MTRR
> >         par->mtrr_aper = -1;
> > -       par->mtrr_reg = -1;
> >         if (!nomtrr) {
> > -               /* Cover the whole resource. */
> > -               par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > +               par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > +                                         info->fix.smem_len,
> >                                           MTRR_TYPE_WRCOMB, 1);
> > -               if (par->mtrr_aper >= 0 && !par->aux_start) {
> > -                       /* Make a hole for mmio. */
> > -                       par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
> > -                                                GUI_RESERVE, GUI_RESERVE,
> > -                                                MTRR_TYPE_UNCACHABLE, 1);
> > -                       if (par->mtrr_reg < 0) {
> > -                               mtrr_del(par->mtrr_aper, 0, 0);
> > -                               par->mtrr_aper = -1;
> > -                       }
> > -               }
> >         }
> >  #endif
> >
> > @@ -2776,10 +2765,6 @@ aty_init_exit:
> >         par->pll_ops->set_pll(info, &par->saved_pll);
> >
> >  #ifdef CONFIG_MTRR
> > -       if (par->mtrr_reg >= 0) {
> > -               mtrr_del(par->mtrr_reg, 0, 0);
> > -               par->mtrr_reg = -1;
> > -       }
> >         if (par->mtrr_aper >= 0) {
> >                 mtrr_del(par->mtrr_aper, 0, 0);
> >                 par->mtrr_aper = -1;
> > @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
> >         }
> >
> >         info->fix.mmio_start = raddr;
> > -       par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
> > +       par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
> 
> Double-check me, but I think that ioremap_nocache + WC MTRR = WC. 

Precicely, in this case the WC hole was obtained by using MTRR WC. This
patch removes that WC hole trick and now we can be explciit about
only wanting ioremap_nocache() on the registers, that is WC is not
desired here and is not used. The patch does not highlight the fact
that there was left in place another ioremap() call for the framebuffer:

info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);

That is the one that later after this patch we use ioremap_wc() for.
This patch just removes the hole solution. That's all.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski March 27, 2015, 9:21 p.m. UTC | #9
On Fri, Mar 27, 2015 at 1:12 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote:
>> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
>> <mcgrof@do-not-panic.com> wrote:
>> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
>> > index 8025624..8875e56 100644
>> > --- a/drivers/video/fbdev/aty/atyfb_base.c
>> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
>> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
>> >
>> >  #ifdef CONFIG_MTRR
>> >         par->mtrr_aper = -1;
>> > -       par->mtrr_reg = -1;
>> >         if (!nomtrr) {
>> > -               /* Cover the whole resource. */
>> > -               par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
>> > +               par->mtrr_aper = mtrr_add(info->fix.smem_start,
>> > +                                         info->fix.smem_len,
>> >                                           MTRR_TYPE_WRCOMB, 1);
>> > -               if (par->mtrr_aper >= 0 && !par->aux_start) {
>> > -                       /* Make a hole for mmio. */
>> > -                       par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
>> > -                                                GUI_RESERVE, GUI_RESERVE,
>> > -                                                MTRR_TYPE_UNCACHABLE, 1);
>> > -                       if (par->mtrr_reg < 0) {
>> > -                               mtrr_del(par->mtrr_aper, 0, 0);
>> > -                               par->mtrr_aper = -1;
>> > -                       }
>> > -               }
>> >         }
>> >  #endif
>> >
>> > @@ -2776,10 +2765,6 @@ aty_init_exit:
>> >         par->pll_ops->set_pll(info, &par->saved_pll);
>> >
>> >  #ifdef CONFIG_MTRR
>> > -       if (par->mtrr_reg >= 0) {
>> > -               mtrr_del(par->mtrr_reg, 0, 0);
>> > -               par->mtrr_reg = -1;
>> > -       }
>> >         if (par->mtrr_aper >= 0) {
>> >                 mtrr_del(par->mtrr_aper, 0, 0);
>> >                 par->mtrr_aper = -1;
>> > @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
>> >         }
>> >
>> >         info->fix.mmio_start = raddr;
>> > -       par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
>> > +       par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
>>
>> Double-check me, but I think that ioremap_nocache + WC MTRR = WC.
>
> Precicely, in this case the WC hole was obtained by using MTRR WC. This
> patch removes that WC hole trick and now we can be explciit about
> only wanting ioremap_nocache() on the registers, that is WC is not
> desired here and is not used. The patch does not highlight the fact
> that there was left in place another ioremap() call for the framebuffer:
>
> info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
>
> That is the one that later after this patch we use ioremap_wc() for.
> This patch just removes the hole solution. That's all.
>

I don't understand.

If I read it right, there's a 2^n byte BAR.  You're requesting WC for
the whole think using arch_phys_wc_add.  On a PAT system that has no
effect and all is well.  On a non-PAT system, it adds an MTRR.  That
means that you need to override the MTRR somehow for the mmio regs,
and UC- won't do the trick.

Or am I missing something here?

--Andy

>   Luis
Ville Syrjälä March 27, 2015, 9:56 p.m. UTC | #10
On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
> On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
> > On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> > > On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> > >> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> > >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > >> > index 8025624..8875e56 100644
> > >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> > >> >
> > >> >  #ifdef CONFIG_MTRR
> > >> >     par->mtrr_aper = -1;
> > >> > -   par->mtrr_reg = -1;
> > >> >     if (!nomtrr) {
> > >> > -           /* Cover the whole resource. */
> > >> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > >> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > >> > +                                     info->fix.smem_len,
> > >> >                                       MTRR_TYPE_WRCOMB, 1);
> > >>
> > >> MTRRs need power of two size, so how is this supposed to work?
> > >
> > > As per mtrr_add_page() [0] the base and size are just supposed to be in units
> > > of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
> > > is not standardized and by no means recorded as a requirement. Obviously
> > > powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
> > > will use mtrr_check() to verify the the same requirement. Furthermore,
> > > as per my commit log message:
> > 
> > Whatever the code may or may not do, the x86 architecture uses
> > power-of-two MTRR sizes.  So I'm confused.
> 
> There should be no confusion, I simply did not know that *was* the
> requirement for x86, if that is the case we should add a check for that
> and perhaps generalize a helper that does the power of two helper changes,
> the cleanest I found was the vesafb driver solution.
> 
> Thoughts?

The vesafb solution is bad since you'll only end up covering only
the first 4MB of the framebuffer instead of the almost 8MB you want.
Which in practice will mean throwing away half the VRAM since you really
don't want the massive performance hit from accessing it as UC. And that
would mean giving up decent display resolutions as well :(

And the other option of trying to cover the remainder with multiple ever
smaller MTRRs doesn't work either since you'll run out of MTRRs very
quickly.

This is precisely why I used the hole method in atyfb in the first
place.

I don't really like the idea of any new mtrr code not supporting that
use case, especially as these things tend to be present in older machines
where PAT isn't an option.
Andy Lutomirski March 27, 2015, 10:02 p.m. UTC | #11
On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä <syrjala@sci.fi> wrote:
> On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
>> On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
>> > On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
>> > > On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
>> > >> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
>> > >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
>> > >> > index 8025624..8875e56 100644
>> > >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
>> > >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
>> > >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
>> > >> >
>> > >> >  #ifdef CONFIG_MTRR
>> > >> >     par->mtrr_aper = -1;
>> > >> > -   par->mtrr_reg = -1;
>> > >> >     if (!nomtrr) {
>> > >> > -           /* Cover the whole resource. */
>> > >> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
>> > >> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
>> > >> > +                                     info->fix.smem_len,
>> > >> >                                       MTRR_TYPE_WRCOMB, 1);
>> > >>
>> > >> MTRRs need power of two size, so how is this supposed to work?
>> > >
>> > > As per mtrr_add_page() [0] the base and size are just supposed to be in units
>> > > of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
>> > > is not standardized and by no means recorded as a requirement. Obviously
>> > > powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
>> > > will use mtrr_check() to verify the the same requirement. Furthermore,
>> > > as per my commit log message:
>> >
>> > Whatever the code may or may not do, the x86 architecture uses
>> > power-of-two MTRR sizes.  So I'm confused.
>>
>> There should be no confusion, I simply did not know that *was* the
>> requirement for x86, if that is the case we should add a check for that
>> and perhaps generalize a helper that does the power of two helper changes,
>> the cleanest I found was the vesafb driver solution.
>>
>> Thoughts?
>
> The vesafb solution is bad since you'll only end up covering only
> the first 4MB of the framebuffer instead of the almost 8MB you want.
> Which in practice will mean throwing away half the VRAM since you really
> don't want the massive performance hit from accessing it as UC. And that
> would mean giving up decent display resolutions as well :(
>
> And the other option of trying to cover the remainder with multiple ever
> smaller MTRRs doesn't work either since you'll run out of MTRRs very
> quickly.
>
> This is precisely why I used the hole method in atyfb in the first
> place.
>
> I don't really like the idea of any new mtrr code not supporting that
> use case, especially as these things tend to be present in older machines
> where PAT isn't an option.

According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
an effective memory type of UC.  Hence my suggestion to add
ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an
otherwise WC MTRR-covered region.

ioremap_nocache is UC- (even on non-PAT unless I misunderstood how
this stuff works), so ioremap_nocache by itself isn't good enough.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain March 27, 2015, 11:31 p.m. UTC | #12
On Fri, Mar 27, 2015 at 02:21:34PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 27, 2015 at 1:12 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> > On Fri, Mar 20, 2015 at 04:52:18PM -0700, Andy Lutomirski wrote:
> >> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
> >> <mcgrof@do-not-panic.com> wrote:
> >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> >> > index 8025624..8875e56 100644
> >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >> >
> >> >  #ifdef CONFIG_MTRR
> >> >         par->mtrr_aper = -1;
> >> > -       par->mtrr_reg = -1;
> >> >         if (!nomtrr) {
> >> > -               /* Cover the whole resource. */
> >> > -               par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> >> > +               par->mtrr_aper = mtrr_add(info->fix.smem_start,
> >> > +                                         info->fix.smem_len,
> >> >                                           MTRR_TYPE_WRCOMB, 1);
> >> > -               if (par->mtrr_aper >= 0 && !par->aux_start) {
> >> > -                       /* Make a hole for mmio. */
> >> > -                       par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
> >> > -                                                GUI_RESERVE, GUI_RESERVE,
> >> > -                                                MTRR_TYPE_UNCACHABLE, 1);
> >> > -                       if (par->mtrr_reg < 0) {
> >> > -                               mtrr_del(par->mtrr_aper, 0, 0);
> >> > -                               par->mtrr_aper = -1;
> >> > -                       }
> >> > -               }
> >> >         }
> >> >  #endif
> >> >
> >> > @@ -2776,10 +2765,6 @@ aty_init_exit:
> >> >         par->pll_ops->set_pll(info, &par->saved_pll);
> >> >
> >> >  #ifdef CONFIG_MTRR
> >> > -       if (par->mtrr_reg >= 0) {
> >> > -               mtrr_del(par->mtrr_reg, 0, 0);
> >> > -               par->mtrr_reg = -1;
> >> > -       }
> >> >         if (par->mtrr_aper >= 0) {
> >> >                 mtrr_del(par->mtrr_aper, 0, 0);
> >> >                 par->mtrr_aper = -1;
> >> > @@ -3466,7 +3451,7 @@ static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
> >> >         }
> >> >
> >> >         info->fix.mmio_start = raddr;
> >> > -       par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
> >> > +       par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
> >>
> >> Double-check me, but I think that ioremap_nocache + WC MTRR = WC.
> >
> > Precicely, in this case the WC hole was obtained by using MTRR WC. This
> > patch removes that WC hole trick and now we can be explciit about
> > only wanting ioremap_nocache() on the registers, that is WC is not
> > desired here and is not used. The patch does not highlight the fact
> > that there was left in place another ioremap() call for the framebuffer:
> >
> > info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
> >
> > That is the one that later after this patch we use ioremap_wc() for.
> > This patch just removes the hole solution. That's all.
> >
> 
> I don't understand.
> 
> If I read it right, there's a 2^n byte BAR.  You're requesting WC for
> the whole think using arch_phys_wc_add.

I believe there is a misunderstanding of order of changes.

Let's split when we use mtrr_add() Vs arch_phys_wc_add() to avoid
confusion as in this patch we don't yet use arch_phys_wc_add(). That
is done in the next patch.

The commit log describes best the state of affairs prior to this
patch:

    The atyfb driver uses an MTRR work around since some
    cards use the same PCI BAR for the framebuffer and MMIO.
    In such cards the last page is used for MMIO, the rest for
    the framebuffer, so on those cards we ioremap() the MMIO
    page alone, then again ioremap() the full framebuffer
    including the MMIO space *and* ___then___ use an MTRR with
    MTRR_TYPE_WRCOMB on the full PCI BAR... and finally "hole"
    in an MTRR_TYPE_UNCACHABLE MTRR only for MMIO.

Then this patch drops the MTRR_TYPE_UNCACHABLE rewrite thing
and instead corrects the ioremap() call for the framebuffer
to only be called for the framebuffer alone. For the MMIO
area we adjust to use then ioremap_nocache(). The MTRR left
is now only *for the framebuffer* and it should not be touching
the MMIO area. So the MMIO area has its own ioremap_nocache()
area alone, the framebuffer is left with an ioremap() followed
by an mtrr_add() call.

The next patch replaces the mtrr_add() with arch_phys_wc_add()
and then also uses ioremap_wc().

> On a PAT system that has no effect and all is well.

Yeah we're not doing arch_phys_wc_add() on the entire PCI BAR.
That was dumb, this fixes that, and on this patch mtrr_add()
is still used.

> On a non-PAT system, it adds an MTRR.  That
> means that you need to override the MTRR somehow for the mmio regs,
> and UC- won't do the trick.

We don't need to solve that problem here as the MTRR should only
be for the framebuffer.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain March 28, 2015, 12:21 a.m. UTC | #13
On Fri, Mar 27, 2015 at 11:56:55PM +0200, Ville Syrjälä wrote:
> On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
> > On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
> > > On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> > > > On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> > > >> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> > > >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> > > >> > index 8025624..8875e56 100644
> > > >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> > > >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> > > >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> > > >> >
> > > >> >  #ifdef CONFIG_MTRR
> > > >> >     par->mtrr_aper = -1;
> > > >> > -   par->mtrr_reg = -1;
> > > >> >     if (!nomtrr) {
> > > >> > -           /* Cover the whole resource. */
> > > >> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> > > >> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
> > > >> > +                                     info->fix.smem_len,
> > > >> >                                       MTRR_TYPE_WRCOMB, 1);
> > > >>
> > > >> MTRRs need power of two size, so how is this supposed to work?
> > > >
> > > > As per mtrr_add_page() [0] the base and size are just supposed to be in units
> > > > of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
> > > > is not standardized and by no means recorded as a requirement. Obviously
> > > > powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
> > > > will use mtrr_check() to verify the the same requirement. Furthermore,
> > > > as per my commit log message:
> > > 
> > > Whatever the code may or may not do, the x86 architecture uses
> > > power-of-two MTRR sizes.  So I'm confused.
> > 
> > There should be no confusion, I simply did not know that *was* the
> > requirement for x86, if that is the case we should add a check for that
> > and perhaps generalize a helper that does the power of two helper changes,
> > the cleanest I found was the vesafb driver solution.
> > 
> > Thoughts?
> 
> The vesafb solution is bad since you'll only end up covering only
> the first 4MB of the framebuffer instead of the almost 8MB you want.

OK so the power of 2 requirement implicates us *having* to use a large
MTRR that includes the MMIo region in the shared PCI case?

Andy, Ville, are we 100% certain about this power of two requirement?
Is that for the base and size or just the size?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain March 28, 2015, 12:28 a.m. UTC | #14
On Fri, Mar 27, 2015 at 03:02:10PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 27, 2015 at 2:56 PM, Ville Syrjälä <syrjala@sci.fi> wrote:
> > On Fri, Mar 27, 2015 at 08:57:59PM +0100, Luis R. Rodriguez wrote:
> >> On Fri, Mar 27, 2015 at 12:43:55PM -0700, Andy Lutomirski wrote:
> >> > On Fri, Mar 27, 2015 at 12:38 PM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> >> > > On Sat, Mar 21, 2015 at 11:15:14AM +0200, Ville Syrjälä wrote:
> >> > >> On Fri, Mar 20, 2015 at 04:17:59PM -0700, Luis R. Rodriguez wrote:
> >> > >> > diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
> >> > >> > index 8025624..8875e56 100644
> >> > >> > --- a/drivers/video/fbdev/aty/atyfb_base.c
> >> > >> > +++ b/drivers/video/fbdev/aty/atyfb_base.c
> >> > >> > @@ -2630,21 +2630,10 @@ static int aty_init(struct fb_info *info)
> >> > >> >
> >> > >> >  #ifdef CONFIG_MTRR
> >> > >> >     par->mtrr_aper = -1;
> >> > >> > -   par->mtrr_reg = -1;
> >> > >> >     if (!nomtrr) {
> >> > >> > -           /* Cover the whole resource. */
> >> > >> > -           par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
> >> > >> > +           par->mtrr_aper = mtrr_add(info->fix.smem_start,
> >> > >> > +                                     info->fix.smem_len,
> >> > >> >                                       MTRR_TYPE_WRCOMB, 1);
> >> > >>
> >> > >> MTRRs need power of two size, so how is this supposed to work?
> >> > >
> >> > > As per mtrr_add_page() [0] the base and size are just supposed to be in units
> >> > > of 4 KiB, although the practice is to use powers of 2 in *some* drivers this
> >> > > is not standardized and by no means recorded as a requirement. Obviously
> >> > > powers of 2 will work too and you'd end up neatly aligned as well. mtrr_add()
> >> > > will use mtrr_check() to verify the the same requirement. Furthermore,
> >> > > as per my commit log message:
> >> >
> >> > Whatever the code may or may not do, the x86 architecture uses
> >> > power-of-two MTRR sizes.  So I'm confused.
> >>
> >> There should be no confusion, I simply did not know that *was* the
> >> requirement for x86, if that is the case we should add a check for that
> >> and perhaps generalize a helper that does the power of two helper changes,
> >> the cleanest I found was the vesafb driver solution.
> >>
> >> Thoughts?
> >
> > The vesafb solution is bad since you'll only end up covering only
> > the first 4MB of the framebuffer instead of the almost 8MB you want.
> > Which in practice will mean throwing away half the VRAM since you really
> > don't want the massive performance hit from accessing it as UC. And that
> > would mean giving up decent display resolutions as well :(
> >
> > And the other option of trying to cover the remainder with multiple ever
> > smaller MTRRs doesn't work either since you'll run out of MTRRs very
> > quickly.
> >
> > This is precisely why I used the hole method in atyfb in the first
> > place.
> >
> > I don't really like the idea of any new mtrr code not supporting that
> > use case, especially as these things tend to be present in older machines
> > where PAT isn't an option.
> 
> According to the Intel SDM, volume 3, section 11.5.2.1, table 11-6,
> non-PAT CPUs that have a WC MTRR, PCD = 1, and PWT = 1 (aka UC) have
> an effective memory type of UC.  Hence my suggestion to add
> ioremap_x86_uc and/or set_memory_x86_uc to punch a UC hole in an
> otherwise WC MTRR-covered region.

OK I think I get it now.

And I take it this would hopefully only be used for non-PAT systems?
Would there be a use case for PAT systems? I wonder if we can wrap
this under some APIs to make it clean and hide this dirty thing
behind the scenes, it seems a fragile and error prone and my hope
would be that we won't need more specialization in this area for
PAT systems.

> ioremap_nocache is UC- (even on non-PAT unless I misunderstood how
> this stuff works), so ioremap_nocache by itself isn't good enough.

Thanks for the clarification.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
index 1f39a62..89ec439 100644
--- a/drivers/video/fbdev/aty/atyfb.h
+++ b/drivers/video/fbdev/aty/atyfb.h
@@ -184,7 +184,6 @@  struct atyfb_par {
 	spinlock_t int_lock;
 #ifdef CONFIG_MTRR
 	int mtrr_aper;
-	int mtrr_reg;
 #endif
 	u32 mem_cntl;
 	struct crtc saved_crtc;
diff --git a/drivers/video/fbdev/aty/atyfb_base.c b/drivers/video/fbdev/aty/atyfb_base.c
index 8025624..8875e56 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -2630,21 +2630,10 @@  static int aty_init(struct fb_info *info)
 
 #ifdef CONFIG_MTRR
 	par->mtrr_aper = -1;
-	par->mtrr_reg = -1;
 	if (!nomtrr) {
-		/* Cover the whole resource. */
-		par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
+		par->mtrr_aper = mtrr_add(info->fix.smem_start,
+					  info->fix.smem_len,
 					  MTRR_TYPE_WRCOMB, 1);
-		if (par->mtrr_aper >= 0 && !par->aux_start) {
-			/* Make a hole for mmio. */
-			par->mtrr_reg = mtrr_add(par->res_start + 0x800000 -
-						 GUI_RESERVE, GUI_RESERVE,
-						 MTRR_TYPE_UNCACHABLE, 1);
-			if (par->mtrr_reg < 0) {
-				mtrr_del(par->mtrr_aper, 0, 0);
-				par->mtrr_aper = -1;
-			}
-		}
 	}
 #endif
 
@@ -2776,10 +2765,6 @@  aty_init_exit:
 	par->pll_ops->set_pll(info, &par->saved_pll);
 
 #ifdef CONFIG_MTRR
-	if (par->mtrr_reg >= 0) {
-		mtrr_del(par->mtrr_reg, 0, 0);
-		par->mtrr_reg = -1;
-	}
 	if (par->mtrr_aper >= 0) {
 		mtrr_del(par->mtrr_aper, 0, 0);
 		par->mtrr_aper = -1;
@@ -3466,7 +3451,7 @@  static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
 	}
 
 	info->fix.mmio_start = raddr;
-	par->ati_regbase = ioremap(info->fix.mmio_start, 0x1000);
+	par->ati_regbase = ioremap_nocache(info->fix.mmio_start, 0x1000);
 	if (par->ati_regbase == NULL)
 		return -ENOMEM;
 
@@ -3491,6 +3476,8 @@  static int atyfb_setup_generic(struct pci_dev *pdev, struct fb_info *info,
 	info->fix.smem_start = addr;
 	info->fix.smem_len = 0x800000;
 
+	aty_fudge_framebuffer_len(info);
+
 	info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
 	if (info->screen_base == NULL) {
 		ret = -ENOMEM;
@@ -3563,6 +3550,7 @@  static int atyfb_pci_probe(struct pci_dev *pdev,
 		return -ENOMEM;
 	}
 	par = info->par;
+	par->bus_type = PCI;
 	info->fix = atyfb_fix;
 	info->device = &pdev->dev;
 	par->pci_id = pdev->device;
@@ -3732,10 +3720,6 @@  static void atyfb_remove(struct fb_info *info)
 #endif
 
 #ifdef CONFIG_MTRR
-	if (par->mtrr_reg >= 0) {
-		mtrr_del(par->mtrr_reg, 0, 0);
-		par->mtrr_reg = -1;
-	}
 	if (par->mtrr_aper >= 0) {
 		mtrr_del(par->mtrr_aper, 0, 0);
 		par->mtrr_aper = -1;