diff mbox series

drm/ttm: set TTM allocated pages as reserved

Message ID 20230329135401.105592-1-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm: set TTM allocated pages as reserved | expand

Commit Message

Christian König March 29, 2023, 1:54 p.m. UTC
KVM tries to grab references to pages in VMAs marked with VM_PFNMAP.

This is illegal and can cause data corruption with TTM pages because
only some of them are actually reference counted.

Mark all pages allocated by TTM as reserved, this way KVM handles the
PFNs like they would point to MMIO space.

This still results in a warning, but at least no other problem.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_pool.c | 62 ++++++++++++++++++++--------------
 1 file changed, 36 insertions(+), 26 deletions(-)

Comments

Paolo Bonzini March 29, 2023, 2:28 p.m. UTC | #1
On 3/29/23 15:54, Christian König wrote:
> KVM tries to grab references to pages in VMAs marked with VM_PFNMAP.
> This is illegal and can cause data corruption with TTM pages because
> only some of them are actually reference counted.

I think that you are referring to this:

         /* Don't set the __GFP_COMP flag for higher order allocations.
          * Mapping pages directly into an userspace process and calling
          * put_page() on a TTM allocated page is illegal.
          */
         if (order)
                 gfp_flags |= __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN |
                         __GFP_KSWAPD_RECLAIM;

By "directly" I guess you mean without going through remap_pfn_range().

Based on our discussion offlist, it should be possible to remove the
get_page/put_page from the path that fills in the KVM page table, but
it is difficult to remove it altogether (it requires changing everything
to use userspace virtual address).

Indeed KVM needs to detect non-reference-counted pages because unfortunately
there are cases where people want to map VM_PFNMAP pages into a guest.  If
it is not enough to have PageReserved set, and there is a better check, KVM
can be fixed too.

> Mark all pages allocated by TTM as reserved, this way KVM handles the
> PFNs like they would point to MMIO space.
> 
> This still results in a warning, but at least no other problem.

What warning is it?

Paolo

> Signed-off-by: Christian König<christian.koenig@amd.com>
Paolo Bonzini March 29, 2023, 3:08 p.m. UTC | #2
On 3/29/23 16:28, Paolo Bonzini wrote:
> On 3/29/23 15:54, Christian König wrote:
>> KVM tries to grab references to pages in VMAs marked with VM_PFNMAP.
>> This is illegal and can cause data corruption with TTM pages because
>> only some of them are actually reference counted.

After some other offlist discussions, I also would like to understand 
what you mean by corruption.

First, is it a _host_ corruption or a guest corruption/crash?  A guest 
crash would be KVM doing exactly what it's meant to do: it detects the 
non-reserved, non-refcounted page and refuses to map it into the guest.

On the other hand, if it is a host crash, my understanding is that an 
order>0 allocation leaves the tail pages with a zero reference count 
(and without a compound_head if, as in this case, __GFP_COMP is unset). 
If that's correct, more analysis is needed to understand why 
get_page_unless_zero() isn't rejecting the tail pages.

Paolo


>> Mark all pages allocated by TTM as reserved, this way KVM handles the
>> PFNs like they would point to MMIO space.
>>
>> This still results in a warning, but at least no other problem.
> 
> What warning is it?
> 
> Paolo
> 
>> Signed-off-by: Christian König<christian.koenig@amd.com>
>
Christian König March 29, 2023, 3:29 p.m. UTC | #3
Am 29.03.23 um 17:08 schrieb Paolo Bonzini:
> On 3/29/23 16:28, Paolo Bonzini wrote:
>> On 3/29/23 15:54, Christian König wrote:
>>> KVM tries to grab references to pages in VMAs marked with VM_PFNMAP.
>>> This is illegal and can cause data corruption with TTM pages because
>>> only some of them are actually reference counted.
>
> After some other offlist discussions, I also would like to understand 
> what you mean by corruption.

I think what was meant here is that only parts of the buffers where 
updated, but see below.

>
> First, is it a _host_ corruption or a guest corruption/crash?  A guest 
> crash would be KVM doing exactly what it's meant to do: it detects the 
> non-reserved, non-refcounted page and refuses to map it into the guest.

Yes I think that this is what happens. The use case and mapping is 
indeed valid as far as I can see, but the handling that KVM does here is 
really problematic.

When the PFN points to an IO address it just works because that isn't 
backed by struct pages. And when the PFN points to the first page of a 
higher order allocation it also works, only when the PFN points to a 
following page kvm_try_get_pfn() fails and causes the problems.

What needs to happen instead is that when both hva_to_pfn_fast() and 
hva_to_pfn_slow() fails you don't try to convert the PFN into a page and 
so also don't get a reference to the page.

This somehow needs to be communicated to the callers of hva_to_pfn() so 
that kvm_release_pfn() knows what to do.

Regards,
Christian.

>
> On the other hand, if it is a host crash, my understanding is that an 
> order>0 allocation leaves the tail pages with a zero reference count 
> (and without a compound_head if, as in this case, __GFP_COMP is 
> unset). If that's correct, more analysis is needed to understand why 
> get_page_unless_zero() isn't rejecting the tail pages.
>
> Paolo
>
>
>>> Mark all pages allocated by TTM as reserved, this way KVM handles the
>>> PFNs like they would point to MMIO space.
>>>
>>> This still results in a warning, but at least no other problem.
>>
>> What warning is it?
>>
>> Paolo
>>
>>> Signed-off-by: Christian König<christian.koenig@amd.com>
>>
>
Paolo Bonzini March 29, 2023, 3:51 p.m. UTC | #4
On 3/29/23 17:29, Christian König wrote:
>> First, is it a _host_ corruption or a guest corruption/crash?  A guest 
>> crash would be KVM doing exactly what it's meant to do: it detects the 
>> non-reserved, non-refcounted page and refuses to map it into the guest.
> 
> Yes I think that this is what happens.

Ok I was worried all the time that this was host corruption/crash; which 
obviously would have been much worse.

> The use case and mapping is indeed valid as far as I can see, but
> the handling that KVM does here is really problematic.
> 
> What needs to happen instead is that when both hva_to_pfn_fast() and 
> hva_to_pfn_slow() fails you don't try to convert the PFN into a page and 
> so also don't get a reference to the page.
> 
> This somehow needs to be communicated to the callers of hva_to_pfn() so 
> that kvm_release_pfn() knows what to do.

There's a bit more complication here:

1) in the guest page fault path we can avoid taking the reference altogether

2) in other MMU-notifier-protected paths, we can also avoid taking the 
reference but we also must stop using kmap() in virt/kvm/pfncache.c.

3) other uses of kmap() must switch to MMU-notifier protection.


If the DRM people are okay with SetPageReserved() as a temporary hack, 
we can change or remove the WARN in kvm_is_zone_device_page(), since 
that is what you are referring to in the commit message.

Paolo
Christian König March 29, 2023, 4:43 p.m. UTC | #5
Am 29.03.23 um 17:51 schrieb Paolo Bonzini:
> On 3/29/23 17:29, Christian König wrote:
>>> First, is it a _host_ corruption or a guest corruption/crash?  A 
>>> guest crash would be KVM doing exactly what it's meant to do: it 
>>> detects the non-reserved, non-refcounted page and refuses to map it 
>>> into the guest.
>>
>> Yes I think that this is what happens.
>
> Ok I was worried all the time that this was host corruption/crash; 
> which obviously would have been much worse.
>
>> The use case and mapping is indeed valid as far as I can see, but
>> the handling that KVM does here is really problematic.
>>
>> What needs to happen instead is that when both hva_to_pfn_fast() and 
>> hva_to_pfn_slow() fails you don't try to convert the PFN into a page 
>> and so also don't get a reference to the page.
>>
>> This somehow needs to be communicated to the callers of hva_to_pfn() 
>> so that kvm_release_pfn() knows what to do.
>
> There's a bit more complication here:
>
> 1) in the guest page fault path we can avoid taking the reference 
> altogether
>
> 2) in other MMU-notifier-protected paths, we can also avoid taking the 
> reference but we also must stop using kmap() in virt/kvm/pfncache.c.
>
> 3) other uses of kmap() must switch to MMU-notifier protection.

I would rather suggest to return the page additionally to the pfn from 
hva_to_pfn() when the function was able to grab a reference to it.

When the page is then not available you can't call kmap() on that either.

>
> If the DRM people are okay with SetPageReserved() as a temporary hack, 
> we can change or remove the WARN in kvm_is_zone_device_page(), since 
> that is what you are referring to in the commit message.

Adding Daniel for additional comments on this, but essentially we have 
changed quite some infrastructure to make sure that everybody uses 
VM_PFNMAP to prevent stuff like this from happening.

I would really prefer a proper solution in KVM instead.

Christian.

>
> Paolo
>
Paolo Bonzini March 29, 2023, 4:56 p.m. UTC | #6
On 3/29/23 18:43, Christian König wrote:
>>
>>
>> 3) other uses of kmap() must switch to MMU-notifier protection.
> 
> I would rather suggest to return the page additionally to the pfn from 
> hva_to_pfn() when the function was able to grab a reference to it.
> 
> When the page is then not available you can't call kmap() on that either.
> 
>>
>> If the DRM people are okay with SetPageReserved() as a temporary hack, 
>> we can change or remove the WARN in kvm_is_zone_device_page(), since 
>> that is what you are referring to in the commit message.
> 
> Adding Daniel for additional comments on this, but essentially we have 
> changed quite some infrastructure to make sure that everybody uses 
> VM_PFNMAP to prevent stuff like this from happening.
> 
> I would really prefer a proper solution in KVM instead.

Hmm... Now that I think about it that would be

https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/

Time to resurrect that work.

Paolo
Sean Christopherson March 29, 2023, 5:22 p.m. UTC | #7
+David

On Wed, Mar 29, 2023, Paolo Bonzini wrote:
> On 3/29/23 18:43, Christian K�nig wrote:
> > > 
> > > 
> > > 3) other uses of kmap() must switch to MMU-notifier protection.
> > 
> > I would rather suggest to return the page additionally to the pfn from
> > hva_to_pfn() when the function was able to grab a reference to it.
> > 
> > When the page is then not available you can't call kmap() on that either.
> > 
> > > 
> > > If the DRM people are okay with SetPageReserved() as a temporary
> > > hack, we can change or remove the WARN in kvm_is_zone_device_page(),
> > > since that is what you are referring to in the commit message.
> > 
> > Adding Daniel for additional comments on this, but essentially we have
> > changed quite some infrastructure to make sure that everybody uses
> > VM_PFNMAP to prevent stuff like this from happening.
> > 
> > I would really prefer a proper solution in KVM instead.
> 
> Hmm... Now that I think about it that would be
> 
> https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
> 
> Time to resurrect that work.

Ya.  I had previously asked David to first eliminated the usage that isn't
protected by mmu_notifiers, but after seeing the resulting complexity, I had a
change of heart[2].  Can you weigh in on the latter thread, specifically my
proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.

[1] https://lore.kernel.org/kvm/Ydhq5aHW+JFo15UF@google.com
[2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/
Christian König March 29, 2023, 5:31 p.m. UTC | #8
Am 29.03.23 um 19:22 schrieb Sean Christopherson:
> +David
>
> On Wed, Mar 29, 2023, Paolo Bonzini wrote:
>> On 3/29/23 18:43, Christian K�nig wrote:
>>>>
>>>> 3) other uses of kmap() must switch to MMU-notifier protection.
>>> I would rather suggest to return the page additionally to the pfn from
>>> hva_to_pfn() when the function was able to grab a reference to it.
>>>
>>> When the page is then not available you can't call kmap() on that either.
>>>
>>>> If the DRM people are okay with SetPageReserved() as a temporary
>>>> hack, we can change or remove the WARN in kvm_is_zone_device_page(),
>>>> since that is what you are referring to in the commit message.
>>> Adding Daniel for additional comments on this, but essentially we have
>>> changed quite some infrastructure to make sure that everybody uses
>>> VM_PFNMAP to prevent stuff like this from happening.
>>>
>>> I would really prefer a proper solution in KVM instead.
>> Hmm... Now that I think about it that would be
>>
>> https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
>>
>> Time to resurrect that work.
> Ya.  I had previously asked David to first eliminated the usage that isn't
> protected by mmu_notifiers, but after seeing the resulting complexity, I had a
> change of heart[2].  Can you weigh in on the latter thread, specifically my
> proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.

I don't think that this is something an admin should be able to decide.

x86 system are rather grateful, but if you kmap() pages accessed by GPUs 
on ARM the system might just reboot spontaneously.

Robin Murphy <robin.murphy@arm.com> can fill you in on the hw details if 
needed.

Christian.

>
> [1] https://lore.kernel.org/kvm/Ydhq5aHW+JFo15UF@google.com
> [2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/
Sean Christopherson March 29, 2023, 5:39 p.m. UTC | #9
On Wed, Mar 29, 2023, Christian K�nig wrote:
> Am 29.03.23 um 19:22 schrieb Sean Christopherson:
> > +David
> > 
> > On Wed, Mar 29, 2023, Paolo Bonzini wrote:
> > > On 3/29/23 18:43, Christian K�nig wrote:
> > > > > 
> > > > > 3) other uses of kmap() must switch to MMU-notifier protection.
> > > > I would rather suggest to return the page additionally to the pfn from
> > > > hva_to_pfn() when the function was able to grab a reference to it.
> > > > 
> > > > When the page is then not available you can't call kmap() on that either.
> > > > 
> > > > > If the DRM people are okay with SetPageReserved() as a temporary
> > > > > hack, we can change or remove the WARN in kvm_is_zone_device_page(),
> > > > > since that is what you are referring to in the commit message.
> > > > Adding Daniel for additional comments on this, but essentially we have
> > > > changed quite some infrastructure to make sure that everybody uses
> > > > VM_PFNMAP to prevent stuff like this from happening.
> > > > 
> > > > I would really prefer a proper solution in KVM instead.
> > > Hmm... Now that I think about it that would be
> > > 
> > > https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
> > > 
> > > Time to resurrect that work.
> > Ya.  I had previously asked David to first eliminated the usage that isn't
> > protected by mmu_notifiers, but after seeing the resulting complexity, I had a
> > change of heart[2].  Can you weigh in on the latter thread, specifically my
> > proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.
> 
> I don't think that this is something an admin should be able to decide.

Why not?  Assuming the admin has CAP_SYS_ADMIN, they can reboot the host in a
myriad of ways.  The idea with the KVM module param is to allow curated setups
where the userspace VMM is trusted to a large extent, e.g. AWS' Nitro, to opt-in
to the unsafe behavior.  I.e. by enabling the flag, the admin is acknowledging
that bad things can happen if untrusted/compromised userspace gets ahold of
/dev/kvm.

> x86 system are rather grateful, but if you kmap() pages accessed by GPUs on
> ARM the system might just reboot spontaneously.

FWIW, the dangers of an unsafe kmap() are arguably far worse than spontaneous
reboots, e.g. there's potential for use-after-free and possibly even write access
to arbitrary memory.
Christian König March 29, 2023, 5:43 p.m. UTC | #10
Am 29.03.23 um 19:39 schrieb Sean Christopherson:
> On Wed, Mar 29, 2023, Christian K�nig wrote:
>> Am 29.03.23 um 19:22 schrieb Sean Christopherson:
>>> +David
>>>
>>> On Wed, Mar 29, 2023, Paolo Bonzini wrote:
>>>> On 3/29/23 18:43, Christian K�nig wrote:
>>>>>> 3) other uses of kmap() must switch to MMU-notifier protection.
>>>>> I would rather suggest to return the page additionally to the pfn from
>>>>> hva_to_pfn() when the function was able to grab a reference to it.
>>>>>
>>>>> When the page is then not available you can't call kmap() on that either.
>>>>>
>>>>>> If the DRM people are okay with SetPageReserved() as a temporary
>>>>>> hack, we can change or remove the WARN in kvm_is_zone_device_page(),
>>>>>> since that is what you are referring to in the commit message.
>>>>> Adding Daniel for additional comments on this, but essentially we have
>>>>> changed quite some infrastructure to make sure that everybody uses
>>>>> VM_PFNMAP to prevent stuff like this from happening.
>>>>>
>>>>> I would really prefer a proper solution in KVM instead.
>>>> Hmm... Now that I think about it that would be
>>>>
>>>> https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
>>>>
>>>> Time to resurrect that work.
>>> Ya.  I had previously asked David to first eliminated the usage that isn't
>>> protected by mmu_notifiers, but after seeing the resulting complexity, I had a
>>> change of heart[2].  Can you weigh in on the latter thread, specifically my
>>> proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.
>> I don't think that this is something an admin should be able to decide.
> Why not?  Assuming the admin has CAP_SYS_ADMIN, they can reboot the host in a
> myriad of ways.  The idea with the KVM module param is to allow curated setups
> where the userspace VMM is trusted to a large extent, e.g. AWS' Nitro, to opt-in
> to the unsafe behavior.  I.e. by enabling the flag, the admin is acknowledging
> that bad things can happen if untrusted/compromised userspace gets ahold of
> /dev/kvm.

Well exactly that's the point, you don't need untrusted/compromised 
userspace the system could just go spontaneously into nirvana during 
normal operation.

This would result in very very hard to debug problems since the issues 
only happen rather rarely.

On the other hand why do you need the kmap() in the first place?

Regards,
Christian.

>
>> x86 system are rather grateful, but if you kmap() pages accessed by GPUs on
>> ARM the system might just reboot spontaneously.
> FWIW, the dangers of an unsafe kmap() are arguably far worse than spontaneous
> reboots, e.g. there's potential for use-after-free and possibly even write access
> to arbitrary memory.
Sean Christopherson March 29, 2023, 6:40 p.m. UTC | #11
On Wed, Mar 29, 2023, Christian K�nig wrote:
> Am 29.03.23 um 19:39 schrieb Sean Christopherson:
> > On Wed, Mar 29, 2023, Christian K�nig wrote:
> > > Am 29.03.23 um 19:22 schrieb Sean Christopherson:
> > > > +David
> > > > 
> > > > On Wed, Mar 29, 2023, Paolo Bonzini wrote:
> > > > > On 3/29/23 18:43, Christian K�nig wrote:
> > > > > > > 3) other uses of kmap() must switch to MMU-notifier protection.
> > > > > > I would rather suggest to return the page additionally to the pfn from
> > > > > > hva_to_pfn() when the function was able to grab a reference to it.
> > > > > > 
> > > > > > When the page is then not available you can't call kmap() on that either.
> > > > > > 
> > > > > > > If the DRM people are okay with SetPageReserved() as a temporary
> > > > > > > hack, we can change or remove the WARN in kvm_is_zone_device_page(),
> > > > > > > since that is what you are referring to in the commit message.
> > > > > > Adding Daniel for additional comments on this, but essentially we have
> > > > > > changed quite some infrastructure to make sure that everybody uses
> > > > > > VM_PFNMAP to prevent stuff like this from happening.
> > > > > > 
> > > > > > I would really prefer a proper solution in KVM instead.
> > > > > Hmm... Now that I think about it that would be
> > > > > 
> > > > > https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
> > > > > 
> > > > > Time to resurrect that work.
> > > > Ya.  I had previously asked David to first eliminated the usage that isn't
> > > > protected by mmu_notifiers, but after seeing the resulting complexity, I had a
> > > > change of heart[2].  Can you weigh in on the latter thread, specifically my
> > > > proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.
> > > I don't think that this is something an admin should be able to decide.
> > Why not?  Assuming the admin has CAP_SYS_ADMIN, they can reboot the host in a
> > myriad of ways.  The idea with the KVM module param is to allow curated setups
> > where the userspace VMM is trusted to a large extent, e.g. AWS' Nitro, to opt-in
> > to the unsafe behavior.  I.e. by enabling the flag, the admin is acknowledging
> > that bad things can happen if untrusted/compromised userspace gets ahold of
> > /dev/kvm.
> 
> Well exactly that's the point, you don't need untrusted/compromised
> userspace the system could just go spontaneously into nirvana during normal
> operation.

The proposal is that the module param is off by default, and the expectation is
that it would be turned on only by very specific setups.  I would not object to
making it even more difficult to enable, burying it behind a Kconfig, but IMO
that is unnecessary.

> This would result in very very hard to debug problems since the issues only
> happen rather rarely.
> 
> On the other hand why do you need the kmap() in the first place?

The Nitro setup manages guest memory in userspace, and hides that memory from
the kernel, e.g. to avoid struct page overhead..  To enable KVM to access guest
memory in select flows, e.g. for paravirtualization, without taking on too much
complexity, KVM supports kmap() of that memory.

For some uses, e.g. nested virtualization, switching to uaccess might be feasible,
but the paravirt stuff, especially Xen support, would likely be insanely complex
to do without kmap().
David Stevens March 30, 2023, 9:41 a.m. UTC | #12
On Thu, Mar 30, 2023 at 2:22 AM Sean Christopherson <seanjc@google.com> wrote:
>
> +David
>
> On Wed, Mar 29, 2023, Paolo Bonzini wrote:
> > On 3/29/23 18:43, Christian K�nig wrote:
> > > >
> > > >
> > > > 3) other uses of kmap() must switch to MMU-notifier protection.
> > >
> > > I would rather suggest to return the page additionally to the pfn from
> > > hva_to_pfn() when the function was able to grab a reference to it.
> > >
> > > When the page is then not available you can't call kmap() on that either.
> > >
> > > >
> > > > If the DRM people are okay with SetPageReserved() as a temporary
> > > > hack, we can change or remove the WARN in kvm_is_zone_device_page(),
> > > > since that is what you are referring to in the commit message.
> > >
> > > Adding Daniel for additional comments on this, but essentially we have
> > > changed quite some infrastructure to make sure that everybody uses
> > > VM_PFNMAP to prevent stuff like this from happening.
> > >
> > > I would really prefer a proper solution in KVM instead.
> >
> > Hmm... Now that I think about it that would be
> >
> > https://lore.kernel.org/kvm/Yc4H+dGfK83BaGpC@google.com/t/
> >
> > Time to resurrect that work.
>
> Ya.  I had previously asked David to first eliminated the usage that isn't
> protected by mmu_notifiers, but after seeing the resulting complexity, I had a
> change of heart[2].  Can you weigh in on the latter thread, specifically my
> proposal of using a module param to let the admin opt-in to "unsafe" kmap usage.
>
> [1] https://lore.kernel.org/kvm/Ydhq5aHW+JFo15UF@google.com
> [2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/

I just finished rebasing my patch series and sent out v6:

https://lore.kernel.org/all/20230330085802.2414466-1-stevensd@google.com/

-David
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index aa116a7bbae3..c665a8bf366a 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -82,6 +82,7 @@  static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
 	struct ttm_pool_dma *dma;
 	struct page *p;
+	unsigned int i;
 	void *vaddr;
 
 	/* Don't set the __GFP_COMP flag for higher order allocations.
@@ -94,38 +95,43 @@  static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
 
 	if (!pool->use_dma_alloc) {
 		p = alloc_pages(gfp_flags, order);
-		if (p)
-			p->private = order;
-		return p;
-	}
+		if (!p)
+			return NULL;
 
-	dma = kmalloc(sizeof(*dma), GFP_KERNEL);
-	if (!dma)
-		return NULL;
+		p->private = order;
+	} else {
 
-	if (order)
-		attr |= DMA_ATTR_NO_WARN;
+		dma = kmalloc(sizeof(*dma), GFP_KERNEL);
+		if (!dma)
+			return NULL;
 
-	vaddr = dma_alloc_attrs(pool->dev, (1ULL << order) * PAGE_SIZE,
-				&dma->addr, gfp_flags, attr);
-	if (!vaddr)
-		goto error_free;
+		if (order)
+			attr |= DMA_ATTR_NO_WARN;
 
-	/* TODO: This is an illegal abuse of the DMA API, but we need to rework
-	 * TTM page fault handling and extend the DMA API to clean this up.
-	 */
-	if (is_vmalloc_addr(vaddr))
-		p = vmalloc_to_page(vaddr);
-	else
-		p = virt_to_page(vaddr);
+		vaddr = dma_alloc_attrs(pool->dev, (1ULL << order) * PAGE_SIZE,
+					&dma->addr, gfp_flags, attr);
+		if (!vaddr) {
+			kfree(dma);
+			return NULL;
+		}
 
-	dma->vaddr = (unsigned long)vaddr | order;
-	p->private = (unsigned long)dma;
-	return p;
+		/* TODO: This is an illegal abuse of the DMA API, but we need
+		 * to rework TTM page fault handling and extend the DMA API to
+		 * clean this up.
+		 */
+		if (is_vmalloc_addr(vaddr))
+			p = vmalloc_to_page(vaddr);
+		else
+			p = virt_to_page(vaddr);
 
-error_free:
-	kfree(dma);
-	return NULL;
+		dma->vaddr = (unsigned long)vaddr | order;
+		p->private = (unsigned long)dma;
+	}
+
+	for (i = 0; i < (1 << order); ++i)
+		SetPageReserved(&p[i]);
+
+	return p;
 }
 
 /* Reset the caching and pages of size 1 << order */
@@ -134,6 +140,7 @@  static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
 {
 	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
 	struct ttm_pool_dma *dma;
+	unsigned int i;
 	void *vaddr;
 
 #ifdef CONFIG_X86
@@ -144,6 +151,9 @@  static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
 		set_pages_wb(p, 1 << order);
 #endif
 
+	for (i = 0; i < (1 << order); ++i)
+		ClearPageReserved(&p[i]);
+
 	if (!pool || !pool->use_dma_alloc) {
 		__free_pages(p, order);
 		return;