Message ID | 20190311090314.GB3310@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Johannes Berg |
Headers | show |
Series | iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE | expand |
On Mon, 2019-03-11 at 10:03 +0100, Stanislaw Gruszka wrote: > Take into account that sg->offset can be bigger than PAGE_SIZE when > setting segment sg->dma_address. Otherwise sg->dma_address will point > at diffrent page, what makes DMA not possible with erros like this: > > xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020] > xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020] > xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020] > xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020] > xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020] > > Additinally with wrong sg->dma_address unmap_sg will free wrong pages, > what what can cause crashes like this: > > Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1 > Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint > Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000() > Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000 > Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount > Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xh ci_pci(E) xhci_hcd(E) > Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E) > Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1 > Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018 > Feb 28 19:27:45 kernel: Call Trace: > Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80 > Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2 > Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0 > Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180 > Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20 > Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60 > Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340 > Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30 > Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100 > Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180 > Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250 > Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0 > Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290 > Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30 > Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170 > Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Cc: stable@vger.kernel.org > Reported-and-tested-by: jan.viktorin@gmail.com > Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> > --- > drivers/iommu/amd_iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 6b0760dafb3e..949621f33624 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -2604,7 +2604,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, > > /* Everything is mapped - write the right values into s->dma_address */ > for_each_sg(sglist, s, nelems, i) { > - s->dma_address += address + s->offset; > + s->dma_address += address + (s->offset & ~PAGE_MASK); > s->dma_length = s->length; > } > You should add a comment calling out that this is needed because the sg_phys(s) call above this is masked with PAGE_MASK. Then this makes much more sense. Otherwise I would have assumed you needed either the full offset or none. Other than that, from that I can tell the code itself looks to be correct, but just difficult to read. Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
On Mon, Mar 11, 2019 at 08:47:44AM -0700, Alexander Duyck wrote: > > drivers/iommu/amd_iommu.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > index 6b0760dafb3e..949621f33624 100644 > > --- a/drivers/iommu/amd_iommu.c > > +++ b/drivers/iommu/amd_iommu.c > > @@ -2604,7 +2604,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, > > > > /* Everything is mapped - write the right values into s->dma_address */ > > for_each_sg(sglist, s, nelems, i) { > > - s->dma_address += address + s->offset; > > + s->dma_address += address + (s->offset & ~PAGE_MASK); > > s->dma_length = s->length; > > } > > > > You should add a comment calling out that this is needed because the > sg_phys(s) call above this is masked with PAGE_MASK. Then this makes > much more sense. Otherwise I would have assumed you needed either the > full offset or none. Would something like this /* * Everything is mapped - write the right values into s->dma_address. * Take into account s->offset can be bigger than page size and sg_phys(s) * address has to be aligned to page granularity. */ be appropriate ? Stanislaw
On Tue, 2019-03-12 at 08:08 +0100, Stanislaw Gruszka wrote: > On Mon, Mar 11, 2019 at 08:47:44AM -0700, Alexander Duyck wrote: > > > drivers/iommu/amd_iommu.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > > index 6b0760dafb3e..949621f33624 100644 > > > --- a/drivers/iommu/amd_iommu.c > > > +++ b/drivers/iommu/amd_iommu.c > > > @@ -2604,7 +2604,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, > > > > > > /* Everything is mapped - write the right values into s->dma_address */ > > > for_each_sg(sglist, s, nelems, i) { > > > - s->dma_address += address + s->offset; > > > + s->dma_address += address + (s->offset & ~PAGE_MASK); > > > s->dma_length = s->length; > > > } > > > > > > > You should add a comment calling out that this is needed because the > > sg_phys(s) call above this is masked with PAGE_MASK. Then this makes > > much more sense. Otherwise I would have assumed you needed either the > > full offset or none. > > Would something like this > > /* > * Everything is mapped - write the right values into s->dma_address. > * Take into account s->offset can be bigger than page size and sg_phys(s) > * address has to be aligned to page granularity. > */ > > be appropriate ? > > Stanislaw > No, that isn't a good description. If you take a look at the code a few lines up you find: phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT); Now if I am not mistaken the whole reason why you are having to make the change here is because the application of PAGE_MASK in this line. Basically what sg_phys() will do is take the address of the page, convert it to a physical address and add the offset. However what the mask is doing is limiting how much of that offset can be added. As a result you have to add the remainder that was masked out. So maybe a better comment would be something like: /* * Add in the remaining piece of the scatter-gather offset that was * masked out when we were determining the physical address via * (sg_phys(s) & PAGE_MASK) earlier. */
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 6b0760dafb3e..949621f33624 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2604,7 +2604,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, /* Everything is mapped - write the right values into s->dma_address */ for_each_sg(sglist, s, nelems, i) { - s->dma_address += address + s->offset; + s->dma_address += address + (s->offset & ~PAGE_MASK); s->dma_length = s->length; }
Take into account that sg->offset can be bigger than PAGE_SIZE when setting segment sg->dma_address. Otherwise sg->dma_address will point at diffrent page, what makes DMA not possible with erros like this: xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020] Additinally with wrong sg->dma_address unmap_sg will free wrong pages, what what can cause crashes like this: Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1 Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000() Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000 Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci _pci(E) xhci_hcd(E) Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E) Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1 Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018 Feb 28 19:27:45 kernel: Call Trace: Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80 Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2 Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0 Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180 Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20 Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60 Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340 Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30 Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100 Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180 Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250 Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0 Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290 Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30 Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170 Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Cc: stable@vger.kernel.org Reported-and-tested-by: jan.viktorin@gmail.com Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> --- drivers/iommu/amd_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)