Message ID | 1442838371-21484-1-git-send-email-tiffany.lin@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Tiffany! On 21-09-15 14:26, Tiffany Lin wrote: > vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > But in dma_sync_sg_for_device, it use lengths of each SG entries > before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > SGs until dma length > dma seg bundary. sgt->nents will less than > sgt->orig_nents. Using SG entries after dma_map_sg_attrs > in vb2_dc_prepare will make some SGs are not sync to device. > After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > sync data to device twice. Device randomly get incorrect data because > some SGs are not sync to device. Change to use number of SG entries > before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > > Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > --- > drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c > index 2397ceb..c5d00bd 100644 > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > if (!sgt || buf->db_attach) > return; > > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > } > > static void vb2_dc_finish(void *buf_priv) > @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > if (!sgt || buf->db_attach) > return; > > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > } I don't really understand it. I am assuming that this happens on an arm and that the dma_map_sg_attrs and dma_sync_sg_* functions used are arm_iommu_map_sg() and arm_iommu_sync_sg_* as implemented in arch/arm/mm/dma-mapping.c. Now, as I understand it (and my understanding may very well be flawed!) the map_sg function concatenates SG entries if possible, so it may return fewer entries. But the dma_sync_sg functions use those updated SG entries, so the full buffer should be covered by this. Using orig_nents will actually sync parts of the buffer twice! The first nents entries already cover the full buffer so any remaining entries up to orig_nents will just duplicate parts of the buffer. So this patch makes no sense in the current code. If I understand your log text correctly this patch goes on top of Sakari Ailus' vb2 sync patch series. So if it wasn't needed before, but it is needed after his patch series, then the problem is in that patch series. In any case, I need some help understanding this patch. And *if* this patch is correct, then the same thing should likely be done for videobuf2-dma-sg.c. Regards, Hans > > /*********************************************/ > -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Hans, On Mon, 2015-09-21 at 15:13 +0200, Hans Verkuil wrote: > Hi Tiffany! > > On 21-09-15 14:26, Tiffany Lin wrote: > > vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > > But in dma_sync_sg_for_device, it use lengths of each SG entries > > before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > > SGs until dma length > dma seg bundary. sgt->nents will less than > > sgt->orig_nents. Using SG entries after dma_map_sg_attrs > > in vb2_dc_prepare will make some SGs are not sync to device. > > After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > > sync data to device twice. Device randomly get incorrect data because > > some SGs are not sync to device. Change to use number of SG entries > > before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > > > > Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > > --- > > drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > index 2397ceb..c5d00bd 100644 > > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > > if (!sgt || buf->db_attach) > > return; > > > > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > } > > > > static void vb2_dc_finish(void *buf_priv) > > @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > > if (!sgt || buf->db_attach) > > return; > > > > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > } > > I don't really understand it. I am assuming that this happens on an arm and that > the dma_map_sg_attrs and dma_sync_sg_* functions used are arm_iommu_map_sg() and > arm_iommu_sync_sg_* as implemented in arch/arm/mm/dma-mapping.c. > We are using __iommu_* implemented in "arch/arm64/mm/dma-mapping.c" in review patch http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html Without patch "[media] vb2: use dma_map_sg_attrs to prevent unnecessary sync", vb2 will sync data to device twice. One is from "dma_map_sg" in "vb2_dc_get_userptr", the other is from "dma_sync_sg_for_device" in "vb2_dc_prepare". dma_map_sg use orig_nents, and dma_sync_sg_for_device use nents." We do not run in 32bits mode, but check "arm_dma_sync_sg_for_device" in "arch/arm/mm/dma-mapping.c", ops->sync_single_for_device(dev, sg_dma_address(s), s->length, dir); It looks like has same issue. > Now, as I understand it (and my understanding may very well be flawed!) the map_sg > function concatenates SG entries if possible, so it may return fewer entries. But > the dma_sync_sg functions use those updated SG entries, so the full buffer should > be covered by this. Using orig_nents will actually sync parts of the buffer twice! > The first nents entries already cover the full buffer so any remaining entries up > to orig_nents will just duplicate parts of the buffer. > I found that in __iommu_sync_sg_for_device, it use sg->length , do not cover full buffer. By adding log in " __iommu_sync_sg_for_device" without patch "[media] vb2: use dma_map_sg_attrs to prevent unnecessary sync", we could see total synced size are different between called from dma_map_sg and dma_sync_sg_for_device. __iommu_sync_sg_for_device called from dma_sync_sg_for_device got updated SG entries number but it use un-updated sg length. After using "DMA_ATTR_SKIP_CPU_SYNC" to skip sync in vb2_dc_get_userptr, we got some part of the buffer not sync. > So this patch makes no sense in the current code. > > If I understand your log text correctly this patch goes on top of Sakari Ailus' vb2 > sync patch series. So if it wasn't needed before, but it is needed after his patch > series, then the problem is in that patch series. > This patch goes on top of these two patchs https://www.mail-archive.com/linux-media%40vger.kernel.org/msg82143.html http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html > In any case, I need some help understanding this patch. > > And *if* this patch is correct, then the same thing should likely be done for > videobuf2-dma-sg.c. > Yes, if this patch correct, same thing should be done for videobuf2-dma-sg.c > Regards, > > Hans > > > > > /*********************************************/ > > -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Tiffany, On Tue, Sep 22, 2015 at 06:19:25PM +0800, tiffany lin wrote: > Hi Hans, > > On Mon, 2015-09-21 at 15:13 +0200, Hans Verkuil wrote: > > Hi Tiffany! > > > > On 21-09-15 14:26, Tiffany Lin wrote: > > > vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > > > But in dma_sync_sg_for_device, it use lengths of each SG entries > > > before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > > > SGs until dma length > dma seg bundary. sgt->nents will less than > > > sgt->orig_nents. Using SG entries after dma_map_sg_attrs > > > in vb2_dc_prepare will make some SGs are not sync to device. > > > After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > > > sync data to device twice. Device randomly get incorrect data because > > > some SGs are not sync to device. Change to use number of SG entries > > > before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > > > > > > Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > > > --- > > > drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > index 2397ceb..c5d00bd 100644 > > > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > > > if (!sgt || buf->db_attach) > > > return; > > > > > > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > > } > > > > > > static void vb2_dc_finish(void *buf_priv) > > > @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > > > if (!sgt || buf->db_attach) > > > return; > > > > > > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > > } > > > > I don't really understand it. I am assuming that this happens on an arm and that > > the dma_map_sg_attrs and dma_sync_sg_* functions used are arm_iommu_map_sg() and > > arm_iommu_sync_sg_* as implemented in arch/arm/mm/dma-mapping.c. > > > > We are using __iommu_* implemented in "arch/arm64/mm/dma-mapping.c" in > review patch > http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html > Without patch "[media] vb2: use dma_map_sg_attrs to prevent unnecessary > sync", vb2 will sync data to device twice. > One is from "dma_map_sg" in "vb2_dc_get_userptr", the other is from > "dma_sync_sg_for_device" in "vb2_dc_prepare". dma_map_sg use orig_nents, > and dma_sync_sg_for_device use nents." > > We do not run in 32bits mode, but check "arm_dma_sync_sg_for_device" in > "arch/arm/mm/dma-mapping.c", > ops->sync_single_for_device(dev, sg_dma_address(s), s->length, dir); > It looks like has same issue. > > > Now, as I understand it (and my understanding may very well be flawed!) the map_sg > > function concatenates SG entries if possible, so it may return fewer entries. But > > the dma_sync_sg functions use those updated SG entries, so the full buffer should > > be covered by this. Using orig_nents will actually sync parts of the buffer twice! > > The first nents entries already cover the full buffer so any remaining entries up > > to orig_nents will just duplicate parts of the buffer. > > > I found that in __iommu_sync_sg_for_device, it use sg->length , do not > cover full buffer. > By adding log in " __iommu_sync_sg_for_device" without patch "[media] > vb2: use dma_map_sg_attrs to prevent unnecessary sync", we could see > total synced size are different between called from dma_map_sg and > dma_sync_sg_for_device. I had the same question Hans did, but I still fail to understand where in the code things are going wrong the way you described at the moment --- after dma_map_sg() there are nents entries in the scatterlist. But. sg_dma_len() should be used instead of the length field to get the size of the entry. If something is wrong, then it's this AFAICT. Could you try whether changing this fixes it? > __iommu_sync_sg_for_device called from dma_sync_sg_for_device got > updated SG entries number but it use un-updated sg length. > After using "DMA_ATTR_SKIP_CPU_SYNC" to skip sync in vb2_dc_get_userptr, > we got some part of the buffer not sync. > > > So this patch makes no sense in the current code. > > > > If I understand your log text correctly this patch goes on top of Sakari Ailus' vb2 > > sync patch series. So if it wasn't needed before, but it is needed after his patch > > series, then the problem is in that patch series. > > > This patch goes on top of these two patchs > https://www.mail-archive.com/linux-media%40vger.kernel.org/msg82143.html This patch has been merged long time ago. > http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html > > > > In any case, I need some help understanding this patch. > > > > And *if* this patch is correct, then the same thing should likely be done for > > videobuf2-dma-sg.c. > > > Yes, if this patch correct, same thing should be done for > videobuf2-dma-sg.c > > Regards, > > > > Hans > > > > > > > > /*********************************************/ > > > > >
Hi Sakari, On Tue, 2015-09-22 at 15:07 +0300, Sakari Ailus wrote: > Hi Tiffany, > > On Tue, Sep 22, 2015 at 06:19:25PM +0800, tiffany lin wrote: > > Hi Hans, > > > > On Mon, 2015-09-21 at 15:13 +0200, Hans Verkuil wrote: > > > Hi Tiffany! > > > > > > On 21-09-15 14:26, Tiffany Lin wrote: > > > > vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > > > > But in dma_sync_sg_for_device, it use lengths of each SG entries > > > > before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > > > > SGs until dma length > dma seg bundary. sgt->nents will less than > > > > sgt->orig_nents. Using SG entries after dma_map_sg_attrs > > > > in vb2_dc_prepare will make some SGs are not sync to device. > > > > After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > > > > sync data to device twice. Device randomly get incorrect data because > > > > some SGs are not sync to device. Change to use number of SG entries > > > > before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > > > > > > > > Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > > > > --- > > > > drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > > index 2397ceb..c5d00bd 100644 > > > > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > > > > @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > > > > if (!sgt || buf->db_attach) > > > > return; > > > > > > > > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > > > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > > > } > > > > > > > > static void vb2_dc_finish(void *buf_priv) > > > > @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > > > > if (!sgt || buf->db_attach) > > > > return; > > > > > > > > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > > > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > > > > } > > > > > > I don't really understand it. I am assuming that this happens on an arm and that > > > the dma_map_sg_attrs and dma_sync_sg_* functions used are arm_iommu_map_sg() and > > > arm_iommu_sync_sg_* as implemented in arch/arm/mm/dma-mapping.c. > > > > > > > We are using __iommu_* implemented in "arch/arm64/mm/dma-mapping.c" in > > review patch > > http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html > > Without patch "[media] vb2: use dma_map_sg_attrs to prevent unnecessary > > sync", vb2 will sync data to device twice. > > One is from "dma_map_sg" in "vb2_dc_get_userptr", the other is from > > "dma_sync_sg_for_device" in "vb2_dc_prepare". dma_map_sg use orig_nents, > > and dma_sync_sg_for_device use nents." > > > > We do not run in 32bits mode, but check "arm_dma_sync_sg_for_device" in > > "arch/arm/mm/dma-mapping.c", > > ops->sync_single_for_device(dev, sg_dma_address(s), s->length, dir); > > It looks like has same issue. > > > > > Now, as I understand it (and my understanding may very well be flawed!) the map_sg > > > function concatenates SG entries if possible, so it may return fewer entries. But > > > the dma_sync_sg functions use those updated SG entries, so the full buffer should > > > be covered by this. Using orig_nents will actually sync parts of the buffer twice! > > > The first nents entries already cover the full buffer so any remaining entries up > > > to orig_nents will just duplicate parts of the buffer. > > > > > I found that in __iommu_sync_sg_for_device, it use sg->length , do not > > cover full buffer. > > By adding log in " __iommu_sync_sg_for_device" without patch "[media] > > vb2: use dma_map_sg_attrs to prevent unnecessary sync", we could see > > total synced size are different between called from dma_map_sg and > > dma_sync_sg_for_device. > > I had the same question Hans did, but I still fail to understand where in > the code things are going wrong the way you described at the moment --- > after dma_map_sg() there are nents entries in the scatterlist. But. > sg_dma_len() should be used instead of the length field to get the size of > the entry. If something is wrong, then it's this AFAICT. > > Could you try whether changing this fixes it? > Do you mean try to change to use sg_dma_len in __iommu_sync_sg_for_device? I tried to change using sg_dam_len() in __iommu_sync_sg_for_device for_each_sg(sgl, sg, nelems, i) { //__dma_map_area(sg_virt(sg), sg->length, dir); __dma_map_area(sg_virt(sg), sg_dma_len(sg), dir); } I still see the issue. Probably there are some other issue in iommu_dma_map_sg. The __iommu_sync_sg_for_device could be called before and after iommu_dma_map_sg called. I am not sure whether there is any side effect change it to sg_dma_len(sg) > > __iommu_sync_sg_for_device called from dma_sync_sg_for_device got > > updated SG entries number but it use un-updated sg length. > > After using "DMA_ATTR_SKIP_CPU_SYNC" to skip sync in vb2_dc_get_userptr, > > we got some part of the buffer not sync. > > > > > So this patch makes no sense in the current code. > > > > > > If I understand your log text correctly this patch goes on top of Sakari Ailus' vb2 > > > sync patch series. So if it wasn't needed before, but it is needed after his patch > > > series, then the problem is in that patch series. > > > > > This patch goes on top of these two patchs > > https://www.mail-archive.com/linux-media%40vger.kernel.org/msg82143.html > > This patch has been merged long time ago. > > > http://lists.linuxfoundation.org/pipermail/iommu/2015-July/013898.html > > > > > > > In any case, I need some help understanding this patch. > > > > > > And *if* this patch is correct, then the same thing should likely be done for > > > videobuf2-dma-sg.c. > > > > > Yes, if this patch correct, same thing should be done for > > videobuf2-dma-sg.c > > > Regards, > > > > > > Hans > > > > > > > > > > > /*********************************************/ > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Hans, On 21/09/15 14:13, Hans Verkuil wrote: > Hi Tiffany! > > On 21-09-15 14:26, Tiffany Lin wrote: >> vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. >> But in dma_sync_sg_for_device, it use lengths of each SG entries >> before dma_map_sg_attrs. dma_map_sg_attrs will concatenate >> SGs until dma length > dma seg bundary. sgt->nents will less than >> sgt->orig_nents. Using SG entries after dma_map_sg_attrs >> in vb2_dc_prepare will make some SGs are not sync to device. >> After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove >> sync data to device twice. Device randomly get incorrect data because >> some SGs are not sync to device. Change to use number of SG entries >> before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. >> >> Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> >> --- >> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c >> index 2397ceb..c5d00bd 100644 >> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c >> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c >> @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) >> if (!sgt || buf->db_attach) >> return; >> >> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); >> + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); >> } >> >> static void vb2_dc_finish(void *buf_priv) >> @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) >> if (!sgt || buf->db_attach) >> return; >> >> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); >> + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); >> } > > I don't really understand it. I am assuming that this happens on an arm and that > the dma_map_sg_attrs and dma_sync_sg_* functions used are arm_iommu_map_sg() and > arm_iommu_sync_sg_* as implemented in arch/arm/mm/dma-mapping.c. > > Now, as I understand it (and my understanding may very well be flawed!) the map_sg > function concatenates SG entries if possible, so it may return fewer entries. But > the dma_sync_sg functions use those updated SG entries, so the full buffer should > be covered by this. Using orig_nents will actually sync parts of the buffer twice! > The first nents entries already cover the full buffer so any remaining entries up > to orig_nents will just duplicate parts of the buffer. As Documentation/DMA-API.txt says, the parameters to dma_sync_sg_* must be the same as those originally passed into dma_map_sg. The segments are only merged *from the point of view of the device*: if I have a scatterlist of two discontiguous 4K segments, I can remap them with an IOMMU so the device sees them as a single 8K buffer, and tell it as such. If on the other hand I want to do maintenance from the CPU side (i.e. any DMA API call), then those DMA addresses mean nothing and I can only operate on the CPU addresses of the underlying pages, which are still very much discontiguous in the linear map; ergo I still need to iterate over the original entries. Whilst I can't claim much familiarity with v4l itself, from a brief look over the existing code this patch does look to be doing the right thing. Robin. -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Tiffany, (Robin and Hans cc'd.) On Mon, Sep 21, 2015 at 08:26:11PM +0800, Tiffany Lin wrote: > vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > But in dma_sync_sg_for_device, it use lengths of each SG entries > before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > SGs until dma length > dma seg bundary. sgt->nents will less than > sgt->orig_nents. Using SG entries after dma_map_sg_attrs > in vb2_dc_prepare will make some SGs are not sync to device. > After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > sync data to device twice. Device randomly get incorrect data because > some SGs are not sync to device. Change to use number of SG entries > before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > > Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > --- > drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c > index 2397ceb..c5d00bd 100644 > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > if (!sgt || buf->db_attach) > return; > > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > } > > static void vb2_dc_finish(void *buf_priv) > @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > if (!sgt || buf->db_attach) > return; > > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > } > > /*********************************************/ Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Could you post a similar patch for videobuf2-dma-sg as well, please? This should probably go to stable (since when?). videobuf-dma-sg appears to be broken, too, but the fix is more changes than one or two lines.
Resent, hopefully without html this time. On September 22, 2015 11:10:15 PM GMT+02:00, Sakari Ailus <sakari.ailus@iki.fi> wrote: >Hi Tiffany, > >(Robin and Hans cc'd.) > >On Mon, Sep 21, 2015 at 08:26:11PM +0800, Tiffany Lin wrote: >> vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. >> But in dma_sync_sg_for_device, it use lengths of each SG entries >> before dma_map_sg_attrs. dma_map_sg_attrs will concatenate >> SGs until dma length > dma seg bundary. sgt->nents will less than >> sgt->orig_nents. Using SG entries after dma_map_sg_attrs >> in vb2_dc_prepare will make some SGs are not sync to device. >> After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove >> sync data to device twice. Device randomly get incorrect data because >> some SGs are not sync to device. Change to use number of SG entries >> before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. >> >> Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> >> --- >> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c >b/drivers/media/v4l2-core/videobuf2-dma-contig.c >> index 2397ceb..c5d00bd 100644 >> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c >> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c >> @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) >> if (!sgt || buf->db_attach) >> return; >> >> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, >buf->dma_dir); >> + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, >buf->dma_dir); >> } >> >> static void vb2_dc_finish(void *buf_priv) >> @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) >> if (!sgt || buf->db_attach) >> return; >> >> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); >> + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, >buf->dma_dir); >> } >> >> /*********************************************/ > >Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> > >Could you post a similar patch for videobuf2-dma-sg as well, please? >This >should probably go to stable (since when?). > >videobuf-dma-sg appears to be broken, too, but the fix is more changes >than one or two lines. > >-- >Kind regards, > >Sakari Ailus >e-mail: sakari.ailus@iki.fi XMPP: sailus@retiisi.org.uk Sakari, can you take a careful look at the vb2 code? If I remember correctly, the nents field receives the result of the map_sg function. I have no idea if that's correct. BTW, don't spend too much time on vb1, nobody cares about that old framework, and vb1 drivers are rarely used on arm platforms. Regards, Hans
Hi Hans, On Wed, Sep 23, 2015 at 10:40:56AM +0200, Hans Verkuil wrote: > Resent, hopefully without html this time. > > On September 22, 2015 11:10:15 PM GMT+02:00, Sakari Ailus <sakari.ailus@iki.fi> wrote: > >Hi Tiffany, > > > >(Robin and Hans cc'd.) > > > >On Mon, Sep 21, 2015 at 08:26:11PM +0800, Tiffany Lin wrote: > >> vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. > >> But in dma_sync_sg_for_device, it use lengths of each SG entries > >> before dma_map_sg_attrs. dma_map_sg_attrs will concatenate > >> SGs until dma length > dma seg bundary. sgt->nents will less than > >> sgt->orig_nents. Using SG entries after dma_map_sg_attrs > >> in vb2_dc_prepare will make some SGs are not sync to device. > >> After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove > >> sync data to device twice. Device randomly get incorrect data because > >> some SGs are not sync to device. Change to use number of SG entries > >> before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. > >> > >> Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> > >> --- > >> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c > >b/drivers/media/v4l2-core/videobuf2-dma-contig.c > >> index 2397ceb..c5d00bd 100644 > >> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c > >> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c > >> @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) > >> if (!sgt || buf->db_attach) > >> return; > >> > >> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, > >buf->dma_dir); > >> + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, > >buf->dma_dir); > >> } > >> > >> static void vb2_dc_finish(void *buf_priv) > >> @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) > >> if (!sgt || buf->db_attach) > >> return; > >> > >> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > >> + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, > >buf->dma_dir); > >> } > >> > >> /*********************************************/ > > > >Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> > > > >Could you post a similar patch for videobuf2-dma-sg as well, please? > >This > >should probably go to stable (since when?). > > > >videobuf-dma-sg appears to be broken, too, but the fix is more changes > >than one or two lines. > > > >-- > >Kind regards, > > > >Sakari Ailus > >e-mail: sakari.ailus@iki.fi XMPP: sailus@retiisi.org.uk > > Sakari, can you take a careful look at the vb2 code? If I remember > correctly, the nents field receives the result of the map_sg function. I > have no idea if that's correct. As far as I can tell, it is. According to a comment in the definition of struct sg_table in include/linux/scatterlist.h, this is the number of *mapped* entries in the table. Although a number of drivers construct the table by themselves use nents only, __sg_alloc_table() assigns the same number to both. The videobuf2 bug appears to be one of its kind --- I checked the other users of struct sg_table for the purpose. drivers/spi/spi.c has the same pattern except that it does not involve syncing the cache. There could be other users of dma_map_sg() that get this wrong though. Perhaps the comment on the sg_table shouldn't be added to the documentation as most of the users appear to be using it differently, even if it appears to be in a conflict with the intended usage. As far as I understand, what we need a similar fix for dma-sg allocator. > > BTW, don't spend too much time on vb1, nobody cares about that old > framework, and vb1 drivers are rarely used on arm platforms. In that case the wrong number of sglist entries is also passed to dma_unmap_sg(). Although in most cases it still works. I think the BTTV driver is using it, for instance.
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c index 2397ceb..c5d00bd 100644 --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c @@ -100,7 +100,7 @@ static void vb2_dc_prepare(void *buf_priv) if (!sgt || buf->db_attach) return; - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); } static void vb2_dc_finish(void *buf_priv) @@ -112,7 +112,7 @@ static void vb2_dc_finish(void *buf_priv) if (!sgt || buf->db_attach) return; - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); } /*********************************************/
vb2_dc_prepare use the number of SG entries dma_map_sg_attrs return. But in dma_sync_sg_for_device, it use lengths of each SG entries before dma_map_sg_attrs. dma_map_sg_attrs will concatenate SGs until dma length > dma seg bundary. sgt->nents will less than sgt->orig_nents. Using SG entries after dma_map_sg_attrs in vb2_dc_prepare will make some SGs are not sync to device. After add DMA_ATTR_SKIP_CPU_SYNC in vb2_dc_get_userptr to remove sync data to device twice. Device randomly get incorrect data because some SGs are not sync to device. Change to use number of SG entries before dma_map_sg_attrs in vb2_dc_prepare to prevent this issue. Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> --- drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)