Message ID | 20241028143553.36704-1-jacopo.mondi@ideasonboard.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] media: rkisp1: Reduce min_queued_buffers to 0 | expand |
On 28/10/2024 15:35, Jacopo Mondi wrote: > There apparently is no reason to require 3 queued buffers to call > streamon() for the RkISP1 as the driver operates with a scratch buffer > where frames can be directed to if there's no available buffer provided > by userspace. > > Reduce the number of required buffers to 0 to allow applications to > operate by queueing capture buffers on-demand. > > Tested with libcamera, by operating with a single capture request. The > same request (and associated capture buffer) gets recycled once > completed. This of course causes a frame rate drop but doesn't hinder > operations. > > Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > --- > The first version of this patch set min_queued_buffers to 1, but setting it > to 0 doesn't compromise operations and it's even better as it allows application > to queue buffers to the capture devices on-demand. If a buffer is not provided > to the DMA engines, image data gets directed to the driver's internal scratch > buffer. > --- > drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > index 2bddb4fa8a5c..5fcf9731f41b 100644 > --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > @@ -35,8 +35,6 @@ > #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > - > enum rkisp1_plane { > RKISP1_PLANE_Y = 0, > RKISP1_PLANE_CB = 1, > @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > q->ops = &rkisp1_vb2_ops; > q->mem_ops = &vb2_dma_contig_memops; > q->buf_struct_size = sizeof(struct rkisp1_buffer); > - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > + q->min_queued_buffers = 0; You can probably just drop this since the vb2_queue struct is zeroed when it is allocated. So no need to set it to 0. And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? Also, see my RFC I posted today: https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 and expect the driver to change that to a workable value, will, in fact, now just get one buffer. And streaming that will cause lots of frame drops. It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in libcamera to get much better control over how many buffers should be allocated. Regards, Hans > q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; > q->lock = &node->vlock; > q->dev = cap->rkisp1->dev; > -- > 2.47.0 > >
Hi Hans, Jacopo, On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > On 28/10/2024 15:35, Jacopo Mondi wrote: > > There apparently is no reason to require 3 queued buffers to call > > streamon() for the RkISP1 as the driver operates with a scratch buffer > > where frames can be directed to if there's no available buffer provided > > by userspace. > > > > Reduce the number of required buffers to 0 to allow applications to > > operate by queueing capture buffers on-demand. > > > > Tested with libcamera, by operating with a single capture request. The > > same request (and associated capture buffer) gets recycled once > > completed. This of course causes a frame rate drop but doesn't hinder > > operations. > > > > Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > > --- > > The first version of this patch set min_queued_buffers to 1, but setting it > > to 0 doesn't compromise operations and it's even better as it allows application > > to queue buffers to the capture devices on-demand. If a buffer is not provided > > to the DMA engines, image data gets directed to the driver's internal scratch > > buffer. > > --- > > drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > index 2bddb4fa8a5c..5fcf9731f41b 100644 > > --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > @@ -35,8 +35,6 @@ > > #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > > #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > - > > enum rkisp1_plane { > > RKISP1_PLANE_Y = 0, > > RKISP1_PLANE_CB = 1, > > @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > > q->ops = &rkisp1_vb2_ops; > > q->mem_ops = &vb2_dma_contig_memops; > > q->buf_struct_size = sizeof(struct rkisp1_buffer); > > - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > > + q->min_queued_buffers = 0; > > You can probably just drop this since the vb2_queue struct is zeroed when it > is allocated. So no need to set it to 0. > > And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? Probably not, as the patch drops the define :-) > Also, see my RFC I posted today: > > https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > > My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > and expect the driver to change that to a workable value, will, in fact, now just get > one buffer. And streaming that will cause lots of frame drops. > > It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. The rkisp1 driver requires a device-specific userspace, so I don't think we need to care about old and/or dumb applications calling REQBUFS(1). > My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > libcamera to get much better control over how many buffers should be allocated. > > > q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; > > q->lock = &node->vlock; > > q->dev = cap->rkisp1->dev; Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Hi Hans On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > On 28/10/2024 15:35, Jacopo Mondi wrote: > > There apparently is no reason to require 3 queued buffers to call > > streamon() for the RkISP1 as the driver operates with a scratch buffer > > where frames can be directed to if there's no available buffer provided > > by userspace. > > > > Reduce the number of required buffers to 0 to allow applications to > > operate by queueing capture buffers on-demand. > > > > Tested with libcamera, by operating with a single capture request. The > > same request (and associated capture buffer) gets recycled once > > completed. This of course causes a frame rate drop but doesn't hinder > > operations. > > > > Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > > --- > > The first version of this patch set min_queued_buffers to 1, but setting it > > to 0 doesn't compromise operations and it's even better as it allows application > > to queue buffers to the capture devices on-demand. If a buffer is not provided > > to the DMA engines, image data gets directed to the driver's internal scratch > > buffer. > > --- > > drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > index 2bddb4fa8a5c..5fcf9731f41b 100644 > > --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > @@ -35,8 +35,6 @@ > > #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > > #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > - > > enum rkisp1_plane { > > RKISP1_PLANE_Y = 0, > > RKISP1_PLANE_CB = 1, > > @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > > q->ops = &rkisp1_vb2_ops; > > q->mem_ops = &vb2_dma_contig_memops; > > q->buf_struct_size = sizeof(struct rkisp1_buffer); > > - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > > + q->min_queued_buffers = 0; > > You can probably just drop this since the vb2_queue struct is zeroed when it > is allocated. So no need to set it to 0. I suspected so :) > > And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? No, and this patch removes it in facts -#define RKISP1_MIN_BUFFERS_NEEDED 3 - > > Also, see my RFC I posted today: > > https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > > My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > and expect the driver to change that to a workable value, will, in fact, now just get > one buffer. And streaming that will cause lots of frame drops. > > It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. Yeah, I've read the discussion between you and Tomi and it seemed like a good time to re-send this patch. > > My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > libcamera to get much better control over how many buffers should be allocated. > In my understanding min_queued_buffers identifies how many buffers should be queued before calling start_streaming, and this comes directly from an hw/driver requirement. This doesn't mean that at least min_queue_buffers should be queued at all the times during streaming, at least, I don't see how and where videobuf2 enforces this. Or does it ? If the above is correct, then the number of buffers to be queued during streaming is, in my opinion, less an hw/driver requirement but more an application decision. As you said an application should be good with 3 buffers (one queued, one currently being written to, one to be consumed by the application), but in very specific cases where an application retains the buffer for longer, for whatever reason, it might need a larger number of queued buffers to provide the DMA engines a space where to write data without them being discarded (to scratch buffers or discarded by the DMA engine itself, if the HW supports that). Or maybe an application is fine to drop frames and only queue buffers sporadically (if the HW supports that ofc). For libcamera, and for this specific platform in particular, we're going to base new developments on the assumption that min_queued_buffers == 0, and it would be more convenient for use to be able to access its value from userspace to identify if we're running on a kernel with or without this patch being applied. Thanks j > Regards, > > Hans > > > q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; > > q->lock = &node->vlock; > > q->dev = cap->rkisp1->dev; > > -- > > 2.47.0 > > > > >
On 28/10/2024 16:30, Jacopo Mondi wrote: > Hi Hans > > On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: >> On 28/10/2024 15:35, Jacopo Mondi wrote: >>> There apparently is no reason to require 3 queued buffers to call >>> streamon() for the RkISP1 as the driver operates with a scratch buffer >>> where frames can be directed to if there's no available buffer provided >>> by userspace. >>> >>> Reduce the number of required buffers to 0 to allow applications to >>> operate by queueing capture buffers on-demand. >>> >>> Tested with libcamera, by operating with a single capture request. The >>> same request (and associated capture buffer) gets recycled once >>> completed. This of course causes a frame rate drop but doesn't hinder >>> operations. >>> >>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> >>> --- >>> The first version of this patch set min_queued_buffers to 1, but setting it >>> to 0 doesn't compromise operations and it's even better as it allows application >>> to queue buffers to the capture devices on-demand. If a buffer is not provided >>> to the DMA engines, image data gets directed to the driver's internal scratch >>> buffer. >>> --- >>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- >>> 1 file changed, 1 insertion(+), 3 deletions(-) >>> >>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>> index 2bddb4fa8a5c..5fcf9731f41b 100644 >>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>> @@ -35,8 +35,6 @@ >>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" >>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" >>> >>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 >>> - >>> enum rkisp1_plane { >>> RKISP1_PLANE_Y = 0, >>> RKISP1_PLANE_CB = 1, >>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) >>> q->ops = &rkisp1_vb2_ops; >>> q->mem_ops = &vb2_dma_contig_memops; >>> q->buf_struct_size = sizeof(struct rkisp1_buffer); >>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; >>> + q->min_queued_buffers = 0; >> >> You can probably just drop this since the vb2_queue struct is zeroed when it >> is allocated. So no need to set it to 0. > > I suspected so :) > >> >> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > > No, and this patch removes it in facts > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > - I should have checked the patch :-) Sorry for the noise. > >> >> Also, see my RFC I posted today: >> >> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u >> >> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 >> and expect the driver to change that to a workable value, will, in fact, now just get >> one buffer. And streaming that will cause lots of frame drops. >> >> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, >> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > > Yeah, I've read the discussion between you and Tomi and it seemed like > a good time to re-send this patch. > >> >> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in >> libcamera to get much better control over how many buffers should be allocated. >> > > In my understanding min_queued_buffers identifies how many buffers > should be queued before calling start_streaming, and this comes > directly from an hw/driver requirement. This doesn't mean that at > least min_queue_buffers should be queued at all the times during > streaming, at least, I don't see how and where videobuf2 enforces > this. Or does it ? It's an intrinsic property of the HW/driver: e.g. if it needs two buffers queued up for the DMA engine to work, then it really is always holding on to two buffers. The only thing the framework does is postpone calling start_streaming until that number of buffers is queued to ensure the DMA engine has what it needs to start. But after that vb2 doesn't check it. > > If the above is correct, then the number of buffers to be queued > during streaming is, in my opinion, less an hw/driver requirement but > more an application decision. No, min_queued_buffers is a HW/drivers property: the DMA engine can't start until that many buffers are queued up, and once it is started it will always hold on to that many buffers. So the application has to know somehow how many buffers are needed to actually stream. One way is via VIDIOC_REQBUFS since that is supposed to always return a workable number of buffers, the other is by actually reporting the minimum number of buffers as per my RFC. > As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > consumed by the application), but in very specific cases where an > application retains the buffer for longer, for whatever reason, it > might need a larger number of queued buffers to provide the DMA > engines a space where to write data without them being discarded (to > scratch buffers or discarded by the DMA engine itself, if the HW > supports that). Or maybe an application is fine to drop frames and > only queue buffers sporadically (if the HW supports that ofc). > > For libcamera, and for this specific platform in particular, we're > going to base new developments on the assumption that > min_queued_buffers == 0, and it would be more convenient for use to be > able to access its value from userspace to identify if we're running > on a kernel with or without this patch being applied. So my proposal in my RFC to expose min_num_buffers would work for libcamera? It sounds like that's what you need. Regards, Hans
Hi Hans, On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > On 28/10/2024 16:30, Jacopo Mondi wrote: > > On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > >> On 28/10/2024 15:35, Jacopo Mondi wrote: > >>> There apparently is no reason to require 3 queued buffers to call > >>> streamon() for the RkISP1 as the driver operates with a scratch buffer > >>> where frames can be directed to if there's no available buffer provided > >>> by userspace. > >>> > >>> Reduce the number of required buffers to 0 to allow applications to > >>> operate by queueing capture buffers on-demand. > >>> > >>> Tested with libcamera, by operating with a single capture request. The > >>> same request (and associated capture buffer) gets recycled once > >>> completed. This of course causes a frame rate drop but doesn't hinder > >>> operations. > >>> > >>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > >>> --- > >>> The first version of this patch set min_queued_buffers to 1, but setting it > >>> to 0 doesn't compromise operations and it's even better as it allows application > >>> to queue buffers to the capture devices on-demand. If a buffer is not provided > >>> to the DMA engines, image data gets directed to the driver's internal scratch > >>> buffer. > >>> --- > >>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > >>> 1 file changed, 1 insertion(+), 3 deletions(-) > >>> > >>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > >>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>> @@ -35,8 +35,6 @@ > >>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > >>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > >>> > >>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > >>> - > >>> enum rkisp1_plane { > >>> RKISP1_PLANE_Y = 0, > >>> RKISP1_PLANE_CB = 1, > >>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > >>> q->ops = &rkisp1_vb2_ops; > >>> q->mem_ops = &vb2_dma_contig_memops; > >>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > >>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > >>> + q->min_queued_buffers = 0; > >> > >> You can probably just drop this since the vb2_queue struct is zeroed when it > >> is allocated. So no need to set it to 0. > > > > I suspected so :) > > > >> > >> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > > > > No, and this patch removes it in facts > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > - > > I should have checked the patch :-) Sorry for the noise. > > >> > >> Also, see my RFC I posted today: > >> > >> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > >> > >> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > >> and expect the driver to change that to a workable value, will, in fact, now just get > >> one buffer. And streaming that will cause lots of frame drops. > >> > >> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > >> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > > > > Yeah, I've read the discussion between you and Tomi and it seemed like > > a good time to re-send this patch. > > > >> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > >> libcamera to get much better control over how many buffers should be allocated. > > > > In my understanding min_queued_buffers identifies how many buffers > > should be queued before calling start_streaming, and this comes > > directly from an hw/driver requirement. This doesn't mean that at > > least min_queue_buffers should be queued at all the times during > > streaming, at least, I don't see how and where videobuf2 enforces > > this. Or does it ? > > It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > queued up for the DMA engine to work, then it really is always holding on > to two buffers. The only thing the framework does is postpone calling > start_streaming until that number of buffers is queued to ensure the > DMA engine has what it needs to start. But after that vb2 doesn't check > it. The "driver" part of "HW/driver" is important here, as drivers can influence this in multiple ways. One of them is usage of scratch buffers, but even without that, a DMA engine that requires two buffers can easily be operated with a single buffer by programming the DMA engine with the same buffer address twice. Drivers should really do so unless they really can't. > > If the above is correct, then the number of buffers to be queued > > during streaming is, in my opinion, less an hw/driver requirement but > > more an application decision. > > No, min_queued_buffers is a HW/drivers property: the DMA engine can't > start until that many buffers are queued up, and once it is started > it will always hold on to that many buffers. That's not always true. The imx7-media-csi driver, for instance, sets min_queued_buffers to 2, but allocates scratch buffers and uses them at runtime, so that it can return all queued buffers to userspace. Grepping for min_queued_buffers I see drivers setting it to 4 (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at runtime. Your statement is not universally true today?. This could be considered as driver issues, and the min_queued_buffers values should be fixed to match the runtime behaviour. In some cases I expect it will require more work than just changing the value, as drivers may implement the logic to operate with less buffers at runtime but not at start time. This would be fixable, but it may also call for asking if the start at runtime behaviours need to be identical. > So the application has to know somehow how many buffers are needed to > actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > always return a workable number of buffers, the other is by actually > reporting the minimum number of buffers as per my RFC. > > > As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > > consumed by the application), but in very specific cases where an > > application retains the buffer for longer, for whatever reason, it > > might need a larger number of queued buffers to provide the DMA > > engines a space where to write data without them being discarded (to > > scratch buffers or discarded by the DMA engine itself, if the HW > > supports that). Or maybe an application is fine to drop frames and > > only queue buffers sporadically (if the HW supports that ofc). > > > > For libcamera, and for this specific platform in particular, we're > > going to base new developments on the assumption that > > min_queued_buffers == 0, and it would be more convenient for use to be > > able to access its value from userspace to identify if we're running > > on a kernel with or without this patch being applied. > > So my proposal in my RFC to expose min_num_buffers would work for libcamera? > It sounds like that's what you need. It may be useful, but I think we may also just require min_num_buffers == 0 for a device to be supported in libcamera. We have to implement APIs such as the Android camera HAL that has no concept of buffers being kept by the device. This could possibly be handled within libcamera by allocating scratch buffers in userspace, but that comes with other challenges. I would like to at least try to get help from the kernel until proven that it's a bad idea.
Hi Laurent, Hans, On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: > Hi Hans, > > On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > > On 28/10/2024 16:30, Jacopo Mondi wrote: > > > On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > > >> On 28/10/2024 15:35, Jacopo Mondi wrote: > > >>> There apparently is no reason to require 3 queued buffers to call > > >>> streamon() for the RkISP1 as the driver operates with a scratch buffer > > >>> where frames can be directed to if there's no available buffer provided > > >>> by userspace. > > >>> > > >>> Reduce the number of required buffers to 0 to allow applications to > > >>> operate by queueing capture buffers on-demand. > > >>> > > >>> Tested with libcamera, by operating with a single capture request. The > > >>> same request (and associated capture buffer) gets recycled once > > >>> completed. This of course causes a frame rate drop but doesn't hinder > > >>> operations. > > >>> > > >>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > > >>> --- > > >>> The first version of this patch set min_queued_buffers to 1, but setting it > > >>> to 0 doesn't compromise operations and it's even better as it allows application > > >>> to queue buffers to the capture devices on-demand. If a buffer is not provided > > >>> to the DMA engines, image data gets directed to the driver's internal scratch > > >>> buffer. > > >>> --- > > >>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > > >>> 1 file changed, 1 insertion(+), 3 deletions(-) > > >>> > > >>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > >>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > > >>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > >>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > >>> @@ -35,8 +35,6 @@ > > >>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > > >>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > >>> > > >>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > >>> - > > >>> enum rkisp1_plane { > > >>> RKISP1_PLANE_Y = 0, > > >>> RKISP1_PLANE_CB = 1, > > >>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > > >>> q->ops = &rkisp1_vb2_ops; > > >>> q->mem_ops = &vb2_dma_contig_memops; > > >>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > > >>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > > >>> + q->min_queued_buffers = 0; > > >> > > >> You can probably just drop this since the vb2_queue struct is zeroed when it > > >> is allocated. So no need to set it to 0. > > > > > > I suspected so :) > > > > > >> > > >> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > > > > > > No, and this patch removes it in facts > > > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > > - > > > > I should have checked the patch :-) Sorry for the noise. > > > > >> > > >> Also, see my RFC I posted today: > > >> > > >> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > > >> > > >> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > > >> and expect the driver to change that to a workable value, will, in fact, now just get > > >> one buffer. And streaming that will cause lots of frame drops. > > >> > > >> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > > >> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > > > > > > Yeah, I've read the discussion between you and Tomi and it seemed like > > > a good time to re-send this patch. > > > > > >> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > > >> libcamera to get much better control over how many buffers should be allocated. > > > > > > In my understanding min_queued_buffers identifies how many buffers > > > should be queued before calling start_streaming, and this comes > > > directly from an hw/driver requirement. This doesn't mean that at > > > least min_queue_buffers should be queued at all the times during > > > streaming, at least, I don't see how and where videobuf2 enforces > > > this. Or does it ? > > > > It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > > queued up for the DMA engine to work, then it really is always holding on > > to two buffers. The only thing the framework does is postpone calling > > start_streaming until that number of buffers is queued to ensure the > > DMA engine has what it needs to start. But after that vb2 doesn't check > > it. > > The "driver" part of "HW/driver" is important here, as drivers can > influence this in multiple ways. One of them is usage of scratch > buffers, but even without that, a DMA engine that requires two buffers > can easily be operated with a single buffer by programming the DMA > engine with the same buffer address twice. Drivers should really do so > unless they really can't. > > > > If the above is correct, then the number of buffers to be queued > > > during streaming is, in my opinion, less an hw/driver requirement but > > > more an application decision. > > > > No, min_queued_buffers is a HW/drivers property: the DMA engine can't > > start until that many buffers are queued up, and once it is started > > it will always hold on to that many buffers. I get it, my point was that once start_streaming has been called, even if min_queued_buffers=2, there is nothing preventing userspace from queing one buffer at the time once the first two have completed. Sure, the hw/driver might not like it, but while delaying start_streaming prevents bad things from happening, there is nothing in the core that prevents applications from potentially stalling the capture operations. But I get your point, if the system needs 2 buffers to start streaming, it will probably need two buffers to continue producing frames. > > That's not always true. The imx7-media-csi driver, for instance, sets > min_queued_buffers to 2, but allocates scratch buffers and uses them at > runtime, so that it can return all queued buffers to userspace. That's interesting. From your mention of "scratch buffers" I get there actually is a need to have 2 buffers queued to the HW ? How does that work, after all queuing a buffer to the DMA engine usually means pointing its write engine to one (set of) addresses. Or is it a driver-only requirement to ask for two buffers ? > Grepping for min_queued_buffers I see drivers setting it to 4 > (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 > (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at > runtime. Your statement is not universally true today?. > > This could be considered as driver issues, and the min_queued_buffers > values should be fixed to match the runtime behaviour. In some cases I > expect it will require more work than just changing the value, as > drivers may implement the logic to operate with less buffers at runtime > but not at start time. This would be fixable, but it may also call for > asking if the start at runtime behaviours need to be identical. > > > So the application has to know somehow how many buffers are needed to > > actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > > always return a workable number of buffers, the other is by actually > > reporting the minimum number of buffers as per my RFC. > > > > > As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > > > consumed by the application), but in very specific cases where an > > > application retains the buffer for longer, for whatever reason, it > > > might need a larger number of queued buffers to provide the DMA > > > engines a space where to write data without them being discarded (to > > > scratch buffers or discarded by the DMA engine itself, if the HW > > > supports that). Or maybe an application is fine to drop frames and > > > only queue buffers sporadically (if the HW supports that ofc). > > > > > > For libcamera, and for this specific platform in particular, we're > > > going to base new developments on the assumption that > > > min_queued_buffers == 0, and it would be more convenient for use to be > > > able to access its value from userspace to identify if we're running > > > on a kernel with or without this patch being applied. > > > > So my proposal in my RFC to expose min_num_buffers would work for libcamera? > > It sounds like that's what you need. > My immediate need is to know if I'm running on a "legacy" version of this driver that still requires 3 buffers for no apparent reason, or on a new version. Your proposal might work, but I still feel like we should report the HW/driver requirement (min_queued_buffers) instead of trying to suggest applications how many buffers they need to allocate to get "smooth streaming" or similar, as the use cases might be different. > It may be useful, but I think we may also just require min_num_buffers > == 0 for a device to be supported in libcamera. We have to implement While I concur this would be ideal, how would it work for existing rkisp1 implementation that do not include this patch ? libcamera should be able to run on both, probably in two different "modes" /o\ > APIs such as the Android camera HAL that has no concept of buffers being > kept by the device. This could possibly be handled within libcamera by > allocating scratch buffers in userspace, but that comes with other > challenges. I would like to at least try to get help from the kernel > until proven that it's a bad idea. > > -- > Regards, > > Laurent Pinchart
On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: > On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: > > On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > > > On 28/10/2024 16:30, Jacopo Mondi wrote: > > > > On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > > > >> On 28/10/2024 15:35, Jacopo Mondi wrote: > > > >>> There apparently is no reason to require 3 queued buffers to call > > > >>> streamon() for the RkISP1 as the driver operates with a scratch buffer > > > >>> where frames can be directed to if there's no available buffer provided > > > >>> by userspace. > > > >>> > > > >>> Reduce the number of required buffers to 0 to allow applications to > > > >>> operate by queueing capture buffers on-demand. > > > >>> > > > >>> Tested with libcamera, by operating with a single capture request. The > > > >>> same request (and associated capture buffer) gets recycled once > > > >>> completed. This of course causes a frame rate drop but doesn't hinder > > > >>> operations. > > > >>> > > > >>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > > > >>> --- > > > >>> The first version of this patch set min_queued_buffers to 1, but setting it > > > >>> to 0 doesn't compromise operations and it's even better as it allows application > > > >>> to queue buffers to the capture devices on-demand. If a buffer is not provided > > > >>> to the DMA engines, image data gets directed to the driver's internal scratch > > > >>> buffer. > > > >>> --- > > > >>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > > > >>> 1 file changed, 1 insertion(+), 3 deletions(-) > > > >>> > > > >>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > >>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > > > >>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > >>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > >>> @@ -35,8 +35,6 @@ > > > >>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > > > >>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > > >>> > > > >>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > > >>> - > > > >>> enum rkisp1_plane { > > > >>> RKISP1_PLANE_Y = 0, > > > >>> RKISP1_PLANE_CB = 1, > > > >>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > > > >>> q->ops = &rkisp1_vb2_ops; > > > >>> q->mem_ops = &vb2_dma_contig_memops; > > > >>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > > > >>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > > > >>> + q->min_queued_buffers = 0; > > > >> > > > >> You can probably just drop this since the vb2_queue struct is zeroed when it > > > >> is allocated. So no need to set it to 0. > > > > > > > > I suspected so :) > > > > > > > >> > > > >> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > > > > > > > > No, and this patch removes it in facts > > > > > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > > > - > > > > > > I should have checked the patch :-) Sorry for the noise. > > > > > > >> > > > >> Also, see my RFC I posted today: > > > >> > > > >> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > > > >> > > > >> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > > > >> and expect the driver to change that to a workable value, will, in fact, now just get > > > >> one buffer. And streaming that will cause lots of frame drops. > > > >> > > > >> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > > > >> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > > > > > > > > Yeah, I've read the discussion between you and Tomi and it seemed like > > > > a good time to re-send this patch. > > > > > > > >> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > > > >> libcamera to get much better control over how many buffers should be allocated. > > > > > > > > In my understanding min_queued_buffers identifies how many buffers > > > > should be queued before calling start_streaming, and this comes > > > > directly from an hw/driver requirement. This doesn't mean that at > > > > least min_queue_buffers should be queued at all the times during > > > > streaming, at least, I don't see how and where videobuf2 enforces > > > > this. Or does it ? > > > > > > It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > > > queued up for the DMA engine to work, then it really is always holding on > > > to two buffers. The only thing the framework does is postpone calling > > > start_streaming until that number of buffers is queued to ensure the > > > DMA engine has what it needs to start. But after that vb2 doesn't check > > > it. > > > > The "driver" part of "HW/driver" is important here, as drivers can > > influence this in multiple ways. One of them is usage of scratch > > buffers, but even without that, a DMA engine that requires two buffers > > can easily be operated with a single buffer by programming the DMA > > engine with the same buffer address twice. Drivers should really do so > > unless they really can't. > > > > > > If the above is correct, then the number of buffers to be queued > > > > during streaming is, in my opinion, less an hw/driver requirement but > > > > more an application decision. > > > > > > No, min_queued_buffers is a HW/drivers property: the DMA engine can't > > > start until that many buffers are queued up, and once it is started > > > it will always hold on to that many buffers. > > I get it, my point was that once start_streaming has been called, even > if min_queued_buffers=2, there is nothing preventing userspace from > queing one buffer at the time once the first two have completed. Sure, the > hw/driver might not like it, but while delaying start_streaming > prevents bad things from happening, there is nothing in the core that > prevents applications from potentially stalling the capture > operations. > > But I get your point, if the system needs 2 buffers to start > streaming, it will probably need two buffers to continue producing > frames. > > > That's not always true. The imx7-media-csi driver, for instance, sets > > min_queued_buffers to 2, but allocates scratch buffers and uses them at > > runtime, so that it can return all queued buffers to userspace. > > That's interesting. From your mention of "scratch buffers" I get there > actually is a need to have 2 buffers queued to the HW ? How does > that work, after all queuing a buffer to the DMA engine usually means > pointing its write engine to one (set of) addresses. > > Or is it a driver-only requirement to ask for two buffers ? The hardware has a ping-pong mechanism with two addresses and switches between them automatically. > > Grepping for min_queued_buffers I see drivers setting it to 4 > > (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 > > (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at > > runtime. Your statement is not universally true today?. > > > > This could be considered as driver issues, and the min_queued_buffers > > values should be fixed to match the runtime behaviour. In some cases I > > expect it will require more work than just changing the value, as > > drivers may implement the logic to operate with less buffers at runtime > > but not at start time. This would be fixable, but it may also call for > > asking if the start at runtime behaviours need to be identical. > > > > > So the application has to know somehow how many buffers are needed to > > > actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > > > always return a workable number of buffers, the other is by actually > > > reporting the minimum number of buffers as per my RFC. > > > > > > > As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > > > > consumed by the application), but in very specific cases where an > > > > application retains the buffer for longer, for whatever reason, it > > > > might need a larger number of queued buffers to provide the DMA > > > > engines a space where to write data without them being discarded (to > > > > scratch buffers or discarded by the DMA engine itself, if the HW > > > > supports that). Or maybe an application is fine to drop frames and > > > > only queue buffers sporadically (if the HW supports that ofc). > > > > > > > > For libcamera, and for this specific platform in particular, we're > > > > going to base new developments on the assumption that > > > > min_queued_buffers == 0, and it would be more convenient for use to be > > > > able to access its value from userspace to identify if we're running > > > > on a kernel with or without this patch being applied. > > > > > > So my proposal in my RFC to expose min_num_buffers would work for libcamera? > > > It sounds like that's what you need. > > My immediate need is to know if I'm running on a "legacy" version of > this driver that still requires 3 buffers for no apparent reason, or > on a new version. Your proposal might work, but I still feel like we > should report the HW/driver requirement (min_queued_buffers) instead > of trying to suggest applications how many buffers they need to > allocate to get "smooth streaming" or similar, as the use cases > might be different. > > > It may be useful, but I think we may also just require min_num_buffers > > == 0 for a device to be supported in libcamera. We have to implement > > While I concur this would be ideal, how would it work for existing > rkisp1 implementation that do not include this patch ? libcamera > should be able to run on both, probably in two different "modes" /o\ A simple option is to check the kernel version, we do that in a few places. Over time we'll increase the minimum kernel version and drop support for legacy APIs. > > APIs such as the Android camera HAL that has no concept of buffers being > > kept by the device. This could possibly be handled within libcamera by > > allocating scratch buffers in userspace, but that comes with other > > challenges. I would like to at least try to get help from the kernel > > until proven that it's a bad idea.
Hi Laurent On Mon, Oct 28, 2024 at 08:39:36PM +0200, Laurent Pinchart wrote: > On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: > > On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: > > > On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > > > > On 28/10/2024 16:30, Jacopo Mondi wrote: > > > > > On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > > > > >> On 28/10/2024 15:35, Jacopo Mondi wrote: > > > > >>> There apparently is no reason to require 3 queued buffers to call > > > > >>> streamon() for the RkISP1 as the driver operates with a scratch buffer > > > > >>> where frames can be directed to if there's no available buffer provided > > > > >>> by userspace. > > > > >>> > > > > >>> Reduce the number of required buffers to 0 to allow applications to > > > > >>> operate by queueing capture buffers on-demand. > > > > >>> > > > > >>> Tested with libcamera, by operating with a single capture request. The > > > > >>> same request (and associated capture buffer) gets recycled once > > > > >>> completed. This of course causes a frame rate drop but doesn't hinder > > > > >>> operations. > > > > >>> > > > > >>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > > > > >>> --- > > > > >>> The first version of this patch set min_queued_buffers to 1, but setting it > > > > >>> to 0 doesn't compromise operations and it's even better as it allows application > > > > >>> to queue buffers to the capture devices on-demand. If a buffer is not provided > > > > >>> to the DMA engines, image data gets directed to the driver's internal scratch > > > > >>> buffer. > > > > >>> --- > > > > >>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > > > > >>> 1 file changed, 1 insertion(+), 3 deletions(-) > > > > >>> > > > > >>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > > >>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > > > > >>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > > >>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > > > > >>> @@ -35,8 +35,6 @@ > > > > >>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > > > > >>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > > > > >>> > > > > >>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > > > >>> - > > > > >>> enum rkisp1_plane { > > > > >>> RKISP1_PLANE_Y = 0, > > > > >>> RKISP1_PLANE_CB = 1, > > > > >>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > > > > >>> q->ops = &rkisp1_vb2_ops; > > > > >>> q->mem_ops = &vb2_dma_contig_memops; > > > > >>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > > > > >>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > > > > >>> + q->min_queued_buffers = 0; > > > > >> > > > > >> You can probably just drop this since the vb2_queue struct is zeroed when it > > > > >> is allocated. So no need to set it to 0. > > > > > > > > > > I suspected so :) > > > > > > > > > >> > > > > >> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > > > > > > > > > > No, and this patch removes it in facts > > > > > > > > > > -#define RKISP1_MIN_BUFFERS_NEEDED 3 > > > > > - > > > > > > > > I should have checked the patch :-) Sorry for the noise. > > > > > > > > >> > > > > >> Also, see my RFC I posted today: > > > > >> > > > > >> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > > > > >> > > > > >> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > > > > >> and expect the driver to change that to a workable value, will, in fact, now just get > > > > >> one buffer. And streaming that will cause lots of frame drops. > > > > >> > > > > >> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > > > > >> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > > > > > > > > > > Yeah, I've read the discussion between you and Tomi and it seemed like > > > > > a good time to re-send this patch. > > > > > > > > > >> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > > > > >> libcamera to get much better control over how many buffers should be allocated. > > > > > > > > > > In my understanding min_queued_buffers identifies how many buffers > > > > > should be queued before calling start_streaming, and this comes > > > > > directly from an hw/driver requirement. This doesn't mean that at > > > > > least min_queue_buffers should be queued at all the times during > > > > > streaming, at least, I don't see how and where videobuf2 enforces > > > > > this. Or does it ? > > > > > > > > It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > > > > queued up for the DMA engine to work, then it really is always holding on > > > > to two buffers. The only thing the framework does is postpone calling > > > > start_streaming until that number of buffers is queued to ensure the > > > > DMA engine has what it needs to start. But after that vb2 doesn't check > > > > it. > > > > > > The "driver" part of "HW/driver" is important here, as drivers can > > > influence this in multiple ways. One of them is usage of scratch > > > buffers, but even without that, a DMA engine that requires two buffers > > > can easily be operated with a single buffer by programming the DMA > > > engine with the same buffer address twice. Drivers should really do so > > > unless they really can't. > > > > > > > > If the above is correct, then the number of buffers to be queued > > > > > during streaming is, in my opinion, less an hw/driver requirement but > > > > > more an application decision. > > > > > > > > No, min_queued_buffers is a HW/drivers property: the DMA engine can't > > > > start until that many buffers are queued up, and once it is started > > > > it will always hold on to that many buffers. > > > > I get it, my point was that once start_streaming has been called, even > > if min_queued_buffers=2, there is nothing preventing userspace from > > queing one buffer at the time once the first two have completed. Sure, the > > hw/driver might not like it, but while delaying start_streaming > > prevents bad things from happening, there is nothing in the core that > > prevents applications from potentially stalling the capture > > operations. > > > > But I get your point, if the system needs 2 buffers to start > > streaming, it will probably need two buffers to continue producing > > frames. > > > > > That's not always true. The imx7-media-csi driver, for instance, sets > > > min_queued_buffers to 2, but allocates scratch buffers and uses them at > > > runtime, so that it can return all queued buffers to userspace. > > > > That's interesting. From your mention of "scratch buffers" I get there > > actually is a need to have 2 buffers queued to the HW ? How does > > that work, after all queuing a buffer to the DMA engine usually means > > pointing its write engine to one (set of) addresses. > > > > Or is it a driver-only requirement to ask for two buffers ? > > The hardware has a ping-pong mechanism with two addresses and switches > between them automatically. > > > > Grepping for min_queued_buffers I see drivers setting it to 4 > > > (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 > > > (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at > > > runtime. Your statement is not universally true today?. > > > > > > This could be considered as driver issues, and the min_queued_buffers > > > values should be fixed to match the runtime behaviour. In some cases I > > > expect it will require more work than just changing the value, as > > > drivers may implement the logic to operate with less buffers at runtime > > > but not at start time. This would be fixable, but it may also call for > > > asking if the start at runtime behaviours need to be identical. > > > > > > > So the application has to know somehow how many buffers are needed to > > > > actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > > > > always return a workable number of buffers, the other is by actually > > > > reporting the minimum number of buffers as per my RFC. > > > > > > > > > As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > > > > > consumed by the application), but in very specific cases where an > > > > > application retains the buffer for longer, for whatever reason, it > > > > > might need a larger number of queued buffers to provide the DMA > > > > > engines a space where to write data without them being discarded (to > > > > > scratch buffers or discarded by the DMA engine itself, if the HW > > > > > supports that). Or maybe an application is fine to drop frames and > > > > > only queue buffers sporadically (if the HW supports that ofc). > > > > > > > > > > For libcamera, and for this specific platform in particular, we're > > > > > going to base new developments on the assumption that > > > > > min_queued_buffers == 0, and it would be more convenient for use to be > > > > > able to access its value from userspace to identify if we're running > > > > > on a kernel with or without this patch being applied. > > > > > > > > So my proposal in my RFC to expose min_num_buffers would work for libcamera? > > > > It sounds like that's what you need. > > > > My immediate need is to know if I'm running on a "legacy" version of > > this driver that still requires 3 buffers for no apparent reason, or > > on a new version. Your proposal might work, but I still feel like we > > should report the HW/driver requirement (min_queued_buffers) instead > > of trying to suggest applications how many buffers they need to > > allocate to get "smooth streaming" or similar, as the use cases > > might be different. > > > > > It may be useful, but I think we may also just require min_num_buffers > > > == 0 for a device to be supported in libcamera. We have to implement > > > > While I concur this would be ideal, how would it work for existing > > rkisp1 implementation that do not include this patch ? libcamera > > should be able to run on both, probably in two different "modes" /o\ > > A simple option is to check the kernel version, we do that in a few > places. Over time we'll increase the minimum kernel version and drop I considered that, but this patch is pretty easy to backport, having something that tells to userspace the value of min_queued_buffers might be useful indeed. > support for legacy APIs. > > > > APIs such as the Android camera HAL that has no concept of buffers being > > > kept by the device. This could possibly be handled within libcamera by > > > allocating scratch buffers in userspace, but that comes with other > > > challenges. I would like to at least try to get help from the kernel > > > until proven that it's a bad idea. > > -- > Regards, > > Laurent Pinchart
On 29/10/2024 08:01, Jacopo Mondi wrote: > Hi Laurent > > On Mon, Oct 28, 2024 at 08:39:36PM +0200, Laurent Pinchart wrote: >> On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: >>> On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: >>>> On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: >>>>> On 28/10/2024 16:30, Jacopo Mondi wrote: >>>>>> On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: >>>>>>> On 28/10/2024 15:35, Jacopo Mondi wrote: >>>>>>>> There apparently is no reason to require 3 queued buffers to call >>>>>>>> streamon() for the RkISP1 as the driver operates with a scratch buffer >>>>>>>> where frames can be directed to if there's no available buffer provided >>>>>>>> by userspace. >>>>>>>> >>>>>>>> Reduce the number of required buffers to 0 to allow applications to >>>>>>>> operate by queueing capture buffers on-demand. >>>>>>>> >>>>>>>> Tested with libcamera, by operating with a single capture request. The >>>>>>>> same request (and associated capture buffer) gets recycled once >>>>>>>> completed. This of course causes a frame rate drop but doesn't hinder >>>>>>>> operations. >>>>>>>> >>>>>>>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> >>>>>>>> --- >>>>>>>> The first version of this patch set min_queued_buffers to 1, but setting it >>>>>>>> to 0 doesn't compromise operations and it's even better as it allows application >>>>>>>> to queue buffers to the capture devices on-demand. If a buffer is not provided >>>>>>>> to the DMA engines, image data gets directed to the driver's internal scratch >>>>>>>> buffer. >>>>>>>> --- >>>>>>>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- >>>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>> index 2bddb4fa8a5c..5fcf9731f41b 100644 >>>>>>>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>> @@ -35,8 +35,6 @@ >>>>>>>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" >>>>>>>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" >>>>>>>> >>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 >>>>>>>> - >>>>>>>> enum rkisp1_plane { >>>>>>>> RKISP1_PLANE_Y = 0, >>>>>>>> RKISP1_PLANE_CB = 1, >>>>>>>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) >>>>>>>> q->ops = &rkisp1_vb2_ops; >>>>>>>> q->mem_ops = &vb2_dma_contig_memops; >>>>>>>> q->buf_struct_size = sizeof(struct rkisp1_buffer); >>>>>>>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; >>>>>>>> + q->min_queued_buffers = 0; >>>>>>> >>>>>>> You can probably just drop this since the vb2_queue struct is zeroed when it >>>>>>> is allocated. So no need to set it to 0. >>>>>> >>>>>> I suspected so :) >>>>>> >>>>>>> >>>>>>> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? >>>>>> >>>>>> No, and this patch removes it in facts >>>>>> >>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 >>>>>> - >>>>> >>>>> I should have checked the patch :-) Sorry for the noise. >>>>> >>>>>>> >>>>>>> Also, see my RFC I posted today: >>>>>>> >>>>>>> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u >>>>>>> >>>>>>> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 >>>>>>> and expect the driver to change that to a workable value, will, in fact, now just get >>>>>>> one buffer. And streaming that will cause lots of frame drops. >>>>>>> >>>>>>> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, >>>>>>> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. >>>>>> >>>>>> Yeah, I've read the discussion between you and Tomi and it seemed like >>>>>> a good time to re-send this patch. >>>>>> >>>>>>> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in >>>>>>> libcamera to get much better control over how many buffers should be allocated. >>>>>> >>>>>> In my understanding min_queued_buffers identifies how many buffers >>>>>> should be queued before calling start_streaming, and this comes >>>>>> directly from an hw/driver requirement. This doesn't mean that at >>>>>> least min_queue_buffers should be queued at all the times during >>>>>> streaming, at least, I don't see how and where videobuf2 enforces >>>>>> this. Or does it ? >>>>> >>>>> It's an intrinsic property of the HW/driver: e.g. if it needs two buffers >>>>> queued up for the DMA engine to work, then it really is always holding on >>>>> to two buffers. The only thing the framework does is postpone calling >>>>> start_streaming until that number of buffers is queued to ensure the >>>>> DMA engine has what it needs to start. But after that vb2 doesn't check >>>>> it. >>>> >>>> The "driver" part of "HW/driver" is important here, as drivers can >>>> influence this in multiple ways. One of them is usage of scratch >>>> buffers, but even without that, a DMA engine that requires two buffers >>>> can easily be operated with a single buffer by programming the DMA >>>> engine with the same buffer address twice. Drivers should really do so >>>> unless they really can't. >>>> >>>>>> If the above is correct, then the number of buffers to be queued >>>>>> during streaming is, in my opinion, less an hw/driver requirement but >>>>>> more an application decision. >>>>> >>>>> No, min_queued_buffers is a HW/drivers property: the DMA engine can't >>>>> start until that many buffers are queued up, and once it is started >>>>> it will always hold on to that many buffers. >>> >>> I get it, my point was that once start_streaming has been called, even >>> if min_queued_buffers=2, there is nothing preventing userspace from >>> queing one buffer at the time once the first two have completed. Sure, the >>> hw/driver might not like it, but while delaying start_streaming >>> prevents bad things from happening, there is nothing in the core that >>> prevents applications from potentially stalling the capture >>> operations. >>> >>> But I get your point, if the system needs 2 buffers to start >>> streaming, it will probably need two buffers to continue producing >>> frames. Right, it won't be able to complete anything and return it to userspace until it gets a third buffer. Only then can it return one buffer to userspace. >>> >>>> That's not always true. The imx7-media-csi driver, for instance, sets >>>> min_queued_buffers to 2, but allocates scratch buffers and uses them at >>>> runtime, so that it can return all queued buffers to userspace. >>> >>> That's interesting. From your mention of "scratch buffers" I get there >>> actually is a need to have 2 buffers queued to the HW ? How does >>> that work, after all queuing a buffer to the DMA engine usually means >>> pointing its write engine to one (set of) addresses. >>> >>> Or is it a driver-only requirement to ask for two buffers ? >> >> The hardware has a ping-pong mechanism with two addresses and switches >> between them automatically. >> >>>> Grepping for min_queued_buffers I see drivers setting it to 4 >>>> (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 >>>> (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at >>>> runtime. Your statement is not universally true today?. >>>> >>>> This could be considered as driver issues, and the min_queued_buffers >>>> values should be fixed to match the runtime behaviour. In some cases I >>>> expect it will require more work than just changing the value, as >>>> drivers may implement the logic to operate with less buffers at runtime >>>> but not at start time. This would be fixable, but it may also call for >>>> asking if the start at runtime behaviours need to be identical. >>>> >>>>> So the application has to know somehow how many buffers are needed to >>>>> actually stream. One way is via VIDIOC_REQBUFS since that is supposed to >>>>> always return a workable number of buffers, the other is by actually >>>>> reporting the minimum number of buffers as per my RFC. >>>>> >>>>>> As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be >>>>>> consumed by the application), but in very specific cases where an >>>>>> application retains the buffer for longer, for whatever reason, it >>>>>> might need a larger number of queued buffers to provide the DMA >>>>>> engines a space where to write data without them being discarded (to >>>>>> scratch buffers or discarded by the DMA engine itself, if the HW >>>>>> supports that). Or maybe an application is fine to drop frames and >>>>>> only queue buffers sporadically (if the HW supports that ofc). >>>>>> >>>>>> For libcamera, and for this specific platform in particular, we're >>>>>> going to base new developments on the assumption that >>>>>> min_queued_buffers == 0, and it would be more convenient for use to be >>>>>> able to access its value from userspace to identify if we're running >>>>>> on a kernel with or without this patch being applied. >>>>> >>>>> So my proposal in my RFC to expose min_num_buffers would work for libcamera? >>>>> It sounds like that's what you need. >>> >>> My immediate need is to know if I'm running on a "legacy" version of >>> this driver that still requires 3 buffers for no apparent reason, or >>> on a new version. Your proposal might work, but I still feel like we >>> should report the HW/driver requirement (min_queued_buffers) instead >>> of trying to suggest applications how many buffers they need to >>> allocate to get "smooth streaming" or similar, as the use cases >>> might be different. >>> >>>> It may be useful, but I think we may also just require min_num_buffers >>>> == 0 for a device to be supported in libcamera. We have to implement >>> >>> While I concur this would be ideal, how would it work for existing >>> rkisp1 implementation that do not include this patch ? libcamera >>> should be able to run on both, probably in two different "modes" /o\ >> >> A simple option is to check the kernel version, we do that in a few >> places. Over time we'll increase the minimum kernel version and drop > > I considered that, but this patch is pretty easy to backport, having > something that tells to userspace the value of > min_queued_buffers might be useful indeed. The implementation is similar to that of the new max_num_buffers field in struct v4l2_create_buffers: it will be signaled by a new buffer capability flag. So you can check that in the code. Much better than relying on kernel versions. Regards, Hans > >> support for legacy APIs. >> >>>> APIs such as the Android camera HAL that has no concept of buffers being >>>> kept by the device. This could possibly be handled within libcamera by >>>> allocating scratch buffers in userspace, but that comes with other >>>> challenges. I would like to at least try to get help from the kernel >>>> until proven that it's a bad idea. >> >> -- >> Regards, >> >> Laurent Pinchart
On Tue, Oct 29, 2024 at 09:26:58AM +0100, Hans Verkuil wrote: > On 29/10/2024 08:01, Jacopo Mondi wrote: > > On Mon, Oct 28, 2024 at 08:39:36PM +0200, Laurent Pinchart wrote: > >> On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: > >>> On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: > >>>> On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > >>>>> On 28/10/2024 16:30, Jacopo Mondi wrote: > >>>>>> On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > >>>>>>> On 28/10/2024 15:35, Jacopo Mondi wrote: > >>>>>>>> There apparently is no reason to require 3 queued buffers to call > >>>>>>>> streamon() for the RkISP1 as the driver operates with a scratch buffer > >>>>>>>> where frames can be directed to if there's no available buffer provided > >>>>>>>> by userspace. > >>>>>>>> > >>>>>>>> Reduce the number of required buffers to 0 to allow applications to > >>>>>>>> operate by queueing capture buffers on-demand. > >>>>>>>> > >>>>>>>> Tested with libcamera, by operating with a single capture request. The > >>>>>>>> same request (and associated capture buffer) gets recycled once > >>>>>>>> completed. This of course causes a frame rate drop but doesn't hinder > >>>>>>>> operations. > >>>>>>>> > >>>>>>>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > >>>>>>>> --- > >>>>>>>> The first version of this patch set min_queued_buffers to 1, but setting it > >>>>>>>> to 0 doesn't compromise operations and it's even better as it allows application > >>>>>>>> to queue buffers to the capture devices on-demand. If a buffer is not provided > >>>>>>>> to the DMA engines, image data gets directed to the driver's internal scratch > >>>>>>>> buffer. > >>>>>>>> --- > >>>>>>>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > >>>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > >>>>>>>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>> @@ -35,8 +35,6 @@ > >>>>>>>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > >>>>>>>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > >>>>>>>> > >>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > >>>>>>>> - > >>>>>>>> enum rkisp1_plane { > >>>>>>>> RKISP1_PLANE_Y = 0, > >>>>>>>> RKISP1_PLANE_CB = 1, > >>>>>>>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > >>>>>>>> q->ops = &rkisp1_vb2_ops; > >>>>>>>> q->mem_ops = &vb2_dma_contig_memops; > >>>>>>>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > >>>>>>>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > >>>>>>>> + q->min_queued_buffers = 0; > >>>>>>> > >>>>>>> You can probably just drop this since the vb2_queue struct is zeroed when it > >>>>>>> is allocated. So no need to set it to 0. > >>>>>> > >>>>>> I suspected so :) > >>>>>> > >>>>>>> > >>>>>>> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > >>>>>> > >>>>>> No, and this patch removes it in facts > >>>>>> > >>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > >>>>>> - > >>>>> > >>>>> I should have checked the patch :-) Sorry for the noise. > >>>>> > >>>>>>> > >>>>>>> Also, see my RFC I posted today: > >>>>>>> > >>>>>>> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > >>>>>>> > >>>>>>> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > >>>>>>> and expect the driver to change that to a workable value, will, in fact, now just get > >>>>>>> one buffer. And streaming that will cause lots of frame drops. > >>>>>>> > >>>>>>> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > >>>>>>> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > >>>>>> > >>>>>> Yeah, I've read the discussion between you and Tomi and it seemed like > >>>>>> a good time to re-send this patch. > >>>>>> > >>>>>>> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > >>>>>>> libcamera to get much better control over how many buffers should be allocated. > >>>>>> > >>>>>> In my understanding min_queued_buffers identifies how many buffers > >>>>>> should be queued before calling start_streaming, and this comes > >>>>>> directly from an hw/driver requirement. This doesn't mean that at > >>>>>> least min_queue_buffers should be queued at all the times during > >>>>>> streaming, at least, I don't see how and where videobuf2 enforces > >>>>>> this. Or does it ? > >>>>> > >>>>> It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > >>>>> queued up for the DMA engine to work, then it really is always holding on > >>>>> to two buffers. The only thing the framework does is postpone calling > >>>>> start_streaming until that number of buffers is queued to ensure the > >>>>> DMA engine has what it needs to start. But after that vb2 doesn't check > >>>>> it. > >>>> > >>>> The "driver" part of "HW/driver" is important here, as drivers can > >>>> influence this in multiple ways. One of them is usage of scratch > >>>> buffers, but even without that, a DMA engine that requires two buffers > >>>> can easily be operated with a single buffer by programming the DMA > >>>> engine with the same buffer address twice. Drivers should really do so > >>>> unless they really can't. > >>>> > >>>>>> If the above is correct, then the number of buffers to be queued > >>>>>> during streaming is, in my opinion, less an hw/driver requirement but > >>>>>> more an application decision. > >>>>> > >>>>> No, min_queued_buffers is a HW/drivers property: the DMA engine can't > >>>>> start until that many buffers are queued up, and once it is started > >>>>> it will always hold on to that many buffers. > >>> > >>> I get it, my point was that once start_streaming has been called, even > >>> if min_queued_buffers=2, there is nothing preventing userspace from > >>> queing one buffer at the time once the first two have completed. Sure, the > >>> hw/driver might not like it, but while delaying start_streaming > >>> prevents bad things from happening, there is nothing in the core that > >>> prevents applications from potentially stalling the capture > >>> operations. > >>> > >>> But I get your point, if the system needs 2 buffers to start > >>> streaming, it will probably need two buffers to continue producing > >>> frames. > > Right, it won't be able to complete anything and return it to userspace > until it gets a third buffer. Only then can it return one buffer to userspace. > > >>>> That's not always true. The imx7-media-csi driver, for instance, sets > >>>> min_queued_buffers to 2, but allocates scratch buffers and uses them at > >>>> runtime, so that it can return all queued buffers to userspace. > >>> > >>> That's interesting. From your mention of "scratch buffers" I get there > >>> actually is a need to have 2 buffers queued to the HW ? How does > >>> that work, after all queuing a buffer to the DMA engine usually means > >>> pointing its write engine to one (set of) addresses. > >>> > >>> Or is it a driver-only requirement to ask for two buffers ? > >> > >> The hardware has a ping-pong mechanism with two addresses and switches > >> between them automatically. > >> > >>>> Grepping for min_queued_buffers I see drivers setting it to 4 > >>>> (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 > >>>> (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at > >>>> runtime. Your statement is not universally true today?. > >>>> > >>>> This could be considered as driver issues, and the min_queued_buffers > >>>> values should be fixed to match the runtime behaviour. In some cases I > >>>> expect it will require more work than just changing the value, as > >>>> drivers may implement the logic to operate with less buffers at runtime > >>>> but not at start time. This would be fixable, but it may also call for > >>>> asking if the start at runtime behaviours need to be identical. > >>>> > >>>>> So the application has to know somehow how many buffers are needed to > >>>>> actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > >>>>> always return a workable number of buffers, the other is by actually > >>>>> reporting the minimum number of buffers as per my RFC. > >>>>> > >>>>>> As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > >>>>>> consumed by the application), but in very specific cases where an > >>>>>> application retains the buffer for longer, for whatever reason, it > >>>>>> might need a larger number of queued buffers to provide the DMA > >>>>>> engines a space where to write data without them being discarded (to > >>>>>> scratch buffers or discarded by the DMA engine itself, if the HW > >>>>>> supports that). Or maybe an application is fine to drop frames and > >>>>>> only queue buffers sporadically (if the HW supports that ofc). > >>>>>> > >>>>>> For libcamera, and for this specific platform in particular, we're > >>>>>> going to base new developments on the assumption that > >>>>>> min_queued_buffers == 0, and it would be more convenient for use to be > >>>>>> able to access its value from userspace to identify if we're running > >>>>>> on a kernel with or without this patch being applied. > >>>>> > >>>>> So my proposal in my RFC to expose min_num_buffers would work for libcamera? > >>>>> It sounds like that's what you need. > >>> > >>> My immediate need is to know if I'm running on a "legacy" version of > >>> this driver that still requires 3 buffers for no apparent reason, or > >>> on a new version. Your proposal might work, but I still feel like we > >>> should report the HW/driver requirement (min_queued_buffers) instead > >>> of trying to suggest applications how many buffers they need to > >>> allocate to get "smooth streaming" or similar, as the use cases > >>> might be different. > >>> > >>>> It may be useful, but I think we may also just require min_num_buffers > >>>> == 0 for a device to be supported in libcamera. We have to implement > >>> > >>> While I concur this would be ideal, how would it work for existing > >>> rkisp1 implementation that do not include this patch ? libcamera > >>> should be able to run on both, probably in two different "modes" /o\ > >> > >> A simple option is to check the kernel version, we do that in a few > >> places. Over time we'll increase the minimum kernel version and drop > > > > I considered that, but this patch is pretty easy to backport, having > > something that tells to userspace the value of > > min_queued_buffers might be useful indeed. > > The implementation is similar to that of the new max_num_buffers field > in struct v4l2_create_buffers: it will be signaled by a new buffer > capability flag. So you can check that in the code. Much better than > relying on kernel versions. Sure, I agree that an explicit API is better. I'm not a big fan of adding a field to v4l2_create_buffers though (I didn't realize it was extended with a max_num_buffers field) as it would require calling VIDIOC_CREATE_BUFS to get the information. That can get quite awkward to use for userspace. This being said, libcamera today won't behave properly with rkisp1, as it won't be able to give the application the last buffers, as guaranteed by the libcamera API. Relying on min_num_buffers == 0 will fix it and won't cause any regression on kernels that still use 3, it will just keep the current behaviour. > >> support for legacy APIs. > >> > >>>> APIs such as the Android camera HAL that has no concept of buffers being > >>>> kept by the device. This could possibly be handled within libcamera by > >>>> allocating scratch buffers in userspace, but that comes with other > >>>> challenges. I would like to at least try to get help from the kernel > >>>> until proven that it's a bad idea.
On 10/29/24 09:56, Laurent Pinchart wrote: > On Tue, Oct 29, 2024 at 09:26:58AM +0100, Hans Verkuil wrote: >> On 29/10/2024 08:01, Jacopo Mondi wrote: >>> On Mon, Oct 28, 2024 at 08:39:36PM +0200, Laurent Pinchart wrote: >>>> On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: >>>>> On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: >>>>>> On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: >>>>>>> On 28/10/2024 16:30, Jacopo Mondi wrote: >>>>>>>> On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: >>>>>>>>> On 28/10/2024 15:35, Jacopo Mondi wrote: >>>>>>>>>> There apparently is no reason to require 3 queued buffers to call >>>>>>>>>> streamon() for the RkISP1 as the driver operates with a scratch buffer >>>>>>>>>> where frames can be directed to if there's no available buffer provided >>>>>>>>>> by userspace. >>>>>>>>>> >>>>>>>>>> Reduce the number of required buffers to 0 to allow applications to >>>>>>>>>> operate by queueing capture buffers on-demand. >>>>>>>>>> >>>>>>>>>> Tested with libcamera, by operating with a single capture request. The >>>>>>>>>> same request (and associated capture buffer) gets recycled once >>>>>>>>>> completed. This of course causes a frame rate drop but doesn't hinder >>>>>>>>>> operations. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> >>>>>>>>>> --- >>>>>>>>>> The first version of this patch set min_queued_buffers to 1, but setting it >>>>>>>>>> to 0 doesn't compromise operations and it's even better as it allows application >>>>>>>>>> to queue buffers to the capture devices on-demand. If a buffer is not provided >>>>>>>>>> to the DMA engines, image data gets directed to the driver's internal scratch >>>>>>>>>> buffer. >>>>>>>>>> --- >>>>>>>>>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- >>>>>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>>>> index 2bddb4fa8a5c..5fcf9731f41b 100644 >>>>>>>>>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>>>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c >>>>>>>>>> @@ -35,8 +35,6 @@ >>>>>>>>>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" >>>>>>>>>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" >>>>>>>>>> >>>>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 >>>>>>>>>> - >>>>>>>>>> enum rkisp1_plane { >>>>>>>>>> RKISP1_PLANE_Y = 0, >>>>>>>>>> RKISP1_PLANE_CB = 1, >>>>>>>>>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) >>>>>>>>>> q->ops = &rkisp1_vb2_ops; >>>>>>>>>> q->mem_ops = &vb2_dma_contig_memops; >>>>>>>>>> q->buf_struct_size = sizeof(struct rkisp1_buffer); >>>>>>>>>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; >>>>>>>>>> + q->min_queued_buffers = 0; >>>>>>>>> >>>>>>>>> You can probably just drop this since the vb2_queue struct is zeroed when it >>>>>>>>> is allocated. So no need to set it to 0. >>>>>>>> >>>>>>>> I suspected so :) >>>>>>>> >>>>>>>>> >>>>>>>>> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? >>>>>>>> >>>>>>>> No, and this patch removes it in facts >>>>>>>> >>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 >>>>>>>> - >>>>>>> >>>>>>> I should have checked the patch :-) Sorry for the noise. >>>>>>> >>>>>>>>> >>>>>>>>> Also, see my RFC I posted today: >>>>>>>>> >>>>>>>>> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u >>>>>>>>> >>>>>>>>> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 >>>>>>>>> and expect the driver to change that to a workable value, will, in fact, now just get >>>>>>>>> one buffer. And streaming that will cause lots of frame drops. >>>>>>>>> >>>>>>>>> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, >>>>>>>>> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. >>>>>>>> >>>>>>>> Yeah, I've read the discussion between you and Tomi and it seemed like >>>>>>>> a good time to re-send this patch. >>>>>>>> >>>>>>>>> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in >>>>>>>>> libcamera to get much better control over how many buffers should be allocated. >>>>>>>> >>>>>>>> In my understanding min_queued_buffers identifies how many buffers >>>>>>>> should be queued before calling start_streaming, and this comes >>>>>>>> directly from an hw/driver requirement. This doesn't mean that at >>>>>>>> least min_queue_buffers should be queued at all the times during >>>>>>>> streaming, at least, I don't see how and where videobuf2 enforces >>>>>>>> this. Or does it ? >>>>>>> >>>>>>> It's an intrinsic property of the HW/driver: e.g. if it needs two buffers >>>>>>> queued up for the DMA engine to work, then it really is always holding on >>>>>>> to two buffers. The only thing the framework does is postpone calling >>>>>>> start_streaming until that number of buffers is queued to ensure the >>>>>>> DMA engine has what it needs to start. But after that vb2 doesn't check >>>>>>> it. >>>>>> >>>>>> The "driver" part of "HW/driver" is important here, as drivers can >>>>>> influence this in multiple ways. One of them is usage of scratch >>>>>> buffers, but even without that, a DMA engine that requires two buffers >>>>>> can easily be operated with a single buffer by programming the DMA >>>>>> engine with the same buffer address twice. Drivers should really do so >>>>>> unless they really can't. >>>>>> >>>>>>>> If the above is correct, then the number of buffers to be queued >>>>>>>> during streaming is, in my opinion, less an hw/driver requirement but >>>>>>>> more an application decision. >>>>>>> >>>>>>> No, min_queued_buffers is a HW/drivers property: the DMA engine can't >>>>>>> start until that many buffers are queued up, and once it is started >>>>>>> it will always hold on to that many buffers. >>>>> >>>>> I get it, my point was that once start_streaming has been called, even >>>>> if min_queued_buffers=2, there is nothing preventing userspace from >>>>> queing one buffer at the time once the first two have completed. Sure, the >>>>> hw/driver might not like it, but while delaying start_streaming >>>>> prevents bad things from happening, there is nothing in the core that >>>>> prevents applications from potentially stalling the capture >>>>> operations. >>>>> >>>>> But I get your point, if the system needs 2 buffers to start >>>>> streaming, it will probably need two buffers to continue producing >>>>> frames. >> >> Right, it won't be able to complete anything and return it to userspace >> until it gets a third buffer. Only then can it return one buffer to userspace. >> >>>>>> That's not always true. The imx7-media-csi driver, for instance, sets >>>>>> min_queued_buffers to 2, but allocates scratch buffers and uses them at >>>>>> runtime, so that it can return all queued buffers to userspace. >>>>> >>>>> That's interesting. From your mention of "scratch buffers" I get there >>>>> actually is a need to have 2 buffers queued to the HW ? How does >>>>> that work, after all queuing a buffer to the DMA engine usually means >>>>> pointing its write engine to one (set of) addresses. >>>>> >>>>> Or is it a driver-only requirement to ask for two buffers ? >>>> >>>> The hardware has a ping-pong mechanism with two addresses and switches >>>> between them automatically. >>>> >>>>>> Grepping for min_queued_buffers I see drivers setting it to 4 >>>>>> (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 >>>>>> (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at >>>>>> runtime. Your statement is not universally true today?. >>>>>> >>>>>> This could be considered as driver issues, and the min_queued_buffers >>>>>> values should be fixed to match the runtime behaviour. In some cases I >>>>>> expect it will require more work than just changing the value, as >>>>>> drivers may implement the logic to operate with less buffers at runtime >>>>>> but not at start time. This would be fixable, but it may also call for >>>>>> asking if the start at runtime behaviours need to be identical. >>>>>> >>>>>>> So the application has to know somehow how many buffers are needed to >>>>>>> actually stream. One way is via VIDIOC_REQBUFS since that is supposed to >>>>>>> always return a workable number of buffers, the other is by actually >>>>>>> reporting the minimum number of buffers as per my RFC. >>>>>>> >>>>>>>> As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be >>>>>>>> consumed by the application), but in very specific cases where an >>>>>>>> application retains the buffer for longer, for whatever reason, it >>>>>>>> might need a larger number of queued buffers to provide the DMA >>>>>>>> engines a space where to write data without them being discarded (to >>>>>>>> scratch buffers or discarded by the DMA engine itself, if the HW >>>>>>>> supports that). Or maybe an application is fine to drop frames and >>>>>>>> only queue buffers sporadically (if the HW supports that ofc). >>>>>>>> >>>>>>>> For libcamera, and for this specific platform in particular, we're >>>>>>>> going to base new developments on the assumption that >>>>>>>> min_queued_buffers == 0, and it would be more convenient for use to be >>>>>>>> able to access its value from userspace to identify if we're running >>>>>>>> on a kernel with or without this patch being applied. >>>>>>> >>>>>>> So my proposal in my RFC to expose min_num_buffers would work for libcamera? >>>>>>> It sounds like that's what you need. >>>>> >>>>> My immediate need is to know if I'm running on a "legacy" version of >>>>> this driver that still requires 3 buffers for no apparent reason, or >>>>> on a new version. Your proposal might work, but I still feel like we >>>>> should report the HW/driver requirement (min_queued_buffers) instead >>>>> of trying to suggest applications how many buffers they need to >>>>> allocate to get "smooth streaming" or similar, as the use cases >>>>> might be different. >>>>> >>>>>> It may be useful, but I think we may also just require min_num_buffers >>>>>> == 0 for a device to be supported in libcamera. We have to implement >>>>> >>>>> While I concur this would be ideal, how would it work for existing >>>>> rkisp1 implementation that do not include this patch ? libcamera >>>>> should be able to run on both, probably in two different "modes" /o\ >>>> >>>> A simple option is to check the kernel version, we do that in a few >>>> places. Over time we'll increase the minimum kernel version and drop >>> >>> I considered that, but this patch is pretty easy to backport, having >>> something that tells to userspace the value of >>> min_queued_buffers might be useful indeed. >> >> The implementation is similar to that of the new max_num_buffers field >> in struct v4l2_create_buffers: it will be signaled by a new buffer >> capability flag. So you can check that in the code. Much better than >> relying on kernel versions. > > Sure, I agree that an explicit API is better. I'm not a big fan of > adding a field to v4l2_create_buffers though (I didn't realize it was > extended with a max_num_buffers field) as it would require calling > VIDIOC_CREATE_BUFS to get the information. That can get quite awkward to > use for userspace. Why would that be awkward? If called with count = 0, then it will just report the capabilities and it will not create any buffers. Regards, Hans > > This being said, libcamera today won't behave properly with rkisp1, as > it won't be able to give the application the last buffers, as guaranteed > by the libcamera API. Relying on min_num_buffers == 0 will fix it and > won't cause any regression on kernels that still use 3, it will just > keep the current behaviour. > >>>> support for legacy APIs. >>>> >>>>>> APIs such as the Android camera HAL that has no concept of buffers being >>>>>> kept by the device. This could possibly be handled within libcamera by >>>>>> allocating scratch buffers in userspace, but that comes with other >>>>>> challenges. I would like to at least try to get help from the kernel >>>>>> until proven that it's a bad idea. >
On Tue, Oct 29, 2024 at 10:54:31AM +0100, Hans Verkuil wrote: > On 10/29/24 09:56, Laurent Pinchart wrote: > > On Tue, Oct 29, 2024 at 09:26:58AM +0100, Hans Verkuil wrote: > >> On 29/10/2024 08:01, Jacopo Mondi wrote: > >>> On Mon, Oct 28, 2024 at 08:39:36PM +0200, Laurent Pinchart wrote: > >>>> On Mon, Oct 28, 2024 at 06:08:18PM +0100, Jacopo Mondi wrote: > >>>>> On Mon, Oct 28, 2024 at 06:21:41PM +0200, Laurent Pinchart wrote: > >>>>>> On Mon, Oct 28, 2024 at 04:48:55PM +0100, Hans Verkuil wrote: > >>>>>>> On 28/10/2024 16:30, Jacopo Mondi wrote: > >>>>>>>> On Mon, Oct 28, 2024 at 04:02:13PM +0100, Hans Verkuil wrote: > >>>>>>>>> On 28/10/2024 15:35, Jacopo Mondi wrote: > >>>>>>>>>> There apparently is no reason to require 3 queued buffers to call > >>>>>>>>>> streamon() for the RkISP1 as the driver operates with a scratch buffer > >>>>>>>>>> where frames can be directed to if there's no available buffer provided > >>>>>>>>>> by userspace. > >>>>>>>>>> > >>>>>>>>>> Reduce the number of required buffers to 0 to allow applications to > >>>>>>>>>> operate by queueing capture buffers on-demand. > >>>>>>>>>> > >>>>>>>>>> Tested with libcamera, by operating with a single capture request. The > >>>>>>>>>> same request (and associated capture buffer) gets recycled once > >>>>>>>>>> completed. This of course causes a frame rate drop but doesn't hinder > >>>>>>>>>> operations. > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> > >>>>>>>>>> --- > >>>>>>>>>> The first version of this patch set min_queued_buffers to 1, but setting it > >>>>>>>>>> to 0 doesn't compromise operations and it's even better as it allows application > >>>>>>>>>> to queue buffers to the capture devices on-demand. If a buffer is not provided > >>>>>>>>>> to the DMA engines, image data gets directed to the driver's internal scratch > >>>>>>>>>> buffer. > >>>>>>>>>> --- > >>>>>>>>>> drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- > >>>>>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-) > >>>>>>>>>> > >>>>>>>>>> diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>>>> index 2bddb4fa8a5c..5fcf9731f41b 100644 > >>>>>>>>>> --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>>>> +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c > >>>>>>>>>> @@ -35,8 +35,6 @@ > >>>>>>>>>> #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" > >>>>>>>>>> #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" > >>>>>>>>>> > >>>>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > >>>>>>>>>> - > >>>>>>>>>> enum rkisp1_plane { > >>>>>>>>>> RKISP1_PLANE_Y = 0, > >>>>>>>>>> RKISP1_PLANE_CB = 1, > >>>>>>>>>> @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) > >>>>>>>>>> q->ops = &rkisp1_vb2_ops; > >>>>>>>>>> q->mem_ops = &vb2_dma_contig_memops; > >>>>>>>>>> q->buf_struct_size = sizeof(struct rkisp1_buffer); > >>>>>>>>>> - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; > >>>>>>>>>> + q->min_queued_buffers = 0; > >>>>>>>>> > >>>>>>>>> You can probably just drop this since the vb2_queue struct is zeroed when it > >>>>>>>>> is allocated. So no need to set it to 0. > >>>>>>>> > >>>>>>>> I suspected so :) > >>>>>>>> > >>>>>>>>> > >>>>>>>>> And is the RKISP1_MIN_BUFFERS_NEEDED define still needed after this change? > >>>>>>>> > >>>>>>>> No, and this patch removes it in facts > >>>>>>>> > >>>>>>>> -#define RKISP1_MIN_BUFFERS_NEEDED 3 > >>>>>>>> - > >>>>>>> > >>>>>>> I should have checked the patch :-) Sorry for the noise. > >>>>>>> > >>>>>>>>> > >>>>>>>>> Also, see my RFC I posted today: > >>>>>>>>> > >>>>>>>>> https://lore.kernel.org/linux-media/126cd76a-6224-483b-a18d-a3cc89e5ff2d@xs4all.nl/T/#u > >>>>>>>>> > >>>>>>>>> My main concern is that applications that just call VIDIOC_REQBUFS with count = 1 > >>>>>>>>> and expect the driver to change that to a workable value, will, in fact, now just get > >>>>>>>>> one buffer. And streaming that will cause lots of frame drops. > >>>>>>>>> > >>>>>>>>> It makes sense to leave min_queued_buffers at 0 if a scratch buffer is available, > >>>>>>>>> but I'm unhappy with the fact that you get a poor experience when REQBUFS(1) is called. > >>>>>>>> > >>>>>>>> Yeah, I've read the discussion between you and Tomi and it seemed like > >>>>>>>> a good time to re-send this patch. > >>>>>>>> > >>>>>>>>> My RFC suggests improvements in the uAPI. With that in place you can use CREATE_BUFS in > >>>>>>>>> libcamera to get much better control over how many buffers should be allocated. > >>>>>>>> > >>>>>>>> In my understanding min_queued_buffers identifies how many buffers > >>>>>>>> should be queued before calling start_streaming, and this comes > >>>>>>>> directly from an hw/driver requirement. This doesn't mean that at > >>>>>>>> least min_queue_buffers should be queued at all the times during > >>>>>>>> streaming, at least, I don't see how and where videobuf2 enforces > >>>>>>>> this. Or does it ? > >>>>>>> > >>>>>>> It's an intrinsic property of the HW/driver: e.g. if it needs two buffers > >>>>>>> queued up for the DMA engine to work, then it really is always holding on > >>>>>>> to two buffers. The only thing the framework does is postpone calling > >>>>>>> start_streaming until that number of buffers is queued to ensure the > >>>>>>> DMA engine has what it needs to start. But after that vb2 doesn't check > >>>>>>> it. > >>>>>> > >>>>>> The "driver" part of "HW/driver" is important here, as drivers can > >>>>>> influence this in multiple ways. One of them is usage of scratch > >>>>>> buffers, but even without that, a DMA engine that requires two buffers > >>>>>> can easily be operated with a single buffer by programming the DMA > >>>>>> engine with the same buffer address twice. Drivers should really do so > >>>>>> unless they really can't. > >>>>>> > >>>>>>>> If the above is correct, then the number of buffers to be queued > >>>>>>>> during streaming is, in my opinion, less an hw/driver requirement but > >>>>>>>> more an application decision. > >>>>>>> > >>>>>>> No, min_queued_buffers is a HW/drivers property: the DMA engine can't > >>>>>>> start until that many buffers are queued up, and once it is started > >>>>>>> it will always hold on to that many buffers. > >>>>> > >>>>> I get it, my point was that once start_streaming has been called, even > >>>>> if min_queued_buffers=2, there is nothing preventing userspace from > >>>>> queing one buffer at the time once the first two have completed. Sure, the > >>>>> hw/driver might not like it, but while delaying start_streaming > >>>>> prevents bad things from happening, there is nothing in the core that > >>>>> prevents applications from potentially stalling the capture > >>>>> operations. > >>>>> > >>>>> But I get your point, if the system needs 2 buffers to start > >>>>> streaming, it will probably need two buffers to continue producing > >>>>> frames. > >> > >> Right, it won't be able to complete anything and return it to userspace > >> until it gets a third buffer. Only then can it return one buffer to userspace. > >> > >>>>>> That's not always true. The imx7-media-csi driver, for instance, sets > >>>>>> min_queued_buffers to 2, but allocates scratch buffers and uses them at > >>>>>> runtime, so that it can return all queued buffers to userspace. > >>>>> > >>>>> That's interesting. From your mention of "scratch buffers" I get there > >>>>> actually is a need to have 2 buffers queued to the HW ? How does > >>>>> that work, after all queuing a buffer to the DMA engine usually means > >>>>> pointing its write engine to one (set of) addresses. > >>>>> > >>>>> Or is it a driver-only requirement to ask for two buffers ? > >>>> > >>>> The hardware has a ping-pong mechanism with two addresses and switches > >>>> between them automatically. > >>>> > >>>>>> Grepping for min_queued_buffers I see drivers setting it to 4 > >>>>>> (rcar-dma.c, rzg2l-video.c), 6 (cxusb-analog.c) or even 9 > >>>>>> (zoran_driver.c) ! I doubt the zoran driver holds on to 9 buffers at > >>>>>> runtime. Your statement is not universally true today?. > >>>>>> > >>>>>> This could be considered as driver issues, and the min_queued_buffers > >>>>>> values should be fixed to match the runtime behaviour. In some cases I > >>>>>> expect it will require more work than just changing the value, as > >>>>>> drivers may implement the logic to operate with less buffers at runtime > >>>>>> but not at start time. This would be fixable, but it may also call for > >>>>>> asking if the start at runtime behaviours need to be identical. > >>>>>> > >>>>>>> So the application has to know somehow how many buffers are needed to > >>>>>>> actually stream. One way is via VIDIOC_REQBUFS since that is supposed to > >>>>>>> always return a workable number of buffers, the other is by actually > >>>>>>> reporting the minimum number of buffers as per my RFC. > >>>>>>> > >>>>>>>> As you said an application should be good with> 3 buffers (one queued, one currently being written to, one to be > >>>>>>>> consumed by the application), but in very specific cases where an > >>>>>>>> application retains the buffer for longer, for whatever reason, it > >>>>>>>> might need a larger number of queued buffers to provide the DMA > >>>>>>>> engines a space where to write data without them being discarded (to > >>>>>>>> scratch buffers or discarded by the DMA engine itself, if the HW > >>>>>>>> supports that). Or maybe an application is fine to drop frames and > >>>>>>>> only queue buffers sporadically (if the HW supports that ofc). > >>>>>>>> > >>>>>>>> For libcamera, and for this specific platform in particular, we're > >>>>>>>> going to base new developments on the assumption that > >>>>>>>> min_queued_buffers == 0, and it would be more convenient for use to be > >>>>>>>> able to access its value from userspace to identify if we're running > >>>>>>>> on a kernel with or without this patch being applied. > >>>>>>> > >>>>>>> So my proposal in my RFC to expose min_num_buffers would work for libcamera? > >>>>>>> It sounds like that's what you need. > >>>>> > >>>>> My immediate need is to know if I'm running on a "legacy" version of > >>>>> this driver that still requires 3 buffers for no apparent reason, or > >>>>> on a new version. Your proposal might work, but I still feel like we > >>>>> should report the HW/driver requirement (min_queued_buffers) instead > >>>>> of trying to suggest applications how many buffers they need to > >>>>> allocate to get "smooth streaming" or similar, as the use cases > >>>>> might be different. > >>>>> > >>>>>> It may be useful, but I think we may also just require min_num_buffers > >>>>>> == 0 for a device to be supported in libcamera. We have to implement > >>>>> > >>>>> While I concur this would be ideal, how would it work for existing > >>>>> rkisp1 implementation that do not include this patch ? libcamera > >>>>> should be able to run on both, probably in two different "modes" /o\ > >>>> > >>>> A simple option is to check the kernel version, we do that in a few > >>>> places. Over time we'll increase the minimum kernel version and drop > >>> > >>> I considered that, but this patch is pretty easy to backport, having > >>> something that tells to userspace the value of > >>> min_queued_buffers might be useful indeed. > >> > >> The implementation is similar to that of the new max_num_buffers field > >> in struct v4l2_create_buffers: it will be signaled by a new buffer > >> capability flag. So you can check that in the code. Much better than > >> relying on kernel versions. > > > > Sure, I agree that an explicit API is better. I'm not a big fan of > > adding a field to v4l2_create_buffers though (I didn't realize it was > > extended with a max_num_buffers field) as it would require calling > > VIDIOC_CREATE_BUFS to get the information. That can get quite awkward to > > use for userspace. > > Why would that be awkward? If called with count = 0, then it will just > report the capabilities and it will not create any buffers. I wasn't aware you could call it with count = 0 for that purpose. This addresses my concerns. > > This being said, libcamera today won't behave properly with rkisp1, as > > it won't be able to give the application the last buffers, as guaranteed > > by the libcamera API. Relying on min_num_buffers == 0 will fix it and > > won't cause any regression on kernels that still use 3, it will just > > keep the current behaviour. > > > >>>> support for legacy APIs. > >>>> > >>>>>> APIs such as the Android camera HAL that has no concept of buffers being > >>>>>> kept by the device. This could possibly be handled within libcamera by > >>>>>> allocating scratch buffers in userspace, but that comes with other > >>>>>> challenges. I would like to at least try to get help from the kernel > >>>>>> until proven that it's a bad idea.
diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c index 2bddb4fa8a5c..5fcf9731f41b 100644 --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c @@ -35,8 +35,6 @@ #define RKISP1_SP_DEV_NAME RKISP1_DRIVER_NAME "_selfpath" #define RKISP1_MP_DEV_NAME RKISP1_DRIVER_NAME "_mainpath" -#define RKISP1_MIN_BUFFERS_NEEDED 3 - enum rkisp1_plane { RKISP1_PLANE_Y = 0, RKISP1_PLANE_CB = 1, @@ -1563,7 +1561,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) q->ops = &rkisp1_vb2_ops; q->mem_ops = &vb2_dma_contig_memops; q->buf_struct_size = sizeof(struct rkisp1_buffer); - q->min_queued_buffers = RKISP1_MIN_BUFFERS_NEEDED; + q->min_queued_buffers = 0; q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; q->lock = &node->vlock; q->dev = cap->rkisp1->dev;
There apparently is no reason to require 3 queued buffers to call streamon() for the RkISP1 as the driver operates with a scratch buffer where frames can be directed to if there's no available buffer provided by userspace. Reduce the number of required buffers to 0 to allow applications to operate by queueing capture buffers on-demand. Tested with libcamera, by operating with a single capture request. The same request (and associated capture buffer) gets recycled once completed. This of course causes a frame rate drop but doesn't hinder operations. Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> --- The first version of this patch set min_queued_buffers to 1, but setting it to 0 doesn't compromise operations and it's even better as it allows application to queue buffers to the capture devices on-demand. If a buffer is not provided to the DMA engines, image data gets directed to the driver's internal scratch buffer. --- drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) -- 2.47.0