Message ID | 1535576973-8067-1-git-send-email-eajames@linux.vnet.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | media: platform: Add Aspeed Video Engine driver | expand |
Quoting Eddie James (2018-08-29 14:09:29) > The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs > can capture and compress video data from digital or analog sources. With > the Aspeed chip acting as a service processor, the Video Engine can > capture the host processor graphics output. > > This series adds a V4L2 driver for the VE, providing a read() interface > only. The driver triggers the hardware to capture the host graphics output > and compress it to JPEG format. > > Testing on an AST2500 determined that the videobuf/streaming/mmap interface > was significantly slower than the simple read() interface, so I have not > included the streaming part. > > It's also possible to use an automatic mode for the VE such that > re-triggering the HW every frame isn't necessary. However this wasn't > reliable on the AST2400, and probably used more CPU anyway due to excessive > interrupts. It was approximately 15% faster. > > The series also adds the necessary parent clock definitions to the Aspeed > clock driver, with both a mux and clock divider. Please let me know your merge strategy here. I can ack the clk patches because they look fine from high-level clk driver perspective (maybe Joel can take a closer look) or I can merge the patches into clk-next and get them into next release while the video driver gets reviewed.
On 08/31/2018 12:56 PM, Stephen Boyd wrote: > Quoting Eddie James (2018-08-29 14:09:29) >> The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs >> can capture and compress video data from digital or analog sources. With >> the Aspeed chip acting as a service processor, the Video Engine can >> capture the host processor graphics output. >> >> This series adds a V4L2 driver for the VE, providing a read() interface >> only. The driver triggers the hardware to capture the host graphics output >> and compress it to JPEG format. >> >> Testing on an AST2500 determined that the videobuf/streaming/mmap interface >> was significantly slower than the simple read() interface, so I have not >> included the streaming part. >> >> It's also possible to use an automatic mode for the VE such that >> re-triggering the HW every frame isn't necessary. However this wasn't >> reliable on the AST2400, and probably used more CPU anyway due to excessive >> interrupts. It was approximately 15% faster. >> >> The series also adds the necessary parent clock definitions to the Aspeed >> clock driver, with both a mux and clock divider. > Please let me know your merge strategy here. I can ack the clk patches > because they look fine from high-level clk driver perspective (maybe > Joel can take a closer look) or I can merge the patches into clk-next > and get them into next release while the video driver gets reviewed. Thanks for taking a look! Probably preferable to get the clk patches into clk-next first (though Joel reviewing would be great). I just put everything in the same set for the sake of explaining the necessity of the clk changes. Thanks, Eddie >
Quoting Eddie James (2018-08-31 12:30:02) > > > On 08/31/2018 12:56 PM, Stephen Boyd wrote: > > Quoting Eddie James (2018-08-29 14:09:29) > > Please let me know your merge strategy here. I can ack the clk patches > > because they look fine from high-level clk driver perspective (maybe > > Joel can take a closer look) or I can merge the patches into clk-next > > and get them into next release while the video driver gets reviewed. > > Thanks for taking a look! Probably preferable to get the clk patches > into clk-next first (though Joel reviewing would be great). I just put > everything in the same set for the sake of explaining the necessity of > the clk changes. > Ok. If Joel is able to review then I can easily merge it into clk-next.
Hi Eddie, Thank you for your work on this. Interesting to see support for this SoC :-) On 08/29/2018 11:09 PM, Eddie James wrote: > The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs > can capture and compress video data from digital or analog sources. With > the Aspeed chip acting as a service processor, the Video Engine can > capture the host processor graphics output. > > This series adds a V4L2 driver for the VE, providing a read() interface > only. The driver triggers the hardware to capture the host graphics output > and compress it to JPEG format. > > Testing on an AST2500 determined that the videobuf/streaming/mmap interface > was significantly slower than the simple read() interface, so I have not > included the streaming part. Do you know why? It should be equal or faster, not slower. I reviewed about half of the driver, but then I stopped since there were too many things missing. First of all, you need to test your driver with v4l2-compliance (available here: https://git.linuxtv.org/v4l-utils.git/). Always compile from the git repo since the versions from distros tend to be too old. Just run 'v4l2-compliance -d /dev/videoX' and fix all issues. Then run 'v4l2-compliance -s -d /dev/videoX' to test streaming. This utility checks if the driver follows the V4L2 API correctly, implements all ioctls that it should and fills in all the fields that it should. Please add the output of 'v4l2-compliance -s' to future versions of this patch series: I don't accept V4L2 drivers without a clean report of this utility. If you have any questions, then mail me or (usually quicker) ask on the #v4l freenode irc channel (I'm in the CET timezone). One thing that needs more explanation: from what I could tell from the driver the VIDIOC_G_FMT ioctl returns the detected format instead of the current format. This is wrong. Instead you should implement the VIDIOC_*_DV_TIMINGS ioctls and the V4L2_EVENT_SOURCE_CHANGE event. The normal sequence is that userspace queries the current timings with VIDIOC_QUERY_DV_TIMINGS, if it finds valid timings, then it sets these timings with _S_DV_TIMINGS. Now it can call G/S_FMT. If the timings change, then the driver should detect that and send a V4L2_EVENT_SOURCE_CHANGE event. When the application receives this event it can take action, such as increasing the size of the buffer for the jpeg data that it reads into. The reason for this sequence of events is that you can't just change the format/resolution mid-stream without giving userspace the chance to reconfigure. Regards, Hans > > It's also possible to use an automatic mode for the VE such that > re-triggering the HW every frame isn't necessary. However this wasn't > reliable on the AST2400, and probably used more CPU anyway due to excessive > interrupts. It was approximately 15% faster. > > The series also adds the necessary parent clock definitions to the Aspeed > clock driver, with both a mux and clock divider. > > Eddie James (4): > clock: aspeed: Add VIDEO reset index definition > clock: aspeed: Setup video engine clocking > dt-bindings: media: Add Aspeed Video Engine binding documentation > media: platform: Add Aspeed Video Engine driver > > .../devicetree/bindings/media/aspeed-video.txt | 23 + > drivers/clk/clk-aspeed.c | 41 +- > drivers/media/platform/Kconfig | 8 + > drivers/media/platform/Makefile | 1 + > drivers/media/platform/aspeed-video.c | 1307 ++++++++++++++++++++ > include/dt-bindings/clock/aspeed-clock.h | 1 + > 6 files changed, 1379 insertions(+), 2 deletions(-) > create mode 100644 Documentation/devicetree/bindings/media/aspeed-video.txt > create mode 100644 drivers/media/platform/aspeed-video.c >
On 09/03/2018 06:57 AM, Hans Verkuil wrote: > Hi Eddie, > > Thank you for your work on this. Interesting to see support for this SoC :-) > > On 08/29/2018 11:09 PM, Eddie James wrote: >> The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs >> can capture and compress video data from digital or analog sources. With >> the Aspeed chip acting as a service processor, the Video Engine can >> capture the host processor graphics output. >> >> This series adds a V4L2 driver for the VE, providing a read() interface >> only. The driver triggers the hardware to capture the host graphics output >> and compress it to JPEG format. >> >> Testing on an AST2500 determined that the videobuf/streaming/mmap interface >> was significantly slower than the simple read() interface, so I have not >> included the streaming part. > Do you know why? It should be equal or faster, not slower. Yes, it seems to be an issue with the timing of the video engine interrupts compared with how a normal v4l2 application queues buffers. With the simple read() application, the driver can swap between DMA buffers freely and get a frame ahead. With the streaming buffers, I found the driver ran through the queue quite quickly, but then, once userspace queues again, we had to wait for the next frame, as I couldn't get a frame ahead since no buffers were available during that time period. This could possibly be solved with more buffers but this gets to require a lot of memory, since each buffer is allocated for the full frame size even though we only fill a fraction of it with JPEG data... > > I reviewed about half of the driver, but then I stopped since there were too > many things missing. > > First of all, you need to test your driver with v4l2-compliance (available here: > https://git.linuxtv.org/v4l-utils.git/). Always compile from the git repo since > the versions from distros tend to be too old. > > Just run 'v4l2-compliance -d /dev/videoX' and fix all issues. Then run > 'v4l2-compliance -s -d /dev/videoX' to test streaming. > > This utility checks if the driver follows the V4L2 API correctly, implements > all ioctls that it should and fills in all the fields that it should. > > Please add the output of 'v4l2-compliance -s' to future versions of this patch > series: I don't accept V4L2 drivers without a clean report of this utility. Sure thing. Thanks for the guidance. > > If you have any questions, then mail me or (usually quicker) ask on the #v4l > freenode irc channel (I'm in the CET timezone). > > One thing that needs more explanation: from what I could tell from the driver > the VIDIOC_G_FMT ioctl returns the detected format instead of the current > format. This is wrong. Instead you should implement the VIDIOC_*_DV_TIMINGS > ioctls and the V4L2_EVENT_SOURCE_CHANGE event. > > The normal sequence is that userspace queries the current timings with > VIDIOC_QUERY_DV_TIMINGS, if it finds valid timings, then it sets these > timings with _S_DV_TIMINGS. Now it can call G/S_FMT. If the timings > change, then the driver should detect that and send a V4L2_EVENT_SOURCE_CHANGE > event. OK I see. I ended up simplifying this part anyway since it's not possible to change the video size from the driver. I don't think there is a need for VIDIOC_QUERY_DV_TIMINGS now, but feel free to review. Thanks again, Eddie > > When the application receives this event it can take action, such as > increasing the size of the buffer for the jpeg data that it reads into. > > The reason for this sequence of events is that you can't just change the > format/resolution mid-stream without giving userspace the chance to > reconfigure. > > Regards, > > Hans > >> It's also possible to use an automatic mode for the VE such that >> re-triggering the HW every frame isn't necessary. However this wasn't >> reliable on the AST2400, and probably used more CPU anyway due to excessive >> interrupts. It was approximately 15% faster. >> >> The series also adds the necessary parent clock definitions to the Aspeed >> clock driver, with both a mux and clock divider. >> >> Eddie James (4): >> clock: aspeed: Add VIDEO reset index definition >> clock: aspeed: Setup video engine clocking >> dt-bindings: media: Add Aspeed Video Engine binding documentation >> media: platform: Add Aspeed Video Engine driver >> >> .../devicetree/bindings/media/aspeed-video.txt | 23 + >> drivers/clk/clk-aspeed.c | 41 +- >> drivers/media/platform/Kconfig | 8 + >> drivers/media/platform/Makefile | 1 + >> drivers/media/platform/aspeed-video.c | 1307 ++++++++++++++++++++ >> include/dt-bindings/clock/aspeed-clock.h | 1 + >> 6 files changed, 1379 insertions(+), 2 deletions(-) >> create mode 100644 Documentation/devicetree/bindings/media/aspeed-video.txt >> create mode 100644 drivers/media/platform/aspeed-video.c >>
On 09/13/2018 09:11 PM, Eddie James wrote: > > > On 09/03/2018 06:57 AM, Hans Verkuil wrote: >> Hi Eddie, >> >> Thank you for your work on this. Interesting to see support for this SoC :-) >> >> On 08/29/2018 11:09 PM, Eddie James wrote: >>> The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs >>> can capture and compress video data from digital or analog sources. With >>> the Aspeed chip acting as a service processor, the Video Engine can >>> capture the host processor graphics output. >>> >>> This series adds a V4L2 driver for the VE, providing a read() interface >>> only. The driver triggers the hardware to capture the host graphics output >>> and compress it to JPEG format. >>> >>> Testing on an AST2500 determined that the videobuf/streaming/mmap interface >>> was significantly slower than the simple read() interface, so I have not >>> included the streaming part. >> Do you know why? It should be equal or faster, not slower. > > Yes, it seems to be an issue with the timing of the video engine > interrupts compared with how a normal v4l2 application queues buffers. > With the simple read() application, the driver can swap between DMA > buffers freely and get a frame ahead. With the streaming buffers, I > found the driver ran through the queue quite quickly, but then, once > userspace queues again, we had to wait for the next frame, as I couldn't > get a frame ahead since no buffers were available during that time > period. This could possibly be solved with more buffers but this gets to > require a lot of memory, since each buffer is allocated for the full > frame size even though we only fill a fraction of it with JPEG data... For stream I/O you usually need 3 buffers: one is being DMAed, one is the next queued up frame for the DMA and the third is being processed in userspace. If userspace doesn't process buffers fast enough, then the driver will need to capture into the same buffer over and over again until userspace finally queues another buffer. What I don't understand here is what the frame rate is. Is the capture framerate the same as the host processor graphics output? Or is it unrelated to that? The problem with read() is that 1) it requires copying the video data, and 2) you cannot use dmabuf for zero-copying pipelines. Whether or not 2 is needed depends on your hardware. > >> >> I reviewed about half of the driver, but then I stopped since there were too >> many things missing. >> >> First of all, you need to test your driver with v4l2-compliance (available here: >> https://git.linuxtv.org/v4l-utils.git/). Always compile from the git repo since >> the versions from distros tend to be too old. >> >> Just run 'v4l2-compliance -d /dev/videoX' and fix all issues. Then run >> 'v4l2-compliance -s -d /dev/videoX' to test streaming. >> >> This utility checks if the driver follows the V4L2 API correctly, implements >> all ioctls that it should and fills in all the fields that it should. >> >> Please add the output of 'v4l2-compliance -s' to future versions of this patch >> series: I don't accept V4L2 drivers without a clean report of this utility. > > Sure thing. Thanks for the guidance. > >> >> If you have any questions, then mail me or (usually quicker) ask on the #v4l >> freenode irc channel (I'm in the CET timezone). >> >> One thing that needs more explanation: from what I could tell from the driver >> the VIDIOC_G_FMT ioctl returns the detected format instead of the current >> format. This is wrong. Instead you should implement the VIDIOC_*_DV_TIMINGS >> ioctls and the V4L2_EVENT_SOURCE_CHANGE event. >> >> The normal sequence is that userspace queries the current timings with >> VIDIOC_QUERY_DV_TIMINGS, if it finds valid timings, then it sets these >> timings with _S_DV_TIMINGS. Now it can call G/S_FMT. If the timings >> change, then the driver should detect that and send a V4L2_EVENT_SOURCE_CHANGE >> event. > > OK I see. I ended up simplifying this part anyway since it's not > possible to change the video size from the driver. I don't think there > is a need for VIDIOC_QUERY_DV_TIMINGS now, but feel free to review. You are capturing the host graphics output, right? So you need to know the resolution, timings, etc. of that output? What if the host changes resolution? That's something you need to know, or am I missing something? This is obviously a somewhat different environment than what I am used to, so bear with me if I ask stupid questions... Regards, Hans > > Thanks again, > Eddie > >> >> When the application receives this event it can take action, such as >> increasing the size of the buffer for the jpeg data that it reads into. >> >> The reason for this sequence of events is that you can't just change the >> format/resolution mid-stream without giving userspace the chance to >> reconfigure. >> >> Regards, >> >> Hans >> >>> It's also possible to use an automatic mode for the VE such that >>> re-triggering the HW every frame isn't necessary. However this wasn't >>> reliable on the AST2400, and probably used more CPU anyway due to excessive >>> interrupts. It was approximately 15% faster. >>> >>> The series also adds the necessary parent clock definitions to the Aspeed >>> clock driver, with both a mux and clock divider. >>> >>> Eddie James (4): >>> clock: aspeed: Add VIDEO reset index definition >>> clock: aspeed: Setup video engine clocking >>> dt-bindings: media: Add Aspeed Video Engine binding documentation >>> media: platform: Add Aspeed Video Engine driver >>> >>> .../devicetree/bindings/media/aspeed-video.txt | 23 + >>> drivers/clk/clk-aspeed.c | 41 +- >>> drivers/media/platform/Kconfig | 8 + >>> drivers/media/platform/Makefile | 1 + >>> drivers/media/platform/aspeed-video.c | 1307 ++++++++++++++++++++ >>> include/dt-bindings/clock/aspeed-clock.h | 1 + >>> 6 files changed, 1379 insertions(+), 2 deletions(-) >>> create mode 100644 Documentation/devicetree/bindings/media/aspeed-video.txt >>> create mode 100644 drivers/media/platform/aspeed-video.c >>> >
On 09/14/2018 01:56 AM, Hans Verkuil wrote: > On 09/13/2018 09:11 PM, Eddie James wrote: >> >> On 09/03/2018 06:57 AM, Hans Verkuil wrote: >>> Hi Eddie, >>> >>> Thank you for your work on this. Interesting to see support for this SoC :-) >>> >>> On 08/29/2018 11:09 PM, Eddie James wrote: >>>> The Video Engine (VE) embedded in the Aspeed AST2400 and AST2500 SOCs >>>> can capture and compress video data from digital or analog sources. With >>>> the Aspeed chip acting as a service processor, the Video Engine can >>>> capture the host processor graphics output. >>>> >>>> This series adds a V4L2 driver for the VE, providing a read() interface >>>> only. The driver triggers the hardware to capture the host graphics output >>>> and compress it to JPEG format. >>>> >>>> Testing on an AST2500 determined that the videobuf/streaming/mmap interface >>>> was significantly slower than the simple read() interface, so I have not >>>> included the streaming part. >>> Do you know why? It should be equal or faster, not slower. >> Yes, it seems to be an issue with the timing of the video engine >> interrupts compared with how a normal v4l2 application queues buffers. >> With the simple read() application, the driver can swap between DMA >> buffers freely and get a frame ahead. With the streaming buffers, I >> found the driver ran through the queue quite quickly, but then, once >> userspace queues again, we had to wait for the next frame, as I couldn't >> get a frame ahead since no buffers were available during that time >> period. This could possibly be solved with more buffers but this gets to >> require a lot of memory, since each buffer is allocated for the full >> frame size even though we only fill a fraction of it with JPEG data... > For stream I/O you usually need 3 buffers: one is being DMAed, one is > the next queued up frame for the DMA and the third is being processed > in userspace. If userspace doesn't process buffers fast enough, then > the driver will need to capture into the same buffer over and over again > until userspace finally queues another buffer. > > What I don't understand here is what the frame rate is. Is the capture > framerate the same as the host processor graphics output? Or is it > unrelated to that? Yes, the frame rate is the capture frame rate, unrelated to the host graphics output frame rate. It's really just a bandwidth-saving measure. > > The problem with read() is that 1) it requires copying the video data, and > 2) you cannot use dmabuf for zero-copying pipelines. Whether or not 2 is > needed depends on your hardware. > >>> I reviewed about half of the driver, but then I stopped since there were too >>> many things missing. >>> >>> First of all, you need to test your driver with v4l2-compliance (available here: >>> https://git.linuxtv.org/v4l-utils.git/). Always compile from the git repo since >>> the versions from distros tend to be too old. >>> >>> Just run 'v4l2-compliance -d /dev/videoX' and fix all issues. Then run >>> 'v4l2-compliance -s -d /dev/videoX' to test streaming. >>> >>> This utility checks if the driver follows the V4L2 API correctly, implements >>> all ioctls that it should and fills in all the fields that it should. >>> >>> Please add the output of 'v4l2-compliance -s' to future versions of this patch >>> series: I don't accept V4L2 drivers without a clean report of this utility. >> Sure thing. Thanks for the guidance. >> >>> If you have any questions, then mail me or (usually quicker) ask on the #v4l >>> freenode irc channel (I'm in the CET timezone). >>> >>> One thing that needs more explanation: from what I could tell from the driver >>> the VIDIOC_G_FMT ioctl returns the detected format instead of the current >>> format. This is wrong. Instead you should implement the VIDIOC_*_DV_TIMINGS >>> ioctls and the V4L2_EVENT_SOURCE_CHANGE event. >>> >>> The normal sequence is that userspace queries the current timings with >>> VIDIOC_QUERY_DV_TIMINGS, if it finds valid timings, then it sets these >>> timings with _S_DV_TIMINGS. Now it can call G/S_FMT. If the timings >>> change, then the driver should detect that and send a V4L2_EVENT_SOURCE_CHANGE >>> event. >> OK I see. I ended up simplifying this part anyway since it's not >> possible to change the video size from the driver. I don't think there >> is a need for VIDIOC_QUERY_DV_TIMINGS now, but feel free to review. > You are capturing the host graphics output, right? So you need to know > the resolution, timings, etc. of that output? What if the host changes > resolution? That's something you need to know, or am I missing something? Yes, that's why I had those extra lines in set_format, so that if the set format resolution matches the actual frame size, then it can change... I guess I'm probably mis-using the API. Will look into the source event change and timings calls. > > This is obviously a somewhat different environment than what I am used to, > so bear with me if I ask stupid questions... No problem! Thanks, Eddie > > Regards, > > Hans > >> Thanks again, >> Eddie >> >>> When the application receives this event it can take action, such as >>> increasing the size of the buffer for the jpeg data that it reads into. >>> >>> The reason for this sequence of events is that you can't just change the >>> format/resolution mid-stream without giving userspace the chance to >>> reconfigure. >>> >>> Regards, >>> >>> Hans >>> >>>> It's also possible to use an automatic mode for the VE such that >>>> re-triggering the HW every frame isn't necessary. However this wasn't >>>> reliable on the AST2400, and probably used more CPU anyway due to excessive >>>> interrupts. It was approximately 15% faster. >>>> >>>> The series also adds the necessary parent clock definitions to the Aspeed >>>> clock driver, with both a mux and clock divider. >>>> >>>> Eddie James (4): >>>> clock: aspeed: Add VIDEO reset index definition >>>> clock: aspeed: Setup video engine clocking >>>> dt-bindings: media: Add Aspeed Video Engine binding documentation >>>> media: platform: Add Aspeed Video Engine driver >>>> >>>> .../devicetree/bindings/media/aspeed-video.txt | 23 + >>>> drivers/clk/clk-aspeed.c | 41 +- >>>> drivers/media/platform/Kconfig | 8 + >>>> drivers/media/platform/Makefile | 1 + >>>> drivers/media/platform/aspeed-video.c | 1307 ++++++++++++++++++++ >>>> include/dt-bindings/clock/aspeed-clock.h | 1 + >>>> 6 files changed, 1379 insertions(+), 2 deletions(-) >>>> create mode 100644 Documentation/devicetree/bindings/media/aspeed-video.txt >>>> create mode 100644 drivers/media/platform/aspeed-video.c >>>>