Skip to content

Conversation

@dericed
Copy link
Member

@dericed dericed commented Sep 17, 2025

this is a draft to get to #810. I starts adds the FFREPORT environmental variable and the play or record command adds the metadata filter to print the hex captions there. A background process called cchex_to_display then periodically skims the end of the FFREPORT and uses a simple ASCII lookup table to convert some of the hex caption data into text and then a drawtext filter uses textfile and reload to draw it live. Currently the live captions only include the basic ASCII and not any extended set, there's also no placement or control codes processed. And it is only implemented in the "Captions" viewer thus far.

Since the captions aren't really fully processed, I'm not using the black boxed style as captions go for, but i'm using a shadowed, right-to-left scroll. I tried to do this all in ffmpeg but I'm not sure it's possible without a video->subtitle filter. @richardpl I'm curious of your thoughts.

vrecord_live_captions.mp4

@bturkus
Copy link
Collaborator

bturkus commented Sep 17, 2025

This CLIP....perfection

@richardpl
Copy link

In theory you could do same thing with ass/subtitle and to limited extent with drawtext filter (limited because it have less text styling features then libass powered filters) - and with lot of work, with some runtime commands dynamic insertion, and this could be nodelay-ish solution as conversion to text subtitles and drawing/overlay back over video is going to add additional delay - but for vrecord that is irrelevant. Anyway I think it should be possible to do all of it in realtime, but not yet with just ffmpeg tool alone.

When using libraries directly one could insert new draw command to filter on each frame displayed, and code using libraries would be alone responsible for converting cc output to actual captions/formated text, and this is even more work than your current working "demo" solution here. As you all know subtitle media filtering/filters is still far away from ffmpeg, even though there is some code posted on mailing list months ago, but no consensus where ever reached thus changes where never merged. I never got motivation to use such changes in my fork because of possible API/ABI conflicts in future version of ffmpeg code ...

Its also possible to do new filter which would do CC decoding inside filter and than rendering within that same filter, even more work...

So, I'm afraid there is no any shortcuts here, even thought I would like to have one, but just fully CC text dechipering into full actual .ass styling is not a trivial work for real-time, on the fly CC display. For off-line, non real-time you just decode all frames, filter out relevant lines, collect all CC data generate file and give it to demuxer/decoder combo, but you all know this boring stuff already...

@dericed
Copy link
Member Author

dericed commented Sep 18, 2025

@richardpl I didn't think the subtitle filter would have worked since the subtitle data doesn't exist when the filterchain starts. With the drawtext filter, I could pre-make a non-empty text file, run a background process to fill it periodically and then use textfile and reload so that the contents are periodically reread.

Would love for a V->S route in libavfilter someday. I remembered a patchset on that concept, but I don't think it was resolved.

At any rate, this ... works, and feasible I could keep updating the cchex_to_text script to add extended characters and even possibly placement.

@iamdamosuzuki can you test this with a long tape. I'm hoping that it's not adding too much intensity to the process but I've only tested so far on file-based inputs.

@dericed
Copy link
Member Author

dericed commented Sep 19, 2025

I'm realizing that this approach of running cchex_to_display before the capture starts and stopping it after also fits in with how we could run a background process to start and stop vrecord conditionally. For example, we could have this script make the capture preview but it could also occasionally check for a timecode value or if the video went to black and stop the capture. I suggest let's finish up this PR first but then I could build off this for #499

@iamdamosuzuki
Copy link
Contributor

I ran a 2 hour transfer and it dropped frames around the 01:55:00 mark:

/ WARNING: There were presentation   \
| timestamp discontinuities in the file's |
| frame MD5s for these frame ranges:      |
| 208286 208291 210840 210844 . This      |
| error may indicate frames dropped by    |
| FFmpeg or vrecord. The file may have    |
\ sync issues.                      /
 -----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
 ________________________________________
/ WARNING: FFmpeg Decklink input    \
| reported dropped frames in the         |
| following 6 locations. This error may  |
| indicate an interrupted signal between |
| hardware components. The file may be   |
| missing content. With decklink inputs, |
| this cow recommends reviewing your     |
| settings in Desktop Video Setup and    |
| setting the video and audio inputs to  |
\ match what those set in vrecord. /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
Dropped frames timestamps: 01:55:49.843 01:55:50.109 01:57:14.994 01:57:15.194 01:59:28.861 01:59:29.061 
Checking file conformance against a mediaconch policy...
 2025-09-22T10:58:07 - File passed policy check for the video.

@dericed
Copy link
Member Author

dericed commented Sep 22, 2025

@iamdamosuzuki did the caption preview work as expected?

@iamdamosuzuki
Copy link
Contributor

@dericed yes it worked. It's only visible in the "Captions" View, right?

@dericed
Copy link
Member Author

dericed commented Sep 22, 2025

Yes, for now. Once this is stable and mergeable, then we can think about adding it to other views.

@iamdamosuzuki
Copy link
Contributor

I've had mixed results. Some transfers work fine, others have dropouts and other errors. Here are a few logs, i'll do another transfer today. I think the caption decoding is correct, but it'll probably have to wait for a refactor in order to be live during capture. Is there a way we could have this working just in passthru mode for now, and push it to capture in a later PR?

20250922_CaptionViewer01_vrecord_input.log
20250922_CaptionViewer03_vrecord_input.log
20250923_CaptionViewer04_vrecord_input.log

@privatezero
Copy link
Member

In theory that would just involve removing it from the list of available options that generate the drop down list for capture view modes, right?

Copy link
Contributor

@iamdamosuzuki iamdamosuzuki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caption decoding is working as expected. I dropped a few frames on one of my transfers, but not on any others. I say this is good to merge.

@dericed
Copy link
Member Author

dericed commented Sep 29, 2025

@bturkus able to test as well?

@iamdamosuzuki
Copy link
Contributor

here's a log from another test. I dropped two frames at the very beginning. I think it may have been caused by me pressing the right arrow key, which seems to have confused the decoder. Not standard operating procedure, so probably not a big deal.
20250929_CaptionViewer_01_vrecord_input.log

I do think we should consider just having this view available in passthrough and not capture mode

@dericed
Copy link
Member Author

dericed commented Sep 29, 2025

Having it in passthrough only could work, but one opportunity in this approach is that we could use the same intermediate log that gathers caption data to also include timecode and some qctools basis data which could permit us to conditionally start and stop the capture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants