CCExtractor and detecting Closed Caption language #11

rlaphoenix · 2023-02-06T18:11:32Z

Describe the bug
Let's imagine there's a title that has 4 subs as WebVTT. German, French, Italian, and Spanish. But the title is English and the only English sub is as a C608 box.

The current code does not run CCExtractor unless there are NO other subtitles available, therefore this would be missing English subtitles. However, we also cannot assume the C608 is English. Therefore, if the C608 was German and we extract it as there's no English sub, we would still be missing English subtitles but now have a duplicate German subtitle.

C608 boxes extracted via CCExtractor are currently missing language information (unless I'm not looking hard enough). We need to detect the language to be able to proceed with this effectively.

Expected behavior
CCExtractor should run if there are no subtitles in the title's original language. For example, if there are no English subtitles on an English video of The Sopranos, then it should run CCExtractor to check for potential English C608 boxes. It should also check the C608 boxes language and ensure that it is English otherwise only use it if there is no other Subtitle in that language.

Another option would be to have some way to detect what language a subtitle is by analyzing the text content. If we can do that, then we could just check if we need a sub for that detected language, if so take it.

rlaphoenix added the bug Something isn't working label Feb 6, 2023

rlaphoenix changed the title ~~CCExtractor should run if there is no subtitles in the title's Original Language~~ CCExtractor and detecting Closed Caption language Feb 6, 2023

rlaphoenix added the help wanted Extra attention is needed label Mar 8, 2024

rlaphoenix pinned this issue Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CCExtractor and detecting Closed Caption language #11

CCExtractor and detecting Closed Caption language #11

rlaphoenix commented Feb 6, 2023

CCExtractor and detecting Closed Caption language #11

CCExtractor and detecting Closed Caption language #11

Comments

rlaphoenix commented Feb 6, 2023