You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Let's imagine there's a title that has 4 subs as WebVTT. German, French, Italian, and Spanish. But the title is English and the only English sub is as a C608 box.
The current code does not run CCExtractor unless there are NO other subtitles available, therefore this would be missing English subtitles. However, we also cannot assume the C608 is English. Therefore, if the C608 was German and we extract it as there's no English sub, we would still be missing English subtitles but now have a duplicate German subtitle.
C608 boxes extracted via CCExtractor are currently missing language information (unless I'm not looking hard enough). We need to detect the language to be able to proceed with this effectively.
Expected behavior
CCExtractor should run if there are no subtitles in the title's original language. For example, if there are no English subtitles on an English video of The Sopranos, then it should run CCExtractor to check for potential English C608 boxes. It should also check the C608 boxes language and ensure that it is English otherwise only use it if there is no other Subtitle in that language.
Another option would be to have some way to detect what language a subtitle is by analyzing the text content. If we can do that, then we could just check if we need a sub for that detected language, if so take it.
The text was updated successfully, but these errors were encountered:
rlaphoenix
changed the title
CCExtractor should run if there is no subtitles in the title's Original Language
CCExtractor and detecting Closed Caption language
Feb 6, 2023
Describe the bug
Let's imagine there's a title that has 4 subs as WebVTT. German, French, Italian, and Spanish. But the title is English and the only English sub is as a C608 box.
The current code does not run CCExtractor unless there are NO other subtitles available, therefore this would be missing English subtitles. However, we also cannot assume the C608 is English. Therefore, if the C608 was German and we extract it as there's no English sub, we would still be missing English subtitles but now have a duplicate German subtitle.
C608 boxes extracted via CCExtractor are currently missing language information (unless I'm not looking hard enough). We need to detect the language to be able to proceed with this effectively.
Expected behavior
CCExtractor should run if there are no subtitles in the title's original language. For example, if there are no English subtitles on an English video of The Sopranos, then it should run CCExtractor to check for potential English C608 boxes. It should also check the C608 boxes language and ensure that it is English otherwise only use it if there is no other Subtitle in that language.
Another option would be to have some way to detect what language a subtitle is by analyzing the text content. If we can do that, then we could just check if we need a sub for that detected language, if so take it.
The text was updated successfully, but these errors were encountered: