Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtitle Filtering 2022 #18

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

softworkz
Copy link
Collaborator

@softworkz softworkz commented Jan 13, 2022

Subtitle Filtering 2022

This is a substantial update to the earlier subtitle filtering patch series.
A primary goal has been to address others' concerns as much as possible on one side and to provide more clarity and control over the way things are working. Clarity is is specifically important to allow for a better understanding of the need for a subtitle start pts value that can be different from the frame's pts value. This is done by refactoring the subtitle timing fields in AVFrame, adding a frame field to indicate repeated subtitle frames, and finally the full removal of the heartbeat functionality, replaced by a new 'subfeed' filter that provides different modes for arbitrating subtitle frames in a filter graph. Finally, each subtitle filter's documentation has been amended by a section describing the filter's timeline behavior (in v3 update).

Subtitle Filtering Demos

I published a demonstration of subtitle filtering capabilities with OCR, text and bitmap subtitle manipulation involved:
Demo 1: Text-Manipulation with Bitmap Subtitles

v9

  • Cleared the cc recipient list
  • Resending after e-mail error

v8

  • Rebased and adapted to upstream changes

v7

  • Revert loglevel change
  • snull: propagate format
  • fftools/ffmpeg: ensure monotonic (frame-)pts values for subtitles

v6 - Fix assertion errors

  • text2graphicsub: fix null point on uninit after error
  • strim: propagate width and height
  • avfilter: add default propagation time_base from inlink to outlink

v5 - Conversion to Graphic Subtitles, and other enhancements

  • I'm glad to announce that Traian (@tcoza) has joined the project and contributed a new 'text2graphicsub' filter to convert text subtitles to graphic subtitles, which can in turn be encoded as dvd, dvb or x-subs (and any other encoder for graphic subs that might be added in the future).
    This filter closes the last open "gap" in subtitle processing.
  • stripstyles filter: now allows very fine-grained control over which ASS style codes should be preserved or stripped
  • stripstyles: do not drop dialog margin values
  • subfeed filter: eliminates duplicate frames with duplicate start times when 'fix_overlap' is specified
  • textmod: do not drop effect values
  • graphicsub2text: reduce font size jitter
  • ass_split: add function to selectively preserve elements when splitting
  • add strim, snull and ssink and further unify subtitle frame handling with audio and video
  • ffmpeg_filter: get simple filter notation working for subtitles

v4 - Quality Improvements

  • finally an updated version
  • includes many improvements from internal testing
  • all FATE tests passed
  • all example commands from the docs verified to work
  • can't list all the detail changes..
  • I have left out the extra commits which can be handled separately, just in case somebody wonders why these are missing:
    • avcodec/webvttenc: Don't encode drawing codes and empty lines
    • avcodec/webvttenc: convert hard-space tags to  
    • avutil/ass_split: Add parsing of hard-space tags (\h)
    • avutil/ass_split: Treat all content in curly braces as hidden
    • avutil/ass_split: Fix ass parsing of style codes with comments

v3 - Rebase

due to merge conflicts - apologies.

Changes in v2

  • added .gitattributes file to enforce binary diffs for the test refs that cannot be applied when being sent via e-mail
  • perform filter graph re-init due to subtitle "frame size" change only when the size was unknown before and not set via -canvas_size
  • overlaytextsubs: Make sure to request frames on the subtitle input
  • avfilter/splitcc: Start parsing cc data on key frames only
  • avcodec/webvttenc: Don't encode ass drawing codes and empty lines
  • stripstyles: fix mem leak
  • gs2t: improve color detection
  • gs2t: empty frames must not be skipped
  • subfeed: fix name
  • textmod: preserve margins
  • added .gitattributes file to enforce binary diffs for the test refs that cannot be applied when being sent via e-mail
  • perform filter graph re-init due to subtitle "frame size" change only when the size was unknown before and not set via -canvas_size
  • avcodec/dvbsubdec: Fix conditions for fallback to default resolution
  • Made changes suggested by Andreas
  • Fixed failing command line reported by Michael

Changes from previous version v24:

AVFrame

  • Removed sub_start_time
    The start time is now added to the subtitle start_pts during decoding
    The sub_end_time field is adjusted accordingly
  • Renamed sub_end_time to duration
    which it is effectively after removing the start_time
  • Added a sub-struct 'subtitle_timing' to av frame
    Contains subtitle_pts renamed to 'subtitle_timing.start_pts'
    and 'subtitle_timing.duration'
  • Change both fields to (fixed) time_base AV_TIMEBASE
  • add repeat_sub field
    provides a clear indication whether a subtitle frame is an actual
    subtitle event or a repeated subtitle frame in a filter graph

Heartbeat Removal

  • completely removed the earlier heartbeat implementation
  • filtering arbitration is now implemented in a new filter: 'subfeed'
  • subfeed will be auto-inserted for compatiblity with sub2video
    command lines
  • the new behavior is not exactly identical to the earlier behavior, but it basically allows to achieve the same results
  • there's a small remainder, now named subtitle kickoff which serves to get things (in the filter graph) going right from the start

New 'subfeed' Filter

  • a versatile filter for solving all kinds of problems with subtile frame flow in filter graphs
  • Can be inserted at any position in a graph
  • Auto-inserted for sub2video command lines (in repeat-mode)
  • Allows duration fixup
    delay input frames with unknown duration and infer duration from start of subsequent frame
  • Provides multiple modes of operation:
    • repeat mode (default)
      Queues input frames
      Outputs frames at a fixed (configurable) rate
      Either sends a matching input frame (repeatedly) or empty frames otherwise
    • scatter mode
      similar to repeat mode, but splits input frames by duration into small segments with same content
    • forward mode
      No fixed output rate
      Useful in combination with duration fixup or overlap fixup

ffmpeg Tool Changes

  • delay subtitle output stream initialization (like for audio and video)
    This is needed for example when a format header depends on having received an initial frame to derive certain header values from
  • decoding: set subtitle frame size from decoding context
  • re-init graph when subtitle size changes
  • always insert subscale filter for sub2video command lines
    (to ensure correct scaling)

Subtitle Encoding

  • ignore repeated frames for encoding
    based on repeat_sub field in AVFrame
  • support multi-area encoding for text subtitles
    Subtitle OCR can create multiple areas at different positions. Previously, the texts were always squashed into a single area ('subtitle rect'), which was not ideal.
    Multiple text areas are now generally supported:
    • ASS Encoder
      Changed to use the 'receive_packet' encoding API
      A single frame with multiple text areas will create multiple packets now
    • All other text subtitle encoders
      A newline is inserted between the text from multiple areas

graphicsub2text (OCR)

  • enhanced preprocessing
    • using elbg algorithm for color quantization
    • detection and removal of text outlines
    • map-based identification of colors per word (text, outline, background)
  • add option for duration fixup
  • add option to dump preprocessing bitmaps
  • Recognize formatting and apply as ASS inline styles
    • per word(!)
    • paragraph alignment
    • positioning
    • font names
    • font size
    • font style (italic, underline, bold)
    • text color, outline color

Other Filter Changes

  • all: Make sure to forward all link properties (time base, frame rate, w, h) where appropriate
  • overlaytextsubs: request frames on the subtitle input
  • overlaytextsubs: disable read-order checking
  • overlaytextsubs: improve implementation of render_latest_only
  • overlaytextsubs: ensure equal in/out video formats
  • splitcc: derive framerate from realtime_latency
  • graphicsub2video: implement caching of converted frames
  • graphicsub2video: use 1x1 output frame size as long as subtitle size is unknown (0x0)

Plus a dozen of things I forgot..

cc: Hendrik Leppkes [email protected]
cc: Soft Works [email protected]
cc: Michael Niedermayer [email protected]

@softworkz
Copy link
Collaborator Author

/preview

@softworkz softworkz force-pushed the submit_subfiltering branch from b9ab108 to 25cb9c8 Compare January 14, 2022 00:33
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@softworkz softworkz force-pushed the submit_subfiltering branch from 25cb9c8 to a764c14 Compare January 14, 2022 00:49
@ffmpeg-codebot
Copy link

There is an issue in commit 8e1a90b:
First line of commit message is too long (> 100 columns): avcodec/subtitles: Migrate subtitle encoders to frame-based API and provide a compatibility shim for the legacy api

@softworkz softworkz force-pushed the submit_subfiltering branch from a764c14 to c001133 Compare January 14, 2022 00:53
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@ffstaging ffstaging deleted a comment from ffmpeg-codebot bot Jan 14, 2022
@softworkz
Copy link
Collaborator Author

/submit

@ffmpeg-codebot
Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/ffstaging/FFmpeg pr-ffstaging-18/softworkz/submit_subfiltering-v1

To fetch this version to local tag pr-ffstaging-18/softworkz/submit_subfiltering-v1:

git fetch --no-tags https://github.com/ffstaging/FFmpeg tag pr-ffstaging-18/softworkz/submit_subfiltering-v1

fftools/ffmpeg.c Outdated Show resolved Hide resolved
Added a text2graphicsub subtitle filter which converts text-based
subtitle tracks to bitmap-based subtitle tracks. The filter uses libass
to render the subtitles.
It takes as parameters an output height and width, as well as a number
of colors in the output palette as well as sources of fonts. All its
arguments are optional.

Reviewed-by: softworkz <[email protected]>
Signed-off-by: softworkz <[email protected]>
Signed-off-by: tcoza <[email protected]>
and provide a compatibility shim for the legacy api

Signed-off-by: softworkz <[email protected]>
…itle encoding

This commit actually enables subtitle filtering in ffmpeg by
sending and receiving subtitle frames to and from a filtergraph.

The heartbeat functionality from the previous sub2video implementation
is removed and now provided by the 'subfeed' filter.
The other part of sub2video functionality is retained by
auto-insertion of the new graphicsub2video filter.

Justification for changed test refs:

- sub2video
  The previous results were incorrect. The command line for this test
  specifies -r 5 (5 frames per second), which is now fulfilled by the
  additional frames in the reference output.
  Example: The first subtitle time is 499000, the second is 15355000,
  which means 0.5s and 15.35s with a difference of 14.85s.
  15s * 5fps = 75 frames and that's now the exact number of video
  frames between these two subtitle events.

- sub2video_basic
  The previous results had some incorrect output because multiple
  frames had the same dts
  The non-empty content frames are visually identical, the different
  CRC is due to the different blending algorithm that is being used.

- sub2video_time_limited
  Subtitle frames are emitted to the filter graphs at a 5 fps rate
  by default. The time limit for this test is 15s * 5fps = 75 frames
  which matches the count in the new ref.

- sub-dvb
  Running ffprobe -show_frames on the source file shows that there
  are 7 subtitle frames with 0 rects in the source at the start
  and 2 at the end. This translates to the 14 and 4 additional
  entries in the new test results.

- filter-overlay-dvdsub-2397
  Overlay results have slightly different CRCs due to different
  blending implementation

- sub-scc
  The first entry is no longer in the output because it is before
  the actual start time and the strim filter removes such entries
  now (like for video and audio)

Signed-off-by: softworkz <[email protected]>
@softworkz softworkz force-pushed the submit_subfiltering branch from fa0d5eb to c5487d7 Compare October 24, 2022 06:05
@softworkz
Copy link
Collaborator Author

/submit

@ffmpeg-codebot
Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/ffstaging/FFmpeg pr-ffstaging-18/softworkz/submit_subfiltering-v8

To fetch this version to local tag pr-ffstaging-18/softworkz/submit_subfiltering-v8:

git fetch --no-tags https://github.com/ffstaging/FFmpeg tag pr-ffstaging-18/softworkz/submit_subfiltering-v8

The previous code expected a segment of type CLUT definition to exist
in order to accept a set of segments to be complete.
This was an incorrect assumption as the presence of a CLUT segment
is not mandatory.
(version 1.6.1 of the spec is probably a bit more clear about this
than earlier versions: https://www.etsi.org/deliver/etsi_en/
300700_300799/300743/01.06.01_20/en_300743v010601a.pdf)

The flawed condition prevented proper fallback to using the default
resolution for the decoding context.

Signed-off-by: softworkz <[email protected]>
@softworkz softworkz force-pushed the submit_subfiltering branch from c5487d7 to bb11798 Compare October 25, 2022 09:09
@softworkz
Copy link
Collaborator Author

/submit

@ffmpeg-codebot
Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/ffstaging/FFmpeg pr-ffstaging-18/softworkz/submit_subfiltering-v9

To fetch this version to local tag pr-ffstaging-18/softworkz/submit_subfiltering-v9:

git fetch --no-tags https://github.com/ffstaging/FFmpeg tag pr-ffstaging-18/softworkz/submit_subfiltering-v9

@@ -77,6 +77,7 @@ HEADERS = adler32.h \
sha512.h \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the FFmpeg mailing list, Hendrik Leppkes wrote (reply to this):

On Tue, Oct 25, 2022 at 11:14 AM softworkz <[email protected]> wrote:
>
> @@ -712,6 +712,53 @@ typedef struct AVFrame {
>       * Duration of the frame, in the same units as pts. 0 if unknown.
>       */
>      int64_t duration;
> +
> +    /**
> +     * Media type of the frame (audio, video, subtitles..)
> +     *
> +     * See AVMEDIA_TYPE_xxx
> +     */
> +    enum AVMediaType type;
> +
> +    /**
> +     * Number of items in the @ref subtitle_areas array.
> +     */
> +    unsigned num_subtitle_areas;
> +
> +    /**
> +     * Array of subtitle areas, may be empty.
> +     */
> +    AVSubtitleArea **subtitle_areas;
> +
> +    /**
> +     * Header containing style information for text subtitles.
> +     */
> +    AVBufferRef *subtitle_header;
> +
> +    /**
> +     * Indicates that a subtitle frame is a repeated frame for arbitrating flow
> +     * in a filter graph.
> +     * The field subtitle_timing.start_pts always indicates the original presentation
> +     * time, while the frame's pts field may be different.
> +     */
> +    int repeat_sub;
> +
> +    struct SubtitleTiming
> +    {
> +        /**
> +         * The display start time, in AV_TIME_BASE.
> +         *
> +         * For subtitle frames, AVFrame.pts is populated from the packet pts value,
> +         * which is not always the same as this value.
> +         */
> +        int64_t start_pts;

There is still no explanation here why they are not the same, why they
could not just be the same, and which field a user should look at.
The cover letter talks about clarity why this is needed is important,
but then provides none of that.

"Its required" is not an argument. So please enlighten us once again
why we absolutely need two timestamps for subtitles, instead of just
one. As far as I can see, subtitle frames only have one relevant time
- when its supposed to be shown on the screen. What would the other
time ever be good for to a user?
Similarly for the duration, of course. I can even see the
AVFrame.duration field in this patch snippet just above the additions
that would seem to fully replace this one.

- Hendrik



> +
> +        /**
> +         * Display duration, in AV_TIME_BASE.
> +         */
> +        int64_t duration;
> +
> +    } subtitle_timing;
>  } AVFrame;
>
>
> @@ -788,6 +835,8 @@ void av_frame_move_ref(AVFrame *dst, AVFrame *src);
>  /**
>   * Allocate new buffer(s) for audio or video data.
>   *
> + * Note: For subtitle data, use av_frame_get_buffer2
> + *
>   * The following fields must be set on frame before calling this function:
>   * - format (pixel format for video, sample format for audio)
>   * - width and height for video
> @@ -807,9 +856,39 @@ void av_frame_move_ref(AVFrame *dst, AVFrame *src);
>   *              recommended to pass 0 here unless you know what you are doing.
>   *
>   * @return 0 on success, a negative AVERROR on error.
> + *
> + * @deprecated Use @ref av_frame_get_buffer2 instead and set @ref AVFrame.type
> + * before calling.
>   */
> +attribute_deprecated
>  int av_frame_get_buffer(AVFrame *frame, int align);
>
> +/**
> + * Allocate new buffer(s) for audio, video or subtitle data.
> + *
> + * The following fields must be set on frame before calling this function:
> + * - format (pixel format for video, sample format for audio)
> + * - width and height for video
> + * - nb_samples and channel_layout for audio
> + * - type (AVMediaType)
> + *
> + * This function will fill AVFrame.data and AVFrame.buf arrays and, if
> + * necessary, allocate and fill AVFrame.extended_data and AVFrame.extended_buf.
> + * For planar formats, one buffer will be allocated for each plane.
> + *
> + * @warning: if frame already has been allocated, calling this function will
> + *           leak memory. In addition, undefined behavior can occur in certain
> + *           cases.
> + *
> + * @param frame frame in which to store the new buffers.
> + * @param align Required buffer size alignment. If equal to 0, alignment will be
> + *              chosen automatically for the current CPU. It is highly
> + *              recommended to pass 0 here unless you know what you are doing.
> + *
> + * @return 0 on success, a negative AVERROR on error.
> + */
> +int av_frame_get_buffer2(AVFrame *frame, int align);
> +
>  /**
>   * Check if the frame data is writable.
>   *
> diff --git a/libavutil/subfmt.c b/libavutil/subfmt.c
> new file mode 100644
> index 0000000000..c72ebe2a43
> --- /dev/null
> +++ b/libavutil/subfmt.c
> @@ -0,0 +1,45 @@
> +/*
> + * Copyright (c) 2021 softworkz
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <string.h>
> +#include "common.h"
> +#include "subfmt.h"
> +
> +static const char sub_fmt_info[AV_SUBTITLE_FMT_NB][24] = {
> +    [AV_SUBTITLE_FMT_UNKNOWN] = "Unknown subtitle format",
> +    [AV_SUBTITLE_FMT_BITMAP]  = "Graphical subtitles",
> +    [AV_SUBTITLE_FMT_TEXT]    = "Text subtitles (plain)",
> +    [AV_SUBTITLE_FMT_ASS]     = "Text subtitles (ass)",
> +};
> +
> +const char *av_get_subtitle_fmt_name(enum AVSubtitleType sub_fmt)
> +{
> +    if (sub_fmt < 0 || sub_fmt >= AV_SUBTITLE_FMT_NB)
> +        return NULL;
> +    return sub_fmt_info[sub_fmt];
> +}
> +
> +enum AVSubtitleType av_get_subtitle_fmt(const char *name)
> +{
> +    for (int i = 0; i < AV_SUBTITLE_FMT_NB; i++)
> +        if (!strcmp(sub_fmt_info[i], name))
> +            return i;
> +    return AV_SUBTITLE_FMT_NONE;
> +}
> diff --git a/libavutil/subfmt.h b/libavutil/subfmt.h
> index 791b45519f..3b8a0613b2 100644
> --- a/libavutil/subfmt.h
> +++ b/libavutil/subfmt.h
> @@ -21,6 +21,9 @@
>  #ifndef AVUTIL_SUBFMT_H
>  #define AVUTIL_SUBFMT_H
>
> +#include <stdint.h>
> +
> +#include "buffer.h"
>  #include "version.h"
>
>  enum AVSubtitleType {
> @@ -65,4 +68,48 @@ enum AVSubtitleType {
>      AV_SUBTITLE_FMT_NB,         ///< number of subtitle formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions.
>  };
>
> +typedef struct AVSubtitleArea {
> +#define AV_NUM_BUFFER_POINTERS 1
> +
> +    enum AVSubtitleType type;
> +    int flags;
> +
> +    int x;         ///< top left corner  of area.
> +    int y;         ///< top left corner  of area.
> +    int w;         ///< width            of area.
> +    int h;         ///< height           of area.
> +    int nb_colors; ///< number of colors in bitmap palette (@ref pal).
> +
> +    /**
> +     * Buffers and line sizes for the bitmap of this subtitle.
> +     *
> +     * @{
> +     */
> +    AVBufferRef *buf[AV_NUM_BUFFER_POINTERS];
> +    int linesize[AV_NUM_BUFFER_POINTERS];
> +    /**
> +     * @}
> +     */
> +
> +    uint32_t pal[256]; ///< RGBA palette for the bitmap.
> +
> +    char *text;        ///< 0-terminated plain UTF-8 text
> +    char *ass;         ///< 0-terminated ASS/SSA compatible event line.
> +
> +} AVSubtitleArea;
> +
> +/**
> + * Return the name of sub_fmt, or NULL if sub_fmt is not
> + * recognized.
> + */
> +const char *av_get_subtitle_fmt_name(enum AVSubtitleType sub_fmt);
> +
> +/**
> + * Return a subtitle format corresponding to name, or AV_SUBTITLE_FMT_NONE
> + * on error.
> + *
> + * @param name Subtitle format name.
> + */
> +enum AVSubtitleType av_get_subtitle_fmt(const char *name);
> +
>  #endif /* AVUTIL_SUBFMT_H */
> --
> ffmpeg-codebot
>
> _______________________________________________
> ffmpeg-devel mailing list
> [email protected]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> [email protected] with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the FFmpeg mailing list, Soft Works wrote (reply to this):



> -----Original Message-----
> From: ffmpeg-devel <[email protected]> On Behalf Of
> Hendrik Leppkes
> Sent: Tuesday, October 25, 2022 11:38 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> [email protected]>
> Subject: Re: [FFmpeg-devel] [PATCH v9 02/25] avutil/frame: Prepare
> AVFrame for subtitle handling
> 
> On Tue, Oct 25, 2022 at 11:14 AM softworkz <[email protected]>
> wrote:
> >
> > @@ -712,6 +712,53 @@ typedef struct AVFrame {
> >       * Duration of the frame, in the same units as pts. 0 if
> unknown.
> >       */
> >      int64_t duration;
> > +
> > +    /**
> > +     * Media type of the frame (audio, video, subtitles..)
> > +     *
> > +     * See AVMEDIA_TYPE_xxx
> > +     */
> > +    enum AVMediaType type;
> > +
> > +    /**
> > +     * Number of items in the @ref subtitle_areas array.
> > +     */
> > +    unsigned num_subtitle_areas;
> > +
> > +    /**
> > +     * Array of subtitle areas, may be empty.
> > +     */
> > +    AVSubtitleArea **subtitle_areas;
> > +
> > +    /**
> > +     * Header containing style information for text subtitles.
> > +     */
> > +    AVBufferRef *subtitle_header;
> > +
> > +    /**
> > +     * Indicates that a subtitle frame is a repeated frame for
> arbitrating flow
> > +     * in a filter graph.
> > +     * The field subtitle_timing.start_pts always indicates the
> original presentation
> > +     * time, while the frame's pts field may be different.
> > +     */
> > +    int repeat_sub;
> > +
> > +    struct SubtitleTiming
> > +    {
> > +        /**
> > +         * The display start time, in AV_TIME_BASE.
> > +         *
> > +         * For subtitle frames, AVFrame.pts is populated from the
> packet pts value,
> > +         * which is not always the same as this value.
> > +         */
> > +        int64_t start_pts;
> 
> There is still no explanation here why they are not the same, why
> they
> could not just be the same, and which field a user should look at.
> The cover letter talks about clarity why this is needed is important,
> but then provides none of that.
> 
> "Its required" is not an argument. So please enlighten us once again
> why we absolutely need two timestamps for subtitles, instead of just
> one. As far as I can see, subtitle frames only have one relevant time
> - when its supposed to be shown on the screen. What would the other
> time ever be good for to a user?
> Similarly for the duration, of course. I can even see the
> AVFrame.duration field in this patch snippet just above the additions
> that would seem to fully replace this one.
> 
> - Hendrik

Hi Hendrik,

thanks a lot for your reply.

Probably I should have better advertised the article I had written 
specifically to explain the background of this:

https://github.com/softworkz/SubtitleFilteringDemos/issues/1

I hope it's understandable - please let me know when you have 
questions.


Thanks again,
softworkz
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the FFmpeg mailing list, Soft Works wrote (reply to this):



> -----Original Message-----
> From: ffmpeg-devel <[email protected]> On Behalf Of
> Soft Works
> Sent: Tuesday, October 25, 2022 11:59 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> [email protected]>
> Subject: Re: [FFmpeg-devel] [PATCH v9 02/25] avutil/frame: Prepare
> AVFrame for subtitle handling
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <[email protected]> On Behalf Of
> > Hendrik Leppkes
> > Sent: Tuesday, October 25, 2022 11:38 AM
> > To: FFmpeg development discussions and patches <ffmpeg-
> > [email protected]>
> > Subject: Re: [FFmpeg-devel] [PATCH v9 02/25] avutil/frame: Prepare
> > AVFrame for subtitle handling
> >
> > On Tue, Oct 25, 2022 at 11:14 AM softworkz <[email protected]>
> > wrote:
> > >
> > > @@ -712,6 +712,53 @@ typedef struct AVFrame {
> > >       * Duration of the frame, in the same units as pts. 0 if
> > unknown.
> > >       */
> > >      int64_t duration;
> > > +
> > > +    /**
> > > +     * Media type of the frame (audio, video, subtitles..)
> > > +     *
> > > +     * See AVMEDIA_TYPE_xxx
> > > +     */
> > > +    enum AVMediaType type;
> > > +
> > > +    /**
> > > +     * Number of items in the @ref subtitle_areas array.
> > > +     */
> > > +    unsigned num_subtitle_areas;
> > > +
> > > +    /**
> > > +     * Array of subtitle areas, may be empty.
> > > +     */
> > > +    AVSubtitleArea **subtitle_areas;
> > > +
> > > +    /**
> > > +     * Header containing style information for text subtitles.
> > > +     */
> > > +    AVBufferRef *subtitle_header;
> > > +
> > > +    /**
> > > +     * Indicates that a subtitle frame is a repeated frame for
> > arbitrating flow
> > > +     * in a filter graph.
> > > +     * The field subtitle_timing.start_pts always indicates the
> > original presentation
> > > +     * time, while the frame's pts field may be different.
> > > +     */
> > > +    int repeat_sub;
> > > +
> > > +    struct SubtitleTiming
> > > +    {
> > > +        /**
> > > +         * The display start time, in AV_TIME_BASE.
> > > +         *
> > > +         * For subtitle frames, AVFrame.pts is populated from
> the
> > packet pts value,
> > > +         * which is not always the same as this value.
> > > +         */
> > > +        int64_t start_pts;
> >
> > There is still no explanation here why they are not the same, why
> > they
> > could not just be the same, and which field a user should look at.
> > The cover letter talks about clarity why this is needed is
> important,
> > but then provides none of that.
> >
> > "Its required" is not an argument. So please enlighten us once
> again
> > why we absolutely need two timestamps for subtitles, instead of
> just
> > one. As far as I can see, subtitle frames only have one relevant
> time
> > - when its supposed to be shown on the screen. What would the other
> > time ever be good for to a user?
> > Similarly for the duration, of course. I can even see the
> > AVFrame.duration field in this patch snippet just above the
> additions
> > that would seem to fully replace this one.
> >
> > - Hendrik
> 
> Hi Hendrik,
> 
> thanks a lot for your reply.
> 
> Probably I should have better advertised the article I had written
> specifically to explain the background of this:
> 
> https://github.com/softworkz/SubtitleFilteringDemos/issues/1

@Hendrik - did that answer your question?

You are also welcome to contact me directly for discussing details
or going through specific examples; or just to ping me for logging
on to IRC for chat.

softworkz

_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

@ffmpeg-codebot
Copy link

User Hendrik Leppkes <[email protected]> has been added to the cc: list.

@ffmpeg-codebot
Copy link

User Soft Works <[email protected]> has been added to the cc: list.

@@ -24,68 +24,193 @@
#include <string.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the FFmpeg mailing list, Michael Niedermayer wrote (reply to this):


--===============1175881865082314697==
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="l0l+eSofNeLXHSnY"
Content-Disposition: inline


--l0l+eSofNeLXHSnY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 25, 2022 at 09:13:44AM +0000, softworkz wrote:
> From: softworkz <[email protected]>
>=20
> and provide a compatibility shim for the legacy api
>=20
> Signed-off-by: softworkz <[email protected]>
> ---
>  libavcodec/assenc.c         | 189 ++++++++++++++++++++++++++++++------
>  libavcodec/avcodec.h        |   5 +-
>  libavcodec/codec_internal.h |  12 ---
>  libavcodec/dvbsubenc.c      |  96 ++++++++++--------
>  libavcodec/dvdsubenc.c      | 103 ++++++++++++--------
>  libavcodec/encode.c         |  61 +++++++++++-
>  libavcodec/movtextenc.c     | 114 ++++++++++++++++------
>  libavcodec/srtenc.c         | 108 ++++++++++++++-------
>  libavcodec/tests/avcodec.c  |   5 +-
>  libavcodec/ttmlenc.c        | 101 ++++++++++++++-----
>  libavcodec/utils.c          |   1 -
>  libavcodec/webvttenc.c      |  86 +++++++++++-----
>  libavcodec/xsubenc.c        |  88 ++++++++++-------
>  13 files changed, 689 insertions(+), 280 deletions(-)

Causes this testcase to fail:
=2E/ffmpeg -i 'bgc.sub.dub.ogm'  -vframes 3 -bitexact -y nosubs.webm

https://samples.ffmpeg.org/ogg/

I did not investgate why or if this a bug or expected. Just reporting a dif=
ference
ive seen

thx

[...]

--=20
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes

--l0l+eSofNeLXHSnY
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCY1rFogAKCRBhHseHBAsP
q0GUAJ4ruktwZAPMK9T2QMRBO1ikUnVDZQCffgDCPU3M8x+JM5bkMUhFzyzBjRU=
=Sw6W
-----END PGP SIGNATURE-----

--l0l+eSofNeLXHSnY--

--===============1175881865082314697==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

--===============1175881865082314697==--

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the FFmpeg mailing list, Soft Works wrote (reply to this):



> -----Original Message-----
> From: ffmpeg-devel <[email protected]> On Behalf Of
> Michael Niedermayer
> Sent: Thursday, October 27, 2022 7:54 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> [email protected]>
> Subject: Re: [FFmpeg-devel] [PATCH v9 23/25] avcodec/subtitles:
> Migrate subtitle encoders to frame-based API
> 
> On Tue, Oct 25, 2022 at 09:13:44AM +0000, softworkz wrote:
> > From: softworkz <[email protected]>
> >
> > and provide a compatibility shim for the legacy api
> >
> > Signed-off-by: softworkz <[email protected]>
> > ---
> >  libavcodec/assenc.c         | 189 ++++++++++++++++++++++++++++++--
> ----
> >  libavcodec/avcodec.h        |   5 +-
> >  libavcodec/codec_internal.h |  12 ---
> >  libavcodec/dvbsubenc.c      |  96 ++++++++++--------
> >  libavcodec/dvdsubenc.c      | 103 ++++++++++++--------
> >  libavcodec/encode.c         |  61 +++++++++++-
> >  libavcodec/movtextenc.c     | 114 ++++++++++++++++------
> >  libavcodec/srtenc.c         | 108 ++++++++++++++-------
> >  libavcodec/tests/avcodec.c  |   5 +-
> >  libavcodec/ttmlenc.c        | 101 ++++++++++++++-----
> >  libavcodec/utils.c          |   1 -
> >  libavcodec/webvttenc.c      |  86 +++++++++++-----
> >  libavcodec/xsubenc.c        |  88 ++++++++++-------
> >  13 files changed, 689 insertions(+), 280 deletions(-)
> 
> Causes this testcase to fail:
> ./ffmpeg -i 'bgc.sub.dub.ogm'  -vframes 3 -bitexact -y nosubs.webm
> 
> https://samples.ffmpeg.org/ogg/
> 
> I did not investgate why or if this a bug or expected. Just reporting
> a difference
> ive seen

Hi Michael,

thanks a lot for testing my patchset! 

I have run the command with and without my patchset and I can confirm
that the output is different.

Though, from analyzing the output files, it appears that the output
with my patchset applied seems "more correct" than the output that ffmpeg
is currently creating.

The details of the investigation can be found here:
https://github.com/softworkz/SubtitleFilteringDemos/issues/2


Thanks again,
softworkz


_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

@ffmpeg-codebot
Copy link

User Michael Niedermayer <[email protected]> has been added to the cc: list.

@ffmpeg-codebot
Copy link

ffmpeg-codebot bot commented Nov 6, 2022

On the FFmpeg mailing list, Soft Works wrote (reply to this):



> -----Original Message-----
> From: ffmpegagent <[email protected]>
> Sent: Tuesday, October 25, 2022 11:13 AM
> To: [email protected]
> Cc: softworkz <[email protected]>
> Subject: [PATCH v9 00/25] Subtitle Filtering 2022
> 
> 
> Subtitle Filtering 2022
> =======================
> 
> This is a substantial update to the earlier subtitle filtering patch
> series.
> A primary goal has been to address others' concerns as much as
> possible on
> one side and to provide more clarity and control over the way things
> are
> working. Clarity is is specifically important to allow for a better
> understanding of the need for a subtitle start pts value that can be
> different from the frame's pts value. This is done by refactoring the
> subtitle timing fields in AVFrame, adding a frame field to indicate
> repeated
> subtitle frames, and finally the full removal of the heartbeat
> functionality, replaced by a new 'subfeed' filter that provides
> different
> modes for arbitrating subtitle frames in a filter graph. Finally,
> each
> subtitle filter's documentation has been amended by a section
> describing the
> filter's timeline behavior (in v3 update).

Another PING.

I'm not aware of any unanswered questions or pending objections.

Thanks,
softworkz
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

@ctrlcctrlv
Copy link

Please merge this

@softworkz
Copy link
Collaborator Author

Please merge this

I can't. You need to write to the ffmpeg-devel mailing list that you want it: https://lists.ffmpeg.org/mailman/listinfo/ffmpeg-devel/

@ctrlcctrlv
Copy link

Right I'm sorry I thought because I saw a bot here that it was just mirroring all of my comments to that mailing list. I will email

@softworkz
Copy link
Collaborator Author

Right I'm sorry I thought because I saw a bot here that it was just mirroring all of my comments to that mailing list. I will email

Yea, this works for the other direction - from ML to here, but not from here to the ML.

@softworkz
Copy link
Collaborator Author

positions are preserved.

@cooliobr
Copy link

hi, how can I fix this error
[Parsed_graphicsub2text_0 @ 0x5555d0f0db00] ASS Result: {\1c&H00FFFF&\fnVerdana\fs30}to set up{\fnTrebuchet MS Bold\b1} &{\b0\fnTrebuchet MS} shoot{\fnVerdana} for{\fnTrebuchet MS Bold\b1} e{\b0\fnVerdana} rock{\fnTrebuchet MS Bold\b1} band [Parsed_graphicsub2text_0 @ 0x5555d0f0db00] OCR Result: tonight. So, um... Think about it, [Parsed_graphicsub2text_0 @ 0x5555d0f0db00] ASS Result: {\1c&H00FFFF&\fnVerdana\fs27}tonight. So,{\fnArial Bold\b1} um...{\b0\fnVerdana} Think{\fnTrebuchet MS} about{\fnVerdana} it, subtitle_kickoff: resend - pts: 260168 subtitle_kickoff: call subtitle_resend_current 261068 frame->format: 1 [Parsed_graphicsub2text_0 @ 0x5555d0f0db00] filter_frame sub_pts: 2900756, duration: 10000, num_areas: 0 Subtitle encoding currently only possible from text to text or bitmap to bitmapError initializing output stream 0:0 --

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants