[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xine-devel
Subject:    Re: [xine-devel] mpeg-ts enhancement
From:       Petri Hintukainen <phintuka () users ! sourceforge ! net>
Date:       2011-07-31 12:04:24
Message-ID: 1312113864.2068.226.camel () ph-laptop
[Download RAW message or body]

Torsten Jager wrote:
> Hello,
> 
> big thanks for answering :)

No, thanks for the patch. xine needs new contributors.

[...]
> Secondly, there are severe problems splitting patches when they
> affect the same file. I can manually split the diff but I dont
> know what to adjust the line numbers to. When they reflect the
> unchanged original file, you cannot auto-apply both patches.
> Things get even worse when both patches share part of their code.
> 
> Can somebody help me with that?

When one patch depends on another or two patches touch (almost) the same
lines of code you can create incremental patches (-> patches must be
applied in some specific order over each other). Previous versions
should make it quite easy to split the patch (diff first version against
second version, second against third, ...).
If changes are in separate diff chunks, it is usually easiest to remove
unrelated chunks by hand. Don't worry about line numbers. Editing chunks
by hand is also possible but requires quite much work so some other
method might be easier ...

> Now for the mpeg-ts demuxer.
> 
> You are completely right concerning those preview buffers.
> That code is seriously broken. It cannot handle frames >8kb
> (that is, spanning multiple fifo buffers) correctly. Hence
> the HDTV trouble, where large frames are common.
> Right again, bogus frame rejection is already there.
> I removed my incomplete frame stuff again, being unnecessary.
> I kept the part that turned out to be the actual fix: flagging
> first buffer of every frame as BUF_FLAG_FRAME_START, causing
> decoder to discard previous stuff that came without
> BUF_FLAG_FRAME_END.
> 
> Instead, I rewrote preview buffer handling. I did not want
> to remove it completely because someone must have put it in
> for a reason. My method now is sending first 5 frames wholly
> as preview, then rewind input if supported. 

Definitely better than the current situation. Just two problems:
- You removed BUF_FLAG_HEADER. It is still needed.
- Seeking to 0 breaks starting playback from any other position

Following two problems still exist, but I don't expect those to be fixed
in the same patch:
1) preview/header buffers are more or less randomly selected (not first
packets of each stream, but first packets of the whole program). It
should probably be per stream, so that first frame of each stream is
sent as header, next ones as preview (?). This makes seeking back
difficult: first packets of some streams will be far from the beginning,
and there can be non-existing streams in the PMT.
2) It does not reflect to changes in the stream. I believe we should
invalidate old headers and send new headers when PMT changes (ex. audio
tracks / codec can change ; DVB channel can be switched ...).
But there's huge problem with header buffers: those are stored in
audio_decoder until the stream is closed. So we would eventually run out
of audio buffers and freeze the engine.
 
> That still image stuff
> should now work at least for seekable input.

Unfortunately not. BluRay still image clips typically mask seek
operations. Even without that you would need to know the position where
to seek to (there can be multiple still images in single clip).


Would it make any sense to send a copy of the first fragment of first
demuxed frame as PREVIEW or HEADER ? That should contain all the header
information decoder might need, while preserving stream integrity. I
don't believe full video frames are needed as preview.


First task in fixing PREVIEW/HEADER stuff is investigating the decoders.
We really need to know what decoders need and why. Anything else is just
guessing ... Maybe the required information could be taken directly from
the stream at decoder side like I implemented it for ffmpeg VC-1 codec ?
There are not too many decoders to be checked: currently only 6 audio
and 4 video codecs are supported inside mpeg-ts.

Video:
Sending first frame(s) as PREVIEW is typically not enough when using
random access start. Ex. for VC-1 or MPEG2 one needs sequence header.
libmpeg2 wrapper expects to find sequence header from preview data. It
does not use header buffers ; actually it decodes header buffers as
normal data ?
ffmpeg wrapper is able to scan VC-1 sequence header from the stream, so
there's no need for preview. Also ffmpeg MPEG2/H.264 decoders do not
need header/preview.
--> headers are not required for video. preview is required only with
libmpeg2.

Audio/SPU:
Audio/SPU decoders use preview buffers to form track map. Preview
buffers can be used to "fixate" channel mapping of demuxer with the one
user sees. This is important if demuxer feeds only some of the streams
to decoder. If preview buffers (or first buffer of each stream) are not
sent in the same order as those appear in demuxer stream map, user will
"see" wrong track languages in the GUI. So, first audio preview buffers
should be sent when PMT is being parsed (or pids being learned). Those
would only indicate presence of the channel (BUF_AUDIO_xxx | channel_no,
no data) and allocate stream index for the stream.
Looks like ffmpeg audio decoder initializes the codec only when it
receives HEADER buffer. In most cases buffer can be empty; only codec
type is used ? It should be easy to fix to behave as
ffmpeg_video_decoder.c

Maybe sending HEADER buffers could be omitted after "fixing" audio
decoders ? That would allow changing audio track types on the fly
without reserving all audio fifo buffers. Another solution is to
implement resetting decoder so that all header buffers are freed.
There's no limit on how much PREVIEW buffers can be sent.


> Bitrate estimation got a little enhancement as well.
> It uses the currently selected audio (not all streams) PCR time

(there's only one PCR in program)

> or pts when PCR is missing (ffmpeg encoded files).

I still can't see where it uses currently selected audio stream ; it
just checks for m->fifo == stream->audio_fifo ; this check is true for
all audio streams.
I don't know if that matters if the algorithm can handle negative time
differences (ex. two audio streams (a, b) pts sequence goes like a: 5,
b: 1, a: 6, a: 7, b: 3, ... --> diff is -4, 5, 1, -4, ...). Small
offsets between streams are not visible in 2 minute interval.

> After 2 seconds the estimation will be used for time display,
> and after 2 minutes I stop analyzing any further.

How reliable is it when calculating 2 first minutes from movie ? If
there are lot of still scenes (maybe just some scrolling texts), bitrate
of beginning can be much smaller than in actual movie ?
Also DVB programs may be encoded with different bitrates. I was thinking
about channels that broadcasts only some hours in a day, and send some
still logo other times. If you tune to the channel 3 minutes before
program starts you'll get far too small bit rate for the program.

> You talk about blu-ray a lot. How do you get past that heavy
> encryption? My brother almost returned his new TV to the store,
> when I found out that "just" his BD player needed a new firmware
> to play newer BDs in HD.
> 
> BTW. It did put me down a bit that it actually was bitrate
> estimation what you liked the most - in my eyes the least
> important of my patches...

Well, that was clearly separate from others and relatively simple. I've
been working a lot with mpeg-ts lately ...

I'm also interested in the clock sync stuff, it is useful with other
broadcast-like streams/inputs too. But I don't know if monitoring buffer
fill levels is good solution. It should not add any extra delay, and
playback should be started as soon as possible. Buffer sizes are
configurable - with default buffers some streams are unplayable even
with 100% buffer fill. With large enough buffers you'll add huge delay
to SD/low bitrate DVB streams.
Ideal solution would be monitoring stream timestamps - maybe what comes
out from decoders - and try to maintain enough ready audio and video
frames for continuous playback, while keeping buffers as minimal as
possible...
So, start playback as soon as possible when decoded audio and video have
reached the same timestamp (+ some margin).
For clock synchronization PCR would be ideal. We could calculate PCR <->
SCR drift, do some filtering and adjust SCR speed based on the result.
Changing playback speed is not safe, you should instead adjust clock
speed.

For real/.flv stuff I can't say much - I've never even looked the code
as I have no such files ... I'm also not xine maintainer so I can safely
skip those parts :)


Most comments are more like cosmetics:
> --- xine-lib-1.1.19/src/demuxers/demux_ts.c     2010-04-09 23:08:08.000000000 +0200
> +++ xine-lib-1.1.19/src/demuxers/demux_ts.c     2011-07-20 23:13:15.000000000 +0200
> @@ -275,7 +275,11 @@ typedef struct {
> int              corrupted_pes;
> uint32_t         buffered_bytes;
> int              autodetected;
> -
> +  /* TJ. */

No need for such markers, diff itself identifies what lines are new
ones. Those just make the code harder to read in the long run.

> +  int              input_normpos;
> +  int              input_time;

I don't know if adding this information to every fragment makes any
sense; one frame can have only one position in the stream, so maybe it
is enough to add this information only to the first fragment ? That
would make caching it here unnecessary.

> +  int              preview;
> +  /* /TJ. */
> } demux_ts_media;
> 
> /* DVBSUB */
> @@ -372,6 +376,13 @@ typedef struct {
> 
> int numPreview;
> 
> +  /* TJ. total bitrate measurement */
> +  int64_t tbrm_time, tbrm_bytes;
> +  int64_t tbrm_lasttime, tbrm_lastpos;
> +  int64_t tbrm_buf_pos, tbrm_frame_pos;

Last two (buf_pos and frame_pos) can be combined

> +  int64_t tbrm_pcr;

Just "pcr" should be enough

> +  /* #/TJ. */
> +
> } demux_ts_t;
> 
> typedef struct {
> @@ -982,25 +993,12 @@ static void demux_ts_buffer_pes(demux_ts
> m->buf->decoder_info[1] = BUF_SPECIAL_SPU_DVD_SUBTYPE;
> m->buf->decoder_info[2] = SPU_DVD_SUBTYPE_PACKAGE;
> }
> -      else {
> -        if (this->numPreview<5)
> -         ++this->numPreview;
> -       if ( this->numPreview==1 )
> -         m->buf->decoder_flags=BUF_FLAG_HEADER | BUF_FLAG_FRAME_END;

BUF_FLAG_HEADER | BUF_FLAG_FRAME_END is missing from the new code

> -       else if ( this->numPreview<5 )
> -         m->buf->decoder_flags=BUF_FLAG_PREVIEW;
> -       else
> -         m->buf->decoder_flags |= BUF_FLAG_FRAME_END;
> -      }
> +      m->buf->decoder_flags |= BUF_FLAG_FRAME_END;
> +      if (m->preview) m->buf->decoder_flags |= BUF_FLAG_PREVIEW;

if (xxx) {
  ...
}
is more readable and consistent with current code.

> m->buf->pts = m->pts;
> m->buf->decoder_info[0] = 1;
> -
> -      if( this->input->get_length (this->input) )
> -        m->buf->extra_info->input_normpos = (int)( (double) \
>                 this->input->get_current_pos (this->input) *
> -                                         65535 / this->input->get_length \
>                 (this->input) );
> -      if (this->rate)
> -        m->buf->extra_info->input_time = \
>                 (int)((int64_t)this->input->get_current_pos (this->input)
> -                                         * 1000 / (this->rate * 50));
> +      m->buf->extra_info->input_normpos = m->input_normpos;
> +      m->buf->extra_info->input_time = m->input_time;
> m->fifo->put(m->fifo, m->buf);
> m->buffered_bytes = 0;
> m->buf = NULL; /* forget about buf -- not our responsibility anymore */
> @@ -1010,6 +1008,8 @@ static void demux_ts_buffer_pes(demux_ts
> }
> /* allocate the buffer here, as pes_header needs a valid buf for dvbsubs */
> m->buf = m->fifo->buffer_pool_alloc(m->fifo);
> +    /* TJ. dont let decoder crash on incomplete frames */
> +    m->buf->decoder_flags |= BUF_FLAG_FRAME_START;

This does not work if PES packets have been splitted (= one frame spans
over multiple PES packets). Well, I don't know if those exist inside
mpeg-ts ...

> if (!demux_ts_parse_pes_header(this->stream->xine, m, ts, len, this->stream)) {
> m->buf->free_buffer(m->buf);
> @@ -1030,6 +1030,52 @@ static void demux_ts_buffer_pes(demux_ts
> m->corrupted_pes = 0;
> memcpy(m->buf->mem, ts+len-m->size, m->size);
> m->buffered_bytes = m->size;
> +      /* TJ. estimate overall bitrate from selected audio stream pts (or
> +         muxing time, when available) and byte position. Skip discontinuities.
> +         After 2 seconds, start using that estimation for time display.
> +         After 2 minutes, stop analyzing and freeze bitrate */

Could this be splitted to separate function ?

> +      {
> +        double l = this->input->get_length (this->input);

"length" might be better as it is used ~10 lines later.

> +        int64_t now = this->tbrm_pcr ? this->tbrm_pcr : m->pts;
> +        if ((this->tbrm_time < 120 * 90000) && now && (this->audio_fifo == \
> m->fifo)) { +          if (this->tbrm_lasttime) {
> +            int64_t diff = now - this->tbrm_lasttime;
> +            if ((diff >= -220000) && (diff < 220000)) {
> +              this->tbrm_time += diff;
> +              this->tbrm_bytes += this->tbrm_frame_pos - this->tbrm_lastpos;
> +              if (this->tbrm_time > 2 * 90000)
> +                this->rate = this->tbrm_bytes * 90000 / this->tbrm_time;
> +            }
> +          }
> +          this->tbrm_lasttime = now;
> +          this->tbrm_lastpos = this->tbrm_frame_pos;
> +        }
> +        /* cache buffer values */
> +        if (l) m->input_normpos = (double)this->tbrm_frame_pos * 65535.0 / l;

Checking if (<double> == 0) is not valid; either do the check before
converting to double or use something like if (l <= 2*DBL_MIN && l >=
-2*DBL_MIN).
Also check if l is reasonable (> 0 ?). Many streams have no length (like
DVB, v4l, rtp, ...) and input plugin returns -1 - except for DVB where
plugin returns 0.

> +        if (this->rate) m->input_time = (int64_t)this->tbrm_frame_pos * 1000 / \
> this->rate; +        /* send first 5 frames as preview */
> +        m->preview = 0;
> +        if (this->numPreview < 5) {
> +          int i;
> +          this->numPreview++;
> +          /* after preview, rewind */
> +          if ((this->numPreview == 5) &&
> +            (this->input->get_capabilities(this->input) & INPUT_CAP_SEEKABLE)) {
> +            this->input->seek (this->input, 0, SEEK_SET);

Seeking to 0 breaks starting playback from the middle of stream
  xine_play("/path/file.ts", <start_time>);

> +            this->send_newpts = 1;
> +            for (i = 0; i < MAX_PIDS; i++) {
> +              m = &this->media[i];
> +              if (m->buf) m->buf->free_buffer (m->buf);
> +              m->buf = NULL;
> +              m->counter = INVALID_CC;
> +              m->corrupted_pes = 1;
> +              m->buffered_bytes = 0;
> +            }
> +            this->buf_flag_seek = 1;

This could be factorized to separate function, it is used in several places

> +          } else m->preview = 1;
> +        }
> +      }
> +      /* /TJ. */
> }
> 
> } else if (!m->corrupted_pes) { /* no pus -- PES packet continuation */
> @@ -1038,15 +1084,11 @@ static void demux_ts_buffer_pes(demux_ts
> m->buf->content = m->buf->mem;
> m->buf->size = m->buffered_bytes;
> m->buf->type = m->type;
> +      if (m->preview) m->buf->decoder_flags |= BUF_FLAG_PREVIEW;
> m->buf->pts = m->pts;
> m->buf->decoder_info[0] = 1;
> -      if( this->input->get_length (this->input) )
> -        m->buf->extra_info->input_normpos = (int)( (double) \
>                 this->input->get_current_pos (this->input) *
> -                                         65535 / this->input->get_length \
>                 (this->input) );
> -      if (this->rate)
> -        m->buf->extra_info->input_time = \
>                 (int)((int64_t)this->input->get_current_pos (this->input)
> -                                         * 1000 / (this->rate * 50));
> -
> +      m->buf->extra_info->input_normpos = m->input_normpos;
> +      m->buf->extra_info->input_time = m->input_time;
> m->fifo->put(m->fifo, m->buf);
> m->buffered_bytes = 0;
> m->buf = m->fifo->buffer_pool_alloc(m->fifo);
> @@ -1730,6 +1772,7 @@ static unsigned char * demux_synchronise
> 
> /* NEW: handle read returning less packets than NPKT_PER_READ... */
> do {
> +      this->tbrm_buf_pos = this->input->get_current_pos (this->input); /* TJ. */

Note that sync_detect() / sync_correct() can change current position. Of
course that has no visible difference in the result, unless the stream
starts with lot of garbage or positions are used for something else than
bitrate calculation. But you are calculating precise packet offsets
later...
Maybe you can query position after sync_detect() and subtract
this->npkt_read * this->pkt_size from it.

Current position can also be stored directly to tbrm_frame_pos:
this->tbrm_frame_pos = this->input->get_current_pos (this->input) - this->pkt_size;
...

> read_length = this->input->read(this->input, this->buf,
> this->pkt_size * NPKT_PER_READ);
> if (read_length < 0 || read_length % this->pkt_size) {
> @@ -1768,6 +1811,7 @@ static unsigned char * demux_synchronise
> }
> return_pointer = &(this->buf)[this->pkt_offset + this->pkt_size * \
> this->packet_number]; this->packet_number++;
> +  this->tbrm_frame_pos = this->tbrm_buf_pos + (return_pointer - this->buf); /* TJ. \
> */

... and increment tbrm_frame_pos by this->pkt_size here.

Correct position for m2ts packet is 4 bytes before return_pointer. m2ts
packet starts 4 bytes before mpeg-ts packet so one needs to subtract
this->pkt_offset from return_pointer.

> return return_pointer;
> }
> 
> @@ -1948,7 +1992,8 @@ static void demux_ts_parse_packet (demux
> if( adaptation_field_control & 0x2 ){
> uint32_t adaptation_field_length = originalPkt[4];
> if (adaptation_field_length > 0) {
> -      demux_ts_adaptation_field_parse (originalPkt+5, adaptation_field_length);
> +      this->tbrm_pcr =
> +        demux_ts_adaptation_field_parse (originalPkt+5, adaptation_field_length);
> }
> /*
> * Skip adaptation header.
> @@ -2235,9 +2280,7 @@ static int demux_ts_seek (demux_plugin_t
> if (this->input->get_capabilities(this->input) & INPUT_CAP_SEEKABLE) {
> 
> if ((!start_pos) && (start_time)) {
> -      start_pos = start_time;
> -      start_pos *= this->rate;
> -      start_pos *= 50;
> +      start_pos = (int64_t)start_time * this->rate / 1000;
> }
> this->input->seek (this->input, start_pos, SEEK_SET);
> 
> @@ -2268,6 +2311,7 @@ static int demux_ts_seek (demux_plugin_t
> 
> }
> 
> +  this->tbrm_lasttime = 0; /* TJ. */
> return this->status;
> }
> 
> @@ -2277,7 +2321,7 @@ static int demux_ts_get_stream_length (d
> 
> if (this->rate)
> return (int)((int64_t) this->input->get_length (this->input)
> -                 * 1000 / (this->rate * 50));
> +                 * 1000 / this->rate);
> else
> return 0;
> }
> @@ -2460,7 +2504,7 @@ static demux_plugin_t *open_plugin (demu
> this->audio_tracks_count = 0;
> this->last_pmt_crc = 0;
> 
> -  this->rate = 16000; /* FIXME */
> +  this->rate = 0; /* byte/sec */

I would leave some default there, it will be overwritten later with
calculated rate anyway. That would help in the very beginning of the
stream and with video-only streams.

> 
> this->status = DEMUX_FINISHED;

And you still should split preview part from rate estimation :)


- Petri


------------------------------------------------------------------------------
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
_______________________________________________
xine-devel mailing list
xine-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xine-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic