Home > front end >  PyAV inconsistency when parsing packets from h264 frames
PyAV inconsistency when parsing packets from h264 frames

Time:04-04

When producing H.264 frames and decoding them using pyAV, packets are parsed from frames only when invoking the parse methods twice.

Consider the following test H.264 input, created using:

ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=30 -f image2 -vcodec libx264 -bsf h264_mp4toannexb -force_key_frames source -x264-params keyint=1:scenecut=0 "frame-M.h264"

Now, using pyAV to parse the first frame:

import av
codec = av.CodecContext.create('h264', 'r')
with open('/path/to/frame-0001.h264', 'rb') as file_handler:
    chunk = file_handler.read()
    packets = codec.parse(chunk) # This line needs to be invoked twice to parse packets

packets remain empty unless the last line is invoked again (packets = codec.parse(chunk))

Also, for different real life examples I cannot characterize, it seems that decoding frames from packets also require several decode invocations:

packet = packets[0]
frames = codec.decode(packet) # This line needs to be invoked 2-3 times to actually receive frames.

Does anyone know anything about this incosistent behavior of pyAV?

(Using Python 3.8.12 on macOS Monterey 12.3.1, ffmpeg 4.4.1, pyAV 9.0.2)

CodePudding user response:

This is an expected PyAV behavior. Not only, it is an expected behavior of the underlying libav. One packet does not guarantee a frame, and multiple packets may be needed before producing a frame. This is apparent in FFmpeg's video decoder example:

    while (ret >= 0) {
        ret = avcodec_receive_frame(dec_ctx, frame);
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
            return;

If it needs more packets to form a frame, it throws the EAGAIN error.

[edit]

Actually, the above example is not a good example as it just exits on EAGAIN. To retrieve a frame, it should rather continue on EAGAIN:

    while (ret >= 0) {
        ret = avcodec_receive_frame(dec_ctx, frame);
        if (AVERROR(EAGAIN))
            continue;
        if (ret == AVERROR_EOF)
            return;

[edit]

pyav's codec.parse()

The decoding sometimes needing additional calls is a fairly well-known fact, but the parser needing to flush is less common. Here is the difference between PyAV and FFmpeg:

PyAV parses the input data with av_parser_parse2() like this [ref]:


        while True:

            with nogil:
                consumed = lib.av_parser_parse2(
                    self.parser,
                    self.ptr,
                    &out_data, &out_size,
                    in_data, in_size,
                    lib.AV_NOPTS_VALUE, lib.AV_NOPTS_VALUE,
                    0
                )
            err_check(consumed)

            # ...snip...

            if not in_size:
                # This was a flush. Only one packet should ever be returned.
                break

            in_data  = consumed
            in_size -= consumed

            if not in_size:
                # Aaaand now we're done.
                break

So it reads until the input data is 100% consumed and note that it does not call av_parser_parse2 at end of buffer (which makes sense as the input data may be only a part of the stream data.

In contrast, FFmpeg does not call av_parser_parse2 directly and uses parse_packet and you can see how it handles the similar situation:

while (size > 0 || (flush && got_output)) {
   int64_t next_pts = pkt->pts;
   int64_t next_dts = pkt->dts;
   int len;

   len = av_parser_parse2(sti->parser, sti->avctx,
                          &out_pkt->data, &out_pkt->size, data, size,
                          pkt->pts, pkt->dts, pkt->pos);

It calls av_parser_parse2 also to flush the stream after input data stream is exhausted. So, you need to do the same in PyAV: after all your frames are read, call codec.parse() one last time to flush the last packet.

  • Related