Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

How to perform transcoding efficiently #534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ArtanisCV opened this issue Jun 21, 2019 · 9 comments
Closed

How to perform transcoding efficiently #534

ArtanisCV opened this issue Jun 21, 2019 · 9 comments

Comments

@ArtanisCV
Copy link

ArtanisCV commented Jun 21, 2019

Currently, pyav returns frames according to dts. However, when we encode a video, we need to provide frames according to pts. Thus, I have to perform sorting between decoding & encoding like this (otherwise I will get an incorrect output video when there are B-frames in the input):

input_container = av.open('input.mp4')
video_stream = input_container.streams.video[0]

output_container = av.open('output.mp4', mode='w')
stream = output_container.add_stream('libx264', rate=video_stream.average_rate)
stream.width = video_stream.width
stream.height = video_stream.height
stream.pix_fmt = 'yuv420p'

# decode
frames = []
for frame in input_container.decode(video=0):
    frames.append(frame)

# perform sorting
frames = sorted(frames, key=lambda f: f.pts)

# encode
for frame in frames:
    # let libav decide the correct pts and time base (in case of changing fps)
    frame.pts = None
    frame.time_base = None
    packet = stream.encode(frame)
    output_container.mux(packet)

output_container.close()

Is there a more efficient way?

Thanks in advance.

@mikeboers
Copy link
Member

I don't think we have encountered this behavior. PyAV yields packets/frames in the order that FFmpeg provides them, which has never been anything but pts order in my experience.

Does this happen for all videos for you?

@ArtanisCV
Copy link
Author

Thanks for your reply. I usually encounter this issue when reading avi/mkv files.

For example, when reading https://filebin.ca/4lNZ9xo4YIFZ (a file taken from the HMDB51 dataset), the dts and pts returned by pyav (i.e., for frame in input_container.decode(video=0): print(f"dts: {frame.dts}, pts: {frame.pts}")) are:

dts: 2, pts: 1
dts: 3, pts: 4
dts: 4, pts: 3
dts: 5, pts: 6
dts: 6, pts: 5
dts: 7, pts: 8
dts: 8, pts: 7
dts: 9, pts: 10
dts: 10, pts: 9
dts: 11, pts: 11
dts: 12, pts: 13
dts: 13, pts: 12
dts: 14, pts: 15
dts: 15, pts: 14
dts: 16, pts: 16
dts: 17, pts: 17
dts: 18, pts: 18
dts: 19, pts: 20
dts: 20, pts: 19
dts: 21, pts: 21
dts: 22, pts: 22
dts: 23, pts: 23
dts: 24, pts: 25
dts: 25, pts: 24
dts: 26, pts: 27
dts: 27, pts: 26
dts: 28, pts: 28
dts: 29, pts: 30
dts: 30, pts: 29
dts: 31, pts: 32
dts: 32, pts: 31
dts: 33, pts: 34
dts: 34, pts: 33
dts: 35, pts: 35
dts: 36, pts: 37
dts: 37, pts: 36
dts: 38, pts: 38
dts: 39, pts: 39
dts: 40, pts: 41
dts: 41, pts: 40
dts: 42, pts: 43
dts: 43, pts: 42
dts: 44, pts: 45
dts: 45, pts: 44
dts: 46, pts: 46
dts: 47, pts: 48
dts: 48, pts: 47
dts: 49, pts: 49
dts: 50, pts: 51
dts: 51, pts: 50
dts: 52, pts: 52
dts: 53, pts: 53
dts: 54, pts: 55
dts: 55, pts: 54
dts: 56, pts: 57
dts: None, pts: 56

You can see that dts is increasing while pts is not. Interestingly, ffplay seems to play the video according to the dst order instead of pts. Maybe the pts returned by pyav is not correct?

@mikeboers
Copy link
Member

ffprobe on that file indicates that packets/frames do not have a pts. I don't know what that implies though. Maybe we assumed that there is always a pts and are somehow generating a garbage one. It is odd to me that there wouldn't be one at all.

@ArtanisCV
Copy link
Author

Thanks for pointing out this issue. Now the problem is clearer.

I am not familiar with avi files. But I remember that avi files does not support variable frame rate. As the frame duration never changes, pts are not strictly required by avi files. In libav, pts may be AV_NOPTS_VALUE. In pyav, the pts seems to be always valid, and thus (i guess) leads to non-monotonic pts sequence for avi files with AV_NOPTS_VALUE.

@mikeboers
Copy link
Member

AV_NOPTS_VALUE gets turned into None.

We do assign a pts when encoding.

I didn't do an exhaustive search for pts, but I don't think there is anything where we do anything special with pts for decode. I'm tempted to drop a print right after we receive the frame...

@mikeboers
Copy link
Member

If I put a print right after this line then it is revealed that FFmpeg is providing those pts values.

Everything on PyAV's side is behaving normally. I don't know if there is anything reasonable to do here.

@ArtanisCV
Copy link
Author

If I put a print right after this line then it is revealed that FFmpeg is providing those pts values.

Everything on PyAV's side is behaving normally. I don't know if there is anything reasonable to do here.

Interestingly, I write the following c code for testing (modified from http://dranger.com/ffmpeg/tutorial01.c). And it seems that frame->best_effort_timestamp can report the correct pts.

// Use
//
// gcc -o test test.c -lavformat -lavcodec
//
// to build (assuming libavformat and libavcodec are correctly installed
// your system).
//
// Run using
//
// ./test input.avi

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

#include <stdio.h>

// compatibility with newer API
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(55, 28, 1)
#define av_frame_alloc avcodec_alloc_frame
#define av_frame_free avcodec_free_frame
#endif

int main(int argc, char *argv[])
{
    // Initalizing these to NULL prevents segfaults!
    AVFormatContext *pFormatCtx = NULL;
    int i, videoStream;
    AVCodecContext *pCodecCtxOrig = NULL;
    AVCodecContext *pCodecCtx = NULL;
    AVCodec *pCodec = NULL;
    AVFrame *pFrame = NULL;
    AVPacket packet;
    int frameFinished;

    if (argc < 2)
    {
        printf("Please provide a movie file\n");
        return -1;
    }
    // Register all formats and codecs
    av_register_all();

    // Open video file
    if (avformat_open_input(&pFormatCtx, argv[1], NULL, NULL) != 0)
        return -1; // Couldn't open file

    // Retrieve stream information
    if (avformat_find_stream_info(pFormatCtx, NULL) < 0)
        return -1; // Couldn't find stream information

    // Find the first video stream
    videoStream = -1;
    for (i = 0; i < pFormatCtx->nb_streams; i++)
        if (pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
        {
            videoStream = i;
            break;
        }
    if (videoStream == -1)
        return -1; // Didn't find a video stream

    // Get a pointer to the codec context for the video stream
    pCodecCtxOrig = pFormatCtx->streams[videoStream]->codec;
    // Find the decoder for the video stream
    pCodec = avcodec_find_decoder(pCodecCtxOrig->codec_id);
    if (pCodec == NULL)
    {
        fprintf(stderr, "Unsupported codec!\n");
        return -1; // Codec not found
    }
    // Copy context
    pCodecCtx = avcodec_alloc_context3(pCodec);
    if (avcodec_copy_context(pCodecCtx, pCodecCtxOrig) != 0)
    {
        fprintf(stderr, "Couldn't copy codec context");
        return -1; // Error copying codec context
    }

    // Open codec
    if (avcodec_open2(pCodecCtx, pCodec, NULL) < 0)
        return -1; // Could not open codec

    // Allocate video frame
    pFrame = av_frame_alloc();

    // Read frames
    while (av_read_frame(pFormatCtx, &packet) >= 0)
    {
        // Is this a packet from the video stream?
        if (packet.stream_index == videoStream)
        {
            // Decode video frame
            avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);

            // Did we get a video frame?
            if (frameFinished)
            {
                printf("pts: %d\n", pFrame->best_effort_timestamp);
            }
        }

        // Free the packet that was allocated by av_read_frame
        av_free_packet(&packet);
    }

    // Free the YUV frame
    av_frame_free(&pFrame);

    // Close the codecs
    avcodec_close(pCodecCtx);
    avcodec_close(pCodecCtxOrig);

    // Close the video file
    avformat_close_input(&pFormatCtx);

    return 0;
}

@mikeboers
Copy link
Member

So.... how about exposing Frame.best_effort_timestamp or something for when you want to use it?

@ArtanisCV
Copy link
Author

So.... how about exposing Frame.best_effort_timestamp or something for when you want to use it?

I think exposing it as a property or something else is fine.

However, according to #13, maybe this will lead to some incompatibility?

@PyAV-Org PyAV-Org locked and limited conversation to collaborators Mar 25, 2022
@jlaine jlaine converted this issue into discussion #933 Mar 25, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants