Saving Numpy Array as Video Relative to a Time Array-CodePudding

Let a numpy array video of shape (T,w,h,3) be given. Here T is a positive integer representing number of frames, w is a positive integer representing the width, h is a positive integer representing the height. Every entry of video is an integer from 0 to 255. In other words, video is a numpy array represents a video in the sense that video[t] is an RGB image for every non-negative integer t < T. After video is given, another array of floats time of shape (T) is given. This array time satisfy time[0]=0 and time[t] < time[t 1] for every non-negative integer t < T. An example of the above situation is given here:

import numpy as np

shape = (200, 500, 1000, 3)
random = np.random.randint(0, 255, shape, dtype= np.uint16)
time = np.zeros((shape[0]), dtype = np.float16)
time[0] = 0
for i in range(1, shape[0]):
    x = np.random.random_sample()
    time[i] = time[i-1]   x

My goal is to save video and time a playable video file such that:

The video file is in format of either avi or mp4 (so that we can just double click it and play it).
Each frame of the video respects the time array in the following sense: for every non-negative integer t < T, the viewer is seeing the picture video[t] during the time period from time[t] to time[t 1]. The moment time[T-1] is the end of the video.
If possible, keep the original size (in the given example the size is (500,1000)).

How can I achieve this? I tried using the opencv's video writer and it seems I have to enter some fps information which I do not have because the time array can be very non-uniform in terms of when each picture is displayed.

CodePudding user response：

That is impossible with OpenCV. OpenCV's VideoWriter only supports fixed/constant frame rate. Anything based on that will require rounding to the nearest frame time and/or needlessly high frame rates and duplicated frames (or rather frames that contain no change).

You want presentation timestamps (PTS). That's an inherent aspect of media containers and streams. A word of caution: some video players may assume a "reasonable" time span between frames, and may glitch otherwise, like becoming laggy/unresponsive because the whole GUI is tied to video timing... That's the fault of the video player though.

Use PyAV. It's the only ffmpeg wrapper for python I know of that actually uses API calls rather than messing around with subprocesses.

Here's the relevant example: https://github.com/PyAV-Org/PyAV/blob/main/examples/numpy/generate_video_with_pts.py

In short: set frame.pts = int(round(my_pts / stream.codec_context.time_base)) where my_pts is something in seconds.

I wrote that example, derived from the sibling "fixed rate" example. I put some effort into getting the ffmpeg API usage "right" (time bases, containers/streams/contexts) but if it happens to fail or act up, you're allowed and encouraged to question what I did there.

CodePudding user response：

The solution to your problem is to generate all the video frames necessary for a given value of FPS and as a video needs a constant frame rate you have to decide first at which granularity you want your video.

After you have decided the FPS value you go and generate all the required video frames, so you can use the export to video method with a constant frame rate.

The numpy arrays representing the image of the frame stay in the video array same as the last one displayed until there is time to change to another one. The chosen frame rate FPS decides then with which accuracy the changes to new frame image hit the specified time values.

Below Python code with an improved version of getting the time values. It generates all the video frames and the explanations are implemented by self-explaining choice of variable names. The logic behind the algorithm used is to generate a single frame image and repeat it as frame of the resulting video as long as the next value on the time axis is not reached. If the next value on the time axis is reached a new image is generated and repeated as long as the video time does not exceed the next time value. The code writes the created data to an .mp4 file:

import numpy as np
import cv2   as cv

FPS = 15
fps_timeDelta  = 1/FPS
noOfImages     = 5              # 200
imageShape     = (210, 297 , 3) # (500, 1000, 3)

vidWriter = cv.VideoWriter(  
    'opencv_writeVideo.mp4', 
    cv.VideoWriter_fourcc(*'MPEG'),
    FPS, (imageShape[1], imageShape[0  ])
)

vidFrameTime = np.concatenate( 
    (np.zeros(1), np.add.accumulate(
                     np.random.random_sample(size=noOfImages))) 
)

vidTime          = 0.0
indxVidFrameTime = 1
singleImageRGB = np.random.randint(
                              0, 256, imageShape, dtype= np.uint8)
cv.imshow("singleImageRGB", singleImageRGB/255 )
cv.waitKey(0)

while vidTime <= vidFrameTime[-1]:
    vidTime  = fps_timeDelta
    if vidTime >= vidFrameTime[indxVidFrameTime]:
        singleImageRGB = np.random.randint(0, 255, imageShape, dtype= np.uint8)
        indxVidFrameTime  = 1
    vidWriter.write(singleImageRGB)