Fastest way to load an animated GIF in Python into a numpy array-CodePudding

Surprisingly I couldn't see any coverage of this.

I've found 3 recognised ways of performing this - Pillow, OpenCV, and Imageio. The results surprised me, so I've posted them as a self-answering Q&A (below).

CodePudding user response：

This seems to be the standard way of loading a GIF in each library:

import os
import cv2
import time
import imageio
import numpy as np
from tqdm import tqdm
from glob import glob
from PIL import Image, ImageSequence

gifs = glob(os.path.join("/folder/of/gifs", "*"))
print(f"Found {len(gifs)} GIFs")

def load_gif_as_video_pil(gif_path):
    im = Image.open(gif_path)
    frames = []
    for frame in ImageSequence.Iterator(im):
        frame = np.array(frame.copy().convert('RGB').getdata(), dtype=np.uint8).reshape(frame.size[1],
                                                                                        frame.size[0],
                                                                                        3)
        frames.append(frame)

    return np.array(frames)

def load_gif_as_video_imageio(gif_path):
    return imageio.mimread(gif_path)

def load_gif_as_video_opencv(filename):
    gif = cv2.VideoCapture(filename)
    frames = []
    while True:
        ret, frame = gif.read()
        if not ret:
            break
        frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    return np.array(frames)


start = time.time()
[load_gif_as_video_imageio(path) for path in tqdm(gifs)]
end = time.time()
print(f"ImageIO: {end - start}")

start = time.time()
[load_gif_as_video_opencv(path) for path in tqdm(gifs)]
end = time.time()
print(f"OpenCV: {end - start}")

start = time.time()
[load_gif_as_video_pil(path) for path in tqdm(gifs)]
end = time.time()
print(f"PIL: {end - start}")

Over 250 GIFs, these are the results:

100%|██████████| 250/250 [00:13<00:00, 18.32it/s]
ImageIO: 13.829721689224243
100%|██████████| 250/250 [00:06<00:00, 39.04it/s]
OpenCV: 6.478164434432983
100%|██████████| 250/250 [03:00<00:00,  1.38it/s]
PIL: 181.03292179107666

OpenCV is twice as fast as imageio, which is 15x faster than PIL (using my method, anyway).

CodePudding user response：

Your code using Pillow is very inefficient! Images are compatibable with Numpy's array interface so your conversion code is complicating things.

I'd use the following helper to get the frames out into a Numpy array:

from PIL import Image, ImageSequence
import numpy as np

def load_frames(image: Image, mode='RGBA'):
    return np.array([
        np.array(frame.convert(mode))
        for frame in ImageSequence.Iterator(image)
    ])

with Image.open('animated.gif') as im:
    frames = load_frames(im)

This runs in basically the same time as the others. For example, with a 400x400 pixel, 21 frame, GIF I have, it takes mimread ~140ms, while Pillow takes ~130ms.

Update: I've just had a play with CV2 and noticed its "wall clock" time is better (i.e. what you were measuring) because it's doing work in other threads. For example, if I run using the Jupyter %time magic, I get the following output:

ImageIO

CPU times: user 135 ms, sys: 9.81 ms, total: 145 ms
Wall time: 145 ms

PIL

CPU times: user 127 ms, sys: 3.03 ms, total: 130 ms
Wall time: 130 ms

CV2

CPU times: user 309 ms, sys: 95 ms, total: 404 ms
Wall time: 89.7 ms

I.e. although it's finishing the loop in 90ms, it's used ~4.5x that CPU time in total.

So if you're interested in the time to complete for a single large image, you might want to use CV2. But if you were batch processing lots of images, I'd suggest using Pillow in a multiprocessing Pool.