Revision history [back]

Convert a Video.mp4 in a 2D Matrix where each row represents a frame

Hello,

as explained in the title, i want to upload a video in python (import cv2) and then find a way to represent this video as 2D-array video_matrix. The rows of video_matrix are equal to the total numbers of frames and the number of columns are equal to the total number of features that describes a single frame.

For Example a 3s video (30fps) has 90 frames, each frame has height 100 and width 100 and each pixel is described by 3 values (rbg).

In my current method I convert each frame, which is a 3D-array with dimension (100,100,3), into a 1D vector of size 100 x 100 x 3.

def image_to_vector(image):
"""
Args:
image: numpy array of shape (length, height, depth)

Returns:
 v: a vector of shape (length x height x depth, 1)
"""
length, height, depth = image.shape
return image.reshape((length * height * depth, 1))

Then i append the resulting vector to a video_matrix

video_matrix = np.column_stack((video_matrix, frame_vector))

I repeat this procedure for all frames of the video. So in the end i get numerical representation of a video as a 2D-array where the rows are representations of a frames. The video_matrix must have this form, because I want to apply machine learning algorithms on it.

My problem is that the second step (append frame_vector to video_matrx) takes to much time. For example if want to represent a three min video it takes almost 2 hours to get the corresponding video_matrix. Is there build in tool in opencv for python that allows me to get the video_matrix faster, even for longer videos?

My Code:

import numpy as np
import cv2 # extract frames from the videos
from PIL import Image  # to manipulate images

 #Create frames of a video and store them 
video = cv2.VideoCapture('path/video.mp4') 
if not os.path.exists('data'): 
    os.makedirs('data') 

counter = 0 
while(True):   
    # reading from frame 
    ret,frame = video.read()  




if ret: 
    # if video is still left continue creating images 
    name = './data/frame' + str(counter) + '.jpg'
        #print ('Creating...' + name) 

        # writing the extracted images 
        cv2.imwrite(name, frame) 

        # increasing counter so that it will 
        # show how many frames are created 
        counter += 1
    else: 
        break

# Release all space and windows once done 
video.release() 
cv2.destroyAllWindows()

video_matrix = np.zeros(width * height * 3) # initialize 1D array which will become the 2D array; first column will be deleted at the end

for i in range(counter): # loops over the total amount of frames

    current_frame = np.asarray(Image.open('./data/frame'+str(i)+'.jpg')) # 3D-array = current frame
    frame_vector = image_to_vector(current_frame) #convert frame into a 1D array
    video_matrix = np.column_stack((video_matrix, frame_vector)) # append frame x to a matrix X that will represent the video

video_matrix = np.delete(video_matrix, 0, 1) # d