how to predict the video [closed]

asked 2018-12-29 20:57:35 -0600

Maria12 gravatar image

updated 2018-12-30 02:48:44 -0600

i find a post about video prediction

https://stackoverflow.com/questions/5...

how to predict video? problem is no video window show?

it had shown memory error, when no memory error, it become very slow how much memory can do this? if use history data, how many minutes can this method predict? if i use predicted result to further predict, what is the maximum minutes can it predict?

i edit as follows , but no video show

from __future__ import print_function

from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Conv1D, MaxPooling1D
from keras.datasets import imdb

# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128

# Convolution
kernel_size = 5
filters = 64
pool_size = 4

# LSTM
lstm_output_size = 70

# Training
batch_size = 30
epochs = 2

'''
Note:
batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''

#print('Loading data...')
#(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
#print(len(x_train), 'train sequences')
#print(len(x_test), 'test sequences')

def getit(x_train,y_train, x_test, y_test):
    print('Pad sequences (samples x time)')
    x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
    x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
    print('x_train shape:', x_train.shape)
    print('x_test shape:', x_test.shape)

    print('Build model...')

    model = Sequential()
    model.add(Embedding(max_features, embedding_size, input_length=maxlen))
    model.add(Dropout(0.25))
    model.add(Conv1D(filters,
                     kernel_size,
                     padding='valid',
                     activation='relu',
                     strides=1))
    model.add(MaxPooling1D(pool_size=pool_size))
    model.add(LSTM(lstm_output_size))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

    print('Train...')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test))
    score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
    print('Test score:', score)
    print('Test accuracy:', acc)
    return model

import cv2
import numpy as np

vid = cv2.VideoCapture(r"C:\Users\martlee2\Downloads\stupidwoman.mp4")

while(vid.isOpened()):
    ret, frame = vid.read()
    if ret == True:
        count_frames = 0
        count_framesb = 0
        frame_list = []
        frame_listb = []
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        frame = cv2.resize(frame,(480,360),interpolation=cv2.INTER_AREA)
        while count_frames < 30:
            frame_list.append(frame)
        while count_framesb < 60 and count_framesb >= 30:
            frame_listb.append(frame)
        if count_frames >= 30:
            count_frames = 0
        if count_framesb >= 30:
            count_framesb = 0
        model = getit(frame_list, np.array([1 for i in range(0,30)]),frame_listb, np.array([1 for i in range(0,30)]))
        frame_set = np.array(frame_list)
        frame_set = frame_set.reshape(1, 15, 480, 360, 1)
        pred = model.predict(frame_set)
        pred_ = np.argmax(pred,axis=1) #i'm using the Model object from Keras
        count_frames = count_frames + 1
        count_framesb = count_framesb + 1
        #frame = cv2.resize(frame,(480,360),interpolation=cv2.INTER_AREA)
        #frame = cv2.putText(frame,str(texto),(0,130), cv2.FONT_HERSHEY_SIMPLEX, 2.5, (255, 0, 0), 2, cv2.LINE_AA)
        cv2.imshow('Video', frame)
        cv2.imshow('Video2', pred)
        if cv2.waitKey(25) & 0xFF == ord('q'):
            break
    else:
        break

vid.release()
cv2.destroyAllWindows()
edit retag flag offensive reopen merge delete

Closed for the following reason not a real question by berak
close date 2018-12-30 04:30:36.659518

Comments

and what is the actual problem, now ?

(apart from retraining your model on each single frame...)

((imho, you have to seperate the training and the evaluation process, and put thos into 2 seperate functions / programs,even))

also, please tell us, what you''re trying to achieve here. what is the purpose of your program ?

berak gravatar imageberak ( 2018-12-29 22:38:24 -0600 )edit

problem is , still can not predict. and can not show two window to show the difference, i want to predict the future of video frame ,or predict 1 minutes video using history video, i want to know whether there are multiple future like Television said, , will it have health problem if not do action as the prediction video shown in future?

Maria12 gravatar imageMaria12 ( 2018-12-30 00:04:12 -0600 )edit

Will the sound wave or Video or even the text predicted influence the future i go or influence health? Will youtube video already been edited by terrorists that not easy to predict or influence in future?

Maria12 gravatar imageMaria12 ( 2018-12-30 02:37:21 -0600 )edit

it had shown memory error, when no memory error, it become very slow how much memory can do this? if use history data, how many minutes can this method predict? if i use predicted result to further predict, what is the maximum minutes can it predict?

Maria12 gravatar imageMaria12 ( 2018-12-30 02:48:56 -0600 )edit

you probably need to take a course in machinelearning, before you can go on.

also, while you cannot hold a whole video , uncompressed, in a frames[] buffer (do the math !), none of those problems matter now, as long as you got the general logic wrong.

it also seems, you just took a TEXT based idea of an LSTM model, and try to apply that to IMAGES.

berak gravatar imageberak ( 2018-12-30 04:09:23 -0600 )edit
1

let me close this. it's back to the drawing board for you. please read some books, take courses, don't try to hack it, if you have NO idea, what you're doing.

  • you can't do anything with a single video file
  • you need to seperate the training from the inference
  • you cannot use a text lstm model for images
berak gravatar imageberak ( 2018-12-30 04:31:55 -0600 )edit

how to define the feature that is 1 or 0 ? is it ordering result use 1, non-ordering result use 0 ?

Maria12 gravatar imageMaria12 ( 2018-12-30 19:01:27 -0600 )edit