Translated text
I am trying to detect patterns in video images, for this I am using optical flow and collecting the movement of the pixels in a certain region and after that I want to train an MLP to recognize. I switched from digital image processing to digital signal processing.
These values refer to the displacement of each pixel, frame by frame.
When collecting the values of a video of 3 seconds, it returns about 5000 values. I have several videos with actions that represent what I want to recognize and others with similar actions but do not represent what I want, and all these videos served to train the MLP.
My question: How can I train the MLP network, because all the training I have already done was of time series. In this case I need to gather all the data and inform when the values represent the action and when they do not represent (1 and 0) through a supervised training.
How can I model this data and report that the entire 5000-values group represents the action I want using the R language?