Ask Your Question
1

grab and process multiple frames in different threads - opencv semi-related

asked 2015-12-22 04:48:55 -0600

theodore gravatar image

updated 2015-12-22 05:03:17 -0600

Well as the title says I would like to grab and process multiple frames in different threads by using a circular buffer or more, I hope that you can point me to what is better. For grabbing frames I am not using the VideoCapture() class from opencv but the libfreenect2 library and the corresponding listener since I am working with the Kinect one sensor which is not compatible with the VideoCapture() + openi2 functionality yet. My intention is to have one thread grabbing frames continuously in order not to affect the framerate that I can get from the kinect sensor, also here I might add a viewer in order to have a live monitor of what is happening (I do not know though how feasible is this and how it will affect the framerate) and having another thread where I would do all the process. From the libfreenect2 listener I can obtain multiple views regarding the sensor so at the same time I can have a frame for the rgb camera, one for the ir, one for the depth and with some process I can also obtain an rgbd. My question now is how to make shareable these frames to the two threads. Having a look in the following questions Time delay in VideoCapture opencv due to capture buffer and waitKey(1) timing issues causing frame rate slow down - fix? I think that a good approach would be to go with two threads and a circular buffer. However, what is more logical? to have multiple circular buffers for each view or one circular buffer which will contain a container (e.g. an stl vector<>) with the frames from each view.

image description

At the moment I am using the second approach with the vector and adapting @pklab 's approach from the first link I posted above. Code is below:

//! [headers]
#include <iostream>
#include <stdio.h>
#include <iomanip>
#include <tchar.h>
#include <signal.h>
#include <opencv2/opencv.hpp>

#include <thread>
#include <mutex>
#include <queue>
#include <atomic>

#include <libfreenect2/libfreenect2.hpp>
#include <libfreenect2/frame_listener_impl.h>
#include <libfreenect2/registration.h>
#include <libfreenect2/packet_pipeline.h>
#include <libfreenect2/logger.h>
//! [headers]

using namespace std;
using namespace cv;

enum Process { cl, gl, cpu };

std::queue<vector<cv::Mat> > buffer;
std::mutex mtxCam;
std::atomic<bool> grabOn; // this is lock free

void grabber()
{
    //! [context]
    libfreenect2::Freenect2 freenect2;
    libfreenect2::Freenect2Device *dev = nullptr;
    libfreenect2::PacketPipeline *pipeline = nullptr;
    //! [context]

    //! [discovery]
    if(freenect2.enumerateDevices() == 0)
    {
        std::cout << "no device connected!" << endl;
        exit(EXIT_FAILURE);
//        return -1;
    }

    string serial = freenect2.getDefaultDeviceSerialNumber();
//    string serial = "014947350647";

    std::cout << "SERIAL: " << serial << endl;

    //! [discovery]

    int depthProcessor = Process::cl;

    if(depthProcessor == Process::cpu)
    {
        if(!pipeline)
            //! [pipeline]
            pipeline = new libfreenect2::CpuPacketPipeline();
            //! [pipeline]
    } else if (depthProcessor == Process::gl) {
#ifdef LIBFREENECT2_WITH_OPENGL_SUPPORT
        if(!pipeline)
            pipeline = new libfreenect2::OpenGLPacketPipeline();
#else
        std::cout << "OpenGL pipeline is not supported!" << std::endl;
#endif
    } else if (depthProcessor == Process::cl) {
#ifdef LIBFREENECT2_WITH_OPENCL_SUPPORT
        if(!pipeline)
            pipeline = new libfreenect2::OpenCLPacketPipeline();
#else
        std::cout << "OpenCL pipeline is not supported!" << std::endl;
#endif
    }

    if(pipeline)
    {
        //! [open]
        dev = freenect2.openDevice(serial, pipeline);
        //! [open]
    } else {
        dev = freenect2 ...
(more)
edit retag flag offensive close merge delete

Comments

It's not exactly same problem but i have three thread one for video capture one to display time signal and another to display fft

PS when I close program there is still a bug

LBerger gravatar imageLBerger ( 2015-12-22 09:10:50 -0600 )edit

I don't think there's too much difference between the two. It's probably slightly easier to keep track of the second one. That also lets you add additional layers of processing (IE: An edge map) without needing to change the infrastructure by adding another circular buffer.

Also, as a note. Boost has a Lock-free Single Producer Single Consumer (just like this problem) queue that might be easier to use, if it takes care of everything.

A warning though. The default copy is shallow, so make sure to use .clone() when doing the put. Otherwise when you get to the get, you'll find that the contents have changed on you.

You are doing that, but you have the comment questioning it. Yes, you do want to use .clone().

Tetragramm gravatar imageTetragramm ( 2015-12-22 16:42:58 -0600 )edit

@LBerger thanks for your code I'll have a look I might be able to get some ideas. @Tetragramm thanks for the suggestions, it seems reasonable.

theodore gravatar imagetheodore ( 2015-12-23 02:49:32 -0600 )edit

2 answers

Sort by ยป oldest newest most voted
3

answered 2015-12-23 11:05:52 -0600

pklab gravatar image

updated 2016-01-08 09:44:16 -0600

Better and general answer is out of OpenCV scope and would requires many pages to investigate different scenario and intra thread events and communication.

Short answer Use a thread safe queue for your logical data.

If you want to use simple implementation:

  • Think in term of application then find the best compromise between safety, performance and code clarity.
  • Which is your logical data ? Create a class for your logical data. If the class has cv::Mats or object pointer member it should have a copy constructor (and =operator) overload.
  • Use a thread safe queue for this class. You can find a lot of example of general purpose thread safe circular buffer.

below is mine.

class MyData implements your logical data. class VerySimpleThreadSafeFIFOBuffer as name says is a generic thread safe FIFO. Follow test functions for how to use.

Compare GetDataMemoryCount() and GetMatMemoryCount() with GetItemCount() to see that memory recycling is effective. One test:

Queued Item:496
Queue Max Size: 15
Unique Data Allocation: 32
Unique Mat Allocation: 28

EDIT: Answer to some comments:

The implementation of VerySimpleThreadSafeFIFOBuffer does assume nothing about consumers and producer, it just needs that the item class must have a deep copy constructor and =operator overloading. It means that you can use VerySimpleThreadSafeFIFOBuffer in multiple producer/consumers. It means also that you can't use VerySimpleThreadSafeFIFOBuffer<cv::Mat> because cv::Mat copy constructor copies the header but not the image (isn't a deep copy).

Under some circumstance, like single producer single consumer, you can write your own queue without using locks. This will reduce lock overhead and deadlock risk.

About multiple producer/consumer you have to take care to synchronization and consider that, if you have multiple consumers each consumer pops an item from the queue. You can't use simple queue with a thread that writes and an thread that processes. This is because they will pop different frames. In this case you need a queue with a release mechanism and need to know when all consumers have completed their task. As alternative you can have a single consumer that pops the item from the queue than starts N threads that use the same item in read only mode.

Threads are useful to do things during idle time, like processing while waiting next frame.

If you grab at 25fps (40ms) only few ms are used to get the frame from the cam, let say 2ms. In this case you have 38ms of idle time that could be used to do some processing. Also your processing must be shorter than 38ms. You can use a queue to get safe from odd long time processing (you could also calculate needed queue length).

If your application doesn't have idle time, using threads you will introduce additional complexity and overtime without real performance gain.

Really, with multicore/hyperthreading CPUs, single thread application has bad performance because it uses just 1 core. Multithreading allocates threads over your multiple core processor providing really better performance. But if you use ... (more)

edit flag offensive delete link more

Comments

@pklab man you continue surprising me :-), sorry for the late response but I was away due to the period. I haven't tried yet the class you have implemented but I will as soon as I will return to my workstation. Just two questions what do you mean by the "@warning THIS IS AN EXAMPLE, MANY IMPROVEMENT CAN BE DONE" VerySimpleThreadSafeFIFOBuffer() class and if I could use the above implementation in a form of one producer/multiple consumers in multiple threads since at the moment I am using one producer/one consumer but the consumer is used to imwrite() frames in the disk but I noticed that the buffer size continuously increasing which will lead me to an out of memory situation. Apparently imwriting is much more slower than grabbing, so I was thinking to use multiple processors/consumers

theodore gravatar imagetheodore ( 2015-12-27 09:32:35 -0600 )edit

to access the buffer in order to be able to run in real time.

theodore gravatar imagetheodore ( 2015-12-27 09:33:39 -0600 )edit

The "warning" means what is stated. For example you could use TBB instead std::thread, or events or something else. It's also a disclaimer for those users that like to copy and paste without try to understand an example about a complex subject !

About imwrite you should know it's very fast. VideoWriter works up to 250fps on i3 machine without problem. See edit in my answer for other subject.

pklab gravatar imagepklab ( 2016-01-08 09:47:04 -0600 )edit

many thanks @pklab when I get some time I will go through your edits. Anyway I think your answers are useful for other users as well ;-)

theodore gravatar imagetheodore ( 2016-01-09 17:20:48 -0600 )edit

@pklab thanks for the exmple I trying to run the example in Qt and facing some issue can you please help for the same

Kira gravatar imageKira ( 2018-03-19 01:05:30 -0600 )edit
0

answered 2017-04-08 05:37:29 -0600

sagiz gravatar image

You can use the TBB parallel pipeline with OpenCV to do just that.

I have a post about using it with OpenCV here - https://www.theimpossiblecode.com/blo...

TBB has advanced multi threading patterns and tools - some of them used internally by OpenCV (if built with it), and some can be used externally, regardless if OpenCV was built with TBB support.

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2015-12-22 04:48:55 -0600

Seen: 19,550 times

Last updated: Jan 08 '16