Ask Your Question
0

How to use readNet (or readFromDarknet) instead of readNetFromCaffe?

asked 2018-11-06 19:19:33 -0600

voo_doo gravatar image

I did an object detection using opencv by loading pre-trained MobileNet SSD model. from this post. It reads a video and detects objects without any problem. But I would like to use readNet (or readFromDarknet) instead of readNetFromCaffe

net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

because I have pre-trained weights and cfg file of my own objects only in Darknet framework. Therefore I simply changed readNetFromCaffe into readNet in above post and got an error:

Traceback (most recent call last):
  File "people_counter.py", line 124, in <module>
    for i in np.arange(0, detections.shape[2]):
IndexError: tuple index out of range

Here detections is an output from

blob = cv2.dnn.blobFromImage(frame, 1.0/255.0, (416, 416), True, crop=False)
net.setInput(blob)
detections = net.forward()

Its shape is (1, 1, 100, 7) tuple (when using readNetFromCaffe).

I was kinda expecting it wouldn't work just by changing the model. Then I decided to look for an object detector code where readNet was used and I found it here. I read through the code and found the same lines as follows:

blob = cv2.dnn.blobFromImage(image, scale, (416,416), (0,0,0), True, crop=False)
net.setInput(blob)
outs = net.forward(get_output_layers(net))

Here, the shape of outs is (1, 845, 6) list. But in order for me to be able to use it right away (here), outs should be of the same size with detections. I've come up to this part and have no clue about how I should proceed.

If something isn't clear, I just need help to use readNet (or readFromDarknet) instead of readNetFromCaffe in this post

edit retag flag offensive close merge delete

Comments

darknet can train several different architectures (and e.g. the output shapes from yolo2 and yolo3 will differ), so please be more specific about your model.

and maybe you should use code from here not things you find in blogposts on the net.

berak gravatar imageberak ( 2018-11-06 21:01:14 -0600 )edit

@berak, my cfg file is very similar to this and weights file is a pre-trained one of yolo-voc.weights. The reason I am using that is it gives everything I need but it uses caffe, I'd like to change it Darknet's yolo for above mentioned reasons. Could you help to get solve this issue?

voo_doo gravatar imagevoo_doo ( 2018-11-07 00:59:19 -0600 )edit

again, different architectures require different handling. have a look e.g. here

berak gravatar imageberak ( 2018-11-07 01:02:43 -0600 )edit

I had a look, thank you for that reference, I do understand that. Since Darknet produces different output than Caffe, do you mean I can't use Darknet's output inside this post? I am asking about some code modifications (that I have to do ) inside the post so that I can handle with Darknet's blob output. Sorry if I couldn't deliver what I mean.

voo_doo gravatar imagevoo_doo ( 2018-11-07 01:51:21 -0600 )edit

i won't read any blogposts, to clear up your confusion (no time for that)

but mobilenet and yolo v2 / v3 have each a different output structure, and you need to adapt your code respectively. (even yolo v2 / v3 differ, and you're unclear about, what you're using, exactly !)

berak gravatar imageberak ( 2018-11-07 01:56:13 -0600 )edit
1

@berak thank you for your time, let me work on that for some time and if I success I'll come and leave an answer.

voo_doo gravatar imagevoo_doo ( 2018-11-07 02:25:28 -0600 )edit

@voo_doo , -- that would be perfect ! ;)

berak gravatar imageberak ( 2018-11-07 06:35:44 -0600 )edit

@berak, I could solve the issue by looking at the different architectures as you referred to. Please visit this flow if you are interested. Thanks

voo_doo gravatar imagevoo_doo ( 2018-11-14 03:24:54 -0600 )edit

@voo_doo , if you could do a short writeup here, that would be really useful for this site. a link to SO - not so much.

berak gravatar imageberak ( 2018-11-14 03:27:07 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
1

answered 2018-11-14 23:19:55 -0600

voo_doo gravatar image

If we look at the code closely we can see that everying is dependent on the outputs of detections, line 121, and we should tweak its outputs to match them with the outs of this, line 63. After spending almost a day, I came to a reasonable (not the perfect) solution. Basically, it is all about output blobs of readNetFromCaffe and readFromDarknet, because they output a blob with a shape 1x1xNx7 and NxC, respectively. Here Ns are the number of detections, but with different size vectors, namely, N in 1x1xNx7 is is a number of detections and an every detection is a vector of values [batchId, classId, confidence, left, top, right, bottom] and N in NxC a number of detected objects and C is a number of classes + 4 where the first 4 numbers are [center_x, center_y, width, height]. After analyzing these, we may replace (124-130 lines)

for i in np.arange(0, detections.shape[2]):
    confidence = detections[0, 0, i, 2]
    if confidence > args["confidence"]:
        idx = int(detections[0, 0, i, 1])
        if CLASSES[idx] != "person":
            continue
        box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
        (startX, startY, endX, endY) = box.astype("int")

with equivalent lines

    for i in np.arange(0, detections.shape[0]):
        scores = detections[i][5:]
        classId = np.argmax(scores)
        confidence = scores[classId]
        if confidence > args["confidence"]:
            idx = int(classId)
            if CLASSES[idx] != "person":
                continue

            center_x = int(detections[i][0] * 416)    
            center_y = int(detections[i][1] * 416)    
            width = int(detections[i][2] * 416)        
            height = int(detections[i][3] * 416)     
            left = int(center_x - width / 2)         
            top = int(center_y - height / 2)
            right = width + left - 1
            bottom = height + top - 1

            box = [left, top, width, height]
            (startX, startY, endX, endY) = box

This way we can keep track of "person" class using Darknet's cfg and weights and count them up/down with a visualiation line.

Again, there might be some other more simpler ways of tracking the detections of Darknet weights file, but this works for this particular case.

A reference: more about blobs output by readNetFromCaffe and readFromDarknet

edit flag offensive delete link more

Comments

thanl you ;)

berak gravatar imageberak ( 2018-11-15 00:47:23 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2018-11-06 19:19:33 -0600

Seen: 5,670 times

Last updated: Nov 14 '18