Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Having trouble using VGG16 to classify images

Hello all,

Following this tutorial: https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/ - we can see the author is using the MobilennetSSD network architecture. I've found this network to be very hit and miss, and won't capture the most basic objects from a car dash cam (other cars, trucks etc.).

I have altered the code to include the VGG16 network, but I can't seem to get it working, and I feel it has something to do with the way I'm parsing the image as a blob with cv2.dnn.blobFromImage() method. I'm resizing my image to the required 224 x 224 format, but it's still producing an error.

The goal of my code at the moment is to parse an image as a blob and print out the return value of net.forward().

import imutils
import numpy as np
import argparse
import cv2
import time

ap = argparse.ArgumentParser()
ap.add_argument('-v', '--video', required=True,
    help='path to input video')
ap.add_argument('-p', '--prototxt', required=True,
    help='path to Caffe deploy prototxt file')
ap.add_argument('-m', '--model', required=True,
    help='path to Caffe pre-trained model')
ap.add_argument('-l', '--labels', required=True,
    help='path to ImageNet labels')
ap.add_argument('-c', '--confidence', type=float, default=0.5,
    help='minimum probability to filter weak detections')
args = vars(ap.parse_args())

rows = open(args['labels']).read().strip().split('\n')
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

COLORS = np.random.uniform(0, 255, size=(len(classes), 3))

net = cv2.dnn.readNetFromCaffe(args['prototxt'], args['model'])

video = cv2.VideoCapture(args['video'])
while True:
    (grabbed, frame) = video.read()

    if not grabbed:
        break

    frame = imutils.resize(frame, width=500)
    frameClone = frame.copy()

    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(frame, 1, (224, 224), (104, 117, 123))

    net.setInput(blob)
            ##### the error is produced on the line below - see the bottom for the error output
    detections = net.forward()
    print(detections)
    wait = input()
    idxs = np.argsort(detections[0])[::-1][:5]

    for (i, idx) in enumerate(idxs):
        if i == 0:
            label = "Label: {}, {:.2f}%".format(classes[idx],
                detections[0][idx] * 100)
            cv2.putText(frameClone, label, (5, 25), cv2.FONT_HERSHEY_SIMPLEX,
                0.7, (0, 0, 255), 2)

    cv2.imshow('Video', frameClone)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

As per above, the code breaks at detections = net.forward(). The lengthy error is displayed as the following:

PS C:\Users\Quo\OneDrive\Projects\Python\Computer Vision\Work\Video> python .\deep_learning_object_detection
_video.py -v .\videos\1522-front.mp4 -p .\prototxt\VGG_ILSVRC_16_layers_deploy.prototxt.txt -m C:\Users\Quo\
Downloads\VGG16_SOD_finetune.caffemodel -l .\labels\synset_words.txt
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:605] 
Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing 
will be halted for security reasons.  To increase the limit (or to disable these warnings), see 
CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:82]     
The total number of bytes read was 538683157
OpenCV Error: Assertion failed (1 <= blobs.size() && blobs.size() <= 2) in 
cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl, file D:\Build\OpenCV\opencv-
3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp, line 71
Traceback (most recent call last):
  File ".\deep_learning_object_detection_video.py", line 43, in <module>
    detections = net.forward()
cv2.error: D:\Build\OpenCV\opencv-3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp:71: error: (-215) 1 <= 
blobs.size() && blobs.size() <= 2 in function cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl

I'm not sure what any of these errors mean, but at a mere glance it looks something to do with the blob. I fear I'm not passing in the correct arguments to work with the VGG model.

Does anyone have any suggestions - any help is greatly appreciated.

click to hide/show revision 2
retagged

updated 2018-01-06 00:50:23 -0600

berak gravatar image

Having trouble using VGG16 to classify images

Hello all,

Following this tutorial: https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/ - we can see the author is using the MobilennetSSD network architecture. I've found this network to be very hit and miss, and won't capture the most basic objects from a car dash cam (other cars, trucks etc.).

I have altered the code to include the VGG16 network, but I can't seem to get it working, and I feel it has something to do with the way I'm parsing the image as a blob with cv2.dnn.blobFromImage() method. I'm resizing my image to the required 224 x 224 format, but it's still producing an error.

The goal of my code at the moment is to parse an image as a blob and print out the return value of net.forward().

import imutils
import numpy as np
import argparse
import cv2
import time

ap = argparse.ArgumentParser()
ap.add_argument('-v', '--video', required=True,
    help='path to input video')
ap.add_argument('-p', '--prototxt', required=True,
    help='path to Caffe deploy prototxt file')
ap.add_argument('-m', '--model', required=True,
    help='path to Caffe pre-trained model')
ap.add_argument('-l', '--labels', required=True,
    help='path to ImageNet labels')
ap.add_argument('-c', '--confidence', type=float, default=0.5,
    help='minimum probability to filter weak detections')
args = vars(ap.parse_args())

rows = open(args['labels']).read().strip().split('\n')
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

COLORS = np.random.uniform(0, 255, size=(len(classes), 3))

net = cv2.dnn.readNetFromCaffe(args['prototxt'], args['model'])

video = cv2.VideoCapture(args['video'])
while True:
    (grabbed, frame) = video.read()

    if not grabbed:
        break

    frame = imutils.resize(frame, width=500)
    frameClone = frame.copy()

    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(frame, 1, (224, 224), (104, 117, 123))

    net.setInput(blob)
            ##### the error is produced on the line below - see the bottom for the error output
    detections = net.forward()
    print(detections)
    wait = input()
    idxs = np.argsort(detections[0])[::-1][:5]

    for (i, idx) in enumerate(idxs):
        if i == 0:
            label = "Label: {}, {:.2f}%".format(classes[idx],
                detections[0][idx] * 100)
            cv2.putText(frameClone, label, (5, 25), cv2.FONT_HERSHEY_SIMPLEX,
                0.7, (0, 0, 255), 2)

    cv2.imshow('Video', frameClone)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

As per above, the code breaks at detections = net.forward(). The lengthy error is displayed as the following:

PS C:\Users\Quo\OneDrive\Projects\Python\Computer Vision\Work\Video> python .\deep_learning_object_detection
_video.py -v .\videos\1522-front.mp4 -p .\prototxt\VGG_ILSVRC_16_layers_deploy.prototxt.txt -m C:\Users\Quo\
Downloads\VGG16_SOD_finetune.caffemodel -l .\labels\synset_words.txt
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:605] 
Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing 
will be halted for security reasons.  To increase the limit (or to disable these warnings), see 
CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:82]     
The total number of bytes read was 538683157
OpenCV Error: Assertion failed (1 <= blobs.size() && blobs.size() <= 2) in 
cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl, file D:\Build\OpenCV\opencv-
3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp, line 71
Traceback (most recent call last):
  File ".\deep_learning_object_detection_video.py", line 43, in <module>
    detections = net.forward()
cv2.error: D:\Build\OpenCV\opencv-3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp:71: error: (-215) 1 <= 
blobs.size() && blobs.size() <= 2 in function cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl

I'm not sure what any of these errors mean, but at a mere glance it looks something to do with the blob. I fear I'm not passing in the correct arguments to work with the VGG model.

Does anyone have any suggestions - any help is greatly appreciated.

Having trouble using VGG16 to classify images

Hello all,

Following this tutorial: https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/ - we can see the author is using the MobilennetSSD network architecture. I've found this network to be very hit and miss, and won't capture the most basic objects from a car dash cam (other cars, trucks etc.).

I have altered the code to include the VGG16 network, but I can't seem to get it working, and I feel it has something to do with the way I'm parsing the image as a blob with cv2.dnn.blobFromImage() method. I'm resizing my image to the required 224 x 224 format, but it's still producing an error.

The goal of my code at the moment is to parse an image as a blob and print out the return value of net.forward().

import imutils
import numpy as np
import argparse
import cv2
import time

ap = argparse.ArgumentParser()
ap.add_argument('-v', '--video', required=True,
    help='path to input video')
ap.add_argument('-p', '--prototxt', required=True,
    help='path to Caffe deploy prototxt file')
ap.add_argument('-m', '--model', required=True,
    help='path to Caffe pre-trained model')
ap.add_argument('-l', '--labels', required=True,
    help='path to ImageNet labels')
ap.add_argument('-c', '--confidence', type=float, default=0.5,
    help='minimum probability to filter weak detections')
args = vars(ap.parse_args())

rows = open(args['labels']).read().strip().split('\n')
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

COLORS = np.random.uniform(0, 255, size=(len(classes), 3))

net = cv2.dnn.readNetFromCaffe(args['prototxt'], args['model'])

video = cv2.VideoCapture(args['video'])
while True:
    (grabbed, frame) = video.read()

    if not grabbed:
        break

    frame = imutils.resize(frame, width=500)
    frameClone = frame.copy()

    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(frame, 1, (224, 224), (104, 117, 123))

    net.setInput(blob)
            ##### the error is produced on the line below - see the bottom for the error output
    detections = net.forward()
    print(detections)
    wait = input()
    idxs = np.argsort(detections[0])[::-1][:5]

    for (i, idx) in enumerate(idxs):
        if i == 0:
            label = "Label: {}, {:.2f}%".format(classes[idx],
                detections[0][idx] * 100)
            cv2.putText(frameClone, label, (5, 25), cv2.FONT_HERSHEY_SIMPLEX,
                0.7, (0, 0, 255), 2)

    cv2.imshow('Video', frameClone)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

As per above, the code breaks at detections = net.forward(). The lengthy error is displayed as the following:

PS C:\Users\Quo\OneDrive\Projects\Python\Computer Vision\Work\Video> python .\deep_learning_object_detection
_video.py -v .\videos\1522-front.mp4 -p .\prototxt\VGG_ILSVRC_16_layers_deploy.prototxt.txt -m C:\Users\Quo\
Downloads\VGG16_SOD_finetune.caffemodel -l .\labels\synset_words.txt
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:605] 
Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing 
will be halted for security reasons.  To increase the limit (or to disable these warnings), see 
CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:82]     
The total number of bytes read was 538683157
OpenCV Error: Assertion failed (1 <= blobs.size() && blobs.size() <= 2) in 
cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl, file D:\Build\OpenCV\opencv-
3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp, line 71
Traceback (most recent call last):
  File ".\deep_learning_object_detection_video.py", line 43, in <module>
    detections = net.forward()
cv2.error: D:\Build\OpenCV\opencv-3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp:71: error: (-215) 1 <= 
blobs.size() && blobs.size() <= 2 in function cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl

I'm not sure what any of these errors mean, but at a mere glance it looks something to do with the blob. I fear I'm not passing in the correct arguments to work with the VGG model.

Does anyone have any suggestions - any help is greatly appreciated.

Having trouble using VGG16 to classify images

Hello all,

Following this tutorial: https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/ - we can see the author is using the MobilennetSSD network architecture. I've found this network to be very hit and miss, and won't capture the most basic objects from a car dash cam (other cars, trucks etc.).

I have altered the code to include the VGG16 network, but I can't seem to get it working, and I feel it has something to do with the way I'm parsing the image as a blob with cv2.dnn.blobFromImage() method. I'm resizing my image to the required 224 x 224 format, but it's still producing an error.

The goal of my code at the moment is to parse an image as a blob and print out the return value of net.forward().

import imutils
import numpy as np
import argparse
import cv2
import time

ap = argparse.ArgumentParser()
ap.add_argument('-v', '--video', required=True,
    help='path to input video')
ap.add_argument('-p', '--prototxt', required=True,
    help='path to Caffe deploy prototxt file')
ap.add_argument('-m', '--model', required=True,
    help='path to Caffe pre-trained model')
ap.add_argument('-l', '--labels', required=True,
    help='path to ImageNet labels')
ap.add_argument('-c', '--confidence', type=float, default=0.5,
    help='minimum probability to filter weak detections')
args = vars(ap.parse_args())

rows = open(args['labels']).read().strip().split('\n')
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

COLORS = np.random.uniform(0, 255, size=(len(classes), 3))

net = cv2.dnn.readNetFromCaffe(args['prototxt'], args['model'])

video = cv2.VideoCapture(args['video'])
while True:
    (grabbed, frame) = video.read()

    if not grabbed:
        break

    frame = imutils.resize(frame, width=500)
    frameClone = frame.copy()

    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(frame, 1, (224, 224), (104, 117, 123))

    net.setInput(blob)
            ##### the error is produced on the line below - see the bottom for the error output
    detections = net.forward()
    print(detections)
    wait = input()
    idxs = np.argsort(detections[0])[::-1][:5]

    for (i, idx) in enumerate(idxs):
        if i == 0:
            label = "Label: {}, {:.2f}%".format(classes[idx],
                detections[0][idx] * 100)
            cv2.putText(frameClone, label, (5, 25), cv2.FONT_HERSHEY_SIMPLEX,
                0.7, (0, 0, 255), 2)

    cv2.imshow('Video', frameClone)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

As per above, the code breaks at detections = net.forward(). The lengthy error is displayed as the following:

PS C:\Users\Quo\OneDrive\Projects\Python\Computer Vision\Work\Video> python .\deep_learning_object_detection
_video.py -v .\videos\1522-front.mp4 -p .\prototxt\VGG_ILSVRC_16_layers_deploy.prototxt.txt -m C:\Users\Quo\
Downloads\VGG16_SOD_finetune.caffemodel -l .\labels\synset_words.txt
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:605] 
Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing 
will be halted for security reasons.  To increase the limit (or to disable these warnings), see 
CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING D:\Build\OpenCV\opencv-3.3.1\3rdparty\protobuf\src\google\protobuf\io\coded_stream.cc:82]     
The total number of bytes read was 538683157
OpenCV Error: Assertion failed (1 <= blobs.size() && blobs.size() <= 2) in 
cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl, file D:\Build\OpenCV\opencv-
3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp, line 71
Traceback (most recent call last):
  File ".\deep_learning_object_detection_video.py", line 43, in <module>
    detections = net.forward()
cv2.error: D:\Build\OpenCV\opencv-3.3.1\modules\dnn\src\layers\fully_connected_layer.cpp:71: error: (-215) 1 <= 
blobs.size() && blobs.size() <= 2 in function cv::dnn::FullyConnectedLayerImpl::FullyConnectedLayerImpl

I'm not sure what any of these errors mean, but at a mere glance it looks something to do with the blob. I fear I'm not passing in the correct arguments to work with the VGG model.

Does anyone have any suggestions - any help is greatly appreciated.