Ask Your Question

How to recognize outdoor objects from different angles

asked 2019-06-10 07:52:33 -0500

Agribot gravatar image


I'd like to get my wheeled robot with it's vision system to be able to identify various objects that are located in an out-door car park.

The objects are things like ... a door...a window ... lamp-post, gate, tree, etc.

The objective is that the robot will start at an undefined position, then move across about 100 meters and search for 1 or more object and thus determine it's location and position in the car-park.

The problem I foresee ( among several challenges ) is that the camera and the vision system will see the objects from different angles and positions, so just programming it to learn and identify each object, using standard object recognition code, might not work.

If the wheeled robot approaches from the right-hand side on 1 day, but approaches from the left-side on another day, how can it reliably identify an object or feature ?

What general solution would work ? ( I do not want to try complex 3-d based solid object recognition.)

What python-based solution can detect objects viewed from different positions and then figure out what the object is from a list of about 20 different objects ?

I've already created various basic / intermediate level applications that detect objects ... red ball, faces, dog, cat, etc.

I'm using python, a Raspberry Pi, networked to a windows laptop via wi-fi using client-server socket code.

I'd like to get your advice before starting to code - I hope to avoid creating a bad solution that is fundamentally wrong and having to start all over again !

Thanks for your suggestions,


edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2019-06-11 18:11:16 -0500

Witek gravatar image

updated 2019-06-11 18:16:54 -0500

One of the best methods (to my knowledge) will be using a Deep Neural Network. And one of the best among them is YOLO (YOLOv3 to be exact). It will detect and recognize objects almost regardless of the viewpoint.

Altohough DNNs generally require time consuming training, there are many out there that had already been trained. For example YOLO was trained on the COCO dataset that contains 80 common object categories and it is ready to use out of the box. If your robot is to recognize these objects, you're home. If, however, your viewpoint is going to be drastically different or you want to recognize objects that are not in the COCO dataset, you will have to (re)train the network for your specific purposes.

YOLO will run on RPi but, naturally, it will not be as fast and as accurate as when run on the PC (if I am not wrong, due to memory requirements, on RPi you have to run a smaller version of YOLO, called YOLO-tiny, which is faster, but less accurate). On the PC you can run the full version of YOLO in real time. The latter is unfortunately not possible with the OpenCV DNN module implementation as it does not use the power of NVIDIA graphics card, but only CPU. Still, on a decent gaming laptop you can get about 5 frames per second at full HD resolution just using the CPU which is the simplest and fastest way to go if you want to experiment with YOLO and OpenCV. Should you have a low voltage CPU, however, expect 1-3 fps. If you want to unleash the full power of YOLO you have to use the original Darknet project or rather its Python wrappers (like this one? I don't know, I never used it). And, of course, you need a good graphics card like GTX 1080 or something. This solution will, however, require streaming of the video from the RPi to the PC, which might be a bottleneck.

Have a look at:

edit flag offensive delete link more

Question Tools

1 follower


Asked: 2019-06-10 07:52:33 -0500

Seen: 600 times

Last updated: Jun 11 '19