# Simple augmented reality using opencv

Hi everyone! I'm totally newbie at opencv and computer vision at all and I face some problems with it. I'm trying to build simple program that's detecting planar object and drawing cube (or at least 3d axis) on it. I used this guide to build it, and here is my code:

import sys
import numpy as np
import cv2

CANNY_LOW = 30
CANNY_HIGH = 200
CGAUSIAN = (5, 5)

MIN_RECT_AREA = 100
IS_CLOSED = True
DELTA = 0.01

BLUE = (255, 0, 0)
GREEN = (0, 255, 0)
RED = (0, 0, 255)
BLACK = (0, 0, 0)

def show(image):
cv2.namedWindow('image', cv2.WINDOW_NORMAL)
cv2.imshow('image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

def filt(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(image, CANNY_LOW, CANNY_HIGH)
blured = cv2.GaussianBlur(edged, GAUSIAN, 5)
return blured

def search_for_table_corners(raw_image):
filtered = filt(raw_image)
_, cnts, _ = cv2.findContours(filtered,
cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for cnt in cnts:
cnt_len = cv2.arcLength(cnt, IS_CLOSED)
approx = cv2.approxPolyDP(cnt, DELTA * cnt_len, IS_CLOSED)
if len(approx) == 4:
cv2.drawContours(raw_image, [approx], -1, BLACK, 4)
return np.float32(approx)
return None

def draw(img, corners, imgpts):
corner = tuple(corners[0].ravel())
img = cv2.line(img, corner, tuple(imgpts[0].ravel()), BLUE, 5)
img = cv2.line(img, corner, tuple(imgpts[1].ravel()), GREEN, 5)
img = cv2.line(img, corner, tuple(imgpts[2].ravel()), RED, 5)
return img

def generate_camera_matrix(image):
h, w = image.shape[:2]
# let it be full frame matrix
sx, sy = (36, 24)
# focus length
f = 50
fx = w * f / sx
fy = h * f / sy
cx = w / 2
cy = h / 2
mtx = np.zeros((3, 3), np.float32)
mtx[0, 0] = fx # [ fx  0  cx ]
mtx[0, 2] = cx # [  0 fy  cy ]
mtx[1, 1] = fy # [  0  0   1 ]
mtx[1, 2] = cy
mtx[2, 2] = 1
return mtx

def generate_distorsions():
return np.zeros((1, 4), np.float32)

def get_object_points(corners):
x1, y1 = corners[0][0]
x2, y2 = corners[1][0]
x3, y3 = corners[2][0]
x4, y4 = corners[3][0]
return np.float32([
# hardbone
[x2, y2, 0],
[x1, y1, 0],
[x3, y3, 0],
[x4, y4, 0],
])

def generate_axis(a):
axis = np.float32([[a,0,0], [0,a,0], [0,0,-a]]).reshape(-1,3)
return axis

def get_corners_subpixels(raw_image, corners):
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,
30, 0.001)
gray = cv2.cvtColor(raw_image, cv2.COLOR_BGR2GRAY)
corners_subpxs = cv2.cornerSubPix(gray, corners,
(11, 11), (-1, -1), criteria)
return corners_subpxs

def estimate_pose(raw_image, table_corners):
print('table_corners:\n', table_corners, '\n', '-' * 70)
object_points = get_object_points(table_corners)
print('object_points:\n', object_points, '\n', '-' * 70)
corners_subpxs = get_corners_subpixels(raw_image, table_corners)
print('corners_subpxs:\n', corners_subpxs, '\n', '-' * 70)
camera_matrix = generate_camera_matrix(raw_image)
print('camera_matrix:\n', corners_subpxs, '\n', '-' * 70)
distorsions = generate_distorsions()
rotation_vec, translation_vec = cv2.solvePnPRansac(object_points,
corners_subpxs, camera_matrix, distorsions, iterationsCount=500,
reprojectionError=50)[1:3]
print('rotation_vec:\n', rotation_vec, '\n', '-' * 70)
print('translation_vec:\n', translation_vec, '\n', '-' * 70)
return rotation_vec, translation_vec, camera_matrix, distorsions, \
corners_subpxs

def create_canvas(image):
return image.copy()

def get_projection_points(raw_image, table_corners):
rvecs, tvecs, mcam, dist, corn2 = estimate_pose(raw_image, table_corners)
size = round(raw_image.shape ...
edit retag close merge delete

Sort by » oldest newest most voted

You seem to have a fundamental understanding of how solvePnP works. HERE is a tutorial that should explain things better. In short, you have both your imagePoints and your worldPoints coming from the same place. They're actually the same numbers. That can't work.

Your image points are correct. The location in pixels inside the image.

Your world points should be the actual corners of the table. So for a pretend table that is 1x2 meters, you have [0,0,0], [1,0,0], [0,2,0],[1,2,0]. And then you need to make the order of your image points and world points match. In otherwords, the point in image space that represents the corner you called [0,0,0] is always paired with the pixel coordinates of that corner.

Lastly, and least important, you don't have enough point to use RANSAC. Just use solvePnP with the SOLVEPNP_P3P flag.

more

Thanks for the answer! So how can I found the actual corners of the table if I have only the 2d image?

( 2016-10-23 11:37:00 -0500 )edit
1

Well, you have to get out a tape-measure. Or if you know the width/height ratio of the table, you can just use unitless numbers. For example, if you know it's a square table, you can just use 1x1.

This is why most augmented reality stuff right now uses markers of some kind, like this: ARUCO

( 2016-10-23 17:11:43 -0500 )edit

Well, I've got your point, you explain it pretty clearly. Is there totally another way to do the same task in opencv?

( 2016-10-24 03:33:21 -0500 )edit

ARUCO or another marker system is the best/simplest way. Pretty easy. If you have a stereo camera you can know distances and model the world that way. That needs a lot of processing power though. Or some other way of knowing where the camera is, combined with motion to model the world. That last needs a very precise locating system and a lot of processing power.

( 2016-10-24 17:18:45 -0500 )edit

I don't have a stereo camera. The whole task is find planar rectangle (table, for example) from web-camera and draw 3d cube on it. It means that when I'm moving camera around planar object, cube projection is changing too. I've tried to find a table using cascades, but, well, I didn't train the network correctly, so I decided to find planar rectangle using algorithm above. Is there another way to do this task without any additional devices such as stereo-camera?

( 2016-10-25 02:59:57 -0500 )edit
1

Not unless you know the aspect ratio (and preferably the exact size) of the rectangle. If you don't, you don't have enough information to find orientation. Knowing things about the rectangle is basically the same as a marker system, just not as easy, and you can't tell which side you're on.

There are lots of different size and shape rectangles that can match what you see at different orientations. So you can't tell which is correct without some outside knowledge, whether it's the size, another camera, or a marker.

( 2016-10-25 10:58:54 -0500 )edit

Official site

GitHub

Wiki

Documentation