trying to get 3d depth from stero images to work

asked 2020-04-16 07:51:02 -0500

oziphantom gravatar image

updated 2020-04-17 00:48:46 -0500

So I've been tumbling down a rabbit whole and I'm at a loss. In Python (the code is below) I've been following many tutorials, and trying to convert the C++ examples to python. At first I was getting transformation matrices that would basically turn the image into a Salvador Dali picture. But buy not using the sample checkerboard and switching to one that has different row and column numbers I've been able to get something that looks to be ok. I take images of the checkerboard displayed on an iPAD(5, retina screen) as it will guarantee the image is flat and my printer sucks. But the disparity map I get might as well be white noise, it contains zero form of the image. The images are my test checkerboard images which is me holding an ipad in front of a wall. But when I try it with more complicated images I still basically get white noise. the photos are 3584x2016

What am I doing wrong? What should I try next?

updates: I tried flipping my columns and row counts in case it was a issue with thing getting flipped at one point, it just gave me a pure grey disparity map.

import numpy as npimport cv2 as cv
from matplotlib import pyplot as plt
import glob
from PIL import Image

def convertPILToCV(pil_image_in):
        This will take a PIL loaded image, anc return it as CV image.
        Image should be converted to RGB first.
    :param pil_image_in: a PIL Iamge object seeks to the current image
    :return: a cv image representation in BGR
    open_cv_image = np.array(pil_image)
    # Convert RGB to BGR
    img_cv = open_cv_image[:, :, ::-1].copy()
    return img_cv

def findCornersInCVImage(cv_image_in):
        this will convert the image to grey, then return findchessboardcorners
    :param cv_image_in:
    :return: ret, corners from the findChessboardCorners and grey image
    gray = cv.cvtColor(cv_image_in, cv.COLOR_BGR2GRAY)
    #  scale_percent = 50  # percent of original size
    #  width = int(img.shape[1] * scale_percent / 100)
    #  height = int(img.shape[0] * scale_percent / 100)
    #  dim = (width, height)
    # resize image
    #  gray = cv.resize(gray, dim, interpolation=cv.INTER_AREA)

    # gray = cv.bitwise_not(gray)
    # grey_shape = gray.shape[::-1]
    # cv.imshow('gray', gray)
    # Find the chess board corners

    ret, corners = cv.findChessboardCorners(gray, (check_rows, check_columns), flags)

    return ret, corners, gray

check_columns = 7
check_rows = 9

# termination criteria
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objpL = np.zeros((check_rows*check_columns,3), np.float32)
objpL[:,:2] = np.mgrid[0:check_rows,0:check_columns].T.reshape(-1,2)
objpR = np.zeros((check_rows*check_columns,3), np.float32)
objpR[:,:2] = np.mgrid[0:check_rows,0:check_columns].T.reshape(-1,2)

# Arrays to store object points and image points from all the images.
objpointsL = [] # 3d point in real world space
imgpointsL = [] # 2d points in image plane.
objpointsR = [] # 3d point in real world space
imgpointsR = [] # 2d points in image plane.

images = glob.glob(r'F:\3D_calib\*.mpo')
# images = [r'F:\3D_calib\DSCF1901.JPG']
grey_shape = []

flags =  flags | cv ...
edit retag flag offensive close merge delete



My ROIs are 0,0,0,0 and the ret from cv.stereoCalibrate is 10.59214738462698 with no flags and 16 with some flags. I've read elsewhere I want it < 1.0 ? I printed out the pattern and took photos using the printed one vs the ipad and I get the same results. Turns out there is a bug in the pillow read code, it didn't actually get the R frame, and hence was giving the function the same L frame. Fixing this issue now gets me a ret value 126.21512348449606 and ROIs are still 0,0,0,0 But the disparity maps don't look like noise, more like a picture of water at night, mostly black with wavy peaks where the waves catch the light.

oziphantom gravatar imageoziphantom ( 2020-04-17 08:21:05 -0500 )edit

I find on another page, that high RMS means you have images that identify in a different order. I guess this means across all the images and not just the R/L pair. trimming out the "odd" ones, I now get the error down to ~8. Using the leftXX rightXX images from samples does get me down to 0.44 however the depth match is "noise" again. So running the stereo_match example mostly works. I then use Calibrate on both the leftXX and rightXX images. This then won't stereo as they are different sizes.. so I stop it from cropping the ROI and do both again. I then put a "correct l and r" into the stereo_match sample, and it gives a useless result that is mostly black with no areas that correlate to something in the image.

oziphantom gravatar imageoziphantom ( 2020-04-17 10:20:38 -0500 )edit