Ask Your Question

Hilman's profile - activity

2021-01-10 18:32:22 -0600 received badge  Famous Question (source)
2020-12-15 11:48:38 -0600 received badge  Notable Question (source)
2019-08-16 09:24:56 -0600 received badge  Popular Question (source)
2019-08-08 08:34:20 -0600 received badge  Famous Question (source)
2019-01-22 20:14:47 -0600 received badge  Notable Question (source)
2018-08-29 07:34:20 -0600 received badge  Notable Question (source)
2018-06-08 02:47:10 -0600 received badge  Popular Question (source)
2018-03-05 01:37:49 -0600 received badge  Popular Question (source)
2017-12-29 09:27:33 -0600 received badge  Famous Question (source)
2017-06-19 05:30:53 -0600 received badge  Notable Question (source)
2017-04-22 14:28:16 -0600 marked best answer Can a paper printed chessboard affect camera calibration?

In calibrating my camera, I used a paper printed chessboard. I guess because of the nature of the paper, it is not "perfect" since there will be some crumples/wrinkles. In 15 images that I have used, below are 3 of the examples of the original image, and its undistorted version.

Original image 1 Undistorted image 1

Original image 2 Undistorted image 2

Original image 3 Undistorted image 3

As you can see, there are some wrinkles at the paper (especially at the edge). Can this somehow greatly affects the camera matrix and distortion coefficients that will be calculated, or I am just being overly paranoid here?

Thanks.

2017-03-28 13:00:43 -0600 received badge  Popular Question (source)
2017-03-10 00:07:36 -0600 commented answer How is the basic pipeline of 3D reconstruction from more than two images?

I have read a bit on that topic (I own that book). But I believe that is just for 3 images. How if the images are arbitrary? Any idea? I also have made some search and looks like OpenCV's solvePnP can be used, although I am not really sure how. Any links/keywords that I can try to look more? That would be very helpful.

2017-02-13 01:39:51 -0600 asked a question How is the basic pipeline of 3D reconstruction from more than two images?

I am doing 3D reconstruction using OpenCV with Python and I have already reconstructed the 3D structure from two images. Let us named the images as image_1 and image_2. So, I want to add another view from the third image, image_3. What I understood, it has to do with Bundle Adjustment, but I have a problem on understanding how things should be done.

So, to make my pipeline more simple, I have decided to use the Python's sba module, which is explained as the wrapper for Lourakis' sparse bundle adjustment C library. Basically, what I understood is that I will need to pass the 3D points of a structure, and their corresponding 2D points from multiple images. For the example given with source code, it looks something like this:

|0.188427 -0.036568 -0.851884| 3 | 0 |500.7 521.2| 1 |902.8 974.4| 2 |147.9 580.2|

From the first to eighth column, this is what it represents:

  1. The 3D point
  2. The number of views/images that 3D point visible
  3. Mask (here means that 3D point is visible in the first image (0'th)
  4. The 2D points of that 3D point in the first image

The 5th, 6th, 7th, 8th column all have the same meaning as the 3rd and 4th column.

As for my condition, I have made a 3D reconstruction called structure_1 from image_1 and image_2. Then I have also made a 3D reconstruction called structure_2 from image_2 and image_3.

Then, consider that a point called point_1 is visible in image_1, image_2 and image_3.

This means, I have two 3D points (from structure_1 and structure_2) for point_1. How should I make it as the example in the source code? The source code already has a 3D point from three views as shown in above snippet.

2017-02-11 04:57:16 -0600 marked best answer What does the getOptimalNewCameraMatrix function does?

For example, in this tutorial, I have some problems in understanding the cv2.getOptimalNewCameraMatrix. Already read the documentation and made some search, but I still can't understand it. I don't get the meaning of the return of this function:

Returns the new camera matrix based on the free scaling parameter

What does "free scaling parameter" means?

I hope someone can give explanation and maybe some examples on this. Thanks.

2016-12-26 18:59:00 -0600 commented question How to construct 3D representation of the scene with more than 2 images?

My goal is more into trying to understand the essence of the SfM first. Because of that, I am not using the more complete library that already has the pipeline implemented. I also want to tweak a lot of stuff and experimenting. That is why I am using OpenCV since it is more general (has other CV algorithms) and I can try many things with it. Also, I am using mac and having problems to build the OpenCV with the SfM module. Or do you have any idea in what is the best library I can used to achieve my goal?

2016-12-24 22:10:45 -0600 asked a question How to construct 3D representation of the scene with more than 2 images?

From a video, I have took three images: i1, i2 and i3. The steps in getting the keypoints in each of the image are:

  1. I detect keypoints from i1 and track it with optical flow up until i2.
  2. From i2, I added more keypoints (good keypoints from i1 still exist) and track it up until i3.

Then, from the corresponding keypoints in i1 and i2, I managed to build the 3d representation. So, using the same pipeline, of course I also managed to reconstruct the 3d representation from the corresponding keypoints in i2 and i3. So, I want to build a scene using these 2 reconstructed 3d scene.

I have done a little bit of reading, and I stuck in some parts. So, I know I will need to call the solvePnPRansac. I have done book keeping during optical flow and knows which keypoints in the reconstructed 3d scene (between i1 and i2) are present in i3. So, I will need just to pass the reconstructed 3d points with the corresponding i3's keypoints to solvePnPRansac. From there, I can get the rotation and translation of the reconstructed 3d scene with respect to the i3's camera. From there, what should I do?

2016-12-16 05:33:19 -0600 asked a question 3D reconstruction (SfM) - confusion with camera's extrinsic parameter

From two images (with known camera intrinsic parameter), I follow the usual pipeline to reconstruct its 3D points. In particular, I use the cv2.findEssentialMat and cv2.recoverPose function. From the cv2.recoverPose, I got the rotation and translation of the second camera (the first camera will has identity matrix for its rotation and zero vector for its translation). What confuse me is the value of the camera 2's translation that is [ -0.98681175 0.08603786 0.1371133 ], where I got a negative x value. Clearly from my images, the second camera should move right, indicating a positive x value. Why is this? Is it because the matrix should show the movement of the points and not the camera itself (I always thought the extrinsic parameter shows the movement of the camera)?

Below are the two images (with highlighted keypoints used for 3D reconstruction) and the reconstructed scene.

image description

Reconstructed scene

The two blue X marks above are camera 1 and camera 2 position (here, I have multiplied camera 2's translation with -1 to get a positive x value).

2016-11-26 17:09:05 -0600 asked a question Undistort images or not before finding the Fundamental/Essential Matrix?

I am quite confused right now. In order to find the Fundamental Matrix and the Essential Matrix, my common way is by first, undistort the images before did the other processes like detecting keypoints, matching the keypoints, find the Fundamental Matrix and then, the Essential Matrix. Is this correct? Can I not undistort the images in order to find the Fundamental Matrix and the Essential Matrix?

Another question is, as for the function findEssentialMat of the OpenCV, does it operate on the undistorted points, or distorted points, or both?

2016-11-14 22:59:48 -0600 asked a question Why findFundamentalMat gives different results for same but different orientation of points?

Sorry if the title is kind of weird. It is quite hard to express the question of my problem.

So, I am in the middle of a 3D reconstruction project. The pipeline is more or less the same with the standard pipeline where

  1. Undistort image
  2. Detect points with keypoint detector
  3. Track the points across frames (optical flow)
  4. Calculate the fundamental matrix

and so on. The only different part is at step 2 where I use a Line Segment Detector and track it across frames.

So, if I am using a keypoint detector, giving two frame of images, I will get two set of keypoints (each set corresponds to each frame). But as for my situation, I have four set of keypoints (each two set correspond to each frame since a line has a start point and an end point).

In order to calculate the Fundamental matrix, I need to concatenate the two sets of point of each frame.

One way is by just vertically concatenate it: np.vstack([start_point, end_point]).

The other way is by :np.hstack([start_point, end_point]).reshape(-1, 2). Means, it is concatenated 'alternately', i.e.

[[start_point[0],
  end_point[0],
  start_point[1],
  end_point[1],
             ...]]

Both will end up with a same shape. But fair enough, they produce a quite different results. From my observation, the vstack produced a more '3D-like' result while the hstack produced a more 'planar-like' result for the reconstruction.

The question is why is this? And which one supposed to be better?

Below is sample code to give a view of this question:

import numpy as np
import cv2

np.random.seed(0)

def prepare_points(pts_frame1, pts_frame2):
    # Prepare the four sets of points
    (p1_f1, p2_f1) = pts_frame1
    (p1_f2, p2_f2) = pts_frame2

    v_stacked_f1f2 = (np.vstack([p1_f1, p2_f1]), np.vstack([p1_f2, p2_f2]))
    h_stacked_f1f2 = (np.hstack([p1_f1, p2_f1]).reshape(-1, 2), 
                                np.hstack([p1_f2, p2_f2]).reshape(-1, 2))

    return (v_stacked_f1f2, h_stacked_f1f2)

pts_frame1 = np.random.random_sample((60, 2)).astype("float32")
pts_frame2 = np.random.random_sample((60, 2)).astype("float32")

# Emulate the two sets of points for each frame where
# the first set is the start point, while
# the second set is the end point of a line
pts_frame1 = (pts_frame1[::2], pts_frame1[1::2])
pts_frame2 = (pts_frame2[::2], pts_frame2[1::2])

(v_stacked_f1f2, h_stacked_f1f2) = prepare_points(pts_frame1, pts_frame2)

F_vstacked = cv2.findFundamentalMat(v_stacked_f1f2[0], v_stacked_f1f2[1],
                                    cv2.FM_RANSAC, 3, 0.99)[0]
F_hstacked = cv2.findFundamentalMat(h_stacked_f1f2[0], h_stacked_f1f2[1],
                                    cv2.FM_RANSAC, 3, 0.99)[0]

print("F_vstacked:\n", F_vstacked, "\n")
print("F_hstacked:\n", F_hstacked, "\n")

# The output:    
# F_vstacked:
# [[ 3.31788127 -2.24336615 -0.77866782]
# [ 0.83418839 -1.4066019  -0.92088302]
# [-2.75413748  2.27311637  1.        ]] 

# F_hstacked:
# [[  7.70558741  25.29966782 -16.20835082]
# [-12.95357284  -0.54474384  14.95490469]
# [  1.79050172 -10.40077071   1.        ]]
2016-09-18 23:56:32 -0600 edited question Problem with getOptimalNewCameraMatrix

I want to calibrate a car video recorder and use it for 3D reconstruction with Structure from Motion (SfM). The original size of the pictures I have took with this camera is 1920x1080. Basically, I have been using the source code from the OpenCV tutorial for the calibration.

But there are some problems and I would really appreciate any help.

So, as usual (at least in the above source code), here is the pipeline:

  1. Find the chessboard corner with findChessboardCorners
  2. Get its subpixel value with cornerSubPix
  3. Draw it for visualisation with drawhessboardCorners
  4. Then, we calibrate the camera with a call to calibrateCamera
  5. Call the getOptimalNewCameraMatrix and the undistort function to undistort the image

In my case, since the image is too big (1920x1080), I have resized it to 640x320 and used it for the calibration (during SfM, I will also use this size of image, so, I don't think it would be any problem). And also, I have used a 9x6 chessboard corners for the calibration.

Here, the problem arose. After a call to the getOptimalNewCameraMatrix, the distortion come out totally wrong. Even the returned ROI is [0,0,0,0]. Below is the original image and its undistorted version:

image description image description

You can see the image in the undistorted image is at the bottom left.

But, if I didn't call the getOptimalNewCameraMatrix and just straight undistort it, I got a quite good image. image description

So, I have two questions.

  1. Why is this? I have tried with another dataset taken with the same camera, and also with my iPhone 6 Plus, but the results are same as above.
  2. For SfM, I guess the call to getOptimalNewCameraMatrix is important? Because if not, the undistorted image would be zoomed and blurred, making the keypoint detection harder (in my case, I will be using the optical flow)? I have tested the code with the opencv sample pictures and the results are just fine.

Below is my source code:

from sys import argv
import numpy as np
import imutils  # To use the imutils.resize function. 
                       # Resizing while preserving the image's ratio.
                       # In this case, resizing 1920x1080 into 640x360.
import cv2
import glob

# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((9*6,3), np.float32)
objp[:,:2] = np.mgrid[0:9,0:6].T.reshape(-1,2)

# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.

images = glob.glob(argv[1] + '*.jpg')
width = 640

for fname in images:
    img = cv2.imread(fname)
    if width:
        img = imutils.resize(img, width=width)

    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

    # Find the chess board corners
    ret, corners = cv2.findChessboardCorners(gray, (9,6),None)

    # If found, add object points, image points (after refining them)
    if ret == True:
        objpoints.append(objp)

        corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
        imgpoints.append ...
(more)