Ask Your Question

takahashi's profile - activity

2019-10-22 06:39:07 -0600 received badge  Famous Question (source)
2019-10-16 09:41:54 -0600 received badge  Notable Question (source)
2018-05-07 07:34:36 -0600 received badge  Popular Question (source)
2017-10-04 03:30:44 -0600 received badge  Notable Question (source)
2017-05-08 06:58:07 -0600 received badge  Popular Question (source)
2017-03-19 15:35:12 -0600 received badge  Student (source)
2016-04-26 02:31:15 -0600 commented question Cannot identify device '/dev/video0' - video reading OpenCV 3.1

if you list the connected video devices, is there any device listed? (e.g. video1 or so). Also, are you using a virtual machine by any chance?

2016-04-25 09:28:22 -0600 commented question Cannot identify device '/dev/video0' - video reading OpenCV 3.1

Make sure, your video0 device exists. Use this command: ls -ltr /dev/video* to list the connected video devices

2016-03-21 03:50:09 -0600 commented question Help with line segment detection in java

you should apply the line segment detector directly on the image and NOT preprocess it using the canny edge detector. Both algorithms are designed to extract lines/edges from a grayscale image.

2016-03-10 01:33:47 -0600 answered a question VideoCapture select timeout with OpenCV 3.0.0-rc1

I found the problem: When using a webcam, make sure to connect it to the Virtual Machine using Devices->Webcams and NOT Devices->USB. Even though the webcam is detected as video0 when attaching it via Devices->USB, for some reasons it does not work.

2016-03-09 13:03:20 -0600 asked a question VideoCapture select timeout with OpenCV 3.0.0-rc1

I am using OpenCV 3.0.0-rc1 on Ubuntu 14.04 LTS Guest in VirtualBox with Windows 8 Host. I have an extremely simple program to read in frames from a webcam (Logitech C170) (from the OpenCV documentation). Unfortunately, it doesn't work (I have tried 3 different webcams). It throws an error "select timeout" every couple of seconds and reads a frame, but the frame is black. Any ideas?

The code is the following:

#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>

using namespace std;
using namespace cv;

// Main
int main(int argc, char **argv) {

/* webcam setup */
VideoCapture stream;;
// check if video device has been initialized
if (!stream.isOpened()) { 
        fprintf(stderr, "Could not open Webcam device");  
        return -1;
    int image_width = 640; // image resolution
    int image_height = 480;
    Mat colorImage,currentImage;
    bool loop = true;
    /* infinite loop for video stream */
    while (loop) {
        loop =;  // read webcam stream
        cvtColor(colorImage, currentImage, CV_BGR2GRAY); // color to gray for current image   
        imshow("Matches", currentImage);
        if(waitKey(30) >= 0) break;
        // end stream while-loop
    return 0;
2015-11-19 01:56:10 -0600 commented question Pose estimation using PNP: Strange wrong results

how large is the subset of 2D-3D correspondences that gives you the wrong pose? It would also be helpful, if you could post your code (maybe append it to your question). The distance ratio is an indicator that your triangulated points are correct. So it really seems like there is something wrong with the pose estimation from the subset of correspondences.

2015-11-18 02:43:51 -0600 commented question Pose estimation using PNP: Strange wrong results

I would start off by checking your feature point correspondences (display them on your synthetic images). Then, since you know the ground truth of the corresponding 3D points from your Gazebo model, check whether the values of your triangulated feature points actually make sense.

2015-11-17 03:55:30 -0600 commented question Pose estimation using PNP: Strange wrong results

your choice of coordinate systems seems a bit arbitrary to me. Make sure that you follow OpenCV's convention by using right-handed coordinate systems with Z pointing away from the camera in direction of the observed scene. Also you state that your right camera is defined as the origin. However, in the very next sentence, you state that the right camera is at (-1,1,1). So if the projection matrices of your two views are wrong, so will the triangulated points be wrong and PnP of course won't be able to calculate the correct solution.

2015-10-09 11:03:21 -0600 commented question VideoCapture returns empty Mat and imshow() crashes.

aha! I didn't know that ROS actually installs OpenCV as well! hmm.. this certainly is annoying. But thanks for the hint, very helpful!

2015-10-09 01:53:34 -0600 commented question VideoCapture returns empty Mat and imshow() crashes.

I installed OpenCV following this tutorial (with OpenCV 3.0.0-rc1 instead of OpenCV 3.0.0 alpha). I never installed anything else regarding OpenCV..

2015-10-08 09:52:08 -0600 commented question VideoCapture returns empty Mat and imshow() crashes.

yes, didn't work neither..

2015-10-07 04:41:23 -0600 commented question VideoCapture returns empty Mat and imshow() crashes.

I thought about this as well, but removing all ROS code still does not change anything. I will leave this annoying problem for now and will update this question once I have figured it out. But thanks anyway!

2015-10-07 03:48:38 -0600 commented question VideoCapture returns empty Mat and imshow() crashes. returns 1. As I stated above, frame.empty() returns 0. BUT the strange thing is, that frame.type() returns 16 and if I comment imshow(), the code works fine and msg = cv_bridge::CvImage(std_msgs::Header(), "rgb8", frame).toImageMsg(); actually sends a valid frame, which I can read and display without any problems.

2015-10-07 03:27:40 -0600 asked a question VideoCapture returns empty Mat and imshow() crashes.

I have following code (in ROS), where I read an Image stream from a webcam, convert it and send it to another ROS program, and display it:

#include <ros/ros.h>
#include <image_transport/image_transport.h>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <cv_bridge/cv_bridge.h>
#include <thread>
#include <chrono>

using namespace std;

int main(int argc, char** argv) {

    ros::init(argc, argv, "image_publisher");
    ros::NodeHandle nh;
    image_transport::ImageTransport it(nh);
    image_transport::Publisher pub = it.advertise("camera/image", 1);

    cv::VideoCapture cap;;
    // give webcam some time
    // Check if video device can be opened with the given index
    if (!cap.isOpened())
        return 1;
    sensor_msgs::ImagePtr msg;
    ROS_INFO("Webcam publisher running");

    ros::Rate loop_rate(10); 
    while (nh.ok()) {
        cv::Mat frame;
        cap >> frame;
        cout<<"frame type: "<<frame.type()<<"\n";
            cout<<"frame empty: "<<frame.empty()<<"\n";
        // Check if grabbed frame is actually full with some content
        if (!frame.empty()) {
            msg = cv_bridge::CvImage(std_msgs::Header(), "rgb8", frame).toImageMsg();

My problem is that the code always outputs "frame empty: 0". The if condition if(!frame.empty()) is nevertheless entered and the ROS msg (created from the frame) which is sent, is just fine. The subscriber to this ROS msg has no problem to read the frame and show it with imshow(). (For those who aren't familiar with ROS: I send the frame to another program where I can read and display it without any problems).

I need to display the frame already in the code above, but it always crashes at cv::imshow("Webcam",frame);, giving the following debugging (with gdb) output:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff760cd20 in cv::_InputArray::size(int) const ()
   from /usr/lib/x86_64-linux-gnu/
(gdb) bt
#0  0x00007ffff760cd20 in cv::_InputArray::size(int) const ()
   from /usr/lib/x86_64-linux-gnu/
#1  0x00007ffff66e4887 in cv::imshow(cv::String const&, cv::_InputArray const&) () from /usr/local/lib/
#2  0x00000000004023c8 in main (argc=1, argv=<optimized out>)
    at /ibox/ros/catkin_ws/src/ros/webcam_publisher/src/webcam_publisher.cpp:41

If I comment the lines cv::imshow("Webcam",frame); (and cv::waitKey(0);), everything works just fine. I am using opencv 3.

Does anybody have a clue what could be going on here?

2015-08-10 03:26:22 -0600 commented answer Using triangulatePoints() and projectPoints() with stereo cameras

You'll get a correct estimation/ scene reconstruction up-to-scale. So the relative translation between the cameras and the positions of the triangulated 3D points will be correct with respect to the translation from the first to second camera, which is assumed to have length 1. You will obtain an up-to-scale metric reconstruction, which unfortunately does not have any absolute information. BA will work without that information, as it refines your pose and 3D points up-to-scale as well. To get the absolute scale, you either need to have previous knowledge about the size of an object in the scene or as you plan, you could use the information from GPS/IMU. This can be included after the BA by multiplying all your metric quantities with the GPS/IMU estimate of the first translation vector.

2015-08-10 03:11:30 -0600 commented answer Using triangulatePoints() and projectPoints() with stereo cameras

The camera extrinsics are the pose of the camera w.r.t a (arbitrary user-defined) 3D coordinate system, so for a 2-camera setup with the 3D coordinate system defined at the center of the first camera, the translation part of the second camera's extrinsics corresponds to the baseline when speaking in terms of a stereo setup. You're right about the change of scale, 2D-to-2D via essential matrix normalizes the translation vector to unit for each pair of views. So what you could do is using 2D-to-2D to get an initial estimate of pose, then you use this estimate to triangulate the points. Once you have established this inital set of pose and 3D points, you can use PnP for further pose estimation (with correct scale). Then you can use BA to minimize the reprojection error.

2015-08-10 02:56:49 -0600 answered a question Using triangulatePoints() and projectPoints() with stereo cameras

You are trying to mix two different approaches for pose estimation: 2D-to-2D ( via essential matrix) and 2D-to-3D using triangulated 3D points. solvePnP is an algorithm for the latter approach. Bundle adjustment basically minimizes the reprojection error over the set of poses and 3D points, so changing extrinsics are actually essential for bundle adjustment. The tutorial here and here gives a very good overview on the topic of visual odometry.

Usually, the 3D coordinate frame is defined such that its origin is at the center of the first camera so in this sense, yes, K1[I|0] for the first camera and K2[R|t] for the second camera is correct. The first time you triangulate the points, the obtained 3D points are expressed in the 3D coordinate frame defined by the first camera. Therefore, when you want to reproject the 3D points in e.g. view 3 in order to calculate the reprojection error, you need to know the pose of the 3rd camera w.r.t the first view.

Hope this helps.

2015-07-24 14:56:34 -0600 received badge  Good Answer (source)
2015-07-24 14:56:34 -0600 received badge  Enlightened (source)
2015-07-24 11:52:20 -0600 commented answer Units of Rotation and translation from Essential Matrix

I think the problem is that the Fundamental matrix is defined by the epipolar geometry, but I can't recall the details right now and what could probably be the cause of your problem.

2015-07-24 06:37:21 -0600 commented answer Units of Rotation and translation from Essential Matrix

If the scene is static, then the scale factor of all 3D points should be constant w.r.t. to one specific view. You never get [identitiy|0] from the SVD decomposition of the Essential matrix, because this is the underlying assumption of the first camera that allows you to calculate [R|t] of the second camera. Or am I maybe not understanding your question correctly? So by static scene, do you mean a rigid scene with two images taken from different poses?

2015-07-24 04:17:32 -0600 commented answer Units of Rotation and translation from Essential Matrix

The camera matrix from an essential matrix obtained with two views, is always the camera matrix of the second camera, P2 = K*[R|t] (considering calibration matrix K), based on the assumption that the world coordinate frame has its origin at the first camera center. A 2D point x seen in the second image is given by x = P2*X, where X is a 3D point in world coordinates. So whenever you calculate the camera pose from a new pair of views, the camera pose is always relative to the first view of this pair. So let's say you have ndifferent views. You could e.g. always calculate the essential matrix w.r.t the first view and then, from the first set of triangulated keypoints, you should ideally be able to estimate the correct scale of all other views w.r.t the first (unit) translation.

2015-07-24 03:19:47 -0600 commented answer Units of Rotation and translation from Essential Matrix

Yes, the camera origin c is the position in world coordinates. When extracting the camera matrix P2 = [R|t] of your second view, you always assume, that the world coordinates coincide with the first camera, such that the first camera actually has a camera matrix P1 = [identity|0]. Therefore, R and t represent the motion from the second camera to the first camera (or world coordinate origin) in the second camera's coordinate frame. Finally, you need to invert this motion, to obtain the pose of your second camera w.r.t to the world coordinate system. This blog post is very helpful to understand these geometric relations. So regarding your last question in above comment: It's exactly the other way around.

2015-07-24 02:16:53 -0600 received badge  Editor (source)
2015-07-24 02:08:13 -0600 received badge  Nice Answer (source)
2015-07-24 02:00:29 -0600 received badge  Teacher (source)
2015-07-24 01:49:57 -0600 answered a question Units of Rotation and translation from Essential Matrix

It depends whether you have knowledge about an objects absolute metric size (e.g., the basewidth of a stereo camera setup). The Essential Matrix E decomposes into rotation R and translation t, but notice that there are four possible solutions, but only one geometrically feasible one (see here, slide 8). After doing the cheirality check, i.e., all 3D points must be in front of the two cameras, you get the feasible solution.

The translation vector t (and rotation matrix R) yield your camera matrix P = [R|t], where R corresponds to the camera orientation in world coordinates and t is the position of the world origin in camera coordinates. Therefore, to obtain the camera origin c in world coordinates, you calculate:

c = -inverse(R)*t

As you noticed in your EDIT 1: The translation vector is normalized to unit. This is due to the SVD, which always returns a solution normalized to unit. Therefore, it is impossible to retrieve an absolute metric translation (in m, mm or whatever) without any additional knowledge of the absolute dimensions of an observed object, but you only obtain a correct solution up-to-scale.

Edit: you might also take a look at this post for the calculation of the four possible solutions.

2015-06-12 05:15:08 -0600 received badge  Enthusiast
2015-06-09 11:06:15 -0600 asked a question Trifocal Tensor with OpenCV

Is there any specific module in OpenCV dedicated to three-view geometry (trifocal tensor)?

2015-06-02 06:25:57 -0600 asked a question Why does BinaryDescriptor::compute give segfault when modifying the input KeyLine vector

I am using OpenCV 3.0.0-rc1. I want to detect lines, filter according to line length and compute the descriptors. However, it always gives a segfault and I don't get why it does so. When I don't use the filtered keylines vector (created by the function keylineLengthFilter), but the original keylines vector, it works perfectly. Or when I set min_length = 0.0, it also works. When I set it to, e.g., min_length = 10.0, it still detects a non-zero number of keylines, but segfault occurs. The documentation where I got example code from is here. Any help is greatly appreciated!

My code is:

The length filter:

vector<KeyLine> keylineLengthFilter(vector<KeyLine>& keylines, const float &length)
      /* keyline filter that returns a binary vector of keyline entries with length larger than input value */
      vector<KeyLine> keylines_filtered;
      for (int i = 0; i < (int)keylines.size(); ++i)
        if (keylines[i].lineLength >= length)
      return keylines_filtered;

The Code: (image, min_length and mask are not included here, but that part works fine on my original code)

vector<KeyLine> keylines_filtered;
      /* create a pointer to an LSDDetector object */
      Ptr<BinaryDescriptor> bd = BinaryDescriptor::createBinaryDescriptor();
      /* create a pointer to a BinaryDescriptor object with default parameters */
      Ptr<LSDDetector> lsd = LSDDetector::createLSDDetector();
      /* compute lines */
      lsd->detect(image, keylines, 2, 1, mask);
      /* filter keylines */
      keylines_filtered = keylineLengthFilter(keylines, min_length);

      // THE SEGFAULT HAPPENS HERE, but only when I use keylines_filtered as argument:
      /* compute descriptors */
      bd->compute(image, keylines_filtered, descriptors); // this gives segfault 
      bd->compute(image, keylines, descriptors); // this works