Help with homography transformation

asked 2018-07-24 02:58:28 -0500

Nbb gravatar image

updated 2018-07-24 22:03:55 -0500

I am trying to get a top-down view of an image via the homography transformation. I am using the BIWI Walking Pedestrians dataset from Inside it you can find a H.txt file containing the 3x3 homography matrix. However, I am unable to get a decent output when applying the homography transformation on the first frame of the video. Below is the output

import numpy as np
import cv2
import matplotlib.pyplot as plt

# read homography matrix
h = []
with open('/home/ast/datasets/ewap/seq_hotel/H.txt', 'r') as f:
    for line in f:
        h.append([float(l) for l in line.strip().split()])
h = np.array(h)

# read first image of the video
cap = cv2.VideoCapture('/home/ast/datasets/ewap/seq_hotel/seq_hotel.avi')
_, frame =

# transform
im_dst = cv2.warpPerspective(frame, h, (frame.shape[1],frame.shape[0]))

And these are the images. You can see that the output is completely black. It is mentioned in the README of the dataset that the positions and velocities are in meters and are obtained with the homography matrix stored in H.txt. I'd assume that the homography matrix must transform the points (and image) from current view to a top down view in order to get the actual position in meters. But am I wrong ? Or have I just applied the function wrongly ? I have verified that my h array is correct when comparing it to H.txt.

image description

image description

This is the matrix H

[[  1.10482000e-02   6.69589000e-04  -3.32953000e+00]
 [ -1.59660000e-03   1.16324000e-02  -5.39514000e+00]
 [  1.11907000e-04   1.36174000e-05   5.42766000e-01]]

EDIT: So I tried doing the homography transform on the bottom right and top left coordinate of the image [0,0] and [576. 720] and got the result [-6.13437462 -9.94008446] and [ 5.69882059 3.33946466]. I think this is the reason why the entire image is black ? because every pixel in the image is mapped to between the coordinate [-6.13437462 -9.94008446] to [ 5.69882059 3.33946466] ? (1) All pixels mapped to the negative side do not appear in the image and (2) the range is too small. I think what I will try is to scale each pixel individually e.g. they get mapped to centimeters instead of meters and to also translate all points to make them all non negative

EDIT: Well I tried to project each points individually but I think the output is not satisfactory. What can I do to the homography matrix such that the output of warpperspective is in centimeters instead of meters.

# offset matrix for the hotel sequence
offset_hotel = np.array([[1,0,10], [0,1,10], [0,0,1]], dtype="float32")

# generate image coordinates
im_coordinates = []
for y in range(np.shape(frame)[0]):
    for x in range(np.shape(frame)[1]):        
im_coordinates = np.array([im_coordinates], dtype = "float32")

# convert to ...
edit retag flag offensive close merge delete


please show us, what is in your h matrix

berak gravatar imageberak ( 2018-07-24 03:36:03 -0500 )edit

hi berak, sorry i did not know that. I looked through the opencv tutorial and assumed they are the same. I have updated my post with the h matrix.

Nbb gravatar imageNbb ( 2018-07-24 03:42:23 -0500 )edit

Maybe try to normalize to have h_33 == 1?

Eduardo gravatar imageEduardo ( 2018-07-24 09:19:54 -0500 )edit

I tried dividing h by h[2,2] to make h[2,2] or h_33 = 1 but the result is still the same

Nbb gravatar imageNbb ( 2018-07-24 19:22:36 -0500 )edit