I ended up getting this to almost work. The problem is that it only works in an environment where:
- The floor has unique features. So, it will likely work on e.g. a wooden floor where there are many unique features, but not on e.g. a linoleum floor where the pattern is non-existing or uniform.
- There is nothing placed on the floor. This requires a bit of explaining: Let's say there is a chair on the floor, and both Cam1 and Cam2 can "see" this chair. So Cam 1 sees the chair from one angle and Cam2 sees the same chair from a different angle. This means that to Cam1 and Cam2 the same chair looks very different, because they are viewing the chair from different angles. So, when the stitching algorithm is looking for similar features in the image from Cam1 and Cam2 the chair will look very different and the algorithm will not recognize that the chair is the same "object". There may be a way around this, but I didn't find a solution myself.
I don't quite know how to solve this problem other than by creating your own reference points (e.g. by lighting an infrared grid on the floor and checking which of the cameras can "see" this grid OR by having people walk through the scene and have them work as reference points for the camera calibration).