Ask Your Question

Understanding relationship between image resolution and "zoom" capability

asked 2016-09-08 00:11:53 -0500

Crashalot gravatar image

New to CV so sorry if this question lacks proper terminology or is completely nonsensical. :)

The goal is to track a specific player during a basketball game and produce a video that follows him around during the game. More specifically, every frame of the video will be centered around him instead of the ball.

One CV expert suggested the following:

The general approach is to have a very high definition video feed with a still camera and zoom into and around the still image and create the illusion that you are zooming/panning a regular camera.

Could someone elaborate on how "high definition" this camera needs to be? 4K? And more abstractly, is there a relationship between the resolution of the video feed and how much "zoom" capability this yields? For instance, if you have a 4K feed, could you zoom around a basketball court of 90 feet? Could you zoom around a football field of 100 yards?

edit retag flag offensive close merge delete


His suggestion is simply bullshit ... software PTZ camera's, only work for fairly stable scenes. Instead, get yourself a PTZ setup, apply a tracker combined with a detector, in your case if you can select the player beforehand, the OpenTLD tracker will be exactly what you need! Then based on the movement of your tracker in the image, adjust your PTZ setup.

StevenPuttemans gravatar imageStevenPuttemans ( 2016-09-08 05:47:14 -0500 )edit

@StevenPuttemans could you elaborate please? Why is his answer not helpful -- isn't a basketball game considered a stable scene? What do you mean by "get yourself a PTZ setup and apply a tracker combined with a detector"? Are you referring to hardware or software? Sorry for not understanding, but thanks for your patience!

Crashalot gravatar imageCrashalot ( 2016-09-08 12:31:45 -0500 )edit

The downside of a system that has a fixed camera and does a software PTZ (which is pan tilt zoom) camera, by moving around in the image, is that the original camera has to have a huge resolution in order to maintain some decent end video. In sport scenes, to capture a player, camera's are mostly placed up high, somewhere above the crowd. If you compare a single player to the overall resolution, then be prepared to pay alot for that high resolution camera. A lower resolution camera might not be able to capture the whole field but it can increase resolution in a specific area. By detection I mean you need to detect the player somehow, either automatic or either manually, then apply a feature tracker onto them to follow them around in the image. Start googling for CV tracking.

StevenPuttemans gravatar imageStevenPuttemans ( 2016-09-09 02:03:39 -0500 )edit

@StevenPuttemans thanks for the clarification. Yes, this wouldn't be used for professional sporting events so the camera will be perched right next to the court. If I understand you, the downside to the lower resolution camera means you must do real-time player tracking on the camera device (let's ignore the manual option where a human pans the camera to follow the player) -- wouldn't that also increase the cost? The question then becomes: is it cheaper to do on-device tracking or to increase the camera resolution?

Crashalot gravatar imageCrashalot ( 2016-09-09 02:08:44 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2016-09-08 07:53:37 -0500

Tetragramm gravatar image

It can work, especially in this scenario. If you have a physically mounted and fixed camera, you have a stable scene, so you just need to follow the guy.

The problem is, your question depends on too much we can't know. It's very simple to answer though.

First, you decide what the final resolution is going to be, and how much space around him you want in the picture.

Then do some simple geometry. From where your camera is at, what angle (FOV) do you need to cover the area the player is in? What angle (FOV) do you need to cover the entire area?

Take the number of pixels (in the horizontal direction) you need on the player, divide by the first angle, multiply by the second angle. That number is the number of pixels you need (in the horizontal direction) for the camera. It's likely to be very large.

Your setup will probably need multiple cameras next to each other to cover the entire basketball court. At least, if you're using cheap cameras. If you're making your own, pretty much anything is possible if you spend enough money.

edit flag offensive delete link more


Thanks so much, this is helpful! Assume a custom camera with two sensors combining to provide 180 FOV so the entire court is covered (assume the camera is at half-court). To answer your questions, it seems like the FOV would be 180 for both, does that seem right?

Crashalot gravatar imageCrashalot ( 2016-09-08 12:35:33 -0500 )edit

That would be the second FOV, of the total system. The first is dependent on the location of the player and how large an area you want included in your image.

Tetragramm gravatar imageTetragramm ( 2016-09-08 17:28:33 -0500 )edit

Thanks for the prompt response! Let's assume you want the player always centered in a 1280x720 video. What would FOV #1 be?

Crashalot gravatar imageCrashalot ( 2016-09-08 18:08:41 -0500 )edit

Nuh uh. You're asking the wrong question. If the player is centered in a 1280x720 video, what does that mean? Draw a line under where the player from one side of the frame to another. Draw another two lines from the location of the camera to the ends of that line. That makes a triangle. What is the angle of the triangle?

Let's call that FOV_sub, and the 180 is FOV. FOV/FOV_sub * (1280x720) gives you the necessary resolution.

Tetragramm gravatar imageTetragramm ( 2016-09-08 18:24:15 -0500 )edit

Sorry for the confusion. What I meant was the video produced should have the player centered and have a resolution of 1280x720. I thought that's what you meant by "area to cover the player"? If you're referring to the camera covering the player, then 180 will cover the player as this FOV already covers the whole court, and we can discard frames where the player is off the court.

Crashalot gravatar imageCrashalot ( 2016-09-08 18:30:53 -0500 )edit

Yes, I know. HERE is a picture of a basketball player, and HERE is a picture of a basketball player. Yes, I know they're not from the same pont of view, but they make my point.

One has a small area around the player, the other has a very large area. You need to decide what is acceptable, then do the math to figure out what resolution that requires.

Tetragramm gravatar imageTetragramm ( 2016-09-08 18:49:21 -0500 )edit

Thanks for your patience in explaining this, very much appreciated. My understanding is the area around the player would be defined in the production of the video since the camera view is supposed to capture the whole court. Put another way, the image captured by the camera(s) would cover both ends of the court. But say Player A stays on one end. So for frame 1 of the video, the algorithm would take the original image and crop everything except Player A (plus the area to yield the 1280x720 video).

Crashalot gravatar imageCrashalot ( 2016-09-08 18:56:51 -0500 )edit

Your question was, "How high definition does the camera need to be?" If your total resolution is too low, it will look like the second image and the player will be almost indistinguishable.

Tetragramm gravatar imageTetragramm ( 2016-09-08 19:05:05 -0500 )edit

Yes, so I'm trying to work backwards to determine the minimum resolution needed for a custom camera (which may require multiple lenses to produce the resolution and FOV needed) if the goal is to produce a 1280x720 video where an arbitrary player is always centered. Could I email more specifics that may make this question simpler to answer? These comment boxes are not ideal. Thanks again for your assistance!

Crashalot gravatar imageCrashalot ( 2016-09-08 19:14:47 -0500 )edit

I would rather not. The problem is rather simple, and I've explained it down to the exact equations you need. All that's left is you deciding how large an area needs to be in the video.

Tetragramm gravatar imageTetragramm ( 2016-09-08 20:12:14 -0500 )edit

Question Tools

1 follower


Asked: 2016-09-08 00:11:53 -0500

Seen: 916 times

Last updated: Sep 08 '16