The mean shift algorithm has been laid out in D. Comanciu and P. Meer, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24 (2002) pp. 603. The functions found in the cv::cuda
module implement the variant of the algorithm with the joint domain representation of the pixel feature space. The joint domain representation means that each pixel is represented by five features, including three color coordinates and two spatial coordinates. As a result of the mean shift algorithm, each pixel is assigned a point in the 5D feature space corresponding to the location of the closest local maximum of pixel density distribution (a.k.a mode of the distribution).
In the context of image segmentation, the mean shift procedure in the joint feature space has at least two advantages over the variant that uses only the color domain:
- Small local features of the color distribution are not trumped by features that dominate the overall image.
- Spatially separate segments of similar average colors are recognized as separate segments.
The dstsp
matrix maps each pixel of the original image to the (approximate) location of the mode it belongs to, projected onto the spatial part of the joint feature space. All pixels belonging to the same segment are mapped to approximately the same location, within the spatial tolerance given by the input parameter sp
.
Similarly, the dstr
matrix maps each pixel to the approximate location of its mode in the color space.
The dstsp
and the dstr
matrices are jointly used in the meanShiftSegmentation()
function to cluster pixels into segments. Beyond this, the dstsp
matrix need not have any special meaning. The use of the spatial coordinates in the mean shift algorithm simply prevents mixing of spatially separate modes.
really ? https://docs.opencv.org/master/d0/d05...
maybe looking at remapping helps to understand those. (for every pixel position, those contain the new, mapped position)
Thanks for your comment. As I understand,
cv::remap
performs a (2D) geometrical transformation of the image, as opposed tomeanShiftProc()
which performs a (3D) transformation in the color space. Unless I am wrong on either of these points, the exact nature of the "position of mapped points" is still unclear.Since the type of
dstsp
is the same as one of the types that can be used as the map forcv::remap
, it sounds reasonable to assume that there is a connection between the two. Now the question is what kind of connection.