Wednesday, August 5, 2009

Activity 12 - Color Image Segmentation


From the previous activities, we find that thresholding is indeed a powerful way in separating our region of interest with the rest of the image. This technique, however, has its limitations. Thresholding can only be applied to images with single-value pixels such as grayscale images. When dealing with colored images (where each pixel has 3 values -- RGB), one has to reduce it first to grayscale before thresholding. Upon reduction, different colors may appear the same in grayscale. As such, separation of the ROI may not be possible just by thresholding. In this activity, we perform another technique for ROI separation in colored images -- color image segmentation.
Colored images may have different shades (or brightness) of colors inherent to it. Thus, it is better represent the image into brightness and chromaticity information. This is called the normalized chromaticity coordinates (NCC).

Transforming RGB to NCC
Per pixel, we add up the RGB values I = R+G+B. The NCC is then:
r = R/I
g = G/I
b = B/I
Note that b is also equal to 1-r-g hence only the r and g values are needed to represent the chromaticities r,g and b. We essentially have transformed RGB to rgI values where I accounts for the brightness of the pixel.

Two techniques will be used in this activity - parametric and non-parametric segmentation.

Parametric segmentation

In parametric segmentaion, we crop a portion of the ROI and take the mean and standard deviation for the chromaticities r and g. To check whether a pixel belongs to the ROI, we check its probability of belonging to the ROI. Since pixel has two chromaticities, we calculate the effective probability

p = p(r)*p(g)

where p(r) and p(g) are the probabilities of a pixel with chromaticities r and g, respectively belongs to the ROI. We now assume a Gaussian function using the calculated mean and standard deviation for each chromaticity such that the probabilities p(r) and p(g) takes the form:



where x is the r-g chromaticity of the pixel.
The result of the whole process is a martix of p values the same size as the image. High values of p will only occur at ROI and thus we effectively separated the ROI from the rest of the image. Below are examples of the results using this method.


Fig.1 (upper left) the raw image and (rest) the segmented images of red, yellow and blue patches.


Fig.2. (left) the raw image ang (right) the segmented image using parametric segmentation

Here we see a complete segmentation of the ROI.

Non-parametric segmentation

This technique does not assume any form of function unlike the previous one. What is need is just the 2D chromaticity histogram of a portion of the ROI. (Note that the axes are the r and g chromaticities in different levels). In creating the histogram, binning is important because it will determine the quality of the segmentation. If bin sizes are small the details of the ROI is preserved but the toleration of the technique in terms of chromaticity would be very low which results to dark regions in the ROI. If on the other hand, the bin sizes are large, the toleration becomes high such that most of the pixels in the ROI is bright but we compromise its details. Our choice of bins depends on what we want. In this particular case, we used a bin size equal to 256, 32 and 10 to demonstrate their reconstuction differences. Once the histogram is created, we now back-project the chromaticity values of each pixel in the image to the histogram. Steps of backprojection is shown below.


Figure 3. Verification of the histogram

Before the backprojection, we first verify if the peak of the histogram corresponds to its color in the chromaticity space. The patch used is yellow. We observe that the peak coincides to the yellow region.

  • Histogram backprojection
    • Obtain the chromaticity (r,g) values of each pixel in the image.
    • For each pixel, find its position (r,g) on the histogram and find its value.
    • The value at the pixel location is changed to the histogram value found.

In this technique, we can infer that high values will be obtained when the (r,g) values of the pixel in the image corresponds to the (r,g) where high values occurred in the histogram. Shown in Fig.3 is the result of this technique.


Figure 4. (clockwise from upperleft): the raw image and the sementated images using (1) 256 bins, (2) 10 bins and (3) 32 bins in the histogram. (Note that large bins correspond to small bin sizes).

In Fig.4, we indeed see that the segmentation quality differs at different bin sizes. This technique is much better than the previous one since it does not any assumptions. The only challenge is that one should be able to find the proper bin sizes depending on what he wants.

In this activity, I give myself a grade of 9/10 for doing a good job.

I thank Ma'am Jing for helping me get through some problems encountered in this activity.


No comments:

Post a Comment