1 of 9 for Introduction to Image Processing

Why image processing?

Welcome to my blog for the course Introduction to Image Processing!

8 min readApr 20, 2023

Images are all around us — capturing precious memories, sending crucial information, and stimulating emotions. In this digital age, it is extremely plausible that millions of digital photographs (and videos too!) are being taken every minute. Images hold substantial amounts of information ranging from your family’s holiday trips, X-ray images of your body, and even to spy satellites observing troop and military hardware movement of adversarial nations.

As a matter of fact, image processing played a major role in the events preceding the most intense 13 days in the history of the world — the Cuban Missile Crisis. In October 1962, high altitude reconnaissance images taken over Cuba were manually processed by National Photographic Interpretation Center (NPIC) technicians using various image processing techniques such as photogrammetry to measure the size and location of the objects present in the image, and image enhancement techniques to increase the clarity of the images. Their analysis revealed the presence of Soviet ballistic missiles in Cuba, all within striking distance of almost the entire continental United States. While this event brought the world to the brink of nuclear war, the crisis was resolved peacefully thanks in great part to the how image processing technology can be used to extract critical information from images and help inform decision-making in an event which could have potentially changed the course of human history forever.

Cuban missile launch sites were identified from U-2 spy plane images (National Archives)

On a less gloomy and cuter note, image processing can also be used to distinguish between two oddly similar-looking objects such as chihuahua and muffins! This remains one of the most famous and popular binary classification problems continuously challenging computer vision algorithms. To be fair, if you squint hard enough, a raisin muffin does look like your feisty little chihuahua doggo.

Perhaps the most famous example for computer vision — chihuahua or muffin? (Medium)

But let’s backtrack for a minute here: what exactly are images? In simple terms, an image is a two-dimensional representation of an object or a scene, which can be captured by a camera or generated by a computer. Images can also be grayscale or colored, static or dynamic, and vary in resolution and size. Images are simply functions and representations of data! Each datapoint in an image in encapsulated in a pixel, the basic building block of digital images. A combination of pixels constitute a digital images. The more pixels you have, the higher the resolution of the image is.

Color models

In this section, we will explore the three most commonly used color models in image processing and computer vision: binary, grayscale, and colored images.

Let’s use my mini dachshund, Pepper, to help us better understand the differences between these color models.

ppr = imread('ppr.jpg')
imshow(ppr)

**Pepper**, my two-year old dapple mini dachshund

Binary

Binary images typically only contain black and white. To put it simply, this color model either shows areas of presence of color or absence of color. These are typically used in edge detection, object detection, and image segmentation. An input image is converted to a binary image by converting each pixel into either black (1) or white (0) based on a set threshold value.

ppr_binary = img_as_uint(rgb2gray(ppr) > rgb2gray(ppr).mean())
imshow(ppr_binary)

Binary images are reminiscent of old textbooks printed on dot matrix printers. The binary version of Pepper’s phot clearly delineates between Pepper’s left ear against the orange bedroom wall, however it seems that the green wall outside the bedroom is above the set threshold (i.e. the grayscale mean) turning it black together with Pepper’s already black fur.

Grayscale

Grayscale images, on the other hand, contain shades of gray between black (1) and white (0). These shades of gray are floats between 0 and 1. The values of the grays are taken from the weighted average of the red, green, and blue channels of the image. More specifically, the single value of each pixel in a grayscale image is a representation of its brightness level.

ppr_gray = rgb2gray(ppr)
imshow(ppr_gray)

Unlike with binary images, different levels of intensities (or brightness) are evident from the image. Even without colors, one can easily identify the objects present in the image.

Colored

As the name implies, colored images contain information based on a full range of colors. We will tackle two different color models used for colored images — namely the RGB and the HSV color models.

RGB Color Model

The RGB color model is the most commonly used model for colored images. It uses the three primary colors: red, green, and blue. Each pixel has a size of 3, with each element corresponding to values of red, green, and blue — also called the color channels. The values can only vary between 0 and 255 where 0 implies that the color channel is completely turned off while 255 implies that it is completely turned on. Therefore, each color channel is represented by 8-bit integer values (i.e. 256 values).

imshow(np.array([[
    [255, 0, 0],
    [0, 255, 0],
    [0, 0, 255]
]]))

Red, *green*, and *blue* channels completely on

imshow(np.array([[
    [255, 128, 0],
    [103, 255, 250],
    [32, 69, 120]
]]))

Different channels turned on result to different color combinations. Colors aside from red, green, and blue are simply combinations of these three primary colors at different and unique proportions.

Going back to Pepper, we can dissect the colored image into its different RGB channels to see the varying levels of each primary color (or color channel) for each pixel, extending to the entire image.

fig, ax = plt.subplots(1, 3, figsize=(12,4))
ax[0].imshow(ppr[:,:,0], cmap='Reds')
ax[0].set_title('Red')
ax[1].imshow(ppr[:,:,1], cmap='Greens')
ax[1].set_title('Green')
ax[2].imshow(ppr[:,:,2], cmap='Blues')
ax[2].set_title('Blue')

Obviously, the blue squares on the bedsheet are dark in the blue channel and visibly white in the red and green channels, noting the low values of red and green in a blue section of the image. Meanwhile, the yellow sections of the bedsheet appear to be dark in both the red and green channels while significantly lighter in the blue channel. This is due to the fact that yellow is simply a combination of red and green colors. Lastly, Pepper’s black fur is extremely light in all channels since black is achieved having zero or low values across all three color channels.

HSV Color Model

Colored images can likewise be represented in terms of their hue, saturation, and value — shortened as HSV.

Hue refers to the actual color of the pixel such as red, green, and blue regardless of whether how bright or saturated the color is. This is measured in terms of its relative angle on the color wheel.

HSV Color Wheel (or Color Cone?) (Mathworks)

Saturation refers to the intensity or purity of the hue. Low saturation makes the color appear whiter or grayer while higher saturation makes the color appear more vibrant and pure.

Lastly, value simply refers to the brightness of the color. Low values make the color appear dark while high value make it appear brighter.

RGB color models are typically used when color information across the three color channels are the most important aspect of the image such as digital photography, color-based image manipulation, and white balancing. On the other hand, HSV color models are more preferable when the color is the most important aspect. Tasks such as color-based object detection and tracking tend to focus on the target color instead of the brightness or intensity of the color. The skimage library can convert between RGB and HSV such as in Pepper’s image below.

ppr_hsv = rgb2hsv(ppr)
fig, ax = plt.subplots(1, 3, figsize=(12,4))
ax[0].imshow(ppr_hsv[:,:,0], cmap='hsv')
ax[0].set_title('Hue')
ax[1].imshow(ppr_hsv[:,:,1], cmap='gray')
ax[1].set_title('Saturation')
ax[2].imshow(ppr_hsv[:,:,2], cmap='gray')
ax[2].set_title('Value')

Quite noticeably, the hue channel simply contains the raw color information of each pixel without regard to its intensity and brightness. Under the saturation channel, more vibrant colors in the image appear darker while lighter colors are closer to white. Interestingly, the value channel is akin to a grayscale version of the image. As mentioned previously, the value just provides the level of brightness of each pixel. This is extremely similar to grayscale color models which also provide different levels of brightness between black (1) and white (0) as floats! Awesome.

Conclusion

Binary, grayscale, and colored images are the three most common types of images used in image processing. Binary images contain only two channels — black and white. Grayscale images contain different shades of gray between black and white. Colored images can be categorized into multiple color models such as RGB and HSV color models.

The RGB color model is the most commonly used color model representing pixels in terms of the three primary colors. The HSV color model represents colors based on their hue, saturation, and value.

Selection of color models all boil down to the specific use case and the types of images needed to be analyzed. There is no one-size-fits-all solution in image processing considering the high variability between different images and the expansive uses of images in our everyday lives. Image processing plays a vital role in our daily lives. Understanding the concepts and the potential applications of image processing can lead to exciting discoveries and innovations in various fields.

References

National Archives. (1962). Photograph of MRBM Field Launch Site №1 in San Cristobal, Cuba, 14 October 1962. photograph. Retrieved from https://catalog.archives.gov/id/193926.

Understanding Color Spaces and Color Space Conversion. Understanding Color Spaces and Color Space Conversion — MATLAB & Simulink. (n.d.). Retrieved April 14, 2023, from https://www.mathworks.com/help/images/understanding-color-spaces-and-color-space-conversion.html

Yao, M. (2018, August 7). Chihuahua or Muffin? my search for the Best Computer Vision API. Medium. Retrieved April 14, 2023, from https://medium.com/free-code-camp/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d