Image Segmentation Part 1

Samruddhi Chitnis
3 min readJan 14, 2022

--

Image Segmentation

Table of Contents:

  1. What is Image Segmentation Segmentation?
  2. Where is Image Segmentation used?
  3. How can we implement Image Segmentation?
  4. Unet Architecture
  5. Conclusion
  6. References
Image Source: https://www.researchgate.net/figure/Example-of-2D-semantic-segmentation-Top-input-image-Bottom-prediction_fig3_326875064
  1. What is Image Segmentation/Semantic Segmentation?

Image Segmentation is also a kind of classification but in image segmentation each and every pixel of the image is classified in to some class. For instance refer the above image, the upper part is the actual image and the below part is the segmented image. We can clearly see that each and every pixel from the image is classified in to some class. Pixels which does not come under any class is often kept in to background class and labeled black .

2. Where is Image Segmentation used?

There are various applications and domains where image segmentation is used. One of the domain where image segmentation is used mostly is medical domain.Other than medical domain image segmentation is also used in self driving cars, geo-sensing, agriculture, construction domain, etc.

3. How can we implement Image Segmentation?

One of the most common and widely used architecture I came across while implementing image segmentation is Unet architecture. Unet architecture was basically developed for biomedical image segmentation. You can read about the original Unet Paper from here.

4. Unet Architecture

Image Source: https://miro.medium.com/max/1838/1*f7YOaE4TWubwaFF7Z1fzNw.png

As can be seen from the above image the Unet architecture resembles the U alphabet. There are two main components in Unet one is the Encoder and the other is the Decoder. Unet is made up of convolutional layers and max pooling layers in the Encoder part, convolutional layer and upsampling or transposed convolution layer in the Decoder part.Unet contains only convolutional layers and does not contain any Dense layer because of which it can accept image of any size.

What is an Encoder ?

Encoder is a stack of convolutional and maxpooling layers. Encoder is responsible for capturing the context or features from the image which is done using convolutional layer. We can either build encoder from scratch or we can use any of the transfer learning models like VGG16, Mobilenet, etc for building the encoder. I will explain how to implement both of them using Keras in the next part of Image Segmentation.

Skip Connections

While we capture the context or features from the images in Encoder, we save/store some of them in skip connections and then provide those skip connections to the Decoder, so that the Decoder remembers the features present at the early stages and use them for recreating the segmentated image.

What is a Decoder?

Decoder is basically responsible for constructing the segmentated image.If you have a look at the Decoder architecture which I would definitely walk you through in the next part, you will find out that there is no max pooling layer in the Decoder, instead it has an upsampling layer along with the convolutional layer or a transposed convolutional layer all of which upsamples the outputs received from the Encoder layer. Decoder receives the input from the Encoder layer as well as uses skip connections stored earlier from the Encoder layer to construct the final output image.

5. Conclusion

This blog will help you in having a basic idea of what image segmentation is all about, it’s applications and a basic detail about what an image segmentation model architecture looks like.

References:

  1. https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47

Thank you all for reading this blog. I am just a begginer in writing blogs, so I request you all to please leave comments, feedbacks and suggestions if any.

This is just the part 1 of Image Segmentation series, in the upcoming blogs I would further explain in depth about image segmentation using unet with the code.

Also Find the next part of this series here

Part 2: https://medium.com/@samruddhichitnis02/image-segmentation-code-implementation-8efa18163d1f

Part 3: https://medium.com/@samruddhichitnis02/image-segmentation-transfer-learning-db40295abb58

--

--

No responses yet