Computer Vision LAB (using Pytorch):Exhaustive Hands On

Ex utamur fierent tacimates duis choro an

This $20 billion dollar industry will be one of the most important job markets in the years to come.


Computer vision allows us to analyze and leverage image and video data, with applications in a variety of industries, including self-driving cars, social network apps, medical diagnostics etc.

Every 60 seconds users upload more than 300 hours of video to Youtube, Netflix subscribers stream over 80,000 hours of video, and Instagram users like over 2 million photos!

This $20 billion dollar industry will be one of the most important job markets in the years to come.


The course aims at leveraging the functionalities of Python stack to use vision data (images and videos) and create digital utility products.

It starts with reading and writing images with python libraries numpy, OpenCV and PIL then applying variety of effects including colour mapping, blending, thresholds, gradients and more with OpenCV.

Understanding video basics with OpenCV, including working with streaming video from a webcam. Video topics, such as optical flow and object detection. Including face detection and object tracking. Then the course moves on to using deep learning and deep neural networks for image recognition and classification. We’ll even cover the latest deep learning networks, including the YOLO (you only look once) deep learning network.


  • Python programming skills : Matrix Operations, Statistics and Calculus
  • Deep Learning knowledge is must

What you'll learn

  • Implementation of CNN
  • Handling Biasness & Dealing with less data
  • Concept of Transfer Learning
  • Multi-level image Classification
  • Pytorch implementation in Image processing with a Capstone project

Course Outline

  • Introduction to Convolutional Neural Networks(CNN)
  • Convolution, Pooling, Padding and its mechanisms
  • Activation & Loss Functions
  • Optimizers(SGD and ADAM)
  • Learning Rate
  • Regularization techniques – Data Augmentation, Dropouts, Batch Normalization
  • Forward propagation & Backpropagation for CNNs
  • Popular CNN architectures – VGG16, Resnet34, Resnet50
  • Transfer Learning
  • Unfreezing and fine-tuning models
  • Choosing optimal Learning Rate
  • Handwritten digit recognition using MNIST dataset
  • Multi label image classification
  • Model development, interpretation and validation
  • Object Detection
  • Course Id                                           A105