Overview

To a computer, a photo is just a grid of numbers (pixels). It doesn’t know that the blob of pixels in the middle is a cat. Computer Vision (CV) is the science of giving computers eyes. It allows self-driving cars to see pedestrians and Facebook to tag your face.

Core Idea

The core idea is Pattern Recognition. Using Convolutional Neural Networks (CNNs) to scan the image for features.

  • Layer 1 sees edges (lines).
  • Layer 2 sees shapes (circles).
  • Layer 3 sees objects (eyes).
  • Layer 4 sees faces.

Formal Definition

The field that deals with how computers can gain high-level understanding from digital images or videos.

Intuition

  • Human: You see a dog instantly.
  • Computer: It sees a matrix of Red, Green, and Blue values. It has to do millions of calculations to realize that “Pointy Ears” + “Fur Texture” + “Snout” = Dog.

Examples

  • Self-Driving Cars: Tesla Autopilot. It uses cameras to track lane lines, read stop signs, and avoid hitting kids.
  • Medical Imaging: AI is now better than human doctors at spotting tumors in X-rays. It never gets tired or distracted.
  • Deepfakes: Using CV to swap faces in a video. You can make the President say anything. Scary stuff.

Common Misconceptions

  • It’s solved: It’s still easily fooled. If you put a sticker on a stop sign, a self-driving car might think it’s a speed limit sign and crash (Adversarial Attack).
  • OCR (Optical Character Recognition): Reading text from images (scanning a PDF).
  • Biometrics: Unlocking your phone with your face (FaceID).

Applications

  • Retail: Amazon Go stores. Cameras track what you pick off the shelf and charge you automatically. No checkout lines.
  • Agriculture: Drones flying over fields, counting every single plant and spotting weeds.

Criticism / Limitations

  • Surveillance: Facial recognition is ending privacy. Governments can track everyone, everywhere. (Banned in some cities like San Francisco).

Further Reading

  • Szeliski, Richard. Computer Vision: Algorithms and Applications.
  • Mitchell, Melanie. Artificial Intelligence: A Guide for Thinking Humans.