Practical Work 1: Getting Started with Computer Vision

Objectives

By the end of this practical work, you will be able to:

Set up a development environment for computer vision projects
Load and display images using OpenCV
Explore the MNIST dataset and understand its structure
Understand how images are represented as numerical arrays

Prerequisites

Python 3.8 or higher installed
Basic Python programming knowledge
Familiarity with NumPy arrays (helpful but not required)

Install the required packages:

pip install opencv-python numpy matplotlib tensorflow

Instructions

Step 1: Environment Setup

Choose one of the following options to set up your development environment:

Option A: Google Colab (Recommended for beginners)

Go to Google Colab
Create a new notebook
All required libraries are pre-installed

Option B: Local Environment with Virtual Environment

# Create a new project directory
mkdir computer-vision-lab
cd computer-vision-lab

# Create and activate virtual environment
python -m venv venv

# On Windows:
venv\Scripts\activate

# On macOS/Linux:
source venv/bin/activate

# Install required packages
pip install opencv-python numpy matplotlib tensorflow

Tip: Using a virtual environment keeps your project dependencies isolated and prevents conflicts with other Python projects.

Step 2: Load and Display an Image with OpenCV

Create a new Python file or notebook cell and add the following code:

import cv2  # (#1:Import OpenCV library)
import matplotlib.pyplot as plt  # (#2:Import matplotlib for displaying images)
import numpy as np  # (#3:Import NumPy for array operations)

# Download a sample image or use your own
# For this example, we'll create a simple gradient image
image = np.zeros((256, 256, 3), dtype=np.uint8)  # (#4:Create a black image)

# Create a gradient effect
for i in range(256):
    image[i, :, 0] = i  # (#5:Blue channel gradient)
    image[:, i, 1] = i  # (#6:Green channel gradient)

# Display the image
plt.figure(figsize=(8, 8))  # (#7:Set figure size)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))  # (#8:Convert BGR to RGB for matplotlib)
plt.title('Generated Gradient Image')
plt.axis('off')  # (#9:Hide axis)
plt.show()

Info: OpenCV uses BGR (Blue, Green, Red) color ordering by default, while matplotlib expects RGB. Always convert when displaying OpenCV images with matplotlib.

Expected Output: Gradient Image

256x256 pixels - Blue channel increases top-to-bottom, Green channel increases left-to-right

Step 3: Explore Image Properties

Understand how images are represented as numerical arrays:

# Explore image properties
print(f"Image shape: {image.shape}")  # (#1:Height, Width, Channels)
print(f"Image data type: {image.dtype}")  # (#2:Data type - usually uint8)
print(f"Min pixel value: {image.min()}")  # (#3:Minimum value in the array)
print(f"Max pixel value: {image.max()}")  # (#4:Maximum value in the array)
print(f"Image size in bytes: {image.nbytes}")  # (#5:Memory footprint)

# Access individual pixels
print(f"\nPixel at (100, 100): {image[100, 100]}")  # (#6:BGR values at specific location)
print(f"Blue channel value: {image[100, 100, 0]}")
print(f"Green channel value: {image[100, 100, 1]}")
print(f"Red channel value: {image[100, 100, 2]}")

Expected output: You should see the image dimensions (256, 256, 3), dtype of uint8, and pixel values ranging from 0 to 255.

Step 4: Load MNIST Dataset with Keras

Load the famous MNIST handwritten digits dataset:

from tensorflow.keras.datasets import mnist  # (#1:Import MNIST from Keras)

# Load the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()  # (#2:Load training and test sets)

# Explore dataset dimensions
print("Training set:")
print(f"  Images shape: {X_train.shape}")  # (#3:60,000 images of 28x28 pixels)
print(f"  Labels shape: {y_train.shape}")  # (#4:60,000 labels)

print("\nTest set:")
print(f"  Images shape: {X_test.shape}")  # (#5:10,000 images)
print(f"  Labels shape: {y_test.shape}")

print(f"\nPixel value range: {X_train.min()} to {X_train.max()}")  # (#6:Grayscale values 0-255)
print(f"Label values: {np.unique(y_train)}")  # (#7:Digits 0-9)

Note: The first time you run this, TensorFlow will download the MNIST dataset (approximately 11 MB).

Expected Console Output

Training set:
  Images shape: (60000, 28, 28)
  Labels shape: (60000,)

Test set:
  Images shape: (10000, 28, 28)
  Labels shape: (10000,)

Pixel value range: 0 to 255
Label values: [0 1 2 3 4 5 6 7 8 9]

Step 5: Visualize MNIST Samples Grid

Create a 4x4 grid to display sample images from the dataset:

# Create a 4x4 grid of sample images
fig, axes = plt.subplots(4, 4, figsize=(10, 10))  # (#1:Create subplot grid)

for i, ax in enumerate(axes.flat):  # (#2:Iterate through all subplots)
    ax.imshow(X_train[i], cmap='gray')  # (#3:Display image in grayscale)
    ax.set_title(f'Label: {y_train[i]}', fontsize=12)  # (#4:Show the digit label)
    ax.axis('off')  # (#5:Hide axes for cleaner look)

plt.suptitle('MNIST Dataset - First 16 Samples', fontsize=16)  # (#6:Add main title)
plt.tight_layout()  # (#7:Adjust spacing)
plt.show()

Expected Output: 4x4 MNIST Sample Grid

5

Label: 5

0

Label: 0

4

Label: 4

1

Label: 1

9

Label: 9

2

Label: 2

1

Label: 1

3

Label: 3

1

Label: 1

4

Label: 4

3

Label: 3

5

Label: 5

3

Label: 3

6

Label: 6

1

Label: 1

7

Label: 7

First 16 samples from MNIST training set (28x28 grayscale). Your actual digits may differ.

Step 6: Calculate and Plot Class Distribution

Analyze the distribution of digits in the training set:

# Calculate class distribution
unique, counts = np.unique(y_train, return_counts=True)  # (#1:Count occurrences of each digit)

# Create histogram
plt.figure(figsize=(10, 6))
plt.bar(unique, counts, color='steelblue', edgecolor='black')  # (#2:Create bar chart)
plt.xlabel('Digit Class', fontsize=12)
plt.ylabel('Number of Samples', fontsize=12)
plt.title('MNIST Training Set - Class Distribution', fontsize=14)
plt.xticks(unique)  # (#3:Show all digit labels on x-axis)

# Add count labels on top of bars
for i, (digit, count) in enumerate(zip(unique, counts)):  # (#4:Annotate each bar)
    plt.text(digit, count + 100, str(count), ha='center', fontsize=10)

plt.tight_layout()
plt.show()

# Print statistics
print("Class Distribution Summary:")
for digit, count in zip(unique, counts):
    percentage = count / len(y_train) * 100
    print(f"  Digit {digit}: {count:,} samples ({percentage:.1f}%)")

Expected Output: Class Distribution Histogram

5,923

0

6,742

1

5,958

2

6,131

3

5,842

4

5,421

5

5,918

6

6,265

7

5,851

8

5,949

9

Digit Class

MNIST is well-balanced with ~6,000 samples per class. Your exact counts may vary slightly.

Step 7: Display Samples from Each Class

Show one example of each digit (0-9):

# Display one sample from each class
fig, axes = plt.subplots(2, 5, figsize=(15, 6))  # (#1:2 rows, 5 columns for digits 0-9)

for digit in range(10):  # (#2:Loop through digits 0-9)
    # Find the first occurrence of this digit
    idx = np.where(y_train == digit)[0][0]  # (#3:Get index of first matching sample)

    # Calculate subplot position
    row = digit // 5  # (#4:Row index)
    col = digit % 5   # (#5:Column index)

    # Display the image
    axes[row, col].imshow(X_train[idx], cmap='gray')
    axes[row, col].set_title(f'Digit: {digit}', fontsize=14, fontweight='bold')
    axes[row, col].axis('off')

plt.suptitle('One Sample from Each MNIST Class', fontsize=16)
plt.tight_layout()
plt.show()

Warning: Make sure to run the cells in order. The variables X_train and y_train must be loaded before this step.

Expected Output: One Sample from Each Class

0

Digit: 0

1

Digit: 1

2

Digit: 2

3

Digit: 3

4

Digit: 4

5

Digit: 5

6

Digit: 6

7

Digit: 7

8

Digit: 8

9

Digit: 9

2x5 grid showing the first sample found for each digit class. Your samples will look different.

Expected Output

After completing this practical work, you should have:

A working development environment with all required libraries
A gradient image displayed using OpenCV and matplotlib
Understanding of image dimensions: 60,000 training images, 10,000 test images, each 28x28 pixels
A 4x4 grid visualization showing sample MNIST digits
A histogram showing approximately equal distribution across all 10 digit classes (~5,000-6,000 samples each)
A visualization showing one representative sample from each digit class (0-9)

Success Criteria: All visualizations render correctly, and you can explain what each property (shape, dtype, min, max) tells us about the image data.

Deliverables

Jupyter Notebook: Complete notebook (.ipynb) with all code cells executed and outputs visible
Screenshots: Screenshots of the following visualizations:
- 4x4 MNIST samples grid
- Class distribution histogram
- Samples from each class visualization

Bonus Challenges

Challenge 1: Fashion-MNIST

Repeat all exercises using the Fashion-MNIST dataset instead:

from tensorflow.keras.datasets import fashion_mnist
(X_train_fashion, y_train_fashion), (X_test_fashion, y_test_fashion) = fashion_mnist.load_data()

# Class names for Fashion-MNIST
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

Challenge 2: Mean Pixel Value per Class

Calculate and compare the mean pixel value for each digit class. This can reveal interesting patterns about how different digits are typically drawn:

# Calculate mean pixel value for each class
mean_values = []
for digit in range(10):
    digit_images = X_train[y_train == digit]
    mean_val = digit_images.mean()
    mean_values.append(mean_val)
    print(f"Digit {digit}: Mean pixel value = {mean_val:.2f}")

# Visualize as a bar chart
plt.bar(range(10), mean_values)
plt.xlabel('Digit')
plt.ylabel('Mean Pixel Value')
plt.title('Mean Pixel Value by Digit Class')
plt.show()

Objectives

Prerequisites

Instructions

Step 1: Environment Setup

Option A: Google Colab (Recommended for beginners)

Option B: Local Environment with Virtual Environment

Step 2: Load and Display an Image with OpenCV

Step 3: Explore Image Properties

Step 4: Load MNIST Dataset with Keras

Step 5: Visualize MNIST Samples Grid

Step 6: Calculate and Plot Class Distribution

Step 7: Display Samples from Each Class

Expected Output

Deliverables

Bonus Challenges

Resources