Exploring Images as Data
Understanding how computers represent and manipulate visual information
Objectives
By the end of this practical work, you will be able to:
- Load and inspect images using Python libraries
- Understand image representation as numerical arrays (tensors)
- Examine pixel values, dimensions, and color channels
- Perform basic image manipulations
- Visualize images and their properties using matplotlib
Prerequisites
- Python 3.8+ installed
- Basic Python knowledge (variables, loops, functions)
- Jupyter Notebook or any Python IDE
Install required packages:
pip install pillow numpy matplotlib opencv-python
Instructions
Step 1: Setup Your Environment
Create a new Python file or Jupyter notebook and import the necessary libraries:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# Optional: for more advanced operations
import cv2
Download a sample image for this exercise, or use any image from your computer. Save it as sample.jpg in your working directory.
Step 2: Load and Inspect an Image
Load an image and examine its basic properties:
# Load the image
img = Image.open("sample.jpg")
# Display basic information
print(f"Format: {img.format}")
print(f"Mode: {img.mode}")
print(f"Size: {img.size}") # (width, height)
# Convert to numpy array for numerical analysis
img_array = np.array(img)
print(f"\nArray shape: {img_array.shape}")
print(f"Data type: {img_array.dtype}")
print(f"Min value: {img_array.min()}")
print(f"Max value: {img_array.max()}")
Understanding the output: The shape tuple represents (height, width, channels). RGB images have 3 channels, grayscale has 1.
Step 3: Explore Pixel Values
Access and understand individual pixel values:
# Access a single pixel (row 100, column 150)
pixel = img_array[100, 150]
print(f"Pixel at (100, 150): {pixel}")
print(f"Red: {pixel[0]}, Green: {pixel[1]}, Blue: {pixel[2]}")
# Extract color channels
red_channel = img_array[:, :, 0]
green_channel = img_array[:, :, 1]
blue_channel = img_array[:, :, 2]
print(f"\nRed channel shape: {red_channel.shape}")
print(f"Red channel mean: {red_channel.mean():.2f}")
Step 4: Visualize the Image and Channels
Create visualizations to understand the image structure:
# Create a figure with subplots
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
# Original image
axes[0, 0].imshow(img_array)
axes[0, 0].set_title("Original Image")
axes[0, 0].axis("off")
# Red channel
axes[0, 1].imshow(red_channel, cmap="Reds")
axes[0, 1].set_title("Red Channel")
axes[0, 1].axis("off")
# Green channel
axes[1, 0].imshow(green_channel, cmap="Greens")
axes[1, 0].set_title("Green Channel")
axes[1, 0].axis("off")
# Blue channel
axes[1, 1].imshow(blue_channel, cmap="Blues")
axes[1, 1].set_title("Blue Channel")
axes[1, 1].axis("off")
plt.tight_layout()
plt.savefig("channel_visualization.png")
plt.show()
Step 5: Basic Image Manipulations
Perform common preprocessing operations:
# Resize the image
resized = img.resize((224, 224))
print(f"Resized shape: {np.array(resized).shape}")
# Convert to grayscale
grayscale = img.convert("L")
gray_array = np.array(grayscale)
print(f"Grayscale shape: {gray_array.shape}")
# Crop a region (left, top, right, bottom)
cropped = img.crop((100, 100, 300, 300))
# Rotate
rotated = img.rotate(45)
# Visualize transformations
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
axes[0, 0].imshow(np.array(resized))
axes[0, 0].set_title("Resized (224x224)")
axes[0, 1].imshow(gray_array, cmap="gray")
axes[0, 1].set_title("Grayscale")
axes[1, 0].imshow(np.array(cropped))
axes[1, 0].set_title("Cropped")
axes[1, 1].imshow(np.array(rotated))
axes[1, 1].set_title("Rotated 45°")
for ax in axes.flat:
ax.axis("off")
plt.tight_layout()
plt.savefig("transformations.png")
plt.show()
Step 6: Normalize Pixel Values
Learn to normalize images for machine learning:
# Standard normalization (0-1 range)
normalized = img_array.astype(np.float32) / 255.0
print(f"Normalized range: [{normalized.min():.2f}, {normalized.max():.2f}]")
# ImageNet normalization (commonly used for pre-trained models)
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
imagenet_normalized = (normalized - mean) / std
print(f"ImageNet normalized range: [{imagenet_normalized.min():.2f}, {imagenet_normalized.max():.2f}]")
Why normalize? Neural networks train better when input values are in a consistent range. ImageNet normalization uses statistics from the ImageNet dataset.
Step 7: Calculate Image Statistics
Compute useful statistics about your image:
# Per-channel statistics
for i, color in enumerate(["Red", "Green", "Blue"]):
channel = img_array[:, :, i]
print(f"{color}: mean={channel.mean():.1f}, std={channel.std():.1f}")
# Histogram of pixel values
fig, axes = plt.subplots(1, 3, figsize=(12, 4))
colors = ["red", "green", "blue"]
for i, (ax, color) in enumerate(zip(axes, colors)):
ax.hist(img_array[:, :, i].ravel(), bins=256, color=color, alpha=0.7)
ax.set_title(f"{color.capitalize()} Channel Histogram")
ax.set_xlabel("Pixel Value")
ax.set_ylabel("Frequency")
plt.tight_layout()
plt.savefig("histograms.png")
plt.show()
Expected Output
After completing this practical work, you should have:
- Console output showing image dimensions, data types, and pixel values
channel_visualization.png- A 2x2 grid showing the original image and its RGB channelstransformations.png- A 2x2 grid showing resized, grayscale, cropped, and rotated versionshistograms.png- Histograms of pixel values for each color channel
Deliverables
- Your Python script or Jupyter notebook with all code and outputs
- The three generated visualization images
- A brief written summary (3-5 sentences) of what you learned about how images are represented as data
Bonus Challenges
- Challenge 1: Load multiple images and compare their statistics. Are there patterns based on image content?
- Challenge 2: Implement a simple brightness adjustment by adding/subtracting values from pixels
- Challenge 3: Create a function that converts an image to its negative (invert pixel values)
- Challenge 4: Use OpenCV to apply edge detection and compare the result to the original