Computer Vision

Session 5 - Transfer Learning & Optimization

Leveraging pretrained models and optimizing performance

Today's Agenda

What is Transfer Learning?

flowchart LR subgraph Source["Source: ImageNet"] S1["1.2M images"] S2["1000 classes"] S3["~25M params"] end subgraph Pretrained["Pretrained Model"] P1["Layers 1-45
FROZEN
Generic features"] P2["Layers 46-50
TRAINABLE
Task-specific"] end subgraph Target["Target Task"] T1["500-5000 images"] T2["5-50 classes"] T3["90%+ accuracy"] end Source -->|"weeks of training"| Pretrained Pretrained -->|"hours of fine-tuning"| Target classDef source fill:#78909c,stroke:#546e7a,color:#fff classDef frozen fill:#4a90d9,stroke:#2e6da4,color:#fff classDef trainable fill:#ff9800,stroke:#ef6c00,color:#fff classDef target fill:#7cb342,stroke:#558b2f,color:#fff class S1,S2,S3 source class P1 frozen class P2 trainable class T1,T2,T3 target

Why Use Transfer Learning?

Limited Data

Works well with small datasets (100s-1000s of images)

Faster Training

Converges much faster than training from scratch

Better Performance

Often achieves higher accuracy than random initialization

Rule of thumb: Always start with transfer learning unless you have millions of labeled images and significant compute resources.

Loading Pretrained Models: Keras

from tensorflow.keras.applications import (
    ResNet50, VGG16, EfficientNetB0, MobileNetV2
)

# Load model with ImageNet weights
base_model = ResNet50(
    weights='imagenet',  # (#1:Pretrained on ImageNet)
    include_top=False,   # (#2:Remove classification head)
    input_shape=(224, 224, 3)
)

# Check model architecture
base_model.summary()

# Available models: VGG16, VGG19, ResNet50/101/152,
# InceptionV3, EfficientNetB0-B7, MobileNetV2/V3, etc.

Loading Pretrained Models: PyTorch

import torchvision.models as models

# Load pretrained ResNet50
model = models.resnet50(pretrained=True)  # (#1:ImageNet weights)

# Modern syntax (PyTorch 2.0+)
from torchvision.models import ResNet50_Weights
model = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V2)  # (#2:Specific weights)

# Remove classification head
import torch.nn as nn
model.fc = nn.Identity()  # (#3:Replace final layer)

# Or get features directly
model = models.resnet50(pretrained=True)
features = nn.Sequential(*list(model.children())[:-1])  # (#4:All but last layer)

Loading Pretrained Models: timm

import timm

# List available models
print(timm.list_models('*efficientnet*'))  # (#1:Search models)

# Load model with pretrained weights
model = timm.create_model(
    'efficientnet_b0',
    pretrained=True,  # (#2:Download pretrained weights)
    num_classes=10    # (#3:Custom number of classes)
)

# Load as feature extractor
model = timm.create_model(
    'vit_base_patch16_224',
    pretrained=True,
    num_classes=0  # (#4:0 removes classifier)
)

# timm has 700+ models: ViT, Swin, ConvNeXt, etc.

timm (PyTorch Image Models) provides the largest collection of pretrained vision models.

Strategy 1: Feature Extraction

Approach

  • Freeze all pretrained layers
  • Add new classifier head
  • Train only the new layers

When to Use

  • Very small dataset
  • Similar domain to ImageNet
  • Limited compute resources
# Keras implementation
base_model = ResNet50(
    weights='imagenet',
    include_top=False,
    input_shape=(224,224,3)
)

# Freeze base model
base_model.trainable = False

# Add classifier
model = keras.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(num_classes, activation='softmax')
])

Feature Extraction: PyTorch Implementation

import torch
import torch.nn as nn
from torchvision import models

class FeatureExtractor(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        # Load pretrained backbone
        self.backbone = models.resnet50(pretrained=True)

        # Freeze all layers
        for param in self.backbone.parameters():  # (#1:Freeze backbone)
            param.requires_grad = False

        # Replace classifier
        num_features = self.backbone.fc.in_features
        self.backbone.fc = nn.Sequential(
            nn.Linear(num_features, 256),  # (#2:New classifier head)
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(256, num_classes)
        )

    def forward(self, x):
        return self.backbone(x)

Strategy 2: Full Fine-tuning

Approach

  • Unfreeze all layers
  • Train entire network end-to-end
  • Use lower learning rate

When to Use

  • Large dataset available
  • Different domain from ImageNet
  • Maximum performance needed
# Keras: Unfreeze all layers
base_model.trainable = True

# Use lower learning rate
model.compile(
    optimizer=keras.optimizers.Adam(
        learning_rate=1e-5  # Lower LR
    ),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# PyTorch: All params trainable
for param in model.parameters():
    param.requires_grad = True

optimizer = torch.optim.Adam(
    model.parameters(),
    lr=1e-5
)

Warning: Use learning rate 10-100x smaller than training from scratch to avoid destroying pretrained features.

Strategy 3: Partial Fine-tuning

# Keras: Freeze early layers, unfreeze later ones
base_model = ResNet50(weights='imagenet', include_top=False)

# Freeze first N layers
for layer in base_model.layers[:100]:  # (#1:Freeze early layers)
    layer.trainable = False
for layer in base_model.layers[100:]:  # (#2:Unfreeze later layers)
    layer.trainable = True

# PyTorch: Freeze by layer name
model = models.resnet50(pretrained=True)
for name, param in model.named_parameters():
    if 'layer4' in name or 'fc' in name:  # (#3:Unfreeze layer4 and fc)
        param.requires_grad = True
    else:
        param.requires_grad = False  # (#4:Freeze other layers)

# Check trainable parameters
trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"Trainable params: {trainable:,}")

Progressive Unfreezing

Concept

Gradually unfreeze layers during training, starting from the top (closest to output) and moving toward the bottom.

Benefits

  • More stable training
  • Preserves low-level features
  • Better generalization
# Progressive unfreezing schedule
def unfreeze_layers(model, epoch):
    if epoch == 0:
        # Train only classifier
        for p in model.backbone.parameters():
            p.requires_grad = False
    elif epoch == 5:
        # Unfreeze layer4
        for p in model.backbone.layer4.parameters():
            p.requires_grad = True
    elif epoch == 10:
        # Unfreeze layer3
        for p in model.backbone.layer3.parameters():
            p.requires_grad = True
    elif epoch == 15:
        # Unfreeze all
        for p in model.parameters():
            p.requires_grad = True

Quick Exercise: Choose Fine-tuning Strategy

Which strategy would you recommend for each scenario?

Scenario A

500 chest X-ray images for pneumonia detection. Single GPU, limited time.

Feature extraction, partial, or full fine-tuning?

Scenario B

50,000 product images for e-commerce classification. Multiple GPUs available.

Feature extraction, partial, or full fine-tuning?

Scenario C

200 satellite images (very different from ImageNet). Need maximum accuracy.

Which approach? What LR strategy?

Discriminative Learning Rates

# Different learning rates for different layers
model = models.resnet50(pretrained=True)

# Group parameters by layer depth
param_groups = [
    {'params': model.conv1.parameters(), 'lr': 1e-6},   # (#1:Lowest LR for early layers)
    {'params': model.layer1.parameters(), 'lr': 1e-6},
    {'params': model.layer2.parameters(), 'lr': 1e-5},
    {'params': model.layer3.parameters(), 'lr': 1e-5},
    {'params': model.layer4.parameters(), 'lr': 1e-4},  # (#2:Higher LR for later layers)
    {'params': model.fc.parameters(), 'lr': 1e-3}       # (#3:Highest LR for new head)
]

optimizer = torch.optim.Adam(param_groups)

# Keras equivalent using layer-wise LR multiplier
# Requires custom training loop or optimizer modification

Intuition: Early layers capture generic features that shouldn't change much; later layers need more adaptation.

Classification Metrics Overview

Accuracy

Correct predictions / Total predictions

Precision

TP / (TP + FP) - Exactness

Recall

TP / (TP + FN) - Completeness

F1 Score

2 * (P * R) / (P + R) - Harmonic mean

Metric Use When Limitation
Accuracy Balanced classes Misleading with imbalanced data
Precision False positives are costly (spam filter) Ignores false negatives
Recall False negatives are costly (disease detection) Ignores false positives
F1 Score Need balance between precision and recall Assumes equal importance of P and R

Understanding the Confusion Matrix

Predicted
ActualPositiveNegative
Positive TP
True Positive
FN
False Negative
Negative FP
False Positive
TN
True Negative

Reading the Matrix

  • TP: Correctly predicted positive
  • TN: Correctly predicted negative
  • FP: Type I error (false alarm)
  • FN: Type II error (missed detection)

Precision = TP / (TP + FP) - "Of all positive predictions, how many were correct?"

Recall = TP / (TP + FN) - "Of all actual positives, how many did we find?"

F1 Score and Confusion Matrix Code

from sklearn.metrics import (
    confusion_matrix, classification_report, f1_score
)
import seaborn as sns
import matplotlib.pyplot as plt

# Predictions
y_pred = model.predict(X_test).argmax(axis=1)

# F1 Score
f1 = f1_score(y_true, y_pred, average='weighted')  # (#1:Weighted for imbalanced data)
print(f"F1 Score: {f1:.4f}")

# Confusion Matrix
cm = confusion_matrix(y_true, y_pred)  # (#2:Shows prediction distribution)

# Visualize
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')  # (#3:Heatmap visualization)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()

Quick Exercise: Interpret Confusion Matrix

Given this confusion matrix for a COVID test model:

Pred: NegativePred: Positive
Actual: Negative85050
Actual: Positive2080

Question 1

What is the accuracy?

Formula: (TP+TN)/Total

Question 2

What is the recall (sensitivity) for COVID+?

Formula: TP/(TP+FN)

Question 3

Is this model better for screening or confirmation? Why?

Think: Cost of false negatives vs false positives

ROC-AUC and Classification Report

from sklearn.metrics import roc_curve, auc, classification_report
import numpy as np

# Get probability predictions
y_proba = model.predict(X_test)  # (#1:Softmax probabilities)

# For binary classification
fpr, tpr, thresholds = roc_curve(y_true, y_proba[:, 1])  # (#2:False/True positive rates)
roc_auc = auc(fpr, tpr)

# Plot ROC curve
plt.plot(fpr, tpr, label=f'ROC (AUC = {roc_auc:.3f})')
plt.plot([0, 1], [0, 1], 'k--')  # (#3:Random baseline)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.legend()
plt.show()

# Classification report (precision, recall, f1 per class)
print(classification_report(y_true, y_pred, target_names=class_names))  # (#4:Full report)

ROC-AUC for Multi-class

from sklearn.preprocessing import label_binarize
from sklearn.metrics import roc_auc_score

# One-vs-Rest AUC
y_true_bin = label_binarize(y_true, classes=range(num_classes))  # (#1:One-hot encode)

# Calculate AUC
auc_ovr = roc_auc_score(
    y_true_bin, y_proba,
    multi_class='ovr',  # (#2:One-vs-Rest strategy)
    average='weighted'
)
print(f"Weighted OvR AUC: {auc_ovr:.4f}")

# Plot ROC for each class
for i in range(num_classes):
    fpr, tpr, _ = roc_curve(y_true_bin[:, i], y_proba[:, i])
    plt.plot(fpr, tpr, label=f'Class {i} (AUC={auc(fpr, tpr):.2f})')  # (#3:Per-class curves)

plt.legend()
plt.show()

Object Detection Metrics

IoU (Intersection over Union)

Measures overlap between predicted and ground truth boxes.

def calculate_iou(box1, box2):
    # box format: [x1, y1, x2, y2]
    x1 = max(box1[0], box2[0])
    y1 = max(box1[1], box2[1])
    x2 = min(box1[2], box2[2])
    y2 = min(box1[3], box2[3])

    inter = max(0, x2-x1) * max(0, y2-y1)
    area1 = (box1[2]-box1[0]) * (box1[3]-box1[1])
    area2 = (box2[2]-box2[0]) * (box2[3]-box2[1])

    return inter / (area1 + area2 - inter)

IoU Thresholds

  • IoU >= 0.5: COCO standard
  • IoU >= 0.75: Strict matching
  • IoU >= 0.5:0.95: COCO mAP

Detection as TP/FP

  • TP: IoU >= threshold + correct class
  • FP: IoU < threshold or wrong class
  • FN: Missed ground truth

Mean Average Precision (mAP)

Calculation Steps

  1. Sort predictions by confidence
  2. Calculate precision/recall at each threshold
  3. Compute AP (area under PR curve) per class
  4. Average AP across all classes

Common Metrics

  • mAP@0.5: IoU threshold 0.5
  • mAP@0.5:0.95: Average over IoU thresholds
  • mAP-small/medium/large: By object size
# Using torchmetrics
from torchmetrics.detection import MeanAveragePrecision

metric = MeanAveragePrecision()

# Format: list of dicts
preds = [
    {'boxes': tensor, 'scores': tensor, 'labels': tensor}
]
targets = [
    {'boxes': tensor, 'labels': tensor}
]

metric.update(preds, targets)
result = metric.compute()

print(f"mAP@0.5: {result['map_50']:.4f}")
print(f"mAP@0.5:0.95: {result['map']:.4f}")

Segmentation Metrics

import numpy as np

def dice_coefficient(pred, target, smooth=1e-6):
    """Dice = 2 * |A intersection B| / (|A| + |B|)"""
    intersection = np.sum(pred * target)  # (#1:Pixel overlap)
    return (2 * intersection + smooth) / (np.sum(pred) + np.sum(target) + smooth)

def iou_score(pred, target, smooth=1e-6):
    """IoU = |A intersection B| / |A union B|"""
    intersection = np.sum(pred * target)
    union = np.sum(pred) + np.sum(target) - intersection  # (#2:Union calculation)
    return (intersection + smooth) / (union + smooth)

def pixel_accuracy(pred, target):
    """Simple accuracy per pixel"""
    return np.mean(pred == target)  # (#3:Correct pixels / Total)

# Per-class metrics
def mean_iou(pred, target, num_classes):
    ious = []
    for c in range(num_classes):
        iou = iou_score(pred == c, target == c)  # (#4:IoU per class)
        ious.append(iou)
    return np.mean(ious)

Dice vs IoU Comparison

Aspect Dice Coefficient IoU (Jaccard)
Formula 2|A ∩ B| / (|A| + |B|) |A ∩ B| / |A ∪ B|
Range 0 to 1 0 to 1
Relationship Dice = 2*IoU / (1 + IoU) IoU = Dice / (2 - Dice)
Values Always >= IoU Always <= Dice
Use Case Medical imaging, loss function Detection, standard benchmark

Example: If IoU = 0.5, then Dice = 0.667. Perfect overlap gives both = 1.

Analyzing Learning Curves

What to Look For

  • Convergence: Loss decreasing and stabilizing
  • Gap: Difference between train/val curves
  • Spikes: Learning rate too high
  • Plateau: Stuck in local minimum
# Plot learning curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Loss curves
axes[0].plot(history['loss'], label='Train')
axes[0].plot(history['val_loss'], label='Val')
axes[0].set_title('Loss')
axes[0].legend()

# Accuracy curves
axes[1].plot(history['accuracy'], label='Train')
axes[1].plot(history['val_accuracy'], label='Val')
axes[1].set_title('Accuracy')
axes[1].legend()

plt.show()

Diagnosing Overfitting vs Underfitting

Pattern Diagnosis Solution
Train high, Val high Underfitting More capacity, longer training, lower regularization
Train low, Val high Overfitting More data, regularization, early stopping
Train low, Val low Good fit Can try for better performance
Val loss increases Overfitting (late) Early stopping, reduce epochs
Oscillating loss LR too high Reduce learning rate

Regularization: Dropout

# Keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Conv2D(64, 3, activation='relu'),
    layers.Dropout(0.25),  # (#1:25% dropout after conv)
    layers.Flatten(),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.5),  # (#2:50% dropout before output)
    layers.Dense(num_classes, activation='softmax')
])

# PyTorch
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, 3),
            nn.ReLU(),
            nn.Dropout2d(0.25)  # (#3:Spatial dropout for conv)
        )
        self.classifier = nn.Sequential(
            nn.Linear(64*26*26, 256),
            nn.ReLU(),
            nn.Dropout(0.5),  # (#4:Standard dropout for FC)
            nn.Linear(256, num_classes)
        )

Regularization: L2 and BatchNorm

# L2 Regularization (Weight Decay)
# Keras
from tensorflow.keras import regularizers

layers.Dense(256,
    kernel_regularizer=regularizers.l2(0.01)  # (#1:L2 penalty)
)

# PyTorch - weight decay in optimizer
optimizer = torch.optim.Adam(
    model.parameters(),
    lr=1e-3,
    weight_decay=1e-4  # (#2:L2 via optimizer)
)

# Batch Normalization - implicit regularization
# Keras
model.add(layers.BatchNormalization())  # (#3:After conv/dense)

# PyTorch
nn.Sequential(
    nn.Conv2d(64, 128, 3),
    nn.BatchNorm2d(128),  # (#4:Normalize activations)
    nn.ReLU()
)

Label Smoothing

Concept

Instead of hard labels (0 or 1), use soft labels that prevent overconfidence.

Smoothed label = (1 - epsilon) * original + epsilon / num_classes

Benefits

  • Prevents overconfident predictions
  • Improves generalization
  • Better calibrated probabilities
# Keras - built-in support
loss = keras.losses.CategoricalCrossentropy(
    label_smoothing=0.1  # epsilon
)

# PyTorch - CrossEntropyLoss
loss_fn = nn.CrossEntropyLoss(
    label_smoothing=0.1
)

# Manual implementation
def smooth_labels(labels, epsilon=0.1):
    n_classes = labels.shape[-1]
    return labels * (1 - epsilon) + \
           epsilon / n_classes

Essential Callbacks

from tensorflow.keras.callbacks import (
    ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
)

# Save best model
checkpoint = ModelCheckpoint(
    'best_model.keras',
    monitor='val_loss',  # (#1:Track validation loss)
    save_best_only=True,
    mode='min'
)

# Stop when no improvement
early_stop = EarlyStopping(
    monitor='val_loss',
    patience=10,  # (#2:Wait 10 epochs)
    restore_best_weights=True  # (#3:Revert to best)
)

# Reduce LR on plateau
reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,  # (#4:Halve learning rate)
    patience=5,
    min_lr=1e-7
)

callbacks = [checkpoint, early_stop, reduce_lr]
model.fit(X_train, y_train, callbacks=callbacks, epochs=100)

PyTorch: Custom Training Loop

best_val_loss = float('inf')
patience_counter = 0
patience = 10

for epoch in range(num_epochs):
    # Training
    model.train()
    for batch in train_loader:
        # ... training step ...

    # Validation
    model.eval()
    val_loss = evaluate(model, val_loader)  # (#1:Calculate val loss)

    # Save best model
    if val_loss < best_val_loss:
        best_val_loss = val_loss
        torch.save(model.state_dict(), 'best_model.pt')  # (#2:Save checkpoint)
        patience_counter = 0
    else:
        patience_counter += 1

    # Early stopping
    if patience_counter >= patience:  # (#3:Stop if no improvement)
        print(f"Early stopping at epoch {epoch}")
        break

    # LR scheduling
    scheduler.step(val_loss)  # (#4:ReduceLROnPlateau)

Hyperparameter Optimization with Optuna

What is Optuna?

Optuna is an automatic hyperparameter optimization framework.

  • Bayesian optimization
  • Pruning bad trials early
  • Visualization tools
  • Parallelization support

Key Concepts

  • Study: Optimization session
  • Trial: Single run with specific params
  • Objective: Function to optimize
  • Sampler: Search algorithm
  • Pruner: Early stopping strategy
pip install optuna optuna-dashboard

Optuna: Complete Example

import optuna

def objective(trial):
    # Suggest hyperparameters
    lr = trial.suggest_float('lr', 1e-5, 1e-2, log=True)  # (#1:Log-uniform)
    dropout = trial.suggest_float('dropout', 0.1, 0.5)
    batch_size = trial.suggest_categorical('batch_size', [16, 32, 64])  # (#2:Categorical)
    optimizer_name = trial.suggest_categorical('optimizer', ['Adam', 'SGD'])

    # Build model with suggested params
    model = build_model(dropout=dropout)

    # Train and evaluate
    accuracy = train_and_evaluate(
        model, lr=lr, batch_size=batch_size, optimizer=optimizer_name
    )

    return accuracy  # (#3:Maximize this)

# Create and run study
study = optuna.create_study(direction='maximize')  # (#4:Maximize accuracy)
study.optimize(objective, n_trials=100)

# Best parameters
print(f"Best params: {study.best_params}")
print(f"Best accuracy: {study.best_value:.4f}")

Optuna: Pruning and Visualization

# Enable pruning for unpromising trials
def objective(trial):
    model = build_model(trial)

    for epoch in range(num_epochs):
        train(model)
        val_acc = evaluate(model)

        # Report intermediate value
        trial.report(val_acc, epoch)  # (#1:Report progress)

        # Check if trial should be pruned
        if trial.should_prune():  # (#2:Prune bad trials)
            raise optuna.TrialPruned()

    return val_acc

# Create study with pruner
study = optuna.create_study(
    direction='maximize',
    pruner=optuna.pruners.MedianPruner()  # (#3:Prune below median)
)

# Visualize results
from optuna.visualization import plot_optimization_history, plot_param_importances
plot_optimization_history(study)  # (#4:Show progress)
plot_param_importances(study)  # (#5:Parameter importance)

Weights & Biases (W&B) Tracking

Features

  • Experiment tracking
  • Hyperparameter sweeps
  • Model versioning
  • Collaborative dashboards
  • Artifact management
import wandb

# Initialize run
wandb.init(
    project='cv-transfer-learning',
    config={
        'learning_rate': 1e-4,
        'epochs': 50,
        'batch_size': 32
    }
)

# Log metrics
for epoch in range(num_epochs):
    train_loss = train(model)
    val_loss = evaluate(model)

    wandb.log({
        'train_loss': train_loss,
        'val_loss': val_loss,
        'epoch': epoch
    })

wandb.finish()

W&B: Hyperparameter Sweeps

# Define sweep configuration
sweep_config = {
    'method': 'bayes',  # (#1:Bayesian optimization)
    'metric': {'name': 'val_accuracy', 'goal': 'maximize'},
    'parameters': {
        'learning_rate': {
            'distribution': 'log_uniform_values',  # (#2:Log-uniform sampling)
            'min': 1e-5, 'max': 1e-2
        },
        'dropout': {'values': [0.2, 0.3, 0.4, 0.5]},  # (#3:Discrete values)
        'batch_size': {'values': [16, 32, 64]},
        'optimizer': {'values': ['adam', 'sgd']}
    }
}

# Create sweep
sweep_id = wandb.sweep(sweep_config, project='cv-sweeps')

# Define training function
def train_sweep():
    wandb.init()
    config = wandb.config  # (#4:Get suggested params)
    model = build_model(config.dropout)
    # ... train with config.learning_rate, etc.

# Run sweep
wandb.agent(sweep_id, train_sweep, count=50)  # (#5:Run 50 trials)

Hands-on Lab: Transfer Learning

Objectives

Exercises

  1. Load EfficientNetB0 and prepare for transfer learning
  2. Train with frozen backbone (feature extraction)
  3. Fine-tune with discriminative learning rates
  4. Generate confusion matrix and classification report
  5. Run a simple Optuna optimization

Lab: Dataset and Setup

# Use a standard dataset for practice
import tensorflow_datasets as tfds

# Load dataset
(train_ds, val_ds), info = tfds.load(
    'oxford_flowers102',  # (#1:102 flower classes)
    split=['train', 'validation'],
    with_info=True,
    as_supervised=True
)

# Preprocessing function
def preprocess(image, label):
    image = tf.image.resize(image, (224, 224))
    image = tf.keras.applications.efficientnet.preprocess_input(image)  # (#2:Model-specific preprocessing)
    return image, label

# Prepare datasets
BATCH_SIZE = 32
train_ds = train_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)  # (#3:Optimize pipeline)
val_ds = val_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

Key Takeaways

Transfer Learning

Always start with pretrained models - they provide a strong foundation

Fine-tuning Strategy

Choose between feature extraction, partial, or full fine-tuning based on data size

Proper Evaluation

Use appropriate metrics for your task: classification, detection, or segmentation

Regularization

Dropout, L2, BatchNorm, and label smoothing prevent overfitting

Callbacks

Use checkpoints, early stopping, and LR scheduling for stable training

HPO Tools

Optuna and W&B automate the search for optimal hyperparameters

Next Session Preview

Session 6: Object Detection Fundamentals

Preparation: Review IoU and mAP concepts. Install Ultralytics YOLO package.

Resources

Type Resource
Library timm - PyTorch Image Models
Documentation Keras Transfer Learning Guide
Tool Optuna - Hyperparameter Optimization
Platform Weights & Biases
Paper How Transferable are Features in DNNs?
Course fast.ai - Practical Deep Learning

Questions?

Lab Time

Fine-tune EfficientNet on the Flowers dataset

Experiment

Compare different fine-tuning strategies

Optimize

Try Optuna or W&B sweeps for hyperparameter search

1 / 1

Slide Overview