AI Agents

Building Autonomous Systems

Lecture 9

From assistants to autonomous development agents

Assistants vs Agents

AI Assistants

Respond to single prompts
No persistent memory
Human orchestrates every step
Limited to text in/out
Reactive: wait for instructions

AI Agents

Execute multi-step tasks autonomously
Maintain context across actions
Self-direct based on goals
Use tools (files, APIs, terminal)
Proactive: plan and execute

Agents are AI systems that can take actions in the world, not just generate text.

Agent Architecture

Goal
Define objective

→

Plan
Break into steps

→

Execute
Run actions

→

Observe
Check results

→

Reflect
Adjust plan

LLM Core

Reasoning engine (Claude, GPT)

Tools

Actions: files, APIs, shell

Memory

Context across interactions

Common Agent Tools

File Operations

Read, write, edit files in the codebase

read_file("src/app.py")
write_file("output.json", data)
edit_file("config.yaml", changes)

Shell Commands

Execute terminal commands

run_command("pytest tests/")
run_command("npm install")
run_command("git status")

Web/API Access

Fetch data from the internet

web_search("Python async patterns")
fetch_url("https://api.example.com")
call_api("POST", url, payload)

Code Analysis

Parse and understand code

find_function("calculate_total")
list_imports("src/")
search_codebase("TODO")

Building Agents with LangChain

from langchain.agents import create_react_agent
from langchain.tools import Tool
from langchain_anthropic import ChatAnthropic

# Define tools
tools = [
    Tool(name="read_file", func=read_file, description="Read a file's contents"),
    Tool(name="write_file", func=write_file, description="Write content to a file"),
    Tool(name="run_tests", func=run_pytest, description="Run pytest tests"),
]

# Create agent with Claude
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
agent = create_react_agent(llm, tools, prompt_template)

# Run agent with a goal
result = agent.invoke({
    "input": "Add input validation to the register_user function, then run tests"
})

print(result["output"])

Claude Agent SDK

from anthropic import Agent
from anthropic.tools import FileRead, FileWrite, Bash

# Create agent with built-in tools
agent = Agent(
    model="claude-sonnet-4-20250514",
    tools=[FileRead(), FileWrite(), Bash()],
    system="""You are a coding assistant.
    Read vision.md before any task.
    Use TDD: write tests first, then implement."""
)

# Run agent task
response = agent.run(
    "Implement user registration following the spec in docs/specs/user-auth.md"
)

# Agent will: read spec, write tests, implement, run tests
print(response.actions)  # List of actions taken
print(response.result)   # Final result

Agent Guardrails

Agents need boundaries to be safe and effective

Dangerous Actions

Deleting production data
Pushing to main branch
Modifying system files
Running arbitrary scripts

Guardrail Strategies

Allowlist of safe directories
Command approval workflow
Sandbox environments
Action logging & audit

Rule: Never give an agent more permissions than it needs for the task.

Agents Automate AIDD Phases

AIDD Phase	Agent Capability
Discover	Read vision.md, analyze codebase, map dependencies
Plan	Break task into steps, create subtasks, estimate scope
Review	Check plan against specs, identify risks, suggest alternatives
Execute	Write tests first, implement code, iterate until green
Commit	Run linting, security checks, create commit with message
Test	Run full test suite, generate coverage report, flag issues

Agent System Prompt

You are a coding agent following AIDD methodology.

Before ANY task: Read vision.md and relevant specs in docs/specs/
For implementation: Write failing test first, implement minimum to pass
Before committing: Run tests, check linting, verify alignment with vision.md

Agent Use Cases

Code Review Agent

Automatically reviews PRs for bugs, style, security issues

Test Generation Agent

Analyzes code changes, generates relevant tests

Documentation Agent

Keeps docs in sync with code changes

Migration Agent

Automates library upgrades and API migrations

Key Takeaways

Autonomous

Agents execute multi-step tasks independently

Tools

Files, APIs, shell - agents take real actions

Guardrails

Essential for safe, predictable behavior

AIDD

Framework provides structure for agent tasks

Questions?

Building AI Agents

Next: Best Practices & Future Trends

AI Agents

Assistants vs Agents

AI Assistants

AI Agents

Agent Architecture

LLM Core

Tools

Memory

Common Agent Tools

File Operations

Shell Commands

Web/API Access

Code Analysis

Building Agents with LangChain

Claude Agent SDK

Agent Guardrails

Dangerous Actions

Guardrail Strategies

Agents Automate AIDD Phases

Agent System Prompt

Agent Use Cases

Code Review Agent

Test Generation Agent

Documentation Agent

Migration Agent

Key Takeaways

Questions?

Slide Overview