AI Agents

Building Autonomous Systems

Lecture 9

From assistants to autonomous development agents

Assistants vs Agents

AI Assistants

  • Respond to single prompts
  • No persistent memory
  • Human orchestrates every step
  • Limited to text in/out
  • Reactive: wait for instructions

AI Agents

  • Execute multi-step tasks autonomously
  • Maintain context across actions
  • Self-direct based on goals
  • Use tools (files, APIs, terminal)
  • Proactive: plan and execute

Agents are AI systems that can take actions in the world, not just generate text.

Agent Architecture

Goal
Define objective
Plan
Break into steps
Execute
Run actions
Observe
Check results
Reflect
Adjust plan

LLM Core

Reasoning engine (Claude, GPT)

Tools

Actions: files, APIs, shell

Memory

Context across interactions

Common Agent Tools

File Operations

Read, write, edit files in the codebase

read_file("src/app.py")
write_file("output.json", data)
edit_file("config.yaml", changes)

Shell Commands

Execute terminal commands

run_command("pytest tests/")
run_command("npm install")
run_command("git status")

Web/API Access

Fetch data from the internet

web_search("Python async patterns")
fetch_url("https://api.example.com")
call_api("POST", url, payload)

Code Analysis

Parse and understand code

find_function("calculate_total")
list_imports("src/")
search_codebase("TODO")

Building Agents with LangChain

from langchain.agents import create_react_agent
from langchain.tools import Tool
from langchain_anthropic import ChatAnthropic

# Define tools
tools = [
    Tool(name="read_file", func=read_file, description="Read a file's contents"),
    Tool(name="write_file", func=write_file, description="Write content to a file"),
    Tool(name="run_tests", func=run_pytest, description="Run pytest tests"),
]

# Create agent with Claude
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
agent = create_react_agent(llm, tools, prompt_template)

# Run agent with a goal
result = agent.invoke({
    "input": "Add input validation to the register_user function, then run tests"
})

print(result["output"])

Claude Agent SDK

from anthropic import Agent
from anthropic.tools import FileRead, FileWrite, Bash

# Create agent with built-in tools
agent = Agent(
    model="claude-sonnet-4-20250514",
    tools=[FileRead(), FileWrite(), Bash()],
    system="""You are a coding assistant.
    Read vision.md before any task.
    Use TDD: write tests first, then implement."""
)

# Run agent task
response = agent.run(
    "Implement user registration following the spec in docs/specs/user-auth.md"
)

# Agent will: read spec, write tests, implement, run tests
print(response.actions)  # List of actions taken
print(response.result)   # Final result

Agent Guardrails

Agents need boundaries to be safe and effective

Dangerous Actions

  • Deleting production data
  • Pushing to main branch
  • Modifying system files
  • Running arbitrary scripts

Guardrail Strategies

  • Allowlist of safe directories
  • Command approval workflow
  • Sandbox environments
  • Action logging & audit
Rule: Never give an agent more permissions than it needs for the task.

Agents Automate AIDD Phases

AIDD PhaseAgent Capability
DiscoverRead vision.md, analyze codebase, map dependencies
PlanBreak task into steps, create subtasks, estimate scope
ReviewCheck plan against specs, identify risks, suggest alternatives
ExecuteWrite tests first, implement code, iterate until green
CommitRun linting, security checks, create commit with message
TestRun full test suite, generate coverage report, flag issues

Agent System Prompt

You are a coding agent following AIDD methodology.

Before ANY task: Read vision.md and relevant specs in docs/specs/
For implementation: Write failing test first, implement minimum to pass
Before committing: Run tests, check linting, verify alignment with vision.md

Agent Use Cases

Code Review Agent

Automatically reviews PRs for bugs, style, security issues

Test Generation Agent

Analyzes code changes, generates relevant tests

Documentation Agent

Keeps docs in sync with code changes

Migration Agent

Automates library upgrades and API migrations

Key Takeaways

Autonomous

Agents execute multi-step tasks independently

Tools

Files, APIs, shell - agents take real actions

Guardrails

Essential for safe, predictable behavior

AIDD

Framework provides structure for agent tasks

Questions?

Building AI Agents

Next: Best Practices & Future Trends

Slide Overview