← Back to lecture
PW 2

Building an LLM Agent in Java

From API call to autonomous agent with custom skills

Duration 3h
Level Advanced
Session 2

Objectives

By the end of this practical work, you will have:

  • Made a direct API call to GPT-5-mini from vanilla Java
  • Built a complete agent loop that calls the LLM and processes tool invocations
  • Implemented 2+ tools (Calculator, ReadFile) the agent can invoke autonomously
  • Created 1+ skill (CodeReview) combining a system prompt with focused tools
  • Written unit tests for your tools and integration tests for the agent
  • Given a live demo of your agent solving real tasks

Grading Criteria

CriteriaPoints
Methodology — AIDD workflow, vision.md, commits/15
Agent Core — LLMClient, Agent loop, ToolRegistry/25
Tools & Skills — 2+ tools, 1+ skill, correct schemas/25
Code Quality — Clean code, error handling, no hardcoded keys/15
Tests — Unit tests for tools, integration test concept/10
Demo — Live demonstration, multi-tool queries/10

Timeline

StepActivityDuration
Step 1Setup & First API Call20 min
Step 2Build Agent Core40 min
Step 3Implement 2+ Tools30 min
Step 4Build a Skill15 min
Step 5Personal Project45 min
Step 6Tests + Cross Review15 min
Step 7Demos & Wrap-up20 min

Time management is critical. Steps 1–4 are guided and must be completed before moving to the personal project in Step 5. Do not spend extra time polishing early steps — a working agent with 2 tools is worth more than a perfect single tool.

Step 1 ~20 min

Setup & First API Call

1.1 Create the Maven project

Create the following directory structure:

mkdir -p llm-agent/src/main/java/fr/epita/agent/services
mkdir -p llm-agent/src/main/java/fr/epita/agent/tools
mkdir -p llm-agent/src/main/java/fr/epita/agent/skills
mkdir -p llm-agent/src/main/java/fr/epita/agent/launchers
mkdir -p llm-agent/src/test/java/fr/epita/agent/tools
cd llm-agent

1.2 Create pom.xml

Create pom.xml at the project root:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>fr.epita.agent</groupId>
    <artifactId>llm-agent</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.json</groupId>
            <artifactId>json</artifactId>
            <version>20240303</version>
        </dependency>
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter</artifactId>
            <version>5.10.2</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <version>3.1.0</version>
                <configuration>
                    <mainClass>fr.epita.agent.launchers.Main</mainClass>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

1.3 Set your API key

Set the OPENAI_API_KEY environment variable. Never hardcode it in source code.

# Linux / macOS
export OPENAI_API_KEY="sk-..."

# Windows PowerShell
$env:OPENAI_API_KEY = "sk-..."

1.4 Create LLMClient.java

This class wraps the OpenAI Chat Completions API using only java.net.http (no external HTTP library). Create src/main/java/fr/epita/agent/services/LLMClient.java:

package fr.epita.agent.services;

import org.json.JSONArray;
import org.json.JSONObject;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.List;

public class LLMClient {
    private static final String API_URL =
        "https://api.openai.com/v1/chat/completions";
    private static final String MODEL = "gpt-5-mini";
    private final String apiKey;
    private final HttpClient client;

    public LLMClient(String apiKey) {
        this.apiKey = apiKey;
        this.client = HttpClient.newHttpClient();
    }

    public JSONObject call(List<JSONObject> messages,
                           JSONArray tools) throws Exception {
        var body = new JSONObject()
            .put("model", MODEL)
            .put("messages", new JSONArray(messages));
        if (tools != null && !tools.isEmpty()) {
            body.put("tools", tools);
            body.put("tool_choice", "auto");
        }
        var req = HttpRequest.newBuilder()
            .uri(URI.create(API_URL))
            .header("Content-Type", "application/json")
            .header("Authorization", "Bearer " + apiKey)
            .POST(HttpRequest.BodyPublishers.ofString(
                body.toString()))
            .build();
        var resp = client.send(req,
            HttpResponse.BodyHandlers.ofString());
        if (resp.statusCode() != 200) {
            throw new RuntimeException("API error "
                + resp.statusCode() + ": " + resp.body());
        }
        return new JSONObject(resp.body());
    }
}

1.5 Test with a simple main

Create a quick test in src/main/java/fr/epita/agent/launchers/Main.java:

package fr.epita.agent.launchers;

import fr.epita.agent.services.LLMClient;
import org.json.JSONObject;
import java.util.List;

public class Main {
    public static void main(String[] args) throws Exception {
        String apiKey = System.getenv("OPENAI_API_KEY");
        if (apiKey == null || apiKey.isBlank()) {
            System.err.println(
                "Error: set OPENAI_API_KEY environment variable.");
            System.exit(1);
        }
        var llm = new LLMClient(apiKey);
        var msg = new JSONObject()
            .put("role", "user")
            .put("content", "Hello! What can you do?");
        var response = llm.call(List.of(msg), null);
        String reply = response
            .getJSONArray("choices")
            .getJSONObject(0)
            .getJSONObject("message")
            .getString("content");
        System.out.println("GPT-5-mini says: " + reply);
    }
}

1.6 Run it

mvn compile exec:java

You should see a friendly reply from GPT-5-mini in your terminal.

Checkpoint: The API responds with a natural language reply. You have a working Maven project that can talk to GPT-5-mini.

Step 2 ~40 min

Build Agent Core

An agent is an LLM in a loop that can decide to call tools, observe the results, and reason about the next step. We need three components: a Tool interface, a ToolRegistry, and the Agent loop itself.

2.1 Tool.java — The tool contract

Create src/main/java/fr/epita/agent/services/Tool.java:

package fr.epita.agent.services;

import org.json.JSONObject;

public interface Tool {
    String name();
    String description();
    JSONObject parameters();   // JSON Schema for the input
    String execute(JSONObject args);
}

Every tool declares its name, a natural language description (the LLM reads this to decide when to call the tool), a JSON Schema for its parameters, and an execute method that returns a string result.

2.2 ToolRegistry.java — Tool management

Create src/main/java/fr/epita/agent/services/ToolRegistry.java:

package fr.epita.agent.services;

import org.json.JSONArray;
import org.json.JSONObject;
import java.util.HashMap;
import java.util.Map;

public class ToolRegistry {
    private final Map<String, Tool> tools = new HashMap<>();

    public void register(Tool tool) {
        tools.put(tool.name(), tool);
    }

    public String run(String name, JSONObject args) {
        Tool tool = tools.get(name);
        if (tool == null)
            throw new IllegalArgumentException(
                "Unknown tool: " + name);
        return tool.execute(args);
    }

    public JSONArray declarations() {
        var arr = new JSONArray();
        for (var t : tools.values()) {
            arr.put(new JSONObject()
                .put("type", "function")
                .put("function", new JSONObject()
                    .put("name", t.name())
                    .put("description", t.description())
                    .put("parameters", t.parameters())));
        }
        return arr;
    }
}

The declarations() method converts all registered tools into the OpenAI function-calling format that the API expects.

2.3 Agent.java — The agent loop

This is the heart of the system. Create src/main/java/fr/epita/agent/services/Agent.java:

package fr.epita.agent.services;

import org.json.JSONObject;
import java.util.ArrayList;

public class Agent {
    private static final int MAX_ITERATIONS = 10;
    private final String systemPrompt;
    private final ToolRegistry registry;
    private final LLMClient llm;

    public Agent(String systemPrompt,
                 ToolRegistry registry,
                 LLMClient llm) {
        this.systemPrompt = systemPrompt;
        this.registry = registry;
        this.llm = llm;
    }

    public String run(String userInput) throws Exception {
        var messages = new ArrayList<JSONObject>();
        messages.add(new JSONObject()
            .put("role", "system")
            .put("content", systemPrompt));
        messages.add(new JSONObject()
            .put("role", "user")
            .put("content", userInput));

        for (int i = 0; i < MAX_ITERATIONS; i++) {
            var response = llm.call(
                messages, registry.declarations());
            var message = response
                .getJSONArray("choices")
                .getJSONObject(0)
                .getJSONObject("message");

            if (message.has("tool_calls")) {
                // The LLM wants to call one or more tools
                messages.add(message);
                var toolCalls =
                    message.getJSONArray("tool_calls");

                for (int j = 0; j < toolCalls.length(); j++) {
                    var toolCall =
                        toolCalls.getJSONObject(j);
                    String callId =
                        toolCall.getString("id");
                    var function =
                        toolCall.getJSONObject("function");
                    String toolName =
                        function.getString("name");
                    var args = new JSONObject(
                        function.getString("arguments"));

                    System.out.println("  [tool] "
                        + toolName + "(" + args + ")");

                    String result;
                    try {
                        result =
                            registry.run(toolName, args);
                    } catch (Exception e) {
                        result = "Error: " + e.getMessage();
                    }

                    System.out.println(
                        "  [result] " + result);

                    messages.add(new JSONObject()
                        .put("role", "tool")
                        .put("tool_call_id", callId)
                        .put("content", result));
                }
            } else {
                // No tool calls = final answer
                return message.getString("content");
            }
        }
        return "Agent reached maximum iterations ("
            + MAX_ITERATIONS + ").";
    }
}

2.4 Test with an EchoTool

Before building real tools, verify the agent loop works with a trivial tool. You can put this in a separate file or as an inner class in Main:

import fr.epita.agent.services.Tool;
import org.json.JSONObject;

class EchoTool implements Tool {
    public String name() { return "echo"; }

    public String description() {
        return "Echo back the input text";
    }

    public JSONObject parameters() {
        return new JSONObject("""
            {"type":"object",
             "properties":{
               "text":{"type":"string"}
             },
             "required":["text"]}""");
    }

    public String execute(JSONObject args) {
        return "Echo: " + args.getString("text");
    }
}

Then test it in your Main:

var registry = new ToolRegistry();
registry.register(new EchoTool());

var agent = new Agent(
    "You are a helpful assistant. Use tools when needed.",
    registry, llm);

String result = agent.run("Please echo the word hello");
System.out.println("Agent: " + result);

You should see in the console:

  [tool] echo({"text":"hello"})
  [result] Echo: hello
Agent: The echo tool returned: "Echo: hello"

What just happened? The LLM decided on its own to call the echo tool. The agent executed it, fed the result back to the LLM, and the LLM produced a final human-readable answer. This is the core agent loop: LLM decides → tool executes → result feeds back → LLM responds.

Checkpoint: The agent calls EchoTool, feeds the result back to the LLM, and produces a final answer. You have a working agent loop.

Step 3 ~30 min

Implement 2+ Tools

Now replace the toy EchoTool with real, useful tools the agent can use to solve actual problems.

3.1 CalculatorTool.java

Create src/main/java/fr/epita/agent/tools/CalculatorTool.java. This tool uses a recursive descent parser to safely evaluate math expressions — no eval(), no scripting engine, just clean parsing:

package fr.epita.agent.tools;

import fr.epita.agent.services.Tool;
import org.json.JSONObject;

public class CalculatorTool implements Tool {
    @Override
    public String name() {
        return "calculate";
    }

    @Override
    public String description() {
        return "Evaluate a mathematical expression. "
            + "Supports +, -, *, /, parentheses.";
    }

    @Override
    public JSONObject parameters() {
        return new JSONObject("""
            {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description":
                          "The math expression, e.g. '2 + 3 * 4'"
                    }
                },
                "required": ["expression"]
            }""");
    }

    @Override
    public String execute(JSONObject args) {
        String expr = args.getString("expression");
        try {
            double result = eval(expr.trim());
            // Return integer format if no decimal part
            if (result == Math.floor(result)
                    && !Double.isInfinite(result)) {
                return String.valueOf((long) result);
            }
            return String.valueOf(result);
        } catch (Exception e) {
            return "Error evaluating '"
                + expr + "': " + e.getMessage();
        }
    }

    /** Recursive descent parser for basic math. */
    private double eval(String expr) {
        return new Object() {
            int pos = -1, ch;

            void next() {
                ch = (++pos < expr.length())
                    ? expr.charAt(pos) : -1;
            }

            boolean eat(int c) {
                while (ch == ' ') next();
                if (ch == c) { next(); return true; }
                return false;
            }

            double parse() {
                next();
                double x = parseExpr();
                if (pos < expr.length())
                    throw new RuntimeException(
                        "Unexpected: " + (char) ch);
                return x;
            }

            double parseExpr() {
                double x = parseTerm();
                for (;;) {
                    if (eat('+')) x += parseTerm();
                    else if (eat('-')) x -= parseTerm();
                    else return x;
                }
            }

            double parseTerm() {
                double x = parseFactor();
                for (;;) {
                    if (eat('*')) x *= parseFactor();
                    else if (eat('/')) x /= parseFactor();
                    else return x;
                }
            }

            double parseFactor() {
                if (eat('+')) return +parseFactor();
                if (eat('-')) return -parseFactor();
                double x;
                if (eat('(')) {
                    x = parseExpr();
                    eat(')');
                } else {
                    int start = pos;
                    while ((ch >= '0' && ch <= '9')
                            || ch == '.') next();
                    x = Double.parseDouble(
                        expr.substring(start, pos));
                }
                return x;
            }
        }.parse();
    }
}

3.2 ReadFileTool.java

Create src/main/java/fr/epita/agent/tools/ReadFileTool.java. This tool includes safety checks to prevent directory traversal attacks:

package fr.epita.agent.tools;

import fr.epita.agent.services.Tool;
import org.json.JSONObject;
import java.nio.file.Files;
import java.nio.file.Path;

public class ReadFileTool implements Tool {
    @Override
    public String name() {
        return "read_file";
    }

    @Override
    public String description() {
        return "Read the contents of a file given "
            + "its relative path.";
    }

    @Override
    public JSONObject parameters() {
        return new JSONObject("""
            {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description":
                          "Relative file path, e.g. 'pom.xml'"
                    }
                },
                "required": ["path"]
            }""");
    }

    @Override
    public String execute(JSONObject args) {
        String filePath = args.getString("path");
        try {
            Path path = Path.of(filePath).normalize();
            // Safety: reject absolute paths and traversal
            if (path.isAbsolute()
                    || path.toString().contains("..")) {
                return "Error: only relative paths "
                    + "within the project are allowed.";
            }
            String content = Files.readString(path);
            if (content.length() > 5000) {
                content = content.substring(0, 5000)
                    + "\n... (truncated)";
            }
            return content;
        } catch (Exception e) {
            return "Error reading '" + filePath
                + "': " + e.getMessage();
        }
    }
}

3.3 Register and test both tools

Update your Main.java to register both tools and run test queries:

var registry = new ToolRegistry();
registry.register(new CalculatorTool());
registry.register(new ReadFileTool());

var agent = new Agent(
    "You are a helpful assistant. Use tools when needed "
    + "to answer questions accurately.",
    registry, llm);

// Test 1: Calculator
String r1 = agent.run("What is 147 * 38 + 529?");
System.out.println("\nAgent: " + r1);

// Test 2: File reading
String r2 = agent.run(
    "Read the pom.xml file and tell me "
    + "what dependencies this project uses.");
System.out.println("\nAgent: " + r2);
mvn compile exec:java

Expected output (calculator):

  [tool] calculate({"expression":"147 * 38 + 529"})
  [result] 6115
Agent: The result of 147 * 38 + 529 is **6,115**.

Expected output (file reading):

  [tool] read_file({"path":"pom.xml"})
  [result] <?xml version="1.0" ...> ...

Agent: The project uses two dependencies:
  1. org.json:json (version 20240303) - JSON processing
  2. org.junit.jupiter:junit-jupiter (version 5.10.2, test scope)

3.4 Optional: Build a 3rd tool

If you finish early, add one more tool. Ideas:

  • CurrentTimeTool — returns the current date and time
  • ListFilesTool — lists files in a directory
  • HttpFetchTool — fetches a URL and returns the response body

Use AIDD to build your 3rd tool! Describe the tool you want in natural language, let the AI generate the code, review the JSON Schema it produces, then test it with the agent.

Checkpoint: Both CalculatorTool and ReadFileTool work. The agent autonomously decides which tool to call based on the user's question.

Step 4 ~15 min

Build a Skill

4.1 What is a skill?

A skill is a reusable, focused capability built on top of the agent. It combines:

  • A specialized system prompt that constrains the agent's behavior
  • A curated set of tools relevant to the task
  • A simple API (one method) that hides the agent complexity

Think of it as: Skill = System Prompt + Tools + Focused Task

4.2 CodeReviewSkill.java

Create src/main/java/fr/epita/agent/skills/CodeReviewSkill.java:

package fr.epita.agent.skills;

import fr.epita.agent.services.*;
import fr.epita.agent.tools.ReadFileTool;

/**
 * A reusable skill that reviews code for quality issues.
 * Combines a focused system prompt with read-only tools.
 */
public class CodeReviewSkill {
    private static final String SYSTEM_PROMPT = """
        You are a code reviewer. When asked to review a file:
        1. Use the read_file tool to read its contents
        2. Analyze the code for: security issues, bugs,
           readability, performance
        3. Return a structured report with severity levels
           (critical/major/minor)
        Keep the review concise and actionable.
        """;

    private final LLMClient llm;

    public CodeReviewSkill(LLMClient llm) {
        this.llm = llm;
    }

    public String review(String filePath) throws Exception {
        var registry = new ToolRegistry();
        registry.register(new ReadFileTool());
        var agent = new Agent(SYSTEM_PROMPT, registry, llm);
        return agent.run("Review the file: " + filePath);
    }
}

4.3 Test the skill

Add this to your Main.java:

System.out.println("\n=== Code Review Skill ===\n");

var reviewSkill = new CodeReviewSkill(llm);
String review = reviewSkill.review(
    "src/main/java/fr/epita/agent/services/Agent.java");
System.out.println("Review:\n" + review);

Here is the actual output from a test run reviewing our Agent.java:

  [tool] read_file({"path":"src/main/java/fr/epita/agent/services/Agent.java"})
  [result] package fr.epita.agent.services; ...

Review:
## Code Review: Agent.java

### Major
- **No input validation**: userInput is passed directly
  without sanitization. A null or empty input would produce
  an unhelpful error.
- **Unbounded message list**: The messages ArrayList grows
  with every iteration but is never trimmed. Long agent
  runs could exceed the model's context window.

### Minor
- **System.out.println in library code**: Debug logging
  should use a proper logging framework (SLF4J/Logback).
- **Magic number**: MAX_ITERATIONS = 10 could be
  configurable via the constructor.
- **No timeout on HTTP calls**: The LLMClient has no
  read timeout, risking indefinite hangs.

Notice the pattern: the skill created its own agent with a focused prompt and only the read_file tool. The agent autonomously read the file, then the LLM produced a structured review. The caller just called reviewSkill.review(path) — all complexity is hidden behind a single method.

Checkpoint: The CodeReviewSkill runs end-to-end. It reads the file autonomously and produces a structured review with severity levels.

4.4 Bonus: From Java Skill to Claude Code Skill

Your Java Skill class and a Claude Code SKILL.md are the same concept: system prompt + tools + input → structured output. Let’s create a real one.

Create the directory and file:

mkdir -p ~/.claude/skills/craft

Create ~/.claude/skills/craft/SKILL.md with this content:

---
name: craft
description: Generate a CRAFT-structured prompt for
  AI-driven development tasks
user-invocable: true
allowed-tools: Read Grep Glob
argument-hint: [feature-description]
---

The user wants to build: $ARGUMENTS

Your job is to generate a complete CRAFT prompt by
analyzing the project first.

## Step 1: Context (auto-discover)
Use Read, Grep, and Glob to find:
- Framework and language (check pom.xml, package.json, etc.)
- Existing files and project structure
- Relevant types, interfaces, existing patterns
- Conventions (naming, folder structure)

## Step 2: Requirement
Restate what the user wants with:
- Precise acceptance criteria (checkboxes)
- Edge cases and error behavior
- What should NOT happen

## Step 3: Action
Specify the exact file(s) to create or modify,
with full paths based on the project structure
you discovered.

## Step 4: Format
State the technical constraints:
- Language version, framework conventions
- Types to use or extend
- Patterns to follow (from existing code)

## Step 5: Test
Describe how to verify:
- Expected behavior
- Edge cases to test
- What must NOT happen

## Output
Return the CRAFT prompt in a fenced code block,
ready to copy-paste into any AI coding assistant.

Test it

In Claude Code, type:

/craft add a CurrentTimeTool that returns the current date and time

Claude will scan your project, find the Tool interface, the existing tools as examples, and generate a complete CRAFT prompt with the right file path, parameter schema, and test cases.

The pattern is universal: your Java Skill class encapsulates system prompt + tools + execute(). A Claude Code SKILL.md does the same thing: frontmatter (metadata) + body (prompt) + allowed-tools. Same architecture, different runtimes.

Step 5 ~45 min

Personal Project

Now extend your agent with your own ideas. Choose a direction and build something unique.

5.1 Choose your direction

Pick one of these project directions (or propose your own):

  • DevOps Agent — tools for reading logs, listing processes, checking disk space
  • Data Agent — tools for reading CSV files, computing statistics, generating summaries
  • Documentation Agent — tools for reading source files, generating Javadoc, writing README sections
  • Testing Agent — tools for reading code and generating JUnit test cases
  • Refactoring Agent — tools for reading code, identifying code smells, suggesting improvements

5.2 Write your vision.md

Create a vision.md at the project root describing your extended agent:

# My Agent - Vision

## Purpose
[What does your agent do? What problem does it solve?]

## Tools
- **Tool 1**: [name] - [what it does]
- **Tool 2**: [name] - [what it does]
- **Tool 3**: [name] - [what it does]

## Skills
- **Skill 1**: [name] - [system prompt focus + which tools]

## User Stories
- As a user, I can ask "..." and the agent will ...
- As a user, I can ask "..." and the agent will ...

5.3 Implement your extensions

Build at least:

  • 1–2 more tools relevant to your chosen direction
  • Enhance your skill or create a new one
  • Error handling — graceful failures, input validation, helpful error messages

Use the AIDD workflow: describe what you want, let the AI generate code, review it, test it, commit.

5.4 Implementation tips

  • Keep tool descriptions clear and specific — the LLM reads them to decide when to call each tool
  • Return structured data from tools (not just raw text) for better LLM reasoning
  • Test each tool in isolation before integrating with the agent
  • Use try/catch in every tool's execute() method — a crashing tool kills the agent loop

Mid-sprint checkpoint (25 min mark): You should have at least one new tool working by now. If you are stuck, ask the instructor or simplify your scope. A working simple tool is better than a broken complex one.

Checkpoint: Your extended agent has at least 3 tools total and 1 skill. It can handle queries specific to your chosen direction. Code is committed with a clear vision.md.

Step 6 ~15 min

Tests + Cross Review

6.1 Unit tests for tools

Tools are pure functions: given JSON input, produce string output. They are easy to test without calling any API. Create src/test/java/fr/epita/agent/tools/CalculatorToolTest.java:

package fr.epita.agent.tools;

import org.json.JSONObject;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

class CalculatorToolTest {
    private final CalculatorTool tool = new CalculatorTool();

    @Test
    void handlesSimpleAddition() {
        var args = new JSONObject()
            .put("expression", "2 + 3");
        assertEquals("5", tool.execute(args));
    }

    @Test
    void respectsOperatorPrecedence() {
        var args = new JSONObject()
            .put("expression", "2 + 3 * 4");
        assertEquals("14", tool.execute(args));
    }

    @Test
    void handlesParentheses() {
        var args = new JSONObject()
            .put("expression", "(2 + 3) * 4");
        assertEquals("20", tool.execute(args));
    }

    @Test
    void handlesComplexExpression() {
        var args = new JSONObject()
            .put("expression", "147 * 38 + 529");
        assertEquals("6115", tool.execute(args));
    }

    @Test
    void returnsErrorForInvalidInput() {
        var args = new JSONObject()
            .put("expression", "abc");
        String result = tool.execute(args);
        assertTrue(result.startsWith("Error"));
    }
}

Run the tests:

mvn test

All 5 tests should pass green.

6.2 Integration test concept

Integration tests for an agent are harder because they involve live API calls. Two approaches:

  • Real API test (slow, costs money): call the real API and assert the agent uses the expected tool
  • Mock test (fast, free): mock LLMClient to return a pre-built tool_calls response and verify the agent dispatches correctly

For this workshop, focus on unit tests for each tool. Write at least 3 test methods per tool you created.

6.3 Cross review

Pair up with a classmate and perform a cross review:

  1. Clone your partner's repository
  2. Set OPENAI_API_KEY and run mvn compile exec:java
  3. Run mvn test to execute their test suite
  4. Try creative queries: edge cases, multi-tool questions, error scenarios
  5. Open 2–3 GitHub issues with your findings (bugs, suggestions, improvements)

Checkpoint: All tests pass (mvn test is green). You have opened 2–3 issues on your partner's repository.

Step 7 ~20 min

Demos & Wrap-up

7.1 Live demos

Each student gets 5 minutes to demonstrate their agent. Structure:

  • 30 seconds — Show your vision.md and chosen direction
  • 2 minutes — Live query demonstrating tool use
  • 1 minute — Your skill in action
  • 1 minute — One thing you learned building this
  • 30 seconds — Q&A

7.2 What your agent can do that a simple chatbot cannot

These examples illustrate the fundamental difference between a chatbot (text in → text out) and an agent (text in → actions → observations → text out):

Example 1: Read and review real code

> "Read my Agent.java and review it for issues"

  [tool] read_file({"path":"src/.../Agent.java"})
  [result] package fr.epita.agent.services; ...

Agent:
## Code Review: Agent.java
### Major
- No input validation on userInput
- Unbounded message list could exceed context window
### Minor
- System.out.println should use a logging framework

A chatbot would hallucinate the file contents. The agent actually reads the file.

Example 2: Precise computation

> "What is the total cost of 29 students x $8.25 per student?"

  [tool] calculate({"expression":"29 * 8.25"})
  [result] 239.25

Agent: The total cost for 29 students at $8.25 each
is **$239.25**.

A chatbot might get the math wrong. The agent delegates to a calculator and guarantees accuracy.

Example 3: Chain multiple tools in one query

> "Read pom.xml and calculate how many dependencies
   there are, then multiply by 100"

  [tool] read_file({"path":"pom.xml"})
  [result] ...json:20240303... junit-jupiter:5.10.2...
  [tool] calculate({"expression":"2 * 100"})
  [result] 200

Agent: The pom.xml has 2 dependencies.
Multiplied by 100, that gives **200**.

The agent chains tools autonomously: it read the file, counted the dependencies, then used the calculator — all from a single natural language request.

7.3 Workshop wrap-up

What you built today:

  • A vanilla Java LLM client using only java.net.http and org.json
  • A complete agent loop with tool calling and result injection
  • A tool system with clean interfaces, a registry, and JSON Schema declarations
  • A skill abstraction that packages agent capabilities into reusable units

7.4 The complete Main.java

Here is the final launcher that tests everything together:

package fr.epita.agent.launchers;

import fr.epita.agent.services.*;
import fr.epita.agent.tools.*;
import fr.epita.agent.skills.*;

public class Main {
    public static void main(String[] args)
            throws Exception {
        String apiKey = System.getenv("OPENAI_API_KEY");
        if (apiKey == null || apiKey.isBlank()) {
            System.err.println("Error: set "
                + "OPENAI_API_KEY environment variable.");
            System.exit(1);
        }

        var llm = new LLMClient(apiKey);

        // --- Test 1: Agent with tools ---
        System.out.println(
            "=== Test 1: Agent with Calculator"
            + " + ReadFile ===\n");

        var registry = new ToolRegistry();
        registry.register(new CalculatorTool());
        registry.register(new ReadFileTool());

        var agent = new Agent(
            "You are a helpful assistant. Use tools "
            + "when needed to answer accurately.",
            registry, llm);

        String result1 =
            agent.run("What is 147 * 38 + 529?");
        System.out.println("\nAgent: " + result1);

        System.out.println("\n--- ---\n");

        String result2 = agent.run(
            "Read the pom.xml file and tell me "
            + "what dependencies this project uses.");
        System.out.println("\nAgent: " + result2);

        // --- Test 2: Code Review Skill ---
        System.out.println(
            "\n=== Test 2: Code Review Skill ===\n");

        var reviewSkill = new CodeReviewSkill(llm);
        String review = reviewSkill.review(
            "src/main/java/fr/epita/agent"
            + "/services/Agent.java");
        System.out.println("Review:\n" + review);

        System.out.println(
            "\n=== All tests passed! ===");
    }
}

7.5 Going further

Production frameworks that build on these same concepts:

FrameworkLanguageKey Feature
LangChain4jJavaFull agent framework with memory, RAG, chains
Spring AIJavaSpring Boot integration, function calling, vector stores
OpenAI Agents SDKPythonOfficial OpenAI agent framework with handoffs
Claude Agent SDKTypeScriptAnthropic's agent framework with tool use

The core ideas — tool interfaces, registries, agent loops, skills — are the same across all of them. You now understand the fundamentals that every agent framework is built on.

Final checkpoint: You have demonstrated your agent live, shown multi-tool queries, and understand how the agent pattern extends to production frameworks. Commit all final code and push to your repository.

Step Overview