Building an AI Agent from Scratch

Everyone says you need a framework to build AI agents. LangChain, AutoGen, CrewAI — pick your poison. I spent a weekend trying all of them before realizing: I had no idea what was actually happening under the hood.

So I deleted everything and started with the bare minimum: a Python script, the Anthropic API, and a JSON file for memory.

What even is an agent?

Strip away the buzzwords and an AI agent is just this:

Send a prompt to an LLM
Parse the response
Execute a tool if the model asked for one
Feed the result back
Repeat until done

That's it. The loop is the agent. Everything else is tooling around that loop.

The minimal implementation

Here's the core loop in about 40 lines:

import anthropic
import json

client = anthropic.Anthropic()

def run_agent(task: str, tools: list, max_steps: int = 10):
    messages = [{"role": "user", "content": task}]

    for step in range(max_steps):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )

        # Append assistant response
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            # Model is done
            return response.content[-1].text

        if response.stop_reason == "tool_use":
            # Execute all tool calls
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result),
                    })
            messages.append({"role": "user", "content": tool_results})

    return "Max steps reached"

That's your agent. No framework. No magic. Just a loop and the API.

Tool calling is the hard part

The loop is easy. The tools are where things get interesting — and where things break.

A tool in the Anthropic API is a JSON schema that tells the model what functions it can call and what parameters they take. When the model wants to use one, it returns a tool_use block instead of finishing. You execute the tool and feed the result back.

Things that will bite you:

Tool errors need to be returned as results, not raised as exceptions. If your tool throws and you don't catch it, the loop dies. Wrap everything in try/except and return the error as a string.
Models will sometimes call tools that don't exist. Especially if your system prompt is vague. Add validation.
The model will loop forever if the task is ambiguous. Set a max_steps and be aggressive about it in dev.

Memory: easier than you think

For a personal agent that runs once a day, "memory" is just a JSON file that gets appended to the system prompt:

def load_memory() -> str:
    try:
        with open("memory.json") as f:
            facts = json.load(f)
        return "## What I remember:\n" + "\n".join(f"- {f}" for f in facts)
    except FileNotFoundError:
        return ""

system = f"""You are a helpful assistant.
{load_memory()}
"""

For anything more complex, you need a vector store. But start here. Ship the simple thing first.

What I learned

The framework is usually hiding something you need to understand anyway.

Building from scratch forced me to actually understand what stop_reason means, why tool_use_id needs to match exactly, and why context window management matters in multi-step tasks.

I eventually went back to using frameworks — but now I know what they're doing, and I know when to reach past them.