Building an AI Agent from Scratch
Most AI agent tutorials start with LangChain. I started with a text file and a loop. Here's what I learned building one from the ground up — tool calling, memory, and all the things that break in production.
Everyone says you need a framework to build AI agents. LangChain, AutoGen, CrewAI — pick your poison. I spent a weekend trying all of them before realizing: I had no idea what was actually happening under the hood.
So I deleted everything and started with the bare minimum: a Python script, the Anthropic API, and a JSON file for memory.
What even is an agent?
Strip away the buzzwords and an AI agent is just this:
- Send a prompt to an LLM
- Parse the response
- Execute a tool if the model asked for one
- Feed the result back
- Repeat until done
That's it. The loop is the agent. Everything else is tooling around that loop.
The minimal implementation
Here's the core loop in about 40 lines:
import anthropic
import json
client = anthropic.Anthropic()
def run_agent(task: str, tools: list, max_steps: int = 10):
messages = [{"role": "user", "content": task}]
for step in range(max_steps):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages,
)
# Append assistant response
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
# Model is done
return response.content[-1].text
if response.stop_reason == "tool_use":
# Execute all tool calls
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
messages.append({"role": "user", "content": tool_results})
return "Max steps reached"
That's your agent. No framework. No magic. Just a loop and the API.
Tool calling is the hard part
The loop is easy. The tools are where things get interesting — and where things break.
A tool in the Anthropic API is a JSON schema that tells the model what functions it can call and what parameters they take. When the model wants to use one, it returns a tool_use block instead of finishing. You execute the tool and feed the result back.
Things that will bite you:
- Tool errors need to be returned as results, not raised as exceptions. If your tool throws and you don't catch it, the loop dies. Wrap everything in try/except and return the error as a string.
- Models will sometimes call tools that don't exist. Especially if your system prompt is vague. Add validation.
- The model will loop forever if the task is ambiguous. Set a max_steps and be aggressive about it in dev.
Memory: easier than you think
For a personal agent that runs once a day, "memory" is just a JSON file that gets appended to the system prompt:
def load_memory() -> str:
try:
with open("memory.json") as f:
facts = json.load(f)
return "## What I remember:\n" + "\n".join(f"- {f}" for f in facts)
except FileNotFoundError:
return ""
system = f"""You are a helpful assistant.
{load_memory()}
"""
For anything more complex, you need a vector store. But start here. Ship the simple thing first.
What I learned
The framework is usually hiding something you need to understand anyway.
Building from scratch forced me to actually understand what stop_reason means, why tool_use_id needs to match exactly, and why context window management matters in multi-step tasks.
I eventually went back to using frameworks — but now I know what they're doing, and I know when to reach past them.