Skip to content

Tool Registry

Every agent in Ishvana has a set of tools — functions the model can call to do something outside the chat. Hawken has tools for lore search, document fetch, and style analysis. Lagan has tools for web research, URL analysis, and Wikipedia lookup. Lorekeeper has tools for entity extraction and canon lookup. There are dozens of tools across all the agents, and every one of them has to be declared to the model in each request so the model knows what’s available to call. Managing those declarations used to be a mess — each agent exported its tools separately, and the tool lists drifted in subtle ways across requests in a way that broke prompt caching. The Tool Registry is Etherforce’s solution. One central place where every tool every agent has is registered, sorted, tracked, and served up to the LLM provider in a stable order. It’s one of the least visible parts of Etherforce, and it has a disproportionate impact on both caching efficiency and agent observability.

The Tool Registry is a singleton service that every agent registers its tools with on startup. When the engine starts, the boot coordinator calls each agent’s register_tools() method, which hands the tool definitions to the registry. Every tool has:

  • A name (unique across all agents).
  • A description (what the tool does).
  • A parameter schema (what the tool takes as arguments, in JSON schema format).
  • The agent that owns it (so the registry knows which agent gets credit for the tool).
  • An implementation function (the actual code the tool executes when called).

Once registered, the tool is available for any agent to call — though in practice each agent only sees its own tools unless explicitly told otherwise.

The most important thing the registry does isn’t holding the tool definitions. It’s making sure the tool list is sorted in a stable, deterministic order across every request.

Here’s why that matters.

Anthropic’s API supports prompt caching — if you send the same system prompt and tool list across multiple requests, the cached prefix gets reused at a reduced cost (up to 90% cheaper for the cached portion). Prompt caching is one of the biggest cost-saving features available, especially for long writing sessions where the system prompt and tools don’t change between requests.

But prompt caching is bit-for-bit sensitive. If two consecutive requests have the same tools but listed in a slightly different order — say, because a Python dict doesn’t preserve insertion order on some platforms, or because a new agent registered its tool mid-session and shifted the order — the cache misses. You pay the full cost of the request instead of the cached cost.

Before the Tool Registry, Ishvana had this problem constantly. Every request had its tool list built dynamically from whichever agent was active, in whatever order the agent’s executor happened to iterate its tool functions. The order was undefined, so the cache would hit on some requests and miss on others, and the miss pattern wasn’t predictable.

The Tool Registry fixes this by sorting every tool list alphabetically by tool name before sending it to the provider. Same tools every request means same order every request means cache hits every request. Combined with system prompt caching, the effect is that long writing sessions cost significantly less — because most of the static prefix (system prompt + tool list) is cached and reused.

In addition to stable ordering, Etherforce’s Anthropic provider marks the last tool in the list with cache_control=ephemeral. This tells Anthropic to cache everything up to and including the tool block.

The marker has to be on the last tool specifically because Anthropic’s cache control works from the beginning of the request up to (and including) the marked block. Marking a middle-of-list tool would only cache up to that tool, leaving the rest of the tool list uncached. Marking the last tool caches the entire tool list plus the system prompt.

This is the kind of detail you shouldn’t have to think about, and the registry handles it automatically. Every Anthropic request has the last tool marked. You don’t configure it.

The second job of the registry is observability. Because every tool is registered in one place, Etherforce can track invocation counts per tool, per agent, across sessions. The Tool Registry tab in the Analysis workspace shows exactly this — a full table of every tool every agent has registered, with columns for name, description, parameter count, and total invocations.

The invocation counts are useful for:

  • Dead tool detection. If a tool shows zero invocations after a month of heavy use, something is wrong. Either the tool is broken (agents can’t call it), the tool is never actually needed for your workflow, or the tool’s description is unclear enough that the models don’t know when to use it. Any of those is worth investigating.
  • Hot tool identification. Tools with disproportionately high invocation counts are worth optimizing. If Hawken’s “lore_search” tool is called 200 times a day, its implementation’s latency matters more than a tool called 2 times a day.
  • Per-agent distribution. Which agents are calling which tools. A tool registered for Hawken but being called by Lagan means something unexpected is happening — either the tool is being called by the wrong agent, or the tool’s scope is broader than originally intended.

The observability data is cheap to collect because it’s literally a counter that increments every time the tool’s implementation function is called. No extra instrumentation required.

One of Ishvana’s more advanced features is parallel delegation — when Ishvana (the orchestrator) fires multiple specialist agents simultaneously on a multi-domain request. Parallel delegation uses a tool called delegate_parallel that takes an array of {agent, task, context} objects and fires each one via asyncio.gather().

The delegate_parallel tool lives in the registry like any other tool, and it’s registered to Ishvana specifically. When Ishvana decides to use it, the tool implementation fans out to the specialist agents concurrently and returns their combined results.

The whole parallel-delegation capability depends on the registry, because the registry is where the specialist agents’ interfaces are declared. Ishvana couldn’t call into Hawken’s generation tool without the registry telling her what the interface looks like.

A smaller but meaningful detail: when the LLM generates a tool call during streaming, Etherforce dispatches the tool via asyncio.create_task() the instant the tool’s arguments are complete — not after the full response finishes streaming.

This matters because LLMs often generate multiple tool calls in a single response. The naive approach is to wait until the whole response is generated, then execute the tools sequentially. The streaming approach kicks off each tool as its arguments arrive, so tool execution overlaps with the remaining stream.

For a response that generates three tool calls, the naive approach takes stream_time + tool1_time + tool2_time + tool3_time. The streaming approach takes max(stream_time, tool1_time + tool2_time + tool3_time). In practice, the streaming approach cuts end-to-end latency meaningfully on tool-heavy requests.

The Anthropic provider gets the most benefit from this because Anthropic emits per-block stop events that let Etherforce detect tool completion mid-stream. OpenRouter-format providers detect tool completion when a new tool index starts in the stream (imperfect but usable). Ollama’s streaming behavior is more basic.

A few things worth noting:

  • Tools aren’t hot-swappable. Once Etherforce is running with a registered tool list, you can’t add new tools mid-session. If you want to add a new tool, you restart the engine.
  • Tool names must be unique across all agents. Two agents can’t both register a tool called search. If you’re writing a custom skill that adds a tool, make sure the name is distinctive.
  • Parameter schemas are validated. A tool implementation that doesn’t match its declared parameter schema will fail when the LLM tries to call it with arguments that match the schema. Etherforce validates implementations against schemas at registration time.
  • The registry is singleton per project. There’s one registry instance for the whole Ishvana app. Different projects share the same registry — which means different projects see the same tools.