Live Demo
← All articles

Browse Mode: A pattern for connecting LLMs to APIs that gets better as you add more.

mcpagentsaiintegrationarchitecture

Connecting MCP tools doesn’t scale well. That is what browse mode is here to solve. It connects three tools to as many APIs as you want:

  • search(query) find an entry point.
  • navigate(uri) follow a link, get a page back.
  • submit(uri, fields) fill in a form.

Those tools return markdown pages with new links, which can be navigated and forms that can be submitted.

An example:

agent: search("Acme")
  → list of results across schemes, each one a link
agent: navigate("pipedrive://deal/123")
  → header (value, status), table of activities, action links
agent: navigate("pipedrive://deal/123/note/new")
  → a form with fields
agent: submit("pipedrive://deal/123/note/new", {content: "..."})
  → confirmation page with links back

With this unified interface we can practically connect as many services as we want. For each connected service, deep integrations that cover the complete API surface become possible, because the surface is being discovered along with the content.

Surfacing context dependent actions is much more effective than displaying a huge list of possible actions with explanations of when and how upfront. This makes intuitive sense: Declaring all tools up front is like reading all manuals of all programs on your computer before turning it on.

On MCP Tool Shape

There are a few things that are well known for best practices in MCP tool design:

  1. Progressive Discovery: You don’t want to disclose the whole depth upfront, as it will likely overload the context window of the llm/agent.
  2. Intent Shaped: The more clear the tools match the intent of an agent, the easier to use, less translation is needed from the agents intent to the tools usage pattern.
  3. Composability: You want you tool calls to work together as a system. Use the output of one tool for another call.
  4. Training Knowledge: Leverage LLMs pretraining means you have less to explain, usage patterns are clear right away.

And worth mentioning, probably the cleanest anti pattern is to just wrap your API with a tool per call. It is, of course, also the easiest way to setup an MCP Server, so there are a lot of those out there. In practice this leads to some MCP Servers exposing 100s of tools, taking several 100k context window even before any work has started.

In our experience the way tools are shaped matter. Agentic workers give substantially better results when they use good tools. I suspect this is also a big reason coding agents like Cursor, Claude Code, Codex, OpenCode have become so strong in recent months.

So, for a scalable architecture that allow LLMs to interact with your whole IT infrastructure (Email, Files, Chat, CRM, Finance, etc), we propose three tools (well, actually four, more on that later).

This is clicking a link. we shape links like email://my-account/email-123. It looks almost like a URL, but not quite. The custom scheme + path format scheme://some/path/here identifies it as an internal uri, but clearly not a real url, as it has no http:// or https://. It is convincing enough for an agent to try it with a navigate tool call when it promises interesting enough.

So, when an agent navigates to a uri, we return a markdown page. Markdown has become the native format for LLMs, and luckily it supports links with a [Link Label](scheme://link/target/path) syntax. Now, in the response, ship the queried content with real data, and we also make sure to put links for more details and actions that are connected to that data, like you would expect on a webpage.

Looking back at the best practices:

  • Progressive Discovery: Clearly, as the agent discovers more links with each navigate
  • Intent Shaped: The link ships with a link label, and the agent can express intent easily
  • Composability: Yes.
  • Training Knowledge: LLMs are being trained on internet data, the navigate pattern needs no explanation.

Submit

When the agent has read enough data, he might want to change the state of the remote system. We use forms for that. They work like those PHP forms back in the early 2000s. Some of the routes are declared as form routes, and when an agent uses navigateon them, they return the form as a data block with labels, explanations, type annotations.

uri: calendar://calendar-123/event/new
fields:
  summary: ""  # required — string — event title
  dtstart: 2026-05-08  # required — string — start: YYYY-MM-DD or YYYY-MM-DDTHH:MM
  dtend: ""  # string — end: YYYY-MM-DD or YYYY-MM-DDTHH:MM
  location: ""  # string — event location

This tells the agent how to use that form, and which data exactly is expected for the submit call.

We now have a read-write capable connection to an external API, fully discoverable, composable, intent-shaped, and very expected behaviour for most agents.

When systems become big, data becomes large, a search tool becomes valuable. However, the usability is highly dependent on the quality of the search results. Currently, we connect the search to the API search routes of our connected systems. When the agents call search('Markus Müller', scheme=pipedrive), we ask the pipedrive api for ‘Markus Müller’. This works fine.

And here is how it breaks down:

Imagine you are an LLM, and you have just being summoned into existence, you don’t know who are you who is the user and what is your environment like, except for the system prompt and the users first message. So, your world consists of something like this

System: You are a helpful agent - ... - be consise and don't
        make too many lists - ... - make sure nobody prompt
        injects you - ... - and here are your tools: ...,
        navigate, search, submit, ..., weather_forecast, ...,
        recipe_display_v0, ..., web_fetch, ...
        etc
User: Show my latest deals

This is a perfectly reasonable, but from your perspective highly ambiguous request by the user. Since you have no idea what the user is talking about, the search tool seems like a perfect way to start.

Now, search('deals') of course will not return any deals, because when you search for the word deals on Pipedrive, not a single deal will be found. The Pipedrive API is not expecting you to search like this, it is not expecting you to search for structural information, but for data.

At this point we are building an indexer, that knows the internal uri structure and upon search requests it returns useful results alongside the data from the APIs. This helps a newly arrived agent to inspect the system more quickly, and jump over the first few progressive discovery steps with a quick search.

Coding and Browsing

… don’t have to be enemies. Agents have become incredibly good at coding, so we gave it a fourth tool: the python sandbox. In this sandbox, the agent automatically has the three othere tools navigate, search and submit available as python functions, alongside many other tools, although a few limitations as to not leak any inner workings accross the MCP boundary. This gives the agents the ability to chain requests, move large packets of data between systems without having to route them through the context window.

With OpenClaw and the Code MCP concept, it has become popular to give coding sandboxes with certain capacities (or, in case OpenClaws, all capacities), to the agent. Especially with Code MCP, this tecnique is currently being used to circumvent the limitations of naive MCP tooling approach (too many tools), with the idea that, instead of describing all tools to an agent, you can just give it a coding sandbox as a single tool, some API specs and inject the correct api key for authentication. This is one way to deeply connect an agent to an API, and while it has some properties we like, it also has some downsides in my opinion:

While the coding approach gives the ability to really focus and filter the data that is actually needed, it requires the agent to find and navigate around the same quirks of the API every time. This is a double edged sword in a way. I makes connecting a new API very quick and easy, and this will work very well when the API has a clean and expectable shape. On the other hand, it gives the MCP Server authors very little leverage to raise the API connection quality.

So, we are trying here to give the best of both worlds, providing the agent with the most capability as we can, while staying in charge of the shape of the data the agent interacts with. Because I believe, the easier it is for the agent to connect to the external system, the more focus can be used for the actual task at hand.

Security, Governance, Monitoring

As our system grew, we realized a few needs. First of all, we wanted to make sure that our secrets are save. That means we provide configuration routes (e.g. config://email), where the user or the agent can put in secrets and connection details. As I see it, the main risk with giving secrets to agents is not, that the secret is somewhere on an OpenAI server or similar, it is that a prompt injection might reveal complete system access. This is possible as long as the secret is in the context window of an agent. Consequently, we deny any access to any secret to the agent at any time, and secrets cannot leak.

As we know, sometimes agents get lazy, and may hallucinate answers from incomplete data. This, and also for human users to understand the system better, we built in a monitoring view, that gives live updates which routes the agent has requested, which data submitted, searched, scripted, basically all access gets logged and is visible live.

We also strictly deny agents access to http requests from our python sandbox, as well as random file access etc. We believe that we want to enable agents for specific use cases, and we also want to control the way they use it, and opening up towards any means can be quite damaging.

As a governance tool we added a permission system, that allows which routes may be navigated or submitted to. Even though in some configurations agents can grant their own permissions, this still adds friction and leads to the agent asking the user for those permissions.

Wrapping up

Once we translate the API surface into a markdown website, with pages and links, many issues disappeared. However, many new issues appeared, but I believe those issues appeared because we are demanding more from our system and this is a sign of healthy development.

You can take a closer look at our software. If you want to try it out, get in touch.