The supervisor sidepanel concept is the right mental model — I've found that the biggest problem with agent-driven browsers isn't the automation itself, it's the human not knowing where the agent is in its reasoning loop when things go wrong.
Curious about the MCP tool schema design: with 40+ tools, are you using a flat tool registry or do you have any grouping/namespacing? I've run into issues where agents get confused when they have too many tools with similar-sounding names and start hallucinating tool calls that don't exist.
This is a really good question; currently Vessel uses a flat registry, but the design is specifically oriented toward solving the hallucination problem.
The way that I manage that in Vessel is contextual surfacing. This way, the model doesn't necessarily have 40+ tools to choose from at any given time, but rather is focused toward a subset of tools that are applicable given the current page context.
Speed and efficiency are my number 1 priority with this browser so this framework may change/shift as time goes on, but this is an approach that I'm particularly interested in exploring.
Contextual surfacing is the right instinct — it's essentially dynamic tool windowing. The interesting design question is what signals you use to determine relevance: URL pattern, page content category, prior agent action, or something else?
I've been packaging reusable MCP tool schema definitions (the JSON spec layer, not implementations) as a way to give agents a consistent vocabulary across different harnesses. The hallucination problem you're solving at the surfacing layer is related but distinct from the schema consistency problem — both matter for reliable tool use in production.
Vessel primarily uses DOM content analysis to identify interactive elements on the page which then indicates which sort of tools might be relevant to surface to the agent. URL pattern is the obvious next indicator, and prior agent action is where it gets REALLY interesting imo because that's where you start building a predictive model of what the agent needs next rather than just reacting to whatever the current state is.
Your schema work sounds pretty interesting - any links? I'd be curious to check it out!
Curious about the MCP tool schema design: with 40+ tools, are you using a flat tool registry or do you have any grouping/namespacing? I've run into issues where agents get confused when they have too many tools with similar-sounding names and start hallucinating tool calls that don't exist.
The way that I manage that in Vessel is contextual surfacing. This way, the model doesn't necessarily have 40+ tools to choose from at any given time, but rather is focused toward a subset of tools that are applicable given the current page context.
Speed and efficiency are my number 1 priority with this browser so this framework may change/shift as time goes on, but this is an approach that I'm particularly interested in exploring.
I've been packaging reusable MCP tool schema definitions (the JSON spec layer, not implementations) as a way to give agents a consistent vocabulary across different harnesses. The hallucination problem you're solving at the surfacing layer is related but distinct from the schema consistency problem — both matter for reliable tool use in production.
Your schema work sounds pretty interesting - any links? I'd be curious to check it out!