Skip to content

feat: Add render_page_text MCP tool for token-efficient visual layout rendering #1966

@lindezhong

Description

@lindezhong

Is your feature request related to a problem? Please describe.

As users of chrome-devtools-mcp, when we need to understand a page's visual layout, we currently have two options: take_snapshot (semantic
structure) or take_screenshot (image), but both have limitations:

  • Snapshot provides interactive uids but lacks visual spatial information
  • Screenshot provides complete visual effects but consumes significant tokens, making it expensive for LLM processing

We need a solution between these two: plain-text grid rendering that preserves column layouts, heading hierarchy, sidebar positions, and other
visual structure information, but in a token-efficient text format.

Describe the solution you'd like

I would like a new MCP tool called render_page_text that renders the currently selected page as a fixed-width plain-text grid, inspired by https://github.com/browsh-org/browsh's approach to converting modern web pages into text-based representations.

Core Concept: Browsh-Inspired Text Rendering

Browsh's key insight: Modern web pages can be represented as ASCII-art-style text grids while preserving spatial relationships. Instead of treating web content as linear HTML (like lynx), browsh uses the browser's rendering engine to capture actual positions, then discretizes them into a character grid.

Describe alternatives you've considered

w3m-Style Text Rendering Engine

Approach: Build a custom HTML → text layout engine (like https://github.com/tats/w3m)

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions