HyperionGray · Copilot · Nov 22, 2025 · Nov 22, 2025 · Nov 22, 2025 · Nov 22, 2025
diff --git a/README.md b/README.md
@@ -13,6 +13,19 @@ I/O using [Trio](https://trio.readthedocs.io/). This library handles the
 WebSocket negotiation and session management, allowing you to transparently
 multiplex commands, responses, and events over a single connection.
 
+## Features
+
+- **Pure CDP**: Direct access to Chrome DevTools Protocol
+- **Async/Await**: Built on Trio for structured concurrency
+- **Type Safety**: Full type hints for better IDE support
+- **High-Level Utilities**: Puppeteer-inspired abstractions for common tasks
+  - Keyboard and mouse simulation
+  - Element interaction and querying
+  - Wait for elements to appear
+  - Pure CDP implementation (no JavaScript injection)
+
+## Basic Example
+
 The example below demonstrates the salient features of the library by navigating to a
 web page and extracting the document title.
 
@@ -40,5 +53,30 @@ async with open_cdp(cdp_url) as conn:
         print(html)
 ```
 
+## High-Level Utilities Example
+
+The library also provides high-level utilities for common automation tasks:
+
+```python
+from trio_cdp import open_cdp, page, target
+from trio_cdp.util import query_selector, Keyboard
+
+async with open_cdp(cdp_url) as conn:
+    async with conn.open_session(target_id) as session:
+        # Navigate to a page
+        await page.enable()
+        await page.navigate(url)
+
+        # Find an input field and type into it
+        input_field = await query_selector(session, 'input[name="search"]')
+        if input_field:
+            await input_field.type('Hello, World!')
+
+        # Press Enter to submit
+        keyboard = Keyboard(session)
+        await keyboard.press('Enter')
+```
+
 This example code is explained [in the documentation](https://trio-cdp.readthedocs.io)
-and more example code can be found in the `examples/` directory of this repository.
+and more example code can be found in the `examples/` directory of this repository,
+including examples for taking screenshots and monitoring network events.
diff --git a/UTILITIES_IMPLEMENTATION.md b/UTILITIES_IMPLEMENTATION.md
@@ -0,0 +1,177 @@
+# Utilities Module Implementation Summary
+
+## Overview
+
+This implementation addresses the GitHub issue about extending `trio-chrome-devtools-protocol` with higher-level utility functions and classes for common browser automation tasks, inspired by Puppeteer/Pyppeteer.
+
+## Decision: Integrated Approach
+
+Rather than creating a separate `trio-puppeteer` package, the utilities are integrated directly into the main `trio_cdp` package as a `util` module. This approach was chosen because:
+
+1. **Lightweight**: The utilities are thin wrappers around CDP commands
+2. **No External Dependencies**: Everything uses native CDP, no JavaScript injection
+3. **Tight Integration**: Direct access to session and connection objects
+4. **Simplicity**: Users don't need to install/manage a separate package
+
+## Implementation
+
+### New Module: `trio_cdp/util.py`
+
+Contains three main classes and utility functions:
+
+#### 1. Keyboard Class
+Provides keyboard input simulation:
+- `down(key, text=None)` - Press key down
+- `up(key)` - Release key
+- `press(key, delay=0)` - Complete key press (down + up)
+- `type(text, delay=0)` - Type a string character by character
+
+**Example:**
+```python
+keyboard = Keyboard(session)
+await keyboard.type("Hello, World!")
+await keyboard.press("Enter")
+```
+
+#### 2. Mouse Class
+Provides mouse action simulation:
+- `move(x, y, steps=1)` - Move mouse with optional smooth interpolation
+- `click(x, y, button='left', click_count=1, delay=0)` - Click at position
+- `down(button='left', click_count=1)` - Mouse button down
+- `up(button='left', click_count=1)` - Mouse button up
+
+**Example:**
+```python
+mouse = Mouse(session)
+await mouse.move(100, 200, steps=10)  # Smooth movement
+await mouse.click(100, 200)
+```
+
+#### 3. ElementHandle Class
+Represents a handle to a DOM element with convenient interaction methods:
+- `click(button='left', click_count=1, delay=0)` - Click the element
+- `type(text, delay=0)` - Focus and type into element
+- `get_attribute(name)` - Get HTML attribute value
+- `get_property(name)` - Get JavaScript property value
+- `get_text_content()` - Extract text content
+
+**Example:**
+```python
+input_field = await query_selector(session, 'input[name="email"]')
+if input_field:
+    await input_field.type('user@example.com')
+```
+
+#### Element Selection Functions
+- `query_selector(session, selector, node_id=None)` - Find first matching element
+- `query_selector_all(session, selector, node_id=None)` - Find all matching elements
+- `wait_for_selector(session, selector, timeout=30, visible=False)` - Wait for element
+
+**Example:**
+```python
+# Find and interact with elements
+button = await query_selector(session, 'button.submit')
+if button:
+    await button.click()
+
+# Wait for dynamic content
+result = await wait_for_selector(session, '.result', timeout=10, visible=True)
+```
+
+## Documentation
+
+### Added Files
+1. **docs/utilities.rst** - Comprehensive documentation for all utilities
+2. **examples/form_interaction.py** - Example showing form interaction
+3. **examples/keyboard_mouse.py** - Example demonstrating keyboard/mouse usage
+4. **tests/test_util.py** - Unit tests for utility functions
+5. **validate_utilities.py** - Validation script to verify module structure
+
+### Updated Files
+1. **README.md** - Added utilities section with examples
+2. **docs/index.rst** - Added utilities to documentation table of contents
+3. **trio_cdp/__init__.py** - Export util module
+
+## Key Design Principles
+
+1. **Pure CDP**: No JavaScript injection, all interactions use native CDP commands
+2. **Async-First**: Fully compatible with Trio's async/await patterns
+3. **Lightweight**: Minimal abstractions, close to underlying CDP
+4. **Type-Safe**: Complete type hints for IDE support
+5. **Composable**: Small, focused utilities that work well together
+6. **Optional**: Core CDP functionality remains available; utilities are opt-in
+
+## Benefits
+
+### For Users
+- **Intuitive API**: Familiar patterns for anyone coming from Puppeteer
+- **Less Boilerplate**: Common tasks simplified with high-level methods
+- **Type Safety**: Full IDE support with autocomplete and type checking
+- **Pure Python**: No JavaScript knowledge required
+
+### For the Project
+- **Maintains Philosophy**: Stays true to lightweight, CDP-focused approach
+- **No Breaking Changes**: Completely additive, existing code unaffected
+- **Extensible**: Users can easily add custom utilities following same patterns
+- **Well-Documented**: Comprehensive docs and examples
+
+## Technical Details
+
+### Generator Fix
+Fixed `generator/generate.py` to handle `typing.Optional` type hints, which was preventing regeneration of CDP bindings with newer Python versions.
+
+### CDP Bindings Regenerated
+Regenerated all CDP binding code to be compatible with `chrome-devtools-protocol==0.4.0`, resolving import errors with the generated code.
+
+## Testing & Validation
+
+1. **Unit Tests**: Comprehensive test suite in `tests/test_util.py`
+2. **Validation Script**: `validate_utilities.py` verifies all classes and methods exist
+3. **Code Quality**: Passed CodeQL security scan with 0 alerts
+4. **Examples**: Two working examples demonstrate real-world usage
+
+## Usage Example
+
+Here's a complete example showing the utilities in action:
+
+```python
+import trio
+from trio_cdp import open_cdp, page, target
+from trio_cdp.util import query_selector, wait_for_selector, Keyboard
+
+async def automate_form(cdp_url):
+    async with open_cdp(cdp_url) as conn:
+        # Get a target
+        targets = await target.get_targets()
+        target_id = targets[0].target_id
+
+        async with conn.open_session(target_id) as session:
+            # Navigate
+            await page.enable()
+            await page.navigate('https://example.com/form')
+
+            # Wait for and fill form
+            name_field = await wait_for_selector(session, 'input[name="name"]', timeout=10)
+            if name_field:
+                await name_field.type('John Doe')
+
+            # Use keyboard for submission
+            keyboard = Keyboard(session)
+            await keyboard.press('Enter')
+```
+
+## Future Enhancements
+
+Potential additions that maintain the same design philosophy:
+
+1. **Page utilities**: Screenshot helpers, PDF generation utilities
+2. **Network utilities**: Request interception helpers, mock response utilities
+3. **Cookie utilities**: Easy cookie management
+4. **Dialog utilities**: Alert/prompt/confirm handlers
+5. **File upload**: File chooser utilities
+
+Each would follow the same pattern: lightweight wrappers around CDP commands with convenient async interfaces.
+
+## Conclusion
+
+This implementation successfully extends `trio-chrome-devtools-protocol` with higher-level utilities while maintaining the library's core principles of being lightweight, pure-CDP, and Trio-native. The utilities provide a more intuitive interface for common automation tasks without sacrificing the power and flexibility of the underlying CDP protocol.
diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -1,6 +1,12 @@
 Changelog
 =========
 
+Unreleased
+----------
+
+* Add ``find_chrome_debugger_url()`` function for programmatic discovery of Chrome's WebSocket URL.
+* ``open_cdp()`` now accepts HTTP URLs (e.g., ``http://localhost:9222``) which are automatically resolved to WebSocket URLs.
+
 0.6.0
 -----
 

diff --git a/docs/getting_started.rst b/docs/getting_started.rst
@@ -3,6 +3,53 @@ Getting Started
 
 .. highlight:: python
 
+Connecting to Chrome
+--------------------
+
+Trio CDP provides flexible ways to connect to a Chrome browser (or any browser that
+supports the Chrome DevTools Protocol).
+
+Starting Chrome with Remote Debugging
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+First, start Chrome with remote debugging enabled::
+
+    $ chrome --remote-debugging-port=9222
+
+You can use any port number you prefer. Chrome will display the debugging URL in the
+console when it starts.
+
+Connecting Programmatically
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The simplest way to connect is by using an HTTP URL::
+
+    from trio_cdp import open_cdp
+
+    async with open_cdp('http://localhost:9222') as conn:
+        # Your code here
+        ...
+
+The library will automatically discover the WebSocket URL from Chrome's HTTP endpoint.
+
+Alternatively, you can use the discovery function explicitly::
+
+    from trio_cdp import find_chrome_debugger_url, open_cdp
+
+    # Discover the WebSocket URL
+    browser_url = find_chrome_debugger_url(port=9222)
+
+    async with open_cdp(browser_url) as conn:
+        ...
+
+You can also provide a WebSocket URL directly if you already have it::
+
+    async with open_cdp('ws://localhost:9222/devtools/browser/...') as conn:
+        ...
+
+Basic Example
+-------------
+
 The following example shows how to connect to browser, navigate to a specified web page,
 and then extract the page title.
 
@@ -120,6 +167,63 @@ we get the outer HTML of the node. This snippet shows some new APIs, but the
 mechanics of sending commands and getting responses are the same as the previous
 snippets.
 
-A more complete version of this example can be found in ``examples/get_title.py`` in
-the repository. There is also a screenshot example in ``examples/screenshot.py``. The 
-unit tests in ``tests/`` also provide some helpful sample code.
+Listening to Events
+-------------------
+
+Trio CDP provides two patterns for handling browser events:
+
+Using ``wait_for()``
+~~~~~~~~~~~~~~~~~~~~
+
+The ``wait_for()`` method is useful when you need to wait for a single event before
+continuing execution. We've already seen this in the navigation example above, where
+we wait for ``page.LoadEventFired``. Here's the pattern:
+
+.. code::
+
+    async with session.wait_for(page.LoadEventFired) as event_proxy:
+        # Trigger an action that will cause the event
+        await page.navigate(url='https://example.com')
+    # After the context exits, event_proxy.value contains the event
+    print(f"Page loaded at timestamp: {event_proxy.value.timestamp}")
+
+Using ``listen()``
+~~~~~~~~~~~~~~~~~~
+
+The ``listen()`` method returns an async iterator that continuously yields events as
+they occur. This is useful for monitoring ongoing activity, such as network requests:
+
+.. code::
+
+    # Enable network events
+    await network.enable()
+
+    # Listen for network events
+    async for event in session.listen(
+        network.RequestWillBeSent,
+        network.ResponseReceived
+    ):
+        if isinstance(event, network.RequestWillBeSent):
+            print(f"Request: {event.request.url}")
+        elif isinstance(event, network.ResponseReceived):
+            print(f"Response: {event.response.url} (status: {event.response.status})")
+
+You can listen to multiple event types at once by passing them all to ``listen()``.
+The iterator will yield events of any of the specified types as they occur.
+
+**Important:** Don't forget to enable events for the domain you're interested in!
+For example, call ``await network.enable()`` before listening to network events,
+or ``await page.enable()`` before listening to page events. You can also use the
+context managers ``session.page_enable()`` or ``session.dom_enable()`` for automatic
+cleanup.
+
+Examples
+--------
+
+A more complete version of the basic example can be found in ``examples/get_title.py`` in
+the repository. There are additional examples showing:
+
+- ``examples/screenshot.py`` - Taking screenshots of web pages
+- ``examples/network_events.py`` - Monitoring network events using both ``wait_for()`` and ``listen()``
+
+The unit tests in ``tests/`` also provide helpful sample code.
diff --git a/docs/index.rst b/docs/index.rst
@@ -15,4 +15,5 @@ responses, and events over a single connection.
    installation
    getting_started
    api
+   utilities
    changelog