Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,19 @@ I/O using [Trio](https://trio.readthedocs.io/). This library handles the
WebSocket negotiation and session management, allowing you to transparently
multiplex commands, responses, and events over a single connection.

## Features

- **Pure CDP**: Direct access to Chrome DevTools Protocol
- **Async/Await**: Built on Trio for structured concurrency
- **Type Safety**: Full type hints for better IDE support
- **High-Level Utilities**: Puppeteer-inspired abstractions for common tasks
- Keyboard and mouse simulation
- Element interaction and querying
- Wait for elements to appear
- Pure CDP implementation (no JavaScript injection)

## Basic Example

The example below demonstrates the salient features of the library by navigating to a
web page and extracting the document title.

Expand Down Expand Up @@ -40,5 +53,29 @@ async with open_cdp(cdp_url) as conn:
print(html)
```

## High-Level Utilities Example

The library also provides high-level utilities for common automation tasks:

```python
from trio_cdp import open_cdp, page, target
from trio_cdp.util import query_selector, Keyboard

async with open_cdp(cdp_url) as conn:
async with conn.open_session(target_id) as session:
# Navigate to a page
await page.enable()
await page.navigate(url)

# Find an input field and type into it
input_field = await query_selector(session, 'input[name="search"]')
if input_field:
await input_field.type('Hello, World!')

# Press Enter to submit
keyboard = Keyboard(session)
await keyboard.press('Enter')
```

This example code is explained [in the documentation](https://trio-cdp.readthedocs.io)
and more example code can be found in the `examples/` directory of this repository.
177 changes: 177 additions & 0 deletions UTILITIES_IMPLEMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Utilities Module Implementation Summary

## Overview

This implementation addresses the GitHub issue about extending `trio-chrome-devtools-protocol` with higher-level utility functions and classes for common browser automation tasks, inspired by Puppeteer/Pyppeteer.

## Decision: Integrated Approach

Rather than creating a separate `trio-puppeteer` package, the utilities are integrated directly into the main `trio_cdp` package as a `util` module. This approach was chosen because:

1. **Lightweight**: The utilities are thin wrappers around CDP commands
2. **No External Dependencies**: Everything uses native CDP, no JavaScript injection
3. **Tight Integration**: Direct access to session and connection objects
4. **Simplicity**: Users don't need to install/manage a separate package

## Implementation

### New Module: `trio_cdp/util.py`

Contains three main classes and utility functions:

#### 1. Keyboard Class
Provides keyboard input simulation:
- `down(key, text=None)` - Press key down
- `up(key)` - Release key
- `press(key, delay=0)` - Complete key press (down + up)
- `type(text, delay=0)` - Type a string character by character

**Example:**
```python
keyboard = Keyboard(session)
await keyboard.type("Hello, World!")
await keyboard.press("Enter")
```

#### 2. Mouse Class
Provides mouse action simulation:
- `move(x, y, steps=1)` - Move mouse with optional smooth interpolation
- `click(x, y, button='left', click_count=1, delay=0)` - Click at position
- `down(button='left', click_count=1)` - Mouse button down
- `up(button='left', click_count=1)` - Mouse button up

**Example:**
```python
mouse = Mouse(session)
await mouse.move(100, 200, steps=10) # Smooth movement
await mouse.click(100, 200)
```

#### 3. ElementHandle Class
Represents a handle to a DOM element with convenient interaction methods:
- `click(button='left', click_count=1, delay=0)` - Click the element
- `type(text, delay=0)` - Focus and type into element
- `get_attribute(name)` - Get HTML attribute value
- `get_property(name)` - Get JavaScript property value
- `get_text_content()` - Extract text content

**Example:**
```python
input_field = await query_selector(session, 'input[name="email"]')
if input_field:
await input_field.type('user@example.com')
```

#### Element Selection Functions
- `query_selector(session, selector, node_id=None)` - Find first matching element
- `query_selector_all(session, selector, node_id=None)` - Find all matching elements
- `wait_for_selector(session, selector, timeout=30, visible=False)` - Wait for element

**Example:**
```python
# Find and interact with elements
button = await query_selector(session, 'button.submit')
if button:
await button.click()

# Wait for dynamic content
result = await wait_for_selector(session, '.result', timeout=10, visible=True)
```

## Documentation

### Added Files
1. **docs/utilities.rst** - Comprehensive documentation for all utilities
2. **examples/form_interaction.py** - Example showing form interaction
3. **examples/keyboard_mouse.py** - Example demonstrating keyboard/mouse usage
4. **tests/test_util.py** - Unit tests for utility functions
5. **validate_utilities.py** - Validation script to verify module structure

### Updated Files
1. **README.md** - Added utilities section with examples
2. **docs/index.rst** - Added utilities to documentation table of contents
3. **trio_cdp/__init__.py** - Export util module

## Key Design Principles

1. **Pure CDP**: No JavaScript injection, all interactions use native CDP commands
2. **Async-First**: Fully compatible with Trio's async/await patterns
3. **Lightweight**: Minimal abstractions, close to underlying CDP
4. **Type-Safe**: Complete type hints for IDE support
5. **Composable**: Small, focused utilities that work well together
6. **Optional**: Core CDP functionality remains available; utilities are opt-in

## Benefits

### For Users
- **Intuitive API**: Familiar patterns for anyone coming from Puppeteer
- **Less Boilerplate**: Common tasks simplified with high-level methods
- **Type Safety**: Full IDE support with autocomplete and type checking
- **Pure Python**: No JavaScript knowledge required

### For the Project
- **Maintains Philosophy**: Stays true to lightweight, CDP-focused approach
- **No Breaking Changes**: Completely additive, existing code unaffected
- **Extensible**: Users can easily add custom utilities following same patterns
- **Well-Documented**: Comprehensive docs and examples

## Technical Details

### Generator Fix
Fixed `generator/generate.py` to handle `typing.Optional` type hints, which was preventing regeneration of CDP bindings with newer Python versions.

### CDP Bindings Regenerated
Regenerated all CDP binding code to be compatible with `chrome-devtools-protocol==0.4.0`, resolving import errors with the generated code.

## Testing & Validation

1. **Unit Tests**: Comprehensive test suite in `tests/test_util.py`
2. **Validation Script**: `validate_utilities.py` verifies all classes and methods exist
3. **Code Quality**: Passed CodeQL security scan with 0 alerts
4. **Examples**: Two working examples demonstrate real-world usage

## Usage Example

Here's a complete example showing the utilities in action:

```python
import trio
from trio_cdp import open_cdp, page, target
from trio_cdp.util import query_selector, wait_for_selector, Keyboard

async def automate_form(cdp_url):
async with open_cdp(cdp_url) as conn:
# Get a target
targets = await target.get_targets()
target_id = targets[0].target_id

async with conn.open_session(target_id) as session:
# Navigate
await page.enable()
await page.navigate('https://example.com/form')

# Wait for and fill form
name_field = await wait_for_selector(session, 'input[name="name"]', timeout=10)
if name_field:
await name_field.type('John Doe')

# Use keyboard for submission
keyboard = Keyboard(session)
await keyboard.press('Enter')
```

## Future Enhancements

Potential additions that maintain the same design philosophy:

1. **Page utilities**: Screenshot helpers, PDF generation utilities
2. **Network utilities**: Request interception helpers, mock response utilities
3. **Cookie utilities**: Easy cookie management
4. **Dialog utilities**: Alert/prompt/confirm handlers
5. **File upload**: File chooser utilities

Each would follow the same pattern: lightweight wrappers around CDP commands with convenient async interfaces.

## Conclusion

This implementation successfully extends `trio-chrome-devtools-protocol` with higher-level utilities while maintaining the library's core principles of being lightweight, pure-CDP, and Trio-native. The utilities provide a more intuitive interface for common automation tasks without sacrificing the power and flexibility of the underlying CDP protocol.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@ responses, and events over a single connection.
installation
getting_started
api
utilities
changelog
Loading
Loading