Skip to content

[WIP] Extend more abstract functions and classes using API#24

Open
Copilot wants to merge 6 commits intomasterfrom
copilot/extend-abstract-functions-api
Open

[WIP] Extend more abstract functions and classes using API#24
Copilot wants to merge 6 commits intomasterfrom
copilot/extend-abstract-functions-api

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Nov 16, 2025

Add higher-level utility functions and classes to trio-chrome-devtools-protocol

This PR addresses issue about extending the library with more abstract functions and classes for common browser automation tasks, inspired by Puppeteer/Pyppeteer.

✅ Completed

Core Implementation

  • Create a trio_cdp/util.py module for higher-level utility functions
  • Implement Keyboard utility class for keyboard input actions
  • Implement Mouse utility class for mouse actions
  • Implement ElementHandle utility class for element interactions
  • Add utility functions for finding and manipulating elements
    • query_selector() - Find element by CSS selector
    • query_selector_all() - Find all matching elements
    • wait_for_selector() - Wait for element to appear

Documentation & Examples

  • Create comprehensive utilities documentation (docs/utilities.rst)
  • Add form interaction example (examples/form_interaction.py)
  • Add keyboard/mouse example (examples/keyboard_mouse.py)
  • Update README with utilities overview
  • Add implementation summary (UTILITIES_IMPLEMENTATION.md)

Testing & Quality

  • Add test suite for utility functions
  • Add validation script for module structure
  • Fix generator to handle Optional type hints
  • Regenerate CDP bindings for compatibility
  • Pass CodeQL security scan (0 alerts)

🎯 Key Features

Keyboard Class

  • down(), up(), press() - Individual key events
  • type() - Type a string character by character

Mouse Class

  • move() - Move mouse with smooth interpolation
  • click(), down(), up() - Mouse button events

ElementHandle Class

  • click() - Click on element at its center
  • type() - Focus and type into element
  • get_attribute() - Get HTML attributes
  • get_property() - Get JavaScript properties
  • get_text_content() - Extract text content

🏆 Design Principles

All utilities maintain the library's core principles:

  • Pure CDP - No JavaScript injection
  • Lightweight - Thin wrappers, minimal abstraction
  • Trio-native - Fully async with structured concurrency
  • Type-safe - Complete type hints for IDE support
  • Optional - Core CDP remains available, utilities are opt-in

📚 Decision

Utilities are integrated into the main package rather than creating a separate trio-puppeteer module, as they are lightweight and tightly coupled with CDP concepts.

Original prompt

This section details on the original issue you should resolve

<issue_title>Extended more abstract Functions and Classes using API</issue_title>
<issue_description>@mehaase I have been extensively using the #6 api and will soon move to the #9. I have a number of functional classes providing and implementing more abstract functions, such as keyboard, mouse actions , and more utility functions for finding and focusing, reading or manipulating elements. This results that the end application is very clean and reasonably elegant

By enlarge I have been incorporating functionality from the likes of https://github.com/miyakogi/pyppeteer an unoffical https://github.com/puppeteer/puppeteer port.

Some key benefits are

  1. the async functions of https://github.com/python-trio/trio provided through https://github.com/HyperionGray/trio-chrome-devtools-protocol.
  2. keep the application to pure CDP with no javascript injection or manipulation
  3. Light weight

I feel many of these classes I have created are worth putting up on Github but under what structure, Do we add on to trio-chrome-devtools-protocol or create another module trio-puppeteer
There are many pro's and con's but I have only implemented a small but to me very useful subset of puppeteer

Interested in perspectives from @mehaase or other users or contributers who may have ideas and relevant comments to structure, naming
</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.


Note

Introduces documentation and examples for new high-level utilities and updates tooling/deps.

  • Adds comprehensive utilities docs (docs/utilities.rst), links them in docs/index.rst, and expands README.md with features and a utilities example
  • New examples/form_interaction.py and examples/keyboard_mouse.py showcasing keyboard/mouse and element interactions
  • Adds UTILITIES_IMPLEMENTATION.md summarizing the utilities design and scope
  • Fixes generator/generate.py to format typing.Optional annotations
  • Refreshes poetry.lock with broad dependency updates

Written by Cursor Bugbot for commit 7a1cb46. Configure here.

Copilot AI and others added 5 commits November 16, 2025 04:13
Co-authored-by: P4X-ng <223870169+P4X-ng@users.noreply.github.com>
Co-authored-by: P4X-ng <223870169+P4X-ng@users.noreply.github.com>
Co-authored-by: P4X-ng <223870169+P4X-ng@users.noreply.github.com>
Co-authored-by: P4X-ng <223870169+P4X-ng@users.noreply.github.com>
Co-authored-by: P4X-ng <223870169+P4X-ng@users.noreply.github.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a1cb466fb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread trio_cdp/util.py
Comment on lines +273 to +277
Get the text content of the element.

:return: Text content
"""
outer_html = await self.session.execute(dom.get_outer_html(self.node_id))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return text content instead of outer HTML

get_text_content promises text but returns dom.get_outer_html, which includes tags/attributes; any caller expecting plain text (e.g., comparing "Hello") will instead get "<span>Hello</span>" and similar markup. This makes the method unreliable for text extraction and breaks consumers when elements have nested tags. Consider using textContent via runtime.call_function_on or a dedicated CDP call so the method name matches behavior.

Useful? React with 👍 / 👎.

Comment thread trio_cdp/util.py
await self.press(char, delay)
if delay > 0:
import trio
await trio.sleep(delay)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double delay in Keyboard.type() method

Medium Severity

The Keyboard.type() method applies the delay twice per character. It passes delay to press() which uses it as the keydown-to-keyup hold time, then calls trio.sleep(delay) again after each press. The documentation states the delay is "Time to wait between key presses" but the actual behavior doubles this time: once during press() and once after. If a user specifies delay=0.1, each character takes 0.2 seconds instead of the expected 0.1 second gap between presses. The call to press(char, delay) should likely be press(char) without passing the delay.

Fix in Cursor Fix in Web

target_id = t.target_id
break

logger.info('Attaching to target id=%s', target_id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undefined variable if no matching target found

Medium Severity

The target_id variable is only assigned inside the for loop when a matching target is found. If targets is empty or no target matches the criteria (e.g., all pages are attached or are devtools pages), target_id is never assigned. When line 46 attempts to use target_id, this causes a NameError crash with an unhelpful error message rather than a clear explanation that no suitable target was found.

Additional Locations (1)

Fix in Cursor Fix in Web

Comment thread trio_cdp/util.py
# This is a simplified approach - in production you might want to use
# runtime evaluation for more accurate text extraction
# For now, return the HTML (caller can parse it if needed)
return outer_html
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_text_content returns HTML not text content

Medium Severity

The get_text_content() method is named and documented as returning "text content" but actually returns outer HTML via dom.get_outer_html(). Code using this method expecting plain text (e.g., for string comparison like if text == "Submit") will fail because it receives HTML markup like <button>Submit</button> instead. The implementation contradicts the function's name, return type documentation, and all external docs that describe it as extracting text content.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extended more abstract Functions and Classes using API

2 participants