Skip to content

Commit eeaaa24

Browse files
authored
Merge pull request #9 from HyperionGray/new_api
New API
2 parents 65fa030 + cbcacc4 commit eeaaa24

69 files changed

Lines changed: 8559 additions & 241 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
.coverage
22
*.egg-info
3+
.ipynb_checkpoints
34
.mypy_cache
45
.vscode
56
__pycache__

.travis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@ install:
1313
- pip install -r requirements.txt
1414

1515
script:
16-
- make test
16+
- make

Makefile

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,29 @@
1-
.PHONY: test
1+
.PHONY: docs
22

3-
publish: test
3+
default: mypy-generate test-generate generate test-import mypy-cdp test-cdp
4+
5+
docs:
6+
$(MAKE) -C docs html
7+
8+
generate:
9+
python generator/generate.py
10+
11+
mypy-cdp:
12+
mypy trio_cdp/
13+
14+
mypy-generate:
15+
mypy generator/
16+
17+
publish: test-import mypy-cdp test-cdp
418
rm -fr dist trio_chrome_devtools_protocol.egg-info
519
python setup.py sdist
620
twine upload dist/*
721

8-
test:
22+
test-cdp:
923
pytest tests/ --cov=trio_cdp --cov-report=term-missing
24+
25+
test-generate:
26+
pytest generator/
27+
28+
test-import:
29+
python -c 'import trio_cdp; print(trio_cdp.accessibility)'

README.md

Lines changed: 19 additions & 166 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
![Python Versions](https://img.shields.io/pypi/pyversions/trio-chrome-devtools-protocol)
55
![MIT License](https://img.shields.io/github/license/HyperionGray/trio-chrome-devtools-protocol.svg)
66
[![Build Status](https://img.shields.io/travis/com/HyperionGray/trio-chrome-devtools-protocol.svg?branch=master)](https://travis-ci.com/HyperionGray/trio-chrome-devtools-protocol)
7+
[![Read the Docs](https://img.shields.io/readthedocs/trio-cdp.svg)](https://trio-cdp.readthedocs.io)
78

89
This Python library performs remote control of any web browser that implements
910
the Chrome DevTools Protocol. It is built using the type wrappers in
@@ -12,180 +13,32 @@ I/O using [Trio](https://trio.readthedocs.io/). This library handles the
1213
WebSocket negotiation and session management, allowing you to transparently
1314
multiplex commands, responses, and events over a single connection.
1415

15-
The example demonstrates the salient features of the library.
16+
The example below demonstrates the salient features of the library by navigating to a
17+
web page and extracting the document title.
1618

1719
```python
20+
from trio_cdp import open_cdp, page, dom
21+
1822
async with open_cdp(cdp_url) as conn:
1923
# Find the first available target (usually a browser tab).
20-
targets = await conn.execute(target.get_targets())
24+
targets = await target.get_targets()
2125
target_id = targets[0].id
2226

2327
# Create a new session with the chosen target.
24-
session = await conn.open_session(target_id)
25-
26-
# Navigate to a website.
27-
await session.execute(page.enable())
28-
async with session.wait_for(page.LoadEventFired):
29-
await session.execute(page.navigate(target_url))
30-
31-
# Extract the page title.
32-
root_node = await session.execute(dom.get_document())
33-
title_node_id = await session.execute(dom.query_selector(root_node.node_id,
34-
'title'))
35-
html = await session.execute(dom.get_outer_html(title_node_id))
36-
print(html)
37-
```
38-
39-
We'll go through this example bit by bit. First, it starts with a context
40-
manager:
41-
42-
```python
43-
async with open_cdp(cdp_url) as conn:
44-
```
45-
46-
This context manager opens a connection to the browser when the block is entered
47-
and closes the connection automatically when the block exits. Now we have a
48-
connection to the browser, but the browser has multiple targets that can be
49-
operated independently. For example, each browser tab is a separate target. In
50-
order to interact with one of them, we have to create a session for it.
51-
52-
```python
53-
targets = await conn.execute(target.get_targets())
54-
target_id = targets[0].id
55-
```
56-
57-
The first line here executes the `get_targets()` command in the browser. Note
58-
the form of the command: `await conn.execute(...)` will send a command to the
59-
browser, parse its response, and return a value (if any). The command is one of
60-
the methods in the PyCDP package. Trio CDP multiplexes commands and responses on
61-
a single connection, so we can send commands concurrently if we want, and the
62-
responses will be routed back to the correct task.
63-
64-
In this case, the command is `target.get_targets()`, which returns a list of
65-
`TargetInfo` objects. We grab the first object and extract its `target_id`.
66-
67-
```python
68-
session = await conn.open_session(target_id)
69-
```
70-
71-
In order to connect to a target, we open a session based on the target ID.
72-
73-
```python
74-
await session.execute(page.enable())
75-
async with session.wait_for(page.LoadEventFired):
76-
await session.execute(page.navigate(target_url))
77-
```
78-
79-
Here we use the session (remember, it corresponds to a tab in the browser) to
80-
navigate to the target URL. Just like the connection object, the session object
81-
has an `execute(...)` method that sends a command to the target, parses the
82-
response, and returns a value (if any).
83-
84-
This snippet also introduces another concept: events. When we ask the browser to
85-
navigate to a URL, it acknowledges our request with a response, then starts the
86-
navigation process. How do we know when the page is actually loaded, though?
87-
Easy: the browser can send us an event!
88-
89-
We first have to enable page-level events by calling `page.enable()`. Then we
90-
use `session.wait_for(...)` to wait for an event of the desired type. In this
91-
example, the script will suspend until it receives a `page.LoadEventFired`
92-
event. (After this block finishes executing, you can run `page.disable()` to
93-
turn off page-level events if you want to save some bandwidth and processing
94-
power, or you can use the context manager `async with session.page_enable(): ...`
95-
to automatically enable page-level events just for a specific block.)
96-
97-
Note that we wait for the event inside an `async with` block, and we do this
98-
_before_ executing the command that will trigger this event. This order of
99-
operations may be surprising, but it avoids race conditions. If we executed a
100-
command and then tried to listen for an event, the browser might fire the event
101-
very quickly before we have had a chance to set up our event listener, and then
102-
we would miss it! The `async with` block sets up the listener before we run the
103-
command, so that no matter how fast the event fires, we are guaranteed to catch
104-
it.
105-
106-
```python
107-
root_node = await session.execute(dom.get_document())
108-
title_node_id = await session.execute(
109-
dom.query_selector(root_node.node_id, 'title'))
110-
html = await session.execute(dom.get_outer_html(title_node_id))
111-
print(html)
112-
```
113-
114-
The last part of the script navigates the DOM to find the `<title>` element.
115-
First we get the document's root node, then we query for a CSS selector, then
116-
we get the outer HTML of the node. This snippet shows some new APIs, but the
117-
mechanics of sending commands and getting responses are the same as the previous
118-
snippets.
28+
async with conn.open_session(target_id) as session:
11929

120-
A more complete version of this example can be found in `examples/get_title.py`.
121-
There is also a screenshot example in `examples/screenshot.py`. The unit tests
122-
in `tests/` also provide more examples.
30+
# Navigate to a website.
31+
async with session.page_enable()
32+
async with session.wait_for(page.LoadEventFired):
33+
await session.execute(page.navigate(target_url))
12334

124-
To run the examples, you need a Chrome binary in your system. You can get one like this:
125-
126-
## Running Examples on MacOS
127-
128-
**Terminal 1**
129-
130-
This sets up the chrome browser in a specific version, and runs it in debug mode with Tor proxy for network traffic.
131-
132-
```
133-
wget https://www.googleapis.com/download/storage/v1/b/chromium-browser-snapshots/o/Mac%2F678035%2Fchrome-mac.zip?generation=1563322360871926&alt=media
134-
unzip chrome-mac.zip && rm chrome-mac.zip
135-
./chrome-mac/Chromium.app/Contents/MacOS/Chromium --remote-debugging-port=9000
136-
> DevTools listening on ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID>
137-
```
138-
139-
**Terminal 2**
140-
141-
This runs the example browser automation script on the instantiated browser window.
142-
143-
```bash
144-
python examples/get_title.py ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID> https://hyperiongray.com
145-
```
146-
147-
## Running Examples on Linux
148-
149-
**Terminal 1**
150-
151-
This sets up the chrome browser in a specific version, and runs it in debug mode with Tor proxy for network traffic.
152-
153-
```
154-
wget https://storage.googleapis.com/chromium-browser-snapshots/Linux_x64/678025/chrome-linux.zip
155-
unzip chrome-linux.zip && rm chrome-linux.zip
156-
./chrome-linux/chrome --remote-debugging-port=9000
157-
> DevTools listening on ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID>
35+
# Extract the page title.
36+
root_node = await session.execute(dom.get_document())
37+
title_node_id = await session.execute(dom.query_selector(root_node.node_id,
38+
'title'))
39+
html = await session.execute(dom.get_outer_html(title_node_id))
40+
print(html)
15841
```
15942

160-
**Terminal 2**
161-
162-
This runs the example browser automation script on the instantiated browser window.
163-
164-
```bash
165-
python examples/get_title.py ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID> https://hyperiongray.com
166-
```
167-
168-
## Changelog
169-
170-
### 0.5.0
171-
172-
* **Backwards Compability Break:** Rename `open_cdp_connection()` to `open_cdp()`.
173-
* Fix `ConnectionClosed` bug.
174-
175-
### 0.4.0
176-
177-
* Add support for passing in a nursery. (Supports usage in Jupyter notebook.)
178-
179-
### 0.3.0
180-
181-
* New APIs for enabling DOM events and Page events.
182-
183-
### 0.2.0
184-
185-
* Restructure event listeners.
186-
187-
### 0.1.0
188-
189-
* Initial version
190-
191-
<a href="https://www.hyperiongray.com/?pk_campaign=github&pk_kwd=trio-cdp"><img alt="define hyperion gray" width="500px" src="https://hyperiongray.s3.amazonaws.com/define-hg.svg"></a>
43+
This example code is explained [in the documentation](https://trio-cdp.readthedocs.io)
44+
and more example code can be found in the `examples/` directory of this repository.

docs/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
_build

docs/Makefile

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line, and also
5+
# from the environment for the first two.
6+
SPHINXOPTS ?=
7+
SPHINXBUILD ?= sphinx-build
8+
SOURCEDIR = .
9+
BUILDDIR = _build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
.PHONY: help Makefile
16+
17+
# Catch-all target: route all unknown targets to Sphinx using the new
18+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+
%: Makefile
20+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

docs/api.rst

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
API
2+
===
3+
4+
Trio CDP replicates the entire API of PyCDP. For example, PyCDP has a ``cdp.dom`` module,
5+
while in Trio CDP, we have ``trio_cdp.dom``. The Trio CDP version has all of the same data
6+
types, commands, and events as the PyCDP version.
7+
8+
The only difference between the two libraries is that the Trio CDP commands are ``async
9+
def`` functions that can be called from Trio, whereas the PyCDP commands are generators
10+
that must be executed in a special way. This document explains both flavors and how to
11+
use them.
12+
13+
You should consult the `PyCDP documentation
14+
<https://py-cdp.readthedocs.io/en/latest/api.html>`_ for a complete reference of the
15+
data types, commands, and events that are available.
16+
17+
Simplified API
18+
--------------
19+
20+
.. highlight: python
21+
22+
The simplified API allows you to await CDP commands directly. For example, to run a CSS
23+
query to get a blockquote element, the code is:
24+
25+
.. code::
26+
27+
from trio_cdp import dom
28+
29+
...other code...
30+
31+
node_id = await dom.query(root, 'blockquote')
32+
33+
As you can see in `the PyCDP documentation
34+
<https://py-cdp.readthedocs.io/en/latest/api/dom.html#cdp.dom.query_selector>`_, the
35+
command takes a ``NodeId`` as its first argument and a string as its second argument.
36+
The arguments in Trio CDP are always the same as in PyCDP.
37+
38+
The return types in Trio CDP are different from PyCDP, however. The PyCDP command is a
39+
generator, where as the Trio CDP command is an ``async def``. The PyCDP command
40+
signature shows ``Generator[Dict[str, Any], Dict[str, Any], NodeId]]``. The generator
41+
contains 3 parts. The first two parts are always the same: ``Dict[str, Any]``. The third
42+
part indicates the real return type, and that is the type that the Trio CDP command will
43+
return. In this case, it returns a ``NodeId``.
44+
45+
If you have code completion in your Python IDE, it will help you see what the return
46+
type is for each Trio CDP command.
47+
48+
.. note::
49+
50+
In order for this calling style to work, you must be inside a "session
51+
context", i.e. your code must be nested in (or called from inside of) an ``async with
52+
conn.open_session()`` block. If you try calling this from outside of a session context,
53+
you will get an exception.
54+
55+
Low-level API
56+
-------------
57+
58+
The low-level API is a bit more verbose, but you may find it necessary or preferable in
59+
some situations. With this API, you import commands from PyCDP and pass them into the
60+
session ``execute()`` method. Taking the same example as the previous section, here is
61+
how you would execute a CSS query:
62+
63+
.. code::
64+
65+
from cdp import dom
66+
67+
...other code...
68+
69+
node_id = await session.execute(dom.query(root, 'blockquote'))
70+
71+
If you compare this example to the example in the previous section, there are two big changes.
72+
First, ``dom`` is imported from PyCDP instead of Trio CDP. This means that is a generator,
73+
not an ``async def``.
74+
75+
Second, in order to run the command on a given session, we have to call that session's
76+
``execute()`` method and pass in the PyCDP generator.
77+
78+
Other than being a little more verbose (calling ``session.execute(...)`` for every CDP
79+
command), the low-level API is otherwise very similar to the simplified API described in
80+
the previous section. It still takes the same arguments and returns the same type (here
81+
a ``NodeId``).

docs/changelog.rst

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
Changelog
2+
=========
3+
4+
0.6.0
5+
-----
6+
7+
* Simplified calling convention for CDP commands.
8+
9+
0.5.0
10+
-----
11+
12+
* **Backwards Compability Break:** Rename `open_cdp_connection()` to `open_cdp()`.
13+
* Fix `ConnectionClosed` bug.
14+
15+
0.4.0
16+
-----
17+
18+
* Add support for passing in a nursery. (Supports usage in Jupyter notebook.)
19+
20+
0.3.0
21+
-----
22+
23+
* New APIs for enabling DOM events and Page events.
24+
25+
0.2.0
26+
-----
27+
28+
* Restructure event listeners.
29+
30+
0.1.0
31+
-----
32+
33+
* Initial version

0 commit comments

Comments
 (0)