Skip to content

Track: manual testing findings from 0.7.2 (2026-04-16)#227

Open
szymdzum wants to merge 29 commits intomainfrom
bug-tracking/test-report-2026-04-16
Open

Track: manual testing findings from 0.7.2 (2026-04-16)#227
szymdzum wants to merge 29 commits intomainfrom
bug-tracking/test-report-2026-04-16

Conversation

@szymdzum
Copy link
Copy Markdown
Owner

@szymdzum szymdzum commented Apr 16, 2026

Tracking PR for manual testing findings from v0.7.2.
Full report: docs/quality/MANUAL_TESTING_2026-04-16.md.

Status: 26 of 29 addressed in this branch. Three remaining (#9, #10, #23) couldn't be reproduced on current code and need a concrete repro before they can be attempted. Each fix is its own focused commit (run git log --oneline main..HEAD on this branch for the breakdown).

🔴 Critical (7/7)

🟡 Significant (13/15)

🟠 Minor (6/7)

Output-control story (added by #17 + #24 + #26)

  • --json → machine-readable
  • Default → human-readable with hints/next-steps
  • --quiet / -q → global; strips tips, next-steps, suggestions, commands footers
  • --verbose → on session start, force landing page when piped
  • Session-start landing page auto-suppressed when stdout isn't a TTY

szymdzum added 21 commits April 16, 2026 18:52
…very

generateSelector() produced input[name="X"] for any name-bearing input,
which collided across radios/checkboxes sharing a name. The resolver then
treated the preview selector as uniquely identifying and dropped the
index, so dom click/fill on a grouped input always hit the first match.

For radios and checkboxes that carry a value, append [value="..."] to
the preview so each element in a group resolves uniquely.
fetchNetworkRequests() called fetchPreviewData() with no lastN argument,
so the daemon's peek handler applied its 10-item default and truncated
the response before the list command ever saw it. The --last option
operated on that pre-truncated slice, so users couldn't retrieve more
than 10 requests regardless of the value (the HAR export path, which
uses a different IPC call, kept working).

Pass lastN=0 (unlimited) and let the list command's own --last do the
slicing, matching the pattern fetchConsoleMessages already uses.
status.ts suggested 'bdg query <script>' (the command is 'bdg dom eval')
and 'bdg tabs' (no such command). dom.ts suggested 'bdg details dom N'
(details only supports network/console). All three sent users down
dead-end paths.

- status 'Commands' section: 'Query browser: bdg dom eval <script>'
- status 'Suggestions' section: drop the 'List Chrome tabs' entry
- dom query 'Next steps': point to 'bdg dom a11y describe N' instead
handleIndexGetSemantic() passed the string 'cached query' as the
selector argument to elementAtIndexNotFoundError, so users saw
'Re-run "bdg dom query cached query" to refresh the cache' — where
'cached query' reads as a required argument name.

Make selector optional on the error helper and emit a generic
'Re-run your original query' suggestion when the caller doesn't know
the original selector.
formatHeaderSection prepended a colon to the key before calling
keyValue(), which adds its own colon for the padding. Result was
'Referer::  https://...' throughout headers output. Let keyValue own
the colon.
…ment

findTargetRequestForHeaders() fell back to any request whose mimeType
contained 'html' when no current-nav Document was found, and further
fell back to any request with response headers. A favicon.ico 404
response body typed 'text/html' would win the mimeType fallback, so
'bdg network document' returned the favicon on pages like /post. With
no html-mimed request at all, 'bdg network headers' would return an
arbitrary request (the headers fallback).

Drop both loose fallbacks. Prefer the current-navigation Document,
then any Document resource, then error — resourceType is the only
signal that reliably distinguishes the main document from collateral.

Contract tests updated to encode resourceType-based selection and to
verify favicon-with-html-mimetype no longer wins.
The default action treats any bare argument as a URL, so 'bdg statsu'
was interpreted as 'http://statsu' and launched a Chrome session. Real
subcommand names get routed by Commander first, but their misspellings
fall through to the [url] positional.

Before URL validation, check the argument against the registered
command names with a Levenshtein-2 threshold and error out with a 'Did
you mean' suggestion. Truly bogus single-token strings (no dots/slashes/
scheme and not a command typo) still reach Chrome — scope-limited fix
focused on the typo case the report called out.
Commander treats '' as a present argument, so 'bdg dom query ""' slipped
through and reached the CDP query helper which silently returned zero
nodes and a success exit code. Agents couldn't tell the difference
between 'selector matched nothing' (exit 0, valid answer) and 'you
passed an empty string' (exit 81, usage error).

Guard the selector at the command boundary.
…nothing

buildHeader fell into the 'last X of Y' branch even when the filter
produced zero matches, so empty results read as 'last 0 of 10' — which
is technically true but reads like a paging error. Surface the intent:
'(no matches; 10 captured)' when the filter emptied the list, '(none
captured)' when there were no requests to begin with.
'bdg https://example.com status' silently ignored 'status' and launched
a session because the root Commander program accepted arbitrary excess
arguments. Turn on allowExcessArguments(false) on the root so any
trailing positional after [url] raises Commander's 'too many arguments'
error instead of being dropped.

Subcommands declare their own arguments and are unaffected.
Two linked a11y issues:

1. Bare 'bdg dom a11y' printed the help block instead of doing anything,
   even though the search argument is documented as optional. Route
   bare invocations to handleA11yTree so the subcommand dumps the tree
   — the most useful default for an a11y-shorthand.

2. 'bdg dom a11y Example' wrapped the term in asterisks to form
   'name:*Example*', which parseQueryPattern then passed through to
   matchesPattern where includes() looks for the literal string
   '*example*'. A heading named 'Example Domain' didn't contain the
   asterisks so the match failed.

   Drop the asterisk wrap in the bare-mode shorthand and strip
   leading/trailing '*' wildcards in parseQueryPattern. Users who habit
   in asterisks now get the expected substring match; users who don't
   never had to type them in the first place.
The empty-state guard checked options.network === undefined, but
Commander defaults the flag to false, so 'bdg peek --console' never
hit the marker branch when there were no messages — users saw the
PREVIEW header, skipped straight to the tip line, and couldn't tell
the difference between 'no data captured' and 'command silently
misbehaved'.

Gate on options.console / !options.network / hasConsoleData so the
empty console section renders its '(none)' marker in both the user-
asked-for-console and default cases, matching the DOM block.
…%O, %%)

formatConsoleArgs blindly joined every RemoteObject with spaces, so
console.log('%cFoo', 'color:red') rendered as '%cFoo color:red' —
the format directive and its CSS argument both leaked into the text,
badly on pages with lots of %c styling.

Detect a format string in the first argument and apply the WHATWG
console substitution rules for the common directives: %s/%d/%i/%f
consume and format their arg, %o/%O render structured objects, %c
silently drops the CSS arg, %% renders a literal %. Unknown directives
pass through as literal text.
formatConsole chose summarize mode whenever --list wasn't set, so
'bdg console --level info' rendered the error/warning summary header
(and 'No errors or warnings found') on an already-filtered info stream.

Treat an explicit --level as implying chronological mode — the
summary's deduplicated errors/warnings layout has no meaningful
equivalent for info/debug.
Peek's --network/--console/--dom flags were documented as filters but
implemented as 'addSections'. The gates only dropped the network block
when --console was set and vice-versa — --dom never suppressed network
or console, so 'bdg peek --dom' printed everything plus the tree.

Introduce a hasExplicitFilter signal: when any of network/console/dom
is explicitly set, only the requested sections render. With no flags,
the default (network + console) stays; DOM remains opt-in due to the
tree's verbosity.
formatDomEval ran every result through JSON.stringify, so a script
returning a string came back wrapped in quotes — agents parsing
document.title had to strip them (and couldn't tell a plain string
apart from an accidentally-JSON-encoded one). Emit strings raw;
numbers/objects/booleans/null keep JSON formatting so they remain
round-trippable.
… none

The main entrypoint called ensureDaemonRunning() for every non-
documentation invocation, so status and stop spawned (and then tore
down) an entire worker just to learn what the caller already implied
by running them: there's no session yet. Slow and wasteful on clean
systems.

Gate the spawn: for the status and stop commands, only launch a daemon
when one is already running (reconnect path). Otherwise let each
command's existing daemon-connection-error handler produce the
appropriate 'not running' message without the side effect.
--brief dropped the Value column entirely, so after running fill/click
agents had to switch to the full table to verify the change landed —
defeating the point of a quick-state summary.

Add a compact VAL column that shows the field's current state:
- text-like fields → quoted value, or ∅ when empty
- checkbox/radio/switch → [x] / [ ]

Widen the brief table to 65 cols to fit the new column without
truncating labels further.
…GUMENTS)

Commander's default exit handler collapsed every usage failure — unknown
option, missing required argument, too-many arguments, invalid argument
value — onto exit 1, which scripts couldn't distinguish from a real
crash.

Install exitOverride on the root program to translate the usage-level
Commander error codes to EXIT_CODES.INVALID_ARGUMENTS while letting
help/version still exit 0. Commander prints its own error text to
stderr, so the override only needs to own the exit code.
setTargetInfo() was only called during initial Chrome setup; subsequent
navigations (location.href assignment, pushState-style routing, server
redirects) left the cached URL and title frozen at the original values.
Agents checking 'bdg status' after a navigation got stale state and
couldn't use it to confirm page changes.

- Thread an onMainFrameNavigation callback through startNavigationTracking
  and subscribe to Page.frameNavigated + Page.navigatedWithinDocument so
  in-document routing updates the URL too.
- Update the stored URL immediately from the event (cheap, event-driven);
  refresh the title via Runtime.evaluate on Page.loadEventFired plus
  scheduled retries, since the event fires before the new document's
  <title> is ready.
- Fall back to host+pathname when document.title is empty so the format
  matches what Chrome's /json endpoint populated at session start.
@szymdzum szymdzum self-assigned this Apr 16, 2026
…submit

waitForCompletion only subscribed to Page.frameNavigated when the caller
passed --wait-navigation, so a default 'bdg dom submit' that POSTed and
then navigated came back with Navigation: no because the listener was
never attached — navigationOccurred stayed at its initial false.

Always subscribe to Page.frameNavigated so the reported flag reflects
what happened; waitNavigation continues to control whether we block
for navigation. Filter to main-frame events so subframe loads during
submit (ads, iframes) don't falsify the flag.
The soft deprecation had lived long enough that users were trained to
ignore it while scripts silently depended on the shorthand. Cut it:
any invocation of 'bdg peek --network' now fails with exit 81 and a
message pointing to 'bdg network list' — the strict superset
replacement. Option description updated so 'bdg peek --help' reflects
the removal.
… --verbose

The 50+ line post-start walkthrough is genuinely useful when a human
runs bdg at their terminal for the first time, but it drowns script
and CI output where nobody reads it. Drive the choice off
process.stdout.isTTY with explicit overrides:

- TTY → landing page (unchanged for interactive users)
- piped/redirected → 'Session started: <url>' only
- --verbose → force the landing page even when piped
- --quiet → already forced minimal; stays that way

No breaking change for interactive usage, and automation no longer
needs --quiet purely to tame the banner.
--quiet was scoped to 'bdg <url>' via the start command's own option
list, so 'bdg --quiet status' parsed the flag (applyCollectorOptions
attaches to the root) but no other formatter consulted it. Users got a
flag that looked universal but only did anything at session start.

Contract: --quiet suppresses tips, 'Next steps:', 'Suggestions:', and
'Commands:' sections while keeping the primary data/result. --json
continues to be the machine-readable path; the two compose.

Covered in this pass (where the noise actually lives):
- bdg status — drops the 'Commands:' footer
- bdg status (no session) — drops the 'Suggestions:' block
- bdg dom query — drops both 'Next steps:' and 'Suggestions:'
- bdg peek — drops the 'Tip:' tail in compact and verbose modes
- bdg dom submit — drops the 'Next steps:' block and updates its
  reference to point at 'bdg network list' instead of the now-removed
  'bdg peek --network'
- daemon-startup log lines on the main entrypoint are also silenced

Commands without decorative sections accept the flag as a global and
no-op on it.
@szymdzum szymdzum marked this pull request as ready for review April 16, 2026 17:58
Four regressions/new bugs surfaced in the 2026-04-16 retest:

- network headers <bogus-id> surfaced the worker's "not found" error via
  IPCError default, landing on exit 104 (UNHANDLED_EXCEPTION). Check
  response.status explicitly and return RESOURCE_NOT_FOUND (83) with a
  "list captured requests" suggestion. Same pattern for the no-arg
  document alias when nothing is captured.

- cdp <method> with a CDP "Invalid parameters" reply also fell through
  to 104. Map likely-user-input CDP errors (invalid parameters / must
  have / is required) to INVALID_ARGUMENTS (81) and point at
  `--describe` for the parameter schema.

- dom query "Next steps -> Extract text" rendered
  `bdg cdp Runtime.evaluate --params '{"expression": "document.querySelector('h1').textContent"}'`
  The outer single-quote wrapper collided with the inner selector quotes,
  so pasted verbatim the shell split the string and the CDP call ran
  `document.querySelector(h1)` -> ReferenceError. Switch to
  `bdg dom eval "document.querySelector('sel').textContent"` which is
  bash-safe (double-quoted outer, single-quoted inner, no nesting).

- dom get <index> stale-cache path threw CommandError with
  RESOURCE_NOT_FOUND (83), but CLAUDE.md documents STALE_CACHE (87) as
  the semantic code for stale nodeIds. Align with the documented
  convention so agents can discriminate "selector never matched" from
  "indexed cache went stale."
Moving the slice into the data-producing lambda so `--json` and the
human formatter agree on what "last N" means. Prior behaviour:

  bdg network list --last 5 --json
  -> .data.requests  = [all 80]
  -> .data.filtered  = [all 80]   # bug: ignored --last
  -> .data.totalCount = 80

  bdg network list --last 5
  -> "last 5 of 80" header, 5 rows printed

Agents scripting against the JSON path got the full capture regardless
of --last, inverse of the original 10-item cap but equally confusing.

New shape:

  .data.requests       - raw unfiltered capture
  .data.filtered       - post-filter, post-slice (what humans see)
  .data.filteredTotal  - post-filter, pre-slice (new, so callers can
                         still reason about "how much survived the
                         filter" without re-running)
  .data.totalCount     - unchanged (full capture count)
…headers

#3 follow-up: commit 3781fcb guarded typos close to existing commands
(edit distance <= 2), so `bdg statuss` is rejected with "Did you mean
bdg status?". But arbitrary single tokens like `bdg unknown` or
`bdg myproject` still fell through validateUrl because normalizeUrl
blindly prepends http://, producing a parseable `http://unknown/`. A
real Chrome process would launch, against what the user almost
certainly meant.

Add a pre-normalisation shape check: the input must contain `.`, `/`,
`:`, an explicit scheme, or match `localhost` to count as a URL. Users
with a genuine bare intranet hostname can disambiguate with an
explicit `bdg http://foo` or a port (`bdg foo:80`). Failures exit 80
(INVALID_URL) with a message that points at both escape hatches and
`bdg --help`.

#6 follow-up: commit f6ebaaf intentionally made `bdg network headers`
default to the main document when no ID is passed (deterministic vs
the old "random request" behaviour). Leave the default, but when the
human path runs without an ID add a one-line stderr hint pointing at
`bdg network list` so agents don't silently assume they got the
request they asked for. Skip the hint under --json to keep the
automation path stderr-clean (matches #9/#10 contract).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant