Track: manual testing findings from 0.7.2 (2026-04-16) by szymdzum · Pull Request #227 · szymdzum/browser-debugger-cli

szymdzum · 2026-04-16T16:56:01Z

Tracking PR for manual testing findings from v0.7.2.
Full report: docs/quality/MANUAL_TESTING_2026-04-16.md.

Status: 26 of 29 addressed in this branch. Three remaining (#9, #10, #23) couldn't be reproduced on current code and need a concrete repro before they can be attempted. Each fix is its own focused commit (run git log --oneline main..HEAD on this branch for the breakdown).

🔴 Critical (7/7)

Refactor CLI to modular architecture and add cleanup --all flag #1 Form index click hits wrong radio/checkbox in grouped inputs
Add comprehensive code quality tooling and refactoring #2 network list silently capped at 10 regardless of --last / --all
Performance: Async file I/O with mutex and comprehensive instrumentation #3 bdg <unknown-command> silently treats typo as a URL
Refactor session controller architecture and integrate chrome-launcher #4 bdg status shows stale URL/title after in-page navigation
Add comprehensive testing foundation and enhanced code quality #5 bdg network document returns favicon instead of main HTML document
feat: comprehensive agent-friendly CLI improvements #6 bdg network headers (no arg) returns random request instead of error
feat: implement collector selector flags #7 Form submit reports Navigation: no when navigation actually occurred

🟡 Significant (13/15)

🟠 Minor (6/7)

build(deps-dev): Bump @typescript-eslint/eslint-plugin from 8.46.2 to 8.46.3 #23 Race in parallel DOM queries (flaky, needs a repeatable repro)
build(deps-dev): Bump knip from 5.67.1 to 5.68.0 #24 peek --network hard-error with migration hint to bdg network list
build(deps-dev): Bump eslint from 9.39.0 to 9.39.1 #25 dom eval returns quoted JSON for simple string results
build(deps-dev): Bump @typescript-eslint/parser from 8.46.2 to 8.46.3 #26 Session-start landing page auto-suppressed outside a TTY; --verbose forces it back
build(deps-dev): Bump @eslint/js from 9.39.0 to 9.39.1 #27 Stale-cache error message reads "cached query" as literal
Refactor: Extract daemon and worker into modular plugin architecture #28 Filter last 0 of N phrasing when no matches
Architecture improvements and smoke test implementation #29 bdg <url> <subcommand> silently ignores subcommand

Output-control story (added by #17 + #24 + #26)

--json → machine-readable
Default → human-readable with hints/next-steps
--quiet / -q → global; strips tips, next-steps, suggestions, commands footers
--verbose → on session start, force landing page when piped
Session-start landing page auto-suppressed when stdout isn't a TTY

…very generateSelector() produced input[name="X"] for any name-bearing input, which collided across radios/checkboxes sharing a name. The resolver then treated the preview selector as uniquely identifying and dropped the index, so dom click/fill on a grouped input always hit the first match. For radios and checkboxes that carry a value, append [value="..."] to the preview so each element in a group resolves uniquely.

fetchNetworkRequests() called fetchPreviewData() with no lastN argument, so the daemon's peek handler applied its 10-item default and truncated the response before the list command ever saw it. The --last option operated on that pre-truncated slice, so users couldn't retrieve more than 10 requests regardless of the value (the HAR export path, which uses a different IPC call, kept working). Pass lastN=0 (unlimited) and let the list command's own --last do the slicing, matching the pattern fetchConsoleMessages already uses.

status.ts suggested 'bdg query <script>' (the command is 'bdg dom eval') and 'bdg tabs' (no such command). dom.ts suggested 'bdg details dom N' (details only supports network/console). All three sent users down dead-end paths. - status 'Commands' section: 'Query browser: bdg dom eval <script>' - status 'Suggestions' section: drop the 'List Chrome tabs' entry - dom query 'Next steps': point to 'bdg dom a11y describe N' instead

handleIndexGetSemantic() passed the string 'cached query' as the selector argument to elementAtIndexNotFoundError, so users saw 'Re-run "bdg dom query cached query" to refresh the cache' — where 'cached query' reads as a required argument name. Make selector optional on the error helper and emit a generic 'Re-run your original query' suggestion when the caller doesn't know the original selector.

formatHeaderSection prepended a colon to the key before calling keyValue(), which adds its own colon for the padding. Result was 'Referer:: https://...' throughout headers output. Let keyValue own the colon.

…ment findTargetRequestForHeaders() fell back to any request whose mimeType contained 'html' when no current-nav Document was found, and further fell back to any request with response headers. A favicon.ico 404 response body typed 'text/html' would win the mimeType fallback, so 'bdg network document' returned the favicon on pages like /post. With no html-mimed request at all, 'bdg network headers' would return an arbitrary request (the headers fallback). Drop both loose fallbacks. Prefer the current-navigation Document, then any Document resource, then error — resourceType is the only signal that reliably distinguishes the main document from collateral. Contract tests updated to encode resourceType-based selection and to verify favicon-with-html-mimetype no longer wins.

The default action treats any bare argument as a URL, so 'bdg statsu' was interpreted as 'http://statsu' and launched a Chrome session. Real subcommand names get routed by Commander first, but their misspellings fall through to the [url] positional. Before URL validation, check the argument against the registered command names with a Levenshtein-2 threshold and error out with a 'Did you mean' suggestion. Truly bogus single-token strings (no dots/slashes/ scheme and not a command typo) still reach Chrome — scope-limited fix focused on the typo case the report called out.

Commander treats '' as a present argument, so 'bdg dom query ""' slipped through and reached the CDP query helper which silently returned zero nodes and a success exit code. Agents couldn't tell the difference between 'selector matched nothing' (exit 0, valid answer) and 'you passed an empty string' (exit 81, usage error). Guard the selector at the command boundary.

…nothing buildHeader fell into the 'last X of Y' branch even when the filter produced zero matches, so empty results read as 'last 0 of 10' — which is technically true but reads like a paging error. Surface the intent: '(no matches; 10 captured)' when the filter emptied the list, '(none captured)' when there were no requests to begin with.

'bdg https://example.com status' silently ignored 'status' and launched a session because the root Commander program accepted arbitrary excess arguments. Turn on allowExcessArguments(false) on the root so any trailing positional after [url] raises Commander's 'too many arguments' error instead of being dropped. Subcommands declare their own arguments and are unaffected.

Two linked a11y issues: 1. Bare 'bdg dom a11y' printed the help block instead of doing anything, even though the search argument is documented as optional. Route bare invocations to handleA11yTree so the subcommand dumps the tree — the most useful default for an a11y-shorthand. 2. 'bdg dom a11y Example' wrapped the term in asterisks to form 'name:*Example*', which parseQueryPattern then passed through to matchesPattern where includes() looks for the literal string '*example*'. A heading named 'Example Domain' didn't contain the asterisks so the match failed. Drop the asterisk wrap in the bare-mode shorthand and strip leading/trailing '*' wildcards in parseQueryPattern. Users who habit in asterisks now get the expected substring match; users who don't never had to type them in the first place.

The empty-state guard checked options.network === undefined, but Commander defaults the flag to false, so 'bdg peek --console' never hit the marker branch when there were no messages — users saw the PREVIEW header, skipped straight to the tip line, and couldn't tell the difference between 'no data captured' and 'command silently misbehaved'. Gate on options.console / !options.network / hasConsoleData so the empty console section renders its '(none)' marker in both the user- asked-for-console and default cases, matching the DOM block.

…%O, %%) formatConsoleArgs blindly joined every RemoteObject with spaces, so console.log('%cFoo', 'color:red') rendered as '%cFoo color:red' — the format directive and its CSS argument both leaked into the text, badly on pages with lots of %c styling. Detect a format string in the first argument and apply the WHATWG console substitution rules for the common directives: %s/%d/%i/%f consume and format their arg, %o/%O render structured objects, %c silently drops the CSS arg, %% renders a literal %. Unknown directives pass through as literal text.

formatConsole chose summarize mode whenever --list wasn't set, so 'bdg console --level info' rendered the error/warning summary header (and 'No errors or warnings found') on an already-filtered info stream. Treat an explicit --level as implying chronological mode — the summary's deduplicated errors/warnings layout has no meaningful equivalent for info/debug.

Peek's --network/--console/--dom flags were documented as filters but implemented as 'addSections'. The gates only dropped the network block when --console was set and vice-versa — --dom never suppressed network or console, so 'bdg peek --dom' printed everything plus the tree. Introduce a hasExplicitFilter signal: when any of network/console/dom is explicitly set, only the requested sections render. With no flags, the default (network + console) stays; DOM remains opt-in due to the tree's verbosity.

formatDomEval ran every result through JSON.stringify, so a script returning a string came back wrapped in quotes — agents parsing document.title had to strip them (and couldn't tell a plain string apart from an accidentally-JSON-encoded one). Emit strings raw; numbers/objects/booleans/null keep JSON formatting so they remain round-trippable.

… none The main entrypoint called ensureDaemonRunning() for every non- documentation invocation, so status and stop spawned (and then tore down) an entire worker just to learn what the caller already implied by running them: there's no session yet. Slow and wasteful on clean systems. Gate the spawn: for the status and stop commands, only launch a daemon when one is already running (reconnect path). Otherwise let each command's existing daemon-connection-error handler produce the appropriate 'not running' message without the side effect.

--brief dropped the Value column entirely, so after running fill/click agents had to switch to the full table to verify the change landed — defeating the point of a quick-state summary. Add a compact VAL column that shows the field's current state: - text-like fields → quoted value, or ∅ when empty - checkbox/radio/switch → [x] / [ ] Widen the brief table to 65 cols to fit the new column without truncating labels further.

…GUMENTS) Commander's default exit handler collapsed every usage failure — unknown option, missing required argument, too-many arguments, invalid argument value — onto exit 1, which scripts couldn't distinguish from a real crash. Install exitOverride on the root program to translate the usage-level Commander error codes to EXIT_CODES.INVALID_ARGUMENTS while letting help/version still exit 0. Commander prints its own error text to stderr, so the override only needs to own the exit code.

setTargetInfo() was only called during initial Chrome setup; subsequent navigations (location.href assignment, pushState-style routing, server redirects) left the cached URL and title frozen at the original values. Agents checking 'bdg status' after a navigation got stale state and couldn't use it to confirm page changes. - Thread an onMainFrameNavigation callback through startNavigationTracking and subscribe to Page.frameNavigated + Page.navigatedWithinDocument so in-document routing updates the URL too. - Update the stored URL immediately from the event (cheap, event-driven); refresh the title via Runtime.evaluate on Page.loadEventFired plus scheduled retries, since the event fires before the new document's <title> is ready. - Fall back to host+pathname when document.title is empty so the format matches what Chrome's /json endpoint populated at session start.

…submit waitForCompletion only subscribed to Page.frameNavigated when the caller passed --wait-navigation, so a default 'bdg dom submit' that POSTed and then navigated came back with Navigation: no because the listener was never attached — navigationOccurred stayed at its initial false. Always subscribe to Page.frameNavigated so the reported flag reflects what happened; waitNavigation continues to control whether we block for navigation. Filter to main-frame events so subframe loads during submit (ads, iframes) don't falsify the flag.

The soft deprecation had lived long enough that users were trained to ignore it while scripts silently depended on the shorthand. Cut it: any invocation of 'bdg peek --network' now fails with exit 81 and a message pointing to 'bdg network list' — the strict superset replacement. Option description updated so 'bdg peek --help' reflects the removal.

… --verbose The 50+ line post-start walkthrough is genuinely useful when a human runs bdg at their terminal for the first time, but it drowns script and CI output where nobody reads it. Drive the choice off process.stdout.isTTY with explicit overrides: - TTY → landing page (unchanged for interactive users) - piped/redirected → 'Session started: <url>' only - --verbose → force the landing page even when piped - --quiet → already forced minimal; stays that way No breaking change for interactive usage, and automation no longer needs --quiet purely to tame the banner.

--quiet was scoped to 'bdg <url>' via the start command's own option list, so 'bdg --quiet status' parsed the flag (applyCollectorOptions attaches to the root) but no other formatter consulted it. Users got a flag that looked universal but only did anything at session start. Contract: --quiet suppresses tips, 'Next steps:', 'Suggestions:', and 'Commands:' sections while keeping the primary data/result. --json continues to be the machine-readable path; the two compose. Covered in this pass (where the noise actually lives): - bdg status — drops the 'Commands:' footer - bdg status (no session) — drops the 'Suggestions:' block - bdg dom query — drops both 'Next steps:' and 'Suggestions:' - bdg peek — drops the 'Tip:' tail in compact and verbose modes - bdg dom submit — drops the 'Next steps:' block and updates its reference to point at 'bdg network list' instead of the now-removed 'bdg peek --network' - daemon-startup log lines on the main entrypoint are also silenced Commands without decorative sections accept the flag as a global and no-op on it.

Four regressions/new bugs surfaced in the 2026-04-16 retest: - network headers <bogus-id> surfaced the worker's "not found" error via IPCError default, landing on exit 104 (UNHANDLED_EXCEPTION). Check response.status explicitly and return RESOURCE_NOT_FOUND (83) with a "list captured requests" suggestion. Same pattern for the no-arg document alias when nothing is captured. - cdp <method> with a CDP "Invalid parameters" reply also fell through to 104. Map likely-user-input CDP errors (invalid parameters / must have / is required) to INVALID_ARGUMENTS (81) and point at `--describe` for the parameter schema. - dom query "Next steps -> Extract text" rendered `bdg cdp Runtime.evaluate --params '{"expression": "document.querySelector('h1').textContent"}'` The outer single-quote wrapper collided with the inner selector quotes, so pasted verbatim the shell split the string and the CDP call ran `document.querySelector(h1)` -> ReferenceError. Switch to `bdg dom eval "document.querySelector('sel').textContent"` which is bash-safe (double-quoted outer, single-quoted inner, no nesting). - dom get <index> stale-cache path threw CommandError with RESOURCE_NOT_FOUND (83), but CLAUDE.md documents STALE_CACHE (87) as the semantic code for stale nodeIds. Align with the documented convention so agents can discriminate "selector never matched" from "indexed cache went stale."

Moving the slice into the data-producing lambda so `--json` and the human formatter agree on what "last N" means. Prior behaviour: bdg network list --last 5 --json -> .data.requests = [all 80] -> .data.filtered = [all 80] # bug: ignored --last -> .data.totalCount = 80 bdg network list --last 5 -> "last 5 of 80" header, 5 rows printed Agents scripting against the JSON path got the full capture regardless of --last, inverse of the original 10-item cap but equally confusing. New shape: .data.requests - raw unfiltered capture .data.filtered - post-filter, post-slice (what humans see) .data.filteredTotal - post-filter, pre-slice (new, so callers can still reason about "how much survived the filter" without re-running) .data.totalCount - unchanged (full capture count)

…headers #3 follow-up: commit 3781fcb guarded typos close to existing commands (edit distance <= 2), so `bdg statuss` is rejected with "Did you mean bdg status?". But arbitrary single tokens like `bdg unknown` or `bdg myproject` still fell through validateUrl because normalizeUrl blindly prepends http://, producing a parseable `http://unknown/`. A real Chrome process would launch, against what the user almost certainly meant. Add a pre-normalisation shape check: the input must contain `.`, `/`, `:`, an explicit scheme, or match `localhost` to count as a URL. Users with a genuine bare intranet hostname can disambiguate with an explicit `bdg http://foo` or a port (`bdg foo:80`). Failures exit 80 (INVALID_URL) with a message that points at both escape hatches and `bdg --help`. #6 follow-up: commit f6ebaaf intentionally made `bdg network headers` default to the main document when no ID is passed (deterministic vs the old "random request" behaviour). Leave the default, but when the human path runs without an ID add a one-line stderr hint pointing at `bdg network list` so agents don't silently assume they got the request they asked for. Skip the hint under --json to keep the automation path stderr-clean (matches #9/#10 contract).

szymdzum added 21 commits April 16, 2026 18:52

docs: add manual testing report for 0.7.2 (2026-04-16)

9ca4913

fix(#19): drop extra colon from network headers output

7fae2fd

formatHeaderSection prepended a colon to the key before calling keyValue(), which adds its own colon for the padding. Result was 'Referer:: https://...' throughout headers output. Let keyValue own the colon.

szymdzum self-assigned this Apr 16, 2026

szymdzum added 4 commits April 16, 2026 19:42

szymdzum marked this pull request as ready for review April 16, 2026 17:58

szymdzum added 3 commits April 16, 2026 20:18

docs: add manual testing retest report for 0.7.2 (2026-04-16)

27811a0

szymdzum mentioned this pull request Apr 16, 2026

fix: bundled QA retest findings from 0.7.2 (2026-04-16) #228

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track: manual testing findings from 0.7.2 (2026-04-16)#227

Track: manual testing findings from 0.7.2 (2026-04-16)#227
szymdzum wants to merge 29 commits intomainfrom
bug-tracking/test-report-2026-04-16

szymdzum commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

szymdzum commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔴 Critical (7/7)

🟡 Significant (13/15)

🟠 Minor (6/7)

Output-control story (added by #17 + #24 + #26)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

szymdzum commented Apr 16, 2026 •

edited

Loading