[kernel-1310] browser telemetry - CLI integration#168
Conversation
knownTelemetryTypes was 22 hardcoded strings that would silently drift as new server-side event types shipped. Drop the warning; unrecognized --types values simply produce no matches, which is obvious. knownTelemetryCategories was settableCategories + "system" with no single source of truth. Remove it; validate --categories against settableCategories with an explicit carve-out for "system".
- mention browsers create as a --telemetry surface (not just update) - document partial-update semantics for per-category config - note default --seq behavior (omit to stream from now) - correct the default output example to call out tab-separated format
Sayan-
left a comment
There was a problem hiding this comment.
integration looks good overall!
…eam --categories
- buildTelemetryParam: drop Enabled=true from the per-category branch so the
request goes through the API's merge path. Previously, enabled=true +
browser={...} collided with the enable-all encoding and was treated as a
replace, defaulting unmentioned categories back to on.
- TelemetryStream: validate --categories against the known set (console,
interaction, network, page, system) and error on typos. Mirrors the
config-side validation in parseTelemetryCategories.
- Tests: cover the three wire-encoding shapes for buildTelemetryParam; flip
the create-with-categories test to assert Enabled is unset; flip the
unknown-category stream test to assert error.
rgarcia
left a comment
There was a problem hiding this comment.
Review: browser telemetry CLI integration
Built the branch, ran the unit tests, did a live QA pass against the prod API, and reviewed the diff for API-surface consistency and implementation. Overall this is a solid PR — builds clean, tests pass, and the core feature works end-to-end. A few behavior issues and consistency nits below.
Build & tests
go build ./cmd/kernelsucceeds;go vet ./...clean;go test ./cmd/...all pass.
QA (live, against prod API)
Drove real CDP traffic to generate events. Note: browsers curl <id> <url> generates no telemetry — it uses Chrome's net stack without opening a CDP target — so it's not a usable way to drive events for testing the stream.
Works:
- Config lifecycle:
create --telemetry=all, partial update (--telemetry=network=offflips only network, others retained),--telemetry=offclears, per-category merge — all good. - Live text stream (
HH:MM:SS\t[cat]\ttype) and-o jsonNDJSON (every line validjq -c; envelopes carryseqandevent.category). - Filters:
--categories=network,--types=network_responsecorrectly narrow output. - Resume:
--seq=Nreplays contiguously from N+1. - Validation/error paths: bad value, unknown category, bad on/off, negative seq, bad
--output, empty update — all error cleanly with exit 1, no session leak on a failed create.
Issues found:
- Ctrl-C on a stream always errors and exits 1. SIGINT prints
ERROR Unexpected end of JSON inputand returns exit 1 (the SSE decoder hits a partial read on teardown; no signal handling). This breaks any script that gates on exit code. With-o jsonthe NDJSON stays valid (error goes to stderr) but the exit code is still 1. Suggest handling SIGINT to exit 0 (or 130) without the error. cdp_connect/cdp_disconnectare mislabeled and unfilterable. The server tags thesecategory:"system"on the wire (visible in-o json), but they render as[cdp]in text output and are dropped by--categories=system. The help text "system covers allmonitor_*events" is therefore incomplete. Root cause is the category-derivation logic — see code review below.--seq=0replays nothing despite docs saying "from the beginning."--seq=0returns zero events;--seq=1replays full history. Both the flag help and the error string (use --seq=0 to resume from the beginning) are misleading against actual server behavior.--telemetry=allis opaque onget. Returnstelemetry: {"browser": {}}(empty object), so you can't confirm "all enabled" from the output. Explicit per-category lists do enumerate.- Minor cosmetics: error messages capitalize the leading flag (
--Seq must be >= 0);createprintsINFO Creating browser session...before telemetry validation fails (no session is actually created — just misleading ordering).
console / interaction categories weren't exercised (the driver only navigated; no console/click traffic) — network, page, and system were observed live.
Code review
Must-fix
shouldEmitis dead code; the streaming loop runs an untested duplicate.shouldEmit(cmd/browsers_telemetry.go:120-129) is only referenced from tests (cmd/browsers_telemetry_test.go:209);TelemetryStreaminlines the identical filter atcmd/browsers_telemetry.go:159-163. So the 8TestShouldEmitcases cover a code path that never actually runs in production. CallshouldEmitfrom the loop and delete the inline copy, and add a drop-case test throughTelemetryStream(assert a non-matching--categories/--typesevent is absent) — there's currently no end-to-end test of a dropped event.
Should-fix
eventCategoryprefix derivation is the riskiest assumption — and the cause of thecdp_*mislabeling above.eventCategory(cmd/browsers_telemetry.go:106-118) derives the category from the substring before the first_, special-casing onlymonitor_→system. The server already sends a correctcategoryfield on the wire (the test fixtures even carry an unused"category"field atcmd/browsers_telemetry_test.go:197-204). Reading category from the SDK/wire field instead of the prefix split would fix thecdp_*classification and resolve theTODO(sdk)in one go.- Hand-rolled
-oflag diverges from the CLI standard.cmd/browsers_telemetry.go:135-137,187validates output manually; the rest of the CLI registers viaaddJSONOutputFlag(pkg/util/output.go:18) and validates viavalidateJSONOutput(pkg/util/output.go:11). The current error (unsupported --output value: use 'json') also omits the offending value and uses single quotes vs. the standard%qform. Keep the custom help text; just share the validation. - Cross-file
init()for cobra wiring. Thetelemetrysubcommand is wired in a secondinit()incmd/browsers_telemetry.go:180-190, relying on a comment about package var-init ordering; every other browser subgroup is wired in the singleinit()incmd/browsers.go. Works, but a latent footgun — move it alongside the others. - Fix the
--seq=0semantics (see QA): either make--seq=0actually replay from the beginning, or correct the help/error text to reflect that--seq=1is the earliest, and reconcile the--seqbound (the check allows-1while the message says>= 0,cmd/browsers_telemetry.go:138-139).
Nits
buildNewTelemetryParam/buildUpdateTelemetryParam(cmd/browsers_telemetry.go:67-97) are byte-for-byte identical except return type — factor the shared logic if a third call site ever appears.- Stream timestamps drop the date (
time.UnixMicro(...).Local().Format("15:04:05"),cmd/browsers_telemetry.go:171);LogsStreamusesutil.FormatLocal(cmd/browsers.go:698). Fine for a live tail, but--seqreplays and midnight-spanning streams lose date context. - Nil-service handling returns an error (
cmd/browsers_telemetry.go:132-133) where the siblingLogsStreamprints + returns nil (cmd/browsers.go:672-675). The error approach is better — just flagging the inconsistency.
Confirmed good (no action): the system category asymmetry (settable: 4 categories; stream-filterable: 5) is clean and tested; --telemetry symmetry on create/update with the extended empty-update guard (cmd/browsers.go:587-591); partial-update merge semantics are locked down by TestBuildTelemetryParam_WireEncoding; parsing/validation is solid.
Test gaps (besides the drop-case above): no coverage for --seq < -1 rejection, unsupported --output rejection, the --types path through TelemetryStream, or trailing-comma / empty-segment parsing.
| params.LastEventID = kernel.Opt(strconv.FormatInt(in.Seq, 10)) | ||
| } | ||
| stream := b.telemetry.StreamStreaming(ctx, br.SessionID, params) | ||
| defer stream.Close() |
There was a problem hiding this comment.
Missing nil telemetry stream guard
Medium Severity
TelemetryStream calls defer stream.Close() immediately after StreamStreaming without checking for a nil stream. Other browser streaming handlers (LogsStream, ProcessStdoutStream, FSWatchEvents) guard nil before Close, so a nil return here can panic instead of failing cleanly.
Reviewed by Cursor Bugbot for commit 351e6b2. Configure here.
rgarcia
left a comment
There was a problem hiding this comment.
Re-review — feedback addressed
Re-pulled at 351e6b2, rebuilt, re-ran tests, and did a fresh live QA pass against the API focused on the behaviors that changed. 8 of 9 items are confirmed fixed; one (--seq=0) is now a docs-vs-behavior mismatch worth reconciling. Recommendation: approve-with-nits once --seq=0 is sorted.
Verified fixed
cdp_*classification — now renders[system]in text andcategory:"system"in JSON, and--categories=systemincludes them (confirmed live:cdp_connect/cdp_disconnectare caught).eventCategorynow reads the wirecategoryfield with a prefix fallback (cmd/browsers_telemetry.go:114-129), covered byTestEventCategory.shouldEmitwired in + tested — the stream loop callsshouldEmit(...)(cmd/browsers_telemetry.go:171) and the inline duplicate is gone; the new drop-path tests (cmd/browsers_telemetry_test.go:177,206) assert the dropped event is absent. Live,--categories=networkand--types=network_responsenarrow correctly.- Ctrl-C clean exit —
signal.NotifyContext+context.Canceledswallow (cmd/browsers_telemetry.go:157,184); live SIGINT now exits0with noUnexpected end of JSON input, in both text and-o json(NDJSON stays valid). Covered byTestTelemetryStream_ContextCanceledExitsCleanly. - Dated timestamps —
2006-01-02 15:04:05(cmd/browsers_telemetry.go:180); confirmed live. - Create banner ordering — moved below telemetry validation (
cmd/browsers.go:418-420);create --telemetry=garbagenow errors with no banner and no session created. - Output validation — shared
validateJSONOutput(cmd/browsers_telemetry.go:146); error echoes the offending value with%q. - Error-message style —
must be "on" or "off",invalid --categories value %q,invalid --seq value %d(echoes the value),--outputdouble-quoted; all confirmed live and consistent with the rest of the CLI. - Cross-file
init()removed — telemetry wiring folded intobrowsers.go'sinit()(cmd/browsers.go:2545-2551).
Build/go vet/full cmd test suite all green; no regressions in the sweep (config create/get, partial update/merge/clear, filters).
Still open — one decision needed
--seq=0 does not replay from the beginning, and the docs now state that it does. Live behavior: --seq=0 replays 0 events, --seq=1 replays the full history, --seq=50 resumes after 50. Root cause: for seq=0 the CLI forwards Last-Event-ID: "0" verbatim (cmd/browsers_telemetry.go:164-165), which the server treats as "from now" (sequences start at 1). This batch updated the help string but not the behavior, so the --seq help (cmd/browsers.go:2549), the error message (cmd/browsers_telemetry.go:150), and README.md:306 now all claim --seq=0 means "from the beginning" — which contradicts what the server does. Either map --seq=0 to a value the server honors as from-beginning (or send "1"), or correct the docs to say --seq=1 is the earliest replay and --seq=0 behaves as from-now.
(Separately, --telemetry=all shows telemetry:{"browser":{}} on get — opaque, but that looks like a server-side representation rather than a CLI concern. Noting only.)
New nits
apicategory isn't accepted by--categories.api_callevents classify as[api], but--categoriesonly acceptsconsole, interaction, network, page, system, so they're filterable only via--types. Likely worth addingapito the filter set (or confirming it's intentionally excluded).- Help text "system covers all
monitor_*events" is now incomplete — sincesystemalso covers server-classifiedcdp_*events, updatecmd/browsers.go:2547andREADME.md:304. - (perf) double
json.Unmarshalper event on the text path —eventCategoryis invoked by bothshouldEmit(cmd/browsers_telemetry.go:133) and the print line (:181), unmarshallingRawJSON()twice per emitted event. Compute once and reuse. Not a correctness issue. - (pre-existing, optional)
buildNewTelemetryParam/buildUpdateTelemetryParamare still near-identical (cmd/browsers_telemetry.go:73-102); fine to ship as-is.
Nice turnaround on these — the structured-category fix in particular cleans up the riskiest part of the original design.
| telemetryStream.Flags().StringSlice("categories", []string{}, "Filter by API event category (api,console,interaction,network,page,system); system covers monitor_* and cdp_* events") | ||
| telemetryStream.Flags().StringSlice("types", []string{}, "Filter by event type (e.g. network_response,console_error)") | ||
| telemetryStream.Flags().Int64("seq", -1, "Resume stream from sequence number (Last-Event-ID); --seq=1 replays from the first event, default -1 streams from now") | ||
| telemetryStream.Flags().StringP("output", "o", "", "Output format: json for newline-delimited JSON envelopes") |
There was a problem hiding this comment.
Inline output flag instead of shared helper
Low Severity
The telemetryStream command registers its --output flag with an inline Flags().StringP("output", "o", "", "Output format: json for newline-delimited JSON envelopes") instead of calling addJSONOutputFlag(telemetryStream). Every other browser subcommand (list, get, view, update, replays, process, fs) uses addJSONOutputFlag, which delegates to the centralized helper with the standard description string. This introduces drift in the help text and flag definition pattern.
Triggered by learned rule: Use shared JSON output helpers in CLI commands
Reviewed by Cursor Bugbot for commit 5bfd91b. Configure here.
| if prefix == "monitor" { | ||
| return "system" | ||
| } | ||
| return prefix |
There was a problem hiding this comment.
Fallback misses cdp prefix to system mapping
Medium Severity
eventCategory fallback maps monitor_* to "system" but not cdp_*, even though the flag help text and README both document that system covers monitor_* and cdp_* events. When a cdp_* event arrives without a wire category field, it gets categorized as "cdp" — which isn't in streamFilterCategories, making those events invisible to any --categories filter and unreachable by the user.
Reviewed by Cursor Bugbot for commit 5bfd91b. Configure here.
rgarcia
left a comment
There was a problem hiding this comment.
Re-review at 5bfd91b — live QA + code
Re-pulled at 5bfd91b, rebuilt, ran a fresh parallelized live QA pass against prod, and re-read the delta since my last re-review. Build / go vet / go test ./cmd/... all green.
Recommendation: approve once the --seq docs/behavior is reconciled. Everything from both prior reviews is confirmed fixed and verified live — the only item still open is the one I flagged last round (--seq=0), and the docs-only fix has left the help/README inaccurate rather than resolving it (inline note on the --seq flag).
Verified fixed (live, against prod)
cdp_*classification —cdp_connect/cdp_disconnectrender[system]and are caught by--categories=system(0 leak to[cdp]);api_call→apiand--categories=apiis accepted.- Filters —
--categories(single + comma union),--types, and combined all narrow correctly with no leaks; typo/invalid categories error (exit 1) instead of silently matching nothing. - Ctrl-C clean exit — SIGINT and SIGTERM exit
0in both text and-o json, noUnexpected end of JSON input, verified mid-write of ~190 buffered JSON lines. - Dated timestamps, partial-update merge (
--telemetry=network=offflips only network, others retained), shared output validation with%qecho, error-message style, banner ordering (no banner / no session leak on a failed create), and the perf double-unmarshal dedup — all confirmed. Help + README carry theapicategory andsystem covers monitor_*/cdp_*claims.
Still open — --seq (see inline)
Resume is strictly after N (Last-Event-ID=N → seq > N), so --seq=1 starts at seq 2 — there's no --seq value that replays the genuine first event, making "replays from the first event" (flag help, error browsers_telemetry.go:150, README.md:306) an off-by-one. And --seq=0 is now a silent no-op: validation accepts it ("must be >= 0"), the code sends Last-Event-ID="0", the server replays nothing, and TestTelemetryStream_SeqZeroSetsLastEventID pins exactly that — while the docs no longer mention 0 at all. Smallest fix: reword the three surfaces to "earliest available / lowest retained sequence" and drop "first event", and either reject/warn on --seq=0 or map it to the earliest retained seq.
Two open Bugbot Mediums — low real-world impact
- nil-stream guard (
browsers_telemetry.go:167): provably dead code —ssestream.NewStreamnever returns nil andClose()already nil-guards. Only a consistency gap vsLogsStream/ProcessStdoutStream; optional. cdp_fallback (inline note): help advertises "system coverscdp_*" but the prefix fallback only mapsmonitor_→system. Masked today by the wirecategoryfield (and nocdp_*event types in the SDK yet) — forward-compat gap, not a live bug.
Minor (non-blocking, mostly server-side)
create --telemetry=offreturns.telemetry=nullwhilegetreturns{}—-o jsonconsumers should handle both.browsers geton a missing session printsERRORbut exits0(generalgetpath, outside this PR's scope).
| telemetryStream := &cobra.Command{Use: "stream <id>", Short: "Stream live telemetry events", Args: cobra.ExactArgs(1), RunE: runBrowsersTelemetryStream} | ||
| telemetryStream.Flags().StringSlice("categories", []string{}, "Filter by API event category (api,console,interaction,network,page,system); system covers monitor_* and cdp_* events") | ||
| telemetryStream.Flags().StringSlice("types", []string{}, "Filter by event type (e.g. network_response,console_error)") | ||
| telemetryStream.Flags().Int64("seq", -1, "Resume stream from sequence number (Last-Event-ID); --seq=1 replays from the first event, default -1 streams from now") |
There was a problem hiding this comment.
This wording is an off-by-one, and it leaves --seq=0 as a silent no-op.
Resume is strictly after the Last-Event-ID (--seq=N → events with seq > N), confirmed live: --seq=1 → min seq 2, --seq=2 → 3, --seq=100 → 101. So --seq=1 does not replay the first event (seq 1), and no value does. The same wording is in the error string (browsers_telemetry.go:150) and README.md:306.
Separately, --seq=0 is accepted by the >= 0 validation, sends Last-Event-ID="0", and the server replays nothing — a silent empty stream with no diagnostic. 0 is no longer mentioned in any docs, yet TestTelemetryStream_SeqZeroSetsLastEventID locks in the 0→"0" behavior.
Suggestion: reword to "earliest available / lowest retained sequence" (drop "first event"), and either reject/warn on --seq=0 or map it to the earliest retained seq. Minor: the bound copy says "must be >= 0" while the guard is < -1, so -1 is actually permitted.
| if prefix == "monitor" { | ||
| return "system" | ||
| } | ||
| return prefix |
There was a problem hiding this comment.
The flag help + README now state system covers cdp_*, but this fallback only maps monitor_→system; a cdp_* event arriving without a wire category returns "cdp", which isn't in streamFilterCategories, so --categories=system would silently drop it.
Not a live bug today — the primary path reads the wire category (server sends system for cdp_*, verified live) and the SDK has no cdp_* event types yet — so this is forward-compat only. But since the docs now advertise it, a one-liner closes the gap: if prefix == "monitor" || prefix == "cdp".
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0e33313. Configure here.
rgarcia
left a comment
There was a problem hiding this comment.
Approving — --seq resolved at 0e33313
--seq=0 now errors with a clear message instead of silently streaming nothing, the help/error/README all state the honest seq > N (resume-after) semantics, and the test asserts the rejection (TestTelemetryStream_SeqZeroErrors). Verified locally: build + go test ./cmd/... green; --seq=0 and --seq=-50 rejected (exit 1), --seq=1 / --seq=N / omit still behave correctly. That was my only blocking item — LGTM.
Two earlier nits remain, both intentionally optional / non-blocking (fine to defer or fast-follow):
eventCategory's prefix fallback mapsmonitor_→systembut notcdp_→system, while help/README now advertisesystemcoverscdp_*. Masked today by the wirecategoryfield; forward-compat only.telemetry streamis the one stream handler without a post-StreamStreamingnil check — provably dead code (the SDK never returns nil), pure consistency withLogsStreamet al.
Follow-up idea: a true "replay from the first event"
Now that --seq honestly documents resume-after-N, there's no way for a client to ask for "everything from the start": --seq=1 begins at seq 2, and --seq=0/omit means from-now. Worth a follow-up because the capability already exists server-side and just isn't exposed — and notably it's not something the control-plane API can add on its own:
StreamBrowserTelemetryin the control-plane API (packages/api/cmd/api/api/telemetry.go) is a pure passthrough — it forwardsLast-Event-IDto the VM and never picks a start position. So an API-only change can't do it.- The start position is decided VM-side in
kernel-imagesserver/cmd/api/api/events.go(StreamTelemetryEvents):afterSeqdefaults to the current seq and is only overriddenif n > 0, so0/absent → from-now. The ring buffer underneath (server/lib/events/ringbuffer.go) already treatsnewReader(0)as "from oldest retained" — the handler simply never passes it 0. - So a real "from start" is a small kernel-images change: accept an explicit sentinel (a dedicated query param like
?replay=all— notLast-Event-ID: 0, since SSE auto-reconnect resends the last id and absent/0 already mean from-now) that maps toNewReader(0), then thread it through the passthrough API param and a new CLI flag (e.g.--replay-all, distinct from--seq). - Caveat: the ring holds 1024 envelopes (
server/cmd/api/main.go), so it's "from oldest retained," not guaranteed seq 1 — a long/busy session will have evicted the earliest events (the reader surfaces aDroppedmarker and resumes at the oldest).
Not blocking this PR — flagging while it's fresh.
|
Noted - will follow up on the |


Adds browser telemetry to the CLI:
kernel browsers create --telemetry=all|off|<list>kernel browsers update <id> --telemetry=all|off|<list>(partial — unspecified categories retain state)kernel browsers telemetry stream <id>with--categories,--types,--seq(Last-Event-ID resume),-o json(NDJSON)Full test matrix → https://gist.github.com/archandatta/f27eb3ea93d58e932b9fd9a7b7b9090f
Essential run steps to validate
Build:
1. Config (create / update / get):
2. Live stream (text + JSON):
3. Filters:
4. Resume from sequence:
5. Validation paths (all should error cleanly):
Cleanup:
/tmp/kernel browsers delete $IDTests
cmd/browsers_telemetry_test.go(go test ./cmd/...)Note
Low Risk
CLI-only feature wiring to existing browser APIs; no auth or core infra changes, with broad unit test coverage for parsing and streaming behavior.
Overview
Adds browser telemetry to the Kernel CLI: configure it on session create and update, and stream live events from a running session.
Configuration — New
--telemetryonbrowsers createandbrowsers updateacceptsall,off, or per-category lists (network=on,page=off). Per-category updates only send named categories so the API can merge;all/offset the globalEnabledflag.browsers updatenow allows telemetry-only updates (no proxy/profile/viewport required).Streaming — New
browsers telemetry stream <id>uses the SDK SSE client, resolves the session ID, supports--categories,--types,--seq(Last-Event-ID resume), human-readable lines or-o jsonNDJSON via newPrintCompactJSONLine. Client-side filtering and category inference (wirecategoryor type-prefix, withmonitor_*→system).Docs & tests — README documents flags and telemetry section; unit tests cover param wire shapes, stream validation, filters, and create/update telemetry mapping.
Reviewed by Cursor Bugbot for commit 0e33313. Bugbot is set up for automated code reviews on this repo. Configure here.