Skip to content

fix(server): handle ECONNRESET on the WS upgrade socket#22556

Open
sashanicolas wants to merge 2 commits into
vitejs:mainfrom
sashanicolas:fix/ws-upgrade-econnreset
Open

fix(server): handle ECONNRESET on the WS upgrade socket#22556
sashanicolas wants to merge 2 commits into
vitejs:mainfrom
sashanicolas:fix/ws-upgrade-econnreset

Conversation

@sashanicolas
Copy link
Copy Markdown

Description

Vite's HMR upgrade handler delegates straight to wss.handleUpgrade without attaching an 'error' listener on the raw socket. If a client RSTs the TCP connection during the upgrade handshake (e.g. a browser tab closing mid-HMR), the underlying socket emits 'error' with ECONNRESET while no listener is attached yet, and Node aborts the dev server with:

Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:216:20)
Emitted 'error' event on Socket instance at ...

Evidence

The gap is in packages/vite/src/node/server/ws.ts (the upgrade handler funnels through handleUpgrade, which calls wss.handleUpgrade directly). Several downstream forks have hit the same stack trace and patched around it locally:

PR #12007 closed a related crash class on the wss 'connection' path, but the upgrade-window path was missed.

Fix

Attach a logger-backed 'error' listener on the raw socket before calling wss.handleUpgrade. The ws library attaches its own as part of the handshake (websocket-server.js line 239), so the two coexist for the rest of the connection lifetime.

Test

Added a unit test in packages/vite/src/node/server/__tests__/ws.spec.ts that opens an upgrade connection and asserts the socket has at least 2 'error' listeners after the upgrade event (vite's + ws library's). The test deterministically fails without the fix and passes with it.

The full race condition requires precise timing and is hard to trigger on demand, which is why the test verifies the safety net is present rather than trying to provoke the crash.

Notes

  • No public API change.
  • All existing unit tests in packages/vite pass (756 tests, 4 skipped).

The HMR WebSocket upgrade handler delegates straight to wss.handleUpgrade
without attaching an 'error' listener on the raw socket first. If the
client RSTs the TCP connection during the upgrade handshake (e.g. a
browser tab closing while vite is pushing an HMR update), the underlying
socket emits 'error' with ECONNRESET while no listener is attached yet,
and Node crashes the dev server with:

  Error: read ECONNRESET
      at TCP.onStreamRead (node:internal/stream_base_commons:216:20)
  Emitted 'error' event on Socket instance at ...

This bug is exclusively in vite/ws.ts, but several downstream forks have
hit the same stack trace and patched around it locally — see
cloudflare/workers-sdk#12047, cloudflare/vinext#905,
livestorejs/livestore#982, nuxt/cli#994, cypress-io/cypress#28688.

The fix attaches a logger-backed error listener on the raw socket
before calling wss.handleUpgrade. The ws library attaches its own as
part of the handshake, so the two coexist for the remainder of the
connection lifetime. The added unit test asserts that vite's listener
is present alongside the ws library's after the upgrade event.
The previous commit logged every error on the upgrade socket, including
the benign client-disconnect codes (ECONNRESET, EPIPE) that fire on
normal tab close / page navigation. This broke playground/ssr's
'should restart ssr' test, which asserts the server logs contain no
'error' string after restart, and added noise to legitimate test runs
(see playground/ssr-deps and ssr-noexternal output in CI).

The client is already gone by the time the error fires, so there's no
actionable recovery — silently absorb the benign codes and only log
unexpected ones. The unit test still passes because it counts socket
'error' listeners, not log output.
@sashanicolas
Copy link
Copy Markdown
Author

The two red checks on this PR look pre-existing and unrelated to the diff:

  • Zizmor is failing on every open PR right now (#22552#22556) — same impostor-commit audit against actions-cool/issues-helper returns HTTP 403 from the GitHub API. Looks like a transient rate-limit / token-scope issue in the audit job.
  • Build&Test Windows (node-24) failed in playground/hmr/__tests__/hmr.spec.ts > acceptExports with a 30s page.waitForEvent('load') timeout. PR #22554 (unrelated change) hit the same acceptExports describe block on a different sub-test with the same Playwright timeout pattern, suggesting a flake. Linux/Mac on Node 20/22/24 and Lint all pass here.

Happy to rebase, narrow the patch, or address anything specific if reviewers spot a real concern in the diff.

@sapphi-red sapphi-red added p2-edge-case Bug, but has workaround or limited in scope (priority) feat: dev dev server labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat: dev dev server p2-edge-case Bug, but has workaround or limited in scope (priority)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants