Skip to content

fix(tables): reliable stop-all, accurate "X running", and rate/usage gating for cell runs#4838

Merged
TheodoreSpeaks merged 6 commits into
stagingfrom
fix/table-stop-workflow
Jun 2, 2026
Merged

fix(tables): reliable stop-all, accurate "X running", and rate/usage gating for cell runs#4838
TheodoreSpeaks merged 6 commits into
stagingfrom
fix/table-stop-workflow

Conversation

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator

Summary

  • Stop all now reliably cancels every dispatched cell. The cancel guard is status-based (not executionId-scoped), so a cell cancelled while it's still a queued pre-stamp can't be resurrected by the worker that later claims it. Adds an explicit pre-execution cancel check (covers cells dequeuing from the queue after Stop) and a resume-worker guard so a paused/awaiting cell doesn't resume after Stop. Cancellation logic consolidated into shared isExecCancelled / isExecCancelledAfter predicates.
  • "X running" / Stop-all control appears as soon as a dispatch starts (new dispatch-start SSE) and stays visible while any dispatch is active — no more "0 running" while cells are clearly queued.
  • Unchecking a checkbox no longer reruns dependents — boolean false is treated as an unmet dependency, so only checking a box triggers downstream runs; unchecking still persists the value.
  • Table cell runs now go through preprocessExecution (the same path other triggers use): attributes usage/billing to the workspace billed account, enforces the usage limit, applies the per-plan timeout, keeps running draft. Rate limit → pace & retry on the async counter (rows aren't skipped); usage limit → halt the dispatch without marking cells and surface an upgrade prompt.

Type of Change

  • Bug fix

Testing

  • Unit tests added (deps predicates incl. boolean-false-as-unmet; preprocessExecution logging-suppression option) — passing.
  • bun run lint:check and bun run check:api-validation:strict pass.
  • Manually exercised rate-limit pacing locally (cells pace and retry; at very low limits they fall back to a re-runnable error after the retry budget — see note below).
  • Stop-all, uncheck-no-rerun, and usage-limit halt + upgrade flows are logic/SSE-wired and worth a manual QA pass before merge.

Note: rate-limit handling is currently per-cell (pace, then re-runnable error after a bounded retry). A follow-up to move pacing to the dispatcher (visible "rate limited — resuming" run state, no cell errors) is discussed but not in this PR.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…gating for cell runs

Stop-all:
- Make the cancellation guard status-based (not executionId-scoped) so a
  `cancelled` tombstone stamped while a cell is still a dispatcher pre-stamp
  (null executionId) keeps the cell dead — fixes function-execute cells that
  resurrected after Stop all. Consolidated into shared isExecCancelled /
  isExecCancelledAfter predicates in deps.ts, reused by the in-memory guard,
  the SQL guard, the dispatcher tombstone filter, the worker, and resume.
- Add an explicit pre-execution cancellation read so a cell that dequeues
  after Stop all (e.g. from the trigger.dev queue) never runs.
- Resume worker aborts a cancelled paused/awaiting cell before resuming;
  cancelWorkflowGroupRuns marks paused executions cancelling.

"X running":
- Emit a dispatch SSE at dispatch start so auto-fired/capped runs surface
  immediately; show the control whenever a dispatch is active.

Checkbox dependency:
- Treat boolean `false` as an unmet dependency so unchecking never reruns
  dependents — only checking does. deriveExecClearsForDataPatch no longer
  re-arms a downstream group whose deps are unmet after the patch.

Rate / usage gating:
- Route table cell execution through preprocessExecution (billing actor =
  workspace billed account, usage limit, per-plan timeout), keeping draft.
- Rate limit: pace & retry per cell (async counter) so rows aren't skipped.
- Usage limit: halt the dispatch without marking cells and emit a
  usageLimitReached event; the client shows an Upgrade prompt that routes to
  subscription settings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 2, 2026 12:21am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented Jun 1, 2026

PR Summary

High Risk
Touches authentication-adjacent billing/usage enforcement, dispatch/cancel concurrency, and core table execution paths where regressions would affect paid usage and data integrity.

Overview
Stop all and cancel semantics are tightened end-to-end: cancelled cells block worker writes by status (not executionId), with matching SQL guards, pre-run dequeue checks, paused-workflow cancellation, and resume skipped when the cell is already cancelled.

Run status UI keeps Stop all visible while any dispatch is active (hasActiveDispatch), not only when per-row running counts are non-zero. The dispatcher emits a dispatch-start SSE when a run begins so auto-fired/capped runs show activity before the first window finishes; stale per-window events are suppressed after a mid-window halt.

Checkbox dependencies: false counts as unmet, so unchecking does not re-trigger downstream groups; dep-driven clears only run when deps are still satisfied after the patch.

Table cell execution now uses preprocessExecution (billing actor, usage limits, plan timeout, draft runs). Usage limit (402) clears pre-stamps, completes the dispatch once (completeDispatchIfActive), emits usageLimitReached SSE, and the table page shows an upgrade toast. Rate limits are retried with jittered backoff per cell; other preprocess failures surface as cell errors without duplicate execution logs (logPreprocessingErrors: false).

Reviewed by Cursor Bugbot for commit aee1374. Bugbot is set up for automated code reviews on this repo. Configure here.

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

This PR hardens the table cell-run pipeline across four axes: reliable Stop All (status-based cancel guard that can't be resurrected by a later worker), accurate "X running" display via a new dispatch-start SSE event and a client hasActiveDispatch flag, correct boolean-false-as-unmet-dependency semantics for checkboxes, and full preprocessExecution integration for table cell runs (billing attribution, usage-limit halting, per-plan timeout, and rate-limit pacing).

  • Cancel guard is now status-only (status <> 'cancelled' in SQL, isExecCancelled in JS) so a pre-stamp cancelled by Stop All can never be resurrected by a later worker that claims the cell with a real executionId.
  • "X running" / Stop-all becomes visible the moment a dispatch starts (new dispatch-start SSE in dispatcherStep) and stays visible via hasActiveDispatch while any dispatch is active, even before cells stamp their progress.
  • Usage-limit path routes through preprocessExecution, clears blocked cells' pre-stamps, and uses completeDispatchIfActive to collapse concurrent 402 hits into a single usageLimitReached SSE event with toast deduplication.
  • Rate-limit pacing retries the async counter up to RATE_LIMIT_MAX_ATTEMPTS times with jittered backoff and a mid-sleep DB cancel check, so cells pace to the user's limit rather than being skipped.

Confidence Score: 5/5

Safe to merge — the cancel guard, billing integration, and dispatch-visibility changes are well-reasoned and each corner case is explicitly addressed with a comment or test.

The three concurrent-hazard fixes (status-based cancel guard, completeDispatchIfActive CAS, mid-sleep DB cancel check) are all correctly implemented and mutually consistent. The preprocessExecution integration preserves the existing billing/usage contract. The hasActiveDispatch and dispatch-start SSE changes are narrowly scoped to the client overlay and carry no data-correctness risk.

workflow-column-execution.ts and dispatcher.ts warrant the most attention — they carry the new usage-limit and rate-limit paths. The interactions between completeDispatchIfActive, the post-window readDispatch guard, and the absence of a complete dispatch SSE event after a usage-limit halt are correct but subtle.

Important Files Changed

Filename Overview
apps/sim/background/workflow-column-execution.ts Largest change: adds preprocessExecution gate (billing/usage/timeout), rate-limit pacing loop with jitter and mid-sleep cancel check, and cancelledBeforeRun pre-execution guard. The blocked propagation through the cascade and re-drive loops is correct.
apps/sim/lib/table/dispatcher.ts Adds dispatch-start SSE, completeDispatchIfActive CAS, dispatchId injection into buildPendingRuns payloads, and a post-window readDispatch check to suppress stale events after a usage-limit halt.
apps/sim/lib/table/service.ts SQL cancel guard broadened to status-only; deriveExecClearsForDataPatch now skips dep propagation when areGroupDepsSatisfied is false after the patch, preventing uncheck-no-rerun regressions.
apps/sim/lib/table/deps.ts New isExecCancelled, isExecCancelledAfter, and isDepValueUnmet predicates (false now unmet). Well-tested in deps.test.ts.
apps/sim/lib/table/cell-write.ts Cancel guard now uses isExecCancelled (status-only, not executionId-scoped), correctly preventing worker resurrection of a killed pre-stamp.
apps/sim/background/resume-execution.ts Adds a cancelled-cell guard before the resume worker runs, preventing wasted compute on paused cells killed by Stop All.
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/hooks/use-table-event-stream.ts Adds applyUsageLimit handler and stable-ref callback. SSE dispatch-event handler correctly manages both the new early dispatch-start and existing per-window events.
apps/sim/lib/execution/preprocessing.ts Adds logPreprocessingErrors option so table cells can suppress execution-log row writes.
apps/sim/lib/table/events.ts New usageLimitReached event type with optional dispatchId, matching the server-side sentinel semantics.
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/table.tsx Wires onUsageLimitReached into useTableEventStream and adds hasActiveDispatch to the two RunStatusControl gates.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Dispatcher
    participant CellWorker
    participant PreprocessExecution
    participant RateLimiter
    participant DB

    Client->>Dispatcher: trigger dispatch
    Dispatcher->>DB: pending to dispatching
    Dispatcher->>Client: SSE dispatch status dispatching cursor 0
    Note over Client: hasActiveDispatch true, Show X running and Stop All

    Dispatcher->>DB: stamp cells pending null executionId
    Dispatcher->>CellWorker: batchEnqueueAndWait

    CellWorker->>DB: getRowById
    CellWorker->>CellWorker: cancelledBeforeRun check
    alt Stop All fired
        DB-->>CellWorker: exec status cancelled
        CellWorker-->>Dispatcher: return error skip
    else Not cancelled
        CellWorker->>PreprocessExecution: checkUsage billing checkRateLimit false
        alt Usage limit hit 402
            PreprocessExecution-->>CellWorker: success false statusCode 402
            CellWorker->>DB: clear pre-stamp executionsPatch null
            CellWorker->>DB: completeDispatchIfActive CAS
            CellWorker->>Client: SSE usageLimitReached
            Note over Client: Remove dispatch overlay toast upgrade prompt
            CellWorker-->>Dispatcher: return blocked
        else Usage OK
            loop Rate-limit pacing up to 6 attempts
                CellWorker->>RateLimiter: checkRateLimitWithSubscription
                alt Rate limited
                    CellWorker->>CellWorker: sleep jitter backoff
                    CellWorker->>DB: re-check cancelled tombstone
                else Allowed
                    CellWorker->>DB: markWorkflowGroupPickedUp
                    CellWorker->>CellWorker: executeWorkflow with timeout
                    CellWorker->>DB: writeTerminalState
                end
            end
        end
    end

    Dispatcher->>DB: readDispatch check if halted
    alt Dispatch still active
        Dispatcher->>Client: SSE dispatch status dispatching cursor N
        Dispatcher->>Dispatcher: next window
    else Halted
        Dispatcher-->>Client: done no stale event
    end
Loading

Reviews (7): Last reviewed commit: "fix(tables): don't emit stale dispatchin..." | Re-trigger Greptile

Comment thread apps/sim/background/workflow-column-execution.ts
Comment thread apps/sim/background/workflow-column-execution.ts
… cancel

Addresses PR review:
- Usage limit: only the cell that transitions the dispatch active→complete
  (via completeDispatchIfActive) emits usageLimitReached, so concurrent cells
  don't fire up to 20 identical "upgrade" toasts.
- Rate-limit retry: re-check the cancelled tombstone after each sleep so a
  Stop All mid-wait releases the concurrency slot promptly (signal never fires
  on the trigger.dev backend).
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

This PR hardens table cell execution across four dimensions: reliable stop-all cancellation (status-based guard instead of executionId-scoped), accurate "X running" display (new dispatch-start SSE event), correct checkbox dependency semantics (false treated as unmet), and usage/billing/rate-limit gating for table cell runs via preprocessExecution.

  • Cancel guard: both the in-memory check (cell-write.ts) and the SQL WHERE clause (service.ts) are now status-only (status <> 'cancelled'), closing the resurrection bug where a worker claiming a pre-stamp with a new executionId could overwrite a stop-click; resume workers gain a matching pre-resume check.
  • Usage limit halt: workers that hit a 402 from preprocessExecution call markDispatchComplete and emit a usageLimitReached SSE event that surfaces an upgrade toast and clears the dispatch overlay; rate-limit 429s pace the cell via a bounded retry loop (up to 6 attempts) rather than skipping the row.
  • Checkbox fix: areGroupDepsSatisfied now treats false as unmet, so unchecking a box no longer re-arms downstream groups; deriveExecClearsForDataPatch gates on the post-patch dep state before propagating dirty marks.

Confidence Score: 3/5

Mostly safe to merge with one real UX defect: concurrent workers hitting the usage cap all fire usageLimitReached events, and the client shows a duplicate toast for every event beyond the first because the callback is invoked unconditionally rather than only when the dispatch is actually removed from the cache.

The cancel-guard, checkbox-dep, and resume-worker fixes are clean and well-tested. The dispatch-start SSE and preprocessExecution integration are straightforward. The defect in applyUsageLimit — calling onUsageLimitReachedRef.current regardless of whether the cache actually changed — will produce repeated identical upgrade toasts in any multi-worker dispatch that exhausts the usage limit before the first event propagates to all workers.

apps/sim/app/workspace/[workspaceId]/tables/[tableId]/hooks/use-table-event-stream.ts (the applyUsageLimit handler) and apps/sim/lib/table/events.ts (dispatchId optionality in the usageLimitReached event type).

Important Files Changed

Filename Overview
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/hooks/use-table-event-stream.ts Adds usageLimitReached SSE event handling; unconditional callback invocation can fire multiple toasts when concurrent workers all hit the 402 limit for the same dispatch.
apps/sim/background/workflow-column-execution.ts Adds preprocessExecution gating (usage/billing/timeout), rate-limit pacing with backoff, and pre-execution cancel checks; uses dispatchId ?? '' as an empty-string sentinel for cascade payloads.
apps/sim/lib/table/deps.ts Adds isExecCancelled, isExecCancelledAfter, and isDepValueUnmet predicates; boolean false treated as unmet dependency; clean, well-tested.
apps/sim/lib/table/service.ts SQL cancel guard simplified to status-only (status <> 'cancelled') fixing the resurrection bug; deriveExecClearsForDataPatch now checks areGroupDepsSatisfied before arming dependents, fixing uncheck-no-rerun.
apps/sim/lib/table/dispatcher.ts Emits dispatch-start SSE event before first window completes; payloads now carry dispatchId; tombstone filter refactored to use shared isExecCancelledAfter; markDispatchComplete exported for worker use.
apps/sim/background/resume-execution.ts Adds cancel guard for paused/awaiting cells before resuming; reads current cell state via getRowById and bails if already cancelled — correctly closes the resume-after-stop gap.
apps/sim/lib/table/events.ts Adds usageLimitReached event type; dispatchId is required (string) even for cascade payloads that have no dispatch, forcing an empty-string sentinel at the call site.

Sequence Diagram

sequenceDiagram
    participant D as Dispatcher
    participant W as Cell Worker
    participant DB as Database
    participant SSE as SSE Stream
    participant UI as Client UI

    D->>DB: stamp cells pending
    D->>SSE: dispatch-start event
    SSE->>UI: hasActiveDispatch true, show X running
    D->>W: enqueue trigger.dev task with dispatchId

    W->>DB: getRowById cancelledBeforeRun check
    alt cell cancelled before dequeue
        W-->>W: return error, skip without overwriting cancel
    else cell not cancelled
        W->>W: preprocessExecution billing usage timeout
        alt usage limit 402
            W->>DB: markDispatchComplete
            W->>SSE: usageLimitReached event
            SSE->>UI: remove dispatch and show upgrade toast once
            W-->>W: return blocked
        else rate limited 429
            loop up to 6 attempts
                W->>W: sleep clamped retryAfterMs
                W->>W: checkRateLimit
            end
        end
        W->>DB: markWorkflowGroupPickedUp with SQL guard status not cancelled
        alt Stop All cancelled cell during sleep
            DB-->>W: skipped, guard rejected
        else normal execution
            W->>W: executeWorkflow with timeout signal
            W->>DB: writeWorkflowGroupState
            W->>SSE: cell event
            SSE->>UI: update cell state
        end
    end
Loading

Reviews (2): Last reviewed commit: "fix(tables): dedupe usage-limit event + ..." | Re-trigger Greptile

Comment thread apps/sim/lib/table/events.ts
Comment thread apps/sim/background/workflow-column-execution.ts Outdated
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

Passing the bucket's shared resetAt as retryAfterMs made backoffWithJitter
return a fixed clamped value (no jitter, attempt ignored), so all concurrent
cells retried in lockstep. Pass null to get jittered exponential backoff.
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment thread apps/sim/background/workflow-column-execution.ts
…kflow

# Conflicts:
#	apps/sim/app/workspace/[workspaceId]/tables/[tableId]/table.tsx
Addresses PR review:
- Clear each blocked cell's pre-stamp on a 402 so it reverts to un-run instead
  of being stuck "Queued" (no error/cancelled badge); covers auto-fire cells
  with no owning dispatch.
- Client re-syncs run-state counts and refetches rows on usageLimitReached so
  the stale "X running" / Stop-all control clears and queued cells drop.
- Make usageLimitReached.dispatchId optional; client only touches the dispatch
  overlay when present.
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 8bd4694. Configure here.

Comment thread apps/sim/lib/table/dispatcher.ts
If a cell halts the dispatch mid-window (usage limit), re-read the dispatch
status after the batch and bail instead of emitting a per-window 'dispatching'
event that would arrive after the client dropped the dispatch and re-add it
(flickering 'X running' back).
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

@TheodoreSpeaks TheodoreSpeaks merged commit 3ccb3a3 into staging Jun 2, 2026
14 checks passed
@TheodoreSpeaks TheodoreSpeaks deleted the fix/table-stop-workflow branch June 2, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant