Skip to content

feat: Token usage details logging for tracking#1003

Draft
Dhruvkumar-Microsoft wants to merge 6 commits into
dev-v4from
psl-token-usage
Draft

feat: Token usage details logging for tracking#1003
Dhruvkumar-Microsoft wants to merge 6 commits into
dev-v4from
psl-token-usage

Conversation

@Dhruvkumar-Microsoft
Copy link
Copy Markdown
Collaborator

Purpose

This pull request introduces comprehensive tracking and reporting of token usage for agents and models throughout the orchestration and API layers. The changes enable detailed monitoring of token consumption per agent and model, real-time updates via WebSocket, and improved logging and analytics for Responsible AI (RAI) checks. The most important changes are grouped below.

Token Usage Tracking and Reporting Enhancements:

  • Added new fields to the Plan data model to record total input, output, and overall token counts, as well as detailed usage breakdowns by agent and model, and orchestration execution times. (src/backend/common/models/messages_af.py)
  • Introduced the TokenUsageUpdate dataclass and a new WebSocket message type (TOKEN_USAGE_UPDATE) to support real-time reporting of token usage per agent, including cumulative statistics. (src/backend/v4/models/messages.py) [1] [2]
  • Added infrastructure in the orchestration manager to extract token usage from agent responses and events, track cumulative usage per agent and model, and associate agents with their deployment models for each user session. (src/backend/v4/orchestration/orchestration_manager.py, src/backend/v4/config/settings.py) [1] [2] [3] [4]

Responsible AI (RAI) Compliance and Analytics:

  • Refactored the RAI agent invocation utility to return both the compliance verdict and token usage details, and updated all RAI-check call sites to handle and record this information. (src/backend/common/utils/utils_af.py, src/backend/v4/api/router.py) [1] [2] [3] [4] [5]
  • Integrated Application Insights event tracking for RAI agent token usage and model-level token consumption, enabling analytics on RAI compliance checks and resource usage. (src/backend/v4/api/router.py) [1] [2]

These changes lay the groundwork for robust observability of LLM agent and model usage, facilitate cost and compliance monitoring, and enable real-time feedback to clients about token consumption.

Does this introduce a breaking change?

  • Yes
  • No

How to Test

  • Get the code
git clone [repo-address]
cd [repo-name]
git checkout [branch-name]
npm install
  • Test the code

What to Check

Verify that the following are valid

  • ...

Other Information

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 25, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
common/models
   messages_af.py1891393%229, 238–239, 241–248, 251–252
common/utils
   utils_af.py1534967%46–49, 129–145, 147–152, 165–166, 168–173, 175–180, 204, 206, 213–215, 217–218, 299
v4/config
   settings.py2211692%114, 163–165, 176, 196, 207–209, 250–251, 302–304, 329–330
v4/models
   messages.py1371191%26, 49, 59, 69, 130, 135–137, 151, 176, 198
v4/orchestration
   orchestration_manager.py47519060%267–268, 300, 313–318, 332–339, 342–347, 350–355, 357–360, 366–369, 371–375, 377–380, 383–389, 394–400, 412–418, 423, 428–432, 436–438, 440–445, 482–484, 488, 494, 506–508, 512, 518, 572–573, 593, 625–632, 634–643, 648–649, 657, 661–662, 712–719, 721–730, 735–736, 744, 748–749, 758–759, 766, 768–772, 774–777, 779, 829, 834, 838, 848–855, 883, 890–891, 900–901, 910–911, 945–956, 960–961, 965–966, 968–969
TOTAL332157782% 

Tests Skipped Failures Errors Time
883 5 💤 14 ❌ 0 🔥 8.363s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant