Skip to content

feat: 支持文件编辑的模糊匹配 && ChatUI工具调用结果美化#8413

Open
elecvoid243 wants to merge 17 commits into
AstrBotDevs:masterfrom
elecvoid243:file_edit_tool
Open

feat: 支持文件编辑的模糊匹配 && ChatUI工具调用结果美化#8413
elecvoid243 wants to merge 17 commits into
AstrBotDevs:masterfrom
elecvoid243:file_edit_tool

Conversation

@elecvoid243
Copy link
Copy Markdown
Contributor

@elecvoid243 elecvoid243 commented May 29, 2026

Modifications / 改动点

📋 概述

本次 PR 包含两大核心改进:

  1. 后端astrbot_file_edit_tool 引入 9 级多策略模糊匹配引擎,大幅提升 LLM 生成编辑操作的成功率,并返回git diff信息
  2. 前端 — ChatUI 工具调用显示美化:编辑结果以 Git diff 可视化展示,各类工具结果差异化渲染

🔧 第一部分:astrbot_file_edit_tool 模糊匹配引擎

问题背景

LLM 在生成文件编辑操作时,经常出现与原始文件内容不完全匹配的情况:

  • 缩进偏移(多余/缺失空格或 Tab)
  • 转义序列不一致(\n 字面量 vs 实际换行符)
  • 空行/尾部空白差异
  • 整体缩进级别偏差

原有的 astrbot_file_edit_tool 仅支持精确匹配,这些微小差异会导致编辑失败。

解决方案

参考 opencode 等编程智能体的多级回退机制,新增 astrbot/core/tools/computer_tools/edit_engine.py,实现了一个多策略替换器链(Replacer Chain),从最严格到最宽松依次尝试 9 种匹配策略,这让astrbot_file_edit_tool的修改成功率大幅提高:

序号 替换器 策略描述
1 SimpleReplacer 精确匹配(兼容原有行为)
2 EscapeNormalizedReplacer 处理 \n\t\xNN\uNNNN 等转义序列
3 LineTrimmedReplacer 每行去除首尾空白后匹配
4 BlockAnchorReplacer 首尾行作为锚点,中间行用 Levenshtein 相似度匹配
5 WhitespaceNormalizedReplacer 将所有连续空白折叠为单个空格后匹配
6 IndentationFlexibleReplacer 去除公共缩进前缀后匹配,保留原始缩进
7 TrimmedBoundaryReplacer 去除整体首尾空白后匹配
8 ContextAwareReplacer 首尾行作为上下文锚点,≥50% 中间行匹配即接受
9 MultiOccurrenceReplacer 查找所有精确匹配(配合 replace_all 使用)

核心函数

def robust_replace(
    content: str,
    old_string: str,
    new_string: str,
    *,
    replace_all: bool = False,
) -> tuple[str, int]:
  • 依次遍历 9 个替换器,遇到第一个唯一定位即返回
  • 若在单替换模式发现多个匹配,继续尝试更严格的策略
  • 返回 (新内容, 替换次数) 元组
  • old_string == new_string 则直接抛出 ValueError

额外保障

  • 文件级异步锁 — 使用 asyncio.Lock + WeakValueDictionary 防止并发写入冲突
  • BOM 保留 — 自动检测并保留 UTF-8 BOM(\xef\xbb\xbf
  • 行尾保留 — 自动检测 \n vs \r\n,编辑后恢复原始行尾格式
  • Unified Diff 输出 — 编辑结果以 diff -u 格式返回,前端可直接渲染

fs.py 的改动

FileEditTool 支持两种运行模式:

  • 本地运行时 — 使用完整的 edit_file() 函数(BOM + CRLF + 锁 + diff)
  • 沙箱运行时 — 通过 booter 中介读写,同样使用 robust_replace() 模糊匹配后写回,并生成 build_unified_diff() 统一 diff 输出

同时优化了工具描述和参数说明:

  • 明确提示 9 种匹配策略的存在
  • 强调从 Read tool 输出中保留精确缩进(行号前缀格式说明)
  • 提醒 replace_all 用于变量重命名场景
  • 结果输出格式化为 _format_result() 函数,包含编辑摘要 + markdown diff 代码块(超过 2000 字符截断)

🎨 第二部分:ChatUI 工具调用界面优化

新增组件

1. DiffPreview.vue

GitHub 风格的 unified diff 渲染组件:

  • 解析 @@ -a,b +c,d @@ hunk 头,逐行渲染
  • 行号显示(旧/新),添加行空白旧行号、删除行空白新行号
  • 新增行(绿色背景 + 深绿色左边框)、删除行(红色背景 + 深红色左边框),悬停高亮
  • 可折叠的头部,显示文件路径(带 mdi-file-document-edit-outline 图标)和增减统计(+N −M
  • 摘要文本(如 "Edited path. Replaced N occurrence(s)...")独立显示
  • 内容截断支持,超过 30 行可点击"Show all"(显示剩余行数),超过 2000 字符显示截断警告
  • 完整深色/浅色主题适配(CSS 变量)
  • 支持 collapsible 属性控制是否可折叠

2. ToolResultView.vue

按工具类型差异化渲染结果,集成 Shiki 语法高亮和系统通知智能剥离:

工具 渲染方式
astrbot_file_read_tool 显示文件路径(优先从 tool args 获取)和范围元数据,使用 Shiki 自动检测语言进行代码高亮,无高亮时回落纯文本
astrbot_file_write_tool 成功/失败状态徽章(mdi-check-circle / mdi-alert-circle
astrbot_grep_tool 结构化结果列表:显示匹配计数 + file:lineno:text 格式行(蓝色文件名、灰色行号、正文)
astrbot_execute_shell 终端风格三行显示 — stdout 黑底代码块、stderr 红色文字、exit code 彩色标签(绿色成功/红色失败),[SYSTEM NOTICE] 不剥离(shell 自行处理)
astrbot_execute_python / astrbot_execute_ipython 终端风格黑色背景等宽字体块
其他工具 智能 JSON 格式化回落 / 原始文本

所有非 shell 工具使用 findSystemNoticeIndex() 智能剥离 [SYSTEM NOTICE] 后缀,在结果底部单独 .result-suffix 区域渲染。

3. systemNotice.ts

系统通知后缀智能检测与剥离工具:

  • findSystemNoticeIndex(text):找到真实系统通知链的起始位置
    • 同时支持 [SYSTEM NOTICE] 标记和 "Truncated tool output preview shown above." 截断通知
    • 特殊情况处理:考虑到读取/修改文件操作时,如果文件内容里也包含了[SYSTEM NOTICE]字段(例如阅读的是astrbot日志文件),可能造成误匹配。使用已知前缀白名单("By the way," / "Important:" / "User sent")区分真实系统通知与文件中误包含的相同文字,并通过关键不变量:从第一个真实候选位置到文本末尾必须全部是通知链内容(因为notice会加入到tool结果的末尾,后面不再会有除了notice以外的其他内容),来找到真正的notice字段
    • 支持链式通知(如重复工具警告 + 跟进通知)
  • splitSystemNotice(text):将结果拆分为 { content, notice } 两部分

修改组件

4. ToolCallCard.vue

  • Args 展示 — 从原始 JSON 块升级为带图标的键值对表格
    • 路径参数显示 📁、内容参数显示 📝、搜索参数显示 🔍、命令参数显示 ⚡、Python 参数显示 🐍、replace_all 显示 🔄、偏移/限制显示 #⃣
    • 超过 60 字符的参数自动截断,点击 可展开
    • 超过 5 个参数支持 "+N more" 折叠/展开
  • astrbot_file_edit_tool — 结果使用 DiffPreview 渲染,自动提取文件路径(正则 Edited (.+?)\\.)和摘要文本,系统通知 [SYSTEM NOTICE]edit-tool-notice 区域单独显示
  • 其他工具 — 委托给 ToolResultView 组件差异化渲染
  • 图标系统 — 每种工具使用专属 MDI 图标(文件读写编辑、搜索、终端、Python、上传下载、网络等)
  • 移除了旧的 @keyframes spin 动画和 formattedResult computed(由 ToolResultView 负责)

5. ThemeAwareMarkdownCodeBlock.vue

  • 当代码块语言为 diff 时,使用 DiffPreview 代替标准代码块渲染(max 30 行),超出可展开显示

6. IPythonToolBlock.vue

  • 样式与 ToolResultView 统一(字体大小 11.5px、行高 1.55、背景色、圆角 4px)
  • Shiki 高亮代码块使用 code-result-shiki class,与 ToolResultView 共享样式
  • 结果区域新增 result-suffix:由 findSystemNoticeIndex() 提取的 [SYSTEM NOTICE] 后缀独立展示
  • formattedResult 先尝试 JSON 格式化,失败回落原文(截断系统通知后部分)

7. ReasoningSidebar.vue

  • 新增拖拽调整宽度功能
    • 最小宽度 280px,最大宽度 1200px,默认宽度 380px
    • 左侧 6px 拖拽手柄,悬停/拖拽时显示主题色半透明背景
    • onBeforeUnmount 中自动清理事件监听器
  • 移动端适配(≤760px):固定全屏宽度覆盖 !important、隐藏拖拽手柄

Shiki 语法高亮扩展

shikiLimitedBundle.js 新增 4 种语言和对应别名:

新增语言 别名
C (c) h
C++ (cpp) cc, cxx, hpp, c++
Go (go) golang
Rust (rust) rs

此外,ToolResultView.vue 内置了 EXT_TO_LANG 映射表,根据文件扩展名自动检测语言进行高亮。


已知问题说明

问题背景

astrbot_file_edit_tool 在沙箱模式下编辑文件时会丢失 CRLF 行尾(\r\n\n),这使得文件内容可能错位。在local环境则没有该问题。

有一种临时方案能解决该问题,并确保结果的正确性,但更好的修复方式超出了本PR的范围。

根因分析

沙箱的所有 booter(ShipyardBooter、ShipyardNeoBooter、CUABooter)的 FileSystemComponent.read_file() 均使用文本模式读取,自动将 CRLF 规范化为 LF。而 edit_engine.py 中已有的行尾保留逻辑仅在本地运行时模式下使用。

运行时 读取方式 行尾处理 结果
本地 (edit_engine.py:edit_file) open("rb") 二进制读取 _detect_line_ending + _convert_to_line_ending ✅ CRLF 保留
沙箱 (fs.py:FileEditTool.call) sb.fs.read_file() 文本读取 无行尾处理 ❌ CRLF 丢失

临时解决方案

核心方案:考虑到沙箱都是配备python环境的,在沙箱模式下,使用 sb.python.exec() 执行二进制 I/O(open("rb") / open("wb")),通过 base64 编码安全传输二进制数据,复用 edit_engine.py 中已有的行尾保留逻辑。

实现细节

edit_engine.py — 提取核心编辑逻辑

从现有的 edit_file() 中提取出 bytes_edit_file() 函数,接受 bytes 输入并返回 bytes 输出,内置行尾检测与恢复逻辑:

def bytes_edit_file(
    raw_bytes: bytes,
    old_string: str,
    new_string: str,
    *,
    replace_all: bool = False,
    encoding: str = "utf-8",
) -> tuple[bytes, EditResult]:
    """
    核心行尾感知编辑逻辑,操作原始字节。
    
    执行步骤:
    1. BOM 检测与保留
    2. 行尾检测(CRLF vs LF)
    3. 规范化为 LF 进行匹配(替换器链基于 LF 工作)
    4. 执行 robust_replace()
    5. 转换回原始行尾格式
    6. 如存在 BOM 则重新附加
    """

简化现有的 edit_file() 使其委托给 bytes_edit_file()

fs.py — 沙箱二进制 I/O

新增两个辅助函数:

  • _sandbox_read_bytes(sb, path) — 通过 sb.python.exec 在沙箱内以 open("rb") 二进制读取,返回 base64 编码后解码的原始字节
  • _sandbox_write_bytes(sb, path, data) — 将数据 base64 编码后通过 sb.python.exec 在沙箱内以 open("wb") 写入

修改 FileEditTool.call() 沙箱分支为三步流程:二进制读取 → bytes_edit_file() 行尾感知编辑 → 二进制写入。

修复后的数据流

原始 CRLF 文件 (\r\n)
    ↓
_sandbox_read_bytes()  ← sb.python.exec("open('rb')") 二进制读取
    ↓
raw_bytes: b"line1\r\nline2\r\nline3\r\n"  ← 原始字节完整保留
    ↓
bytes_edit_file()
    ├─ _detect_line_ending() → "\r\n"  ← 正确检测 CRLF
    ├─ _normalize_line_endings() → "\n"  ← 规范化为 LF 供匹配引擎使用
    ├─ robust_replace() → 替换成功
    └─ _convert_to_line_ending("\r\n") → 恢复 CRLF
    ↓
write_bytes: b"line1\r\nmodified\r\nline3\r\n"  ← CRLF 完整恢复
    ↓
_sandbox_write_bytes()  ← sb.python.exec("open('wb')") 二进制写入
    ↓
最终文件 (CRLF: \r\n)  ← 行尾完美保留 ✅

测试结果

测试用例 描述 结果
CRLF 文件 × LF 搜索字符串 用 LF 搜索字符串编辑 CRLF 文件 ✅ CRLF 保留
CRLF 文件 × CRLF 搜索字符串 用 CRLF 搜索字符串编辑 CRLF 文件 ✅ CRLF 保留
LF 文件保持 LF 编辑 LF 文件,验证 LF 不变 ✅ LF 不变
replace_all × CRLF CRLF 文件批量替换 ✅ 3 处替换,CRLF 保留
BOM + CRLF 保留 同时保留 BOM 和 CRLF ✅ 完美保留
文件未找到错误处理 验证文件缺失时的正确错误处理 ✅ 正确抛出 ValueError
沙箱辅助函数往返测试 CRLF 字节经过读→写后存活 ✅ 字节完整

修复前通过率 62.5%(5/8),修复后通过率 100%(8/8)

已知局限

当前通过 sb.python.exec() 实现沙箱二进制 I/O 是一个可用的临时方案,引入了代码生成、base64 编码、子进程执行等间接开销。对于典型编辑场景(<100KB 文件)开销可忽略,但大文件或高频操作可能有明显延迟。

建议的未来改进:为沙箱运行时添加原生二进制 I/O 支持(FileSystemComponent.read_file_bytes() / write_file_bytes()),消除 Python exec 中介。但这超出了本PR的范畴。


  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

工具调用显示页面

为不同的内置工具设置了各自的样式
共性

  1. 传入的ARGS字段以表格形式呈现
  2. 隔离了工具结果和[SYSTEM NOTICE]的显示

astrbot_execute_shell

当stderr有值时,会显示为红色
shell_tool
shell_tool2

astrbot_file_write_tool

write_tool

astrbot_grep_tool

grep_tool

astrbot_file_edit_tool

文件修改会以diff形式显示
edit_tool

astrbot_file_read_tool

如果读取到的文件符合某种语法格式,会自动高亮显示
read_tool

astrbot_execute_python

与原版样式一致(但适当减小了字体)+ SYSTEM NOTICE单独显示
python_tool

一般工具调用

由于其他工具调用的返回没有统一格式,以及插件的存在,所以直接输出Result文本
tool_result


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Introduce a robust multi-strategy file edit engine with diff output and enhance ChatUI tool call visualization, including diff previews, syntax-highlighted file reads, and resizable reasoning/thread sidebars.

New Features:

  • Add a fuzzy-matching file edit engine that preserves BOM/line endings and returns unified diffs for edits.
  • Render file edit results as Git-style diffs in ChatUI, with a unified diff viewer component and diff-aware markdown code block handling.
  • Provide specialized rendering for different tool results (file read/write, grep, shell) including syntax-highlighted code views and terminal-style outputs.
  • Allow reasoning and thread side panels to be horizontally resizable by dragging their edges.
  • Expand Shiki-limited syntax highlighting bundle with additional languages and aliases (C/C++, Go, Rust, etc.).

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend feature:chatui The bug / feature is about astrbot's chatui, webchat labels May 29, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust file editing engine with fuzzy matching capabilities, integrates it into the file system tools, adds drag-resize functionality to the chat sidebar and thread panel, and enhances the dashboard's tool result rendering with syntax highlighting and diff previews. The review feedback highlights several critical issues, including a Vue template compilation error in ToolResultView.vue due to duplicate v-else branches, a module-level asyncio.Lock anti-pattern in edit_engine.py, and potential race conditions in sandbox file editing. Additionally, the reviewer pointed out unhandled global style resets during component unmounting, state-handling mismatches in ToolCallCard.vue, and rendering bugs in DiffPreview.vue, while offering actionable suggestions to simplify state management and parsing logic.

Comment thread dashboard/src/components/chat/message_list_comps/ToolResultView.vue
Comment thread dashboard/src/components/chat/message_list_comps/ToolCallCard.vue Outdated
Comment thread astrbot/core/tools/computer_tools/edit_engine.py Outdated
Comment thread astrbot/core/tools/computer_tools/edit_engine.py Outdated
Comment thread dashboard/src/components/chat/message_list_comps/ToolCallCard.vue Outdated
Comment thread dashboard/src/components/chat/message_list_comps/DiffPreview.vue Outdated
Comment thread dashboard/src/components/chat/ReasoningSidebar.vue
Comment thread dashboard/src/components/chat/ThreadPanel.vue Outdated
Comment thread astrbot/core/tools/computer_tools/fs.py
Comment thread astrbot/core/tools/computer_tools/fs.py Outdated
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In ToolResultView.vue there are two <template v-else> branches at the same level, which makes the template invalid and the second fallback unreachable; consider collapsing the logic into a single v-else-if/v-else chain or removing the redundant block.
  • The arg expansion logic in ToolCallCard.vue mutates displayedArgEntries inside toggleArgExpand, but since this is derived from argEntries the mutation won’t persist and the index may not align with the original entries; it would be more robust to derive the display field purely from argEntries plus expandedArgs in a computed, keyed by argument name rather than slice index.
  • ReasoningSidebar.vue and ThreadPanel.vue duplicate almost identical drag-resize logic; extracting this into a shared composable or utility would reduce code duplication and keep future behavior changes consistent between the two panels.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In ToolResultView.vue there are two `<template v-else>` branches at the same level, which makes the template invalid and the second fallback unreachable; consider collapsing the logic into a single `v-else-if`/`v-else` chain or removing the redundant block.
- The arg expansion logic in ToolCallCard.vue mutates `displayedArgEntries` inside `toggleArgExpand`, but since this is derived from `argEntries` the mutation won’t persist and the index may not align with the original entries; it would be more robust to derive the `display` field purely from `argEntries` plus `expandedArgs` in a computed, keyed by argument name rather than slice index.
- ReasoningSidebar.vue and ThreadPanel.vue duplicate almost identical drag-resize logic; extracting this into a shared composable or utility would reduce code duplication and keep future behavior changes consistent between the two panels.

## Individual Comments

### Comment 1
<location path="dashboard/src/components/chat/message_list_comps/ToolCallCard.vue" line_range="178-187" />
<code_context>
+  return "";
+}
+
+function toggleArgExpand(index) {
+  const entry = argEntries.value[index];
+  if (!entry || !entry.long) return;
+  if (expandedArgs.has(index)) {
+    expandedArgs.delete(index);
+  } else {
+    expandedArgs.add(index);
   }
-  if (name.includes("web_search") || name.includes("tavily")) {
-    return "mdi-web";
+  // Update display
+  const displayed = displayedArgEntries.value;
+  for (let i = 0; i < displayed.length; i++) {
+    const e = argEntries.value.find((x) => x.key === displayed[i].key);
+    if (e && e.long) {
+      displayed[i].display = expandedArgs.has(i) ? e.raw : e.raw.slice(0, 60);
+    }
   }
</code_context>
<issue_to_address>
**issue (bug_risk):** Arg expansion logic relies on mutating computed output and ignores expanded state on recomputation.

`argEntries` rebuilds fresh entries from `toolCall.args` on each recompute, but `toggleArgExpand` mutates their `display` field and maintains expansion in a separate `expandedArgs` Set. When `argEntries` recomputes, those mutations are lost. It also mutates `displayedArgEntries`, which is derived from a computed and should be treated as read-only. Instead, have `argEntries` compute `display` from `expandedArgs` (and `showAllArgs`) inside the computed—e.g., using `expandedArgs.has(index)` in the mapping—and let `toggleArgExpand` only add/remove indices in `expandedArgs` so display state remains derived and stable across recomputations.
</issue_to_address>

### Comment 2
<location path="dashboard/src/components/chat/message_list_comps/ToolResultView.vue" line_range="54-55" />
<code_context>
+      </div>
+    </template>
+
+    <!-- ── fallback ────────────────────────────────────────────── -->
+    <template v-else>
+      <div class="result-status" :class="resultOk ? 'success' : 'error'">
+        <v-icon size="16">{{ resultOk ? 'mdi-check-circle' : 'mdi-alert-circle' }}</v-icon>
</code_context>
<issue_to_address>
**issue (bug_risk):** There are two sibling `v-else` templates, which is invalid and will cause a template compile error.

The root `<div class="tool-result-view">` already has a `v-else` that renders the generic status (`<div class="result-status" ...>`), and then you add another `<template v-else>` for the raw fallback. Vue only allows a single `v-else` in the `v-if` chain and it must directly follow its `v-if`/`v-else-if` without other siblings. This second `v-else` is therefore invalid and will not behave as intended. If you need both status and raw fallback, use one `v-else` and branch inside it, or change the second `v-else` to an inner conditional within the existing branch.
</issue_to_address>

### Comment 3
<location path="astrbot/core/tools/computer_tools/edit_engine.py" line_range="440" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+def robust_replace(
+    content: str,
+    old_string: str,
</code_context>
<issue_to_address>
**issue (complexity):** Consider centralizing empty-string guarding and match-position construction in helpers to simplify `robust_replace` and the replacers while preserving behavior.

You can reduce complexity in `robust_replace` and the replacers without changing behavior by centralizing two cross-cutting concerns:

1. **Guarding against empty `old_string` once**, instead of in every replacer.
2. **Building/deduplicating `match_positions` in one helper**, instead of inlined in `robust_replace`.

### 1. Centralize empty-string handling

Right now almost every replacer starts with:

```python
if not find:
    return
```

You can enforce this once in `robust_replace` and drop it from the replacers:

```python
def robust_replace(
    content: str,
    old_string: str,
    new_string: str,
    *,
    replace_all: bool = False,
) -> tuple[str, int]:
    if not old_string:
        raise ValueError("old_string must be non-empty")

    if old_string == new_string:
        raise ValueError(
            "No changes to apply: old_string and new_string are identical."
        )

    not_found = True
    ...
```

Then in replacers you can safely remove the `if not find: return` boilerplate:

```python
def _simple_replacer(content: str, find: str) -> Iterator[str]:
    """Exact match."""
    yield find
```

Same for the other replacers (keeping any additional early-return logic that depends on `find` content, like `trimmed == find`).

### 2. Extract match-position builder

The `match_positions` building/deduplication logic inside `robust_replace` is non-trivial and repeated for every replacer invocation. You can move it into a small helper to make `robust_replace` easier to read:

```python
def _build_match_positions(content: str, matches: list[str]) -> list[tuple[int, str]]:
    """Return (index, match) pairs, deduplicated by overlapping region."""
    match_positions: list[tuple[int, str]] = []
    for match in matches:
        start = 0
        while True:
            idx = content.find(match, start)
            if idx == -1:
                break
            if not any(pos <= idx < pos + len(m) for pos, m in match_positions):
                match_positions.append((idx, match))
            start = idx + 1
    return match_positions
```

Then `robust_replace`’s core loop becomes:

```python
for replacer in _REPLACERS:
    matches = list(replacer(content, old_string))
    if not matches:
        continue

    match_positions = _build_match_positions(content, matches)
    if not match_positions:
        continue

    not_found = False

    if replace_all:
        new_content = content
        replacements = 0
        for idx, match in sorted(match_positions, key=lambda x: x[0], reverse=True):
            new_content = (
                new_content[:idx] + new_string + new_content[idx + len(match):]
            )
            replacements += 1
        return new_content, replacements

    if len(match_positions) == 1:
        idx, match = match_positions[0]
        return content[:idx] + new_string + content[idx + len(match):], 1

    # Multiple matches: try next replacer
```

This keeps all existing behavior (including the overlapping-match dedup logic) but:

- Makes `robust_replace` shorter and easier to follow.
- Encapsulates the trickiest part (overlap-aware dedup) behind a small, named helper.
- Avoids readers having to re-parse the same logic inline every time they look at the function.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread dashboard/src/components/chat/message_list_comps/ToolCallCard.vue Outdated
Comment thread dashboard/src/components/chat/message_list_comps/ToolResultView.vue
Comment thread astrbot/core/tools/computer_tools/edit_engine.py
@elecvoid243 elecvoid243 marked this pull request as draft May 29, 2026 13:48
@elecvoid243 elecvoid243 marked this pull request as ready for review May 29, 2026 14:56
@elecvoid243
Copy link
Copy Markdown
Contributor Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • In FileEditTool.call, the local-runtime path normalizes line endings via edit_file while the sandbox path passes old/new directly into robust_replace; if sandbox read_file preserves CRLFs this could lead to inconsistent matching behavior between runtimes, so consider applying the same _normalize_line_endings logic before robust_replace in the sandbox branch for parity.
  • The DiffPreview.parseUnifiedDiff logic currently truncates by line count and character count independently; if a diff is both very long and highly fragmented into many hunks, it may be useful to cap the number of hunks or clearly indicate when hunks are partially omitted so users are not confused by missing context around the first truncated hunk.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `FileEditTool.call`, the local-runtime path normalizes line endings via `edit_file` while the sandbox path passes `old`/`new` directly into `robust_replace`; if sandbox `read_file` preserves CRLFs this could lead to inconsistent matching behavior between runtimes, so consider applying the same `_normalize_line_endings` logic before `robust_replace` in the sandbox branch for parity.
- The `DiffPreview.parseUnifiedDiff` logic currently truncates by line count and character count independently; if a diff is both very long and highly fragmented into many hunks, it may be useful to cap the number of hunks or clearly indicate when hunks are partially omitted so users are not confused by missing context around the first truncated hunk.

## Individual Comments

### Comment 1
<location path="astrbot/core/tools/computer_tools/fs.py" line_range="487-496" />
<code_context>
+            if local_env:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Line-ending normalization is inconsistent between local and sandbox runtimes

In the local path, `edit_file` normalizes line endings before matching and then restores the originals, but in the sandbox path you pass `old_content` and `old` directly to `robust_replace`. This can cause CRLF files with LF-based `old` patterns to match locally but fail in sandbox. Please apply the same normalization for `old_content`, `old`, and `new` in the sandbox path (or extract the normalization into a shared helper used by both paths) to keep behavior consistent across environments.

Suggested implementation:

```python
        try:
            if local_env:
                normalized_path = _normalize_rw_path(
                    path,
                    restricted=restricted,
                    local_env=local_env,
                    umo=umo,
                    write=True,
                )

```

```python
        # Normalize line endings for consistent matching across local/sandbox
        # Convert CRLF -> LF for matching, then restore the original style afterwards
        original_has_crlf = "\r\n" in old_content

        normalized_old_content = old_content.replace("\r\n", "\n")
        normalized_old = old.replace("\r\n", "\n")
        normalized_new = new.replace("\r\n", "\n")

        new_content, num_replacements = robust_replace(
            normalized_old_content,
            normalized_old,
            normalized_new,
        )

        if original_has_crlf:
            # Restore CRLF if that was the original style
            new_content = new_content.replace("\n", "\r\n")

```

I assumed the sandbox path currently calls `robust_replace` via a line like:

```python
new_content, num_replacements = robust_replace(old_content, old, new)
```

If the actual call signature or variable names differ (for example, if you don't capture `num_replacements` or use different variable names), adjust the `<<<<<<< SEARCH` line to match your current code and keep the `REPLACE` block as-is, updating variable names as needed. The key behavior is:

1. Detect whether the original `old_content` uses CRLF.
2. Normalize `old_content`, `old`, and `new` to LF-only before calling `robust_replace`.
3. After replacement, convert the result back to CRLF if the original content used CRLF.

Make sure this normalization is applied only in the sandbox path (i.e., the non-`local_env` branch) so that the local path continues using its existing `edit_file` logic, which already handles line-ending normalization.
</issue_to_address>

### Comment 2
<location path="dashboard/src/components/chat/message_list_comps/ToolResultView.vue" line_range="66-67" />
<code_context>
+    </template>
+
+    <!-- ── execute_python ──────────────────────────────────────── -->
+    <template v-else-if="toolName === 'astrbot_execute_python' || toolName === 'astrbot_execute_ipython'">
+      <pre class="result-terminal" v-text="resultText"></pre>
+    </template>
+
</code_context>
<issue_to_address>
**nitpick (bug_risk):** Missing CSS for .result-terminal leads to inconsistent styling vs other result blocks

This template uses `class="result-terminal"` for Python/IPython output, but the stylesheet defines only an (empty) `.result-terminal-deprecated {}` and no `.result-terminal` rules, so these outputs don’t match the styling of other result blocks.

Please either reuse an existing styled class (e.g. `.result-code` / `.result-raw`) or add a `.result-terminal` rule that mirrors the other monospace, scrollable result styles.
</issue_to_address>

### Comment 3
<location path="dashboard/src/components/chat/message_list_comps/ToolCallCard.vue" line_range="213-217" />
<code_context>
+  return match ? match[1] : raw;
+});
+
+const editToolFilePath = computed(() => {
+  if (!isEditTool.value) return "";
+  const raw = editToolCleanResult.value;
+  const match = raw.match(/^Edited\s+(.+?)\./m);
+  return match ? match[1] : "";
+});
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Regex used to extract edited file path from result string breaks on paths with dots

`raw.match(/^Edited\s+(.+?)\./m)` stops at the first `.` after `Edited`, so `Edited src/tools/foo.bar/baz.py.` yields `src/tools/foo`, and the DiffPreview header shows an incomplete path.

Since `_format_result` is under our control, consider either:
- Matching to end of line and stripping an optional trailing dot, e.g. `/^Edited\s+(.+?)\.?$/m`, or
- Emitting a more structured header (e.g. `Edited path: {path}`) and updating the regex accordingly.

That will keep paths with dots working correctly.
</issue_to_address>

### Comment 4
<location path="astrbot/core/tools/computer_tools/edit_engine.py" line_range="438" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+def robust_replace(
+    content: str,
+    old_string: str,
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring `robust_replace` so that replacers yield character spans instead of substrings, allowing the core function to apply replacements directly and with simpler control flow.

You can significantly simplify `robust_replace` by having replacers yield *spans* instead of substrings. Right now each replacer “finds” a match by returning the text, and `robust_replace` has to re-find those substrings with a second pass over `content`. That doubles work and drives the nested control flow.

A minimal refactor that keeps all behavior but reduces complexity:

1. **Change `Replacer` to work on spans**

```python
from typing import Iterator, Callable, Tuple

Span = tuple[int, int]
Replacer = Callable[[str, str], Iterator[Span]]
```

Then update simple replacers to emit `(start, end)` instead of substrings:

```python
def _simple_replacer(content: str, find: str) -> Iterator[Span]:
    if not find:
        return
    idx = content.find(find)
    if idx != -1:
        yield (idx, idx + len(find))


def _multi_occurrence_replacer(content: str, find: str) -> Iterator[Span]:
    if not find:
        return
    start = 0
    flen = len(find)
    while True:
        idx = content.find(find, start)
        if idx == -1:
            break
        yield (idx, idx + flen)
        start = idx + flen
```

For the more complex replacers that currently construct blocks, you can keep their internal logic the same and compute spans instead of yielding the block string:

```python
def _line_trimmed_replacer(content: str, find: str) -> Iterator[Span]:
    if not find:
        return
    original_lines = content.split("\n")
    search_lines = find.split("\n")
    if search_lines and search_lines[-1] == "":
        search_lines.pop()
    if not search_lines:
        return

    # Precompute line offsets to map (line_start, line_end) → (char_start, char_end)
    offsets = []
    pos = 0
    for ln in original_lines:
        offsets.append(pos)
        pos += len(ln) + 1  # +1 for '\n'

    for i in range(len(original_lines) - len(search_lines) + 1):
        if all(
            original_lines[i + j].strip() == search_lines[j].strip()
            for j in range(len(search_lines))
        ):
            start = offsets[i]
            end_line = i + len(search_lines) - 1
            end = offsets[end_line] + len(original_lines[end_line])
            yield (start, end)
```

2. **Simplify `robust_replace` using spans**

Once all replacers yield spans, `robust_replace` no longer needs to re-scan `content` or manage overlapping substring matches. The core logic becomes:

```python
def robust_replace(
    content: str,
    old_string: str,
    new_string: str,
    *,
    replace_all: bool = False,
) -> tuple[str, int]:
    if old_string == new_string:
        raise ValueError(
            "No changes to apply: old_string and new_string are identical."
        )

    for replacer in _REPLACERS:
        spans = list(replacer(content, old_string))
        if not spans:
            continue

        # Deduplicate and sort spans
        spans = sorted(set(spans), key=lambda s: s[0])

        if replace_all:
            # Apply from end to start to keep indices valid
            new_content = content
            for start, end in reversed(spans):
                new_content = new_content[:start] + new_string + new_content[end:]
            return new_content, len(spans)

        # Single-replacement mode
        if len(spans) == 1:
            start, end = spans[0]
            new_content = content[:start] + new_string + content[end:]
            return new_content, 1

        # Multiple matches in single-replacement mode: try next replacer
        continue

    # No replacer produced any match
    raise ValueError(
        "Could not find oldString in the file. It must match exactly, "
        "including whitespace, indentation, and line endings. "
        "Try providing more surrounding context to make the match unique."
    )
```

This preserves the multi-strategy behavior and error semantics while:

- Removing the inner `while find` loop and overlap checks.
- Eliminating the need to rediscover positions from ambiguous substring payloads.
- Making `replace_all` vs single-replacement behavior explicit and easy to follow.

You can incrementally migrate each replacer to spans, keeping the outer API (`edit_file`, etc.) unchanged.
</issue_to_address>

### Comment 5
<location path="dashboard/src/utils/systemNotice.ts" line_range="67" />
<code_context>
+ *   - Continuation lines of a notice paragraph (non-blank, directly after
+ *     a notice marker line, before any blank line gap).
+ */
+function isNoticeChainRegion(text: string, fromPos: number): boolean {
+  const region = text.slice(fromPos);
+  const lines = region.split("\n");
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the notice parsing logic by using an explicit state machine and a single merged scan over markers instead of booleans plus an intermediate positions array.

You can reduce the complexity without changing behavior by making the implicit state machine explicit and by avoiding the intermediate `positions` array.

### 1. Make the notice-chain parsing use explicit states

`isNoticeChainRegion` currently encodes states via `inFollowUp` + `sawBlankAfterNotice`. Making these states explicit clarifies the grammar and removes the need to mentally track boolean interactions.

For example:

```ts
const enum ChainState {
  Outside,           // between notices, after a blank line
  AfterNoticeHeader, // immediately after a [SYSTEM NOTICE] header line
  InFollowUpBody,    // inside a "User sent" follow-up body
  InOverflowBody,    // inside overflow notice body
}

function isNoticeChainRegion(text: string, fromPos: number): boolean {
  const region = text.slice(fromPos);
  const lines = region.split("\n");
  let state: ChainState = ChainState.Outside;

  for (const rawLine of lines) {
    const trimmed = rawLine.trim();
    if (!trimmed) {
      // blank line: reset to Outside, but stay within chain
      state = ChainState.Outside;
      continue;
    }

    if (trimmed.includes(MARKER)) {
      const markerIdx = trimmed.indexOf(MARKER);
      const afterMarker = trimmed.slice(markerIdx).trim();
      state = afterMarker.startsWith(`${MARKER} User sent`)
        ? ChainState.InFollowUpBody
        : ChainState.AfterNoticeHeader;
      continue;
    }

    if (trimmed.startsWith(OVERFLOW_NOTICE_PREFIX)) {
      state = ChainState.InOverflowBody;
      continue;
    }

    if (state === ChainState.InFollowUpBody && /^\d+\.\s/.test(trimmed)) {
      // numbered list items allowed only in follow-up bodies
      continue;
    }

    // continuation lines: allowed in any "body" state, but not in Outside
    if (state === ChainState.AfterNoticeHeader ||
        state === ChainState.InFollowUpBody ||
        state === ChainState.InOverflowBody) {
      continue;
    }

    return false;
  }

  return true;
}
```

This keeps the same rules, but the states and transitions are explicit and easier to extend (e.g., adding a new notice type is a new state/branch instead of another boolean interaction).

### 2. Avoid building and sorting `positions`

You can preserve the “earliest valid candidate wins” semantics without allocating an array and sorting by doing a merged linear scan over the two markers.

A small refactor of `findSystemNoticeIndex`:

```ts
export function findSystemNoticeIndex(text: string): number {
  let systemIdx = text.indexOf(MARKER, 0);
  let overflowIdx = text.indexOf(OVERFLOW_NOTICE_PREFIX, 0);

  while (systemIdx !== -1 || overflowIdx !== -1) {
    let pos: number;
    let type: "system" | "overflow";

    if (systemIdx === -1 || (overflowIdx !== -1 && overflowIdx < systemIdx)) {
      pos = overflowIdx!;
      type = "overflow";
      overflowIdx = text.indexOf(OVERFLOW_NOTICE_PREFIX, pos + OVERFLOW_NOTICE_PREFIX.length);
    } else {
      pos = systemIdx;
      type = "system";
      systemIdx = text.indexOf(MARKER, pos + MARKER.length);
    }

    if (type === "system" && !isGenuineSystemNotice(text, pos)) {
      continue;
    }
    if (isNoticeChainRegion(text, pos)) {
      return pos;
    }
  }

  return -1;
}
```

This preserves the current behavior (candidates considered in ascending `pos` across both types) but removes the `positions` array and the sort, making the flow more direct and easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/tools/computer_tools/fs.py
Comment on lines +66 to +67
<template v-else-if="toolName === 'astrbot_execute_python' || toolName === 'astrbot_execute_ipython'">
<pre class="result-terminal" v-text="resultText"></pre>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (bug_risk): Missing CSS for .result-terminal leads to inconsistent styling vs other result blocks

This template uses class="result-terminal" for Python/IPython output, but the stylesheet defines only an (empty) .result-terminal-deprecated {} and no .result-terminal rules, so these outputs don’t match the styling of other result blocks.

Please either reuse an existing styled class (e.g. .result-code / .result-raw) or add a .result-terminal rule that mirrors the other monospace, scrollable result styles.

Comment thread dashboard/src/components/chat/message_list_comps/ToolCallCard.vue
Comment thread astrbot/core/tools/computer_tools/edit_engine.py
Comment thread dashboard/src/utils/systemNotice.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend feature:chatui The bug / feature is about astrbot's chatui, webchat size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant