s03 - 第一个工具

前两章的模型只能说话。你问它问题，它给你答案，但它什么都做不了。这一章给它一个工具——读文件。从这一刻起，它不再只是一个聊天机器人。它开始成为一个 Agent。

这一章要解决什么问题

s01 和 s02 的代码有一个根本限制：模型只能生成文字。

你问它 "帮我看看 config.json 里有什么"，它只能猜，因为它根本没见过那个文件。

工具调用（Tool Calling / Function Calling）解决了这个问题。你告诉模型："你可以用这些工具"，模型在需要的时候说"我要调用某个工具"，你的代码去执行，再把结果传回来。

这一章的目标：让模型能读文件。

代码

在上一章的基础上，新建或替换 agent.py：

python

import os
import json
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com",
)

# ============================================================
# 工具：读取文件
# ============================================================

def read_file(path: str) -&gt; str:
    """读取指定路径的文件，返回内容。"""
    try:
        with open(path, "r", encoding="utf-8") as f:
            return f.read()
    except FileNotFoundError:
        return f"错误：文件 '{path}' 不存在。"
    except Exception as e:
        return f"错误：读取文件时出错 — {e}"

# 告诉模型它有哪些工具可用
tools = [
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "读取指定路径的文件内容。当你需要查看某个文件时使用这个工具。",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description": "要读取的文件路径",
                    }
                },
                "required": ["path"],
            },
        },
    }
]

# 工具名 → 实际函数的映射
available_functions = {
    "read_file": read_file,
}

# ============================================================
# 对话循环
# ============================================================

messages = [
    {"role": "system", "content": "你是一个有用的助手。当用户问到文件相关的问题时，使用工具来读取文件内容。"},
]

print("对话开始，输入 exit 退出。\n")

while True:
    user_input = input("你: ")
    if user_input.strip().lower() in ("exit", "quit"):
        break

    messages.append({"role": "user", "content": user_input})

    # 第一步：发消息给模型（带上工具定义）
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=messages,
        tools=tools,
    )

    assistant_message = response.choices[0].message

    # 判断模型是想说话，还是想调用工具
    if assistant_message.tool_calls:
        # 模型想调用工具 —— 先把它说的话存进历史
        messages.append(assistant_message)

        # 逐个执行模型请求的工具调用
        for tool_call in assistant_message.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)

            print(f"  [调用工具] {function_name}({function_args})")

            # 执行函数
            result = available_functions[function_name](**function_args)

            # 把结果喂回给模型
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            })

        # 第二步：模型拿到工具结果，生成最终回复
        second_response = client.chat.completions.create(
            model="deepseek-chat",
            messages=messages,
            tools=tools,
        )
        final_reply = second_response.choices[0].message.content
        messages.append({"role": "assistant", "content": final_reply})
        print(f"\n助手: {final_reply}\n")

    else:
        # 模型只想说话，没有调用工具
        reply = assistant_message.content
        messages.append({"role": "assistant", "content": reply})
        print(f"\n助手: {reply}\n")

先准备一个测试文件：

bash

echo '{"name": "agent-demo", "version": "1.0"}' &gt; config.json

运行：

bash

python agent.py

对话开始，输入 exit 退出。

你: 帮我看看 config.json 里写了什么
  [调用工具] read_file({'path': 'config.json'})

助手: config.json 的内容是一个 JSON 对象，包含两个字段：
- name: "agent-demo"
- version: "1.0"

你: exit

发生了什么

这一章比前两章多了一个核心机制：模型不再是直接回答，而是先告诉你它想做什么，你做完之后它再回答。

1. 定义工具

python

tools = [
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "读取指定路径的文件内容。当你需要查看某个文件时使用这个工具。",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description": "要读取的文件路径",
                    }
                },
                "required": ["path"],
            },
        },
    }
]

这个结构看起来很长，但拆开来看很简单。它就是一份"说明书"，告诉模型：

name：工具叫什么名字
description：工具是干什么的（模型靠这段话决定什么时候用它）
parameters：工具需要什么参数，每个参数的类型和含义

parameters 遵循 JSON Schema 格式。你不需要背这个格式——知道它在描述"这个函数接受什么参数"就够了。

2. 第一次 API 调用

python

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
    tools=tools,
)

跟上一章比，多了一个 tools=tools。就这一行。你把工具的说明书传给模型，模型就知道自己有哪些工具可以用。

3. 判断模型的意图

python

if assistant_message.tool_calls:

模型的回复有两种可能：

没有 tool_calls：模型决定直接回答（跟前两章一样）
有 tool_calls：模型决定先调用工具

这是关键区别。模型不再是被动回答——它在做决策。

4. 执行工具，把结果传回去

python

for tool_call in assistant_message.tool_calls:
    function_name = tool_call.function.name
    function_args = json.loads(tool_call.function.arguments)
    result = available_functions[function_name](**function_args)
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": result,
    })

这里做了三件事：

从模型的回复里提取工具名和参数
调用实际的 Python 函数
把结果以 role: "tool" 的身份塞回 messages

5. 第二次 API 调用

python

second_response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
    tools=tools,
)

模型拿到工具返回的结果后，生成最终的人类可读回复。

两步舞

整个流程是一个两步舞：

用户说话
  → 模型看到消息 + 工具列表
  → 模型说："我要调用 read_file，参数是 config.json"
  → 你的代码真的去读了 config.json
  → 把读到的内容喂回模型
  → 模型看到工具结果
  → 模型生成最终回复
  → 用户看到回复

模型决定做什么，你的代码负责真的去做。 模型不会自己读文件、不会自己调 API、不会自己发邮件——它只是告诉你它想做，你来执行。

这就是工具调用的本质。

上两章 vs 这一章

s01-s02	s03
模型只能生成文字	模型可以请求调用工具
一次 API 调用	可能两次 API 调用（工具调用时）
模型的回答完全靠训练数据	模型可以获取实时信息
聊天机器人	Agent 的雏形

试着改改

1. 给工具加一个写文件的能力

python

def write_file(path: str, content: str) -&gt; str:
    """将内容写入指定路径的文件。"""
    try:
        with open(path, "w", encoding="utf-8") as f:
            f.write(content)
        return f"成功写入 '{path}'。"
    except Exception as e:
        return f"错误：写入文件时出错 — {e}"

然后把它加到 tools 列表和 available_functions 字典里。试试让模型帮你写个文件。

2. 去掉 tools 参数，看看模型怎么回答

把 tools=tools 删掉，再问同样的问题（"帮我看看 config.json 里写了什么"）。模型会编造一个答案，因为它没有读文件的能力了。

3. 打印完整的 messages

在循环末尾加：

python

print(f"  messages 长度: {len(messages)}")

调用工具后你会发现 messages 多了好几条——模型的 tool_calls 消息、tool 结果消息、最终回复消息。理解 messages 的结构，是理解后面所有章节的基础。

4. 看看 tool_calls 长什么样

在 messages.append(assistant_message) 后面加：

python

print(f"  [tool_calls] {json.dumps([tc.function.model_dump() for tc in assistant_message.tool_calls], ensure_ascii=False, indent=2)}")

你会看到模型返回的原始工具调用结构。

教学边界

这一章只做一件事：给模型一个工具，让它能读文件。

不涉及：

多个工具同时调用（虽然代码支持，但本章只演示一个）
工具调用失败时的重试或错误处理策略
MCP（Model Context Protocol）
工具的安全性（模型可以读任何路径的文件）
流式输出下的工具调用

这些都在后面的章节里。现在你需要记住的是：

模型本身不能做事。它只能告诉你它想做什么。工具调用就是那个桥梁——模型表达意图，你来执行，然后把结果喂回去。

一句话记住

Agent 的第三步，是给模型一个工具。模型说"我要调用"，你来真的去做——这就是工具调用，聊天机器人和 Agent 的分水岭。

s03 - 第一个工具 ​

这一章要解决什么问题 ​

代码 ​

发生了什么 ​

1. 定义工具 ​

2. 第一次 API 调用 ​

3. 判断模型的意图 ​

4. 执行工具，把结果传回去 ​

5. 第二次 API 调用 ​

两步舞 ​

上两章 vs 这一章 ​

试着改改 ​

教学边界 ​

一句话记住 ​

s03 - 第一个工具

这一章要解决什么问题

代码

发生了什么

1. 定义工具

2. 第一次 API 调用

3. 判断模型的意图

4. 执行工具，把结果传回去

5. 第二次 API 调用

两步舞

上两章 vs 这一章

试着改改

教学边界

一句话记住