使用 SimToM 构建透视推理智能体

提示策略，如思维链 (CoT)，可以提升大型语言模型的推理能力。然而，在需要跟踪不一致世界状态的任务中，它们表现不佳。SimToM 提出了一种简单、两阶段的 LLM 提示框架，灵感来自于模拟理论（Simulation Theory）。作者表明，这种方法在 ToMI 和 BigToM 这两个包含心智理论问题的基准测试中，优于零样本提示和 CoT。

在本例中，我们将使用 Outlines 的提示模板和结构化生成功能，通过几行代码实现 SimToM。

SimToM 的工作原理

SimToM 使用两个连续的提示调用大型语言模型

透视推理：第一个提示接收一个 故事 和一个 角色。目标是基于角色的视角理解情况，并过滤掉故事的其余部分。
问答：第二个提示接收上一步中角色的视角，并让大型语言模型使用该上下文回答一个问题。

Figure 2 in the paper

Outlines 实现

要使用 Outlines 实现 SimToM，我们需要：

使用提示函数编写提示。
使用 Pydantic 定义每个提示将返回的 JSON 对象。
使用 transformers 集成使用 Mistral 模型生成响应。

让我们深入了解吧！

使用提示模板

作者已在这个 GitHub 仓库中分享了他们的代码、提示和数据。下面，我们在 Outlines 中定义了他们用于 ToMI 数据集的提示

from outlines import Template


perspective_taking = Template.from_string(
    """<s>[INST] The following is a sequence of events about some characters, that takes place in multiple locations.
    Your job is to output only the events that the specified character, {{character}}, knows about.

    Here are a few rules:
    1. A character knows about all events that they do.
    2. If a character is in a certain room/location, that character knows about all other events that happens in the room. This includes other characters leaving or exiting the location, the locations of objects in that location, and whether somebody moves an object to another place.
    3. If a character leaves a location, and is NOT in that location, they no longer know about any events that happen within that location. However, they can re-enter the location.

    Story: {{story}}
    What events does {{character}} know about? Only output the events according to the above rules, do not provide an explanation. [/INST]""" # noqa
)

simulation = Template.from_string(
    """<s>[INST] {% for event in events %}
    {{event}}
    {% endfor %}
    You are {{name}}.
    Based on the above information, answer the following question:
    {{question}}
    You must choose one of the above choices, do not say there is not enough information. Answer with a single word, do not output anything else. [/INST]""" # noqa
)

JSON 结构化生成

Outlines 保证大型语言模型将返回一个有效的 JSON 对象，我们可以将其指定为一个 Pydantic 模型。

我们将需要两个用于 SimToM 的 Pydantic 模型，每个提示一个

from pydantic import BaseModel, Field
from typing import List


class PerspectiveTaking(BaseModel):
    """This is for the first prompt."""
    character: str = Field(description="The character we extract the events for.")
    events: List[str] = Field(description="All events that the character knows about.")


class Simulation(BaseModel):
    """This is for the second prompt."""
    answer: str

调用大型语言模型

让我们使用 ToMI 数据集中的一个示例来尝试 SimToM

story = """
1 Aria entered the front_yard.
2 Aiden entered the front_yard.
3 The grapefruit is in the green_bucket.
4 Aria moved the grapefruit to the blue_container.
5 Aiden exited the front_yard.
6 Noah entered the playroom.
"""
question = "7 Where was the grapefruit at the beginning?"
character = "Aria"

我们加载 Mistral-7B-Instruct-v0.3，使用我们之前定义的模板创建提示，并生成结构化响应。提醒一下，第一次调用的目标是获取角色 Aria 知道的所有事件。

# Load an LLM from Hugging Face
MODEL_NAME = "mistral-community/Mistral-7B-Instruct-v0.3"
model = outlines.models.transformers(MODEL_NAME, device="cuda")

perspective_prompt = perspective_taking(story=story, character=character)

# Call Mistral 7B with the first prompt
generator = outlines.generate.json(model, PerspectiveTaking)
perspective = generator(perspective_prompt)

print(perspective.model_dump())
# {'character': 'Aria', 'events': ['1 Aria entered the front_yard.', '3 The grapefruit is in the green_bucket.', '4 Aria moved the grapefruit to the blue_container.']}

不错！我们现在将使用这些事件生成第二个提示。

sim_prompt = simulation(events=perspective.events, name=character, question=question)

# Call Mistral 7B with the second prompt
generator = outlines.generate.json(model, Simulation)
result = generator(sim_prompt)

print(result.model_dump())
# {'answer': 'green_bucket'}

大功告成！SimToM 可能在智能体工作流程中很有用，智能体必须基于它们所知的信息行事，而不是所有可用信息。SimToM 的一个注意事项是，透视推理步骤可能会移除重要信息，从而导致错误结果。正如作者在论文中所述，它可以作为一个简单有效的基线，用于评估大型语言模型在心智理论推理任务上的表现。