ReAct Agent
本示例展示了如何使用 outlines 构建您自己的 agent,该 agent 使用开放权重本地模型并能生成结构化输出。它受到 Simon Willison 的博客文章 A simple Python implementation of the ReAct pattern for LLMs 的启发。
ReAct 模式(Reason+Act 的缩写,意为“推理+行动”)在论文 ReAct: Synergizing Reasoning and Acting in Language Models 中有详细描述。在这种模式下,您可以实现 LLM 可以执行的额外行动,例如搜索维基百科或进行计算,然后教导 LLM 如何请求执行这些行动,并将结果反馈给 LLM。
此外,我们还赋予 LLM 使用草稿本的能力,这在论文 Show Your Work: Scratchpads for Intermediate Computation with Language Models 中有所描述,它提高了 LLM 执行多步计算的能力。
我们使用 llama.cpp 及其 llama-cpp-python 库。Outlines 支持 llama-cpp-python,但我们需要自行安装它。
我们通过传递 HuggingFace Hub 上的仓库名称以及文件名(或通配符模式)来下载模型权重。
import llama_cpp
from outlines import generate, models
model = models.llamacpp("NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
"Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
),
n_gpu_layers=-1,
flash_attn=True,
n_ctx=8192,
verbose=False)
(可选) 将模型权重存储在自定义文件夹中
默认情况下,模型权重会下载到 hub 缓存中,但如果想将权重存储在自定义文件夹中,我们可以从 HuggingFace 上拉取 NousResearch 的量化 GGUF 模型 Hermes-2-Pro-Llama-3-8B。
wget https://hugging-face.cn/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
我们初始化模型
import llama_cpp
from llama_cpp import Llama
from outlines import generate, models
llm = Llama(
"/path/to/model/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
),
n_gpu_layers=-1,
flash_attn=True,
n_ctx=8192,
verbose=False
)
构建 ReAct Agent
在本示例中,我们使用两个工具
- wikipedia: \<search term> - 搜索维基百科并返回第一个结果的摘要
- calculate: \<expression> - 使用 Python 的 eval() 函数评估表达式
import httpx
def wikipedia(q):
return httpx.get("https://en.wikipedia.org/w/api.php", params={
"action": "query",
"list": "search",
"srsearch": q,
"format": "json"
}).json()["query"]["search"][0]["snippet"]
def calculate(numexp):
return eval(numexp)
我们通过 Pydantic 类定义 agent 的逻辑。首先,我们希望 LLM 只能在这两个预先定义的工具之间做出选择
我们的 agent 将循环执行 Thought (思考) 和 Action (行动)。我们明确指定 Action Input 字段,以便它不会忘记添加 Action 的参数。我们还添加了一个草稿本(可选)。
from pydantic import BaseModel, Field
class Reason_and_Act(BaseModel):
Scratchpad: str = Field(..., description="Information from the Observation useful to answer the question")
Thought: str = Field(..., description="It describes your thoughts about the question you have been asked")
Action: Action
Action_Input: str = Field(..., description="The arguments of the Action.")
我们的 agent 将得出 Final Answer (最终答案)。我们还添加了一个草稿本(可选)。
class Final_Answer(BaseModel):
Scratchpad: str = Field(..., description="Information from the Observation useful to answer the question")
Final_Answer: str = Field(..., description="Answer to the question grounded on the Observation")
我们的 agent 将自行决定何时得出 Final Answer,从而停止 Thought 和 Action 的循环。
我们可以使用 json schema 生成响应,但我们将使用 regex 并检查一切是否按预期工作
from outlines.fsm.json_schema import convert_json_schema_to_str
from outlines_core.fsm.json_schema import build_regex_from_schema
json_schema = Decision.model_json_schema()
schema_str = convert_json_schema_to_str(json_schema=json_schema)
regex_str = build_regex_from_schema(schema_str)
print(regex_str)
# '\\{[ ]?"Decision"[ ]?:[ ]?(\\{[ ]?"Scratchpad"[ ]?:[ ]?"([^"\\\\\\x00-\\x1F\\x7F-\\x9F]|\\\\["\\\\])*"[ ]?,[ ]?"Thought"[ ]?:[ ]?"([^"\\\\\\x00-\\x1F\\x7F-\\x9F]|\\\\["\\\\])*"[ ]?,[ ]?"Action"[ ]?:[ ]?("wikipedia"|"calculate")[ ]?,[ ]?"Action_Input"[ ]?:[ ]?"([^"\\\\\\x00-\\x1F\\x7F-\\x9F]|\\\\["\\\\])*"[ ]?\\}|\\{[ ]?"Scratchpad"[ ]?:[ ]?"([^"\\\\\\x00-\\x1F\\x7F-\\x9F]|\\\\["\\\\])*"[ ]?,[ ]?"Final_Answer"[ ]?:[ ]?"([^"\\\\\\x00-\\x1F\\x7F-\\x9F]|\\\\["\\\\])*"[ ]?\\})[ ]?\\}'
然后我们需要将我们的提示词调整为 Hermes 的 JSON schema 提示词格式,并解释 agent 的逻辑
import datetime
def generate_hermes_prompt(question, schema=""):
return (
"<|im_start|>system\n"
"You are a world class AI model who answers questions in JSON with correct Pydantic schema. "
f"Here's the json schema you must adhere to:\n<schema>\n{schema}\n</schema>\n"
"Today is " + datetime.datetime.today().strftime('%Y-%m-%d') + ".\n" +
"You run in a loop of Scratchpad, Thought, Action, Action Input, PAUSE, Observation. "
"At the end of the loop you output a Final Answer. "
"Use Scratchpad to store the information from the Observation useful to answer the question "
"Use Thought to describe your thoughts about the question you have been asked "
"and reflect carefully about the Observation if it exists. "
"Use Action to run one of the actions available to you. "
"Use Action Input to input the arguments of the selected action - then return PAUSE. "
"Observation will be the result of running those actions. "
"Your available actions are:\n"
"calculate:\n"
"e.g. calulate: 4**2 / 3\n"
"Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessary\n"
"wikipedia:\n"
"e.g. wikipedia: Django\n"
"Returns a summary from searching Wikipedia\n"
"DO NOT TRY TO GUESS THE ANSWER. Begin! <|im_end|>"
"\n<|im_start|>user\n" + question + "<|im_end|>"
"\n<|im_start|>assistant\n"
)
我们定义一个 ChatBot 类
class ChatBot:
def __init__(self, prompt=""):
self.prompt = prompt
def __call__(self, user_prompt):
self.prompt += user_prompt
result = self.execute()
return result
def execute(self):
generator = generate.regex(model, regex_str)
result = generator(self.prompt, max_tokens=1024, temperature=0, seed=42)
return result
我们定义一个查询函数
import json
def query(question, max_turns=5):
i = 0
next_prompt = (
"\n<|im_start|>user\n" + question + "<|im_end|>"
"\n<|im_start|>assistant\n"
)
previous_actions = []
while i < max_turns:
i += 1
prompt = generate_hermes_prompt(question=question, schema=Decision.model_json_schema())
bot = ChatBot(prompt=prompt)
result = bot(next_prompt)
json_result = json.loads(result)['Decision']
if "Final_Answer" not in list(json_result.keys()):
scratchpad = json_result['Scratchpad'] if i == 0 else ""
thought = json_result['Thought']
action = json_result['Action']
action_input = json_result['Action_Input']
print(f"\x1b[34m Scratchpad: {scratchpad} \x1b[0m")
print(f"\x1b[34m Thought: {thought} \x1b[0m")
print(f"\x1b[36m -- running {action}: {str(action_input)}\x1b[0m")
if action + ": " + str(action_input) in previous_actions:
observation = "You already run that action. **TRY A DIFFERENT ACTION INPUT.**"
else:
if action=="calculate":
try:
observation = eval(str(action_input))
except Exception as e:
observation = f"{e}"
elif action=="wikipedia":
try:
observation = wikipedia(str(action_input))
except Exception as e:
observation = f"{e}"
print()
print(f"\x1b[33m Observation: {observation} \x1b[0m")
print()
previous_actions.append(action + ": " + str(action_input))
next_prompt += (
"\nScratchpad: " + scratchpad +
"\nThought: " + thought +
"\nAction: " + action +
"\nAction Input: " + action_input +
"\nObservation: " + str(observation)
)
else:
scratchpad = json_result["Scratchpad"]
final_answer = json_result["Final_Answer"]
print(f"\x1b[34m Scratchpad: {scratchpad} \x1b[0m")
print(f"\x1b[34m Final Answer: {final_answer} \x1b[0m")
return final_answer
print(f"\nFinal Answer: I am sorry, but I am unable to answer your question. Please provide more information or a different question.")
return "No answer found"
现在我们可以测试我们的 ReAct agent 了
print(query("What's 2 to the power of 10?"))
# Scratchpad:
# Thought: I need to perform a mathematical calculation to find the result of 2 to the power of 10.
# -- running calculate: 2**10
#
# Observation: 1024
#
# Scratchpad: 2 to the power of 10 is 1024.
# Final Answer: 2 to the power of 10 is 1024.
# 2 to the power of 10 is 1024.
print(query("What does England share borders with?"))
# Scratchpad:
# Thought: To answer this question, I will use the 'wikipedia' action to gather information about England's geographical location and its borders.
# -- running wikipedia: England borders
#
# Observation: Anglo-Scottish <span class="searchmatch">border</span> (Scottish Gaelic: Crìochan Anglo-Albannach) is an internal <span class="searchmatch">border</span> of the United Kingdom separating Scotland and <span class="searchmatch">England</span> which runs for
#
# Scratchpad: Anglo-Scottish border (Scottish Gaelic: Crìochan Anglo-Albannach) is an internal border of the United Kingdom separating Scotland and England which runs for
# Final Answer: England shares a border with Scotland.
# England shares a border with Scotland.
正如 Simon 的博客文章中提到的,这远非一个非常鲁棒的实现,还有很大的改进空间。但用几行 Python 代码就能让 LLM 拥有这些额外的能力,这真是太棒了。现在,您可以使用开放权重 LLM 在本地运行它了。
本示例最初由 Alonso Silva 贡献。