使用 LM Studio 提供服务

不想自托管？

如果你想快速开始 JSON 结构化生成，你可以调用 .json，这是一个保证有效 JSON 输出的 .txt API。

LM Studio 是一个用于运行本地 LLM 的应用程序。它可以在硬件受限的环境中灵活地混合使用 GPU 和 CPU 计算资源。

从 LM Studio 0.3.4 开始，它通过使用 OpenAI 兼容的端点，原生支持 Outlines 进行结构化文本生成。

设置

访问他们的下载页面安装 LM Studio。
启用 LM Studio 的服务器功能。
下载模型。
安装 Python 依赖项。
```
pip install pydantic openai
```

调用服务器

默认情况下，LM Studio 将从 https://:1234 提供服务。如果你在不同的端口或主机上提供服务，请确保将 OpenAI 中的 base_url 参数更改为相应的位置。

class Testing(BaseModel):
    """
    A class representing a testing schema.
    """
    name: str
    age: int

openai_client = openai.OpenAI(
    base_url="http://0.0.0.0:1234/v1",
    api_key="dopeness"
)

# Make a request to the local LM Studio server
response = openai_client.beta.chat.completions.parse(
    model="hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF",
    messages=[
        {"role": "system", "content": "You are like so good at whatever you do."},
        {"role": "user", "content": "My name is Cameron and I am 28 years old. What's my name and age?"}
    ],
    response_format=Testing
)

你应该会收到一个 ParsedChatCompletion[Testing] 对象

ParsedChatCompletion[Testing](
    id='chatcmpl-3hykyf0fxus7jc90k6gwlw',
    choices=[
        ParsedChoice[Testing](
            finish_reason='stop',
            index=0,
            logprobs=None,
            message=ParsedChatCompletionMessage[Testing](
                content='{ "age": 28, "name": "Cameron" }',
                refusal=None,
                role='assistant',
                function_call=None,
                tool_calls=[],
                parsed=Testing(name='Cameron', age=28)
            )
        )
    ],
    created=1728595622,
    model='lmstudio-community/Phi-3.1-mini-128k-instruct-GGUF/Phi-3.1-mini-128k-instruct-Q4_K_M.gguf',
    object='chat.completion',
    service_tier=None,
    system_fingerprint='lmstudio-community/Phi-3.1-mini-128k-instruct-GGUF/Phi-3.1-mini-128k-instruct-
Q4_K_M.gguf',
    usage=CompletionUsage(
        completion_tokens=17,
        prompt_tokens=47,
        total_tokens=64,
        completion_tokens_details=None,
        prompt_tokens_details=None
    )
)

你可以通过以下方式获取你的 Testing 对象

response.choices[0].message.parsed