Ollama 使用指南：Python 与 JavaScript 集成实践

什么是 Ollama？

Ollama 是一个用于在本地运行大语言模型（LLM）的工具和框架。它简化了在本地环境中部署和使用大语言模型的过程，使得开发者可以在自己的计算机上运行强大的 AI 模型，而无需依赖云服务。

Ollama 提供了 REST API 接口，并且为 Python 和 JavaScript 开发者提供了专门的 SDK，使得集成变得非常简单。

安装 Ollama

首先需要在本地安装 Ollama，可以从 Ollama 官网下载适用于您操作系统的安装包。

安装完成后，可以通过以下命令验证安装：

1	ollama --version

拉取模型

在使用 Ollama 之前，需要先拉取所需的模型。以 gemma3 模型为例：

1	ollama pull gemma3

您可以在 Ollama 模型库中找到更多可用的模型。

Python 中使用 Ollama

安装 Python SDK

首先需要安装 Ollama 的 Python SDK：

1	pip install ollama

基本聊天功能

以下是一个简单的聊天示例：

from ollama import chat
from ollama import ChatResponse

# 发送聊天消息
response: ChatResponse = chat(model='gemma3', messages=[
    {
        'role': 'user',
        'content': '为什么天空是蓝色的？',
    },
])

# 输出响应内容
print(response['message']['content'])

# 或者直接通过响应对象访问字段
print(response.message.content)

流式响应

对于需要实时显示响应的场景，可以启用流式响应：

from ollama import chat

# 启用流式响应
stream = chat(
    model='gemma3',
    messages=[{'role': 'user', 'content': '为什么天空是蓝色的？'}],
    stream=True,
)

# 实时输出响应内容
for chunk in stream:
    print(chunk['message']['content'], end='', flush=True)

异步客户端

对于异步应用，可以使用 AsyncClient：

import asyncio
from ollama import AsyncClient

async def chat():
    message = {'role': 'user', 'content': '为什么天空是蓝色的？'}
    response = await AsyncClient().chat(model='gemma3', messages=[message])
    print(response.message.content)

# 运行异步函数
asyncio.run(chat())

流式异步响应

import asyncio
from ollama import AsyncClient

async def chat():
    message = {'role': 'user', 'content': '为什么天空是蓝色的？'}
    async for part in await AsyncClient().chat(model='gemma3', messages=[message], stream=True):
        print(part['message']['content'], end='', flush=True)

# 运行异步函数
asyncio.run(chat())

自定义客户端

可以创建自定义客户端来配置特定的选项：

from ollama import Client

# 创建自定义客户端
client = Client(
    host='http://localhost:11434',
    headers={'x-some-header': 'some-value'}
)

# 发送请求
response = client.chat(model='gemma3', messages=[
    {
        'role': 'user',
        'content': '为什么天空是蓝色的？',
    },
])

print(response.message.content)

其他 API 功能

Ollama Python SDK 还提供了其他功能：

import ollama

# 列出所有模型
models = ollama.list()
print(models)

# 显示模型信息
model_info = ollama.show('gemma3')
print(model_info)

# 生成文本（非聊天模式）
response = ollama.generate(model='gemma3', prompt='写一首关于春天的诗')
print(response.response)

# 删除模型
# ollama.delete('gemma3')

# 复制模型
# ollama.copy('gemma3', 'user/gemma3')

# 嵌入文本
embedding = ollama.embed(model='gemma3', input='天空是蓝色的因为瑞利散射')
print(embedding)

JavaScript 中使用 Ollama

安装 JavaScript SDK

在 Node.js 项目中安装 Ollama SDK：

1	npm install ollama

基本聊天功能

import ollama from 'ollama'

// 发送聊天消息
const response = await ollama.chat({
    model: 'gemma3',
    messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
})

// 输出响应内容
console.log(response.message.content)

流式响应

import ollama from 'ollama'

const message = { role: 'user', content: '为什么天空是蓝色的？' }
const response = await ollama.chat({
    model: 'gemma3',
    messages: [message],
    stream: true,
})

// 实时输出响应内容
for await (const part of response) {
    process.stdout.write(part.message.content)
}

浏览器中使用

在浏览器环境中，需要导入浏览器模块：

import ollama from 'ollama/browser'

// 在浏览器中使用
const response = await ollama.chat({
    model: 'gemma3',
    messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
})

console.log(response.message.content)

自定义客户端

import { Ollama } from 'ollama'

// 创建自定义客户端
const ollama = new Ollama({
    host: 'http://localhost:11434',
    headers: { 'x-some-header': 'some-value' }
})

// 发送请求
const response = await ollama.chat({
    model: 'gemma3',
    messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
})

console.log(response.message.content)

其他 API 功能

import ollama from 'ollama'

// 列出所有模型
const models = await ollama.list()
console.log(models)

// 显示模型信息
const modelInfo = await ollama.show('gemma3')
console.log(modelInfo)

// 生成文本（非聊天模式）
const response = await ollama.generate({
    model: 'gemma3',
    prompt: '写一首关于春天的诗'
})
console.log(response.response)

// 嵌入文本
const embedding = await ollama.embed({
    model: 'gemma3',
    input: '天空是蓝色的因为瑞利散射'
})
console.log(embedding)

错误处理

在使用 Ollama 时，需要适当处理可能出现的错误：

Python 错误处理

import ollama
from ollama import ResponseError

model = 'does-not-yet-exist'

try:
    ollama.chat(model)
except ResponseError as e:
    print('错误:', e.error)
    if e.status_code == 404:
        print('模型不存在，需要先拉取模型')
        # ollama.pull(model)

JavaScript 错误处理

import ollama from 'ollama'

try {
    const response = await ollama.chat({
        model: 'does-not-yet-exist',
        messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
    })
    console.log(response.message.content)
} catch (error) {
    console.error('错误:', error.message)
    if (error.status === 404) {
        console.log('模型不存在，需要先拉取模型')
        // await ollama.pull('does-not-yet-exist')
    }
}

实际应用示例

Python 聊天机器人

from ollama import chat
import asyncio

class ChatBot:
    def __init__(self, model='gemma3'):
        self.model = model
        self.history = []
    
    def send_message(self, message):
        # 添加用户消息到历史记录
        self.history.append({'role': 'user', 'content': message})
        
        # 发送请求
        response = chat(model=self.model, messages=self.history)
        
        # 添加助手响应到历史记录
        assistant_message = response['message']
        self.history.append(assistant_message)
        
        return assistant_message['content']
    
    def reset(self):
        self.history = []

# 使用示例
bot = ChatBot()
response = bot.send_message('你好，介绍一下你自己')
print(response)

JavaScript 聊天应用

import ollama from 'ollama'

class ChatBot {
    constructor(model = 'gemma3') {
        this.model = model
        this.history = []
    }
    
    async sendMessage(message) {
        // 添加用户消息到历史记录
        this.history.push({ role: 'user', content: message })
        
        // 发送请求
        const response = await ollama.chat({
            model: this.model,
            messages: this.history
        })
        
        // 添加助手响应到历史记录
        const assistantMessage = response.message
        this.history.push(assistantMessage)
        
        return assistantMessage.content
    }
    
    reset() {
        this.history = []
    }
}

// 使用示例
const bot = new ChatBot()
const response = await bot.sendMessage('你好，介绍一下你自己')
console.log(response)