Merge branch 'labring:main' into local

This commit is contained in:
Theresa
2025-03-26 15:54:01 +08:00
committed by GitHub
26 changed files with 1427 additions and 6 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 179 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

View File

@@ -0,0 +1,184 @@
---
title: '使用 Ollama 接入本地模型 '
description: ' 采用 Ollama 部署自己的模型'
icon: 'api'
draft: false
toc: true
weight: 950
---
[Ollama](https://ollama.com/)是一个开源的AI大模型部署工具专注于简化大语言模型的部署和使用支持一键下载和运行各种大模型。
## 安装 Ollama
Ollama 本身支持多种安装方式,但是推荐使用 Docker 拉取镜像部署。如果是个人设备上安装了 Ollama 后续需要解决如何让 Docker 中 FastGPT 容器访问宿主机 Ollama的问题较为麻烦。
### Docker 安装(推荐)
你可以使用 Ollama 官方的 Docker 镜像来一键安装和启动 Ollama 服务(确保你的机器上已经安装了 Docker命令如下
```bash
docker pull ollama/ollama
docker run --rm -d --name ollama -p 11434:11434 ollama/ollama
```
如果你的 FastGPT 是在 Docker 中进行部署的,建议在拉取 Ollama 镜像时保证和 FastGPT 镜像处于同一网络,否则可能出现 FastGPT 无法访问的问题,命令如下:
```bash
docker run --rm -d --name ollama --network (你的 Fastgpt 容器所在网络) -p 11434:11434 ollama/ollama
```
### 主机安装
如果你不想使用 Docker ,也可以采用主机安装,以下是主机安装的一些方式。
#### MacOS
如果你使用的是 macOS且系统中已经安装了 Homebrew 包管理器,可通过以下命令来安装 Ollama
```bash
brew install ollama
ollama serve #安装完成后,使用该命令启动服务
```
#### Linux
在 Linux 系统上,你可以借助包管理器来安装 Ollama。以 Ubuntu 为例,在终端执行以下命令:
```bash
curl https://ollama.com/install.sh | sh #此命令会从官方网站下载并执行安装脚本。
ollama serve #安装完成后,同样启动服务
```
#### Windows
在 Windows 系统中,你可以从 Ollama 官方网站 下载 Windows 版本的安装程序。下载完成后,运行安装程序,按照安装向导的提示完成安装。安装完成后,在命令提示符或 PowerShell 中启动服务:
```bash
ollama serve #安装完成并启动服务后,你可以在浏览器中访问 http://localhost:11434 来验证 Ollama 是否安装成功。
```
#### 补充说明
如果你是采用的主机应用 Ollama 而不是镜像,需要确保你的 Ollama 可以监听0.0.0.0。
##### 1. Linxu 系统
如果 Ollama 作为 systemd 服务运行,打开终端,编辑 Ollama 的 systemd 服务文件使用命令sudo systemctl edit ollama.service在[Service]部分添加Environment="OLLAMA_HOST=0.0.0.0"。保存并退出编辑器然后执行sudo systemctl daemon - reload和sudo systemctl restart ollama使配置生效。
##### 2. MacOS 系统
打开终端使用launchctl setenv ollama_host "0.0.0.0"命令设置环境变量,然后重启 Ollama 应用程序以使更改生效。
##### 3. Windows 系统
通过 “开始” 菜单或搜索栏打开 “编辑系统环境变量”,在 “系统属性” 窗口中点击 “环境变量”,在 “系统变量” 部分点击 “新建”创建一个名为OLLAMA_HOST的变量变量值设置为0.0.0.0,点击 “确定” 保存更改,最后从 “开始” 菜单重启 Ollama 应用程序。
### Ollama 拉取模型镜像
在安装后 Ollama 后,本地是没有模型镜像的,需要自己去拉取 Ollama 中的模型镜像。命令如下:
```bash
# Docker 部署需要先进容器,命令为: docker exec -it < Ollama 容器名 > /bin/sh
ollama pull <模型名>
```
![](/imgs/Ollama-pull.png)
### 测试通信
在安装完成后,需要进行检测测试,首先进入 FastGPT 所在的容器,尝试访问自己的 Ollama ,命令如下:
```bash
docker exec -it < FastGPT 所在的容器名 > /bin/sh
curl http://XXX.XXX.XXX.XXX:11434 #容器部署地址为“http://<容器名>:<端口>”,主机安装地址为"http://<主机IP>:<端口>"主机IP不可为localhost
```
看到访问显示自己的 Ollama 服务以及启动,说明可以正常通信。
## 将 Ollama 接入 FastGPT
### 1. 查看 Ollama 所拥有的模型
首先采用下述命令查看 Ollama 中所拥有的模型,
```bash
# Docker 部署 Ollama需要此命令 docker exec -it < Ollama 容器名 > /bin/sh
ollama ls
```
![](/imgs/Ollama-models1.png)
### 2. AI Proxy 接入
如果你采用的是 FastGPT 中的默认配置文件部署[这里](/docs/development/docker.md),即默认采用 AI Proxy 进行启动。
![](/imgs/Ollama-aiproxy1.png)
以及在确保你的 FastGPT 可以直接访问 Ollama 容器的情况下,无法访问,参考上文[点此跳转](#安装-ollama)的安装过程检测是不是主机不能监测0.0.0.0,或者容器不在同一个网络。
![](/imgs/Ollama-aiproxy2.png)
在 FastGPT 中点击账号->模型提供商->模型配置->新增模型添加自己的模型即可添加模型时需要保证模型ID和 OneAPI 中的模型名称一致。详细参考[这里](/docs/development/modelConfig/intro.md)
![](/imgs/Ollama-models2.png)
![](/imgs/Ollama-models3.png)
运行 FastGPT ,在页面中选择账号->模型提供商->模型渠道->新增渠道。之后,在渠道选择中选择 Ollama ,然后加入自己拉取的模型,填入代理地址,如果是容器中安装 Ollama 代理地址为http://地址:端口补充容器部署地址为“http://<容器名>:<端口>”,主机安装地址为"http://<主机IP>:<端口>"主机IP不可为localhost
![](/imgs/Ollama-aiproxy3.png)
在工作台中创建一个应用,选择自己之前添加的模型,此处模型名称为自己当时设置的别名。注:同一个模型无法多次添加,系统会采取最新添加时设置的别名。
![](/imgs/Ollama-models4.png)
### 3. OneAPI 接入
如果你想使用 OneAPI ,首先需要拉取 OneAPI 镜像,然后将其在 FastGPT 容器的网络中运行。具体命令如下:
```bash
# 拉取 oneAPI 镜像
docker pull intel/oneapi-hpckit
# 运行容器并指定自定义网络和容器名
docker run -it --network < FastGPT 网络 > --name 容器名 intel/oneapi-hpckit /bin/bash
```
进入 OneAPI 页面,添加新的渠道,类型选择 Ollama ,在模型中填入自己 Ollama 中的模型,需要保证添加的模型名称和 Ollama 中一致,再在下方填入自己的 Ollama 代理地址默认http://地址:端口,不需要填写/v1。添加成功后在 OneAPI 进行渠道测试,测试成功则说明添加成功。此处演示采用的是 Docker 部署 Ollama 的效果,主机 Ollama需要修改代理地址为http://<主机IP>:<端口>
![](/imgs/Ollama-oneapi1.png)
渠道添加成功后,点击令牌,点击添加令牌,填写名称,修改配置。
![](/imgs/Ollama-oneapi2.png)
修改部署 FastGPT 的 docker-compose.yml 文件,在其中将 AI Proxy 的使用注释,在 OPENAI_BASE_URL 中加入自己的 OneAPI 开放地址默认是http://地址:端口/v1v1必须填写。KEY 中填写自己在 OneAPI 的令牌。
![](/imgs/Ollama-oneapi3.png)
[直接跳转5](#5-模型添加和使用)添加模型,并使用。
### 4. 直接接入
如果你既不想使用 AI Proxy也不想使用 OneAPI也可以选择直接接入修改部署 FastGPT 的 docker-compose.yml 文件,在其中将 AI Proxy 的使用注释,采用和 OneAPI 的类似配置。注释掉 AIProxy 相关代码在OPENAI_BASE_URL中加入自己的 Ollama 开放地址默认是http://地址:端口/v1强调:v1必须填写。在KEY中随便填入因为 Ollama 默认没有鉴权,如果开启鉴权,请自行填写。其他操作和在 OneAPI 中加入 Ollama 一致,只需在 FastGPT 中加入自己的模型即可使用。此处演示采用的是 Docker 部署 Ollama 的效果,主机 Ollama需要修改代理地址为http://<主机IP>:<端口>
![](/imgs/Ollama-direct1.png)
完成后[点击这里](#5-模型添加和使用)进行模型添加并使用。
### 5. 模型添加和使用
在 FastGPT 中点击账号->模型提供商->模型配置->新增模型添加自己的模型即可添加模型时需要保证模型ID和 OneAPI 中的模型名称一致。
![](/imgs/Ollama-models2.png)
![](/imgs/Ollama-models3.png)
在工作台中创建一个应用,选择自己之前添加的模型,此处模型名称为自己当时设置的别名。注:同一个模型无法多次添加,系统会采取最新添加时设置的别名。
![](/imgs/Ollama-models4.png)
### 6. 补充
上述接入 Ollama 的代理地址中,主机安装 Ollama 的地址为“http://<主机IP>:<端口>”,容器部署 Ollama 地址为“http://<容器名>:<端口>”

View File

@@ -1,5 +1,5 @@
protobuf
transformers==4.30.2
transformers==4.48.0
cpm_kernels
torch>=2.0
gradio

View File

@@ -6,6 +6,6 @@ sentence_transformers==2.2.2
sse_starlette==1.6.5
starlette==0.27.0
tiktoken==0.4.0
torch==2.0.1
transformers==4.31.0
torch==2.4.0
transformers==4.48.0
uvicorn==0.23.2

View File

@@ -0,0 +1,85 @@
# Readme
# 项目介绍
---
本项目参照官方插件**pdf-marker**基于MinertU实现了一个高效的 **PDF 转 Markdown 接口服务**,通过高性能的接口设计,快速将 PDF 文档转换为 Markdown 格式文本。
- **简洁性:**项目无需修改代码,仅需调整文件路径即可使用,简单易用
- **易用性:**通过提供简洁的 API开发者只需发送 HTTP 请求即可完成 PDF 转换
- **灵活性:**支持本地部署,便于快速上手和灵活集成
# 配置推荐
配置及速率请参照[MinerU项目](https://github.com/opendatalab/MinerU/blob/master/README_zh-CN.md)官方介绍。
# 本地开发
## 基本流程
1、安装基本环境主要参照官方文档[使用CPU及GPU](https://github.com/opendatalab/MinerU/blob/master/README_zh-CN.md#%E4%BD%BF%E7%94%A8GPU)运行MinerU的方式进行。具体如下首先使用anaconda安装基础运行环境
```bash
conda create -n mineru python=3.10
conda activate mineru
pip install -U "magic-pdf[full]" --extra-index-url https://wheels.myhloli.com -i https://mirrors.aliyun.com/pypi/simple
```
2、[下载模型权重文件](https://github.com/opendatalab/MinerU/blob/master/docs/how_to_download_models_zh_cn.md)
```bash
pip install modelscope
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/scripts/download_models.py -O download_models.py
python download_models.py
```
python脚本会自动下载模型文件并配置好配置文件中的模型目录
配置文件可以在用户目录中找到,文件名为`magic-pdf.json`
> windows的用户目录为 "C:\\Users\\用户名", linux用户目录为 "/home/用户名", macOS用户目录为 "/Users/用户名"
3、如果您的显卡显存大于等于 **8GB** 可以进行以下流程测试CUDA解析加速效果。默认为cpu模式使用显卡的话需修改【用户目录】中配置文件magic-pdf.json中"device-mode"的值。
```bash
{
"device-mode":"cuda"
}
```
4、如需使用GPU加速需额外再安装依赖。
```bash
pip install --force-reinstall torch==2.3.1 torchvision==0.18.1 "numpy<2.0.0" --index-url https://download.pytorch.org/whl/cu118
```
```bash
pip install paddlepaddle-gpu==2.6.1
```
5、克隆一个FastGPT的项目文件
```
git clone https://github.com/labring/FastGPT.git
```
6、将主目录设置为 plugins/model 下的pdf-mineru文件夹
```
cd /plugins/model/pdf-mineru/
```
7、执行文件pdf_parser_mineru.py启动服务
```bash
python pdf_parser_mineru.py
```
# 访问示例
仿照了**pdf-marker**的方式。
```bash
curl --location --request POST "http://localhost:7231/v1/parse/file" \
--header "Authorization: Bearer your_access_token" \
--form "file=@./file/chinese_test.pdf"
```

View File

@@ -0,0 +1,282 @@
import json
import os
from base64 import b64encode
from glob import glob
from io import StringIO
from typing import Tuple, Union
import uvicorn
from fastapi import FastAPI, UploadFile, File
from fastapi.responses import JSONResponse
from loguru import logger
from tempfile import TemporaryDirectory
from pathlib import Path
import fitz # PyMuPDF
import asyncio
from concurrent.futures import ProcessPoolExecutor
import torch
import multiprocessing as mp
from contextlib import asynccontextmanager
import time
import magic_pdf.model as model_config
from magic_pdf.config.enums import SupportedPdfParseMethod
from magic_pdf.data.data_reader_writer import DataWriter, FileBasedDataWriter
from magic_pdf.data.dataset import PymuDocDataset
from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
from magic_pdf.operators.models import InferenceResult
from magic_pdf.operators.pipes import PipeResult
model_config.__use_inside_model__ = True
app = FastAPI()
process_variables = {}
my_pool = None
class MemoryDataWriter(DataWriter):
def __init__(self):
self.buffer = StringIO()
def write(self, path: str, data: bytes) -> None:
if isinstance(data, str):
self.buffer.write(data)
else:
self.buffer.write(data.decode("utf-8"))
def write_string(self, path: str, data: str) -> None:
self.buffer.write(data)
def get_value(self) -> str:
return self.buffer.getvalue() # 修复:使用 getvalue() 而不是 get_value()
def close(self):
self.buffer.close()
def worker_init(counter, lock):
num_gpus = torch.cuda.device_count()
processes_per_gpu = int(os.environ.get('PROCESSES_PER_GPU', 1))
with lock:
worker_id = counter.value
counter.value += 1
if num_gpus == 0:
device = 'cpu'
else:
device_id = worker_id // processes_per_gpu
if device_id >= num_gpus:
raise ValueError(f"Worker ID {worker_id} exceeds available GPUs ({num_gpus}).")
device = f'cuda:{device_id}'
config = {
"parse_method": "auto",
"ADDITIONAL_KEY": "VALUE"
}
converter = init_converter(config, device_id)
pid = os.getpid()
process_variables[pid] = converter
print(f"Worker {worker_id}: Models loaded successfully on {device}!")
def init_converter(config, device_id):
os.environ["CUDA_VISIBLE_DEVICES"] = str(device_id)
return config
def img_to_base64(img_path: str) -> str:
with open(img_path, "rb") as img_file:
return b64encode(img_file.read()).decode('utf-8')
def embed_images_as_base64(md_content: str, image_dir: str) -> str:
lines = md_content.split('\n')
new_lines = []
for line in lines:
if line.startswith("![") and "](" in line and ")" in line:
start_idx = line.index("](") + 2
end_idx = line.index(")", start_idx)
img_rel_path = line[start_idx:end_idx]
img_name = os.path.basename(img_rel_path)
img_path = os.path.join(image_dir, img_name)
logger.info(f"Checking image: {img_path}")
if os.path.exists(img_path):
img_base64 = img_to_base64(img_path)
new_line = f"![](data:image/png;base64,{img_base64})"
new_lines.append(new_line)
else:
logger.warning(f"Image not found: {img_path}")
new_lines.append(line)
else:
new_lines.append(line)
return '\n'.join(new_lines)
def process_pdf(pdf_path, output_dir):
try:
pid = os.getpid()
config = process_variables.get(pid, "No variable")
parse_method = config["parse_method"]
with open(str(pdf_path), "rb") as f:
pdf_bytes = f.read()
output_path = Path(output_dir) / f"{Path(pdf_path).stem}_output"
os.makedirs(str(output_path), exist_ok=True)
image_dir = os.path.join(str(output_path), "images")
os.makedirs(image_dir, exist_ok=True)
image_writer = FileBasedDataWriter(str(output_path))
# 处理 PDF
infer_result, pipe_result = process_pdf_content(pdf_bytes, parse_method, image_writer)
md_content_writer = MemoryDataWriter()
pipe_result.dump_md(md_content_writer, "", "images")
md_content = md_content_writer.get_value()
md_content_writer.close()
# 获取保存的图片路径
image_paths = glob(os.path.join(image_dir, "*.jpg"))
logger.info(f"Saved images by magic_pdf: {image_paths}")
# 如果 magic_pdf 未保存足够图片,使用 fitz 提取
if not image_paths or len(image_paths) < 3: # 假设至少 3 张图片
logger.warning("Insufficient images saved by magic_pdf, falling back to fitz extraction")
image_map = {}
original_names = []
# 收集 Markdown 中的所有图片文件名
for line in md_content.split('\n'):
if line.startswith("![") and "](" in line and ")" in line:
start_idx = line.index("](") + 2
end_idx = line.index(")", start_idx)
img_rel_path = line[start_idx:end_idx]
original_names.append(os.path.basename(img_rel_path))
# 提取图片并映射
with fitz.open(pdf_path) as doc:
img_counter = 0
for page_num, page in enumerate(doc):
for img_index, img in enumerate(page.get_images(full=True)):
xref = img[0]
base = doc.extract_image(xref)
if img_counter < len(original_names):
img_name = original_names[img_counter] # 使用 Markdown 中的原始文件名
else:
img_name = f"page_{page_num}_img_{img_index}.jpg"
img_path = os.path.join(image_dir, img_name)
with open(img_path, "wb") as f:
f.write(base["image"])
if img_counter < len(original_names):
image_map[original_names[img_counter]] = img_name
img_counter += 1
image_paths = glob(os.path.join(image_dir, "*.jpg"))
logger.info(f"Images extracted by fitz: {image_paths}")
# 更新 Markdown仅在必要时替换
for original_name, new_name in image_map.items():
if original_name != new_name:
md_content = md_content.replace(f"images/{original_name}", f"images/{new_name}")
return {
"status": "success",
"text": md_content,
"output_path": str(output_path),
"images": image_paths
}
except Exception as e:
logger.error(f"Error processing PDF: {str(e)}")
return {
"status": "error",
"message": str(e),
"file": str(pdf_path)
}
def process_pdf_content(pdf_bytes, parse_method, image_writer):
ds = PymuDocDataset(pdf_bytes)
infer_result: InferenceResult = None
pipe_result: PipeResult = None
if parse_method == "ocr":
infer_result = ds.apply(doc_analyze, ocr=True)
pipe_result = infer_result.pipe_ocr_mode(image_writer)
elif parse_method == "txt":
infer_result = ds.apply(doc_analyze, ocr=False)
pipe_result = infer_result.pipe_txt_mode(image_writer)
else: # auto
if ds.classify() == SupportedPdfParseMethod.OCR:
infer_result = ds.apply(doc_analyze, ocr=True)
pipe_result = infer_result.pipe_ocr_mode(image_writer)
else:
infer_result = ds.apply(doc_analyze, ocr=False)
pipe_result = infer_result.pipe_txt_mode(image_writer)
return infer_result, pipe_result
@asynccontextmanager
async def lifespan(app: FastAPI):
try:
mp.set_start_method('spawn')
except RuntimeError:
raise RuntimeError("Set start method to spawn twice. This may be a temporary issue with the script. Please try running it again.")
global my_pool
manager = mp.Manager()
worker_counter = manager.Value('i', 0)
worker_lock = manager.Lock()
gpu_count = torch.cuda.device_count()
my_pool = ProcessPoolExecutor(max_workers=gpu_count * int(os.environ.get('PROCESSES_PER_GPU', 1)),
initializer=worker_init, initargs=(worker_counter, worker_lock))
yield
if my_pool:
my_pool.shutdown(wait=True)
print("Application shutdown, cleaning up...")
app.router.lifespan_context = lifespan
@app.post("/v2/parse/file")
async def process_pdfs(file: UploadFile = File(...)):
s_time = time.time()
with TemporaryDirectory() as temp_dir:
temp_path = Path(temp_dir) / file.filename
with open(str(temp_path), "wb") as buffer:
buffer.write(await file.read())
# 验证 PDF 文件
try:
with fitz.open(str(temp_path)) as pdf_document:
total_pages = pdf_document.page_count
except fitz.fitz.FileDataError:
return JSONResponse(content={"success": False, "message": "", "error": "Invalid PDF file"}, status_code=400)
except Exception as e:
logger.error(f"Error opening PDF: {str(e)}")
return JSONResponse(content={"success": False, "message": "", "error": f"Internal server error: {str(e)}"}, status_code=500)
try:
loop = asyncio.get_running_loop()
results = await loop.run_in_executor(
my_pool,
process_pdf,
str(temp_path),
str(temp_dir)
)
if results.get("status") == "error":
return JSONResponse(content={
"success": False,
"message": "",
"error": results.get("message")
}, status_code=500)
# 嵌入 Base64
image_dir = os.path.join(results.get("output_path"), "images")
md_content_with_base64 = embed_images_as_base64(results.get("text"), image_dir)
return {
"success": True,
"message": "",
"markdown": md_content_with_base64,
"pages": total_pages
}
except Exception as e:
logger.error(f"Error in process_pdfs: {str(e)}")
return JSONResponse(content={
"success": False,
"message": "",
"error": f"Internal server error: {str(e)}"
}, status_code=500)
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=7231)

View File

@@ -0,0 +1 @@
MISTRAL_API_KEY=

View File

@@ -0,0 +1,143 @@
# PDF-Mistral 插件
此插件使用 Mistral 的 OCR API 将 PDF 文件转换为 Markdown 文本。它可以从 PDF 文档中提取文本内容和图像,并将它们作为带有嵌入式 base64 图像的 Markdown 返回。
## 功能特点
- 使用 Mistral OCR API 提取 PDF 文本
- Markdown 中的 base64 图像嵌入
- 完善的错误处理
- 支持多页 PDF
## 设置
### 前提条件
- Python 3.8+
- Mistral API 密钥([在此获取](https://mistral.ai/)
### 安装
1. 安装所需的软件包:
```bash
pip install -r requirements.txt
```
2. 通过创建/编辑 `.env` 文件设置环境变量:
```bash
# 在 .env 文件中
MISTRAL_API_KEY=你的-mistral-api-密钥
```
## 使用方法
### 启动服务器
使用以下命令运行服务器:
```bash
python api_mp.py
```
或者直接使用 uvicorn
```bash
uvicorn api_mp:app --host 0.0.0.0 --port 7231
```
然后配置到FastGPT配置文件即可
```json
{
xxx
"systemEnv": {
xxx
"customPdfParse": {
"url": "http://localhost:7231/v1/parse/file", // 自定义 PDF 解析服务地址
}
}
}
```
### API 端点
#### 解析 PDF 文件
**端点**`POST /v1/parse/file`
**请求**
- 包含文件字段的多部分表单数据
**响应**
```json
{
"pages": 5, // PDF 中的页数
"markdown": "...", // 带有嵌入式 base64 图像的 Markdown 内容
"duration": 10.5 // 处理时间(秒)
}
```
**错误响应**
```json
{
"pages": 0,
"markdown": "",
"error": "错误信息"
}
```
### 使用示例
使用 curl
```bash
curl -X POST -F "file=@path/to/your/document.pdf" http://localhost:7231/v1/parse/file
```
使用 JavaScript/Axios
```javascript
const formData = new FormData();
formData.append('file', pdfFile);
const response = await axios.post('http://localhost:7231/v1/parse/file', formData, {
headers: {
'Content-Type': 'multipart/form-data'
}
});
if (response.data.error) {
console.error('错误:', response.data.error);
} else {
console.log('页数:', response.data.pages);
console.log('Markdown:', response.data.markdown);
}
```
## 限制
- PDF 文件必须可读且没有密码保护
- 最大文件大小取决于 Mistral API 限制目前最大52.4M
- Mistral API 有页面限制最多最大1000页
## 故障排除
### 常见错误
1. **"MISTRAL_API_KEY environment variable not set"(未设置 MISTRAL_API_KEY 环境变量)**
- 确保您已将 Mistral API 密钥添加到 `.env` 文件中
- 确保 `.env` 文件与脚本在同一目录中
2. **"Failed to process PDF file"(无法处理 PDF 文件)**
- PDF 可能已损坏或受密码保护
- 尝试使用其他 PDF 文件
3. **Mistral API 错误**
- 检查您的 Mistral API 密钥是否有效
- 确保您在 Mistral API 速率限制范围内
- 验证 PDF 是否在大小/页数限制范围内
## 许可证
MIT 许可证

View File

@@ -0,0 +1,230 @@
import time
import base64
import fitz
import re
import json
from contextlib import asynccontextmanager
from loguru import logger
from fastapi import HTTPException, FastAPI, UploadFile, File
from fastapi.responses import JSONResponse
from mistralai import Mistral
import os
import shutil
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
app = FastAPI()
temp_dir = "./temp"
# Initialize Mistral client with API key from environment variable
mistral_api_key = os.environ.get("MISTRAL_API_KEY", "")
if not mistral_api_key:
logger.warning("MISTRAL_API_KEY environment variable not set. PDF processing will fail.")
mistral_client = Mistral(api_key=mistral_api_key) if mistral_api_key else None
@asynccontextmanager
async def lifespan(app: FastAPI):
# Create temp directory if it doesn't exist
global temp_dir
if not os.path.exists(temp_dir):
os.makedirs(temp_dir)
print("Application startup, creating temp directory...")
yield
if temp_dir and os.path.exists(temp_dir):
shutil.rmtree(temp_dir)
print("Application shutdown, cleaning up...")
app.router.lifespan_context = lifespan
@app.post("/v1/parse/file")
async def read_file(
file: UploadFile = File(...)):
temp_file_path = None
try:
start_time = time.time()
global temp_dir
os.makedirs(temp_dir, exist_ok=True)
temp_file_path = os.path.join(temp_dir, file.filename)
with open(temp_file_path, "wb") as temp_file:
file_content = await file.read()
temp_file.write(file_content)
# Get page count using PyMuPDF
try:
pdf_document = fitz.open(temp_file_path)
total_pages = pdf_document.page_count
pdf_document.close()
except Exception as e:
logger.error(f"Failed to open PDF file: {str(e)}")
return {
"pages": 0,
"markdown": "",
"error": f"Failed to process PDF file: {str(e)}"
}
if mistral_client is None:
return {
"pages": 0,
"markdown": "",
"error": "MISTRAL_API_KEY environment variable not set."
}
# Step 1: Upload the file to Mistral's servers
logger.info(f"Uploading file {file.filename} to Mistral servers")
with open(temp_file_path, "rb") as f:
try:
uploaded_file = mistral_client.files.upload(
file={
"file_name": file.filename,
"content": f,
},
purpose="ocr"
)
except Exception as e:
error_msg = str(e)
# Try to parse Mistral API error format
try:
error_data = json.loads(error_msg)
if error_data.get("object") == "error":
error_msg = error_data.get("message", error_msg)
except:
pass
return {
"pages": 0,
"markdown": "",
"error": f"Mistral API upload error: {error_msg}"
}
# Step 2: Get a signed URL for the uploaded file
logger.info(f"Getting signed URL for file ID: {uploaded_file.id}")
try:
signed_url = mistral_client.files.get_signed_url(file_id=uploaded_file.id)
except Exception as e:
error_msg = str(e)
# Try to parse Mistral API error format
try:
error_data = json.loads(error_msg)
if error_data.get("object") == "error":
error_msg = error_data.get("message", error_msg)
except:
pass
return {
"pages": 0,
"markdown": "",
"error": f"Mistral API signed URL error: {error_msg}"
}
# Step 3: Process the file using the signed URL
logger.info("Processing file with OCR API")
try:
ocr_response = mistral_client.ocr.process(
model="mistral-ocr-latest",
document={
"type": "document_url",
"document_url": signed_url.url,
},
include_image_base64=True
)
except Exception as e:
error_msg = str(e)
# Try to parse Mistral API error format
try:
error_data = json.loads(error_msg)
if error_data.get("object") == "error":
error_msg = error_data.get("message", error_msg)
except:
pass
return {
"pages": 0,
"markdown": "",
"error": f"Mistral OCR processing error: {error_msg}"
}
# Combine all pages' markdown content
markdown_content = "\n".join(page.markdown for page in ocr_response.pages)
# Create a dictionary to map image filenames to their base64 data
image_map = {}
for page in ocr_response.pages:
for img in page.images:
# Extract the image filename from the image id
img_id = img.id
img_base64 = img.image_base64
# Print a sample of the first image base64 data for debugging
if len(image_map) == 0 and img_base64:
print("Sample image base64 prefix:", img_base64[:50] if len(img_base64) > 50 else img_base64)
print("Does base64 already include prefix?", img_base64.startswith("data:image/"))
# Ensure the base64 data is in the correct format for the upstream system
# If it doesn't already have the prefix, add it
if not img_base64.startswith("data:image/"):
# Assume it's a PNG if we can't determine the type
img_base64 = f"data:image/png;base64,{img_base64}"
# Add both potential formats to the map
image_map[f"{img_id}.jpeg"] = img_base64
image_map[f"{img_id}.png"] = img_base64
image_map[img_id] = img_base64
# Use regex to find all image references in the markdown content
# This will match patterns like ![any-text](any-filename.extension)
image_pattern = r'!\[(.*?)\]\((.*?)\)'
def replace_image_with_base64(match):
alt_text = match.group(1)
img_filename = match.group(2)
# Extract just the filename without path
img_filename_only = os.path.basename(img_filename)
# Check if we have base64 data for this image
if img_filename_only in image_map:
return f"![]({image_map[img_filename_only]})"
else:
# If we don't have base64 data, keep the original reference
logger.warning(f"No base64 data found for image: {img_filename_only}")
return match.group(0)
# Replace all image references with base64 data
markdown_content = re.sub(image_pattern, replace_image_with_base64, markdown_content)
# Clean up the uploaded file from Mistral's servers
try:
logger.info(f"Deleting uploaded file from Mistral servers: {uploaded_file.id}")
mistral_client.files.delete(file_id=uploaded_file.id)
except Exception as e:
logger.warning(f"Failed to delete uploaded file: {e}")
end_time = time.time()
duration = end_time - start_time
print(file.filename + " Total time:", duration)
# Return with format matching client expectations
return {
"pages": total_pages,
"markdown": markdown_content,
"duration": duration # Keep this for logging purposes
}
except Exception as e:
logger.exception(e)
return {
"pages": 0,
"markdown": "",
"error": f"Internal server error: {str(e)}"
}
finally:
if temp_file_path and os.path.exists(temp_file_path):
os.remove(temp_file_path)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=7231)

View File

@@ -0,0 +1,8 @@
fastapi==0.115.5
uvicorn==0.32.1
mistralai>=1.5.0
PyMuPDF==1.24.14
python-multipart==0.0.18
python-dotenv==1.0.1
loguru==0.7.2
requests==2.32.3

View File

@@ -18,7 +18,7 @@ const QuoteList = React.memo(function QuoteList({
rawSearch: SearchDataResponseItemType[];
}) {
const theme = useTheme();
const { chatId, appId, outLinkAuthData } = useChatStore();
const { appId, outLinkAuthData } = useChatStore();
const RawSourceBoxProps = useContextSelector(ChatBoxContext, (v) => ({
chatItemDataId,
@@ -39,10 +39,11 @@ const QuoteList = React.memo(function QuoteList({
collectionIdList: [...new Set(rawSearch.map((item) => item.collectionId))],
chatItemDataId,
appId,
chatId,
chatId: RawSourceBoxProps.chatId,
...outLinkAuthData
}),
{
refreshDeps: [rawSearch, RawSourceBoxProps.chatId],
manual: false
}
);

View File

@@ -3,7 +3,7 @@ import { ChatHistoryItemResType, ChatItemType } from '@fastgpt/global/core/chat/
import { SearchDataResponseItemType } from '@fastgpt/global/core/dataset/type';
import { FlowNodeTypeEnum } from '@fastgpt/global/core/workflow/node/constant';
const isLLMNode = (item: ChatHistoryItemResType) =>
export const isLLMNode = (item: ChatHistoryItemResType) =>
item.moduleType === FlowNodeTypeEnum.chatNode || item.moduleType === FlowNodeTypeEnum.tools;
export function transformPreviewHistories(

View File

@@ -0,0 +1,191 @@
import { describe, expect, it } from 'vitest';
import { ChatRoleEnum } from '@fastgpt/global/core/chat/constants';
import { FlowNodeTypeEnum } from '@fastgpt/global/core/workflow/node/constant';
import { ChatHistoryItemResType, ChatItemType } from '@fastgpt/global/core/chat/type';
import {
transformPreviewHistories,
addStatisticalDataToHistoryItem
} from '@/global/core/chat/utils';
describe('transformPreviewHistories', () => {
it('should transform histories correctly with responseDetail=true', () => {
const histories: ChatItemType[] = [
{
obj: ChatRoleEnum.AI,
value: 'test response',
responseData: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 1.5
}
]
}
];
const result = transformPreviewHistories(histories, true);
expect(result[0]).toEqual({
obj: ChatRoleEnum.AI,
value: 'test response',
responseData: undefined,
llmModuleAccount: 1,
totalQuoteList: [],
totalRunningTime: 1.5,
historyPreviewLength: undefined
});
});
it('should transform histories correctly with responseDetail=false', () => {
const histories: ChatItemType[] = [
{
obj: ChatRoleEnum.AI,
value: 'test response',
responseData: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 1.5
}
]
}
];
const result = transformPreviewHistories(histories, false);
expect(result[0]).toEqual({
obj: ChatRoleEnum.AI,
value: 'test response',
responseData: undefined,
llmModuleAccount: 1,
totalQuoteList: undefined,
totalRunningTime: 1.5,
historyPreviewLength: undefined
});
});
});
describe('addStatisticalDataToHistoryItem', () => {
it('should return original item if obj is not AI', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.Human,
value: 'test'
};
expect(addStatisticalDataToHistoryItem(item)).toBe(item);
});
it('should return original item if totalQuoteList is already defined', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.AI,
value: 'test',
totalQuoteList: []
};
expect(addStatisticalDataToHistoryItem(item)).toBe(item);
});
it('should return original item if responseData is undefined', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.AI,
value: 'test'
};
expect(addStatisticalDataToHistoryItem(item)).toBe(item);
});
it('should calculate statistics correctly', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.AI,
value: 'test',
responseData: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 1.5,
historyPreview: ['preview1']
},
{
moduleType: FlowNodeTypeEnum.datasetSearchNode,
quoteList: [{ id: '1', q: 'test', a: 'answer' }],
runningTime: 0.5
},
{
moduleType: FlowNodeTypeEnum.tools,
runningTime: 1,
toolDetail: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 0.5
}
]
}
]
};
const result = addStatisticalDataToHistoryItem(item);
expect(result).toEqual({
...item,
llmModuleAccount: 3,
totalQuoteList: [{ id: '1', q: 'test', a: 'answer' }],
totalRunningTime: 3,
historyPreviewLength: 1
});
});
it('should handle empty arrays and undefined values', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.AI,
value: 'test',
responseData: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 0
}
]
};
const result = addStatisticalDataToHistoryItem(item);
expect(result).toEqual({
...item,
llmModuleAccount: 1,
totalQuoteList: [],
totalRunningTime: 0,
historyPreviewLength: undefined
});
});
it('should handle nested plugin and loop details', () => {
const item: ChatItemType = {
obj: ChatRoleEnum.AI,
value: 'test',
responseData: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 1,
pluginDetail: [
{
moduleType: FlowNodeTypeEnum.chatNode,
runningTime: 0.5
}
],
loopDetail: [
{
moduleType: FlowNodeTypeEnum.tools,
runningTime: 0.3
}
]
}
]
};
const result = addStatisticalDataToHistoryItem(item);
expect(result).toEqual({
...item,
llmModuleAccount: 3,
totalQuoteList: [],
totalRunningTime: 1,
historyPreviewLength: undefined
});
});
});

View File

@@ -0,0 +1,59 @@
import { describe, it, expect } from 'vitest';
import { authType2UsageSource } from '@/service/support/wallet/usage/utils';
import { AuthUserTypeEnum } from '@fastgpt/global/support/permission/constant';
import { UsageSourceEnum } from '@fastgpt/global/support/wallet/usage/constants';
describe('authType2UsageSource', () => {
it('should return source if provided', () => {
const result = authType2UsageSource({
authType: AuthUserTypeEnum.apikey,
shareId: 'share123',
source: UsageSourceEnum.api
});
expect(result).toBe(UsageSourceEnum.api);
});
it('should return shareLink if shareId is provided', () => {
const result = authType2UsageSource({
authType: AuthUserTypeEnum.apikey,
shareId: 'share123'
});
expect(result).toBe(UsageSourceEnum.shareLink);
});
it('should return api if authType is apikey', () => {
const result = authType2UsageSource({
authType: AuthUserTypeEnum.apikey
});
expect(result).toBe(UsageSourceEnum.api);
});
it('should return fastgpt as default', () => {
const result = authType2UsageSource({});
expect(result).toBe(UsageSourceEnum.fastgpt);
});
it('should return fastgpt for non-apikey authType', () => {
const result = authType2UsageSource({
authType: AuthUserTypeEnum.owner
});
expect(result).toBe(UsageSourceEnum.fastgpt);
});
it('should prioritize source over shareId and authType', () => {
const result = authType2UsageSource({
source: UsageSourceEnum.api,
shareId: 'share123',
authType: AuthUserTypeEnum.apikey
});
expect(result).toBe(UsageSourceEnum.api);
});
it('should prioritize shareId over authType', () => {
const result = authType2UsageSource({
shareId: 'share123',
authType: AuthUserTypeEnum.apikey
});
expect(result).toBe(UsageSourceEnum.shareLink);
});
});

View File

@@ -0,0 +1,237 @@
import { vi, describe, it, expect } from 'vitest';
import type { FlowNodeTemplateType } from '@fastgpt/global/core/workflow/type/node';
import type { StoreNodeItemType } from '@fastgpt/global/core/workflow/type/node';
import type { Node, Edge } from 'reactflow';
import {
FlowNodeTypeEnum,
FlowNodeInputTypeEnum,
FlowNodeOutputTypeEnum,
EDGE_TYPE
} from '@fastgpt/global/core/workflow/node/constant';
import { WorkflowIOValueTypeEnum } from '@fastgpt/global/core/workflow/constants';
import { NodeInputKeyEnum, NodeOutputKeyEnum } from '@fastgpt/global/core/workflow/constants';
import {
nodeTemplate2FlowNode,
storeNode2FlowNode,
storeEdgesRenderEdge,
computedNodeInputReference,
getRefData,
filterWorkflowNodeOutputsByType,
checkWorkflowNodeAndConnection,
getLatestNodeTemplate
} from '@/web/core/workflow/utils';
describe('workflow utils', () => {
describe('nodeTemplate2FlowNode', () => {
it('should convert template to flow node', () => {
const template: FlowNodeTemplateType = {
name: 'Test Node',
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [],
outputs: []
};
const result = nodeTemplate2FlowNode({
template,
position: { x: 100, y: 100 },
selected: true,
parentNodeId: 'parent1',
t: (key) => key
});
expect(result).toMatchObject({
type: FlowNodeTypeEnum.userInput,
position: { x: 100, y: 100 },
selected: true,
data: {
name: 'Test Node',
flowNodeType: FlowNodeTypeEnum.userInput,
parentNodeId: 'parent1'
}
});
expect(result.id).toBeDefined();
});
});
describe('storeNode2FlowNode', () => {
it('should convert store node to flow node', () => {
const storeNode: StoreNodeItemType = {
nodeId: 'node1',
flowNodeType: FlowNodeTypeEnum.userInput,
position: { x: 100, y: 100 },
inputs: [],
outputs: [],
name: 'Test Node',
version: '1.0'
};
const result = storeNode2FlowNode({
item: storeNode,
selected: true,
t: (key) => key
});
expect(result).toMatchObject({
id: 'node1',
type: FlowNodeTypeEnum.userInput,
position: { x: 100, y: 100 },
selected: true
});
});
it('should handle dynamic inputs and outputs', () => {
const storeNode: StoreNodeItemType = {
nodeId: 'node1',
flowNodeType: FlowNodeTypeEnum.userInput,
position: { x: 0, y: 0 },
inputs: [
{
key: 'dynamicInput',
renderTypeList: [FlowNodeInputTypeEnum.addInputParam]
}
],
outputs: [
{
key: 'dynamicOutput',
type: FlowNodeOutputTypeEnum.dynamic
}
],
name: 'Test Node',
version: '1.0'
};
const result = storeNode2FlowNode({
item: storeNode,
t: (key) => key
});
expect(result.data.inputs).toHaveLength(1);
expect(result.data.outputs).toHaveLength(1);
});
});
describe('filterWorkflowNodeOutputsByType', () => {
it('should filter outputs by type', () => {
const outputs = [
{ id: '1', valueType: WorkflowIOValueTypeEnum.string },
{ id: '2', valueType: WorkflowIOValueTypeEnum.number },
{ id: '3', valueType: WorkflowIOValueTypeEnum.boolean }
];
const result = filterWorkflowNodeOutputsByType(outputs, WorkflowIOValueTypeEnum.string);
expect(result).toHaveLength(1);
expect(result[0].id).toBe('1');
});
it('should return all outputs for any type', () => {
const outputs = [
{ id: '1', valueType: WorkflowIOValueTypeEnum.string },
{ id: '2', valueType: WorkflowIOValueTypeEnum.number }
];
const result = filterWorkflowNodeOutputsByType(outputs, WorkflowIOValueTypeEnum.any);
expect(result).toHaveLength(2);
});
it('should handle array types correctly', () => {
const outputs = [
{ id: '1', valueType: WorkflowIOValueTypeEnum.string },
{ id: '2', valueType: WorkflowIOValueTypeEnum.arrayString }
];
const result = filterWorkflowNodeOutputsByType(outputs, WorkflowIOValueTypeEnum.arrayString);
expect(result).toHaveLength(2);
});
});
describe('checkWorkflowNodeAndConnection', () => {
it('should validate nodes and connections', () => {
const nodes: Node[] = [
{
id: 'node1',
type: FlowNodeTypeEnum.userInput,
data: {
nodeId: 'node1',
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [
{
key: NodeInputKeyEnum.userInput,
required: true,
value: undefined,
renderTypeList: [FlowNodeInputTypeEnum.input]
}
],
outputs: []
},
position: { x: 0, y: 0 }
}
];
const edges: Edge[] = [
{
id: 'edge1',
source: 'node1',
target: 'node2',
type: EDGE_TYPE
}
];
const result = checkWorkflowNodeAndConnection({ nodes, edges });
expect(result).toEqual(['node1']);
});
it('should handle empty nodes and edges', () => {
const result = checkWorkflowNodeAndConnection({ nodes: [], edges: [] });
expect(result).toBeUndefined();
});
});
describe('getLatestNodeTemplate', () => {
it('should update node to latest template version', () => {
const node = {
nodeId: 'node1',
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [{ key: 'input1', value: 'test' }],
outputs: [{ key: 'output1', value: 'test' }],
name: 'Old Name',
intro: 'Old Intro'
};
const template = {
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [{ key: 'input1' }, { key: 'input2' }],
outputs: [{ key: 'output1' }, { key: 'output2' }]
};
const result = getLatestNodeTemplate(node, template);
expect(result.inputs).toHaveLength(2);
expect(result.outputs).toHaveLength(2);
expect(result.name).toBe('Old Name');
});
it('should preserve existing values when updating template', () => {
const node = {
nodeId: 'node1',
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [{ key: 'input1', value: 'existingValue' }],
outputs: [{ key: 'output1', value: 'existingOutput' }],
name: 'Node Name',
intro: 'Node Intro'
};
const template = {
flowNodeType: FlowNodeTypeEnum.userInput,
inputs: [{ key: 'input1', value: 'newValue' }],
outputs: [{ key: 'output1', value: 'newOutput' }]
};
const result = getLatestNodeTemplate(node, template);
expect(result.inputs[0].value).toBe('existingValue');
expect(result.outputs[0].value).toBe('existingOutput');
});
});
});