Compare commits
8 Commits
v4.6.6
...
v4.6.7-alp
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5e2adb22f0 | ||
|
|
c031e6dcc9 | ||
|
|
8ee7407c4c | ||
|
|
006ad17c6a | ||
|
|
414b693303 | ||
|
|
dfa6586e5e | ||
|
|
5968bfeb12 | ||
|
|
5876a47da6 |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -36,4 +36,7 @@ dist/
|
||||
docSite/public/
|
||||
docSite/resources/_gen/
|
||||
docSite/.vercel
|
||||
*.local.*
|
||||
*.local.*
|
||||
|
||||
|
||||
.idea/
|
||||
|
||||
17
dev.md
Normal file
17
dev.md
Normal file
@@ -0,0 +1,17 @@
|
||||
# 打包命令
|
||||
|
||||
```sh
|
||||
# Build image, not proxy
|
||||
docker build -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.4.7 --build-arg name=app .
|
||||
|
||||
# build image with proxy
|
||||
docker build -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.4.7 --build-arg name=app --build-arg proxy=taobao .
|
||||
```
|
||||
|
||||
# Pg 常用索引
|
||||
|
||||
```sql
|
||||
CREATE INDEX IF NOT EXISTS modelData_dataset_id_index ON modeldata (dataset_id);
|
||||
CREATE INDEX IF NOT EXISTS modelData_collection_id_index ON modeldata (collection_id);
|
||||
CREATE INDEX IF NOT EXISTS modelData_teamId_index ON modeldata (team_id);
|
||||
```
|
||||
@@ -44,15 +44,19 @@ FastGPT 商业版软件根据不同的部署方式,分为 3 类收费模式。
|
||||
**特有服务**
|
||||
|
||||
{{< table "table-hover table-striped-columns" >}}
|
||||
| 部署方式 | 特有服务 | 上线时长 | 价格 |
|
||||
| 部署方式 | 特有服务 | 上线时长 | 标品价格 |
|
||||
| ---- | ---- | ---- | ---- |
|
||||
| Sealos全托管 | 1. 有效期内免费升级。<br>2. 免运维服务&数据库。 | 半天 | 3000元起/月(3个月起)<br>或<br>30000元起/年 |
|
||||
| 自有服务器-单机版 | 1. 6个版本的升级服务。 | 14天内 | 60000元/套(不限时长) |
|
||||
| 自有服务器-Sealos版 | 1. 6个版本的升级服务。 | 14天内 | 150000元/套(不限时长)|
|
||||
| 自有服务器-高可用版 | 1. 6个版本的升级服务。 | 14天内 | 150000元/套(不限时长)|
|
||||
{{< /table >}}
|
||||
|
||||
{{% alert icon="🤖 " context="success" %}}
|
||||
6个版本的升级服务不是指只能用 6 个版本,而是指依赖 FastGPT 团队提供的升级服务。大部分时候,建议自行升级,也不麻烦。
|
||||
- 6个版本的升级服务不是指只能用 6 个版本,而是指依赖 FastGPT 团队提供的升级服务。大部分时候,建议自行升级,也不麻烦。
|
||||
- 全托管版本适合技术人员紧缺的团队,仅需关注业务推动,无需关心服务是否正常运行。
|
||||
- 单机版和高可用版可以完全部署在自己服务器中。
|
||||
- 单机版适合中小团队对内提供服务,需要自己维护数据库备份等。
|
||||
- 高可用版适合对外提供在线服务,包含可视化监控、多副本、负载均衡、数据库自动备份等生产环境的基础设施。
|
||||
{{% /alert %}}
|
||||
|
||||
|
||||
|
||||
@@ -29,7 +29,7 @@ weight: 106
|
||||
|
||||
### 全文检索
|
||||
|
||||
才用传统的全文检索方式。适合查找关键的主谓语等。
|
||||
采用传统的全文检索方式。适合查找关键的主谓语等。
|
||||
|
||||
### 混合检索
|
||||
|
||||
@@ -55,4 +55,4 @@ FastGPT 会使用 `RRF` 对重排结果、向量搜索结果、全文检索结
|
||||
|
||||
一个`0-1`的数值,会过滤掉一些低相关度的搜索结果。
|
||||
|
||||
该值仅在`语义检索`或使用`结果重排`时生效。
|
||||
该值仅在`语义检索`或使用`结果重排`时生效。
|
||||
|
||||
@@ -277,7 +277,7 @@ weight: 708
|
||||
"maxContext": 1600,
|
||||
"maxResponse": 4000,
|
||||
"inputPrice": 0,
|
||||
"outputPrice": 0,
|
||||
"outputPrice": 0
|
||||
}
|
||||
],
|
||||
"vectorModels": [ // 向量模型
|
||||
|
||||
@@ -28,7 +28,7 @@ weight: 910
|
||||
### 源码部署
|
||||
|
||||
1. 根据上面的环境配置配置好环境,具体教程自行 GPT;
|
||||
2. 下载 [python 文件](app.py)
|
||||
2. 下载 [python 文件](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-base)
|
||||
3. 在命令行输入命令 `pip install -r requirments.txt`;
|
||||
4. 按照[https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)下载模型仓库到app.py同级目录
|
||||
5. 添加环境变量 `export ACCESS_TOKEN=XXXXXX` 配置 token,这里的 token 只是加一层验证,防止接口被人盗用,默认值为 `ACCESS_TOKEN` ;
|
||||
|
||||
@@ -17,6 +17,11 @@ weight: 707
|
||||
| 500w 组向量 | 8c32g | 16c64g 200GB |
|
||||
{{< /table >}}
|
||||
|
||||
## 部署架构图
|
||||
|
||||

|
||||
|
||||
|
||||
### 1. 准备好代理环境(国外服务器可忽略)
|
||||
|
||||
确保可以访问 OpenAI,具体方案可以参考:[代理方案](/docs/development/proxy/)。或直接在 Sealos 上 [部署 OneAPI](/docs/development/one-api),既解决代理问题也能实现多 Key 轮询、接入其他大模型。
|
||||
|
||||
@@ -62,7 +62,7 @@ git clone git@github.com:<github_username>/FastGPT.git
|
||||
|
||||
**注意:json 配置文件不能包含注释,介绍中为了方便看才加入的注释**
|
||||
|
||||
这个文件大部分时候不需要修改。只需要关注 SystemParams 里的参数:
|
||||
这个文件大部分时候不需要修改。只需要关注 `systemEnv` 里的参数:
|
||||
|
||||
- `vectorMaxProcess`: 向量生成最大进程,根据数据库和 key 的并发数来决定,通常单个 120 号,2c4g 服务器设置 10~15。
|
||||
- `qaMaxProcess`: QA 生成最大进程
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -19,13 +19,17 @@ images: []
|
||||
|
||||
## 通用问题
|
||||
|
||||
### 能否纯本地允许
|
||||
|
||||
可以。需要准备好向量模型和LLM模型。
|
||||
|
||||
### insufficient_user_quota user quota is not enough
|
||||
|
||||
OneAPI 账号的余额不足,默认 root 用户只有 200 刀,可以手动修改。
|
||||
|
||||
### xxx渠道找不到
|
||||
|
||||
OneAPI 中没有配置该模型渠道。
|
||||
OneAPI 中没有配置该模型渠道。或者是修改了配置文件中一部分的模型,但没有全部修改。
|
||||
|
||||
### 页面中可以正常回复,API 报错
|
||||
|
||||
@@ -35,6 +39,15 @@ OneAPI 中没有配置该模型渠道。
|
||||
|
||||
OneAPI 的 API Key 配置错误,需要修改`OPENAI_API_KEY`环境变量,并重启容器(先 stop 然后 rm 掉,最后再 up -d 运行一次)。可以`exec`进入容器,`env`查看环境变量是否生效。
|
||||
|
||||
### 其他模型没法进行问题分类/内容提取
|
||||
|
||||
需要给其他模型配置`toolChoice=false`,就会默认走提示词模式。目前内置提示词仅针对了商业模型API进行测试,国内外的商业模型基本都可用。
|
||||
|
||||
### 页面崩溃
|
||||
|
||||
1. 关闭翻译
|
||||
2. 检查配置文件是否正常加载,如果没有正常加载会导致缺失系统信息,在某些操作下会导致空指针。
|
||||
|
||||
## Docker 部署常见问题
|
||||
|
||||
### 如何更新?
|
||||
@@ -96,7 +109,7 @@ mongo连接失败,检查
|
||||
|
||||
### TypeError: Cannot read properties of null (reading 'useMemo' )
|
||||
|
||||
用 Node18 试试,可能最新的 Node 有问题。 本地开发流程:
|
||||
删除所有的`node_modules`,用 Node18 重新 install 试试,可能最新的 Node 有问题。 本地开发流程:
|
||||
|
||||
1. 根目录: `pnpm i`
|
||||
2. 复制 `config.json` -> `config.local.json`
|
||||
|
||||
33
docSite/content/docs/development/upgrading/467.md
Normal file
33
docSite/content/docs/development/upgrading/467.md
Normal file
@@ -0,0 +1,33 @@
|
||||
---
|
||||
title: 'V4.6.7(需要初始化)'
|
||||
description: 'FastGPT V4.6.7'
|
||||
icon: 'upgrade'
|
||||
draft: false
|
||||
toc: true
|
||||
weight: 829
|
||||
---
|
||||
|
||||
## 1。执行初始化 API
|
||||
|
||||
发起 1 个 HTTP 请求 ({{rootkey}} 替换成环境变量里的 `rootkey`,{{host}} 替换成自己域名)
|
||||
|
||||
1. https://xxxxx/api/admin/initv464
|
||||
|
||||
```bash
|
||||
curl --location --request POST 'https://{{host}}/api/admin/initv467' \
|
||||
--header 'rootkey: {{rootkey}}' \
|
||||
--header 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
初始化说明:
|
||||
1. 将 images 重新关联到数据集(不初始化也问题不大,就是可能会留下永久脏数据)
|
||||
|
||||
|
||||
## V4.6.7 更新说明
|
||||
|
||||
1. 修改了知识库UI及新的导入交互方式。
|
||||
2. 优化知识库和对话的数据索引。
|
||||
3. 知识库 openAPI,支持通过 API 操作知识库。(文档待补充)
|
||||
4. 新增 - 输入框变量提示。输入 { 号后将会获得可用变量提示。根据社区针对高级编排的反馈,我们计划于 2 月份的版本中,优化变量内容,支持模块的局部变量以及更多全局变量写入。
|
||||
5. 修复 - API 对话时,chatId 冲突问题。
|
||||
6. 修复 - Iframe 嵌入网页可能导致的 window.onLoad 冲突。
|
||||
@@ -25,7 +25,9 @@ FastGPT 采用了 RAG 中的 Embedding 方案构建知识库,要使用好 Fast
|
||||
|
||||
FastGPT 采用了 `PostgresSQL` 的 `PG Vector` 插件作为向量检索器,索引为`HNSW`。且`PostgresSQL`仅用于向量检索,`MongoDB`用于其他数据的存取。
|
||||
|
||||
在`PostgresSQL`的表中,设置一个 `index` 字段用于存储向量,以及一个`data_id`用于在`MongoDB`中寻找对应的映射值。多个`index`可以对应一组`data_id`,也就是说,一组向量可以对应多组数据。在进行检索时,相同数据会进行合并。
|
||||
在`MongoDB`的`dataset.datas`表中,会存储向量原数据的信息,同时有一个`indexes`字段,会记录其对应的向量ID,这是一个数组,也就是说,一组向量可以对应多组数据。
|
||||
|
||||
在`PostgresSQL`的表中,设置一个 `index` 字段用于存储向量。在检索时,会先召回向量,再根据向量的ID,去`MongoDB`中寻找原数据内容,如果对应了同一组原数据,则进行合并,向量得分取最高得分。
|
||||
|
||||

|
||||
|
||||
|
||||
Binary file not shown.
Binary file not shown.
@@ -1,43 +0,0 @@
|
||||
mixed-port: 7890
|
||||
allow-lan: false
|
||||
bind-address: '*'
|
||||
mode: rule
|
||||
log-level: warning
|
||||
dns:
|
||||
enable: true
|
||||
ipv6: false
|
||||
nameserver:
|
||||
- 8.8.8.8
|
||||
- 8.8.4.4
|
||||
cache-size: 400
|
||||
proxies:
|
||||
|
||||
proxy-groups:
|
||||
- {
|
||||
name: '♻️ 自动选择',
|
||||
type: url-test,
|
||||
proxies:
|
||||
[
|
||||
香港V02×1.5,
|
||||
ABC,
|
||||
印度01,
|
||||
台湾03,
|
||||
新加坡02,
|
||||
新加坡03,
|
||||
日本01,
|
||||
日本02,
|
||||
新加坡01,
|
||||
美国01,
|
||||
美国02,
|
||||
台湾01,
|
||||
台湾02
|
||||
],
|
||||
url: 'https://api.openai.com',
|
||||
interval: 3600
|
||||
}
|
||||
rules:
|
||||
- 'DOMAIN-SUFFIX,google.com,♻️ 自动选择'
|
||||
- 'DOMAIN-SUFFIX,ai.fastgpt.in,♻️ 自动选择'
|
||||
- 'DOMAIN-SUFFIX,openai.com,♻️ 自动选择'
|
||||
- 'DOMAIN-SUFFIX,api.openai.com,♻️ 自动选择'
|
||||
- 'MATCH,DIRECT'
|
||||
@@ -1,18 +0,0 @@
|
||||
export ALL_PROXY=socks5://127.0.0.1:7891
|
||||
export http_proxy=http://127.0.0.1:7890
|
||||
export https_proxy=http://127.0.0.1:7890
|
||||
export HTTP_PROXY=http://127.0.0.1:7890
|
||||
export HTTPS_PROXY=http://127.0.0.1:7890
|
||||
|
||||
OLD_PROCESS=$(pgrep clash)
|
||||
if [ ! -z "$OLD_PROCESS" ]; then
|
||||
echo "Killing old process: $OLD_PROCESS"
|
||||
kill $OLD_PROCESS
|
||||
fi
|
||||
sleep 2
|
||||
|
||||
cd /root/fastgpt/clash/fast
|
||||
rm -f ./nohup.out || true
|
||||
rm -f ./cache.db || true
|
||||
nohup ./clash-linux-amd64-v3 -d ./ &
|
||||
echo "Restart clash fast"
|
||||
@@ -1,10 +0,0 @@
|
||||
export ALL_PROXY=''
|
||||
export http_proxy=''
|
||||
export https_proxy=''
|
||||
export HTTP_PROXY=''
|
||||
export HTTPS_PROXY=''
|
||||
OLD_PROCESS=$(pgrep clash)
|
||||
if [ ! -z "$OLD_PROCESS" ]; then
|
||||
echo "Killing old process: $OLD_PROCESS"
|
||||
kill $OLD_PROCESS
|
||||
fi
|
||||
11
packages/global/common/file/api.d.ts
vendored
11
packages/global/common/file/api.d.ts
vendored
@@ -1,9 +1,15 @@
|
||||
export type UploadImgProps = {
|
||||
base64Img: string;
|
||||
import { MongoImageTypeEnum } from './image/constants';
|
||||
|
||||
export type preUploadImgProps = {
|
||||
type: `${MongoImageTypeEnum}`;
|
||||
|
||||
expiredTime?: Date;
|
||||
metadata?: Record<string, any>;
|
||||
shareId?: string;
|
||||
};
|
||||
export type UploadImgProps = preUploadImgProps & {
|
||||
base64Img: string;
|
||||
};
|
||||
|
||||
export type UrlFetchParams = {
|
||||
urlList: string[];
|
||||
@@ -11,6 +17,7 @@ export type UrlFetchParams = {
|
||||
};
|
||||
export type UrlFetchResponse = {
|
||||
url: string;
|
||||
title: string;
|
||||
content: string;
|
||||
selector?: string;
|
||||
}[];
|
||||
|
||||
@@ -1,12 +1,14 @@
|
||||
export const fileImgs = [
|
||||
{ suffix: 'pdf', src: '/imgs/files/pdf.svg' },
|
||||
{ suffix: 'csv', src: '/imgs/files/csv.svg' },
|
||||
{ suffix: '(doc|docs)', src: '/imgs/files/doc.svg' },
|
||||
{ suffix: 'txt', src: '/imgs/files/txt.svg' },
|
||||
{ suffix: 'md', src: '/imgs/files/markdown.svg' }
|
||||
{ suffix: 'pdf', src: 'file/fill/pdf' },
|
||||
{ suffix: 'csv', src: 'file/fill/csv' },
|
||||
{ suffix: '(doc|docs)', src: 'file/fill/doc' },
|
||||
{ suffix: 'txt', src: 'file/fill/txt' },
|
||||
{ suffix: 'md', src: 'file/fill/markdown' },
|
||||
{ suffix: 'html', src: 'file/fill/html' }
|
||||
|
||||
// { suffix: '.', src: '/imgs/files/file.svg' }
|
||||
];
|
||||
|
||||
export function getFileIcon(name = '', defaultImg = '/imgs/files/file.svg') {
|
||||
export function getFileIcon(name = '', defaultImg = 'file/fill/file') {
|
||||
return fileImgs.find((item) => new RegExp(item.suffix, 'gi').test(name))?.src || defaultImg;
|
||||
}
|
||||
|
||||
52
packages/global/common/file/image/constants.ts
Normal file
52
packages/global/common/file/image/constants.ts
Normal file
@@ -0,0 +1,52 @@
|
||||
export const imageBaseUrl = '/api/system/img/';
|
||||
|
||||
export enum MongoImageTypeEnum {
|
||||
systemAvatar = 'systemAvatar',
|
||||
appAvatar = 'appAvatar',
|
||||
pluginAvatar = 'pluginAvatar',
|
||||
datasetAvatar = 'datasetAvatar',
|
||||
userAvatar = 'userAvatar',
|
||||
teamAvatar = 'teamAvatar',
|
||||
|
||||
chatImage = 'chatImage',
|
||||
collectionImage = 'collectionImage'
|
||||
}
|
||||
export const mongoImageTypeMap = {
|
||||
[MongoImageTypeEnum.systemAvatar]: {
|
||||
label: 'common.file.type.appAvatar',
|
||||
unique: true
|
||||
},
|
||||
[MongoImageTypeEnum.appAvatar]: {
|
||||
label: 'common.file.type.appAvatar',
|
||||
unique: true
|
||||
},
|
||||
[MongoImageTypeEnum.pluginAvatar]: {
|
||||
label: 'common.file.type.pluginAvatar',
|
||||
unique: true
|
||||
},
|
||||
[MongoImageTypeEnum.datasetAvatar]: {
|
||||
label: 'common.file.type.datasetAvatar',
|
||||
unique: true
|
||||
},
|
||||
[MongoImageTypeEnum.userAvatar]: {
|
||||
label: 'common.file.type.userAvatar',
|
||||
unique: true
|
||||
},
|
||||
[MongoImageTypeEnum.teamAvatar]: {
|
||||
label: 'common.file.type.teamAvatar',
|
||||
unique: true
|
||||
},
|
||||
|
||||
[MongoImageTypeEnum.chatImage]: {
|
||||
label: 'common.file.type.chatImage',
|
||||
unique: false
|
||||
},
|
||||
[MongoImageTypeEnum.collectionImage]: {
|
||||
label: 'common.file.type.collectionImage',
|
||||
unique: false
|
||||
}
|
||||
};
|
||||
|
||||
export const uniqueImageTypeList = Object.entries(mongoImageTypeMap)
|
||||
.filter(([key, value]) => value.unique)
|
||||
.map(([key]) => key as `${MongoImageTypeEnum}`);
|
||||
14
packages/global/common/file/image/type.d.ts
vendored
Normal file
14
packages/global/common/file/image/type.d.ts
vendored
Normal file
@@ -0,0 +1,14 @@
|
||||
import { MongoImageTypeEnum } from './constants';
|
||||
|
||||
export type MongoImageSchemaType = {
|
||||
_id: string;
|
||||
teamId: string;
|
||||
binary: Buffer;
|
||||
createTime: Date;
|
||||
expiredTime?: Date;
|
||||
type: `${MongoImageTypeEnum}`;
|
||||
|
||||
metadata?: {
|
||||
relatedId?: string; // This id is associated with a set of images
|
||||
};
|
||||
};
|
||||
10
packages/global/common/math/date.ts
Normal file
10
packages/global/common/math/date.ts
Normal file
@@ -0,0 +1,10 @@
|
||||
// The number of days left in the month is calculated as 30 days per month, and less than 1 day is calculated as 1 day
|
||||
export const getMonthRemainingDays = () => {
|
||||
const now = new Date();
|
||||
const year = now.getFullYear();
|
||||
const month = now.getMonth();
|
||||
const date = now.getDate();
|
||||
const days = new Date(year, month + 1, 0).getDate();
|
||||
const remainingDays = days - date;
|
||||
return remainingDays + 1;
|
||||
};
|
||||
@@ -15,10 +15,10 @@ export const simpleMarkdownText = (rawText: string) => {
|
||||
return `[${cleanedLinkText}](${url})`;
|
||||
});
|
||||
|
||||
// replace special \.* ……
|
||||
const reg1 = /\\([-.!`_(){}\[\]])/g;
|
||||
// replace special #\.* ……
|
||||
const reg1 = /\\([#`!*()+-_\[\]{}\\.])/g;
|
||||
if (reg1.test(rawText)) {
|
||||
rawText = rawText.replace(/\\([`!*()+-_\[\]{}\\.])/g, '$1');
|
||||
rawText = rawText.replace(reg1, '$1');
|
||||
}
|
||||
|
||||
// replace \\n
|
||||
@@ -45,14 +45,15 @@ export const uploadMarkdownBase64 = async ({
|
||||
uploadImgController
|
||||
}: {
|
||||
rawText: string;
|
||||
uploadImgController: (base64: string) => Promise<string>;
|
||||
uploadImgController?: (base64: string) => Promise<string>;
|
||||
}) => {
|
||||
// match base64, upload and replace it
|
||||
const base64Regex = /data:image\/.*;base64,([^\)]+)/g;
|
||||
const base64Arr = rawText.match(base64Regex) || [];
|
||||
// upload base64 and replace it
|
||||
await Promise.all(
|
||||
base64Arr.map(async (base64Img) => {
|
||||
if (uploadImgController) {
|
||||
// match base64, upload and replace it
|
||||
const base64Regex = /data:image\/.*;base64,([^\)]+)/g;
|
||||
const base64Arr = rawText.match(base64Regex) || [];
|
||||
|
||||
// upload base64 and replace it
|
||||
for await (const base64Img of base64Arr) {
|
||||
try {
|
||||
const str = await uploadImgController(base64Img);
|
||||
|
||||
@@ -61,8 +62,8 @@ export const uploadMarkdownBase64 = async ({
|
||||
rawText = rawText.replace(base64Img, '');
|
||||
rawText = rawText.replace(/!\[.*\]\(\)/g, '');
|
||||
}
|
||||
})
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Remove white space on both sides of the picture
|
||||
const trimReg = /(!\[.*\]\(.*\))\s*/g;
|
||||
@@ -70,5 +71,20 @@ export const uploadMarkdownBase64 = async ({
|
||||
rawText = rawText.replace(trimReg, '$1');
|
||||
}
|
||||
|
||||
return simpleMarkdownText(rawText);
|
||||
return rawText;
|
||||
};
|
||||
|
||||
export const markdownProcess = async ({
|
||||
rawText,
|
||||
uploadImgController
|
||||
}: {
|
||||
rawText: string;
|
||||
uploadImgController?: (base64: string) => Promise<string>;
|
||||
}) => {
|
||||
const imageProcess = await uploadMarkdownBase64({
|
||||
rawText,
|
||||
uploadImgController
|
||||
});
|
||||
|
||||
return simpleMarkdownText(imageProcess);
|
||||
};
|
||||
|
||||
@@ -13,13 +13,12 @@ export const splitText2Chunks = (props: {
|
||||
chunkLen: number;
|
||||
overlapRatio?: number;
|
||||
customReg?: string[];
|
||||
countTokens?: boolean;
|
||||
}): {
|
||||
chunks: string[];
|
||||
tokens: number;
|
||||
chars: number;
|
||||
overlapRatio?: number;
|
||||
} => {
|
||||
let { text = '', chunkLen, overlapRatio = 0.2, customReg = [], countTokens = true } = props;
|
||||
let { text = '', chunkLen, overlapRatio = 0.2, customReg = [] } = props;
|
||||
const splitMarker = 'SPLIT_HERE_SPLIT_HERE';
|
||||
const codeBlockMarker = 'CODE_BLOCK_LINE_MARKER';
|
||||
const overlapLen = Math.round(chunkLen * overlapRatio);
|
||||
@@ -240,13 +239,11 @@ export const splitText2Chunks = (props: {
|
||||
mdTitle: ''
|
||||
}).map((chunk) => chunk?.replaceAll(codeBlockMarker, '\n') || ''); // restore code block
|
||||
|
||||
const tokens = countTokens
|
||||
? chunks.reduce((sum, chunk) => sum + countPromptTokens(chunk, 'system'), 0)
|
||||
: 0;
|
||||
const chars = chunks.reduce((sum, chunk) => sum + chunk.length, 0);
|
||||
|
||||
return {
|
||||
chunks,
|
||||
tokens
|
||||
chars
|
||||
};
|
||||
} catch (err) {
|
||||
throw new Error(getErrText(err));
|
||||
|
||||
@@ -33,6 +33,12 @@ export function countPromptTokens(
|
||||
) {
|
||||
const enc = getTikTokenEnc();
|
||||
const text = `${role}\n${prompt}`;
|
||||
|
||||
// too large a text will block the thread
|
||||
if (text.length > 15000) {
|
||||
return text.length * 1.7;
|
||||
}
|
||||
|
||||
try {
|
||||
const encodeText = enc.encode(text);
|
||||
return encodeText.length + role.length; // 补充 role 估算值
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import dayjs from 'dayjs';
|
||||
|
||||
export const formatTime2YMDHM = (time: Date) => dayjs(time).format('YYYY-MM-DD HH:mm');
|
||||
export const formatTime2YMDHM = (time?: Date) =>
|
||||
time ? dayjs(time).format('YYYY-MM-DD HH:mm') : '';
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import crypto from 'crypto';
|
||||
import { customAlphabet } from 'nanoid';
|
||||
|
||||
/* check string is a web link */
|
||||
export function strIsLink(str?: string) {
|
||||
@@ -36,3 +37,7 @@ export function replaceVariable(text: string, obj: Record<string, string | numbe
|
||||
}
|
||||
return text || '';
|
||||
}
|
||||
|
||||
export const getNanoid = (size = 12) => {
|
||||
return customAlphabet('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890', size)();
|
||||
};
|
||||
|
||||
@@ -51,6 +51,12 @@ export type FastGPTFeConfigsType = {
|
||||
favicon?: string;
|
||||
customApiDomain?: string;
|
||||
customSharePageDomain?: string;
|
||||
subscription?: {
|
||||
datasetStoreFreeSize?: number;
|
||||
datasetStorePrice?: number;
|
||||
};
|
||||
|
||||
uploadFileMaxSize?: number;
|
||||
};
|
||||
|
||||
export type SystemEnvType = {
|
||||
@@ -63,4 +69,5 @@ export type SystemEnvType = {
|
||||
declare global {
|
||||
var feConfigs: FastGPTFeConfigsType;
|
||||
var systemEnv: SystemEnvType;
|
||||
var systemInitd: boolean;
|
||||
}
|
||||
|
||||
2
packages/global/core/app/type.d.ts
vendored
2
packages/global/core/app/type.d.ts
vendored
@@ -4,7 +4,7 @@ import { PermissionTypeEnum } from '../../support/permission/constant';
|
||||
import type { AIChatModuleProps, DatasetModuleProps } from '../module/node/type.d';
|
||||
import { VariableInputEnum } from '../module/constants';
|
||||
import { SelectedDatasetType } from '../module/api';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constant';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constants';
|
||||
|
||||
export interface AppSchema {
|
||||
_id: string;
|
||||
|
||||
@@ -4,7 +4,7 @@ import { ModuleOutputKeyEnum, ModuleInputKeyEnum } from '../module/constants';
|
||||
import type { FlowNodeInputItemType } from '../module/node/type.d';
|
||||
import { getGuideModule, splitGuideModule } from '../module/utils';
|
||||
import { ModuleItemType } from '../module/type.d';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constant';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constants';
|
||||
|
||||
export const getDefaultAppForm = (templateId = 'fastgpt-universal'): AppSimpleEditFormType => {
|
||||
return {
|
||||
|
||||
@@ -31,16 +31,16 @@ export enum ChatSourceEnum {
|
||||
}
|
||||
export const ChatSourceMap = {
|
||||
[ChatSourceEnum.test]: {
|
||||
name: 'chat.logs.test'
|
||||
name: 'core.chat.logs.test'
|
||||
},
|
||||
[ChatSourceEnum.online]: {
|
||||
name: 'chat.logs.online'
|
||||
name: 'core.chat.logs.online'
|
||||
},
|
||||
[ChatSourceEnum.share]: {
|
||||
name: 'chat.logs.share'
|
||||
name: 'core.chat.logs.share'
|
||||
},
|
||||
[ChatSourceEnum.api]: {
|
||||
name: 'chat.logs.api'
|
||||
name: 'core.chat.logs.api'
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
4
packages/global/core/chat/type.d.ts
vendored
4
packages/global/core/chat/type.d.ts
vendored
@@ -4,7 +4,7 @@ import { ChatRoleEnum, ChatSourceEnum, ChatStatusEnum } from './constants';
|
||||
import { FlowNodeTypeEnum } from '../module/node/constant';
|
||||
import { ModuleOutputKeyEnum } from '../module/constants';
|
||||
import { AppSchema } from '../app/type';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constant';
|
||||
import { DatasetSearchModeEnum } from '../dataset/constants';
|
||||
|
||||
export type ChatSchema = {
|
||||
_id: string;
|
||||
@@ -22,6 +22,7 @@ export type ChatSchema = {
|
||||
shareId?: string;
|
||||
outLinkUid?: string;
|
||||
content: ChatItemType[];
|
||||
metadata?: Record<string, any>;
|
||||
};
|
||||
|
||||
export type ChatWithAppSchema = Omit<ChatSchema, 'appId'> & {
|
||||
@@ -91,6 +92,7 @@ export type moduleDispatchResType = {
|
||||
runningTime?: number;
|
||||
inputTokens?: number;
|
||||
outputTokens?: number;
|
||||
charsLength?: number;
|
||||
model?: string;
|
||||
query?: string;
|
||||
contextTotalLen?: number;
|
||||
|
||||
50
packages/global/core/dataset/api.d.ts
vendored
50
packages/global/core/dataset/api.d.ts
vendored
@@ -1,5 +1,5 @@
|
||||
import { DatasetDataIndexItemType, DatasetSchemaType } from './type';
|
||||
import { DatasetCollectionTrainingModeEnum, DatasetCollectionTypeEnum } from './constant';
|
||||
import { TrainingModeEnum, DatasetCollectionTypeEnum } from './constants';
|
||||
import type { LLMModelItemType } from '../ai/model.d';
|
||||
|
||||
/* ================= dataset ===================== */
|
||||
@@ -16,28 +16,47 @@ export type DatasetUpdateBody = {
|
||||
};
|
||||
|
||||
/* ================= collection ===================== */
|
||||
export type CreateDatasetCollectionParams = {
|
||||
datasetId: string;
|
||||
export type DatasetCollectionChunkMetadataType = {
|
||||
parentId?: string;
|
||||
trainingType?: `${TrainingModeEnum}`;
|
||||
chunkSize?: number;
|
||||
chunkSplitter?: string;
|
||||
qaPrompt?: string;
|
||||
metadata?: Record<string, any>;
|
||||
};
|
||||
export type CreateDatasetCollectionParams = DatasetCollectionChunkMetadataType & {
|
||||
datasetId: string;
|
||||
name: string;
|
||||
type: `${DatasetCollectionTypeEnum}`;
|
||||
trainingType?: `${DatasetCollectionTrainingModeEnum}`;
|
||||
chunkSize?: number;
|
||||
fileId?: string;
|
||||
rawLink?: string;
|
||||
qaPrompt?: string;
|
||||
rawTextLength?: number;
|
||||
hashRawText?: string;
|
||||
metadata?: Record<string, any>;
|
||||
};
|
||||
|
||||
export type ApiCreateDatasetCollectionParams = DatasetCollectionChunkMetadataType & {
|
||||
datasetId: string;
|
||||
};
|
||||
export type TextCreateDatasetCollectionParams = ApiCreateDatasetCollectionParams & {
|
||||
name: string;
|
||||
text: string;
|
||||
};
|
||||
export type LinkCreateDatasetCollectionParams = ApiCreateDatasetCollectionParams & {
|
||||
link: string;
|
||||
};
|
||||
export type FileCreateDatasetCollectionParams = ApiCreateDatasetCollectionParams & {
|
||||
name: string;
|
||||
rawTextLength: number;
|
||||
hashRawText: string;
|
||||
|
||||
fileMetadata?: Record<string, any>;
|
||||
collectionMetadata?: Record<string, any>;
|
||||
};
|
||||
|
||||
/* ================= data ===================== */
|
||||
export type PgSearchRawType = {
|
||||
id: string;
|
||||
team_id: string;
|
||||
tmb_id: string;
|
||||
collection_id: string;
|
||||
data_id: string;
|
||||
score: number;
|
||||
};
|
||||
export type PushDatasetDataChunkProps = {
|
||||
@@ -51,3 +70,14 @@ export type PostWebsiteSyncParams = {
|
||||
datasetId: string;
|
||||
billId: string;
|
||||
};
|
||||
|
||||
export type PushDatasetDataProps = {
|
||||
collectionId: string;
|
||||
data: PushDatasetDataChunkProps[];
|
||||
trainingMode: `${TrainingModeEnum}`;
|
||||
prompt?: string;
|
||||
billId?: string;
|
||||
};
|
||||
export type PushDatasetDataResponse = {
|
||||
insertLen: number;
|
||||
};
|
||||
|
||||
@@ -6,7 +6,7 @@ export enum DatasetTypeEnum {
|
||||
}
|
||||
export const DatasetTypeMap = {
|
||||
[DatasetTypeEnum.folder]: {
|
||||
icon: 'core/dataset/folderDataset',
|
||||
icon: 'common/folderFill',
|
||||
label: 'core.dataset.Folder Dataset',
|
||||
collectionLabel: 'common.Folder'
|
||||
},
|
||||
@@ -53,23 +53,7 @@ export const DatasetCollectionTypeMap = {
|
||||
name: 'core.dataset.link'
|
||||
},
|
||||
[DatasetCollectionTypeEnum.virtual]: {
|
||||
name: 'core.dataset.Virtual File'
|
||||
}
|
||||
};
|
||||
export enum DatasetCollectionTrainingModeEnum {
|
||||
manual = 'manual',
|
||||
chunk = 'chunk',
|
||||
qa = 'qa'
|
||||
}
|
||||
export const DatasetCollectionTrainingTypeMap = {
|
||||
[DatasetCollectionTrainingModeEnum.manual]: {
|
||||
label: 'core.dataset.collection.training.type manual'
|
||||
},
|
||||
[DatasetCollectionTrainingModeEnum.chunk]: {
|
||||
label: 'core.dataset.collection.training.type chunk'
|
||||
},
|
||||
[DatasetCollectionTrainingModeEnum.qa]: {
|
||||
label: 'core.dataset.collection.training.type qa'
|
||||
name: 'core.dataset.Manual collection'
|
||||
}
|
||||
};
|
||||
|
||||
@@ -120,10 +104,12 @@ export enum TrainingModeEnum {
|
||||
|
||||
export const TrainingTypeMap = {
|
||||
[TrainingModeEnum.chunk]: {
|
||||
label: 'core.dataset.training.type chunk'
|
||||
label: 'core.dataset.training.Chunk mode',
|
||||
tooltip: 'core.dataset.import.Chunk Split Tip'
|
||||
},
|
||||
[TrainingModeEnum.qa]: {
|
||||
label: 'core.dataset.training.type qa'
|
||||
label: 'core.dataset.training.QA mode',
|
||||
tooltip: 'core.dataset.import.QA Import Tip'
|
||||
}
|
||||
};
|
||||
|
||||
@@ -184,4 +170,8 @@ export const SearchScoreTypeMap = {
|
||||
}
|
||||
};
|
||||
|
||||
export const FolderAvatarSrc = '/imgs/files/folder.svg';
|
||||
export const FolderIcon = 'file/fill/folder';
|
||||
export const FolderImgUrl = '/imgs/files/folder.svg';
|
||||
|
||||
export const CustomCollectionIcon = 'common/linkBlue';
|
||||
export const LinkCollectionIcon = 'common/linkBlue';
|
||||
2
packages/global/core/dataset/controller.d.ts
vendored
2
packages/global/core/dataset/controller.d.ts
vendored
@@ -21,7 +21,7 @@ export type UpdateDatasetDataProps = {
|
||||
};
|
||||
|
||||
export type PatchIndexesProps = {
|
||||
type: 'create' | 'update' | 'delete';
|
||||
type: 'create' | 'update' | 'delete' | 'unChange';
|
||||
index: Omit<DatasetDataIndexItemType, 'dataId'> & {
|
||||
dataId?: string;
|
||||
};
|
||||
|
||||
10
packages/global/core/dataset/type.d.ts
vendored
10
packages/global/core/dataset/type.d.ts
vendored
@@ -8,7 +8,7 @@ import {
|
||||
DatasetTypeEnum,
|
||||
SearchScoreTypeEnum,
|
||||
TrainingModeEnum
|
||||
} from './constant';
|
||||
} from './constants';
|
||||
|
||||
/* schema */
|
||||
export type DatasetSchemaType = {
|
||||
@@ -42,15 +42,21 @@ export type DatasetCollectionSchemaType = {
|
||||
type: `${DatasetCollectionTypeEnum}`;
|
||||
createTime: Date;
|
||||
updateTime: Date;
|
||||
|
||||
trainingType: `${TrainingModeEnum}`;
|
||||
chunkSize: number;
|
||||
chunkSplitter?: string;
|
||||
qaPrompt?: string;
|
||||
|
||||
fileId?: string;
|
||||
rawLink?: string;
|
||||
qaPrompt?: string;
|
||||
|
||||
rawTextLength?: number;
|
||||
hashRawText?: string;
|
||||
metadata?: {
|
||||
webPageSelector?: string;
|
||||
relatedImgId?: string; // The id of the associated image collections
|
||||
|
||||
[key: string]: any;
|
||||
};
|
||||
};
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
import { DatasetCollectionTypeEnum, DatasetDataIndexTypeEnum } from './constant';
|
||||
import { TrainingModeEnum, DatasetCollectionTypeEnum, DatasetDataIndexTypeEnum } from './constants';
|
||||
import { getFileIcon } from '../../common/file/icon';
|
||||
import { strIsLink } from '../../common/string/tools';
|
||||
|
||||
@@ -7,18 +7,13 @@ export function getCollectionIcon(
|
||||
name = ''
|
||||
) {
|
||||
if (type === DatasetCollectionTypeEnum.folder) {
|
||||
return '/imgs/files/folder.svg';
|
||||
return 'common/folderFill';
|
||||
}
|
||||
if (type === DatasetCollectionTypeEnum.link) {
|
||||
return '/imgs/files/link.svg';
|
||||
return 'common/linkBlue';
|
||||
}
|
||||
if (type === DatasetCollectionTypeEnum.virtual) {
|
||||
if (name === '手动录入') {
|
||||
return '/imgs/files/manual.svg';
|
||||
} else if (name === '手动标注') {
|
||||
return '/imgs/files/mark.svg';
|
||||
}
|
||||
return '/imgs/files/collection.svg';
|
||||
return 'file/fill/manual';
|
||||
}
|
||||
return getFileIcon(name);
|
||||
}
|
||||
@@ -30,19 +25,14 @@ export function getSourceNameIcon({
|
||||
sourceId?: string;
|
||||
}) {
|
||||
if (strIsLink(sourceId)) {
|
||||
return '/imgs/files/link.svg';
|
||||
return 'common/linkBlue';
|
||||
}
|
||||
const fileIcon = getFileIcon(sourceName, '');
|
||||
if (fileIcon) {
|
||||
return fileIcon;
|
||||
}
|
||||
|
||||
if (sourceName === '手动录入') {
|
||||
return '/imgs/files/manual.svg';
|
||||
} else if (sourceName === '手动标注') {
|
||||
return '/imgs/files/mark.svg';
|
||||
}
|
||||
return '/imgs/files/collection.svg';
|
||||
return 'file/fill/manual';
|
||||
}
|
||||
|
||||
export function getDefaultIndex(props?: { q?: string; a?: string; dataId?: string }) {
|
||||
@@ -55,3 +45,8 @@ export function getDefaultIndex(props?: { q?: string; a?: string; dataId?: strin
|
||||
dataId
|
||||
};
|
||||
}
|
||||
|
||||
export const predictDataLimitLength = (mode: `${TrainingModeEnum}`, data: any[]) => {
|
||||
if (mode === TrainingModeEnum.qa) return data.length * 20;
|
||||
return data.length;
|
||||
};
|
||||
|
||||
@@ -113,5 +113,16 @@ export enum VariableInputEnum {
|
||||
textarea = 'textarea',
|
||||
select = 'select'
|
||||
}
|
||||
export const variableMap = {
|
||||
[VariableInputEnum.input]: {
|
||||
icon: 'core/app/variable/input'
|
||||
},
|
||||
[VariableInputEnum.textarea]: {
|
||||
icon: 'core/app/variable/textarea'
|
||||
},
|
||||
[VariableInputEnum.select]: {
|
||||
icon: 'core/app/variable/select'
|
||||
}
|
||||
};
|
||||
|
||||
export const DYNAMIC_INPUT_KEY = 'DYNAMIC_INPUT_KEY';
|
||||
|
||||
@@ -54,10 +54,9 @@ export enum FlowNodeTypeEnum {
|
||||
pluginModule = 'pluginModule',
|
||||
pluginInput = 'pluginInput',
|
||||
pluginOutput = 'pluginOutput',
|
||||
cfr = 'cfr',
|
||||
cfr = 'cfr'
|
||||
|
||||
// abandon
|
||||
variable = 'variable'
|
||||
}
|
||||
|
||||
export const EDGE_TYPE = 'default';
|
||||
|
||||
@@ -23,15 +23,15 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.textAnswer,
|
||||
flowType: FlowNodeTypeEnum.chatNode,
|
||||
avatar: '/imgs/module/AI.png',
|
||||
name: 'AI 对话',
|
||||
intro: 'AI 大模型对话',
|
||||
name: 'core.module.template.Ai chat',
|
||||
intro: 'core.module.template.Ai chat intro',
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiModel,
|
||||
type: FlowNodeInputTypeEnum.selectChatModel,
|
||||
label: '对话模型',
|
||||
label: 'core.module.input.label.aiModel',
|
||||
required: true,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
showTargetInApp: false,
|
||||
@@ -41,42 +41,31 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatTemperature,
|
||||
type: FlowNodeInputTypeEnum.hidden, // Set in the pop-up window
|
||||
label: '温度',
|
||||
label: '',
|
||||
value: 0,
|
||||
valueType: ModuleIOValueTypeEnum.number,
|
||||
min: 0,
|
||||
max: 10,
|
||||
step: 1,
|
||||
markList: [
|
||||
{ label: '严谨', value: 0 },
|
||||
{ label: '发散', value: 10 }
|
||||
],
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
},
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatMaxToken,
|
||||
type: FlowNodeInputTypeEnum.hidden, // Set in the pop-up window
|
||||
label: '回复上限',
|
||||
label: '',
|
||||
value: 2000,
|
||||
valueType: ModuleIOValueTypeEnum.number,
|
||||
min: 100,
|
||||
max: 4000,
|
||||
step: 50,
|
||||
markList: [
|
||||
{ label: '100', value: 100 },
|
||||
{
|
||||
label: `${4000}`,
|
||||
value: 4000
|
||||
}
|
||||
],
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
},
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatIsResponseText,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
label: '返回AI内容',
|
||||
label: '',
|
||||
value: true,
|
||||
valueType: ModuleIOValueTypeEnum.boolean,
|
||||
showTargetInApp: false,
|
||||
@@ -85,7 +74,7 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatQuoteTemplate,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
label: '引用内容模板',
|
||||
label: '',
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
@@ -93,7 +82,7 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatQuotePrompt,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
label: '引用内容提示词',
|
||||
label: '',
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
@@ -110,7 +99,7 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiSystemPrompt,
|
||||
type: FlowNodeInputTypeEnum.textarea,
|
||||
label: '系统提示词',
|
||||
label: 'core.ai.Prompt',
|
||||
max: 300,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
description: chatNodeSystemPromptTip,
|
||||
@@ -122,8 +111,8 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.aiChatDatasetQuote,
|
||||
type: FlowNodeInputTypeEnum.target,
|
||||
label: '引用内容',
|
||||
description: "对象数组格式,结构:\n [{q:'问题',a:'回答'}]",
|
||||
label: 'core.module.input.label.Quote',
|
||||
description: 'core.module.input.description.Quote',
|
||||
valueType: ModuleIOValueTypeEnum.datasetQuote,
|
||||
showTargetInApp: true,
|
||||
showTargetInPlugin: true
|
||||
@@ -134,16 +123,16 @@ export const AiChatModule: FlowModuleTemplateType = {
|
||||
Output_Template_UserChatInput,
|
||||
{
|
||||
key: ModuleOutputKeyEnum.history,
|
||||
label: '新的上下文',
|
||||
description: '将本次回复内容拼接上历史记录,作为新的上下文返回',
|
||||
label: 'core.module.output.label.New context',
|
||||
description: 'core.module.output.description.New context',
|
||||
valueType: ModuleIOValueTypeEnum.chatHistory,
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
targets: []
|
||||
},
|
||||
{
|
||||
key: ModuleOutputKeyEnum.answerText,
|
||||
label: 'AI回复内容',
|
||||
description: '将在 stream 回复完毕后触发',
|
||||
label: 'core.module.output.label.Ai response content',
|
||||
description: 'core.module.output.description.Ai response content',
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
targets: []
|
||||
|
||||
@@ -9,19 +9,17 @@ export const AssignedAnswerModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.textAnswer,
|
||||
flowType: FlowNodeTypeEnum.answerNode,
|
||||
avatar: '/imgs/module/reply.png',
|
||||
name: '指定回复',
|
||||
intro: '该模块可以直接回复一段指定的内容。常用于引导、提示',
|
||||
name: 'core.module.template.Assigned reply',
|
||||
intro: 'core.module.template.Assigned reply intro',
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
{
|
||||
key: ModuleInputKeyEnum.answerText,
|
||||
type: FlowNodeInputTypeEnum.textarea,
|
||||
valueType: ModuleIOValueTypeEnum.any,
|
||||
label: '回复的内容',
|
||||
description:
|
||||
'可以使用 \\n 来实现连续换行。\n可以通过外部模块输入实现回复,外部模块输入时会覆盖当前填写的内容。\n如传入非字符串类型数据将会自动转成字符串',
|
||||
placeholder:
|
||||
'可以使用 \\n 来实现连续换行。\n可以通过外部模块输入实现回复,外部模块输入时会覆盖当前填写的内容。\n如传入非字符串类型数据将会自动转成字符串',
|
||||
label: 'core.module.input.label.Response content',
|
||||
description: 'core.module.input.description.Response content',
|
||||
placeholder: 'core.module.input.description.Response content',
|
||||
showTargetInApp: true,
|
||||
showTargetInPlugin: true
|
||||
}
|
||||
|
||||
@@ -17,12 +17,8 @@ export const ClassifyQuestionModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.functionCall,
|
||||
flowType: FlowNodeTypeEnum.classifyQuestion,
|
||||
avatar: '/imgs/module/cq.png',
|
||||
name: '问题分类',
|
||||
intro: `根据用户的历史记录和当前问题判断该次提问的类型。可以添加多组问题类型,下面是一个模板例子:
|
||||
类型1: 打招呼
|
||||
类型2: 关于商品“使用”问题
|
||||
类型3: 关于商品“购买”问题
|
||||
类型4: 其他问题`,
|
||||
name: 'core.module.template.Classify question',
|
||||
intro: `core.module.template.Classify question intro`,
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
@@ -30,7 +26,7 @@ export const ClassifyQuestionModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.aiModel,
|
||||
type: FlowNodeInputTypeEnum.selectCQModel,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
label: '分类模型',
|
||||
label: 'core.module.input.label.Classify model',
|
||||
required: true,
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
@@ -39,11 +35,9 @@ export const ClassifyQuestionModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.aiSystemPrompt,
|
||||
type: FlowNodeInputTypeEnum.textarea,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
label: '背景知识',
|
||||
description:
|
||||
'你可以添加一些特定内容的介绍,从而更好的识别用户的问题类型。这个内容通常是给模型介绍一个它不知道的内容。',
|
||||
placeholder:
|
||||
'例如: \n1. AIGC(人工智能生成内容)是指使用人工智能技术自动或半自动地生成数字内容,如文本、图像、音乐、视频等。\n2. AIGC技术包括但不限于自然语言处理、计算机视觉、机器学习和深度学习。这些技术可以创建新内容或修改现有内容,以满足特定的创意、教育、娱乐或信息需求。',
|
||||
label: 'core.module.input.label.Background',
|
||||
description: 'core.module.input.description.Background',
|
||||
placeholder: 'core.module.input.placeholder.Classify background',
|
||||
showTargetInApp: true,
|
||||
showTargetInPlugin: true
|
||||
},
|
||||
|
||||
@@ -17,8 +17,8 @@ export const ContextExtractModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.functionCall,
|
||||
flowType: FlowNodeTypeEnum.contentExtract,
|
||||
avatar: '/imgs/module/extract.png',
|
||||
name: '文本内容提取',
|
||||
intro: '可从文本中提取指定的数据,例如:sql语句、搜索关键词、代码等',
|
||||
name: 'core.module.template.Extract field',
|
||||
intro: 'core.module.template.Extract field intro',
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
@@ -26,7 +26,7 @@ export const ContextExtractModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.aiModel,
|
||||
type: FlowNodeInputTypeEnum.selectExtractModel,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
label: '提取模型',
|
||||
label: 'core.module.input.label.LLM',
|
||||
required: true,
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
|
||||
@@ -12,22 +12,22 @@ import {
|
||||
} from '../../constants';
|
||||
import { Input_Template_Switch, Input_Template_UserChatInput } from '../input';
|
||||
import { Output_Template_Finish, Output_Template_UserChatInput } from '../output';
|
||||
import { DatasetSearchModeEnum } from '../../../dataset/constant';
|
||||
import { DatasetSearchModeEnum } from '../../../dataset/constants';
|
||||
|
||||
export const DatasetSearchModule: FlowModuleTemplateType = {
|
||||
id: FlowNodeTypeEnum.datasetSearchNode,
|
||||
templateType: ModuleTemplateTypeEnum.functionCall,
|
||||
flowType: FlowNodeTypeEnum.datasetSearchNode,
|
||||
avatar: '/imgs/module/db.png',
|
||||
name: '知识库搜索',
|
||||
intro: '去知识库中搜索对应的答案。可作为 AI 对话引用参考。',
|
||||
name: 'core.module.template.Dataset search',
|
||||
intro: 'core.module.template.Dataset search intro',
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
{
|
||||
key: ModuleInputKeyEnum.datasetSelectList,
|
||||
type: FlowNodeInputTypeEnum.selectDataset,
|
||||
label: '关联的知识库',
|
||||
label: 'core.module.input.label.Select dataset',
|
||||
value: [],
|
||||
valueType: ModuleIOValueTypeEnum.selectDataset,
|
||||
list: [],
|
||||
@@ -38,7 +38,7 @@ export const DatasetSearchModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.datasetSimilarity,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
label: '最低相关性',
|
||||
label: '',
|
||||
value: 0.4,
|
||||
valueType: ModuleIOValueTypeEnum.number,
|
||||
min: 0,
|
||||
@@ -54,8 +54,7 @@ export const DatasetSearchModule: FlowModuleTemplateType = {
|
||||
{
|
||||
key: ModuleInputKeyEnum.datasetLimit,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
label: '引用上限',
|
||||
description: '单次搜索最大的 Tokens 数量,中文约1字=1.7Tokens,英文约1字=1Tokens',
|
||||
label: '',
|
||||
value: 1500,
|
||||
valueType: ModuleIOValueTypeEnum.number,
|
||||
showTargetInApp: false,
|
||||
@@ -93,23 +92,22 @@ export const DatasetSearchModule: FlowModuleTemplateType = {
|
||||
Output_Template_UserChatInput,
|
||||
{
|
||||
key: ModuleOutputKeyEnum.datasetIsEmpty,
|
||||
label: '搜索结果为空',
|
||||
label: 'core.module.output.label.Search result empty',
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
valueType: ModuleIOValueTypeEnum.boolean,
|
||||
targets: []
|
||||
},
|
||||
{
|
||||
key: ModuleOutputKeyEnum.datasetUnEmpty,
|
||||
label: '搜索结果不为空',
|
||||
label: 'core.module.output.label.Search result not empty',
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
valueType: ModuleIOValueTypeEnum.boolean,
|
||||
targets: []
|
||||
},
|
||||
{
|
||||
key: ModuleOutputKeyEnum.datasetQuoteQA,
|
||||
label: '引用内容',
|
||||
description:
|
||||
'始终返回数组,如果希望搜索结果为空时执行额外操作,需要用到上面的两个输入以及目标模块的触发器',
|
||||
label: 'core.module.output.label.Quote',
|
||||
description: 'core.module.output.label.Quote intro',
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
valueType: ModuleIOValueTypeEnum.datasetQuote,
|
||||
targets: []
|
||||
|
||||
@@ -17,8 +17,8 @@ export const HttpModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.externalCall,
|
||||
flowType: FlowNodeTypeEnum.httpRequest,
|
||||
avatar: '/imgs/module/http.png',
|
||||
name: 'HTTP模块',
|
||||
intro: '可以发出一个 HTTP POST 请求,实现更为复杂的操作(联网搜索、数据库查询等)',
|
||||
name: 'core.module.template.Http request',
|
||||
intro: 'core.module.template.Http request intro',
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
|
||||
@@ -22,8 +22,8 @@ export const RunAppModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.externalCall,
|
||||
flowType: FlowNodeTypeEnum.runApp,
|
||||
avatar: '/imgs/module/app.png',
|
||||
name: '应用调用',
|
||||
intro: '可以选择一个其他应用进行调用',
|
||||
name: 'core.module.template.Running app',
|
||||
intro: 'core.module.template.Running app intro',
|
||||
showStatus: true,
|
||||
inputs: [
|
||||
Input_Template_Switch,
|
||||
|
||||
@@ -8,7 +8,7 @@ export const RunPluginModule: FlowModuleTemplateType = {
|
||||
flowType: FlowNodeTypeEnum.pluginModule,
|
||||
avatar: '/imgs/module/custom.png',
|
||||
intro: '',
|
||||
name: '自定义模块',
|
||||
name: '',
|
||||
showStatus: false,
|
||||
inputs: [], // [{key:'pluginId'},...]
|
||||
outputs: []
|
||||
|
||||
@@ -8,14 +8,14 @@ export const UserGuideModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.userGuide,
|
||||
flowType: FlowNodeTypeEnum.userGuide,
|
||||
avatar: '/imgs/module/userGuide.png',
|
||||
name: '用户引导',
|
||||
name: 'core.module.template.User guide',
|
||||
intro: userGuideTip,
|
||||
inputs: [
|
||||
{
|
||||
key: ModuleInputKeyEnum.welcomeText,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
label: '开场白',
|
||||
label: 'core.app.Welcome Text',
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
},
|
||||
@@ -23,7 +23,7 @@ export const UserGuideModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.variables,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
valueType: ModuleIOValueTypeEnum.any,
|
||||
label: '对话框变量',
|
||||
label: 'core.module.Variable',
|
||||
value: [],
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
@@ -32,7 +32,7 @@ export const UserGuideModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.questionGuide,
|
||||
valueType: ModuleIOValueTypeEnum.boolean,
|
||||
type: FlowNodeInputTypeEnum.switch,
|
||||
label: '问题引导',
|
||||
label: '',
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
},
|
||||
@@ -40,7 +40,7 @@ export const UserGuideModule: FlowModuleTemplateType = {
|
||||
key: ModuleInputKeyEnum.tts,
|
||||
type: FlowNodeInputTypeEnum.hidden,
|
||||
valueType: ModuleIOValueTypeEnum.any,
|
||||
label: '语音播报',
|
||||
label: '',
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
}
|
||||
|
||||
@@ -16,14 +16,14 @@ export const UserInputModule: FlowModuleTemplateType = {
|
||||
templateType: ModuleTemplateTypeEnum.systemInput,
|
||||
flowType: FlowNodeTypeEnum.questionInput,
|
||||
avatar: '/imgs/module/userChatInput.png',
|
||||
name: '用户问题(入口)',
|
||||
intro: '用户输入的内容。该模块通常作为应用的入口,用户在发送消息后会首先执行该模块。',
|
||||
name: 'core.module.template.Chat entrance',
|
||||
intro: 'core.module.template.Chat entrance intro',
|
||||
inputs: [
|
||||
{
|
||||
key: ModuleInputKeyEnum.userChatInput,
|
||||
type: FlowNodeInputTypeEnum.systemInput,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
label: '用户问题',
|
||||
label: 'core.module.input.label.user question',
|
||||
showTargetInApp: false,
|
||||
showTargetInPlugin: false
|
||||
}
|
||||
@@ -31,7 +31,7 @@ export const UserInputModule: FlowModuleTemplateType = {
|
||||
outputs: [
|
||||
{
|
||||
key: ModuleOutputKeyEnum.userChatInput,
|
||||
label: '用户问题',
|
||||
label: 'core.module.input.label.user question',
|
||||
type: FlowNodeOutputTypeEnum.source,
|
||||
valueType: ModuleIOValueTypeEnum.string,
|
||||
targets: []
|
||||
|
||||
@@ -1,7 +1,4 @@
|
||||
export const chatNodeSystemPromptTip =
|
||||
'模型固定的引导词,通过调整该内容,可以引导模型聊天方向。该内容会被固定在上下文的开头。可使用变量,例如 {{language}}';
|
||||
export const userGuideTip = '可以在对话前设置引导语,设置全局变量,设置下一步指引';
|
||||
export const welcomeTextTip =
|
||||
'每次对话开始前,发送一个初始内容。支持标准 Markdown 语法,可使用的额外标记:\n[快捷按键]: 用户点击后可以直接发送该问题';
|
||||
export const variableTip =
|
||||
'可以在对话开始前,要求用户填写一些内容作为本轮对话的特定变量。该模块位于开场引导之后。\n变量可以通过 {{变量key}} 的形式注入到其他模块 string 类型的输入中,例如:提示词、限定词等';
|
||||
export const chatNodeSystemPromptTip = 'core.app.tip.chatNodeSystemPromptTip';
|
||||
export const userGuideTip = 'core.app.tip.userGuideTip';
|
||||
export const welcomeTextTip = 'core.app.tip.welcomeTextTip';
|
||||
export const variableTip = 'core.app.tip.variableTip';
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { FlowNodeInputTypeEnum, FlowNodeTypeEnum } from './node/constant';
|
||||
import { ModuleIOValueTypeEnum, ModuleInputKeyEnum } from './constants';
|
||||
import { ModuleIOValueTypeEnum, ModuleInputKeyEnum, variableMap } from './constants';
|
||||
import { FlowNodeInputItemType, FlowNodeOutputItemType } from './node/type';
|
||||
import { AppTTSConfigType, ModuleItemType, VariableItemType } from './type';
|
||||
import { Input_Template_Switch } from './template/input';
|
||||
@@ -94,3 +94,12 @@ export function plugin2ModuleIO(
|
||||
: []
|
||||
};
|
||||
}
|
||||
|
||||
export const formatVariablesIcon = (
|
||||
variables: VariableItemType[]
|
||||
): (VariableItemType & { icon: string })[] => {
|
||||
return variables.map((item) => ({
|
||||
...item,
|
||||
icon: variableMap[item.type]?.icon
|
||||
}));
|
||||
};
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
"encoding": "^0.1.13",
|
||||
"js-tiktoken": "^1.0.7",
|
||||
"openai": "4.23.0",
|
||||
"pdfjs-dist": "^4.0.269",
|
||||
"nanoid": "^4.0.1",
|
||||
"timezones-list": "^3.0.2"
|
||||
},
|
||||
"devDependencies": {
|
||||
|
||||
1
packages/global/support/user/api.d.ts
vendored
1
packages/global/support/user/api.d.ts
vendored
@@ -3,7 +3,6 @@ import { OAuthEnum } from './constant';
|
||||
export type PostLoginProps = {
|
||||
username: string;
|
||||
password: string;
|
||||
tmbId?: string;
|
||||
};
|
||||
|
||||
export type OauthLoginProps = {
|
||||
|
||||
1
packages/global/support/user/team/type.d.ts
vendored
1
packages/global/support/user/team/type.d.ts
vendored
@@ -9,7 +9,6 @@ export type TeamSchema = {
|
||||
createTime: Date;
|
||||
balance: number;
|
||||
maxSize: number;
|
||||
lastDatasetBillTime: Date;
|
||||
limit: {
|
||||
lastExportDatasetTime: Date;
|
||||
lastWebsiteSyncTime: Date;
|
||||
|
||||
1
packages/global/support/user/type.d.ts
vendored
1
packages/global/support/user/type.d.ts
vendored
@@ -13,6 +13,7 @@ export type UserModelSchema = {
|
||||
createTime: number;
|
||||
timezone: string;
|
||||
status: `${UserStatusEnum}`;
|
||||
lastLoginTmbId?: string;
|
||||
openaiAccount?: {
|
||||
key: string;
|
||||
baseUrl: string;
|
||||
|
||||
3
packages/global/support/wallet/bill/api.d.ts
vendored
3
packages/global/support/wallet/bill/api.d.ts
vendored
@@ -3,8 +3,7 @@ import { BillListItemCountType, BillListItemType } from './type';
|
||||
|
||||
export type CreateTrainingBillProps = {
|
||||
name: string;
|
||||
vectorModel?: string;
|
||||
agentModel?: string;
|
||||
datasetId: string;
|
||||
};
|
||||
|
||||
export type ConcatBillProps = BillListItemCountType & {
|
||||
|
||||
@@ -7,7 +7,7 @@ export enum BillSourceEnum {
|
||||
api = 'api',
|
||||
shareLink = 'shareLink',
|
||||
training = 'training',
|
||||
datasetStore = 'datasetStore'
|
||||
datasetExpand = 'datasetExpand'
|
||||
}
|
||||
|
||||
export const BillSourceMap: Record<`${BillSourceEnum}`, string> = {
|
||||
@@ -15,5 +15,5 @@ export const BillSourceMap: Record<`${BillSourceEnum}`, string> = {
|
||||
[BillSourceEnum.api]: 'Api',
|
||||
[BillSourceEnum.shareLink]: '免登录链接',
|
||||
[BillSourceEnum.training]: '数据训练',
|
||||
[BillSourceEnum.datasetStore]: '知识库存储'
|
||||
[BillSourceEnum.datasetExpand]: '知识库扩容'
|
||||
};
|
||||
|
||||
@@ -4,9 +4,8 @@ import { BillSourceEnum } from './constants';
|
||||
export type BillListItemCountType = {
|
||||
inputTokens?: number;
|
||||
outputTokens?: number;
|
||||
textLen?: number;
|
||||
charsLength?: number;
|
||||
duration?: number;
|
||||
dataLen?: number;
|
||||
|
||||
// abandon
|
||||
tokenLen?: number;
|
||||
|
||||
4
packages/global/support/wallet/sub/api.d.ts
vendored
Normal file
4
packages/global/support/wallet/sub/api.d.ts
vendored
Normal file
@@ -0,0 +1,4 @@
|
||||
export type SubDatasetSizeParams = {
|
||||
size: number;
|
||||
renew: boolean;
|
||||
};
|
||||
37
packages/global/support/wallet/sub/constants.ts
Normal file
37
packages/global/support/wallet/sub/constants.ts
Normal file
@@ -0,0 +1,37 @@
|
||||
export enum SubTypeEnum {
|
||||
datasetStore = 'datasetStore'
|
||||
}
|
||||
|
||||
export const subTypeMap = {
|
||||
[SubTypeEnum.datasetStore]: {
|
||||
label: 'support.user.team.subscription.type.datasetStore'
|
||||
}
|
||||
};
|
||||
|
||||
export enum SubModeEnum {
|
||||
month = 'month',
|
||||
year = 'year'
|
||||
}
|
||||
|
||||
export const subModeMap = {
|
||||
[SubModeEnum.month]: {
|
||||
label: 'support.user.team.subscription.mode.month'
|
||||
},
|
||||
[SubModeEnum.year]: {
|
||||
label: 'support.user.team.subscription.mode.year'
|
||||
}
|
||||
};
|
||||
|
||||
export enum SubStatusEnum {
|
||||
active = 'active',
|
||||
expired = 'expired'
|
||||
}
|
||||
|
||||
export const subStatusMap = {
|
||||
[SubStatusEnum.active]: {
|
||||
label: 'support.user.team.subscription.status.active'
|
||||
},
|
||||
[SubStatusEnum.expired]: {
|
||||
label: 'support.user.team.subscription.status.expired'
|
||||
}
|
||||
};
|
||||
12
packages/global/support/wallet/sub/type.d.ts
vendored
Normal file
12
packages/global/support/wallet/sub/type.d.ts
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
import { SubModeEnum, SubStatusEnum, SubTypeEnum } from './constants';
|
||||
|
||||
export type TeamSubSchema = {
|
||||
teamId: string;
|
||||
type: `${SubTypeEnum}`;
|
||||
mode: `${SubModeEnum}`;
|
||||
status: `${SubStatusEnum}`;
|
||||
renew: boolean;
|
||||
startTime: Date;
|
||||
expiredTime: Date;
|
||||
datasetStoreAmount?: number;
|
||||
};
|
||||
6
packages/service/common/file/constants.ts
Normal file
6
packages/service/common/file/constants.ts
Normal file
@@ -0,0 +1,6 @@
|
||||
import path from 'path';
|
||||
|
||||
export const tmpFileDirPath =
|
||||
process.env.NODE_ENV === 'production' ? '/app/tmp' : path.join(process.cwd(), 'tmp');
|
||||
|
||||
export const previewMaxCharCount = 3000;
|
||||
@@ -3,9 +3,10 @@ import { BucketNameEnum } from '@fastgpt/global/common/file/constants';
|
||||
import fsp from 'fs/promises';
|
||||
import fs from 'fs';
|
||||
import { DatasetFileSchema } from '@fastgpt/global/core/dataset/type';
|
||||
import { delImgByFileIdList } from '../image/controller';
|
||||
import { MongoFileSchema } from './schema';
|
||||
|
||||
export function getGFSCollection(bucket: `${BucketNameEnum}`) {
|
||||
MongoFileSchema;
|
||||
return connectionMongo.connection.db.collection(`${bucket}.files`);
|
||||
}
|
||||
export function getGridBucket(bucket: `${BucketNameEnum}`) {
|
||||
@@ -21,6 +22,7 @@ export async function uploadFile({
|
||||
tmbId,
|
||||
path,
|
||||
filename,
|
||||
contentType,
|
||||
metadata = {}
|
||||
}: {
|
||||
bucketName: `${BucketNameEnum}`;
|
||||
@@ -28,6 +30,7 @@ export async function uploadFile({
|
||||
tmbId: string;
|
||||
path: string;
|
||||
filename: string;
|
||||
contentType?: string;
|
||||
metadata?: Record<string, any>;
|
||||
}) {
|
||||
if (!path) return Promise.reject(`filePath is empty`);
|
||||
@@ -44,7 +47,7 @@ export async function uploadFile({
|
||||
|
||||
const stream = bucket.openUploadStream(filename, {
|
||||
metadata,
|
||||
contentType: metadata?.contentType
|
||||
contentType
|
||||
});
|
||||
|
||||
// save to gridfs
|
||||
@@ -96,40 +99,6 @@ export async function delFileByFileIdList({
|
||||
}
|
||||
}
|
||||
}
|
||||
// delete file by metadata(datasetId)
|
||||
export async function delFileByMetadata({
|
||||
bucketName,
|
||||
datasetId
|
||||
}: {
|
||||
bucketName: `${BucketNameEnum}`;
|
||||
datasetId?: string;
|
||||
}) {
|
||||
const bucket = getGridBucket(bucketName);
|
||||
|
||||
const files = await bucket
|
||||
.find(
|
||||
{
|
||||
...(datasetId && { 'metadata.datasetId': datasetId })
|
||||
},
|
||||
{
|
||||
projection: {
|
||||
_id: 1
|
||||
}
|
||||
}
|
||||
)
|
||||
.toArray();
|
||||
|
||||
const idList = files.map((item) => String(item._id));
|
||||
|
||||
// delete img
|
||||
await delImgByFileIdList(idList);
|
||||
|
||||
// delete file
|
||||
await delFileByFileIdList({
|
||||
bucketName,
|
||||
fileIdList: idList
|
||||
});
|
||||
}
|
||||
|
||||
export async function getDownloadStream({
|
||||
bucketName,
|
||||
|
||||
15
packages/service/common/file/gridfs/schema.ts
Normal file
15
packages/service/common/file/gridfs/schema.ts
Normal file
@@ -0,0 +1,15 @@
|
||||
import { connectionMongo, type Model } from '../../mongo';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
|
||||
const FileSchema = new Schema({});
|
||||
|
||||
try {
|
||||
FileSchema.index({ 'metadata.teamId': 1 });
|
||||
FileSchema.index({ 'metadata.uploadDate': -1 });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
export const MongoFileSchema = models['dataset.files'] || model('dataset.files', FileSchema);
|
||||
|
||||
MongoFileSchema.syncIndexes();
|
||||
@@ -1 +0,0 @@
|
||||
export const imageBaseUrl = '/api/system/img/';
|
||||
@@ -1,5 +1,5 @@
|
||||
import { UploadImgProps } from '@fastgpt/global/common/file/api';
|
||||
import { imageBaseUrl } from './constant';
|
||||
import { imageBaseUrl } from '@fastgpt/global/common/file/image/constants';
|
||||
import { MongoImage } from './schema';
|
||||
|
||||
export function getMongoImgUrl(id: string) {
|
||||
@@ -8,10 +8,13 @@ export function getMongoImgUrl(id: string) {
|
||||
|
||||
export const maxImgSize = 1024 * 1024 * 12;
|
||||
export async function uploadMongoImg({
|
||||
type,
|
||||
base64Img,
|
||||
teamId,
|
||||
expiredTime,
|
||||
metadata
|
||||
metadata,
|
||||
|
||||
shareId
|
||||
}: UploadImgProps & {
|
||||
teamId: string;
|
||||
}) {
|
||||
@@ -20,12 +23,16 @@ export async function uploadMongoImg({
|
||||
}
|
||||
|
||||
const base64Data = base64Img.split(',')[1];
|
||||
const binary = Buffer.from(base64Data, 'base64');
|
||||
|
||||
const { _id } = await MongoImage.create({
|
||||
type,
|
||||
teamId,
|
||||
binary: Buffer.from(base64Data, 'base64'),
|
||||
binary,
|
||||
expiredTime: expiredTime,
|
||||
metadata
|
||||
metadata,
|
||||
|
||||
shareId
|
||||
});
|
||||
|
||||
return getMongoImgUrl(String(_id));
|
||||
@@ -39,8 +46,17 @@ export async function readMongoImg({ id }: { id: string }) {
|
||||
return data?.binary;
|
||||
}
|
||||
|
||||
export async function delImgByFileIdList(fileIds: string[]) {
|
||||
export async function delImgByRelatedId({
|
||||
teamId,
|
||||
relateIds
|
||||
}: {
|
||||
teamId: string;
|
||||
relateIds: string[];
|
||||
}) {
|
||||
if (relateIds.length === 0) return;
|
||||
|
||||
return MongoImage.deleteMany({
|
||||
'metadata.fileId': { $in: fileIds.map((item) => String(item)) }
|
||||
teamId,
|
||||
'metadata.relatedId': { $in: relateIds.map((id) => String(id)) }
|
||||
});
|
||||
}
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
import { TeamCollectionName } from '@fastgpt/global/support/user/team/constant';
|
||||
import { connectionMongo, type Model } from '../../mongo';
|
||||
import { MongoImageSchemaType } from '@fastgpt/global/common/file/image/type.d';
|
||||
import { mongoImageTypeMap } from '@fastgpt/global/common/file/image/constants';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
|
||||
const ImageSchema = new Schema({
|
||||
@@ -12,12 +14,18 @@ const ImageSchema = new Schema({
|
||||
type: Date,
|
||||
default: () => new Date()
|
||||
},
|
||||
binary: {
|
||||
type: Buffer
|
||||
},
|
||||
expiredTime: {
|
||||
type: Date
|
||||
},
|
||||
binary: {
|
||||
type: Buffer
|
||||
},
|
||||
type: {
|
||||
type: String,
|
||||
enum: Object.keys(mongoImageTypeMap),
|
||||
required: true
|
||||
},
|
||||
|
||||
metadata: {
|
||||
type: Object
|
||||
}
|
||||
@@ -25,14 +33,14 @@ const ImageSchema = new Schema({
|
||||
|
||||
try {
|
||||
ImageSchema.index({ expiredTime: 1 }, { expireAfterSeconds: 60 });
|
||||
ImageSchema.index({ type: 1 });
|
||||
ImageSchema.index({ createTime: 1 });
|
||||
ImageSchema.index({ teamId: 1, 'metadata.relatedId': 1 });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
export const MongoImage: Model<{
|
||||
teamId: string;
|
||||
binary: Buffer;
|
||||
metadata?: { fileId?: string };
|
||||
}> = models['image'] || model('image', ImageSchema);
|
||||
export const MongoImage: Model<MongoImageSchemaType> =
|
||||
models['image'] || model('image', ImageSchema);
|
||||
|
||||
MongoImage.syncIndexes();
|
||||
|
||||
@@ -1,11 +1,8 @@
|
||||
import type { NextApiRequest, NextApiResponse } from 'next';
|
||||
import { customAlphabet } from 'nanoid';
|
||||
import multer from 'multer';
|
||||
import path from 'path';
|
||||
import { BucketNameEnum, bucketNameMap } from '@fastgpt/global/common/file/constants';
|
||||
import fs from 'fs';
|
||||
|
||||
const nanoid = customAlphabet('1234567890abcdef', 12);
|
||||
import { getNanoid } from '@fastgpt/global/common/string/tools';
|
||||
|
||||
type FileType = {
|
||||
fieldname: string;
|
||||
@@ -17,7 +14,7 @@ type FileType = {
|
||||
size: number;
|
||||
};
|
||||
|
||||
export function getUploadModel({ maxSize = 500 }: { maxSize?: number }) {
|
||||
export const getUploadModel = ({ maxSize = 500 }: { maxSize?: number }) => {
|
||||
maxSize *= 1024 * 1024;
|
||||
class UploadModel {
|
||||
uploader = multer({
|
||||
@@ -26,17 +23,25 @@ export function getUploadModel({ maxSize = 500 }: { maxSize?: number }) {
|
||||
},
|
||||
preservePath: true,
|
||||
storage: multer.diskStorage({
|
||||
filename: (_req, file, cb) => {
|
||||
// destination: (_req, _file, cb) => {
|
||||
// cb(null, tmpFileDirPath);
|
||||
// },
|
||||
filename: async (req, file, cb) => {
|
||||
const { ext } = path.parse(decodeURIComponent(file.originalname));
|
||||
cb(null, nanoid() + ext);
|
||||
cb(null, `${getNanoid()}${ext}`);
|
||||
}
|
||||
})
|
||||
}).any();
|
||||
}).single('file');
|
||||
|
||||
async doUpload<T = Record<string, any>>(req: NextApiRequest, res: NextApiResponse) {
|
||||
async doUpload<T = Record<string, any>>(
|
||||
req: NextApiRequest,
|
||||
res: NextApiResponse,
|
||||
originBuckerName?: `${BucketNameEnum}`
|
||||
) {
|
||||
return new Promise<{
|
||||
files: FileType[];
|
||||
metadata: T;
|
||||
file: FileType;
|
||||
metadata: Record<string, any>;
|
||||
data: T;
|
||||
bucketName?: `${BucketNameEnum}`;
|
||||
}>((resolve, reject) => {
|
||||
// @ts-ignore
|
||||
@@ -46,25 +51,33 @@ export function getUploadModel({ maxSize = 500 }: { maxSize?: number }) {
|
||||
}
|
||||
|
||||
// check bucket name
|
||||
const bucketName = req.body?.bucketName as `${BucketNameEnum}`;
|
||||
const bucketName = (req.body?.bucketName || originBuckerName) as `${BucketNameEnum}`;
|
||||
if (bucketName && !bucketNameMap[bucketName]) {
|
||||
return reject('BucketName is invalid');
|
||||
}
|
||||
|
||||
// @ts-ignore
|
||||
const file = req.file as FileType;
|
||||
|
||||
resolve({
|
||||
...req.body,
|
||||
files:
|
||||
// @ts-ignore
|
||||
req.files?.map((file) => ({
|
||||
...file,
|
||||
originalname: decodeURIComponent(file.originalname)
|
||||
})) || [],
|
||||
file: {
|
||||
...file,
|
||||
originalname: decodeURIComponent(file.originalname)
|
||||
},
|
||||
bucketName,
|
||||
metadata: (() => {
|
||||
if (!req.body?.metadata) return {};
|
||||
try {
|
||||
return JSON.parse(req.body.metadata);
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
return {};
|
||||
}
|
||||
})(),
|
||||
data: (() => {
|
||||
if (!req.body?.data) return {};
|
||||
try {
|
||||
return JSON.parse(req.body.data);
|
||||
} catch (error) {
|
||||
return {};
|
||||
}
|
||||
})()
|
||||
@@ -75,14 +88,4 @@ export function getUploadModel({ maxSize = 500 }: { maxSize?: number }) {
|
||||
}
|
||||
|
||||
return new UploadModel();
|
||||
}
|
||||
|
||||
export const removeFilesByPaths = (paths: string[]) => {
|
||||
paths.forEach((path) => {
|
||||
fs.unlink(path, (err) => {
|
||||
if (err) {
|
||||
console.error(err);
|
||||
}
|
||||
});
|
||||
});
|
||||
};
|
||||
11
packages/service/common/file/utils.ts
Normal file
11
packages/service/common/file/utils.ts
Normal file
@@ -0,0 +1,11 @@
|
||||
import fs from 'fs';
|
||||
|
||||
export const removeFilesByPaths = (paths: string[]) => {
|
||||
paths.forEach((path) => {
|
||||
fs.unlink(path, (err) => {
|
||||
if (err) {
|
||||
console.error(err);
|
||||
}
|
||||
});
|
||||
});
|
||||
};
|
||||
@@ -50,8 +50,11 @@ export const cheerioToHtml = ({
|
||||
.get()
|
||||
.join('\n');
|
||||
|
||||
const title = $('head title').text() || $('h1:first').text() || fetchUrl;
|
||||
|
||||
return {
|
||||
html,
|
||||
title,
|
||||
usedSelector
|
||||
};
|
||||
};
|
||||
@@ -61,39 +64,39 @@ export const urlsFetch = async ({
|
||||
}: UrlFetchParams): Promise<UrlFetchResponse> => {
|
||||
urlList = urlList.filter((url) => /^(http|https):\/\/[^ "]+$/.test(url));
|
||||
|
||||
const response = (
|
||||
await Promise.all(
|
||||
urlList.map(async (url) => {
|
||||
try {
|
||||
const fetchRes = await axios.get(url, {
|
||||
timeout: 30000
|
||||
});
|
||||
const response = await Promise.all(
|
||||
urlList.map(async (url) => {
|
||||
try {
|
||||
const fetchRes = await axios.get(url, {
|
||||
timeout: 30000
|
||||
});
|
||||
|
||||
const $ = cheerio.load(fetchRes.data);
|
||||
const { html, usedSelector } = cheerioToHtml({
|
||||
fetchUrl: url,
|
||||
$,
|
||||
selector
|
||||
});
|
||||
const md = await htmlToMarkdown(html);
|
||||
const $ = cheerio.load(fetchRes.data);
|
||||
const { title, html, usedSelector } = cheerioToHtml({
|
||||
fetchUrl: url,
|
||||
$,
|
||||
selector
|
||||
});
|
||||
const md = await htmlToMarkdown(html);
|
||||
|
||||
return {
|
||||
url,
|
||||
content: md,
|
||||
selector: usedSelector
|
||||
};
|
||||
} catch (error) {
|
||||
console.log(error, 'fetch error');
|
||||
return {
|
||||
url,
|
||||
title,
|
||||
content: md,
|
||||
selector: usedSelector
|
||||
};
|
||||
} catch (error) {
|
||||
console.log(error, 'fetch error');
|
||||
|
||||
return {
|
||||
url,
|
||||
content: '',
|
||||
selector: ''
|
||||
};
|
||||
}
|
||||
})
|
||||
)
|
||||
).filter((item) => item.content);
|
||||
return {
|
||||
url,
|
||||
title: '',
|
||||
content: '',
|
||||
selector: ''
|
||||
};
|
||||
}
|
||||
})
|
||||
);
|
||||
|
||||
return response;
|
||||
};
|
||||
|
||||
@@ -15,7 +15,9 @@ export const htmlToMarkdown = (html?: string | null) =>
|
||||
worker.on('message', (md: string) => {
|
||||
worker.terminate();
|
||||
|
||||
resolve(simpleMarkdownText(md));
|
||||
let rawText = simpleMarkdownText(md);
|
||||
|
||||
resolve(rawText);
|
||||
});
|
||||
worker.on('error', (err) => {
|
||||
worker.terminate();
|
||||
|
||||
6
packages/service/common/system/cron.ts
Normal file
6
packages/service/common/system/cron.ts
Normal file
@@ -0,0 +1,6 @@
|
||||
import nodeCron from 'node-cron';
|
||||
|
||||
export const setCron = (time: string, cb: () => void) => {
|
||||
// second minute hour day month week
|
||||
return nodeCron.schedule(time, cb);
|
||||
};
|
||||
@@ -49,6 +49,7 @@ export const addLog = {
|
||||
},
|
||||
error(msg: string, error?: any) {
|
||||
this.log('error', msg, {
|
||||
message: error?.message,
|
||||
stack: error?.stack,
|
||||
...(error?.config && {
|
||||
config: {
|
||||
|
||||
@@ -1,19 +1,19 @@
|
||||
export type DeleteDatasetVectorProps = {
|
||||
teamId: string;
|
||||
|
||||
id?: string;
|
||||
datasetIds?: string[];
|
||||
collectionIds?: string[];
|
||||
dataIds?: string[];
|
||||
idList?: string[];
|
||||
};
|
||||
|
||||
export type InsertVectorProps = {
|
||||
teamId: string;
|
||||
tmbId: string;
|
||||
datasetId: string;
|
||||
collectionId: string;
|
||||
dataId: string;
|
||||
};
|
||||
|
||||
export type EmbeddingRecallProps = {
|
||||
similarity?: number;
|
||||
datasetIds: string[];
|
||||
similarity?: number;
|
||||
};
|
||||
|
||||
@@ -10,6 +10,7 @@ const getVectorObj = () => {
|
||||
export const initVectorStore = getVectorObj().init;
|
||||
export const deleteDatasetDataVector = getVectorObj().delete;
|
||||
export const recallFromVectorStore = getVectorObj().recall;
|
||||
export const checkVectorDataExist = getVectorObj().checkDataExist;
|
||||
export const getVectorDataByTime = getVectorObj().getVectorDataByTime;
|
||||
export const getVectorCountByTeamId = getVectorObj().getVectorCountByTeamId;
|
||||
|
||||
@@ -21,7 +22,7 @@ export const insertDatasetDataVector = async ({
|
||||
query: string;
|
||||
model: string;
|
||||
}) => {
|
||||
const { vectors, tokens } = await getVectorsByText({
|
||||
const { vectors, charsLength } = await getVectorsByText({
|
||||
model,
|
||||
input: query
|
||||
});
|
||||
@@ -31,32 +32,27 @@ export const insertDatasetDataVector = async ({
|
||||
});
|
||||
|
||||
return {
|
||||
tokens,
|
||||
charsLength,
|
||||
insertId
|
||||
};
|
||||
};
|
||||
|
||||
export const updateDatasetDataVector = async ({
|
||||
id,
|
||||
query,
|
||||
model
|
||||
}: {
|
||||
...props
|
||||
}: InsertVectorProps & {
|
||||
id: string;
|
||||
query: string;
|
||||
model: string;
|
||||
}) => {
|
||||
// get vector
|
||||
const { vectors, tokens } = await getVectorsByText({
|
||||
model,
|
||||
input: query
|
||||
// insert new vector
|
||||
const { charsLength, insertId } = await insertDatasetDataVector(props);
|
||||
|
||||
// delete old vector
|
||||
await deleteDatasetDataVector({
|
||||
teamId: props.teamId,
|
||||
id
|
||||
});
|
||||
|
||||
await getVectorObj().update({
|
||||
id,
|
||||
vectors
|
||||
});
|
||||
|
||||
return {
|
||||
tokens
|
||||
};
|
||||
return { charsLength, insertId };
|
||||
};
|
||||
|
||||
@@ -1,20 +1,20 @@
|
||||
import {
|
||||
initPg,
|
||||
insertDatasetDataVector,
|
||||
updateDatasetDataVector,
|
||||
deleteDatasetDataVector,
|
||||
embeddingRecall,
|
||||
getVectorDataByTime,
|
||||
getVectorCountByTeamId
|
||||
getVectorCountByTeamId,
|
||||
checkDataExist
|
||||
} from './controller';
|
||||
|
||||
export class PgVector {
|
||||
constructor() {}
|
||||
init = initPg;
|
||||
insert = insertDatasetDataVector;
|
||||
update = updateDatasetDataVector;
|
||||
delete = deleteDatasetDataVector;
|
||||
recall = embeddingRecall;
|
||||
checkDataExist = checkDataExist;
|
||||
getVectorCountByTeamId = getVectorCountByTeamId;
|
||||
getVectorDataByTime = getVectorDataByTime;
|
||||
}
|
||||
|
||||
@@ -4,7 +4,7 @@ import { delay } from '@fastgpt/global/common/system/utils';
|
||||
import { PgClient, connectPg } from './index';
|
||||
import { PgSearchRawType } from '@fastgpt/global/core/dataset/api';
|
||||
import { EmbeddingRecallItemType } from '../type';
|
||||
import { DeleteDatasetVectorProps, EmbeddingRecallProps } from '../controller.d';
|
||||
import { DeleteDatasetVectorProps, EmbeddingRecallProps, InsertVectorProps } from '../controller.d';
|
||||
import dayjs from 'dayjs';
|
||||
|
||||
export async function initPg() {
|
||||
@@ -16,11 +16,9 @@ export async function initPg() {
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
vector VECTOR(1536) NOT NULL,
|
||||
team_id VARCHAR(50) NOT NULL,
|
||||
tmb_id VARCHAR(50) NOT NULL,
|
||||
dataset_id VARCHAR(50) NOT NULL,
|
||||
collection_id VARCHAR(50) NOT NULL,
|
||||
data_id VARCHAR(50) NOT NULL,
|
||||
createTime TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
createtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
`);
|
||||
|
||||
@@ -34,26 +32,22 @@ export async function initPg() {
|
||||
}
|
||||
}
|
||||
|
||||
export const insertDatasetDataVector = async (props: {
|
||||
teamId: string;
|
||||
tmbId: string;
|
||||
datasetId: string;
|
||||
collectionId: string;
|
||||
dataId: string;
|
||||
vectors: number[][];
|
||||
retry?: number;
|
||||
}): Promise<{ insertId: string }> => {
|
||||
const { dataId, teamId, tmbId, datasetId, collectionId, vectors, retry = 3 } = props;
|
||||
export const insertDatasetDataVector = async (
|
||||
props: InsertVectorProps & {
|
||||
vectors: number[][];
|
||||
retry?: number;
|
||||
}
|
||||
): Promise<{ insertId: string }> => {
|
||||
const { teamId, datasetId, collectionId, vectors, retry = 3 } = props;
|
||||
|
||||
try {
|
||||
const { rows } = await PgClient.insert(PgDatasetTableName, {
|
||||
values: [
|
||||
[
|
||||
{ key: 'vector', value: `[${vectors[0]}]` },
|
||||
{ key: 'team_id', value: String(teamId) },
|
||||
{ key: 'tmb_id', value: String(tmbId) },
|
||||
{ key: 'dataset_id', value: datasetId },
|
||||
{ key: 'collection_id', value: collectionId },
|
||||
{ key: 'data_id', value: String(dataId) }
|
||||
{ key: 'dataset_id', value: String(datasetId) },
|
||||
{ key: 'collection_id', value: String(collectionId) }
|
||||
]
|
||||
]
|
||||
});
|
||||
@@ -72,43 +66,33 @@ export const insertDatasetDataVector = async (props: {
|
||||
}
|
||||
};
|
||||
|
||||
export const updateDatasetDataVector = async (props: {
|
||||
id: string;
|
||||
vectors: number[][];
|
||||
retry?: number;
|
||||
}): Promise<void> => {
|
||||
const { id, vectors, retry = 2 } = props;
|
||||
try {
|
||||
// update pg
|
||||
await PgClient.update(PgDatasetTableName, {
|
||||
where: [['id', id]],
|
||||
values: [{ key: 'vector', value: `[${vectors[0]}]` }]
|
||||
});
|
||||
} catch (error) {
|
||||
if (retry <= 0) {
|
||||
return Promise.reject(error);
|
||||
}
|
||||
await delay(500);
|
||||
return updateDatasetDataVector({
|
||||
...props,
|
||||
retry: retry - 1
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
export const deleteDatasetDataVector = async (
|
||||
props: DeleteDatasetVectorProps & {
|
||||
retry?: number;
|
||||
}
|
||||
): Promise<any> => {
|
||||
const { id, datasetIds, collectionIds, dataIds, retry = 2 } = props;
|
||||
const { teamId, id, datasetIds, collectionIds, idList, retry = 2 } = props;
|
||||
|
||||
const teamIdWhere = `team_id='${String(teamId)}' AND`;
|
||||
|
||||
const where = await (() => {
|
||||
if (id) return `id=${id}`;
|
||||
if (datasetIds) return `dataset_id IN (${datasetIds.map((id) => `'${String(id)}'`).join(',')})`;
|
||||
if (collectionIds)
|
||||
return `collection_id IN (${collectionIds.map((id) => `'${String(id)}'`).join(',')})`;
|
||||
if (dataIds) return `data_id IN (${dataIds.map((id) => `'${String(id)}'`).join(',')})`;
|
||||
if (id) return `${teamIdWhere} id=${id}`;
|
||||
|
||||
if (datasetIds) {
|
||||
return `${teamIdWhere} dataset_id IN (${datasetIds
|
||||
.map((id) => `'${String(id)}'`)
|
||||
.join(',')})`;
|
||||
}
|
||||
|
||||
if (collectionIds) {
|
||||
return `${teamIdWhere} collection_id IN (${collectionIds
|
||||
.map((id) => `'${String(id)}'`)
|
||||
.join(',')})`;
|
||||
}
|
||||
|
||||
if (idList) {
|
||||
return `${teamIdWhere} id IN (${idList.map((id) => `'${String(id)}'`).join(',')})`;
|
||||
}
|
||||
return Promise.reject('deleteDatasetData: no where');
|
||||
})();
|
||||
|
||||
@@ -137,13 +121,13 @@ export const embeddingRecall = async (
|
||||
): Promise<{
|
||||
results: EmbeddingRecallItemType[];
|
||||
}> => {
|
||||
const { vectors, limit, similarity = 0, datasetIds, retry = 2 } = props;
|
||||
const { datasetIds, vectors, limit, similarity = 0, retry = 2 } = props;
|
||||
|
||||
try {
|
||||
const results: any = await PgClient.query(
|
||||
`BEGIN;
|
||||
SET LOCAL hnsw.ef_search = ${global.systemEnv.pgHNSWEfSearch || 100};
|
||||
select id, collection_id, data_id, (vector <#> '[${vectors[0]}]') * -1 AS score
|
||||
select id, collection_id, (vector <#> '[${vectors[0]}]') * -1 AS score
|
||||
from ${PgDatasetTableName}
|
||||
where dataset_id IN (${datasetIds.map((id) => `'${String(id)}'`).join(',')})
|
||||
AND vector <#> '[${vectors[0]}]' < -${similarity}
|
||||
@@ -153,21 +137,10 @@ export const embeddingRecall = async (
|
||||
|
||||
const rows = results?.[2]?.rows as PgSearchRawType[];
|
||||
|
||||
// concat same data_id
|
||||
const filterRows: PgSearchRawType[] = [];
|
||||
let set = new Set<string>();
|
||||
for (const row of rows) {
|
||||
if (!set.has(row.data_id)) {
|
||||
filterRows.push(row);
|
||||
set.add(row.data_id);
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
results: filterRows.map((item) => ({
|
||||
results: rows.map((item) => ({
|
||||
id: item.id,
|
||||
collectionId: item.collection_id,
|
||||
dataId: item.data_id,
|
||||
score: item.score
|
||||
}))
|
||||
};
|
||||
@@ -179,7 +152,11 @@ export const embeddingRecall = async (
|
||||
}
|
||||
};
|
||||
|
||||
// bill
|
||||
export const checkDataExist = async (id: string) => {
|
||||
const { rows } = await PgClient.query(`SELECT id FROM ${PgDatasetTableName} WHERE id=${id};`);
|
||||
|
||||
return rows.length > 0;
|
||||
};
|
||||
export const getVectorCountByTeamId = async (teamId: string) => {
|
||||
const total = await PgClient.count(PgDatasetTableName, {
|
||||
where: [['team_id', String(teamId)]]
|
||||
@@ -188,15 +165,20 @@ export const getVectorCountByTeamId = async (teamId: string) => {
|
||||
return total;
|
||||
};
|
||||
export const getVectorDataByTime = async (start: Date, end: Date) => {
|
||||
const { rows } = await PgClient.query<{ id: string; data_id: string }>(`SELECT id, data_id
|
||||
const { rows } = await PgClient.query<{
|
||||
id: string;
|
||||
team_id: string;
|
||||
dataset_id: string;
|
||||
}>(`SELECT id, team_id, dataset_id
|
||||
FROM ${PgDatasetTableName}
|
||||
WHERE createTime BETWEEN '${dayjs(start).format('YYYY-MM-DD')}' AND '${dayjs(end).format(
|
||||
'YYYY-MM-DD 23:59:59'
|
||||
WHERE createtime BETWEEN '${dayjs(start).format('YYYY-MM-DD HH:mm:ss')}' AND '${dayjs(end).format(
|
||||
'YYYY-MM-DD HH:mm:ss'
|
||||
)}';
|
||||
`);
|
||||
|
||||
return rows.map((item) => ({
|
||||
id: item.id,
|
||||
dataId: item.data_id
|
||||
id: String(item.id),
|
||||
teamId: item.team_id,
|
||||
datasetId: item.dataset_id
|
||||
}));
|
||||
};
|
||||
|
||||
@@ -7,6 +7,5 @@ declare global {
|
||||
export type EmbeddingRecallItemType = {
|
||||
id: string;
|
||||
collectionId: string;
|
||||
dataId: string;
|
||||
score: number;
|
||||
};
|
||||
|
||||
@@ -18,10 +18,9 @@ export async function getVectorsByText({
|
||||
}
|
||||
|
||||
try {
|
||||
// 获取 chatAPI
|
||||
const ai = getAIApi();
|
||||
|
||||
// 把输入的内容转成向量
|
||||
// input text to vector
|
||||
const result = await ai.embeddings
|
||||
.create({
|
||||
model,
|
||||
@@ -32,13 +31,13 @@ export async function getVectorsByText({
|
||||
return Promise.reject('Embedding API 404');
|
||||
}
|
||||
if (!res?.data?.[0]?.embedding) {
|
||||
console.log(res?.data);
|
||||
console.log(res);
|
||||
// @ts-ignore
|
||||
return Promise.reject(res.data?.err?.message || 'Embedding API Error');
|
||||
}
|
||||
|
||||
return {
|
||||
tokens: res.usage.total_tokens || 0,
|
||||
charsLength: input.length,
|
||||
vectors: await Promise.all(res.data.map((item) => unityDimensional(item.embedding)))
|
||||
};
|
||||
});
|
||||
@@ -53,7 +52,9 @@ export async function getVectorsByText({
|
||||
|
||||
function unityDimensional(vector: number[]) {
|
||||
if (vector.length > 1536) {
|
||||
console.log(`当前向量维度为: ${vector.length}, 向量维度不能超过 1536, 已自动截取前 1536 维度`);
|
||||
console.log(
|
||||
`The current vector dimension is ${vector.length}, and the vector dimension cannot exceed 1536. The first 1536 dimensions are automatically captured`
|
||||
);
|
||||
return vector.slice(0, 1536);
|
||||
}
|
||||
let resultVector = vector;
|
||||
|
||||
@@ -2,8 +2,7 @@ import { connectionMongo, type Model } from '../../common/mongo';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
import { ChatItemSchema as ChatItemType } from '@fastgpt/global/core/chat/type';
|
||||
import { ChatRoleMap } from '@fastgpt/global/core/chat/constants';
|
||||
import { customAlphabet } from 'nanoid';
|
||||
const nanoid = customAlphabet('abcdefghijklmnopqrstuvwxyz1234567890', 24);
|
||||
import { getNanoid } from '@fastgpt/global/common/string/tools';
|
||||
import {
|
||||
TeamCollectionName,
|
||||
TeamMemberCollectionName
|
||||
@@ -12,25 +11,9 @@ import { appCollectionName } from '../app/schema';
|
||||
import { userCollectionName } from '../../support/user/schema';
|
||||
import { ModuleOutputKeyEnum } from '@fastgpt/global/core/module/constants';
|
||||
|
||||
export const ChatItemCollectionName = 'chatitems';
|
||||
|
||||
const ChatItemSchema = new Schema({
|
||||
dataId: {
|
||||
type: String,
|
||||
require: true,
|
||||
default: () => nanoid()
|
||||
},
|
||||
appId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: appCollectionName,
|
||||
required: true
|
||||
},
|
||||
chatId: {
|
||||
type: String,
|
||||
require: true
|
||||
},
|
||||
userId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: userCollectionName
|
||||
},
|
||||
teamId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: TeamCollectionName,
|
||||
@@ -41,6 +24,24 @@ const ChatItemSchema = new Schema({
|
||||
ref: TeamMemberCollectionName,
|
||||
required: true
|
||||
},
|
||||
userId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: userCollectionName
|
||||
},
|
||||
chatId: {
|
||||
type: String,
|
||||
require: true
|
||||
},
|
||||
dataId: {
|
||||
type: String,
|
||||
require: true,
|
||||
default: () => getNanoid(22)
|
||||
},
|
||||
appId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: appCollectionName,
|
||||
required: true
|
||||
},
|
||||
time: {
|
||||
type: Date,
|
||||
default: () => new Date()
|
||||
@@ -80,19 +81,24 @@ const ChatItemSchema = new Schema({
|
||||
});
|
||||
|
||||
try {
|
||||
ChatItemSchema.index({ dataId: -1 });
|
||||
ChatItemSchema.index({ time: -1 });
|
||||
ChatItemSchema.index({ appId: 1 });
|
||||
ChatItemSchema.index({ chatId: 1 });
|
||||
ChatItemSchema.index({ userGoodFeedback: 1 });
|
||||
ChatItemSchema.index({ userBadFeedback: 1 });
|
||||
ChatItemSchema.index({ customFeedbacks: 1 });
|
||||
ChatItemSchema.index({ adminFeedback: 1 });
|
||||
ChatItemSchema.index({ dataId: 1 }, { background: true });
|
||||
/* delete by app;
|
||||
delete by chat id;
|
||||
get chat list;
|
||||
get chat logs;
|
||||
close custom feedback;
|
||||
*/
|
||||
ChatItemSchema.index({ appId: 1, chatId: 1, dataId: 1 }, { background: true });
|
||||
ChatItemSchema.index({ time: -1 }, { background: true });
|
||||
ChatItemSchema.index({ userGoodFeedback: 1 }, { background: true });
|
||||
ChatItemSchema.index({ userBadFeedback: 1 }, { background: true });
|
||||
ChatItemSchema.index({ customFeedbacks: 1 }, { background: true });
|
||||
ChatItemSchema.index({ adminFeedback: 1 }, { background: true });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
export const MongoChatItem: Model<ChatItemType> =
|
||||
models['chatItem'] || model('chatItem', ChatItemSchema);
|
||||
models[ChatItemCollectionName] || model(ChatItemCollectionName, ChatItemSchema);
|
||||
|
||||
MongoChatItem.syncIndexes();
|
||||
|
||||
@@ -1,13 +1,12 @@
|
||||
import { connectionMongo, type Model } from '../../common/mongo';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
import { ChatSchema as ChatType } from '@fastgpt/global/core/chat/type.d';
|
||||
import { ChatRoleMap, ChatSourceMap } from '@fastgpt/global/core/chat/constants';
|
||||
import { ChatSourceMap } from '@fastgpt/global/core/chat/constants';
|
||||
import {
|
||||
TeamCollectionName,
|
||||
TeamMemberCollectionName
|
||||
} from '@fastgpt/global/support/user/team/constant';
|
||||
import { appCollectionName } from '../app/schema';
|
||||
import { ModuleOutputKeyEnum } from '@fastgpt/global/core/module/constants';
|
||||
|
||||
export const chatCollectionName = 'chat';
|
||||
|
||||
@@ -48,7 +47,8 @@ const ChatSchema = new Schema({
|
||||
default: ''
|
||||
},
|
||||
top: {
|
||||
type: Boolean
|
||||
type: Boolean,
|
||||
default: false
|
||||
},
|
||||
source: {
|
||||
type: String,
|
||||
@@ -69,34 +69,20 @@ const ChatSchema = new Schema({
|
||||
//For special storage
|
||||
type: Object,
|
||||
default: {}
|
||||
},
|
||||
content: {
|
||||
type: [
|
||||
{
|
||||
obj: {
|
||||
type: String,
|
||||
required: true,
|
||||
enum: Object.keys(ChatRoleMap)
|
||||
},
|
||||
value: {
|
||||
type: String,
|
||||
default: ''
|
||||
},
|
||||
[ModuleOutputKeyEnum.responseData]: {
|
||||
type: Array,
|
||||
default: []
|
||||
}
|
||||
}
|
||||
],
|
||||
default: []
|
||||
}
|
||||
});
|
||||
|
||||
try {
|
||||
ChatSchema.index({ appId: 1 });
|
||||
ChatSchema.index({ tmbId: 1 });
|
||||
ChatSchema.index({ shareId: 1 });
|
||||
ChatSchema.index({ updateTime: -1 });
|
||||
ChatSchema.index({ chatId: 1 }, { background: true });
|
||||
// get user history
|
||||
ChatSchema.index({ tmbId: 1, appId: 1, top: -1, updateTime: -1 }, { background: true });
|
||||
// delete by appid; clear history; init chat; update chat; auth chat;
|
||||
ChatSchema.index({ appId: 1, chatId: 1 }, { background: true });
|
||||
|
||||
// get chat logs;
|
||||
ChatSchema.index({ teamId: 1, appId: 1, updateTime: -1 }, { background: true });
|
||||
// get share chat history
|
||||
ChatSchema.index({ shareId: 1, outLinkUid: 1 }, { background: true });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
@@ -3,10 +3,12 @@ import { MongoChatItem } from './chatItemSchema';
|
||||
import { addLog } from '../../common/system/log';
|
||||
|
||||
export async function getChatItems({
|
||||
appId,
|
||||
chatId,
|
||||
limit = 30,
|
||||
field
|
||||
}: {
|
||||
appId: string;
|
||||
chatId?: string;
|
||||
limit?: number;
|
||||
field: string;
|
||||
@@ -15,7 +17,10 @@ export async function getChatItems({
|
||||
return { history: [] };
|
||||
}
|
||||
|
||||
const history = await MongoChatItem.find({ chatId }, field).sort({ _id: -1 }).limit(limit).lean();
|
||||
const history = await MongoChatItem.find({ appId, chatId }, field)
|
||||
.sort({ _id: -1 })
|
||||
.limit(limit)
|
||||
.lean();
|
||||
|
||||
history.reverse();
|
||||
|
||||
@@ -23,10 +28,12 @@ export async function getChatItems({
|
||||
}
|
||||
|
||||
export const addCustomFeedbacks = async ({
|
||||
appId,
|
||||
chatId,
|
||||
chatItemId,
|
||||
feedbacks
|
||||
}: {
|
||||
appId: string;
|
||||
chatId?: string;
|
||||
chatItemId?: string;
|
||||
feedbacks: string[];
|
||||
|
||||
@@ -1,9 +1,20 @@
|
||||
import {
|
||||
DatasetCollectionTrainingModeEnum,
|
||||
TrainingModeEnum,
|
||||
DatasetCollectionTypeEnum
|
||||
} from '@fastgpt/global/core/dataset/constant';
|
||||
} from '@fastgpt/global/core/dataset/constants';
|
||||
import type { CreateDatasetCollectionParams } from '@fastgpt/global/core/dataset/api.d';
|
||||
import { MongoDatasetCollection } from './schema';
|
||||
import {
|
||||
CollectionWithDatasetType,
|
||||
DatasetCollectionSchemaType
|
||||
} from '@fastgpt/global/core/dataset/type';
|
||||
import { MongoDatasetTraining } from '../training/schema';
|
||||
import { delay } from '@fastgpt/global/common/system/utils';
|
||||
import { MongoDatasetData } from '../data/schema';
|
||||
import { delImgByRelatedId } from '../../../common/file/image/controller';
|
||||
import { deleteDatasetDataVector } from '../../../common/vectorStore/controller';
|
||||
import { delFileByFileIdList } from '../../../common/file/gridfs/controller';
|
||||
import { BucketNameEnum } from '@fastgpt/global/common/file/constants';
|
||||
|
||||
export async function createOneCollection({
|
||||
teamId,
|
||||
@@ -12,11 +23,15 @@ export async function createOneCollection({
|
||||
parentId,
|
||||
datasetId,
|
||||
type,
|
||||
trainingType = DatasetCollectionTrainingModeEnum.manual,
|
||||
chunkSize = 0,
|
||||
|
||||
trainingType = TrainingModeEnum.chunk,
|
||||
chunkSize = 512,
|
||||
chunkSplitter,
|
||||
qaPrompt,
|
||||
|
||||
fileId,
|
||||
rawLink,
|
||||
qaPrompt,
|
||||
|
||||
hashRawText,
|
||||
rawTextLength,
|
||||
metadata = {},
|
||||
@@ -30,11 +45,15 @@ export async function createOneCollection({
|
||||
datasetId,
|
||||
name,
|
||||
type,
|
||||
|
||||
trainingType,
|
||||
chunkSize,
|
||||
chunkSplitter,
|
||||
qaPrompt,
|
||||
|
||||
fileId,
|
||||
rawLink,
|
||||
qaPrompt,
|
||||
|
||||
rawTextLength,
|
||||
hashRawText,
|
||||
metadata
|
||||
@@ -74,26 +93,59 @@ export function createDefaultCollection({
|
||||
datasetId,
|
||||
parentId,
|
||||
type: DatasetCollectionTypeEnum.virtual,
|
||||
trainingType: DatasetCollectionTrainingModeEnum.manual,
|
||||
trainingType: TrainingModeEnum.chunk,
|
||||
chunkSize: 0,
|
||||
updateTime: new Date('2099')
|
||||
});
|
||||
}
|
||||
|
||||
// check same collection
|
||||
export const getSameRawTextCollection = async ({
|
||||
datasetId,
|
||||
hashRawText
|
||||
/**
|
||||
* delete collection and it related data
|
||||
*/
|
||||
export async function delCollectionAndRelatedSources({
|
||||
collections
|
||||
}: {
|
||||
datasetId: string;
|
||||
hashRawText?: string;
|
||||
}) => {
|
||||
if (!hashRawText) return undefined;
|
||||
collections: (CollectionWithDatasetType | DatasetCollectionSchemaType)[];
|
||||
}) {
|
||||
if (collections.length === 0) return;
|
||||
|
||||
const collection = await MongoDatasetCollection.findOne({
|
||||
datasetId,
|
||||
hashRawText
|
||||
const teamId = collections[0].teamId;
|
||||
|
||||
if (!teamId) return Promise.reject('teamId is not exist');
|
||||
|
||||
const collectionIds = collections.map((item) => String(item._id));
|
||||
const fileIdList = collections.map((item) => item?.fileId || '').filter(Boolean);
|
||||
const relatedImageIds = collections
|
||||
.map((item) => item?.metadata?.relatedImgId || '')
|
||||
.filter(Boolean);
|
||||
|
||||
// delete training data
|
||||
await MongoDatasetTraining.deleteMany({
|
||||
teamId,
|
||||
collectionId: { $in: collectionIds }
|
||||
});
|
||||
|
||||
return collection;
|
||||
};
|
||||
await delay(2000);
|
||||
|
||||
// delete dataset.datas
|
||||
await MongoDatasetData.deleteMany({ teamId, collectionId: { $in: collectionIds } });
|
||||
// delete pg data
|
||||
await deleteDatasetDataVector({ teamId, collectionIds });
|
||||
|
||||
// delete file and imgs
|
||||
await Promise.all([
|
||||
delImgByRelatedId({
|
||||
teamId,
|
||||
relateIds: relatedImageIds
|
||||
}),
|
||||
delFileByFileIdList({
|
||||
bucketName: BucketNameEnum.dataset,
|
||||
fileIdList
|
||||
})
|
||||
]);
|
||||
|
||||
// delete collections
|
||||
await MongoDatasetCollection.deleteMany({
|
||||
_id: { $in: collectionIds }
|
||||
});
|
||||
}
|
||||
|
||||
@@ -1,10 +1,7 @@
|
||||
import { connectionMongo, type Model } from '../../../common/mongo';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
import { DatasetCollectionSchemaType } from '@fastgpt/global/core/dataset/type.d';
|
||||
import {
|
||||
DatasetCollectionTrainingTypeMap,
|
||||
DatasetCollectionTypeMap
|
||||
} from '@fastgpt/global/core/dataset/constant';
|
||||
import { TrainingTypeMap, DatasetCollectionTypeMap } from '@fastgpt/global/core/dataset/constants';
|
||||
import { DatasetCollectionName } from '../schema';
|
||||
import {
|
||||
TeamCollectionName,
|
||||
@@ -56,15 +53,23 @@ const DatasetCollectionSchema = new Schema({
|
||||
type: Date,
|
||||
default: () => new Date()
|
||||
},
|
||||
|
||||
trainingType: {
|
||||
type: String,
|
||||
enum: Object.keys(DatasetCollectionTrainingTypeMap),
|
||||
enum: Object.keys(TrainingTypeMap),
|
||||
required: true
|
||||
},
|
||||
chunkSize: {
|
||||
type: Number,
|
||||
required: true
|
||||
},
|
||||
chunkSplitter: {
|
||||
type: String
|
||||
},
|
||||
qaPrompt: {
|
||||
type: String
|
||||
},
|
||||
|
||||
fileId: {
|
||||
type: Schema.Types.ObjectId,
|
||||
ref: 'dataset.files'
|
||||
@@ -72,9 +77,6 @@ const DatasetCollectionSchema = new Schema({
|
||||
rawLink: {
|
||||
type: String
|
||||
},
|
||||
qaPrompt: {
|
||||
type: String
|
||||
},
|
||||
|
||||
rawTextLength: {
|
||||
type: Number
|
||||
@@ -89,10 +91,19 @@ const DatasetCollectionSchema = new Schema({
|
||||
});
|
||||
|
||||
try {
|
||||
DatasetCollectionSchema.index({ datasetId: 1 });
|
||||
DatasetCollectionSchema.index({ datasetId: 1, parentId: 1 });
|
||||
DatasetCollectionSchema.index({ updateTime: -1 });
|
||||
DatasetCollectionSchema.index({ hashRawText: -1 });
|
||||
// auth file
|
||||
DatasetCollectionSchema.index({ teamId: 1, fileId: 1 }, { background: true });
|
||||
|
||||
// list collection; deep find collections
|
||||
DatasetCollectionSchema.index(
|
||||
{
|
||||
teamId: 1,
|
||||
datasetId: 1,
|
||||
parentId: 1,
|
||||
updateTime: -1
|
||||
},
|
||||
{ background: true }
|
||||
);
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
@@ -4,16 +4,32 @@ import type { ParentTreePathItemType } from '@fastgpt/global/common/parentFolder
|
||||
import { splitText2Chunks } from '@fastgpt/global/common/string/textSplitter';
|
||||
import { MongoDatasetTraining } from '../training/schema';
|
||||
import { urlsFetch } from '../../../common/string/cheerio';
|
||||
import { DatasetCollectionTypeEnum } from '@fastgpt/global/core/dataset/constant';
|
||||
import {
|
||||
DatasetCollectionTypeEnum,
|
||||
TrainingModeEnum
|
||||
} from '@fastgpt/global/core/dataset/constants';
|
||||
import { hashStr } from '@fastgpt/global/common/string/tools';
|
||||
|
||||
/**
|
||||
* get all collection by top collectionId
|
||||
*/
|
||||
export async function findCollectionAndChild(id: string, fields = '_id parentId name metadata') {
|
||||
export async function findCollectionAndChild({
|
||||
teamId,
|
||||
datasetId,
|
||||
collectionId,
|
||||
fields = '_id parentId name metadata'
|
||||
}: {
|
||||
teamId: string;
|
||||
datasetId: string;
|
||||
collectionId: string;
|
||||
fields?: string;
|
||||
}) {
|
||||
async function find(id: string) {
|
||||
// find children
|
||||
const children = await MongoDatasetCollection.find({ parentId: id }, fields);
|
||||
const children = await MongoDatasetCollection.find(
|
||||
{ teamId, datasetId, parentId: id },
|
||||
fields
|
||||
).lean();
|
||||
|
||||
let collections = children;
|
||||
|
||||
@@ -25,8 +41,8 @@ export async function findCollectionAndChild(id: string, fields = '_id parentId
|
||||
return collections;
|
||||
}
|
||||
const [collection, childCollections] = await Promise.all([
|
||||
MongoDatasetCollection.findById(id, fields),
|
||||
find(id)
|
||||
MongoDatasetCollection.findById(collectionId, fields),
|
||||
find(collectionId)
|
||||
]);
|
||||
|
||||
if (!collection) {
|
||||
@@ -92,8 +108,12 @@ export const getCollectionAndRawText = async ({
|
||||
return Promise.reject('Collection not found');
|
||||
}
|
||||
|
||||
const rawText = await (async () => {
|
||||
if (newRawText) return newRawText;
|
||||
const { title, rawText } = await (async () => {
|
||||
if (newRawText)
|
||||
return {
|
||||
title: '',
|
||||
rawText: newRawText
|
||||
};
|
||||
// link
|
||||
if (col.type === DatasetCollectionTypeEnum.link && col.rawLink) {
|
||||
// crawl new data
|
||||
@@ -102,19 +122,26 @@ export const getCollectionAndRawText = async ({
|
||||
selector: col.datasetId?.websiteConfig?.selector || col?.metadata?.webPageSelector
|
||||
});
|
||||
|
||||
return result[0].content;
|
||||
return {
|
||||
title: result[0]?.title,
|
||||
rawText: result[0]?.content
|
||||
};
|
||||
}
|
||||
|
||||
// file
|
||||
|
||||
return '';
|
||||
return {
|
||||
title: '',
|
||||
rawText: ''
|
||||
};
|
||||
})();
|
||||
|
||||
const hashRawText = hashStr(rawText);
|
||||
const isSameRawText = col.hashRawText === hashRawText;
|
||||
const isSameRawText = rawText && col.hashRawText === hashRawText;
|
||||
|
||||
return {
|
||||
collection: col,
|
||||
title,
|
||||
rawText,
|
||||
isSameRawText
|
||||
};
|
||||
@@ -135,6 +162,7 @@ export const reloadCollectionChunks = async ({
|
||||
rawText?: string;
|
||||
}) => {
|
||||
const {
|
||||
title,
|
||||
rawText: newRawText,
|
||||
collection: col,
|
||||
isSameRawText
|
||||
@@ -149,11 +177,15 @@ export const reloadCollectionChunks = async ({
|
||||
// split data
|
||||
const { chunks } = splitText2Chunks({
|
||||
text: newRawText,
|
||||
chunkLen: col.chunkSize || 512,
|
||||
countTokens: false
|
||||
chunkLen: col.chunkSize || 512
|
||||
});
|
||||
|
||||
// insert to training queue
|
||||
const model = await (() => {
|
||||
if (col.trainingType === TrainingModeEnum.chunk) return col.datasetId.vectorModel;
|
||||
if (col.trainingType === TrainingModeEnum.qa) return col.datasetId.agentModel;
|
||||
return Promise.reject('Training model error');
|
||||
})();
|
||||
await MongoDatasetTraining.insertMany(
|
||||
chunks.map((item, i) => ({
|
||||
teamId: col.teamId,
|
||||
@@ -163,7 +195,7 @@ export const reloadCollectionChunks = async ({
|
||||
billId,
|
||||
mode: col.trainingType,
|
||||
prompt: '',
|
||||
model: col.datasetId.vectorModel,
|
||||
model,
|
||||
q: item,
|
||||
a: '',
|
||||
chunkIndex: i
|
||||
@@ -172,6 +204,7 @@ export const reloadCollectionChunks = async ({
|
||||
|
||||
// update raw text
|
||||
await MongoDatasetCollection.findByIdAndUpdate(col._id, {
|
||||
...(title && { name: title }),
|
||||
rawTextLength: newRawText.length,
|
||||
hashRawText: hashStr(newRawText)
|
||||
});
|
||||
|
||||
@@ -1,24 +1,47 @@
|
||||
import { CollectionWithDatasetType } from '@fastgpt/global/core/dataset/type';
|
||||
import { CollectionWithDatasetType, DatasetSchemaType } from '@fastgpt/global/core/dataset/type';
|
||||
import { MongoDatasetCollection } from './collection/schema';
|
||||
import { MongoDataset } from './schema';
|
||||
import { delCollectionAndRelatedSources } from './collection/controller';
|
||||
|
||||
/* ============= dataset ========== */
|
||||
/* find all datasetId by top datasetId */
|
||||
export async function findDatasetIdTreeByTopDatasetId(
|
||||
id: string,
|
||||
result: string[] = []
|
||||
): Promise<string[]> {
|
||||
let allChildrenIds = [...result];
|
||||
export async function findDatasetAndAllChildren({
|
||||
teamId,
|
||||
datasetId,
|
||||
fields
|
||||
}: {
|
||||
teamId: string;
|
||||
datasetId: string;
|
||||
fields?: string;
|
||||
}): Promise<DatasetSchemaType[]> {
|
||||
const find = async (id: string) => {
|
||||
const children = await MongoDataset.find(
|
||||
{
|
||||
teamId,
|
||||
parentId: id
|
||||
},
|
||||
fields
|
||||
).lean();
|
||||
|
||||
// find children
|
||||
const children = await MongoDataset.find({ parentId: id });
|
||||
let datasets = children;
|
||||
|
||||
for (const child of children) {
|
||||
const grandChildrenIds = await findDatasetIdTreeByTopDatasetId(child._id, result);
|
||||
allChildrenIds = allChildrenIds.concat(grandChildrenIds);
|
||||
for (const child of children) {
|
||||
const grandChildrenIds = await find(child._id);
|
||||
datasets = datasets.concat(grandChildrenIds);
|
||||
}
|
||||
|
||||
return datasets;
|
||||
};
|
||||
const [dataset, childDatasets] = await Promise.all([
|
||||
MongoDataset.findById(datasetId),
|
||||
find(datasetId)
|
||||
]);
|
||||
|
||||
if (!dataset) {
|
||||
return Promise.reject('Dataset not found');
|
||||
}
|
||||
|
||||
return [String(id), ...allChildrenIds];
|
||||
return [dataset, ...childDatasets];
|
||||
}
|
||||
|
||||
export async function getCollectionWithDataset(collectionId: string) {
|
||||
@@ -30,3 +53,22 @@ export async function getCollectionWithDataset(collectionId: string) {
|
||||
}
|
||||
return data;
|
||||
}
|
||||
|
||||
/* delete all data by datasetIds */
|
||||
export async function delDatasetRelevantData({ datasets }: { datasets: DatasetSchemaType[] }) {
|
||||
if (!datasets.length) return;
|
||||
|
||||
const teamId = datasets[0].teamId;
|
||||
const datasetIds = datasets.map((item) => String(item._id));
|
||||
|
||||
// Get _id, teamId, fileId, metadata.relatedImgId for all collections
|
||||
const collections = await MongoDatasetCollection.find(
|
||||
{
|
||||
teamId,
|
||||
datasetId: { $in: datasetIds }
|
||||
},
|
||||
'_id teamId fileId metadata'
|
||||
).lean();
|
||||
|
||||
await delCollectionAndRelatedSources({ collections });
|
||||
}
|
||||
|
||||
@@ -1,81 +1,2 @@
|
||||
import { MongoDatasetData } from './schema';
|
||||
import { MongoDatasetTraining } from '../training/schema';
|
||||
import { delFileByFileIdList, delFileByMetadata } from '../../../common/file/gridfs/controller';
|
||||
import { BucketNameEnum } from '@fastgpt/global/common/file/constants';
|
||||
import { MongoDatasetCollection } from '../collection/schema';
|
||||
import { delay } from '@fastgpt/global/common/system/utils';
|
||||
import { delImgByFileIdList } from '../../../common/file/image/controller';
|
||||
import { deleteDatasetDataVector } from '../../../common/vectorStore/controller';
|
||||
|
||||
/* delete all data by datasetIds */
|
||||
export async function delDatasetRelevantData({ datasetIds }: { datasetIds: string[] }) {
|
||||
datasetIds = datasetIds.map((item) => String(item));
|
||||
|
||||
// delete training data(There could be a training mission)
|
||||
await MongoDatasetTraining.deleteMany({
|
||||
datasetId: { $in: datasetIds }
|
||||
});
|
||||
|
||||
await delay(2000);
|
||||
|
||||
// delete dataset.datas
|
||||
await MongoDatasetData.deleteMany({ datasetId: { $in: datasetIds } });
|
||||
// delete pg data
|
||||
await deleteDatasetDataVector({ datasetIds });
|
||||
|
||||
// delete collections
|
||||
await MongoDatasetCollection.deleteMany({
|
||||
datasetId: { $in: datasetIds }
|
||||
});
|
||||
|
||||
// delete related files
|
||||
await Promise.all(
|
||||
datasetIds.map((id) => delFileByMetadata({ bucketName: BucketNameEnum.dataset, datasetId: id }))
|
||||
);
|
||||
}
|
||||
/**
|
||||
* delete all data by collectionIds
|
||||
*/
|
||||
export async function delCollectionRelevantData({
|
||||
collectionIds,
|
||||
fileIds
|
||||
}: {
|
||||
collectionIds: string[];
|
||||
fileIds: string[];
|
||||
}) {
|
||||
collectionIds = collectionIds.filter(Boolean).map((item) => String(item));
|
||||
const filterFileIds = fileIds.filter(Boolean).map((item) => String(item));
|
||||
|
||||
// delete training data
|
||||
await MongoDatasetTraining.deleteMany({
|
||||
collectionId: { $in: collectionIds }
|
||||
});
|
||||
|
||||
await delay(2000);
|
||||
|
||||
// delete dataset.datas
|
||||
await MongoDatasetData.deleteMany({ collectionId: { $in: collectionIds } });
|
||||
// delete pg data
|
||||
await deleteDatasetDataVector({ collectionIds });
|
||||
|
||||
// delete collections
|
||||
await MongoDatasetCollection.deleteMany({
|
||||
_id: { $in: collectionIds }
|
||||
});
|
||||
|
||||
// delete file and imgs
|
||||
await Promise.all([
|
||||
delImgByFileIdList(filterFileIds),
|
||||
delFileByFileIdList({
|
||||
bucketName: BucketNameEnum.dataset,
|
||||
fileIdList: filterFileIds
|
||||
})
|
||||
]);
|
||||
}
|
||||
/**
|
||||
* delete one data by mongoDataId
|
||||
*/
|
||||
export async function delDatasetDataByDataId(mongoDataId: string) {
|
||||
await deleteDatasetDataVector({ dataIds: [mongoDataId] });
|
||||
await MongoDatasetData.findByIdAndDelete(mongoDataId);
|
||||
}
|
||||
|
||||
@@ -10,7 +10,7 @@ import { DatasetColCollectionName } from '../collection/schema';
|
||||
import {
|
||||
DatasetDataIndexTypeEnum,
|
||||
DatasetDataIndexTypeMap
|
||||
} from '@fastgpt/global/core/dataset/constant';
|
||||
} from '@fastgpt/global/core/dataset/constants';
|
||||
|
||||
export const DatasetDataCollectionName = 'dataset.datas';
|
||||
|
||||
@@ -71,6 +71,7 @@ const DatasetDataSchema = new Schema({
|
||||
],
|
||||
default: []
|
||||
},
|
||||
|
||||
updateTime: {
|
||||
type: Date,
|
||||
default: () => new Date()
|
||||
@@ -85,12 +86,18 @@ const DatasetDataSchema = new Schema({
|
||||
});
|
||||
|
||||
try {
|
||||
DatasetDataSchema.index({ datasetId: 1 });
|
||||
DatasetDataSchema.index({ collectionId: 1 });
|
||||
DatasetDataSchema.index({ updateTime: -1 });
|
||||
// same data check
|
||||
DatasetDataSchema.index({ teamId: 1, collectionId: 1, q: 1, a: 1 }, { background: true });
|
||||
// list collection and count data; list data
|
||||
DatasetDataSchema.index(
|
||||
{ teamId: 1, datasetId: 1, collectionId: 1, chunkIndex: 1, updateTime: -1 },
|
||||
{ background: true }
|
||||
);
|
||||
// full text index
|
||||
DatasetDataSchema.index({ datasetId: 1, fullTextToken: 'text' });
|
||||
DatasetDataSchema.index({ inited: 1 });
|
||||
DatasetDataSchema.index({ teamId: 1, datasetId: 1, fullTextToken: 'text' }, { background: true });
|
||||
// Recall vectors after data matching
|
||||
DatasetDataSchema.index({ teamId: 1, datasetId: 1, 'indexes.dataId': 1 }, { background: true });
|
||||
DatasetDataSchema.index({ updateTime: 1 }, { background: true });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
@@ -5,7 +5,7 @@ import {
|
||||
DatasetStatusEnum,
|
||||
DatasetStatusMap,
|
||||
DatasetTypeMap
|
||||
} from '@fastgpt/global/core/dataset/constant';
|
||||
} from '@fastgpt/global/core/dataset/constants';
|
||||
import {
|
||||
TeamCollectionName,
|
||||
TeamMemberCollectionName
|
||||
@@ -92,7 +92,7 @@ const DatasetSchema = new Schema({
|
||||
});
|
||||
|
||||
try {
|
||||
DatasetSchema.index({ userId: 1 });
|
||||
DatasetSchema.index({ teamId: 1 });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
}
|
||||
|
||||
@@ -1,5 +1,15 @@
|
||||
import { delay } from '@fastgpt/global/common/system/utils';
|
||||
import { MongoDatasetTraining } from './schema';
|
||||
import type {
|
||||
PushDatasetDataChunkProps,
|
||||
PushDatasetDataProps,
|
||||
PushDatasetDataResponse
|
||||
} from '@fastgpt/global/core/dataset/api.d';
|
||||
import { getCollectionWithDataset } from '../controller';
|
||||
import { TrainingModeEnum } from '@fastgpt/global/core/dataset/constants';
|
||||
import { simpleText } from '@fastgpt/global/common/string/tools';
|
||||
import { countPromptTokens } from '@fastgpt/global/common/string/tiktoken';
|
||||
import type { VectorModelItemType, LLMModelItemType } from '@fastgpt/global/core/ai/model.d';
|
||||
|
||||
export const lockTrainingDataByTeamId = async (teamId: string, retry = 3): Promise<any> => {
|
||||
try {
|
||||
@@ -19,3 +29,165 @@ export const lockTrainingDataByTeamId = async (teamId: string, retry = 3): Promi
|
||||
return Promise.reject(error);
|
||||
}
|
||||
};
|
||||
|
||||
export async function pushDataListToTrainingQueue({
|
||||
teamId,
|
||||
tmbId,
|
||||
collectionId,
|
||||
data,
|
||||
prompt,
|
||||
billId,
|
||||
trainingMode = TrainingModeEnum.chunk,
|
||||
|
||||
vectorModelList = [],
|
||||
qaModelList = []
|
||||
}: {
|
||||
teamId: string;
|
||||
tmbId: string;
|
||||
vectorModelList: VectorModelItemType[];
|
||||
qaModelList: LLMModelItemType[];
|
||||
} & PushDatasetDataProps): Promise<PushDatasetDataResponse> {
|
||||
const {
|
||||
datasetId: { _id: datasetId, vectorModel, agentModel }
|
||||
} = await getCollectionWithDataset(collectionId);
|
||||
|
||||
const checkModelValid = async ({ collectionId }: { collectionId: string }) => {
|
||||
if (!collectionId) return Promise.reject(`CollectionId is empty`);
|
||||
|
||||
if (trainingMode === TrainingModeEnum.chunk) {
|
||||
const vectorModelData = vectorModelList?.find((item) => item.model === vectorModel);
|
||||
if (!vectorModelData) {
|
||||
return Promise.reject(`Model ${vectorModel} is inValid`);
|
||||
}
|
||||
|
||||
return {
|
||||
maxToken: vectorModelData.maxToken * 1.5,
|
||||
model: vectorModelData.model,
|
||||
weight: vectorModelData.weight
|
||||
};
|
||||
}
|
||||
|
||||
if (trainingMode === TrainingModeEnum.qa) {
|
||||
const qaModelData = qaModelList?.find((item) => item.model === agentModel);
|
||||
if (!qaModelData) {
|
||||
return Promise.reject(`Model ${agentModel} is inValid`);
|
||||
}
|
||||
return {
|
||||
maxToken: qaModelData.maxContext * 0.8,
|
||||
model: qaModelData.model,
|
||||
weight: 0
|
||||
};
|
||||
}
|
||||
return Promise.reject(`Training mode "${trainingMode}" is inValid`);
|
||||
};
|
||||
|
||||
const { model, maxToken, weight } = await checkModelValid({
|
||||
collectionId
|
||||
});
|
||||
|
||||
// format q and a, remove empty char
|
||||
data.forEach((item) => {
|
||||
item.q = simpleText(item.q);
|
||||
item.a = simpleText(item.a);
|
||||
|
||||
item.indexes = item.indexes
|
||||
?.map((index) => {
|
||||
return {
|
||||
...index,
|
||||
text: simpleText(index.text)
|
||||
};
|
||||
})
|
||||
.filter(Boolean);
|
||||
});
|
||||
|
||||
// filter repeat or equal content
|
||||
const set = new Set();
|
||||
const filterResult: Record<string, PushDatasetDataChunkProps[]> = {
|
||||
success: [],
|
||||
overToken: [],
|
||||
repeat: [],
|
||||
error: []
|
||||
};
|
||||
|
||||
// filter repeat content
|
||||
data.forEach((item) => {
|
||||
if (!item.q) {
|
||||
filterResult.error.push(item);
|
||||
return;
|
||||
}
|
||||
|
||||
const text = item.q + item.a;
|
||||
|
||||
// count q token
|
||||
const token = countPromptTokens(item.q);
|
||||
|
||||
if (token > maxToken) {
|
||||
filterResult.overToken.push(item);
|
||||
return;
|
||||
}
|
||||
|
||||
if (set.has(text)) {
|
||||
console.log('repeat', item);
|
||||
filterResult.repeat.push(item);
|
||||
} else {
|
||||
filterResult.success.push(item);
|
||||
set.add(text);
|
||||
}
|
||||
});
|
||||
|
||||
// insert data to db
|
||||
const insertData = async (dataList: PushDatasetDataChunkProps[], retry = 3): Promise<number> => {
|
||||
try {
|
||||
const results = await MongoDatasetTraining.insertMany(
|
||||
dataList.map((item, i) => ({
|
||||
teamId,
|
||||
tmbId,
|
||||
datasetId,
|
||||
collectionId,
|
||||
billId,
|
||||
mode: trainingMode,
|
||||
prompt,
|
||||
model,
|
||||
q: item.q,
|
||||
a: item.a,
|
||||
chunkIndex: item.chunkIndex ?? i,
|
||||
weight: weight ?? 0,
|
||||
indexes: item.indexes
|
||||
}))
|
||||
);
|
||||
await delay(500);
|
||||
return results.length;
|
||||
} catch (error) {
|
||||
if (retry > 0) {
|
||||
await delay(500);
|
||||
return insertData(dataList, retry - 1);
|
||||
}
|
||||
return Promise.reject(error);
|
||||
}
|
||||
};
|
||||
|
||||
let insertLen = 0;
|
||||
const chunkSize = 50;
|
||||
const chunkList = filterResult.success.reduce(
|
||||
(acc, cur) => {
|
||||
const lastChunk = acc[acc.length - 1];
|
||||
if (lastChunk.length < chunkSize) {
|
||||
lastChunk.push(cur);
|
||||
} else {
|
||||
acc.push([cur]);
|
||||
}
|
||||
return acc;
|
||||
},
|
||||
[[]] as PushDatasetDataChunkProps[][]
|
||||
);
|
||||
for await (const chunks of chunkList) {
|
||||
insertLen += await insertData(chunks);
|
||||
}
|
||||
|
||||
delete filterResult.success;
|
||||
|
||||
return {
|
||||
insertLen,
|
||||
...filterResult
|
||||
};
|
||||
}
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
import { connectionMongo, type Model } from '../../../common/mongo';
|
||||
const { Schema, model, models } = connectionMongo;
|
||||
import { DatasetTrainingSchemaType } from '@fastgpt/global/core/dataset/type';
|
||||
import { DatasetDataIndexTypeMap, TrainingTypeMap } from '@fastgpt/global/core/dataset/constant';
|
||||
import { DatasetDataIndexTypeMap, TrainingTypeMap } from '@fastgpt/global/core/dataset/constants';
|
||||
import { DatasetColCollectionName } from '../collection/schema';
|
||||
import { DatasetCollectionName } from '../schema';
|
||||
import {
|
||||
@@ -102,10 +102,11 @@ const TrainingDataSchema = new Schema({
|
||||
});
|
||||
|
||||
try {
|
||||
// lock training data; delete training data
|
||||
TrainingDataSchema.index({ teamId: 1, collectionId: 1 });
|
||||
// get training data and sort
|
||||
TrainingDataSchema.index({ weight: -1 });
|
||||
TrainingDataSchema.index({ lockTime: 1 });
|
||||
TrainingDataSchema.index({ datasetId: 1 });
|
||||
TrainingDataSchema.index({ collectionId: 1 });
|
||||
TrainingDataSchema.index({ expireAt: 1 }, { expireAfterSeconds: 7 * 24 * 60 });
|
||||
} catch (error) {
|
||||
console.log(error);
|
||||
|
||||
@@ -3,17 +3,17 @@
|
||||
"version": "1.0.0",
|
||||
"dependencies": {
|
||||
"@fastgpt/global": "workspace:*",
|
||||
"axios": "^1.5.1",
|
||||
"cheerio": "1.0.0-rc.12",
|
||||
"cookie": "^0.5.0",
|
||||
"dayjs": "^1.11.7",
|
||||
"encoding": "^0.1.13",
|
||||
"jsonwebtoken": "^9.0.2",
|
||||
"mongoose": "^7.0.2",
|
||||
"nanoid": "^4.0.1",
|
||||
"dayjs": "^1.11.7",
|
||||
"next": "13.5.2",
|
||||
"multer": "1.4.5-lts.1",
|
||||
"axios": "^1.5.1",
|
||||
"cheerio": "1.0.0-rc.12",
|
||||
"next": "13.5.2",
|
||||
"nextjs-cors": "^2.1.2",
|
||||
"node-cron": "^3.0.3",
|
||||
"pg": "^8.10.0",
|
||||
"tunnel": "^0.0.6"
|
||||
},
|
||||
@@ -21,6 +21,7 @@
|
||||
"@types/cookie": "^0.5.2",
|
||||
"@types/jsonwebtoken": "^9.0.3",
|
||||
"@types/multer": "^1.4.10",
|
||||
"@types/node-cron": "^3.0.11",
|
||||
"@types/pg": "^8.6.6",
|
||||
"@types/tunnel": "^0.0.4"
|
||||
}
|
||||
|
||||
@@ -1,18 +1,22 @@
|
||||
import { MongoOpenApi } from './schema';
|
||||
|
||||
export async function updateApiKeyUsedTime(id: string) {
|
||||
await MongoOpenApi.findByIdAndUpdate(id, {
|
||||
export function updateApiKeyUsedTime(id: string) {
|
||||
MongoOpenApi.findByIdAndUpdate(id, {
|
||||
lastUsedTime: new Date()
|
||||
}).catch((err) => {
|
||||
console.log('update apiKey used time error', err);
|
||||
});
|
||||
}
|
||||
|
||||
export async function updateApiKeyUsage({ apikey, usage }: { apikey: string; usage: number }) {
|
||||
await MongoOpenApi.findOneAndUpdate(
|
||||
export function updateApiKeyUsage({ apikey, usage }: { apikey: string; usage: number }) {
|
||||
MongoOpenApi.findOneAndUpdate(
|
||||
{ apiKey: apikey },
|
||||
{
|
||||
$inc: {
|
||||
usage
|
||||
}
|
||||
}
|
||||
);
|
||||
).catch((err) => {
|
||||
console.log('update apiKey usage error', err);
|
||||
});
|
||||
}
|
||||
|
||||
@@ -9,17 +9,15 @@ export const updateOutLinkUsage = async ({
|
||||
shareId: string;
|
||||
total: number;
|
||||
}) => {
|
||||
try {
|
||||
await MongoOutLink.findOneAndUpdate(
|
||||
{ shareId },
|
||||
{
|
||||
$inc: { total },
|
||||
lastTime: new Date()
|
||||
}
|
||||
);
|
||||
} catch (err) {
|
||||
MongoOutLink.findOneAndUpdate(
|
||||
{ shareId },
|
||||
{
|
||||
$inc: { total },
|
||||
lastTime: new Date()
|
||||
}
|
||||
).catch((err) => {
|
||||
console.log('update shareChat error', err);
|
||||
}
|
||||
});
|
||||
};
|
||||
|
||||
export const pushResult2Remote = async ({
|
||||
|
||||
@@ -6,7 +6,7 @@ import { TeamMemberRoleEnum } from '@fastgpt/global/support/user/team/constant';
|
||||
import { parseHeaderCert } from '../controller';
|
||||
import { PermissionTypeEnum } from '@fastgpt/global/support/permission/constant';
|
||||
import { AppErrEnum } from '@fastgpt/global/common/error/code/app';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
|
||||
// 模型使用权校验
|
||||
export async function authApp({
|
||||
@@ -24,7 +24,7 @@ export async function authApp({
|
||||
> {
|
||||
const result = await parseHeaderCert(props);
|
||||
const { teamId, tmbId } = result;
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { app, isOwner, canWrite } = await (async () => {
|
||||
// get app
|
||||
|
||||
@@ -13,8 +13,9 @@ import {
|
||||
} from '@fastgpt/global/core/dataset/type';
|
||||
import { getFileById } from '../../../common/file/gridfs/controller';
|
||||
import { BucketNameEnum } from '@fastgpt/global/common/file/constants';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
import { CommonErrEnum } from '@fastgpt/global/common/error/code/common';
|
||||
import { MongoDatasetCollection } from '../../../core/dataset/collection/schema';
|
||||
|
||||
export async function authDatasetByTmbId({
|
||||
teamId,
|
||||
@@ -27,7 +28,7 @@ export async function authDatasetByTmbId({
|
||||
datasetId: string;
|
||||
per: AuthModeType['per'];
|
||||
}) {
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { dataset, isOwner, canWrite } = await (async () => {
|
||||
const dataset = await MongoDataset.findOne({ _id: datasetId, teamId }).lean();
|
||||
@@ -107,7 +108,7 @@ export async function authDatasetCollection({
|
||||
}
|
||||
> {
|
||||
const { userId, teamId, tmbId } = await parseHeaderCert(props);
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { collection, isOwner, canWrite } = await (async () => {
|
||||
const collection = await getCollectionWithDataset(collectionId);
|
||||
@@ -163,47 +164,40 @@ export async function authDatasetFile({
|
||||
}
|
||||
> {
|
||||
const { userId, teamId, tmbId } = await parseHeaderCert(props);
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
|
||||
const file = await getFileById({ bucketName: BucketNameEnum.dataset, fileId });
|
||||
const [file, collection] = await Promise.all([
|
||||
getFileById({ bucketName: BucketNameEnum.dataset, fileId }),
|
||||
MongoDatasetCollection.findOne({
|
||||
teamId,
|
||||
fileId
|
||||
})
|
||||
]);
|
||||
|
||||
if (!file) {
|
||||
return Promise.reject(CommonErrEnum.fileNotFound);
|
||||
}
|
||||
|
||||
if (file.metadata.teamId !== teamId) {
|
||||
if (!collection) {
|
||||
return Promise.reject(DatasetErrEnum.unAuthDatasetFile);
|
||||
}
|
||||
|
||||
const { dataset } = await authDataset({
|
||||
...props,
|
||||
datasetId: file.metadata.datasetId,
|
||||
per
|
||||
});
|
||||
const isOwner =
|
||||
role !== TeamMemberRoleEnum.visitor &&
|
||||
(String(dataset.tmbId) === tmbId || role === TeamMemberRoleEnum.owner);
|
||||
// file role = collection role
|
||||
try {
|
||||
const { isOwner, canWrite } = await authDatasetCollection({
|
||||
...props,
|
||||
collectionId: collection._id,
|
||||
per
|
||||
});
|
||||
|
||||
const canWrite =
|
||||
isOwner ||
|
||||
(role !== TeamMemberRoleEnum.visitor && dataset.permission === PermissionTypeEnum.public);
|
||||
|
||||
if (per === 'r' && !isOwner && dataset.permission !== PermissionTypeEnum.public) {
|
||||
return {
|
||||
userId,
|
||||
teamId,
|
||||
tmbId,
|
||||
file,
|
||||
isOwner,
|
||||
canWrite
|
||||
};
|
||||
} catch (error) {
|
||||
return Promise.reject(DatasetErrEnum.unAuthDatasetFile);
|
||||
}
|
||||
if (per === 'w' && !canWrite) {
|
||||
return Promise.reject(DatasetErrEnum.unAuthDatasetFile);
|
||||
}
|
||||
if (per === 'owner' && !isOwner) {
|
||||
return Promise.reject(DatasetErrEnum.unAuthDatasetFile);
|
||||
}
|
||||
|
||||
return {
|
||||
userId,
|
||||
teamId,
|
||||
tmbId,
|
||||
file,
|
||||
isOwner,
|
||||
canWrite
|
||||
};
|
||||
}
|
||||
|
||||
@@ -2,7 +2,7 @@ import { AuthResponseType } from '@fastgpt/global/support/permission/type';
|
||||
import { AuthModeType } from '../type';
|
||||
import { OpenApiSchema } from '@fastgpt/global/support/openapi/type';
|
||||
import { parseHeaderCert } from '../controller';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
import { MongoOpenApi } from '../../openapi/schema';
|
||||
import { OpenApiErrEnum } from '@fastgpt/global/common/error/code/openapi';
|
||||
import { TeamMemberRoleEnum } from '@fastgpt/global/support/user/team/constant';
|
||||
@@ -21,7 +21,7 @@ export async function authOpenApiKeyCrud({
|
||||
const result = await parseHeaderCert(props);
|
||||
const { tmbId, teamId } = result;
|
||||
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { openapi, isOwner, canWrite } = await (async () => {
|
||||
const openapi = await MongoOpenApi.findOne({ _id: id, teamId });
|
||||
|
||||
@@ -9,7 +9,7 @@ import { MongoApp } from '../../../core/app/schema';
|
||||
import { OutLinkErrEnum } from '@fastgpt/global/common/error/code/outLink';
|
||||
import { PermissionTypeEnum } from '@fastgpt/global/support/permission/constant';
|
||||
import { AppErrEnum } from '@fastgpt/global/common/error/code/app';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
|
||||
/* crud outlink permission */
|
||||
export async function authOutLinkCrud({
|
||||
@@ -27,7 +27,7 @@ export async function authOutLinkCrud({
|
||||
const result = await parseHeaderCert(props);
|
||||
const { tmbId, teamId } = result;
|
||||
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { app, outLink, isOwner, canWrite } = await (async () => {
|
||||
const outLink = await MongoOutLink.findOne({ _id: outLinkId, teamId });
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { AuthResponseType } from '@fastgpt/global/support/permission/type';
|
||||
import { AuthModeType } from '../type';
|
||||
import { parseHeaderCert } from '../controller';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
import { TeamMemberRoleEnum } from '@fastgpt/global/support/user/team/constant';
|
||||
import { MongoPlugin } from '../../../core/plugin/schema';
|
||||
import { PluginErrEnum } from '@fastgpt/global/common/error/code/plugin';
|
||||
@@ -23,7 +23,7 @@ export async function authPluginCrud({
|
||||
const result = await parseHeaderCert(props);
|
||||
const { tmbId, teamId } = result;
|
||||
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
const { plugin, isOwner, canWrite } = await (async () => {
|
||||
const plugin = await MongoPlugin.findOne({ _id: id, teamId });
|
||||
@@ -73,7 +73,7 @@ export async function authPluginCanUse({
|
||||
}
|
||||
|
||||
if (source === PluginSourceEnum.personal) {
|
||||
const { role } = await getTeamInfoByTmbId({ tmbId });
|
||||
const { role } = await getTmbInfoByTmbId({ tmbId });
|
||||
const plugin = await MongoPlugin.findOne({ _id: pluginId, teamId });
|
||||
if (!plugin) {
|
||||
return Promise.reject(PluginErrEnum.unExist);
|
||||
|
||||
@@ -3,7 +3,7 @@ import { AuthModeType } from '../type';
|
||||
import { TeamItemType } from '@fastgpt/global/support/user/team/type';
|
||||
import { TeamMemberRoleEnum } from '@fastgpt/global/support/user/team/constant';
|
||||
import { parseHeaderCert } from '../controller';
|
||||
import { getTeamInfoByTmbId } from '../../user/team/controller';
|
||||
import { getTmbInfoByTmbId } from '../../user/team/controller';
|
||||
import { UserErrEnum } from '../../../../global/common/error/code/user';
|
||||
|
||||
export async function authUserNotVisitor(props: AuthModeType): Promise<
|
||||
@@ -13,7 +13,7 @@ export async function authUserNotVisitor(props: AuthModeType): Promise<
|
||||
}
|
||||
> {
|
||||
const { userId, teamId, tmbId } = await parseHeaderCert(props);
|
||||
const team = await getTeamInfoByTmbId({ tmbId });
|
||||
const team = await getTmbInfoByTmbId({ tmbId });
|
||||
|
||||
if (team.role === TeamMemberRoleEnum.visitor) {
|
||||
return Promise.reject(UserErrEnum.binVisitor);
|
||||
@@ -38,7 +38,7 @@ export async function authUserRole(props: AuthModeType): Promise<
|
||||
}
|
||||
> {
|
||||
const result = await parseHeaderCert(props);
|
||||
const { role: userRole, canWrite } = await getTeamInfoByTmbId({ tmbId: result.tmbId });
|
||||
const { role: userRole, canWrite } = await getTmbInfoByTmbId({ tmbId: result.tmbId });
|
||||
|
||||
return {
|
||||
...result,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user