Compare commits

...

22 Commits

Author SHA1 Message Date
archer
4b8dfeef12 perf: i18n 2025-06-03 23:01:58 +08:00
archer
98b00ae86d perf: multiple menu 2025-06-03 22:53:42 +08:00
dreamer6680
c1f8d5b032 add secondary.tsx (#4946)
* add secondary.tsx

* fix

---------

Co-authored-by: dreamer6680 <146868355@qq.com>
2025-06-03 22:05:07 +08:00
archer
4adb8b7e6f perf: log 2025-06-03 22:03:12 +08:00
archer
e32ca8a3e9 perf: api dataset code 2025-06-03 21:30:54 +08:00
dreamer6680
2507997d20 Thirddatasetmd (#4942)
* add thirddataset.md

* fix thirddataset.md

* fix

* delete wrong png

---------

Co-authored-by: dreamer6680 <146868355@qq.com>
2025-06-03 21:30:54 +08:00
Archer
86f5a68d8c fix: ts (#4948) 2025-06-03 21:30:54 +08:00
Archer
92c38d9d2f Feat: Images dataset collection (#4941)
* New pic (#4858)

* 更新数据集相关类型,添加图像文件ID和预览URL支持;优化数据集导入功能,新增图像数据集处理组件;修复部分国际化文本;更新文件上传逻辑以支持新功能。

* 与原先代码的差别

* 新增 V4.9.10 更新说明,支持 PG 设置`systemEnv.hnswMaxScanTuples`参数,优化 LLM stream 调用超时,修复全文检索多知识库排序问题。同时更新数据集索引,移除 datasetId 字段以简化查询。

* 更换成fileId_image逻辑,并增加训练队列匹配的逻辑

* 新增图片集合判断逻辑,优化预览URL生成流程,确保仅在数据集为图片集合时生成预览URL,并添加相关日志输出以便调试。

* Refactor Docker Compose configuration to comment out exposed ports for production environments, update image versions for pgvector, fastgpt, and mcp_server, and enhance Redis service with a health check. Additionally, standardize dataset collection labels in constants and improve internationalization strings across multiple languages.

* Enhance TrainingStates component by adding internationalization support for the imageParse training mode and update defaultCounts to include imageParse mode in trainingDetail API.

* Enhance dataset import context by adding additional steps for image dataset import process and improve internationalization strings for modal buttons in the useEditTitle hook.

* Update DatasetImportContext to conditionally render MyStep component based on data source type, improving the import process for non-image datasets.

* Refactor image dataset handling by improving internationalization strings, enhancing error messages, and streamlining the preview URL generation process.

* 图片上传到新建的 dataset_collection_images 表,逻辑跟随更改

* 修改了除了controller的其他部分问题

* 把图片数据集的逻辑整合到controller里面

* 补充i18n

* 补充i18n

* resolve评论:主要是上传逻辑的更改和组件复用

* 图片名称的图标显示

* 修改编译报错的命名问题

* 删除不需要的collectionid部分

* 多余文件的处理和改动一个删除按钮

* 除了loading和统一的imageId,其他都resolve掉的

* 处理图标报错

* 复用了MyPhotoView并采用全部替换的方式将imageFileId变成imageId

* 去除不必要文件修改

* 报错和字段修改

* 增加上传成功后删除临时文件的逻辑以及回退一些修改

* 删除path字段,将图片保存到gridfs内,并修改增删等操作的代码

* 修正编译错误

---------

Co-authored-by: archer <545436317@qq.com>

* perf: image dataset

* feat: insert image

* perf: image icon

* fix: training state

---------

Co-authored-by: Zhuangzai fa <143257420+ctrlz526@users.noreply.github.com>
2025-06-03 21:30:50 +08:00
gggaaallleee
9fb5d05865 add audit (#4923)
* add audit

* update audit

* update audit
2025-06-03 21:28:26 +08:00
dependabot[bot]
b974574157 chore(deps): bump tar-fs in /plugins/webcrawler/SPIDER (#4945)
Bumps [tar-fs](https://github.com/mafintosh/tar-fs) from 3.0.8 to 3.0.9.
- [Commits](https://github.com/mafintosh/tar-fs/compare/v3.0.8...v3.0.9)

---
updated-dependencies:
- dependency-name: tar-fs
  dependency-version: 3.0.9
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-03 16:02:05 +08:00
gggaaallleee
5a5367d30b add Bocha search template (#4933)
* add bocha

* Delete packages/service/support/operationLog/util.ts
2025-05-30 21:07:49 +08:00
Archer
8ed35ffe7e Update dataset.md (#4927) 2025-05-29 18:25:59 +08:00
Archer
0f866fc552 feat: text collecion auto save for a txt file (#4924) 2025-05-29 17:57:27 +08:00
Archer
05c7ba4483 feat: Workflow node search (#4920)
* add node find (#4902)

* add node find

* plugin header

* fix

* fix

* remove

* type

* add searched status

* optimize

* perf: search nodes

---------

Co-authored-by: heheer <heheer@sealos.io>
2025-05-29 14:29:28 +08:00
heheer
fa80ce3a77 fix child app external variables (#4919) 2025-05-29 13:37:59 +08:00
Archer
830358aa72 remove invalid code (#4915) 2025-05-28 22:11:40 +08:00
Archer
02b214b3ec feat: remove buffer;fix: custom pdf parse (#4914)
* fix: doc

* fix: remove buffer

* fix: pdf parse
2025-05-28 21:48:10 +08:00
Archer
a171c7b11c perf: buffer;fix: back up split (#4913)
* perf: buffer

* fix: back up split

* fix: app limit

* doc
2025-05-28 18:18:25 +08:00
heheer
802de11363 fix runtool empty message (#4911)
* fix runtool empty message

* del unused code

* fix
2025-05-28 17:48:30 +08:00
Archer
b4ecfb0b79 Feat: Node latest version (#4905)
* node versions add keep the latest option (#4899)

* node versions add keep the latest option

* i18n

* perf: version code

* fix: ts

* hide system version

* hide system version

* hide system version

* fix: ts

* fix: ts

---------

Co-authored-by: heheer <heheer@sealos.io>
2025-05-28 10:46:32 +08:00
heheer
331b851a78 fix has tool node condition (#4907) 2025-05-28 10:34:02 +08:00
Archer
50d235c42a fix: i18n (#4898) 2025-05-27 10:45:25 +08:00
282 changed files with 6378 additions and 1715 deletions

View File

@@ -132,15 +132,15 @@ services:
# fastgpt # fastgpt
sandbox: sandbox:
container_name: sandbox container_name: sandbox
image: ghcr.io/labring/fastgpt-sandbox:v4.9.10 # git image: ghcr.io/labring/fastgpt-sandbox:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10-fix2 # 阿里云
networks: networks:
- fastgpt - fastgpt
restart: always restart: always
fastgpt-mcp-server: fastgpt-mcp-server:
container_name: fastgpt-mcp-server container_name: fastgpt-mcp-server
image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10 # git image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10-fix2 # 阿里云
ports: ports:
- 3005:3000 - 3005:3000
networks: networks:
@@ -150,8 +150,8 @@ services:
- FASTGPT_ENDPOINT=http://fastgpt:3000 - FASTGPT_ENDPOINT=http://fastgpt:3000
fastgpt: fastgpt:
container_name: fastgpt container_name: fastgpt
image: ghcr.io/labring/fastgpt:v4.9.10 # git image: ghcr.io/labring/fastgpt:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10-fix2 # 阿里云
ports: ports:
- 3000:3000 - 3000:3000
networks: networks:

View File

@@ -109,15 +109,15 @@ services:
# fastgpt # fastgpt
sandbox: sandbox:
container_name: sandbox container_name: sandbox
image: ghcr.io/labring/fastgpt-sandbox:v4.9.10 # git image: ghcr.io/labring/fastgpt-sandbox:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10-fix2 # 阿里云
networks: networks:
- fastgpt - fastgpt
restart: always restart: always
fastgpt-mcp-server: fastgpt-mcp-server:
container_name: fastgpt-mcp-server container_name: fastgpt-mcp-server
image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10 # git image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10-fix2 # 阿里云
ports: ports:
- 3005:3000 - 3005:3000
networks: networks:
@@ -127,8 +127,8 @@ services:
- FASTGPT_ENDPOINT=http://fastgpt:3000 - FASTGPT_ENDPOINT=http://fastgpt:3000
fastgpt: fastgpt:
container_name: fastgpt container_name: fastgpt
image: ghcr.io/labring/fastgpt:v4.9.10 # git image: ghcr.io/labring/fastgpt:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10-fix2 # 阿里云
ports: ports:
- 3000:3000 - 3000:3000
networks: networks:

View File

@@ -96,15 +96,15 @@ services:
# fastgpt # fastgpt
sandbox: sandbox:
container_name: sandbox container_name: sandbox
image: ghcr.io/labring/fastgpt-sandbox:v4.9.10 # git image: ghcr.io/labring/fastgpt-sandbox:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10-fix2 # 阿里云
networks: networks:
- fastgpt - fastgpt
restart: always restart: always
fastgpt-mcp-server: fastgpt-mcp-server:
container_name: fastgpt-mcp-server container_name: fastgpt-mcp-server
image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10 # git image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10-fix2 # 阿里云
ports: ports:
- 3005:3000 - 3005:3000
networks: networks:
@@ -114,8 +114,8 @@ services:
- FASTGPT_ENDPOINT=http://fastgpt:3000 - FASTGPT_ENDPOINT=http://fastgpt:3000
fastgpt: fastgpt:
container_name: fastgpt container_name: fastgpt
image: ghcr.io/labring/fastgpt:v4.9.10 # git image: ghcr.io/labring/fastgpt:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10-fix2 # 阿里云
ports: ports:
- 3000:3000 - 3000:3000
networks: networks:

View File

@@ -72,15 +72,15 @@ services:
sandbox: sandbox:
container_name: sandbox container_name: sandbox
image: ghcr.io/labring/fastgpt-sandbox:v4.9.10 # git image: ghcr.io/labring/fastgpt-sandbox:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-sandbox:v4.9.10-fix2 # 阿里云
networks: networks:
- fastgpt - fastgpt
restart: always restart: always
fastgpt-mcp-server: fastgpt-mcp-server:
container_name: fastgpt-mcp-server container_name: fastgpt-mcp-server
image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10 # git image: ghcr.io/labring/fastgpt-mcp_server:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-mcp_server:v4.9.10-fix2 # 阿里云
ports: ports:
- 3005:3000 - 3005:3000
networks: networks:
@@ -90,8 +90,8 @@ services:
- FASTGPT_ENDPOINT=http://fastgpt:3000 - FASTGPT_ENDPOINT=http://fastgpt:3000
fastgpt: fastgpt:
container_name: fastgpt container_name: fastgpt
image: ghcr.io/labring/fastgpt:v4.9.10 # git image: ghcr.io/labring/fastgpt:v4.9.10-fix2 # git
# image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10 # 阿里云 # image: registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.9.10-fix2 # 阿里云
ports: ports:
- 3000:3000 - 3000:3000
networks: networks:

232
dev.md
View File

@@ -1,114 +1,118 @@
## Premise ## Premise
Since FastGPT is managed in the same way as monorepo, it is recommended to install make first during development. Since FastGPT is managed in the same way as monorepo, it is recommended to install make first during development.
monorepo Project Name: monorepo Project Name:
- app: main project - app: main project
-...... -......
## Dev ## Dev
```sh ```sh
# Give automatic script code execution permission (on non-Linux systems, you can manually execute the postinstall.sh file content) # Give automatic script code execution permission (on non-Linux systems, you can manually execute the postinstall.sh file content)
chmod -R +x ./scripts/ chmod -R +x ./scripts/
# Executing under the code root directory installs all dependencies within the root package, projects, and packages # Executing under the code root directory installs all dependencies within the root package, projects, and packages
pnpm i pnpm i
# Not make cmd # Not make cmd
cd projects/app cd projects/app
pnpm dev pnpm dev
# Make cmd # Make cmd
make dev name=app make dev name=app
``` ```
Note: If the Node version is >= 20, you need to pass the `--no-node-snapshot` parameter to Node when running `pnpm i` Note: If the Node version is >= 20, you need to pass the `--no-node-snapshot` parameter to Node when running `pnpm i`
```sh ```sh
NODE_OPTIONS=--no-node-snapshot pnpm i NODE_OPTIONS=--no-node-snapshot pnpm i
``` ```
### Jest ### Jest
https://fael3z0zfze.feishu.cn/docx/ZOI1dABpxoGhS7xzhkXcKPxZnDL https://fael3z0zfze.feishu.cn/docx/ZOI1dABpxoGhS7xzhkXcKPxZnDL
## I18N ## I18N
### Install i18n-ally Plugin ### Install i18n-ally Plugin
1. Open the Extensions Marketplace in VSCode, search for and install the `i18n Ally` plugin. 1. Open the Extensions Marketplace in VSCode, search for and install the `i18n Ally` plugin.
### Code Optimization Examples ### Code Optimization Examples
#### Fetch Specific Namespace Translations in `getServerSideProps` #### Fetch Specific Namespace Translations in `getServerSideProps`
```typescript ```typescript
// pages/yourPage.tsx // pages/yourPage.tsx
export async function getServerSideProps(context: any) { export async function getServerSideProps(context: any) {
return { return {
props: { props: {
currentTab: context?.query?.currentTab || TabEnum.info, currentTab: context?.query?.currentTab || TabEnum.info,
...(await serverSideTranslations(context.locale, ['publish', 'user'])) ...(await serverSideTranslations(context.locale, ['publish', 'user']))
} }
}; };
} }
``` ```
#### Use useTranslation Hook in Page #### Use useTranslation Hook in Page
```typescript ```typescript
// pages/yourPage.tsx // pages/yourPage.tsx
import { useTranslation } from 'next-i18next'; import { useTranslation } from 'next-i18next';
const YourComponent = () => { const YourComponent = () => {
const { t } = useTranslation(); const { t } = useTranslation();
return ( return (
<Button <Button
variant="outline" variant="outline"
size="sm" size="sm"
mr={2} mr={2}
onClick={() => setShowSelected(false)} onClick={() => setShowSelected(false)}
> >
{t('common:close')} {t('common:close')}
</Button> </Button>
); );
}; };
export default YourComponent; export default YourComponent;
``` ```
#### Handle Static File Translations #### Handle Static File Translations
```typescript ```typescript
// utils/i18n.ts // utils/i18n.ts
import { i18nT } from '@fastgpt/web/i18n/utils'; import { i18nT } from '@fastgpt/web/i18n/utils';
const staticContent = { const staticContent = {
id: 'simpleChat', id: 'simpleChat',
avatar: 'core/workflow/template/aiChat', avatar: 'core/workflow/template/aiChat',
name: i18nT('app:template.simple_robot'), name: i18nT('app:template.simple_robot'),
}; };
export default staticContent; export default staticContent;
``` ```
### Standardize Translation Format ### Standardize Translation Format
- Use the t(namespace:key) format to ensure consistent naming. - Use the t(namespace:key) format to ensure consistent naming.
- Translation keys should use lowercase letters and underscores, e.g., common.close. - Translation keys should use lowercase letters and underscores, e.g., common.close.
## Build ## audit
```sh Please fill the OperationLogEventEnum and operationLog/audit function is added to the ts, and on the corresponding position to fill i18n, at the same time to add the location of the log using addOpearationLog function add function
# Docker cmd: Build image, not proxy
docker build -f ./projects/app/Dockerfile -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 . --build-arg name=app ## Build
# Make cmd: Build image, not proxy
make build name=app image=registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 ```sh
# Docker cmd: Build image, not proxy
# Docker cmd: Build image with proxy docker build -f ./projects/app/Dockerfile -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 . --build-arg name=app
docker build -f ./projects/app/Dockerfile -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 . --build-arg name=app --build-arg proxy=taobao # Make cmd: Build image, not proxy
# Make cmd: Build image with proxy make build name=app image=registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1
make build name=app image=registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 proxy=taobao
``` # Docker cmd: Build image with proxy
docker build -f ./projects/app/Dockerfile -t registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 . --build-arg name=app --build-arg proxy=taobao
# Make cmd: Build image with proxy
make build name=app image=registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt:v4.8.1 proxy=taobao
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

View File

@@ -645,7 +645,7 @@ data 为集合的 ID。
{{< /tab >}} {{< /tab >}}
{{< /tabs >}} {{< /tabs >}}
### 创建一个外部文件库集合(商业版 ### 创建一个外部文件库集合(弃用
{{< tabs tabTotal="3" >}} {{< tabs tabTotal="3" >}}
{{< tab tabName="请求示例" >}} {{< tab tabName="请求示例" >}}

View File

@@ -1,5 +1,5 @@
--- ---
title: 'V4.9.1' title: 'V4.9.1(包含升级脚本)'
description: 'FastGPT V4.9.1 更新说明' description: 'FastGPT V4.9.1 更新说明'
icon: 'upgrade' icon: 'upgrade'
draft: false draft: false

View File

@@ -1,5 +1,5 @@
--- ---
title: 'V4.9.10(进行中)' title: 'V4.9.10'
description: 'FastGPT V4.9.10 更新说明' description: 'FastGPT V4.9.10 更新说明'
icon: 'upgrade' icon: 'upgrade'
draft: false draft: false
@@ -15,8 +15,8 @@ weight: 790
### 2. 更新镜像 tag ### 2. 更新镜像 tag
- 更新 FastGPT 镜像 tag: v4.9.10 - 更新 FastGPT 镜像 tag: v4.9.10-fix2
- 更新 FastGPT 商业版镜像 tag: v4.9.10 - 更新 FastGPT 商业版镜像 tag: v4.9.10-fix2
- mcp_server 无需更新 - mcp_server 无需更新
- Sandbox 无需更新 - Sandbox 无需更新
- AIProxy 无需更新 - AIProxy 无需更新

View File

@@ -0,0 +1,42 @@
---
title: 'V4.9.11(进行中)'
description: 'FastGPT V4.9.11 更新说明'
icon: 'upgrade'
draft: false
toc: true
weight: 789
---
## 执行升级脚本
该脚本仅需商业版用户执行。
从任意终端,发起 1 个 HTTP 请求。其中 {{rootkey}} 替换成环境变量里的 `rootkey`{{host}} 替换成**FastGPT 域名**。
```bash
curl --location --request POST 'https://{{host}}/api/admin/initv4911' \
--header 'rootkey: {{rootkey}}' \
--header 'Content-Type: application/json'
```
**脚本功能**
1. 移动第三方知识库 API 配置。
## 🚀 新增内容
1. 商业版支持图片知识库。
2. 工作流中增加节点搜索功能。
3. 工作流中,子流程版本控制,可选择“保持最新版本”,无需手动更新。
4. 增加更多审计操作日志。
## ⚙️ 优化
1. 原文缓存改用 gridfs 存储,提高上限。
## 🐛 修复
1. 工作流中,管理员声明的全局系统工具,无法进行版本管理。
2. 工具调用节点前,有交互节点时,上下文异常。
3. 修复备份导入,小于 1000 字时,无法分块问题。
4. 自定义 PDF 解析,无法保存 base64 图片。

View File

@@ -1,5 +1,5 @@
--- ---
title: 'V4.9.4' title: 'V4.9.4(包含升级脚本)'
description: 'FastGPT V4.9.4 更新说明' description: 'FastGPT V4.9.4 更新说明'
icon: 'upgrade' icon: 'upgrade'
draft: false draft: false

View File

@@ -0,0 +1,161 @@
---
title: '第三方知识库开发'
description: '本节详细介绍如何在FastGPT上自己接入第三方知识库'
icon: 'language'
draft: false
toc: true
weight: 410
---
目前,互联网上拥有各种各样的文档库,例如飞书,语雀等等。 FastGPT 的不同用户可能使用的文档库不同,目前 FastGPT 内置了飞书、语雀文档库,如果需要接入其他文档库,可以参考本节内容。
## 统一的接口规范
为了实现对不同文档库的统一接入FastGPT 对第三方文档库进行了接口的规范,共包含 4 个接口内容,可以[查看 API 文件库接口](/docs/guide/knowledge_base/api_datase)。
所有内置的文档库,都是基于标准的 API 文件库进行扩展。可以参考`FastGPT/packages/service/core/dataset/apiDataset/yuqueDataset/api.ts`中的代码,进行其他文档库的扩展。一共需要完成 4 个接口开发:
1. 获取文件列表
2. 获取文件内容/文件链接
3. 获取原文预览地址
4. 获取文件详情信息
## 开始一个第三方文件库
为了方便讲解,这里以添加飞书知识库为例。
### 1. 添加第三方文档库参数
首先,要进入 FastGPT 项目路径下的`FastGPT\packages\global\core\dataset\apiDataset.d.ts`文件,添加第三方文档库 Server 类型。例如,语雀文档中,需要提供`userId``token`两个字段作为鉴权信息。
```ts
export type YuqueServer = {
userId: string;
token?: string;
basePath?: string;
};
```
{{% alert icon="🤖 " context="success" %}}
如果文档库有`根目录`选择的功能,需要设置添加一个字段`basePath`
{{% /alert %}}
### 2. 创建 Hook 文件
每个第三方文档库都会采用 Hook 的方式来实现一套 API 接口的维护Hook 里包含 4 个函数需要完成。
-`FastGPT\packages\service\core\dataset\apiDataset\`下创建一个文档库的文件夹,然后在文件夹下创建一个`api.ts`文件
-`api.ts`文件中,需要完成 4 个函数的定义,分别是:
- `listFiles`:获取文件列表
- `getFileContent`:获取文件内容/文件链接
- `getFileDetail`:获取文件详情信息
- `getFilePreviewUrl`:获取原文预览地址
### 3. 数据库添加配置字段
-`packages/service/core/dataset/schema.ts` 中添加第三方文档库的配置字段,类型统一设置成`Object`
-`FastGPT/packages/global/core/dataset/type.d.ts`中添加第三方文档库配置字段的数据类型,类型设置为第一步创建的参数。
![](/imgs/thirddataset-7.png)
{{% alert icon="🤖 " context="success" %}}
`schema.ts`文件修改后,需要重新启动 FastGPT 项目才会生效。
{{% /alert %}}
### 4. 添加知识库类型
`projects/app/src/web/core/dataset/constants.ts`中,添加自己的知识库类型
```TS
export const datasetTypeCourseMap: Record<`${DatasetTypeEnum}`, string> = {
[DatasetTypeEnum.folder]: '',
[DatasetTypeEnum.dataset]: '',
[DatasetTypeEnum.apiDataset]: '/docs/guide/knowledge_base/api_dataset/',
[DatasetTypeEnum.websiteDataset]: '/docs/guide/knowledge_base/websync/',
[DatasetTypeEnum.feishuShare]: '/docs/guide/knowledge_base/lark_share_dataset/',
[DatasetTypeEnum.feishuKnowledge]: '/docs/guide/knowledge_base/lark_knowledge_dataset/',
[DatasetTypeEnum.yuque]: '/docs/guide/knowledge_base/yuque_dataset/',
[DatasetTypeEnum.externalFile]: ''
};
```
{{% alert icon="🤖 " context="success" %}}
在 datasetTypeCourseMap 中添加自己的知识库类型,`' '`内是相应的文档说明,如果有的话,可以添加。
文档添加在`FastGPT\docSite\content\zh-cn\docs\guide\knowledge_base\`
{{% /alert %}}
## 添加前端
`FastGPT\packages\web\i18n\zh-CN\dataset.json`,`FastGPT\packages\web\i18n\en\dataset.json``FastGPT\packages\web\i18n\zh-Hant\dataset.json`中添加自己的 I18n 翻译,以中文翻译为例,大体需要如下几个内容:
![](/imgs/thirddataset-24.png)
`FastGPT\packages\web\components\common\Icon\icons\core\dataset\`添加自己的知识库图标,一共是两个,分为`Outline``Color`,分别是有颜色的和无色的,具体看如下图片。
![](/imgs/thirddataset-10.png)
`FastGPT\packages\web\components\common\Icon\constants.ts`文件中,添加自己的图标。 `import` 是图标的存放路径。
![](/imgs/thirddataset-9.png)
`FastGPT\packages\global\core\dataset\constants.ts`文件中,添加自己的知识库类型。
![](/imgs/thirddataset-8.png)
{{% alert icon="🤖 " context="success" %}}
`label`内容是自己之前通过 i18n 翻译添加的知识库名称的
`icon`是自己之前添加的 Icon , I18n 的添加看最后清单。
{{% /alert %}}
`FastGPT\projects\app\src\pages\dataset\list\index.tsx`文件下,添加如下内容。这个文件负责的是知识库列表页的`新建`按钮点击后的菜单,只有在该文件添加知识库后,才能创建知识库。
![](/imgs/thirddataset-12.png)
`FastGPT\projects\app\src\pageComponents\dataset\detail\Info\index.tsx`文件下,添加如下内容。
![](/imgs/thirddataset-18.png)
`FastGPT\projects\app\src\pageComponents\dataset\list\CreateModal.tsx`文件下,添加如下内容。
| | |
| --- | --- |
| ![](/imgs/thirddataset-19.png) | ![](/imgs/thirddataset-20.png) |
`FastGPT\projects\app\src\pageComponents\dataset\list\SideTag.tsx`文件下,添加如下内容。
![](/imgs/thirddataset-21.png)
`FastGPT\projects\app\src\web\core\dataset\context\datasetPageContext.tsx`文件下,添加如下内容。
![](/imgs/thirddataset-23.png)
## 添加配置表单
`FastGPT\projects\app\src\pageComponents\dataset\ApiDatasetForm.tsx`文件下,添加自己如下内容。这个文件负责的是创建知识库页的字段填写。
| | | |
| --- | --- | --- |
| ![](/imgs/thirddataset-13.png) | ![](/imgs/thirddataset-14.png) | ![](/imgs/thirddataset-15.png) |
代码中添加的两个组件是对根目录选择的渲染,对应设计的 api 的 getfiledetail 方法,如果你的文件不支持,你可以不引用。
```
{renderBaseUrlSelector()} //这是对`Base URL`字段的渲染
{renderDirectoryModal()} //点击`选择`后出现的`选择根目录`窗口,见图
```
| | |
| --- | --- |
| ![](/imgs/thirddataset-16.png) | ![](/imgs/thirddataset-17.png) |
如果知识库需要支持根目录,还需要在`ApiDatasetForm`文件中添加相关内容。
## 添加杂项
最后,需要在很多文件里添加`server`类型,这里由于文件过多,且不大,不一一列举文件的清单。只提供方法:使用自己编程工具的全局搜索功能,搜索`YuqueServer``yuqueServer`。在搜索到的文件中,逐一添加自己的知识库类型。
## 提示
建议知识库创建完成后,完整测试一遍知识库的功能,以确定有无漏洞,如果你的知识库添加有问题,且无法在文档找到对应的文件解决,一定是杂项没有添加完全,建议重复一次全局搜索`YuqueServer``yuqueServer`,检查是否有地方没有加上自己的类型。

View File

@@ -6,7 +6,8 @@ export const fileImgs = [
{ suffix: '(doc|docs)', src: 'file/fill/doc' }, { suffix: '(doc|docs)', src: 'file/fill/doc' },
{ suffix: 'txt', src: 'file/fill/txt' }, { suffix: 'txt', src: 'file/fill/txt' },
{ suffix: 'md', src: 'file/fill/markdown' }, { suffix: 'md', src: 'file/fill/markdown' },
{ suffix: 'html', src: 'file/fill/html' } { suffix: 'html', src: 'file/fill/html' },
{ suffix: '(jpg|jpeg|png|gif|bmp|webp|svg|ico|tiff|tif)', src: 'image' }
// { suffix: '.', src: '/imgs/files/file.svg' } // { suffix: '.', src: '/imgs/files/file.svg' }
]; ];

View File

@@ -2,4 +2,5 @@ export type AuthFrequencyLimitProps = {
eventId: string; eventId: string;
maxAmount: number; maxAmount: number;
expiredTime: Date; expiredTime: Date;
num?: number;
}; };

View File

@@ -34,7 +34,7 @@ export const valToStr = (val: any) => {
}; };
// replace {{variable}} to value // replace {{variable}} to value
export function replaceVariable(text: any, obj: Record<string, string | number>) { export function replaceVariable(text: any, obj: Record<string, string | number | undefined>) {
if (typeof text !== 'string') return text; if (typeof text !== 'string') return text;
for (const key in obj) { for (const key in obj) {

View File

@@ -10,6 +10,8 @@ import { AppTypeEnum } from './constants';
import { AppErrEnum } from '../../common/error/code/app'; import { AppErrEnum } from '../../common/error/code/app';
import { PluginErrEnum } from '../../common/error/code/plugin'; import { PluginErrEnum } from '../../common/error/code/plugin';
import { i18nT } from '../../../web/i18n/utils'; import { i18nT } from '../../../web/i18n/utils';
import appErrList from '../../common/error/code/app';
import pluginErrList from '../../common/error/code/plugin';
export const getDefaultAppForm = (): AppSimpleEditFormType => { export const getDefaultAppForm = (): AppSimpleEditFormType => {
return { return {
@@ -190,17 +192,10 @@ export const getAppType = (config?: WorkflowTemplateBasicType | AppSimpleEditFor
return ''; return '';
}; };
export const formatToolError = (error?: string) => { export const formatToolError = (error?: any) => {
const unExistError: Array<string> = [ if (!error || typeof error !== 'string') return;
AppErrEnum.unAuthApp,
AppErrEnum.unExist,
PluginErrEnum.unAuth,
PluginErrEnum.unExist
];
if (error && unExistError.includes(error)) { const errorText = appErrList[error]?.message || pluginErrList[error]?.message;
return i18nT('app:un_auth');
} else { return errorText || error;
return error;
}
}; };

View File

@@ -1,4 +1,9 @@
import type { ChunkSettingsType, DatasetDataIndexItemType, DatasetSchemaType } from './type'; import type {
ChunkSettingsType,
DatasetDataIndexItemType,
DatasetDataFieldType,
DatasetSchemaType
} from './type';
import type { import type {
DatasetCollectionTypeEnum, DatasetCollectionTypeEnum,
DatasetCollectionDataProcessModeEnum, DatasetCollectionDataProcessModeEnum,
@@ -7,12 +12,14 @@ import type {
ChunkTriggerConfigTypeEnum, ChunkTriggerConfigTypeEnum,
ParagraphChunkAIModeEnum ParagraphChunkAIModeEnum
} from './constants'; } from './constants';
import type { LLMModelItemType } from '../ai/model.d'; import type { ParentIdType } from '../../common/parentFolder/type';
import type { ParentIdType } from 'common/parentFolder/type';
/* ================= dataset ===================== */ /* ================= dataset ===================== */
export type DatasetUpdateBody = { export type DatasetUpdateBody = {
id: string; id: string;
apiDatasetServer?: DatasetSchemaType['apiDatasetServer'];
parentId?: ParentIdType; parentId?: ParentIdType;
name?: string; name?: string;
avatar?: string; avatar?: string;
@@ -24,9 +31,6 @@ export type DatasetUpdateBody = {
websiteConfig?: DatasetSchemaType['websiteConfig']; websiteConfig?: DatasetSchemaType['websiteConfig'];
externalReadUrl?: DatasetSchemaType['externalReadUrl']; externalReadUrl?: DatasetSchemaType['externalReadUrl'];
defaultPermission?: DatasetSchemaType['defaultPermission']; defaultPermission?: DatasetSchemaType['defaultPermission'];
apiServer?: DatasetSchemaType['apiServer'];
yuqueServer?: DatasetSchemaType['yuqueServer'];
feishuServer?: DatasetSchemaType['feishuServer'];
chunkSettings?: DatasetSchemaType['chunkSettings']; chunkSettings?: DatasetSchemaType['chunkSettings'];
// sync schedule // sync schedule
@@ -100,6 +104,9 @@ export type ExternalFileCreateDatasetCollectionParams = ApiCreateDatasetCollecti
externalFileUrl: string; externalFileUrl: string;
filename?: string; filename?: string;
}; };
export type ImageCreateDatasetCollectionParams = ApiCreateDatasetCollectionParams & {
collectionName: string;
};
/* ================= tag ===================== */ /* ================= tag ===================== */
export type CreateDatasetCollectionTagParams = { export type CreateDatasetCollectionTagParams = {
@@ -125,8 +132,9 @@ export type PgSearchRawType = {
score: number; score: number;
}; };
export type PushDatasetDataChunkProps = { export type PushDatasetDataChunkProps = {
q: string; // embedding content q?: string;
a?: string; // bonus content a?: string;
imageId?: string;
chunkIndex?: number; chunkIndex?: number;
indexes?: Omit<DatasetDataIndexItemType, 'dataId'>[]; indexes?: Omit<DatasetDataIndexItemType, 'dataId'>[];
}; };

View File

@@ -1,5 +1,5 @@
import { RequireOnlyOne } from '../../common/type/utils'; import { RequireOnlyOne } from '../../../common/type/utils';
import type { ParentIdType } from '../../common/parentFolder/type.d'; import type { ParentIdType } from '../../../common/parentFolder/type';
export type APIFileItem = { export type APIFileItem = {
id: string; id: string;
@@ -28,6 +28,12 @@ export type YuqueServer = {
basePath?: string; basePath?: string;
}; };
export type ApiDatasetServerType = {
apiServer?: APIFileServer;
feishuServer?: FeishuServer;
yuqueServer?: YuqueServer;
};
// Api dataset api // Api dataset api
export type APIFileListResponse = APIFileItem[]; export type APIFileListResponse = APIFileItem[];

View File

@@ -0,0 +1,31 @@
import type { ApiDatasetServerType } from './type';
export const filterApiDatasetServerPublicData = (apiDatasetServer?: ApiDatasetServerType) => {
if (!apiDatasetServer) return undefined;
const { apiServer, yuqueServer, feishuServer } = apiDatasetServer;
return {
apiServer: apiServer
? {
baseUrl: apiServer.baseUrl,
authorization: '',
basePath: apiServer.basePath
}
: undefined,
yuqueServer: yuqueServer
? {
userId: yuqueServer.userId,
token: '',
basePath: yuqueServer.basePath
}
: undefined,
feishuServer: feishuServer
? {
appId: feishuServer.appId,
appSecret: '',
folderToken: feishuServer.folderToken
}
: undefined
};
};

View File

@@ -6,45 +6,80 @@ export enum DatasetTypeEnum {
dataset = 'dataset', dataset = 'dataset',
websiteDataset = 'websiteDataset', // depp link websiteDataset = 'websiteDataset', // depp link
externalFile = 'externalFile', externalFile = 'externalFile',
apiDataset = 'apiDataset', apiDataset = 'apiDataset',
feishu = 'feishu', feishu = 'feishu',
yuque = 'yuque' yuque = 'yuque'
} }
export const DatasetTypeMap = {
// @ts-ignore
export const ApiDatasetTypeMap: Record<
`${DatasetTypeEnum}`,
{
icon: string;
avatar: string;
label: any;
collectionLabel: string;
courseUrl?: string;
}
> = {
[DatasetTypeEnum.apiDataset]: {
icon: 'core/dataset/externalDatasetOutline',
avatar: 'core/dataset/externalDatasetColor',
label: i18nT('dataset:api_file'),
collectionLabel: i18nT('common:File'),
courseUrl: '/docs/guide/knowledge_base/api_dataset/'
},
[DatasetTypeEnum.feishu]: {
icon: 'core/dataset/feishuDatasetOutline',
avatar: 'core/dataset/feishuDatasetColor',
label: i18nT('dataset:feishu_dataset'),
collectionLabel: i18nT('common:File'),
courseUrl: '/docs/guide/knowledge_base/lark_dataset/'
},
[DatasetTypeEnum.yuque]: {
icon: 'core/dataset/yuqueDatasetOutline',
avatar: 'core/dataset/yuqueDatasetColor',
label: i18nT('dataset:yuque_dataset'),
collectionLabel: i18nT('common:File'),
courseUrl: '/docs/guide/knowledge_base/yuque_dataset/'
}
};
export const DatasetTypeMap: Record<
`${DatasetTypeEnum}`,
{
icon: string;
avatar: string;
label: any;
collectionLabel: string;
courseUrl?: string;
}
> = {
...ApiDatasetTypeMap,
[DatasetTypeEnum.folder]: { [DatasetTypeEnum.folder]: {
icon: 'common/folderFill', icon: 'common/folderFill',
avatar: 'common/folderFill',
label: i18nT('dataset:folder_dataset'), label: i18nT('dataset:folder_dataset'),
collectionLabel: i18nT('common:Folder') collectionLabel: i18nT('common:Folder')
}, },
[DatasetTypeEnum.dataset]: { [DatasetTypeEnum.dataset]: {
icon: 'core/dataset/commonDatasetOutline', icon: 'core/dataset/commonDatasetOutline',
avatar: 'core/dataset/commonDatasetColor',
label: i18nT('dataset:common_dataset'), label: i18nT('dataset:common_dataset'),
collectionLabel: i18nT('common:File') collectionLabel: i18nT('common:File')
}, },
[DatasetTypeEnum.websiteDataset]: { [DatasetTypeEnum.websiteDataset]: {
icon: 'core/dataset/websiteDatasetOutline', icon: 'core/dataset/websiteDatasetOutline',
avatar: 'core/dataset/websiteDatasetColor',
label: i18nT('dataset:website_dataset'), label: i18nT('dataset:website_dataset'),
collectionLabel: i18nT('common:Website') collectionLabel: i18nT('common:Website'),
courseUrl: '/docs/guide/knowledge_base/websync/'
}, },
[DatasetTypeEnum.externalFile]: { [DatasetTypeEnum.externalFile]: {
icon: 'core/dataset/externalDatasetOutline', icon: 'core/dataset/externalDatasetOutline',
avatar: 'core/dataset/externalDatasetColor',
label: i18nT('dataset:external_file'), label: i18nT('dataset:external_file'),
collectionLabel: i18nT('common:File') collectionLabel: i18nT('common:File')
},
[DatasetTypeEnum.apiDataset]: {
icon: 'core/dataset/externalDatasetOutline',
label: i18nT('dataset:api_file'),
collectionLabel: i18nT('common:File')
},
[DatasetTypeEnum.feishu]: {
icon: 'core/dataset/feishuDatasetOutline',
label: i18nT('dataset:feishu_dataset'),
collectionLabel: i18nT('common:File')
},
[DatasetTypeEnum.yuque]: {
icon: 'core/dataset/yuqueDatasetOutline',
label: i18nT('dataset:yuque_dataset'),
collectionLabel: i18nT('common:File')
} }
}; };
@@ -77,7 +112,8 @@ export enum DatasetCollectionTypeEnum {
file = 'file', file = 'file',
link = 'link', // one link link = 'link', // one link
externalFile = 'externalFile', externalFile = 'externalFile',
apiFile = 'apiFile' apiFile = 'apiFile',
images = 'images'
} }
export const DatasetCollectionTypeMap = { export const DatasetCollectionTypeMap = {
[DatasetCollectionTypeEnum.folder]: { [DatasetCollectionTypeEnum.folder]: {
@@ -97,6 +133,9 @@ export const DatasetCollectionTypeMap = {
}, },
[DatasetCollectionTypeEnum.apiFile]: { [DatasetCollectionTypeEnum.apiFile]: {
name: i18nT('common:core.dataset.apiFile') name: i18nT('common:core.dataset.apiFile')
},
[DatasetCollectionTypeEnum.images]: {
name: i18nT('dataset:core.dataset.Image collection')
} }
}; };
@@ -120,6 +159,7 @@ export const DatasetCollectionSyncResultMap = {
export enum DatasetCollectionDataProcessModeEnum { export enum DatasetCollectionDataProcessModeEnum {
chunk = 'chunk', chunk = 'chunk',
qa = 'qa', qa = 'qa',
imageParse = 'imageParse',
backup = 'backup', backup = 'backup',
auto = 'auto' // abandon auto = 'auto' // abandon
@@ -133,6 +173,10 @@ export const DatasetCollectionDataProcessModeMap = {
label: i18nT('common:core.dataset.training.QA mode'), label: i18nT('common:core.dataset.training.QA mode'),
tooltip: i18nT('common:core.dataset.import.QA Import Tip') tooltip: i18nT('common:core.dataset.import.QA Import Tip')
}, },
[DatasetCollectionDataProcessModeEnum.imageParse]: {
label: i18nT('dataset:training.Image mode'),
tooltip: i18nT('common:core.dataset.import.Chunk Split Tip')
},
[DatasetCollectionDataProcessModeEnum.backup]: { [DatasetCollectionDataProcessModeEnum.backup]: {
label: i18nT('dataset:backup_mode'), label: i18nT('dataset:backup_mode'),
tooltip: i18nT('dataset:backup_mode') tooltip: i18nT('dataset:backup_mode')
@@ -172,14 +216,16 @@ export enum ImportDataSourceEnum {
fileCustom = 'fileCustom', fileCustom = 'fileCustom',
externalFile = 'externalFile', externalFile = 'externalFile',
apiDataset = 'apiDataset', apiDataset = 'apiDataset',
reTraining = 'reTraining' reTraining = 'reTraining',
imageDataset = 'imageDataset'
} }
export enum TrainingModeEnum { export enum TrainingModeEnum {
chunk = 'chunk', chunk = 'chunk',
qa = 'qa', qa = 'qa',
auto = 'auto', auto = 'auto',
image = 'image' image = 'image',
imageParse = 'imageParse'
} }
/* ------------ search -------------- */ /* ------------ search -------------- */

View File

@@ -8,17 +8,19 @@ export type CreateDatasetDataProps = {
chunkIndex?: number; chunkIndex?: number;
q: string; q: string;
a?: string; a?: string;
imageId?: string;
indexes?: Omit<DatasetDataIndexItemType, 'dataId'>[]; indexes?: Omit<DatasetDataIndexItemType, 'dataId'>[];
}; };
export type UpdateDatasetDataProps = { export type UpdateDatasetDataProps = {
dataId: string; dataId: string;
q?: string; q: string;
a?: string; a?: string;
indexes?: (Omit<DatasetDataIndexItemType, 'dataId'> & { indexes?: (Omit<DatasetDataIndexItemType, 'dataId'> & {
dataId?: string; // pg data id dataId?: string; // pg data id
})[]; })[];
imageId?: string;
}; };
export type PatchIndexesProps = export type PatchIndexesProps =

View File

@@ -0,0 +1,13 @@
export type DatasetImageSchema = {
_id: string;
teamId: string;
datasetId: string;
collectionId?: string;
name: string;
contentType: string;
size: number;
metadata?: Record<string, any>;
expiredTime?: Date;
createdAt: Date;
updatedAt: Date;
};

View File

@@ -13,9 +13,15 @@ import type {
ChunkTriggerConfigTypeEnum ChunkTriggerConfigTypeEnum
} from './constants'; } from './constants';
import type { DatasetPermission } from '../../support/permission/dataset/controller'; import type { DatasetPermission } from '../../support/permission/dataset/controller';
import type { APIFileServer, FeishuServer, YuqueServer } from './apiDataset'; import type {
ApiDatasetServerType,
APIFileServer,
FeishuServer,
YuqueServer
} from './apiDataset/type';
import type { SourceMemberType } from 'support/user/type'; import type { SourceMemberType } from 'support/user/type';
import type { DatasetDataIndexTypeEnum } from './data/constants'; import type { DatasetDataIndexTypeEnum } from './data/constants';
import type { ParentIdType } from 'common/parentFolder/type';
export type ChunkSettingsType = { export type ChunkSettingsType = {
trainingType?: DatasetCollectionDataProcessModeEnum; trainingType?: DatasetCollectionDataProcessModeEnum;
@@ -49,7 +55,7 @@ export type ChunkSettingsType = {
export type DatasetSchemaType = { export type DatasetSchemaType = {
_id: string; _id: string;
parentId?: string; parentId: ParentIdType;
userId: string; userId: string;
teamId: string; teamId: string;
tmbId: string; tmbId: string;
@@ -72,14 +78,16 @@ export type DatasetSchemaType = {
chunkSettings?: ChunkSettingsType; chunkSettings?: ChunkSettingsType;
inheritPermission: boolean; inheritPermission: boolean;
apiServer?: APIFileServer;
feishuServer?: FeishuServer; apiDatasetServer?: ApiDatasetServerType;
yuqueServer?: YuqueServer;
// abandon // abandon
autoSync?: boolean; autoSync?: boolean;
externalReadUrl?: string; externalReadUrl?: string;
defaultPermission?: number; defaultPermission?: number;
apiServer?: APIFileServer;
feishuServer?: FeishuServer;
yuqueServer?: YuqueServer;
}; };
export type DatasetCollectionSchemaType = ChunkSettingsType & { export type DatasetCollectionSchemaType = ChunkSettingsType & {
@@ -132,7 +140,13 @@ export type DatasetDataIndexItemType = {
dataId: string; // pg data id dataId: string; // pg data id
text: string; text: string;
}; };
export type DatasetDataSchemaType = {
export type DatasetDataFieldType = {
q: string; // large chunks or question
a?: string; // answer or custom content
imageId?: string;
};
export type DatasetDataSchemaType = DatasetDataFieldType & {
_id: string; _id: string;
userId: string; userId: string;
teamId: string; teamId: string;
@@ -141,13 +155,9 @@ export type DatasetDataSchemaType = {
collectionId: string; collectionId: string;
chunkIndex: number; chunkIndex: number;
updateTime: Date; updateTime: Date;
q: string; // large chunks or question history?: (DatasetDataFieldType & {
a: string; // answer or custom content
history?: {
q: string;
a: string;
updateTime: Date; updateTime: Date;
}[]; })[];
forbid?: boolean; forbid?: boolean;
fullTextToken: string; fullTextToken: string;
indexes: DatasetDataIndexItemType[]; indexes: DatasetDataIndexItemType[];
@@ -179,6 +189,7 @@ export type DatasetTrainingSchemaType = {
dataId?: string; dataId?: string;
q: string; q: string;
a: string; a: string;
imageId?: string;
chunkIndex: number; chunkIndex: number;
indexSize?: number; indexSize?: number;
weight: number; weight: number;
@@ -244,20 +255,18 @@ export type DatasetCollectionItemType = CollectionWithDatasetType & {
}; };
/* ================= data ===================== */ /* ================= data ===================== */
export type DatasetDataItemType = { export type DatasetDataItemType = DatasetDataFieldType & {
id: string; id: string;
teamId: string; teamId: string;
datasetId: string; datasetId: string;
imagePreivewUrl?: string;
updateTime: Date; updateTime: Date;
collectionId: string; collectionId: string;
sourceName: string; sourceName: string;
sourceId?: string; sourceId?: string;
q: string;
a: string;
chunkIndex: number; chunkIndex: number;
indexes: DatasetDataIndexItemType[]; indexes: DatasetDataIndexItemType[];
isOwner: boolean; isOwner: boolean;
// permission: DatasetPermission;
}; };
/* --------------- file ---------------------- */ /* --------------- file ---------------------- */
@@ -284,3 +293,14 @@ export type SearchDataResponseItemType = Omit<
score: { type: `${SearchScoreTypeEnum}`; value: number; index: number }[]; score: { type: `${SearchScoreTypeEnum}`; value: number; index: number }[];
// score: number; // score: number;
}; };
export type DatasetCiteItemType = {
_id: string;
q: string;
a?: string;
imagePreivewUrl?: string;
history?: DatasetDataSchemaType['history'];
updateTime: DatasetDataSchemaType['updateTime'];
index: DatasetDataSchemaType['chunkIndex'];
updated?: boolean;
};

View File

@@ -2,10 +2,15 @@ import { TrainingModeEnum, DatasetCollectionTypeEnum } from './constants';
import { getFileIcon } from '../../common/file/icon'; import { getFileIcon } from '../../common/file/icon';
import { strIsLink } from '../../common/string/tools'; import { strIsLink } from '../../common/string/tools';
export function getCollectionIcon( export function getCollectionIcon({
type: DatasetCollectionTypeEnum = DatasetCollectionTypeEnum.file, type = DatasetCollectionTypeEnum.file,
name = '' name = '',
) { sourceId
}: {
type?: DatasetCollectionTypeEnum;
name?: string;
sourceId?: string;
}) {
if (type === DatasetCollectionTypeEnum.folder) { if (type === DatasetCollectionTypeEnum.folder) {
return 'common/folderFill'; return 'common/folderFill';
} }
@@ -15,7 +20,10 @@ export function getCollectionIcon(
if (type === DatasetCollectionTypeEnum.virtual) { if (type === DatasetCollectionTypeEnum.virtual) {
return 'file/fill/manual'; return 'file/fill/manual';
} }
return getFileIcon(name); if (type === DatasetCollectionTypeEnum.images) {
return 'core/dataset/imageFill';
}
return getSourceNameIcon({ sourceName: name, sourceId });
} }
export function getSourceNameIcon({ export function getSourceNameIcon({
sourceName, sourceName,
@@ -40,5 +48,6 @@ export function getSourceNameIcon({
export const predictDataLimitLength = (mode: TrainingModeEnum, data: any[]) => { export const predictDataLimitLength = (mode: TrainingModeEnum, data: any[]) => {
if (mode === TrainingModeEnum.qa) return data.length * 20; if (mode === TrainingModeEnum.qa) return data.length * 20;
if (mode === TrainingModeEnum.auto) return data.length * 5; if (mode === TrainingModeEnum.auto) return data.length * 5;
if (mode === TrainingModeEnum.image) return data.length * 2;
return data.length; return data.length;
}; };

View File

@@ -59,7 +59,6 @@ export type FlowNodeCommonType = {
}; };
export type PluginDataType = { export type PluginDataType = {
version?: string;
diagram?: string; diagram?: string;
userGuide?: string; userGuide?: string;
courseUrl?: string; courseUrl?: string;
@@ -126,6 +125,7 @@ export type FlowNodeItemType = FlowNodeTemplateType & {
nodeId: string; nodeId: string;
parentNodeId?: string; parentNodeId?: string;
isError?: boolean; isError?: boolean;
searchedText?: string;
debugResult?: { debugResult?: {
status: 'running' | 'success' | 'skipped' | 'failed'; status: 'running' | 'success' | 'skipped' | 'failed';
message?: string; message?: string;

View File

@@ -1,4 +1,5 @@
export enum OperationLogEventEnum { export enum OperationLogEventEnum {
//Team
LOGIN = 'LOGIN', LOGIN = 'LOGIN',
CREATE_INVITATION_LINK = 'CREATE_INVITATION_LINK', CREATE_INVITATION_LINK = 'CREATE_INVITATION_LINK',
JOIN_TEAM = 'JOIN_TEAM', JOIN_TEAM = 'JOIN_TEAM',
@@ -11,5 +12,52 @@ export enum OperationLogEventEnum {
RELOCATE_DEPARTMENT = 'RELOCATE_DEPARTMENT', RELOCATE_DEPARTMENT = 'RELOCATE_DEPARTMENT',
CREATE_GROUP = 'CREATE_GROUP', CREATE_GROUP = 'CREATE_GROUP',
DELETE_GROUP = 'DELETE_GROUP', DELETE_GROUP = 'DELETE_GROUP',
ASSIGN_PERMISSION = 'ASSIGN_PERMISSION' ASSIGN_PERMISSION = 'ASSIGN_PERMISSION',
//APP
CREATE_APP = 'CREATE_APP',
UPDATE_APP_INFO = 'UPDATE_APP_INFO',
MOVE_APP = 'MOVE_APP',
DELETE_APP = 'DELETE_APP',
UPDATE_APP_COLLABORATOR = 'UPDATE_APP_COLLABORATOR',
DELETE_APP_COLLABORATOR = 'DELETE_APP_COLLABORATOR',
TRANSFER_APP_OWNERSHIP = 'TRANSFER_APP_OWNERSHIP',
CREATE_APP_COPY = 'CREATE_APP_COPY',
CREATE_APP_FOLDER = 'CREATE_APP_FOLDER',
UPDATE_PUBLISH_APP = 'UPDATE_PUBLISH_APP',
CREATE_APP_PUBLISH_CHANNEL = 'CREATE_APP_PUBLISH_CHANNEL',
UPDATE_APP_PUBLISH_CHANNEL = 'UPDATE_APP_PUBLISH_CHANNEL',
DELETE_APP_PUBLISH_CHANNEL = 'DELETE_APP_PUBLISH_CHANNEL',
EXPORT_APP_CHAT_LOG = 'EXPORT_APP_CHAT_LOG',
//Dataset
CREATE_DATASET = 'CREATE_DATASET',
UPDATE_DATASET = 'UPDATE_DATASET',
DELETE_DATASET = 'DELETE_DATASET',
MOVE_DATASET = 'MOVE_DATASET',
UPDATE_DATASET_COLLABORATOR = 'UPDATE_DATASET_COLLABORATOR',
DELETE_DATASET_COLLABORATOR = 'DELETE_DATASET_COLLABORATOR',
TRANSFER_DATASET_OWNERSHIP = 'TRANSFER_DATASET_OWNERSHIP',
EXPORT_DATASET = 'EXPORT_DATASET',
CREATE_DATASET_FOLDER = 'CREATE_DATASET_FOLDER',
//Collection
CREATE_COLLECTION = 'CREATE_COLLECTION',
UPDATE_COLLECTION = 'UPDATE_COLLECTION',
DELETE_COLLECTION = 'DELETE_COLLECTION',
RETRAIN_COLLECTION = 'RETRAIN_COLLECTION',
//Data
CREATE_DATA = 'CREATE_DATA',
UPDATE_DATA = 'UPDATE_DATA',
DELETE_DATA = 'DELETE_DATA',
//SearchTest
SEARCH_TEST = 'SEARCH_TEST',
//Account
CHANGE_PASSWORD = 'CHANGE_PASSWORD',
CHANGE_NOTIFICATION_SETTINGS = 'CHANGE_NOTIFICATION_SETTINGS',
CHANGE_MEMBER_NAME_ACCOUNT = 'CHANGE_MEMBER_NAME_ACCOUNT',
PURCHASE_PLAN = 'PURCHASE_PLAN',
EXPORT_BILL_RECORDS = 'EXPORT_BILL_RECORDS',
CREATE_INVOICE = 'CREATE_INVOICE',
SET_INVOICE_HEADER = 'SET_INVOICE_HEADER',
CREATE_API_KEY = 'CREATE_API_KEY',
UPDATE_API_KEY = 'UPDATE_API_KEY',
DELETE_API_KEY = 'DELETE_API_KEY'
} }

View File

@@ -13,6 +13,7 @@ const staticPluginList = [
'WeWorkWebhook', 'WeWorkWebhook',
'google', 'google',
'bing', 'bing',
'bocha',
'delay' 'delay'
]; ];
// Run in worker thread (Have npm packages) // Run in worker thread (Have npm packages)

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4816",
"name": "钉钉 webhook", "name": "钉钉 webhook",
"avatar": "plugins/dingding", "avatar": "plugins/dingding",
"intro": "向钉钉机器人发起 webhook 请求。", "intro": "向钉钉机器人发起 webhook 请求。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "Menghuan1918", "author": "Menghuan1918",
"version": "488",
"name": "PDF识别", "name": "PDF识别",
"avatar": "plugins/doc2x", "avatar": "plugins/doc2x",
"intro": "将PDF文件发送至Doc2X进行解析返回结构化的LaTeX公式的文本(markdown)支持传入String类型的URL或者流程输出中的文件链接变量", "intro": "将PDF文件发送至Doc2X进行解析返回结构化的LaTeX公式的文本(markdown)支持传入String类型的URL或者流程输出中的文件链接变量",

View File

@@ -1,6 +1,5 @@
{ {
"author": "Menghuan1918", "author": "Menghuan1918",
"version": "488",
"name": "Doc2X服务", "name": "Doc2X服务",
"avatar": "plugins/doc2x", "avatar": "plugins/doc2x",
"intro": "将传入的图片或PDF文件发送至Doc2X进行解析返回带LaTeX公式的markdown格式的文本。", "intro": "将传入的图片或PDF文件发送至Doc2X进行解析返回带LaTeX公式的markdown格式的文本。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4816",
"name": "企业微信 webhook", "name": "企业微信 webhook",
"avatar": "plugins/qiwei", "avatar": "plugins/qiwei",
"intro": "向企业微信机器人发起 webhook 请求。只能内部群使用。", "intro": "向企业微信机器人发起 webhook 请求。只能内部群使用。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4811",
"name": "Bing搜索", "name": "Bing搜索",
"avatar": "core/workflow/template/bing", "avatar": "core/workflow/template/bing",
"intro": "在Bing中搜索。", "intro": "在Bing中搜索。",

View File

@@ -0,0 +1,677 @@
{
"author": "",
"name": "博查搜索",
"avatar": "core/workflow/template/bocha",
"intro": "使用博查AI搜索引擎进行网络搜索。",
"showStatus": true,
"weight": 10,
"courseUrl": "",
"isTool": true,
"templateType": "search",
"workflow": {
"nodes": [
{
"nodeId": "pluginInput",
"name": "workflow:template.plugin_start",
"intro": "workflow:intro_plugin_input",
"avatar": "core/workflow/template/workflowStart",
"flowNodeType": "pluginInput",
"showStatus": false,
"position": {
"x": 636.3048409085379,
"y": -238.61714728578016
},
"version": "481",
"inputs": [
{
"renderTypeList": [
"input"
],
"selectedTypeIndex": 0,
"valueType": "string",
"canEdit": true,
"key": "apiKey",
"label": "apiKey",
"description": "博查API密钥",
"defaultValue": "",
"required": true
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "string",
"canEdit": true,
"key": "query",
"label": "query",
"description": "搜索查询词",
"defaultValue": "",
"required": true,
"toolDescription": "搜索查询词"
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "string",
"canEdit": true,
"key": "freshness",
"label": "freshness",
"description": "搜索指定时间范围内的网页。可填值oneDay(一天内)、oneWeek(一周内)、oneMonth(一个月内)、oneYear(一年内)、noLimit(不限,默认)、YYYY-MM-DD..YYYY-MM-DD(日期范围)、YYYY-MM-DD(指定日期)",
"defaultValue": "noLimit",
"required": false,
"toolDescription": "搜索时间范围"
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "boolean",
"canEdit": true,
"key": "summary",
"label": "summary",
"description": "是否显示文本摘要。true显示false不显示(默认)",
"defaultValue": false,
"required": false,
"toolDescription": "是否显示文本摘要"
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "string",
"canEdit": true,
"key": "include",
"label": "include",
"description": "指定搜索的site范围。多个域名使用|或,分隔最多20个。例如qq.com|m.163.com",
"defaultValue": "",
"required": false,
"toolDescription": "指定搜索的site范围"
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "string",
"canEdit": true,
"key": "exclude",
"label": "exclude",
"description": "排除搜索的网站范围。多个域名使用|或,分隔最多20个。例如qq.com|m.163.com",
"defaultValue": "",
"required": false,
"toolDescription": "排除搜索的网站范围"
},
{
"renderTypeList": [
"input",
"reference"
],
"selectedTypeIndex": 0,
"valueType": "number",
"canEdit": true,
"key": "count",
"label": "count",
"description": "返回结果的条数。可填范围1-50默认为10",
"defaultValue": 10,
"required": false,
"min": 1,
"max": 50,
"toolDescription": "返回结果条数"
}
],
"outputs": [
{
"id": "apiKey",
"valueType": "string",
"key": "apiKey",
"label": "apiKey",
"type": "hidden"
},
{
"id": "query",
"valueType": "string",
"key": "query",
"label": "query",
"type": "hidden"
},
{
"id": "freshness",
"valueType": "string",
"key": "freshness",
"label": "freshness",
"type": "hidden"
},
{
"id": "summary",
"valueType": "boolean",
"key": "summary",
"label": "summary",
"type": "hidden"
},
{
"id": "include",
"valueType": "string",
"key": "include",
"label": "include",
"type": "hidden"
},
{
"id": "exclude",
"valueType": "string",
"key": "exclude",
"label": "exclude",
"type": "hidden"
},
{
"id": "count",
"valueType": "number",
"key": "count",
"label": "count",
"type": "hidden"
}
]
},
{
"nodeId": "pluginOutput",
"name": "common:core.module.template.self_output",
"intro": "workflow:intro_custom_plugin_output",
"avatar": "core/workflow/template/pluginOutput",
"flowNodeType": "pluginOutput",
"showStatus": false,
"position": {
"x": 2764.1105686698083,
"y": -30.617147285780163
},
"version": "481",
"inputs": [
{
"renderTypeList": [
"reference"
],
"valueType": "object",
"canEdit": true,
"key": "result",
"label": "result",
"isToolOutput": true,
"description": "",
"value": [
"nyA6oA8mF1iW",
"httpRawResponse"
]
}
],
"outputs": []
},
{
"nodeId": "pluginConfig",
"name": "common:core.module.template.system_config",
"intro": "",
"avatar": "core/workflow/template/systemConfig",
"flowNodeType": "pluginConfig",
"position": {
"x": 184.66337662472682,
"y": -216.05298493910115
},
"version": "4811",
"inputs": [],
"outputs": []
},
{
"nodeId": "nyA6oA8mF1iW",
"name": "HTTP 请求",
"intro": "调用博查搜索API",
"avatar": "core/workflow/template/httpRequest",
"flowNodeType": "httpRequest468",
"showStatus": true,
"position": {
"x": 1335.0647252518884,
"y": -455.9043948565971
},
"version": "481",
"inputs": [
{
"key": "system_addInputParam",
"renderTypeList": [
"addInputParam"
],
"valueType": "dynamic",
"label": "",
"required": false,
"description": "common:core.module.input.description.HTTP Dynamic Input",
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectDataset",
"selectApp"
],
"showDescription": false,
"showDefaultValue": true
},
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpMethod",
"renderTypeList": [
"custom"
],
"valueType": "string",
"label": "",
"value": "POST",
"required": true,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpTimeout",
"renderTypeList": [
"custom"
],
"valueType": "number",
"label": "",
"value": 30,
"min": 5,
"max": 600,
"required": true,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpReqUrl",
"renderTypeList": [
"hidden"
],
"valueType": "string",
"label": "",
"description": "common:core.module.input.description.Http Request Url",
"placeholder": "https://api.ai.com/getInventory",
"required": false,
"value": "https://api.bochaai.com/v1/web-search",
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpHeader",
"renderTypeList": [
"custom"
],
"valueType": "any",
"value": [
{
"key": "Authorization",
"type": "string",
"value": "Bearer {{$pluginInput.apiKey$}}"
},
{
"key": "Content-Type",
"type": "string",
"value": "application/json"
}
],
"label": "",
"description": "common:core.module.input.description.Http Request Header",
"placeholder": "common:core.module.input.description.Http Request Header",
"required": false,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpParams",
"renderTypeList": [
"hidden"
],
"valueType": "any",
"value": [],
"label": "",
"required": false,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpJsonBody",
"renderTypeList": [
"hidden"
],
"valueType": "any",
"value": "{\n \"query\": \"{{query}}\",\n \"freshness\": \"{{freshness}}\",\n \"summary\": {{summary}},\n \"include\": \"{{include}}\",\n \"exclude\": \"{{exclude}}\",\n \"count\": {{count}}\n}",
"label": "",
"required": false,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpFormBody",
"renderTypeList": [
"hidden"
],
"valueType": "any",
"value": [],
"label": "",
"required": false,
"debugLabel": "",
"toolDescription": ""
},
{
"key": "system_httpContentType",
"renderTypeList": [
"hidden"
],
"valueType": "string",
"value": "json",
"label": "",
"required": false,
"debugLabel": "",
"toolDescription": ""
},
{
"valueType": "string",
"renderTypeList": [
"reference"
],
"key": "query",
"label": "query",
"toolDescription": "博查搜索检索词",
"required": true,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"query"
]
},
{
"valueType": "string",
"renderTypeList": [
"reference"
],
"key": "freshness",
"label": "freshness",
"toolDescription": "搜索时间范围",
"required": false,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"freshness"
]
},
{
"valueType": "boolean",
"renderTypeList": [
"reference"
],
"key": "summary",
"label": "summary",
"toolDescription": "是否显示文本摘要",
"required": false,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"summary"
]
},
{
"valueType": "string",
"renderTypeList": [
"reference"
],
"key": "include",
"label": "include",
"toolDescription": "指定搜索的site范围",
"required": false,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"include"
]
},
{
"valueType": "string",
"renderTypeList": [
"reference"
],
"key": "exclude",
"label": "exclude",
"toolDescription": "排除搜索的网站范围",
"required": false,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"exclude"
]
},
{
"valueType": "number",
"renderTypeList": [
"reference"
],
"key": "count",
"label": "count",
"toolDescription": "返回结果条数",
"required": false,
"canEdit": true,
"editField": {
"key": true,
"description": true
},
"customInputConfig": {
"selectValueTypeList": [
"string",
"number",
"boolean",
"object",
"arrayString",
"arrayNumber",
"arrayBoolean",
"arrayObject",
"arrayAny",
"any",
"chatHistory",
"datasetQuote",
"dynamic",
"selectApp",
"selectDataset"
],
"showDescription": false,
"showDefaultValue": true
},
"value": [
"pluginInput",
"count"
]
}
],
"outputs": [
{
"id": "error",
"key": "error",
"label": "workflow:request_error",
"description": "HTTP请求错误信息成功时返回空",
"valueType": "object",
"type": "static"
},
{
"id": "httpRawResponse",
"key": "httpRawResponse",
"required": true,
"label": "workflow:raw_response",
"description": "HTTP请求的原始响应。只能接受字符串或JSON类型响应数据。",
"valueType": "any",
"type": "static"
},
{
"id": "system_addOutputParam",
"key": "system_addOutputParam",
"type": "dynamic",
"valueType": "dynamic",
"label": "",
"editField": {
"key": true,
"valueType": true
}
}
]
}
],
"edges": [
{
"source": "pluginInput",
"target": "nyA6oA8mF1iW",
"sourceHandle": "pluginInput-source-right",
"targetHandle": "nyA6oA8mF1iW-target-left"
},
{
"source": "nyA6oA8mF1iW",
"target": "pluginOutput",
"sourceHandle": "nyA6oA8mF1iW-source-right",
"targetHandle": "pluginOutput-target-left"
}
]
},
"chatConfig": {}
}

View File

@@ -1,6 +1,5 @@
{ {
"author": "silencezhang", "author": "silencezhang",
"version": "4811",
"name": "数据库连接", "name": "数据库连接",
"avatar": "core/workflow/template/datasource", "avatar": "core/workflow/template/datasource",
"intro": "可连接常用数据库并执行sql", "intro": "可连接常用数据库并执行sql",

View File

@@ -1,6 +1,5 @@
{ {
"author": "collin", "author": "collin",
"version": "4817",
"name": "流程等待", "name": "流程等待",
"avatar": "core/workflow/template/sleep", "avatar": "core/workflow/template/sleep",
"intro": "让工作流等待指定时间后运行", "intro": "让工作流等待指定时间后运行",

View File

@@ -1,6 +1,5 @@
{ {
"author": "silencezhang", "author": "silencezhang",
"version": "4817",
"name": "基础图表", "name": "基础图表",
"avatar": "core/workflow/template/baseChart", "avatar": "core/workflow/template/baseChart",
"intro": "根据数据生成图表可根据chartType生成柱状图折线图饼图", "intro": "根据数据生成图表可根据chartType生成柱状图折线图饼图",

View File

@@ -1,6 +1,5 @@
{ {
"author": "silencezhang", "author": "silencezhang",
"version": "486",
"name": "BI图表功能", "name": "BI图表功能",
"avatar": "core/workflow/template/BI", "avatar": "core/workflow/template/BI",
"intro": "BI图表功能可以生成一些常用的图表如饼图柱状图折线图等", "intro": "BI图表功能可以生成一些常用的图表如饼图柱状图折线图等",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "DuckDuckGo 网络搜索", "name": "DuckDuckGo 网络搜索",
"avatar": "core/workflow/template/duckduckgo", "avatar": "core/workflow/template/duckduckgo",
"intro": "使用 DuckDuckGo 进行网络搜索", "intro": "使用 DuckDuckGo 进行网络搜索",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "DuckDuckGo 图片搜索", "name": "DuckDuckGo 图片搜索",
"avatar": "core/workflow/template/duckduckgo", "avatar": "core/workflow/template/duckduckgo",
"intro": "使用 DuckDuckGo 进行图片搜索", "intro": "使用 DuckDuckGo 进行图片搜索",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "DuckDuckGo 新闻检索", "name": "DuckDuckGo 新闻检索",
"avatar": "core/workflow/template/duckduckgo", "avatar": "core/workflow/template/duckduckgo",
"intro": "使用 DuckDuckGo 进行新闻检索", "intro": "使用 DuckDuckGo 进行新闻检索",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "DuckDuckGo 视频搜索", "name": "DuckDuckGo 视频搜索",
"avatar": "core/workflow/template/duckduckgo", "avatar": "core/workflow/template/duckduckgo",
"intro": "使用 DuckDuckGo 进行视频搜索", "intro": "使用 DuckDuckGo 进行视频搜索",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "DuckDuckGo服务", "name": "DuckDuckGo服务",
"avatar": "core/workflow/template/duckduckgo", "avatar": "core/workflow/template/duckduckgo",
"intro": "DuckDuckGo 服务,包含网络搜索、图片搜索、新闻搜索等。", "intro": "DuckDuckGo 服务,包含网络搜索、图片搜索、新闻搜索等。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "488",
"name": "飞书 webhook", "name": "飞书 webhook",
"avatar": "core/app/templates/plugin-feishu", "avatar": "core/app/templates/plugin-feishu",
"intro": "向飞书机器人发起 webhook 请求。", "intro": "向飞书机器人发起 webhook 请求。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "网页内容抓取", "name": "网页内容抓取",
"avatar": "core/workflow/template/fetchUrl", "avatar": "core/workflow/template/fetchUrl",
"intro": "可获取一个网页链接内容,并以 Markdown 格式输出,仅支持获取静态网站。", "intro": "可获取一个网页链接内容,并以 Markdown 格式输出,仅支持获取静态网站。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "481",
"templateType": "tools", "templateType": "tools",
"name": "获取当前时间", "name": "获取当前时间",
"avatar": "core/workflow/template/getTime", "avatar": "core/workflow/template/getTime",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4811",
"name": "Google搜索", "name": "Google搜索",
"avatar": "core/workflow/template/google", "avatar": "core/workflow/template/google",
"intro": "在google中搜索。", "intro": "在google中搜索。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "486",
"name": "数学公式执行", "name": "数学公式执行",
"avatar": "core/workflow/template/mathCall", "avatar": "core/workflow/template/mathCall",
"intro": "用于执行数学表达式的工具,通过 js 的 expr-eval 库运行表达式并返回结果。", "intro": "用于执行数学表达式的工具,通过 js 的 expr-eval 库运行表达式并返回结果。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4816",
"name": "Search XNG 搜索", "name": "Search XNG 搜索",
"avatar": "core/workflow/template/searxng", "avatar": "core/workflow/template/searxng",
"intro": "使用 Search XNG 服务进行搜索。", "intro": "使用 Search XNG 服务进行搜索。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "cloudpense", "author": "cloudpense",
"version": "1.0.0",
"name": "Email 邮件发送", "name": "Email 邮件发送",
"avatar": "plugins/email", "avatar": "plugins/email",
"intro": "通过SMTP协议发送电子邮件(nodemailer)", "intro": "通过SMTP协议发送电子邮件(nodemailer)",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "489",
"name": "文本加工", "name": "文本加工",
"avatar": "/imgs/workflow/textEditor.svg", "avatar": "/imgs/workflow/textEditor.svg",
"intro": "可对固定或传入的文本进行加工后输出,非字符串类型数据最终会转成字符串类型。", "intro": "可对固定或传入的文本进行加工后输出,非字符串类型数据最终会转成字符串类型。",

View File

@@ -1,6 +1,5 @@
{ {
"author": "", "author": "",
"version": "4811",
"name": "Wiki搜索", "name": "Wiki搜索",
"avatar": "core/workflow/template/wiki", "avatar": "core/workflow/template/wiki",
"intro": "在Wiki中查询释义。", "intro": "在Wiki中查询释义。",

View File

@@ -1,5 +1,8 @@
import type { ApiDatasetDetailResponse } from '@fastgpt/global/core/dataset/apiDataset'; import type {
import { FeishuServer, YuqueServer } from '@fastgpt/global/core/dataset/apiDataset'; ApiDatasetDetailResponse,
FeishuServer,
YuqueServer
} from '@fastgpt/global/core/dataset/apiDataset/type';
import type { import type {
DeepRagSearchProps, DeepRagSearchProps,
SearchDatasetDataResponse SearchDatasetDataResponse

View File

@@ -0,0 +1,181 @@
import { retryFn } from '@fastgpt/global/common/system/utils';
import { connectionMongo } from '../../mongo';
import { MongoRawTextBufferSchema, bucketName } from './schema';
import { addLog } from '../../system/log';
import { setCron } from '../../system/cron';
import { checkTimerLock } from '../../system/timerLock/utils';
import { TimerIdEnum } from '../../system/timerLock/constants';
const getGridBucket = () => {
return new connectionMongo.mongo.GridFSBucket(connectionMongo.connection.db!, {
bucketName: bucketName
});
};
export const addRawTextBuffer = async ({
sourceId,
sourceName,
text,
expiredTime
}: {
sourceId: string;
sourceName: string;
text: string;
expiredTime: Date;
}) => {
const gridBucket = getGridBucket();
const metadata = {
sourceId,
sourceName,
expiredTime
};
const buffer = Buffer.from(text);
const fileSize = buffer.length;
// 单块大小:尽可能大,但不超过 14MB不小于128KB
const chunkSizeBytes = (() => {
// 计算理想块大小:文件大小 ÷ 目标块数(10)。 并且每个块需要小于 14MB
const idealChunkSize = Math.min(Math.ceil(fileSize / 10), 14 * 1024 * 1024);
// 确保块大小至少为128KB
const minChunkSize = 128 * 1024; // 128KB
// 取理想块大小和最小块大小中的较大值
let chunkSize = Math.max(idealChunkSize, minChunkSize);
// 将块大小向上取整到最接近的64KB的倍数使其更整齐
chunkSize = Math.ceil(chunkSize / (64 * 1024)) * (64 * 1024);
return chunkSize;
})();
const uploadStream = gridBucket.openUploadStream(sourceId, {
metadata,
chunkSizeBytes
});
return retryFn(async () => {
return new Promise((resolve, reject) => {
uploadStream.end(buffer);
uploadStream.on('finish', () => {
resolve(uploadStream.id);
});
uploadStream.on('error', (error) => {
addLog.error('addRawTextBuffer error', error);
resolve('');
});
});
});
};
export const getRawTextBuffer = async (sourceId: string) => {
const gridBucket = getGridBucket();
return retryFn(async () => {
const bufferData = await MongoRawTextBufferSchema.findOne(
{
'metadata.sourceId': sourceId
},
'_id metadata'
).lean();
if (!bufferData) {
return null;
}
// Read file content
const downloadStream = gridBucket.openDownloadStream(bufferData._id);
const chunks: Buffer[] = [];
return new Promise<{
text: string;
sourceName: string;
} | null>((resolve, reject) => {
downloadStream.on('data', (chunk) => {
chunks.push(chunk);
});
downloadStream.on('end', () => {
const buffer = Buffer.concat(chunks);
const text = buffer.toString('utf8');
resolve({
text,
sourceName: bufferData.metadata?.sourceName || ''
});
});
downloadStream.on('error', (error) => {
addLog.error('getRawTextBuffer error', error);
resolve(null);
});
});
});
};
export const deleteRawTextBuffer = async (sourceId: string): Promise<boolean> => {
const gridBucket = getGridBucket();
return retryFn(async () => {
const buffer = await MongoRawTextBufferSchema.findOne({ 'metadata.sourceId': sourceId });
if (!buffer) {
return false;
}
await gridBucket.delete(buffer._id);
return true;
});
};
export const updateRawTextBufferExpiredTime = async ({
sourceId,
expiredTime
}: {
sourceId: string;
expiredTime: Date;
}) => {
return retryFn(async () => {
return MongoRawTextBufferSchema.updateOne(
{ 'metadata.sourceId': sourceId },
{ $set: { 'metadata.expiredTime': expiredTime } }
);
});
};
export const clearExpiredRawTextBufferCron = async () => {
const gridBucket = getGridBucket();
const clearExpiredRawTextBuffer = async () => {
addLog.debug('Clear expired raw text buffer start');
const data = await MongoRawTextBufferSchema.find(
{
'metadata.expiredTime': { $lt: new Date() }
},
'_id'
).lean();
for (const item of data) {
try {
await gridBucket.delete(item._id);
} catch (error) {
addLog.error('Delete expired raw text buffer error', error);
}
}
addLog.debug('Clear expired raw text buffer end');
};
setCron('*/10 * * * *', async () => {
if (
await checkTimerLock({
timerId: TimerIdEnum.clearExpiredRawTextBuffer,
lockMinuted: 9
})
) {
try {
await clearExpiredRawTextBuffer();
} catch (error) {
addLog.error('clearExpiredRawTextBufferCron error', error);
}
}
});
};

View File

@@ -1,33 +1,22 @@
import { getMongoModel, Schema } from '../../mongo'; import { getMongoModel, type Types, Schema } from '../../mongo';
import { type RawTextBufferSchemaType } from './type';
export const collectionName = 'buffer_rawtexts'; export const bucketName = 'buffer_rawtext';
const RawTextBufferSchema = new Schema({ const RawTextBufferSchema = new Schema({
sourceId: { metadata: {
type: String, sourceId: { type: String, required: true },
required: true sourceName: { type: String, required: true },
}, expiredTime: { type: Date, required: true }
rawText: { }
type: String,
default: ''
},
createTime: {
type: Date,
default: () => new Date()
},
metadata: Object
}); });
RawTextBufferSchema.index({ 'metadata.sourceId': 'hashed' });
RawTextBufferSchema.index({ 'metadata.expiredTime': -1 });
try { export const MongoRawTextBufferSchema = getMongoModel<{
RawTextBufferSchema.index({ sourceId: 1 }); _id: Types.ObjectId;
// 20 minutes metadata: {
RawTextBufferSchema.index({ createTime: 1 }, { expireAfterSeconds: 20 * 60 }); sourceId: string;
} catch (error) { sourceName: string;
console.log(error); expiredTime: Date;
} };
}>(`${bucketName}.files`, RawTextBufferSchema);
export const MongoRawTextBuffer = getMongoModel<RawTextBufferSchemaType>(
collectionName,
RawTextBufferSchema
);

View File

@@ -1,8 +0,0 @@
export type RawTextBufferSchemaType = {
sourceId: string;
rawText: string;
createTime: Date;
metadata?: {
filename: string;
};
};

View File

@@ -6,13 +6,14 @@ import { type DatasetFileSchema } from '@fastgpt/global/core/dataset/type';
import { MongoChatFileSchema, MongoDatasetFileSchema } from './schema'; import { MongoChatFileSchema, MongoDatasetFileSchema } from './schema';
import { detectFileEncoding, detectFileEncodingByPath } from '@fastgpt/global/common/file/tools'; import { detectFileEncoding, detectFileEncodingByPath } from '@fastgpt/global/common/file/tools';
import { CommonErrEnum } from '@fastgpt/global/common/error/code/common'; import { CommonErrEnum } from '@fastgpt/global/common/error/code/common';
import { MongoRawTextBuffer } from '../../buffer/rawText/schema';
import { readRawContentByFileBuffer } from '../read/utils'; import { readRawContentByFileBuffer } from '../read/utils';
import { gridFsStream2Buffer, stream2Encoding } from './utils'; import { computeGridFsChunSize, gridFsStream2Buffer, stream2Encoding } from './utils';
import { addLog } from '../../system/log'; import { addLog } from '../../system/log';
import { readFromSecondary } from '../../mongo/utils';
import { parseFileExtensionFromUrl } from '@fastgpt/global/common/string/tools'; import { parseFileExtensionFromUrl } from '@fastgpt/global/common/string/tools';
import { Readable } from 'stream'; import { Readable } from 'stream';
import { addRawTextBuffer, getRawTextBuffer } from '../../buffer/rawText/controller';
import { addMinutes } from 'date-fns';
import { retryFn } from '@fastgpt/global/common/system/utils';
export function getGFSCollection(bucket: `${BucketNameEnum}`) { export function getGFSCollection(bucket: `${BucketNameEnum}`) {
MongoDatasetFileSchema; MongoDatasetFileSchema;
@@ -64,23 +65,7 @@ export async function uploadFile({
// create a gridfs bucket // create a gridfs bucket
const bucket = getGridBucket(bucketName); const bucket = getGridBucket(bucketName);
const fileSize = stats.size; const chunkSizeBytes = computeGridFsChunSize(stats.size);
// 单块大小:尽可能大,但不超过 14MB不小于512KB
const chunkSizeBytes = (() => {
// 计算理想块大小:文件大小 ÷ 目标块数(10)。 并且每个块需要小于 14MB
const idealChunkSize = Math.min(Math.ceil(fileSize / 10), 14 * 1024 * 1024);
// 确保块大小至少为512KB
const minChunkSize = 512 * 1024; // 512KB
// 取理想块大小和最小块大小中的较大值
let chunkSize = Math.max(idealChunkSize, minChunkSize);
// 将块大小向上取整到最接近的64KB的倍数使其更整齐
chunkSize = Math.ceil(chunkSize / (64 * 1024)) * (64 * 1024);
return chunkSize;
})();
const stream = bucket.openUploadStream(filename, { const stream = bucket.openUploadStream(filename, {
metadata, metadata,
@@ -173,24 +158,18 @@ export async function getFileById({
export async function delFileByFileIdList({ export async function delFileByFileIdList({
bucketName, bucketName,
fileIdList, fileIdList
retry = 3
}: { }: {
bucketName: `${BucketNameEnum}`; bucketName: `${BucketNameEnum}`;
fileIdList: string[]; fileIdList: string[];
retry?: number;
}): Promise<any> { }): Promise<any> {
try { return retryFn(async () => {
const bucket = getGridBucket(bucketName); const bucket = getGridBucket(bucketName);
for await (const fileId of fileIdList) { for await (const fileId of fileIdList) {
await bucket.delete(new Types.ObjectId(fileId)); await bucket.delete(new Types.ObjectId(fileId));
} }
} catch (error) { });
if (retry > 0) {
return delFileByFileIdList({ bucketName, fileIdList, retry: retry - 1 });
}
}
} }
export async function getDownloadStream({ export async function getDownloadStream({
@@ -223,15 +202,13 @@ export const readFileContentFromMongo = async ({
rawText: string; rawText: string;
filename: string; filename: string;
}> => { }> => {
const bufferId = `${fileId}-${customPdfParse}`; const bufferId = `${String(fileId)}-${customPdfParse}`;
// read buffer // read buffer
const fileBuffer = await MongoRawTextBuffer.findOne({ sourceId: bufferId }, undefined, { const fileBuffer = await getRawTextBuffer(bufferId);
...readFromSecondary
}).lean();
if (fileBuffer) { if (fileBuffer) {
return { return {
rawText: fileBuffer.rawText, rawText: fileBuffer.text,
filename: fileBuffer.metadata?.filename || '' filename: fileBuffer?.sourceName
}; };
} }
@@ -265,16 +242,13 @@ export const readFileContentFromMongo = async ({
} }
}); });
// < 14M // Add buffer
if (fileBuffers.length < 14 * 1024 * 1024 && rawText.trim()) { addRawTextBuffer({
MongoRawTextBuffer.create({ sourceId: bufferId,
sourceId: bufferId, sourceName: file.filename,
rawText, text: rawText,
metadata: { expiredTime: addMinutes(new Date(), 20)
filename: file.filename });
}
});
}
return { return {
rawText, rawText,

View File

@@ -1,16 +1,16 @@
import { Schema, getMongoModel } from '../../mongo'; import { Schema, getMongoModel } from '../../mongo';
const DatasetFileSchema = new Schema({}); const DatasetFileSchema = new Schema({
const ChatFileSchema = new Schema({}); metadata: Object
});
const ChatFileSchema = new Schema({
metadata: Object
});
try { DatasetFileSchema.index({ uploadDate: -1 });
DatasetFileSchema.index({ uploadDate: -1 });
ChatFileSchema.index({ uploadDate: -1 }); ChatFileSchema.index({ uploadDate: -1 });
ChatFileSchema.index({ 'metadata.chatId': 1 }); ChatFileSchema.index({ 'metadata.chatId': 1 });
} catch (error) {
console.log(error);
}
export const MongoDatasetFileSchema = getMongoModel('dataset.files', DatasetFileSchema); export const MongoDatasetFileSchema = getMongoModel('dataset.files', DatasetFileSchema);
export const MongoChatFileSchema = getMongoModel('chat.files', ChatFileSchema); export const MongoChatFileSchema = getMongoModel('chat.files', ChatFileSchema);

View File

@@ -1,5 +1,57 @@
import { detectFileEncoding } from '@fastgpt/global/common/file/tools'; import { detectFileEncoding } from '@fastgpt/global/common/file/tools';
import { PassThrough } from 'stream'; import { PassThrough } from 'stream';
import { getGridBucket } from './controller';
import { type BucketNameEnum } from '@fastgpt/global/common/file/constants';
import { retryFn } from '@fastgpt/global/common/system/utils';
export const createFileFromText = async ({
bucket,
filename,
text,
metadata
}: {
bucket: `${BucketNameEnum}`;
filename: string;
text: string;
metadata: Record<string, any>;
}) => {
const gridBucket = getGridBucket(bucket);
const buffer = Buffer.from(text);
const fileSize = buffer.length;
// 单块大小:尽可能大,但不超过 14MB不小于128KB
const chunkSizeBytes = (() => {
// 计算理想块大小:文件大小 ÷ 目标块数(10)。 并且每个块需要小于 14MB
const idealChunkSize = Math.min(Math.ceil(fileSize / 10), 14 * 1024 * 1024);
// 确保块大小至少为128KB
const minChunkSize = 128 * 1024; // 128KB
// 取理想块大小和最小块大小中的较大值
let chunkSize = Math.max(idealChunkSize, minChunkSize);
// 将块大小向上取整到最接近的64KB的倍数使其更整齐
chunkSize = Math.ceil(chunkSize / (64 * 1024)) * (64 * 1024);
return chunkSize;
})();
const uploadStream = gridBucket.openUploadStream(filename, {
metadata,
chunkSizeBytes
});
return retryFn(async () => {
return new Promise<{ fileId: string }>((resolve, reject) => {
uploadStream.end(buffer);
uploadStream.on('finish', () => {
resolve({ fileId: String(uploadStream.id) });
});
uploadStream.on('error', reject);
});
});
};
export const gridFsStream2Buffer = (stream: NodeJS.ReadableStream) => { export const gridFsStream2Buffer = (stream: NodeJS.ReadableStream) => {
return new Promise<Buffer>((resolve, reject) => { return new Promise<Buffer>((resolve, reject) => {
@@ -53,3 +105,20 @@ export const stream2Encoding = async (stream: NodeJS.ReadableStream) => {
stream: copyStream stream: copyStream
}; };
}; };
// 单块大小:尽可能大,但不超过 14MB不小于512KB
export const computeGridFsChunSize = (fileSize: number) => {
// 计算理想块大小:文件大小 ÷ 目标块数(10)。 并且每个块需要小于 14MB
const idealChunkSize = Math.min(Math.ceil(fileSize / 10), 14 * 1024 * 1024);
// 确保块大小至少为512KB
const minChunkSize = 512 * 1024; // 512KB
// 取理想块大小和最小块大小中的较大值
let chunkSize = Math.max(idealChunkSize, minChunkSize);
// 将块大小向上取整到最接近的64KB的倍数使其更整齐
chunkSize = Math.ceil(chunkSize / (64 * 1024)) * (64 * 1024);
return chunkSize;
};

View File

@@ -22,7 +22,7 @@ export const getUploadModel = ({ maxSize = 500 }: { maxSize?: number }) => {
maxSize *= 1024 * 1024; maxSize *= 1024 * 1024;
class UploadModel { class UploadModel {
uploader = multer({ uploaderSingle = multer({
limits: { limits: {
fieldSize: maxSize fieldSize: maxSize
}, },
@@ -41,8 +41,7 @@ export const getUploadModel = ({ maxSize = 500 }: { maxSize?: number }) => {
} }
}) })
}).single('file'); }).single('file');
async getUploadFile<T = any>(
async doUpload<T = any>(
req: NextApiRequest, req: NextApiRequest,
res: NextApiResponse, res: NextApiResponse,
originBucketName?: `${BucketNameEnum}` originBucketName?: `${BucketNameEnum}`
@@ -54,7 +53,7 @@ export const getUploadModel = ({ maxSize = 500 }: { maxSize?: number }) => {
bucketName?: `${BucketNameEnum}`; bucketName?: `${BucketNameEnum}`;
}>((resolve, reject) => { }>((resolve, reject) => {
// @ts-ignore // @ts-ignore
this.uploader(req, res, (error) => { this.uploaderSingle(req, res, (error) => {
if (error) { if (error) {
return reject(error); return reject(error);
} }
@@ -94,6 +93,58 @@ export const getUploadModel = ({ maxSize = 500 }: { maxSize?: number }) => {
}); });
}); });
} }
uploaderMultiple = multer({
limits: {
fieldSize: maxSize
},
preservePath: true,
storage: multer.diskStorage({
// destination: (_req, _file, cb) => {
// cb(null, tmpFileDirPath);
// },
filename: (req, file, cb) => {
if (!file?.originalname) {
cb(new Error('File not found'), '');
} else {
const { ext } = path.parse(decodeURIComponent(file.originalname));
cb(null, `${getNanoid()}${ext}`);
}
}
})
}).array('file', global.feConfigs?.uploadFileMaxSize);
async getUploadFiles<T = any>(req: NextApiRequest, res: NextApiResponse) {
return new Promise<{
files: FileType[];
data: T;
}>((resolve, reject) => {
// @ts-ignore
this.uploaderMultiple(req, res, (error) => {
if (error) {
console.log(error);
return reject(error);
}
// @ts-ignore
const files = req.files as FileType[];
resolve({
files: files.map((file) => ({
...file,
originalname: decodeURIComponent(file.originalname)
})),
data: (() => {
if (!req.body?.data) return {};
try {
return JSON.parse(req.body.data);
} catch (error) {
return {};
}
})()
});
});
});
}
} }
return new UploadModel(); return new UploadModel();

View File

@@ -110,7 +110,7 @@ export const readRawContentByFileBuffer = async ({
return { return {
rawText: text, rawText: text,
formatText: rawText, formatText: text,
imageList imageList
}; };
}; };

View File

@@ -4,7 +4,8 @@ import { MongoFrequencyLimit } from './schema';
export const authFrequencyLimit = async ({ export const authFrequencyLimit = async ({
eventId, eventId,
maxAmount, maxAmount,
expiredTime expiredTime,
num = 1
}: AuthFrequencyLimitProps) => { }: AuthFrequencyLimitProps) => {
try { try {
// 对应 eventId 的 account+1, 不存在的话,则创建一个 // 对应 eventId 的 account+1, 不存在的话,则创建一个
@@ -14,7 +15,7 @@ export const authFrequencyLimit = async ({
expiredTime: { $gte: new Date() } expiredTime: { $gte: new Date() }
}, },
{ {
$inc: { amount: 1 }, $inc: { amount: num },
// If not exist, set the expiredTime // If not exist, set the expiredTime
$setOnInsert: { expiredTime } $setOnInsert: { expiredTime }
}, },

View File

@@ -5,7 +5,10 @@ export enum TimerIdEnum {
clearExpiredSubPlan = 'clearExpiredSubPlan', clearExpiredSubPlan = 'clearExpiredSubPlan',
updateStandardPlan = 'updateStandardPlan', updateStandardPlan = 'updateStandardPlan',
scheduleTriggerApp = 'scheduleTriggerApp', scheduleTriggerApp = 'scheduleTriggerApp',
notification = 'notification' notification = 'notification',
clearExpiredRawTextBuffer = 'clearExpiredRawTextBuffer',
clearExpiredDatasetImage = 'clearExpiredDatasetImage'
} }
export enum LockNotificationEnum { export enum LockNotificationEnum {

View File

@@ -20,6 +20,10 @@ export const getVlmModel = (model?: string) => {
?.find((item) => item.model === model || item.name === model); ?.find((item) => item.model === model || item.name === model);
}; };
export const getVlmModelList = () => {
return Array.from(global.llmModelMap.values())?.filter((item) => item.vision) || [];
};
export const getDefaultEmbeddingModel = () => global?.systemDefaultModel.embedding!; export const getDefaultEmbeddingModel = () => global?.systemDefaultModel.embedding!;
export const getEmbeddingModel = (model?: string) => { export const getEmbeddingModel = (model?: string) => {
if (!model) return getDefaultEmbeddingModel(); if (!model) return getDefaultEmbeddingModel();

View File

@@ -30,8 +30,7 @@ import { Types } from 'mongoose';
community: community-id community: community-id
commercial: commercial-id commercial: commercial-id
*/ */
export function splitCombineToolId(id: string) {
export async function splitCombinePluginId(id: string) {
const splitRes = id.split('-'); const splitRes = id.split('-');
if (splitRes.length === 1) { if (splitRes.length === 1) {
// app id // app id
@@ -42,7 +41,7 @@ export async function splitCombinePluginId(id: string) {
} }
const [source, pluginId] = id.split('-') as [PluginSourceEnum, string]; const [source, pluginId] = id.split('-') as [PluginSourceEnum, string];
if (!source || !pluginId) return Promise.reject('pluginId not found'); if (!source || !pluginId) throw new Error('pluginId not found');
return { source, pluginId: id }; return { source, pluginId: id };
} }
@@ -54,7 +53,7 @@ const getSystemPluginTemplateById = async (
versionId?: string versionId?: string
): Promise<ChildAppType> => { ): Promise<ChildAppType> => {
const item = getSystemPluginTemplates().find((plugin) => plugin.id === pluginId); const item = getSystemPluginTemplates().find((plugin) => plugin.id === pluginId);
if (!item) return Promise.reject(PluginErrEnum.unAuth); if (!item) return Promise.reject(PluginErrEnum.unExist);
const plugin = cloneDeep(item); const plugin = cloneDeep(item);
@@ -64,10 +63,10 @@ const getSystemPluginTemplateById = async (
{ pluginId: plugin.id, 'customConfig.associatedPluginId': plugin.associatedPluginId }, { pluginId: plugin.id, 'customConfig.associatedPluginId': plugin.associatedPluginId },
'associatedPluginId' 'associatedPluginId'
).lean(); ).lean();
if (!systemPlugin) return Promise.reject(PluginErrEnum.unAuth); if (!systemPlugin) return Promise.reject(PluginErrEnum.unExist);
const app = await MongoApp.findById(plugin.associatedPluginId).lean(); const app = await MongoApp.findById(plugin.associatedPluginId).lean();
if (!app) return Promise.reject(PluginErrEnum.unAuth); if (!app) return Promise.reject(PluginErrEnum.unExist);
const version = versionId const version = versionId
? await getAppVersionById({ ? await getAppVersionById({
@@ -77,6 +76,12 @@ const getSystemPluginTemplateById = async (
}) })
: await getAppLatestVersion(plugin.associatedPluginId, app); : await getAppLatestVersion(plugin.associatedPluginId, app);
if (!version.versionId) return Promise.reject('App version not found'); if (!version.versionId) return Promise.reject('App version not found');
const isLatest = version.versionId
? await checkIsLatestVersion({
appId: plugin.associatedPluginId,
versionId: version.versionId
})
: true;
return { return {
...plugin, ...plugin,
@@ -85,12 +90,19 @@ const getSystemPluginTemplateById = async (
edges: version.edges, edges: version.edges,
chatConfig: version.chatConfig chatConfig: version.chatConfig
}, },
version: versionId || String(version.versionId), version: versionId ? version?.versionId : '',
versionLabel: version?.versionName,
isLatestVersion: isLatest,
teamId: String(app.teamId), teamId: String(app.teamId),
tmbId: String(app.tmbId) tmbId: String(app.tmbId)
}; };
} }
return plugin;
return {
...plugin,
version: undefined,
isLatestVersion: true
};
}; };
/* Format plugin to workflow preview node data */ /* Format plugin to workflow preview node data */
@@ -102,11 +114,11 @@ export async function getChildAppPreviewNode({
versionId?: string; versionId?: string;
}): Promise<FlowNodeTemplateType> { }): Promise<FlowNodeTemplateType> {
const app: ChildAppType = await (async () => { const app: ChildAppType = await (async () => {
const { source, pluginId } = await splitCombinePluginId(appId); const { source, pluginId } = splitCombineToolId(appId);
if (source === PluginSourceEnum.personal) { if (source === PluginSourceEnum.personal) {
const item = await MongoApp.findById(appId).lean(); const item = await MongoApp.findById(appId).lean();
if (!item) return Promise.reject('plugin not found'); if (!item) return Promise.reject(PluginErrEnum.unExist);
const version = await getAppVersionById({ appId, versionId, app: item }); const version = await getAppVersionById({ appId, versionId, app: item });
@@ -132,8 +144,8 @@ export async function getChildAppPreviewNode({
}, },
templateType: FlowNodeTemplateTypeEnum.teamApp, templateType: FlowNodeTemplateTypeEnum.teamApp,
version: version.versionId, version: versionId ? version?.versionId : '',
versionLabel: version?.versionName || '', versionLabel: version?.versionName,
isLatestVersion: isLatest, isLatestVersion: isLatest,
originCost: 0, originCost: 0,
@@ -142,7 +154,7 @@ export async function getChildAppPreviewNode({
pluginOrder: 0 pluginOrder: 0
}; };
} else { } else {
return getSystemPluginTemplateById(pluginId); return getSystemPluginTemplateById(pluginId, versionId);
} }
})(); })();
@@ -216,12 +228,12 @@ export async function getChildAppRuntimeById(
id: string, id: string,
versionId?: string versionId?: string
): Promise<PluginRuntimeType> { ): Promise<PluginRuntimeType> {
const app: ChildAppType = await (async () => { const app = await (async () => {
const { source, pluginId } = await splitCombinePluginId(id); const { source, pluginId } = splitCombineToolId(id);
if (source === PluginSourceEnum.personal) { if (source === PluginSourceEnum.personal) {
const item = await MongoApp.findById(id).lean(); const item = await MongoApp.findById(id).lean();
if (!item) return Promise.reject('plugin not found'); if (!item) return Promise.reject(PluginErrEnum.unExist);
const version = await getAppVersionById({ const version = await getAppVersionById({
appId: id, appId: id,
@@ -244,8 +256,6 @@ export async function getChildAppRuntimeById(
}, },
templateType: FlowNodeTemplateTypeEnum.teamApp, templateType: FlowNodeTemplateTypeEnum.teamApp,
// 用不到
version: item?.pluginData?.nodeVersion,
originCost: 0, originCost: 0,
currentCost: 0, currentCost: 0,
hasTokenFee: false, hasTokenFee: false,

View File

@@ -1,6 +1,6 @@
import { type ChatNodeUsageType } from '@fastgpt/global/support/wallet/bill/type'; import { type ChatNodeUsageType } from '@fastgpt/global/support/wallet/bill/type';
import { type PluginRuntimeType } from '@fastgpt/global/core/plugin/type'; import { type PluginRuntimeType } from '@fastgpt/global/core/plugin/type';
import { splitCombinePluginId } from './controller'; import { splitCombineToolId } from './controller';
import { PluginSourceEnum } from '@fastgpt/global/core/plugin/constants'; import { PluginSourceEnum } from '@fastgpt/global/core/plugin/constants';
/* /*
@@ -20,7 +20,7 @@ export const computedPluginUsage = async ({
childrenUsage: ChatNodeUsageType[]; childrenUsage: ChatNodeUsageType[];
error?: boolean; error?: boolean;
}) => { }) => {
const { source } = await splitCombinePluginId(plugin.id); const { source } = splitCombineToolId(plugin.id);
const childrenUsages = childrenUsage.reduce((sum, item) => sum + (item.totalPoints || 0), 0); const childrenUsages = childrenUsage.reduce((sum, item) => sum + (item.totalPoints || 0), 0);
if (source !== PluginSourceEnum.personal) { if (source !== PluginSourceEnum.personal) {

View File

@@ -1,14 +1,13 @@
import { MongoDataset } from '../dataset/schema'; import { MongoDataset } from '../dataset/schema';
import { getEmbeddingModel } from '../ai/model'; import { getEmbeddingModel } from '../ai/model';
import { import { FlowNodeTypeEnum } from '@fastgpt/global/core/workflow/node/constant';
AppNodeFlowNodeTypeMap,
FlowNodeTypeEnum
} from '@fastgpt/global/core/workflow/node/constant';
import { NodeInputKeyEnum } from '@fastgpt/global/core/workflow/constants'; import { NodeInputKeyEnum } from '@fastgpt/global/core/workflow/constants';
import type { StoreNodeItemType } from '@fastgpt/global/core/workflow/type/node'; import type { StoreNodeItemType } from '@fastgpt/global/core/workflow/type/node';
import { MongoAppVersion } from './version/schema'; import { getChildAppPreviewNode, splitCombineToolId } from './plugin/controller';
import { checkIsLatestVersion } from './version/controller'; import { PluginSourceEnum } from '@fastgpt/global/core/plugin/constants';
import { Types } from '../../common/mongo'; import { authAppByTmbId } from '../../support/permission/app/auth';
import { ReadPermissionVal } from '@fastgpt/global/support/permission/constant';
import { getErrText } from '@fastgpt/global/common/error/utils';
export async function listAppDatasetDataByTeamIdAndDatasetIds({ export async function listAppDatasetDataByTeamIdAndDatasetIds({
teamId, teamId,
@@ -33,53 +32,58 @@ export async function listAppDatasetDataByTeamIdAndDatasetIds({
export async function rewriteAppWorkflowToDetail({ export async function rewriteAppWorkflowToDetail({
nodes, nodes,
teamId, teamId,
isRoot isRoot,
ownerTmbId
}: { }: {
nodes: StoreNodeItemType[]; nodes: StoreNodeItemType[];
teamId: string; teamId: string;
isRoot: boolean; isRoot: boolean;
ownerTmbId: string;
}) { }) {
const datasetIdSet = new Set<string>(); const datasetIdSet = new Set<string>();
// Add node(App Type) versionlabel and latest sign /* Add node(App Type) versionlabel and latest sign ==== */
const appNodes = nodes.filter((node) => AppNodeFlowNodeTypeMap[node.flowNodeType]); await Promise.all(
const versionIds = appNodes nodes.map(async (node) => {
.filter((node) => node.version && Types.ObjectId.isValid(node.version)) if (!node.pluginId) return;
.map((node) => node.version); const { source } = splitCombineToolId(node.pluginId);
if (versionIds.length > 0) { try {
const versionDataList = await MongoAppVersion.find( const [preview] = await Promise.all([
{ getChildAppPreviewNode({
_id: { $in: versionIds } appId: node.pluginId,
}, versionId: node.version
'_id versionName appId time' }),
).lean(); ...(source === PluginSourceEnum.personal
? [
authAppByTmbId({
tmbId: ownerTmbId,
appId: node.pluginId,
per: ReadPermissionVal
})
]
: [])
]);
const versionMap: Record<string, any> = {}; node.pluginData = {
diagram: preview.diagram,
const isLatestChecks = await Promise.all( userGuide: preview.userGuide,
versionDataList.map(async (version) => { courseUrl: preview.courseUrl,
const isLatest = await checkIsLatestVersion({ name: preview.name,
appId: version.appId, avatar: preview.avatar
versionId: version._id };
}); node.versionLabel = preview.versionLabel;
node.isLatestVersion = preview.isLatestVersion;
return { versionId: String(version._id), isLatest }; node.version = preview.version;
}) } catch (error) {
); node.pluginData = {
const isLatestMap = new Map(isLatestChecks.map((item) => [item.versionId, item.isLatest])); error: getErrText(error)
versionDataList.forEach((version) => { };
versionMap[String(version._id)] = version;
});
appNodes.forEach((node) => {
if (!node.version) return;
const versionData = versionMap[String(node.version)];
if (versionData) {
node.versionLabel = versionData.versionName;
node.isLatestVersion = isLatestMap.get(String(node.version)) || false;
} }
}); })
} );
/* Add node(App Type) versionlabel and latest sign ==== */
// Get all dataset ids from nodes // Get all dataset ids from nodes
nodes.forEach((node) => { nodes.forEach((node) => {

View File

@@ -68,6 +68,9 @@ export const checkIsLatestVersion = async ({
appId: string; appId: string;
versionId: string; versionId: string;
}) => { }) => {
if (!Types.ObjectId.isValid(versionId)) {
return false;
}
const version = await MongoAppVersion.findOne( const version = await MongoAppVersion.findOne(
{ {
appId, appId,

View File

@@ -3,12 +3,11 @@ import type {
ApiFileReadContentResponse, ApiFileReadContentResponse,
APIFileReadResponse, APIFileReadResponse,
ApiDatasetDetailResponse, ApiDatasetDetailResponse,
APIFileServer, APIFileServer
APIFileItem } from '@fastgpt/global/core/dataset/apiDataset/type';
} from '@fastgpt/global/core/dataset/apiDataset';
import axios, { type Method } from 'axios'; import axios, { type Method } from 'axios';
import { addLog } from '../../../common/system/log'; import { addLog } from '../../../../common/system/log';
import { readFileRawTextByUrl } from '../read'; import { readFileRawTextByUrl } from '../../read';
import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type'; import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type';
import { type RequireOnlyOne } from '@fastgpt/global/common/type/utils'; import { type RequireOnlyOne } from '@fastgpt/global/common/type/utils';

View File

@@ -3,10 +3,10 @@ import type {
ApiFileReadContentResponse, ApiFileReadContentResponse,
ApiDatasetDetailResponse, ApiDatasetDetailResponse,
FeishuServer FeishuServer
} from '@fastgpt/global/core/dataset/apiDataset'; } from '@fastgpt/global/core/dataset/apiDataset/type';
import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type'; import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type';
import axios, { type Method } from 'axios'; import axios, { type Method } from 'axios';
import { addLog } from '../../../common/system/log'; import { addLog } from '../../../../common/system/log';
type ResponseDataType = { type ResponseDataType = {
success: boolean; success: boolean;

View File

@@ -1,18 +1,10 @@
import type { import { useApiDatasetRequest } from './custom/api';
APIFileServer, import { useYuqueDatasetRequest } from './yuqueDataset/api';
YuqueServer, import { useFeishuDatasetRequest } from './feishuDataset/api';
FeishuServer import type { ApiDatasetServerType } from '@fastgpt/global/core/dataset/apiDataset/type';
} from '@fastgpt/global/core/dataset/apiDataset';
import { useApiDatasetRequest } from './api';
import { useYuqueDatasetRequest } from '../yuqueDataset/api';
import { useFeishuDatasetRequest } from '../feishuDataset/api';
export const getApiDatasetRequest = async (data: { export const getApiDatasetRequest = async (apiDatasetServer?: ApiDatasetServerType) => {
apiServer?: APIFileServer; const { apiServer, yuqueServer, feishuServer } = apiDatasetServer || {};
yuqueServer?: YuqueServer;
feishuServer?: FeishuServer;
}) => {
const { apiServer, yuqueServer, feishuServer } = data;
if (apiServer) { if (apiServer) {
return useApiDatasetRequest({ apiServer }); return useApiDatasetRequest({ apiServer });

View File

@@ -3,9 +3,9 @@ import type {
ApiFileReadContentResponse, ApiFileReadContentResponse,
YuqueServer, YuqueServer,
ApiDatasetDetailResponse ApiDatasetDetailResponse
} from '@fastgpt/global/core/dataset/apiDataset'; } from '@fastgpt/global/core/dataset/apiDataset/type';
import axios, { type Method } from 'axios'; import axios, { type Method } from 'axios';
import { addLog } from '../../../common/system/log'; import { addLog } from '../../../../common/system/log';
import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type'; import { type ParentIdType } from '@fastgpt/global/common/parentFolder/type';
type ResponseDataType = { type ResponseDataType = {
@@ -105,7 +105,6 @@ export const useYuqueDatasetRequest = ({ yuqueServer }: { yuqueServer: YuqueServ
if (!parentId) { if (!parentId) {
if (yuqueServer.basePath) parentId = yuqueServer.basePath; if (yuqueServer.basePath) parentId = yuqueServer.basePath;
} }
let files: APIFileItem[] = []; let files: APIFileItem[] = [];
if (!parentId) { if (!parentId) {

View File

@@ -5,9 +5,10 @@ import {
} from '@fastgpt/global/core/dataset/constants'; } from '@fastgpt/global/core/dataset/constants';
import type { CreateDatasetCollectionParams } from '@fastgpt/global/core/dataset/api.d'; import type { CreateDatasetCollectionParams } from '@fastgpt/global/core/dataset/api.d';
import { MongoDatasetCollection } from './schema'; import { MongoDatasetCollection } from './schema';
import { import type {
type DatasetCollectionSchemaType, DatasetCollectionSchemaType,
type DatasetSchemaType DatasetDataFieldType,
DatasetSchemaType
} from '@fastgpt/global/core/dataset/type'; } from '@fastgpt/global/core/dataset/type';
import { MongoDatasetTraining } from '../training/schema'; import { MongoDatasetTraining } from '../training/schema';
import { MongoDatasetData } from '../data/schema'; import { MongoDatasetData } from '../data/schema';
@@ -15,7 +16,7 @@ import { delImgByRelatedId } from '../../../common/file/image/controller';
import { deleteDatasetDataVector } from '../../../common/vectorDB/controller'; import { deleteDatasetDataVector } from '../../../common/vectorDB/controller';
import { delFileByFileIdList } from '../../../common/file/gridfs/controller'; import { delFileByFileIdList } from '../../../common/file/gridfs/controller';
import { BucketNameEnum } from '@fastgpt/global/common/file/constants'; import { BucketNameEnum } from '@fastgpt/global/common/file/constants';
import { type ClientSession } from '../../../common/mongo'; import type { ClientSession } from '../../../common/mongo';
import { createOrGetCollectionTags } from './utils'; import { createOrGetCollectionTags } from './utils';
import { rawText2Chunks } from '../read'; import { rawText2Chunks } from '../read';
import { checkDatasetLimit } from '../../../support/permission/teamLimit'; import { checkDatasetLimit } from '../../../support/permission/teamLimit';
@@ -38,20 +39,25 @@ import {
getLLMMaxChunkSize getLLMMaxChunkSize
} from '@fastgpt/global/core/dataset/training/utils'; } from '@fastgpt/global/core/dataset/training/utils';
import { DatasetDataIndexTypeEnum } from '@fastgpt/global/core/dataset/data/constants'; import { DatasetDataIndexTypeEnum } from '@fastgpt/global/core/dataset/data/constants';
import { deleteDatasetImage } from '../image/controller';
import { clearCollectionImages, removeDatasetImageExpiredTime } from '../image/utils';
export const createCollectionAndInsertData = async ({ export const createCollectionAndInsertData = async ({
dataset, dataset,
rawText, rawText,
relatedId, relatedId,
imageIds,
createCollectionParams, createCollectionParams,
backupParse = false, backupParse = false,
billId, billId,
session session
}: { }: {
dataset: DatasetSchemaType; dataset: DatasetSchemaType;
rawText: string; rawText?: string;
relatedId?: string; relatedId?: string;
imageIds?: string[];
createCollectionParams: CreateOneCollectionParams; createCollectionParams: CreateOneCollectionParams;
backupParse?: boolean; backupParse?: boolean;
billId?: string; billId?: string;
@@ -69,15 +75,18 @@ export const createCollectionAndInsertData = async ({
// Set default params // Set default params
const trainingType = const trainingType =
createCollectionParams.trainingType || DatasetCollectionDataProcessModeEnum.chunk; createCollectionParams.trainingType || DatasetCollectionDataProcessModeEnum.chunk;
const chunkSize = computeChunkSize({
...createCollectionParams,
trainingType,
llmModel: getLLMModel(dataset.agentModel)
});
const chunkSplitter = computeChunkSplitter(createCollectionParams); const chunkSplitter = computeChunkSplitter(createCollectionParams);
const paragraphChunkDeep = computeParagraphChunkDeep(createCollectionParams); const paragraphChunkDeep = computeParagraphChunkDeep(createCollectionParams);
const trainingMode = getTrainingModeByCollection({
trainingType: trainingType,
autoIndexes: createCollectionParams.autoIndexes,
imageIndex: createCollectionParams.imageIndex
});
if (trainingType === DatasetCollectionDataProcessModeEnum.qa) { if (
trainingType === DatasetCollectionDataProcessModeEnum.qa ||
trainingType === DatasetCollectionDataProcessModeEnum.backup
) {
delete createCollectionParams.chunkTriggerType; delete createCollectionParams.chunkTriggerType;
delete createCollectionParams.chunkTriggerMinSize; delete createCollectionParams.chunkTriggerMinSize;
delete createCollectionParams.dataEnhanceCollectionName; delete createCollectionParams.dataEnhanceCollectionName;
@@ -87,35 +96,60 @@ export const createCollectionAndInsertData = async ({
delete createCollectionParams.qaPrompt; delete createCollectionParams.qaPrompt;
} }
// 1. split chunks // 1. split chunks or create image chunks
const chunks = rawText2Chunks({ const {
rawText, chunks,
chunkTriggerType: createCollectionParams.chunkTriggerType, chunkSize
chunkTriggerMinSize: createCollectionParams.chunkTriggerMinSize, }: {
chunkSize, chunks: Array<{
paragraphChunkDeep, q?: string;
paragraphChunkMinSize: createCollectionParams.paragraphChunkMinSize, a?: string; // answer or custom content
maxSize: getLLMMaxChunkSize(getLLMModel(dataset.agentModel)), imageId?: string;
overlapRatio: trainingType === DatasetCollectionDataProcessModeEnum.chunk ? 0.2 : 0, indexes?: string[];
customReg: chunkSplitter ? [chunkSplitter] : [], }>;
backupParse chunkSize?: number;
}); } = (() => {
if (rawText) {
const chunkSize = computeChunkSize({
...createCollectionParams,
trainingType,
llmModel: getLLMModel(dataset.agentModel)
});
// Process text chunks
const chunks = rawText2Chunks({
rawText,
chunkTriggerType: createCollectionParams.chunkTriggerType,
chunkTriggerMinSize: createCollectionParams.chunkTriggerMinSize,
chunkSize,
paragraphChunkDeep,
paragraphChunkMinSize: createCollectionParams.paragraphChunkMinSize,
maxSize: getLLMMaxChunkSize(getLLMModel(dataset.agentModel)),
overlapRatio: trainingType === DatasetCollectionDataProcessModeEnum.chunk ? 0.2 : 0,
customReg: chunkSplitter ? [chunkSplitter] : [],
backupParse
});
return { chunks, chunkSize };
}
if (imageIds) {
// Process image chunks
const chunks = imageIds.map((imageId: string) => ({
imageId,
indexes: []
}));
return { chunks };
}
throw new Error('Either rawText or imageIdList must be provided');
})();
// 2. auth limit // 2. auth limit
await checkDatasetLimit({ await checkDatasetLimit({
teamId, teamId,
insertLen: predictDataLimitLength( insertLen: predictDataLimitLength(trainingMode, chunks)
getTrainingModeByCollection({
trainingType: trainingType,
autoIndexes: createCollectionParams.autoIndexes,
imageIndex: createCollectionParams.imageIndex
}),
chunks
)
}); });
const fn = async (session: ClientSession) => { const fn = async (session: ClientSession) => {
// 3. create collection // 3. Create collection
const { _id: collectionId } = await createOneCollection({ const { _id: collectionId } = await createOneCollection({
...createCollectionParams, ...createCollectionParams,
trainingType, trainingType,
@@ -123,8 +157,8 @@ export const createCollectionAndInsertData = async ({
chunkSize, chunkSize,
chunkSplitter, chunkSplitter,
hashRawText: hashStr(rawText), hashRawText: rawText ? hashStr(rawText) : undefined,
rawTextLength: rawText.length, rawTextLength: rawText?.length,
nextSyncTime: (() => { nextSyncTime: (() => {
// ignore auto collections sync for website datasets // ignore auto collections sync for website datasets
if (!dataset.autoSync && dataset.type === DatasetTypeEnum.websiteDataset) return undefined; if (!dataset.autoSync && dataset.type === DatasetTypeEnum.websiteDataset) return undefined;
@@ -166,11 +200,7 @@ export const createCollectionAndInsertData = async ({
vectorModel: dataset.vectorModel, vectorModel: dataset.vectorModel,
vlmModel: dataset.vlmModel, vlmModel: dataset.vlmModel,
indexSize: createCollectionParams.indexSize, indexSize: createCollectionParams.indexSize,
mode: getTrainingModeByCollection({ mode: trainingMode,
trainingType: trainingType,
autoIndexes: createCollectionParams.autoIndexes,
imageIndex: createCollectionParams.imageIndex
}),
prompt: createCollectionParams.qaPrompt, prompt: createCollectionParams.qaPrompt,
billId: traingBillId, billId: traingBillId,
data: chunks.map((item, index) => ({ data: chunks.map((item, index) => ({
@@ -184,7 +214,12 @@ export const createCollectionAndInsertData = async ({
session session
}); });
// 6. remove related image ttl // 6. Remove images ttl index
await removeDatasetImageExpiredTime({
ids: imageIds,
collectionId,
session
});
if (relatedId) { if (relatedId) {
await MongoImage.updateMany( await MongoImage.updateMany(
{ {
@@ -204,7 +239,7 @@ export const createCollectionAndInsertData = async ({
} }
return { return {
collectionId, collectionId: String(collectionId),
insertResults insertResults
}; };
}; };
@@ -285,17 +320,20 @@ export const delCollectionRelatedSource = async ({
.map((item) => item?.metadata?.relatedImgId || '') .map((item) => item?.metadata?.relatedImgId || '')
.filter(Boolean); .filter(Boolean);
// Delete files // Delete files and images in parallel
await delFileByFileIdList({ await Promise.all([
bucketName: BucketNameEnum.dataset, // Delete files
fileIdList delFileByFileIdList({
}); bucketName: BucketNameEnum.dataset,
// Delete images fileIdList
await delImgByRelatedId({ }),
teamId, // Delete images
relateIds: relatedImageIds, delImgByRelatedId({
session teamId,
}); relateIds: relatedImageIds,
session
})
]);
}; };
/** /**
* delete collection and it related data * delete collection and it related data
@@ -340,16 +378,16 @@ export async function delCollection({
datasetId: { $in: datasetIds }, datasetId: { $in: datasetIds },
collectionId: { $in: collectionIds } collectionId: { $in: collectionIds }
}), }),
// Delete dataset_images
clearCollectionImages(collectionIds),
// Delete images if needed
...(delImg ...(delImg
? [ ? collections
delImgByRelatedId({ .map((item) => item?.metadata?.relatedImgId || '')
teamId, .filter(Boolean)
relateIds: collections .map((imageId) => deleteDatasetImage(imageId))
.map((item) => item?.metadata?.relatedImgId || '')
.filter(Boolean)
})
]
: []), : []),
// Delete files if needed
...(delFile ...(delFile
? [ ? [
delFileByFileIdList({ delFileByFileIdList({

View File

@@ -1,11 +1,9 @@
import { MongoDatasetCollection } from './schema'; import { MongoDatasetCollection } from './schema';
import { type ClientSession } from '../../../common/mongo'; import type { ClientSession } from '../../../common/mongo';
import { MongoDatasetCollectionTags } from '../tag/schema'; import { MongoDatasetCollectionTags } from '../tag/schema';
import { readFromSecondary } from '../../../common/mongo/utils'; import { readFromSecondary } from '../../../common/mongo/utils';
import { import type { CollectionWithDatasetType } from '@fastgpt/global/core/dataset/type';
type CollectionWithDatasetType, import { DatasetCollectionSchemaType } from '@fastgpt/global/core/dataset/type';
type DatasetCollectionSchemaType
} from '@fastgpt/global/core/dataset/type';
import { import {
DatasetCollectionDataProcessModeEnum, DatasetCollectionDataProcessModeEnum,
DatasetCollectionSyncResultEnum, DatasetCollectionSyncResultEnum,
@@ -159,9 +157,7 @@ export const syncCollection = async (collection: CollectionWithDatasetType) => {
return { return {
type: DatasetSourceReadTypeEnum.apiFile, type: DatasetSourceReadTypeEnum.apiFile,
sourceId, sourceId,
apiServer: dataset.apiServer, apiDatasetServer: dataset.apiDatasetServer
feishuServer: dataset.feishuServer,
yuqueServer: dataset.yuqueServer
}; };
})(); })();
@@ -233,18 +229,37 @@ export const syncCollection = async (collection: CollectionWithDatasetType) => {
QA: 独立进程 QA: 独立进程
Chunk: Image Index -> Auto index -> chunk index Chunk: Image Index -> Auto index -> chunk index
*/ */
export const getTrainingModeByCollection = (collection: { export const getTrainingModeByCollection = ({
trainingType: DatasetCollectionSchemaType['trainingType']; trainingType,
autoIndexes?: DatasetCollectionSchemaType['autoIndexes']; autoIndexes,
imageIndex?: DatasetCollectionSchemaType['imageIndex']; imageIndex
}: {
trainingType: DatasetCollectionDataProcessModeEnum;
autoIndexes?: boolean;
imageIndex?: boolean;
}) => { }) => {
if (collection.trainingType === DatasetCollectionDataProcessModeEnum.qa) { if (
trainingType === DatasetCollectionDataProcessModeEnum.imageParse &&
global.feConfigs?.isPlus
) {
return TrainingModeEnum.imageParse;
}
if (trainingType === DatasetCollectionDataProcessModeEnum.qa) {
return TrainingModeEnum.qa; return TrainingModeEnum.qa;
} }
if (collection.imageIndex && global.feConfigs?.isPlus) { if (
trainingType === DatasetCollectionDataProcessModeEnum.chunk &&
imageIndex &&
global.feConfigs?.isPlus
) {
return TrainingModeEnum.image; return TrainingModeEnum.image;
} }
if (collection.autoIndexes && global.feConfigs?.isPlus) { if (
trainingType === DatasetCollectionDataProcessModeEnum.chunk &&
autoIndexes &&
global.feConfigs?.isPlus
) {
return TrainingModeEnum.auto; return TrainingModeEnum.auto;
} }
return TrainingModeEnum.chunk; return TrainingModeEnum.chunk;

View File

@@ -9,6 +9,7 @@ import { deleteDatasetDataVector } from '../../common/vectorDB/controller';
import { MongoDatasetDataText } from './data/dataTextSchema'; import { MongoDatasetDataText } from './data/dataTextSchema';
import { DatasetErrEnum } from '@fastgpt/global/common/error/code/dataset'; import { DatasetErrEnum } from '@fastgpt/global/common/error/code/dataset';
import { retryFn } from '@fastgpt/global/common/system/utils'; import { retryFn } from '@fastgpt/global/common/system/utils';
import { clearDatasetImages } from './image/utils';
/* ============= dataset ========== */ /* ============= dataset ========== */
/* find all datasetId by top datasetId */ /* find all datasetId by top datasetId */
@@ -102,8 +103,10 @@ export async function delDatasetRelevantData({
}), }),
//delete dataset_datas //delete dataset_datas
MongoDatasetData.deleteMany({ teamId, datasetId: { $in: datasetIds } }), MongoDatasetData.deleteMany({ teamId, datasetId: { $in: datasetIds } }),
// Delete Image and file // Delete collection image and file
delCollectionRelatedSource({ collections }), delCollectionRelatedSource({ collections }),
// Delete dataset Image
clearDatasetImages(datasetIds),
// Delete vector data // Delete vector data
deleteDatasetDataVector({ teamId, datasetIds }) deleteDatasetDataVector({ teamId, datasetIds })
]); ]);

View File

@@ -0,0 +1,56 @@
import { getDatasetImagePreviewUrl } from '../image/utils';
import type { DatasetCiteItemType, DatasetDataSchemaType } from '@fastgpt/global/core/dataset/type';
export const formatDatasetDataValue = ({
q,
a,
imageId,
teamId,
datasetId
}: {
q: string;
a?: string;
imageId?: string;
teamId: string;
datasetId: string;
}): {
q: string;
a?: string;
imagePreivewUrl?: string;
} => {
if (!imageId) {
return {
q,
a
};
}
const previewUrl = getDatasetImagePreviewUrl({
imageId,
teamId,
datasetId,
expiredMinutes: 60 * 24 * 7 // 7 days
});
return {
q: `![${q.replaceAll('\n', '\\n')}](${previewUrl})`,
a,
imagePreivewUrl: previewUrl
};
};
export const getFormatDatasetCiteList = (list: DatasetDataSchemaType[]) => {
return list.map<DatasetCiteItemType>((item) => ({
_id: item._id,
...formatDatasetDataValue({
teamId: item.teamId,
datasetId: item.datasetId,
q: item.q,
a: item.a,
imageId: item.imageId
}),
history: item.history,
updateTime: item.updateTime,
index: item.chunkIndex
}));
};

View File

@@ -37,8 +37,7 @@ const DatasetDataSchema = new Schema({
required: true required: true
}, },
a: { a: {
type: String, type: String
default: ''
}, },
history: { history: {
type: [ type: [
@@ -74,6 +73,9 @@ const DatasetDataSchema = new Schema({
default: [] default: []
}, },
imageId: {
type: String
},
updateTime: { updateTime: {
type: Date, type: Date,
default: () => new Date() default: () => new Date()

View File

@@ -0,0 +1,166 @@
import { addMinutes } from 'date-fns';
import { bucketName, MongoDatasetImageSchema } from './schema';
import { connectionMongo, Types } from '../../../common/mongo';
import fs from 'fs';
import type { FileType } from '../../../common/file/multer';
import fsp from 'fs/promises';
import { computeGridFsChunSize } from '../../../common/file/gridfs/utils';
import { setCron } from '../../../common/system/cron';
import { checkTimerLock } from '../../../common/system/timerLock/utils';
import { TimerIdEnum } from '../../../common/system/timerLock/constants';
import { addLog } from '../../../common/system/log';
const getGridBucket = () => {
return new connectionMongo.mongo.GridFSBucket(connectionMongo.connection.db!, {
bucketName: bucketName
});
};
export const createDatasetImage = async ({
teamId,
datasetId,
file,
expiredTime = addMinutes(new Date(), 30)
}: {
teamId: string;
datasetId: string;
file: FileType;
expiredTime?: Date;
}): Promise<{ imageId: string; previewUrl: string }> => {
const path = file.path;
const gridBucket = getGridBucket();
const metadata = {
teamId: String(teamId),
datasetId: String(datasetId),
expiredTime
};
const stats = await fsp.stat(path);
if (!stats.isFile()) return Promise.reject(`${path} is not a file`);
const readStream = fs.createReadStream(path, {
highWaterMark: 256 * 1024
});
const chunkSizeBytes = computeGridFsChunSize(stats.size);
const stream = gridBucket.openUploadStream(file.originalname, {
metadata,
contentType: file.mimetype,
chunkSizeBytes
});
// save to gridfs
await new Promise((resolve, reject) => {
readStream
.pipe(stream as any)
.on('finish', resolve)
.on('error', reject);
});
return {
imageId: String(stream.id),
previewUrl: ''
};
};
export const getDatasetImageReadData = async (imageId: string) => {
// Get file metadata to get contentType
const fileInfo = await MongoDatasetImageSchema.findOne({
_id: new Types.ObjectId(imageId)
}).lean();
if (!fileInfo) {
return Promise.reject('Image not found');
}
const gridBucket = getGridBucket();
return {
stream: gridBucket.openDownloadStream(new Types.ObjectId(imageId)),
fileInfo
};
};
export const getDatasetImageBase64 = async (imageId: string) => {
// Get file metadata to get contentType
const fileInfo = await MongoDatasetImageSchema.findOne({
_id: new Types.ObjectId(imageId)
}).lean();
if (!fileInfo) {
return Promise.reject('Image not found');
}
// Get image stream from GridFS
const { stream } = await getDatasetImageReadData(imageId);
// Convert stream to buffer
const chunks: Buffer[] = [];
return new Promise<string>((resolve, reject) => {
stream.on('data', (chunk: Buffer) => {
chunks.push(chunk);
});
stream.on('end', () => {
// Combine all chunks into a single buffer
const buffer = Buffer.concat(chunks);
// Convert buffer to base64 string
const base64 = buffer.toString('base64');
const dataUrl = `data:${fileInfo.contentType || 'image/jpeg'};base64,${base64}`;
resolve(dataUrl);
});
stream.on('error', reject);
});
};
export const deleteDatasetImage = async (imageId: string) => {
const gridBucket = getGridBucket();
try {
await gridBucket.delete(new Types.ObjectId(imageId));
} catch (error: any) {
const msg = error?.message;
if (msg.includes('File not found')) {
addLog.warn('Delete dataset image error', error);
return;
} else {
return Promise.reject(error);
}
}
};
export const clearExpiredDatasetImageCron = async () => {
const gridBucket = getGridBucket();
const clearExpiredDatasetImages = async () => {
addLog.debug('Clear expired dataset image start');
const data = await MongoDatasetImageSchema.find(
{
'metadata.expiredTime': { $lt: new Date() }
},
'_id'
).lean();
for (const item of data) {
try {
await gridBucket.delete(item._id);
} catch (error) {
addLog.error('Delete expired dataset image error', error);
}
}
addLog.debug('Clear expired dataset image end');
};
setCron('*/10 * * * *', async () => {
if (
await checkTimerLock({
timerId: TimerIdEnum.clearExpiredDatasetImage,
lockMinuted: 9
})
) {
try {
await clearExpiredDatasetImages();
} catch (error) {
addLog.error('clearExpiredDatasetImageCron error', error);
}
}
});
};

View File

@@ -0,0 +1,36 @@
import type { Types } from '../../../common/mongo';
import { getMongoModel, Schema } from '../../../common/mongo';
export const bucketName = 'dataset_image';
const MongoDatasetImage = new Schema({
length: { type: Number, required: true },
chunkSize: { type: Number, required: true },
uploadDate: { type: Date, required: true },
filename: { type: String, required: true },
contentType: { type: String, required: true },
metadata: {
teamId: { type: String, required: true },
datasetId: { type: String, required: true },
collectionId: { type: String },
expiredTime: { type: Date, required: true }
}
});
MongoDatasetImage.index({ 'metadata.datasetId': 'hashed' });
MongoDatasetImage.index({ 'metadata.collectionId': 'hashed' });
MongoDatasetImage.index({ 'metadata.expiredTime': -1 });
export const MongoDatasetImageSchema = getMongoModel<{
_id: Types.ObjectId;
length: number;
chunkSize: number;
uploadDate: Date;
filename: string;
contentType: string;
metadata: {
teamId: string;
datasetId: string;
collectionId: string;
expiredTime: Date;
};
}>(`${bucketName}.files`, MongoDatasetImage);

Some files were not shown because too many files have changed in this diff Show More