Skip to content

Commit b755253

Browse files
ming1753chang-wenbin
authored andcommitted
[Docs] PaddleOCR-VL add RTX3060 server param (PaddlePaddle#4765)
* [Docs] PaddleOCR-VL add RTX3060 server param * modify config * fix bug
1 parent f3eae82 commit b755253

File tree

2 files changed

+35
-8
lines changed

2 files changed

+35
-8
lines changed

docs/best_practices/PaddleOCR-VL-0.9B.md

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
## 1. Environment Preparation
66
### 1.1 Support Status
77
Recommended Hardware Configuration:
8-
- GPU Memory: 24GB or more
8+
- GPU Memory: 12GB or more
99
- Shared Memory: 2GB or more
1010

1111
### 1.2 Install Fastdeploy
@@ -14,7 +14,7 @@ Installation process reference documentation [FastDeploy GPU Install](../get_sta
1414

1515
## 2.How to Use
1616
### 2.1 Basic: Launching the Service
17-
**Example 1:** Deploying a 16K Context Service on a Single RTX 4090 GPU
17+
**Example 1:** Deploying a 16K Context Service on a Single RTX 3060 GPU
1818
```shell
1919
python -m fastdeploy.entrypoints.openai.api_server \
2020
--model PaddlePaddle/PaddleOCR-VL \
@@ -23,10 +23,24 @@ python -m fastdeploy.entrypoints.openai.api_server \
2323
--engine-worker-queue-port 8182 \
2424
--max-model-len 16384 \
2525
--max-num-batched-tokens 16384 \
26-
--gpu-memory-utilization 0.8 \
26+
--gpu-memory-utilization 0.9 \
2727
--max-num-seqs 128
2828
```
29-
**Example 2:** Deploying a 16K Context Service on a Single A100 GPU
29+
30+
**Example 2:** Deploying a 16K Context Service on a Single RTX 4090 GPU
31+
```shell
32+
python -m fastdeploy.entrypoints.openai.api_server \
33+
--model PaddlePaddle/PaddleOCR-VL \
34+
--port 8180 \
35+
--metrics-port 8181 \
36+
--engine-worker-queue-port 8182 \
37+
--max-model-len 16384 \
38+
--max-num-batched-tokens 16384 \
39+
--gpu-memory-utilization 0.8 \
40+
--max-num-seqs 196
41+
```
42+
43+
**Example 3:** Deploying a 16K Context Service on a Single A100 GPU
3044
```shell
3145
python -m fastdeploy.entrypoints.openai.api_server \
3246
--model PaddlePaddle/PaddleOCR-VL \

docs/zh/best_practices/PaddleOCR-VL-0.9B.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
## 一、环境准备
66
### 1.1 支持情况
77
推荐硬件配置:
8-
- 显存:24GB显存及以上
8+
- 显存:12GB显存及以上
99
- 共享内存:2G及以上
1010

1111
### 1.2 安装fastdeploy
@@ -14,7 +14,7 @@
1414

1515
## 二、如何使用
1616
### 2.1 基础:启动服务
17-
**示例1:** 4090上单卡部署16K上下文的服务
17+
**示例1:** 3060上单卡部署16K上下文的服务
1818
```shell
1919
python -m fastdeploy.entrypoints.openai.api_server \
2020
--model PaddlePaddle/PaddleOCR-VL \
@@ -23,11 +23,24 @@ python -m fastdeploy.entrypoints.openai.api_server \
2323
--engine-worker-queue-port 8182 \
2424
--max-model-len 16384 \
2525
--max-num-batched-tokens 16384 \
26-
--gpu-memory-utilization 0.8 \
26+
--gpu-memory-utilization 0.9 \
2727
--max-num-seqs 128
2828
```
2929

30-
**示例2:** A100上单卡部署16K上下文的服务
30+
**示例2:** 4090上单卡部署16K上下文的服务
31+
```shell
32+
python -m fastdeploy.entrypoints.openai.api_server \
33+
--model PaddlePaddle/PaddleOCR-VL \
34+
--port 8180 \
35+
--metrics-port 8181 \
36+
--engine-worker-queue-port 8182 \
37+
--max-model-len 16384 \
38+
--max-num-batched-tokens 16384 \
39+
--gpu-memory-utilization 0.8 \
40+
--max-num-seqs 196
41+
```
42+
43+
**示例3:** A100上单卡部署16K上下文的服务
3144
```shell
3245
python -m fastdeploy.entrypoints.openai.api_server \
3346
--model PaddlePaddle/PaddleOCR-VL \

0 commit comments

Comments
 (0)