[Docs] PaddleOCR-VL add RTX3060 server param (PaddlePaddle#4765)

ming1753 · chang-wenbin · commit b75525315e4f · 2026-03-02T15:13:59.000+08:00
* [Docs] PaddleOCR-VL add RTX3060 server param

* modify config

* fix bug
diff --git a/docs/best_practices/PaddleOCR-VL-0.9B.md b/docs/best_practices/PaddleOCR-VL-0.9B.md
@@ -5,7 +5,7 @@
 ## 1. Environment Preparation
 ### 1.1 Support Status
 Recommended Hardware Configuration:
-- GPU Memory: 24GB or more
+- GPU Memory: 12GB or more
 - Shared Memory: 2GB or more
 
 ### 1.2 Install Fastdeploy
@@ -14,7 +14,7 @@ Installation process reference documentation [FastDeploy GPU Install](../get_sta
 
 ## 2.How to Use
 ### 2.1 Basic: Launching the Service
-**Example 1:** Deploying a 16K Context Service on a Single RTX 4090 GPU
+**Example 1:** Deploying a 16K Context Service on a Single RTX 3060 GPU
 ```shell
 python -m fastdeploy.entrypoints.openai.api_server \
     --model PaddlePaddle/PaddleOCR-VL \
@@ -23,10 +23,24 @@ python -m fastdeploy.entrypoints.openai.api_server \
     --engine-worker-queue-port 8182 \
     --max-model-len 16384 \
     --max-num-batched-tokens 16384 \
-    --gpu-memory-utilization 0.8 \
+    --gpu-memory-utilization 0.9 \
     --max-num-seqs 128
 ```
-**Example 2:** Deploying a 16K Context Service on a Single A100 GPU
+
+**Example 2:** Deploying a 16K Context Service on a Single RTX 4090 GPU
+```shell
+python -m fastdeploy.entrypoints.openai.api_server \
+    --model PaddlePaddle/PaddleOCR-VL \
+    --port 8180 \
+    --metrics-port 8181 \
+    --engine-worker-queue-port 8182 \
+    --max-model-len 16384 \
+    --max-num-batched-tokens 16384 \
+    --gpu-memory-utilization 0.8 \
+    --max-num-seqs 196
+```
+
+**Example 3:** Deploying a 16K Context Service on a Single A100 GPU
 ```shell
 python -m fastdeploy.entrypoints.openai.api_server \
     --model PaddlePaddle/PaddleOCR-VL \
diff --git a/docs/zh/best_practices/PaddleOCR-VL-0.9B.md b/docs/zh/best_practices/PaddleOCR-VL-0.9B.md
@@ -5,7 +5,7 @@
 ## 一、环境准备
 ### 1.1 支持情况
 推荐硬件配置：
-- 显存：24GB显存及以上
+- 显存：12GB显存及以上
 - 共享内存：2G及以上
 
 ### 1.2 安装fastdeploy
@@ -14,7 +14,7 @@
 
 ## 二、如何使用
 ### 2.1 基础：启动服务
- **示例1：** 4090上单卡部署16K上下文的服务
+ **示例1：** 3060上单卡部署16K上下文的服务
 ```shell
 python -m fastdeploy.entrypoints.openai.api_server \
     --model PaddlePaddle/PaddleOCR-VL \
@@ -23,11 +23,24 @@ python -m fastdeploy.entrypoints.openai.api_server \
     --engine-worker-queue-port 8182 \
     --max-model-len 16384 \
     --max-num-batched-tokens 16384 \
-    --gpu-memory-utilization 0.8 \
+    --gpu-memory-utilization 0.9 \
     --max-num-seqs 128
 ```
 
- **示例2：** A100上单卡部署16K上下文的服务
+ **示例2：** 4090上单卡部署16K上下文的服务
+```shell
+python -m fastdeploy.entrypoints.openai.api_server \
+    --model PaddlePaddle/PaddleOCR-VL \
+    --port 8180 \
+    --metrics-port 8181 \
+    --engine-worker-queue-port 8182 \
+    --max-model-len 16384 \
+    --max-num-batched-tokens 16384 \
+    --gpu-memory-utilization 0.8 \
+    --max-num-seqs 196
+```
+
+ **示例3：** A100上单卡部署16K上下文的服务
 ```shell
 python -m fastdeploy.entrypoints.openai.api_server \
     --model PaddlePaddle/PaddleOCR-VL \