Skip to content

Adding files to deploy CodeTrans application on ROCm vLLM #1545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
097dcaf
Add a new section to change LLM model such as deepseek based on valid…
louie-tsai Feb 12, 2025
aa5a93e
CodeTrans - add deploy app with vLLM ROCm
Feb 12, 2025
488aee3
CodeTrans - add deploy app with vLLM ROCm
Feb 12, 2025
3243f4b
CodeTrans - add deploy app with vLLM ROCm
Feb 12, 2025
f5430e5
CodeTrans - add deploy app with vLLM ROCm
Feb 12, 2025
1106d0e
CodeTrans - fix Dockerfile for vLLM
Feb 12, 2025
265d67b
CodeTrans - fix Dockerfile for vLLM
Feb 12, 2025
bea90fa
CodeTrans - fix files for deploy with ROCm vLLM
Feb 13, 2025
9090354
CodeTrans - fix files for deploy with ROCm vLLM
Feb 13, 2025
470d88b
CodeTrans - fix files for deploy with ROCm vLLM
Feb 17, 2025
a3fe495
CodeTrans - fix files for deploy with ROCm vLLM
Feb 17, 2025
cc3ab59
CodeTrans - fix files for deploy with ROCm vLLM
Feb 19, 2025
26a443c
CodeTrans - fix ROCm docker compose file
Mar 6, 2025
7a317ec
CodeTrans - fix files for deploy on ROCm
Mar 17, 2025
19bc3c5
CodeTrans - fix files for deploy on ROCm
Mar 17, 2025
9feb285
Merge branch 'main' into feature/Codetrans_vLLM
chyundunovDatamonsters Mar 17, 2025
2d0602c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 17, 2025
061db6c
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
3b0e35e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 18, 2025
0d92a20
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
acf55d9
Merge remote-tracking branch 'origin/feature/Codetrans_vLLM' into fea…
Mar 18, 2025
3db4ac4
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
204e3a7
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
f597a85
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
4f577ad
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 18, 2025
92a8034
CodeTrans - fix files for deploy on ROCm
Mar 18, 2025
524421a
Merge remote-tracking branch 'origin/feature/Codetrans_vLLM' into fea…
Mar 18, 2025
289cffd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 18, 2025
7415b59
CodeTrans - fix files for deploy on ROCm
Mar 19, 2025
cf68881
CodeTrans - fix files for deploy on ROCm
Mar 19, 2025
e011a9b
CodeTrans - fix files for deploy on ROCm
Mar 19, 2025
4693488
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2025
b21ed34
Merge branch 'main' into feature/Codetrans_vLLM
chyundunovDatamonsters Mar 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added CodeTrans/assets/img/ui-result-page.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added CodeTrans/assets/img/ui-starting-page.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
390 changes: 343 additions & 47 deletions CodeTrans/docker_compose/amd/gpu/rocm/README.md

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion CodeTrans/docker_compose/amd/gpu/rocm/compose.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# Copyright (c) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0

services:
Expand All @@ -19,7 +20,7 @@ services:
HUGGINGFACEHUB_API_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN}
host_ip: ${host_ip}
healthcheck:
test: ["CMD-SHELL", "curl -f http://$host_ip:8008/health || exit 1"]
test: ["CMD-SHELL", "curl -f http://${HOST_IP}:${CODETRANS_TGI_SERVICE_PORT}/health || exit 1"]
interval: 10s
timeout: 10s
retries: 100
Expand Down
113 changes: 113 additions & 0 deletions CodeTrans/docker_compose/amd/gpu/rocm/compose_vllm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Copyright (C) 2024 Intel Corporation
# Copyright (c) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0

services:
codetrans-vllm-service:
image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest}
container_name: codetrans-vllm-service
ports:
- "${CODETRANS_VLLM_SERVICE_PORT:-8081}:8011"
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGINGFACEHUB_API_TOKEN: ${CODETRANS_HUGGINGFACEHUB_API_TOKEN}
HF_TOKEN: ${CODETRANS_HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
WILM_USE_TRITON_FLASH_ATTENTION: 0
PYTORCH_JIT: 0
healthcheck:
test: [ "CMD-SHELL", "curl -f http://${HOST_IP}:${CODETRANS_VLLM_SERVICE_PORT:-8028}/health || exit 1" ]
interval: 10s
timeout: 10s
retries: 100
volumes:
- "./data:/data"
shm_size: 20G
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
- apparmor=unconfined
command: "--model ${CODETRANS_LLM_MODEL_ID} --swap-space 16 --disable-log-requests --dtype float16 --tensor-parallel-size 4 --host 0.0.0.0 --port 8011 --num-scheduler-steps 1 --distributed-executor-backend \"mp\""
ipc: host
codetrans-llm-server:
image: ${REGISTRY:-opea}/llm-textgen:${TAG:-latest}
container_name: codetrans-llm-server
depends_on:
codetrans-vllm-service:
condition: service_healthy
ports:
- "${CODETRANS_LLM_SERVICE_PORT:-9000}:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LLM_ENDPOINT: ${CODETRANS_LLM_ENDPOINT}
LLM_MODEL_ID: ${CODETRANS_LLM_MODEL_ID}
HUGGINGFACEHUB_API_TOKEN: ${CODETRANS_HUGGINGFACEHUB_API_TOKEN}
HF_TOKEN: ${CODETRANS_HUGGINGFACEHUB_API_TOKEN}
LLM_COMPONENT_NAME: "OpeaTextGenService"
restart: unless-stopped
codetrans-backend-server:
image: ${REGISTRY:-opea}/codetrans:${TAG:-latest}
container_name: codetrans-backend-server
depends_on:
- codetrans-llm-server
ports:
- "${CODETRANS_BACKEND_SERVICE_PORT:-7777}:7777"
environment:
no_proxy: ${no_proxy}
https_proxy: ${https_proxy}
http_proxy: ${http_proxy}
MEGA_SERVICE_HOST_IP: ${HOST_IP}
LLM_SERVICE_HOST_IP: ${HOST_IP}
LLM_SERVICE_PORT: ${CODETRANS_LLM_SERVICE_PORT}
ipc: host
restart: always
codetrans-ui-server:
image: ${REGISTRY:-opea}/codetrans-ui:${TAG:-latest}
container_name: codetrans-ui-server
depends_on:
- codetrans-backend-server
ports:
- "${CODETRANS_FRONTEND_SERVICE_PORT:-5173}:5173"
environment:
no_proxy: ${no_proxy}
https_proxy: ${https_proxy}
http_proxy: ${http_proxy}
BASE_URL: ${CODETRANS_BACKEND_SERVICE_URL}
BASIC_URL: ${CODETRANS_BACKEND_SERVICE_URL}
ipc: host
restart: always
codetrans-nginx-server:
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
container_name: codetrans-nginx-server
depends_on:
- codetrans-backend-server
- codetrans-ui-server
ports:
- "${CODETRANS_NGINX_PORT:-80}:80"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- FRONTEND_SERVICE_IP=${CODETRANS_FRONTEND_SERVICE_IP}
- FRONTEND_SERVICE_PORT=${CODETRANS_FRONTEND_SERVICE_PORT}
- BACKEND_SERVICE_NAME=${CODETRANS_BACKEND_SERVICE_NAME}
- BACKEND_SERVICE_IP=${CODETRANS_BACKEND_SERVICE_IP}
- BACKEND_SERVICE_PORT=${CODETRANS_BACKEND_SERVICE_PORT}
ipc: host
restart: always

networks:
default:
driver: bridge
17 changes: 11 additions & 6 deletions CodeTrans/docker_compose/amd/gpu/rocm/set_env.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# Copyright (c) 2024 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0

### The IP address or domain name of the server on which the application is running
export HOST_IP=direct-supercomputer1.powerml.co
# If your server is located behind a firewall or proxy, you will need to specify its external address,
# which can be used to connect to the server from the Internet. It must be specified in the EXTERNAL_HOST_IP variable.
# If the server is used only on the internal network or has a direct external address,
# specify it in HOST_IP and in EXTERNAL_HOST_IP.
export HOST_IP=''
export EXTERNAL_HOST_IP=''

### Model ID
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
Expand All @@ -16,7 +21,7 @@ export CODETRANS_TGI_SERVICE_PORT=18156
export CODETRANS_TGI_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_TGI_SERVICE_PORT}"

### A token for accessing repositories with models
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=''
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}

### The port of the LLM service. On this port, the LLM service will accept connections
export CODETRANS_LLM_SERVICE_PORT=18157
Expand All @@ -28,7 +33,7 @@ export CODETRANS_MEGA_SERVICE_HOST_IP=${HOST_IP}
export CODETRANS_LLM_SERVICE_HOST_IP=${HOST_IP}

### The ip address of the host on which the container with the frontend service is running
export CODETRANS_FRONTEND_SERVICE_IP=192.165.1.21
export CODETRANS_FRONTEND_SERVICE_IP=${HOST_IP}

### The port of the frontend service
export CODETRANS_FRONTEND_SERVICE_PORT=18155
Expand All @@ -37,7 +42,7 @@ export CODETRANS_FRONTEND_SERVICE_PORT=18155
export CODETRANS_BACKEND_SERVICE_NAME=codetrans

### The ip address of the host on which the container with the backend service is running
export CODETRANS_BACKEND_SERVICE_IP=192.165.1.21
export CODETRANS_BACKEND_SERVICE_IP=${HOST_IP}

### The port of the backend service
export CODETRANS_BACKEND_SERVICE_PORT=18154
Expand All @@ -46,4 +51,4 @@ export CODETRANS_BACKEND_SERVICE_PORT=18154
export CODETRANS_NGINX_PORT=18153

### Endpoint of the backend service
export CODETRANS_BACKEND_SERVICE_URL="http://${HOST_IP}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
export CODETRANS_BACKEND_SERVICE_URL="http://${EXTERNAL_HOST_IP}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
54 changes: 54 additions & 0 deletions CodeTrans/docker_compose/amd/gpu/rocm/set_env_vllm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env bash

# Copyright (c) 2025 Advanced Micro Devices, Inc.
# SPDX-License-Identifier: Apache-2.0

### The IP address or domain name of the server on which the application is running
# If your server is located behind a firewall or proxy, you will need to specify its external address,
# which can be used to connect to the server from the Internet. It must be specified in the EXTERNAL_HOST_IP variable.
# If the server is used only on the internal network or has a direct external address,
# specify it in HOST_IP and in EXTERNAL_HOST_IP.
export HOST_IP=''
export EXTERNAL_HOST_IP=''

### Model ID
export CODETRANS_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"

### The port of the TGI service. On this port, the TGI service will accept connections
export CODETRANS_VLLM_SERVICE_PORT=18156

### The endpoint of the TGI service to which requests to this service will be sent (formed from previously set variables)
export CODETRANS_LLM_ENDPOINT="http://${HOST_IP}:${CODETRANS_VLLM_SERVICE_PORT}"

### A token for accessing repositories with models
export CODETRANS_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}

### The port of the LLM service. On this port, the LLM service will accept connections
export CODETRANS_LLM_SERVICE_PORT=18157

### The IP address or domain name of the server for CodeTrans MegaService
export CODETRANS_MEGA_SERVICE_HOST_IP=${HOST_IP}

### The endpoint of the LLM service to which requests to this service will be sent
export CODETRANS_LLM_SERVICE_HOST_IP=${HOST_IP}

### The ip address of the host on which the container with the frontend service is running
export CODETRANS_FRONTEND_SERVICE_IP=${HOST_IP}

### The port of the frontend service
export CODETRANS_FRONTEND_SERVICE_PORT=18155

### Name of GenAI service for route requests to application
export CODETRANS_BACKEND_SERVICE_NAME=codetrans

### The ip address of the host on which the container with the backend service is running
export CODETRANS_BACKEND_SERVICE_IP=${HOST_IP}

### The port of the backend service
export CODETRANS_BACKEND_SERVICE_PORT=18154

### The port of the Nginx reverse proxy for application
export CODETRANS_NGINX_PORT=18153

### Endpoint of the backend service
export CODETRANS_BACKEND_SERVICE_URL="http://${EXTERNAL_HOST_IP}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
5 changes: 5 additions & 0 deletions CodeTrans/docker_image_build/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,8 @@ services:
dockerfile: comps/third_parties/nginx/src/Dockerfile
extends: codetrans
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
vllm-rocm:
build:
context: GenAIComps
dockerfile: comps/third_parties/vllm/src/Dockerfile.amd_gpu
image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest}
4 changes: 2 additions & 2 deletions CodeTrans/tests/test_compose_on_rocm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ function start_services() {
export CODETRANS_BACKEND_SERVICE_PORT=7777
export CODETRANS_NGINX_PORT=8088
export CODETRANS_BACKEND_SERVICE_URL="http://${ip_address}:${CODETRANS_BACKEND_SERVICE_PORT}/v1/codetrans"
export host_ip=${ip_address}
export HOST_IP=${ip_address}

sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env

Expand Down Expand Up @@ -111,7 +111,7 @@ function validate_microservices() {
"codetrans-tgi-service" \
"codetrans-tgi-service" \
'{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}'

sleep 10
# llm microservice
validate_services \
"${ip_address}:${CODETRANS_LLM_SERVICE_PORT}/v1/chat/completions" \
Expand Down
Loading