Skip to content

Commit f37ce2c

Browse files
letonghanpre-commit-ci[bot]zehao-intelSpycshchensuyue
authored
Support Embedding Microservice with Llama Index (#150)
* fix stream=false doesn't work issue Signed-off-by: letonghan <[email protected]> * support embedding comp with llama_index Signed-off-by: letonghan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add More Contents to the Table of MicroService (#141) * Add More Contents to the Table MicroService Signed-off-by: zehao-intel <[email protected]> * reorder Signed-off-by: zehao-intel <[email protected]> * Update README.md * refine structure Signed-off-by: zehao-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix model Signed-off-by: zehao-intel <[email protected]> * refine table Signed-off-by: zehao-intel <[email protected]> * put llm to the ground Signed-off-by: zehao-intel <[email protected]> --------- Signed-off-by: zehao-intel <[email protected]> Co-authored-by: Sihan Chen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Use common security content for OPEA projects (#151) * add python coverage Signed-off-by: chensuyue <[email protected]> * docs update Signed-off-by: chensuyue <[email protected]> * Revert "add python coverage" This reverts commit 69615b1. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: chensuyue <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Enable vLLM Gaudi support for LLM service based on officially habana vllm release (#137) Signed-off-by: tianyil1 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * support embedding comp with llama_index Signed-off-by: letonghan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test script for embedding llama_inde Signed-off-by: letonghan <[email protected]> * remove conflict requirements Signed-off-by: letonghan <[email protected]> * update test script Signed-off-by: letonghan <[email protected]> * udpate Signed-off-by: letonghan <[email protected]> * update Signed-off-by: letonghan <[email protected]> * update Signed-off-by: letonghan <[email protected]> * fix ut issue Signed-off-by: letonghan <[email protected]> --------- Signed-off-by: letonghan <[email protected]> Signed-off-by: zehao-intel <[email protected]> Signed-off-by: chensuyue <[email protected]> Signed-off-by: tianyil1 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: zehao-intel <[email protected]> Co-authored-by: Sihan Chen <[email protected]> Co-authored-by: chen, suyue <[email protected]> Co-authored-by: Tianyi Liu <[email protected]>
1 parent f7443f2 commit f37ce2c

File tree

8 files changed

+217
-1
lines changed

8 files changed

+217
-1
lines changed

comps/embeddings/README.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,10 @@ For both of the implementations, you need to install requirements first.
2727
## 1.1 Install Requirements
2828

2929
```bash
30+
# run with langchain
3031
pip install -r langchain/requirements.txt
32+
# run with llama_index
33+
pip install -r llama_index/requirements.txt
3134
```
3235

3336
## 1.2 Start Embedding Service
@@ -57,8 +60,12 @@ curl localhost:$your_port/embed \
5760
Start the embedding service with the TEI_EMBEDDING_ENDPOINT.
5861

5962
```bash
63+
# run with langchain
6064
cd langchain
65+
# run with llama_index
66+
cd llama_index
6167
export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
68+
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"
6269
export LANGCHAIN_TRACING_V2=true
6370
export LANGCHAIN_API_KEY=${your_langchain_api_key}
6471
export LANGCHAIN_PROJECT="opea/gen-ai-comps:embeddings"
@@ -68,7 +75,10 @@ python embedding_tei_gaudi.py
6875
### Start Embedding Service with Local Model
6976

7077
```bash
78+
# run with langchain
7179
cd langchain
80+
# run with llama_index
81+
cd llama_index
7282
python local_embedding.py
7383
```
7484

@@ -98,19 +108,29 @@ Export the `TEI_EMBEDDING_ENDPOINT` for later usage:
98108

99109
```bash
100110
export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
111+
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"
101112
```
102113

103114
## 2.2 Build Docker Image
104115

116+
### Build Langchain Docker (Option a)
117+
105118
```bash
106119
cd ../../
107120
docker build -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/langchain/docker/Dockerfile .
108121
```
109122

123+
### Build LlamaIndex Docker (Option b)
124+
125+
```bash
126+
cd ../../
127+
docker build -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/llama_index/docker/Dockerfile .
128+
```
129+
110130
## 2.3 Run Docker with CLI
111131

112132
```bash
113-
docker run -d --name="embedding-tei-server" -p 6000:6000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/embedding-tei:latest
133+
docker run -d --name="embedding-tei-server" -p 6000:6000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT -e TEI_EMBEDDING_MODEL_NAME=$TEI_EMBEDDING_MODEL_NAME opea/embedding-tei:latest
114134
```
115135

116136
## 2.4 Run Docker with Docker Compose
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
FROM ubuntu:22.04
6+
7+
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
8+
libgl1-mesa-glx \
9+
libjemalloc-dev \
10+
vim \
11+
python3 \
12+
python3-pip
13+
14+
RUN useradd -m -s /bin/bash user && \
15+
mkdir -p /home/user && \
16+
chown -R user /home/user/
17+
18+
USER user
19+
20+
COPY comps /home/user/comps
21+
22+
RUN pip install --no-cache-dir --upgrade pip && \
23+
pip install --no-cache-dir -r /home/user/comps/embeddings/llama_index/requirements.txt
24+
25+
ENV PYTHONPATH=$PYTHONPATH:/home/user
26+
27+
WORKDIR /home/user/comps/embeddings/llama_index
28+
29+
ENTRYPOINT ["python3", "embedding_tei_gaudi.py"]
30+
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
version: "3.8"
5+
6+
services:
7+
embedding:
8+
image: opea/embedding-tei:latest
9+
container_name: embedding-tei-server
10+
ports:
11+
- "6000:6000"
12+
ipc: host
13+
environment:
14+
http_proxy: ${http_proxy}
15+
https_proxy: ${https_proxy}
16+
TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT}
17+
TEI_EMBEDDING_MODEL_NAME: ${TEI_EMBEDDING_MODEL_NAME}
18+
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
19+
restart: unless-stopped
20+
21+
networks:
22+
default:
23+
driver: bridge
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import os
5+
6+
from langsmith import traceable
7+
from llama_index.embeddings.text_embeddings_inference import TextEmbeddingsInference
8+
9+
from comps import EmbedDoc768, ServiceType, TextDoc, opea_microservices, register_microservice
10+
11+
12+
@register_microservice(
13+
name="opea_service@embedding_tgi_gaudi",
14+
service_type=ServiceType.EMBEDDING,
15+
endpoint="/v1/embeddings",
16+
host="0.0.0.0",
17+
port=6000,
18+
input_datatype=TextDoc,
19+
output_datatype=EmbedDoc768,
20+
)
21+
@traceable(run_type="embedding")
22+
def embedding(input: TextDoc) -> EmbedDoc768:
23+
embed_vector = embeddings._get_query_embedding(input.text)
24+
embed_vector = embed_vector[:768] # Keep only the first 768 elements
25+
res = EmbedDoc768(text=input.text, embedding=embed_vector)
26+
return res
27+
28+
29+
if __name__ == "__main__":
30+
tei_embedding_model_name = os.getenv("TEI_EMBEDDING_MODEL_NAME", "BAAI/bge-large-en-v1.5")
31+
tei_embedding_endpoint = os.getenv("TEI_EMBEDDING_ENDPOINT", "http://localhost:8090")
32+
embeddings = TextEmbeddingsInference(model_name=tei_embedding_model_name, base_url=tei_embedding_endpoint)
33+
print("TEI Gaudi Embedding initialized.")
34+
opea_microservices["opea_service@embedding_tgi_gaudi"].start()
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from langsmith import traceable
5+
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
6+
7+
from comps import EmbedDoc1024, ServiceType, TextDoc, opea_microservices, register_microservice
8+
9+
10+
@register_microservice(
11+
name="opea_service@local_embedding",
12+
service_type=ServiceType.EMBEDDING,
13+
endpoint="/v1/embeddings",
14+
host="0.0.0.0",
15+
port=6000,
16+
input_datatype=TextDoc,
17+
output_datatype=EmbedDoc1024,
18+
)
19+
@traceable(run_type="embedding")
20+
def embedding(input: TextDoc) -> EmbedDoc1024:
21+
embed_vector = embeddings.get_text_embedding(input.text)
22+
res = EmbedDoc1024(text=input.text, embedding=embed_vector)
23+
return res
24+
25+
26+
if __name__ == "__main__":
27+
embeddings = HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5")
28+
opea_microservices["opea_service@local_embedding"].start()
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
docarray[full]
2+
fastapi
3+
huggingface_hub
4+
langsmith
5+
llama-index-embeddings-text-embeddings-inference
6+
opentelemetry-api
7+
opentelemetry-exporter-otlp
8+
opentelemetry-sdk
9+
shortuuid

tests/test_embeddings_llama_index.sh

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
#!/bin/bash
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
set -xe
6+
7+
WORKPATH=$(dirname "$PWD")
8+
LOG_PATH="$WORKPATH/tests"
9+
ip_address=$(hostname -I | awk '{print $1}')
10+
11+
function build_docker_images() {
12+
cd $WORKPATH
13+
echo $(pwd)
14+
docker build --no-cache -t opea/embedding-tei:comps --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/llama_index/docker/Dockerfile .
15+
}
16+
17+
function start_service() {
18+
tei_endpoint=5001
19+
model="BAAI/bge-large-en-v1.5"
20+
revision="refs/pr/5"
21+
docker run -d --name="test-comps-embedding-tei-endpoint" -p $tei_endpoint:80 -v ./data:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.2 --model-id $model --revision $revision
22+
export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${tei_endpoint}"
23+
tei_service_port=5010
24+
docker run -d --name="test-comps-embedding-tei-server" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p ${tei_service_port}:6000 --ipc=host -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/embedding-tei:comps
25+
sleep 3m
26+
}
27+
28+
function validate_microservice() {
29+
tei_service_port=5010
30+
URL="http://${ip_address}:$tei_service_port/v1/embeddings"
31+
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log
32+
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST -d '{"text":"What is Deep Learning?"}' -H 'Content-Type: application/json' "$URL")
33+
if [ "$HTTP_STATUS" -eq 200 ]; then
34+
echo "[ embedding - llama_index ] HTTP status is 200. Checking content..."
35+
local CONTENT=$(curl -s -X POST -d '{"text":"What is Deep Learning?"}' -H 'Content-Type: application/json' "$URL" | tee ${LOG_PATH}/embedding.log)
36+
37+
if echo '"text":"What is Deep Learning?","embedding":\[' | grep -q "$EXPECTED_RESULT"; then
38+
echo "[ embedding - llama_index ] Content is as expected."
39+
else
40+
echo "[ embedding - llama_index ] Content does not match the expected result: $CONTENT"
41+
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log
42+
exit 1
43+
fi
44+
else
45+
echo "[ embedding - llama_index ] HTTP status is not 200. Received status was $HTTP_STATUS"
46+
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log
47+
exit 1
48+
fi
49+
}
50+
51+
function stop_docker() {
52+
cid=$(docker ps -aq --filter "name=test-comps-embedding-*")
53+
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
54+
}
55+
56+
function main() {
57+
58+
stop_docker
59+
60+
build_docker_images
61+
start_service
62+
63+
validate_microservice
64+
65+
stop_docker
66+
echo y | docker system prune
67+
68+
}
69+
70+
main

0 commit comments

Comments
 (0)