Skip to content

Commit 445c9b1

Browse files
s-gobrielpre-commit-ci[bot]chensuyueBaoHuilingXuhuiRen
authored
add VDMS retriever microservice for v0.9 Milestone (#539)
* add VDMS retriever microservice Signed-off-by: s-gobriel <[email protected]> * add retrieval gateway and logger back to init Signed-off-by: s-gobriel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use 5009 in CI Signed-off-by: BaoHuiling <[email protected]> * change index_name to collection_name Signed-off-by: s-gobriel <[email protected]> * fix var name Signed-off-by: BaoHuiling <[email protected]> * use index name all Signed-off-by: BaoHuiling <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add deps Signed-off-by: BaoHuiling <[email protected]> * changes to address code reviews Signed-off-by: s-gobriel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * resolve docarray Signed-off-by: s-gobriel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add optional docarray embeddoc constraints Signed-off-by: s-gobriel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug in comment Signed-off-by: BaoHuiling <[email protected]> * import DEBUG Signed-off-by: BaoHuiling <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: s-gobriel <[email protected]> Signed-off-by: BaoHuiling <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: chen, suyue <[email protected]> Co-authored-by: BaoHuiling <[email protected]> Co-authored-by: XuhuiRen <[email protected]>
1 parent 01886fe commit 445c9b1

File tree

14 files changed

+643
-8
lines changed

14 files changed

+643
-8
lines changed

comps/retrievers/langchain/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,18 @@ The service primarily utilizes similarity measures in vector space to rapidly re
66

77
Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial.
88

9-
## Retriever Microservice with Redis
9+
# Retriever Microservice with Redis
1010

1111
For details, please refer to this [readme](redis/README.md)
1212

13-
## Retriever Microservice with Milvus
13+
# Retriever Microservice with Milvus
1414

1515
For details, please refer to this [readme](milvus/README.md)
1616

17-
## Retriever Microservice with PGVector
17+
# Retriever Microservice with PGVector
1818

1919
For details, please refer to this [readme](pgvector/README.md)
20+
21+
# Retriever Microservice with VDMS
22+
23+
For details, please refer to this [readme](vdms/README.md)
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Retriever Microservice
2+
3+
This retriever microservice is a highly efficient search service designed for handling and retrieving embedding vectors. It operates by receiving an embedding vector as input and conducting a similarity search against vectors stored in a VectorDB database. Users must specify the VectorDB's host, port, and the index/collection name, and the service searches within that index to find documents with the highest similarity to the input vector.
4+
5+
The service primarily utilizes similarity measures in vector space to rapidly retrieve contentually similar documents. The vector-based retrieval approach is particularly suited for handling large datasets, offering fast and accurate search results that significantly enhance the efficiency and quality of information retrieval.
6+
7+
Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial.
8+
9+
# Visual Data Management System (VDMS)
10+
11+
VDMS is a storage solution for efficient access of big-”visual”-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata stored as a graph and enabling machine friendly enhancements to visual data for faster access.
12+
13+
VDMS offers the functionality of VectorDB. It provides multiple engines to index large number of embeddings and to search them for similarity. Based on the use case, the engine used will provide a tradeoff between indexing speed, search speed, total memory footprint, and search accuracy.
14+
15+
VDMS also supports a graph database to store different metadata(s) associated with each vector embedding, and to retrieve them supporting a large variety of relationships ranging from simple to very complex relationships.
16+
17+
In Summary, VDMS supports:
18+
19+
K nearest neighbor search
20+
Euclidean distance (L2) and inner product (IP)
21+
Libraries for indexing and computing distances: TileDBDense, TileDBSparse, FaissFlat (Default), FaissIVFFlat, Flinng
22+
Embeddings for text, images, and video
23+
Vector and metadata searches
24+
Scalabity to allow for definition of different relationships across the metadata
25+
26+
# 🚀1. Start Microservice with Python (Option 1)
27+
28+
To start the retriever microservice, you must first install the required python packages.
29+
30+
## 1.1 Install Requirements
31+
32+
```bash
33+
pip install -r requirements.txt
34+
```
35+
36+
## 1.2 Start TEI Service
37+
38+
```bash
39+
export LANGCHAIN_TRACING_V2=true
40+
export LANGCHAIN_API_KEY=${your_langchain_api_key}
41+
export LANGCHAIN_PROJECT="opea/retriever"
42+
model=BAAI/bge-base-en-v1.5
43+
revision=refs/pr/4
44+
volume=$PWD/data
45+
docker run -d -p 6060:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model --revision $revision
46+
```
47+
48+
## 1.3 Verify the TEI Service
49+
50+
```bash
51+
curl 127.0.0.1:6060/rerank \
52+
-X POST \
53+
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
54+
-H 'Content-Type: application/json'
55+
```
56+
57+
## 1.4 Setup VectorDB Service
58+
59+
You need to setup your own VectorDB service (VDMS in this example), and ingest your knowledge documents into the vector database.
60+
61+
As for VDMS, you could start a docker container using the following commands.
62+
Remember to ingest data into it manually.
63+
64+
```bash
65+
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
66+
```
67+
68+
## 1.5 Start Retriever Service
69+
70+
```bash
71+
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
72+
python langchain/retriever_vdms.py
73+
```
74+
75+
# 🚀2. Start Microservice with Docker (Option 2)
76+
77+
## 2.1 Setup Environment Variables
78+
79+
```bash
80+
export RETRIEVE_MODEL_ID="BAAI/bge-base-en-v1.5"
81+
export INDEX_NAME=${your_index_name or collection_name}
82+
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
83+
export LANGCHAIN_TRACING_V2=true
84+
export LANGCHAIN_API_KEY=${your_langchain_api_key}
85+
export LANGCHAIN_PROJECT="opea/retrievers"
86+
```
87+
88+
## 2.2 Build Docker Image
89+
90+
```bash
91+
cd ../../
92+
docker build -t opea/retriever-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/langchain/vdms/docker/Dockerfile .
93+
```
94+
95+
To start a docker container, you have two options:
96+
97+
- A. Run Docker with CLI
98+
- B. Run Docker with Docker Compose
99+
100+
You can choose one as needed.
101+
102+
## 2.3 Run Docker with CLI (Option A)
103+
104+
```bash
105+
docker run -d --name="retriever-vdms-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e INDEX_NAME=$INDEX_NAME -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/retriever-vdms:latest
106+
```
107+
108+
## 2.4 Run Docker with Docker Compose (Option B)
109+
110+
```bash
111+
cd langchain/vdms/docker
112+
docker compose -f docker_compose_retriever.yaml up -d
113+
```
114+
115+
# 🚀3. Consume Retriever Service
116+
117+
## 3.1 Check Service Status
118+
119+
```bash
120+
curl http://localhost:7000/v1/health_check \
121+
-X GET \
122+
-H 'Content-Type: application/json'
123+
```
124+
125+
## 3.2 Consume Embedding Service
126+
127+
To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.
128+
129+
```bash
130+
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
131+
curl http://${your_ip}:7000/v1/retrieval \
132+
-X POST \
133+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
134+
-H 'Content-Type: application/json'
135+
```
136+
137+
You can set the parameters for the retriever.
138+
139+
```bash
140+
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
141+
curl http://localhost:7000/v1/retrieval \
142+
-X POST \
143+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity\", \"k\":4}" \
144+
-H 'Content-Type: application/json'
145+
```
146+
147+
```bash
148+
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
149+
curl http://localhost:7000/v1/retrieval \
150+
-X POST \
151+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_distance_threshold\", \"k\":4, \"distance_threshold\":1.0}" \
152+
-H 'Content-Type: application/json'
153+
```
154+
155+
```bash
156+
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
157+
curl http://localhost:7000/v1/retrieval \
158+
-X POST \
159+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_score_threshold\", \"k\":4, \"score_threshold\":0.2}" \
160+
-H 'Content-Type: application/json'
161+
```
162+
163+
```bash
164+
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
165+
curl http://localhost:7000/v1/retrieval \
166+
-X POST \
167+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"mmr\", \"k\":4, \"fetch_k\":20, \"lambda_mult\":0.5}" \
168+
-H 'Content-Type: application/json'
169+
```
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
FROM langchain/langchain:latest
6+
7+
ARG ARCH="cpu"
8+
9+
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
10+
libgl1-mesa-glx \
11+
libjemalloc-dev \
12+
iputils-ping \
13+
vim
14+
15+
RUN useradd -m -s /bin/bash user && \
16+
mkdir -p /home/user && \
17+
chown -R user /home/user/
18+
19+
COPY comps /home/user/comps
20+
21+
# RUN chmod +x /home/user/comps/retrievers/langchain/vdms/run.sh
22+
23+
USER user
24+
RUN pip install --no-cache-dir --upgrade pip && \
25+
if [ ${ARCH} = "cpu" ]; then pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu; fi && \
26+
pip install --no-cache-dir -r /home/user/comps/retrievers/langchain/vdms/requirements.txt
27+
28+
RUN pip install -U langchain
29+
RUN pip install -U langchain-community
30+
31+
RUN pip install --upgrade huggingface-hub
32+
33+
ENV PYTHONPATH=$PYTHONPATH:/home/user
34+
35+
ENV HUGGINGFACEHUB_API_TOKEN=dummy
36+
37+
ENV USECLIP 0
38+
39+
ENV no_proxy=localhost,127.0.0.1
40+
41+
ENV http_proxy=""
42+
ENV https_proxy=""
43+
44+
WORKDIR /home/user/comps/retrievers/langchain/vdms
45+
46+
#ENTRYPOINT ["/home/user/comps/retrievers/langchain/vdms/run.sh"]
47+
#ENTRYPOINT ["/bin/bash"]
48+
49+
ENTRYPOINT ["python", "retriever_vdms.py"]
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
version: "3.8"
5+
6+
services:
7+
tei_xeon_service:
8+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
9+
container_name: tei-xeon-server
10+
ports:
11+
- "6060:80"
12+
volumes:
13+
- "./data:/data"
14+
shm_size: 1g
15+
command: --model-id ${RETRIEVE_MODEL_ID}
16+
retriever:
17+
image: opea/retriever-vdms:latest
18+
container_name: retriever-vdms-server
19+
ports:
20+
- "7000:7000"
21+
ipc: host
22+
environment:
23+
no_proxy: ${no_proxy}
24+
http_proxy: ${http_proxy}
25+
https_proxy: ${https_proxy}
26+
INDEX_NAME: ${INDEX_NAME}
27+
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
28+
restart: unless-stopped
29+
30+
networks:
31+
default:
32+
driver: bridge
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
docarray[full]
2+
easyocr
3+
einops
4+
fastapi
5+
langchain-community
6+
langchain-core
7+
langchain-huggingface
8+
opentelemetry-api
9+
opentelemetry-exporter-otlp
10+
opentelemetry-sdk
11+
prometheus-fastapi-instrumentator
12+
pymupdf
13+
sentence_transformers
14+
shortuuid
15+
uvicorn
16+
vdms

0 commit comments

Comments
 (0)