Skip to content

Add milvus support for data-prep #468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
**/bin/
*.out
*.swp
**/Chart.lock
**/charts/*.tgz

bazel-*
compile_commands.json
.gitconfig
4 changes: 4 additions & 0 deletions helm-charts/common/data-prep/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,7 @@ dependencies:
version: 1.0.0
repository: file://../redis-vector-db
condition: redis-vector-db.enabled
- name: milvus
version: 4.2.12
repository: https://zilliztech.github.io/milvus-helm/
condition: milvus.enabled
4 changes: 4 additions & 0 deletions helm-charts/common/data-prep/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,7 @@ curl http://localhost:6007/v1/dataprep \
| service.port | string | `"6007"` | |
| REDIS_URL | string | `""` | |
| TEI_EMBEDDING_ENDPOINT | string | `""` | |

## Milvus support

Refer to the milvus-values.yaml for milvus configurations.
2 changes: 2 additions & 0 deletions helm-charts/common/data-prep/ci-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,5 @@ tei:
enabled: true
redis-vector-db:
enabled: true
milvus:
enabled: false
33 changes: 33 additions & 0 deletions helm-charts/common/data-prep/milvus-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for data-prep.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
milvus:
enabled: true
cluster:
enabled: false
etcd:
replicaCount: 1
pulsar:
enabled: false
minio:
mode: standalone
redis-vector-db:
enabled: false
tei:
enabled: true

image:
repository: opea/dataprep-milvus

port: 6010
# text embedding inference service URL, e.g. http://<service-name>:<port>
#TEI_EMBEDDING_ENDPOINT: "http://embedding-tei:80"
# milvus DB configurations
#MILVUS_HOST: "milvustest"
MILVUS_PORT: "19530"
COLLECTION_NAME: "rag_milvus"
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""
23 changes: 20 additions & 3 deletions helm-charts/common/data-prep/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,19 @@ metadata:
labels:
{{- include "data-prep.labels" . | nindent 4 }}
data:
{{- if .Values.TEI_EMBEDDING_ENDPOINT }}
{{- if .Values.MOSEC_EMBEDDING_ENDPOINT }}
MOSEC_EMBEDDING_ENDPOINT: {{ .Values.MOSEC_EMBEDDING_ENDPOINT | quote}}
MOSEC_EMBEDDING_MODEL: {{ .Values.MOSEC_EMBEDDING_MODEL | quote}}
{{- else if .Values.TEI_EMBEDDING_ENDPOINT }}
TEI_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote}}
{{- else if not .Values.EMBED_MODEL }}
TEI_EMBEDDING_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote}}
{{- else if not .Values.LOCAL_EMBEDDING_MODEL }}
TEI_ENDPOINT: "http://{{ .Release.Name }}-tei"
{{- end }}
EMBED_MODEL: {{ .Values.EMBED_MODEL | quote }}
{{- if .Values.LOCAL_EMBEDDING_MODEL }}
EMBED_MODEL: {{ .Values.LOCAL_EMBEDDING_MODEL | quote }}
LOCAL_EMBEDDING_MODEL: {{ .Values.LOCAL_EMBEDDING_MODEL | quote }}
{{- end }}
{{- if .Values.REDIS_URL }}
REDIS_URL: {{ .Values.REDIS_URL | quote}}
{{- else }}
Expand All @@ -22,6 +29,16 @@ data:
INDEX_NAME: {{ .Values.INDEX_NAME | quote }}
KEY_INDEX_NAME: {{ .Values.KEY_INDEX_NAME | quote }}
SEARCH_BATCH_SIZE: {{ .Values.SEARCH_BATCH_SIZE | quote }}
{{- if .Values.MILVUS_HOST }}
MILVUS_HOST: {{ .Values.MILVUS_HOST | quote }}
{{- else }}
MILVUS_HOST: "{{ .Release.Name }}-milvus"
{{- end }}
MILVUS: {{ .Values.MILVUS_HOST | quote }}
MILVUS_PORT: {{ .Values.MILVUS_PORT | quote }}
{{- if .Values.COLLECTION_NAME }}
COLLECTION_NAME: {{ .Values.COLLECTION_NAME | quote }}
{{- end }}
HUGGINGFACEHUB_API_TOKEN: {{ .Values.global.HUGGINGFACEHUB_API_TOKEN | quote}}
HF_HOME: "/tmp/.cache/huggingface"
{{- if .Values.global.HF_ENDPOINT }}
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/common/data-prep/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ spec:
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: data-prep
containerPort: 6007
containerPort: {{ .Values.port }}
protocol: TCP
volumeMounts:
- mountPath: /tmp
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/common/data-prep/templates/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: 6007
targetPort: {{ .Values.port }}
protocol: TCP
name: data-prep
selector:
Expand Down
4 changes: 4 additions & 0 deletions helm-charts/common/data-prep/templates/tests/test-pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,9 @@ spec:
curlcode=$?
if [[ $curlcode -eq 7 ]]; then sleep 10; else echo "curl failed with code $curlcode"; exit 1; fi;
done;
curl http://{{ include "data-prep.fullname" . }}:{{ .Values.service.port }}/v1/dataprep/delete_file -sS \
-X POST \
-H "Content-Type: application/json" \
-d '{"file_path": "file1.txt"}';
if [ $i -gt $max_retry ]; then echo "test failed with maximum retry"; exit 1; fi
restartPolicy: Never
12 changes: 11 additions & 1 deletion helm-charts/common/data-prep/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

tei:
enabled: false
milvus:
enabled: false
redis-vector-db:
enabled: false

Expand Down Expand Up @@ -38,6 +40,7 @@ securityContext:
seccompProfile:
type: RuntimeDefault

port: 6007
service:
type: ClusterIP
port: 6007
Expand Down Expand Up @@ -89,14 +92,21 @@ LOGFLAG: ""
TEI_EMBEDDING_ENDPOINT: ""

# local embedder's model
EMBED_MODEL: ""
LOCAL_EMBEDDING_MODEL: ""

# redis DB service URL, e.g. redis://<service-name>:<port>
REDIS_URL: ""
INDEX_NAME: "rag-redis"
KEY_INDEX_NAME: "file-keys"
SEARCH_BATCH_SIZE: 10

# milvus DB configurations
MILVUS_HOST: ""
MILVUS_PORT: ""
COLLECTION_NAME: ""
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""

global:
http_proxy: ""
https_proxy: ""
Expand Down
4 changes: 4 additions & 0 deletions helm-charts/common/retriever-usvc/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,7 @@ dependencies:
version: 1.0.0
repository: file://../redis-vector-db
condition: redis-vector-db.enabled
- name: milvus
version: 4.2.12
repository: https://zilliztech.github.io/milvus-helm/
condition: milvus.enabled
4 changes: 4 additions & 0 deletions helm-charts/common/retriever-usvc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,7 @@ curl http://localhost:7000/v1/retrieval \
| service.port | string | `"7000"` | |
| REDIS_URL | string | `""` | |
| TEI_EMBEDDING_ENDPOINT | string | `""` | |

## Milvus support

Refer to the milvus-values.yaml for milvus configurations.
2 changes: 2 additions & 0 deletions helm-charts/common/retriever-usvc/ci-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,5 @@ tei:
enabled: true
redis-vector-db:
enabled: true
milvus:
enabled: false
33 changes: 33 additions & 0 deletions helm-charts/common/retriever-usvc/milvus-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for retriever-usvc.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

milvus:
enabled: true
cluster:
enabled: false
etcd:
replicaCount: 1
pulsar:
enabled: false
minio:
mode: standalone
redis-vector-db:
enabled: false
tei:
enabled: true

image:
repository: opea/retriever-milvus
port: 7000
# text embedding inference service URL, e.g. http://<service-name>:<port>
#TEI_EMBEDDING_ENDPOINT: "http://dataprep-tei:80"
# milvus DB configurations
#MILVUS_HOST: "dataprep-milvus"
MILVUS_PORT: "19530"
COLLECTION_NAME: "rag_milvus"
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""
22 changes: 19 additions & 3 deletions helm-charts/common/retriever-usvc/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,34 @@ metadata:
labels:
{{- include "retriever-usvc.labels" . | nindent 4 }}
data:
{{- if .Values.TEI_EMBEDDING_ENDPOINT }}
{{- if .Values.MOSEC_EMBEDDING_ENDPOINT }}
MOSEC_EMBEDDING_ENDPOINT: {{ .Values.MOSEC_EMBEDDING_ENDPOINT | quote}}
MOSEC_EMBEDDING_MODEL: {{ .Values.MOSEC_EMBEDDING_MODEL | quote}}
{{- else if .Values.TEI_EMBEDDING_ENDPOINT }}
TEI_EMBEDDING_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote }}
{{- else if not .Values.EMBED_MODEL }}
{{- else if not .Values.LOCAL_EMBEDDING_MODEL }}
TEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-tei"
{{- end }}
EMBED_MODEL: {{ .Values.EMBED_MODEL | quote }}
{{- if .Values.LOCAL_EMBEDDING_MODEL }}
EMBED_MODEL: {{ .Values.LOCAL_EMBEDDING_MODEL | quote }}
LOCAL_EMBEDDING_MODEL: {{ .Values.LOCAL_EMBEDDING_MODEL | quote }}
{{- end }}
{{- if .Values.REDIS_URL }}
REDIS_URL: {{ .Values.REDIS_URL | quote}}
{{- else }}
REDIS_URL: "redis://{{ .Release.Name }}-redis-vector-db:6379"
{{- end }}
INDEX_NAME: {{ .Values.INDEX_NAME | quote }}
{{- if .Values.MILVUS_HOST }}
MILVUS_HOST: {{ .Values.MILVUS_HOST | quote }}
{{- else }}
MILVUS_HOST: "{{ .Release.Name }}-milvus"
{{- end }}
MILVUS: {{ .Values.MILVUS_HOST | quote }}
MILVUS_PORT: {{ .Values.MILVUS_PORT | quote }}
{{- if .Values.COLLECTION_NAME }}
COLLECTION_NAME: {{ .Values.COLLECTION_NAME | quote }}
{{- end }}
EASYOCR_MODULE_PATH: "/tmp/.EasyOCR"
http_proxy: {{ .Values.global.http_proxy | quote }}
https_proxy: {{ .Values.global.https_proxy | quote }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ spec:
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: retriever-usvc
containerPort: 7000
containerPort: {{ .Values.port }}
protocol: TCP
volumeMounts:
- mountPath: /tmp
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/common/retriever-usvc/templates/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: 7000
targetPort: {{ .Values.port }}
protocol: TCP
name: retriever-usvc
selector:
Expand Down
12 changes: 11 additions & 1 deletion helm-charts/common/retriever-usvc/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

tei:
enabled: false
milvus:
enabled: false
redis-vector-db:
enabled: false

Expand All @@ -17,7 +19,7 @@ replicaCount: 1
LOGFLAG: ""

TEI_EMBEDDING_ENDPOINT: ""
EMBED_MODEL: ""
LOCAL_EMBEDDING_MODEL: ""

REDIS_URL: ""
INDEX_NAME: "rag-redis"
Expand Down Expand Up @@ -48,6 +50,7 @@ securityContext:
seccompProfile:
type: RuntimeDefault

port: 7000
service:
type: ClusterIP
# The default port for retriever service is 7000
Expand Down Expand Up @@ -92,6 +95,13 @@ tolerations: []

affinity: {}

# milvus DB configurations
MILVUS_HOST: ""
MILVUS_PORT: ""
COLLECTION_NAME: ""
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""

global:
http_proxy: ""
https_proxy: ""
Expand Down