Skip to content

Commit fcfc586

Browse files
lvliang-intelChingis Yundunov
authored andcommitted
Adapt code for dataprep microservice refactor (opea-project#1408)
opea-project/GenAIComps#1153 Signed-off-by: lvliang-intel <[email protected]> Signed-off-by: Chingis Yundunov <[email protected]>
1 parent b046d01 commit fcfc586

File tree

91 files changed

+400
-354
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+400
-354
lines changed

AgentQnA/docker_compose/amd/gpu/rocm/launch_agent_service_tgi_rocm.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,8 @@ export EMBEDDING_SERVICE_HOST_IP=${host_ip}
4040
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
4141
export RERANK_SERVICE_HOST_IP=${host_ip}
4242
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
43-
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
44-
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get_file"
45-
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete_file"
43+
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
44+
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get"
45+
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete"
4646

4747
docker compose -f compose.yaml up -d

AgentQnA/docker_compose/amd/gpu/rocm/set_env.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,6 @@ export EMBEDDING_SERVICE_HOST_IP=${host_ip}
4141
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
4242
export RERANK_SERVICE_HOST_IP=${host_ip}
4343
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
44-
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
45-
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get_file"
46-
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete_file"
44+
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
45+
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/get"
46+
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/delete"

AgentQnA/retrieval_tool/index_data.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ def main():
5353
host_ip = args.host_ip
5454
port = args.port
5555
proxies = {"http": ""}
56-
url = "http://{host_ip}:{port}/v1/dataprep".format(host_ip=host_ip, port=port)
56+
url = "http://{host_ip}:{port}/v1/dataprep/ingest".format(host_ip=host_ip, port=port)
5757

5858
# Split jsonl file into json files
5959
files = split_jsonl_into_txts(os.path.join(args.filedir, args.filename))

AgentQnA/retrieval_tool/launch_retrieval_tool.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ export EMBEDDING_SERVICE_HOST_IP=${host_ip}
1919
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
2020
export RERANK_SERVICE_HOST_IP=${host_ip}
2121
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8889/v1/retrievaltool"
22-
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
23-
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
24-
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
22+
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep/ingest"
23+
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get"
24+
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete"
2525

2626
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml up -d

AgentQnA/tests/step1_build_images.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ function build_docker_images_for_retrieval_tool(){
2121
# git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
2222
get_genai_comps
2323
echo "Build all the images with --no-cache..."
24-
service_list="doc-index-retriever dataprep-redis embedding retriever reranking"
24+
service_list="doc-index-retriever dataprep embedding retriever reranking"
2525
docker compose -f build.yaml build ${service_list} --no-cache
2626
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
2727

ChatQnA/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -202,8 +202,8 @@ Gaudi default compose.yaml
202202
| Embedding | Langchain | Xeon | 6000 | /v1/embeddings |
203203
| Retriever | Langchain, Redis | Xeon | 7000 | /v1/retrieval |
204204
| Reranking | Langchain, TEI | Gaudi | 8000 | /v1/reranking |
205-
| LLM | Langchain, vLLM | Gaudi | 9000 | /v1/chat/completions |
206-
| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep |
205+
| LLM | Langchain, TGI | Gaudi | 9000 | /v1/chat/completions |
206+
| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep/ingest |
207207

208208
### Required Models
209209

@@ -294,7 +294,7 @@ Here is an example of `Nike 2023` pdf.
294294
# download pdf file
295295
wget https://raw.githubusercontent.com/opea-project/GenAIComps/v1.1/comps/retrievers/redis/data/nke-10k-2023.pdf
296296
# upload pdf file with dataprep
297-
curl -X POST "http://${host_ip}:6007/v1/dataprep" \
297+
curl -X POST "http://${host_ip}:6007/v1/dataprep/ingest" \
298298
-H "Content-Type: multipart/form-data" \
299299
-F "files=@./nke-10k-2023.pdf"
300300
```

ChatQnA/benchmark/accuracy/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,14 +72,14 @@ python eval_multihop.py --docs_path MultiHop-RAG/dataset/corpus.json --dataset_
7272
If you are using Kubernetes manifest/helm to deploy `ChatQnA` system, you must specify more arguments as following:
7373

7474
```bash
75-
python eval_multihop.py --docs_path MultiHop-RAG/dataset/corpus.json --dataset_path MultiHop-RAG/dataset/MultiHopRAG.json --ingest_docs --retrieval_metrics --ragas_metrics --llm_endpoint http://{llm_as_judge_ip}:{llm_as_judge_port}/generate --database_endpoint http://{your_dataprep_ip}:{your_dataprep_port}/v1/dataprep --embedding_endpoint http://{your_embedding_ip}:{your_embedding_port}/v1/embeddings --tei_embedding_endpoint http://{your_tei_embedding_ip}:{your_tei_embedding_port} --retrieval_endpoint http://{your_retrieval_ip}:{your_retrieval_port}/v1/retrieval --service_url http://{your_chatqna_ip}:{your_chatqna_port}/v1/chatqna
75+
python eval_multihop.py --docs_path MultiHop-RAG/dataset/corpus.json --dataset_path MultiHop-RAG/dataset/MultiHopRAG.json --ingest_docs --retrieval_metrics --ragas_metrics --llm_endpoint http://{llm_as_judge_ip}:{llm_as_judge_port}/generate --database_endpoint http://{your_dataprep_ip}:{your_dataprep_port}/v1/dataprep/ingest --embedding_endpoint http://{your_embedding_ip}:{your_embedding_port}/v1/embeddings --tei_embedding_endpoint http://{your_tei_embedding_ip}:{your_tei_embedding_port} --retrieval_endpoint http://{your_retrieval_ip}:{your_retrieval_port}/v1/retrieval --service_url http://{your_chatqna_ip}:{your_chatqna_port}/v1/chatqna
7676
```
7777

7878
The default values for arguments are:
7979
|Argument|Default value|
8080
|--------|-------------|
8181
|service_url|http://localhost:8888/v1/chatqna|
82-
|database_endpoint|http://localhost:6007/v1/dataprep|
82+
|database_endpoint|http://localhost:6007/v1/dataprep/ingest|
8383
|embedding_endpoint|http://localhost:6000/v1/embeddings|
8484
|tei_embedding_endpoint|http://localhost:8090|
8585
|retrieval_endpoint|http://localhost:7000/v1/retrieval|
@@ -139,14 +139,14 @@ python eval_crud.py --dataset_path ./data/split_merged.json --docs_path ./data/8
139139
If you are using Kubernetes manifest/helm to deploy `ChatQnA` system, you must specify more arguments as following:
140140

141141
```bash
142-
python eval_crud.py --dataset_path ./data/split_merged.json --docs_path ./data/80000_docs --ingest_docs --database_endpoint http://{your_dataprep_ip}:{your_dataprep_port}/v1/dataprep --embedding_endpoint http://{your_embedding_ip}:{your_embedding_port}/v1/embeddings --retrieval_endpoint http://{your_retrieval_ip}:{your_retrieval_port}/v1/retrieval --service_url http://{your_chatqna_ip}:{your_chatqna_port}/v1/chatqna
142+
python eval_crud.py --dataset_path ./data/split_merged.json --docs_path ./data/80000_docs --ingest_docs --database_endpoint http://{your_dataprep_ip}:{your_dataprep_port}/v1/dataprep/ingest --embedding_endpoint http://{your_embedding_ip}:{your_embedding_port}/v1/embeddings --retrieval_endpoint http://{your_retrieval_ip}:{your_retrieval_port}/v1/retrieval --service_url http://{your_chatqna_ip}:{your_chatqna_port}/v1/chatqna
143143
```
144144

145145
The default values for arguments are:
146146
|Argument|Default value|
147147
|--------|-------------|
148148
|service_url|http://localhost:8888/v1/chatqna|
149-
|database_endpoint|http://localhost:6007/v1/dataprep|
149+
|database_endpoint|http://localhost:6007/v1/dataprep/ingest|
150150
|embedding_endpoint|http://localhost:6000/v1/embeddings|
151151
|retrieval_endpoint|http://localhost:7000/v1/retrieval|
152152
|reranking_endpoint|http://localhost:8000/v1/reranking|

ChatQnA/benchmark/accuracy/eval_crud.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def args_parser():
149149
parser.add_argument("--tasks", default=["question_answering"], nargs="+", help="Task to perform")
150150
parser.add_argument("--ingest_docs", action="store_true", help="Whether to ingest documents to vector database")
151151
parser.add_argument(
152-
"--database_endpoint", type=str, default="http://localhost:6007/v1/dataprep", help="Service URL address."
152+
"--database_endpoint", type=str, default="http://localhost:6007/v1/dataprep/ingest", help="Service URL address."
153153
)
154154
parser.add_argument(
155155
"--embedding_endpoint", type=str, default="http://localhost:6000/v1/embeddings", help="Service URL address."

ChatQnA/benchmark/accuracy/eval_multihop.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ def args_parser():
211211
parser.add_argument("--ragas_metrics", action="store_true", help="Whether to compute ragas metrics.")
212212
parser.add_argument("--limits", type=int, default=100, help="Number of examples to be evaluated by llm-as-judge")
213213
parser.add_argument(
214-
"--database_endpoint", type=str, default="http://localhost:6007/v1/dataprep", help="Service URL address."
214+
"--database_endpoint", type=str, default="http://localhost:6007/v1/dataprep/ingest", help="Service URL address."
215215
)
216216
parser.add_argument(
217217
"--embedding_endpoint", type=str, default="http://localhost:6000/v1/embeddings", help="Service URL address."

ChatQnA/benchmark/performance/kubernetes/intel/gaudi/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,7 @@ Use the following `cURL` command to upload file:
164164

165165
```bash
166166
cd GenAIEval/evals/benchmark/data
167-
curl -X POST "http://${cluster_ip}:6007/v1/dataprep" \
167+
curl -X POST "http://${cluster_ip}:6007/v1/dataprep/ingest" \
168168
-H "Content-Type: multipart/form-data" \
169169
-F "chunk_size=3800" \
170170
-F "files=@./upload_file.txt"

0 commit comments

Comments
 (0)