Skip to content

Commit 61795fd

Browse files
lvliang-intelpre-commit-ci[bot]root
authored
Support llamaindex for retrieval microservice and remove langchain dependency for llm and rerank microservice (#152)
* remove langchain dependency for llm and rerank Signed-off-by: lvliang-intel <[email protected]> * add llamaindex support for retrieval Signed-off-by: lvliang-intel <[email protected]> * fix schema issue Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix dockerfile Signed-off-by: lvliang-intel <[email protected]> * update readme Signed-off-by: lvliang-intel <[email protected]> * update reamde Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix entrypoint Signed-off-by: lvliang-intel <[email protected]> * add dataprep process in test script Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix redis url for dataprep Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update readme Signed-off-by: lvliang-intel <[email protected]> * update code Signed-off-by: lvliang-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: lvliang-intel <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: root <[email protected]>
1 parent 9b658f4 commit 61795fd

File tree

18 files changed

+356
-36
lines changed

18 files changed

+356
-36
lines changed

comps/llms/text-generation/tgi/llm.py

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import time
66

77
from fastapi.responses import StreamingResponse
8-
from langchain_community.llms import HuggingFaceEndpoint
8+
from huggingface_hub import AsyncInferenceClient
99
from langsmith import traceable
1010

1111
from comps import (
@@ -28,26 +28,23 @@
2828
)
2929
@traceable(run_type="llm")
3030
@register_statistics(names=["opea_service@llm_tgi"])
31-
def llm_generate(input: LLMParamsDoc):
31+
async def llm_generate(input: LLMParamsDoc):
32+
stream_gen_time = []
3233
start = time.time()
33-
llm_endpoint = os.getenv("TGI_LLM_ENDPOINT", "http://localhost:8080")
34-
llm = HuggingFaceEndpoint(
35-
endpoint_url=llm_endpoint,
36-
max_new_tokens=input.max_new_tokens,
37-
top_k=input.top_k,
38-
top_p=input.top_p,
39-
typical_p=input.typical_p,
40-
temperature=input.temperature,
41-
repetition_penalty=input.repetition_penalty,
42-
streaming=input.streaming,
43-
timeout=600,
44-
)
4534
if input.streaming:
46-
stream_gen_time = []
4735

4836
async def stream_generator():
4937
chat_response = ""
50-
async for text in llm.astream(input.query):
38+
text_generation = await llm.text_generation(
39+
prompt=input.query,
40+
stream=input.streaming,
41+
max_new_tokens=input.max_new_tokens,
42+
repetition_penalty=input.repetition_penalty,
43+
temperature=input.temperature,
44+
top_k=input.top_k,
45+
top_p=input.top_p,
46+
)
47+
async for text in text_generation:
5148
stream_gen_time.append(time.time() - start)
5249
chat_response += text
5350
chunk_repr = repr(text.encode("utf-8"))
@@ -59,10 +56,23 @@ async def stream_generator():
5956

6057
return StreamingResponse(stream_generator(), media_type="text/event-stream")
6158
else:
62-
response = llm.invoke(input.query)
59+
response = await llm.text_generation(
60+
prompt=input.query,
61+
stream=input.streaming,
62+
max_new_tokens=input.max_new_tokens,
63+
repetition_penalty=input.repetition_penalty,
64+
temperature=input.temperature,
65+
top_k=input.top_k,
66+
top_p=input.top_p,
67+
)
6368
statistics_dict["opea_service@llm_tgi"].append_latency(time.time() - start, None)
6469
return GeneratedDoc(text=response, prompt=input.query)
6570

6671

6772
if __name__ == "__main__":
73+
llm_endpoint = os.getenv("TGI_LLM_ENDPOINT", "http://localhost:8080")
74+
llm = AsyncInferenceClient(
75+
model=llm_endpoint,
76+
timeout=600,
77+
)
6878
opea_microservices["opea_service@llm_tgi"].start()

comps/llms/text-generation/tgi/requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
docarray[full]
22
fastapi
33
huggingface_hub
4-
langchain==0.1.16
54
langsmith
65
opentelemetry-api
76
opentelemetry-exporter-otlp

comps/reranks/requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
docarray[full]
22
fastapi
3-
langchain
43
langsmith
54
opentelemetry-api
65
opentelemetry-exporter-otlp
File renamed without changes.

comps/reranks/langchain/docker/Dockerfile renamed to comps/reranks/tei/docker/Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ RUN pip install --no-cache-dir --upgrade pip && \
2727

2828
ENV PYTHONPATH=$PYTHONPATH:/home/user
2929

30-
WORKDIR /home/user/comps/reranks/langchain
30+
WORKDIR /home/user/comps/reranks/tei
3131

32-
ENTRYPOINT ["python", "reranking_tei_xeon.py"]
32+
ENTRYPOINT ["python", "reranking_tei.py"]
3333

comps/reranks/langchain/reranking_tei_xeon.py renamed to comps/reranks/tei/reranking_tei.py

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@
88
import time
99

1010
import requests
11-
from langchain_core.prompts import ChatPromptTemplate
1211
from langsmith import traceable
1312

1413
from comps import (
@@ -48,14 +47,23 @@ def reranking(input: SearchedDoc) -> LLMParamsDoc:
4847
context_str = context_str + " " + input.retrieved_docs[best_response["index"]].text
4948
if context_str and len(re.findall("[\u4E00-\u9FFF]", context_str)) / len(context_str) >= 0.3:
5049
# chinese context
51-
template = "仅基于以下背景回答问题:\n{context}\n问题: {question}"
50+
template = """
51+
### 你将扮演一个乐于助人、尊重他人并诚实的助手,你的目标是帮助用户解答问题。有效地利用来自本地知识库的搜索结果。确保你的回答中只包含相关信息。如果你不确定问题的答案,请避免分享不准确的信息。
52+
### 搜索结果:{context}
53+
### 问题:{question}
54+
### 回答:
55+
"""
5256
else:
53-
template = """Answer the question based only on the following context:
54-
{context}
55-
Question: {question}
56-
"""
57-
prompt = ChatPromptTemplate.from_template(template)
58-
final_prompt = prompt.format(context=context_str, question=input.initial_query)
57+
template = """
58+
### You are a helpful, respectful and honest assistant to help the user with questions. \
59+
Please refer to the search results obtained from the local knowledge base. \
60+
But be careful to not incorporate the information that you think is not relevant to the question. \
61+
If you don't know the answer to a question, please don't share false information. \
62+
### Search results: {context} \n
63+
### Question: {question} \n
64+
### Answer:
65+
"""
66+
final_prompt = template.format(context=context_str, question=input.initial_query)
5967
statistics_dict["opea_service@reranking_tgi_gaudi"].append_latency(time.time() - start, None)
6068
return LLMParamsDoc(query=final_prompt.strip())
6169
else:

comps/retrievers/README.md renamed to comps/retrievers/langchain/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@ Overall, this microservice provides robust backend support for applications requ
88

99
# Retriever Microservice with Redis
1010

11-
For details, please refer to this [readme](langchain/redis/README.md)
11+
For details, please refer to this [readme](redis/README.md)
1212

1313
# Retriever Microservice with Milvus
1414

15-
For details, please refer to this [readme](langchain/milvus/README.md)
15+
For details, please refer to this [readme](milvus/README.md)
1616

1717
# Retriever Microservice with PGVector
1818

19-
For details, please refer to this [readme](langchain/pgvector/README.md)
19+
For details, please refer to this [readme](pgvector/README.md)

comps/retrievers/langchain/pinecone/docker/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ RUN chmod +x /home/user/comps/retrievers/langchain/pinecone/run.sh
2020
USER user
2121

2222
RUN pip install --no-cache-dir --upgrade pip && \
23-
pip install --no-cache-dir -r /home/user/comps/retrievers/requirements.txt
23+
pip install --no-cache-dir -r /home/user/comps/retrievers/langchain/pinecone/requirements.txt
2424

2525
ENV PYTHONPATH=$PYTHONPATH:/home/user
2626

0 commit comments

Comments
 (0)