Skip to content

Commit 86412c8

Browse files
Supported image summarization with LVM in dataprep microservice (#215)
Signed-off-by: Xinyu Ye <[email protected]>
1 parent 6b7bec4 commit 86412c8

File tree

2 files changed

+29
-1
lines changed

2 files changed

+29
-1
lines changed

comps/dataprep/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,22 @@
22

33
The Dataprep Microservice aims to preprocess the data from various sources (either structured or unstructured data) to text data, and convert the text data to embedding vectors then store them in the database.
44

5+
## Use LVM (Large Vision Model) for Summarizing Image Data
6+
7+
Occasionally unstructured data will contain image data, to convert the image data to the text data, LVM can be used to summarize the image. To leverage LVM, please refer to this [readme](../lvms/README.md) to start the LVM microservice first and then set the below environment variable, before starting any dataprep microservice.
8+
9+
```bash
10+
export SUMMARIZE_IMAGE_VIA_LVM=1
11+
```
12+
513
# Dataprep Microservice with Redis
614

715
For details, please refer to this [readme](redis/README.md)
816

17+
# Dataprep Microservice with Milvus
18+
19+
For details, please refer to this [readme](milvus/README.md)
20+
921
# Dataprep Microservice with Qdrant
1022

1123
For details, please refer to this [readme](qdrant/README.md)

comps/dataprep/utils.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Copyright (C) 2024 Intel Corporation
22
# SPDX-License-Identifier: Apache-2.0
33

4+
import base64
45
import errno
56
import functools
67
import io
@@ -198,6 +199,16 @@ def load_csv(input_path):
198199

199200
def load_image(image_path):
200201
"""Load the image file."""
202+
if os.getenv("SUMMARIZE_IMAGE_VIA_LVM", None) == "1":
203+
query = "Please summarize this image."
204+
image_b64_str = base64.b64encode(open(image_path, "rb").read()).decode()
205+
response = requests.post(
206+
"http://localhost:9399/v1/lvm",
207+
data=json.dumps({"image": image_b64_str, "prompt": query}),
208+
headers={"Content-Type": "application/json"},
209+
proxies={"http": None},
210+
)
211+
return response.json()["text"].strip()
201212
loader = UnstructuredImageLoader(image_path)
202213
text = loader.load()[0].page_content
203214
return text
@@ -239,7 +250,12 @@ def document_loader(doc_path):
239250
return load_xlsx(doc_path)
240251
elif doc_path.endswith(".csv"):
241252
return load_csv(doc_path)
242-
elif doc_path.endswith(".tiff"):
253+
elif (
254+
doc_path.endswith(".tiff")
255+
or doc_path.endswith(".jpg")
256+
or doc_path.endswith(".jpeg")
257+
or doc_path.endswith(".png")
258+
):
243259
return load_image(doc_path)
244260
elif doc_path.endswith(".svg"):
245261
return load_image(doc_path)

0 commit comments

Comments
 (0)