Skip to content

Commit b873cf8

Browse files
lianhaoXuhuiRen
andauthored
dataprep: Fix issue in uploading docx with embedding image (#561)
Fix issue #407 Signed-off-by: Lianhao Lu <[email protected]> Co-authored-by: XuhuiRen <[email protected]>
1 parent 72123b2 commit b873cf8

File tree

8 files changed

+8
-9
lines changed

8 files changed

+8
-9
lines changed

comps/dataprep/milvus/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,5 @@ python-pptx
2525
sentence_transformers
2626
shortuuid
2727
tiktoken
28-
unstructured[all-docs]==0.11.5
28+
unstructured[all-docs]==0.15.7
2929
uvicorn

comps/dataprep/pgvector/langchain/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,6 @@ python-pptx
2626
sentence_transformers
2727
shortuuid
2828
tiktoken
29-
unstructured[all-docs]==0.11.5
29+
unstructured[all-docs]==0.15.7
3030
uvicorn
3131

comps/dataprep/pinecone/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,5 +26,5 @@ python-docx
2626
python-pptx
2727
sentence_transformers
2828
shortuuid
29-
unstructured[all-docs]==0.11.5
29+
unstructured[all-docs]==0.15.7
3030
uvicorn

comps/dataprep/qdrant/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,5 +23,5 @@ python-pptx
2323
qdrant-client
2424
sentence_transformers
2525
shortuuid
26-
unstructured[all-docs]==0.11.5
26+
unstructured[all-docs]==0.15.7
2727
uvicorn

comps/dataprep/redis/langchain/docker/docker-compose-dataprep-redis.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,6 @@ services:
3232
no_proxy: ${no_proxy}
3333
http_proxy: ${http_proxy}
3434
https_proxy: ${https_proxy}
35-
REDIS_HOST: ${REDIS_HOST}
36-
REDIS_PORT: ${REDIS_PORT}
3735
REDIS_URL: ${REDIS_URL}
3836
INDEX_NAME: ${INDEX_NAME}
3937
TEI_ENDPOINT: ${TEI_ENDPOINT}

comps/dataprep/redis/langchain/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,5 @@ python-pptx
2525
redis
2626
sentence_transformers
2727
shortuuid
28-
unstructured[all-docs]==0.11.5
28+
unstructured[all-docs]==0.15.7
2929
uvicorn

comps/dataprep/redis/langchain_ray/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,6 @@ ray
2424
redis
2525
sentence_transformers
2626
shortuuid
27+
unstructured[all-docs]==0.15.7
2728
uvicorn
2829
virtualenv

comps/dataprep/utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import shutil
1313
import signal
1414
import subprocess
15+
import tempfile
1516
import timeit
1617
import unicodedata
1718
import urllib.parse
@@ -192,8 +193,7 @@ def load_docx(docx_path):
192193
if isinstance(r._target, docx.parts.image.ImagePart):
193194
rid2img[r.rId] = os.path.basename(r._target.partname)
194195
if rid2img:
195-
save_path = "./imgs/"
196-
os.makedirs(save_path, exist_ok=True)
196+
save_path = tempfile.mkdtemp()
197197
docx2txt.process(docx_path, save_path)
198198
for paragraph in doc.paragraphs:
199199
if hasattr(paragraph, "text"):

0 commit comments

Comments
 (0)