You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docker compose -f docker-compose-dataprep-milvus.yaml up -d
51
95
```
52
96
53
-
## Invoke Microservice
97
+
## 🚀3. Consume Microservice
98
+
99
+
### 3.1 Consume Upload API
54
100
55
101
Once document preparation microservice for Milvus is started, user can use below command to invoke the microservice to convert the document to embedding and save to the database.
56
102
@@ -65,13 +111,13 @@ curl -X POST \
65
111
http://localhost:6010/v1/dataprep
66
112
```
67
113
68
-
You can specify chunk_size and chunk_size by the following commands.
114
+
You can specify chunk_size and chunk_size by the following commands. To avoid big chunks, pass a small chun_size like 500 as below (default 1500).
69
115
70
116
```bash
71
117
curl -X POST \
72
118
-H "Content-Type: multipart/form-data" \
73
119
-F "files=@./file.pdf" \
74
-
-F "chunk_size=1500" \
120
+
-F "chunk_size=500" \
75
121
-F "chunk_overlap=100" \
76
122
http://localhost:6010/v1/dataprep
77
123
```
@@ -132,3 +178,70 @@ Note: If you specify "table_strategy=llm", You should first start TGI Service, p
132
178
```bash
133
179
curl -X POST -H "Content-Type: application/json" -d '{"path":"/home/user/doc/your_document_name","process_table":true,"table_strategy":"hq"}' http://localhost:6010/v1/dataprep
134
180
```
181
+
182
+
### 3.2 Consume get_file API
183
+
184
+
To get uploaded file structures, use the following command:
185
+
186
+
```bash
187
+
curl -X POST \
188
+
-H "Content-Type: application/json" \
189
+
http://localhost:6010/v1/dataprep/get_file
190
+
```
191
+
192
+
Then you will get the response JSON like this:
193
+
194
+
```json
195
+
[
196
+
{
197
+
"name": "uploaded_file_1.txt",
198
+
"id": "uploaded_file_1.txt",
199
+
"type": "File",
200
+
"parent": ""
201
+
},
202
+
{
203
+
"name": "uploaded_file_2.txt",
204
+
"id": "uploaded_file_2.txt",
205
+
"type": "File",
206
+
"parent": ""
207
+
}
208
+
]
209
+
```
210
+
211
+
### 3.3 Consume delete_file API
212
+
213
+
To delete uploaded file/link, use the following command.
214
+
215
+
The `file_path` here should be the `id` get from `/v1/dataprep/get_file` API.
216
+
217
+
```bash
218
+
# delete link
219
+
curl -X POST \
220
+
-H "Content-Type: application/json" \
221
+
-d '{"file_path": "https://www.ces.tech/.txt"}' \
222
+
http://localhost:6010/v1/dataprep/delete_file
223
+
224
+
# delete file
225
+
curl -X POST \
226
+
-H "Content-Type: application/json" \
227
+
-d '{"file_path": "uploaded_file_1.txt"}' \
228
+
http://localhost:6010/v1/dataprep/delete_file
229
+
230
+
# delete all files and links, will drop the entire db collection
231
+
curl -X POST \
232
+
-H "Content-Type: application/json" \
233
+
-d '{"file_path": "all"}' \
234
+
http://localhost:6010/v1/dataprep/delete_file
235
+
```
236
+
237
+
## 🚀4. Troubleshooting
238
+
239
+
1. If you get errors from Mosec Embedding Endpoint like `cannot find this task, maybe it has expired` while uploading files, try to reduce the `chunk_size` in the curl command like below (the default chunk_size=1500).
0 commit comments