-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Labels
module: platformsIssues related to platforms, hardware, and support matrixIssues related to platforms, hardware, and support matrix
Description
Description
I'm trying to serve an embedding model [FastText] in triton-server using python as its backend. The external dependencies are just fasttext module which is inturn dependent on numpy. I have created a custom execution environment as mentioned here.
The problem is that, I'm facing the following error while running the triton server as a docker container,
+-------------------+---------+----------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+----------------------------------------------------------------------------------------------------+
| fast-text-service | 1 | UNAVAILABLE: Internal: ImportError: Error importing numpy: you should not try to import numpy from |
| | | its source directory; please exit the numpy source tree, and relaunch |
| | | your python interpreter from there. |
| | | |
| | | At: |
| | | /tmp/python_env_3RJ5YZ/0/lib/python3.10/site-packages/numpy/__init__.py(119): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
| | | /opt/tritonserver/backends/python/triton_python_backend_utils.py(30): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
| | | /mnt/data/model_repository/fast-text-service/1/model.py(1): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
+-------------------+---------+----------------------------------------------------------------------------------------------------+
Triton Information
I'm running the triton container on a M2 chip and its image is nvcr.io/nvidia/tritonserver:24.09-pyt-python-py3.
To Reproduce
- Create a conda environment using
conda create -k -y -n ${CONDA_ENV_NAME} python=3.10.12
- Set
export PYTHONNOUSERSITE=True
- Install fasttext,
pip3 install fasttext
- Create a tar file of your env using
conda pack -o ./model-repository/fast-text-service/${CONDA_ENV_NAME}.tar.gz
- Create a basic
models.py
file as required by the trition server. - Load the fasttext model in
models.py
requirements.txt:
fasttext==0.9.3
numpy==2.1.2
pybind11==2.13.6
setuptools==75.2.0
configpb.txt
name: "fast-text-service"
backend: "python"
max_batch_size: 8
dynamic_batching { }
input [
{
name: "TEXT"
data_type: TYPE_STRING
dims: [ 1 ]
}
]
output [
{
name: "Status"
data_type: TYPE_FP32
dims: [ 1 ]
},
{
name: "Embedding"
data_type: TYPE_FP32
dims: [ 300 ]
}
]
parameters: {
key: "EXECUTION_ENV_PATH"
value: {string_value: "/mnt/data/model_repository/fast-text-service/fasttext-server.tar.gz"}
}
instance_group [
{
count: 1
kind: KIND_CPU
}
]
Expected behavior
The container exits by saying error: creating server: Internal - failed to load all models
.
Below is a segment of a log generated by the triton container,
....
....
....
....
I1020 05:12:57.580370 1 model_lifecycle.cc:472] "loading: fast-text-service:1"
I1020 05:12:57.580948 1 backend_model.cc:503] "Adding default backend config setting: default-max-batch-size,4"
I1020 05:12:57.580969 1 shared_library.cc:112] "OpenLibraryHandle: /opt/tritonserver/backends/python/libtriton_python.so"
I1020 05:12:57.585539 1 python_be.cc:1618] "'python' TRITONBACKEND API version: 1.19"
I1020 05:12:57.585553 1 python_be.cc:1640] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"true\",\"backend-directory\":\"/opt/tritonserver/backends\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}"
I1020 05:12:57.585837 1 python_be.cc:1778] "Shared memory configuration is shm-default-byte-size=1048576,shm-growth-byte-size=1048576,stub-timeout-seconds=30"
I1020 05:12:57.586479 1 python_be.cc:2075] "TRITONBACKEND_GetBackendAttribute: setting attributes"
I1020 05:12:57.586515 1 python_be.cc:1879] "TRITONBACKEND_ModelInitialize: fast-text-service (version 1)"
I1020 05:12:57.587722 1 model_config_utils.cc:1941] "ModelConfig 64-bit fields:"
I1020 05:12:57.587729 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::default_priority_level"
I1020 05:12:57.587730 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds"
I1020 05:12:57.587732 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::max_queue_delay_microseconds"
I1020 05:12:57.587734 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::priority_levels"
I1020 05:12:57.587735 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::priority_queue_policy::key"
I1020 05:12:57.587736 1 model_config_utils.cc:1943] "\tModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds"
I1020 05:12:57.587738 1 model_config_utils.cc:1943] "\tModelConfig::ensemble_scheduling::step::model_version"
I1020 05:12:57.587740 1 model_config_utils.cc:1943] "\tModelConfig::input::dims"
I1020 05:12:57.587741 1 model_config_utils.cc:1943] "\tModelConfig::input::reshape::shape"
I1020 05:12:57.587743 1 model_config_utils.cc:1943] "\tModelConfig::instance_group::secondary_devices::device_id"
I1020 05:12:57.587744 1 model_config_utils.cc:1943] "\tModelConfig::model_warmup::inputs::value::dims"
I1020 05:12:57.587746 1 model_config_utils.cc:1943] "\tModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim"
I1020 05:12:57.587748 1 model_config_utils.cc:1943] "\tModelConfig::optimization::cuda::graph_spec::input::value::dim"
I1020 05:12:57.587749 1 model_config_utils.cc:1943] "\tModelConfig::output::dims"
I1020 05:12:57.587751 1 model_config_utils.cc:1943] "\tModelConfig::output::reshape::shape"
I1020 05:12:57.587752 1 model_config_utils.cc:1943] "\tModelConfig::sequence_batching::direct::max_queue_delay_microseconds"
I1020 05:12:57.587754 1 model_config_utils.cc:1943] "\tModelConfig::sequence_batching::max_sequence_idle_microseconds"
I1020 05:12:57.587755 1 model_config_utils.cc:1943] "\tModelConfig::sequence_batching::oldest::max_queue_delay_microseconds"
I1020 05:12:57.587757 1 model_config_utils.cc:1943] "\tModelConfig::sequence_batching::state::dims"
I1020 05:12:57.587759 1 model_config_utils.cc:1943] "\tModelConfig::sequence_batching::state::initial_state::dims"
I1020 05:12:57.587760 1 model_config_utils.cc:1943] "\tModelConfig::version_policy::specific::versions"
I1020 05:12:57.588199 1 python_be.cc:1485] "Using Python execution env /mnt/data/model_repository/fast-text-service/fasttext-server.tar.gz"
I1020 05:12:57.588459 1 pb_env.cc:292] "Extracting Python execution env /mnt/data/model_repository/fast-text-service/fasttext-server.tar.gz"
I1020 05:12:58.119991 1 stub_launcher.cc:385] "Starting Python backend stub: source /tmp/python_env_V8xPqb/0/bin/activate && exec env LD_LIBRARY_PATH=/tmp/python_env_V8xPqb/0/lib:$LD_LIBRARY_PATH /opt/tritonserver/backends/python/triton_python_backend_stub /mnt/data/model_repository/fast-text-service/1/model.py triton_python_backend_shm_region_a2fedd57-ea76-415c-9ec4-31883a32f342 1048576 1048576 1 /opt/tritonserver/backends/python 336 fast-text-service DEFAULT"
I1020 05:12:58.157720 98 pb_stub.cc:298] Failed to initialize Python stub for auto-complete: ImportError: Error importing numpy: you should not try to import numpy from
its source directory; please exit the numpy source tree, and relaunch
your python interpreter from there.
At:
/tmp/python_env_V8xPqb/0/lib/python3.10/site-packages/numpy/__init__.py(119): <module>
<frozen importlib._bootstrap>(241): _call_with_frames_removed
<frozen importlib._bootstrap_external>(883): exec_module
<frozen importlib._bootstrap>(703): _load_unlocked
<frozen importlib._bootstrap>(1006): _find_and_load_unlocked
<frozen importlib._bootstrap>(1027): _find_and_load
/opt/tritonserver/backends/python/triton_python_backend_utils.py(30): <module>
<frozen importlib._bootstrap>(241): _call_with_frames_removed
<frozen importlib._bootstrap_external>(883): exec_module
<frozen importlib._bootstrap>(703): _load_unlocked
<frozen importlib._bootstrap>(1006): _find_and_load_unlocked
<frozen importlib._bootstrap>(1027): _find_and_load
/mnt/data/model_repository/fast-text-service/1/model.py(1): <module>
<frozen importlib._bootstrap>(241): _call_with_frames_removed
<frozen importlib._bootstrap_external>(883): exec_module
<frozen importlib._bootstrap>(703): _load_unlocked
<frozen importlib._bootstrap>(1006): _find_and_load_unlocked
<frozen importlib._bootstrap>(1027): _find_and_load
I1020 05:12:58.158159 1 python_be.cc:1902] "TRITONBACKEND_ModelFinalize: delete model state"
E1020 05:12:58.158182 1 model_lifecycle.cc:642] "failed to load 'fast-text-service' version 1: Internal: ImportError: Error importing numpy: you should not try to import numpy from\n its source directory; please exit the numpy source tree, and relaunch\n your python interpreter from there.\n\nAt:\n /tmp/python_env_V8xPqb/0/lib/python3.10/site-packages/numpy/__init__.py(119): <module>\n <frozen importlib._bootstrap>(241): _call_with_frames_removed\n <frozen importlib._bootstrap_external>(883): exec_module\n <frozen importlib._bootstrap>(703): _load_unlocked\n <frozen importlib._bootstrap>(1006): _find_and_load_unlocked\n <frozen importlib._bootstrap>(1027): _find_and_load\n /opt/tritonserver/backends/python/triton_python_backend_utils.py(30): <module>\n <frozen importlib._bootstrap>(241): _call_with_frames_removed\n <frozen importlib._bootstrap_external>(883): exec_module\n <frozen importlib._bootstrap>(703): _load_unlocked\n <frozen importlib._bootstrap>(1006): _find_and_load_unlocked\n <frozen importlib._bootstrap>(1027): _find_and_load\n /mnt/data/model_repository/fast-text-service/1/model.py(1): <module>\n <frozen importlib._bootstrap>(241): _call_with_frames_removed\n <frozen importlib._bootstrap_external>(883): exec_module\n <frozen importlib._bootstrap>(703): _load_unlocked\n <frozen importlib._bootstrap>(1006): _find_and_load_unlocked\n <frozen importlib._bootstrap>(1027): _find_and_load\n"
I1020 05:12:58.158204 1 model_lifecycle.cc:777] "failed to load 'fast-text-service'"
I1020 05:12:58.158289 1 server.cc:604]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I1020 05:12:58.158645 1 server.cc:631]
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
I1020 05:12:58.158680 1 server.cc:674]
+-------------------+---------+----------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+----------------------------------------------------------------------------------------------------+
| fast-text-service | 1 | UNAVAILABLE: Internal: ImportError: Error importing numpy: you should not try to import numpy from |
| | | its source directory; please exit the numpy source tree, and relaunch |
| | | your python interpreter from there. |
| | | |
| | | At: |
| | | /tmp/python_env_V8xPqb/0/lib/python3.10/site-packages/numpy/__init__.py(119): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
| | | /opt/tritonserver/backends/python/triton_python_backend_utils.py(30): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
| | | /mnt/data/model_repository/fast-text-service/1/model.py(1): <module> |
| | | <frozen importlib._bootstrap>(241): _call_with_frames_removed |
| | | <frozen importlib._bootstrap_external>(883): exec_module |
| | | <frozen importlib._bootstrap>(703): _load_unlocked |
| | | <frozen importlib._bootstrap>(1006): _find_and_load_unlocked |
| | | <frozen importlib._bootstrap>(1027): _find_and_load |
+-------------------+---------+----------------------------------------------------------------------------------------------------+
I1020 05:12:58.158773 1 metrics.cc:770] "Collecting CPU metrics"
I1020 05:12:58.158837 1 tritonserver.cc:2598]
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.50.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging |
| model_repository_path[0] | /mnt/data/model_repository |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| model_config_name | |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I1020 05:12:58.158866 1 server.cc:305] "Waiting for in-flight requests to complete."
I1020 05:12:58.158868 1 server.cc:321] "Timeout 30: Found 0 model versions that have in-flight inferences"
I1020 05:12:58.158926 1 server.cc:336] "All models are stopped, unloading models"
I1020 05:12:58.158934 1 server.cc:345] "Timeout 30: Found 0 live models and 0 in-flight non-inference requests"
I1020 05:12:58.158947 1 backend_manager.cc:138] "unloading backend 'python'"
I1020 05:12:58.158950 1 python_be.cc:1859] "TRITONBACKEND_Finalize: Start"
I1020 05:12:58.221014 1 python_be.cc:1864] "TRITONBACKEND_Finalize: End"
error: creating server: Internal - failed to load all models
theslyone
Metadata
Metadata
Assignees
Labels
module: platformsIssues related to platforms, hardware, and support matrixIssues related to platforms, hardware, and support matrix