You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vLLM is also available via `Langchain <https://github.com/langchain-ai/langchain>`_ .
7
+
8
+
To install langchain, run
9
+
10
+
.. code-block:: console
11
+
12
+
$ pip install langchain -q
13
+
14
+
To run inference on a single or multiple GPUs, use ``VLLM`` class from ``langchain``.
15
+
16
+
.. code-block:: python
17
+
18
+
from langchain.llms importVLLM
19
+
20
+
llm = VLLM(model="mosaicml/mpt-7b",
21
+
trust_remote_code=True, # mandatory for hf models
22
+
max_new_tokens=128,
23
+
top_k=10,
24
+
top_p=0.95,
25
+
temperature=0.8,
26
+
# tensor_parallel_size=... # for distributed inference
27
+
)
28
+
29
+
print(llm("What is the capital of France ?"))
30
+
31
+
Please refer to this `Tutorial <https://github.com/langchain-ai/langchain/blob/master/docs/extras/integrations/llms/vllm.ipynb>`_ for more details.
0 commit comments