File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -49,7 +49,7 @@ We provide a docker file, which bases on Triton image `nvcr.io/nvidia/tritonserv
49
49
50
50
``` bash
51
51
mkdir workspace && cd workspace
52
- git clone https://gitlab-master.nvidia. com/liweim/transformer_backend .git
52
+ git clone https://github. com/triton-inference-server/fastertransformer_backend .git
53
53
nvidia-docker build --tag ft_backend --file transformer_backend/Dockerfile .
54
54
nvidia-docker run --gpus=all -it --rm --volume $HOME :$HOME --volume $PWD :$PWD -w $PWD --name ft-work ft_backend
55
55
cd workspace
@@ -120,4 +120,4 @@ The model configuration for Triton server is put in `all_models/transformer/conf
120
120
- vocab_size: size of vocabulary
121
121
- decoder_layers: number of transformer layers
122
122
- batch_size: max supported batch size
123
- - is_fuse_QKV: fusing QKV in one matrix multiplication or not. It also depends on the weights of QKV.
123
+ - is_fuse_QKV: fusing QKV in one matrix multiplication or not. It also depends on the weights of QKV.
You can’t perform that action at this time.
0 commit comments