Skip to content
This repository was archived by the owner on Aug 7, 2025. It is now read-only.

Commit fd4e3e8

Browse files
authored
Merge pull request #1095 from pytorch/readmerefactor
Doc cleanup
2 parents d8ec14d + 610ba43 commit fd4e3e8

32 files changed

+143
-190
lines changed

CONTRIBUTING.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,14 @@ If you are interested in contributing to TorchServe, your contributions will fal
2121
```
2222
> Supported cuda versions as cu111, cu102, cu101, cu92
2323

24-
- Execute sanity suite
24+
- Run sanity suite
2525
```bash
26-
python ./torchserve_sanity.py
26+
python torchserve_sanity.py
2727
```
28+
- Run Regression test `python test/regression_tests.py`
2829
- For running individual test suites refer [code_coverage](docs/code_coverage.md) documentation
30+
- If you are updating an existing model make sure that performance hasn't degraded by running [benchmarks](https://github.com/pytorch/serve/tree/master/benchmarks) on the master branch and your branch and verify there is no performance regression
31+
- For large changes make sure to run the [automated benchmark suite](https://github.com/pytorch/serve/tree/master/test/benchmark) which will run the apache bench tests on several configurations of CUDA and EC2 instances
2932
- If you need more context on a particular issue, please create raise a ticket on [`TorchServe` GH repo](https://github.com/pytorch/serve/issues/new/choose) or connect to [PyTorch's slack channel](https://pytorch.slack.com/)
3033

3134
Once you finish implementing a feature or bug-fix, please send a Pull Request to https://github.com/pytorch/serve. Use this [template](pull_request_template.md) when creating a Pull Request.

README.md

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ TorchServe is a flexible and easy to use tool for serving PyTorch models.
2222
* [Serve a Model](#serve-a-model)
2323
* [Serve a Workflow](docs/workflows.md)
2424
* [Quick start with docker](#quick-start-with-docker)
25+
* [Highlighted Examples](#highlighted-examples)
26+
* [Featured Community Projects](#featured-community-projects)
2527
* [Contributing](#contributing)
2628

2729
## Install TorchServe and torch-model-archiver
@@ -48,17 +50,17 @@ TorchServe is a flexible and easy to use tool for serving PyTorch models.
4850

4951
Refer to the documentation [here](docs/torchserve_on_win_native.md).
5052

51-
2. Install torchserve and torch-model-archiver
53+
2. Install torchserve, torch-model-archiver and torch-workflow-archiver
5254

5355
For [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install)
5456
Note: Conda packages are not supported for Windows. Refer to the documentation [here](docs/torchserve_on_win_native.md).
5557
```
56-
conda install torchserve torch-model-archiver -c pytorch
58+
conda install torchserve torch-model-archiver torch-workflow-archiver -c pytorch
5759
```
5860

5961
For Pip
6062
```
61-
pip install torchserve torch-model-archiver
63+
pip install torchserve torch-model-archiver torch-workflow-archiver
6264
```
6365

6466
Now you are ready to [package and serve models with TorchServe](#serve-a-model).
@@ -71,7 +73,7 @@ Ensure that you have `python3` installed, and the user has access to the site-pa
7173

7274
Run the following script from the top of the source directory.
7375

74-
NOTE: This script uninstalls existing `torchserve` and `torch-model-archiver` installations
76+
NOTE: This script uninstalls existing `torchserve`, `torch-model-archiver` and `torch-workflow-archiver` installations
7577

7678
#### For Debian Based Systems/ MacOS
7779

@@ -136,7 +138,7 @@ torchserve --start --ncs --model-store model_store --models densenet161.mar
136138

137139
After you execute the `torchserve` command above, TorchServe runs on your host, listening for inference requests.
138140

139-
**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resoures (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).
141+
**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resources (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).
140142

141143
### Get predictions from a model
142144

@@ -212,6 +214,11 @@ To stop the currently running TorchServe instance, run:
212214
torchserve --stop
213215
```
214216

217+
### Inspect the logs
218+
All the logs you've seen as output to stdout related to model registration, management, inference are recorded in the `/logs` folder.
219+
220+
High level performance data like Throughput or Percentile Precision can be generated with [Benchmark](benchmark/README.md) and visualized in a report.
221+
215222
### Concurrency And Number of Workers
216223
TorchServe exposes configurations that allow the user to configure the number of worker threads on CPU and GPUs. There is an important config property that can speed up the server depending on the workload.
217224
*Note: the following property has bigger impact under heavy workloads.*
@@ -239,9 +246,12 @@ Feel free to skim the full list of [available examples](examples/README.md)
239246
## Learn More
240247
241248
* [Full documentation on TorchServe](docs/README.md)
242-
* [Manage models API](docs/management_api.md)
249+
* [Model Management API](docs/management_api.md)
243250
* [Inference API](docs/inference_api.md)
251+
* [Metrics API](docs/metrics.md)
244252
* [Package models for use with TorchServe](model-archiver/README.md)
253+
* [Deploying TorchServe with Kubernetes](kubernetes/README.md)
254+
* [TorchServe Workflows](examples/Workflows/README.md)
245255
* [TorchServe model zoo for pre-trained and pre-packaged models-archives](docs/model_zoo.md)
246256
247257
## Contributing
@@ -255,4 +265,4 @@ To file a bug or request a feature, please file a GitHub issue. For filing pull
255265
## Disclaimer
256266
This repository is jointly operated and maintained by Amazon, Facebook and a number of individual contributors listed in the [CONTRIBUTORS](https://github.com/pytorch/serve/graphs/contributors) file. For questions directed at Facebook, please send an email to [email protected]. For questions directed at Amazon, please send an email to [email protected]. For all other questions, please open up an issue in this repository [here](https://github.com/pytorch/serve/issues).
257267
258-
*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*
268+
*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*

benchmarks/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ We currently support benchmarking with JMeter & Apache Bench. One can also profi
66

77
* [Benchmarking with JMeter](#benchmarking-with-jmeter)
88
* [Benchmarking with Apache Bench](#benchmarking-with-apache-bench)
9+
* [AutoBenchmarking Apachage Bench on AWS](#benchmarking-apache-bench-aws)
910
* [Profiling](#profiling)
1011

1112
# Benchmarking with JMeter
@@ -304,7 +305,7 @@ Note: These pre-defined parameters in test plan can be overwritten by cmd line a
304305
The reports are generated at location "/tmp/benchmark/"
305306
- CSV report: /tmp/benchmark/ab_report.csv
306307
- latency graph: /tmp/benchmark/predict_latency.png
307-
- torhcserve logs: /tmp/benchmark/logs/model_metrics.log
308+
- torchserve logs: /tmp/benchmark/logs/model_metrics.log
308309
- raw ab output: /tmp/benchmark/result.txt
309310

310311
### Sample output CSV
@@ -315,6 +316,9 @@ The reports are generated at location "/tmp/benchmark/"
315316
### Sample latency graph
316317
![](predict_latency.png)
317318

319+
# Benchmarking Apache Bench AWS
320+
If you're making a large change to TorchServe it's best to run an [automated benchmarking suite on AWS](https://github.com/pytorch/serve/tree/master/test/benchmark) so that you can test multiple CUDA versions and EC2 hardware configurations easily.
321+
318322
# Profiling
319323

320324
## Frontend

binaries/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
```pwsh
1010
python .\ts_scripts\install_dependencies.py --environment=dev
1111
```
12-
> For GPU with Cuda 10.1, make sure add the `--cuda cu101` arg to the above command
12+
> For GPU with Cuda 10.2, make sure add the `--cuda cu102` arg to the above command
1313
1414

1515
2. To build a `torchserve` and `torch-model-archiver` wheel execute:

ci/buildspec.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ phases:
77
commands:
88
- apt-get update
99
- apt-get install sudo -y
10-
- python ts_scripts/install_dependencies.py --cuda=cu101 --environment=dev
10+
- python ts_scripts/install_dependencies.py --cuda=cu102 --environment=dev
1111

1212
build:
1313
commands:

codebuild/README.md

Lines changed: 0 additions & 27 deletions
This file was deleted.

docker/README.md

Lines changed: 9 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -32,44 +32,28 @@ Use `build_image.sh` script to build the docker images. The script builds the `p
3232
|-b, --branch_name|Specify a branch name to use. Default: master |
3333
|-g, --gpu|Build image with GPU based ubuntu base image|
3434
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, codebuild|
35-
|-t, --tag|Tag name for image. If not specified, script uses torchserv default tag names.|
35+
|-t, --tag|Tag name for image. If not specified, script uses torchserve default tag names.|
3636
|-cv, --cudaversion| Specify to cuda version to use. Supported values `cu92`, `cu101`, `cu102`, `cu111`. Default `cu102`|
37+
|--codebuild| Set if you need [AWS CodeBuild](https://aws.amazon.com/codebuild/)|
38+
3739

3840
**PRODUCTION ENVIRONMENT IMAGES**
3941

4042
Creates a docker image with publicly available `torchserve` and `torch-model-archiver` binaries installed.
4143

42-
- For creating CPU based image :
44+
- To create a CPU based image
4345

4446
```bash
4547
./build_image.sh
4648
```
4749

48-
- For creating GPU based image with cuda version 11.1:
49-
50-
```bash
51-
./build_image.sh -g -cv cu111
52-
```
53-
54-
- For creating GPU based image with cuda version 10.2:
55-
56-
```bash
57-
./build_image.sh -g -cv cu102
58-
```
59-
60-
- For creating GPU based image with cuda version 10.1:
50+
- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`
6151

62-
```bash
63-
./build_image.sh -g -cv cu101
64-
```
65-
66-
- For creating GPU based image with cuda version 9.2:
52+
```bash
53+
./build_image.sh -g -cv cu102
54+
```
6755

68-
```bash
69-
./build_image.sh -g -cv cu92
70-
```
71-
72-
- For creating image with a custom tag:
56+
- To create an image with a custom tag
7357

7458
```bash
7559
./build_image.sh -t torchserve:1.0

docs/FAQs.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,15 +46,15 @@ Refer [configuration.md](configuration.md) for more details.
4646

4747

4848
### How can I resolve model specific python dependency?
49-
You can provide a requirements.txt while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
49+
You can provide a `requirements.txt` while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
5050
Refer [configuration.md](configuration.md) for more details.
5151

5252
### Can I deploy Torchserve in Kubernetes?
5353
Yes, you can deploy Torchserve in Kubernetes using Helm charts.
5454
Refer [Kubernetes deployment ](../kubernetes/README.md) for more details.
5555

5656
### Can I deploy Torchserve with AWS ELB and AWS ASG?
57-
Yes, you can deploy Torchserve on a multinode ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.
57+
Yes, you can deploy Torchserve on a multi-node ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.
5858

5959
### How can I backup and restore Torchserve state?
6060
TorchServe preserves server runtime configuration across sessions such that a TorchServe instance experiencing either a planned or unplanned service stop can restore its state upon restart. These saved runtime configuration files can be used for backup and restore.
@@ -65,7 +65,7 @@ Torchserve has a utility [script](../docker/build_image.sh) for creating docker
6565

6666
All these docker images can be created using `build_image.sh` with appropriate options.
6767

68-
Run `./build_image.sh --help` for all availble options.
68+
Run `./build_image.sh --help` for all available options.
6969

7070
Refer [Create Torchserve docker image from source](../docker/README.md#create-torchserve-docker-image) for more details.
7171

@@ -80,11 +80,11 @@ To create a Docker image for a specific branch and specific tag, use the followi
8080

8181

8282
### What is the difference between image created using Dockerfile and image created using Dockerfile.dev?
83-
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from pypi distribution.
83+
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from PyPi distribution.
8484

8585
## API
8686
Relevant documents
87-
- [Torchserve Rest API](../docs/model_zoo.md#model-zoo)
87+
- [Torchserve Rest API](rest_api.md)
8888

8989
### What can I use other than *curl* to make requests to Torchserve?
9090
You can use any tool like Postman, Insomnia or even use a python script to do so. Find sample python script [here](https://github.com/pytorch/serve/blob/master/docs/default_handlers.md#torchserve-default-inference-handlers).
@@ -103,7 +103,7 @@ Relevant documents
103103
- [Custom Handlers](custom_service.md#custom-handlers)
104104

105105
### How do I return an image output for a model?
106-
You would have to write a custom handler with the post processing to return image.
106+
You would have to write a custom handler and modify the postprocessing to return the image
107107
Refer [custom service documentation](custom_service.md#custom-handlers) for more details.
108108

109109
### How to enhance the default handlers?
@@ -116,24 +116,25 @@ Refer [default handlers](default_handlers.md#torchserve-default-inference-handle
116116

117117
### Is it possible to deploy Hugging Face models?
118118
Yes, you can deploy Hugging Face models using a custom handler.
119-
Refer [Huggingface_Transformers](../examples/Huggingface_Transformers/README.md) for example.
119+
Refer [HuggingFace_Transformers](https://github.com/pytorch/serve/blob/master/examples/Huggingface_Transformers/README.md) for example.
120120

121121
## Model-archiver
122122
Relevant documents
123123
- [Model-archiver ](../model-archiver/README.md#torch-model-archiver-for-torchserve)
124124
- [Docker Readme](../docker/README.md)
125125

126126
### What is a mar file?
127-
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility *torch-model-archiver* is used to create a mar file.
127+
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility `torch-model-archiver` is used to create a mar file.
128128

129129
### How can create mar file using Torchserve docker container?
130130
Yes, you create your mar file using a Torchserve container. Follow the steps given [here](../docker/README.md#create-torch-model-archiver-from-container).
131131

132132
### Can I add multiple serialized files in single mar file?
133-
Currently `TorchModelArchiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.
133+
Currently `torch-model-archiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.
134134

135135
Sample code snippet:
136-
```
136+
137+
```python
137138
properties = context.system_properties
138139
model_dir = properties.get("model_dir")
139140
```

0 commit comments

Comments
 (0)