You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 7, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+5-2Lines changed: 5 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,11 +21,14 @@ If you are interested in contributing to TorchServe, your contributions will fal
21
21
```
22
22
> Supported cuda versions as cu111, cu102, cu101, cu92
23
23
24
-
- Execute sanity suite
24
+
- Run sanity suite
25
25
```bash
26
-
python ./torchserve_sanity.py
26
+
python torchserve_sanity.py
27
27
```
28
+
- Run Regression test`python test/regression_tests.py`
28
29
- For running individual test suites refer [code_coverage](docs/code_coverage.md) documentation
30
+
- If you are updating an existing model make sure that performance hasn't degraded by running [benchmarks](https://github.com/pytorch/serve/tree/master/benchmarks) on the master branch and your branch and verify there is no performance regression
31
+
- For large changes make sure to run the [automated benchmark suite](https://github.com/pytorch/serve/tree/master/test/benchmark) which will run the apache bench tests on several configurations of CUDA and EC2 instances
29
32
- If you need more context on a particular issue, please create raise a ticket on [`TorchServe` GH repo](https://github.com/pytorch/serve/issues/new/choose) or connect to [PyTorch's slack channel](https://pytorch.slack.com/)
30
33
31
34
Once you finish implementing a feature or bug-fix, please send a Pull Request to https://github.com/pytorch/serve. Use this [template](pull_request_template.md) when creating a Pull Request.
After you execute the `torchserve`command above, TorchServe runs on your host, listening for inference requests.
138
140
139
-
**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resoures (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).
141
+
**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resources (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).
140
142
141
143
### Get predictions from a model
142
144
@@ -212,6 +214,11 @@ To stop the currently running TorchServe instance, run:
212
214
torchserve --stop
213
215
```
214
216
217
+
### Inspect the logs
218
+
All the logs you've seen as output to stdout related to model registration, management, inference are recorded in the `/logs` folder.
219
+
220
+
High level performance data like Throughput or Percentile Precision can be generated with [Benchmark](benchmark/README.md) and visualized in a report.
221
+
215
222
### Concurrency And Number of Workers
216
223
TorchServe exposes configurations that allow the user to configure the number of worker threads on CPU and GPUs. There is an important config property that can speed up the server depending on the workload.
217
224
*Note: the following property has bigger impact under heavy workloads.*
@@ -239,9 +246,12 @@ Feel free to skim the full list of [available examples](examples/README.md)
239
246
## Learn More
240
247
241
248
* [Full documentation on TorchServe](docs/README.md)
242
-
* [Manage models API](docs/management_api.md)
249
+
* [Model Management API](docs/management_api.md)
243
250
* [Inference API](docs/inference_api.md)
251
+
* [Metrics API](docs/metrics.md)
244
252
* [Package models for use with TorchServe](model-archiver/README.md)
253
+
* [Deploying TorchServe with Kubernetes](kubernetes/README.md)
* [TorchServe model zoo for pre-trained and pre-packaged models-archives](docs/model_zoo.md)
246
256
247
257
## Contributing
@@ -255,4 +265,4 @@ To file a bug or request a feature, please file a GitHub issue. For filing pull
255
265
## Disclaimer
256
266
This repository is jointly operated and maintained by Amazon, Facebook and a number of individual contributors listed in the [CONTRIBUTORS](https://github.com/pytorch/serve/graphs/contributors) file. For questions directed at Facebook, please send an email to [email protected]. For questions directed at Amazon, please send an email to [email protected]. For all other questions, please open up an issue in this repository [here](https://github.com/pytorch/serve/issues).
257
267
258
-
*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*
268
+
*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*
@@ -315,6 +316,9 @@ The reports are generated at location "/tmp/benchmark/"
315
316
### Sample latency graph
316
317

317
318
319
+
# Benchmarking Apache Bench AWS
320
+
If you're making a large change to TorchServe it's best to run an [automated benchmarking suite on AWS](https://github.com/pytorch/serve/tree/master/test/benchmark) so that you can test multiple CUDA versions and EC2 hardware configurations easily.
Copy file name to clipboardExpand all lines: docs/FAQs.md
+11-10Lines changed: 11 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,15 +46,15 @@ Refer [configuration.md](configuration.md) for more details.
46
46
47
47
48
48
### How can I resolve model specific python dependency?
49
-
You can provide a requirements.txt while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
49
+
You can provide a `requirements.txt` while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
50
50
Refer [configuration.md](configuration.md) for more details.
51
51
52
52
### Can I deploy Torchserve in Kubernetes?
53
53
Yes, you can deploy Torchserve in Kubernetes using Helm charts.
54
54
Refer [Kubernetes deployment ](../kubernetes/README.md) for more details.
55
55
56
56
### Can I deploy Torchserve with AWS ELB and AWS ASG?
57
-
Yes, you can deploy Torchserve on a multinode ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.
57
+
Yes, you can deploy Torchserve on a multi-node ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.
58
58
59
59
### How can I backup and restore Torchserve state?
60
60
TorchServe preserves server runtime configuration across sessions such that a TorchServe instance experiencing either a planned or unplanned service stop can restore its state upon restart. These saved runtime configuration files can be used for backup and restore.
@@ -65,7 +65,7 @@ Torchserve has a utility [script](../docker/build_image.sh) for creating docker
65
65
66
66
All these docker images can be created using `build_image.sh` with appropriate options.
67
67
68
-
Run `./build_image.sh --help` for all availble options.
68
+
Run `./build_image.sh --help` for all available options.
69
69
70
70
Refer [Create Torchserve docker image from source](../docker/README.md#create-torchserve-docker-image) for more details.
71
71
@@ -80,11 +80,11 @@ To create a Docker image for a specific branch and specific tag, use the followi
80
80
81
81
82
82
### What is the difference between image created using Dockerfile and image created using Dockerfile.dev?
83
-
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from pypi distribution.
83
+
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from PyPi distribution.
### What can I use other than *curl* to make requests to Torchserve?
90
90
You can use any tool like Postman, Insomnia or even use a python script to do so. Find sample python script [here](https://github.com/pytorch/serve/blob/master/docs/default_handlers.md#torchserve-default-inference-handlers).
Yes, you can deploy Hugging Face models using a custom handler.
119
-
Refer [Huggingface_Transformers](../examples/Huggingface_Transformers/README.md) for example.
119
+
Refer [HuggingFace_Transformers](https://github.com/pytorch/serve/blob/master/examples/Huggingface_Transformers/README.md) for example.
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility *torch-model-archiver* is used to create a mar file.
127
+
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility `torch-model-archiver` is used to create a mar file.
128
128
129
129
### How can create mar file using Torchserve docker container?
130
130
Yes, you create your mar file using a Torchserve container. Follow the steps given [here](../docker/README.md#create-torch-model-archiver-from-container).
131
131
132
132
### Can I add multiple serialized files in single mar file?
133
-
Currently `TorchModelArchiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.
133
+
Currently `torch-model-archiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.
0 commit comments