Skip to content
This repository was archived by the owner on Aug 7, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,14 @@ If you are interested in contributing to TorchServe, your contributions will fal
```
> Supported cuda versions as cu111, cu102, cu101, cu92

- Execute sanity suite
- Run sanity suite
```bash
python ./torchserve_sanity.py
python torchserve_sanity.py
```
- Run Regression test `python test/regression_tests.py`
- For running individual test suites refer [code_coverage](docs/code_coverage.md) documentation
- If you are updating an existing model make sure that performance hasn't degraded by running [benchmarks](https://github.com/pytorch/serve/tree/master/benchmarks) on the master branch and your branch and verify there is no performance regression
- For large changes make sure to run the [automated benchmark suite](https://github.com/pytorch/serve/tree/master/test/benchmark) which will run the apache bench tests on several configurations of CUDA and EC2 instances
- If you need more context on a particular issue, please create raise a ticket on [`TorchServe` GH repo](https://github.com/pytorch/serve/issues/new/choose) or connect to [PyTorch's slack channel](https://pytorch.slack.com/)

Once you finish implementing a feature or bug-fix, please send a Pull Request to https://github.com/pytorch/serve. Use this [template](pull_request_template.md) when creating a Pull Request.
Expand Down
24 changes: 17 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ TorchServe is a flexible and easy to use tool for serving PyTorch models.
* [Serve a Model](#serve-a-model)
* [Serve a Workflow](docs/workflows.md)
* [Quick start with docker](#quick-start-with-docker)
* [Highlighted Examples](#highlighted-examples)
* [Featured Community Projects](#featured-community-projects)
* [Contributing](#contributing)

## Install TorchServe and torch-model-archiver
Expand All @@ -48,17 +50,17 @@ TorchServe is a flexible and easy to use tool for serving PyTorch models.

Refer to the documentation [here](docs/torchserve_on_win_native.md).

2. Install torchserve and torch-model-archiver
2. Install torchserve, torch-model-archiver and torch-workflow-archiver

For [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install)
Note: Conda packages are not supported for Windows. Refer to the documentation [here](docs/torchserve_on_win_native.md).
```
conda install torchserve torch-model-archiver -c pytorch
conda install torchserve torch-model-archiver torch-workflow-archiver -c pytorch
```

For Pip
```
pip install torchserve torch-model-archiver
pip install torchserve torch-model-archiver torch-workflow-archiver
```

Now you are ready to [package and serve models with TorchServe](#serve-a-model).
Expand All @@ -71,7 +73,7 @@ Ensure that you have `python3` installed, and the user has access to the site-pa

Run the following script from the top of the source directory.

NOTE: This script uninstalls existing `torchserve` and `torch-model-archiver` installations
NOTE: This script uninstalls existing `torchserve`, `torch-model-archiver` and `torch-workflow-archiver` installations

#### For Debian Based Systems/ MacOS

Expand Down Expand Up @@ -136,7 +138,7 @@ torchserve --start --ncs --model-store model_store --models densenet161.mar

After you execute the `torchserve` command above, TorchServe runs on your host, listening for inference requests.

**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resoures (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).
**Note**: If you specify model(s) when you run TorchServe, it automatically scales backend workers to the number equal to available vCPUs (if you run on a CPU instance) or to the number of available GPUs (if you run on a GPU instance). In case of powerful hosts with a lot of compute resources (vCPUs or GPUs), this start up and autoscaling process might take considerable time. If you want to minimize TorchServe start up time you should avoid registering and scaling the model during start up time and move that to a later point by using corresponding [Management API](docs/management_api.md#register-a-model), which allows finer grain control of the resources that are allocated for any particular model).

### Get predictions from a model

Expand Down Expand Up @@ -212,6 +214,11 @@ To stop the currently running TorchServe instance, run:
torchserve --stop
```

### Inspect the logs
All the logs you've seen as output to stdout related to model registration, management, inference are recorded in the `/logs` folder.

High level performance data like Throughput or Percentile Precision can be generated with [Benchmark](benchmark/README.md) and visualized in a report.

### Concurrency And Number of Workers
TorchServe exposes configurations that allow the user to configure the number of worker threads on CPU and GPUs. There is an important config property that can speed up the server depending on the workload.
*Note: the following property has bigger impact under heavy workloads.*
Expand Down Expand Up @@ -239,9 +246,12 @@ Feel free to skim the full list of [available examples](examples/README.md)
## Learn More

* [Full documentation on TorchServe](docs/README.md)
* [Manage models API](docs/management_api.md)
* [Model Management API](docs/management_api.md)
* [Inference API](docs/inference_api.md)
* [Metrics API](docs/metrics.md)
* [Package models for use with TorchServe](model-archiver/README.md)
* [Deploying TorchServe with Kubernetes](kubernetes/README.md)
* [TorchServe Workflows](examples/Workflows/README.md)
* [TorchServe model zoo for pre-trained and pre-packaged models-archives](docs/model_zoo.md)

## Contributing
Expand All @@ -255,4 +265,4 @@ To file a bug or request a feature, please file a GitHub issue. For filing pull
## Disclaimer
This repository is jointly operated and maintained by Amazon, Facebook and a number of individual contributors listed in the [CONTRIBUTORS](https://github.com/pytorch/serve/graphs/contributors) file. For questions directed at Facebook, please send an email to [email protected]. For questions directed at Amazon, please send an email to [email protected]. For all other questions, please open up an issue in this repository [here](https://github.com/pytorch/serve/issues).

*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*
*TorchServe acknowledges the [Multi Model Server (MMS)](https://github.com/awslabs/multi-model-server) project from which it was derived*
6 changes: 5 additions & 1 deletion benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ We currently support benchmarking with JMeter & Apache Bench. One can also profi

* [Benchmarking with JMeter](#benchmarking-with-jmeter)
* [Benchmarking with Apache Bench](#benchmarking-with-apache-bench)
* [AutoBenchmarking Apachage Bench on AWS](#benchmarking-apache-bench-aws)
* [Profiling](#profiling)

# Benchmarking with JMeter
Expand Down Expand Up @@ -304,7 +305,7 @@ Note: These pre-defined parameters in test plan can be overwritten by cmd line a
The reports are generated at location "/tmp/benchmark/"
- CSV report: /tmp/benchmark/ab_report.csv
- latency graph: /tmp/benchmark/predict_latency.png
- torhcserve logs: /tmp/benchmark/logs/model_metrics.log
- torchserve logs: /tmp/benchmark/logs/model_metrics.log
- raw ab output: /tmp/benchmark/result.txt

### Sample output CSV
Expand All @@ -315,6 +316,9 @@ The reports are generated at location "/tmp/benchmark/"
### Sample latency graph
![](predict_latency.png)

# Benchmarking Apache Bench AWS
If you're making a large change to TorchServe it's best to run an [automated benchmarking suite on AWS](https://github.com/pytorch/serve/tree/master/test/benchmark) so that you can test multiple CUDA versions and EC2 hardware configurations easily.

# Profiling

## Frontend
Expand Down
2 changes: 1 addition & 1 deletion binaries/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
```pwsh
python .\ts_scripts\install_dependencies.py --environment=dev
```
> For GPU with Cuda 10.1, make sure add the `--cuda cu101` arg to the above command
> For GPU with Cuda 10.2, make sure add the `--cuda cu102` arg to the above command


2. To build a `torchserve` and `torch-model-archiver` wheel execute:
Expand Down
2 changes: 1 addition & 1 deletion ci/buildspec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ phases:
commands:
- apt-get update
- apt-get install sudo -y
- python ts_scripts/install_dependencies.py --cuda=cu101 --environment=dev
- python ts_scripts/install_dependencies.py --cuda=cu102 --environment=dev

build:
commands:
Expand Down
27 changes: 0 additions & 27 deletions codebuild/README.md

This file was deleted.

34 changes: 9 additions & 25 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,44 +32,28 @@ Use `build_image.sh` script to build the docker images. The script builds the `p
|-b, --branch_name|Specify a branch name to use. Default: master |
|-g, --gpu|Build image with GPU based ubuntu base image|
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, codebuild|
|-t, --tag|Tag name for image. If not specified, script uses torchserv default tag names.|
|-t, --tag|Tag name for image. If not specified, script uses torchserve default tag names.|
|-cv, --cudaversion| Specify to cuda version to use. Supported values `cu92`, `cu101`, `cu102`, `cu111`. Default `cu102`|
|--codebuild| Set if you need [AWS CodeBuild](https://aws.amazon.com/codebuild/)|


**PRODUCTION ENVIRONMENT IMAGES**

Creates a docker image with publicly available `torchserve` and `torch-model-archiver` binaries installed.

- For creating CPU based image :
- To create a CPU based image

```bash
./build_image.sh
```

- For creating GPU based image with cuda version 11.1:

```bash
./build_image.sh -g -cv cu111
```

- For creating GPU based image with cuda version 10.2:

```bash
./build_image.sh -g -cv cu102
```

- For creating GPU based image with cuda version 10.1:
- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`

```bash
./build_image.sh -g -cv cu101
```

- For creating GPU based image with cuda version 9.2:
```bash
./build_image.sh -g -cv cu102
```

```bash
./build_image.sh -g -cv cu92
```

- For creating image with a custom tag:
- To create an image with a custom tag

```bash
./build_image.sh -t torchserve:1.0
Expand Down
21 changes: 11 additions & 10 deletions docs/FAQs.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,15 @@ Refer [configuration.md](configuration.md) for more details.


### How can I resolve model specific python dependency?
You can provide a requirements.txt while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
You can provide a `requirements.txt` while creating a mar file using "--requirements-file/ -r" flag. Also, you can add dependency files using "--extra-files" flag.
Refer [configuration.md](configuration.md) for more details.

### Can I deploy Torchserve in Kubernetes?
Yes, you can deploy Torchserve in Kubernetes using Helm charts.
Refer [Kubernetes deployment ](../kubernetes/README.md) for more details.

### Can I deploy Torchserve with AWS ELB and AWS ASG?
Yes, you can deploy Torchserve on a multinode ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.
Yes, you can deploy Torchserve on a multi-node ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details.

### How can I backup and restore Torchserve state?
TorchServe preserves server runtime configuration across sessions such that a TorchServe instance experiencing either a planned or unplanned service stop can restore its state upon restart. These saved runtime configuration files can be used for backup and restore.
Expand All @@ -65,7 +65,7 @@ Torchserve has a utility [script](../docker/build_image.sh) for creating docker

All these docker images can be created using `build_image.sh` with appropriate options.

Run `./build_image.sh --help` for all availble options.
Run `./build_image.sh --help` for all available options.

Refer [Create Torchserve docker image from source](../docker/README.md#create-torchserve-docker-image) for more details.

Expand All @@ -80,11 +80,11 @@ To create a Docker image for a specific branch and specific tag, use the followi


### What is the difference between image created using Dockerfile and image created using Dockerfile.dev?
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from pypi distribution.
The image created using Dockerfile.dev has Torchserve installed from source where as image created using Dockerfile has Torchserve installed from PyPi distribution.

## API
Relevant documents
- [Torchserve Rest API](../docs/model_zoo.md#model-zoo)
- [Torchserve Rest API](rest_api.md)

### What can I use other than *curl* to make requests to Torchserve?
You can use any tool like Postman, Insomnia or even use a python script to do so. Find sample python script [here](https://github.com/pytorch/serve/blob/master/docs/default_handlers.md#torchserve-default-inference-handlers).
Expand All @@ -103,7 +103,7 @@ Relevant documents
- [Custom Handlers](custom_service.md#custom-handlers)

### How do I return an image output for a model?
You would have to write a custom handler with the post processing to return image.
You would have to write a custom handler and modify the postprocessing to return the image
Refer [custom service documentation](custom_service.md#custom-handlers) for more details.

### How to enhance the default handlers?
Expand All @@ -116,24 +116,25 @@ Refer [default handlers](default_handlers.md#torchserve-default-inference-handle

### Is it possible to deploy Hugging Face models?
Yes, you can deploy Hugging Face models using a custom handler.
Refer [Huggingface_Transformers](../examples/Huggingface_Transformers/README.md) for example.
Refer [HuggingFace_Transformers](https://github.com/pytorch/serve/blob/master/examples/Huggingface_Transformers/README.md) for example.

## Model-archiver
Relevant documents
- [Model-archiver ](../model-archiver/README.md#torch-model-archiver-for-torchserve)
- [Docker Readme](../docker/README.md)

### What is a mar file?
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility *torch-model-archiver* is used to create a mar file.
A mar file is a zip file consisting of all model artifacts with the ".mar" extension. The cmd-line utility `torch-model-archiver` is used to create a mar file.

### How can create mar file using Torchserve docker container?
Yes, you create your mar file using a Torchserve container. Follow the steps given [here](../docker/README.md#create-torch-model-archiver-from-container).

### Can I add multiple serialized files in single mar file?
Currently `TorchModelArchiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.
Currently `torch-model-archiver` allows supplying only one serialized file with `--serialized-file` parameter while creating the mar. However, you can supply any number and any type of file with `--extra-files` flag. All the files supplied in the mar file are available in `model_dir` location which can be accessed through the context object supplied to the handler's entry point.

Sample code snippet:
```

```python
properties = context.system_properties
model_dir = properties.get("model_dir")
```
Expand Down
Loading