-
Notifications
You must be signed in to change notification settings - Fork 887
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
harshbafna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is a good idea to introduce a default batch_size and max_batch_delay parameter in TorchServe, as every model may use a different batch size.
There can be two more possible approaches for this :
- Allow user to provide these parameters as a part of model-archive creation
--batch_size 2 --max_batch_delay 50
- Allow users to specify batch_size and max_batch_delay with the
load_modelsparameter. Something like:
load_models=model1:2:50,model2:5:100
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Yeah, each model may have different batchSize and maxBatchDelay. Ideally, each model's manifest defines all the parameter's default values such as #workers, batchSize and maxBatchDelay. However, the existing implementation uses property file to define default #workers for all of the models. Each model's attributes (such as #workers, batchSize and maxBatchDelay) can be changed via http request. To follow existing implementation pattern, so I choose default value for all the models, instead of only define batchSize and maxBatchDelay in manifest. |
|
@harshbafna any update on this? |
|
@lxning should we go ahead and get this merged? There are a couple of issues asking for configuring batch sizes Just to clarify which manifest should users change if they want to deploy several models at once each with their own batch size? |
|
Dupe of #1122 |
Description
Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes #(issue)
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Allow user to configure default batchSize and maxBatchDelay in properties when model is initiated.
Please describe the tests [UT/IT] that you ran to verify your changes and relevent result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
UT/IT execution results
Logs
Checklist: