Skip to content

关于PaddlePaddle Serving的一点调研 #394

@gongweibao

Description

@gongweibao

TensorFlow serving

首先是调研了一下TensorFlow Serving。他的arch view文档在这里 或者这里

除了提供基础的rpc server的功能外,亮点在于一下几个feature:

  • 多版本管理:
    • 可以同时load多个版本的model,并且客户端可以访问指定的版本。
    • 模型热加载:新版本的model发布后,自动加载新版本。
      • 版本管理的policy是可以定制的。默认主要实现的有两种:Availability Preserving PolicyResource Preserving Policy
TensorFlow Serving includes two policies that accommodate most known use- cases. 
These are the Availability Preserving Policy (avoid leaving zero versions loaded; 
typically load a new version before unloading an old one), and the Resource Preserving Policy 
(avoid having two versions loaded simultaneously, thus requiring double the resources; 
unload an old version before loading a new one)
  • 支持从多种存储上加载模型:
    • 可以扩展支持更多种类的存储。
  • client端访问的批处理功能:
    • 同样,这个功能也是可以自定义policy。
Batching of multiple requests into a single request can significantly reduce the cost 
of performing inference, especially in the presence of hardware accelerators such as GPUs. 

根据文档描述,其Loaders 是可以扩展的,这样具有了支持非TensorFlow model的能力。社区已经有人为TensorFlow Serving增加caffe模型的支持:tensorflow/serving#261
https://github.com/rayglover-ibm/serving-caffe

厂内情况

讨论:

我们在大会之前做一个可以和TensorFlow Serving对标的Serving服务作为亮点,时间有点来不及。

  • TensorFlow Serving的基础上做插件支持PaddlePaddle的模型?
  • 做一个简单的C++版本Infer Server(如HttpServer)?

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions