Original link: https://www.bodunhu.com/blog/posts/shepherd-serving-dnns-in-the-wild/
Paper link: SHEPHERD: Serving DNNs in the Wild
Achieving scalability, high system goodput and maximize resource utilization, at the same time is hard for an inference system.
While individual request streams can be highly unpredictable, aggregating request streams into moderately-sized groups greatly improves predictability, permitting high resource utilization as well as scalability
SHEPHERD’s main observation is
This article is transferred from: https://www.bodunhu.com/blog/posts/shepherd-serving-dnns-in-the-wild/
This site is only for collection, and the copyright belongs to the original author.