TY - JOUR
T1 - Automated fine-grained CPU cap control in serverless computing platform
AU - Kim, Young Ki
AU - HoseinyFarahabady, M. Reza
AU - Lee, Young Choon
AU - Zomaya, Albert Y.
PY - 2020/10
Y1 - 2020/10
N2 - Serverless computing has emerged as a new cloud computing execution model that liberates users and application developers from explicitly managing 'physical' resources, leaving such a resource management burden to service providers. In this article, we study the problem of resource allocation for multi-tenant serverless computing platforms explicitly taking into account workload fluctuations including sudden surges. In particular, we investigate different root causes of performance degradation in these platforms where tenants (their applications) have different workload characteristics. To this end, we develop a fine-grained CPU cap control solution as a resource manager that dynamically adjusts CPU usage limit (or CPU cap) concerning applications with same/similar performance requirements, i.e., application groups. The adjustment of CPU caps applies primarily to co-located worker processes of serverless computing platforms to minimize resource contention, which is the major source of performance degradation. The actual adjustment decisions are made based on performance metrics (e.g., throttled time and queue length) using a group-aware scheduling algorithm. The extensive experimental results performed in our local cluster confirm that the proposed resource manager can effectively eliminate the burden of explicit reservation of computing capacity, even when fluctuations and sudden surges in the incoming workload exist. We measure the robustness of the proposed resource manager by comparing it with several heuristics which extensively used in practice, including the enhanced version of round robin and the least length queue scheduling policies, under various workload intensities driven by real-world scenarios. Notably, our resource manager outperforms other heuristics by decreasing skewness and average response time up to 44 and 94 percent, respectively, while it does not over-use the CPU resources.
AB - Serverless computing has emerged as a new cloud computing execution model that liberates users and application developers from explicitly managing 'physical' resources, leaving such a resource management burden to service providers. In this article, we study the problem of resource allocation for multi-tenant serverless computing platforms explicitly taking into account workload fluctuations including sudden surges. In particular, we investigate different root causes of performance degradation in these platforms where tenants (their applications) have different workload characteristics. To this end, we develop a fine-grained CPU cap control solution as a resource manager that dynamically adjusts CPU usage limit (or CPU cap) concerning applications with same/similar performance requirements, i.e., application groups. The adjustment of CPU caps applies primarily to co-located worker processes of serverless computing platforms to minimize resource contention, which is the major source of performance degradation. The actual adjustment decisions are made based on performance metrics (e.g., throttled time and queue length) using a group-aware scheduling algorithm. The extensive experimental results performed in our local cluster confirm that the proposed resource manager can effectively eliminate the burden of explicit reservation of computing capacity, even when fluctuations and sudden surges in the incoming workload exist. We measure the robustness of the proposed resource manager by comparing it with several heuristics which extensively used in practice, including the enhanced version of round robin and the least length queue scheduling policies, under various workload intensities driven by real-world scenarios. Notably, our resource manager outperforms other heuristics by decreasing skewness and average response time up to 44 and 94 percent, respectively, while it does not over-use the CPU resources.
KW - dynamic CPU scheduling
KW - operating system process management
KW - performance modeling
KW - Serverless computing
KW - virtualized cloud platforms
UR - http://www.scopus.com/inward/record.url?scp=85084929023&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/arc/DP190103710
U2 - 10.1109/TPDS.2020.2989771
DO - 10.1109/TPDS.2020.2989771
M3 - Article
AN - SCOPUS:85084929023
SN - 1045-9219
VL - 31
SP - 2289
EP - 2301
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 10
ER -