Under today's bursty web traffic, the fine-grained per-container control promises more efficient resource provisioning for web services and better resource utilization in cloud datacenters. In this paper, we present Two-stage Stochastic Programming Resource Allocator (2SPRA). It optimizes resource provisioning for containerized n-tier web services in accordance with fluctuations of incoming workload to accommodate predefined SLOs on response latency. In particular, 2SPRA is capable of minimizing resource over-provisioning by addressing dynamics of web traffic as workload uncertainty in a native stochastic optimization model. Using special-purpose OpenOpt optimization framework, we fully implement 2SPRA in Python and evaluate it against three other existing allocation schemes, in a Docker-based CoreOS Linux VMs on Amazon EC2. We generate workloads based on four real-world web traces of various traffic variations: AOL, WorldCup98, ClarkNet, and NASA. Our experimental results demonstrate that 2SPRA achieves the minimum resource over-provisioning outperforming other schemes. In particular, 2SPRA allocates only 6.16 percent more than application's actual demand on average and at most 7.75 percent in the worst case. It achieves 3x further reduction in total resources provisioned compared to other schemes delivering overall cost-savings of 53.6 percent on average and up to 66.8 percent. Furthermore, 2SPRA demonstrates consistency in its provisioning decisions and robust responsiveness against workload fluctuations.
|Number of pages||14|
|Journal||IEEE Transactions on Parallel and Distributed Systems|
|Publication status||Published - 1 Jul 2017|
- Cloud resources provisioning
- Multi-tier web applications
- Stochastic optimization
- Workload uncertainty