As more and more applications need to analyze unbounded data streams in a real-time manner, data stream processing platforms, such as Storm, have drawn the attention of many researchers, especially the scheduling problem. However, there are still many challenges unnoticed or unsolved. In this paper, we propose and implement an adaptive online scheme to solve three important challenges of scheduling. First, how to make a scaling decision in a real-time manner to handle the fluctuant load without congestion? Second, how to minimize the number of affected workers during rescheduling while satisfying the resource demand of each instance? We also point out that the stateful instances should not be placed on the same worker with stateless instances. Third, currently, the application performance cannot be guaranteed because of resource contention even if the computation platform implements an optimal scheduling algorithm. In this paper, we realize resource isolation using Cgroup, and then the performance interference caused by resource contention is mitigated. We implement our scheduling scheme and plug it into Storm, and our experiments demonstrate in some respects our scheme achieves better performance than the state-of-the-art solutions.
- Resource allocation
- Stream processing