Main Article Content
Cloud computing providers face the problem of matching heterogeneous customer workloads to resources that will serve them. This is particularly challenging if customers, who are already running a job on a cluster, scale their resource usage up and down over time. The provider therefore has to continuously decide whether she can add additional workloads to a given cluster or if doing so would impact existing workloads’ ability to scale. Currently, this is often done using simple threshold policies to reserve large parts of each cluster, which leads to low efficiency (i.e., low average utilization of the cluster). We propose more sophisticated policies for controlling admission to a cluster and demonstrate that they significantly increase cluster utilization. We first introduce the cluster admission problem and formalize it as a constrained Partially Observable Markov Decision Process (POMDP). As it is infeasible to solve the POMDP optimally, we then systematically design admission policies that estimate moments of each workload’s distribution of future resource usage. Via extensive simulations grounded in a trace from Microsoft Azure, we show that our admission policies lead to a substantial improvement over the simple threshold policy. We then show that substantial further gains are possible if high-quality information is available about arriving workloads. Based on this, we propose an information elicitation approach to incentivize users to provide this information and simulate its effects.