Cloud 2.0 : cloud infrastructure costs halved!
Cloud 2.0: cloud infrastructure costs halved by automatic adjustment of instances to QoE and QoS
Back in 2011, an idea was turning in our heads about the future of cloud resource management. Our intuition, driven by research projects on operator self-managing network architectures (autonomic networks), prompted us to think that these techniques should be adapted for cloud platforms and thereby foster the creation of a new generation of platform, “Cloud 2.0”.
In our whitepaper, Toward Sustainable Performance of Cloud Computing, we referred to automatic management of the elasticity of instances: auto-scaling. This consists of adjusting supply precisely to demand, and could not be optimal without a mechanism that takes into account the platform’s QoS (quality of service) and QoE (quality of experience) parameters.
Since then we have witnessed the birth of automatic resource management mechanisms or services (Auto Scaling from Amazon, Metrics Hub, recently bought out by Microsoft, RightScale, and others). Still, the decision rules based on quality of service (QoS) indicators do not provide for optimization of decision-making. To give one example, CPU usage rate of 100% indicates that the virtual machine is working, but that does not mean that the service is slow from the user’s point of view. If auto scaling relies on this metric, it might increase the number of instances, which would lead to useless additional costs.
As an SaaS vendor faced with the need to adjust our infrastructure to our customers’ audiences (to handle sales season, unforeseeable traffic peaks, etc.) we ported some of our servers to two public cloud platforms (IaaS and PaaS). When this operation was finished, we tackled implementation of a policy of automatic management of web front-ends in order to adjust even more closely to demand.
Demand is shown in the following graph (we broke down into 5-minute segments the number of requests received on our web front-ends).
It was clear to us that using this information could help us build a module for automatically deciding the number of instances that are really necessary. We therefore decided to develop this module using cloud platform APIs to take care of sizing. This module was then put into production to manage the platform. Safety nets were set up, for example, the minimum and maximum number of instances, to forestall any malfunction. Furthermore, economic models of the platforms were taken into account, including hourly billing for a platform, invoicing by the minute for another, so as to adjust decisions consequently. We were surprised by the results we obtained.
- the default number of instances allocated was oversized with respect to the traffic we currently handle
- the set-adapting mechanism reacts very accurately to demand
- resulting savings were very significant
This way load peaks were managed automatically, without any manual intervention. This made is possible to handle unexpected events and thereby improve the perceived quality of service.
Moreover, major savings resulted even at the scale of a small infrastructure. This can be calculated as follows:
We have now already planned developments like automatic management of other types of instances, for example databases, to reduce costs while maintaining optimum quality of service.
William Rang, CTO