Metrics #1 - Either Simple or Useful
Many organisations from governments to vendors are seeking metrics to describe the efficiency of data centres and the components within them. Each is seeking some sort of magic number that can be adopted to measure how efficient we are, these metrics vary from device level benchmarks through to attempts to state the “business value” or “efficiency” of entire data centres or IT estates along with all the attendant issues of trying to create common and portable definitions of the “efficiency” and “business value” of a data centre. One of the recurrent themes we see in the evolution of these metrics is a basic dichotomy; you can have either simple or meaningful. In this article we will explore this basic problem and ask whether we should be learning from other sectors how to preserve enough meaning to still be useful?

What do we want from metrics?
To understand the tension we first need to look at what it is we want from metrics and what makes a metric successful once released. Note that effective and useful are frequently not major factors in the success of a metric, only the appearance of relevance or applicability is required.
Simple
The recurring theme in most successful metrics is that they are simple, preferably a single number that can be explained in one sentence to somebody who has no domain specific knowledge. In the case of a data centre this might be “Our DCIE is 0.6, that means 60% of the power you pay for goes to the IT equipment and the rest is lost in the other stuff” in the case of a car it might be “It does 45MPG combined cycle, this means that under the test combination of urban and highway driving the car achieved 45MPG, you are likely to get something close to that”.
Simplicity is the most appealing to those without the domain specific knowledge to understand the inherent problems with the metric and marketing folks who are interested in something, apparently quantitative and scientific that can support the product or brand message. This very quickly led to the PUE / DCIE metric being appropriated as a comparison and advertising tool leaving The Green Grid trying to regain control of their own metrics and define how they should be used.
Meaningful
It seems obvious to say that metrics must be meaningful but meaningful depends on both the thing being measured and the audience receiving that measurement. Some of the requirements for a metric to be meaningful are;
- Relevant, the metric must measure and report (or at least appear to) something that the audience wants to know
- Applicable, the metric must be applicable to the audience’s situation
- Transferable, if we measure the metric for one user we should be able to do the same for another user
- Comparable, if we measure the metric for two different users we should be able to make a comparison between them
So what is the problem?
If we take the combined cycle MPG metric it is easy to see how a customer who only drove in stop-start town traffic would get worse mileage than the motorway (freeway) commuter. This customer is far more interested in the low speed and idle fuel consumption than cruising speed efficiency. Unfortunately the combined cycle MPG metric is of little use to them as they cannot determine the relative contribution of town and cruising speed consumption in the 45MPG.
The data centre contains many examples of these use specific variations. Each organisation has its own specific blend of requirements, applications and issues in a complex environment which must be considered in context to provide any sort of useful measure, thus each organisation or data centre will need its own specific weighting of factors to determine a meaningful measure. No simplistic “Average Data Centre Efficiency” metric is ever going to be useful across organisations as too much information is destroyed in collapsing to this single value. Can we realistically compare the “efficiency” of a bank trading platform and YouTube?
Price comparison sites
It is not just data centres and IT that have this problem, price comparison websites are another example of where simplified metrics have failed, even for an audience with no wish to be domain experts. Trying to list each car insurance company from ‘best’ to ‘worst’ on a website is obviously meaningless and you would get no customers. What do you mean by ‘best’? The lowest premiums, quickest to pay claims, fewest claims rejected or those that provide a courtesy car? Even if we could determine a ‘standard’ weighting of these parameters that suitably munged up all the customers’ requirements we are still left with a serious problem, each insurer calculates premiums and excesses differently, based on a range of information about the customer.
The price comparison sites are well aware of this issue but still need to provide their customers with the sort of simple answer that makes for a successful metric. To resolve this dichotomy they collect enough information about each customer to provide an answer meaningful to them, with an attractive simple ranking to select from. Behind the scenes the comparison site takes the customer’s information and carries out some relatively complex analysis.
Should we do this for Data Centres?
I would argue that we have little choice but to approach data centres this way. It is easy to see that there are many parameters for a data centre, reliability, Google or Corporate workloads etc. Otherwise we are wasting our time trying to stuff together some averaged munge-mark metric that will be equally meaningless to every data centre operator, open to misinterpretation, misuse and actually obstruct efforts to improve efficiency.

The business does not care about DCIE or Compute Units or any other technical gobbledygook, the business cares about the cost of delivering the IT services it requires to support business activities. For us to provide the sort of simple and meaningful metrics that will allow both data centre specialists and the businesses we support to understand their performance we need to take a leaf out of the price comparison site’s book and use a computer to help represent the complex and interdependent aspects of our performance and efficiency in a relevant and applicable way.


