Download as pdf
German version of this whitepaper
Overview: cost savings potential with BVQ
There are several areas in which BVQ can help to reduce procurement cost, operating cost and operating risks.
This document describes these areas.
These disciplines help to save cost in various ways: on the one hand by using the opportunity to capture and to allocate cost, on the other hand by having a better control of operating risks or with improved transparency, which allows a faster problem analysis and thereby higher availability.
Reduction of purchasing cost
Monitoring and analysis by using the BVQ Treemap with IO density
The BVQ Treemap is designed to represent capacities with a review, which allows the administrator to draw conclusions about the utilization of technical systems. This allows a very easy and fast determination on which storage class each volume should be distributed.
The BVQ Capacity Performance Analysis (carried out at each HKC analysis today) prepares the results in a way, which allows a cost-effective decision on future storage classes.
Especially in large environments the financial impact of choosing the right storage stages is huge. 100 € and more per TB can be saved easily with the right choice. At the beginning these savings are reduced purchasing cost, but later they are noticeable as reduced operating cost in the disposition.
The DS8300 storage capacity is technically fully utilized and should be expanded.
Supported by the BVQ Treemap and the IO density coloring, it is certainly visible in the analysis, that many capacities stored on the DS8300 do not require high performance memory. Approx. 60% of all stored data could be immediately shifted to a cheaper storage.
Fig 1: the BVQ Treemap representation of the client system with IO density heat coloring. Top right are the high performance areas based on the DS8300 systems. The blue staining of the storage objects indicates that they take less than 20% of the offered performance. The blue volumes are thus candidates for a cheaper storage class. With the list of candidates it is now simple to relieve the high-performance storage class in such a way, that an extension of this class will no longer be necessary.
In the capacity-storage classes it is also visible, that large areas can be stored in memory areas with less performance.
The decision, derived from this analysis, was to introduce a new and very cheap but high quality storage class to record the data with low performance requirements. Thus, the more expensive memory systems were relieved in a way so that a further expansion of these systems was no longer necessary.
Depending on the customer's situation, different result values can be given here. It is supposed that the customer wants to expand with 100TB and this expansion is done completely in the low-cost performance area. Also supposed is a price difference between low performance and high performance extension of 1500 € / TB. In this case a saving of 150,000 € is possible.
Furthermore, the follow-up cost should also be considered. Generally extensions in the low performance range are using disk types with double to triple capacity. This means fewer frames, less parking space and also lower energy cost.
Analytical storage planning reduces procurement cost
Another very useful method is the analytical storage planning based on measurement data. In the following example, the MS Exchange area of a company was analyzed to determine the required supply power.
Here the simple interpretation of the measured values showed a processing power requirement of 2500 IOPS. With the in-depth analysis of the data streams´ behavior, it was detected, that more than half of the needed 2500 IOPS were generated by the SVC´s caches.
The improved planning basis now uses a value of 1094 required IOPS, which have to be supplied by the storage infrastructure. This performance can be provided by using RAID 6 with ¼ of the previously estimated disks.
Reduction of operational risks
Operational risks are assessed individually and can generally not be expressed in numbers. Therefore, the following two important questions have to be asked for any organization:
For the resulting cost the complete downtime beginning with the start of the damage occurrence until the restoration of the regulated and redundant operation has to be included.
One of our reference customers quantifies the damage due to an unplanned outage, as follows:
"After 4 hours which we can bridge up without major damage, a financial loss of 200,000 € is generated per hour in all locations."
Operating risks can also be divided into several different classes:
Technical failures can happen at any time. They definitely have to be intercepted by redundancies like clusters, data mirrors, RAIDs, etc.
In rare cases technical defects annunciate themselves and can early be recognized by monitoring and analysis.
SVC and Storwize provide many functions, which are used by the administrator to change the volumes´ properties. One of these key features here is the data migration function, which can be used to move volumes between storage pools or storage locations.
The management interface of SVC Storwize is not designed to recognize and avoid logical placement errors. The probability that an error creeps in during a migration of a larger number of volumes, depends on the quality of the consistent system documentation and administrative processes. The effect of an incorrectly placed volume can be fatal in case of a failure. The effort to monitor the volume placement is very high and error prone. BVQ automates the monitoring and provides with it a very large contribution to exclude these risks or at least a better control of them.
When volumes are managed by SLA rules, it is possble that an SLA is broken in case of a location-change by a migration. BVQ monitors the storage rule conformity of the individual volume by existing rules and provides information by using alerts or colored markings, whether volumes violate a SLA rule.
Manual tracking of the rules is very time consuming and associated with a high error rate. BVQ fully automates the monitoring process in collaboration with the integrated alerting module and the BVQ service level management package.
Metro / global mirror relations have to be monitored during operation. During a necessary switching event the expected data has to be present on the disaster side. The mirror relations have to be complete and active.
BVQ can monitor flash copies and metro global mirrors and generates appropriate alerts in case of an error. The manual control is also very time-consuming and associated with a high error rate. BVQ automates this control with the built in alerting module in cooperation with the BVQ copy service package.
Quick handling of performance bottlenecks with BVQ
The central SAN storage is the basis of the entire IT. Bottlenecks here affect all storage system levels above it and lead to immediate disabilities during operation.
The following text distinguishes between two principal bottleneck situations: the general shortage which results from an overstrained system, and the performance peak appearing more or less sporadically.
A general advantage of BVQ is the provision of very deep analysis tools already in the basic version, in order to control both situations. The instruments are highly interactive and by avoiding loop ways via reporting interfaces or SQL database accesses, a conclusion can be found very quickly. Very important factors are also the many analysis examples (available online) and the rapid availability of SVA staff to assist the customer in problem analysis.
The general shortage – a constant companion
The general shortage is evidenced by consistently high response times causing a poor performance. This bottleneck is relatively easy to analyze and to control, because here it is usually possible to establish a direct connection between the current load and response time. The problem of a general overload is mostly solved with the expansion of the technical systems. This might not be the best method, since often more cost-effective methods like restructuring and organizational changes are available and also more sustainable improvements.
BVQ supports here with reporting or the selective usage of the treemap GUI. If necessary, an analysis of the conspicuous volumes or storage areas can be carried out just with few mouse clicks, to detect a bottleneck without doubt. When the storage infrastructure is enlarged, the BVQ IO density analysis can be used to work out a cost-effective scaling strategy.
The response time peak – dangerous and difficult to control
The second type of bottleneck - or "peak" - has the annoying property to occur more or less sporadically and seemingly unpredictable. The response time of individual systems, suddenly shoots to 10 - to 20-fold values (and possibly higher).
In the simplest case it downgrades the response times of the affected servers and thus also the execution time of processes, but this can also lead to a crash of individual systems.
A dangerous feature of the peaks is their unpredictability in occurrence and impact. For this reason, peaks have to be eliminated immediately, because it has to be assumed that the next following peak has already a much greater impact.
Because a quick fix is necessary, peaks are often fought by a massive upgrading of the existing infrastructure. However, in this way they are usually only reduced until they occur again and lead to even higher follow-on investments.
The only valid response to a peak is the root-cause analysis to understand the reason of the peak and then to specifically invest in the recovery. With a high percentage however, the root cause analysis only leads to organizational changes. Investments would often not be required, if the reason for the peak was understood.
BVQ provides all the necessary analytical tools to perform a peak analysis. Due to the highly interactive analysis interface, an analysis can be performed in a very short time. Many BVQ analysis examples are also available online and, if necessary, SVA-trained staff is available to assist the client in this situation.
Proactive problem avoidance
While working with BVQ the administrator gets to know the limits of the system. For this reason many errors can be avoided now. In many cases, changes by BVQ are watched very closely. A classic case is the introduction of compression. Here an administrator has to expect a higher system load. With the help of BVQ he will gradually converge to the compression of individual areas in the case of need and will always keep in mind the effect on the overall system. This is an example of how he proactively avoids a problem with simple methods.
Proactive service procedure: storage system health check
Often also health checks are offered with BVQ. These checks are carried out mostly once per quarter to ensure that all system´s performance parameters meet the requirements. A health check always uses the current data and the comparison data from previous periods.
Creation of additional storage values
With the BVQ Accounting Package an organizational layer can be placed on all storage objects. With the help of this layer, capacities at levels like application, servers and cost center are combined in their organization and can be evaluated later by reporting.
The organizational summary can also be used with the BVQ Treemap and provides here many opportunities to improve the overview for monitoring and problem analysis. A typical case is the identification of a bottleneck not only by using volume or server objects, but also by working with individual applications or application groups.
SLM enables the management and monitoring of storage-related SLAs and many other security relations in the whole storage area. With the integrated alerting, the SLM package can ensure that contractual specifications are kept and security misconfigurations are detected at an early stage during migrations.
BVQ is able to read storage-related data from the connected VMware systems and can bring them into connection with the SVC / Storwize data. This results in the administrator´s ability to see the distribution of VMware systems in the SVC / Storwize storage area and to evaluate them. By using the BVQ database, for example, an accounting can be established at the level of the Vmware.
Another important application is to assist in problem analysis. If a performance bottleneck occurs in a VMware system, all connected storages are made visible by a single click and can be analyzed immediately. Thus an important time advantage is created for VMware problem analysis.
The BVQ copy service management helps monitoring the metro and global mirror relations, which are used to save the data to an external location in a disaster case. With these mirrors a very high level of security will be generated, if they are complete and working properly at all times. In collaboration with the integrated alerting, BVQ supports the monitoring of these levels. Many administrative tasks are made easier and mistakes are avoided in this way.
BVQ is a product of the SVA System Vertrieb Alexander GmbH