Choosing the right enterprise-grade SSD means running real-world tests to know how well it will perform when rolled out into production. Tests and comparisons stress the drives to identify the limits of failure and actual endurance, and to see if the performance will change over time.
Part I of this series looked at the hardware considerations for building a testing rig. In this part, we dive into the details of designing and running the final benchmark tests to understand the real-world needs for your organisation and how to leverage benchmarking software to design an appropriate battery of tests.
What should you be testing for?
Simply put, you want to know the maximum stress you can put on a device. This means looking at I/O performance over a long period of time rather than a short snapshot. When you run your tests on a preconditioned drive, you should perform tests that are long enough to check for drops in latency and IOPS in both sequential and random read or write scenarios.
In these long tests, look at total drive saturation, I/O latency, boot latency, how the system degrades as workload is increased and bandwidth congestion. Also measure sustained use or start, stop or pause to identify if other patterns emerge in performance.
Web hosting and streaming applications have very specific workloads that are sequential in nature, so make sure that your storage array has adequate sequential performance number sets. Even so, if you have an application that performs sequential R/W, the behaviour can be random if you have a cluster of drives, so read across the cluster to evaluate how random R/W sectors are generated or read.
For data center drives, testing with higher queue depths is also very important. Queue depth refers to the number of outstanding access operations or the number of I/Os waiting in the device queue at a single point in time. This test simply measures the ability of the drive to deal with a high number of concurrent IOs, which is typical of multi-threaded applications and virtualisation.
Overall, make sure your drives are hitting the QoS latency and consistency metrics while meeting all PRD (product requirements document) performance numbers, as well as passing all your RAID, vSAN and OLTP testing suites.
Your goal is to test the drive with a variety of R/W/M workloads for a period long enough to expose any deficiencies that may exist.
“Having an understanding of what your performance requirements are is important to adequately design a configuration that will meet your Quality of Service (QoS) and service level objectives (SLOs) for VDI deployment in addition to knowing what to look for in candidate server, storage and networking technologies… Knowing what your actual performance and application characteristics are helps to align the applicable technology to your QoS and SLO needs while avoiding apples to oranges benchmark comparisons.”
– Greg Schulz Storage IO Blog
Use the right benchmarking software
The hardest part of testing isn’t choosing the right software or hardware, it’s designing the test parameters. Believe it or not, the best benchmarking tool won’t be found in your test bed – it’s on your network right now. Before you start testing, run a trace using your built-in OS tools. If you’re utilising high-performance production applications, you need to find out the exact requirements of the app.
When do I/Os spike? When people are pulling reports? When everyone is writing simultaneously? To answer these questions, you need to pull a trace or use the built-in OS tools (Windows Performance Monitor, iostat, htop, vcenter performance reports, nmon) to watch how your application uses the physical disc, CPUDRAM and network over time in order to identify bottlenecks, as well as read and write latencies.
This will help you to understand types of workloads, bandwidth requirements and when bottlenecks occur. Once you have identified these benchmarks, you can then design an appropriate test for your organisation and choose an appropriate software platform to measure results.
You might look at popular press reviews and notice that they’re essentially using three main testing software solutions: Crystal Disk, IOMeter and ATTO. Most of the tests that utilise these tools are looking at consumer drives which won’t undergo the same stress as an enterprise drive.
Comprehensive enterprise testing should start with a software called fio. This open-source platform allows you to test IOPS for real-world performance, random reads and writes, and latency measures. These tests are highly customisable for your applications and measure varying I/O types, block or data sizes, I/O depth, target files and simultaneous processes. This isn’t the only tool that you should use, but is one of the more comprehensive tools that serves as a great start to your battery of tests.
The important thing about choosing an SSD for your data center is to remember that you’re not just choosing one drive – you may be choosing hundreds or even thousands of drives. They have to last, have the appropriate endurance ratings for your applications and be backed by a manufacturer that will support you.
Check out part 1: How to test an enterprise SSD – hardware requirements for a test bed